Sunday, July 16, 2017

Fascinating GWAS proteomics(?) study.

This post is going to be a bit of a Sunday morning ramble. It began when this interesting paper  showed up in my Twitter feed a couple times today.


And it caught my attention because of the associated text in the retweets:


GWAS is Genome Wide Association Study (wikipedia here). Generally, they proceed in this way: hundreds or thousands of people are tested with SNP arrays that can detect literally millions of different genetic variants. The participants are divided by their phenotype or disease state and inferences are made between the signals of variants between the phenotypes.

Lots of good stuff has come from GWAS, and lots still will as the tools continue to improve. If all goes well you will identify an Expression Quantitative Trait Loci (eQTL) or two that is associated with your disease. GWAS via SNP doesn't identify a gene that is associated with your disease. It identifies an area in the genome that is associated with your disease. In the best case scenario, you are working with a really well annotated genome and a gene with really well understood mechanisms of expression.

Side note: As of this Nature article in 2011, 96% of the 1.7 MILLION samples in the global GWAS catalog were from people of European descent. In this Nature followup last year, this appears to have approved, but there shockingly large discrepancies (that same library now has 35 MILLION samples). These articles point out the problems with developing genetic medicines for only certain populations. However, if you are really bored (or interested), check out this paper and the concept of linkage disequilibrium. Genomes aren't static. They can't be or that evolution thing doesn't work very well. You may not be able to make an inference from the effect of one point on a genome from one population to the next, cause that gene might be different or somewhere else entirely.

Wow. What was I writing about? Oh yeah! Okay, so GWAS is powerful, but we're inferring a lot of stuff 1) that place that looks upregulated is linked to that gene 2) it is linked to that gene in a way we understand (upregulation of that area could cause regulation of the associated gene to go UP or Down)

If you're still reading along (sorry) you can see why I might do a double-take on a GWAS proteomics study. You might also understand why I might read a paper and be a little surprised that no direct protein measurements were ever performed in this study.


This paper introduced me to the concept of pQTLs. These are QTLs associated with protein levels. 71 proteins known to be associated with cardiovascular disease were integrated as factors here.

My interpretation (which is likely wrong) is that rather than saying cohort A vs cohort B, the factors compared were patient group A who had CRP levels above X.X mg/dL compared to patients with CRP below that level. Then you look for QTLs that stand out.

How did they fare? Pretty well. They make some interesting biological conclusions and correspond those to what characteristics that patients manifest. They come up with 20 or so observations where the GWAS predicts the proteins that they know are elevated from the patient files. They find some other QTLs that seem to be associated with the known elevated proteins that might make for better predictive models of different stages of cardiovascular diseases down the road. Some enterprising CVD researcher should pull out this list and see if they do correspond at the protein level.

Is it really a proteomics study? Well...it's more of a transcriptomics study with some integration to a small set of proteins, but it is interesting and it forced me to read 2 or 3 papers to get to this (likely incorrect) interpretation of what they set out to do -- and whether it worked.

Got some guy doing GWAS down the hall and wondering if you could work together? Maybe you should check out this paper!

Incidentally, last summer a great study came out of some lab at Harvard where QTL and protein quantification were systematically compared using an amazing mouse model system. I wrote a post on it here that certainly didn't do it justice, but the original paper  is seriously good and helps bridge some gaps -- including terminology and can give you a feel for when you can trust QTL measurements and when you can't.

No comments:

Post a Comment