Thursday, June 20, 2024

pQTL /GWAS studies by LCMS proteomics identifies loads of peptide level variants!

 

Quantitative 

Trait 

Loci 

or QTLs are a great excuse for doing some pretty low confidence and low accuracy measurements. In genomics these are done all the time with SNP arrays that can more or less sorta quantify a couple hundred things per sample. 

Here is the trick, though, if you get enough samples you can start to see the patterns in that lousy data without doing good genomics or gene product measurements on lots and lots of people (still hard on I don't care what refrigerated room of super computers you have). 

This is opening the door to things like ....actually I don't think I want to say the names of some of these affinity arrays right this second.....a post on one of them this week has like 20,000 site visits in about 3 days... you know what I'm talking about. 

The fast(? are they really, though?) inexpensive(? ummmm.....honestly doesn't look like it?) mostly unproven things that may totally work, but there certainly isn't an abundance of evidence yet that they do.

What if you could do decently large QTL type work at a protein level with proven quantitative technology? What's that worth to the world? I dunno, but is that even possible yet?


This is a couple of months old (I've been busy) but it certainly implies that - yes - we can do these things today and we don't even need the best of the best of the best to do so. 

This study used the Seer Protegraph for upstream sample prep and then nice microflow proteomics on a TIMSTOF Pro (possibly a 2, I forget). Thanks to the nice robust microflow stetup they knocked out each sample in about 30 minutes. So 48 samples/day this way. 

I think the biggest of the "next gen" studies I've seen so far was 5,000 samples? Let's go with reasonable downtime and QC/QA estimates. You're at 3 months? 4 months if you take weekends off if you do it this way. Are the affinity things faster? Maybe? Are they cheaper? Also....maybe....

However - while I don't know the affinity technologies well, one thing that I do know about any affinity type technology is that - if you didn't design a probe for it before you ran the thing you will NEVER EVER be able to go back and look for new stuff. 

If you did that same study using the platform described here where they did DIA based analysis - it's the complete opposite - you can always go back and look for new stuff. I'm doing it all the time right now as these neural network things get better I can go back to single cells we analyzed 2 years ago and rerun them and my blanks look better, my coverage goes up and I can find a few more of the mutations and PTMs I care about.

How's the coverage this way? LCMS sucks at plasma proteomics, right? As good as any affinity tech we've seen so far and - again - as the algorithms - and our knowledge of the true human proteome - evolve - we can go back to these data.

In fact, you can do it right now if you want. The files are all right here

Off my soapbox, the authors did a bangup job of quantifying human variants in these data. It's truly stunning downstream analysis work. 

1 comment:

  1. Thank you for discussing our paper! Regarding further GWAS, we just posted "A genome-wide association study of mass spectrometry proteomics using the Seer Proteograph platform" to bioRxiv:

    https://www.biorxiv.org/content/10.1101/2024.05.27.596028v1

    where we use blood samples from a discovery cohort of 1,260 American participants and a replication in 325 individuals from Asia, with diverse ethnic backgrounds to analyse 1,980 proteins that were quantified in at least 80% of the samples.

    ReplyDelete