Monday, March 29, 2021

Proteomics needs more spectral library formats!


 At both the conveniently overlapping USHUPO and ABRF a couple weeks ago a big story was the emerging new alternative proteomics techniques. Illumina is getting into the game and is chasing the SOMASCAM technology to see who can be the fastest to accurately quantify 1,500 proteins in human samples. 

Four new companies launched last year alone raising huge amounts of money with basically the same pitch "we want to be the Illumina of proteomics", which probably led to Illumina wondering why it couldn't be the "Illumina of proteomics" too. Google the term in quotation marks. You'll find them and their huge successful investment raises.

The outside world is excited and ready to invest in proteomics and it's becoming readily apparent that LCMS is not part of the conversation. Could someone raise anywhere near that kind of investment capital on a pitch based on LCMS? No way. If you take a step outside our little community spin around in place 3 times and look back in you can probably see why the scientific community is fatigued with us and all the dumb shit we spend our time on. 

For example: I think that a substantial percentage of people in the field right now are spending their time trying to come up with completely new and completely incompatible spectral library formats. Is that what you're planning to do today?

Why do we need a tool like EncyclopeDIA to have to converters for 7 different spectral library formats? Why is that not even enough?

I downloaded the files from a new study from a single word journal this morning to see the results for myself and I'm absolutely fucking thrilled to see a new spectral library format this morning. Even Pinnacle, which can natively load practically any format of MS data and has options for accepting a scrollable list of input formats (did you know that some ultralarge biopharma companies have their own internal spectral library formats? they do, because we've all absorbed too much acetonitrile through our skin and it has done something awful to our brains. I know because Pinnacle has the option to accept those as well on it's pulldown list of options for input) just closes when I try to feed it this amazing new spectral library format. I'm sure that OptyTech could fix that for me today, but why should they have to? 

When you apply for your next grant and you're beaten by the genomics core across town because they can now use their NovaSeq to quantify 1,500 proteins, don't be bitter. 

Go back to your lab and get back to working on a completely new and unnecessary way to extract proteins from your cell types that we already have 15 ways to effectively work with. Alternate between 100mM TEAB for resuspending your trypsin today and swap over to 50mM AmBiC on Friday, heck, mess with the ratios of protein to trypsin while you're at it. Tinker with that gradient to get that one extra albumin peptide you've always wanted to see in your global runs, you know you want to. Hell, put a grad student on making an entirely new spectral library format. We probably don't have enough anyway. 

Just keep in mind that every step along the way we have done basically everything we could think of to make proteomics inaccessible to the greater scientific community and make it as challenging to reproduce our results as we possibly could. If Illumina pulls off 3,000 protein identifications in their next generation of technology as they have promised, we'll be lucky if LCMS proteomics exists anywhere outside of Cambridge and Munich, because by and large the scientific community is tired of our circular tail-biting craziness. And they should be. 

No comments:

Post a Comment