Tuesday, December 7, 2021

Conspicuous in absence -- where is the easy and obvious SOMASCAN experiment?


If you're in a big biology or medical institution you probably heard the sigh of relief when this paper dropped.

"Finally", they say, "FI-NALLY we're free of the tyranny of those cantankerous vitamin D deprived mass spectrometrists! Proteomics has been wrest out of their pale and shaky hands and we can do real science with it!" 

Honestly, I bet you can't blame them. I certainly can't. I submitted to a core proteomics journal recently after several years and I forgot the culture of snide comments and borderline trolling that is permitted in the two "big" proteomics journals that is somewhat unique to this scientific community. That's on me and that's a different story entirely. 

The story of the moment is the easy experiment and why it hasn't been done yet. SOMASCAN is an aptamer based proteomics technology. Aptamers are single stranded nucleotide chains that can be used to detect proteins or pollutants, or just about anything, really. 

This study is the first massive use of the technology on an extremely well-characterized population. If you aren't familiar deCODE/ ENCODE was a huge thing. Iceland is pretty isolated and there is a really interesting method for keeping track of heredity in the absence of records -- the unique use of last names. The suffix "son" "dottir"can be used pretty much without fail to track family lineages back through time with or without solid record keeping. 

deCODE started back in the 90s when some dude from Boston decided to just go and sequence a ton of people and  kicked off a massive genome sequencing project on this population. This group did amazing things to bring in the genomics revolution. I'm a big fan, and we all should be. The point of the background is that when these people do stuff, the rest of the world listens.

All the sudden, SOMASCAN isn't just some isolated weird company that spends more money on buying Google Ad space than on testing that their panels are quantitatively accurate. They're now a huge weird company that the world knows about largely because they spend more money buying Google Ad space than testing that their panels have any sort of quantitatively accuracy that has just pulled off a 5,000 proteome project with one of the biggest names in medicine in the world. Now, I have a bunch of US government COI stuff, so I can't even invest in Thermo, despite their recent diversification into the highest priced percolating coffee makers on earth.  

So feel free to ignore me, as always. I couldn't invest in the exciting new wave of proteomics technology like this one even if I wanted to, but I love to talk about proteomics so much that I'll even answer the phone when people representing big capital management firms call to ask me about stuff when I'm driving. SOMASCAN has been the topic of a lot of calls like this and while I certainly wouldn't tell anyone how to spend their money, this is my reservation. 

Here is the thing. This isn't new tech. It's been around for years now and the experiment that SOMASCAN needs to do to make me and every other naysayer out there shut up isn't hard or all that expensive. It's easy, relatively inexpensive and now at least 5 years in, the fact it hasn't been done should give you Elisabeth Holmes knows about that usb drive you buried off the Coyote Trail in San Jose level chills. Maybe Katie Holmes? Probably also scary. 

Here is the experiment that hasn't been done. 

Run samples with SOMASCAN. Run same samples with LCMS. Compare results. For fans of this technology they'll cite that work has been done. Sure. There is stuff, but LCMS based proteomics comes in many different flavors. There is the truly quantitative stuff and there is the kind of quantitative stuff. For most global proteomics out there, particularly as you travel back through time, it's the kind of quantitative variety. It wasn't that is couldn't be quantitative. It was more like proteomics was used to cast a wide net to look for things to then go after quantitatively. Good accurate quantitative mass spectrometry is BOOOOOOOORING. You need standards and controls and you have to think about %CVs and LODs and LOQs and I don't want to do that on 8,000 protein targets. I want to do something kinda quantitative on 8,000 proteins then I want to look at the evidence of the cool ones for a month and then pick 1or 10 to do the boring stuff on. 

That's just me. There are nerds out there who basically ONLY do the real quantitative stuff.  A bunch of them have congregated in Seattle for some reason, but I worked with one in Pittsburgh recently, so they are spreading. Here is the experiment:  Have someone well established in translational or clinical mass spectrometry run those same samples using a quantitative pipeline. Compare the results. This isn't a "gotcha!" There are clinical mass spectrometry assays that have been blessed by the American Society of Clinical Pathologists (ASCP) and CLIA and are used to diagnose patients. The error bars and CVs are well controlled and well-understood. 

Again, yes, there have been LCMS to SOMASCAN comparisons. Have they been designed well to be quantiative? If so, I'd LOVE to see one because I haven't yet. That isn't sarcasm. If SOMASCAN works I would literally use it. Mass specs are an expensive pain in the ass and if we're just quantifying proteins I do not care if I use a mass spec. It's the alternative proteoforms and PTMs and splicing events that we need LCMS for. 

This many years in, though, and without that experiment it's went way beyond suspicious.

All that being said, I think this effort by the deCODE group is pretty cool 5,000 proteomes? Even if the data is lousy, this is a group that makes sense out of GWAS data and that's pretty much the crappiest "-omics" data you can get out of anything except a $510 drip coffee maker and they've done great stuff with it over the years. (GWAS, not the coffee maker, though maybe both.) And science is largely about resources. Mass spectrometers require a LOT of electricity and Iceland has a well documented lack of access to stable and affordable power, which has something to do with all the volcanoes and geothermal stuff around. I think it knocks the power out all the time or something or makes it a lot more expensive. I might have part of that mixed up. Either way, they got these beautiful plots like the one at the top of this post that I expect was met with either rage or projectile vomiting in certain facilities in Seattle. 

Just because protein quantitation doesn't meet the criteria of classically trained analytical chemists doesn't mean that the biology isn't real. (Something is very wrong with that sentence, I'm not sure what). My problem with this approach is that these results could very easily be validated or supported with classical analytical techniques like the ones that are so precise the FDA lets medical technologists in hospitals use them to determine what is up with a patient. 

It would be super easy. Please don't interpret this as I'm saying someone is hiding something. I ain't saying that someone is hiding something. But it's pretty weird.

As a final statement that requires more time and exploration, this isn't the only "next gen" platform aiming for proteomics. There is another one coming and, this is imporant, the two do not appear to agree..... Which one is right? Are either? I don't know but it sure wouldn't be all that hard to figure it out.

1 comment:

  1. Here is one large scale evaluation of affinity reagents for proteins (antibodies, which are likely more specific than aptimers), and the results are not promising:

    Marcon E, Jain H, Bhattacharya A, Guo H, Phanse S, Pu S, et al. Assessment of a method to characterize antibody selectivity and specificity for use in immunoprecipitation. Nat Methods. 2015;12:725–731. doi: 10.1038/nmeth.3472.