Saturday, January 1, 2022

2021 Proteomics Recap!

Wow! I just found a draft of another proteomics 2021 wrapup that I'd started and not finished.

I like this one better than the last one because it is almost entirely me rambling. 

With no particular order this is my recap of what happened in Proteomics in 2021! 

1) The outside world noticed Proteomics in 2021! 

What was the tipping point? Was it the peta-bytes of DNA sequencing data that had acquired on diseases like: Alzheimer's, Huntingdon's, ALS, Schizophrenia, on and on and on, that are diseases that do NOT alter someone's DNA? ("Come on, guys if we just increase our read depth and generate 15TB per file, we'll get closer to something we can't possibly measure")

Was it that someone finally looked behind the scenes at Single Cell RNASeq data and realized (as one scientist I have tons of respect for recently asked "...ummm....should it all be zeros?") that you only get 8,000 transcripts quantified by sequencing at least 10,000 cells because over 90%-95% of every value in each individual cell-- at the whole transcript level -- is a zero. More on this further in. 

I don't know what it was, but someone at Forbes got up and went and took pictures of mass specs in some museum and lots of people in our field got offered a whole lot more money to go do proteomics somewhere else. Wow, there are a lot of start-ups -- I know more than a few people who either took or hesitantly walked away from 7 figure offers, a consequence....ummm....there are a lot of jobs open.... (this was just one side of one board at ASMS Philly. If we normalize the number of postings by number of attendees this might be sortof insane..)

A few months ago I made a list of over 100 open positions in one U.S. city for a slide deck. We might seriously need to do something about this soon....and I'm not sure that waiving the requirements for degree completion (as some companies are doing) is the best long term strategy for our field or for students who have been isolated during what should be the coolest years of their life and with their goals for a graduate degree slipping further away each time their lab gets shut down by some evil virus. We'll work it out in the end, but everyone should be aware that things like this are happening. If a CV crosses your desk down the road where someoned did 5 years of work and no fancy sword or robe is attached, please stop and think about the fact that COVID has impacted everyone to different extents. 

2) Proteomics (and mass spectrometry) has proven that it belongs involved in monitoring and responding to emerging diseases. 

Whew. I can't even try to link even 1% of them here. Yes, COVID-19 is still around. And the absurdly huge mountain of evidence on how this and other viruses work makes one thing clear -- mass spectrometry HAS to play a role in the detection and study of emerging threats in the future. Heck, I think someone let me ramble and entire article about it somewhere....

How's this virus work? 

It splices! of uses complex glycosylations!  

And it rapidly mutates in ways that, currently, only mass spectrometry is fast enough to respond to due to the fact that we don't need complex reagents to be produced and shipped out to be generated to adapt. Who else is continuing to learn lots of stuff about viruses that they never wanted to before? 

3) Single Cell Proteomics (SCP) is our field's moonshot! 

I think people not currently doing (or trying to do) SCP are probably tired of hearing about it, but the challenges of single cell, and the solutions, are clearly starting to trickle down to help everyone out. 

Rumor has it that the first instrument designed specifically for single cell analysis actually went to a lab that does HLA/MHC immunopeptidomics. Which makes a lot of sense because signal limitations are a huge problem with those awful peptides as well.

In addition, we're getting exposure through SCP to new things we didn't know about. Before this year did you know that robots have been around for decades that can move picoliters of solvent around rapidly and accurately? Heck, 18 months ago a regulatory body and I did 4 rounds of paperwork filings on the calibration of a robot that had the job of moving 200 microliters of methanol around and today my problems are failed calibrations of -- not nanoliters -- picoliters -- of acetonitrile, with a robot that was first released 10 years ago.

Other things are trickling down as well, like new surfactants that reduce peptide binding to plastics and glass and better data processing tools for digging deep in our data. We're also punching some holes in some old ideas that have hung around a bit too long from our roots in analytical chemistry. Things like -- maybe you don't need 100% coverage of every peptide every single time to call it an identification.

Also -- I think we're starting to see behind the genetics curtain and finding that they've got some amazing marketing department covering up a lot of problems, like the whole 90% missing value at the gene level thing. 

4) If we're really really careful, we can have conferences again...probably...

I have a 25% written ASMS 2021 wrapup here and it's all positive stuff about the conference. I've had trouble prioritizing it because my impression of ASMS in my head is mostly about how I'd largely forgotten how to interact with human beings in person. To be honest I've never once in my life walked away from a human interaction and thought to myself, "great job, Ben, it totally looked like you've held a conversation with another member of your species before! Have you been practicing?" But I was totally impressed with people I spoke to who seemed to be the same awesome people I hadn't seen in a couple of years. And for those of us who might have appeared to have lost a step, it's great that you were so cool about it. 

I guess the questions that we'll need to look at going forward is -- do we need to? Or is the hybrid venue here to stay? There certainly seem to be environmental implications to think about. The first few were kind of clumsy, but we're getting better at it and the technology is improving! 

5) You don't have to use an Orbitrap for proteomics!  

From around 2007 or so the Orbitrap just took over. Look at this distribution in datasets at ProteomeXchange ---

There are 18,518 datasets and the SCIEX Triple TOFs warrant a place on the chart with around 1,100 datasets and the Synapt has 335. All other instruments combined are lumped one section that makes up less than 10% of the total. 

But check out two massive studies I rambled about earlier this year! (Left and right

Sure, the Orbitraps are involved, but just as many runs were performed on other instruments. 

6) Probably even cooler? Check out that image above again and see how many times capillary LC was used!  The arguments against the dreaded nanoflow liquid chromatography are continuing to build! 

Or...38,000 runs and counting...? 

Are you doing single cell level sample limited work? If that answer is no then I'm going to continue to say that you don't necessarily need Nanoflow. Someone dropped a new NanoLC in 2021 that can do 20nL/min and I sincerely wish anyone crazy enough to do that luck with it. Just doing the math in my head I think that it will take around 74 calendar days to clear a 4 microliter air bubble from your lines. We'll keep running a lot of our stuff at 100 microliter per minute because about 95% of the time there is plenty of sample to get the same coverage. 

7) On a similar note -- other proteomics technologies are coming. Or are here? Fortunately, they seem too busy disagreeing with one another to cause too much of a problem. Slooooooowly the outside world is realizing that even at 4 million read copies of DNA you still won't get to a protein or metabolite abundance. The companies that have sold instruments for DNA and RNA sequencing have tons of firepower, money and way way way better marketing than we do. 

I'll give you an assignment. Ask any random scientist how many transcripts they think are quantified in a single cell using RNASeq. I bet you a beer at HUPO that they answer something like "all of them" or "10,000" something. They won't believe you when you give them the real answer which is "a couple hundred". 

I should put some data up to download! That's the answer, though. But they're doing thousands of single cells and so their Venn diagram of unique reads from unique cells works it's way up to thousands of transcripts when you've done 100,000 cells. Missing values? Oh...yeah, about 95% missing values at the full transcript level. This isn't me making a joke, this is reality.

Okay, so where does this ramble end? Not real sure, but maybe here. This is a lot of words. We've got the world's attention right now and let's see if we can handle the pressure and use protein specific measurements to figure out biology and medicine! 

No comments:

Post a Comment