Thursday, December 5, 2019

Over 1,000 SINGLE CELL PROTEOMES! 2,700 Proteins. 10 days of instrument time!

 You know about SCoPE-MS. Even if you don't pop in on my rambling here, if you've been to a conference this year there have been a few people here and there (and more than a few vendors) who have shown that they can replicate the technique. 

Time to start applying it! And pushing the boundaries and this new preprint shows what we can do in terms of throughput.

By applying enough improvements to the technique that we need to start calling this ScoPE2 -- these authors do single cell proteomes of over 1,000 (one thousand!) cells in 10 freaking days.

How on earth do you do that? In part with TMTPro. 16 cells (minus control and blank) -- 95 minute runs and the math is legit (part of the study is done with TMT 11plex) the second half with TMTPro/16-plex)

10 days = 240 hours = 14,400 minutes/95 min/run = 151 injections x 13 or 14 cells (after controls) = Math checks out!

I'm downloading the data and -- yo -- the vendor talks this year sure have made it seem like the Eclipse is the only way to go for doing single cell. And there are clear advantages there. Sensitivity, speed, real time search complementing SPS MS3 and I just identified a candidate lab with an Eclipse on the way where I might show up on Xmas with some Canadian whiskey and see how long they director will let me hang around (whaaaaaaaaatup, T?)  ---  but the RAW files from this new study? I can fully open in RAWMeat and see everything except MS1 fill  -- which means -- it's Q Exactive!

I was sure the scan header was wrong. There is no way the number of peptides they're reporting (and I'm verifying) is Q Exactive Classic, right? No way.

How'd they do it?

Narrow'ed the quad isolation (0.7Da). I know what you might be thinking, didn't Gygi lab do that a long time ago to limit isolation interference? Totally. Didn't it end up not working as well as SPS MS3? Well...yes and no. SPS MS3, particularly with RT-search is the best. However, if you look at how the Lumos picks SPS MS3 -- it still drops the quad isolation way way down. Because it's dumb not to.

And -- yes the problem with the original Q Exactive and Fusion 1 system is the last generation quadrupoles that are in them. They don't isolate symmetrically (if you say give me a 0.7Da isolation the ions on the upper and lower sides, the ones 0.30-ishDa/Th above and below target center are artificially suppressed because its a lot better at isolating right at target center than the sides) which can be a bummer -- but right in the middle they isolate great! And that's all that is wanted here. They're trying to just isolate the most intense peak! Could this actually be an advantage? I'm probably just typing too fast....there's no way an asymmetrical isolation could actually improve TMT quan on narrow isolation data....

They focus on optimizing the peak picking time and fill time and -- this data is awesome.

You should read it and download the data and check it out, but -- I'm going to keep rambling (file pulled at random)

This is the RAW Meat TopN plot (how many MS/MS events are actually chosen) and it's revealing). It looks to me like the majority of the time there are only 3? ions selected for fragmentation after each MS1 scan. The majority might be 1.

And fill time for those MS/MS (top panel, TIC bottom)

Yeah....with the narrow isolation and the fact these are SINGLE CELLS(!!!) out of the 6,000 or so MS/MS scans in this experiment it looks like about 5,800 required all 300ms of injection time they were allowed (red line at top).

I'm going on about this data as if it's something crazy awesome, right? But how did they actually do?

2,700 proteins. I'm not making this up.

AND -- they compare this to the 10x genomics workflow for single cell? And it kills it. It absolutely crushes it. Yes I'm biased. Clearly. But multiple peptides per protein per cell. It adds up. Sure -- the genomics stuff is still useful and we've got a ways to go. I'd love to do this correctly myself once before I go around kicking over sequencers or whatever they're called, but -- just wow.

One more thing. Basically every week the NCBI has a "CodeAthon" you can check it out on Github here. They get a bunch of informatics nerds to submit proposals around a loose central theme and they pick a project, get all the nerds together and they code away until they have something. So far they've been shockingly successful as several of the individual events have resulted in accepted papers.

You know who has been ignored? Proteomics. There is one in January that is on Single Cell stuff. I submitted a proposal last month that it should be on integrating single cell DNA/Protein. My biggest concern? Where would I get the dataset? BOOM! I have all the data I need.

If you think this would be a cool idea, shoot in a proposal! This is cool enough that I'd take a train from the warm south to frigid NYC in January if we get to do some proteomics coding. I'll get Canadian Whiskey!


  1. It looks good but they only can quantify ~500 proteins per runs in Figure 2f and Figure S2b. However, over 2700 quantified proteins in the final data?

  2. Would be curious to hear your opinion on data quality!
    At least in the old data the empty channel was never empty and had as much signal as the single cell containing channels

  3. Hello Ben,
    thanks for making people aware of this great paper. Regarding the 0.7 m/z isolation window: What you wrote, makes sense. However, they also used a 0.3 m/z offset, which should lead to an inclusion of the mono- and +1 Da peak only (and +2 Da for quadruply charged precursors), probably to boost the number of ions in MS2. In case of a nonsymmetrical isolation efficiency in old Quads, this would lead to an inefficient isolation of both desired peaks, though. It apparently works, nonetheless!