Saturday, December 12, 2020

Is one of your proteins of interest in your "contaminants" database?


I'm moving fast this morning, but I thought this was a fun thing to bring up. It would be great if every proteomics sample ONLY had the proteins that you digested and wanted to see, but that's not the case, right? You've got your stupid protease hanging around, and you've probably got dog and trash cat keratin falling off of you all the time. In the winter you get this great boost in wool peptide identifications. Common contaminant databases are critical and used by just about everyone and a lot of cool software now just has the option to add them automatically. 

And...maybe we ought to take a critical look at some of these lists.....

Imagine that you're doing some laser capture microdissection experiments on the epithelium of a tissue slice, and your suprise when you don't detect one of the major protein constituents that should be there. Weird, right? Did you toss keratin 7 because you hard filter your results and use a contaminants database that flags a few extra keratins? 

If you're using the default contaminants.fasta that comes with every MaxQuant download, that might be the case. 

There might be 10 new proteomics studies this fall already on Ubiquitin-Conjugating Enzymes. It's a hot topic out there. I wish you all luck. Blech. If you're doing a meta-analysis of this data and using a hard filter, you might not see a few of the proteins if your contaminant database is derived directly from the great Global Proteomics Machine cRAP database. 

The GPM website clearly breaks out the contaminants in the database by type. There are a bunch of human proteins on the list that are common contaminants if you use the Sigma UPS standard, which a lot of labs do. However, there are some really cool proteins in those standards! 

The direct FASTA download doesn't break the proteins out that way (it can't, FASTA isn't exactly a flexible thing) and it looks like a couple pieces of software have either taken cRAP verbatim or have started with it and added their own in house observations to it. The MetaMorpheus contaminants XML definitely has these proteins in it, for example.

The answer? Probably not hard filtering, I guess. (I have a default filter that makes anything on my contaminants database invisible in PD and when I open .tsv from other software I toss anything with an X in the contaminants column. That's on me, but hey! now I know better! 

Friday, December 11, 2020

Extend the capabilities of your higher mileage hardware with 8-plex complementary quan!

 The use of the complementary region of a reporter ion tag is not new, but it has been somewhat limited in utility due to the relatively low plexing capabilities. 

You know how those Tandem Mass Things all have the exact same mass? It's because they swap isotopes between the reporter thing (red above) down around 100 m/z that you normally quan off of and this balance region thing (blue above) that we typically just forget about.

However, it's really noisy down around 100 m/z, and for a lot of instruments it is 1) impossible to scan down that low 2) annoying to scan that low (because if you lower your lower limit you also have to lower your upper limit), so the complementary tags have always had a bit of a following, but -- ouch. You're stuck to plexing 5at most 5 samples at once? 

Would it be a better option with some adjusted tag chemistry and 3 more channels? Sure looks like it! 

Not only does the complementary approach beat SPS MS3 in some cases in this comparison (I think the comparison was a Fusion 1 and the authors are very clear about the hardware advancements in the subsequent generations) the very best looking files out of the ones I've downloaded? They're CID, yo. 

Check out how clean the complementary ion region is here! There is a lot less noise in the higher m/z range in shotgun samples. If you thought that you'd pushed your trusty high mileage hardware to and/or well beyond it's limits and you're having trouble competing with the big spenders out there with the newer gizmos? I probably couldn't recommend this new study more! Tighten up that quan while still getting a solid high (or higher) plexing number! 

Oh. Wait. Does processing it look like an absolute nightmare? 

I don't have proof yet, but I'm pretty sure minor adjustments to this will do it.

Less fun details: The resolution that you use will be important to what channels you can and can not use. As your instrument probably decreases in relative resolution as the relative m/z increases, those big ions will coalesce with the natural C13 isotopes, so you'll lose a channel or two. That's why you probably can't use the N/C swapping at all unless you really crank up the resolution numbers. Still cool. Still highly recommended. 

Thursday, December 10, 2020

Take the fun out of single cell proteomics with false positive rates in quantification!

Is the fact that the hospitals near you are very very full and you're...injury prone...making it hard for you to sleep at night? Or are you just too excited by the promise of single cell proteomic technology and need to reign it in a little? 

Do I ever have the study for you! 

Why the annoying title for this blog post? Because it would be so much more fun to not think about relative errors in quantification in this exciting and emerging field of single cell proteomics. Let's focus on the positive! Do I want to know that the relative error in protein quantification tends to head in a suboptimal direction as the number of relative cells in an experiment decreases? 

Is it important to know? 

Of course it is! 

But it's even more important to know what those relative errors are as you scale down so we can start to think about clever ways to adjust for these errors, like how many replicates would you need to do to make them better?

Mad props to this group for trudging through and doing the work that I know I sure didn't want to do and making the graphs that no one hanging out at a single cell sorters wants to think about and really truly absolutely need to! 

Oh, and this is probably a much better summary of the study: 

Wednesday, December 9, 2020

Tailoring collision energies to search engines -- have be been too complacent?

On the large list of things that I thought we had settled and we, as a field, would never have to think about again.....

You should check out the evidence yourself here, the files are up on MASSIVE and the authors show solid arguments that for both QTOF and OrbitalTrap instruments we might want to think a little harder about how we set our collision the tune of as much as a 40% increase in identifications between optimal and the opposite of optimal.  The 40% looks like a pretty extreme example of sub-basement optimal,but it does help to highlight the importance of these parameters.