Friday, November 6, 2020

LFQmbrFDR! (FDR for Match Between Runs!)

That's the level of happy that this new preprint should make you! Keep your eye out of the black pug and then try saying

LFQ-MBR-FDR 3 times! I think I broke my chair! 

This is the preprint I'm about to ramble about! 

Why am I pug puppy chair breaking level excited? Well, let's take this fantastic example of how we typically do Match Between Runs (MBR). 

This is from this great paper
by Schonke et al., (the o in the author's name isn't really an o, I believe it is supposed to be an astonished emoji.  😲 

If you don't use match between runs, you only identify the peptides in your individual runs that you obtained MS/MS fragments for that you can identify. This is obviously a subsection of your total peptides present because of a lot of things, like: your instrument is too slow to fragment all of them, or in sample Ob2 you have some other peptide coisolating and lowering your peptide score below your cutoff, etc. etc., 

MBR allows you to extrapolate from run to run what that peptide you clearly see an MS1 signal for but just didn't identify.

In my always and forever humble opinion, the way Sch😲nke et al., did it is the best way, because for the peptides without matching we have a second level of confidence of the peptide ID. We have retention time and MS1 and probably isotopes and we've got a Peptide Spectral Match. For the not matching we just have the first 2. 

Everyone wants more data, right, but you should denote somehow where you got it from because there is no confidence metric for your match between runs. UNTIL NOW! 

I do think it is worth noting that some guys at Harvard did some really serious work last year to try and estimate how often MBR makes errors. You can read what some weird guy wrote about that here. 

Back to the new preprint:

This group shows off a way of estimating the quality of the MBR data and it's application in their currently existing software! I'm going to check today to see if it is already live there, but I think it is. 

They pressure test it against a bunch of OrbitalTrap and TIMMYTOFFY data. 

Where it should really really be applied? Datasets without a lot of signal. Single cell, in particular! 

No comments:

Post a Comment