Saturday, October 13, 2018
The terrifying FDR Averaging study is live on biorXIV!!!
....Just a little early for Halloween...!! The scariest study of the year just went up on biorXIV here!
It seems less bad if you start with the fact that this team has a (computationally expensive) solution and I think it's already live on Crux.
Look -- we all know that all our FDR shortcut things (target decoy, Percolator, Elutator, and so on) are imperfect. And -- we know they need appropriate datasets to work right. This study starts out by pointing out what happens if the dataset that hits the FDR calculator IS NOT right. Fluctuations by as much as 20% in your peptide IDs, just by reshuffling your decoy sequences and searching the same data again??? Ummm.....
....yeah....fortunately for those of us who use...well...BASICALLY EVERY PIECE OF SOFTWARE I USE....when you make your decoy sequence, you end up using that one pretty much forever.
Let's see....when was my UniProt human decoy FASTA generated.....
Oh. The week I installed software on my new computer?
The reason this is so disturbing is that if I was using a program that would reshuffle my decoy FASTA every time, I would see this because, given random shuffling, my results could be very different each time I press the <RUN> button. Okay -- Honestly, from a reproducibility standpoint, making one decoy and sticking to it is a good thing and keeps people from asking questions like "wait. are you running my results through a random number generator?!?" and I'm grateful for the fact I don't have to answer this question. This paragraph is poorly written.
Okay -- but -- at the end of the day I want to give people the list that is the absolute closest representation of what the proteins that I can detect in the cells they gave me are doing. And if my current FDR methods are simply masking issues with the data that can be as extreme as described here -- I think upgrading the way I generate my lists and tell true from false needs to be put at the top of my priority list.