Sunday, January 13, 2013

Proteois: FDR for label free quan!

Currently in press at MCP is this paper from Marianne Sandin et al., that describes Proteois, and adaptive alignment algorithm for label free quantification experiments.  An interesting aspect of this algorithm is the use of a false discovery rate (FDR) calculation during the alignment stages.  Another nice feature is that there are multiple readouts during the alignment and quan steps that allow you to rapidly troubleshoot problems with your analysis.


  1. Hi Ben,
    Really enjoy your blog and appreciate the time and effort you put into it. Speaking of label free quan in proteomics, what is your opinion on spectral counting versus peak intensity/MS1 profiling? I've used spctral counting for years and always found it to be robust and accurate. More recently I've used Progenesis to do label free quan but I find that spectral counting still outperforms it (based in head to head comparisons using bacterial proteins spiked into mammalian background at different amounts). It seems to me that spectral counting is a much more straightforward approach- fewer steps (and assumptions) required to get to an answer, occam's razor, so to speak. Any thiughts on this.

  2. I guess it really just depends on the application. There certainly is nothing wrong with spectral counting, but dynamic exclusion has a tendency to affect it. Without dynamic exclusion you get an amazing dynamic range out of spectral counting, and it is much more accurate. With dynamic exclusion, if you only see each peptide once or twice then your maximum number of peptide spectral matches (PSMS) are lower, so you lose a lot of your dynamic range. You do, however, get to read deeper into the proteome. Less quan- more IDs.
    Peak picking applications that use the MS1 precursor intensity do have more variables, but they can maintain a pretty decent dynamic range even when you use dynamic exclusion. So you get good quan and good IDs out of the same run. One more advantage that I have seen in my experience is that peak area quan can be more reliable at the lower regulation levels. In other words, I will be more likely to believe a protein that has 3 peaks that are upregulated by 2 fold, than I will to believe that a protein is upregulated because in run 1 I saw 3 peptides and in run 2 I saw 6. Once the numbers get higher, then I see a pretty nice correlation.
    There have been a couple of really good papers comparing spectral counting to different label free methods (one I remember used spectral counting and peak areas to compare samples that were actually SILAC labeled and they actually knew what the ratios should be). I can't think of the authors off the top of my head, but I'm sure you could easily find them through a Google Scholar search.
    And 1 caveat of all the stuff I just typed: everything I said of the peak area calculations above is true if you have enough chromatographic and/or MS1 resolution to truly ID your MS1 peaks. If you don't, such as short gradients on an ion trap or triple quad instrument, then I go with spectral counting every time.
    Hope this helps more than it jumbles things!