Friday, October 27, 2017

PhoStar -- Identify phosphopeptides BEFORE database search!

No. This blog post is about a super smart way to find phosphopeptide MS/MS spectra before your database search. Maybe even, without(!!) a database search and is described here.

What if you took your output MS/MS spectra and searched them against a comprehensive spectral library of experimentally determined phosphopeptides. -- like this one. 

Spectral library searches are, by definition, fast. However, you'll note that NIST got it's spectral libraries from human phospho from CPTAC -- so they're iTRAQ-4 labeled. PhoStar doesn't care. It uses a supervised machine learning approach to determine how to bin the MS/MS spectra -- do they go into the output file of MS/MS spectra that are likely phosphorylated? Or into the new file of MS/MS spectra that are definitely not phosphorylated?

How great is this? In what is possibly my all-time favorite dataset (Bekker-Jensen et al.,) these authors do such deep coverage of the human proteome (in only 32 hours of instrument run time) that they find over 10,000 phosphorylation sites without enrichment!  I've downloaded and reprocessed about half of this huge study and even on my last-gen Proteome Destroyer  -- I haven't even considered trying searching for phosphorylation. 584,000 unique peptides from one cell line!! It's tough to process with no modifications whatsoever....

When you have MILLIONS of MS/MS spectra to search through, every phosphorylation is going to make this job tougher. PhoStar gives you the ability to pull those out and worry about them later and separately. And it looks like it does a darned good job!

