Spectral libraries vs. search engines

Anyway, it's the content that matters!  And this topic is going to be cool, and important in the future.

Spectral libraries are something we're hearing more about.  That's because the libraries are getting bigger, more useful, and improving in quality.  I think it is a good time to take a look at what they are and what advantages/disadvantages they are bringing for us.

Spectral library searches have been around for a long time.  Originally, they were used for small molecule searching, but they were quickly adapted for peptide searching.  The two that are integrated into Proteome Discoverer 1.4; MSPepSearch (NIST; link coming when the government reopens...) and SpectraST  were two of the earlier algorithms to pop up.  

The concept is simple -- you take identified spectra that you (or somebody else) has sequenced and identified with high confidence in the past and you put that in a library.  Then on your next experiment, rather than go through the statistical magickery of a search engine, you simply compare all of your MS/MS spectra to that of your library.  If the new spectra looks like the old spectra, you have a match.  

Faster.  Way faster.  In the original paper for SpectrST (by the way, I just found out today that this rhymes with "contrast"), on the same PC, the spectral query speed for SpectraST was 0.005 seconds, while Sequest was 6.4 seconds.  That is almost 1300 times faster.  Partly this comes from the fact that you are comparing a spectra that actually occurred to another.  In a Sequest search, the engine has to look at every possible MS/MS fragment ion and do that comparison.

More sensitive.  By comparing two spectra, you can get away with fewer fragments and of lower intensity than you can with a traditional search engine, mostly for the reasons mentioned above.

You've got to have a library.  And an okay library only cuts it if you want okay results.  If you want good results, you need a good library, and excellent results...  If your spectra has a PTM, but that PTM has never been recorded in a library, that result is gone.

And this is why I haven't really used these engines.  The libraries just haven't been good enough.  But this is the good news:  They are getting better.  Much better.  High resolution MS/MS libraries are around the corner and new tools are coming.

But this is the best news:

With the completion of all these new libraries, we won't have to worry about whether spectral library or traditional searching is better, because we can use them both.  We can use the spectral library engine to filter rapidly though matches, then we can take the spectra that don't match and send those (and only those) through Sequest, Mascot or another engine.  Then, if you really want to you can take the spectra that don't match there and export those by searching with Byonic or Peaks or PepNovo+ and really get down into your data.  

For a video on how to set up this last part, follow this link (watch in HD only!)

