Tuesday, June 14, 2016

Time to take the free version of PD for a stroll!

What did you do during timeouts in Finals game 5?

I'm running a bunch of different samples through the new free version of PD!

How is it?

If I said it is easily the best piece of proteomics software you can get for $0, you'd probably suspect that I'm biased. So I won't say it. Not trying to insult anyone out there, but this is pretty sweet.

Seriously, though. The framework of Proteome Discoverer is really really good. You get the ability to organize your experiments -- separate processing and consensus workflows -- persistent workflows -- friendly node driven interface -- a whole ton of the awesome things that the full PD version contains -- but you don't have to pay for it!!  Seriously, I'm pretty sure the manufacturer dumped a lot of time, energy -- and money into developing this interface and the IMP-PD piggybacks all of that.

Throw in the ability to instantly pull out your XICs for the PSM you're looking at? A really nice label free quan node -- and differential statistics? This is a SERIOUSLY nice piece of software.

The next real question is this -- is the Free version of PD better than the pay for version? Well...I'm totally not going that far. I need SILAC quan (pay for version only); I like the PD 2.1 TMT quan algorithm better than the IMP one (honestly, possibly cause I'm misusing it -- waiting for the publication eagerly! ) and the full version filters better. One more difference -- speed of search.

Same data file; 65,000 MS/MS spectra -- normal-ish settings

Sequest first (I highlighted the wrong line. I meant to highligth the 3.5 minutes for the Sequest search. Did I mention I am watching the NBA finals while writing this?)

Then MSAmanda (the only search engine in the free IMP-PD.)

Alright, here is a limitation. The current version of MSAmanda is the fastest one yet, but its still not as fast as Sequest. (P.S. The numbers here look different just cause of how the formatting lines up in the Administration window.)

60,000 spectra is pretty small by today's standards. Put this up against the much more normal set of 5e5 to 1e6 total MS/MS spectra? This is gonna be a disadvantage. Throw in some more PTMs? Yikes. Consider that my processing computer is probably a good bit faster than yours? Double yikes.

Wanna see something interesting, though?

Sequest total search time runs 18 min with this set. Mostly cause Percolator takes 9 minutes or so.

Elutator is the IMP option for Percolator. And it is also a little slower than Percolator (publication in works on this one and I'm excited for it -- this little algorithm is superb!) But when you look at all the little steps in the processing pathway, the difference isn't all that different. Percolator + Sequest is 18 minutes in this format. MSAmanda + Elutator = 35 min.

At some point I realized I was running off of my HDD storage drive, rather than off my C:/ which is an SSD. But I was too distracted by the game to rerun this. Times might be faster for both of these datasets.

Okay, so I'm gonna take the high road (this has been popular in the media this week). The IMP nodes compared to the pay for nodes?

WOOOOOOOOHHHHOOOOOO!!!!!  Using both gets me a bunch more IDs?  Sure. So I go back into the MSAmanda+Elutator matches, look at them usng my normal cutoff procedure -- and they're good -- like, they're real good. My worst scoring peptides are great.

The best solution for me is
Sequest + Percolator
MSAmanda + Elutator

Cause that's a bunch of peptides out of an Orbitrap Elite 2 hour gradient!

See why I'm excited to know about Elutator?!?!

No comments:

Post a Comment