Thursday, June 8, 2017

Taking the new PeakJuggler for a Spin!



At ASMS we got an updated version of the PeakJuggler nodes for PD 2.x courtesy of IMP and PD-nodes.org. I don't have a ton of data on them because I was just running them on my tablet at night, but here is some stuff I've learned.

First off - PeakJuggler uses R (in the background, don't worry!) but you have to make sure your version is current enough. I had a couple crashes because my tablet was carrying a version that was too old.

If you never use R for anything other than IMP PD nodes, you can just go to CRAN and download the newest version (3.4.0 as I'm writing this). This will put a second (or third) version of R on your PC which you can remove with Uninstall Programs in Windows.

If you have used it but aren't an expert (example, myself) and don't want to have multiple instances on your PC and want to migrate your files over you can run this script.

# installing/loading the package:
if(!require(installr)) { install.packages("installr"); require(installr)} #load / install+load installr   # using the package: updateR() # this will start the updating process of your R installation. It will check for newer versions, and if one is available, will guide you through the decisions you'd need to make.

If you are an expert, I bet you know a smarter way of doing this -- or already use the current version!

TESTING PEAK JUGGLER.

The great people at IMP already benchmarked the new PeakJuggler with the Ramus et al., Quan test dataset. It is high/low data of yeast with USP1 proteins spiked in. Since they did that one, I've been working on the Shalit et al., dataset instead.

This is human digest with E.coli digest spiked in at different levels.

As a first test, I ran just one 3ng E. coli spike in and 1 10ng spike in. These files are large relatively large; about 1.5GB each.) With PeakJuggler and MSAmanda and Percolator -- I'm hitting at 45 minutes per processing and around 9 minutes for each consensus for each individual file on my trusty old 8 core PC.  This is a marked improvement over the previous version! After combining the results with a MultiConsensus report, I get some great output!


Here I can see that the 10ng E.coli spike in is always higher than the 3ng spike in. Sorting through it is pretty clear the human proteins hover in the 1:1 range. You'll note at the bottom I've got one that wasn't ID'ed in the 3ng. It only has an area of 7e5 in the 10ng sample, and it appears that might be close to the threshold.

I don't think the webpage mentions improvements in the plotting, but at first glance the XICs of the peaks look really nice! In the PD 2.1 version the output appears to be the same as the PIAD, meaning that it can't calculate LFQ ratios.

Summing it up -- once you update your R, this free node is a really nice solution for label free quan in Proteome Discoverer!

No comments:

Post a Comment