Saturday, February 25, 2017
Saturday morning and it's raining? Time to process some UVPD data!
Okay...so this paper has made the blog a couple of times. This time, however, I'm up early on a Saturday cause I've got the RAW data files from it!
Now you can have them too! They're at Proteome Xchange under PXD003904.
I've been wanting to run this for a few reasons:
1) To see how hard it is to process
2) To figure out the best engine and settings for it
3) ....cause I thought I was going kayaking this morning before I went to lab...and...it's...raining....
(So it's another Science Saturday!)
Why #1? Check out that image! Normally we're just worrying about b/y ions. For those of you who have bribed friends with advanced engineering degrees to help you set this up in your instruments, what do you get for your troubles? Now you get to worry about b ions, y ions, a ions, x ions, c ions AND z ions in each MS/MS spectra!
(Just a reminder...thanks WikiPedia!) This looks like MASSIVELY more search space. Just what y'all needed, right?
Okay -- so I'm going to start with just one file from this dataset. And only care about the positive UVPD -- and I chose this study cause the peptide is not chemically modified AND the fragments are read out in high resolution (I have some ion trap files now....and I'm finding them challenging...more later, maybe!)
First impression? HOLY COW, this data is beautiful!!! Second impression....is my computer going to wake the dogs up?!?!? ALL the fans are running.
This is one file.... Yes....the search space has blown up!
DISCLAIMER: This may not be the smartest way to run this. I'm half-awake, somewhat annoyed, and not a professional scientist. This is how I set it up, and I'm super impressed with the data.
Used a normal Uniprot database (from 2011, LOL!) used a tight mass tolerance cutoff and then allowed SequestHT to have equal weighting on all the fragment ions. I'm doing several iterations with the different engines as well, but this is a nice start.
If I use exactly these settings and just exchange Percolator and Target Decoy
Sequest + Target Decoy = 1,700 phosphopeptides
Sequest + Percolator = 4,459 phosphopeptides!
Are the extra Percolated peptides real? As far as I can tell? Yeah...they're real. Umm...check this out! (Click to expand)
It is a great big peptide phosphopeptide (2 missed cleavages?) and looks better than any ETD phosphopeptide I think I've ever done. I dropped +3 and +4 fragments from this so I could visualize the chart here....IMHO, this data is just stunning. I chose this at random. Most of them look this good!
My worst scoring peptide from Percolator -- I might not put it directly in a paper itself, but I wouldn't be embarrassed to load it into a Supplemental figures file. (Leaving it out for the sake of -- I do have to do a lot of work today...)
How do the other engines I have do with this data? Bad news.
Neither my version of MsAmanda nor Byonic currently have the capabilities of accepting the full a/b/c/z/x/y fragment spread....
...but I'm quite certain both of these awesome teams have it on their radar if not ready to rock already!
EDIT: I've been told I ought to update my Byonic by a reader! His version has it (not a vendor!) has it. Sweet!
I haven't configured a Mascot server in a few years, but unless they changed something significantly, I'm sure you can go in and configure a new instrument and set the weights of the ion spreads in the same way as this, then you just use PD to access that new instrument type.
TL/DR? We can natively process UVPD data from the literature with SequestHT in PD. It will push our processing CPU pretty hard, but it is do-able. And there is a reason the field is excited about the possibilities of UVPD Orbitrap fragments!