Friday, November 18, 2016

Why didn't I get any quan values in Proteome Discoverer for this thing?!?!?

Wait. Do you really have time for this before work?


(LOL! Any picture of this guy make me happy, but this is a favorite)

When you're running PD you often run your nodes down two distinct pipelines -- one that is your peptide ID pipeline and one that is your quantification pipeline.


I'm no expert in what is going on behind the scenes here in the magic binary land, but I find it really useful from a logical sense to keep this in mind. Our end report is going to bring it back together, but these nodes are functioning, for the most part, as distinct and separate executables. The results are all brought back together into the SQLite (is it still this in PD 2.2, I think so, but I'm not 100% sure) table that is our .MSF and .PDRESULT file.

As such -- it is completely possible for our friends Sequest and Percolator to find something that the other side of the aisle did not. Honestly -- dig deep -- it happens quite a bit.

Check this out --


This is an iTRAQ run from a friend who studies possibly one of the hardest things (that isn't a plant) that you can do proteomics on.  (14,000 MS/MS events -- 90 PSMs in this file, seriously.)  But check this out -- in the processed data I can find thousands of spectra with iTRAQ quan) -- but no IDs.

This can only occur when we've seriously separated out the 2 processing pathways.

This isn't the most common question I get about PD, though!  The question that comes in is -- wait, I ID'ed this! Where's the quan?

First off -- this is gonna be significantly less common in reporter ion quan. If you've got a good fragmentation spectra, chances are you're going to have reporter ions down there. Even if you spike in a good heavy labeled standard -- like the PRTCs -- you'll probably see reporter ions. (Thought I had proof of this, but I can't find it right now). This is isolation interference. We're never fragmenting a 100% pure population of just our ion of interest. Other stuff sneaks in. But it does happen.

If you see something like this, you'll want to look at the Peptide Group level for the "Quan Info" tab. This will give you a vague statement regarding why you didn't get quantification.



It is significantly more common to find ID with no quan in the Event Detector MS1 quantification. (SILAC and PIAD). Example...

Check out this SILAC dataset and the stuff we find waaaaay down in the noise. We get some info on why there isn't quan when we look at the Peptide Group Level. "Not used" and "Excluded by Method"

To figure this out, you need to check out this troubleshooting chart from the manual.


This is what PD considers behind the scenes. In PD 2.x we've got control over some of these parameters (in the MSF and Quantification nodes). It might take some detective work to determine what you are looking for. But the Quan Info columns can help you chase it down.

It is a little more manual in the PIAD workflow. Example...


We've got a protein ID with 55% coverage and no quan? What? As the highlighter and misspelled word indicate, you see that this protein only has one Unique peptide. What we need to do is find out why that peptide didn't get quan.

If we check that protein and then Expand the Associated Tables.... (click to see full-size)


We can find that 1 PSM that is unique just to that protein...If we go one layer down...we find an absolute kick in the pants.  Remember when you set you built that method and you said "No Quantification" (cause in PD 2.x the PIAD isn't considered a "real" quan method).

PD 2.2 has "real" label free quan, as do the PeakJuggler and OpenMS community nodes. But PIAD doesn't get some of the troubleshooting benefits that SILAC does.

But we can figure out why this thing didn't get quan.


If we highlight this peptide and then show "the spectra used for quantification" and "show the XIC" we might get to the answer. Check out the XIC at the bottom.  Even with a 6ppm mass tolerance cutoff, this is an ugly peak. If we look at the precursor, we're seeing an awful lot of interference here. (It says 64% isolation interference...which...honestly, is a measurement of something else entirely, but is useful for illustration purposes here.)

The Event detector is seriously strict. Remember, the maximum cutoff you can put in is 4ppm.

Check out another peptide (the next one down the list and what it does look like)


Again...isolation interference shouldn't be my metric, but its only 22% for this one and it shows in the Peak. The PIAD has no problem working with this one.

I guess the moral of the story is -- PD 2.1 quan has a logical pipeline and you can almost always figure out why you get and ID and no quan. Honestly, it is probably harder to figure out why you got quan but no ID.




3 comments:

  1. Hi Ben,

    thanks a lot for your thoughts about the PD and possible workarounds.
    I´m struggling getting proper quan values when using the Precursor Ion Quantifier node in the new PD2.2 - sure, I have seen your videos and the particular video for this node...

    We have isotopical labeled peptides (in reference to the TAILS workflow from the Overall group, you mentioned quite a while ago on this blog too) measured on an Orbitrap Velos, CID Fragmented.
    Sure, we got a lot of quan values if we performing the PD workflows in order to your recommendations but unfortunately they aren´t in the way we did expect.
    Why we are concerned? - We performed a swap experiment as internal control for every sample:
    First sample is: "Cells from Day 0= light labeled" mixed with "Cells from Day 3= heavy labeled" and the swapped sample is visa versa...
    If we use the TPP and XPRESS we do get proper quan values for both "replicates", but with lower peptide identification and in an user unfriendly webinterface.

    Would you suggest to perform one (Mascot/Sequest) search with the light Mod as static variable (28 Da for the first "replicate") and the heavy Mod as dynamic (34 Da for the first "replicate")? Visa versa for the second "replicate" or setting both Mods as dynamic?...The first setting would be the common fashion for Xtandem through the TPP...

    Could it be more beneficial to use one study for the two replicates (with two processing workflows) with one shared consensus workflow due to the swap character of the experiment....

    Thanks for any ideas on this problem.

    Cheers,
    Max

    ReplyDelete
    Replies
    1. Max,
      Interesting problem! My first guess is that PD is getting confused in the ID, not the quan. It's linking the wrong things together. Honestly, I'd probably start with making the mods static -- not normalizing anything and then just see what it looks like. Shoot me a direct email if it still looks crazy. I can't guarantee I can help, but I'd try to take a look at it.

      Delete
  2. Hi Ben,

    thanks for answering. I was already trying different paths and it looks way better than a couple of days ago.
    I will hit you up tomorrow by mail - in good ol´Europe it is already bedtime :)

    ReplyDelete