Tuesday, June 5, 2012

What mass ranges yield the most (and best) fragment ions?

Here is the experimental setup:  normal mouse serum was taken and depleted of the 4 most common proteins, albumin, transferrin, IGg, and the fourth one (I forget).
The remaining proteins were digested in trypsin overnight, using a simple FASP-like method in an Amicon filter with a 10 kD MW cutoff.
The peptides were desalted by ziptip and 1 ug of peptides were loaded on a 30 cm MAGIC column from Proteomics Plus, running at 250 nL/min.  A standard top 10 method (CID) was used on an Orbitrap Velos with a dynamic exclusion after 2 occurrences.  Only ions with >1 charge were used, with a m/z of 350-2000.  The data was processed in Proteome Discoverer 1.3 using Mascot and Sequest with Percolator rescoring
1348 unique fragmentation events resulted in 956 high confidence peptide IDs (71% identification rate)

I have been curious for a long time about the distribution of useful ions.  The real question is, I guess, am I wasting time?  Are there any peptides in the low and high mass ranges, and if not, why am I scanning all the way to 2,000 on every MS/MS?
The average positively ID'ed peptide m/z was:  890.45, with a median of 892.98
The average fragment m/z was:  856.61, with a median of 836.96

What does the actual distribution look like?
Peptides first:

Wow.  Keep in mind that the I started at a m/z of 350, but its pretty obvious that we don't get a lot of ID'ed peptides from the lower m/z range.  Nor do we see anything in the high mass range, with only 1 peptide ID'ed with an m/z >1600.

Does the distribution of fragmented ions look the same?
No.  Up to about 1,400 m/z the distribution looks exactly the same.  It seems like we are picking fragments purely by random distribution.

How do the two overlap?  If the totals from each chart are adjusted to 100%, the distribution looks like this:
The red bars are the adjusted number of fragment ions within that mass range and the blue bars are the number of confident peptide IDs.  What is striking, I think, is the number of low mass fragments we obtained that were not ID'ed as peptides.  It is possible that these sequences were too short to reach our stringent peptide ID cutoffs.  The adjustment causes a funny occurrence around the peptide/fragment median, where we actually identify more peptides than we have fragments.  This is merely an adjustment error that reflects the peptide ID m/z median.

These results beg further analysis, and the first questions I have when looking at this are: 1) is this reproducible, or simply a single occurrence in mouse serum and 2) is this a consequence of using the FT for MS and IT for MS/MS?  I'm going to look at 2 experiments where we performed FT-IT and FT-FT (HCD) analysis of the same samples.

The real take away message here is that we may be wasting precious scanning time in the +1600 m/z range in an FT-IT experiment that is reaping no benefits to our research


  1. Hi Ben

    did you ever add up the total number of non-duplicated proteins from plasma in the off-gel work?


  2. Hi Ben,

    Which paper does these data come from? I'd like to read the full text if possible.


  3. No publication, sorry, this was just me messing around with some of my RAW data on an Orbi Velos.