Wednesday, August 29, 2012

Dynamic exclusion mass windows -- and why they matter.

I really thought that I had moved this entry over from the old blog last year.  This is one of my favorite articles, particularly because so many people do DE wrong.  Well, it has been successfully located and migrated over.  I couldn't find the date, but since it appears I'm using a 80 minute gel slice fractions, it is probably from last summer or fall.

This entry is about dynamic exclusion, how to do it right and how to do it wrong.
We'll start at the beginning:
In a typical LC-MS experiment, you simplify your complex sample over a chromatography gradient.  This allows your hundreds, thousands, or tens of thousands of ions to elute a few (or a few hundred) at a time, dramatically simplifying the job of the mass spectrometer.
This is a fairly nice example.  This is the total ion chromatogram (TIC), the total number of ions that are making it into the mass spectrometer.  The x-axis is time and the y-axis is intensity.
Each peak in this chromatogram represents a whole ton of ions that eluted off the chromatography column together.  Because of this, your MS1 spectra is still pretty complex (often hundreds of ions).

Interesting ions from your MS1 spectra are chosen according to criteria you set (charge state, m/z range, intensity) and are fragmented.  Normally the most intense ions are selected for fragmentation.

This is the problem.  A specific ions doesn't just elute for the time it takes to do 1 MS1 and 1 MS2 event (often milliseconds!).  An ion will elute for the width of it's specific peak.  For example, on the spectra above, my 80 minute gel slice gradient, my peaks are typically 30-45 seconds long.  This means that one ions is present in my MS1 spectra for 30-45 seconds.  For simplicity, let's say 30 seconds.
If I am only looking at my top 10 most intense ions, for 30 seconds, all I get is fragments of the same 10 ions.  If my gradient is 80 minutes long, that gives me 160 possible intervals of 10 ions, meaning that the most peptides I will EVER see is 1,600.  (It is worth noting above, that my ions only really elute from 30-65 minutes, so this number is actually FAR lower).
Dynamic exclusion (DE) jumps in and help.  Since my trusty old 3200 instrument in grad school, every MS instrument I have used has been equipped with this feature.  DE says "enough already!  I've seen this ion too many times, time to look at something else!"
In DE you select 1) How many times you want to see this ion before you ignore it. 2) How long to ignore it. 3) How close you have to be to the exact mass of that ion for it to be ignored.  Use this well, and your data improves dramatically.  Use it wrong and your experiment suffers.

I recently saw this setting for DE.  1) Only see the ion once before you ignore it. 2) Ignore it for 30 seconds. 3) This is the ion to ignore if it is within 0.5 Da of the observed mass.

So:  Every time we ignore an ion, we are actually ignoring 0.5 Da above and below that observed mass. So this is a 1 Da gap that is being ignored.  We are then ignoring that 1 Da gap for 30 seconds.  If we are using using a method that focuses on our top 10 most intense ions, after every MS1/MS2 event cycle, we are producing a 10 Da window that is going to be ignored.  In general, a top 10 method FT/CID on an Orbitrap system occurs in roughly 1 second.  Let's do the math here:  Every 1 second, you ignore a 10Da gap for 30 seconds.  At the end of that 30 seconds, by the time the first DE list has expired, you have created 29 additional 10 Da windows to ignore.Just for simplicity sake, let's squeeze all of these things we are ignoring together so they are touching.

This is the image from above, just with the 300 Da mass window of the MS1 spectra blocked out.  Although the most intense spectra will range wildly by m/z, we know from previous exercises that good spectra only come from a pretty narrow m/z window.  Without a question, the region where the best spectra are is where you are going to be ignoring.  This is going to dramatically lower your PSMs, as well as your identified peptides and proteins.  With these settings, it might actually be better to turn DE off and salvage your experiment.
How can you use DE to your benefit?  Lower your mass window!  This is a big advantage of the high resolution of Orbitrap mass spectrometers.  You can change your exclusion windows to parts per million (PPM).
What if we repeated the same experiment as above but we set the DE mass window to 10 ppm?  Let's assume a central m/z of 500 just to keep the mass easy.  At a m/z of 500, 1 ppm is a window of 0.0005 Da.  10ppm is 0.005 Da.  Every 1 second we generate a 0.05 Da mass window to ignore.  At the end of the 30 seconds, we are ignoring a total mass window of 1.5 Da.  You are now successfully ignoring the ions you want to ignore AND not blocking out the other ions.
Feel free to try this experiment yourself.  While these calculations seem like gross simplifications, I have seen numerous experiments where dropping a mass window from even 0.1 Da to 10ppm has had caused dramatic increases in PSMs, peptides and total numbers of proteins ID'ed.

Monday, August 27, 2012

What is in a DTA file?

Wow.  This one is even simpler than the MGF file:

You get the monoisotopic mass (A1), the charge state (B1) and you get the fragment masses and areas.  Each individual DTA file is given its own number, but I am unclear as to whether you can use that ID to reference back to your original scan event (actually, I doubt it).  You could always do it manually.
Again, no charge state assignment.

Thursday, August 23, 2012

What is in an MGF file?

Even if I'm the only person to ever reference this, I'm going to make a short series of this -- just to keep them straight.
We'll start with an easy one:  What information is contained in a Mascot Generic Format file (MGF) file?

Via Proteome Discoverer 1.3, I exported some Orbitrap Velos RAW data into every possible format.  The MGF file looks like this when opened in a spreadsheet


So, you get whether the mass is monoisotopic, or averaged, in this case - mono.
"BEGIN IONS" denotes that this is the place where Mascot should start paying attention, the Title line is a short statement that identifies this spectra.
 The first information is the parent ion mass. The next number is the parent peak area.  This is optional information that is not used in Mascot
The next line provides the charge state.  In our data, this was assigned within the Orbitrap when the isotopes were successfully resolved.
The retention time and the scan number are next, followed by the fragment ions and the area of their peaks (again, this is optional).  The last line is to tell Mascot to stop paying attention and to start looking for the next "Begin Ions" statement.

Did you notice what was missing?  Charge state assignment for the fragment ions.  In high resolution mass spectrometry, it is possible to resolve the charge states for most of the MS2 fragments.  In this experiment, the fragment ions were read at high resolution, and the majority of fragments were isotopically resolved.  Mascot DOES NOT use this information.  As high resolution mass spectrometry continues to advance and MS2 charge state assignments become the norm, the Mascot algorithm will eventually need to be updated or replaced by algorithms that do use this extremely important information.  At this point, however, the Orbitrap is producing more valuable information than we actually know how to use.

Wednesday, August 22, 2012

MALDI on an Orbitrap

I could write 50 pages about what happened this week.  In fact, I almost already have.  I've been at the Thermo facility in Bremen, Germany where I have taken the user and experts course on the Q Exactive, and the expert course on the Elite Hybrid, and I've taken 45 pages of notes so far.  Today, however, was the user's course on the MALDI Orbitrap.

I have to admit up front that I'm not the world's biggest MALDI expert.  I have only used the ABI 4800 at the Virginia Tech Mass Spectrometry Incubator.  I have, however, kept up with a lot of MALDI technology and papers, particularly the innovative drug discovery work of the Dorrestein lab at UC San Diego.

And of course, I'm probably kind of biased.  1)  There isn't much question that I love the Orbitrap series of instruments.  I did when I got my first XL as a postdoc and that has only been amplified by the rest of the Orbi family (I have, at this point, finally operated every instrument with an Orbitrap except for the 'Classic').

Despite these obvious flaws in my background and impartiality, believe me when I say:  I do not understand how anyone could possibly justify purchasing a MALDI-TOF instrument.  Ever.  The MALDI-Orbitrap produced by Thermo is an Orbitrap XL with the upgraded HCD cell.  100,000 unit maximum resolution (has a TOF ever exceeded 30,000?  and the sensitivity of an Orbitrap.  I was watching signal intensity being generated at 3E6 with <5 laser pulses.  The MALDI seamlessly switches from plates to imaging mass spectrometry  --- again, with the resolution and sensitivity of an Orbitrap.  

If this wasn't bad enough for the MALDI-TOF field, it doesn't end at the Orbitrap XL.  Although Thermo has decided to not push the development of MALDI sources on the newer systems, a company in Maryland, called Mass Tech has developed an atmospheric pressure MALDI source that can be fitted to just about anything -- including the Q Exactive and the high field Orbitrap Elite.  Anyone want to do MALDI at even higher sensitivity with 240,000 resolution?  

Wednesday, August 15, 2012

Peptide Atlas


Holy cow.  It has been nearly a month since my last entry.  Apparently, I've been every bit as busy as I thought I had been....changing jobs, states, and buying a new house are all things that will cut into your free time.
Anyway, this amazing tool has been available for years now, but it only just now came to my attention.  Peptide Atlas is an amazing set of tools, as well as a data repository for all of us out there who have interest in a specific protein or set of proteins and want to target them.
Say you are interested in a cool protein like STAT3 but you didn't see it in your discovery run.  You can go to the Peptide Atlas and see what peptides from STAT3 are most commonly found in your normal proteomics experiment.  You can then export the mass of those peptides and move them to your Include or Parent Mass lists, dramatically increasing the chances that you will find and be able to observe that protein.  It takes the guess work out of your experiments, particularly what peptides or charge states you should be looking for.  There are other tools that will help you build SRM methods but I havent' gotten to those yet.  Data from the Peptide Atlas can also be directly imported into PinPoint if you are doing true label free and targeted quan.