Wednesday, February 28, 2018

Why do your alkylation and reduction separately? Do them together!

Okay -- I'm prepared. This is probably one of those things everyone has been doing that I just missed while I wasn't in the lab full time. Admittedly, I don't prep many samples, and I never really have. But I'm prepping a bunch of patient plasma samples this week with some different methods and I forgot how much it sucked for me.  I forget what I'm doing all the time. I can't read my own handwriting and I get very...very...distracted....

What was I..umm....

Oh yeah! Until recently when everyone all of a sudden seemed very interested in PTMs on cysteines (WTFourier?) I blindly reduced and alkylated. Now that I'm thinking about it, I'm pretty concerned. There is this blog post

...and associated paper... that's bad news...serenity now... These are patient samples that I'm really really interested in (note to collaborators -- I'm 150% invested in every sample, not just these). How do they deal with it in the study I'm using as my template? Oh. They don't use IAA. I should have read ahead.

For reference, this is my model study for patient plasma samples. If someone brings you plasma, just do what they did. (I have to mix it up a little myself because I'm looking for something super weird.)

When you get to the reduction and alkylation steps you need to go back in time to this study.

Where you find this gem!

The figure at the very top of this post is 1b, demonstrating the efficiency of this method. BOOM! Reduction and alkylation in one step!  No weird iodine things. No DTT cross-reactions with your alkylating step. Less places to get mixed up and forget where you left 80 samples. Everyone wins!

EDIT: 3/1/18. Definitely check out the comments people have time to make on this post!! This story is more complicated than this. I need to spend more time on this as well.

Saturday, February 24, 2018

STOP. Do not phosphoenrich another sample till you check this out!!

WHOA!!! Some of y'all have spent a good part of a million bucks to double your phosphopeptides. What if you could just about do it for free?!?! What if it just requires switching your solvents around and adding an enzyme to degrade the DNA/RNA?  That's what they show you can do here!

Okay -- so they only increase the numbers by like 50%. But that's still a lot!!

What you're doing is just switching up the protocol to drop the amount of crap that is also sticking to your enrichment column.

By UV it looks like this:

A has a lot of crap! B has WAY LESS crap! And that's all there is to it, your instrument spends less time trying to sort out your already-difficult-to-ionize-and-fragment phosphopeptides from a bunch of other stuff and you get more IDs.

As a side effect, maybe missing all that junk is a great way to further preserve column life and keep the instrument cleaner longer.

Even if it doesn't? 50% more phosphopeptides!!!?!?!?!

Friday, February 23, 2018

Tracking metabolite fate with the Colon Simulator!

The importance of mammalian microbiomes is something that just can't be overstated. The genomics people haven't been making it up and we're seeing more and more metabolomics stuff to back it up. Unfortunately it can be a little hard to study what all those things are doing in the context of shifting conditions between the mammal's diet and a zillion other external conditions. Maybe variables need to be minimized.

Solution? The colon simulator! 

You can't just put a mixture of the ten most common gut bacteria in a flask and rotate it at 37C and expect it to simulate physiological conditions. The colon simulator attempts to recreate something closer to what is actually going on. In this study they use it to track labeled polydextrose (that's the stuff in just about every boxed/canned product that is "low sugar". It doesn't take much time on Google to find that there are some well known side-effects of polydextrose -- rather than following some forum posts with some very immature statements in their titles down some sort of a rabbit hole, I decide to consult WikiPedia. Here I learn that these statements are backed by science because polydextrose can cause 10x more flatulence than naturally occurring fiber in some people).

If this isn't enough to ban this compound outright, we should definitely be studying it!

These authors put 13C polyglucose in their simulator and track the heavy breakdown products. They use 2 NMR systems as well as an LC-MicroTOF that they operate at 1Hz (presumably to obtain the highest resolution possible? or for massive scan averaging to simulate resolution?).

The output is really interesting and visually very nice to look at, but you'll have to check it out yourself. How I'm interpreting it is that the bacteria aren't doing anything at all uniform with this carbon source, they are utilizing it in different ways and then the new breakdown products in different ways based on their own metabolic processes and relative numbers (explaining the individual variation in processing polydextrose?) and the more we can learn about it, the better.

Have a great weekend!

Thursday, February 22, 2018

Structural prediction of protein models using distance restraints!

Amateur hour is over for structural proteomics, yo'.  Time to take the formaldehyde and the guess work and get off the stage.

This is how you do it. Step by step. Reagents, mass spec settings, free software. Everything is in here. Okay -- actually -- you have to also have this paper (this is where the mass spec data came from) and THEN you have the entire workflow.

MS-Cleavable crosslinker was utilized (of course) but the gem here is the downstream analysis that takes you a step forward in your data. You go from this peptide is xlinked to this peptide to "holy cow this is the way this protein is folded or how these are linked".

As someone with a bunch of this planned in the near future -- this Protocol couldn't have come at a better time. Now I just need someone to install all this software....

Wednesday, February 21, 2018

DALEX: Take a look behind the scene in machine learning!

I probably don't need to tell you that people love throwing the terms "machine learning" around these days. Heck, we're at a point now where some percentage of them aren't just saying it to sound smart and know they actually know what it means. (I'm not one of them) 😈 (Is that an evil cat Emoji? Yeah!)

A big problem with the actual machine learning algorithms that are real things that are actually doing computations on computers is how black boxy they are now. A lot of it is one algorithm build on another and another and modern PCs just have the firepower to run all of them.

Now you have a new problem. What are they doing to your data back there!?!  And if they are screwing it up, when would you know? In 2 weeks when all the calculations are done? Or when someone does a follow-up to your study and the prose has a condescending tone (that you probably imagined...😈...)

DALEX is an attempt to figure this out. It is a project by a bunch of bored mathematicians called MI^2 (this is pronounced "Am I square").

You can read about this project here.

Shoutout to Dr. Norris (someone who does fall in the group of people who knows what machine learning is and how to use it) for this cool link!

Tuesday, February 20, 2018

Metal bands OR words that exist in the human protein FASTA sequence?

On a lighter note, I just saw this on Twitter. These are some "words" you can find in the human FASTA sequences!

I can't find MISERY, but I checked a few. ANGST looks like it shows up in at least 21 human proteins (NIST fwd FASTA was the first one I had available to check). Finally, a good reason to stop being embarrassed about that Goth period you had in high school!

Do you have a data processing task that sounds impossible? Perseus time!

It's NBA All Star Weekend and here in the U.S.A. and it's a big enough deal that in my rural community we get a school holiday for it. From the sounds I can hear from my yard, I think that nearly all of the local children are celebrating this respite from arithmetic by firing semi- and fully- automatic weapons. I'm exaggerating. I'm sure there are also untrained adults out there with military grade weapons. I like to hope there are two distinct groups of people who go down my middle-of- nowhere dirt road:  The group with the machine guns and the group that throws all the "Lite" beer cans out the windows of their vehicles. What can I say? I'm an optimist!

Around all this revelry I somehow have found time to start checking something critical off of my bucket list. And this is to finally take a look at where MaxQuant and Perseus are today.  And...I feel kinda dumb...

I'm going to start with Perseus first. If you don't have this one your desktop and you have any intention of doing an analysis that is more than peptide ID, you should go here and register (it's free, of course) and get the newest version on your desktop.

Am I always telling everyone to download all sorts of software? Probably. I should justify this.

The current iteration of Perseus can do everything you've ever wanted to do with a complicated proteomics or transcriptomics dataset.

It can process your data through logical and hierarchical filters (and allow you to export your data at every point in the step by step process. NOT JUST AT THE END).  Think about how useful this is for a second. If your workflow looks like poop at the end, you can go back through your data manipulations and look at the report at each step. You can find out exactly where you took that beautiful mass spec data and messed it up.

It also allows single step insanely powerful manipulations of your data. Example: Imagine that, out of the sheer goodness of your heart, you have taken on the data processing of a huge clinical proteomics cohort in a virtually unknown disease. Imagine that this study had the most rigorous QC methodology anyone has ever done for a proteomics study (I didn't do that part. holy cow. the team that did is good. wait. this is hypothetical). Also imagine that you have delivered 16 LFQ reports and everyone is really annoyed that you did Control/Disease state, rather than DiseaseState/Control. (It's clinical, this is a bunch of MDs) and recreating those 16 Consensus reports is more than all the goodness that has, or ever will, exist in your heart.

Perseus? Just pull in all the table for all the values and hit the Transform button. Type 1/x and export the report.

I think I literally or figuratively (I get those mixed up) just chose the absolute least powerful thing that you can do with Perseus as an example because it saves me 16 hours of Consensus workflow processing.

What if you have a bunch of SILAC experiments that were done a few years apart and someone realizes that these would be perfect for comparing the light labeled version of 3 of them from the 2011 study to the heavy standards done last year? Sounds like a nightmare, right? There are 10 ways you could do this (PD could do it) but Perseus is actually designed to do it. That's kind of what it is for. There are tutorials specifically made to address this!

If you are thinking -- "wait. aren't you really hard on MaxQuant and Perseus in this blog?" Yeah. Totally. I can't remember even 1% of what I've written on this site, but I think that all of the criticism has been regarding how challenging the software is for beginners or for simple experiments. My first favorable comparison of the two software packages was when PD 1.2 (I believe) could get me the same results the version of MaxQuant did at the time but could do it with a simple saved template that I could generate results from just by hitting the "Play" button. PD has grown up a lot and it is the software I will go to every time (my lab has like 7 licenses and Mascot! w00t!). But if you have something nuts --like -- absolutely nuts -- you may enjoy your life a lot more if you go to software that can do something like this.

This is a multiscatter plot showing the Spearman correlation coefficients for the quantification of 9 different cell lines versus one another. The coefficient is overlain on the plot and the orange is the visualization of one set of proteins selected in a single plot -- carried over to where is this set of proteins present in EVERY OTHER SAMPLE SET.  Is there a set where your proteins of interest are not showing up in the low ratio range? Easy to find that plot, highlight it, it becomes the active plot and then you can examine them manually.

Now -- I have to be honest. I haven't done these plots. I stole them from last year's MaxQuant summer school lectures. But -- this is important -- I'm giving it a go right now -- and I'm just feeding Perseus PD data. I want to do something that is tough and time consuming in PD, so I'm just feeding it into Perseus. Oh -- and I'm also giving Perseus transcriptomics data, too. Cause Perseus doesn't really care what it's looking at, so long as you tell it the right format!

If I convinced you to also give up your next holiday to learn Perseus. I recommend you take the time and start here.

Part II is here:

And Part III (my personal goal for today): This is the video where Dr. Tyanova shows all the clustering!!

As an added bonus, Dr. Geiger is really funny. You have to really be paying attention to catch it and I suspect if you are replaying the video and pausing it while trying to replicate her live data manipulations it's easier to catch her subtle jokes than if you are sitting in the audience. Or the summer school participants are just really serious (as they should be). You may find yourself looking around and wondering why no one else laughed and then realize you're in your office and there is just a sleeping dog and it's 5pm and you haven't had breakfast and maybe low blood sugar makes you laugh at things no one else laughs at. Who knows? I prefer to think that Dr. Geiger is really funny.

Yes, I just suggested you watch 3 hours of videos and to work along with these awesome operators to learn Perseus. This much power doesn't come for free! There are other resources as well. This great recent paper and there are great focused tutorials and (non video) use cases here at

Monday, February 19, 2018

Time to plan your summer European vacation around these amazing meetings!

Everyone in proteomics should be in Europe in July! Let me help you plan your vacation.

First stop:  July 2-6 -- Zurich

For the DIA/SWATH course. You can register here until March 31st. There is no intro material. This is for mass spectrometrists. Been doing DDA for 10 years and want to see what DIA can do for you? This is your stop.

Stop #2: July 8-13 -- Barcelona

Want to become an expert in the world's most powerful quantitative proteomic packages? This is how you do it. MaxQuant and Perseus for 5 full days with amazing speakers who design the stuff and power users who influence the designers. You can apply for a spot here.

Stop #3: July 16-20 -- VIENNA! 

EuPA's Advanced Practical Proteomics returns -- this year in Vienna. If you aren't familiar, just check out the amazing lecture material produced from the 2017 academy.

SUMOylation, glycoproteomics, big data, protegeonomics, PTMs I haven't heard of. You can get all these 2017 lectures here.

You can register for this event here. While I'd love to get to all 3 this July, I can only realistically do one so it's Vienna in August for me!  I can not wait!!

Stop #4: July 27-28 -- Oland, Sweden!

Heavy metal festival on an island in rural Sweden?!? What?!?  IN FLAMES plays both days? Wait! Wrong blog! much vacation time do I get again...?

Sunday, February 18, 2018

More details on our optimized gradients for C-18 PepMap

Thanks for the emails!  I legitimately love to receive them. Even if it starts with "you're an idiot!" Which, honestly, is reasonably rare. 

This is a follow-up to a recent post where I talked about how I'd been totally messing up my LC separations by treating C-18 PepMap like other resins.  I know I didn't provide enough details, but this has been a work in progress as we fine tune our instruments to be able to best handle the intimidating number of projects we've got going on.

After messing around with 6 EasyNLCs (1 Easy2, 2 Easy1200 and 3 Easy1000s) this is what appears to be the best separation I can get in 120 minutes of time on the 15cm 2um columns in 2 hours.

Please note: Recent versions of the EasyNLC user manual recommend using no more than 80% acetonitrile in Buffer B to protect your pump and valve seals. We switched all of our systems to 80% just last week. With 6 EasyNLCs running around the clock --- even a 1% increase in pump seal life will result in at least a 40% decrease in the number of loud profanities coming from this one weird bald guy in the lab.  Also worth noting -- these LCs use viscosity and temperature and some other stuff to determine flow rates. Don't change your solvents without consulting your manual (get the newest one online. there have been significant revisions over the years!), and don't trust what weird people post online on Sunday mornings.

Buffer A is 0.1% formic acid in 100% LC-MS grade water
Buffer B is 80% MS-grade acetonitrile, 20% LC-MS grade water, 0.1% formic acid.

This is with a 2cm C-18 PepMap trap column in line.One with a surprising amount of dead volume. More on that when I have more data.

I'm actually using 500nL/min when I get to the high organics just to flush things out, but I don't like how the graphic represents it.  Translation: I am displaying a method image above that is actually incorrect, purely for my own personal sense of aesthetics...

What happens if I don't ramp it up the organic pressure/flow rate? I get some trailing of some extremely hydrophobic peptides.

I know that is hard to see, but if you click on it - it should expand it. This is a human cell digest from Thermo with the PRTC peptides spiked in. The bottom is the most hydrophilic peptide of the 15 and the middle frame is peptide number 13. You can see that there is still some signal when the method cuts out at 130 min. There really are some peptides coming off at the end there.

If I up the flow rate on high organic, it trails off a good bit better and it sharpens the peak shape on my latest eluting standards.

I really dislike the boring 0-10 minutes at the front. We're running some tests in this weekend's queue, but on one run we've been able to negate it almost completely by using an alternative NanoViper trap column with significantly lower dead volume. I'll share those details when I get them.

For less complex samples we're getting the best separation by mimicking this gradient and ramping to 24% buffer B in 40 minutes and 36 in an additional 15. I brought some RAW files home, but I can't find the funny cord for my portable hard drive. I'll post later if I can find the stupid cord.

Why aren't all USB cords standardized yet?!? It's 2017, for crying out loud.

EDIT: *2018

Saturday, February 17, 2018

Spectral accuracy of an Orbitrap using Isotope Ratios

There has been this myth out there for years that Orbitraps aren't good at isotope ratio analysis.

Okay, it might not actually be a myth. The original Orbitrap and maybe the Orbitrap XL didn't perform very well against a time of flight (TOF) instrument in a study operated by a TOF manufacturer.

Fast-forward 11 years or something and take a look at this newish study. 

Head to head comparison -- a Q Exactive Plus operating at 140,000 resolution (doesn't appear to utilize the enhanced resolution upgrade) versus an honest to goodness isotope ratio mass spec.

How's it do?

Really well. Honestly, better than I expected despite my borderline obsession with these instruments, with an important caveat or two:

There is a dynamic range where the isotope ratios are spot on. It looks to me from the charts in the paper that if you are above 1e5 counts you're in the clear. Drop below that line and things get wobbly.  Stay above it? And you can just look at the isotopic distribution and count the number of carbons, nitrogens and sulfurs in your molecule without any additional information.

Caveat 2 (I'm adding this one) the mass cutoff of the instrument. Resolution decreases as m/z increases in the Orbitrap so in the low range everything is just incredible, but then you hit the low mass cutoff at 50 m/z....we've got some molecules in the 40 m/z range and it is soul crushing to have to run those compounds on the TSQs...

Is a Q Exactive going to beat a dedicated IRMS instrument? No way. Could you use a QE as a high throughput screening device to determine if compounds needed to be sent off for isotopic determination on an IRMS? Absolutely! As long as it's >50m/z (and in this paper, they never go over 1,400 m/ long as:  50 m/z < your compound < 1400 m/z

Friday, February 16, 2018

Systematic analysis of protein turnover in primary cells!!

Edited 2/16/18 for unnecessary profanities, but -- You've still got to check this out!!

What has proteomics done for biology lately? You mean today? Well -- how 'bout figuring out the protein turnover dynamics for over 9,000  (!! over 9,000 !!!) different proteins in B-cells, Natural Born Killer cells, neurons, monocytes, and hepatocytes!?!?

Why is it important? Proteostasis is a critical component of understanding mammalian biology and perturbation of the natural processes is the center of many diseases. Also, all of the cells this study works with are important in their own right.

This dynamic SILAC approach used in the paper is a major improvement over anything I've seen before on this topic -- they can assess protein turnover in a huge dynamic range from a time perspective, assessing proteins that have half-lives as short as 10 hours or as looong as 1,000 hours!

This study has "systematic" in it's title. This translates here into "a ridiculous amount of work". Just when I think I've got this under wraps, and I can't get any more impressed, I realized that they also apply this turnover analysis to protein complexes. How do all the proteins that construct the nuclear pore complex cycle through degradation/replacement in every one of these cell lines? Oh. Like this.

Yes, I know I get excited about a lot of stuff that I read. My basal level state probably appears to exist somewhere between "in complete awe" and "totally blown away" to outside observers, but this study is biology text book altering level stuff and I'm having a lot of trouble putting it down and going to work this morning.

Important note -- This study also features major improvements on the already awesome isobarquant Python software package that can be directly downloaded from links within this great paper.

All the RAW and processed data is available. Since it is 571 RAW files (!!!) and I've already used 2 screenshots from this paper...

Oh no. I'm not done yet. (What time is it?)

In this study the authors use a Q Exactive (Plus, I think) and if you are going through the methods you'll notice that they use a higher MS/MS target value than I bet you're using. 1e6.

Now. Let's think about this one for a second. Why do we keep our target values lower? Because we're cramming a lot of positively charged things into a little tiny space (the C-trap and Orbitrap). I wrote Dr. Makarov once for advice on maximum ions for SIM scans and he said that I would start to see space charging if I went above 5e4 (I can't remember what instrument I was on, but I promise you I kept that email. I didn't frame it, or anything I'm not that weird.)

But that is a SIM scan, right? My only proof that my ion is my ion is that it has a perfect mass and ion distribution. Any shifts from that and I'm in trouble (some FDA pesticide assays on SIM scans require <1ppm mass accuracy for positive ID). This is MS/MS scan. We're using a much more wobbly tolerance, typically allowing 0.02 Da.

If I go to my new favorite bookmark, the RedElephant, it tells me that on a 200Da fragment ion, a 0.02Da shift is 100ppm!  Even when these guys purposely tried to space charge an Orbitrap I think they couldn't force it to get 30ppm out on a SIM scan (I forget all the details and I really should go to work sometime...)

So...if we couldn't possibly space charge our ions out of whack why wouldn't we go for a higher target value for our MS/MS ions? Sure, maybe it is overkill, but if there is no downside?

Okay, so check this out. They didn't just go for 1e6 without evaluating it. They've thoroughly vetted this target level. And it isn't a good idea for reporter ions where you need perfectly focused fragment ion masses.

(I need to read more ACS)

But it appears just fine for everything else. You bet your sweet peppy I'm going to run some head-to-head comparisons as soon as I find some open time on one of our instruments.

Thursday, February 15, 2018

SUMO is back and this is how you identify substrates and partners.

SUMOylation is a PTM that I generally try to forget about, because

1) I don't know how to identify it and
2) I don't know what it does, except...well...if it messes up it's probably bad.

Solution? This great new paper in Press at MCP! 

SUMOylation is a protein-type PTM -- something like ubiquitination, but without the handy-dandy lysine on the third amino acid post-protein binding. We'll probably find out tomorrow, if we haven't already, that this isn't true -- but, ubiquitination means DESTROY THIS PROTEIN IMMEDIATELY.  SUMOylation on a protein -- well...I don't think we're entirely sure, but if the last couple amino acids aren't cleaved off, it doesn't do anything.

This study solves problem #1 for me. These authors describe an enrichment procedure for SUMOylation that they can control in fine detail using Flag tags and stuff. Even better? They develop a protein microarray for SUMOylation substrates!  YES.

Potential collaborators of the future, "What about SUMOylation?"
Ben, "Great idea! Here are the names of 18 people in beautiful Baltimore who know how to do this with protein microarrays. I even know a couple of them. Let's go visit and get a Natty Boh, hon!"

Hey, if you can monitor a variable and complicated PTM with an array. I say do it with an array. And if you aren't interested in sites or are doing exploration, this great paper walks you through the molecular techniques to get a global picture of this weird and complicated modification.

Wednesday, February 14, 2018

phpMS -- More new easy to use proteomics tools online!


EDIT: 10/13/18 After watching a second person use this blog post to get to the the phpMS webserver (the way I access this tool I use probably every single day as well), I edited the top of the page to make it easier to find the direct link to the tool.

Powerful tools?
Easy and super simple to use?
Free webserver hosting it?
A random red elephant?

This is the link to the paper!! 

Check check check check!

The tolerance calculator is AWESOME. How often have you immediately wanted to know exactly what the Da or millimass unit tolerance is when you've been thinking in parts per millions (ppm) all day? Type it into the box!

In silico digest online or predict your fragments!  Sure, the Protein Prospector has been able to do this for 20 years -- but...and I mean this in the most respectful way possible...I use the Protein Prospector at least once a day right now, but it's never been the most user friendly tool ever written.

The red elephant is a really nice thing to bookmark when I need a simpler answer without the raw power of the my hillbilly friend.

Like most tools people are developing these days, this thing has way more power than the smaller functionalities I seem most impressed with here. I'm just talking about the neat little tools that are going to make my life easier and my day maybe a little more productive.

Tuesday, February 13, 2018

Oh no. Batch effects in proteomics?

I'm stuck behind a paywall so high that I can't even see a thumbnail of this paper's main figure -- or an abstract that will tell me if this is proteomics related.

However, Google flags the term "proteomics" in the super secret hidden text of this study and it makes me think that they address things I'm concerned about in this study.

While I'd like to ignore it because of the crappy abstract, the title says it's something that I'd really like to read...

I can't recommend you check it out, but I'm leaving it here so I don't forget to download it when I'm on the other side of the wall.

Zotero -- Open Source Citation Software!

Supposedly my new job has access to EndNote. I can't figure it out and calling my IT help desk only provided me with something completely unrelated in consensus reality.

Then Reddit suggested that Zotero is better anyway! I can't say for sure yet, but it appears to be 1) free 2) well supported 3) has the correction citation formats loaded for the target journals for the open things on my desktop.

Monday, February 12, 2018

Target decoy methods for spectral library searches!

Could you use 18% more peptide IDs with your normal confidence level (assuming a 1% FDR)?

Wait. I can one up this. Could you use 23% more ID's at a 0.1% FDR?

Is it finally time for us to seriously look at spectral libraries again? I'm gonna say it is. I also think this paper is another good argument to support it. 

If you've also been doing this a while you might think "hey, we tried this, but the libraries weren't good enough"

Okay -- so there are some people, I think they're mostly in Germany (but we won't hold that against them) and they are synthesizing something like every human peptide. If you haven't seen it, I recommend you check it out here.

NIST also hasn't been sitting around doing nothing all these years either. They've already made a huge libraries of increasing quality -- and they've already made spectral libraries of the first ProteomeTools release.

You can check out what NIST has available these days here.

Does anyone know if MSPepSearch works in the IMP-Proteome Discoverer? I'm trying it out, but I won't know for sure until the demo key expires in 54 days.... 

Friday, February 9, 2018

How far has MCH peptidomics moved in the last 6 years?

I'm up bright and early (for Ben) this morning (let's not discuss details such as numbers on time keeping devices. It's early somewhere...) and MHCs are on my mind.

I haven't failed at identifying MHC peptides in around 6 years or so. I think the blame ended up falling on someone else, and to be honest, I didn't know what the heck I was looking at anyway. I knew that they weren't tryptic, they might completely lack basic residues and they might have PTMs all over them. I was treating them exactly like any other sample that would be that insanely miserable to work with.

Fast forward 6 years. MHCs are coming and I'm diving through the literature hoping to find THE MHC IDENTIFICATION WORKFLOW OF THE FUTURE that I dreamed of in the past.

And...well...hmm....HEY! Here is a recent review...

It covers what MHC peptides are and different ways people have failed to identify them. They cover failures with transcriptomics, fancy immunoprecipitation protocols, metabolic labeling and reporter ion quantification proteomics experiments and more.  Heck, people have even been trying to do this with SWATH. I'm not even going to look at that section, I can guess how that turned out.. 😏  (SIDE NOTE:  Holy I have Emojis now?!??...thank you Blogger!) Is this a cactus!?!? 🌵 Why would I need a cactus Emoji!?!


Honestly, when we were looking at this before I was convinced my biggest problem wasn't the mass spec. I was convinced it was the data analysis. This review has good news, by the way. My old lab is still doing MHC work and I think they are primarily doing this with PEAKS, which the review is solidly in favor of. I'll probably request a free trial to see how it does with the data we've acquired so far.

While I'm on the data processing topic, the great scientists in my facility have had success with the approach described in this recent study.

This is really elegant in it's simplicity. It takes the protein cleavage bias out of the data processing equation by allowing FASTAs to be generated that specifically contain just the MHC sequences. Sequest can't bias it's scoring toward the peptides ending in K or R when your FASTA entry doesn't contain any.  Think of it like Xcomb for MCHs.

Okay -- and I made fun of the SWATH study mentioned in the review, but I'm actually printing  that now as well. That study would require another work-around to the data processing problem -- spectral libraries! That could be a huge asset in searching MS/MS fragments from peptides that don't obey our super neat b/y ion rules!

I'm sorry if it seems like I made light of the complications of working with MHCs or studies where people have tried to advance this critical field in medicine. At least this sorry...

I'm glad to have a lot of stuff to read and to see how y'all have advanced the field and I'll be eagerly awaiting what is coming down the pipeline while we're struggling to work with these awful ions ourselves! 


Wait...where are the UVPD MHC studies...?...someone is doing that, right?!?!?  Someone living where there are cacti? (There are cacti in Austin, right? )


Thursday, February 8, 2018

How fast is an Orbitrap Fusion 1 with the "30Hz" upgrade in software 3.0SP1?

I swear, I'm not writing this post to annoy people. This is important information for those of us trying to design the best possible experiments and it is info that can sometimes be hard to get. As I continue to shorten my runs to get the highest possible throughput in samples and replicates per day out of my instruments, the scan speed of each instrument is of paramount importance to my quantification accuracy (I want 12 measurements across each peak -- minimum -- that's why I spend so much time working on "how fast can I get it")

In the newest Fusion software there are some fantastic new features. I 100% recommend doing the upgrade.'t have the most efficient upgrade path doing it on my own -- my friendly neighborhood FSE suggests operator error and I don't care, it needed a PM anyway, --what I do care about is the Fusion has been absolutely killing it since the upgrade and cleaning!

One new feature is the ability to run the Orbitrap at 7,500 resolution. The release notes describe it as "30Hz under specific conditions" and this is absolutely the case.

To see how fast the instrument can run, I loaded 200ng of a complex human cell digest. I set the fill times at increasingly lower intervals. From 30ms all the way down to 10ms total fill time for the MS/MS.

With a 10ms max fill time the space between MS/MS scans is right around 30Hz.

Want to know the surprising part? This run isn't even complete garbage! 10ms max fill with only 200ng on column? You do that for experimental purposes just to see how fast the instrument is -- in theory -- you don't expect IDs. And...I still pulled almost 1,000 unique protein groups on this run! Raise this to 22ms and the number of ID's I pull are a good bit better, though!

Okay -- so the Orbitrap can do 30Hz "under specific conditions" as advertised, honestly it is a little better than this! Right on. However, one of the things we've seen really emphasized in the literature -- for example,  in this study and this study and this study is the concept of sequencing speed as a measurement of what is fully achievable in a standard LC-MS/MS data dependent experiment.

The Fusion 1 with the new 30Hz upgrade can achieve a ddMS2-MS2 sequencing speed of 30Hz if you run the MS1 at 7,500 resolution as well. If, however, you wish to use a higher resolution you're going to take a hit in the overall sequencing speed.

I generally do my LFQ runs with at least 120,000 resolution at MS1. Might be overkill, based on 3 studies I've really liked this year, but that's how I'm going. This is around 4Hz. If we run the instrument with a TopN or use a TopSpeed method with 1 second then that 4Hz scan is going to have a massive impact on the overall sequencing speed.

This is where the math gets tricky (I'm working on it for a possible update of the Quadrupole Orbitrap Cycle time calculator...we'll see...). If you are using a Top speed method that doesn't require an MS1 scan every second, the hit in Hz isn't nearly so bad. For example, the proteomics methods seem to have a default of TopSpeed=3 seconds.

Let's check this math. If you run a 120,000 MS1 scan on the Fusion 1 (assume no overhead, so 4 Hz) and assume a 33ms time to complete the 7,500 resolution MS/MS scan -- if you do a MS1 scan every 1 second, it looks like this:

You are getting around 22-23 scans/second in the Orbitrap. This is comparable to the Q Exactive HF; albeit I am getting 2x the resolution at the MS1 (critical to me) and I'm getting half the resolution at the MS/MS (which I am totally fine with!)

However, what if you do only have an MS1 scan every 3 seconds?

You are going much faster than the Q Exactive HF. We're looking at >27 MS/MS scans per second. Niiice! This is what I'm looking for.'s still a far cry from...

...but you probably got a tribrid system because you wanted an ion trap. It is worth noting that this recent study states that the ion trap on the Fusion II system exceeds 60Hz when using the ion trap in conjunction with the Orbitrap. if speed is your #1 concern you can always use that second mass analyzer!