Thursday, May 31, 2018

Why is there a crab on the cover of JASMS?!?

What's this about?!? I believe some people are in need of these instructions!

Actually -- the study that scored the cover is really cool, so I'll give them a pass this time.

Multifaceted is a very nice word for -- this team did a ton of work -- including neuropeptide imaging?!? Check out how cool that looks (actually, it looks better in the paper).

Sure -- it's crabs -- but what a promising use of technology I didn't know was possible. Begs the question of how far away is mammalian neuropeptide imaging?

Wednesday, May 30, 2018

Phase constrained deconvolution -- can resolve TMT11 plex in 32 milliseconds!

Could this be a hint of things to come?!?!

Last year some people at Thermo, including some Makarov guy, published a paper on a phase constrained deconvolution alternative to fast fourier transform (FT) (that you can find here). Honestly, it looked like kind of a thought experiment. The fourier transformation is a great math trick that allows you to go from frequency of orbitals to masses in these instruments (but has, no joke, about a thousand other uses behind the scenes in today's world).

The phase constrained deconvolution goes beyond what the fourier transform can do -- pushing into the limits of "fourier uncertainty" where our instruments currently don't, and improving both resolution and sensitivity.  Again -- cool paper, but ---

-- this is WAY COOLER ---

Keep in mind that this chart is biased (smaller m/z gets higher resolution in the Orbitrap -- even with phase constraint model) -- but look at these flippin' numbers for the TMT reporter region resolution (this is an ultra high field Orbitrap -- I think this is the D20 with 5kV on the central electrode, but don't quote me --if I'm right, this is what is in the Tribrids and HF/HF-X; D20 with 2.5(?)kV is Elite)

Only 32 ms --- that is 31 Hz (ignoring overhead) and they got 80,000 resolution at the 127 marker -- enough to revolve the 6mmu (0.0062 Da) separation between the TMT127 N and C reporters.


Okay -- so I have to throw this in, because I've got a meeting with the developer of this thing planned --

--Cause -- Yury's booster system appears to be right at this same level --

The box on the right isn't the clearest, but the red peaks show complete baseline resolutoin of the 127 and 128 TMTC/N isotopes using 15,000 resolution on a Fusion/Lumos. Given that we use 50,000 resolution to resolve these in our lab -- 15,000 translates to 3 times faster!!

Plus, I've been told that it's really easy to just put it out of sight in case your service engineer is all nosy. If he's coming to fix one system you can just take the FTMS Booster and plug it into another one and use it on that one. Whichever one needs a ton of resolution that day.

I'm not trying to stir anything up this morning -- the Phase constrained deconvolution looks amazing -- but the FTMS booster is something that I could buy today (if I had the budget for it) and it costs less than an HPLC that I really want.

EDIT: MassSpecPro posted this link yesterday -- understanding the Fast Fourier Transform. If you want to know more about all of this stuff, this is a great starting point!

Tuesday, May 29, 2018

Let's check out the new PASEF TOF thing!

Have y'all seen this thing yet? I just got to see a great talk on it and I'm ready to say that -- for bottom up proteomics -- this is the most powerful Time Of Flight instrument we've ever seen.

EDIT 6/9/2018: I've spent some time talking to smart people about this thing and I've -- overall -- felt kinda dumb about this post for a number of reasons
1) I got really annoyed about my inability to see the instrument data or process it and kind of went ranting
2) The rants on this page aren't based on real facts, just some grainy marketing info on this device. I should really withhold judgment on this thing until I can see some real data -- however -- I definitely don't want to seem like I'm endorsing it until I do. I've seen MadMen. Marketing people are scary.
3) When in doubt we should be encouraging competition in the marketplace. Just look at what the resurgence of AMD has done in the PC marketplace. Intel has real competition for the first time in years and we're back to exceeding Moore's law for microprocessor performance! Maybe the same thing can happen on the mass spectrometry front.
4) Now I'm afraid people think that I've actually ranked my dogs in the order in which I would eat them if I had to....

I'm going to leave this post in place as a reminder to myself to do think before I post. The walkthrough on how to look at Bruker .d data in MaxQuant does appear correct, if you use the new version. And I'll put up a new post later when I actually review some data.

There are a lot of bells and whistles in the instrument, but probably the central technology is PASEF which was described here in 2015. TOFs are FAST. Screaming fast. But -- sensitivity is a serious problem. You need to accumulate some ions before you shoot them down a tube toward a detector. -- especially one that is several feet away to get anywhere near the sensitivity of a quad, a trap, or an Orbitrap.  PASEF and the TIMSTOF allows accumulation of ions before shooting them and that can bring up the sensitivity a lot. It also allows parallel filling and that speeds everything way up.

Okay -- so -- let's not go into details, but someone sent me a Zip file with some data from one of these things. I'm normally not nervous about science stuff -- but, holy cow, it is really hard to find data from one of these things. I'm sure that with brand new technology it probably takes a while to get everything formatted right for public uploads and things, but when I do see some uploaded via ProteomeXchange partner, I'll feel less like this....

If you get into the file -- you'll find it is an SQLite format (interesting!). That is what the processed data from Proteome Discoverer is -- so if you want to open it you will need to go and get a free copy of SQLite -- you'll need 4.5 or newer -- and if you don't want to hunt for it, you can get it here for Windows 64-bit.

You'll definitely need to check first to make sure you have a .NET framework that is this new. Here is a guide for how to do that. This will require administrator access. When you get there you'll see something like this, probably. If you do, you'll need a .NET framework upgrade.

To get that, Google "Download .NET framework 4.5" and verify that you're on the legit https:/ site (lots of fakes out there). Close everything, upgrade, reboot and come back to this. If you're on Windows 10, this should already be preinstalled. No problem.

To process it you'll need a recent version of MaxQuant, the newest version of PEAKS (haven't checked this one) or -- it sounds like the most recent Mascot! Easy! I'll go with MaxQuant. We need to get reintroduced anyhow.


Okay -- let's assume operator error, put a note on the MaxQuant discussion board and try something else!  EDIT: See bottom of post for correct MaxQuant version to open this data.

Here's a cool video that talks about the instrument! Let's check it out!

This is a really solid video. Dr. Rather explains how the PASEF thing works, and how we have to think about TIMS data in a different way than we're used to seeing. Being a guy who had the first MS/MS spectrum he ever published tattooed on himself, there's obviously some things I'm interested in seeing. At 22min and 37 seconds you'll find a spectrum to check out.

I drew two question marks here. Let's look at the one on the left first!

If you go to Google and type in "proteomics red elephant" it will take you to a blog post I wrote about a tool I use every single day, the pHpMS webserver. I love this tool.

If I punch in the sequence for this peptide in the Fragment Predictor it tells me this instrument is getting the right doubly charged mass

888.94 compared to 888.9329 -- I'm not going to be a jerk here, for real. Yeah -- I like 4 decimal places. Honestly, in this range we're only accurate (without post-acquisition recalibration) to the 3rd decimal, even on a Fusion 1. Even if we assume the very worst, that the measured mass was 888.9449, this is only 13.49 ppm off, right? I'd rather eat my dog that I like than send a peptide to a collaborator where my MS1 was off by 13.49 ppm, but this is the very very maximum, it could easily be 888.9351 rounded up, which would only be 2ppm out. For reference sake, I wouldn't eat my other dog for less than 8ppm.

How do the MS/MS fragments line up? It's tough to tell with it being all no numbers and stuff, but we can extrapolate. y10 should truly be 1203.5423 and y14 with z=2 (**in the figure) should be 853.4143. If we assume that gap between the scale markers at 1200 and 850 are the same --

---Its probably just the visualization tool and a low res image of a low res image, right? y14 sits precisely on the 850 marker and y10 does not. The visualization tool used here could be at fault for bad as my eyes are I could be mistaking what appears to be a 3 proton gap and a nonexistent 3 proton gap. I don't even know why I'm still talking about this. I'm sure it's fine.

Let's go somewhere else. What about phosphoproteomics!?! Wait. I didn't do the other question mark. Ummm...let's come back to why we fragmented the same unmodified peptide 3 times...later...or not ever....
(Edit: 6/9/18) Surely it can do monoisotopic averagine modeling, right? Q-TOFs 10 (15?) years ago could do that.... some vague answers when I asked. I'm a little hung up on this one. Monoisotopic precursor selection (called MIPS or PeptideMatch by some manufacturers) is a big deal. Chances are you don't know your instrument is even doing it. If you want to see whether it's just a dumb button or not, go in, turn it off and run that sample again.) Actually -- new topic. Go to this paper from Dave Muddiman.  Turning off MIPS cut the IDs in half, and that version of MIPS was not very good -- at least compared to today's algorithms. (That one would throw out SILAC 'cause it couldn't resolve the isotopic differences. Buzz me if you need a reference on that, my memory is problematic, but I'm seeing some hazy details -- something about a smart ultra-marathoner from somewhere in Texas who showed that conclusively and the manufacturer fixed it...MIPS is that important! Dr. Bob Swaim would know exactly who I'm talking about, so I'll email him if anyone asks me about it. If you know him, email him for faster turn-around)

There is this REALLY cool thing and it's called the PhosphoProteomics eBook for this device. You can look it up on the Google. There are two spectra in it --- of basically coeluting isobaric phosphopeptide species (there was an AMAZING talk at ABRF about this from [correction -- Brian Searle gave this talk -- consulted my notes] about how often biologically relevant isobaric phosphopeptides elute together, but I haven't verified he's published what I want to ramble about yet). The isobaric species in the eBook look like this --

Here's the pitch -- without the ion mobility, you'd never be able to tell these two isobaric peptides apart. They'd coelute, muddy the water and you'd never figure them out.

Back to my outdated obsession with accuracy in scientific instruments in the year 2018 (bear with me, please) we have two isomers and one has a mass of 546.2626 and the other isomer has a mass of 546.2536

PPM calculator says --- 16.47ppm....I really like my dog. For real....who wouldn't!?!? Check out this dumb Mufasa thing he does sometimes. That is a really dangerous place to stand.

...but if you think that I wouldn't choose to eat that big majestic monster before I'd tell someone I had two isomers at >15ppm apart and the only evidence I had to back it their identity up in a human digest was fragmentation data scaled to the nearest 500 Da and one of the isomers has no evidence of neutral losses whatsoever.... I'm joking, of course. I'd probably eat Bernie if it came down to it, but definitely not Gustopheles.

Edit 5/29/18: Crazy idea I need to leave here so I don't forget. Could we simulate the 2 phosphopeptide fragmentation patterns and then see if Ascore or phosphoRS could tell them apart?

Wow. That was SO MANY WORDs. Not sure why I wrote them all now. Let's sum it up.

At ASMS we're going to hear a lot about a really cool new parallelized ion mobility trapping TOF thing.

It's probably the best TOF for proteomics the world has ever seen.

Yeah --- it can get upwards of 100 scans/second -- but at that speed it can't do sophisticated calculations like monoisotopic precursor selection on the fly. It also may not be capable of dynamic exclusion the way we're used to seeing it. And -- the mass accuracy isn't what we're used to seeing these days from fourier transformation or even modern TOF based instruments.

I'm not saying it's a bad tool. We've discussed how it might fit into our workflows at my day job, but there is this inherent danger in the explosion in value in the mass spectrometry commercial space when business marketing teams get to run ahead of peer reviewed science. There's a lot of flash around this big new box, but -- man -- there sure isn't much on the evidence side yet...

Update 5/28/18: MaxQuant was released on 5/27/18. It has native PASEF TOF support!

Monday, May 28, 2018


You know what I could really use? A --

Super Lazy phosphoproteomics protocol!

[Super Laid-back was suggested this morning. I dig it.]

[Stream Lined? That also works 😺 ]

11 channels for quan? That sounds like a good start!

Spin column phosphopeptide enrichment and elution?

SPS-MS3? Okay -- I can work around that. Especially if the results are really this good....

Sunday, May 27, 2018

Even more features added to SearchGUI/PeptideShaker!

About time someone published an updated paper on what you can do with SearchGUI! I didn't even know about all the stuff on the right side of this picture, but you can read about it all in this new study at JPR here.

SearchGUI is an amazing central interface for a ton of different open search engines -- including the deNovoGUI (which now has DirecTag and the super crazy no-way-it's-that-fast Novor de novo engine)

Now this paper reveals all the work that is going on downstream -- links to Galaxy!??

Saturday, May 26, 2018

BoxCar/BoxFahrt real data and new mysteries!!


So...I'm confused. So far I've had exactly zero luck with forcing BoxFahrt to work on our QE HF using Thermo's factory issued software. The great Dr. Antonius Koller (now of NorthEastern University if you can't reach him through his old CUMC account and want to bug him while he's getting set up) and I have been in touch a lot as he has been working on making use of the basic time saving logic behind BoxCar to improve his results. He came up with a work around this week (raising the default mass settings to match the width of the BoxCar!!) that I haven't been able to try yet, but so far...

While editing (I'm trying to do a 48 hours before posting rule now, so I seem less slightly less odd, and don't tell you things like "I'm writing this from a 4 day death metal festival". I already like the blog less. P.S. I'm an adult, I'm definitely not blogging from a tablet and waiting for the Ruins of Beverast) I came across a reader comment -- one major problem with the QE manufacturer software is that you have just one inclusion list. If you use it for your targeted SIM -- it's now problematic for your T-SIM dd-MS2 -- which might be the main misconception for why people (like me) have always thought that method doesn't work. It isn't doing dd-MS2 within your window, it is doing T-SIM and then only doing MS2 if it sees what you're looking for in your T-SIM.  Toni's work-around (essentially increasing the T-SIM inclusion mass accuracy cutoff to include the entire BoxCar helps over-ride this).

As a side note -- why hasn't a complete industry popped up of people selling software to alter instrument software? For real -- there are thousands of them out there that could be improved. There is only one company I know of -- and they might be closed now, I wrote them for quotes about a month ago... you can run the Q Exactive with Visual Basic for goth sake. In the back of a lot of our brains is Basic -- we had to use it in order to be able to play video games. Commodore 64, yo!

Back to the awesome Bill Murray meme!

I'm not kidding. And I'm not cheating. No MS1 or MS2 spectral libraries. No FASTA with 7e6 entries. Just Proteome Discoverer, UniProt Human (and cRAP) FASTA entries. And BoxFahrt.  Heck, the chromatography doesn't even look that great.


I'll post the method iterations. There is a lot to learn here on the Fusion -- and lots of room to improve from where I am right now.

However -- the approach isn't without some mysteries and drawbacks right now.

Mystery  #1) I can't use Morpheus with these files. No idea why. I get loads of PSMs and Peptide groups, but I only get 2 (possibly the same 2) proteins past 1% FDR. If it is the same 2 proteins, for real, we need to figure out what is special about them. I bet they're full of ANGST.

Edit: 5/28/18: The development team (If you've never been up to Wisconsin to see why they're so great at mass spec -- you should try to go visit. There are such great people up there doing such brilliant stuff -- plus that's a cool town) has reached out to see why this isn't working and I'm sending files now. Thanks for looking at this, Zach!

Mystery #2) Percolator in PD 2.1 HATES these files. HATES them. I only found out on accident by using the default Thermo Fusion basic ID workflow (I think it only corrects by target decoy at the peptide group level.  This is what gives me the almost 6,000 protein groups. Gotta check on that.

Throw in Percolator --- less than half the PSMs make it through the filter. knocking the BoxFahrt 400ng 90 minute HeLa runs down to less than 2,500 protein groups in 85 minutes.

Mystery #3) Are these spectra crap? Well -- they are ion trap -- so they are crap (kidding!!) -- but they aren't any worse than any other ion trap PSMs by eye -- let me know if you want to see and I'll send you the processed data. The image at the very top is my very worst MS/MS spectra (the default workflow appears to require a minimum XCorr of 2.0 -- which -- back in the day when I'd totally spend multiple days at a death metal festival and wondering when I'd run into those fun guys from the Hunt lab -- who are probably also too grown up for this stuff, I'd have considered pretty darned good.   However, I can't objectively say whether 2e5 MS/MS spectra are worse or better, but wouldn't it be cool to think that there is something important here that Percolator doesn't like about these spectra?

Maybe they're too large? Wait -- where is that picture I made it last night...? I'll find it and add it in later. I tried to overlay histograms of the charge and MH+ for peptides ID'ed with each approach. It looks like the stuff that Percolator is throwing out that Target Decoy is keeping are considerably larger and higher charged peptides, but this is inconclusive with the amount of time I have right now.

Mystery #4) Minora doesn't work AT ALL. No traces, no quan and this is a major drawback for me.

I've got some samples in I've been dying to run all year and BoxFahrt gives me loads of peptide IDs, but I need quan -- I had to resort to spectral counts (yes, I died inside a little -- but I didn't throw up or anything...I'm an adult (warning! sound)-- a spectral counting hating adult....) and they lined up with what we know from the phenotype/RNASeq for these cells-- awesome -- but I need real quan -- so the samples went back to EasyStar (IonStar for people with EasyNano and EasySprays -- see -- I'll steal method names from anyone, including my friends and collaborators. IonStar is a much cooler name. Putting results here has been on my to do list for a while. The 50cm is pretty darned close and limited runs with the 75cm EasySpray PepMap suggest that it might have more theoretical plates than the 100cm 3um column. But now I'm off topic.

Here is my best Fusion 1 BoxFahrt method iteration so far. 
Edit 5/28/18 -- here is the link. That would be useful, I guess.

It uses 60,000 resolution MS1 for 3 T-SIMs with each T-SIM getting 1.5 seconds to do as many ddMS2 ion trap MS/MS scans that it can. I use the "use all parallelizable time" AGC target over-ride feature.

If you raise the T-SIM MS1 target any higher (actually, I only tried 5e6) you lose IDs (n=1) ~10% loss.

I tried 120,000 resolution MS1 and it cost me 15% IDs.

I tried turning off the fill time over-ride and that cost me 6-8%

If you have the Fusion 2, it may be possible to alter your MS/MS isolation windows for the msxT-SIMs. I can't do it on my Fusion 1 with this tune build....bummer....

Wow. That's a lot of words -- conclusion?!?  If I can deal with the temporary loss of some of my favorite tools -- if I use staggered msxTSIM-ddMS2 on my Fusion 1 with parallelization in the ion trap, I might possibly be getting the best results I've ever seen from any instrument.

Thursday, May 24, 2018

New bioRxiV paper shows how the EvoSep works!

I've mentioned the EvoSEP on here at least once before, but it has been a mystery how this ultrafast and low to zero- carryover system works. I know that it uses disposable columns for each sample but this new open source pre-print finally shows how it works.

It's is really smart and a nice step forward for clinical proteomics or anything where any carryover is going to sink you.

Wednesday, May 23, 2018

Jailbreak 2.0!

Are you feeling a little limited with your awesome and super easy to use EasySpray source? Want to power it up?

Check out EasySpray JailBreak 2.0! (unless it's illegal -- then don't. and don't tell anyone I sent you to this site)

The original JailBreak kit let you put any nano columns into your EasySpray that you wanted to -- 2.0 lets you go up to MicroFlow -- WITH SHEATH GAS CONTROL!

We're hearing lots about MicroFlow right now -- which -- depending on how you define it, your high flow LC or your nanoflow LC can probably do it (even EasyNanoLCs -- just make sure your gradient is short enough that you don't have to restroke the buffer pumps).

This source addition is the missing link.

Worth noting, the manufacturer does have MicroFlow columns and emitters now, but if they don't have the solution that works for you this looks like a viable option.

Saturday, May 19, 2018

Analysis of PNGase F-resistant glycopeptides with SugarQb!

Is it glycoproteomics week? Sacred Bos taurus, there have been so many awesome papers this week.

If this is your field -- or if you know it is coming your direction, you'll be happy to know that every aspect of it appears to be developing rapidly!

I recommend checking out --

The GlycoPeptide Decoy Generator (and a new Glycopeptide CID library!)

This new metal based enrichment strategy for glycopeptides!

This new proteoglycan deep sequencing paper (they use a some neat enrichment with a QE HF and process the data with PEAKs in conjunction with an awesome software package GlycReSoft that they developed. While you're there, check out the gold mine of other neat little algorithms they have posted!)

Told you!! One heck of a week for glycoproteomics!!

The title of the post is about this one, though! 

I'm out of blogging time today, but if that title doesn't interest you (cryptic specificities? what?) you're probably here for the jokes -- but if nothing else it's great to see SugarQb being put to use. In case you don't have this free node installed in your copy of Proteome Discoverer 2.1 -- you should -- you can get it here.

Friday, May 18, 2018

Addressing more BoxCar/BoxFahrt comments!

It can be hard to both post and to find comments that are made on this dumb blog. Especially when they go on separate posts. It helps to address them directly sometimes! In no particular order

Q1) Is the Xcalibur add-in available yet?

--Not yet, I don't think. I'll probably run around screaming when I find out it has

Q2) How can you process the data currently?

-- For the TMT stuff we've been doing (LFQ runs are on this weekend when I did the math and realized there were a few (very rare in our lab) open hours on something!) Proteome Discoverer 2.1 has no problems with the data. Actually -- I know the peptide ID is great, but I need to do the quan comparisons later. I haven't tested PD 2.2 (I'm using IMP-PD nodes for this project and they aren't all available in PD 2.2 yet)

-- The MaxQuant version in the BoxCar paper is specifically equipped for real BoxCar data. I don't know yet (maybe next week) if it can handle BoxFahrt.

-- Testing is in progress right now for RAWQuant --- which, honestly, deserves it's own post. It's REALLY cool and I think it is something that we need to integrate into our data processing immmediately.

NOT A Q!!!)  Okay -- so -- thank you Chris -- I didn't know that the Fusion has features to allow you to select individual isolation windows. I will evaluate this immediately.  If you can optimize your isolation windows to spread out the densest regions of ion current (like BoxCar does) -- the results I'm getting on the Fusion right now might just be the beginning of the improvements I'm seeing!!

Q3) How does this differ from WiSIMDIA? It's got some similarities in that we're doing gas phase fractionation for the MS1 -- and WiSIMDIA is probably a good starting template for how BoxCar can be adapted to DIA. BoxCar staggers the isolation in the MS1 and allows for a more even distribution of MS1 ions than WiSIM -- and that even distribution allows lower intensity ions to come up out of the noise and be selected for fragmentation.

Thursday, May 17, 2018

Finally learned the XlinkX workflow and nodes!

High on my "to do" list for 2017 was to learn how to use the new XlinkX workflow and nodes. I didn't get to it -- and there I was just wandering through Mount Ember minding my own business and --

I hadn't battled crosslinked proteins in half a decade, but we recently got the XlinkX workflow added into one of our PD boxes --- 2 hours later (mostly because I was reading stuff I didn't actually need to)

(Not my sample -- just being cautious) but HOW COOL IS THAT OUTPUT? Here is what we think it is (very top) -- here is the MS2 evidence (first spectra). Bottom panels -- Here is the MS3 evidence for each side of the DSSO crosslink!

If you did crosslinking in the past and you still wake up from nightmares of the experience, I really recommend you check out this new generation of crosslinking reagents, instrument methods and data processing software. For real. I think the nodes for Proteome Discoverer are $500 in the US. The DSSO reagent set us back $100?

Wednesday, May 16, 2018

EASI-tag -- some lab in Germany is working on new reporter ion technology!

What a busy week! Some labs would be content with identifying a major point of inefficiency in every mass spectrometer in the world and demonstrating a strategy to confront it -- and maybe take some time to sit around and feel smart about it.  This new preprint shows that this isn't how they do things. 

If you're thinking "hey! we have lots of reporter ion tagging technologies already. what could this one add?"

What if I said that you could take 3 samples and label them and mix them in a ratio of 1 to 12 to 144 -- and when you process the data that your output was 1 to 12 to 144? In MS2 -- no funny, time wasting, MS3 tricks, no isotope suppression!

It is worth keeping in mind that this was a single shot of yeast digest on a 95 minute 45cm Dr. Maisch column on a QE HF -- this setup alone (sharp peaks, sample with only around 4,000 proteins on a fast instrument) would probably cause some reduction in ratio suppression, but when TMT10 first came out, Dr. Min Du and I 2D fractionated some human protein digest we labeled in 1:2:4:8:16 -- and 16 doesn't look like 16 -- it looks like 8-10 on a QE Plus. (This is the example set in any of the TMT/iTRAQ Proteome Discoverer processing videos I've made over the years.) If you could really see a 144 fold difference in MS2 scans? This is huge.

How's EASI-Tag do it? The reporter fragments off at significantly lower energy than it take to break the peptide backbone. The authors use a 2 stage collision energy, one low, and one normal.

They also do two things to the QE HF I'm not sure I understand the logic behind.

#1 -- They offset the isolation window for MS/MS
#2 -- They use a special setting they've developed for the QE HF software to preferably isolate the monoisotopic peak. Since we're adding a tag to these peptides, this is about the opposite of what we normally do -- for example --

When looking at a peptide greater than around 1600Th -- the M+1 peak becomes the most intense species, statistically, on something as large as this peptide, the M+2 is almost as abundant as the monoisotopic.  Since I'm signal starved and the heavy isotopes are distributed evenly across the peptide (and the Orbitrap onboard computer can identify the monoisotopic -- regardless of what is fragmented), I generally want the M+1 to be fragmented....

OH -- They describe the reason why they did this in Figure 1 c. Both the preferential selection of the monoisotopic and the offset. I think it has a lot to do with the particular characteristics of this tagging technology and doesn't mean I should start reoptimizing my other reporter ion experiments.

The co-first author on this great new study is one of my fellow instructors at the Advanced Proteomics summer school in Vienna in July! I can compartmentalize it in my brain (forget about this entirely -- there isn't all that much space in here...) and ask a million questions this summer!

Tuesday, May 15, 2018

BoxFahrt (BoxCar-ish on a Fusion -- no hacking required!)

I'm on something like iteration 48 -- but I think I've got it.

First off -- if you haven't seen BoxCar yet, you should check it out.  Once we all can do it, I think it will change how shotgun proteomics is done from here on out.

That is the kicker, though. It's going to be a bit before we can all do it -- and a bit longer before all of our data processing software can handle the output (mucking about with the MS1s is rough on label free quan.)

You know what doesn't have a downside -- just a massive potential upside? REPORTER ION QUAN.

I'm on iteration 48 and (with some false starts) getting massively better data using what the native Fusion software (whichever version came out in December -- the one that adds the "30Hz" and the funny quad isolation glitch if you use IC).

I won't walk you through all the stuff -- but if you click on the picture above you'll see what I've gotten around to method-wise.

3 Instrument methods (or segments) consisting of just these things:

The three segments are identical -- with the exception of the fact that each segment has a separate set of 10 T-SIM scans. For TMT, where my care for the MS1 scans across the peak is not the highest, a total cycle time of 4.5 seconds is just fine for me.

What I'm running right this minute uses a 60k MS1 scans, so this is the screenshot, but what I've been running has primarily been 120k -- it just hurts me to use 600ms on MS1 scans -- even when they're this good (and HOLY COW) the MS1 scans are good (below)

The individual segments are easy to set up. Important to note, the AGC target on the Fusion is the target in the box DIVIDED by the number of MSX events (took 5 runs to figure that out from the scan header -- I was still impressed by the quality of the data)

It's really easy to see the improvements in the MS1 signal distribution as shown in the BoxCar paper (I've got a bunch of examples just like this -- I'm not picking and choosing). In the top, this obnoxious singly charged peak uses up the entire AGC target -- you can't even see the 843 or 722 or a bunch of other really good ions -- msx-T-SIM it -- there they are!  Then -- you start to wonder -- if dropping from 1e9 to 1e7 is gonna work out real well and you realize you divided 20ms fill time between 10 T-SIMs...

Looking at the Fusion scan headers has caused people to eat Tide Pods (I can't prove this) but it helps if you think about them historically.

I highlighted the ion injection time -- it brings the first one to the top. This isn't a sum of all the multiplex injections, just the first or shortest one (difficult to discern because the first one is almost always the shortest). You'll see in the second line, the injection times of the first(?) 6 injections. Your fill time is a sum of those (plus the other 4 that can't be displayed).

Thinking about it historically -- how old is Xcalibur? Somebody probably knows. I don't, but if you told me that if you dug really deep into it you'd find Xeroxes of the punch cards that were used to code it, I wouldn't be shocked. 

How long have we been able to multiplex? 2012. when the QE came out? And even on the great Q Exactive Classic, I've never successfully multiplexed >5 ions. If the scan header only has enough room for 6 multi-injection spaces -- it makes sense to me.

Getting MS1 improvement is easy. The tricky part was getting the Fusion to fragment the stuff that I tell it to -- the most abundant, MIPS passing, peptides it sees. If you use an MSX and MSX control, it seems to get confused, cause this is what I tried first --

Don't get me wrong, the MS1 scans look great! But it didn't select anywhere near the number of MS/MS scans that appeared available (to me -- admittedly limited measurements). It is quite likely someone smarter will have better success, but --

Breaking them out into separate segments appears to get me more IDs -- even just using big T-SIM windows in a single MS1 scan appears to help!

Let's sum this up.

#1) I'm a BoxFahrt believer
#2) OMGauss, I can't wait to be able to do BoxCar right
#3) Umm....I'm getting an impressive boost in my TMT labeled peptide IDs with BoxFahrt -- can you imagine what BoxCar can do!??! 
#4) I'm doing TMT11! I'm, of course, not even using the ion trap on the instrument!   Very next thing for me (when I get the excuse to tinker again) BoxFahrt with parallelization for MS/MS in the Ion Trap!! I swear, the more I look at it, the more I think it will work better than the Q Exactive methods.... it could be filling and doing low res MS/MS scans simultaneously!

Monday, May 14, 2018

Become a power user (HACK YOUR FUSION!!)

Do you have an awesome idea for a better way to run your Orbitrap? There have been great examples over the years -- like Gygi lab adding TMT MS3 to their Orbitrap Elite and S. Gallien et al., adding IS-PRM to their Q Exactive and this recent BoxCar thing.

The manufacturer has always offered developer's kits and API's for those of us who can find somebody with the skills to write some code and change things, but this is the first time I've heard of a course about it!

Conveniently released about 2 hours after my employer's travel people set up my flight, but starting early enough in the morning that I'd never make the beginning of it anyway (winning!) you early birds can register for this cool ASMS (Saturday June 2nd) workshop here!

Sunday, May 13, 2018

BoxCar updates!

So...ummm...BoxCar is a popular topic. I've hardly had a conversation since the paper came out that didn't end up with us talking about the paper and/or our independent reanalyses of the RAW files.

I'd like to bring everyone's attention to this comment on the blog by Florian Meier, the first author on the BoxCar study: 

Dear Ben, all, Thank you very much for your excitement about the BoxCar acquisition method. We are about to release an Xcalibur plug-in that will enable BoxCar scans without the hassle of tweaking the Xcalibur method editor or extra software from the vendor. Please give us some time to fix last bugs and follow for updates and download details once available. Regards Florian on BoxFahrt-- BoxCar for people who can't alter their instrument software.


The fact someone from this team wrote a post here (!!!) and with news THIS good? A Chuck Norris thumbs up is quite literally the most powerful approval I could come up with. (I've heard that on this take the camera survived, but the cameraman and key grip did not). 

Thanks, Florian! We can't wait to try it!! 

Saturday, May 12, 2018

IonStar -- Global proteomics with reproducibility as #1 priority!

Okay, ya'll. This is going to look a little self-serving, because I've been lucky enough to contribute in some small way to this amazing project, but I'm on a mission.

This mission is to prove that:
1) Proteomics CAN be reproducible
2) Proteomics CAN contribute to clinical studies
3) Proteomics CAN be part of what we use to diagnose patients -- to find out when they're sick before it's too late and to help pick the drugs that they need to use to get better the fastest.
4) If proteomics focuses on what it can do to COMPLEMENT genomics and transcriptomics, rather than trying to beat them all the time (an exome sequencing is under $350, y'all, and a full 30-50x transcriptome might drop under $1k really really soon) at things they can do better and cheaper -- we can do great amazing things together. Do we really want to try and compete with that -- when they can't do any PTMs and have basically no ability to do proteoforms!?!?

I think an awful lot of people in our field are on this same mission -- but sometimes it doesn't seem like it -- because we can't don't seem to be able to stop messing around with the settings on our instruments and settle on methods that will make our experiment not only impactful for us for singular studies -- but also impactful for anyone who wants to go to ProteomeXchange and look at our data and compare those to other datasets.

I'm guilty of the same thing. Why have 112 settings I can change on my Fusion IF I DON'T TRY EVERY COMBINATION OF THEM!?!?!?!  I stayed up basically all night last night trying to make a QE HF do BoxCar (follow-up post coming -- I think I got it)

Jun Qu is also on this same mission. To prove it, his lab pretty much stopped changing their sample prep and mass spec methods a couple years ago -- and it's reaping some amazing dividends (more papers published -- just since 2017 -- than I've published in my career...).  So I'm going to present yet another great IonStar paper here.


7,000 proteins ID'ed quantified in mammal cell lysates with no missing values
Introduces IonStaR Stats which can be downloaded here so you, too, can have all the tools shown in this paper.

The suggestion of -- you know what?!?! if we just do great chromatography and ultra-high resolution MS1 scans -- maybe that triply charge peptide that we've got at 104 +/- 0.5min with 1ppm mass accuracy is the same peptide from run1 to run 8,412.  Maybe we can use MS1 libraries (not presented here -- but it sure sounds like it might work)...

Another advantage -- and maybe just because I'm a little bit of a funk because of some disappointment the last 2 days -- if you've got a TriBrid you're good to go. You don't have to hack your instrument or anything like that. The vendor's software -- a seriously nice column (Jun's lab uses 100cm columns), in limited experiments at my facility with 50cm and 75cm EasySprays -- the performance doesn't looks that far off. The bit I lose, I'd trade for the ease of NanoViper. I'm lazy -- sue me.  The important part is 1) picking a sample prep method, best you can, and follow it exactly 2) Run the same exact instrument method. As much as you want to try that higher AGC target --DON'T. 3) And consider that if you are just using peak finding (match between runs) that there is a certain number of false discoveries that will occur -- use some method of FADR to control it a little!

Friday, May 11, 2018

BoxFahrt-- BoxCar for people who can't alter their instrument software. Might even work!

EDIT 5/13/18: Please ignore this post. BoxCar is under testing and is coming to all of us soon! I'm only leaving this post here as a reminder to myself to think before I hit the "Publish" button -- and in case anyone wants to look at what the BoxCar RAW files look like to help understand the instrument method logic.

In my old neighborhood in Baltimore there is a hilarious race each year. It started as a soapbox derby, but due to all the artists and weirdos, it rapidly descended into chaos.

Now -- a bunch of people race down a hill riding old toilets.  I'm not making this up (proof) . Am I on the right blog? I am!

Okay -- so --- I was really really excited about BoxCar, then I saw a Tweet and blog comment that made me realize -- I can't natively multiplex >10 isolation windows in any Exactive Tune I have -and they use 16!!   I went to ProteomeXchange and got some of the RAW data -- and...umm... this is completely custom written software on the QE HF...ugh.... However -- I don't think doing something similar is impossible -- maybe I just have to make some compromises!

I've got the RAW files in front of me, a notebook, a pen that is a monkey with googly eyes, a tablet that thinks it is a Q Exactive HF -- working on this late at night -- so some ethanol might have made it into this espresso.

Time to build something that should simulate BoxCar!  In honor of  something very vulgar I said very loudly when I saw the .XML stuff where the instrument method is supposed to be - which made me think of the Toilet Derbies, I'm going to call this method BoxFahrt.

Disclaimers -- yo. if you've read this far and you think that I'm about to do something smart, shame on you. However, just to be sure -- no guarantees this will work, I won't have a chance to start testing it until at least Monday. But, don't you worry, I'll let you know how it goes.

First off, lets look at the RAW files from ProteomeXchange (you can get them here)  and try to diagnose what everything is doing (without trying to read the .XML used as an instrument file).

Using the plasma samples (smallest set at 3.6GB) as an example this is the method as I see it:

1) MS1 scan at 120,000 resolution from 300-1650
2) 16 BoxCar isolations with a 120,000 (?) resolution MSX orbitrap "full scan"
3) Same as 2, but with altered overlapping windows
4) MS/MS scans -- as I flip through the RAW file, it appears that we're looking at something realistically approaching a "Top5"

First question I have -- how important is #1? 3 MS1 scans at 120,000 resolution, even on the HF, is a lot of time. Let's assume it is important and I'll throw it in later. However --- my first attempt at BoxFart is going to be --

Step 1:  Set up MSX TSIM-ddMS2 runs (in this example 2)

In BoxCar, the authors run from 400-1200(m/z). BoxFahrt will do the same thing. To get this in 2 windows I'm going to need to do 40Da 80 (m/z) windows. If I do 3 x 10, it's going to be smaller isolation windows. For example purposes, I'm going to go with just the two here.  CORRECTION --80 Da windows.  Should be corrected in image above. 

Downsides of this way of doing things (BoxFahrt wasn't entirely meant to be a huge compliment or anything to this parody of a great method) -- if I set it up this way we're looking at the first round of MSX-t-SIM followed by the MS/MS scans selected from that Orbitrap scan. THEN we're looking at the MS/MS selected from the second round of MSX-t-SIM scans.

BoxCar appears to pick them from the two together, but I don't think that makes a ton of difference. The problem here may be the challenge in AGC control.

I don't have fine tune control over the AGC targets I'm going to be using. I can set just one number for BoxFhart. I'm going to say 5e5 and 20ms for my 40Da isolation windows

The logic behind my settings --

We know the QE family can handle 5e6 charges in the C-trap with limited ill effects (no reference, I just have friends who run above 3e6 for MS1 -- I'm sure there are references -- however, I've been working on this for a long time already and I'm getting sleepy.)

If we MSX 10 windows equally, that would allow us to run 5e5 ions per BoxFahrt window.
On a QE Classic or Plus, the 140k Scan is something like 512ms . That would allow us to have a Maximum IT per MSX-SIM of 50 ms, give or take (overhead is around 14ms -- so maybe shoot for 40?)
On the QE HF, 120,000 resolution is about half that. I'm erring on the side of caution and going to 5e5 and 20ms. It might be smarter to raise the target. Again -- this is where I'll start when I can actually have free time on our massively overworked instruments.

Now you need to build an inclusion list.

BoxCar alternates the overlapping windows. Please keep in mind that quad isolation isn't truly symmetrical on any quadrupole, but the Q Exactive classic is an older style (non segmented) quad and the isolation discrepancy on the edges is particularly steep off of symmetrical -- the QE Plus and HF have segmented quad stat are much closer to symmetric. The BoxCar paper goes into how to best deal with the quad isolation issues on the edges. Considering they use the HF -- just keep in mind that you might be looking at some loss in signal at the edges if you use a QE Classic -- or -- Fusion, to lesser degree, Fusion 1 systems.

What we need to do in BoxFahrt is build smart windows and (possibly -- can't say for sure yet) control our MSX ID #s(?)

Don't quote me on this (or anything I write here. that goes without saying, right?!?!) -- but I'd probably first try to run with no MSX ID filled in. If that didn't work great, I'll next put in some MSX ID numbers. Even with an MSX of 10, you can't put in #1 for all the ones in the first batch and #2 for all the ones in the second. I think, therefore, that this feature just allows you to keep your scans in order.

EDIT number 4,212: In older versions of the QE tune software (I definitely think up to 2.2) if I put in an inclusion list like the one below and walked away and came back to the method, I'd find that the list had reorganized itself in increasing order. I...believe....that this is no longer the case, but I've never verified. If you go here, I've put links to Planet Orbitrap where you can get the Vendor notes on Tune versions. If you are on an older version of QE Tune -- you'll have an issue setting up inclusion lists like this. You'll end up getting a 2 phase, over-complicated gas phase fractionation method. That might still work, but will be less cool.

I've spent way too much time on this last night and today -- so I'm going to stop here for now. I won't know anything until I actually try shooting some standard protein mixtures on an instrument or 5, but this is where I plan to start.  You'll note I started with a Loop Count of 5 -- in this setup this would be 5 from each MSX-TSim -- so we're really looking at a spaced out simulation of a Top10.

Honestly -- I think this is going to be easier to simulate on the Tribrids, but I'll probably leave this alone until I have some real data.

WAY WAY too much time spent on this the last 24 hours. Gonna have to save it and end here.

EDIT 5/11/18 later in the day:  What? I'm back to this. I want to address another reader comment -- if you did want to do BoxCar right what would you need?

I presume you'd need the API. You need to contact the vendor to get it, I think. It was on the BRIMS portal for a long time, but I don't think it's there now. The API is a Windows Visual Studio interface that allows you to completely control your Q Exactive.  Some really cool stuff can be done with the Q Exactive when you get the API.

Warning, though, it is a LOT of work to use. It is kind of a blank slate. I'd presume, however, that if you cut the instrument method text out of the BoxCar RAW data that it would have most of the things necessary.  There is a PDF talking about the API (directly opens from this link.)

More Edits late the next day:  I've heard, from a reputable source, a reputable sounding rumor that the vendor is investigating making BoxCar available to the rest of us. Some legal review needs to be done to see if distribution can be done. Stay tuned.

Also -- it was pointed out I had misspelled the name of my method repeatedly in the post. I have made these corrections.