Sunday, November 11, 2018

Untargeted analysis of the airways of children with respiratory infections!

According to the sign at my pharmacy it's cold and flu season again! Miss Puff (above) isn't sick. She's 328 years old and runs around like a psycho in 5-10 minute stretches each day and does basically the above the rest of the time.

Why hasn't proteomics solved these viruses yet? Maybe it has. I forget to look each year until I get a cold myself -- surprise! this is a self-centered blogger, which I assume is an oxymoron.

I don't know if this great team solved my problems, but this study is super cool!

Ever thought about how you might sample the proteomes of a human's upper respiratory tract? Neither have I!  Turns out it isn't nearly as easy as you'd think. There are all sorts of rules you have to follow with human beings and they weren't exactly designed with our field in mind.

What you want to get is the epithelial cells that are being affected by the virus or bacteria or whatever, and maybe the cells that are doing the immune stuff to kill those things. You need to start with a swab and then you need to get the proteins off those swabs that you can make into peptides.

This study looks at multiple methods to get the best yield -- and even how to optimize it for TMT experiments....I'll just steal the figure....

Man, I like this paper. EDIT - RAMBLING ABOUT NOTHING, PLEASE IGNORE: [Or...I like DayQuil....will we one day look back on over the counter medications of this era with the same kind of disgust we now have for prior procedures like leaches? "Hey! Let's put a stimulant (pseudo-ephedrin) and a potent dissociative compound (dextromethorphan HBr) and one of the most potent liver toxins ever discovered (acetaminophen) in one big ol' pill. Let's label it for 'staying awake while you're sick'. Don't think it'll work? Let's color it bright orange so it doesn't at all look like an insane thing to ingest!"

For an off topic read, this study is awesome. They review all sorts of evidence on cold "remedies". Turns out basically nothing has any solid evidence of being useful in any way. Possibly, I'm exaggerating, but this bit is gold, though --

--- but I bet the people with the placebo weren't as scary behind the wheel of a car!!  ]

BACK TO PAPER: Sorry -- no, I seriously do like this paper. Whoever designed this experiment wanted to do some solid science. Big cohort. Loads of variable control. Patient diagnoses confirmed by PCR. Really nice iterative approach -- oh yeah, and some solid mass spectrometry!

You know how you get the proteins off these swabs? You use Universal Transport Buffer or something, probably some medical requirement -- and it's loaded with BSA. Yup, you have to take a tiny number of cells and bury them in B.S.Albumin.

Guess how many proteins you get if you don't deplete it? Even if you run it on a Q Exactive using 50cm columns and 3 hour gradients? Four. 

Possibly I'm exaggerating, but it wasn't that long ago that number would be pretty close to accurate.

Heck, I'll steal another figure! Why not?  OPEN ACCESS, FTW!!

Okay -- straight up -- you took a swab, stuck it in someone's nose or throat or whatever, stuck it in a tube full of BSA and you get as much as 2,400 human proteins ID'ed? THAT.IS.AWESOME. I love this field and where we are right now!!

Protocols B, C and C2 all do something great. They use a genetics kit that is made to get DNA and RNA out of a cell. I'd almost bet you based on the manufacturer of the kit that the protein is waste material in the kit instructions. In this case, they use it. B is that material with label free proteomics. C is with TMT and C2 is TMT but they increase the gradient from 3 to 6 hours.

See why I love this paper? All those nerds doing genetics stuff may be throwing away protein material that is cleaner and higher yield than some of our normal protein extraction techniques we made up even generate (Protocol A is pretty normal) I haven't even got to the figures where they try to make sense of the medical data here -- there are loads of pretty plots. All the data was processed in MaxQuant, even the TMT data, though they did the correction factors themselves outside of the program.

The final paragraph tells you that all the pretty pictures were made in R using super secret scripts. All the data has been deposited and you can get it here!

Saturday, November 10, 2018

MassComp -- reduce mZxML data files with no losses!

Back when desktop computers were manufactured by smaller companies there was one called MassComp. I didn't know this until I discovered the MassComp I want to talk about hadn't designed an icon for itself yet.

This is the MassComp and you can get it here!  It's new -- so new, in fact, that you may have to compile it yourself. However -- this is gooooood stuff. Have y'all seen this?

Have you seen more of them if you move a file from one place to another a few times? That might mean that you're data transfer protocol is not error checking, or the compression thing that you are using to move that data file is not error checking. That's the opposite of MassComp. It's anti-lossless data compressing. (If you do see that symptom talk to your IT people!!)

MassComp is just for MzXmL now, but a lot of people use that! 

Friday, November 9, 2018

Histidine rich glycoprotein looks like a legit accelerated aging marker!

Before you have to endure my rambling -- this is the paper in question!

My first thought -- what a terrible name for a protein! Who named this? Booooring....

My second?  -- is a study with a bead array allowed to call itself a proteomics study?  Sure! Probably will get more proteins than a SWATcH study on the same material....

Third? -- Can I deplete this protein the next time I donate platelets? Histidines stick to nickel or something, right? How hard would it be to add some nickels to the tube where they put my platelet depleted blood back into me? Sounds not hard! Unfortunately, both of my arms are busy during the procedure....I will require assistance....

How'd they do this study? It looks like it starts about with GWAS (Genome Wide Association Study, I think) from multiple cohorts all smartly age- and other stuff- matched. The markers linked to the differently aged cohorts (once other variables were statistically modulated) were then assessed by the above mentioned bead arrays -- leading to a protein that we should try to deplete!

It makes me really want to go back through some of the recent smart proteomics studies on aging from the NIA and others -- and see how these data align....I was going to put links to some of the aging studies I've posted here, but there are A LOT. I might be getting obsessed. Just type "aging" into the search bar above if you'd like to find other sets to look for this evil protein in.

Tuesday, November 6, 2018

FlashPack! Now I don't have to spend all day making a terrible nLC column!

I shouldn't make nLC columns. Some other people shouldn't either.

But for people that CAN make columns reproducibly and reliably, FlashPack might be something they'd want to check out!

The authors show that 50cm columns can be packed in 12 minutes (vs 100 hours in a traditional approach? did I read that right?) and somehow pull this all of at low pressure. I don't get how, but it might be an issue with my basic understanding of the procedure....

Monday, November 5, 2018

MaxQuant.Live corrections on the TopN functionality!

Some important corrections from my recent extremely sleep-deprived posts regarding my super high speed Q Exactive HF.

1) Currently, I don't seem to be able to set any resolution that I want for MS1 scans using the TopN or EASI-TAG functionalities added to my HF.  Especially for TopN, I need to set a resolution for MS1 that matches the resolution in my MS1 scan in my normal vendor software --

Yeah -- I was pretty sad about this one. I really really wanted to run my MS1 scans at 867530.9 resolution, but I can't --not in TopN -- sorry Jenny....(mandatory link)

2) I can't hit HF-X speeds with an HF. One of the coolest things about the HF-X is the ridiculous improvement in overhead time versus the older systems (down from >14ms on the QE Classic to somewhere around 3ms on the HF-X. The HF falls in the middle somewhere, the math is on this blog somewhere, but it's a lot more than 3ms. If I try to run at 8,000 resolution the amount of time it takes to collect enough ions to see + overhead does take a big toll on my overall cycle time.

The record today was around 28 Hz. Still a little better than the 22 Hz the HF normally gets, but a far cry from 41.3Hz.  I'll keep tinkering, but it is quite possible that I'm within the margin of error and I'm not actually running realistically faster than with the factory software.

3) Don't hit the stop button in Xcalibur! You'll need to close everything and start over to get the MaxQuant.Live, Xcalibur, SleasyNano handshake thing to all happen right again.

4) Super cool thing that I'm not sure how to use, but I successfully ran MaxQuant.Live from the Tune page. could realistically do a direct infusion experiment and run BoxCars on it. Calling Glaros lab -- want to write some super sophisticated methods for your PaperSpray contraptions? We might be able to BoxCar and set the new sensitivity record for direct injection. I'm also thinking FIA (flow injection analysis) is something that could gain TONS from BoxCar.

5) Did I have a point to this? I forget. Probably not!

Saturday, November 3, 2018

MaxQuant.Live Targeting! Fragment the same peptides in EVERY run!

Okay -- I apologize if you're already tired of hearing about MaxQuant.Live. I'm afraid it's only going to get worse. I was excited enough once it was installed that I didn't sleep at all the first night -- and I realize that is weird. I just delayed colleagues samples and played with our new super powered HF and the data coming off of it and I'm having a really hard time not driving the hour into work and just staying there all weekend -- but the HF is busy doing some DIA study....

I haven't used MaxQuant.Live Targeting yet. It's next on the list. And it might be more important to how our field moves forward than even the concept behind BoxCar. It fixes a problem our field has always had and one that I think none of us every really thought could be fixed.

In proteomics we're always doing the stochastic sampling thing. If you run the exact same sample 3 times, chances are that if you compare the peptides/proteins ID'ed in those runs they'll look something like these 12 minute CE runs I was messing with. In 12 minutes, I can't get the whole yeast proteome -- and the variability when the instrument can't possibly fragment every peptide present has a degree of chance to it, resulting in an overlap like this.

Increase the depth of the coverage relative to your proteome, and this becomes less of an issue. But we still can't get every peptide every run, so this problem is present in every run to some degree.

What didn't have to stochastically sample? (Am I spelling that wrong? Blogger thinks so)

What if you could run a sample once and then use the data from that run to fragment every peptide from that first run in every sample afterward? Could you run the same sample 3 times and get the exact same results? That sounds impossible, right?

These are the results. They can achieve almost 100% match across 3 runs at over 20,000 peptides. They can identify peptides that can't even be SEEN in the MS1 scans in some samples because they're too low at MS1, but because the Targeting App has been intelligently adjusting the retention times while looking for the peptides from the list that it was given and the ion is essentially pulled out from nowheresville the same way you can see a peptide by PRM or when using other intelligent acquisition methods like PROMIS-Quan or TOMAHAQ.

I've posted the preprint this data came from here previously. The RAW data hasn't been made available yet, and I'll know more once I dump a huge list into my HF when I get a chance.

Friday, November 2, 2018

NeoFusion -- A search engine for spliced peptides!

Why is Madison, Wisconsin the capital of proteomics innovation in the U.S.?  I have a theory developing, but it's obvious it's just snowballing up there. Pun intended. If you go to see what's going on up there for yourself don't take the free Mustang upgrade Hertz offers you up there in February.

Case in point: This brilliant new piece of software. Wait -- honestly -- I thought today I was going to talk about another brilliant piece of software from Wisconsin that we just started using continuously this week -- but NeoFusion has to jump line.

If you're paywalled and you promise not to tell anyone there is a glitch right now on the mobile site where you can get the full text.

Let's fill in some backstory here: Endogenous peptides are super important to systems like self-recognition. Our cells will hold weird peptides in little protein pockets on the outside of our cells and these peptides say to the immune system things like "I'm a Ben cell. Don't eat me!" or "I'm a dying cell that's all infected with stuff. KILL ME!!" They do other stuff but this is all the biologists have gotten through to me with the sock puppets they use to explain immunology to me. Big picture is that if you can figure out the peptides on bad cells you can make antibody drugs to destroy them... and stuff....

Problem is -- there is mounting evidence (3 papers now, I think - and more on the way) that a lot of these peptides are all spliced up from either different regions of themselves or even other proteins as a side effect of the proteosomal degradation process that produces them.

I'm new to the endogenous peptide stuff, but I've been through some studies pretty thoroughly and my take on the data analysis is -- get your LC-MS files and do some de novo sequencing (or use modified proteomics engines) while simultaneously using really large mass tolerances and pretending that you've never heard of FDR. Hey -- whatever -- it's not like people are trying to make cancer drugs -- this is proteomics -- what's important is that you found more peptides than the last team.

NeoFusion is something completely new.  It has this brilliant sequenced-based identification method for matching what it's got to the sequences you feed it. It isn't like these systems invent new proteins from nothing. They splice the existing proteome together. NeoFusion uses this to it's advantage and it's like a puzzle -- once you find part of the story the sequence continues to support itself if it's correct. Does that make sense?  This team is also all about heavily relying on post-acquisition recalibrations to strengthen the identification of PTMs (crank that mass accuracy up!!!) -- and -- of course this is ridiculously useful here as well. I haven't ran anything through this yet -- but I've got a buttload of files waiting for IT permission to install this!

Bonus points because Neo-Fusion doesn't forget that FDR is a thing.

Thursday, November 1, 2018

MaxQuant.LIVE (Software) Hands on!! Superpowers for your HF and HF-X!

MAXQUANT.LIVE!!!!!!! (Calm down, Ben) Okay....

You know how you've always thought things like "....this instrument is probably capable of so much more than what it can currently do..."?  MaxQuant.Live is the proof that you were right.

What can you do with MaxQuant.Live software on your QE HF or HF-X?

What about BoxCar? 

Wait. What's that number there that I highlighted there? Is that AT LONG LAST the ability to type the resolution that you want out of your instrument into a little box? There is nothing whatsoever magical about the resolutions 17,500, 35,000 70,000 on your instrument. After you get past the initial pulse from the FT injection the resolution is simply the amount of time the ions spend in the Orbitrap, right? Have you ever looked at your cycle time and thought that the person who thought you didn't need a resolution -any resolution- between 60k and 120k deserved a flying elbow drop off the top belt?

Now you can channel that rage into your sample prep where it belongs!  (I'm running with a Randy Savage metaphor here for some reason, just go with it, okay? I don't condone violence to any person in R&D, not even the ones that came up with SWATcH. They probably realized it was a bad idea and were just as shocked as the rest of us when marketing ran with it.)

In MaxQuant.Live you can type in the resolution you want. Did I immediately type the lowest number possible? Was I on the edge of my seat waiting to see an error come up that said I couldn't do 8,000 resolution on an HF -- could that awesome little box actually be capable of >40 Hz scan speed thanks to the wonderful people at Max Planck.....

Did it actually do it? Well -- yes -- and no. The overhead is still pretty high in the original HF and I only tried to run this on the BoxCar program so it needed long MS1 scans and 100ng of peptides was a poor choice of sample (maxed out my 28ms fill times almost continuously...) but I was too impatient for the nanoLC to load more peptides than that -- but it did 8,000 resolution for sure!!

How is MaxQuant.Live? Just a little tricky, honestly, but you'll get the hang of it! The manuals are great! I did discover that opening a RAW file while it is being acquired with it isn't a great idea (but when has it been...?).

So....I ran BoxCar tonight on a QE HF.
I ran my QE HF at a higher speed than it's ever ran before, making it the fastest instrument in the lab again!

And -- you know what? -- that isn't even the best part. Not even close. MaxQuant.Live is the software, but MaxQuant.Live is also a method. And it's way too cool to squeeze into the bottom of this post. You should 100% check it out.

Oh -- and I have no idea why on the cycle in the top picture my system skipped both the MS1 and BoxCar scan and did 81 MS/MS scans. I think it had to do with me trying to search the RAW file before it was completed. Was that stupid? Oh..yeah....

EDIT: I needed to make some corrections to this post. They can be found here.

Wednesday, October 31, 2018

Advanced Precursor Determination -- Bad for TMT? Part 2.

Honestly -- even after reading the second paper -- I'm not sure I get it....

CPTAC-3 has some heavy hitters again in this newest project and a few months back they demonstrated their results of some serious TMT 11-plex optimization.

Their results?

Don't use APD
Don't use MS3 based TMT.

I rambled about that here.

The second one I get. For global proteomics I also go to MS2-based TMT quan. I'd rather get 15 peptides per protein with more isolation interference than getting 9 peptides per protein with less interference. The first one -- cheeeeeeeese --- I can't wrap my head around....

A second study on this topic replicates and elaborates on these findings and is brand new here.

Okay -- honestly -- maybe I get it....and maybe it's just denial....cause I've got some TMT 10-plex data on the PC behind me that is some of the best I've ever seen and it came from this study from the Olsen lab.  

The author's report 16,700 TMT10-plex labeled phosphopeptides. I'm pretty sure I got around 10k when I reprocessed it myself (and I'm a picky jerk about PTMs) with offline fractionation and short short gradients (6 hours total run time or something ridiculously short). 

And maybe it's the offline fractionation that improves the coisolation? And maybe phosphopeptides are just simpler? Because on the HF-X -- at least at launch -- APD was always on....

Lots to think about -- later! 

Monday, October 29, 2018

Security vulnerability in Xcalibur Foundation. Download and install this patch!

This is a serious post -- though the picture above is funny.

There is a vulnerability in Foundation (the program underneath Xcalibur starting way back after version 2.0.7.)

If your PC is online and has Xcalibur or even Foundation - with any of these versions -- every single one of them -- your computer may be at risk. This affects both instruments and PCs that just have Xcalibur on them for looking at data.

You can download the newest Foundation and Xcalibur -- Foundation 3.1SP5 and Xcalibur 4.2 or -- you can install the patch.

I only have a direct link to the patch and clicking on this will start the download.

If you're worried that I'm just making this up, call into tech support or your favorite FSE and ask, they'll be going around doing these patches soon. For reference, this is Factory Communication 2018.020

Sunday, October 28, 2018

XINA! Multiplex proteome kinetics in R

(This is the face I'm going to make if I'm asked to do proteome kinetics....)

But now there is a great new R tool -- called XINA (no relation) that takes loads of work out of this horrible sounding idea!  You can read about XINA in press at JPR here.

If there is another package that can do this, I don't know about it. I especially don't know anything that can directly port out the data into StringDB and KEGG (also through R...sorry...)

You can directly download it through BioConductor or you can pull the whole thing down from Github at this link!

Friday, October 26, 2018


((This image was floating around un-acknowledged on Google Images. It is a CopyRight of Steve Graepel and originally appeared here. This image used without permission, but better that I hunted down the guy who created it, right? (As always, let me know if this is a problem and I'll take it down!))

Okay -- so those degraded peptides?? THOSE ARE A SUPER BIG DEAL!  What if a big team decided to do something crazy and profile those???

BOOM. Here ya' go!

I'll be honest, I'm not 100% sure how they did this. I believe the proteosomes were purified and then the degraded peptides were knocked loose from them somehow. Then MaxQuant was used for an enzyme non-specific search of the entire proteome. Multiple rounds of digested proteomes were used for comparison to make sure they were on the right track.

And -- I can't even wrap my head around all the potential here, but I'm going to try.

1) The "dark proteome" stuff -- which might have a different definition now than the one I normally put with it. I consider it all the stuff that passes MIPS (or Peptide Match) -- so it isotopically looks like a peptide, elutes off c18 when a peptide should, but we don't know what the Albert heck it is.

The protesomes are, presumably, active ALL THE TIME. So a lot of the background peptides may have just been profiled in this paper!

2) How these differ between disease states could open up a whole new field in diagnostics!  The proteosomes are tightly regulated by a series of complex processes (typically modulated by ubiquitin, as far as we can tell, right?) Some proteins are labeled for degradation just because they're old (there is an N-terminal instability thing that marks old proteins) or they're degraded as part of the specified, complex, and poorly understood mechanisms.

What if we didn't need to learn the degradation patterns themselves and could just monitor the degraded peptides coming out of the system???  These authors do this here and show the potential this may have -- there are big differences in different diseases!

I'm super psyched to discuss this paper with people who understand the biology behind this and congrats to this team for ---

The first Tweet is my perception of this great paper. The second Tweet -- well -- that's pretty funny...

Thursday, October 25, 2018

Boost your crosslinked peptide IDs by fixing your monoisotopic assignment!

Virtually all of proteomics data processing these days requires a proper monisotopic assignment to make a match. It's also probably no surprise that today's instruments are trained on perfect tryptic digests.

What if I told you that there is a huge spreadsheet showing that monoisotopic assignments of big peptides (like crosslinked peptide species) are messed up a large percentage of the time?!?

Don't worry! There's a fix and it's in press at JPR here!

The spreadsheet is in the supplemental -- and it's from a very modern instrument!

Wednesday, October 24, 2018

ap-Quant -- Powerful FDR controlled label free quan for everyone!

The apQuant paper is finally out!

What's this about? It's FDR controlled (by Percolator? what? I know!!) label free quan software that you can use in the free version of Proteome Discoverer (IMP-PD 2.1), PD 2.2 and PD 2.3.

Even better? It's fully compatible with MS2GO!! 

You can check out the paper here.

Tuesday, October 23, 2018

Cross-ID Beta is now available!

I may go on several days of posts on chemical crosslinking. It's something we're doing a lot -- both with some big successes so far and some big not quite so successes.

Good time to be getting into it because the field is blowing up, though!

Another great new tool (that I've been pressing the "refresh" button on their website for a few weeks, is CROSS-ID from the Heck lab).

Today a new button appeared (can't swear I looked yesterday) with a BETA DOWNLOAD. You can get it here!