Friday, November 30, 2018

The ULTIMATE Proteome Crosslinking Protocol!

THIS IS THE BLYBERMONDAY/CRACK FRIDAY SPECIAL I WAS LOOKING FOR!

EVERY step of the process.

A bunch of stuff I hadn't even thought about, like cell lysis, reaction conditions, offline fractionation (including what fractions to keep and which to toss)

AND A PRIDE FTP SITE WITH PROOF THAT ALL THIS WORKS!!

Is it my data processing pipeline? Nope! Cause now I have files where Proteome Wide Crosslinking actually worked!!

Thursday, November 29, 2018

Last QE Turbo post, for real!


This might be the last one. (Continuation of this post, possibly yesterday? I forget)

What if the fill times were exactly the same on a QE HF running the same method and the same fill times (arbitrarily 30ms, cause why not?) and all I changed was the MS/MS resolution? 15k vs 8k?

You'd expect to get more scans when using 8k MS/MS, right? Because you'd definitely get more scans completed when you didn't need the full fill time.  More scans = more peptides!

I had a small opening overnight so I put on 500ng of HeLa and ran it on a 45 minute gradient 4 times.
2x with 60k res MS1/15k res MS/MS with top 30 and 30ms max fill time
2x with same except 8k res MS/MS

With 15k MS/MS -- 46,900 and 46,978 MS/MS scans
With 8k MS/MS -- 54,012 and 53,606 MS/MS scans

14% more MS/MS scans! WOOHOO!!

Then it gets less cool.





Clearly this is pseudo-pscientific, but it doesn't appear that my HF benefits from running MS/MS at 8,000 resolution. Honestly -- this is about what I'm seeing on the Fusion 1, to the point I don't use a transient below 15,000 resolution. I would like to take this apart. Like -- where are those 7,000 extra MS/MS scans going? Does mass accuracy appear to suffer due to peak coalescence at the lower resolution? That seems the most likely explanation, but no time to investigate right now.

This is good for me, probably. There was this increasing temptation to use MaxQuant.Live to "hack" my instrument every time there was a gap in the queue, but with this kinda settled, it's time to get back to the metabolomics!!

Wednesday, November 28, 2018

Proteogenomics fills in not-so-subtle blanks in the B.subtilis proteome!


I love these studies where someone tackles a model organism! Bacillus subtilis? Really? We have to know everything about this one, right!?!? Just what I can remember from long long ago grad school...

It has a genome about the same size as E.coli -- but
It can form nearly indestructible spores, hang out for 20,000 years, and start growing again.
It can make super weird fruiting bodies.
You can knock out every one of it's penicillin binding proteins (required for it to make cell walls) and it will still grow!

Despite hundreds? Thousands? Of studies on this little organism, loads of mysteries still exist? Such as...only 50% of the proteome is accounted for...???

Ravikumar et al., from the Macek lab is on the scene, solving these mysteries in this new Nature Scientific Report!

How do you improve on the understanding of something expected to be this well understood?


You grow normal and knockout bacteria in normal media(s) and in SILAC (&Super-SILAC) ones.
You fire up the Orbi-XL and you


Like --
1) In gel digestion
2) In solution digestion
3) You phospho enrich (4 WAYS)
4) You acetyl-enrich
5) You employ the appropriate multi-enzyme approaches

You process 1,688 (!!!!!!) files through MaxQuant.
You use the canonical databases.
You supplement these with 6 frame translated genomes.
You use Motif-X and other tools to assemble everything
..and finally(!) you do this:
...tried to spell that like 12 times. Screenshot!

What do you get out of this monumental amount of work?

1) By far, the most complete proteome of this important organism ever done.
2) A really intimidating folder on ProteomeXchange/PRIDE (here)  For real, someone deserves an award for just uploading this many files.
3) An amazingly thorough understanding of how this organism tightly regulates itself through a complex combination of acetylation and phosphorylation events.
4) And some useful phylostrafibrastabhic data?
5) B.subtilis finally


(...come on...I had to use that somewhere....)

Tuesday, November 27, 2018

Metaproteomics of medieval teeth tells us about the health of ancient people!!!


Okay -- if the title of this doesn't make you want to read this (open access!) paper we probably can't be friends....


This is like Emily and Antonius's mummy study if you weren't quite as sample limited and you moved forward from an Orbitrap XL to today's fastest Orbitraps!!

They can ID the bacteria that these ancient people had been exposed and infected with -- from their teeth!!

An Encino Man gif is mandatory here, right?


(thought I could find a better one...)


MWAHAHAHAHAAA!!!! Much better one! 

.....told you it was raining a lot in Maryland....



...turns out it was a failed condenser thingamajing.....

Fingers crossed it turns out okay. This Elite is a total rockstar. We've got a bunch of newer stuff around here, but this one is still getting slammed 24/7 for good reason. Hopefully an intense shower of unknown duration isn't the end of the tour!

Monday, November 26, 2018

Clinical mass spectrometry is more than pain management.

I recently went to demo some mass specs. Boring triple quad routine analysis stuff.

I realized on the second day that every time I said the word "clinical" all the people there heard was "pain management" (is this a US exclusive thing?)

Inspired, I just wanted to take a look at clinical mass spec today and some of the proof that we can do more than just quantify oxycodone in plasma.

<begin rant>

Great new study #1


Great new study #2 (You get HDL, LDL measurements in the clinic all the time -- but this is just a tiny clue to the puzzle of what is wrong. We have the tools to flesh out these differences NOW.)

Great newish study #3 (We don't even need anything special. The tubes that the hospitals are using for collecting samples for DNA analysis? Those are perfect for clinical proteomics analysis!!)



Similar and great newish study #4 (Look, we don't need anything special, Mr. Hospital administrator. We're not asking you to spend an extra $0.06 per patient on a new fancy tube. You're already making more FFPE slides than you will use. We just want one of them....



Great newish study #5 (Personalized medicine for bladder cancer. We can track biomarkers to determine what chemotherapeutic to use and when. Today. With the technology we have now. "personalized medicine" is more than just words for politicians to throw around. This is stuff we can do right now to make patient's lives better and improve the chance they go home.)


Okay -- I should wrap this up. And I'm going to end it with the study that is the first image. And -- yes -- this study has appeared on this blog multiple times, with me focusing on different aspects of it, but --- what a fucking awesome piece of work. I'm thrilled that if you Google image search "clinical proteomics" that is now the second picture that pops up.

WITH.A.MASS.SPECTROMETER.WE.CAN.HELP.PATIENTS.NOW.IT.ISNT.EVEN.HARD.

LC-MS.ALLOWS.PERSONALIZED.MEDICINE.NOW.

<end rant>

Sunday, November 25, 2018

What do you get when you run an HFat full X transient speed?


If you can't tell, that's a picture of the racing stripes I'm considering requesting a local auto-body place cut and install on my HF. Now that I'm looking at it, it seems a little dumb. It looks more like suspenders.

Much better!

Okay -- so you should probably check this out on your own, but in case you aren't crazy and want to see how it turns out -- these are my results so far. (Oh. And this is an extension of this post a few weeks ago)

Some results:

500ng of human cell digest at 60k MS1 15k MS/MS (normal HF speed) -- 45 minute gradient;


Okay -- pretty normal.

What if I change nothing, but drop the MS/MS to 8,000 resolution (and the fill time down to 11ms to make sure transient is the limiting factor)?


Loads more MS/MS scans.

Less of everything else. Let's look at why:



Even with the normal method, we're maxing out our fill time the majority (66.85%) of the time.

Cutting that number in more than half? That's totally dumb. Makes me sound as dumb as all these "peer reviewers" keep implying that I am, right?

"Hey Ben, why did you choose 11ms?" Because I'm dumb. Or because I wanted to verify that I really was increasing the number of MS/MS scans and boosting the number of ions sampled per cycle.  And -- I'm increasing the number of MS/MS scans and the number of ions sampled per cycle.

And I wanted to see if I could get anything with 11ms of fill time. Almost 1900 protein groups in 45 minutes with a massive fill time limitation? That's AWESOME. Less than what I get with a normal method an nLC -- but, in case you've never visited this blog before, I HATE nanoLC. Necessary evil and all that, but I'll have a huge party when it's a dead technology we laugh about suffering through.

We've been evaluating options to nanoLC. And I've got a solution that shows some promise, but the peaks are waaaaaay too sharp. Like 4 seconds wide sharp. At the top of a 4 second peak, there is a TON of signal. But if your cycle time for only 20 MS/MS events is a full second? You've got high odds of missing that peak apex, and all that signal is gone.

(In case you're wondering 1ug of digest doesn't look much better)

So....more optimization is necessary...but for specific applications, maybe the HF deserves to get those flames!

Assessing NanoParticle/NanoTechnology Safety with Proteomics!


There is a wing in my building that is all locked up. I know it has something to do with nanoparticles and they've got mass specs back there -- and -- I don't know what they do with them -- and until now I only had a vague conception of  what a nanoparticle was (I'm assuming a very small particle.)

If you're curious, a whole special edition thing on NanoParticles has been recently assembled. A link to all the chapters can be found at this intro here, and the links I've clicked because I thought they sounded interesting have been open access.




Proteomics can play a role in these things -- namely in assessing the safety of the particles! This article might play more like an "intro to proteomics for nanoparticles people" but it also does a good job of helping me understand why people are doing this stuff and what the problems might be.



It is more than a little clear that as these technologies develop we're going to be necessary to try and stay on top of it!

Saturday, November 24, 2018

How to optimize collision energy for massive numbers of crosslinks!!


Okay -- it might be time to put a page over there ---> for just crosslinking stuff. There is too much and it needs organized somehow. Time for another trip to the Room of Spirit and Time....

By the way, all this stuff isn't getting into the realm of redundancy at all. Crosslinking mass spectrometry does work. When it does, those spectra can immediately jump your research forward 10 steps. I've seen it happen around me at work multiple times this year thanks to data that I've personally generated for other people. However -- it's still got some serious limitations. The biggest one in my mind  is that the stuff that my group is actually doing....well....our stuff doesn't work at all. You still need to keep the samples very very simple. When we move the complexity up to even just one organelle -- blech....hundreds of thousands of MS/MS events? A dozen crosslinks the IONTRAP tells you are real. What's a 1% FDR of 100,000? Is 1% FDR of a crosslink MS3 even possible to estimate? Am I typing "1% FDR" repeatedly because I know that this totally incorrect, though somewhat conceptionally useful shortcut of confidence estimation that we say outloud all the time makes (some?) statisticians dream about murdering every proteomics person in the entire world so we'll be forced to start over with ones that they can teach basic statistics to?   ðŸ™‰

Fun thought. What would a statistician do to get rid of all of us...?

What? Crosslinking? Oh yeah!

Here is the thing, though. When you see it work it looks like "man, just some minor tweaks will make this work. I swear." You feel so close and that is what makes it both so alluring and so frustrating.

Will this be the study to tip the scales???  I hope so!! 


I'm primarily using the MS3 based crosslinking stuff on the Fusion. We're now investigating a new method the great Dr. Bernard Delanghe taught our group recently that utilizes both MS2 DSSO triggered MS3 and ETHcD and it's showing some serious promise -- but we're all familiar by now with the pseudo continental divide --


You don't need to read much of Dr. Mechtler's stuff (or hear much of his accent 😉) to realize he's firmly European. This group is consistently showing us that if we use mass accuracy (and our heads) we can circumvent the need for multi-level fragmentation and slow, low efficiency fragmentation methods. And this study is another chapter in this growing tome. I'd title this chapter.....

MS3 DSSO crosslinking < HCD MS2 based DSSO Xlinking 

(with proper CE optimization!!!!)


With this paper in hand I now have a reason to head to lab on a gross Saturday morning and see if I can get some of these samples to give up the crosslinks WE KNOW are in there somewhere.....

Friday, November 23, 2018

RawTools! More than you ever even wanted to know about Orbitrap RAW data!


I've been camping off-grid in WV and missed a load of things. It seems like every single day one or three awesome new tools appears to fix a problem our field has -- or ones we haven't yet realized that we did.

RawTools runs on both sides of this.

1) I need a RawMeat Replacement
2) Wait -- this is WAY more stuff than RawMeat could do when it was alive!

First off -- you should check out the paper at JPR here.

If you can't access it cause you're back on grid, but not at work, a pre-edited print was loaded at BioRxIv a while back that is available here.

If you just want to start interrogating some darned RAW files, you can find this tool in a number of formats in a couple of places.

You can get the most recent release at GitHub here and you can get the earlier releases here.

That's a lot of links? Yup! Sure is, but check out what you can do with this awesome toolkit!


Not only can you dig into single RAW files, but you can monitor big piles of RAW files! On the left, you see the MS2 injection time increasing as the instrument displayed here got to a bunch of injections. Consequently, on E you can see that the ID rate started plummeting.

A super interesting part might be how that sensitivity of the instrument doesn't match up exactly with a subsequent drop in IDs. We've got some wiggle room on trap instruments, particularly when the fill time doesn't get as high as the transient time. I bet if we dig into this dataset (available here!) that is when the two line up (When fill time > transient).

This is just a single example of what this great new resource can do for us!

One more link, though! Once you've processed your data through this awesome package -- you can visualize it with a simple ShinyApp, either available online here or by running it locally on your computer! Pretty great, right?!?!?

Wednesday, November 21, 2018

Facebook is still terrible, but this ad is on point....


I just got off my third self-imposed "Ben had an over-the-top tantrum about US politics" Facebook ban and logged in to upload 75 pictures of my 18 dogs -- and -- for the first time ever, that awful site tried to sell me something useful!

Need a $25 oscilloscope? Me either!

I'm not gonna buy something from a guy who is played in a movie by the same guy who plays Lex Luthor. For real. That's weird. But you can find this for sale if you Google "DSO138 Oscilloscope".  Hey -- if you have nieces or nephews who have ADHD nearly as severe as yours? You can get it as a kit that looks like it will take FOREVER to assemble. Holiday distraction accomplished!

Bonus:

Friday, November 16, 2018

Your data has been Normalyzered with NormalyzerDE!!


Proteomics needs more statistics and this might be one of the easiest ways to get them ever!

NormalyzerDE is described in this new paper.


Don't want to read? Me either!

You can just try out the software by going to QuantitativeProteomics.Org.

The direct link to the online NormalyzerDE is here, including great HELP tabs to walk you through getting your data in the right formats.

You might also want to check out the main page that I listed above. There are a ton of tools there. Given how innovative and elegant this one is, I have a feeling some of those other links are definitely worth visiting!


Wageningen Biochem has a really nice sample prep guide online!


I'll add this to the Newbie section over there later --> but I've left this tab open on my phone all week and need to post it somewhere now!

This is a really succinct and well written guide to proteomics sample prep, including steps on managing expectations!

Direct PDF download is here.

Thursday, November 15, 2018

I'm sorry to say the 90s are over. Maybe it's time to move past SeQuest!


So...in the early 90s there was a SciFi TV show about submarines that was on network TV in the U.S., unlike the hairstyles of the actors, the show hasn't aged all that well.

SeQuest the search engine has faired a good bit better. Heck, I bet if you looked at every proteomics paper that has been published this year the majority used this awesome -- way ahead of it's time -- bit of awesome out of the Yates lab.

However, I think I'll always remember 2018 as the year of the PTMs that SeQuest got wrong...and...well the multiple times I crossed flood water far too deep to be safe, cause it won't stop raining on the U.S. East Coast -- but mostly all the darned PTMs I've had to manually evaluate and tell people SeQuest got wrong. I desperately wish I was done calling people, but I'm not...

On the UP SIDE! It's 2018!  We've got options! More than we've ever had.

I looked it up. According to Car & Driver, the best car in 1997 was the Audi A4. I got to drive one around that time and it was AWESOME. But by today's standards.... 21mpg highway (before the EPA here readjusted fuel economy, so -- probably 18 mpg highway) 0-60mph in 9.5 seconds. WHAT?? We were impressed by that? My car is old now, averages well over 45mg and is over 1 second faster -- and it's driven by a moron. All the time.

I'll always love pastel Reebok Pumps, 90s cars and SciFi, and SeQuest. For real. But I'm a grownup -- you probably are too -- and if we take a step back and detach our emotions we can see that all those things actually totally suck.

Convinced? No? Good! What if I show you more options we have? (Click to expand, if you want!)


This is how I've been running my searches today.

Please note. No JNCO Jeans. No SeQuest.

This is all stuff that is very close to launch (why else would someone let me have it?)

You know ProteomeTools, right? 300,000 synthetic human peptides? What if MSPepSearch was powerful enough to search libraries that big? Cause the new one coming totally will be!

And...it takes 5 seconds to search a RAW file. No joking. Okay...this is a small RAW file. But that's hella fast.



Oh. And you can Percolate it.

Then anything that doesn't match my spectral library? Send it to MSAmanda 2.0!

Okay -- I'm going to back up on my SeQuest criticism. If you have a lot of peptides per protein -- it's fine. Go with it. But if you HAVE to prove that peptide is real? For the love of dog, please do not use that engine!  MSAmanda is free and is so much better. If MSAmanda tells me a PTM is there....it probably is. SeQuest? Break out the ruler. It's manual validation time....

Guess what? You can Percolate that too! (So now you're using equivalent metrics for both engines!)

Didn't match that? Okay -- well -- I don't know when this is coming out for everybody else, but I have a de novo search engine in PD!!!  And I can Percolate that, too!!  (More details later. I updated my PD and it broke my Novor node...and my Xcalibur...but now I have 4.2!

These are the tools for Proteome Discoverer fans, but if PD isn't your platform:
MSFragger is blowing up and the data I'm seeing from multiple groups says  >SeQuest
MetaMorpheus gets better every week and is way >SeQuest
Mascot is getting updated at least once/year and >SeQuest
PEAKs? MS-GF+, the 75 search engines in SearchGUI (like Comet!! which is >SeQuest), Byonic, on and on.
Oh and some MaxQuant/Andromeda/Perseus thing. I'm getting very re-familiar with now thanks to BoxCar -- and -- there are very good reasons that people are using it. And on and on...

Look, I'll probably queue up some mouse runs before I go to bed on SeQuest. And not just because mouse studies of human cancer when you've got a hospital full of humans with human cancer on your campus is stupid (cue a biologist to tell me things like -- "Ben, I can't test a new drug on humans"...or "...sigh....I need sections from brain for this study....I can't exactly take these from humans..sigh..." To these comments, I say please see the disclaimers section or the "driver of Ben's car is a moron statement above.")  😇🙉😇, but because I've got deep fractionation and I'm probably going to average 30% coverage of the proteins for all this work. SeQuest will be just fine. Will I get all the spectra? No. Will I be able to trust my PTMs? After you check them with a calculator and ruler, sure!

This is more of a rant that started with something about submarines, but it's a new age and we've got so many awesome tools to work with now!


Sunday, November 11, 2018

Untargeted analysis of the airways of children with respiratory infections!


According to the sign at my pharmacy it's cold and flu season again! Miss Puff (above) isn't sick. She's 328 years old and runs around like a psycho in 5-10 minute stretches each day and does basically the above the rest of the time.

Why hasn't proteomics solved these viruses yet? Maybe it has. I forget to look each year until I get a cold myself -- surprise! this is a self-centered blogger, which I assume is an oxymoron.

I don't know if this great team solved my problems, but this study is super cool!


Ever thought about how you might sample the proteomes of a human's upper respiratory tract? Neither have I!  Turns out it isn't nearly as easy as you'd think. There are all sorts of rules you have to follow with human beings and they weren't exactly designed with our field in mind.

What you want to get is the epithelial cells that are being affected by the virus or bacteria or whatever, and maybe the cells that are doing the immune stuff to kill those things. You need to start with a swab and then you need to get the proteins off those swabs that you can make into peptides.

This study looks at multiple methods to get the best yield -- and even how to optimize it for TMT experiments....I'll just steal the figure....


Man, I like this paper. EDIT - RAMBLING ABOUT NOTHING, PLEASE IGNORE: [Or...I like DayQuil....will we one day look back on over the counter medications of this era with the same kind of disgust we now have for prior procedures like leaches? "Hey! Let's put a stimulant (pseudo-ephedrin) and a potent dissociative compound (dextromethorphan HBr) and one of the most potent liver toxins ever discovered (acetaminophen) in one big ol' pill. Let's label it for 'staying awake while you're sick'. Don't think it'll work? Let's color it bright orange so it doesn't at all look like an insane thing to ingest!"

For an off topic read, this study is awesome. They review all sorts of evidence on cold "remedies". Turns out basically nothing has any solid evidence of being useful in any way. Possibly, I'm exaggerating, but this bit is gold, though --

--- but I bet the people with the placebo weren't as scary behind the wheel of a car!!  ]

BACK TO PAPER: Sorry -- no, I seriously do like this paper. Whoever designed this experiment wanted to do some solid science. Big cohort. Loads of variable control. Patient diagnoses confirmed by PCR. Really nice iterative approach -- oh yeah, and some solid mass spectrometry!

You know how you get the proteins off these swabs? You use Universal Transport Buffer or something, probably some medical requirement -- and it's loaded with BSA. Yup, you have to take a tiny number of cells and bury them in B.S.Albumin.

Guess how many proteins you get if you don't deplete it? Even if you run it on a Q Exactive using 50cm columns and 3 hour gradients? Four. 

Possibly I'm exaggerating, but it wasn't that long ago that number would be pretty close to accurate.

Heck, I'll steal another figure! Why not?  OPEN ACCESS, FTW!!


Okay -- straight up -- you took a swab, stuck it in someone's nose or throat or whatever, stuck it in a tube full of BSA and you get as much as 2,400 human proteins ID'ed? THAT.IS.AWESOME. I love this field and where we are right now!!

Protocols B, C and C2 all do something great. They use a genetics kit that is made to get DNA and RNA out of a cell. I'd almost bet you based on the manufacturer of the kit that the protein is waste material in the kit instructions. In this case, they use it. B is that material with label free proteomics. C is with TMT and C2 is TMT but they increase the gradient from 3 to 6 hours.

See why I love this paper? All those nerds doing genetics stuff may be throwing away protein material that is cleaner and higher yield than some of our normal protein extraction techniques we made up even generate (Protocol A is pretty normal) I haven't even got to the figures where they try to make sense of the medical data here -- there are loads of pretty plots. All the data was processed in MaxQuant, even the TMT data, though they did the correction factors themselves outside of the program.

The final paragraph tells you that all the pretty pictures were made in R using super secret scripts. All the data has been deposited and you can get it here!


Saturday, November 10, 2018

MassComp -- reduce mZxML data files with no losses!


Back when desktop computers were manufactured by smaller companies there was one called MassComp. I didn't know this until I discovered the MassComp I want to talk about hadn't designed an icon for itself yet.


This is the MassComp and you can get it here!  It's new -- so new, in fact, that you may have to compile it yourself. However -- this is gooooood stuff. Have y'all seen this?


Have you seen more of them if you move a file from one place to another a few times? That might mean that you're data transfer protocol is not error checking, or the compression thing that you are using to move that data file is not error checking. That's the opposite of MassComp. It's anti-lossless data compressing. (If you do see that symptom talk to your IT people!!)

MassComp is just for MzXmL now, but a lot of people use that!