Wednesday, May 22, 2019

Virtual lock masses!

Time Of Flight (TOF) instruments can be ridiculously fast. Compared to today's FTMS (mostly Orbitrap) instruments, they compare poorly in two pretty serious regards.

Is this new paper in biorXIV something that could finally lift one of these limitations?

My blogger-at-5:30AM interpretation is:

1) Sensitivity -- with the exception of TIMS, there is no way to accumulate signal before shooting the ions down a flight tube -- this is a generalization, but I expect around 100x less signal on a TOF compared to even the oldest Orbitrap systems (also keep in mind that more TOF resolution = less TOF signal since the ions are traveling farther in an imperfect vacuum system.)  This is even worse, considering that I sometimes acquire signal in C-traps prior to Orbitrap analysis for 1000ms (or more) when I want to prove that a QE can exceed even quadrupole LOD/LOQ

2) Accuracy -- again, more generalizations, but most TOFs without consistent lockmass are accurate to one decimal place, where Orbitraps are at least to the second decimal and almost always to the third. (TOFs are also massively affected by temperature in the room, with some needing calibrated many many times/day)

Aha! Okay -- so I haven't ran a TOF in a long-ish time now -- but -- I know consistent lockmass sprays are in use that improve #2, but could you mathematically drop in a virtual value and get improved accuracy all the time?  These authors sure think so.

I tell you what. I was seriously proud of my brain as far as page 5. For real. I probably didn't actually know what was going on, but some combination of espresso and birds chirping at 5:30 made me feel like I did. Go brain go! I had some random thought about "Vince Carter's coming back for one more year, maybe there is still hope for you down this senescence water slide, old brain!" ...then...

...maybe it's time to walk the dogs and look at flowers....

Importantly -- yes, this totally looks like it works, though my dummies interpretation is that you need a lot of spectra to get the maths to work out, but once you have that pile of spectra you can use learning machines to massively improve the quality of the TOFs data.

Now -- I know people often think that the mass accuracy off the Orbitraps is probably more than we need -- and just about no one uses MS/MS lockmass on their Tribrid systems, even though that capability is there -- but -- is it enough? Will future bioinformaGicians look back on our data and wonder "why on earth didn't they lockmass their MS/MS spectra -- I need it for this (insert undiscovered chemical modification here)!??!" I don't know, but it's nice to think that we could improve the data we already have!

Tuesday, May 21, 2019

IPSA -- A handy tool I bet you're going to use all the time!

Sometimes you just need to step back, look at where your pain points are in your field and just put someone smart on fixing them forever.

Pain point? Annotating all those stupid fragments that your GUI output software doesn't feel like doing for whatever reason....or does but it looks awful....

Solution?  The Interactive Peptide Spectral Annotator! You can read about it ahead of print here

I highly recommend you read the paper. I will as soon as I have time to breath. Writing this and generating the meme took up all of my free time this morning. (Time well spent, in my ever humble opinion).

However -- if you are also pressed for time -- this tool is so amazingly intuitive that you can get going with it right now!

The online version is here!

The Github so you can use it and integrate it with your bioinformagical powers is linked in the paper.

I just punched in some random peptides into the online thingy here....

(..I mean..I was already on the site...but -wow, the output is super sharp)

Saturday, May 18, 2019

MARMoSET -- Get publication ready data out of your RAW files!

OMG. I'm either sleepy and/or this is hilarious. Full song and video at YouTube here.

What I'm talking about, though is this great new paper in Press at MCP.

How much time have you spent trying to get data out of your RAW files and to a resolution that you could actually submit?

Bonus -- As someone who has spent a decent amount of time trying to come up with cool abbreviations for methods -- this is an awesome example.

P.S. -- Have we had the UrbanDictionary talk? If not, we should. Please -- before you publish that cool word you came up with for your method -- please -- check urban dictionary for it first. This is to ensure your new software or technique doesn't lead everyone looking for it online first comes across something terrible you weren't aware human beings were physiologically capable of doing. Definitely may have happened to someone in our field recently. Don't be a statistic, please. 

This is how Marmoset works -- and how they got the name!!

Friday, May 17, 2019

Focus on the spectra that matter, duh!!

How did I miss this!??!  Looks like I retweeted it months ago, but didn't read it?!? 

How much time are you spending processing PSMs from proteins like this?

12 minuted CE-MS/MS run -- 145 PSMs from Tubulin B4. Who cares about Tubulin B4? Literally no one who has ever lived on the planet earth. Okay -- maybe 6 people who ever lived on this planet.  I don't care about tubulin. Especially not enough to waste 145 MS/MS events in a 12 min CE-MS/MS run.

Why don't I care?
Because it isn't changing. I only care about quantitatively changing proteins. Or PTMs that are quantitatively changing. If the ratio is 1:1, no one cares about your dumb protein.

How do we do proteomics, though?

We identify everything -- then we quantify it. If we're really lucky our software does both at the same time.

What we should do is use something like Quandenser/Triqler (a DINOSAUR IS INVOLVED, which is another program, but that's okay) to find out what is changing first -- THEN ID IT! 

You can get this program from THE Matthew here.  This program works for label free stuff.

Unrelated project, but related idea --

If you want to do something similar for TMT/iTRAQ, you can get RIDAR from Conor Jenkins GitHub here. 

Tuesday, May 14, 2019

Great review of post translational modifications associated with aging!

Okay -- honestly -- I was absolutely following this review from a couple of years ago until it gets to the E.coli stuff. Which is probably awesome/relevant in some way besides making very pretty pictures, but the first part of it is definitely great.

Reviews of PTMs and aging and the literature that has found links between these? Fantastic! You can check it out here.

Monday, May 13, 2019

Effects of APOE on Brain Proteomic Networks!

Somewhere over there -->
I put together a guide a few years ago about what to do after you have proteomics data. I'm glad to hear some people have found it useful, but I know it isn't very good. Truth is...I don't really know what you do next to work out a mechanism if GiaPronto or Ingenuity doesn't say "It's this one!!"

This amazing paper in a journal I've never heard of brings new set of tools on stage and goes through them in painstaking detail.

I strongly recommend reading it in the HTML format over the PDF because having the references side-by-side makes it much easier to pop a tab open and figure out what they're doing (in case it's a word of stack of letters in a row you've never heard of).

...which...makes me wonder why everything isn't formatted this way!!

Don't get me wrong, the proteomics here is stellar. (They run a Fusion in HighSpeed mode using 50cm columns and 150min gradients and combine this with data from another cohort that used a QE Plus). Most of the methods and basic data analysis are based directly on the spatially resolved mouse brain study Max Planck did a few years ago, but its the amazing level of detail in both the experimental design and downstream analysis that makes this something I'm very glad I read this weekend!

Saturday, May 11, 2019

Baltimore/Washington Mass Spec Area -- Let's talk about new software! you @the_ion_doctor for reminding me of this thing I agreed to do a year ago that I definitely wouldn't have forgotten about otherwise, probably!

In the area? Want to talk about data processing? Find our little group here!

Let's talk about --
OpenSearch Strategies (Fragger.TagGraph) and hybrids like MetaMorpheus
Second searching!
Using MS1 libraries
Scaling data processing way up with GPU and Cloud-based processing
And how to take all this lousy nextgen sequencing crap and make it into something that you can actually search with?

That's what the overview on slide 3 of my totally completed and well-orchestrated slide deck says we'll do and that probably won't umm..change...too much... Slides so good I'll post them, here after the talk.  Be careful when removing the limiters, though....

Wednesday, May 8, 2019

Register for the ASMS Skyline User Group Meeting!

There is no shortage of things to do the weekend before ASMS! However, if you haven't already signed up for awesome workshops the whole weekend -- registration is still open for Skyline!

You can register here!

1) Its in the Georgia Aquarium
2) If you're really super extraordinary, you might end up with something like this:

3) The speaker lineup is (again this year) loaded!

4) I might finally get to meet my collaborator -- schizophrenia proteomics expert Matt MacDonald in person, since he's doing one of the talks.

Related -- a study we both worked on just went live this week at AJP (classic medical style journal, I'm pretty sure it went into review in 2011....his group at Pitt did all the LC-MS work, so it's really good.)

Tuesday, May 7, 2019

PROTEOFORMER 2.0! Don't compete with RiboSeq, Assemble with it!

Gotta move fast on this one, but -- holy Unicron -- this is exactly what I was looking for. I've got proteomics and RNASeq and some PacmanBio stuff of some samples and this very frustrated researcher was explaining to me that he had "RiboSeq" of the same thing and I just kept staring off into the distance because I do that when I'm trying to assemble information I don't understand and it makes people think that someone is sneaking up behind them. And -- check this out!

Don't know what Ribo-Seq is? Me either! Here is a WikiPedia article.  My understanding is that it bridges the gap -- only things that are actively getting to the ribosome for translation are sequenced. It's the closest you get with probes and genetics before getting to the proteome!

Obviously this isn't new -- there appears to be a 1.0 version, for example -- but it's new to me on a totally new concept -- and it's so powerful that this group uses it to identify NEW PROTEOFORMS from shotgun data!

Bonus -- 1980s chemistry courtesy of poorly paid animators.

Monday, May 6, 2019

ANN-Solo returns with a GUI and GPU processing!!

Okay -- there are still some downsides to ANN-Solo in my mind, but they're mostly because I'm old and dumb. Here is an older post I made on this great software.

Downsides -- Python. The world's most approachable powerful coding software. You need to use it. If you've got neuroplasticity left you can probably learn it in an afternoon. Mine's all used up.  Fortunately I have Python experts around my lab! Two all the time and 3 on Thursdays!

Downsides -- Linux. Wait. You can use OS-X? Is that what Macintosh uses? I forget.

What am I rambling about? This new preprint featuring the return of ANN-Solo with even more power (and a GUI) 

Want a reason to get a Linus NumPy things all set up on something in your office?!?!

5.6 milliseconds to search an MS/MS spectrum against a spectral library! To get this fast, there is a catch, though, you have to use a Graphics Processing Unit (GPU).

GPU data processing isn't new. My old Waters Q-TOF used it 7 years ago. You can buy at least one commercial data package that uses it, and I think Darryl Pappin was messing around with them quite a while ago.

GPUs have TONS of cores. My 1080TI in the PC I'm typing this on has 3584 cores (they're called CUDA cores). Compare this to my CPU that has 20 cores or threads or so. However, each CUDA core in a GPU is weak and dumb and only capable of doing small tasks, like controlling a few pixels in a video game or performing the same dumb math problem over and over again in order to try and construct a block to win you a BitCoin. They have another big downside, as well, in that if you exceed what they can do, they aren't smart enough to stop the code. They "overflow" or something and the little core outputs gibberish.

ANN-Solo breaks high resolution data files into tiny little parts so that each little core only has to do a little -- then you can use the thousands of available cores to tear through files.  Compared to a decent CPU, they drop the time to process IPRG2012 data from 50 minutes to 6 minutes.

In the defense of the CPU, they use a VERY good GPU. The study uses the new 2080 GPU. That's a >$800 and pretty tough to justify if you're just using it for gaming....

Hey Ben -- this doesn't sound all that exciting....what's the big deal!??!

Have you seen some press releases at your university or facility talking about the "17 zillion (insert fake sounding number here) cores at the new High Performance Computing Thingy" that anyone an have access to? I bet you have.

Have you tried to use it? Were you surprised when someone complained that you used 48 cores for 24 hours and wondered what's up?  They're probably talking about CUDA cores and they probably don't actually have all that many of the cores that your software uses.

I bet the details are in the paper. I dunno, but in general you can link GPU after GPU together. What if ANN-Solo is the way to use all that HPC stuff? How cool would that be? And that's why I'm excited...


Sunday, May 5, 2019

HDX vs FPOP -- They're both better than NMR, but which one is the bestest?

Structural mass spec is advancing like crazy. Crosslinking keeps getting better -- with new reagents, better separation and improved MS/MS techniques and -- of course better software (1, 2, 3 )leading the way. (Here is an awesome recent review on all things proteomics structural.)

Crosslinking gives us a lot of power by letting us know what amino acid residues are close or reeaally close together (is this one commercially available yet?)

There are weaknesses here, though. Things can be close together without being within the space necessary for us to chemically lock them together. Conformations of relatively "low stoichiometry" (hey -- if phosphoproteomics can use this terminology as if it's okay for 15 years, let's keep bending it, it's almost a meme now!) are going to be impossible to see AND we learn nothing about modified residues or the outside of the protein structures.

Two techniques are improving all the time that can give you a lot more of this information -- and they go head-to-head in this study I left myself a note to read in September and just found.

Chances are you know more about these techniques than I do -- but I'm learning 'cause I think they're only going to become more important all the time!  HDX requires some arduous sample prep up front or the purchase of an add on system for your mass spec that does all the work for you. Deuterium can't get to the inside of your protein as effectively, so anything that gets labeled is on the outside. Boom! I know what the inside and outside of this protein and protein-protein interaction is like and tons of smart software exists that helps interpret the data. Workbench is a good example!

FPOP uses hydrogen peroxide and

...lasers to modify the outside of proteins, protein complexes, and -- holy shit -- have you seen this?!? -- even works inside whole cells!!  Study 1, study 2. The downside is that the modifications on the outside of the protein may be unpredictable. Better data processing, better resolution and accuracy data have helped make this easier, but it's still tough at this point.

Both of these techniques are better than NMR, obviously, because I don't have an NMR and the whole concept seems really old fashioned, and helium is not getting cheaper -- if you are a new lab starting out you may find that it's hard to get Helium at all (we had to go through 3 vendors and lie and say we're doing medical research (!!kidding about the last part!!) -- to even get tanks!)

 In the study that I originally started talking about (comparing HDX of the same protein to FPOP) HDX comes out on top. The comparison might not be the most fair, though.

1) HDX analysis was performed on a QTOF system
2) FPOP analysis was performed on a QE Plus system

I know someone who has an HDX QE Plus (Hi!) and I wonder if she'd get the same results with the same material? I think the higher resolution and higher sensitivity of system #2 is kind of seriously essential here, maybe particularly because the QE Plus for the FPOP used 300ms(!!!) of fill time for MS/MS, indicating that even with a pure protein collecting a lot of ions is critical component, which the TOF utilized in this study can't do (only one TOF kinda sorta can, right?!? I'm only putting question marks here because I've discovered there was some really interesting hybrid TOF technologies that were developed in the past that we all seem to have pretty much forgotten about). looks like FPOP wins! I like this result because HDX sample prep seems too finicky for me to ever get it right!

Saturday, May 4, 2019

Cloud Connected?

What is this thing that has the same name as one of my all time favorite songs from my all time favorite band?!?!?

It's the "control your instrument from freaking anywhere -- doesn't seem to care about your IT jerk's firewall settings -- can even install on your phone Cloud Connection utility"!

You can get it here.

Remember when this blog was good?

Me either!!

Friday, May 3, 2019

ThermoRawFileParser -- get more data out of your RAW files!

The first step in any data processing workflow is getting the actual numbers out of your vendor's proprietary format. This is something most/all of us just take for granted. I did until this dumb paper messed up some perfectly established complacency on the topic back in December --

And this is where the cleverly named ThermoRAWFileParser comes to the rescue!

How you get the data out of your RAW files matters. Proof?!?!?

That's 3000 more peptides! For free? For just improving how you pull the data out of the RAW file and make it into mZxML!?!? 

Tuesday, April 23, 2019

Ten simple rules for better figures!

I might just be leaving this paper here for me and my lab as we continue frantically preparing posters, talks and preprints for conference season....worth checking out, though!!

Monday, April 22, 2019

Nonsense Induced Transcriptional Compensation (NITC)!

A bunch of new studies just dropped in Nature this month that go a long way toward explaining a lot of results that seemed like nonreproducible nonsense.

The best place to start is this summary!

Here is the paradox mentioned above in my dummies view of it: in higher organisms you can "knock out" a gene by messing up it's structure or you can "knock down" a gene/gene product by messing with it's regulator or silencing it's RNA production.

You'd think that decreasing the amount of gene by either mechanism would have the same effects (cause, obviously, the gene doesn't actually do anything --- only the protein and how much of it is around is what is important) -- and sometimes it does.

But, other times the knockout and the knockdown will have VERY different effects on the phenotype. This will look like
1) Either the knockdown or the knockout didn't work
2) The person measuring the phenotype is:

 But  now there appears to be a mechanism -- it looks like short "nonfunctional" products from knockouts (often called nonsense mutations) may lead to compensation by causing upregulation of similar genes. Which -- from an evolutionary standpoint seems to make a lot of sense, right? Why have 30,000 genes or whatever if one single amino acid variant could call a stop codon and shut the whole organism down?

The evidence is described in zebrafish in this and this great new study(ies?).

Sunday, April 21, 2019

Sci-Hub -- a terrible illegal misuse of the internet you should never ever use.

If Shaq can't convince you not to do it, maybe I probably have no chance, but I'm going to try.

Never ever ever use SciHub. It's totally illegal and bad for publisher profits and we need healthy academic journals making money in order for science as we know it to work today.

What is SciHub? Oh -- it's a way to get any article from any journal in a free way via illegal unethical means that you should never ever use.

How do you use it?

You don't. But if you were to, you'd

1) Find a paper that you can't get access to without doing it the right way (the right way being -- giving Elfsevier $42 for a research study performed by the U.S.A. HHS (which is -- by US law -- REQUIRED to be open access)

2) You'd definitely (for real, I'm not joking, this is just for helping you not make the mistake of accidentally doing it) never Google the words "where is SciHub now"

3) If you didn't listen to my advice and actually did this you'd get a link that would lead you to these unethical criminal's website. Where you'd be asked told to enter the paper identifier or direct link.

4) Then this criminal website would provide you this study PDF that US taxpayers had already funded completely so that anyone in the world should have access to it, but of course it would be a total crime for you to access without paying the publisher for it.

Again -- this was for purely 100% scientific inquiry. I would never endorse the use of this service. I strongly recommend that you not use it. I encourage your report anyone who does to the proper authorities.

Keep science ethical, yo.

Tuesday, April 2, 2019

Reliable identification of lactic acid bacteria in food products!

Microbial identification in complex matrices by LC-MS/MS isn't new. I know people within an hour of my house that have published papers on this going back 10 or 15 years (APG, FTW!)

However -- it never seems to make it into routine usage. Part of it might be the technical aspects, but part of it might be that you need to really understand your matrix and what you're looking for. This great new study provides an awesome twist on microbial ID and a way that I could envision LC-MS/MS really going into daily use! 

What's the twist? There are bacteria that we do want around -- particularly in food preparation. But when you've got large populations of bacteria making yogurt or cheese for you it can be hard to tell those from the ones you don't want.

By clever selection of peptides from the extremely well characterized organisms of interest that should never exist in your starting material (or, presumably, in the bacteria you don't want) you can use either untargeted or targeted nLC-QE methods to identify good vs bad microbes.

The reason I'm thinking this is a great example of one that we can adapt to routine analysis is that the signal for these peptides is off the charts -- the XICs on the nanoLC allow you to see even the M+4, maybe even the M+5 isotope on these peptides! You're only getting that when you've got TONS of signal. If you can make out an M+4 at 200nL/min nLC -- this is an ion that you can EASILY find 3 isotopes for using analytical level flow rates (200uL/min)!!!

Yes, I've been pretty hard on the ElfSeverers this week. Good science still ends up there, though, right now and this is one example.

Monday, April 1, 2019


If you do have an ElfSeverer account you might want to check your Spam folder. I just got this email this week and should change my password.

For your information, Little Bunny FooFoo was already in use.

This is not an April Fool's day thing, btw.

Sunday, March 31, 2019

Inductively Coupled Plasma (ICP) -MS for Proteomics stuff!

Hey! What's this thing? This is my INDUCTIVELY COUPLED PLASMA MASS SPEC thing that is being installed on Monday!

What does it do? It uses a frickin' argon plasma beam to convert anything (even metals!) to gases and then basically uses unit resolution to tell the elements apart.

It's pretty sensitive -- this white paper tells you how to set up an older model than mine (!!) to  quantify how many molecules of cisplatin (a chemotherapeutic that has Platinum in it) are in each individual cell by turning each whole cell and into gas phase primary elemental components and measuring the Pt signal).

For those of you who don't live on continents that have banned the ElfSeverer journals -- there was just a special issue about it in one of their journals --

-- I can't access it either. I think the paywall just asked me for more money than Mario Kart for the Nintendo Switch -- and I don't yet have Mario Kart for the's the same game as the Wi U...which I do have...but....

...look how cool that guy is! He can play Mario Kart on a fake airplane (where is the other seat!??!) with his Switch! That could be me, I just have better hair.

Sorry ElfSeverer....I'm going to read some other papers and -- whoa -- this is almost related - check this great new study out!

We use a lot of metal based drugs in the clinic. And we use them cause they kill cells we want dead. However, just like everything else out there, we often don't know the whole story.

Here, this team is interested in the mechanism of action of Plecstatin -- which isn't active until it's metabolized -- making the mechanism of action tougher to work out. New drugs based on ruthenium and osmium are kind of important.

As you might be aware -- Platinum is expensive. You can make a LOT more drug for the same price if you can replace Pt with another transition metal (Google says Pt is 30x more expensive than Ru) -- which is one reason to follow up on new drugs. According to this paper -- when they've done these switches it hasn't been exactly apples to apples. The drugs may be just as effective but the mechanisms may be a little different(!?!?)

They sample at multiple time points and use a load of techniques to figure out what is going on with this drug. All the ICP-MS is in the discussion and it sounds like it was used previously cause you can track the Ru easily with that technique. What they do here is pull-down shotgun analysis with a Q Exactive using biotin labeled Plecstatin-1 and activated drug.

The two drugs pull down very different populations of proteins -- with the prodrug interacting with proteins that just appear to be from initiating a generic stress response -- the activated form, however, is binding strongly to the target (Plectin?)

They do some further stuff where they modify the drug and repeat to try and figure out the important parts of the drug toward the mechanism of action they're interested in -- and how it changes over time and it appears to point to specific bonds as critical!

What is the moral of this story? Ben needs more hobbies? Probably. But maybe -- it's that ICP-MS can be a valuable tool for drug mechanism work and all sorts of other things (mine's primary job will be trying to protect young people from inhaling lead particles...a problem that is increasing rather than what it should be doing) but there is a bunch of other things it can do (and if you have some ideas where 3 minutes of intense ionization with an Argon beam might help you solve a puzzle? reach out! No joke -- it's 3 min per run. I'm worried about how to keep it busy!)