Wednesday, May 22, 2019

Virtual lock masses!


Time Of Flight (TOF) instruments can be ridiculously fast. Compared to today's FTMS (mostly Orbitrap) instruments, they compare poorly in two pretty serious regards.

Is this new paper in biorXIV something that could finally lift one of these limitations?


My blogger-at-5:30AM interpretation is:

1) Sensitivity -- with the exception of TIMS, there is no way to accumulate signal before shooting the ions down a flight tube -- this is a generalization, but I expect around 100x less signal on a TOF compared to even the oldest Orbitrap systems (also keep in mind that more TOF resolution = less TOF signal since the ions are traveling farther in an imperfect vacuum system.)  This is even worse, considering that I sometimes acquire signal in C-traps prior to Orbitrap analysis for 1000ms (or more) when I want to prove that a QE can exceed even quadrupole LOD/LOQ

2) Accuracy -- again, more generalizations, but most TOFs without consistent lockmass are accurate to one decimal place, where Orbitraps are at least to the second decimal and almost always to the third. (TOFs are also massively affected by temperature in the room, with some needing calibrated many many times/day)


Aha! Okay -- so I haven't ran a TOF in a long-ish time now -- but -- I know consistent lockmass sprays are in use that improve #2, but could you mathematically drop in a virtual value and get improved accuracy all the time?  These authors sure think so.

I tell you what. I was seriously proud of my brain as far as page 5. For real. I probably didn't actually know what was going on, but some combination of espresso and birds chirping at 5:30 made me feel like I did. Go brain go! I had some random thought about "Vince Carter's coming back for one more year, maybe there is still hope for you down this senescence water slide, old brain!" ...then...



...maybe it's time to walk the dogs and look at flowers....

Importantly -- yes, this totally looks like it works, though my dummies interpretation is that you need a lot of spectra to get the maths to work out, but once you have that pile of spectra you can use learning machines to massively improve the quality of the TOFs data.

Now -- I know people often think that the mass accuracy off the Orbitraps is probably more than we need -- and just about no one uses MS/MS lockmass on their Tribrid systems, even though that capability is there -- but -- is it enough? Will future bioinformaGicians look back on our data and wonder "why on earth didn't they lockmass their MS/MS spectra -- I need it for this (insert undiscovered chemical modification here)!??!" I don't know, but it's nice to think that we could improve the data we already have!

Tuesday, May 21, 2019

IPSA -- A handy tool I bet you're going to use all the time!


Sometimes you just need to step back, look at where your pain points are in your field and just put someone smart on fixing them forever.

Pain point? Annotating all those stupid fragments that your GUI output software doesn't feel like doing for whatever reason....or does but it looks awful....

Solution?  The Interactive Peptide Spectral Annotator! You can read about it ahead of print here


I highly recommend you read the paper. I will as soon as I have time to breath. Writing this and generating the meme took up all of my free time this morning. (Time well spent, in my ever humble opinion).

However -- if you are also pressed for time -- this tool is so amazingly intuitive that you can get going with it right now!

The online version is here!

The Github so you can use it and integrate it with your bioinformagical powers is linked in the paper.

I just punched in some random peptides into the online thingy here....




(..I mean..I was already on the site...but -wow, the output is super sharp)

Saturday, May 18, 2019

MARMoSET -- Get publication ready data out of your RAW files!


OMG. I'm either sleepy and/or this is hilarious. Full song and video at YouTube here.

What I'm talking about, though is this great new paper in Press at MCP.


How much time have you spent trying to get data out of your RAW files and to a resolution that you could actually submit?

Bonus -- As someone who has spent a decent amount of time trying to come up with cool abbreviations for methods -- this is an awesome example.

P.S. -- Have we had the UrbanDictionary talk? If not, we should. Please -- before you publish that cool word you came up with for your method -- please -- check urban dictionary for it first. This is to ensure your new software or technique doesn't lead everyone looking for it online first comes across something terrible you weren't aware human beings were physiologically capable of doing. Definitely may have happened to someone in our field recently. Don't be a statistic, please. 

This is how Marmoset works -- and how they got the name!!


Friday, May 17, 2019

Focus on the spectra that matter, duh!!


How did I miss this!??!  Looks like I retweeted it months ago, but didn't read it?!? 

How much time are you spending processing PSMs from proteins like this?


12 minuted CE-MS/MS run -- 145 PSMs from Tubulin B4. Who cares about Tubulin B4? Literally no one who has ever lived on the planet earth. Okay -- maybe 6 people who ever lived on this planet.  I don't care about tubulin. Especially not enough to waste 145 MS/MS events in a 12 min CE-MS/MS run.

Why don't I care?
Because it isn't changing. I only care about quantitatively changing proteins. Or PTMs that are quantitatively changing. If the ratio is 1:1, no one cares about your dumb protein.

How do we do proteomics, though?

We identify everything -- then we quantify it. If we're really lucky our software does both at the same time.

What we should do is use something like Quandenser/Triqler (a DINOSAUR IS INVOLVED, which is another program, but that's okay) to find out what is changing first -- THEN ID IT! 

You can get this program from THE Matthew here.  This program works for label free stuff.

Unrelated project, but related idea --

If you want to do something similar for TMT/iTRAQ, you can get RIDAR from Conor Jenkins GitHub here. 

Tuesday, May 14, 2019

Great review of post translational modifications associated with aging!


Okay -- honestly -- I was absolutely following this review from a couple of years ago until it gets to the E.coli stuff. Which is probably awesome/relevant in some way besides making very pretty pictures, but the first part of it is definitely great.

Reviews of PTMs and aging and the literature that has found links between these? Fantastic! You can check it out here.


Monday, May 13, 2019

Effects of APOE on Brain Proteomic Networks!


Somewhere over there -->
I put together a guide a few years ago about what to do after you have proteomics data. I'm glad to hear some people have found it useful, but I know it isn't very good. Truth is...I don't really know what you do next to work out a mechanism if GiaPronto or Ingenuity doesn't say "It's this one!!"

This amazing paper in a journal I've never heard of brings new set of tools on stage and goes through them in painstaking detail.

I strongly recommend reading it in the HTML format over the PDF because having the references side-by-side makes it much easier to pop a tab open and figure out what they're doing (in case it's a word of stack of letters in a row you've never heard of).


...which...makes me wonder why everything isn't formatted this way!!

Don't get me wrong, the proteomics here is stellar. (They run a Fusion in HighSpeed mode using 50cm columns and 150min gradients and combine this with data from another cohort that used a QE Plus). Most of the methods and basic data analysis are based directly on the spatially resolved mouse brain study Max Planck did a few years ago, but its the amazing level of detail in both the experimental design and downstream analysis that makes this something I'm very glad I read this weekend!

Saturday, May 11, 2019

Baltimore/Washington Mass Spec Area -- Let's talk about new software!


Ummm....so...thank you @the_ion_doctor for reminding me of this thing I agreed to do a year ago that I definitely wouldn't have forgotten about otherwise, probably!

In the area? Want to talk about data processing? Find our little group here!

Let's talk about --
OpenSearch Strategies (Fragger.TagGraph) and hybrids like MetaMorpheus
Second searching!
Using MS1 libraries
Scaling data processing way up with GPU and Cloud-based processing
And how to take all this lousy nextgen sequencing crap and make it into something that you can actually search with?

That's what the overview on slide 3 of my totally completed and well-orchestrated slide deck says we'll do and that probably won't umm..change...too much... Slides so good I'll post them, here after the talk.  Be careful when removing the limiters, though....




Wednesday, May 8, 2019

Register for the ASMS Skyline User Group Meeting!


There is no shortage of things to do the weekend before ASMS! However, if you haven't already signed up for awesome workshops the whole weekend -- registration is still open for Skyline!

You can register here!

1) Its in the Georgia Aquarium
2) If you're really super extraordinary, you might end up with something like this:


3) The speaker lineup is (again this year) loaded!

4) I might finally get to meet my collaborator -- schizophrenia proteomics expert Matt MacDonald in person, since he's doing one of the talks.

Related -- a study we both worked on just went live this week at AJP (classic medical style journal, I'm pretty sure it went into review in 2011....his group at Pitt did all the LC-MS work, so it's really good.)

Tuesday, May 7, 2019

PROTEOFORMER 2.0! Don't compete with RiboSeq, Assemble with it!


Gotta move fast on this one, but -- holy Unicron -- this is exactly what I was looking for. I've got proteomics and RNASeq and some PacmanBio stuff of some samples and this very frustrated researcher was explaining to me that he had "RiboSeq" of the same thing and I just kept staring off into the distance because I do that when I'm trying to assemble information I don't understand and it makes people think that someone is sneaking up behind them. And -- check this out!


Don't know what Ribo-Seq is? Me either! Here is a WikiPedia article.  My understanding is that it bridges the gap -- only things that are actively getting to the ribosome for translation are sequenced. It's the closest you get with probes and genetics before getting to the proteome!

Obviously this isn't new -- there appears to be a 1.0 version, for example -- but it's new to me on a totally new concept -- and it's so powerful that this group uses it to identify NEW PROTEOFORMS from shotgun data!

Bonus -- 1980s chemistry courtesy of poorly paid animators.

Monday, May 6, 2019

ANN-Solo returns with a GUI and GPU processing!!


Okay -- there are still some downsides to ANN-Solo in my mind, but they're mostly because I'm old and dumb. Here is an older post I made on this great software.

Downsides -- Python. The world's most approachable powerful coding software. You need to use it. If you've got neuroplasticity left you can probably learn it in an afternoon. Mine's all used up.  Fortunately I have Python experts around my lab! Two all the time and 3 on Thursdays!

Downsides -- Linux. Wait. You can use OS-X? Is that what Macintosh uses? I forget.

What am I rambling about? This new preprint featuring the return of ANN-Solo with even more power (and a GUI) 



Want a reason to get a Linus NumPy things all set up on something in your office?!?!



5.6 milliseconds to search an MS/MS spectrum against a spectral library! To get this fast, there is a catch, though, you have to use a Graphics Processing Unit (GPU).

GPU data processing isn't new. My old Waters Q-TOF used it 7 years ago. You can buy at least one commercial data package that uses it, and I think Darryl Pappin was messing around with them quite a while ago.

GPUs have TONS of cores. My 1080TI in the PC I'm typing this on has 3584 cores (they're called CUDA cores). Compare this to my CPU that has 20 cores or threads or so. However, each CUDA core in a GPU is weak and dumb and only capable of doing small tasks, like controlling a few pixels in a video game or performing the same dumb math problem over and over again in order to try and construct a block to win you a BitCoin. They have another big downside, as well, in that if you exceed what they can do, they aren't smart enough to stop the code. They "overflow" or something and the little core outputs gibberish.

ANN-Solo breaks high resolution data files into tiny little parts so that each little core only has to do a little -- then you can use the thousands of available cores to tear through files.  Compared to a decent CPU, they drop the time to process IPRG2012 data from 50 minutes to 6 minutes.

In the defense of the CPU, they use a VERY good GPU. The study uses the new 2080 GPU. That's a >$800 and pretty tough to justify if you're just using it for gaming....

Hey Ben -- this doesn't sound all that exciting....what's the big deal!??!

Have you seen some press releases at your university or facility talking about the "17 zillion (insert fake sounding number here) cores at the new High Performance Computing Thingy" that anyone an have access to? I bet you have.

Have you tried to use it? Were you surprised when someone complained that you used 48 cores for 24 hours and wondered what's up?  They're probably talking about CUDA cores and they probably don't actually have all that many of the cores that your software uses.

I bet the details are in the paper. I dunno, but in general you can link GPU after GPU together. What if ANN-Solo is the way to use all that HPC stuff? How cool would that be? And that's why I'm excited...


..groans...


Sunday, May 5, 2019

HDX vs FPOP -- They're both better than NMR, but which one is the bestest?

Structural mass spec is advancing like crazy. Crosslinking keeps getting better -- with new reagents, better separation and improved MS/MS techniques and -- of course better software (1, 2, 3 )leading the way. (Here is an awesome recent review on all things proteomics structural.)

Crosslinking gives us a lot of power by letting us know what amino acid residues are close or reeaally close together (is this one commercially available yet?)

There are weaknesses here, though. Things can be close together without being within the space necessary for us to chemically lock them together. Conformations of relatively "low stoichiometry" (hey -- if phosphoproteomics can use this terminology as if it's okay for 15 years, let's keep bending it, it's almost a meme now!) are going to be impossible to see AND we learn nothing about modified residues or the outside of the protein structures.

Two techniques are improving all the time that can give you a lot more of this information -- and they go head-to-head in this study I left myself a note to read in September and just found.



Chances are you know more about these techniques than I do -- but I'm learning 'cause I think they're only going to become more important all the time!  HDX requires some arduous sample prep up front or the purchase of an add on system for your mass spec that does all the work for you. Deuterium can't get to the inside of your protein as effectively, so anything that gets labeled is on the outside. Boom! I know what the inside and outside of this protein and protein-protein interaction is like and tons of smart software exists that helps interpret the data. Workbench is a good example!



FPOP uses hydrogen peroxide and


...lasers to modify the outside of proteins, protein complexes, and -- holy shit -- have you seen this?!? -- even works inside whole cells!!  Study 1, study 2. The downside is that the modifications on the outside of the protein may be unpredictable. Better data processing, better resolution and accuracy data have helped make this easier, but it's still tough at this point.

Both of these techniques are better than NMR, obviously, because I don't have an NMR and the whole concept seems really old fashioned, and helium is not getting cheaper -- if you are a new lab starting out you may find that it's hard to get Helium at all (we had to go through 3 vendors and lie and say we're doing medical research (!!kidding about the last part!!) -- to even get tanks!)

 In the study that I originally started talking about (comparing HDX of the same protein to FPOP) HDX comes out on top. The comparison might not be the most fair, though.

1) HDX analysis was performed on a QTOF system
2) FPOP analysis was performed on a QE Plus system

I know someone who has an HDX QE Plus (Hi!) and I wonder if she'd get the same results with the same material? I think the higher resolution and higher sensitivity of system #2 is kind of seriously essential here, maybe particularly because the QE Plus for the FPOP used 300ms(!!!) of fill time for MS/MS, indicating that even with a pure protein collecting a lot of ions is critical component, which the TOF utilized in this study can't do (only one TOF kinda sorta can, right?!? I'm only putting question marks here because I've discovered there was some really interesting hybrid TOF technologies that were developed in the past that we all seem to have pretty much forgotten about).

So...it looks like FPOP wins! I like this result because HDX sample prep seems too finicky for me to ever get it right!

Saturday, May 4, 2019

Cloud Connected?




What is this thing that has the same name as one of my all time favorite songs from my all time favorite band?!?!?

It's the "control your instrument from freaking anywhere -- doesn't seem to care about your IT jerk's firewall settings -- can even install on your phone Cloud Connection utility"!

You can get it here.

Remember when this blog was good?

Me either!!

Friday, May 3, 2019

ThermoRawFileParser -- get more data out of your RAW files!


The first step in any data processing workflow is getting the actual numbers out of your vendor's proprietary format. This is something most/all of us just take for granted. I did until this dumb paper messed up some perfectly established complacency on the topic back in December --



And this is where the cleverly named ThermoRAWFileParser comes to the rescue!



How you get the data out of your RAW files matters. Proof?!?!?


That's 3000 more peptides! For free? For just improving how you pull the data out of the RAW file and make it into mZxML!?!? 

Tuesday, April 23, 2019

Ten simple rules for better figures!



I might just be leaving this paper here for me and my lab as we continue frantically preparing posters, talks and preprints for conference season....worth checking out, though!!