Monday, April 12, 2021

Were you dying to know about the VAMPIRE BAT serum proteome?!?

You know, prior to this recent little thing that happened with a bat virus  I might have cast this paper in a different and less serious light. However, it seems like we should probably know everything we can possible know about anything. 

Therefore, I present a no-nonsense blog post about this great new study

The study focuses on vampire bats in Belize and samples come in as part of a longitudinal study. Rabies is involved and so is livestock due to agricultural alterations of bat habitats. Yikes. See. Totally serious stuff. 

All LCMS is performed on an Orbitrap Fusion 2 "Lumos" using DIA. As you might imagine the vampire bat FASTAs may not be as well developed and annotated as some other organisms, so the informatics section...

...better just skip that part. 

Do you want to know if they found high quality MS/MS matches to peptides that match the middle eastern coronavirus? Or should I ...lay this one to rest... already? 

Sunday, April 11, 2021

Predict peptides with PROSIT and MSPiP for proteogenomics!

Wow. Okay, of course this is one way to do it! Actually, it seems blindingly obvious to me right this second, but I certainly hadn't thought to do it the way they did in this new study!  

There are several smart ways to get around proteogenomic challenges that we're increasingly being encouraged to think about. It's a lot easier if we can just use the UniProt SwissProt little tiny library of protein sequences, but they ignore:


individual biological variation,

the fact that some of those sequences might have been Craig Venter's and if you were picking a normal human being to pull sequences from, would he really be the best pick? 

What this group does? They pull out their little nanopore gizmos and they do some sequencing. Instead of feeding their Variant Call File (vcf) or whatever is the same thing in Nanopore world into a search engine, they feed those sequences to deep learning peptide fragment and creation tools. 

BOOM! Modeled MS/MS spectra that they can match against theoretical! Smart, right? 

Friday, April 9, 2021

Did they use MaxQuant.Live targeting in reverse to boost Ub and SUMO coverage?!?

 I'm mostly sure I understand what is going on here? Maybe? 

What I think is going on is that I just didn't realize that the super powerful MaxQuant.Live targeting function could be used for ultrasmart exclusion as well. And if that is what is going on 1) I feel kind of dumb for not realizing that was possible -- I could totally use this! and 2) you can use it to dig deeper into your data and find the really annoying PTMs like SUMOylation! 

Okay...but I have the interface (Beta 2.0! With Exploris support, which clearly couldn't be used to get into any other instruments with the same handy universal interface --

And I'm not sure that I could reproduce this on our instruments without...sigh...reading more... 

This post appears to have a theme now. I'm not sure why.

If you are also stymied by a number of options that seem too challenging for you to get away with just tryng them all to see if they work, I bet the secrets are in one of these places.

The MaxQuant.Live documentation pages? Oh. Maybe I found it here. Not 100%, but it narrowed it down. 

The google group page?  It's here somewhere! 


Thursday, April 8, 2021

It's finally time for the Journal of Proteomics Research special edition for tools!

 I've been looking forward to this JPR special edition since I finished the last one! 

You can check it out here! 

Just flipping through:

Updates for Percolator, a new glycoproteomics engine, python tools for TIMS data (holy cow, does the TIMSTOF need more functional software) an open library for metabolomics, and a bunch more R packages. 

Get the right tool for the job! 

Wednesday, April 7, 2021

IonQuant paper is out!


There is a decent chance given the amazing amount of traffic on the MSFragger Google Groups channel that you've already tried IonQuant and may be actively interacting with the developers to make it better and faster and more powerful.

If not, you should check out this ultrafast label free quantification application which actually works out a metric for false discovery rates for match between runs. We're literally using it every day! It's how we process our daily quality assurance runs on TIMMYDuncan. 

Tuesday, April 6, 2021

TMT labeling in a bubble on a chip!

Leaving this one here so I can spend more time on it later, got a busy day ahead and I'm finding labeling extremely small amounts of peptides with TMT to be a decent sized challenge. 

This team forces a little bubble with some solvents then force labels inside or something and get efficient quantitative labeling of a couple dozen cells worth of material?!? 


Monday, April 5, 2021

Integrative proteomic and metabolomic analysis of mouse cardiac deletion

I'll be honest, the only reason I'm interested in this new study at MCP is that I'm constantly looking for a better way to integrate proteomic and metabolomic data than the crappy tool that I made myself. 

Don't get me wrong, there is nothing wrong with this study. It looks like very good metabolomics -- both positive and negative runs on a Q Exactive, data processing in Compound Discoverererer 2.1. The proteomics was processed in MaxQuant and all the data is publicly available. 

Integration of the data was with a commercial tool that I've never heard of before and haven't tried, but I'm going to investigate the abbreviations on their front page to see what they are. The program is called SIMBA, and they used OPALS, MDMA, and OTTERPlots for VeryMultiplication Empires... or...maybe I should screenshot it.... 

Clearly not an endoresement, but if the fact 2 people standing very close together appear to be looking at extremely crappy mass spec data on a Macintosh means they're doing something interesting over there. 

Sunday, April 4, 2021

Super heavy TMT tags! Setting the stage for SuperTomaHaQ?

 Want a fun puzzle to wrap your head around? 

Test your noggin against the concept of Super Heavy TMT

TMTPro is already pretty darned heavy. You've moved from 229Da to 304Da for your adduct, but due to the really impressive bond chemistry design or something, if I'm losing IDs, I can't really tell. 

After a quick screen of this paper my main question was "what on earth could I use this for?" And I have a hunch this JPR paper is setting the stage for something, like a clever application that I'll be impressed by later, something like a SuperHeavyTOMAHAQ solution to a biological question I didn't know existed? If nothing else, new reagents are cool and give us more stuff to be creative about. 

"Super Heavy" comes up with some really funny gifs, as well...

...thanks google images! 

Saturday, April 3, 2021

Albumin adductomics provides readback of pollution exposure!

Uh oh y'all, it looks like albumin actually does something! You might want to take a look at this before you make that decision whether to chemically deplete it or not prior to your next study

This group (wait, I know some of these people! The full length first names of 3 of them threw me off) specifically looked at the albumin in both an untargeted and targeted (PRM) manner in samples from people in an area where lung cancer is rising at a high rate.

As an aside, have you ever tried to get an intact MS signal for albumin? I've never once got a good one. That protein just does not want to fly for me and someone said that it was the 7 stupid free Cysteines. Well, it turns out that these cysteines do more than make you question your ability to run an instrument. They covalently react with molecules and due to the relatively short half-life of the protein they can bind stuff you don't want and help you get it processed and out. 

By specifically focusing on these regions and the what binds to the cysteines, you can end up with a sensitive biomonitoring system for what a patient has been exposed to and whether they cleared it. 

All the work here was performed using nanoLC on an Orbitrap Fusion 2 "Lumos" system. What would take this to the next level would be showing that you could pick up these same molecules using a system a little less expensive, heavy, and overly complicated for clinical applications and all the transitions are here to do it! 

Friday, April 2, 2021

MaxQuant.Live update -- Exploris support!

I don't have an Exploris to test this on, but MaxQuant.Live now supports these instruments (in the beta that is good through July).

You've got an Exploris and you love your easy to use node driven next generation workflow method design? Why would you possibly want to operate your instrument through a different piece of software? 

Can your Exploris target 20,000+ peptides per run with on the fly retention time adjustment/alignment? My Q Exactive can through MaxQuant.Live.

Can your Exploris run BoxCar? (If you've got the 480, the answer should be yes. I think that is not the same answer for the 240). 

Can you run your Exploris at 92,412 resolution? Or are you stuck with the 5 settings that the instrument vendor let you have that so you wouldn't hurt your little brain with chosing a great big number? Not any more! Punch any dumb number into the box that you want to

Again, I can't verify that it works, but the good people at Max Planck generally make some decent software that does what it says it will do. 

Go to MaxQuant.Live and check it out

Thursday, April 1, 2021

Study protein turnover with pulsed SILAC DIA!


It's easy to forget about the "gold standard for proteomics quantification" these days since label free quan has gotten so powerful thanks to new algorithms and DIA, but it is still useful.

For a smart application of SILAC to study protein half-lives (turnover) with a smart experimental design -- with DIA -- check out this study!

What proteins are made or degraded when you tread cell line A with cancer drug B? Pulse in the heavy labeled media and you'll see what proteins are incorporating a lot of your heavy labeled amino acids and which ones aren't. 

The study is easy to follow and should be far less expensive than your typical -grow your cells in $4,000/liter media for 10 passages SILAC type experiment.

You'd think that having heavy labeled peptides in a DIA window would impact data quality by increasing the overall complexity. From these results it does look like that's probably the case, but not by enough to make it a bad idea for solving a fun biological question. And -- check this out -- the authors get almost an extra order of magnitude from the DIA in terms of comparing incorporated vs not over DDA. I think just about everyone would give up a few peptides to dig that much deeper. Biological dynamic range is not our friend. 

A Q Exactive HF-X was used and DIA windows were at 30,000 resolution. I really like Figure 3 and I've never seen anything like it. I assume it's some sort of wizaRdry. 

Wednesday, March 31, 2021

Scanning SWATH with ultrashort gradients-- 2,000 proteins in 1 minute?

 I'm gonna drop this here because everyone else is talking about it. 

Scanning SWATH goes way back to almost the beginnings of SWATH as an idea. I think it is very similar to SONAR from Waters in that the quad is not a stationary bin like we use for quadrupole Orbitrap based DIA. 

The trick here is fully on an informatics level, executed through the impressively easy to use DIA-NN software. 

There was an informal ABRF wrap up meeting with a lot of smart people from around the world that I somehow got invited to and one thing we talked about at length was the new generation of LCMS software that doesn't give you easy access to MS/MS fragments. DIA-NN falls in that group. This isn't going to make everyone comfortable, but it is something to be aware of. There is a whole generation of proteomics people using new software and getting stupendous numbers of identifications that verified by software but are not (or at least, not easily) verified by checking to see that MS/MS fragments actually exist.

If the software is right? Who cares! If the software is wrong, for many of these tools I don't know how you ever find out. I've been trying to look at the RAW data from this study but MSConvert won't recognize the .wiff.scan files and I'll probably just assume the reviewers did their due diligence on this fantastic sounding study. 

Tuesday, March 30, 2021

More improvements for MSAmanda -- for all operating systems!

MSAmanda has continued to improve, and if you haven't takend a look at this free search tool in a while now might be the time to take another look. Check out a short summary here!

For people using Macintoshes there aren't exactly a ton of options for proteomics. You can say the same for Linux probably unless you are adept at pulling tools from Github and figuring out what steps the developer thought you were smart enough to know about without them telling you. 

Other highlights include more compatibility with file types recommended by the Proteomics Standards Initiative which must be composed of the most tolerant people who have ever lived on earth. Most normal people after 1 year of trying to get proteomics to standardize anything: 

They've been coming up with ideas for almost 20 years! 

Monday, March 29, 2021

Proteomics needs more spectral library formats!

 At both the conveniently overlapping USHUPO and ABRF a couple weeks ago a big story was the emerging new alternative proteomics techniques. Illumina is getting into the game and is chasing the SOMASCAM technology to see who can be the fastest to accurately quantify 1,500 proteins in human samples. 

Four new companies launched last year alone raising huge amounts of money with basically the same pitch "we want to be the Illumina of proteomics", which probably led to Illumina wondering why it couldn't be the "Illumina of proteomics" too. Google the term in quotation marks. You'll find them and their huge successful investment raises.

The outside world is excited and ready to invest in proteomics and it's becoming readily apparent that LCMS is not part of the conversation. Could someone raise anywhere near that kind of investment capital on a pitch based on LCMS? No way. If you take a step outside our little community spin around in place 3 times and look back in you can probably see why the scientific community is fatigued with us and all the dumb shit we spend our time on. 

For example: I think that a substantial percentage of people in the field right now are spending their time trying to come up with completely new and completely incompatible spectral library formats. Is that what you're planning to do today?

Why do we need a tool like EncyclopeDIA to have to converters for 7 different spectral library formats? Why is that not even enough?

I downloaded the files from a new study from a single word journal this morning to see the results for myself and I'm absolutely fucking thrilled to see a new spectral library format this morning. Even Pinnacle, which can natively load practically any format of MS data and has options for accepting a scrollable list of input formats (did you know that some ultralarge biopharma companies have their own internal spectral library formats? they do, because we've all absorbed too much acetonitrile through our skin and it has done something awful to our brains. I know because Pinnacle has the option to accept those as well on it's pulldown list of options for input) just closes when I try to feed it this amazing new spectral library format. I'm sure that OptyTech could fix that for me today, but why should they have to? 

When you apply for your next grant and you're beaten by the genomics core across town because they can now use their NovaSeq to quantify 1,500 proteins, don't be bitter. 

Go back to your lab and get back to working on a completely new and unnecessary way to extract proteins from your cell types that we already have 15 ways to effectively work with. Alternate between 100mM TEAB for resuspending your trypsin today and swap over to 50mM AmBiC on Friday, heck, mess with the ratios of protein to trypsin while you're at it. Tinker with that gradient to get that one extra albumin peptide you've always wanted to see in your global runs, you know you want to. Hell, put a grad student on making an entirely new spectral library format. We probably don't have enough anyway. 

Just keep in mind that every step along the way we have done basically everything we could think of to make proteomics inaccessible to the greater scientific community and make it as challenging to reproduce our results as we possibly could. If Illumina pulls off 3,000 protein identifications in their next generation of technology as they have promised, we'll be lucky if LCMS proteomics exists anywhere outside of Cambridge and Munich, because by and large the scientific community is tired of our circular tail-biting craziness. And they should be. 

Sunday, March 28, 2021

Mesh -- select and fragment multiple charge states of intact proteins -- in real time!

 And the winner for understated study so far for 2022 goes to.....

Man, this is why I'm still here doing this stuff, I guess. From the title and abstract alone, I bet you probably didn't want to read this. Why would you? 

Well, you can get a program developed for this study from Github here that will allow you to operate your Q Exactive system with some new super powers

MetaDrive? What's that do? Well, it's nothing to yell about really, it allows you to look at intact proteins on the fly and deconvolute them and then it selects multiple precursors for your intact protein (like targeted multiplex quantification) and will hit them with stepped CE allowing you to get massively more signal on your intact proteins. That's all. 

When you do intact proteins on an Orbitrap and select an ion for fragmentation the best case scenario is that it looks like this and you selected that one charge state for your MS/MS fragmentation. 

You're only getting a fraction of the total protein signal for fragmentation. It's a large part of the reason that MS/MS scans of proteins look so lousy. 

What if when this signal came off your QE, there was a program sitting there that was smart enough to deconvolute that protein and recognize multiple intense charge states? Keep in mind that the example above is a pure myoglobin standard. Top down doesn't look like that normally because you'll have multiple protein envelopes overlapping. 

Just imagine if you were able to select even these three below? You've effectively tripled your protein signal for fragmentation! 
Even better, since the Orbitrap is by far the slowest part of this process, there may be effectively no increase in the amount of time each cycle takes (like BoxCar, you're accumulating multiple isolated ion windows, holding them and then scanning, but here you can fragment them!) 

You know, not a big deal. The authors might have understated how freaking cool this is because they do note that the real time calculations are a little slow. They're working on implementing the screaming fast FlashDeconV program into this to help alleviate the bottleneck. 

The files are up on PRIDE, but still locked. I already put in a request to open them up if you want them!