Wednesday, April 1, 2020

ASMS 2020 Houston Cancelled -- Online ASMS 2020 details coming soon!

In something that comes as a surprise to no one, we won't be gathering in Houston for my favorite holiday.

An online format is coming. Keep up on details at

If you're crazy enough that you booked your Airbnb a year in advance, you may be able to get forgiveness for all or a percentage of your stay. May 31, 2020 is the last day currently, but details will be posted here.

Man...a lot can change in the world in 3 weeks. The Skyline team was being proactive about the volunteer speaker pool and I was convinced we'd beat this stupid virus (come on, it's only 10 proteins!) by then.

Great virus proteomics references!

This isn't anywhere near a complete list, but if you're finding yourself motivated to jump into virus proteomics all the sudden for some weird reason or the other, here are some awesome resources.

This review is where everyone (IMHO) should start.

If you're looking for something beyond what the 137 references in that paper can provide, you might need to do some jumping jacks.

However, if you wanted to focus on specifically LCMS methods for virus stuff, a really good next stop would be going to Google Scholar and typing in Virus and Arthur Moseley. That team at Duke has been getting DARPA funding to work on virus detection for years (that's who is acknowledged in at least some of the papers).

In this one they identify host targets using 2D-wait...gels...?...hey, it works and then they build targeted assays for the host biomarkers. 

And this one is my current favorite. In this one they build targeted assays for viral proteins!

The peptides are hunded down with an LC-QTOF and LC-MRM (all Waters instruments) are developed. Stable isotope peptides are used for the MRMs and 30 minute NanoLC gradients.

On the TQ-S they get theoretical LODs in the sub-femptomol range!

AND they show that they can use targets for the variable regions of the nucleocapsids to work out the lineage of the virus.

Tuesday, March 31, 2020

UniProt got organized on COVID-19!

UniProt just launched this great new page, linking all their COVID-19 resources. I expect a lot more with UniProt2020_02 in just a few weeks.

Monday, March 30, 2020

DeepLC -- Predict the retention times of MODIFIED peptides with a handy GUI!

I think my expectations are typical for what I expect from the good people of the world making free software and proteomics tools. 
I just want: 

1) Completely new ideas that are way better than the old ones. 
2) Ultra powerful algorithms that use resources I couldn't possibly get or use elsewhere.
3) It all bundled in a way that will only take me like 45 seconds to install.
4) It to be intuitive enough that I don't have to read anything to use all this power.

All the perfectly reasonable expectations we all have for our bioinformagicians out there. 

We know that chemically modifying a peptide with a PTM shifts it's retention time. How? That depends on the modification. Phosphopeptides generally come out earlier (you probably lost a significant number of them if you used a PepMap trap column.), but what about the other ones?  That's one of the problems you need DeepLC for. Loads of application here, but I've gotta move fast today.  

You can get the program from this Github!

Super easy installation and it only has 8  settings! Tons of new power for me in exchange for exactly zero effort on my part?

Proteomics + Mice on Turntables!

I'm taking 15 minutes to stop looking at a FAIMS mystery to just, honestly, not look at it for a second and -- BOOM -- someone put mice on turntables and did proteomics of their brains?

I'm also going to grab breakfast/lunch/what day is it? so I can't read this, but I'll put the link here! so I can come back later.

...and a figure from the abstract to prove I'm not joking!

...apparently it is about learning about how mice learn. I'm going to still do the joke I planned.

#JBC MethodsMadness final round Team mass spec needs your votes!

If you weren't aware this is a thing, it's down to the finals and team mass spec needs your votes.

You can vote here! and PCR were beaten in the earlier rounds...because...hmmm....

Okay, but mass spec is DEFINITELY cooler than Cryo-EM. If you don't use a UHMR or EMR to speed up the workflows, even with an inexpensive Cryo-EM ($2M) to do all of the QC work and optimization stuff for your normal Cryo-EM (maybe $8M) you can only solve 4 or 6 protein structures per year with one.

Sunday, March 29, 2020

Prosit spectral libraries for COVID-19 (SARS-CoV-2) vs experimental!'m honestly just floored by how ridiculously accurate these Prosit spectra are... Everyone should be using this tool. (Click should expand it.)

First off -- This tool doesn't require you to be a master bioinformagician or anything to use it.
Here is my simple walkthrough on how I generate Prosit spectral libraries.

Second --
At the top is a mirror plot generated in Proteome Discoverer 2.4. The top is the experimental peptide from a COVID-19 / SARS-CoV-2 preprint that came out on Monday 3/23/20. The bottom is the prediction made by Prosit on 1/27/20.

 Worth noting: In earlier versions of Proteome Discoverer the MSPepSearch may not accept the Prosit spectral libraries (whateveryoucalledit.MSP). I pretty much jumped from 2.2 to 2.4 since I was doing mostly small molecule stuff for a year. If you are on an earlier version of PD 2.2, there is a solution -- MSAna (which is compatible with PD 2.1-2.4  -- all versions -- including the free versions)

You can get MSAna and the installation instructions from -- now you can use spectral libraries. MSAna also has several options for decoy library generation. It is worth checking out on it's own.

Back to the library vs theoretical, though!

This is the Prosit predicted fragmentation pattern for this peptide. This is a screenshot from the ridiculously handy and free tool PDV (Proteomics Data Viewer) that I use basically every day now for one reason or another.

Yo, where did y14 and y15 go? And it is kind of a neat characteristic that it only predicts that you'll see a central series of b -ions...

An experimental PSM says...?

....pretty darned close!

Okay -- that's not bad, right? However, you should go back to the top and look at the relative intensities of these fragments.....because Prosit predicts those too! Actually, here is a zoomed in clip!

A deep learning tool predicted the bottom....and the top is the real spectra from a peptide that had never been experimentally observed in unlabeled for until last Monday!! Crazy, right?

I have been pretty hard on a lot of the "artificial learning, machine intelligence, deep intelligence" stuff and I still think this is legit funny --

-- and I'm still going to be skeptical of anyone saying those terms (as we should be, of course) but I'm flipping through these spectra from this tool and this is one deep learning thingamabob that looks like its doing exactly what it's supposed to.

EDIT -- If you use spectral libraries in Proteome Discoverer -- go to the PSM level and double click anywhere on the PSM to open the normal menu you are used to. Now you'll find that this button is not greyed out. Click that and it will open your experimental vs library spectrum(a)(es)

Saturday, March 28, 2020

DEqMS -- Statistical significance of proteomics data -- adjusting for variable PSM numbers matters!

One of the cool things about genomics being a decade or so ahead of us in many regards is that we can learn from the mistakes they made (...well...we theory...) hmmm.... I'm...hmm...okay...well... start over 

can steal a lot of their cool ideas and programs! 

If you're trying to figure out what peptides or proteins are significantly different between your conditions, you're probably using a tool that was designed for RNA microrrays, like: 

edgeR or

LIMMA works great. We have loads of proof, but there is a huge difference between RNA microarrays and shotgun proteomics.
...besides the fact that RNA doesn't correlate with protein levels....
You always get the same number of measurements for each target! The old Affy arrays I used would have something like 46,000 RNA things stuck to it. Each sample would hybridize (or whatever) to those 46,000 so, in theory, you're always getting back 46,000 measurements per sample.

Shotgun proteomics isn't like that at all! Some proteins will only get 1 or 2 PSMS, even when you go all out. Even in high abundance proteins, you'll have stochastic effects, you almost never see even a technical replicate where you always got 84 PSMs for the protein in each one. 

What if you adjusted for that in some way? Like you purposely adjust your model so that it expects a situation where the PSM levels are realistically variable from run to run? 

I've rambled enough and I can't pretend I can follow the math, but this group validates the crap out of this approach using a ton of different types of publicly deposited proteomics data and, across the board, it looks fantastic in every one of them. Here are the conclusions that I am the most excited about: 

Friday, March 27, 2020

April 3rd webinar! Christian Munch SARS-CoV-2 Proteomics Talk!

If you aren't curious about how this group in Frankfurt somehow got this much high quality proteomics out the door on SARS-CoV-2 so impossibly fast, you're weird. (If you missed it, here is my recap of it with links to their data. This is the TMT calibrator study and the first proteomics study out on this stupid virus)

The great people of the London Proteomics Discusion Group is moving to a full webinar format and Christian Munch (my keyboard won't make the correct symbols, there is supposed to be something over the vowel in his last name) will talk about this amazing endeavor.

You can register for it here! (I also attempted to make the register button work, it probably didn't)

Huge shoutout to Dr. Harvey Johnston for tipping me off to this.

PicoFlow Proteomics -- Did you think NanoFlow was too easy?

I'm just going to leave this here.

As someone who believes that NanoLC is one of the 3 main reasons why proteomics isn't taken seriously, wow, am I ever excited to think about throwing a 2 micron internal diameter column into my workflows

75 picograms of peptides. 1,000 proteins ID'ed.

I presume a 1 uL bubble in this system takes 11 years to get out of your system, but it is interesting nonetheless.

Thursday, March 26, 2020

Another day -- another great COVID-19 proteomics study!

Who knew proteomics was so fast?!?! COVID-19 study number 3?!?

I was thrown off at first because I didn't recognize the species in their FASTA files. The SARS-CoV-2 virus was used to infect monkey cells.

Nanopore sequencing was used here as well as "high-low" shotgun proteomics on a Fusion Lumos system.
(High resolution MS1, ultra fast, low resolution MS/MS in the ion trap. I'm pretty sure they used HCD fragmentation. (I don't have time to go back through it today).

What I did think was interesting enough to screenshot was the number of phosphorylation sites that the detected. I know I rambled about the ModPred program on here somewhere recently (Edit -found it!)  and, check this out!

I can't use pictures from the preprint due to the copyright terms, but the phosphorylation sites that they find line up surprisingly well with the ModPred predictions!

There is a neat point where they find multiple phosphorylation sites in one area that ModPred is convinced there is only one likely location....and given that it is ion trap data, it wouldn't be strange to think that the localization was difficult. It would be interesting to take a deeper look into which one was right in that scenario.

The RAW files have been publicly uploaded, but I can't seem to pull them down. They are through a service I haven't heard of before.

Wednesday, March 25, 2020

An alternative protein assembly approach!

I'm not qualified to give an opinion on this new study. Fortunately, that hasn't stopped me in at least a decade....

This is how I explain it when I lecture, though.
Proteomics is really good at peptide-spectral-matching.
As a's fair to say that we're better at drinking beer than we are at assembling those peptides back into proteins. (Come on, that's at least part of the reason we're so bummed out about all the conferences we're missing right now. Nothing facilitates a discussion about ion fragmentation than being in a bar with 100 of the world's best experts on the topic, unless it's going to the next bar and finding another 90 or so).

Hopefully that protein is 100% unique. No amino acids line up in order with any other protein in all of the universe. If that is the case, we're set! If it isn' code that is designed to be easily integrated into our existing pipelines? That's a total win, even if I don't understand it at all.

Solid GenomeWeb article on the proteomics studies on COVID-19 so far!

Want a good overview of the two COVID-19 proteomics studies? This article does a great job of covering the giant Munch lab TMT study and the impossibly large protein-protein interaction data that posted in BiorXIV yesterday.

I think you have to be signed up to read it, but I'm pretty sure it's free to read.

Moral of the story? This isn't the proteomics of 2004 when SARS was tearing things up and we couldn't help. We've grown up a lot and have things to offer.

Tuesday, March 24, 2020

COVID-19 (SARS-CoV-2) Protein Interaction and Drug Repurposing study!

It is truly freaking amazing how fast some of these studies for COVID-19 / SARS-CoV-2 /2019-nCOV, are coming together.

As I'm waiting patiently for even the RAW files to download, you have to think -- how the heck did they get this study done this fast!?!??

Okay -- so 75 people working on it? That could be helpful!

Each protein from the virus was cloned and used as a bait -- whoa...they even cloned the predicted cleavage sites separately....and they tested all these drugs....?

A KingFisher was used to help automate the pull-downs/IPs/AEs, whatever you like to call them.

The digestions were performed on-bead and 2 LCMS systems were used. One is a QE Plus system running 75 minute gradients. I'm a little unclear on if 2 identical systems were used or if the second system was something different.

Downstream analysis used SAINT and MIST and MaxQuant and protein complexes were cross-referenced with CORUM. And the bioinformaticians on this weren't sitting around. The number of programs used in the downstream is staggering....

What you want is the ProteomeXchange data and it's all here.

Aren't a proteomics wizard? might have come to the wrong weird blog. But if you just want the Protein Protein interaction data -- they were uploaded to NDex!

Monday, March 23, 2020

Lunatic -- Finally a super easy way to quantify/optimize peptides loading!

For today's break from watching people who think this COVID-19 thing is a hoax, I present the easiest way to optimize your peptide loading that I've ever seen!!

I get this question all the time. You get this question all the time. Sure, we did the BCA, so there is all the protein, but how much peptide do we put on the instrument.

The actual value might surprise you! ...nevermind...I guess it is right there in the image....

We can argue about loading amounts vs how dirty your quadrupole gets vs whatever later. I strongly suggest this paper, if only, to see the easiest way to truly measure and optimize your peptide load you've ever tried.

I don't know what is going on in Ghent, but they seem to have this notion that they can just fix all these things in proteomics that have been broken forever.....