Saturday, October 31, 2020

Need some "normal" single shot human files?

I think I came up with (no exaggeration) over 1,000 RAW files from HeLa cell lysates or the Pierce digest standard on my network here. I bet a lot of people would find the same. HeLa is the go to human cell type for proteomics despite the fact that basically everyone in genomics decided 10 years ago that use of the cell line is unethical. 

The NIH is actually revisiting this issue now

The cell line is useful for proteomics for a lot of reasons, like all the transcripts/proteins are being expressed basically all the time and checkpoint phosphorylation sites may be fully activated and the cell keeps going. 

You know what is tough to find? "Normal" human proteome files! 

There is the Proteome Draft Map (Pandey Lab version): These are a bit older now, but the data quality is great. 24 offline fractions, 90 minute LCMS per fraction and HCD MS/MS. The files from the LTQ Orbitrap Elite are the best and the MS/MS resolution is around 30,000 (there are Velos files as well, but the density is much better in the Elite files). All the RAW files are on ProteomeXchange here. 

More recently, this group employed offline SAX fractionation followed by QE HF/Plus and Lumos (for some tissues and digestion types) to make:

The RAW files are on ProteomeXchange here.

Sometimes, though, you aren't looking to start your project looking at 5e5-5e6 MS/MS spectra. It would just be cool to see if you are working with a new tissue, what you should expect:

New Resource Alert! 

They do some cool analyses of these tissues, but I'm just looking for a resource today and these great RAW files are at this link at PRIDE

The files I've pulled down are 120 minute single shot files with superb chromatography analyzed on a QE HF-X. The top 28 precursors were selected for fragmentation at 15,000 resolution. The 2-5 charged precursors were analyzed.

The only minor criticism I have of this great new resource (that I'm now using for free for my research, so I'm obviously a jerk for pointing it out, but it is worth noting if this matters to you) is that the first mass was not set at a specific cutoff, it is instead proportional to the parent m/z

For example: 

For an ion this big you won't see low mass fragments like oxonium ions from glycan monomers. So, if you were looking to survey, for example,  "how often do we see high abundance HexNaC peptides in different tissues" the numbers will be misleading compared to files that did set a lower mass cutoff static. 

Programs like SugarQB that require the presence of these oxonium ions as "reporters" to flag spectra for large library glycan searches won't work very well (in part because the glycans will push up the size of the ions). 

Just pointing it out like a jerk for the people who use this stuff! Otherwise, for real, this is a stupendous new resource that I'm going to use for a lot of stuff! 

Thursday, October 22, 2020

The Winner of the 2020 Data Mineathon Challenge/ #ALSMinePTMs is.....

First of all, I can't thank everyone enough for putting up with the chaos that was #ALSMinePTMs. 

Thank you judges! 

Thank you random friends who helped me evaluate some of the data and filter it down for sending out to judging!

And most of all, thank you to the participants who downloaded files that I didn't properly set up access to at first, that I then posted incorrect meta-data for, and still used their valuable time to process this data and return results! 

We are working on the draft of a manuscript to actually summarize these results and more emails will be going out while we pull this awesome data together. 

This is an early observation that we're in the process of evaluating, but this might have been more than just a fun exercise. Some of these PTMs that were discovered look legitimately important to the understanding of this messed up disease and might strengthen some of the hypotheses of where it starts from (environmental exposure, as terrifying as that is). Preliminary analysis of another cohort definitely shows some of these peptides, but more work will be necessary to really check all this out, and that is in progress. 

Just to be clear if this is the first time you've heard of this. All the participants downloaed a proteomics dataset that was not enriched for PTMs and found PTMs in it, and we're validating the results and we're going to write a paper about it. That is the power of next generation proteomics data processing software. 

So. Thank you. 

But that's not what I'm typing about. This was a Challenge. And bragging rights are on the line AND

There were great submissions all around, but the top pick from the judges was a submission sent in by Dr. Sarah Haynes and the first 2 pages looked like this --- 

If you aren't using MSFragger/FragPipe, Philosopher and/or associated tools, I urge you to give them a look. Other tools found many of the same modifications that MSFragger did and that really resonated with the judges, but no tool found as many different classes of solidly confident PTMs. When multiple tools support what your analysis found but nothing found all the ones that you did, that's a super impressive, right? 

And the follow up analysis done by the team in support of their findings was really just top notch work. 

Again, I'll follow up with more later, but I wanted to get the winner announced and this part of the challenge wrapped up. 

A huge congratulations to this team, and another huge thank you to the people who made this possible. looks like a real institution is interested in this challenge and maybe providing resources to keep it going, so it really might have been the 1st Annual Proteomics Data Minethon Challenge Thing! 

Wednesday, October 21, 2020

Need THE guide for today's proteomics for medical collaborators?

 Is this the guide to today's proteomics technology that Carl Sagan would have written? Something that breaks down the crazy stuff we do that is branching into hundreds of distinct directions and makes it as approachable as possible? 

If not, it's as close of an attempt as I've ever seen and it makes me very happy.  100% recommended for sending to that person you think will be cool to work with.  

Monday, October 12, 2020

STAMPS -- Build Metabolomics Assays From the Protein Level!


I'd love to have an argument with you about what "Metabolomics" actually means. To me, the sticker is that whole "-omics" part of the word. Wait. Terrible idea time. 

I'll argue all day that "proteomics" is only just now starting to happen routinely. Here is the argument, though, I'm not sure "metabolomics" is really happening yet. We know that only a small subsection of the metabolites in a cell will stick to reverse phase chromatography, and an even smaller section of those will ionize in one polarity and an even smaller section of those will survive ionization and out of those you'll confidently identify maybe 10% , maybe 30% of them?  If you haven't tried global metabolomics, give it a whirl. Then tell me how you figure out which of the 25 things confidently labeled as "inosine" is inosine. (The second one is probably hypoxanthine, it doesn't survive electrospray very well.)  

What was I typing abou....STAMPS! You can read about it here.

STAMPS dares to ask the question: "Why are you trying to quantify these frustrating small molecules, when you could be quantifying the proteins responsible for making and degrading them?"

And you say: "Because it would take me 3 miserable months to make the targeted assays"

And they said: "Oh. Here you go. We made them all for you." (In mouse, so far, but more is clearly coming)

This thing is super sweet. Go into the database and find the metabolic pathway you're interested in. (They've got 16,000 proteins in mouse available so far) 

Put a checkmark on the proteins that you want, right in the metabolic pathway for the thing you're interested in. 

Check the spectra of your targets, if you're interested. 

And download your assay. Formatted for input into Skyline! 

Is this still sorta targeted protein quan and not proteomics? Maybe. But if you've got the ability to choose from all the things, to me, that counts. I'll be even more pumped when they set up human, but if I was really interested it wouldn't be all that hard to port this with Picky or Phosphopedia or similar. 

Sunday, October 11, 2020

PRiSM -- A thought experiment on Protein rather than Peptide Spectral Matches!

(Image credit: Lucas Vieira for making this and putting it in the open domain!

Proteomics hasn't been around all that long (not real proteomics, anyway) but we have been around long enough to develop a nice cognitive box for us to work within. 

The answers to to these questions are something like:
1) We do the peptide because it is WAY easier. 
2) You can do it. I mean....they did it.... I couldn't do it. 
3) You need some serious math and a lot of firepower. 
4) Pros? It totally works and they can find things that traditional engines can't.
5) Cons? It's hard. Like seriously hard. 3 hours per MS/MS spectrum per computational core hard, but this is a thought experiment, not a practical optimization study. 

We know there are places that our bottom-up search engines just don't work well. Maybe there is an alternative! 

Saturday, October 10, 2020

Do you have a strong Opinion on Proteomics? This special edition wants to hear it!

 I know you have some opinions about what is good and maybe what is bad in proteomics right now. (I'm anything could possibly be bad about proteomics!) If you're tired of saying that opinion over and over again to yourself, and think the whole world should hear it -- now is your chance, cause this special edition in Proteomes is accepting reviews now. 

How much do they want your review? No page charges! (Don't tell anyone at Elfseverer about this. They might literally die.)  

I'm not kidding about it being an opinion piece, special guest editor Matthew Padula described his hope of assembling a "warts and all" compilation of where we are today, so we can take it apart and figure out how to fix it all to get to the next level.

Super cool idea, right? 

Friday, October 9, 2020

Reminder you can control your EasyNLCs from your PC!

 Probably everyone already knows this? Just in case, you can totally control your EasyNLC systems from your computer. It makes remoting in a lot easier. I just confirmed today that it is compatible with the Easy1200 system.

In our test group of mass spectrometrists considered "grumpy" and "deficient in Vitamin D" by their colleagues, we found a 12% reduced incidence in road rage when they were running the horrible "flush air" script on their horrible commutes. 

You can get the installation documents from the great UWPR page here

Thursday, October 8, 2020

Fragment mass prediction for phosphosite localization!

Could we improve on current phosphorylation site localization strategies? This group sure seems to think so and this new approach seems simple enough with all this deep learning peptide stuff happening to work into a lot of new workflows! 

If I get what they're doing, they use the deep learning model for the unmodified peptides and use the phospho and phospho loss (?) shift to score the localization. The plots vs the traditional approach suggest they're onto something solid. 

Wednesday, October 7, 2020

The Carrier Proteome Effect in Single Cell Proteomics!

 I saw this one Twitter a few days ago and I couldn't find it anywhere. I think there might be a preprint, but I'm leaving it here so I don't lose it again.

Direct video link is here. 

Tuesday, October 6, 2020

Match between runs for Reporter Ion Quan?!?!?


Have you ever tried to combine multiple TMT studies? I'm paraphrasing, but Akhilesh Pandey said "two will work, and three is okay, but the amount of loss by your fourth plex makes it not worth it" and I have found that to be 100% true in my hands. Our sampling is stochaistic, which is fun to say, but I'm not sure that I'm using the word correctly. The Venn diagram of the peptides that you're able to fragment and identify in each plex that you add gets progressively less overlapping. In today's instruments that are blindingly fast, we're getting loads more fragment ions but that doesn't necessarily translate to higher percentages of actual identified ones.

Isobaric match between runs (IMBR) allows you to link data from fragmented but not identified peptides back to the identified ones from other plexes. 

I'm legit blown away. There are at least 10 different studies on hard drives sitting around here that this should help with. 

Oh, and they built a new algorithm to normalize without using a pooled channel. 'Cause, you know, this wasn't incredible enough. And these appear to now be built into MaxQuant and Perseus now! 



I was on a conference call the other day and someone said that we needed more tools for combining TMT plexes. If you're not about to start updating your MaxQuant after reading this post, you should 100% check out IRS from Phil Wilmarth. You'll need to use R to do this, but you can pretty much follow anything Phil does and copypasta what he writes directly into R and just run it.