Thursday, May 26, 2022

Break up your PD results (spectral counts by file!) with the MS2Go PreFilter!


I must have gotten this question 50+ times over the last 8 years. Something like, "Yo, wierdo, why can't I see my PSMs per file like in other programs?" The easiest way is to run each file separately and figure it out. UNTIL MS2GoPreFilter! 

Just slap this into any consensus workflow in the post processing box and -- violin -- there is your counts per file! 

Tuesday, May 17, 2022

Get proteomics samples ran for free!

If you've unfortunately stumbled on this page because you just heard about proteomics and think it might work for your work, I'm sorry if you discovered it is too expensive for you. 

But -- if your idea is really good, you can, no joke - get free proteomic samples ran by great labs. 

Not like that! Check this out! 

You can take your awesome idea here, convince IDEA how awesome it is and they'll run your stuff for free. I can't tell you how much I love this idea. 

I swear one of the main reasons that Europe is killing the rest of the world on big proteomics papers is the fact they have an entire huge network that is similar to this. People apply with amazing ideas and people who are like "WHOA. that'll get us in a Nature journal for sure!" just run your stuff. And the cycle works because the people who funded it the first time are like "WHOOOOA!!! Did we just get 100 huge papers for an amazingly small (relative) amount of money? We have to fund that again!!"  

It is called EPIC-XS (pronounced EPIC EXCESS! and, since the idea originated in Luxembourg, the proper pronounciation is almost identical to how a Ninja Turtle would say it. I don't make the rules. You can check out this gnarly idea here.) 

Benchmarking DIA for patient samples! -- and a rant about study motivation and new informatics tools.


I hope the resolution of this image turns out a little better than it looks on my screen right this second, because the figures in this new study are fantastic

The amount of work here is just staggering. We just sent out a paper where I used 5 DDA search engines in an effort to help reviewers feel more comfortable about some peptide IDs I've made, since they've been a little controversial so I also spent most of a week working on an HTML interface so people could see the original spectra, and match data, stats, decoy matches and all that stuff. I want everyone who sees these identifications to see the original spectra from which I made these identifications. If I'm wrong, I want (need) to know about it because I'm about to embark on a couple years of work based on these identifications. It will totally suck to find out that I was wrong in 2027. While this is largely just me bragging about how cool the last 3 weeks have been in solitary confinement in my office, there is a secondary point here. 

New software is showing up that is dramatically increasing the number of peptides and proteins that we are identifying -- and that has happened before, yo, and it ended up not working out great for everybody when no one could validate those new protein IDs. It sank some companies and some really high profiel projects. I'm not saying that history is repeating itself, by any means, but it is definitely making me nervous. (I'm not the only one, I got to hang out with some core lab directors last year and it was a major topic). 

I want more identifications in less time just like everyone else, but I need some sort of confidence boost to go along with those identifications as well. Actually, geez, I'm hoping that I'm just continuing to head into this cantankerous old academic stage of my career, but I actually want to see evidence in mass spectra that identifications are real. Not all of them, that's impossible now, but I'd at least like to be able to spotcheck IDs whenever I feel like it -- and absolutely when they are something important. 

I don't want to detract from this awesome study that was a huge amount of work and should definitely be published in this prestigious journal and did a lot of really impressive stuff --  okay, but I have no  idea what any of these words are

However, I do know what these words are and this, encapsulates things pretty well for me. 

Everyone has different goals and motivation going into a study. I only know how to do one thing with mass spectrometry data so here are my goals and motivations, basically always: I use proteomics and metabolomics of disease states to find markers. Then I try to make a targeted assay for that/those markers OR I give that list of identifications to someone who has only a basic working knowledge of how I made those IDs, but trusts me that every one I gave them was correct (or, at the very least will be really mean to me if they find a wrong one later that they put work into). 

There is no point in my workflow where anything outweighs a low quality identification. The MD who brought be this stuff will never say "well, I guess it was worth it to waste two years of my time and these patients who suffered physical pain to provide these samples because those plots were pretty." 

Now, that being said, there are other people and other goals and to be perfectly fair, we've identified most of the easy diagnostic markers. Most are combinatorial and need fancy statistics to uncover.  GWAS has helped people despite the fact the data for each patient, in isolation, is difficult to interpret at best, and largely inaccurate at worst. For people trying to use large patient n and machine learning and so on to identify patterns in patient data to make identifications, the negative effects of having low quality identifications WILL be outweighed by having more data. 

In addition, and I have to add this (see the cantankerous statements above) other groups that will find these benefits will outweigh other things are instrument and software manufacturers and creators and vendors. 

I'll end this rant now, with this statement. THE IMPORTANT PART HERE is knowing which group you are in. If you are in a core lab or collaborative center environment and you need to stand by every identification that goes out your door for your and your team's livelihoods,  I bet I don't need to tell you to be a little skeptical about new software that boosts your IDs by 20% with no easy way to see if those IDs are based by spectral evidence. If you are in the other groups I mentioned above, you are probably okay! 

Sunday, May 15, 2022

Building libraries from narrow window DIA?


This idea might seem pretty obvious once you stop to think about it, but I sure haven't tried it and I can't think of seeing it before -- 

If you rely on DDA to develop your library you're inevitably going to miss some stuff. Narrow DIA takes a long time, but if it ionizes you should see it! 

Friday, May 13, 2022

Wednesday, May 11, 2022

Where can I get help with proteomics questions?!?


This list isn't meant to be comprehensive, but I think maybe it could be helpful -- I'll add it as a permanent link over there --> somewhere. 

Let's start with where do I go when I need help with something? Well...over the last 19 years of doing mass spec type stuff I've developed a network of contacts that I go to when I'm stuck. If you are interested in this post chances are you don't have an ill-advised tattoo that someone prominent in this field told you was a great idea after you shared a completely responsible amount of scotch that they  feel just a little guilty about and that you can leverage for help extracting MS1 XICs for the rest of your career. 

Have you tried Twitter? It is probably still alive right now since that smelly emerald mine heir that has tricked everyone into thinking he started Tesla and SpaceX hasn't yet taken over and tried to run that into the ground. There are a lot of helpful proteomics people on it! 

Did you know there is an active proteomics reddit page? That's a great place to put questions, although r/biochemistry and r/bioinformatics have larger subscriber bases and frequently feature proteomics sample prep and data processing questions respectively. 

Questions on the sample prep side? Okay....that's tougher, unless you are using a commercial solution. I use S-Traps, so when I need help I bug the support team at Protifi. If you are doing SP3 via the PreOmics stuff, bug them! If you are doing prep on the cheap, maybe try Reddit? 

Software? This is much much better and I cannot stress this enough, use the Github support pages and Google groups for your software of choice! 

MSFragger has a level of support that is bordering on absolutely impossible to fathom

MetaMorpheus is just as insane...almost 600 closed issues...?....

On this topic, anyone who has a Github up will get an email if you bug them. If you're using the code they published -- bug them! 

MaxQuant has a series of Google groups that include a section on basic mass spectrometry questions in general!

Commercial software? This is the best part of commercial software, they have support people! Don't let them sit around bored, contact them! 

Need help with a method? Have you tried Not everything is in there, but over the last 5 years or so I've tried to convince other people to put their methods there. Maybe it will happen one day.  I'm wonder if the reason the project hasn't really taken off is because most people with mass specs aren't confident enough in their ability to use an instrument to feel okay making them available.  So... it is about 98% methods that I have personally made and since I have experience with, and access to, just about everything, the resource still plods along, gets downloaded a lot and I occasionally get positive feedback about it. 

Is your question about mass spectrometry hardware? This is great. Call or email the instrument vendor! Don't have a contact? Do this! 

1) Figure out who your local sales person is 

2) Tell them you have questions! 

This might actually be a useful application for LinkedIn. I bet your sales rep is on that thing! 

Because not everyone knows this, this is largely how the system works. Your local sales rep is paid just about enough money to not lose their home. Not much more. The manufacturing company wants them as desperate as possible to sell instruments. If you are sitting there with your instrument (even an old one) and you are struggling, you don't have time to write a great grant to buy a new one, do you? Even if you do, you probably will buy something else thinking you'll struggle less with it. You struggling with your hardware is bad news for your vendor. Contact your sales rep. 

They will, almost without fail (unless they've had a bad couple of years and are working nights at a hospital or spending all their time applying for jobs) get you help. You probably have local applications people and you definitely have applications support people somewhere. Now, depending on how greedy the vendor company is, when you buy that Fusion 4 instrument, your sales rep will either have a party or a really big party, because even 1% of that instrument is a pretty good day for most people in science, so it works out for everyone somehow. For real, bug them, they'll be excited to hear from you.

I feel like I forgot something else I wanted to add....maybe later! 

Tuesday, May 10, 2022

USP14 reveals how much we've got left to learn about everything...


I've left this one open on my desktop for a really long time despite the fact I could use the space.

If you haven't seen it, maybe you don't want to, but I think it is worth rambling about for a second. 

Of course, my first thought was "wait. we know how a stupid proteosome works, right? RIGHT? Oh don' we know how ANYTHING works? 

Turns out that we kind of do, but there was a big mystery here because increasing the amount of USP14 that is around will both upregulate and downregulate the amount of proteosomal activity. LCMS based proteomics didn't solve this one, though. CryoEM with machine learning figured this one out.

And, if I have this right (here is the original paper, btw) the amount of USP14 around is completely irrelevant to this system. The amount of modified (by ubiquitin/ubiquityl) USP14 around is all that matters. Makes you kind of wish these mods weren't such a strangely hard thing to work with, right? 

Monday, May 9, 2022

Dynamic Instrument Control -- History....and Future!

What if our mass spectrometry instrument hardware just stopped improving right now? Would we be able to stop, catch our collective breaths and start improving on the things we're collectively bad at as a field? Or would we find some other way to tinker with our instruments and the computers behind them? 

Probably the last one, because we're getting some really impressive returns right now from technologies like real time search and more intelligent acquisition methods, etc., 

These things have been around a looooooooooooooooooooooooooooooooooong time, though. For a really good perspective of these tools and their capabilities (to help guide all you instrument hackers out there?) and some thoughts on the future, check out this smart review on the topic! 


Sunday, May 8, 2022

Ready to do some deep learning multiplex fragmentation prediction?

Hey you! Were you thinking, I wish I could just predict all my TMT spectra so I didn't have to search them every time? ME TOO!

PROSIT now does deep learning prediction of TMT labeled spectra. 

WHAT??? I know!!

It's live here! 

How to do it? Okay, well this worked for me. I basically just followed this old tutorial I made

Then I opened the EncyclopeDIA output file and made a new column (or row, I still don't know the difference) and I think I just put in HCD in every entry, saved it and loaded it to Prosit.

I made the data an MSPepSearch compatible file and loaded it into a copy of PD where the free software developed by NIST wasn't disabled due to there being no money being funneled to a big company for that particular PC! 

Saturday, May 7, 2022

SPIN -- Identify bone archaeology samples with LCMS proteomics!


This one is a seriously fun read!

This isn't the first archaeological forensics via proteomics paper, but it's the first one I can think of that used old bones(!!) and had this much evidence to support identifications -- old identifications! 

Lots to digest here, this isn't a small paper, but worth the time. 

Moral of the story might be: we can digest and identify just about anything, and it doesn't really matter how old it is, there is probably trace proteins around and that's all you need today? 

Friday, May 6, 2022

What happened at EUPA 2022?? -- Guest blog post special!

I have been buried in work linked to maintaining my employment for several weeks. So buried that I scheduled a lot of blogposts for 2032. I'll leave them there so they'll post again later.

The annual congress of EuPA is something I've always wanted to attend and it has never quite worked out, so I conned some guest bloggers into providing some insight about what happened at the conference. (Sorry for the month long delay). 

At Proteomics Forum/EuPA2022 in Leipzig, we got word of the shout out from across the ocean to provide a little update on the most exciting stuff that was presented. In search of a brief format, Ghent attendees (#GhentRepresent) spread out and each of us came up with a personal perspective on the most exciting stuff, a predominantly ECR perspective:


Arthur Declercq:EuPA2022 was the conference of tackling problems in specific proteomics niches such as single cell or immunopeptidomics. It was very nice the see many people showed up for the immunoproteomics symposium where Michal Bassani-Sternberg gave her talk about identifying neo-epitopes for personalized cancer immunotherapy with their proteogenomics pipeline NeoDisc, showing how they bring state-of-the-art research from bench to bedside (10.1038/s41587-021-01072-6).

Ben commentary: Hey! This is one of the two immunopeptidomics splicing groups!  Despite the controversy about whether HLA peptides splice or not, there are some very successful clinical trials going on right now centered on weird spliced peptides. 


Tine Claeys: “The diversity of proteomics and its wide variety of instrument, tools, applications, … shows the creativity of scientist in tackling very different problems. I was intrigued by the talk from Jennifer Van Eyk on how her group approaches clinical applications and biomarker discovery integrating multi-omics analyses. Combining already existing clinical data with the new advancements in the mass spec field to determine your molecular twin demonstrated again the importance of big-data analysis, especially in a clinical context.  The talk from Ileana Cristea on spatiotemporal organelle remodeling ( 10.1016/j.celrep.2020.107943) blew my mind on how the location of a protein can have a huge impact and can be very altered upon viral infection but we're still mainly focused on protein abundance. This will definitely impact the way I approach my own research. I've had some amazing interactions with amazing people and expanded my scientific horizon in ways that would've been impossible without Proteomics Forum 2022!”


Toon Callens: “As I just started my PhD in proteomics (bioinformatics) it was great to see what everyone is working on and to have a broad overview of the field and the state-of-the-art. It was also interesting to see overlap with my own projects, which gave me some new ideas.”


Annelies Bogaert:Single cells, single cells, single cells! It is great to see how much progress has been made in this field. Also great to see how proteomics is contributing to help conquer the covid-19 pandemic. Chris Overall his talk about how they used TAILS to identify substrates of the SARS-CoV-2 3CLpro protease which gives insights in how the virus uses this to its benefit, is just one great example (10.1016/j.celrep.2021.109892). While Albert Heck gave a nice presentation showing that we are evolving to personalized diagnostic as top down comparisons of IgG antibodies from one person over time could in the future help diagnose medical conditions (10.1016/j.cels.2021.08.008). Since I spend some of my time trying to understand better the functionality of proteoforms (thanks to Neil Kelleher for introducing this term [10.1038/nmeth.2369]) it was nice to hear that so many talks also considered them and stressed their importance as they are often still neglected! And as a last note, this conference really showed that DIA is becoming the standard over DDA.”


Tim Van Den Bossche: “I was really honored to present the flagship manuscript of the Metaproteomics Initiative and give a short introduction about the Initiative itself. Where the field 10 years ago was often overlooked, it's amazing to see that there's now a dedicated metaproteomics session at the major European proteomics conference”

 Ben commentary: This is the metaproteomics initiative mentioned:

Ralf Gabriels: “Proteomics is done with simple setups; more challenging workflows is where the interest is now: Single cell, subcellular/spatial, immunopeptidomics, metaproteomics, proteogenomics, open modification... And quantification labels are coming to DIA!? Both non-isobaric with plexDIA (10.1101/2021.11.03.467007) and isobaric with DIA-TMT (10.1016/j.mcpro.2021.100177).”

Ben commentary: 

 Maarten Dhaenens: “plexDIA, using 3-plex nonisobaric mass tags: how could I have missed this one? (Nikolai Slavov) (10.1101/2021.11.03.467007)

Neil Kelleher is doing stuff with single ions I will never understand because I am a QTOF person, but he is so right that “we should measure what we must, not what we can!”. 

Markus Ralsner suggested ordering your differential proteins according to chromosome localization, because you might in fact be looking at an aneuploidy event in your cell lines. That reminds me of a shrimp project where we once were thinking along the same lines, but we never really showed it as beautifully as Markus did! 

And OMG, Kathryn Lilley is leaving us to look at nucleotide biomolecules?! Nooooooooo! But then again: what she is noticing on subcellular RNA localization is beyond cool. Please Kathryn, come back when you finished throwing over their world view and throw over ours a few more times?  

Vadim Demichev got across the message that Match Between Runs (MBR) is making a significant difference in the current DIA-NN implementation. Therefore, he does upfront peak picking and alignment. And hey, that reminded me of something we were doing a while ago... 

In fact, just talking about it became the biggest breakthrough for me personally: our ion-networks could finally be revived (10.1101/726273), all because of the return of in-person interactions and how these are so different from digital meetings! I am convinced that Bruker data will turn out to be the perfect match for ion-networks. So, if ion-networks are a “sleeping beauty”, then Proteomics Forum/EuPA2022 sure woke her up!”

Ben commentary: whoa....check out the highlighted stuff....this happens all the time in cancer, particularly when cells are treated with platinum based therapies. Worth considering, for real...

Lennart Martens: “Lots of great interactions and creative new ideas at the first in-person meeting in ages. Also… CAKE!”

Ben commentary: Wait. THE Lennart Martens? Holy shit... This was a great idea! 

CONCLUSION: The conference hall was part of the Zoo of Leipzig. After the conference, we visited the zoo and found a fish that perfectly illustrated how our heads felt after a full week of in-person conference!


Guest reporters:

Arthur Declercq (@DeclercqArthur) 1,3
Tine Claeys (
@TineClaeys1) 1,3
Toon Callens (
@ToonCallens) 1,3
Annelies Bogaert (
@AnneliesBogaert) 2,3
Tim Van Den Bossche (
@tvdbossche) 1,3
Ralf Gabriels (
@RalfGabriels) 1,3
Maarten Dhaenens (
@MaartenDhaenens) 4
Lennart Martens (
@CompOmics) 1,3

1. CompOmics, VIB-UGent Center for Biotechnology, VIB, Ghent, Belgium
2. Gevaert Lab, VIB-UGent Center for Biotechnology, VIB, Ghent, Belgium
3. Department of Biomolecular Medicine, Ghent University, Belgium
4. ProGenTomics, Department of Pharmaceutics, Ghent University, Belgium