Wednesday, October 29, 2025

Ignore the EvoSep instructions and load peptide tips however you want!!




The whole point of the EvoSep HPLC is reproducibility, I'm pretty sure. That's what it says on all their things. The tips come with handy little visual instruction cards that tell you exactly how to load them. My team actually made stickers for the boxes where you check off each step you've completed because we can only spin 2 boxes at a time in our centrifuge and we often prep far more tips at a time. 

And...I swear about half of the papers I read (particularly from one group I'm going to point out right now) seem to just load the tips however they happen to feel like doing it that day. 

This is by far the funniest one. In this paper on AlphaDIA - a group that has strong ties to EvoSep - ignores the instructions entirely - in 2 separate ways, in a single paper. 

Look, this is funny because it's the same study, but if the goal is reproducible peptide binding why introduce a potentially confounding variable at all?? 

First EvoTip prep


Propanol-1 then water then 99% acetonitrile then re-equilibrate (you never learn how much volume here).

And then a totally different way of loading the same tips! This time with a robot, so maybe it has to do it this way? 


Propanol-1 then 99% acetonitrile twice then water twice with seemingly random volumes? 

In case you aren't familiar this is what the card says in the box. 


It was tough choosing a gif here, but it's tough to pass this one up.



Tuesday, October 28, 2025

AlphaDIA - I don't understand why it is different but it IS open source!

 


I really did read the full body of the text for the published manuscript on AlphaDIA. I kept looking for the "why is this different than the DIA tools that I currently use" and I never had that "OH! That's why this is different!" moment. For real, WTF is a learning transfer? I'm either a poor reader or it is never fully explained in the text anywhere. 

However it 

1) Processes Orbitrap data

2) Processes TIMSTOF data

3) Processes SCIEX ZenoTOF data

4) Processes Astral TOF data

And it is fully open source! YEAH!  Which doesn't matter a lot to academics cause 2 of the main tools you probably use are free to you and it doesn't make a difference. But it matters to a lot of people! 

Looking at just the baseline numbers for what we get in DIA-NN and SpectroNaut on our cancer cell line digests on the TIMSTOF Ultra and the numbers I get when I reanalyze other people's Astral data, the AlphaDIA numbers seem on the lower end, but within reasonable expectations. Also, it's worth noting the Github went live 4 years ago and the preprint is at least 18 months old, so these might be older files. Proteomics hardware/methods and informatics improvements have been nuts the last couple of years, so it's tough to tell how much all that factors in here. 

If you haven't looked at it in a while, I'm happy to report the Github has some neat little animated walkthroughs and things

OH YEAH! And it runs on MacIntoshes. So if you're just done with Windows 11 enough that you're going to buy hardware from the US POTUS's very close friend Mr. Apple himself.

You can read about it here. 



Monday, October 27, 2025

Does anyone know why there is a cockroach peptidomics paper every couple of years? New one!

 


I won't lie, I'm not even going to go past the first page on this one. I was at Johns Hopkins off and on for like 20 years. I have some cat sized cockroach related psychological trauma I've submerged that I'm not about to bring to the surface because someone decided to put an actual picture of how they get the neuropeptides out of the cockroach. 

However, every couple of years someone does peptidomics on cockroaches and I do not know why. I probably won't ever know. Hopefully it's some important model or something. I looked and it doesn't appear to be the same group. 

I'm sure they did a great job on this or it wouldn't get published in JPR. If you want to read it, here is the link. Have fun. 

Oh yeah, and I didn't hallucinate this. I just searched "cockroach peptidomics" in JPR's search bar. 2 there in the last 5 years alone! 

Gross.

Sunday, October 26, 2025

Pathogenic demo drops on Steam on Halloween!

 


Listen up gamer nerds. You can keep playing whatever dumb thing you are currently playing - or - hear me out - you can get a free demo for Pathogenic where you play as a pathogen trying to infect a host!! 

Demo goes live on halloween!  More info here.


Thursday, October 23, 2025

US HUPO ABSTRACT SUBMISSION DEADLINE IS TOMORROW! 10/24/2025!!!

 


For all of you excited to leave a decent pile of what you might consider your human rights behind for a few days of fun in the Middle West you better get on it! 

Abstract deadlines are TOMORROW! 

Get 'em submitted so you can go to Missouri!! 


Wednesday, October 22, 2025

MaxQuant + SDRF enables great reproducibility with no downsides whatsoever!

 


As everyone in proteomics already knows there is absolutely no downside whatsoever to using MaxQuant for every proteomics experiment. It's super fast, visually stunning and modern, incredibly stable and gives you all sorts of insight into your experiments when they succeed and those exceptionally rare times when it just stops running 18 days into analyzing those 4 files.

How could you make our field's very favorite toolkit even better? No way, right? Oh. Do I have one for you. What if you could also get your metadata out in the soon-to-be-mandatory (those are single dashes, no generative AI here - every word on this blog is typed by this one weird dyslexic guy) SDRF format? 


Nature Comms?!? Whoa. What a demonstration of how great it is to work with MaxQuant that you can score a high impact publication by getting it to export a .JSON table in the format you want!  Heads up editors - one of y'all is about to see our SDRF exporter for Proteome Discoverer as soon as I get an hour to figure out what computer I made it on! Heck, I'll throw in one for Metamorpheus too! Supplemental methods.

Disclaimers: Proteomics metadata is something we should be uploading properly. We arent. Hell, I'd say 75% of proteomics experiments aren't even having their data put on public repositories. My job is to draw attention to things by generally being a nuisance about it. Anything that makes getting data deposited with appropriate metadata is a very very good thing. Thank you to these authors for their work and effort. 

Tuesday, October 21, 2025

There's a whole book on immunoproteomics / immunopeptidomics!

 


Okay, this one slipped by me. I had a weird 2024...and 2025 isn't going to set any records for normalcy.

It all seems brand new to me. There is some smart immunopeptidomics using TOMAHAQ derived methods as well as the clearest description of the ThunderPASEF technique I've seen. The lead in for why you'd want to do immunopeptidomics in the first place might be the star, though. Totally worth checking out! 

Saturday, October 18, 2025

The first credible quantitative analysis of 7,000 proteins in human plasma!

 


Over the last few years I think I've seen 5 different company presentations that have been in some sort of an arms race - purely with one another - to deep dive in human plasma. This one now says they can get 3k proteins in human plasma, so the other one says 3,500 and the next one has to say 4,000, and then 5,000. I've visited a couple of them personally and the scientists running the samples seem to spending most of their time trying to find new jobs because their executives and their sales teams seem incapable of shutting the fuck up for even 10 seconds and are pulling these protein coverage numbers out of their own lower body cavities. Yo, if you're ever in a company as a scientist and you answer to the sales team - get out. 12 years hanging around mass spec companies and I've never once seen that setup work. 

Back to the paper! 

Among all this noise there is have been credible developments in hardware and chromatography and nanoparticles. And what if you really sat back and had people who are experts at all of these things work together to see what they could really do? 

The best case scenario is that you might end up with something like this


This is the newest (that I'm aware of) generation of Seer Proteograph linked to some fancy uPAC 50cm chromatography running at 1uL/min on the Orbitrap Astral with about 30 minutes/sample and a similar but slightly longer method on an Orbitrap Exploris 480. 400ng of peptide (wow! in my mind now, that's a lot, but people put a lot) on the Astral and something in that range on the Exploris. Feels like they tuned in the chromatography on the Exploris and then the Astral arrived and they were good to go on the chromatography side.

Of course, the nanoparticle corona stuff had to be done up front and that's where the magic is to get past those 22 pesky proteins that make up 99% of the blood plasma.

Now - this isn't a SomaScam /Illumina Protein Crap experiment where you can detect all sorts of stuff but you can't quantify any of it because your aptamers have a dynamic range of 2 (not orders, 2). This group spiked in bovine proteins at different levels in different samples and determined how well they could quantitative recapitulate the expected ratios.

Turns out it works really really well, with or without depletion. For real, there were legit analytical chemists doing legit analytical chemistry here and it makes this whole workflow seem very very smart. 

Sure - this isn't an inexpensive workflow. We all know everyone complains about the Seer kits. I hear rumors of $250-$350/sample all the time. And an Astral isn't cheap, but we're talking about 48 samples/day so 336 samples/week before controls? That's more than Illumina's solution can do. And - sure - an Astral isn't cheap but neither is an Illumina sequencer (which you also need for O-Link)

It's been a pretty good week for plasma proteomics for mass spectrometry.... and sure, I'm biased, but if we're competitive from a price, thoughput and coverage perspective, the other technologies seem more than a bit silly. 

Wednesday, October 15, 2025

Are data streams the answer to proteomics processing bottlenecks?

 


I don't know if this is the answer to some of my most pressing current problems, but it does seem silly dumping all the data at once into something that can't handle it.

Maybe if we treat the data as a stream(?) we can allow it to be progressively processed(?)


In their models they get up at a 3-order(!!!) of magnitude(!!!) increase in speed when processing large (and largely simulated) datasets up of tens to hundreds of thousands of proteomics samples.

Worth at least thinking about, IMHO.

Tuesday, October 14, 2025

It's official! The Proteomics Show will be live at International HUPO!

 


Since no one asked for it (and ASMS ignored me)! 

THE Proteomics Show will record live at International HUPO on Monday, November 10th. Main poster/exhibit hall at 2pm (I think!) 

It'll be me for sure and one super secret special host - and audience participation! Woooo! Toronto!  

Sunday, October 5, 2025

Multicenter quantitative evaluation of plasma proteomics spike-ins!

 

Whoa! This has been a big week or two for plasma proteomics! And this multi-center evaluation goes after the most important part of it - who gives a rat's ass if we can detect a protein? Not me and not you - what we cant to do is accurately quantify it! Let's spike in all sorts of stuff and have everyone you know run it! 


All sorts of instruments are employed in this study and -

HOT DAMN - Did they get 4,000 proteins in neat plasma?????

Oh. Okay, When you put E.coli and Yeast into plasma, even at low concentrations on a really nice instrument from any vendor you can get a couple thousand proteins. (Cause it's not too hard to see 3,000 proteins in yeast or 1,000 proteins in E.coli.) 

When you get down to Figures 7f and 7g you see numbers that make a lot more sense. 400-ish proteins in neat plasma. 



There are a ton of fundamental observations in this study that are critical to think about regardless of your instrument or whether you do plasma proteomics or not. How does DDA compare to DIA? Where is lower resolution/faster TOF better than slower but higher resolution Orbitrap scans - and vice versa.

Super super super cool study. 

Saturday, October 4, 2025

mzPeak - Is this the solution for proteomics data now - and the future??

 


Okay y'all. I'm going to approach this one with a healthy pile of skepticism, but I need a solution - and probably you do as well. A small label free single cell study for us - like one 384 well plate is generating maybe 500 - 600 GB in RAW (Bruker .d) data. Then to run our data in SpectroNaut we have do first do the absurdly infuriating process of converting it to a special SpectroNaut format. It's called .HTRMS, which is probably Swiss for  "Hard drive (T?) Room Makes no Sense". This takes your .d file and makes a second file that can only be read by SpectroNaut and it almost exactly the same size. Now you're at 1 384 well plate and a TERAbyte or more. 

The problem here is that neither of these things is a UNIVERSAL format. The always forgotten, frequently cursed, consistently ignored, but-they-keep going-anyway Proteomics Standards Initiative has tried forever tried to come up with ways to store mass spec and proteomics data in "universal formats". They've had some great ideas. We've ignored all of them. They've evolved those ideas as files went from Megabytes to Gigabytes, which didn't really change much because we -as a field - ignored all the stuff they were talking about anyway.

mzML or mzXML or whatever we're supposed to be working with doesn't work for me. A Bruker .d file still increases in size by about 10x. So...my 384 well plate is now 6TB and that's the size of my largest onboard hard drives. 

What we need is something that can not only deal with the files of today, but allow us all to deal with the files of tomorrow. I've got some files from the Billie prototype and those files are 10x larger than anything my TIMSTOFs generate. It's not out of the question that things like the 8600 which are doing scanning SWATHs at absurd rates of speed are also going to be generating preposterous amounts of data - let alone behemoths such as the Astral and Athena. 

Is this the solution? 


I don't know. A bunch of these people seem to contribute to the PSI, which immediately makes me want to ignore this whole thing, but the parquet file format does sound like it might have a bunch of advantages over .sqlites and XMLs and even locked proprietary binary files that are easily corrupted by transfering from one location to another. 


Friday, October 3, 2025

optTMT - Rapidly build the ideal TMT experiment for the best quan values!

 


Hey you! Do you need to do a big multi-batch TMT study? I've got a couple coming up! 

What if there was a handydandy GUI that would help you designt that study for getting the maximum quantitative accuracy and least effects from impurities and coisolations fudging all your super cool biological findings?

CHECK THIS OUT! 


Don't want to read? I got you, yo! 

https://marc-antoinegerault.shinyapps.io/TMT_optimization/

Thursday, October 2, 2025

Tired of slow DIA search times? Try DIA-NN 2.3 with InfinDIA!

 


Is your awesome new instrument generating so much data that you can't process it fast anymore? My data density has jumped about 5x from my last hardware to my new stuff and the data processing is correspondingly less faster. 

A huge thank you to the Director of Proteomic Innovation at Cedar Sinai, Dr. Simion Kreimer for texting me about this new thing called InfinDIA (which is in a free [demo?] for academics in DIA-NN 2.3 that you can get and read about here


I stole one page of the important stuff! 

How much faster are we talking about? One pseudo-bulk run (25 cells) on DIA-NN 2.2 was running about 25 min/file to process with our previous settings on our admittedly underpowered Dell Craputer with the Ultra 9 285 and 128GB of RAM (running 20 threads so there's still enough power left to remote login to the box) 

When toggling the mode to InfinDIA in DIA-NN 2.3 it appears to have completed 6 files with MBR in about the same amount of time! This is just a single run and should be re-checked to see what other changes happen, but it's definitely seems worth taking a look at, cause...



Wednesday, October 1, 2025

The PCA-N paper is published AND has links to the data files!!

 



Edits 10/2/2025: Thanks for the emails and comments! It looks like this is only the first 1,700 files. Still fun to download and take a look at! Also, it looks like the newest O-Link Explort HT can do 300+ samples/instrument setup per day. That's faster than an Astral at 100SPD. My notes were on the previous O-Link Explore 30something which was 82 samples/plate. That's 82 samples prepared per work day/instrument set but I believe that takes 2 days. Even if it's just that is 410 samples/week compared to 700/week on the Astral. And WAY cheaper on the Astral. 

WOOOOOOO! This paper is out and could represent a full paradigm shift in how we do plasma proteomics! 



Thousands and thousands of plasma samples! Done just about as fast as you could do it with O-Link and WAY WAY WAY WAY faster than you could do it with Illumina Protein Prep. (40,000 samples in this study, not counting QCs, were completed on one instrument in less than 1 year). This would take just over 4 years with Illumina's new and non-quantitative protein detection technology. One instrument running O-Link explore would take longer than this study, but not by a lot. 

Here are my notes from the preprint! 

AND all the data is linked in the paper that you might have found very extremely disappointing when it wasn't available in the preprint!