Saturday, January 31, 2026

ADAPT-MS - A starting point for automatically classifying clinical (untargeted) proteomics data!


This one took me a couple of rounds of putting it down and coming back to it later.

It's a smart concept and a very nice thing to think about as proteomics becomes more trusted as a diagnostic. 


I think I first thought it was something that it isn't, and that's why I had such a conceptual problem with it. Obviously, I might still have it wrong, but this is how I'd describe it. 

What if you had a random patient come in and you could do global untargeted plasma proteomics on their sample? Not inside of a controlled cohort that you planned 2 years ago and pulled all the samples from the repository? Just that one sample that just came in. That's how clinical stuff might work. A sleepy 22 year old might be working nights to save for grad or med school and be studying and run those 12 samples that came into the lab (typically because it's super important) at 3am. Could you do anything with global data? 

If the answer is no, then the future is not very bright for diagnostic untargeted proteomics. If the answer is shmaybe, then you're getting somewhere, and if it's a yes, then let's start building on this idea right this second.

To simulate it they pulled some traditional proteomic studies where they had a discovery cohort and then a validation cohort and someone did it all the traditional way. Found the markers in batch 1 and focused on how well that marker seemed to be predictive in batch 2. So these authors loaded those data, pretended they didn't know what went where and use the machine learning things to try and sort it out - and it totally ends up doing okay! 

We've got ourselves a shmaybe here! 

I appreciate the transparency of the authors, the conclusions almost read like a "limitations" section. The rest of the paper reads like someone was sending a secret code to Olga Vitek that only she would be able to decipher. If that was really what this was, Nature page fees may be the absolute most expensive way to do it....

Here is the thing, though, it didn't outperform the traditional human thing when the experiment is done really well (the example data they used is superb, probably outliers) but it did reasonably well, and that's still a huge deal. 

 And everything to reproduce it yourself is reasonably well annotated in these notebooks

No comments:

Post a Comment