Friday, April 11, 2025

Is there anything we CAN'T do proteomics on today?

 


Another self-serving blog post but I'm so pumped to see this out! 


About 2 years ago I was asked to review a paper for some journal and my peer review was so positive that the editor asked me if I wanted to write a commentary on it. 

I wrote the most over the top commentary - about the paper which was on  MANGOS - the fruit. Because that group somehow did great quantitative proteomics - and glycoproteomics - on mangos (the fruit) sitting on a shelf to figure out why they change when they ripen.

Why was I so impressed? Because the available protein sequences for the mango suck. And - it's this thing, right? We can only do proteomics on organisms that we have really high quality curated genome sequences for! Right? 

Okay - so then, concurrent story - I had to kill somewhere around 70 very poisonous black widow spiders who took up residence on my property. It was not cool - at all. 

Like this grumpy asshole


and these violent little jerks the one above was so grumpy about my murder plans. 


and this asshole -


and this one 

I could keep going. Searching "spider" in my ICloud thing will keep you very busy for a while.

Okay - so the big ones are not only super poisonous but they are creepy and they'll do this "I'll roll up so you can't see me - oh, that didn't work because I'm black and RED? Time to run right at you." Particularly if they're protecting an egg sac or a bunch of babies that just came out of an egg sac. BTW, they hatch in groups of 50 or so and then they play a little game of survivor so only the meanest ones get out. They're awful. 

Tarsh Shah was working on his Master's at Hopkins and was looking for a cool thesis project. He dropped by my lab right after I'd lost an amazing PhD student to her graduation and I had a free lab bench. I had a couple ideas for projects and this was one (a drug analysis project is also on my desktop somewhere). 

Colten, Hannah and Ahmed trained Tarsh on proteomics prep and data analysis and Ben Neely -in my opinion the world's #1 expert on doing proteomics on under-studied organisms - provided invaluable support and advice for working with organisms with almost no annotated protein sequences. If you want to do a study on something where you can't download a good UniProt protein database for - 100% start with Ben Neely's blog

This wasn't a funded project, so S-Traps and EvoTips were donated or scrounged and instrument runs were performed on weekends instruments weren't in use on real projects. 

Concurrent with this work - the somewhat closely related Western Black Widow Spider had a genome assembly pulled together. With even more help from Neely and tools on his Github (thank you!) we followed his instructions and were able to make this into a FASTA file as well.

To be clear - UniProt had 140 proteins from any black widow spider when we started out on this. There are 53,000 spider species. Scrounging RefSeq got us 14 - and when we entered all these sequences into SpectroNaut for it to generate spectral libraries for DIA analysis - I sorta expected 400 proteins and a student who now knew how to do S-Traps and load EvoTips and evaluate a mirror plot of mass spectra?

Regardless of what spider FASTA we used we could get over 2,000 protein groups! WTF, right? Not that long ago I was impressed when I got 2,000 proteins from human materials. When we used the Western Black Widow FASTA we got 5,500 or so! 

Now, you can totally just get a bunch of proteins, but how do you know the deep learning neural network things didn't make them up? We started hunting proteins that made sense - like what protein should be in a spider head? No idea. But can we find one and do we only find it in the head? Yes. But Tarsh found some cool papers about how black widow spider toxins work. From his interests in pharmacology and drug functions (and possibly some ideas about how we might learn from these toxins)?  Including a recent study that showed that small spiders only produce toxins for sedating insects, but big mean spiders produce different toxins for murdering amphibians and different ones for murdering mammals. So we looked for those. And - boom - that little spider I was able to capture intact had almost no toxin expression. One of the big mean ones? Toxin proteins EVERYWHERE. Proof of concept, right? That's the figure at the top. Red is high and blue is low. Regardless of what spider database we used, if we filter on toxins - What's fun is that regardless of the FASTA used, we see those same trends. 

Ultimately, this started with something bad that happened that I might have nightmares about for the rest of my life. But we hope it ended as something inspirational - like maybe we can do good proteomics on just about anything, even if that organism doesn't have a beautiful FASTA library on the easiest-to-access websites? 

No comments:

Post a Comment