Thursday, August 28, 2025

SomaModules - Pathway Analysis for Scan Output! (And publicly available SomaScan data!)

 


This paper took me some time to dig through, with the majority of the time focused on PUBLICLY AVAILABLE SOMASCAN DATA!




What's it about? It's about how to change the output data from SomaScan into something useful. In this case the authors walk you through a large pile of separate R Scripts that will take you from your Somamer output and will get you to a gene identifier for each one....cause...obviously....that's not what you get from the vendor....

Here is what you'll need to take these output data and get to pathway analysis....they all won't fit on the screen....


Now....you absolutely could have all these separate scripts in a single notebook, but it is certainly funnier this way. Transform your silly data format - get this output - put it into this next script - run that - put it into this next one. 

This is a great group and I've got - easily - 500 - Thermo .RAW files from them sitting on around on hard drives here from various studies on the Baltimore Longitudinal Study of Human Aging. 

It would have to be a great group, because they can get usable information out of this SomaLogic data! 

And by THIS SomaScan data - I mean - THERE IS DATA HERE IN VARIOUS FORMS! 

And... it's ...revealing.... I think the paper references it as the 11k data, but I count like 7,400 unique protein targets. So I might have that mixed up, it might all be the 7k aptamer data. 

Just like you've heard, everything appears to always be detected, and it appears to always be detected above the blanks. And if that is your goal, stop reading here and have fun!  You've detected proteins! Probably! 

If you are interested in quantitative differences between condition A and condition B, these aptamers do not appear to magically exceed the physical limits that have long been established for assessing quantitative differences of aptamer - protein binding and release. Nor does complexity of the background and the aptamers present increase the maximum dynamic range in protein quantification ever recorded for an aptamer. Shockingly, it appears to be completely the opposite. 

This data deserves a more thorough analysis. And, as I mentioned, I have hundreds of really good LCMS proteomics file already processed from this study. Most of it is 24 offline fraction TMT data, that I happen to think was done really well. I will, however, leave you with the impressions from my first couple attempts to interrogate these data. 



Tuesday, August 26, 2025

Deconvoluting proteomics homogenates with single cell type insights??


 

This idea has been proposed to me a few times recently and I thought something like "oh, sure, and exactly how TF would you do it in practice??"

Not sure it's the solution, but I sure need to leave it here so I can look at it later! 


Sunday, August 24, 2025

Wednesday, August 13, 2025

Characterizing and engineering PTMs with high throughput systems (not mass spectrometers!)

 


Disclaimers! I've had this open on my desktop since I got back from vacation and I do not get it. We're way way out of my wheelhouse. 

I do think it's worth posting here, though, for people to think about, if only so I can close the tab and reference it here later. 


Ultimately, I don't think it has any real overlap with what most of us do PTM-wise with mass spectrometrs. We're looking at how all the acetylation sites (for example) change in appearance and abundance when we put drug A into some cells when compared to control cells. 

However, the interest here has to exist here in being able to drive the PTM engineering using in vitro systems. The proof of concept leads to the discovery of mutations that drive up the concentration of the products they want. (Confirmed by targeted LCMS/MS) Still really cool (I think). 

Tuesday, August 12, 2025

DeCrypt KRAS signaling using all the common inhbiitors!

 



This one had so many steps to get behind the paywall of that I thought I was trying to find a certain global leader's name thousands of times in files about a certain island. 

It was totally worth it though, because this is GREAT! 


KRAS does so many things and has so much redundancy that you can fill entire buildings full of RAS researchers for decades and they still haven't sorted it all out.

Now there are these awesome little small molecule inhibitors, but you're still looking at a big matrix of effects. Tubingen was like - fuck it, we'll just use them all and do super deep 

Proteomics and

phosphoproteomics and 

ubiquityl/ubiquitin/Ubbydooobydoo omics

On cell lines where each particular drug works. Plus 

dose responses! 

Geez. Science signaling is a pain in the butt. Even trying to get the PDF makes you go to a weird PDF reader and then you have to download it from there? I bet the impact factor is lousy....

Got it so I can read it! 

TMT11-plex was uses for most experiments, suggesting they may have started this study before the 16/18-plex came out, or that the Decrypt downstream stuff is just easier with maintaining the exact same workflow - which I'm 100.00% fine with. 

Oh shit - they also did Cysteine profiling! Now....that's cool. I know someone who is convinced these drugs should be having off-target effects. We did some runs to see - didn't - but I didn't have the bandwidth to do Cysteine enrichments.

Now...if I do have a criticism here of this beautiful work I'm not super stoked that some of the experiments were done with SPS MS3 while others were high resolution MS/MS. I'm probably going to download the high resolution runs for looking for things I'm personally interested in and I'm probably not going to download the ones with ion trap MS/MS. 

I'm just a resolution and accuracy snob, particularly when hunting PTMs. 

Don't let that -at all- detract from what a fantastic and useful piece of science this thing is. It's just a minor comment based on how I will personally reuse these data - that you can get at ProteomeXchange here! https://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD063604

Of course, these data are probably already fully integrated into Decrypt so you don't have to download a 300GB of data to dig through yourself. 

However - the last big Decrypt paper that dropped featured some drugs that we've done single cell proteomics on and that data let me vastly improve my analysis of that drug. And - here are 2 more that we've worked on! 

Monday, August 11, 2025

SynchoSep-MS - Another top lab utilizing amazingly complicated HPLC!

 


Ummmmm.....okay.....so when I saw the first preprint with a similar concept, I thought something like "that's a weird idea that probably looks okay on paper but will absolutely never ever work in practice".

And....here is a bunch of top notch scientists like Dan frickin' Polasky, himself, demonstrating a very similar concept? 


Between the 2 studies there are 15 or so authors and I'm pretty sure that means 22 of them are smarter than me - so... 


...wait. That's upside down. 

Do we revisit this idea? It might be like the single ion mass spectrometry thing that I really couldn't figure out, that - to this day - still looks to me like you're just doing scan averaging of incredibly poor signa, but I have to accept that I'm not smart enough to understand why it's something else.

And that's totally fine as long as the science is good! 

Okay - so revist the idea? Here you inject 2 samples on an LCMS system but each one goes to a different nanoLC column so that peptide A for sample 1 and sample 2 come out at different times? 


If I'm anywhere close one getting this - you're operating under the assumption that your MS system is no longer as limiting in the number of ions/peptides/proteins that it can detect as your chromatography is....

Which is very confusing after a decade and a half (geez...longer...? wtf...?) of having way way way more peptides eluting than you can possibly detect - one blog post on the absolutely amazing paper that I was first aware that actually counted this discrepancy

If that is true then, this sorta makes sense....? The downsides in my mind that I noted when the similar preprint dropped that you're depending on two nanoLC columns (I'd rather not depend on one - two sounds like 4 times the trouble) still seem valid to me. Also, I'm going to add the concern that this is great for my proteins that exist at 20,000 copies or more, but probably really bad for the proteins in sample A or B that are only about 100 copies per cell.... However, if you're really truly desperate to get that throughput up without mutiplexing reagents, this appears to be an avenue you could explore. 

Let me know how it goes? I like my robust simple little HPLC system where my weak point (my column or emitter) is a single instance. 

MaSSyPup returns with USB - install nothing - data processing!


 (That's a massive puppy!) 

And this is MaSSyPupX (yes, there were some double-takes at the screen at my weekly meetup of dyslexic and dysgraphic scientists). 


This probably sounds silly to a lot of people, but I have friends and collaborators who can't install anything at all on their PCs themselves. When FragPipe or SpectroNaut rolls out yet another super powerful upgrade they have to call (and sometimes pay) someone to come out and upgrade that PC for them. I have control over my PCs for now - but not the HPC - so I have to be careful about mismatching the software versions on each. 

I haven't checked it out yet, but MaSSyPupX appears to allow you to circumvent this completely by allowing you to have an external drive with all your pre-installed and pre-configured software on it. Plug it up and run. Now....even with a lot of external drives being super fast SSD, I expect some latency issues due to USB transfer speeds, but I know those are also improving. Even if it isn't ideal may be a solution to a serious problem or two some people out there have. 

Saturday, August 9, 2025

Single cell proteomics defines functional neutrophils sub-types in glioblastoma!

 


Sometimes you just need a vacation from everything. I ambitiously downloaded a couple dozen papers to my kindle but I probably have only read half of them.

I spent some solid time on this new preprint for obvious reasons, though! 


Anyone doing single cell proteomics (SCP) is probably running up against the same problems I am - like....are you sure this is the right technology for the problem? Are you sure it wouldn't be better to figure out a way to get 25 or 500 cells from that sample that are identical that way you can get thousands and thousands of proteins without trying very hard? 

Here we seen an exploration of both large cell numbers (500 cells) and bulk homogenates and single cells. And here is the real kicker - even when we think we have the perfect marker for getting "identical" cells out of that population. 


The coverage here is stupendous for these simple little cells. They may be getting as much as 50% of the total neutrophil proteome on this Astral and - yet again - we start to see important functional heterogeneity in systems that would be a whole lot easier to study if they'd just all do the exact same thing!