Thursday, August 28, 2025

SomaModules - Pathway Analysis for Scan Output! (And publicly available SomaScan data!)

 


This paper took me some time to dig through, with the majority of the time focused on PUBLICLY AVAILABLE SOMASCAN DATA!




What's it about? It's about how to change the output data from SomaScan into something useful. In this case the authors walk you through a large pile of separate R Scripts that will take you from your Somamer output and will get you to a gene identifier for each one....cause...obviously....that's not what you get from the vendor....

Here is what you'll need to take these output data and get to pathway analysis....they all won't fit on the screen....


Now....you absolutely could have all these separate scripts in a single notebook, but it is certainly funnier this way. Transform your silly data format - get this output - put it into this next script - run that - put it into this next one. 

This is a great group and I've got - easily - 500 - Thermo .RAW files from them sitting on around on hard drives here from various studies on the Baltimore Longitudinal Study of Human Aging. 

It would have to be a great group, because they can get usable information out of this SomaLogic data! 

And by THIS SomaScan data - I mean - THERE IS DATA HERE IN VARIOUS FORMS! 

And... it's ...revealing.... I think the paper references it as the 11k data, but I count like 7,400 unique protein targets. So I might have that mixed up, it might all be the 7k aptamer data. 

Just like you've heard, everything appears to always be detected, and it appears to always be detected above the blanks. And if that is your goal, stop reading here and have fun!  You've detected proteins! Probably! 

If you are interested in quantitative differences between condition A and condition B, these aptamers do not appear to magically exceed the physical limits that have long been established for assessing quantitative differences of aptamer - protein binding and release. Nor does complexity of the background and the aptamers present increase the maximum dynamic range in protein quantification ever recorded for an aptamer. Shockingly, it appears to be completely the opposite. 

This data deserves a more thorough analysis. And, as I mentioned, I have hundreds of really good LCMS proteomics file already processed from this study. Most of it is 24 offline fraction TMT data, that I happen to think was done really well. I will, however, leave you with the impressions from my first couple attempts to interrogate these data. 



No comments:

Post a Comment