Friday, July 26, 2024

Extract nuclei from fixed tissues - optimized protocols!

 


More nuclei extraction proteomics! This is for bulk analysis and this team went all out in testing different procedures for the best yield. Super cool stuff! 




"SomaSignal tests: The next step in the evolution of multiplexed proteomics..."

 


..... (come on Ben, type something nice you can do it....got one!) The illustrations in this new advertisement review for aptamer based proteomics are very nice!


...and....some peer reviewers were fine with letting this thing through, so there must be merit in it somewhere. 

Unfortunately, I was skimming it and saw section headings like this and didn't want to throw up on my keyboard, so I'll leave it here for others to find and read. 


Every author on this review, btw, appears to be an employee of SomaLogic. 

Thursday, July 25, 2024

Reminder that misleading mouse studies waste medical resources


I'm great Mr. Mouse, thank you for asking. So rather than share any annoyance about what I read over my lunch, we're going to remind people about these 10 year old findings! 

Look, there are places for animal model data. There really are. Core metabolism is just like ours with few changes. However, if you're using mice as a neurological model and testing something that impacts higher level functioning in humans - using mice, which absolutely do not have the pathways in question - you're just being silly. I linked the editorial, because no one has to read much past the fact that 0 ALS drugs that worked in mice had any level of efficacy in humans.

Wednesday, July 24, 2024

Rapid one-pot workflow for phosphoproteomics prep (and DDM coating procedure)!

 


One issue with phosphoproteomics enrichment methods is the large required sample input. There are other issues, such as the mod is intrinsically transient by evolutionary design, but this group totally tackled the first one. 


I figured that anything sounding this complicated would require me to wait until a commercial kit was available, but the set up sounds surprisingly (shockingly?) simple. The paper is open access, so you can get that part list yourself if you are interested.

Another reason to check out this paper is that they coat their low-bind plastic in n-Dodecyl-Beta-Maltoside (DDM) and show that helps a lot. If you're interested in just reducing your surface loss binding stuff but possibly also worried about the longitudinal effects of a recently described detergent blasting into your mass spectrometer, here is a process to coat and discard the active solution! 

Tuesday, July 23, 2024

41,000 (forty-one THOUSAND) human plasma proteomics samples!

 


41,000 plasma proteomic samples?

Clinical data linking 218 different diseases across these people? 

Yes, this was O-link explore, so just under 3,000 proteins probed (which, reminder, does NOT mean measured, it means they were theoretically detectable because there are probes for the protein).

This is still exciting.

LESS exciting.....




Monday, July 22, 2024

Single NUCLEI proteomics (and identification of cancer mutations in single cells!)

 


I'll start with the figure above because it's early in the paper. Super exciting to me personally because of my personal research interests. These peptides are hard to identify in bulk. This group is doing them in single human cells.

How? They're building libraries for the software that we like to run library free the old fashioned way. They're capturing the specifics of their instrument as well as the unique characteristics of how peptides at super low intensity/concentration tend to behave. So this is all label free proteomics by data independent acquisition used to resolve both human mutations AND PTMs in single human cells. Wow, right? 

THEN this group broke through to the next level of what single cell proteomics (SCP) needs to do. THEY ANALYZED SINGLE NUCLEI. For context, if you look at what a large percentage of "single cell seq / scSeq" is actually analyzing, it is actually nuclei harvested from fixed tissues. The nucleus is pretty tough and it can often be separated from materials where the rest of the cell won't be recoverable.

Things like fixed tissue. And we have LOTS of that.

They only analyzed around 100 single nuclei here, but I honestly thought we'd see nuclei 5 years from now. When I first saw these results around the end of 2023, I couldn't believe it. It should be noted that we've recently saw a new preprint that did a few thousand single nuclei, so I was waaaay off in my estimates. Super super cool new paper. 

Sunday, July 21, 2024

Struggling with protein acylation? You should try Acyl(S-) Trapping!

 



If you are a rational human being and you want to stay one, you should probably just forget that proteins reversibly acylate all the time. Most notably, acylations tend to occur to drive intracellular spatial stuff. Like your protein gets a terrible acylation on it that helps it migrate to the membrane where it now has activity while it's tethered there. 

Typically to measure these awful things you start with a huge input then do some enrichment and cleave off your enrichment tag or something. Other times you over-express your protein in a system where the acylation is forced to occur but it has no biological function. Force that KRAS to go to the E.coli membrane expressed at completely nonphysiologically relevant concentrations, because that will help you characterize those mods while learning absolutely nothing else about what that system does. 

OR you can and should do this


This clever use of both a modified suspension trap and what/when/how you add isobaric tags allowed this group to characterize protein acylation in a complex system starting from as little as 20 ug of protein. It's a super cool new approach to get at very releavant protein modifications that are very very tough to do otherwise. 100% recommended. 

Saturday, July 20, 2024

ARE WE THERE YET?!? What does single cell proteomics need to do next?

 


Just leaving this here before I run out the door! Great insight (perspective, even?) from one of our most forward thinking protein informatics groups. Cool that we can do this stuff, but is it read to help people yet? Why not? Totally worth a read. (Open access)



Wednesday, July 17, 2024

Proteomic analysis of a super promising new active RAS inhibiting drug!


Y'all, KRAS small molecule drugs are sort of my jam and I have exactly zero guesses of how this drug could possibly work. My calendar is packed today, though, and I'm going to drop this here without looking it up.

Here is the idea, though: KRAS by itself is generally not bad. The problem is that the active sites get mutated and then the stupid new version of the protein stays active all the time. As KRAS and it's cousins NRAS and HRAS sit on top of pro-proliferation pathways, having them active all the time is a terrible idea.

This new drug, currently RMS-7977 (the name generally changes if it has success in human trials, you can guess by the designation that it's probably not there yet, so a healthy grain of skepticism should be involved here for this exact molecule) appears to only inhibit the active sites. 

Imagine a KRASG12C mutation where you've got a cysteine very actively holding onto GTP so that it is activating those proliferation pathways. You've got too much active stuff around. This drug doesn't block the GTP site, it blocks the activity of the active GTP bound protein. 

What's surprising is that there is a very different shape for a G12D mutated pocket and a G12V or G12R, etc., but this thing doesn't care! 

What's funny is that they did some awesome proteomics and it didn't really make the paper. TMT proteomics and phosphoproteomics on an Exploris 480 using the turbo TMT mode. 

If you're as interested as I am, the files are here. There appears to be a second repository, but there is a typo in the paper so I can't find it without some digging. 

Really really cool stuff. And some of the early small molecule RAS inhibitors aren't really doing well in the clinic at all right now, despite being approved for use. However, the molecules based on those are rapidly evolving and each one is better than the last. Even if this one doesn't go forward, the fact that you can inhibit a bunch of deleterious mutations with a single drug is a super promising development!! 



Tuesday, July 16, 2024

Save the date! ABRF 2025 - Las Vegas March 23-26!

 




Today we'll talk about why you have the coolest person in your field head your academic associations. 

I'm of course talking about -- 


THE Dr. Sue Weintraub  -who may have had nothing at all to do with the fact that the Association of Biomolecular Research Facilities is meeting in 2025 in Las Vegas, but I'm going to pretend that I have inside knowledge that it was entirely her idea. 

Thanks Sue! You can find out more and register here! 

I haven't been to an ABRF meeting in a few years. I did a talk on single cell proteomics in 2022, maybe, because I haven't been in a core facility in a few years. I'm a huge fan of the organization and the conference. 

If you are an early career researcher (ECR) and you are working on building up your credibility and exposure ABRF can be an amazing thing to be part of. Look, we already know who is going to be headlining ASMS next year, right? Same dudes that founded the whole thing in the 1960s. ABRF heavily promotes ECRs because it is great for everyone. Core labs sometimes don't get to do a lot of method development because they're too busy applying established methods for the hundreds of customers necessary to keep their lights on. You come in from you academic lab with the techniques you developed and you can get a chance to present those new methods to the lab pros who may have a very different angle than you've considered. Someone with 20 years of applying proteomics techniques can be an unbelievable untapped resource for new insights and ABRF is where you get to hang out with them.

I've never gambled in my life but I've gotten to climb in the Red Rock Canyon just outside of Vegas.... that might still be my profile picture for this blog.....

Obviously you go for the science, but it never hurts when there is fun stuff to do before it starts or a day or two after all the science has wrapped up.



Monday, July 15, 2024

Using a carrier proteome approach to "amplify" pericardial fluid proteins!

 


The importance of this new paper can be better appreciated after some quick google searches. Something like "what is pericardial fluid and how do you extract it" is a good start.


You use imaging based approaches to guide a microbiopsy needle into SOMEONE'S BEATING HEART and you remove some fluid. Ideally there is NEITHER A LOT OF SPARE FLUID not DO YOU WANT TO TAKE A LOT OF EXTRA SAMPLE OUT. 

You are - quite hopefully - very sample limited here. So what this group did was TMT label the peptides from a pool of some fluids to use it as their "carrier" and then TMT labeled individual fluids. We know that too much carrier messes with quan, so they carefully optimize the levels to get a lot of IDs but without compromising their quantification accuracy by too much. 

I am required by a new blog rule to point out (thank you reviewer #1 for a paper just accepted today (wooo!) that using a labeled sample to increase your odds of detecting samples in a body fluid is the premise behind the TMTCalibrator, which pre-dates the preprint of SCoPE-MS by about a year. 

As you might be able to tell from the aforementioned "googling", I am no expert in pericardial fluids and I didn't have time to become one on possibly the hottest single day I've ever personally experiences (heat index of 116F? On an isolated wooded mountain in Pennsyltucky?) poor day to mow my grass. I might have broke my brain. But these people are and they seem very happy with the biological applications of their large relative increase in numbers of identifications. 

I'm happy for them, because this is a cool approach and - if I was going to have pericardial fluid removed for some nerds to do proteomics - I'd want them to take as little as possible. I

Saturday, July 13, 2024

Equi-CP - Semitargeted quantification of drug treated single cell (and other things!)

 


I really like this protocol because it is so very straight forward (after the sample prep...ugh...) and we took a swing at something similar a while back and had absolutely no luck whatsoever (on a different instrument platform).  Here is a very clear, well written, protocol to show you how to do it. 


It should be open since I don't have an affiliation right this second and I can read it. 

Here is the idea, though. When you use TMTCalibrator/SCoPE-MS/iBASIL/BOOST. You generally start with a whole proteome that you label with one isobaric tag. Then you mix that with isobaric tagged extremely low level or single cell samples, right?

In Equine Clostridium perfringens (Equi-CP, and to be fair, I didn't read the part where I was supposed to learn where the name came from. You take synthetic peptides and label those with one channel, clean them up and then spike those in your labeled single cells.

Then this group just does plain old DDA on a nice previous generation Orbitrap. It goes along just doing MS1 scans until it hits your synthetic peptides, fragments those and - boom - you have single cell data. I also like how they pseudo-randomize their samples with the MANTIS and (had we seen this earlier) Dr. Colten Eberhard's spring working out how to pseudorandomize 4,000 mouse brain cells from 6 different mice might have been a little less stressful for him to sort out because they ultimately came to very similar methods. (Though Colten had a x6 matrix, not a x2 and the guy with career defining dysgraphia/dyslexia typing this post was like "omg, there is no way in this world I can help you at all" (something he clearly knew after 4 years). 

Super cool, though, right? What we tried to do was take all the peptides from the RAS pathway kits from Thermo and then label those with one channel. Honestly, I still don't know if we were just way too ambitious / outside of our dynamic range or if the cells just didn't tag well in that batch. Either way, when I do it again, I'm using this protocol verbatim. 

Friday, July 12, 2024

More single cell histone PTMs (epiproteomics!) label free, targeted WITH derivatization!

 


I've whined about this before, but I went to an epigenetics meeting last year and had a lightning talk (and awesomely lucky poster location by the coffee) and pitched single cell histone PTMs ("come talk to me by the coffee"). 

No one did. 

For real, every single single cell proteomics dataset I've analyzed (and I've looked at just about everyone's RAW data - except the people who don't make theirs public even after I request it - and you know who you are and you suck) - has histone PTMs in it. 

If you really want to do histone PTMs right, though, you need to do something about all those basic residues. The gold standard is propiprolypropylianation (spelling?) where you derivatize every lysine that is unmodified and then you digest. If you've never done it, it sucks. You can't buy the reagents. You have to buy some reagents and then create the active (reactive) form and then you add it to your sample before it expires. Not fun and I didn't think - for even a single second - that the reaction could be controlled well enough to do it for single cells.

I was wrong. 


This group did some super high precision work on the CellenOne and not only isolated single cells well and lysed them in tiny volumes, but they also dumped that live reactant into the single cell lysates. Derivatizing their lysines from single cells. When I said that I see histone PTMs in every single cell data set, it's not hundreds of them. It's the easy ones. K27/K28 (depending on whether you count the M, which UniProt does) and K80 are friendly and tryptic. Unfortunately, there is also a lot of homology in all the histones so finding a modification site is generally easy at a peptide level....tryptically digested it is tough to figure out what histone it came from. If you propionylate (spelling?) then you get a longer sequence that can help break those populations up and assign them to the original correct proteoform.

Thursday, July 11, 2024

nSWATH - more success with small fast DIA windows!

 


Now that we've got multiple instruments breaking the 100 scans/second limit, we're starting to see some promising old ideas coming back. One is the data independent acquisition (DIA, or in this case, the trademarked "SWATH") running with isolation widths we'd normally see for data DEpendent (DDA). To get enough scans across the peak, this group compromised a little on their mass range limit but sees improvements in peptides/proteins and %CVs in a 2 proteome digest standard. 



Wednesday, July 10, 2024

Mapping the human hematopoietic stem and progenitor cell hierarchy with single cell proteomics!



Holy cow. I think it might be time to stop saying things like "the emerging field of single cell proteomics". I think it's emerged and is the tasty form of the cicada (where are those things, btw? I thought the US was going to be buried in billions of them this summer...I've seen nothing but the evil spotted lanternfly).



The first is that this is just a really smart model to apply the technology to, right? Take the stem cells and force them to differentiate out. The single cell proteomics (SCP)  method in question appears to be Reticle, which makes sense for 2,500 single cells.  

The second is - wow - the integration of these data is just top notch. I've tried doing this as well and people have let me publish the results - to see it done THIS well, is just fantastic. 

Third, and most important, is that they actually learned something from this IN VIVO analysis of these cells. They took bone marrow cells from 6 patients (!!) and cleverly sorted their control (stem cell/progenitors? not my field and just trying to follow along) as well as randomly sorted cells that have a differentiator marker on the cell surface. (FACs machine, just take anything within this great big gate and randomly deposit them, I don't care what size they are or how they autofluoresce over here, just catch them arbitrarily and drop them in wells).

Really really super cool study and something that should get any immunologist interested in what you can do with our technology. Though.....I'd personally be quick to warn them that integrating the data this well might take a team like this who has been steadily improving study after study over the last 4+ years. 

Monday, July 8, 2024

Proteomics discovers first candidate diagnostic markers for disease in seals!

 

I take a couple of weeks off to go on a bunch of interviews and someone graffiti'ed the front of my favorite journal? 

It turns out that seals have been increasingly suffering from a gross disease called domoic acid toxicosis (link) and there are no diagnostics. Currently it sounds like you've got a sick acting seal or a dead seal and someone guesses. 

This group fixed that with proteomics of their CSF! (Cerebral spinal fluid)

They identify a list of proteins differential in seals that are suffering from symptoms and an additional list of proteins that can determine which ones have the chronic DAT. 

BOOM. Proteomics steps in to a disease I bet almost no one has heard of and comes up with diagnostics. I imagine that it isn't entirely without challenges....like where do you get CSF out of a seal...but it's still a promising use of the magical magnets in our labs. 


Friday, June 21, 2024

Could you use proteomics/mass spectrometry help? Remote or on-site?

 


It's a wrap, y'all. The TIMSTOF SCP is crated up and leaving today. The SCIEX ZenoTOF crate is in the hallway. The Agilent QQQ is already gone. I've closed the Johns Hopkins Metabolomics core. Hopkins isn't interested in having me as a tenure track faculty member. I probably could stick around as a research associate, but it is a dead end job.  You are on 1 year contracts at ....more.... than a postdoc salary and you can only have students by paying for someone else's students that you mentor. You can hire technicians, but they'll stick around and learn enough to go get 2x the salary at one of the companies in the area. You have to write short term pilot grants because no one is going to give you a 3 or 5 year award when you are on 1 year contracts. "Lack of institutional support" should be my middle name. So....

I am currently seeking work. I verbally agreed to a tenure track position months ago, but it is unclear when/if that will actually start. If paperwork drags out much longer I'll lose the last year of my R01 and at that point I think I'll  accept I'll never have "professor" anywhere in my job title. When I turned down a tenure track offer in 2014-2015(?) to stay in my home town it never occurred to me for a second it might be the only shot I'd ever get. 

Right now I'm planning to unpaid sabbatical for a while to write the 20-ish papers that are open on my desktop. 8? 8 would be stellar. I used to do a solid amount of industry consulting, so I've got a DUNS number, etc., However, I can only write like 4 hours/day, and it doesn't pay very well (in fact, it pays $0 and if the papers get accepted I'll have to figure out how to...pay....them.....cart before the horse, though).  

Are you looking for some mass spectrometry/proteomics help? Here are some ideas I have that I think could be fun and helpful to people, but I'm open to ideas for other consulting type roles

1) Did you get one of those giant TIMSTOF things after running an Orbitrap since 2005? Could you use help getting over the hump so it's making the crazy data that everyone says it can? I can help with that. Early adopter. 

2) Want to jump into single cell proteomics? Did you already and then just discover it's actually really super hard? I've been doing this miserable stuff for years. My really good students can generate great data. Not as good as my stuff, and zero insult to them, it just takes time to get it right. 

3) Did you spend a fortune generating proteomics of multi-omics data and need it to talk somehow? 

4) Are you really great at proteomics but need to do some intact protein or metabolomics or glycoproteomics and could use a hand? 

5) Could you use help marketing your proteomics technology, technique, reagent, or just advice on the business side of proteomics? This blog is only for sale after I've completely burned through my savings, though I might test putting ads up before I get too desperate). 

Those are just some ideas. I'm open to remote or on-site support. If you are interested, my personal email address is LCMSmethods@gmail.com.

Thursday, June 20, 2024

pQTL /GWAS studies by LCMS proteomics identifies loads of peptide level variants!

 

Quantitative 

Trait 

Loci 

or QTLs are a great excuse for doing some pretty low confidence and low accuracy measurements. In genomics these are done all the time with SNP arrays that can more or less sorta quantify a couple hundred things per sample. 

Here is the trick, though, if you get enough samples you can start to see the patterns in that lousy data without doing good genomics or gene product measurements on lots and lots of people (still hard on I don't care what refrigerated room of super computers you have). 

This is opening the door to things like ....actually I don't think I want to say the names of some of these affinity arrays right this second.....a post on one of them this week has like 20,000 site visits in about 3 days... you know what I'm talking about. 

The fast(? are they really, though?) inexpensive(? ummmm.....honestly doesn't look like it?) mostly unproven things that may totally work, but there certainly isn't an abundance of evidence yet that they do.

What if you could do decently large QTL type work at a protein level with proven quantitative technology? What's that worth to the world? I dunno, but is that even possible yet?


This is a couple of months old (I've been busy) but it certainly implies that - yes - we can do these things today and we don't even need the best of the best of the best to do so. 

This study used the Seer Protegraph for upstream sample prep and then nice microflow proteomics on a TIMSTOF Pro (possibly a 2, I forget). Thanks to the nice robust microflow stetup they knocked out each sample in about 30 minutes. So 48 samples/day this way. 

I think the biggest of the "next gen" studies I've seen so far was 5,000 samples? Let's go with reasonable downtime and QC/QA estimates. You're at 3 months? 4 months if you take weekends off if you do it this way. Are the affinity things faster? Maybe? Are they cheaper? Also....maybe....

However - while I don't know the affinity technologies well, one thing that I do know about any affinity type technology is that - if you didn't design a probe for it before you ran the thing you will NEVER EVER be able to go back and look for new stuff. 

If you did that same study using the platform described here where they did DIA based analysis - it's the complete opposite - you can always go back and look for new stuff. I'm doing it all the time right now as these neural network things get better I can go back to single cells we analyzed 2 years ago and rerun them and my blanks look better, my coverage goes up and I can find a few more of the mutations and PTMs I care about.

How's the coverage this way? LCMS sucks at plasma proteomics, right? As good as any affinity tech we've seen so far and - again - as the algorithms - and our knowledge of the true human proteome - evolve - we can go back to these data.

In fact, you can do it right now if you want. The files are all right here

Off my soapbox, the authors did a bangup job of quantifying human variants in these data. It's truly stunning downstream analysis work. 

Wednesday, June 19, 2024

New biomarker panel for Parkinson's disease with early predictive power!!

 


Now for some good news! While there are surprisingly well-developed set of genetic markers for Parkinson's Disease that you can even get info on through things like 23andMe, sometimes it develops without any of these. It's called de novo when that happens. What we need are some protein level markers, because it sounds like this yet another disease that isn't (or isn't entirely) genetic! 

We've seen some amazing progress on Alzheimer's Disease biomarkers, largely out of the school in St. Louis that sounds like it should be in Seattle which adds to why literally everyone forgets that it is there. 

Super encouraging paper title for Parkinson's time!

https://www.nature.com/articles/s41467-024-48961-3 (something is up with my ability to hyperlink on blogger) 


How'd they do it? I definitely expected some of the super high tech nanoparticle based plasma preparation stuff that lets us see ultra deep into the fluids. Nope. Not here. This is a story about having access to a priceless patient sample set and doing things the hard way. 

They used a standard depletion strategy (read it last night on my phone, but I suspect we're talking about the top12 depletion Agilent column or something similar to what Michal and I were using at NIAID a dozen years ago) and - 

MSe! (WTF is that?) Oh. Let me tell you (it's probably in the terminology translator over there somewhere --> ) In the long history of this blog which is now a summary of something north of 3,000 proteomics papers (about half you can see) - I think there this is MSe paper number 3. 

It is a technology we had at NIAID in 2010(?)  and it is Waters for  All Ion Fragmentation. It is a near 100% duty cycle technology. You get your MS1 scan then you get another full scan where every peptide is fragmented in a single window. With the rapid improvements in data independent analysis algorithms, you'd guess that maybe we could make sense of these data better now than ever. I honestly don't know, I haven't seen it used in a long time. 

Something like 1,200 proteins were quantified in the patient samples. They used some criteria for filtering I don't quite understand, but sounds more strict than what I'd use in this case and work there way around to around 900 for quantitative analysis - landing on around 120 that they build targeted assays for their large cohort study. Told you - they put the work in and did this the classical way. 

Of their 120 markers, they can consistently detect about 1/4 and when applied to their larger cohort a learning machine can accurately classify the patients! 

Minor criticism of the paper because I work in a lab without air conditioning during a historic heat wave and it is making me very negative:  The targeted data is all up exactly where it should be - on Panorama where Skyline nerds can take a look at it. The global data does not appear to have been deposited. While....there are probably 10 people on planet earth who can process MSe data (maybe there are more? No way there are 100, right?) this is a dataset that might make some of us want to try. These are, however, clearly actual patient samples and sometimes IRBs don't allow global data to be deposited, but it would have been cool to see these results. As an aside some recent TIMSTOF methods described by Vadim's team have essentially (link) been All Ion Fragmentation methods so DIA-Neural Network should be able to make sense of these spectra, if the screwy Waters data format could be converted to something universal. 

Don't let this offset how great this study is and the real theme here - if a skilled team can get access to the right samples maybe we don't need the best and most expensive instrument or sample prep method in the world to do something truly important. 



Tuesday, June 18, 2024

Ultrafast acoustic ejection for biomarker studies!

 


Acoustic Ejection mass spec is super promising, right? You couple an Echo which can accurately transfer liquid from one place to another to the front of a mass spec where a droplet gets moved ionized, quantified and moves to the next. Dr. Wheeler, who graduated from our group recently had an internship at Merck where she ran one. She said that you couldn't prepare samples fast enough for it even if you were loading 1526 well plates! I think most of the systems are ending up doing drug screens in pharma.

Could you use it for biomarker studies in humans? Get those big cohorts done over your lunch break? Sure looks like it! (link!)

You aren't getting any upfront separation from HPLC, so this group used ...affinity... enrichment.... so they enriched with antibodies then did really nice super ultra fast quantification on the echo. By blog rules I am required to be snide about the use of rabbit blood derivatives to mess up the quantitative nature of a proteomics assay.

However - 

This automated platform for affinity enrichment was put through a rigorous validation for both robustness and reproducibility. It's clearly thorough because I nearly fell asleep twice trying to read it. The highest %CV in the study appears to be a shockingly low 11%? 

If nothing else this is the first study I've seen that says the Echo equipped triple quad can be a legitimate contributor in proteomic validation. 


Monday, June 17, 2024

Re-revisited the organ specific protein aging study - "organ specificity" is not supported by protein level data!

 



For anyone unfortunate to visit this blog in the past, you might have seen some of my early analysis and reanalysis of a high profile Nature paper in December. I'm not going ot put the link here. But the idea - which got mainstream attention - was that we could measure protein abundance in plasma and use that to infer the biological age of different human organs. 

Here was me going through it - increasingly appalled by a several aspects of it

As you'll see I was less annoyed by the central premise - that 4x more transcript abundance might mean a protein is organ specific - than I was by the lack of publicly available data - or that the validation was performed by ...measuring RNA......not protein....

That rant got me a really cool interview with the science reporter for the Wall Street Journal and a soundbite in the mainstream press.  

Here is the thing, though, while appalled that a seemingly arbitrary and overall rather ignorant level of assumptions about using transcript counts to predict whether a PROTEIN came from a specific organ doesn't mean that all of the results are meaningless. It could be that if you examined every organ in complete isolation and said "if I count 4x more transcripts for this protein in organ A than in any other organ then that protein might be pretty specific to that organ." 

So I thought something like "wow, wouldn't it be great if there existed somewhere in the world actual protein level measurements of different human organs?" Something like these 2 studies that got the cover of this exact same journal 10 years ago? 


Or - more convenient - more recent data that is higher depth and really addresses a lot of the weaknesses in the two articles in this 10 year old thing I love so much I have the cover framed (the artist who made it is super cool and I respect him a lot.) 


Imagine this - you use the same cutoff that  Oh et al., used - the protein abundance needs to be 4x more than any other organ? In isolation. As if every organ is completely disconnected and that protein bearing material doesn't get transferred between them in some sort of an interconnected fluid based system. 

What's the overlap between the proteins predicted to be organ-specific between the transcript based data and the proteomic data? (Keep in mind there is not 100% overlap of every target or organ). 

Want to follow my step by step analysis? It's in Excel and I tried to make it very clear. It's a bunch of VLOOKUP and things.  Heck - this is how the Open Science Framework is supposed to work! Check it out and please tell me if it's wrong or flawed. I spent a lot of time looking at it (and spotchecking "organ specific proteins" at http://www.humanproteomemap.org/

Drumroll....? 

59.6%. Better than flipping a coin! But...not...much...better....

Okay - but hear me out. What if you actually consider that organs ARE connected by a, I dunno, let's call it a "circulation system" or something "circulatory?" I like that one - that could hypothetically carry proteins from multiple organs. How many of those proteins are higher in abundance - not 4x higher  - just any amount higher than the summed abundance of the organs we have solid PROTEIN LEVEL measurements on? Let's just use the organs in the 29 healthy human tissues map that have a match in the recent nature paper. 

45.4%. Worse than flipping a coin, but -again - not by very much. 

Now, is Ben just screaming at the sky again? There are grownup ways of doing this stuff. Like contacting the editor at the journal and asking if you could put in a commentary or a "matter's arising" that discusses that the very basis of a paper is intrinsically flawed. 

I did that. 

And the editor asked me to have a conversation with the authors - so I contacted the senior author about my concerns. I'm not sure if I can share the emails so I won't but I'll provide my interpretation - I am not sure if gaslighting is the correct term or if it is just sometimes a feature of academia where a Professor assumes anyone who isn't one probably doesn't know 1% of what they do and talks down to them? Hard to tell. But this is a summary of the conversation. 

1) They'd love to share the proteomics data, but it's impossible to share any sort of -omics data without waiting months or years. Ben sighs. Obviously this is not only inaccurate, it is shockingly ignorant. 

2) They might be forming some sort of a consortium to make -omics data publicly available. Ben shuts his PC off for the day. We have a global, extremely well organized multi-national system to share proteomics data. Please do not invent one. Please. 

3) Looking at proteomics data publicly available is something that they might try one day. So...yet again... someone with SomaLogic data skips the easy and obvious experiment

I spent a lot of time working on my focused breathing and redrafted my email a few times explained that in proteomics data sharing is considered mandatory- has been for a decade - unless patient data is compromised or it is flagged for national defense or something. And shared a summary of the analysis I linked above. I also shared something else that is a decade old about why we have to share proteomics data

Then I shared this with the editor with my analysis and they thought about it for a month or two. 

I received an email from the editor that they had a meeting and couldn't see how actual protein level data could add anything to the findings of the paper and rejected my paper. 

I guess that was the issue. 

I don't want to add anything to the findings of this paper. I want to point out that the whole central premise of the study is silly and that - in the absence of publicly available data for reanalysis - no aspect of it should be taken seriously at all. Because when you actually look at proteins themselves - which this study is based on - and compare those to results that have been analyzed and reanalyzed, there is virtually no support for this study. 

Along the way somewhere I found out that - unsurprisingly- there is a whole company being spun out of the results of this study. I mean....who wouldn't want to know that their liver is 15 years older than it should be? Right? Cool idea, sign me up!  However - there is no reason to believe - at all - that the methods detailed in the study can make anything at all like those kinds of measurements. 

Here is my analysis - whole thing open with step by step instructions. Please check my work!

Oh wow.  While I was rereading this rant the preprint went live, but given the format I wrote it in, the post is actually longer.