Thursday, June 20, 2024

pQTL /GWAS studies by LCMS proteomics identifies loads of peptide level variants!

 

Quantitative 

Trait 

Loci 

or QTLs are a great excuse for doing some pretty low confidence and low accuracy measurements. In genomics these are done all the time with SNP arrays that can more or less sorta quantify a couple hundred things per sample. 

Here is the trick, though, if you get enough samples you can start to see the patterns in that lousy data without doing good genomics or gene product measurements on lots and lots of people (still hard on I don't care what refrigerated room of super computers you have). 

This is opening the door to things like ....actually I don't think I want to say the names of some of these affinity arrays right this second.....a post on one of them this week has like 20,000 site visits in about 3 days... you know what I'm talking about. 

The fast(? are they really, though?) inexpensive(? ummmm.....honestly doesn't look like it?) mostly unproven things that may totally work, but there certainly isn't an abundance of evidence yet that they do.

What if you could do decently large QTL type work at a protein level with proven quantitative technology? What's that worth to the world? I dunno, but is that even possible yet?


This is a couple of months old (I've been busy) but it certainly implies that - yes - we can do these things today and we don't even need the best of the best of the best to do so. 

This study used the Seer Protegraph for upstream sample prep and then nice microflow proteomics on a TIMSTOF Pro (possibly a 2, I forget). Thanks to the nice robust microflow stetup they knocked out each sample in about 30 minutes. So 48 samples/day this way. 

I think the biggest of the "next gen" studies I've seen so far was 5,000 samples? Let's go with reasonable downtime and QC/QA estimates. You're at 3 months? 4 months if you take weekends off if you do it this way. Are the affinity things faster? Maybe? Are they cheaper? Also....maybe....

However - while I don't know the affinity technologies well, one thing that I do know about any affinity type technology is that - if you didn't design a probe for it before you ran the thing you will NEVER EVER be able to go back and look for new stuff. 

If you did that same study using the platform described here where they did DIA based analysis - it's the complete opposite - you can always go back and look for new stuff. I'm doing it all the time right now as these neural network things get better I can go back to single cells we analyzed 2 years ago and rerun them and my blanks look better, my coverage goes up and I can find a few more of the mutations and PTMs I care about.

How's the coverage this way? LCMS sucks at plasma proteomics, right? As good as any affinity tech we've seen so far and - again - as the algorithms - and our knowledge of the true human proteome - evolve - we can go back to these data.

In fact, you can do it right now if you want. The files are all right here

Off my soapbox, the authors did a bangup job of quantifying human variants in these data. It's truly stunning downstream analysis work. 

Wednesday, June 19, 2024

New biomarker panel for Parkinson's disease with early predictive power!!

 


Now for some good news! While there are surprisingly well-developed set of genetic markers for Parkinson's Disease that you can even get info on through things like 23andMe, sometimes it develops without any of these. It's called de novo when that happens. What we need are some protein level markers, because it sounds like this yet another disease that isn't (or isn't entirely) genetic! 

We've seen some amazing progress on Alzheimer's Disease biomarkers, largely out of the school in St. Louis that sounds like it should be in Seattle which adds to why literally everyone forgets that it is there. 

Super encouraging paper title for Parkinson's time!

https://www.nature.com/articles/s41467-024-48961-3 (something is up with my ability to hyperlink on blogger) 


How'd they do it? I definitely expected some of the super high tech nanoparticle based plasma preparation stuff that lets us see ultra deep into the fluids. Nope. Not here. This is a story about having access to a priceless patient sample set and doing things the hard way. 

They used a standard depletion strategy (read it last night on my phone, but I suspect we're talking about the top12 depletion Agilent column or something similar to what Michal and I were using at NIAID a dozen years ago) and - 

MSe! (WTF is that?) Oh. Let me tell you (it's probably in the terminology translator over there somewhere --> ) In the long history of this blog which is now a summary of something north of 3,000 proteomics papers (about half you can see) - I think there this is MSe paper number 3. 

It is a technology we had at NIAID in 2010(?)  and it is Waters for  All Ion Fragmentation. It is a near 100% duty cycle technology. You get your MS1 scan then you get another full scan where every peptide is fragmented in a single window. With the rapid improvements in data independent analysis algorithms, you'd guess that maybe we could make sense of these data better now than ever. I honestly don't know, I haven't seen it used in a long time. 

Something like 1,200 proteins were quantified in the patient samples. They used some criteria for filtering I don't quite understand, but sounds more strict than what I'd use in this case and work there way around to around 900 for quantitative analysis - landing on around 120 that they build targeted assays for their large cohort study. Told you - they put the work in and did this the classical way. 

Of their 120 markers, they can consistently detect about 1/4 and when applied to their larger cohort a learning machine can accurately classify the patients! 

Minor criticism of the paper because I work in a lab without air conditioning during a historic heat wave and it is making me very negative:  The targeted data is all up exactly where it should be - on Panorama where Skyline nerds can take a look at it. The global data does not appear to have been deposited. While....there are probably 10 people on planet earth who can process MSe data (maybe there are more? No way there are 100, right?) this is a dataset that might make some of us want to try. These are, however, clearly actual patient samples and sometimes IRBs don't allow global data to be deposited, but it would have been cool to see these results. As an aside some recent TIMSTOF methods described by Vadim's team have essentially (link) been All Ion Fragmentation methods so DIA-Neural Network should be able to make sense of these spectra, if the screwy Waters data format could be converted to something universal. 

Don't let this offset how great this study is and the real theme here - if a skilled team can get access to the right samples maybe we don't need the best and most expensive instrument or sample prep method in the world to do something truly important. 



Tuesday, June 18, 2024

Ultrafast acoustic ejection for biomarker studies!

 


Acoustic Ejection mass spec is super promising, right? You couple an Echo which can accurately transfer liquid from one place to another to the front of a mass spec where a droplet gets moved ionized, quantified and moves to the next. Dr. Wheeler, who graduated from our group recently had an internship at Merck where she ran one. She said that you couldn't prepare samples fast enough for it even if you were loading 1526 well plates! I think most of the systems are ending up doing drug screens in pharma.

Could you use it for biomarker studies in humans? Get those big cohorts done over your lunch break? Sure looks like it! (link!)

You aren't getting any upfront separation from HPLC, so this group used ...affinity... enrichment.... so they enriched with antibodies then did really nice super ultra fast quantification on the echo. By blog rules I am required to be snide about the use of rabbit blood derivatives to mess up the quantitative nature of a proteomics assay.

However - 

This automated platform for affinity enrichment was put through a rigorous validation for both robustness and reproducibility. It's clearly thorough because I nearly fell asleep twice trying to read it. The highest %CV in the study appears to be a shockingly low 11%? 

If nothing else this is the first study I've seen that says the Echo equipped triple quad can be a legitimate contributor in proteomic validation. 


Monday, June 17, 2024

Re-revisited the organ specific protein aging study - "organ specificity" is not supported by protein level data!

 



For anyone unfortunate to visit this blog in the past, you might have seen some of my early analysis and reanalysis of a high profile Nature paper in December. I'm not going ot put the link here. But the idea - which got mainstream attention - was that we could measure protein abundance in plasma and use that to infer the biological age of different human organs. 

Here was me going through it - increasingly appalled by a several aspects of it

As you'll see I was less annoyed by the central premise - that 4x more transcript abundance might mean a protein is organ specific - than I was by the lack of publicly available data - or that the validation was performed by ...measuring RNA......not protein....

That rant got me a really cool interview with the science reporter for the Wall Street Journal and a soundbite in the mainstream press.  

Here is the thing, though, while appalled that a seemingly arbitrary and overall rather ignorant level of assumptions about using transcript counts to predict whether a PROTEIN came from a specific organ doesn't mean that all of the results are meaningless. It could be that if you examined every organ in complete isolation and said "if I count 4x more transcripts for this protein in organ A than in any other organ then that protein might be pretty specific to that organ." 

So I thought something like "wow, wouldn't it be great if there existed somewhere in the world actual protein level measurements of different human organs?" Something like these 2 studies that got the cover of this exact same journal 10 years ago? 


Or - more convenient - more recent data that is higher depth and really addresses a lot of the weaknesses in the two articles in this 10 year old thing I love so much I have the cover framed (the artist who made it is super cool and I respect him a lot.) 


Imagine this - you use the same cutoff that  Oh et al., used - the protein abundance needs to be 4x more than any other organ? In isolation. As if every organ is completely disconnected and that protein bearing material doesn't get transferred between them in some sort of an interconnected fluid based system. 

What's the overlap between the proteins predicted to be organ-specific between the transcript based data and the proteomic data? (Keep in mind there is not 100% overlap of every target or organ). 

Want to follow my step by step analysis? It's in Excel and I tried to make it very clear. It's a bunch of VLOOKUP and things.  Heck - this is how the Open Science Framework is supposed to work! Check it out and please tell me if it's wrong or flawed. I spent a lot of time looking at it (and spotchecking "organ specific proteins" at http://www.humanproteomemap.org/

Drumroll....? 

59.6%. Better than flipping a coin! But...not...much...better....

Okay - but hear me out. What if you actually consider that organs ARE connected by a, I dunno, let's call it a "circulation system" or something "circulatory?" I like that one - that could hypothetically carry proteins from multiple organs. How many of those proteins are higher in abundance - not 4x higher  - just any amount higher than the summed abundance of the organs we have solid PROTEIN LEVEL measurements on? Let's just use the organs in the 29 healthy human tissues map that have a match in the recent nature paper. 

45.4%. Worse than flipping a coin, but -again - not by very much. 

Now, is Ben just screaming at the sky again? There are grownup ways of doing this stuff. Like contacting the editor at the journal and asking if you could put in a commentary or a "matter's arising" that discusses that the very basis of a paper is intrinsically flawed. 

I did that. 

And the editor asked me to have a conversation with the authors - so I contacted the senior author about my concerns. I'm not sure if I can share the emails so I won't but I'll provide my interpretation - I am not sure if gaslighting is the correct term or if it is just sometimes a feature of academia where a Professor assumes anyone who isn't one probably doesn't know 1% of what they do and talks down to them? Hard to tell. But this is a summary of the conversation. 

1) They'd love to share the proteomics data, but it's impossible to share any sort of -omics data without waiting months or years. Ben sighs. Obviously this is not only inaccurate, it is shockingly ignorant. 

2) They might be forming some sort of a consortium to make -omics data publicly available. Ben shuts his PC off for the day. We have a global, extremely well organized multi-national system to share proteomics data. Please do not invent one. Please. 

3) Looking at proteomics data publicly available is something that they might try one day. So...yet again... someone with SomaLogic data skips the easy and obvious experiment

I spent a lot of time working on my focused breathing and redrafted my email a few times explained that in proteomics data sharing is considered mandatory- has been for a decade - unless patient data is compromised or it is flagged for national defense or something. And shared a summary of the analysis I linked above. I also shared something else that is a decade old about why we have to share proteomics data

Then I shared this with the editor with my analysis and they thought about it for a month or two. 

I received an email from the editor that they had a meeting and couldn't see how actual protein level data could add anything to the findings of the paper and rejected my paper. 

I guess that was the issue. 

I don't want to add anything to the findings of this paper. I want to point out that the whole central premise of the study is silly and that - in the absence of publicly available data for reanalysis - no aspect of it should be taken seriously at all. Because when you actually look at proteins themselves - which this study is based on - and compare those to results that have been analyzed and reanalyzed, there is virtually no support for this study. 

Along the way somewhere I found out that - unsurprisingly- there is a whole company being spun out of the results of this study. I mean....who wouldn't want to know that their liver is 15 years older than it should be? Right? Cool idea, sign me up!  However - there is no reason to believe - at all - that the methods detailed in the study can make anything at all like those kinds of measurements. 

Here is my analysis - whole thing open with step by step instructions. Please check my work!

Oh wow.  While I was rereading this rant the preprint went live, but given the format I wrote it in, the post is actually longer.


Wednesday, June 12, 2024

Astronaut multiomics week with 44 new papers and open access data everywhere!

 


Does it seem like space and astronaut data is all over the place right now? It is! 

44 new papers just dropped and many are proteomics, transcriptomics, metabolomics/lipidomics of astronauts (both human and otherwise). 

You can check it out (and get access to the data if you want to investigate it yourself - this isn't Space Karen stuff, this is NASA stuff, it's open!)

Tuesday, June 11, 2024

Sunday, June 9, 2024

Spaceflight changed the skeletal muscle proteomes of 2 astronauts!

 


A common theme in a lot of serious science fiction is often how life from our planet will need to adapt to the challenges of low or zero gravity. Makes sense, right? Astronauts spend a lot of time recovering after time up on the ISS.

Want to actually understand what is changing?!? Of course you do! 

It is a short read and it has some really optimistic statements, like how exercise can help mediate some of the biggest changes in mitochondria.

This is the journals front page today, btw, which gives some insight into how you actually exercise on the ISS? 


I'd never before heard of this journal before scholar alerted me that someone I follow published something new. Digging into this finds that it's not the first entry. There have been multiple proteomics studies in this journal because - I mean....it's not like being in space for 6 months is altering your DNA a whole lot. Those changes are clearly proteomic! 

Another recent paper looked at similar things to this one, but did it in space mice! 


One reason this multi-omics paper is super cool is that it turns out NASA has a whole data repository of data from organisms that have gone to space. This group took phenotypic data with transcriptomic, proteomic and DNA methylation to better understand muscle loss in mice that spent 30 days in space! 


As an aside, a few years ago I volunteered to help analyze proteomics of Arabidopsis that had went to space. The data hadn't been released yet and I needed to get clearance and the paperwork turned into a hassle for everyone and they dropped it. I have been pretty busy, but I would have found the time for that study - purely and completely so I could make the following joke. 

Space Plants! 


Meh. Maybe some day. 

Saturday, June 8, 2024

CLIPPER 2.0 - (Re)Annotate those positional isoforms!

 


Picture this - you decided to do something wacky and actually use one of those FASTA databases that contains a lot of different protein isoforms to see if you could find them! Not just the smallest UniProt database where it's about 100% one open reading frame for every gene for every protein entry. We all know that isn't how biology works at all, but what else do you do? 

Chances are you will can make those peptide hits and do a protein rollup and then have no real way to easily dig through those isoforms you were looking for anyway....yay.....

What if you could take your output from a lot of the common tools and drop them into something that can help you find those isoforms? Check this out! 

Welcome to the party, CLIPPER! I've got some RAS mutations I could use your help with. There are different mutations on each chromosome, too, (super fun) so multiple sites map back to the same single entry and I think you're going to help. 



Friday, June 7, 2024

Searle lab Stellar preprint resolves a lot of questions about the new ion trap!

 


I tried to keep up on what was going on in Anaheim at ASMS, but it's hard to keep up when you're actually there. Not there is almost harder. I ended up with more questions about Stellar than answers, but - yet again - Thermo collaborators dropped a series of preprints throughout the conference.

This one really clarifies what this instrument is and can do


I knew from the architecture (ion trap on the back of what appears to be the phenomenal Altis + QQQ instrument?) that it was going to be crazy sensitive, and the headlines were clearly that this ion trap can hit >100 scans/second. 

What this preprint goes into is how you can use this for both global proteomics and targeted validation. They use the instrument in DDA mode with offline fractions to build a library and evaluate that versus gas phase fractionated libraries with DIA. And then they do the targeting. So...if you had questions, it looks like this instrument can at least do all the normal shotgun proteomics stuff. A lot of us old people had bad experiences with nominal mass instruments way back in the day. I swear, if I put enough LTQ XL MS/MS spectra into a program called Bioworks and ran a Sequest search (around 2009-2011, this was my workflow) I could literally generate proteomics data to support ANY hypothesis you brought my way. There was low accuracy, poorly matched MS/MS spectra for every peptide from every protein.

The first real way of estimating FDR wasn't published by Gygi lab until 2007 (link),  but it didn't get to me in any form I could use until I was running Orbitraps. And who uses target decoy for PSMs today? We all use intelligent deep machine intelligence thingies. It's easy to think that if we had these informatics tools maybe we'd have less fear of nominal mass instruments. Exciting thought and since this truly sounds like an instrument with a load of capabilities at a much lower price point than the other headline grabbing ones today, I bet we'll find out soon! 

Thursday, June 6, 2024

Finally!MALDI-prmPASEF for spatial targeting (preprint says prototype software)!


Wow. Do you have to dig for this one. 

Background: if you aren't paying attention and you get a TIMSTOF Flex that can do ESI - PASEF (DDA) diaPASEF (DIA...duh...) and prmPASEF (targeting with ion mobility and quadrupole and with high resolution fragments!) you might be very confused to find out that none of these features work when you.... turn....on.... the.... MALDI... source. 

What you have to do is figure out the mass of your target (which might change if it picks up a funny matrix adduct) and then it's relative ion mobility and then punch both of those things in to get a TIMS cleaned up MS1...at 35,000 mass resolution....the end result is that you 1) can't scan very fast (because you aren't doing the parallel accumulation bit) AND you can't remove a lot of background. If you're like me it makes you sort of forget you have a MALDI at all.

While rumored for a while, those rumors got a whole lot louder at some meeting in California this week. Not there, I had to really do some preprint digging and - this group has it! And they hid it behind a ton of words that I do not know and 23 other figures. But it's in here! 



To find data from MALDI prm-PASEF you need to go to Supplemental 18 (top figure) and it looks super super legit. Unequivocal, even? I like the word "prototypic" here. 


Wednesday, June 5, 2024

19,000 phosphopeptides by microflow DIA! Is the PTM tipping point finally here?

 


When you think about why to not do DIA you've got maybe 2 reasons right now. Limited multiplexing capabilities - and PTMs.

Obviously the PTM data is there, right? We need the informatics and maybe the methods to grow up. 

Is this great new study signaling the tipping point is already upon us for one annoying PTM

There is a LOT of good work in this paper. Not limited to - wow - that's a lot of IDs for microflow - as well as the use of multiple search tools to get to these data (comparisons of library generation!). That last part is cool because all the spectra used for these deep learning tools for peptides are from Orbitraps. The spectra of other mass analyzers may look a little different. There is a lot to read into this if you're on a SCIEX TOF platform. In any case, a really nice study for multiple reasons including way way way more phosphopeptides than we've ever seen in our lab out of DIA. Maybe it's time to try lookin at some old files again. 

Tuesday, June 4, 2024

Set your Q Exactives up for optimal TMT32/35-plex (90k res?) cycle times!


This week at ASMS there were some reagents on the big stages, including what appears to be either a TMT32-plex or TMT35-plex commercial release. 

While I haven't had time around packing the lab to really dig into the fine details, it appears that a minimum of 90,000 resolution @ m/z of 200 is required to achieve baseline separation of all the tags. 

If you're running on Q Exactives you probably noticed that you do not have a 90,000 resolution setting. On a Classic or Plus you've got 70,000 and 140,000. While you could just run at 140,000 that's a wopping 512 milliseconds per MS/MS scan! 

Running at 90,000 would only be 329 milliseconds, which is an extra MS/MS scan/second. Which would definitely add up across a run. 

Assuming you've got the setup I put up there (70k MS1 and 90k MS2) a top 6 would be 2.2 seconds, rather than 3.6 seconds. 

That's 3,200 MS/MS in 120 min vs 2,000 MS/MS in 120 min.

No, that's not a lot of spectra, but you're 32-plexing! You're offline fractionating and stuff, right? Plus it's TMT, you literally do not care about your chromatographic peak shape.

Also, in one of the slides it looks like they show not quite baseline resolved spectra at a lower resolution which would obviously be faster. 

(Oh yeah, and an HF or HF-X would be about 2x the number of spectra)

How to do this? 

MAXQUANT.LIVE.LIVE.LIVE.LIVE.LIVE (sing it with me, people who plan to retire with a Q Exactive still in their possession!) 

MAXQUANT.LIVE 2.1 is now valid until January 2026!!

If you haven't ever used MaxQuant.Live, I made instrument triggering methods you can download here for the Q Exactive HF(it works the same for Q Exactive Classic, I've never tried Plus) as well as a powerpoint that walks you through the steps. You go into the beta and there are two little paperclips to download the .meth file and the .pptx. 

Monday, June 3, 2024

Near real-time plasma proteomics by LCMS!

 


One of the historic (and largely accurate) criticisms of LCMS proteomics is that it is slow.

Slooow.

Slooooooooooooow.

What parts are slow? 

Sample prep has historically taken 16 hours, though faster higher temperature methods have really taken center stage. We often do 2 hour digestions at 47C but that's after 1 hr of reduction and alkylation and possibly after homogenization or enrichment. Still not fast. 

Also the HPLCs on mass specs are slow. Smart new LCs have improved this a lot, including technologies like dual trap single column HPLC (DTSC).

What if you ran an LCMS system really really fast. And what if your total sample prep was almost as fast? 


What's your argument against LCMS proteomics then? Dynamic range? Okay, valid, but that number is going up by the day. 

Dual trap single column HPLC mass spec (fast) coupled with a new protease that can digest a sample in minutes??? 

Sunday, June 2, 2024

ASMS Hardware Launch #3 - TIMSTOF Ultra 2!

 


Okay, so big time thing here is always a better source for a TIMSTOF. So pumped that they've been putting time into fixing the weakest part of  the system with multiple new iterations of sources. 30% more signal on the Ultra is crazy.

The headline on this one should be - ion counts and # of ion control (something 2.0). It has always been possible to set a maximum charge limit on each packet of ions on a TIMSTOF. No one knows how to use it and maybe it never worked. Who knows?

Imagine what it would be like running an Orbitrap without automatic gain control and ion injection time. Too many ions, space charging. Too few ions, meh spectra. That's what TIMSTOFs have been running with since this NMR company burst onto the scene a few years ago. 

Youtube video about it here. 

ASMS Big Hardware Release #2? A little Xevo MRT? 100,000 res at 100 Hz!?!

 


Ummmmmmm......okay....so that is actually what it says.... My first thought was....


Right? Because the MRT works by increasing resolution by increasing flight path length.


According to the waters website this is speed independent of the resolution. Okay....so a while ago a friend who is in reagent sales said that a company you wouldn't expect was buying a lot of isobaric tagging reagents. I mean.....this is a little benchtop TOF that can to the TMT18-plex and very likely the TMT xx-plex reagents. 

Y'all might want to swing by to see what Waters is doing. And, like most years, I don't mean that statement as a joke this year. 

ASMS Big Hardware Release 1! The Return...of the linear...ion...trap...?

 

Cleverly timed preprint drop #1!? 

Man, I swear some people know the right person at CSHL to slip cash to or something to time these papers. When I send a preprint it cold me out in 15 minutes or 2 weeks....or....rejected like my most recent one that was just accepted by really good journal. You can't appeal a Biorxiv rejection, btw. What was I....

Oh yeah! 

I'm not in Anaheim but I'm spying on the socials and this looks like it's the big mystery box from Thermo! 


Yo. Is that a Q Exactive but they put a linear ion trap where the Orbi ought to be? 

Sure looks like it! 

According to this guy, however, it's a little more sophisticated than the Q Exactive and it's basically the high end TSQ Altis + and then a super high speed linear ion trap! 



How fast? Preprint suggests 100 Hz! And with a good quad? I'm not awake enough to read the whole preprint, though. 

And for anyone thinking ...wait a minute...I've seen a Triple quad with a trap at the end of it before..... I started my career on a 3200 QTRAP and it was great. 800 PPM mass accuracy but I was doing glycans and they all have the same mass anyway. Who cares if you can't tell a glutamine from a glutamate or whatever?