Sunday, April 24, 2022

EnFind -- Glycoproteomics without the need for oxonium ions?


I just stumbled on this while working on a tough deadline or three, and now I can cite it! It is a preprint from a couple of years ago. 

Due to the time of flight (effect, not instrument) coming out of the quadrupole on the TIMSTOF you can't really scan low and high mass fragment ions simultaneously. They take longer to get there and you didn't get fined by the union that "maintains" your building for altering your ceiling tiles because you wanted to scan slooooow. (BTW: I'm 100% pro-union, labor unions are the only thing between lots of members of my family dying in coal mines in West Virginia so people like the...governor....of the state can save $10. He is really really really rich, btw, largely by operating coal mines unsafely and owes millions of dollars in safety violations that he just doesn't pay because billionaires in the US can literally do anything they want with absolutely no repercussions. I have personally known people who have died underground in mines owned by that walking pile of dogshit, though, thankfully no family, so please pardon the emotion here.) 

What was I....oh yeah! This preprint!

Okay, so don't quote me on this preprint at all. The data isn't publicly available anywhere and it's been sitting on a preprint server a couple of years, but I think the concept is cool.

I was looking to see if anyone who has shared with me how they are doing glycoproteomics on these ceiling disrupting monsters has published their approach so I can cite it. Why I'm interested is that I really fine tune out my pre-pulse storage conditions for my two TMT scan events and crank the collision energy to 11 on the scan that liberates my reporter ions. 

The end result is that in my MS2 spectra I often see crazy high intensity reporter ions and then a great big gap up to around 230 or so before we see the peptide signal. (They look like the spectra at the very top -- see where the signal starts? Ain't gonna see that 204 or 183. It looks like there is nothing in the 204 range. Which causes a lot of the glycoproteomics programs to output nothing at all because a lot of them only consider spectra for glycan ID if they see oxonium ions. This group uses PEAKS and they just don't think about oxonium ions. Boom. Problem solved, maybe? 

Wednesday, April 20, 2022

Analysis of the popularity of proteomics search engines!

I've often wondered who is using which search engines. I've also made guesses and they don't seem to be anywhere near accurate based on this Google Scholar analysis by Dr. Felipe da Veiga Leprevost who I finally met in person at US HUPO this year. 

He shared this on social media and was clear that it wasn't comprehensive (who can keep up with the 1,000 plus proteomics tools out there? I mean, @PastelBio is doing an incredible job, but I don't think it's at 1,000 yet.) 

This is by citations, so there are lags, and it doesn't tell us what companies that don't publish are doing (and there are a lot of instruments in those places). Click to expand. 

Check out the overlay, though! 

From this it looks like MaxQuant is, and has been, massively dominant for proteomics data analysis in the literature. The fastest growing engine out there, probably surprising few of us, is MSFragger. I didn't use FragPipe for the first time until 2019 and I don't think it had been out for very long when I started using it. 

Monday, April 18, 2022

Double barrel chromatography for ultra low level peptides!


I haven't seen much success with double barrel type chromatography in a while -- but here is a demonstration of a successful application in ultra low levels.

Setups like this are jus as fun to set up as they look, but the idea is that you are trapping your peptides on trap number 2 while your peptides are eluting off of analytical column #1. When everything goes onto column 2 trap and analytical columns number 1 are re-equilibrating. The goal is to wipe out that downtime between injections. The results are often having two distinct sets of files with very different looking chromatography, but this group seems to have the recipe for sorting it out! 

Friday, April 15, 2022

What do all the FragPipe output columns mean??!?!


Maybe you already knew what every column header for FragPipe output was and the difference between intensity and MaxLFQ intensity. I didn't! 

There is an entire glossary of definitions available here (thanks for the link, Fengchao!)

Thursday, April 14, 2022

DIA-NN 1.8.1 update -- helpful for you TIMS people!


I hadn't updated DIA-NN in a while AND I'm running it on a newer PC, so I can't say for sure what is making it seem a whole lot faster than before for dia-PASEF.

Wednesday, April 13, 2022

New nPOP protocol just dropped -- supports TMTPro10-plex for TOFs!


Woohooo! You can check out this protocol here! 

One of the best things that has happened for my work recently has been the addition of increased multiplexing in the TMTPro sets. Sure, on the Orbis we can 18-plex, but on the fast instruments we can now 10-plex! I saw some misunderstandings of how this works on the Twitter thing recently, so this is how it works. 

Just use every other tag! You've got 10! That's as much as you could multiplex with anything (except Neu-Plex) a couple of years ago!  

What we most recently did was buy a 16-plex kit + the two extra tags. Now there are full 18-plex kits available in just the huge 5mg aliquots (enough to label approximately 1.8 million single human cells) that you can aliquot out. We preprinted a super short and fast protocol to aliquot out TMT kits using an inexpensive robot last summer that saved us enough money in one single kit to more than pay for the robot. I bet you can find it. That paper also received a response from a peer reviewer that was so negative that I'm having it framed for my office. Though it might spend some time above my heavy bag for a while. Motivation! 

We use the N-tags for the TOFs and we keep the C-tags for when we make TMT spectral libraries or when a collaborator has just a few samples. We may need to shake this up a little because there are a lot of c-tag aliquots around. 

Since this is unit resolution, you can use these tags for 10-plex multiplexing on ion traps or triple quads as well. AND the collision energy is so much closer to the peptide bond energy that I can't imagine ever using the TMT6/10/11-plex reagents ever again. 

Having the ability to leverage these reagents for the super cool nPOP method is going to be a major win for us. The way this is configured it looks like we can prep almost 1,500 multiplexed single cells at a time. 

What we most recently did was buy a 16-plex kit + the two extra tags. Now there are full 18-plex kits available in just the huge 5mg aliquots (enough to label approximately 1.8 million single human cells) that you can aliquot out. We preprinted a super short and fast protocol to aliquot out TMT kits using an inexpensive robot last summer that saved us enough money in one single kit to more than pay for the robot. I bet you can find it. That paper also received a response from a peer reviewer that was so negative that I'm having it framed for my office. Though it might spend some time above my heavy bag for a while. Motivation! 

Tuesday, April 12, 2022

Retention time prediction for TMT peptides!


More and more cool learning and prediction tools to boost our confidence in our IDs! There are two others that I'm dying to talk about. I was looking to see if either was officially launched and found this new one.

Monday, April 11, 2022

Ad hoc learning for phosphopeptide identifications!


Okay...I'm going to post this, but I'll lead with the fact that I'm on the fence about leaving this in the drafts folder. 

It looks really cool and the idea seems smart but if you aren't running on Linux and ready to do a decent amount of command line typing this set up for your GPU it isn't for you. If you're like "omg, I love typing while being strictly penalized for every tiny mistake I make and I didn't know there were operating systems outside of Linux" this one IS for you: 

You can get all this at GitLAB here

Sunday, April 10, 2022

MaxQuant 2.1 is out -- Features ZenoTOF support and TMTPro 16/18!


Trying to figure out processing that ZenoTOF IDA data? Me too! MaxQuant has native support and you don't have to use my buggy fix for TMTPro data analysis. It's default. I bet there are other features to support the jump all the way to 2.something! 

Tuesday, April 5, 2022

Very important corrections on recent posts!!

I make a lot of mistakes. Probably more than average for someone my age. As such, I sometimes need to post important corrections -- and here are two serious mistakes I've recently made on this blog.

A scientist that I respect a lot who now has a title that sounds like it is probably a lot of work contacted me to provide more up-to-date documents, as the copies I posted expired several years ago.

The vendor is much more compatible with developers and even has a Github up that helps you walk through everything you need to alter the operation of your factory Orbitrap hardware. You can find this here. The things I highlighted appear to be totally absent in the new agreements, but it's a lot of words and I don't have time for all of them right now. 

Number 2: 

The company is a joke company! 

Please disregard this post as well! I'll make further corrections as I have time. 

Monday, April 4, 2022

Proteomics Repository Updates for 2022!


I'll lump these two new papers together here, but they're both worth looking at individually. As proteomics keeps getting bigger, the big repositories and online resources we are probably starting to take for granted can't stay static. And that isn't just scaling for data storage. They have tons of powers that most of us rarely use. 

For what is going on at PRIDE check out this one! 

And for what is happening at IProX here

Sunday, April 3, 2022

Well....ummmm.....protein AGE effects PTMs?


Imma just leave this here to think about on my commute....

The TIMSTOF SCP Prototype paper is out!


I haven't had a chance to read this one all the way through yet, but we'll be talking about it in lab a lot when everyone gets back from Experimental Biology but I've read far enough to know that a lot has changed since the original preprint (almost 2 years ago? I think it was right around ASMS 2020?) 

I had a peer reviewer around 9 month ago talk about how much better this data is than one of our single cells studies we had submitted and since other authors had to sign off on my response I couldn't type what I wanted which was: What an amazing an intuitive assessment! I'll be right back after I cut up my Department Chair's 10 month old, $1.3M instrument, add a vacuum pump, a right angle to the ion beam, fabricate an ion transfer tube with massive hole in the center and modify the tune software to accept all these changes! Thanks for helping!

The fact of the matter is that this is a really nice study. The now that we do have the TIMSTOF SCP which is based on the prototypes much of this data was generated on and we have this increased signal and possibly even dynamic rang we're definitely going to try this sample prep and pasefDIA based approach. (We're in early learning curve....I've got 4 files that are worth keeping so far, but I'm getting there!)

If you've spent a lot of time on the preprint and wonder if the accepted version is worth your time, I've found a lot here that either I forgot, or I think is new. You aaaaaalmost get a MaxQuantDIA vs. SpectroNaut vs DIAnn here. What you get is a very polite statement that this wasn't intended to be a software comparison, but...we'll...only show or talk about data from DIA-NN for the rest of the paper....(I forgot the nomenclature for the software). 

Again, it's a really nice study and a really impressive tool and there is some really smart work here comparing single cell seq with their results. I do think it is critical, though, to keep in mind that the majority of mass spectrometry data that has been generated on actual single human cells has been generated on the nearly 10 year old Q Exactive Classic platform. You aren't locked out of SCP if you've got older hardware. The real challenges are getting your cells prepped and figuring out what to do with the relatively low coverage proteomes of 1,000 things that should be a lot more similar than the data suggests they are. 
And maybe, I guess, convincing some reviewers that it's okay to use the hardware that you actually have access to. 

Saturday, April 2, 2022

SOMASCAN vs O-Link -- nextgen proteomics deathmatch!

(I couldn't find any pictures of the actual instrument that does one of these things, so I used my best judgement.) 

The blog hasn't had a head-to-head deathmatch in a while, but wooooo-- what a great topic for one! 

I gotta move fast -- and I've mentioned this preprint in passing a while back, but it keeps coming up at meetings and in calls, and it is absolutely worth revisiting

That sounds pretty benign, right? 

SomaScan is an aptamer based proteomics technology (nucleotide thingies are specifically designed to bind to whatever you want) that has been around for several years now. It has been getting a lot of attention recently thanks to a couple of big studies and the amazing amount of capital that the company pours into advertising revenue. I think a deep dive into ad costs vs investment capital for this tech would be a lot of fun. Strangely, you can't find a picture of the device that generates the "data." Many of the searches direct you back to this blog, and I'm probably not being all that helpful, but I'll try harder. 

O-link is a newer proteomics technology that is antibody powered, but actually pretty clever for a technology based on a reagent that millions of years of evolution has worked very very hard to make unpredictable. To get around the fact that antibodies will randomly bind to whatever they want to because binding to new things is legitimately their biological task, O-link requires that two matched antibodies bind to a target protein. When that happens the oligonucleotides on antibody A and antibody B will hybridize (bind together into double stranded nucleotides -- amplification of that sequence can only happen if hybridization happens. Only antibody A? No signal. Same for B.  This should rule out some false positives, but due to the unpredictable nature of antibodies, I expect an increase in false negatives also must occur. 

Deathmatch time! Ready? 

(Unrelated. Just popped up when I was looking for GIFs of impatient people.)

How'd they do this deathmatch? They looked at 871 proteins that both technologies could quantify.

In 10,000 individuals(!?!?!?) These things do seriously have some throughput capabilities! 

Good news? Neither technology appears to have a systematic bias toward specific pathways. That suggests there isn't a copy number bias or anything. 

Bad news? 

One does much worse with membrane proteins than the other, there is a thorough evaluation of that in the study. 

Correlation of results between the two? 0.38, you know, a little bit closer to a few things match here and there to NOTHING MATCHES. 

Which is right? Are either? Well, these authors go to the trusty pQTLs and genetic elements to see how they agree? Which is...ummm.....

Okay, so one of them might be great. Both of them might be terrible. 

The one thing that is clear from these data is that both of them can not be right. At least one of these technologies is bad at measuring protein abundance.  And this is where my over-the-top frustration comes in. 
The money being poured into these technologies is on levels that traditional mass spec based proteomics has never really had access to.  We know what our problems are. We know that we've been too slow in the past, we haven't scaled well or automated much and we haven't standardized. But here is the thing -- how much of this is due to limited resources. If the best group you could think of right this second had access to the kinds of resources these unproven "next gen" proteomics sytems have, would they fix all of that? I sure think so. 
My argument evaporates as soon as someone shows some convincing evidence that one of these things can measure a protein accurately. I'll keep complaining, and waiting till that happens. 

Friday, April 1, 2022

Finally! Another company has built an Orbitrap system!

For years and years and years we've waited for this to happen.



US7714283B2 and GB2434484B were filed during my first year of grad school. Not to brag, but that was a long fucking time ago and patents and trademarks, regardless of how many excuses you make for extensions, don't last forever. 

Someone who actually knows about these things told me after....a responsible number of drinks.... that one of the primary reasons Alexander Makarov hasn't been seriously considered for a Nobel has to do with the fact one company exclusively owns this and has made $e9 off of it. Now, I might have gotten some details wrong and he/she might have been incorrect during this conversation about a decade ago.  But, you can't convince me the Orbi doesn't merit a medal in Stockholm and with the monopoly broken, maybe it finally happens? 

One of the primary reasons I've heard for why the vendor isn't concerned is due the the sheer difficulty in machining the Orbitrap to specifications. Even the vendor has a high failure rate, and used to have a failure rate high enough that you could win imperfect Orbis at some conferences. In the era of the Motorola Razr, people were making Orbitraps. Can it really be THAT hard in 2022? 

I guess not, because I just got schematics from a company called Veridian Dynamics on the first externally developed Orbitrap sytem, and at first glance it might be too amazing to believe. 

Hold on to your hat, partner. What about 4? 

Introducing the SLE9D-MS System! 

This design is the brain child of Dr. Ben Neely who has secretly been leading the design of this hardware for a couple of years. 

9 dimensional. 
Mass spec. 

Unnecessary(?) edit 4/4/2022? So....umm....this was totally an April Fool's joke that has some largely unintentional synergy that somehow made it even more funny.