Sunday, November 3, 2024

What's the best software for DDA analysis of host-cell protein contamination in biotherapeutics?

 


This is an interesting analysis for a critical workflow in biopharma/biotherapeutics



It's applicable in other places, but where this is used the most is when an antibody drug is manufactured in a hamster ovary cell line or something and you need to make sure that not much is going along for the ride.

In this case, do our algorithms designed for deep global analysis make the cut? In the end it looks like they all do pretty well (Figure 5 is a great summary) though some are clearly better than others. One commercial package I wasn't even aware of somehow (Byos, from ProteinMetrics) actually has a specific workflow for host cell proteins - and you can imagine if they were thinking about that during the design they'd do a good job - and that does seem to be the case. 

Saturday, November 2, 2024

Leveraging proteomics to develop an accurate model system for human fallopian tubes!

 


We've eventually got to get away from animal models for human studies. There are clearly dramatic differences between mouse/rat/nematode/yeast biology and human that lead to all sorts of false discoveries. These are so drastic that some funding agencies have rapidly approaching hard deadlines where they just won't fund the stuff.

But human biology is tough to mimic in a bioreactor, even if those things are increasingly inexpensive and easier to use. Even modeling something as relatively "simple" as the blood brain barrier is not at all simple. How the f(ourier) do you model something as important and complex (and amazingly under-studied) as the human fallopian tubes??? 

Like this! 


 Okay - so one way you could do this would be to get some human donor samples and do some really amazing imaging and then dig deep through previously deposited data to help construct yourself a map. (P.S. I love that the authors did such a nice shoutout to my long time neigbors in the JHU Microscopy Core.) Could you do a lot better? What if you also captured physiological function like oocyte transport?!? Could you end up with a map that provided a dynamic understanding of how the system changes during physiological function? That would help as well, but what if you used that information to build the most accurate in vitro system for studying these tissues as you could? This multi-institute team did something like all of this. Cells were grown out to organoids which were coaxed into "assembloids" (we're far outside of my expertise here, so please forgive my interpretations) and by controlling the matrices and how these cells were coaxed do differentiate and assemble, they get there. Proteomics was used along the way (TIMSTOF HT with EvoSep One, diaPASEF, analysis with SpectroNaut) to verify that the system proteome expression profiles.

Even for someone (me) who couldn't follow a lot of the biology/cell differentiation stuff, this is clearly an exciting work. I'd 1,000x prefer having access to engineered systems for my pharmacology work over mouse models. Up that another 1,000x if I those systems were backed up with proteomic data that they were accurately representing human biology. 

Thursday, October 31, 2024

Can Stellar hit "next gen" numbers of protein targets?

 


I've got a pile of huge O-link population level proteomics studies on my desktop to rant about at some point. However, at some point a blog post has consumed so much of my time that I think "maybe I should float this by an editor to see if I can get real world credit for those hours" and I made that mistake yet again with that one. 

I love proteomic discovery work. I don't think we've found anywhere near all the protein post-translational modifications that matter in humans, and that's how we'll find them. Heck, we don't even know how many f'ing proteins there even are. The biologists, however, don't seem to have the patience for these primary discoveries. They want large n targeted studies and they want them yesterday. 

You can PRM a decent number of targets at high resolution. A stellar student in my group did 300 targets on a ZenoTOF but that's WAY more than I've personally set up. I don't know if I've done more than 40 on a Q-Orbitrap ever. 

Sure QQQs are fast, but in real biological matrices for MRM/SRMs? You need a crapload of transitions and you need a synthetic peptide for every one. There's just too much noise. 

 An O-link bigtime assay (maybe that's what they call it? something like that) that uses a super high throughput (and increasingly unreliable - some of that new Illumina super multiplex stuff has a load of garbage artifacts in it) can target (and, please remember, target doesn't necessarily mean detect) thousands, and that really hasn't been in reach for LCMS 

Until now?  Can the handy dandy super fast little unit resolution ion trap target on that same scale? These people seem to think so. 


At first this doesn't seem all that impressive, right? I did nearly all of my PhD work on the new-at-the-time 3200 QQQTrap system. Triple quad with signal boosting trap on the end. Yawn. Same stuff. 

My QQQTrap couldn't do >100 scans/second. And it sure as heck wasn't more sensitive than any QQQ on the market today. The trick, though, is in the combination of really sophisticated control software in combination with quantitative software everyone in the field except for me is really good at using. 

The authors take the Biognosys 600-ish targets for plasma reagent kit and demonstrate really good quan even when running 100 samples/day. What's that? 14 minutes run to run? Something like that. But with the number of concurrent targets that can be measured per cycle, it's clear they could add on a crapload more and have the cycle time to pull them off. It'll probably be a question for the community soon, for how many points of confidence do we need for unit resolution PRMs? Given the full blown acceptance of tools that predict peptide fragmentation patterns, retention times and ion mobility when available in wide window experiments....I dunno, it's hard to see too many problems with targeted quan without a heavy internal standard for every peptide. I'd betcha two of my kid's newly acquired KitKat bars that the quan is better than the O-linky and Aptamer-ma-bobs. 

Wednesday, October 30, 2024

msqrob2PTM - Normalize PTM data against global proteomics to actually find the important PTMs!

If you take this great new paper and print out any figure you could approach any scientist on earth from a long distance holding it up. As soon as they could see it at all, they'd know - everything in this paper is written in R. Which is totally cool, I'm not making fun. It's just the most R'y proteomics paper I've opened in a while. 


What is it? It's a statistically valid way of taking your PTM level peptide data and normalizing it against your whole protein data, so you can get stuff like relative phospho-site occupancy per molecule. Great for those biologists who expect this to be the standard output in a PTM study. 

Tuesday, October 29, 2024

Shine up that crappy figure with public domain illustrations from NIAID BioArt!

 


What an awesome new resource from the great scientific illustration team at the US NIAID! 

Check them out here at:  https://bioart.niaid.nih.gov/

Shine up your ugly figures with these great illustrations. If you are concerned about what and how you can reuse them, just filter by the ones that are 100% public domain! Which, at least on the protein tab, (which is all that really matters, right?)  appears to be every single one of them. 

A Don Hunt story (John's Version)?

Several years ago I got to introduce Don Hunt at a packed meeting in Bethesda. Trying to easily summarize the list of accomplishments and pivot points in protein biochemistry where he has played a role? Challenging. Trying to do it so he has time to show new data? Impossible. 

If you've got 5 minutes, this perspective by John something or the other can really put some of this into ...pers...pective.... hmmm.... article classifications aren't always so neatly packed into the category with the exact literal meaning. This one is. 


Bonus - the tale of the twists and turns in getting Sequest published should inspire anyone out there who has really believed in an idea that hasn't gotten the warmest reviews from peers! Totally fun read. 

Also, I really enjoyed reference 31 (I think, read it yesterday). 






Monday, October 28, 2024

Gain of cysteine missense mutations in both disease and healthy(?) human tissues?!?

 


In shotgun proteomics we generally do our best to ignore cysteines - and especially the super important PTMs that they tend to carry. We reduce the cysteines (losing the PTMs and anything else, like drugs that might bind to them) and then we put in harsh binding chemicals to make sure those cysteines never do anything remotely biological ever again. We assume those 2 reactions occured with 100.0000000000% efficiency, and we move on. 

Chemoproteomics people, however, tend to be very intersted in the drugs that bind to active things like cysteines so they have to use other approaches. And....maybe the coolest discoveries of this awesome new paper weren't a surprise while doing some chemoproteomics, but I felt like there was an air of increasing surprise (there was for me!) as I read through it! 

The paper isn't a short read because this team did a lot. On the mass spec front, cysteine pull-downs and whole proteomes (which did employ reduction/alkylation) were analyzed on an Orbitrap Eclipse with a nanoLC system. TMT was also used at times.

On the genomics front, cell lines were sequenced and the variant call files were integrated into the database search using a 2 step process with MSFragger through command line. I'm not sure if this was just their typical way of doing things or whether integrating the normal FASTA with the processed peptide variants and controlling FDR the way they did required some fine tuning that is easier to set up outside of the GUI. 

These cancer cell lines largely suffer from problems repairing mismatch errors in their DNA. (Deficient in MisMatch Repair, or dMMR). Makes sense, right? Cancer is often a DNA disease. Errors propagate until you've created renegade cells that do whatever they want. Missense mutations typically end up changing one amino acid to another. Why would new cysteines be the most likely outcome?  

From a pure codon perspective, it doesn't seem like the most likely outcome! If you are randomly altering bases in DNA you'd think Leucing or Arginine (6 codons each) would be the most likely random occurence, right? Cysteines are only 2. (Stolen from Wikipedia)


...but we're talking about the selective pressure of cancer cells....does having more cysteines infer some sort of an advantage? Beyond me to think about, but it sure is weird. 

Where is gets weirder is that it looks like missense mutations are also found in the healthy human data they evaluated as well.....again....beyond this blogger to really think about - but it should go as yet another of this huge pile of reasons to question our current ability to target every human protein. 

As an aside, I found myself reading between the lines of this one more than I should have, but I could imagine someone doing chemoproteomics of cysteine binding drugs (maybe because I spent a lot of time on sotorasib the last few years) and then finding 800 peptides it bound to that had no presence in any human FASTA database. It sure would justify the time they put into this great thought-provoking study! 

Sunday, October 27, 2024

De novo analysis of poly(!!) clonal antibodies from human blood!

 



I can't seem to get this picture in higher resolution, but the paper is open access here

I have a list of things in my head that I don't think you can do with a mass spectrometer. Or should do. Or, maybe, if you do it, it's definitely going to suck.

Mixtures of polyclonal antibodies? Definitely on that list. Mixtures taken from human serum? Yeah, good luck with that! 

There are a LOT of steps here from the best antibody characterization group I personally know of, but solving the absurd mixture of proteoforms that are present in human serum following a viral infection? 

It's hard to quantify KRAS when you know you've got a copy of the WT and one copy of one of the convenient mutant proteoforms on the other chromosome. That's 2 proteoforms on a little (though annoying) protein. mABs class switch and glycosylate and crosslink in weird places if you look at them funny. So you end up with these 10x larger proteins with a big conserved region from all the different variants and then a mixture of craziness from the variable regions way down in the low abundance region! 

To do it required all those enzymes above with both bottom-up and middle-down proteomics. An Exploris 240 and Orbitrap Eclipse were both used. I'm a little unclear from the methods, but I'd assume the middle down definitely went on the Eclipse for ETHcD, though it may have been used for most things. Also, this is the first time I've seen an EvoSep used for mAB mapping and I'm totally psyched that it works well for it. (You know, it's kinda tuned for global proteomics and in peptide mapping you want the little tiny peptides and the big ones as well). 

And - this is a paper from a company, right, and - GASP - all the data is up on MASSIVE if you want to check it out for yourself. No joke, the 3 papers I had in front of this one to read were a 0/3 for publicly available data, so you won't hear about them here! 

Friday, October 25, 2024

Parallelizing the most challenging steps in proteomic analysis on the cloud!

 


I got this preprint sent to me after my brainstorming on core hour usage for proteomics. I was largely doing that to figure out whether it was worth it to me to use spend the time on slurming around for 50,000 cpu core hours I just got access to. What I didn't get into in that post was where the Harvard team found their HPC spending the most time - it was, by far, on match between runs. 

In this preprint, this team demonstrates some early results in parallelizing that pain point on the Cloud. 


The best figure in the paper is probably the panel at the top. Go to 1,000 files and - yeah - you use a lot of cores but you cut 6 days of processing time to a few hours. Since Clouds (which are just someone elses HPC) tend to do a really good job in charging you for what resources you actually use (because it's a highly competitive commercial environment and if they didn't do it right you'd give your money to someone else) the costs end up working out to just about the same, same cost but you get your results back almost a week later? Everyone is taking that deal. 

Again, very preliminary, but you should be excited because you know someone who would like to talk to you about their 5,000 FFPE blocks for proteomics and you can only avoid them for so long. Pretty cool to know that someone is thinking about a bottleneck you haven't go to yet! 

Thursday, October 24, 2024

iHUPO release of the upgraded ZenoTOF (+)

 


I only have marketing stuff to go off of, but the ZenoTOF platform got an upgrade release at iHUPO in the ZenoTOF+.  I guess the other one is now the ZenoTOF Classic or the ZenoTOF Lite. You can read the marketing stuff here. 

The system keeps EAD and Zeno pulsing, but appears to add a fast sliding(?) quad ramp function. 


By rapidly parallelizing the pulse/eject in the zenotrap it looks like the efficiency goes way way up and allows some ridiculously fast acquisition times for high loads as well as impressive coverage of low load samples (top panel). To get to the numbers in that panel they did drop the microflow and moved over to the IonOpticks Aurora 15cm and 150 nL/min. 

You know, when you need a QQQ it's made sense for the whole time I've been doing mass spec to talk to each vendor and either visit a demo lab or two or send out samples. How cool is it that it makes sense to do the same thing for global proteomics today?  Hopefully this all translates to competitive pricing!