Sunday, January 28, 2024

Crosslinking proteomics plus AlphaFold2 = higher confidence data all around!


I can't really assess how good this new toolkit is, but it makes sense and the illustrations are gorgeous!  

My argument for why it is worth posting is that I've never once put crosslinking reagent into some proteins (or....well....had someone else do it) and have my spectra improve. Those chemicals generally do the opposite. Then, particularly if you're using a fancy tribrid with the real time crosslinking stuff, a lot of the data you get is ion trap (blech) and it's tough to really tell what is what. 

XLMS-tools (on git?lab?) is an ambitious way to leverage the fancy predictive capabilities of AlphaFold 2 to increase you confidence in those putative identifications - AND the opposite. 

Sounds smart, the software and data is publicly available and the use of proteins in open/closed configurations as spotchecks is a compelling argument.

Funny thing I learned at EuBiC is that AlphaFold has very strict high mass cutoffs, so this probably wouldn't help you much for a big protein. To fold big proteins they cut the sequences into little chunks and fold those. probably less dumb than it sounds. But that's AlphaFold's problem, not this nice new study. 

Saturday, January 27, 2024

Single cell proteomics - beyond cell status - the Broad makes a statement!


Woooo.......okay.....this data from the Broad (pronounced like "Toad," don't be that dude)

You aren't supposed to look at the author list and think "oh shit. they're working together? this is gonna be good....." but that's exactly what I thought when I saw the first two names on this. 

You could argue this is the maybe the most expensive way possible to do single cell proteomics, but the last name on this preprint has the money buy an Exploris exclusively to keep his lunch warm and two years from now this'll be 3 generations of hardware behind, so who cares?  😇

Punchline? This features the new prep on the CellenOne system for label free prep -- that can be directly centrifuged into the EvoTips! Then you take that extremely low loss prep and run it on the ULTRA! They use the lovely IonOpticks columns and evaluate 20, 40 and 80SPD runs (which is, before controls, numbers of cells/day) running diaPASEF with the ULTRAshort cycle times (8 windows/cycle?) and reproducibly pull 4k proteins/cell.

Straight up, I've never personally gotten 4k proteins on a normal sized single cell. You dump some chemotherapy drugs in so they expand? Heck yeah I can do it then! 

The statement here, though, is in the title. They're getting enough depth that they can track actual perturbed pathways in one cell at a time. So we're getting beyond the whole x numbers of proteins. 

I've said and typed this a lot, but not that long ago my QA for 200 nanograms of peptide on a QE  running DDA was like 2,200 proteins in 60min. If I hit that number, the QE was ready for anything. 4k was a high water mark for me on a looooooong gradient. To reliably get that out of 0.1% of the starting material is absurd. Super impressive stuff that suggests maybe I should do something else with my time....

Friday, January 26, 2024

Proteomic researchers solve one of the big riddles of why tardigrades are indestructible!


(.gif taken from this article, which did not list any limitations on reuse)

You know tardigrades, right? I think they call them bug bears or something as well. As animated above, they're a whole lot tougher than most organisms, particularly anything anywhere near their size. Besides being very normally tough, they can also go into an even tougher hibernation thing as well. 

WANNA KNOW HOW THEY DO IT? Check out this new study from Amanda Smythers et al.,! 

It's cysteine modifications! Yes, the same ones that you can't see because you dumped DTT into your protein lysates and poured in a bunch of iodoacetamide or methylthiols or something. It makes you wonder how many other signaling pathways are we losing because of this.

Thursday, January 25, 2024

DIAGUI - Easily convert that DIA-NN data into what you're looking for!


Okay! Multiple groups have been working on this problem simultaneously! 

Again, though, the only problem with DIA-NN (and some of these other amazing tools) is that they output a bunch of sheets. We've got 10 good programs for taking MaxQuant sheets and making them into data people want. What if we do this with DIA-NN? 

Welcome to the party, DIAgui! 

There have always been these nice R Tools that you upload your DIA-NN results into. But some of us hate R. 

DIAgui runs the DIA-NN R package without you having to touch the nasty stuff yourself. AND it adds some smart new features and capabilities. 

You don't get the visualization of the raw data that you do with MassDASH, but my recent experience is that just about exactly no one wants to looks at mass spectra anyway unless they absolutely have to and there is no other option!. 

Wednesday, January 24, 2024

Quantify 102 proteomics samples per LCMS run!! TAG-IBT16!



Okay, so I'm under a tight deadline right now so short posts, if any, but mutliplexing 102 samples at once is worth taking a quick look at! 

Obviously, you need an ultra high resolution instrument for this, and they appear to have used MS3 - so an Orbitrap "Tribrid" or similar.

Interestingly they processed the data in Proteome Discoverer 1.4, which I honestly don't thing I even have a copy of anymore. I suspect that was because you could fully alter your multiplex processing method so they could build a 102 plex data interpretation method.

I suspect there are downsides here....but......I sure could come up with something this morning I could use a higher multiplex capability for.

Actually - I'll tell you about it. In our TIMSTOF based multiplex single cell proteomics we can multiplex 7 single cells. I use 1 blank, 1 carrier and then we skip the channels between the carrier, so 127n, 128c, 129n, 130c, 131n, 132c, and 133n are used for single cells.

We have an 8 condition experiment in works. So........we either randomize our 8 conditions across 7 lanes and then deconvolute that mess later? Ignore the fact that the PI has career altering dysgraphia/ dyslexia and that still sucks. Add in that latter fact and that sounds like either the PI never once touches anything or Colten spends 3 weeks of extremely long days setting up a huge SCP study on material that has taken him literal years to develop and he walks away with nothing, right? So we're only going to use 4 channels in each plex. Label free would almost be easier but we need an accurate and generally unbiased distribution of the cells present to pull off this from a biological level so 4x faster is still better than nothing. 

I'd take ONE additional channel here. 92 more? 

Monday, January 22, 2024

SCOPE-X - Boost your multiplex single cell relative signal with this one trick!


Holy cow. How did I miss this great and super smart paper!?!?? 

Bogdan's team found a smart solution to reducing the amount of space their carrier channel takes up in each spectrum -- they just don't read it. 

By putting the carrier/boost channel as 126 and then fixing the MS/MS spectra at 126.5 each spectrum is scaled up in relative intensity, which allows the job of the search tool to be a little easier. 

On the HF-X that they used, I think that this doesn't actually have an effect on the number of ions within the instrument at any phase. There isn't, that I know of, an additional isolation after fragmentation. I'm also very sleepy. Long night. 

However, on those fancy pants things that do perform additional isolation steps prior to reading out the reporter ions, I think this opens up some interesting ideas for how to circumvent the dreaded carrier channel effects on those instruments.

Besides that neat trick, this study is a gold mine of good ideas for applying SCP to drug discovery! I'm sad to say there are a bunch of formulas and math things in it as well, but you can do what I did and just skip that page - and it's still a really good paper. 

Saturday, January 13, 2024

DeepRescore2 - Deep learning of phosphopeptides!


Come on PTM prediction models! I don't want time to go any faster cause, aside from my career, my life has never been better. Slooooooow down. However, I'm looking forward to being able to have deep learning predictions of every known PTM someday! 

Is this a great new step in the right direction?? 

Friday, January 12, 2024

scplainer - are we catching up with the single cell informatics?

Single cell seq data analysis is a wealth of options. I swear, there are 8? commercial packages out there that will help you actually make sense of thousands of noisy low-accuracy measurements of transcripts in single cells. While most people who are good at the stuff are using R or Python pipelines, it's clear they've got the lead. 

We've got some good bioinformaticians in proteomics as well, though! And here is more proof. 

I haven't tried it yet, but I'm dying to. I've got some single cell data from our lab sitting here from drug treated cells that were prepared over the course of a week in 2 separate batches. I think we started with 3,000 single cells at first. 

On the surface, just using a boring old PCA analysis of cells from 2 batches (maybe 560 total shown here) - same cells, same standardized lab SOP for prep, same LCMS method --

Batch 1 and batch 2 are pretty darned clear. 

Is scpLainer the answer? We'll see! 

Thursday, January 11, 2024

Proteomics and metabolomics of B. subtilis cells and spores!


Honestly, I figured that given how good the maps of the pathways of how things worked in B. subtilis were when I was in grad school -- largely due to decades of work in Peter Setlow's lab - that we had this bacteria all figured out by now.... 

But, of course we don't, right? When Dr. KBJ was on our podcast she dropped the perceptive bomb on us that we don't even have functional annotation for 1/2 of the genes in E.coli, and that has to be the most studied of the bacteria!  So, time for some Metabolomics and Proteomics of B. subtilis.... from ... Peter Setlow....?....

I had to check Scholar to make sure I wasn't off by a score of years and - yes -  I found a Peter Setlow paper from 1964. And 13 papers in 2023! What a career he is having! 

The paper walks you through a couple of different extraction methods in order to find something universally applicable to a tough bacterium and to extremely ridiculously tough bacterial spores. There is an undertone of concern that by using different required extraction methods for two of the life cycle stages of this organism that is actually what you're seeing rather than true effects, which makes sense. I do think that this is one of the first papers I've seen where the metabolomics and proteomics were both done on a TIMSTOF. I've yet to take the metabolomics plunge with the TIMS and it is nice to see it working so well. It is also explained well. So if you're interested in trying it, this is as nice of a tutorial as I've seen, particularly on the data processing side. 

They use Bruker's solution for the metabolomics and MaxQuant for the proteomics and all the files are up on MASSIVE. It appears to be a very solid study. I do have a minor qualm about the implication in the abstract graphic that extensive correlative analysis was performed between metabolites and proteins. If that was done, I don't see evidence of it here. There is a functional analysis of a single pathway that made the main paper, but I'd probably save a full functional analysis of these data for second paper as well. Now -- the really cool thing, of course - would be to apply this same method across the sporulation cycle -- and then to do the correlative analysis, and with an extraction method that works on both ends of the cycle, I won't be surprised to find out that was the plan all along. Probably ought to set an scholar alert for it. 

Confused about processing single cell proteomics data?


I've spent some time wondering how many people out there have been successful in single cell proteomics and not known they were. I think it's fair to say that the data processing is still more convoluted than bulk lysate proteomics and it's possible to get great LCMS data and not gotten the most out of the single cell data.

This might be particularly easy to do with DIA-NN which takes longer to process files with lower over all signal to noise ratios. I was working on tutorial videos over the holiday break, but then my voice failed entirely so I only got through DIA-NN and SpectroNaut (which....the latter is probably unnecessary). Now, the end goal of these videos is also to visualize the single cell results in our lab's downstream solution, SCP-Viz, but as noted in the videos you can just stop watching when we get to a simple GUI that can actually make sense of the data, if you'd rather do something much harder. 😇 

How to use SCP-Viz is here

How to process single cell data with DIA-NN is here. (Doing it this way you're talking about maybe 5 minutes/cell you analyze using a very standard12 threads of a 16 thread desktop PC) 

One way to process single cell data with SpectroNaut is here (it is faster to do something similar to what I do with DIA-NN above, but in the new versions there isn't that big of a gap between that and just dumping all your data in as shown here. 

If you're interested in QC'ing your single cell proteomics data post-acquisition and pre-processing and/or cleaning those files of spectra that don't have your single cell reporter ions, as detailed recently here

Here is how you use the DIDARSCPQC GUI. If you want to use Conor's original Python rather than my 3 button GUI add-on to his program (weirdo) you can find a walkthrough on how to install and run it (I used Spider within Anaconda) here.

Next up on my list is how to process DDA single cell data in FragPipe, MaxQuant and Proteome Discoverer and I'll add them to the OrsburnLab youtube channel and to my Github as I pull them together. 

Wednesday, January 10, 2024

Phase constrain THE ENTIRE MASS RANGE with GPU enabled Orbitraps!


If you've got a nice newer Orbitrap, I believe several of them have the ability to use the phase constraint calculations to enhance the resolution of a tiny section of the MS/MS mass range before...

...this happens to the Raspberry Pi level processor in your instrument. It's just too much algorithm. At ASMS 2020 I saw a poster or maybe 2 talking about linking GPUs up to an Orbitrap to expand the phase constraint range - and now we've got real results to dig through!   Is this the return of BoxCar? Maybe? It sure doesn't hurt DIA results to get twice the resolution in the same amount of time! 

Tuesday, January 9, 2024

Single spinal neuron proteomics of ALS victims


I've been looking forward to seeing this one out! 

Background: ALS fucking sucks and we still don't know anywhere near enough about it except there is appears to be a genetic component for a very small amount of people who will spontaneously get the disease and inevitably die from it. This stupid fucking disease stole one of our field's brightest stars a few years ago and I still have to stop myself from writing him for help when I'm stuck on something. We don't have diagnostics for it, except in maybe 3% of people with a TDP-43 mutation. Fragments of the TDP-43 protein seem to be very linked to the disease because bits of the protein with or without different PTMs appear to accumulate and may be linked to other diseases beyond ALS. 

Unfortunately, motor neurons are surrounded by all sorts of other cells, so homogenization screws up any signal that we should/could have when studying post-mortem tissue from ALS. 

Need a reason for ultra-sensitive mass spectrometry? Here's one! This big collaboration between BioGen and two great labs at BYU use laser capture microdissection of postmortem tissue from matched healthy controls and ALS victims to cut out the motor neurons and use nanoPOTs to get all the peptide they can out of the neurons. 

The analysis was performed on an Exploris 480 running what I'm pretty sure was a 20nL/min flow rate over 100 minutes (low flow was achieved using an RSLCnano with a split flow). A data dependent method was used (120,000 MS1 30,000 MS/MS with a 1e5 AGC target and 500ms max injection time) and the results were processed in Proteome Discoverer + Infernys. Wide windows weren't used in this one. I suspect this study started before some recent method development work from these labs that I've recently read and posted about here.  

End results? Around 500 proteins appear to be significantly different in ALS affected neurons! Way more than anyone has ever seen in an ALS proteomics study. Possible biomarkers? We can hope! Also, we've got a method now for going after other nefarious and poorly understood diseases.

Funny perspective on the study -- "Limitations of this study" includes something like "we were only able to identify 2,500 proteins". Oh. Is that it?  Our second attempt at mouse neurons got us to around 400 proteins/cell over the Xmas holidays but the group will probably do a lot better on the third attempt without me in the way. 

Beautiful amazing study with far reaching ramifications. I can't recommend it enough. 

Monday, January 8, 2024

WARP Columns! Is this the (currently) low cost nanoLC column solution you've been looking for?


Like everyone(?) in LCMS proteomics, I appear to be in this cycle of finding chromatography that I like, then either a big company buys them, relocates production and they start arriving broken in the new boxes, or the guy packing everything retires or dies (rest in power, dude) or just gets too near-sighted to do it anymore. 

Time to start the cycle again with WARP columns! 

Some guy from something that was once called "MyChrome?" started this company last year and I found them while looking for really small custom columns for a very annoying triply phosphorylated drug. They also make NanoLC columns and a 25 cm that is exactly compatible with my setup (nanoViper in, nanoViper or captiveSpray out) was - for a limited time - $500! 

They don't give me anything for posting this. I just get asked a lot about where I'm getting columns, and for a while I think this is where I'm getting some of them. 

Sunday, January 7, 2024

Single cell epigenetics/epiproteomics by mass spectrometry time!

I am super pumped about this one, y'all! Okay, as you might have noticed I LOVE to talk about mass spectrometry and proteomics. So much so that sometimes on my 45 min to 1.5 hour commute (each way...blech...) sometimes people will just call me and ask me questions and I'll think later to ask who is interrupting my relaxing death metal time.  The last 3 years or so it has been things like "what is THE application for single cell proteomics where nothing else exists that can compete with it?" 

This is one answer I've kept to myself. (Largely because I've pitched it a bunch of times, including at an epigenetics meeting and the response wasn't encouraging...)

If you are doing single cell proteomics (well) everything has to be perfect. Your sample prep, your instrument, your sample prep, your environmental conditions, and - especially - your sample prep. 

But even if you are having an off day with your sample prep and you only get 158 proteins detected per cell - chances are you've got pretty decent coverage of at least 9 histone proteins. That's because they exist in millions of copies per cell. They do sorta suck because they're packed full of basic residues and trypsin cuts them into little bits, and they elute early enough that they may not retain on your trap column well. 

But there are people out there that really really really care about histone PTMs in single cells! 

Check out this 2023 Nature Biotechnology paper. They successfully measured 2 (TWO!) histone PTMs in single cells! Someone later this year did three, but they can't do 3 simultaneously. They can do a bunch of cells with 2 and then a bunch of cells in the next batch with the third one. 

What my paper shows is that even when using relatively high flow nanoLC/microLC with relatively low coverage per method (by today's insane standards!), I can pretty easily pick up 16 different histone PTMs. And as I increase the throughput, as measured in cells per day, those numbers don't change much. We're sampling the histone peptides over and over and over because there are just so many of them! 

Now - I did use the TIMSTOF SCP system for this study - and it is blazingly fast even when you've only got like 15 nanograms of peptide load on the system and are running at 1uL/min flow rates or higher. However, I've reprocessed just about every single cell proteomics dataset out there in the world (...which...honestly...isn't all that many...) and deep insight into single cell epigenetics/epiproteomics is available in just about every one of them (with the obvious exception of the non-nucleated cell differentiation studies) 

Just because this one is currently open on my desktop - check out the most recent preprint from the Slavov lab (maybe my first post of 2024, on here somewhere) where they did cancer cells that they forced into the epithelial mesenchymal transition with TGF-beta treatment. 

This study uses a very very clean Q Exactive Classic running 70k MS1 (100ms fill) and top 7 method with 70k MS/MS with 300ms fill times on a 100 min run-to-run cycle time (90-ish minute active gradient). Translation - this data is beautiful - about 2x the resolution of my stuff (about 4x in the low mass region) at the consequence of about 1/4 the number of spectra/file and about 1/10 the number of spectra per unit time. And here is 100% sequence coverage of a histone acetylation site that is seen in about 60% of the single cells in this study. 

Funny thing about K+acetyl is that it makes a BEAUTIFUL and very distinct diagnostic fragment ion that holds onto the proton with extreme veracity. If you don't see it in a peptide your search engine says is "acetylated" your engine might be wrong (or you didn't scan low enough) 

And here it is in this Q Exactive file --- clearly distinct from the carrier channel used in this study (the 126 in the TMTPro 18). You can click to zoom, but the 126.09 is K+acetyl! 

(I can see this PTM in a very high percentage of the 400 single cells in this study!) 

Cool, right?!? Okay, so who cares, right? I've shown that I can see these histone PTMs in at least 2 accepted papers so far, but what's the application? Honestly, whatever people study histone PTMs for, right? At the genomics thing I went to they were talking about epigenetics in evolution and in heredity and all sorts of other nerdy stuff that I'm sure is important. 

What I do, however, is study how drugs work and how cells adapt to drugs. And there is a whole class of drugs out there called "histone deacetylase inhibitors" so I chose one that currently has a limited use authorization from the FDA and has a very promising sounding (see all disclaimers) ongoing clinical trial and I had a new MS student (thanks Tarsh!) dose some cells with the drug, then we pseudo randomized control and treated cells after TMT tagging them and - 

BOOM! Tons more signal form K+acetyl -and both cell type and PTM site specific data! And I can see these PTMs whether I'm running 210 cells per day (7 cells/LCMS injection on EvoSep 30SPD) or 420 cells/day (EvoSep 60SPD) or 700 cells per day (EvoSep 100SPD). What is truly crazy is that even when I put on the 4cm column on the EvoSep and run 500 SPD (3,500 cells/day!!!) I can still see 9 histone protein, but the coisolation interference makes everything look 1:1. That needs work, but 420 cells per day is 2x my normal throughput! The study I published in JPR in December which is housed in a folder on my desktop as "BIG PANC SCP STUDY!) could be ran from beginning to end in a weekend at that throughput. NanoLC at 200nL/min with a 15cm x 75um x 1.5um particle column had a whole lot more proteins/cell, but still! 

Okay, that's enough typing. I need to go to EuBiC winter school! 

Friday, January 5, 2024

JPR Rising Stars 2024 issue is out!

Kick of whatever year this is with this new collection of papers from JPR that kicks off with a great commentary from Dr. Yates

There is a great new introduction to LCMS based proteomics in this issue and a significantly more traditional Orbitrap-TOF hybrid system. The ultra super fast SAGE algorithm makes an appearance and some nontraditional approaches like the stellar chemical acetylation technique (just remove ALL the background from your IPs?!??!), single cell proteomics on an ion trap and direct injection proteomics. Totally worth a skim through! 

Thursday, January 4, 2024

Ruh Roh, Reorge.... LCMS might still be winning the proteomics game!


Whoa. This is a fun read, y'all. While maybe a little biased in the people that Adam got sound bites from in the article, it does seem like some of the excitement for nextgen hardware is hesitantly ending up supporting LCMS based proteomics. Let's go. Let's go. Let's go! 

Wednesday, January 3, 2024

What to get the mass spectrometrist who has everything? 200 Hz ion trap in their OrbiTOF!


What do you get for the mass spectrometrist who has everything? What about an ion trap inside of their OrbiTOF? 

Now - there are a lot of terms being thrown around in this great paper about a prototype of an instrument (that would probably cost more than Twitter is currently worth), which makes this seem a little less like science fiction - and a little more like...maybe I should put in that abstract for Anaheim....

Human brains are just good at detecting patterns - whether they exist or not - but if someone has already thought that "ion processor" sounds a lot better than "rectalinear collision cell inserted perpendicular in an Astral" AND drafted loads of professional schematics, it makes you wonder how far that is from production. If you are wondering if some part of the marketing department of one of the big mass spectrometry companies is on my payroll, I'm happy to say that - no - this is entirely their doing, though I'll consider it a late Xmas present! 

Anybody have any ideas for what they might do with ultra-high speed trapping and routing into an Asstral? On example shown in this study demonstrates remarkably high resolution of myoglobin at a +23 charge state -- at 200 Hz (the figure actually adverages 100 scans, so 2 Hz, but still!) 

Now....they don't spend much time on it in the paper, but.....this prototype system clearly exceeds 100,000 resolution following ion processor fragmentation (!!!!!!!!!!!!!)

THE Proteomics Show (Season 4?) is in production!


The end of 2023 was rough here for a lot of reasons. If you are following THE Proteomics Show podcast you'll hear some of it Neely got COVID and I had RSV symptoms for over a month. There are points where I didn't do anything but whisper for an entire day so that my voice wouldn't fail while recording podcasts with these amazing people who agree to hang out with us while we ask uninformed questions. 

I'm actually making videos on how to process single cell proteomics data with different software packages to get the best possible results, and I sound about 3/4 dead. However, we trucked through and we've just about wrapped up with recording THE US HUPO sponsored series (thank you US HUPO!) OREGON TRAIL. I think we got about half the invited speakers for US HUPO Portland and most of the awesome award winners. We'd love to get everyone, but we all have limited time and energy. 

Despite this, we've started recording season 4 (b-sides, which is a reference people younger than Neely probably won't get) which isn't sponsored by anyone. Neely edits it and since he doesn't know how to turn down the metal intro music, they sound cooler than the ones where we pay a pro to edit it all together.  We dropped in John Arthur as our Holiday Spectacular and that's a clue into the quality of big name scientists we've talked into wasting some time with us. 

While interviewing THE one and only Neil Kelleher himself, we discovered that we're not the only podcast out there (show coming up soon!) 

There is a Kelleher cast, and you can check it out here

If you aren't a podcast person (I'm not) and maybe not a mainstream technology person (I'm not) you can - no kidding - just say your phone's name and tell it to open your podcast app. Type "proteomics" into the box and we're there. Magic. 

Tuesday, January 2, 2024

Targeted mass spectrometry of somatic mutations - without affinity enrichment!!


Remember when mutations were purely the domain of genetics technologies? It is 2024 and those days should be as dead in your mind as the f'ing western blot. (Which....isn't entirely dead....and, I have to admit has it's place sometimes because making targeted assays for ultralow abundance stuff can be a drag)

But check out how great this new study is at targeting super detrimental (and typically low abundance) KRAS mutations!

It would be fair to argue that their sample input is large (>500mg) but they obviously only used a fraction of a percent for each run. Probably also fair to argue that 60 min is a long LCMS experiment, but just check out how great that data looks! They ran at 600nL/min which is easy to maintain and tuning in the FAIMS cleans up the background and overall data quality to a really impressive extent. 

In the era of targeted small molecule inhibitors (and - probably more importantly - KRAS epitope targeted mABs) knowing the mutation that is actually present - and where - couldn't be more important for patients. And they show they can clearly resolve ALL the main mutations. 

Incredible work I couldn't recommend more.