Wednesday, December 18, 2024

We are the analyzers!


Follow this if you don't want to watch it in my silly blog interface. https://www.youtube.com/watch?v=wzmJDNmsWK8

Is the US HUP Ed&Out the coolest committee you can be a part of (used to be VMO)? 

YES! 

Huge success that started when Amanda Smythers started pulling together amazing silly rhymes last year at a meeting and Dragana Noe ran with it and made this super clutch finished product. 

So much work! Who is pumped for the US HUPO video competition this year? Me? You? Really? Submit your videos! 

Tuesday, December 17, 2024

Set up a DIA experiment with variable size windows on any Q Exactive!

 


I swear, I thought this was on the blog somewhere and I can't find it. 

If you're in Europe or other places, I guess, you can just disregard this entirely. Rumor is you have a variable or variation DIA window button. If you're in the US, there is a trademark thing and only one vendor(?) can give you a V on your instrument by default. 

Huge shoutout here to the Slavov lab for posting RAW files in the plexDIA preprint back in 2021 or something because this way of setting it up is more efficient than how I was setting it up. 

As you can see above - you need to set up multiple experiments. In this case I'm acquiring MS1s on my old Q Exactive in the dark and generally extremely damp Biophysics basement at The John. 

A Q Exactive Classis is slow by today's standards, so I think you'll definitely see people who will drop the MS1, depending on what their software actually requires. 


This may actually be exactly the Slavov lab method that I am showing here. It might be worth noting that plexDIA was using mTRAQ tags, so the m/z of the peptides are shifted up a bit by the mass of the tag - but I tell you what - this method will absolutely smoke a standard 20 AMU fixed with DIA method. 

So the trick is to set your loop count to match the number of windows in your "Inclusion list" file.

This is what your inclusion list should look like - I strongly recommend going into this and going File -->  Export --> .csv or .tab delineated (whatever makes you happy) then opening that format in Excel or something else. Then you can use quick equations to set you center mass and overlap. Here I think we're getting a 1Da overlap on each side of each window, but don't quote me on that. 

I'm 91% sure that you can also have Skyline do this for you, but - yet again - my Skyline expert is off making a lot more money than me, so the infinite cycle will continue. My cycle is this - sometime in 2025 I suspect I will pay to have some super bright young person go hang out with Brendan MacLean and some other brilliant targeted mass spec people and she/he will come back and tell me how cool Mike MacCoss is and for the next couple of years I'll have a Skyline expert!  Then we'll have loads of pretty plots and matrix matched curves and then disaster will strike - Asta-Zeneca or Merck or Moderna will offer that brilliant young person more money than I make. I'll be super happy for them and they'll leave and reviewer comments will come in where it's like "can you do this very simple Skyline thing and we'll accept your paper" and I'll say "...no. No, I can't. I'm way way too dumb to do anything in Skyline myself even though I have tried to sign every support document for the software since 2012 or something. 


What was I typing about? Aha! VariableDIA windows! 

Above you've got the first 25 windows that are 12.5Da and then the next batch is 25Da and the final is great big 62 Da or whatever. 

Considering where most of your peptides are in overall m/z ratio, you might want to adjust this. For example, if you were just LysC digesting and you were pushing up the distribution of the m/z of your typical peptides by about 70%, you could see a 4th experiment where you were using wider windows in the lower m/z mass range, narrow in the middle and then bigger then really big. This specific bit of vaguely and probably not usefulness I just typed was inspired by the r/proteomics entry that inspired me to spend 33 minutes before my first meeting today typing this instead of getting a shower and what appears to be a desperately needed second coffee. 

Since the Virginia Tech/Google $20M drama continues, my GoogleDrive that basically every link on this blog went to has been deleted. What would be helpful, I think would be if I put the actual .meth and these tables up in links somewhere. I'm now at 36 minutes somehow, so I can't do it now. If you need these, email me, my contact is over there somewhere ----> 

 and I'll try to get these up where you can direct download them later 




Monday, December 16, 2024

If you knew most of the proteoforms - is MS1 alone enough?

 


Okay - so this is really interesting and I may need to sleep on it, but here is the idea - 

Ideally we'd be able to see every proteoform MS1 rapidly and have instruments fast/sensitive enough to sequence them. We can use 2D separations to get there pre- or post- digestion, but in no case are these experiments fast.

What if we tossed the MS1s? Could we still do good biology? I mean....an intact proteoform mass with 600 amino acids is a lot less likely to occur completely at random than a 10 amino acid peptide...

Seems to work, too! The proof of concept appears to be an E.coli proteoform atlas. 



Saturday, December 14, 2024

Tau interactome maps and neurodegeneration!

 

Y'all, I am SO PSYCHED for US HUPO 2025. As you might know, THE Proteomics Show podcast was renewed for a 6th(Six? 6?) season! And the whole reason US HUPO put up with our antics in the first place was that we were highlighting invited speakers, award winners and the big deal plenary people.

Not to brag, but I don't really have any hobbies except reading proteomics papers and maybe finding out that there are a bunch of proteomics people together in some city (or ski resort) and going there whether I'm invited or not. As such, I'm sorta plugged into what's going on in the community. 

I personally know/knew 1 (one! yes, one!) invited speaker for US HUPO 2025. Aleksandra Nita-Lazar, who is as good of a scientist as our field has. And then - no one else. So we're legitimately interviewing completely new victims doing crazy badass science. Huge props to US HUPO for this year's line up. I need to upgrade my tablet thing because there is no way mine will survive all the notes. 

Case in point? This is just about the sickest fucking paper you could read right now and December of 2024 might be the coolest month for proteomics in all of history. I'm not joking. I can't even keep up with mindblowing advance one after another. And - again - lack of hobbies. 

How did I miss this??? I legitimately do not know.

TAU? 

APEX Interactomics??? (Temporal - hey this protein is hanging out with THESE Proteins! Right now?? Yes, right now)

Neurons generated from IPSC stem cell thingies from healthy patients and those with dementia? (Disclaimer - went to one stem cell conference in 2009, learned/absorbed very little).

So cool, and we got to spend a morning talking with Tara Tracy for the show (next weeks? week after? something soon) and I'd walk to Philly to see her speak. Google helpfully just informed me it's like 12 hours. Meh. Still would do it. 

Friday, December 13, 2024

The antibodies don't work....ummm....duh....?

 


My favorite part about this new Nature news feature was that I thought it was an article from 10 years ago.

No...but the first citation is that article from 10 years ago...

Look - antibodies do work. But is that discount mass produced antibody for $400 what you want to stake 15 years of your career on without validating the hell out of it? Probably not, right? It's a really super ultra complex tool. Like any complex tool it needs QC/QA and we don't see enough of it. 

Wednesday, December 11, 2024

Proteoform level analysis of "purified" albumin reveals shocking levels of complexity!

I'm again going to put off blog posts on the 50,000 human proteome cohorts using "next gen" spot-based targeted proteomics.

I get it - I love the idea that we can use alternative technologies and get to this kind of population level protein level studies. But -just to remind you - we don't know the answer to this question


What we do know - without any possible doubt, whatsoever, that evolution is super ridiculously stingy when it comes to making new stuff. Sure, there are excessive things that don't negatively impact the overall survival of a population - but those are exceedingly rare.

If a cell makes an alternative form of a protein you can almost guarantee that there is a good fucking reason for it. 

Case in point - what if you treated one of the single best characterized proteins on the planet - not as a protein - but as a proteome

This team took a look at multiple "purified" forms of trusty old bovine serum albumin and treated it like a population of proteoforms - and 

1) It definitely is

2) "Purifying" a protein is....umm... something you'd think was 100% super well defined. Let's go with "could use further definition and characterization". 

We've seen things like these in the past - here is an old post where a modified Exactive (I think what later became the Exactive EMR, one of my all time favorite little boxes) - pulled out 59 different forms of ovalbumin. 

That's hard to look at and really encapsulate. Woodland et al., went full classic protein biochemistry and - this isn't hard to understand. This is a "purified albumin" separated out by isoelectric focusing in dimension 1 and by SDS-PAGE in dimension 2. 


This is a purified protein??? Some of that stuff probably was just tagging along, but a whole ton of that is albumin proteoforms. And - again - there is probably a very good reason for why an organism would expend energy to develop alternative forms of all these proteins, right? 

This is a super cool and thought provoking study that pokes some holes in more than a couple of our normal assumptions. 

Tuesday, December 10, 2024

ProHap - Search your proteomics data against population variants! Critically important new community resource!

 


STOP. IGNORE THE FLOWCHART ABOVE. These are bioinformatics people, they think this stuff is mandatory. I assume their conferences all have contests where the winner makes the flowchart most likely to make someone in another field throw up.

Again - don't look at it - 'cause this is legitimately important. 

You know how the genomics people have been doing things for years with illustrious sounding titles like "The 1,000 Human Genome Project?" Particularly when a lot of those things kicked off and the technology was more expensive, these things absorbed HUGE amounts of research dollars. The goals were to undestand how human genomes vary across us - as a species. 

And they did these things and they kept the results 100% secret from everyone forever. 

I guess that's not true, but -to me, as a human proteomics reseacher -they have been less than useless. Yay, you did a bunch of stuff. Who does that help? Not me or anyone I know. Even researchers I know who focus on health disparities can't get usable data out of these things.

UNTIL NOW. 

What these awesome, though flow-chart loving people did was dig into these top secret genomic databases and they assessed - 

-you won't believe it -

Protein level changes across human populations! This is where it gets important. 

How many peptide level variants could there possibly be in 1,000 genomes? 12? 15? 

Try 54,679! Don't believe me? Here is a completely not illegally taken screenshot. Don't sue me!


Almost FIFTY-FIVE THOUSAND PEPTIDE VARIANTS?!?

How many are you looking for in your data? One? Yeah, me too. I mean, unless we're doing deep cancer genomics and then we search for 2 million. Why not normal variants?!? 

Okay - are you thinking - "big deal, I probably need to spend the next 10 days downloading klugey python scripts written by proteomics people and finding out that my Docker thing is from 2017? How on earth does this help me?" 

And this is where this is super legit. 

Go here. https://zenodo.org/records/12671302

Download this - 


Use 7-zip or something to unzip it twice. (I don't know, it's right there with the flowchart competition, bioinforomatics people have contests to see who can Zip things the most number of times. Bonus - as in here - instead of naming each Zip .zip you can name them weird things. The first thing you unzip is .gz, then it will make a .tar, and you also unzip that - and you'll get the whole reason I've written this entire thing -


You get a FASTA FILE that represents common peptide level variants that appear in human beings across our population! 


Yeah, it's pretty big. 104MB and 157k entries. But you're encapsulating some much larger percentage of normal human genetics now! 

100% check out the paper. They did other smart stuff and there are other (possibly superior files depending on your application.) 

If you're using FragPipe (you should be!) check out this advice from Alexey! 


And check out this additional resource from his team here!

Monday, December 9, 2024

Top down proteoform analysis of kinase inhibitors in an approachable method!

 


Wow. This new study of kinase inhibitor treatment of cancer cells - using top down (intact protein/ no digestion) proteomics is 

1) Super legit

2) Seems really approachable

3) Kind of resets the bar in my head for what we can do right now with today's off-the-shelf technology.

And I might have a surprise for you. While Neil Kelleher's name is here because it is part of a special issue in his honor - this isn't a Kelleher lab study! 


Generally when we see a super impressive top down study I flip through it and then think - cool - maybe I'll be able to replicate it in 10 years? There is often modified instruments or things where you think - if I was able to keep my core scientific team together in a group for a decade we could pull off something this hard. 

Not to say there isn't some legit technical firepower on this study (Kevin Gao is a pro's pro mass spectrometrist), but you can read through this protocol and think - wait - could I totally do this? 

Instrumentation is an Exploris 240! (Approachable, affordable, clean it yourself hardware!) 

The HPLC is a custom Accela....okay, well...I don't have one of those, but it is running at 400nL/min with an interesting combination of buffers. I assume any U3000 RSLC, Eksigent or whatever could assume those same performance metrics. 

Custom nanoLC source. Details in references, but you can make a nanoLC source from Legos. Probably not that tough to reproduce (or necessary). There are funny little bars that are necessary for the Exploris systems when you make your own source and those can set you back several hundred $$

They used TopPic suite for the data analysis, which you can get for free here, as long as you sign stuff saying you won't be a jerk. For some of the focused proteoform specific (is the phosphorylation site at this place or this place) they (interestingly) used BioPharma Finder. I've never loaded more than 5 proteins in that at a time and it's super slow with that many. I assume they put in one sequence and a narrow time window in order to really lock down that one target they're trying to localize. 

The results are well displayed - really pretty and clear - and, again, really might just change your mind about doing top down proteomics. Bravo to this team, I legitimately loved reading this paper from beginning to end. 

Wait - found something to complain about! Whew, I was worried. They haven't unlocked the PRIDE repository so I can look at the files. It was just accepted (JPR ASAP). 

Sunday, December 8, 2024

Is this the year I finally win the US HUPO conference T-shirt design contest?!?

 


I think it is, though I also thought that in 2022....

and...maybe I did win the Chicago one...? No, it looks like I tried to print my own shirt and the company thought I was playing a joke on them? Weird. 

Well, if you think you can beat my entry, go ahead and try! Mwhahahahahahaaaa. You can waste your time submitting one here

Saturday, December 7, 2024

THE (real) single cell proteomics technique scSeq people love - NanoSplits- is out!

 


Check out one of my favorite techniques of the last few years - the NanoSplits paper here! 


The first preprint of this study is somewhere on the blog, but the work evolved considerably since we initially saw it.

If you aren't familiar, what this does is label free preparation of REAL NORMAL SIZED SINGLE ONE (1, uno, um, eins, jeden, yski, en siffra, een, ichi) at a time on glass slides using precision robotics. 

THEN the lysed cell is split into 2 fractions with most of the protein going one way and more of the little transcripts going the other way. You do single cell proteomics on the fraction with more protein and you can amplify the transcripts in the other fraction for transcriptomics. 

BOOM! You get everything! Now, there are obviously some drawbacks here, including that it is really hard to do. You need the precision robotics. This team features some people with serious instrumentation backgrounds but also people with a history of simplifying methods so mortals can eventually do them. We've written 2 grant applications where the technique has been prominently featured. The scSeq people are a whole lot more comfortable with this measuring protein thing if they can get evidence that you aren't just making stuff up! 

What's super cool here is that while multiple groups have shown complementary data by doing stuff like single cell proteomics and single cell seq on the same or very similar populations of cells (my group's first study was dosing the same cell line from the same source with the same drug - in a recent study) - here you get a real - Cell A proteomics and transcriptomics fill in a specific pattern. Cell B the same. 

The authors are quick to point out that NanoSplits could be a bridge technique to unify findings between more traditional studies where you either do SCP or scSeq or both on the same population. A small number of cells split could explain discrepancies between these 2 data types, or help you truly link 2 populations together. 

Seriously - a phenomenal, clever technique with top notch data collection and informatics and when I resubmit a grant in a couple of months I'm sure my reviewers will be excited to see a prominently published paper rather than a link to a preprint.