Saturday, December 12, 2020

Is one of your proteins of interest in your "contaminants" database?


 

I'm moving fast this morning, but I thought this was a fun thing to bring up. It would be great if every proteomics sample ONLY had the proteins that you digested and wanted to see, but that's not the case, right? You've got your stupid protease hanging around, and you've probably got dog and trash cat keratin falling off of you all the time. In the winter you get this great boost in wool peptide identifications. Common contaminant databases are critical and used by just about everyone and a lot of cool software now just has the option to add them automatically. 

And...maybe we ought to take a critical look at some of these lists.....

Imagine that you're doing some laser capture microdissection experiments on the epithelium of a tissue slice, and your suprise when you don't detect one of the major protein constituents that should be there. Weird, right? Did you toss keratin 7 because you hard filter your results and use a contaminants database that flags a few extra keratins? 

If you're using the default contaminants.fasta that comes with every MaxQuant download, that might be the case. 


There might be 10 new proteomics studies this fall already on Ubiquitin-Conjugating Enzymes. It's a hot topic out there. I wish you all luck. Blech. If you're doing a meta-analysis of this data and using a hard filter, you might not see a few of the proteins if your contaminant database is derived directly from the great Global Proteomics Machine cRAP database. 


The GPM website clearly breaks out the contaminants in the database by type. There are a bunch of human proteins on the list that are common contaminants if you use the Sigma UPS standard, which a lot of labs do. However, there are some really cool proteins in those standards! 

The direct FASTA download doesn't break the proteins out that way (it can't, FASTA isn't exactly a flexible thing) and it looks like a couple pieces of software have either taken cRAP verbatim or have started with it and added their own in house observations to it. The MetaMorpheus contaminants XML definitely has these proteins in it, for example.

The answer? Probably not hard filtering, I guess. (I have a default filter that makes anything on my contaminants database invisible in PD and when I open .tsv from other software I toss anything with an X in the contaminants column. That's on me, but hey! now I know better! 

Friday, December 11, 2020

Extend the capabilities of your higher mileage hardware with 8-plex complementary quan!


 The use of the complementary region of a reporter ion tag is not new, but it has been somewhat limited in utility due to the relatively low plexing capabilities. 

You know how those Tandem Mass Things all have the exact same mass? It's because they swap isotopes between the reporter thing (red above) down around 100 m/z that you normally quan off of and this balance region thing (blue above) that we typically just forget about.

However, it's really noisy down around 100 m/z, and for a lot of instruments it is 1) impossible to scan down that low 2) annoying to scan that low (because if you lower your lower limit you also have to lower your upper limit), so the complementary tags have always had a bit of a following, but -- ouch. You're stuck to plexing 5at most 5 samples at once? 

Would it be a better option with some adjusted tag chemistry and 3 more channels? Sure looks like it! 

Not only does the complementary approach beat SPS MS3 in some cases in this comparison (I think the comparison was a Fusion 1 and the authors are very clear about the hardware advancements in the subsequent generations) the very best looking files out of the ones I've downloaded? They're CID, yo. 


Check out how clean the complementary ion region is here! There is a lot less noise in the higher m/z range in shotgun samples. If you thought that you'd pushed your trusty high mileage hardware to and/or well beyond it's limits and you're having trouble competing with the big spenders out there with the newer gizmos? I probably couldn't recommend this new study more! Tighten up that quan while still getting a solid high (or higher) plexing number! 

Oh. Wait. Does processing it look like an absolute nightmare? 

I don't have proof yet, but I'm pretty sure minor adjustments to this will do it.

Less fun details: The resolution that you use will be important to what channels you can and can not use. As your instrument probably decreases in relative resolution as the relative m/z increases, those big ions will coalesce with the natural C13 isotopes, so you'll lose a channel or two. That's why you probably can't use the N/C swapping at all unless you really crank up the resolution numbers. Still cool. Still highly recommended. 

Thursday, December 10, 2020

Take the fun out of single cell proteomics with false positive rates in quantification!


Is the fact that the hospitals near you are very very full and you're...injury prone...making it hard for you to sleep at night? Or are you just too excited by the promise of single cell proteomic technology and need to reign it in a little? 

Do I ever have the study for you! 

Why the annoying title for this blog post? Because it would be so much more fun to not think about relative errors in quantification in this exciting and emerging field of single cell proteomics. Let's focus on the positive! Do I want to know that the relative error in protein quantification tends to head in a suboptimal direction as the number of relative cells in an experiment decreases? 


Is it important to know? 

Of course it is! 

But it's even more important to know what those relative errors are as you scale down so we can start to think about clever ways to adjust for these errors, like how many replicates would you need to do to make them better?

Mad props to this group for trudging through and doing the work that I know I sure didn't want to do and making the graphs that no one hanging out at a single cell sorters wants to think about and really truly absolutely need to! 

Oh, and this is probably a much better summary of the study: 



Wednesday, December 9, 2020

Tailoring collision energies to search engines -- have be been too complacent?


On the large list of things that I thought we had settled and we, as a field, would never have to think about again.....



You should check out the evidence yourself here, the files are up on MASSIVE and the authors show solid arguments that for both QTOF and OrbitalTrap instruments we might want to think a little harder about how we set our collision energies....to the tune of as much as a 40% increase in identifications between optimal and the opposite of optimal.  The 40% looks like a pretty extreme example of sub-basement optimal,but it does help to highlight the importance of these parameters. 

Monday, November 30, 2020

NRPro -- A powerful new approach for antibiotic drug discovery!

 


Even if you aren't interested in the fact that the pipeline for new antibiotics pretty much ran dry decades ago, you should check out this new paper at ACS

You can eventually drive a nail into a wall with the handle of a screwdriver. Similarly, you can eventually get a search engine designed for tryptic peptides to help you find some oddly conformed endogenous peptides. However, if you want to build a house by nailing peptidic natural products together you probably want to try a different tool and come up with your analogies after coffee. 


Most metabolomic tools are terrible for peptidic natural products. They're too complex, and large and they often multiply charge. Likewise, your peptide tools are looking for a minimum of 7 amino acids or something, and they sure aren't prepped for the mass changes when these little things cyclize! These new tools bridge that gap! 


Sunday, November 29, 2020

Modeling the peptide universe collisional cross sections?


This preprint was not exactly what I was looking for. I was looking for something like PROSIT for collisional cross section predictions, and this is kind of like that.

It feels more like a thought experiment, and it the results are really interesting. There are clear collisional cross sectional patterns (like the crazy image above). It is an interesting read and makes me hopeful that what I was originally looking for would make sense. 

Saturday, November 28, 2020

Tutorial slides -- Proteome Discoverer Free Version with MS-Fragger Install!


Would you like to have a fully functioning version of the Proteome Discoverer environment on your PC at home (or on multiple PCs throughout your lab, which is a much more normal thing to do)? You obviously can't have the commercial nodes that the manufacturer has to pay royalties on, but you can have lots of tools including:

!!!MS-FRAGGER!!! operating in PD on any Windows PC. 

The SugarQB glycoproteomics workflow (which...I'd argue is as good as ANYTHING available for glycoproteomics for any price right now, with that caveat that you have to have your glycan mod in your database) 

AND TONS MORE!! 

Here is a link to a 20-ish slide walkthrough for setting up PD with a bunch of cool free nodes. I tried to include every relevant link, so your browser or security settings might be mad about all the extensions in the slide deck. 

This might not be completely accurate (duh), but I tried and I hope it helps. 

I put up the Version info in the image above, because maybe I'll update it going forward. 

As an aside, having PD installed at home helps me morally justify having a badass OMICSPCs system in my basement. It's only coincidental that it can run Crysis. 

Friday, November 27, 2020

The first TIMSTOF post. First week impressions!


I'm a die hard Orbitrap fan and I always will be. However, there are finally some other technologies out there that can compete in some ways with Dr. Makarov's incredible invention. I don't know if this has made the blog or not, it's been a busy year, but I've finally found a long-term position at JOHNS HOPKINS. I'm junior faculty in Namandje Bumpus's lab and we've got an amazing group of smart fast moving young people and a bunch of crazy ideas. Point of evidence? That ceiling disrupting monster in the picture above. We did a lot of research and compared a lot of data over the last year and a lab with all Thermo instruments filled most of a room with an instrument made by the NMR company? Sure did. That 2,000 pound 8'7" monster is the TIMSTOF Flex + MALDI 2. 

There will probably be a lot of TIMSTOF posts as I struggle through learning this thing and making it do what we want it to do (much of which it doesn't seem to want to do) because writing about it will help me think about it. Here are some very early impressions and some information I would have liked to have had going into the last week. 

First impressions. Installation: 

It's frickin' huge. 



The truck that came to deliver it, was TOO BIG for our loading docks. The 5 crates that arrived (2 of which were over 5 feet tall) does not include an HPLC (which, if you get from Bruker, would be box #6).  If you have any space limitation of any kind, you need to really plan this out. Boxes need to be opened in a specific order, because some crates are just the equipment necessary to move and install the rest. 

You'll also need to be prepared to remove lots of doors -- and to probably get on your neighbor's nerves for a day or two. Recommendation: Have at least 2 people around in addition to the FSE, and prepare for a solid 12 hours the first day to get moved and powered up. 


In general, though, the installation was pretty clean once the behemoth was in place. There is, of course, a federal law against any mass spectrometer manufacturer providing you fully accurate and complete pre-installation documents. I believe the death penalty is a consideration in some states if you receive the correct NEMA plug type schematic ahead of time. Of course, we installed exactly what the instructions stated, and of course, they were wrong. This isn't a knock on Bruker. It's the law. 

First impressions: Ease of use. 

Ummm....okay....how do I write this without coming off as arrogant or stupid or both? This instrument isn't anywhere near as finished as hardware from other manufacturer's gear. Ease of use was not at the top of the design list. If this is your first mass spectrometer, I think it's going to be rough. 

If you are a biologist and the mass spec is just one of 15 tools you'll use during the week and you don't have a dedicated mass spectrometrist? 


Let's start with hardware first. Swapping from your ESI to your Nano source requires tools. You don't pop two hinges and swap them out. You'll need a tiny hex wrench (1.25mm, I think) there are multiple small parts that will be devastating to lose or break, including tiny gold o-rings and spring. It is very very easy to break your nano column while both installing and removing the nanospray source. It is reasonably easy to instally your nano emitter incorrectly, which you'll only find out that you did after you've fully installed your nano source, requiring you to take it all apart to reseat the seals on your source. You test that seal by blocking a filter on the completed source with a (GLOVED; there's EVIL STUFF IN THE FILTER) finger tip and monitoring the drop in vacuum pressure. 

Fortunately, you will only be swapping the sources every single day, so your chances of making mistakes will be rare. This can be minimized by putting evil fluorinated compounds into a filter so you have a spread of lock mass compounds into your nanospray at all times. I'm putting up a little sign to remind everyone. Gloves if touching the nanoESI. Huge shoutout to Gabriela and Brett at the UC Davis core for providing this secret protocol to me before I broke every NanoLC column in the entire mid-Atlantic taking the source off every day. 

First impressions: Vendor software

Impressively, the TIMSTOF is compatible in an almost plug-n-play format with every vendor HPLC. If you can find the drivers for the instrument they load up well. Without question, the EasyNLC works better on the TIMSTOF than on any Thermo instrument. They communicate digitally and you have better control over parameters. Weird. 

Interstingly, the hardest thing to find in any of the vendor software packages is a mass spectrum. Chromatograms and mobilograms make the front page of most of the software that must be open while you're running the instrument. Finding a mass spectrum requires around 10 button clicks. I think someone forgot to tell the developers what this thing is. You can, however, eventually find one, but the software "Data Analysis" will be very annoyed that you figured it out and will ask you if you want to save changes when you close the software. 

First impressions: Performance

THIS THING IS FAST. Ricky Bobby drafting off a slingshot fast. Sloth escaping after burning down a hospital fast. FAST.  120 MS/MS scans per second at 40,000 resolution FAST. 


And the sensitivity is there. You wouldn't think that it would be, but it is. TOFS aren't sensitive. That's most of the problem with them, right? Holy cow. When you extract 500,000 MS/MS spectra out of a 60 min run with around 50ng of peptides on a column, it's easy to think "why was I cursing so loud about how hard it was to get this thing to show me a single MS/MS spectrum?" 

There are some really cool features hidden in the instrument methods. Best I can tell, mostly undocumented. There is a cool preprint from those Max Plank people where the authors state "functions of the instrument are largely unexplored'. I think I have a 1/4 finished post about that somewhere. 

First impressions: Compatibility with tools

This is getting better all the time, but if you find the fact that there are over 1,000 proteomics software tools in the world a little daunting, this might be an instrument for you! Very very few of the tools are compatible with the data from the device. Frustratingly, some of them will look like they're processing the data, hang out a couple days heating your office, and output gibberish. 

Things that don't work:
ProteoWizard (ouch. yeah. that one hurt. I figured I could just convert it to anything I wanted and reopen all my tools) 
Morpheus - weirded out by this one. Even a Bruker generated mZmL crashes for me, but it might just be me. 
Actually, it could just be a poor mZmL formatting thing, so I won't list any others till I have a chance to put more time in. 

Things that do work: 
FragPipe/MS-Fragger (in some functions; no TMT, etc.,) 
MaxQuant (also appears to work, but not for TMT)
Skyline (very recent addition/update)
Several commercial software packages; Bolt/Pinnacle, PEAKS, SpectroNaut, Byonic, what is Robin Park's software called again? IP2? It might be called something else now, and it can work in real time!  I meant to check that out, but busy busy busy.  

Again, this is just a first week with the instrument and I'm sure my impressions will change and evolve. There are a lot of pluses here. This monster of an instrument is enormously capable. It's fast, sensitive, and there is loads of opportunity to build new and exciting experiments, but it doesn't feel like a fully finished product. I think that early adopters are going to largely feel like beta testers. And if that's what you're down for, hell yeah! Me too!  


But if spending lots of time doing method development and maybe fabricating useful parts or testing less dangerous chemicals for calibrating your instrument, or writing patches to make your favorite tools work with your new million dollar instrument isn't what you consider a good use of the minutes you have left on this planet, this might not be what you're looking for yet. It'll be interesting to see what direction this new tech goes in, though! 

Thursday, November 26, 2020

Process TMTPro(16-plex) reagents in MaxQuant!

 


Huge thank you to Ed Emmott for posting these resources so I could get this workflow going this morning. 

Even if you just downloaded MaxQuant yesterday (1.6.17; i.e. "v.Mannly Wabbit"; [if you didn't know the newest version of R is called "Bunnie-Wunnies Freak Out", I think we could keep this trend going!]) you'll see that you don't have an option for TMTPro yet. That's no problem, but there are a couple of steps! 

You need: 

1) MaxQuant

2) To know where your MaxQuant folders are. 

3) This DropBox link from the Emmott Lab; you'll need both files (link here!)

4) This text file I made. Even though the title of the file is "probably wrong" it seems to work! Woo! 

Step 1: 

1) Make sure MaxQuant is closed. Check your version number that you are currently using. 

2) Find the folder where MaxQuant is actually operating out of. (You probably have a shortcut on your desktop and the actual folders are in your Downloads drive somewhere)

 



This is probably where you want to be. You want to now swap out the modifications.XML that is in your folder for the one from the Emmott lab dropbox. If you're a paranoid weirdo, just move the original XML file, but I've heard paranoia is rare in mass spectrometrists, so you'll probably just copy right over that old one while shouting "YOLO" or "Parkour" or something equally hip.

Now start MaxQuant. If you go to your Configuration tab, you should see at the bottom that the TMTPro reagent has been added! 

Now that you've got the reagents you can build your new Quan table or import the "TMT16_Ben_Probably wrong" Text file. 

You can also use the second folder in the Emmott Lab dropbox to file --> load parameters. 

And that's it. If you have the correction factors from the kit you purchased and you're a person who uses those you should punch those in. Otherwise you should be set. Looks like it works to me! 

Tuesday, November 24, 2020

Optimal Fragmentation of N- and O- Glycopeptides!


If you're like me and were just kinda hoping you could go the rest of your career without actually knowing the difference between an N-glycan and an O-glycan, I've got some very bad news for you

You're going to have to go to Wikipedia and get them straightened out in your head before you set up that instrument, particularly if you're thinking of using fancy fragmentation methods to get them straight in your head.

The good news is that these authors pretty much fine tuned this all out for you in this great paper. 
The one thing that I would mention is that there are differences from one instrument to the next in terms of the fragmentation energies required. (That's why things like PROCAL exist (<-- blog post for paper). Blog post with exact masses and link to JPT who sells it now. )  

The instrument specific fragmentation energies are somewhat easy to forget about when you're running peptides. They handily fragment well right at the peptide bond, but these authors show what appears to be a huge difference in N-glycopeptide identification with an HCD change from 30 to 35 (!yikes!). You might want to verify that your instrument(s) are really doing what they say they're doing if you're thinking about intact glycopeptide work! 

If you have multiple instruments, it's always good to know what settings match up. It would suck to optimize on your Fusion 1 and then move that to another instrument and waste all those glycopeptide runs! 

Monday, November 23, 2020

PASS-DIA -- Ultradeep (Discovery?) DIA Experiments!


 One of the limitations of DIA has been "you have to know what is there first" sensitive discovery of new things? That's tough if you've got 40 bazillion ions all coisolated at once! 

But what if you dropped your DIA windows to miniscule levels -- as small as what you do with DDA, and you just scanned across and fragmented all the things?  Don't pass the turkey on Thursday (because turkey is gross, but not as gross as the COVID you're dumb cousin with the red hat is going to give you) PASS-DIA! 


One of the things you don't have in your DIA data is a way to link your precursor up to your fragment ions. You'll need this awesome thing hosted at PNNL. It's called mPE-MMR. (Pronounced "muppet murder")


Next you need to run the sample multiple times. It takes a looooong time to acquire all the 2Da windows across a mass range. For PASS-DIA, you make multiple passes. About 150 Da is examined per experiment, and then you reinject for the next pass, moving your scan window to the next 150 Da. That way you cover absolutely everything in 2Da windows. 

The authors show application in a broad range of tissue types and experiments, including glyco and phospho- peptide experiments. 



Monday, November 16, 2020

Two-thirds of proteins of the host and parasite are modified!


 




Experimental setup?
Red blood cells
Red blood cells + malaria (Plasmodium falciparum; the ultra-deadly on the first infection type)
Enrich for a ton of different PTMs, including: 

Phosphorylation, acetylation, crotonylation, 2- hydroxyisobutyrylation, N-glycosylation, and ubiquitination!!
Q Exactive LCMS

Stunning downsteam analysis.

INCORRECT LINK IN THE PAPER. Don't go to the WolfPSORT link in the paper! 

This is the correct link for it: https://wolfpsort.hgc.jp/ (this alone is really cool, you should check it out, it's a subcellular localization predictor)

What did they get? 

Over 2/3 of the proteins are modified! Which explains a lot of things! 


Sunday, November 15, 2020

The HUPO High-Stringency Inventory -- An editorial update!


This is a quick open one/two page editorial that makes a really nice read on where we are on the Human Proteome

I almost didn't link to it after seeing Donald Rumsfeld quoted in it, but after careful deliberation, assumed that the author considered quoting Lloyd Christmas and figured a real life person of similar intelligence would be better received by ACS. 



Saturday, November 14, 2020

Argonaut: A Webportal for MultiOmics Collaboration!


Coon Lab has been busy this year! Item 1: A reeeeeaaaaaly intersting patent application.




What's this do? Exactly what the title says! Legit multiOmics integration stuff through a web based platform? I'm only messing around with the example data, but check this out. 

What if you had lipidomics and proteomics and metabolomics on a system? You can link it all together! (You should be able to click on the image to expand). 


And within these separate experiments you can do some really cool comparisons like graphically setting up correlations between the observations and the various conditions. 
 


You can directly go to your outliers (or you datapoints of extreme interest) by rapidly and creepily fastily kicking out the reports of exactly what you're looking at or just direct info on that datapoint, what that protein or lipid (gross) is, as well as the evidence that supports it. (In this case you can see how many bon-bons will fit in a Ferrari in scientific notation). 

Obviously, this is their example data on their hosted site, but if the IPSA is any indication, I'd guess that this thing works just as well as described. Some people in Madison have some mad programming skills. 100% recommend you check this out! 

Oh, and surprisingly, this exists as an independent entity of the Coon Lab collaboration of COVID severity symptoms. (Preprint link in my raving about it here.) Direct link to the resource here! I assumed that this was going to be the resource announcement for the system that was used to build amazing tool, but it does appear to be distinct.

Like I said, busy! 

Thursday, November 12, 2020

RAW mass spec data is too pretty for you? Look at it in R!


I'm not joking. Images like the one above could BE YOURS! 

Impress your friends with a selection of : UNIVERSALLY HIDEOUS R FONTS! 

Like these? 

Yes! Exactly like these! 

Make sure your dog can't critique your manuscript figures with: 16 COLOR GRAPHICS! (Available in other R packages)

Make sure no one can see that monoisotopic mass is off just a little by: OVERLAPPING LABELS FROM YOUR SODIUM ADDUCTS! 

I'm just being a jerk, it's clear from anyone reading anything in science how critical R is to science. It's used so universally now that it's weird to see downstream analysis without it. 

And you know what we've never really had? A way to go straight from RAW files to stats and that's what this now allows!  Downstream to stats? We've got amazing tools like MSStats and SProCoP and that loads of cool tools from people like Gatto and Wilmarth that have been out there for years for us to use.  But if you've ever thought "wow, I'd really like to extract out anything spectra that has a delta of X?  OR (better question for the power built innately into R) "how often does this delta occur in my RAW data?" 

Get your RAW data into R and all the sudden that becomes possible! 

THEN you could totally make things like this!