Monday, March 18, 2019

Best practices for MetaProteomics -- Designed by a super team!

In case you were concerned, my obsession with metaproteomics is alive and well. For an update on the challenges and progress in the field that used PSMs to understand whole communities and ecosystems and global warming and who knows what else?!?!?

Check out this brand new review!

Is this everyone in this rapidly growing but still pretty small field? Hard to tell since the "rapidly growing part" and everything...but its more people than I knew were involved.

Sunday, March 17, 2019

Personalized DNA testing has reached a pinnacle with the DNA Friend!

Have you all been seeing DNA test kits everywhere? My local pharmacy has kits from 23andMe and much less expensive kits for all sorts of reasons that don't at all seem like crazy scams.

If you have been on the fence about sending your DNA off to be analyzed on the legal-in-exactly one country fully pirated next gen sequencers, I highly recommend you check out the DNA friend. Your results come back in minutes!

Friday, March 15, 2019

Got a pesky membrane protein? Hexafluoroisopropanol!

For first prize in the best chemical compound to say outloud today!

Nol (yeah?...yay?...nope. Neither work. On the 10th repeat I find myself saying "Nol-uh") I'll work on it on my commute.

What was I doing? Talking about a paper! This paper, in fact!

My good friend Dr. Blonder is a big fan of using organic solvents to get to membrane proteins (Scholar search "Blonder membranes" and you'll find papers doing things like digesting with trypsin directly in methanol and stuff going back to almost the 90s -- and these techniques do totally work) but a fluorinated solvent!??!?! That's at least new to me!

The interesting part is that this not only dissolves the membrane but it also fractionates the proteins, with integral membranes ending up separated in large degree from the anchor bound ones? That could be seriously useful when you're hunting seriously problematic proteins.

Wednesday, March 13, 2019

New therapeutic targets of early-stage hepatocellular carcinoma!

If this Tweet isn't something to be rampagingly optimistic about this morning -- I don't know what your problem is, but I hope you get better!

This is in reference to this article in some journal I've never heard of --

I'm way too behind to spend much time on it -- but they deconvolved a massive biological problem by using 110(!! a decent n?!?! for proteomics!?!?) pair matched samples (this is where you try your best to eliminate things that will muddy your results -- for example you compare normal to cancer tissue where the patient samples are the same gender and approximate age -- all things you need big sample sizes for) and with all this statistical power they realize they aren't just looking at one homogenous mixture of cancer patients -- there are different subtypes -- THEN when you break them into subtypes -- MARKERS start to become obvious!!!

Was that one very excited sentence? Probably!!

(I love that picture above, they look more excited about this study than me!)

Tuesday, March 12, 2019

Analysis of the stoichiometry of human acetylation?

Does the word "stoichiometry" give you awful flashbacks to adolescence? Have you avoided thinking about it by only using it in sentences like "phosphorylation only occurs at low stoichiometry" and not saying anything further? Time to fix that, because this is the topic in a biologically meaningful context from a brilliant and totally new (at least to me?) approach to understanding acetylation!

Are these sentences terrible? Daylight savings time is dumb and I feel even less coherent than usual.

How does acetylation in human cells happen? 2 ways

1) Tightly controlled enzymatic acetylation/deacetylation (acetyltransferase things?)
2) Non-enzymatic reactions on any available lysine from Acetyl-CoA just floating around.

This group described the abundance of #2 to be at a level where the super important tightly controlled cellular regulation focused #1 ends up being something really hard to define using our traditional method (they actually said "needle in a haystack" which is depressing)

And that's why they did a ton of innovative work to try and figure out which acetylations are which!

I'm not smart enough this morning (possibly ever, honestly) to explain what they did beyond the use of serial diluted SILAC (smart!) antibody enrichment of acetylation sites in conjunction with forcing chemical acetylation with 1M(!) acetyl-phosphate and doing loads of smart maths! (A QE HF was used for all the mass spec stuff and MaxQuant for data analysis)

All the data will be available (it isn't yet) at PRIDE here. (PXD009994)

What do we get? A massively better understanding of how and where and when acetylation occurs and which ones we REALLY need to pay attention to when trying to decide --> is this a downstream process caused by a normal (or glitchy) evolved mechanism OR is it just a highly abundant protein that looks important because of chemical acetylation effects?

Monday, March 11, 2019

Anyone out there interested in doing a postdoc for a crazy person?

Do you know how to use mass spectrometers?

Are you tolerant to some really crazy research ideas?

Are you comfortable working with someone who kind of exists in some sort of an odd quantum state? For example -- you might never really know where your adviser is?

Well then!

I'm looking for postdocs, technicians and research associate/scientists for our new labs under construction in Columbia, MD and Irvine, California.

1) You'll totally be publishing. We're doing things no one has done before. Possibly -- no one has been crazy enough.
2)  We already have a bioinformatician. You find a protein bioinformatician? You hire them 6 months ahead of schedule if you need to. Who knows when you'll find the next one available!
3) Unlike many postdoc positions there is room for direct advancement if you kick ass. More labs will be coming online soon -- several should be up in the next 2 years -- and they'll need people to run them! You could come in, learn everything and be heading straight toward running your own facility.

1) You'd have to work with me. (Hence all the stuff above)
2) There is a lot of boring routine stuff that we have to do as we develop the protocols for the robots. (Humans shouldn't prep samples -- it's 2018) Correction: 2019. But we need to fully understand the best ways to do everything so we can make the robots do the smartest things possible.
3) You might find out what we're doing and be like " ARE crazy..." but you'd never know till you write me.

Time frame?
People for Columbia ASAP.
People for Irvine -- soon.

Interested? (Write me!

Sunday, March 10, 2019

ETD and EThcD are complementary in comprehensive ADP-Ribosylomics!

The reason to be excited about this great new paper In Press at MCP is the subtitle on the top of each page:

The biologists have been super excited about these things for a long time -- and probably more than a little disappointed in what proteomics has done to help them understand this ultra important class of PTMs.

Does this fix it?  I don't know how many sites people will want/need or how many exist, but this sure looks like a ton of data. 

For us mass spec nerds, maybe the most interesting part is the surprisingly complementary results generated by ETD and EThcD. In my head I kind of consider the two about the same thing. ETD makes mostly charge reduced species, but if you look close enough you'll find c and z ions. EThcD makes less charge reduced stuff, that -- in theory -- is replaced with a blend of c,z,b and y fragments -- but my experience is that Sequest can't make any sense out of it. 

All this data processing was done in MaxQuant and allowed a dizzying number of dynamic mods: 

This results in some beautiful data -- tons of sites identified and localized and data that ought to make the biologists a whole lot happier than anything I've seen before! 

Saturday, March 9, 2019

Better chromatography for crosslinking!!

Okay! This is more of what proteomics needs!  More sophisticated and better chromatography needs to sneak in here and there and this great note is a perfect example of why.

My heart sank when I realized a lot of people were breaking out the SCX columns to enrich for their crosslinked peptide species. I'm sure SCX still has uses out there in the world, but when I hear those letters all I think of is putting in milligrams of peptides and getting micrograms back, so maybe this is the answer?

Friday, March 8, 2019

Spatial, cell type resolved proteomics of brain samples!

How much better are laser pointers now than they were like 10 years ago? I can get a laser at the dollar tree (if you don't have these -- they're amazing. Every item in the store costs $1 and the items stocked are chosen completely at random. Seriously. Go into one and try to cap yourself at $7. I bet you can't do it) and 1) it's way brighter than the ones even a few years ago 2) the batteries last longer and 3) it's $1. Which is crazy.

What about a real use of lasers, like laser microdissection? Could it possibly be improving at the same rate? Or -- at the rate that proteomics technology is improving?

What if you used the best of both?  And you painstakingly optimized EVERYTHING necessary to link the two techniques together? Then you'd have this new paper in JPR.

I'm assuming if you have a Lumos system and you're using 50cm columns on it, you didn't get your laser at a Dollar Tree. I'm also assuming that if you work your way upwards toward identification of 1,500 human proteins(!!!) from stuff you cut with a laser from a slide of tissue that is 10 micrometers thick (probably 1 cell width, right? Google is confused by the question) that I'm not the only person who is super impressed.

The ion trap in the Lumos was used for a lot of the MS/MS (the sensitivity comes in handy when you're trying to resolve individual cell types off slides) and MaxQuant/Perseus/fancy R stuff was used to pull the story all together.

I'm unclear regarding the isolation of specific cell types and what they are, but in one set of samples this group came close to 4,000 proteins ID'ed!!

If it's been a few years since you last used laser microdissection + proteomics and you know someone with a question that only these techniques could answer, maybe it's time to get a new slide and follow the protocols in this new paper to the letter!

Thursday, March 7, 2019


I need to add some stuff to my bucket list so I've got more reasons to not die than the fact there is probably an elderly dog out there in the world with diabetes, allopecia and incontinence that needs me.

2019 is rocking here in Columbia, MD!! ---

Invited talk(s) at ABRF!! ("Making a core lab nimble and efficient" and "introducing the WIN antibody characterization community project" -- actually, this needs it's own sentence...)

Here is the first slide!

Are you guys getting inundated with antibody and antibody drug characterization requests?!? If you aren't, there are probably people at your university or facility who are doing this and sending it elsewhere.

In the U.S. like 2 months ago -- FDA cleared MoxywoxytootymomoloopyMab and it's an antibody with powerful drugs on it -- it localizes to CD22 (I think) on cancer cells and then blows them up. BOOM. And there are dozens like it trying to get clearance. Antibody drugs are coming down the pipeline like crazy AND mass spectrometry is the only way you can make sure 1) verify the sequence identity 2) verify that drug conjugations were successful

Here is the idea for the community study -- How is everyone doing this?!??! I started counting them up and I came up with over a dozen different combinations of ways that you could conceivably characterize a monoclonal antibody with mass spectrometry. What if we (ABRF WIN!) found labs that do this -- send them antibodies -- and we all work together to figure out: 1) How everyone is doing it 2) What is the best way?

If you were on the fence about being in San Antonio for ABRF (I know...what are the odds the Spurs have 3 games on the road....?) maybe the fact that you've got 100 requests to do mAB characterization in your inbox you're ignoring is a reason to go?


AND I JUST FOUND OUT TODAY --- I'M SPEAKING AT ASMS!!! I speak on Thursday! YES, I also thought the conference ended on Wednesday. It doesn't!  I have to change my AirBnB and flights! I think one organizer was like -- "come on, just let Ben have a milk crate to stand on outside the convention center on Friday or something. He's been making requests to talk for like 18 years. Who knows, he might be too old to travel next year...."  THANK YOU ORGANIZERS!!

I'm speaking on the work Conor Jenkins and I are doing with OptysTech using stupid amounts of their cloud processing power to find cancer mutations -- without any genetics based sequencing required. There is a preprint with a few details out now and some super promising results in hand that we're working on clarifying/verifying.


Tuesday, March 5, 2019

Metabolomics of hibernating (arctic) squirrels!

Is the super fat squirrel thing happening everywhere? In my back yard they're so round now that even my not-so-athletic dogs seem like they might actually catch one. It occurred to me that it's weird to see squirrels in the middle of the winter. Don't they hibernate or something? According to a lot of articles I found online (IFLS link) it is actually happening everywhere.

Okay -- so that was my question -- what about a real science question that I never would have thought to ask? How does the squirrel metabolome change during hibernation? For the answer you'll need to check out this new paper at JPR. 

It turns out that Arctic squirrels (which, I'm pretty sure was also a band that my housemate in grad school was a fan of) do definitely hibernate -- and make a good model for studying hibernation.

I assume it's easier to get blood from an arctic squirrel than from one of these things....

(Wikipedia informs me that a bear isn't a true hibernator, but the joke still holds, IMHO)

For the sciency parts --- blood was drawn from the squirrels both in full torpor (what I'd consider hiberation ) and coming out of it and then also fully awake and other time points and this is the samples that were used for metabolomics.

This team focused on the metabolic shifts in the red blood cells! I don't read as many metabolomics papers, but this isn't something I've seen before. The RBCs were separated and lysed and this is what went on to the UHPLC Q Exactive system.

When I do read metabolomics papers, I'm always surprised by how low the TopN is. You often see the Top3 or Top5 most abundant ions selected for fragmentation. The now possibly discontinued(?) (and awesome) Q Exactive Focus system was only capable of doing Top3 when it first launched. Unless you're doing UHPLC with crazy sharp peaks, this always seems like a waste of cycle time to me. This group uses a 70,000 resolution followed by a Top15 method for MS/MS acquisition. Rough math in my head says a 1.3 second cycle time if 15 ions are actually selected (which won't be often) and I can't see a downside to this (forget FWHM for peaks for high res MS1, you want baseline to baseline and this is going to still provide plenty of MS1 for accurate quan).

For the downstream processing, Compound Discoverer was used with a ton of databases (yay! if you have them use them!) including NIST, KEGG, LipidMaps, and an in house library of 1,000 compounds. MetaboAnalyst (something I need to check out and keep forgetting) was also employed here.

What did they get from all this work? A really interesting and surprisingly complete picture of the RBC metabolism in and out of hibernation, including some targets that they were able to get standards for and verify with absolute quan techniques.

Did I decide to read this paper initially because of the fat squirrels in my back yard? Yes.

Did I learn a lot about how to put together a solid metabolomics study as a consequence? Also yes!

Added bonus: I discovered that there subreddit that is completely devoted to insulting fat squirrels (of course there is...?). The goal seems to be to insult fat squirrels with the most profanities you possibly can, to the point that you have to be 18 or older to enter the site, and I probably shouldn't direct link to that.

Monday, March 4, 2019

It's finally time to discuss MS1 based libraries! And how to use them for anything/everything!

Time for a story! Okay -- not a story -- but something that I think is going to surprise a lot of people here in 'Murica.

Just about everyone is using MS1-based libraries, except for you. You know how I know this? Because it surprised the holy Heck out of me.

Actually -- let's start with a study that I'm a little obsessed with that came from Yale, I think.

This paper is a big deal for a lot of reasons. One is that it's really hard to get human brains. My experience so far is that the proper course of action is a series of begging, trades, justifications, and begging. All this beats going out and getting a bunch of brains yourself, I assume.

This group got material from a bunch of different areas of a bunch of different brains. However -- a lot of material from the stingy brain storage people still generally isn't very much. And you aren't getting a paper in Nature Neuroscience with low coverage.


Okay -- so here was the mistake. This group didn't know that this is supposed to be a European secret strategy for kicking everyone else's asses in proteins/peptides identified per run. They spelled it all out. You're supposed to be vague about it and use the terms "match between runs" a lot.

I'm being facetious, of course. Just because I didn't know that everyone else is doing proteomics this way, doesn't mean anyone was hiding it! It just means you have to read a lot.

In fact, you can pretty much read about this strategy in this great protocol update on MaxQuant a couple years ago. can completely read about it because it is described in painstaking detail....

Here's the idea. You generate a pool of all the samples you want to work with and you fractionate the holy heck out of it -- then you lie to your software and tell it that you didn't fractionate it.

Then you run all the samples you care about getting quan on with single shot. It's very important that you use similar chromatography for your single shot and for your fractions.

Then feature identification matches up the stuff that you didn't fragment and ID in your single shot samples with the stuff you did fragment and ID in your deep fractionated "library" samples.

Bingo. Totally works -- only trick is you have to stop using whatever you're using and start using MaxQuant....

...or.....maybe your software isn't left handed either? {Groans...}

Can I lie to Proteome Discover the same way and get same/similar results? Totally.

Back to the brains above. I have brain samples that we got as detailed above. We even got much better mass spectrometrists than me (shoutout to some winners!) to run them since my system was still en route, and the most we could get was seriously like less than < 5ug of protein --and well -- it wasn't the newest Orbitrap in the whole world (actually...well...the oldest...but still an Orbitrap!)

Let's lie to some software and get some phenomenal results! (Please note, this is all done in PD 2.2 -- you can do this in PD 2.1 (or any other version equipped with the apQuant nodes), but I haven't tried matching the results)

First off -- I'm going to download the proper section of the brain from the study above. It's at ProteomeXchange/PRIDE here. 005445

Now I have 15 awesome sample fractions from pooled samples from multiple patients from the correct brain area where our stuff came from. THIS IS MY MS1 library!

CRITICAL STEP1: Add your library files as "Files" not as "Fractions" Fractions will complicate things. Then use your study factors to group them together

CRITICAL STEP 2: Use Minora (and if you have PD 2.2 also recalibrate your MS1s with SpectrumRC -- super useful -- particularly when comparing newer to older instrument data). If you aren't lucky like me and talk someone into accurately replicating the chromatography conditions of your library samples, you'll want to widen your alignment properties in Minora. (This is actually in the Consensus steps here)

Step 3: (I need to investigate this further): These are my settings for the quantifier.

I think these are all important, but have no proof.

CRITICAL Step 4: Use Data Distributions Post Processing Node!

This makes finding your data so much easier!!

That might be it...maybe I'll just put the workflows up somewhere that they can be downloaded....

What do the results look like?

Let's pick one of my sad -- sample limited single shot files!

Green basically is the stuff that you found by MS/MS and will roughly correspond (+/- statistical changes) to what you'll get if you run this file alone. Blue is what I'm interested in here. That's stuff that is found that is new and thanks to the contributions of the other files present!

1263? That sounds much better! Without this "library" every single shot patient sample together didn't come up with this many hits as this single file alone compared to the library in this manner.

Okay -- this can obviously be a gamble. How do you estimate the FDR here? Do you weight the discoveries at the same level?

Let's go back to the Yale paper above --  I really think they did this part right as well.

1) They tried to estimate the false discoveries in the match between runs peptides using knowledge of the biological samples (honestly -- I can't entirely follow it, but they come up with a maximum of 3.8% "imposter" matches.
2) They provide all the data tables, clearly indicating what was identified by MS/MS and by match between runs. I thought they also did their pathway analysis stuff using both the small and big lists, but I might have that mixed up with another study.

Downsides I almost forgot to mention!  HOLY COW. THIS TAKES FOREVER!  You're adding feature alignment and tons more files and spectra?!? You're talking about taking 50 files that take 4 hours to process on one of the OmicsPCs MaxDestroyer systems and now you're adding a ton more files to it? It'll push a big data processing system like mine one to as much as a day of sitting there crunching numbers and being tough to play video games on. This isn't PD exclusive, it takes comparable time in my hands for MaxQuant as well....

Sunday, March 3, 2019

Intrinsically disordered proteins!

(Figure borrowed from this awesome paper I totally don't get)

Two years ago I was asked by a collaborator (for a paper that was FINALLY accepted this week -- YAY!) to investigate our findings for Intrinsically Disordered Proteins. (IDPs)

I did the right thing and ignored this request completely.

Wanna talk about something that proteomics isn't good at right now? IDPs. Me either. But maybe ignoring a big biological problem that people are increasingly linking to diseases isn't the best idea. And maybe we have tools now?

Should I try to talk about what the problem is first? Sure! That way maybe I'll understand it better. (To clarify -- the original request hasn't actually gone away -- it's simply gotten more urgent).

(BIG Shoutout to Lukasz Kozlowski for making this aweseome open gif!)

This is SUMO or something. You know how all proteins are supposed to work cause they've got 3D structures and things? This gif shows that this protein does have 3D structure in part of it -- but a crazy disordered protein part in others. At first this sounds okay, right? Albert Heck! Who cares? I'm going to reduce and alkylate and digest and that won't matter at all, right?

EXACTLY! That's the point. And part of the problem. We completely trash this information with shotgun proteomics. Heck....I don't know if we retain this information with top-down either...aside from the mass shifts from the free cysteines?!??

See why I ignored this? Okay -- but we have some resources, I guess. I've tried this one --

DisProt is a growing resource. It has information on 800+ proteins of interest now. My profanity laden notes from 2017 suggest that the database was significantly smaller then. Maybe you end up with a list of proteins that make no sense at all to IPA or any other resource -- but it turns out it's an enriched list of disordered proteins? Yeah! That sounds smart as I type it. I plan to not reread it.

Okay -- but what if we take 10 steps backward. What if there was a way to tackle this problem directly? ION MOBILITY TO THE RESCUE!

Did we finally find something that Ion Mobility is critical for?!?!?  (Kidding, I don't really think that every application of IMS could also be solved with 4 minutes of attention to chromatography, but it gets people wound up when you say things like that!)

I'm not even going to read this paper today -- because 1) I don't have IM in house (yet...but maybe this is the reason to get a demo scheduled) and 2) I can't apply this to my original files but if you have someone bugging you about disordered proteins and you haven't added the word as a filter to your spam trigger, maybe you should check it out!

Saturday, March 2, 2019

PhoX -- An IMAC enrichable crosslinker!!

Oh. The weird blogger is back -- and he's still talking about crosslinkers....great....

Yes. This is an obsession right now. When it works -- this may be the most powerful technique I've ever used. People get excited and yell things at the projector showing the results and say biology words that I don't know. get an in vivo project and run a single sample, meticulously prepared and fractionated for 2 days and you come back with 35 crosslinks and....

...and the best results you've ever seen aren't actually all that much're starting to think this is only good for in vitro stuff and...what the Heck?!??!?

Lots of great stuff in this paper, but what I care about IS.THIS.

From human samples!!! That's more than I'VE EVER SEEN FROM 24+ FRACTIONS!

Friday, March 1, 2019

What is the "software crisis" in bioinformatics?

This paper in biorXIV I missed a while back (thanks @Smith_Chem_Wisconsin) is very revealing.

Honestly, I thought the numbers would be far worse for bioinformatics, as a whole, but this is a really interesting scoring procedure and obviously something we need to address as more new tools appear.