Tuesday, December 31, 2013

New methods in Orbitrap methods database

Recently, I've gotten to be in on a couple of Orbitrap Fusion installs and got to play around a little with different samples.  I've uploaded some methods that have produced nice results to the Orbitrap methods database.  The first is an MS2 only based method for iTRAQ and the second is the same for the TMT 10plex reagent.

In addition, I received some RAW data from a lipid researcher and uploaded his method for looking at lipids on the Q Exactive.  As always, treat these as a starting point and email me (orsburn@vt.edu) if you have questions/comments/suggestions. These are constructed in my free time and not meant to be considered as the end all, but for at least these 2 Fusion runs I'm pretty happy with what I got out of them!

Monday, December 30, 2013

Proteome screening of pleural effusions identifies galectin 1 as a diagnostic biomarker and highlights several prognostic biomarkers for malignant mesothelioma.

Mesothelioma is a cancer that starts in the protective lining around your organs.  Since the massive removal of asbestos in North America over the last 20 years or so, we rarely hear about this cancer now, as it was the major cause.  Asbestos was pretty useful stuff and since removal is expensive, there are lots of places where this cancer is still a big deal.  Unfortunately, there aren't good biomarkers for clinical assays.

Until now!  In this study, a team of Swedish researchers in conjunction with a medical institute in Turkey track down a series of good biomarkers for this nasty disease.  The experimental method was simple.  They used iTRAQ 8-plex to label (top-14 depleted) pleural fluid from Turkish patients with various lung conditions.  They used hi-resolution isoelectric focusing (see, the OFFGEL isn't dead!) to pre-fractionate the samples before running them on an Orbitrap Velos. 

Note on the method:  The method employed is one that could be considered suboptimal.  There are several places where the fill times and method could be tweaked to improve the theoretical peptide coverage and quantitative accuracy.  For current recommendations on running iTRAQ samples on an Orbitrap Velos, please see the Orbitrap methods database in the right hand of this page.

The results here, however, are extremely nice.  Multiple biomarkers were identified and validated with statistical significance which can only lead to an improvement in the correct profiling of the patients afflicted with these diseases.

Sunday, December 29, 2013

Most active Twitterers in Proteomics

Oh, Lists! Why do we love you so much?

In another example, Google+ suggested this article for me regarding the most active Twitterers (I don't think that is the best diction, but you probably know what I mean) in our field.  I follow most of these people, but will probably start following more.  I ran low on topics to write about over the holidays!

You can find the list here!

Tuesday, December 24, 2013

Backdated --- Merry Xmas!!!

I took a break from the blog for the holiday but this picture is too awesome to not use it!

Monday, December 23, 2013

Use of quantitative mass spectrometric analysis to elucidate the mechanisms of phospho-priming and auto-activation of the checkpoint kinase Rad53 in vivo

Rad53 is a protein that is involved in DNA repair in yeast.  An extremely similar protein in humans, Rad51, is shown above.  You irradiate some cells and probe with an anti-Rad51 antibody and you get the distinct foci shown in green above.

Despite years of research, DNA break repair is something that is poorly understood.  A really nice paper in MCP uses a spiked in SILAC approach and high res mass spec to try to pull back the curtains.

In this multi-lab (heck, multi-nation) study, this team uses Rad53 deficient yeast strains and MMS (methyl methanesulfonate, a chemical that induces DNA double strand damage) to try to fish out the pathway leading to Rad51 activation.

Minor comment on the paper.  The MS1 search tolerance was set at 25ppm in the MaxQuant/Andromeda runs.  Particularly in a SILAC study, I think that this window is a little too big and might lead to mismatched pairs and maybe a raised FDR.  Otherwise, this paper is a nice solid look at using a good classic genetics (knockout) approach coupled with HR-MS to fish out pathway differences.  This is also one of those rare systems approaches successfully using spiked in SILAC.

You can pull up the original paper while it's still open access here.

Friday, December 20, 2013

Mass spec terminology translator V1 is up

I have no idea what is happening in the image above.  GoogleImages gave it to me when I said "pug translator".
Anyway!  As long promised I finally put up the first version of the mass spec terminology translator (see top right).  This will be evolving.  I had a 6 hour flight with no internet yesterday and this was the first list I came up with.  As new jargon (and more importantly, jargon that I just haven't thought about yet) pops up I'll continue to add on, and hopefully clean up.  A bunch just popped into my head while I was writing this.  Expect expansion.

And a direct link here.

Mass spec free proteomics!?!?!

Lets start off like this:  This is a paper in MCP but they didnt do any mass spectrometry, not even a little, so why do we care?

Next article please!

No, wait!  This is awesome, I promise.

You know how I'm always preaching that the database ought to match the size of the thing we're doing MS/MS on?  Why on earth would we go to all the work of cutting a liver out of something, doing proteomics on it, and then searching all of the proteins that mice can produce?  Does that make sense?  Nope!  But there isn't much we can do about it.

But what if somebody took the genome and cut it down to what proteins are expressed in what tissues by doing RNASeq on individual tissues?

And what if they backed it up with protein arrays to show that the protein levels strongly correlate with the transript levels?

In my (humble?) opinion, thats a paper that ought to be in MCP!

Who needs a mass spec?  Wait! What?  Everybody!  But this is pretty cool and will be helpful later!!  Lets try re-running that brain, liver, or whatever with a FASTA that only contains proteins from that organ and see how our results jump through the roof and our FDR drops through the floor!

Thursday, December 19, 2013

Do Byonic searches directly in Proteome Discoverer!!!!!!

I knew this was coming.  I really really did.  But that doesn't stop me from being crazy super psyched about it.

You can buy a Byonic node and put it directly into Proteome Discoverer right now!

If you don't know about Byonic, go to the search bar and look up my ravings about this software over the last year.  Next generation search engines are GO!  MSAmanda for high-high MS/MS data and Byonic for your PTMs and in 2013 Proteome Discoverer has went from relying on (perfectly suitable) search algorithms written when I was in high school to brand new engines that 1) take adavantage of high res MS/MS data and 2) can search PTMS in a completely new and super efficient way!


I don't have this yet.  Expect me to hunt it down and data to follow!

Tuesday, December 17, 2013

Rockethub -- Another place to crowdsource fund some science

This article came by way of this month's Wired (which is pretty great, btw, as Bill Gates stepped in as editor).  I knew about RocketHub, but I didn't realize that people were successfully using this as a method to fund their research.  You can browse through, look at research proposals, and donate to ones that you feel are viable.

I flipped through a few on the site (by simply using the search term "science", which also brought up a lot of science fiction).  LabScene is one of my favorites, a project that hopes to replace the peer review process!

You can go directly to Rockethub here, and check out the video for LabScene here.

Monday, December 16, 2013

Tools for proteomics app

This is a neat little app.  Visually appealing and with nice little summaries of mass spec techniques and hyperlinks to where you can actually purchase the consumables to do the work.  Now, how many people have purchasing departments that will allow them to use an Ipad to place orders?  I don't know, but we all know everything is moving that way so it's only a matter of time.

Sunday, December 15, 2013

Matching cross-linked peptide spectra: only as good as the worse identification

We all want to cross-link peptides and do mass spec on them, right?  We have a protein of interest and we want to know what other proteins are interacting with it.  So the strategy is to throw in one of the 20 or so crosslinkers out there, pull down our protein of interest with an antibody or something and then do MS/MS on everything OR specifically study the crosslinked peptides.

Problems?  When we pull down one protein we pull down tons of proteins.  Part of the reason is that antibodies really aren't 100% specific, particularly due to the incredible number of protein isoforms present in biological systems.  Another part is that no protein system is composed of just a few proteins.  Billions of years of evolution have forced an unbelievable level of intricacy in hundreds if not thousands of proteins working simultaneously together to efficiently achieve even the most simple of tasks in the most energetically favorable manner.  This isn't done by textbook pathway drawings of 6 proteins.  Not when throwing in 100 more will could the energy requirements of that reaction by 30%.

Another problem?  The false discovery rates of small protein complexes (or, heck, even big ones) sucks.  FDR works best with bigger and bigger datasets.  Small ones just don't work right.

A worse problem?  The crosslinked peptides give horrendous FDR calculations.  Awful.  Cause you have to use so many dynamic modifications per peptide sequence.  This equals horrible dynamics.  Add that to your small sample size and your often looking at a random number generator.

{End rant}

This paper is badass, btw.  You know what they did?  They look at the crosslinking in a biologically relevant context.  No kidding!  They take into account the protein crystal structure providing the proximity of the residues for crosslinking and throw that into the FDR!!!!  Cause we have that data out there for most proteins (okay, not most, but for most important proteins.)

Okay, so there is a disconnect here, maybe.  Yes.  We have the crystal structure for individual proteins.  Lots and lots of them.  And this process will work for that (they prove it in this paper using RNA Polymerase II as an example).  But what I'm more interested in is in complexes, and we don't have nearly the same degree of data for those.  So I guess I'm extending the real power of this paper a little, but what a step forward!  I'm imagining the extension of this algorithm if it eliminated binding sites we know are in use or are deep in the internal structure of the protein.  But, holy cow, this paper is really really smart....

Read it (currently open access) here!

Saturday, December 14, 2013

Spectral counting in Proteome Discoverer

The image above is stolen directly by/from Google Images.  First thing that pops up if you look up "spectral counting".  Anyway, I often get questions about using Proteome Discoverer for spectral counting.  I have some slides that I cut from an iORBI I attended several years ago showing spectral counting in PD and this is what I send people.

I'm not a big fan of spectral counting, but it does have its place sometimes.  The slides show you a comparison of spectral counting vs. quan of peaks at the MS1 intensity level as the limitations in dynamic range from spectral counting (with a very nice reference).

EDIT (2/1/17) New DropBox link!

Friday, December 13, 2013

Peaks 7

My first Xmas present arrived while I was on vacation.  Peaks 7 came out and I got a nice long trial license to check it out.  I installed it on my plane ride home.

The list of new features from Peaks 6 to Peaks 7 is kind of mind blowing.  This software is very very sophisticated.  It is supposed to be faster, it has online collaboration and sharing modules embedded within it and it now does label free quan (with really pretty heat maps).  There are a full list of the new features available here.

However, the features I'm most interested in checking out are described as "improved de novo localization scores" and "statistical charts for accurate filtration of de novo data".  I love to hear about improvements in scoring accuracy and FDR, and de novo is where FDR needs the most improvement.  Anybody going out of their way to improve that can send me their software in an easy to install way with a nice free trial and I'm going to do a fair job of checking it out in my free time.

Bonus?  You can directly import MSF reports from Proteome Discoverer 1.3 and 1.4.  I don't know if this will simplify my workflow for de novo studies (link to the video I made for working de novo into PD), but it just might.  I'll be back later with first and final impressions!

Thursday, December 12, 2013

MaConDa -- A nice resource for identifying contaminants

This is a cool resource I recently stumbled across!  The MaConDa is a really easy and simple site that has exact mass information on previously identified contaminants in MS/MS runs.  I know most of us have the supplemental information Excel sheet somewhere from that cool paper from a few years ago lying around somewhere, but this is like that with a couple of neat twists.

1) You can filter by instrument type (ion trap, QQQ, or TOF [probably what you'd use for Orbi])
2) You can set a custom PPM error, such as that for your instrument
3) You can filter by contaminant type
4) You can output a list that contains adducts.  Icing on the cake!

Check MaConDa out at this link.  I bet you'll end up using it sometime.  I used it today!

Sunday, December 8, 2013

GOrilla -- Gene ontology tool with a great name

Quick note that I found while reading the new MCP on vacation...I know....I have a problem.  In my defense, there are some great papers this month....

Anyway, a new to me tool for gene ontology is GOrilla, which appears to be hosted by the Weizmann institute.  I know there are a lot of GO tools out there, but this one has a couple of nice features beyond its great name.  The first is a customizable p value threshold for your gene enrichment analysis.  The second is the easy control of your output format.  You can simply checkmark the box to output your data in Excel format and/or you can export it directly into the Revigo visualization tool.

You can check out GOrilla here.

Friday, December 6, 2013

Ohaiyou gozaimasu!

Ive got one backpack full of stuff, my IPad, and I'm on a bus to the mountains of northern Japan.  My goals include seeing a wild snow monkey, snowboarding, and making a dent in the global supply of Sapporo (which is crazy cheap here!).  As a consequence, the blog may not see many updates until I return.  It should be an exciting winter, however, as many many good things are happening and I can't wait to share them with y'all!

Thursday, December 5, 2013

PRTC. I swear I wrote this entry once before!

Are you running some sort of quality control when you do proteomics?  If your answer is "of course" then I like you.  Heck, I'm a friendly guy.  If your answer was "never, and I hate pugs!" I'd probably still like you, but I might like you better if you are running some sort of QC.

We need to have some sort of metric of how our instruments are running.  I'm often asked what my favorite is.  This question commonly comes after people find out that I don't know what a BSA digest should look like...

The answer, and I swear I wrote all of this a long time ago, is PRTC.

PRTC?  Now, I'll admit, I didn't know about this thing until I joined Thermo.  But I like it so much that I keep aliquots in the ziplock baggy that I keep my toothpaste in for when I travel.

PRTC stands for "Pugs Rock The Cazbar!"

or Peptide retention time calibration....

What it is:  A clean, equimolar, mixture of 15 isotopically labeled peptides for varying hydrophobicity.  Running these can give you a picture of the performance of your LC gradient and your signal intensity over time.  Since they are isotopically labeled you don't have to worry about them being mistaken for something else if you happen to have a small fraction carry over into your next run.

They are also well aliquoted.  So you can use them as spiked in standards at low concentration to normalize label free peptide quan from sample to sample. (I should have some nice data on this in the next couple of weeks.)  And in Pinpoint, you can simply add in QC peptides and it throws them in.

Oh.  And it's cheap.

Downside?  None.  Period.  Exclamation point.

Want to know more?  Check out the product page at Pierce, or this sweet application note written by some of my favorite people!

Wednesday, December 4, 2013

Comparison of peptide and protein fractionation methods in proteomics

This is a nice analysis that comes from a pretty simple set of experiments that were just done nicely.  The article from Mostovenko et. al., (open access!) compares multiple methods of fractionating both an E.coli digest and a single digest of human plasma.

The methods compared are:  SDS-PAGE vs. SCX vs. IEF.  The output is unique peptides and overlap between methods.  Interestingly, in the bacterial digest, SDS-PAGE and SCX run kind of neck and neck, while IEF lags behind.  However, in the plasma digest, SCX fractionation is the clear winner.

Tuesday, December 3, 2013

Protein Expression Control Analysis (PECA)

This is a paper for all you bioinformatics people out there.  Partially because you need to have a stronger background in computer stuff that I do to even install and use PECA.

It appears to be a nice tool for the analysis of RNA expression data and it has the capabilities for also inputting quantitative proteomics data from different formats.

PECA differentiates itself from other software by specifically targeting transcripts or proteins that fall within a steep range but limited range of almost logarithmic increase or decrease.

For more information, check out the abstract at JPR (not open access) here.  You can also download the software for compiling and install directly from sourceforge here.

Monday, December 2, 2013

Arginine phosphorylation in bacterial stress response

In press at MCP is a great new paper showing how arginine phosphorylation is used by Bacillus subtilis in the regulation of response to stress.

And it isn't a little involved.  It's a lot involved.  This study shows that it can be linked to heat shock response, response to oxidative stress and in the resistance to drugs.  Pretty impressive findings out of well-characterized model organism.

Beyond the fact that we have yet another PTM to worry about, this paper is valuable for the clear (and pretty simple!) methodology for harvesting, enriching, and analyzing arginine phosphopeptides.

A good read, if only for putting in perspective how much we don't know about the physiology of even the most well-studied organisms!  Definitely check it out while it is still open access.  Direct access to the PDF is here.

Sunday, December 1, 2013

How far is the human proteome project at this point?

The human proteome project has been rocking for a while now.  How far has all this work gotten so far.

Well, here is an update (not open access), compliments of JPR and Terry Farrah et. al., and the number is around 62%.

62% what?

Oh, 62% of the coding sequences of DNA that we think code for proteins have strong supporting evidence of their existence in MS/MS spectra. That's pretty cool right  Over half way!

Let's take a moment and think about how great this is, and how far we've come so far!

While doing so, let's forget the fact that one post translational modification can have dramatic ramifications on the function of a protein.  Let's also forget the fact that in 2011, we knew of about 80,000 specific PTMs. Also, let's forget about conformational changes that can have effects every bit as impressive as PTMs.

Please don't get me wrong, I'm not trying to put down the work of the participants of the human proteome project or the good people at ISB who are running the peptide atlas.  I'm simply concerned about our tendency to underestimate the complexity of biological systems.  We did that with the human genome project.  First of all, getting MS/MS spectra for all of the proteins predicted from the HGP data is the tip of the iceberg.  Secondly, let's not declare big ongoing projects completed for a while.  Grant dollars are pretty scarce out there, and we don't need ignorant politicans reading headlines and cutting all the money to our friends because they think the job is done.

Ran into this one thanks to Twitterer @PastelBio

Saturday, November 30, 2013

Comprehensive history of the Orbitrap by Dr. Makarov!

There are a lot of stories out there about the development of the Orbitrap system.  Want the whole story directly from Alexander Makarov?  Check out this month's issue of the Analytical Scientist, cause he wrote out the whole history.

The article, "Orbitrap Against All Odds" is a good read for both people inside the field as well as for anyone who is trying to push through an idea that they believe in, despite the opinions of others.

You can download the complete PDF here!  (You may need to register, first, but it is free!)

Friday, November 29, 2013

How does TMT10 affect peptide charge states?

A reader wrote in with this very sensible question regarding one of my posts on the TMT 10plex reagents.  The question from Javi:  How does the new TMT 10plex reagent affect peptide charge states.  For example, he notes, that iTRAQ can lead to an increase in charge states.  The TMT0 reagent, as well, is often used in ETD studies because the charge state gets pushed up.

 But does the TMT 10plex do the same thing?  I could probably ask someone, but since I have a lot of data lying about in all of these portable hard drives, maybe I should just look at a few.  Who needs scientific rigor?  This is a blog, after all!

  Anyway, I picked two tryptic digests that each had roughly 30,000 MS/MS events, both are adherent cancer cell line digests.

The black bars are the unlabeled digest.  And the majority of the peptides appear to be +2.  The TMT labeled seems a little biased toward +3.

So, in my completely unscientific (and roughly 4 minute analysis) I'd say, yes, the TMT10 plex is similar to other isobaric peptide labels in that it tends to lead to an increase in peptide charge state.

Keep these questions coming!  Sometimes I seriously just run out of things I'm interested enough to write about!

Thursday, November 28, 2013

Turkey (egg shell) proteomics!

Happy Turkey day!  This time, that isn't my pug, he just looks just like him!  That costume is ridiculously expensive.  I'll get it after the holiday when it goes on sale for next year!

Now, I often wonder strange things like:  A whole lot of researchers sure do work with green monkeys as disease models, but the genome has never been finished?  But the turkey genome was finished way back in 2010 (partially, I believe due to my alma mater and the fact that a neutered turkey is our mascot...).  Wow.  This isn't even close to a coherent thought at all!  If you're used to my disjointed rambling, you're probably okay with it, or you've already skipped ahead to the science.

So...what do we do with a turkey genome?  Turkey proteomics, of course!  In this study, Karlheinz and Matthias Mann take a look at the turkey egg shell in comparison to that of the chicken egg shell. Seriously!  And yes, it is a pretty interesting paper!  Ever wondered how to extract protein out of an egg shell?  Not any more!  This (open access) paper has a clear method.

The extracted proteins were ran on an Orbitrap Elite in high:high mode and comparisons were done with the fancy statistics in MaxQuant/Andromeda.  It is pretty neat because the extraction required the subfractionation of the proteins present by what they were dissolved in.  Now, I'm a little bit confused about the analysis.  It appears that they used MaxQuant to do a meta-analysis (in short, old data from a database compared to newly acquired data) of the new dataset (turkey) vs an old dataset (chicken).  I do a lot of meta-analysis of genomics data.  But we have all the nice statistical tools we need, as well as enough replicates of the data to verify statistical robustness (a single microarray may have as many as 20-100 signals per protein depending on the array type).   I am unclear as to how this can be done in MaxQuant.  It is likely that the newer/est versions of MaxQuant have some new statistics tools and I just haven't upgraded recently enough.  That probably means it's time for some MaxQuant reviews!

Now we just need to get working on that green monkey genome.

TL/DR:  Matthias Mann's lab did proteomics on turkey eggs.  Ben wrote this and a lot of other words because thought it would be funny to write about turkey proteomics on turkey day.

Wednesday, November 27, 2013

What does a good TMT or iTRAQ MS/MS spectra look like?

Holy cow!  I haven't posted anything in almost a week.  Normally there are very good reasons for this, like 1) I changed my password and forgot it or 2) It was nouveau week, or 3) The super cool projects I'm working on in my spare time are either a) something I can't yet tell you about cause its secret or b) something that didn't actually work.  Possibly a combination of all of these, but I'd appreciate it y'all would assume it isn't primarily 3b!

But now I'm back, full of espresso and I'm excited to throw this one out here.   I may have written about this before, and I plan to actually do another entry later on "this is a good spectra, this is not" but this one is pertinent to a lot of people due to the explosive popularity of the (fantastic!) TMT10 reagents.

Here is the question:  When I'm looking at an MS/MS spectra of a reporter ion tagged peptide, what am I looking for to tell that I have a good one?  I.e., how can I tell that my HCD collision energy is too much or too little?

Disclaimer:  I totally made the following up.  I ran iTRAQ for years for my own research and help  at least one person a week optimize their reporter ions.  This is the way I do it.  People have probably published other better ways of doing it, but this one is faster.

I base my opinion of whether I'm looking at a good MS/MS spectra on only 2 things
1) Are there reporter ions
2) Can I find my parent ion at <5% base peak intensity

Randomly chosen example from a friend's TMT10 run:

This is on an Orbi Velos.  The chromatogram isn't ugly because of spray stability issues (Patricia doesn't mess around when it comes to technique, the spray stability is great).  It is ugly because of the relatively high amount of time it takes the Orbi Velos to do a Top15 method with MS/MS at 30k resolution.

Lets look at criteria #1

Reporter ions!  Check

What about criteria #2?  The base peak is 1.2E6

How about my parent ion?

Hard to read, but the parent is 4E4.  Less than 5%, but still there.  HCD can be a little tricky to optimize.  It can be easy to over-blast your peptides and not have enough left to sequence.  If there is still a small percentage of the parent around, then I can feel pretty confident that I didn't hit the peptide too hard.  If there is a lot of parent around then I didn't hit it hard enough.  The 5% rule is a crude estimation.  Is there parent?  Is there just a tiny bit?  Perfect.

So this brings into play the big advantage that I perceive between the iTRAQ 8 plex and TMT10, and why every person I've seen do the comparison has switched to TMT10.  This is much easier to optimize.  The iTRAQ 8 chemistry is tricky.  It takes several passes to get your collision energy where you have reporter ions AND you have enough peptide left to sequence.  It is significantly easier to get this right with the TMT10, because the reporters come off with at least the same efficiency as the breaking of the peptide backbone.  When in doubt process the data!  I bet you'll find that spectra optimized like this will end up sequenced with good quan data at a pretty high efficiency.

TL/DR:  Its a good reporter ion MS/MS spectra if you have reporter ions and you can still find some of your parent ion at a low level in the spectra.

Thursday, November 21, 2013

SCAMPI-- A statistical approach to protein quantification

We need more statistics in proteomics.  We all know that.  We particularly need them in our quantification studies.  This is a little easier when we're doing label free but, of course, that comes with its own set of new challenges.

I get all sorts of excited when I see a proteomics paper that looks like a listing of fraternity houses, and this new paper from Sarah Gerster, et al., definitely fits that description.  In this study, the team describes SCAMPI, a protein quantification tool written in R, everyone's favorite statistics program.

Now, this is where this blogger stops.  I drew your attention to it.  I looked at every page.  I think anything where we start to treat proteomics like every other science and do robust statistical magic is going to move us forward.  I cant' really tell you if this is a good one, but it looks nice and it's got Ruedi Aebersold's name on it, so I figure it's worth checking out.  At the very least it has a memorable name.

Wednesday, November 20, 2013

How to do intact or top-down analysis of intact proteins on an Orbitrap

It is funny that I haven't written about this before, particularly when it is such a common question for me to be asked, and even more particularly because it is so counter-intuitive.

First of all, I don't understand the physics or anything, I just have these simple concepts in my head (heck, as far as I cant tell, the physics seems a little controversial anyway).

Concept 1)  Proteins hate to be trapped.
In my head, I visualize the fact that we can't achieve a perfect vacuum, so there are some gas molecules in the traps, regardless of how well we pump them down.  The longer our big ol' proteins are in the trap, the more likely it is that they'll run into one of these stray gas molecules.

Concept 2) Crap sticks to proteins, so we need to blast them a little.
I was around when some previous students of Neil Kelleher's had a lively discussion regarding the physics around this.  My brain was it's normal reliable self and went to thinking about something like this:
Fortunately, for all intensive purposes all I really need to know is:  crap sticks to proteins, so blast them a little.

Okay, so those are my concepts.  These are directly linked to how I'm going to get a bad ass intact protein MS1 spectra:

The steps:
1) Find a nice protein standard and direct inject it.  If it is apomyoglobin, bring it up in 30% organic or higher or it won't dissolve (thanks Rosa!).  Start small, say 10-30kDa.  If doing high flow, you're going to need quite a bit.  It definitely depends on your instrument, sensitivity, etc., But for an Orbi Velos or QE, I'll probably start with something as high as 0.1ug/uL in 30-50% acetonitrile with 0.1%-0.3% formic acid.  Once I get it, I can always dilute the next injection.

2) Use the lowest resolution your instrument has.  (See, counter-intuitive, right?)

3) Fill time is not your friend, that's just more trapping time.  Keep it low, but your AGC target high (3E6 AGC, but 50 or 100ms fill time at most).

4) Microscans ARE your friend.  Rather than filling for 200ms, which is one set of proteins given a chance to react with spare gas molecules, you can do 4 microscans of 50ms, giving 4 times the number of ions 1/4 of the time to get messed up.

5) S-lens RF or tube voltage, depending on the kind of instrument, are going to be interesting things for optimization.  Mess around with them till you get the best signal

6) Adjust the spray voltage and capillary temperatures.  In general, turning them down lower than you have been using for cal mix.  These can beat up your proteins.  A lot of times if I'm using a HESI source, I just turn off the auxiliary heater (just set it to 0, it will always show you a red mark by that temperature, but that's okay!)

7) Try adding some in-source collison energy to knock some crap off your protein.  Watch for a drop in signal due to fragmentation as you raise the energy levels

8) Acquire a set number of MS1 scans.  I like 100.  Open the file, average the spectra and see how that looks.  Does it suck?  Increase your microscans and adjust all the things I mentioned above.  Try again till it looks nice

9) Are you happy with your resolution?  If no, raise the resolution, repeat steps 3-8.  If yes, move on to a bigger protein, and start at number 3 again!  Try cutting your concentration and repeating.  What is your limit of detection?  Keep in mind the rough numbers, because if you move from the ESI to micro or nano-flow, you're going to have increases in sensitivity in most (not all!  these big proteins can be harder to solvate with nano than high flow ESI).

Intacts are hard to do.  Keep that in mind.  This is a process.  It is best to start with a higher concentration of a lower molecular weight protein at low resolution and work your way up to that antibody.  Once you get a nice signal, then you can start thinking about things like SIM scans for better signal and think about fragmenting these big things!

Important note:  When you buy a protein standard, it comes all full of junk.  There are salts and detergents and preservatives and often other proteins that are in there to preserve that protein.  Most standards will benefit greatly, maybe enormously, by some sort of pre-cleanup method. 

Can't get those last air bubbles out of your nano-LC system?

So you've purged and flushed air, and ran your LC at high speed, but you've still go some pesky airbubbles eluting from the tip of your emitter?  Don't just get super angry, do something about it!

I just learned this trick this week after spending a couple days trying to solve exactly this situation.  I received a suggestion from a coworker that seemed a little nutty.  Fortunately, if it had involved a ritual rain dance, I probably would have tried it at that point.

DISCLAIMER:  I don't know very much about LCs at all.  I know they pump liquid of a specific volume in a certain direction at a user-controlled rate.  Do not take any advice from me on this (or quite frankly, on anything else! without consulting your service manual, engineer or tech support)

Anyway, what I ended up doing, based on this suggestion was run an injection of 100% isopropanol through the system as it was.  I set the LC to an artifical "1 column setup"  (there were 2, but I didn't tell the LC).  This way all of the isopropanol was pushed through both the trap and analytical column.

And you know what?  It totally worked.  It might not work for you or for anyone else you know.  But it worked for me, and I looked less silly doing it than if I had went with the option the guys below chose.  Honestly, they look pretty cool.  I would look far less cool doing it, but if the IPA injection fails....

Tuesday, November 19, 2013

Shortix: Cut silica correctly every time!

A group I'm working with this week has this awesome little tool.  It is perfect for people like me who can't cut fused silica cleanly and evenly any every single time they try.

You push the silica into the device while repressing a little entry button that holds the diamond cutter out of the way.  You tighten this thing down so it holds the silica evenly, let go of the button, rotate the wheel and BOOM! perfectly cut silica.

Down-side?  It is $300.  You can purchase it here.

Monday, November 18, 2013

IPRG 2012 -- What did we learn?

IPRG 2012:  What did we learn?

In general, the ABRF (The Association of Biomolecular Resource Facilities) has some awesome ideas and the IPRG 2012 study is no exception.

In this study, synthetic peptides were produced that contained common modifications on their respective amino acids, including phosphorylation, acetylation, methylation, sulfation and nitration events.  The synthetic peptides were spiked into yeast tryptic digest.  The anonymous participants of the study ran these samples and attempted to search for these PTMs using a variety of LC-MS/MS and processing conditions.  While the level/number of identified spectra was a measured metric, the real focus of this study was the efficiency of identification of the modified peptides and the correct localization of those modifications.

The results are definitely interesting across the board.  One place of particular interest is a breakdown in the paper of the number of peptides, both consensus and unique that were identified by each research group.  The study showed that the clear winners were a group that used Byonic as the primary search engine.  Surprisingly, the one researcher who used Proteome Discoverer/Sequest had the lowest number of identified peptides in the study.  Having personally compared PD to every one of the search engines compared in this study on at least a few, if not numerous datasets, I have to think that this group had issues either with their instrumentation or experimental design.  Nothing short of that would explain the discrepancy.  While it would be interesting to know for sure what happened, that would negate a good bit of the anonymity of this study.

Another place where Byonic really showed power was in the identifications of the known modifications and the correct placement of them.  Interestingly, nearly all of the instruments and methodologies had trouble with one specific modification in specific, tyrosine sulfation.

Now, I want to throw out my cautious opinion on this study. I definitely see the value in comparing lab to lab, particularly when reproducibility is such an active criticism for our field.  It is definitely worth thinking about the small sample size and the huge array of variables that this study is taking a swing at.  Different instruments, LC gradients, packing material, ionization sources and their relative efficiencies, processing schemes, etc., etc., all contribute to these results.

Is it valuable to know where we are in terms of global abilities to accurately assess PTMs?  Absolutely, and this is certainly a valuable snapshot of where we are.  But we should be slow to make judgments based on this small sample size and intrinsic variability.

You can read the paper, In Press, here.

Saturday, November 16, 2013

Nerdy computer note of the month: DDR4 release!

PC nerd alert.  DDR4 memory is about to release.  Crucial says they'll have the first modules out next month.  Want your processing PC to access memory faster, but also use less energy?  Enter DDR4.  Twice as much memory per stick (16GB!  woooohooo!) with access speeds twice that of DDR3.  Read the marketing press release here.

59 proteoforms of ovalbumin?

I'm currently just overwhelmed in my raw appreciation of just how cool science is and of how very very little we seem to know about our world around us.  There is stuff to discover absolutely everywhere!

Case in point, this new paper out of Albert Heck's lab where they use a modified Exactive (essentially the Exactive plus EMR) to study ovalbumin in its native state.  The same ovalbumin that we use as a molecular weight marker for SDS-PAGE.  The same ovalbumin that is sitting on a fridge shelf for some reason or another in virtually every lab in the world.

And what do they see in their nifty native analysis?  59 proteoforms!  59 distinct variations of this standard protein.  Seriously?  59?!?!  Does that blow anyone else's mind a little?

Step sideways a second:  Remember the human genome project release?  When we were super excited that we had 30,000 genes sequenced or whatever after close to a decade of work? (I drank a lot of beers yesterday with a friend who told me that he can sequence a human genome with 30x coverage in 1 day, but that's a different thought for a different day).  So we had 30k genes all worked out, and that is a lot of complexity.  But even if we ignore all the variations in transcription/translation that we know about now, and just considered  that if 1 gene made one transcipt and that transcript made one protein, here we see 59 variants of that protein that, for the most part we couldn't/wouldn't find (or it would be pretty difficult to discern) unless we looked at the protein in its intact and native state.  That is a lot of complexity!  But think about the fact that we know there are possibly millions of protein variants at just the linear amino acid/modification level, and throw in the fact that these can actually result in a much larger combination of proteoforms and Wow!  does that ever make it seem amazing that we have come so far, but also how exciting how much further we have to go?!?!

Overwhelming feeling here?  We all need to do more intact and native analysis.  (In a related note, this week I'll be doing some top-down work on a QE Plus with the Protein Mode upgrade {Woooohoooo!}.  Of course, my opinions/results on that will follow!)

Check out this paper.  If only to get an idea about how many things biologically kind of make sense, but don't really, that might make sense if we took into account the fact that what we think of as 1 protein could actually be dozens of variants that we just haven't had the tools (until now!) to even see.

The paper that has inspired me to get out of bed with a ton of appreciation for the world today is called:

Analyzing Protein Micro-Heterogeneity in Chicken Ovalbumin by High-Resolution Native Mass Spectrometry Exposes Qualitatively and Semi-Quantitatively 59 Proteoforms

TL/DR:  Read this paper.

Friday, November 15, 2013

Weak statistics and lack of reproducibility

Umm...this one is disturbing.
Let's start at the title:

Weak statistical standards implicated in scientific irreproducibility

and then move to the subtitle:

One-quarter of studies that meet commonly used statistical cutoff may be false.

Ummm...already disturbing, right?  It gets worse when you start to think about the 2 most common criticisms of our field:  1) A lacks of robust statistics and 2) A lack of reproducibility (you generally don't hear them in exactly this order...)

I'm actually not going to go any further.  You should check out this short editorial, though, and the 2 references.  This is a dialogue we're going to need to continue to have as a field through the future.  Yes, I'm dreading it.  Cause I don't want to be doing a lot of statistics either....

The editorial is here (and under 1 page!)

Thursday, November 14, 2013

Poo proteomics!

It is probably a little immature that I'm taking this very serious, interesting, and well published study and reducing it to the term "poo proteomics."  But sometimes, that just happens, and it's still my blog (please refer to disclaimer page)!  (I humbly issue an apology to the authors of this very nice paper if you find it offensive.  You have to admit that my title is catchier.)

The paper is actually called "Host-centric proteomics of stool: A novel strategy focused on intestinal responses to the gut microbiota," and is from a team out of Standford.  In this very serious study, the researchers use a number of complex in vivo models of different gut flora and perform proteomics on the output.

Just a side note (and I'm totally cracking up here):  I'm picturing the staff scientist who runs this instrument and his/her face when they explain what they want to inject into his extremely well maintained analytical instrument....  To my good friends out there in Core lab type roles, I apologize because I've pictured a lot of your faces during this imaginary dialogue in my head.

Back to serious:  What they demonstrate:  more complex gut flora equals more complex poo proteome.  The results sound obvious, but imagine how useful an assay would be for gut infections (like the crazy deadly C. difficile variants) if you only had to take a tiny sample of stool (which a lot of hospitals acquire anyway) to classify. And come on, somebody was going to do this eventually, right?!?

mMass -- easy open source tools for mass spectra

I just happened across this one when 2 people asked me about a nice open source in silico fragmentation predictor in the same day.  Sounds like search that will end up as a post!

I looked around, downloaded a few, and found my favorite, and it is mMass.  You can check it out at mMass.org.  This very nice piece of open sourceware has a ton of nice options, and is written by a guy who states that programming is his hobby.  The world needs to find more hobby programmers like this!

The program is super easy to download, install and use.  And the interface is very intuitive.  Besides fragment prediction, it is also a file converter, sharer, and viewer.  It can pick peaks, recalibrate your spectra and do some processing.  And on and on.

Definitely definitely worth a free download!

Wednesday, November 13, 2013

Uniprot update available today

Uniprot update time!  Last update of 2013.  Update here!

Open source tools for top down proteomics

Want to do some top-down data processing on the cheap?  Are you willing to write a command line here and there and jump through a data conversion hoop or two?  Then there are a couple of tools that will work for you or the bioinformatics guy who is doing your processing.

The first is the MS-Deconv from the CCMS.  Simple deconvolution of MS and MS/MS spectra.  It is available for download as a command line driven algorithm, or with a simple graphical user interface.  In order to run with this program, you will first have to convert your data to mzXML.  Unfortunately, unlike in some programs, it doesn't seem like you can get away with uploading mzml.  That X is essential here.  (For Thermo Elite, QE, or Fusion, first convert your RAW file to mzml with the PD full version or viewer software, then use ProteoWizard to convert mzml to mzXml, instructions here.)  For most instruments, you can directly convert your RAW files with this tool directly to mzXmL, but I haven't tested this tool for the newer Thermo instruments in quite a while and it didn't seem to like the RAW data for these when I last did.

The next tool is MS-Align.  Which can directly take the output for MS-Deconv and process it for LC driven intact analysis.

I'm doing some intact analysis with a QE today, we'll see how these two tools compare to other ones out there.  Yes, there are some hoops to jump through (and you'll notice a lack of control settings in the MS-Deconv algorithm that you may want), but these could be a nice complementary resource for your top-down studies.

ItunesU free courses in mass spectrometry

Aside from the crazy randomness of the internet, my second favorite thing about it is the easy access to information about everything.  Suddenly fascinated by the fact that gel nail polish is polymerized by placing the customer's nails in a UV light and want to know the chemistry?  Easy access to that information.

In a note more related to the supposed topic of this blog, if you are new to proteomics there are tons of tools out there.  It doesn't need to be that daunting to get into this field.  As an example, yesterday I learned about this resource that is available on iTunes.  A whole slew of intro to proteomics videos produced by our friends at the Broad (like toad) Institute!  They are less than a year old (no old info here!) and broken into concise topics for easy digestion.