Monday, March 31, 2014


I seriously have a list of the next 10 things to investigate/write about, but this one just hopped the queue.  If you haven't noticed, I'm doing a lot of crosslinking stuff right now.  The MS/MS and data analysis is tricky stuff.  Yesterday, my collaborator showed me Xcomb, a tool from my neighbors at the Goodlett lab (how is it that we don't know each other?  Lets fix that soon!)

Xcomb gets around a whole lot of the difficulty of crosslinking studies by building you a custom crosslinked database.  You give it the proteins you are looking for and it makes a FASTA file that contains a whole slew of new "proteins" that are your possible crosslinked species.  Then you throw in your crosslinker mass as a dynamic mod and you can use any software you want to search for crosslinked species. Brilliant.

The original paper is here (yup, it is a few years old, but no less cool!)

Sunday, March 30, 2014

Byonic in Proteome Discoverer

Byonic time!

As I mentioned earlier, one of my favorite MS/MS programs ever, Byonic, has been seemlessly integrated into the Proteome Discoverer 1.4 environment.  If you haven't heard of Byonic, I wrote a couple posts on it over the last 2 years.  Essentially, Byonic gives us the opportunity to study PTMs well, thoroughly, and confidently.  And you can just tack it into PD now.

Simple to set up, just like a SequestHT or MSAmanda search (Advanced parameters are hidden because it has too many options for a screenshot on this laptop!)

Specific focuses are available for glycopeptides!  What else does glycopeptide analysis?  Sure, there are some packages out there, but my dog, this is EASY.  I have some glycopeptide files here now that I may process and update you on, but my sample queue is loaded for at least the next 3 days.  The sounds of the CPU fans in this room, lol!

Wildcard search!  Fully customizable wild card search!  Wooooohoooo!  Okay, so you are thinking "big deal, right?  We can wildcard search in Mascot and some other software."  Sure. We. Can.  And not to poo on anybody, but this works.  And it works really really well.

Proof?  You know those crosslinkers that were developed at the University of Victoria Protein Centre? (Described in this paper).  You should.  Cause they're awesome.  Isotopically labeled and CID cleavable.  The idea is that when two proteins are in association, you crosslink them with this compound.  The crosslinks come in two types, the heavy and the light, but otherwise completely identical.  They are exactly 8.00 (I forget) Da apart.  You can then use the Mass tags function on the Orbitraps to, basically, do a gas phase enrichment for the low abundance crosslinked counterparts.  Otherwise, you basically can't see them.

So, I took a file that was treated this way and ran it.  Consider this.  Its an extremely low abundance mod.  Oh, and its >500 Da in size. What are the chances that Byonic could find a mod like this?  So, I gave Byonic this file and didn't tell it about the mod.  At all.  I just said "do a wildcard search up to 600 Da".  It took a while.  Imagine that it had to look at every amino acid possibility and up to 600 Da heavier!  What are the combination?  It took a few hours.

But, you know what it found?

The crosslinked peptides.  Exactly where they should be.  Exactly what they should be.

Unimpressed?  Try this, or something like it with another program.  Let it search its maximum allowable "wildcard" type mass window.  Then email me if it you don't have any false hits, because by doing a search like this you have inherently dumped a whole ton of chaos into your search and you have made your FDR engine do impossible numbers of calculations.  I've never seen a search like this come back without silliness in it.  There are too many variables.  And Byonic just does it.  Correctly.  Oh, and it automatically (unless you turn it off) throws in a target decoy search as well as searches common contaminants (it appears to be the cRAP database).

This little company in San Carlos is quietly writing the best shotgun proteomics software I have ever seen.  And if you are using PD, you already know how to use it.

TL/DR:  Byonic is awesome.  I have no idea how they do it, but it is awesome.

Peptide Shaker!

Y'all are always coming up with so much cool stuff!  Peptide Shaker isn't exactly new, but its mostly new to me.  Heard of it, but I never went out of my way to check it out.  It is real cool.  Not only do you get a nice (and interesting themed) GUI, but there is a big reason that you might want to check it out now if you haven't before.

Peptide Shaker supports MS-GF+ now.  What's that?  It is a program written a few years ago that took a good hard swing at something people ask about a lot.  Advanced peptide statistics.  Namely, using p values to accurately gauge the efficiency of peptide identifications.  In the introduction paper for MS-GF+, the authors showed that on virtually every MS and fragmentation mode, their algorithm produced more and better IDs than Mascot running Percolator.  Now, thats interesting all on its own, but I've never seen a link to download this magic code before.  Now, its supported by this nice GUI so now these claims can be scrutinized by me (when I finish the next previous 10 things on my list) or by you!

You can get Peptide Shaker here.  And read more about MS-GF+ here.

Thanks @compomics for the link to this!

PS, ignore the "Low memory warning" above.  My poor personal desktop is about to burst into flames.  Replacement is eminent....

Saturday, March 29, 2014

Preview in Proteome Discoverer

I have SO much to write about right now.  Its killing me that I have this real job that I have to do.  This one is going to be kind of short cause today is pretty packed.

Preview is a program from Protein Metrics that quickly evaluates your RAW data and gives you an idea of how you should be setting up your searches.  I had heard about it, but never used it.  Now it is a node that you can buy from Protein Metric for PD.

This is how it works:  I have a RAW data file that I know comes from a human tumor cell lysate.  I know that this tumor is all screwed up (its cancer cells, its gonna be screwed up) but I don't know how screwed up.  So I take the RAW file and I run it through Preview.

I tell Preview what I know about the file:
Or whatever I feel like.  Sure, I could check it to see if its CID or HCD, but lets let Preview figure it out.  Do you see what is under Search Options?  Below the insanely awesome toggle where you tell it whether you have Phospho Enriched?  Wild card search!  This is the same wild card search that is in Byonic, but you have a little less control over the mass ranges.  They are preset.

In 4 minutes (140 minute Elite run with ~50,000 MS/MS scans on my quad core laptop with SSD buffered hard drive....this software is very very fast.  How fast?  Faster, even, than Morpheus on this PC) you get something like this:

This sweet overview of the file opens in your default browser.  It gives you your top scoring proteins and your potential m/z errors.  You can use that data to fine tune how you set up your real search.

That isn't all.  You get several other HTML pages generated with more details AND you get an Excel sheet with potential wild card results.

In the case of this file, the Excel sheet pointed out that I should definitely search for methionine oxidation AND it pointed out a slew of peptides that have a +128 Da mass shift.  Interestingly, there really isn't anything of that mass in Unimod.  Is it the next -ome that we have never looked at that is disregulated in tumors?  We won't know until we take a look.  And if we figure out what it is (or not, and just search for it anyway) I can use the new ptmRS to get confident site localization of the mod!

To find out how you can get Preview, follow this link to the Protein Metrics home page.  There is much much more to come.  After I get some real work done I'll show you the awesome stuff that I'm getting out of the Byonic node!

Friday, March 28, 2014

Quantitative analysis of 4000 proteins on a QE!

I have so much to write about right now!  March is a great month for proteomics!

First off, thanks @pastelbio for letting me know about this great new paper in Nature Scientific Reports.

This paper (open access) from Liangliang Sun et al., uses iTRAQ labeling and a Q Exactive to follow the development of the African Clawed Frog throughout its life.  They end up tracking 4000 proteins over 8 different points in the life cycle!  They use an interesting pooling procedure and SCX pre-fractionation to get numbers this amazing with the iTRAQ 8-plex.

Great paper for anyone wondering how to track the development of an organism.  Also a great set up for people interested in using reporter ion quantification on their Q Exactive.

Thursday, March 27, 2014

Getting beyond parsimony

Wanna feel dumb sometime?  Pick up a paper with Oliver Serang listed as an author.  But its important, very very important that we start thinking about the things he has been thinking about, because what Oliver has been working on is our problems with parsimony.

(I've stolen the following screenshots from a lecture posted by the Tabb lab at Vanderbilt.)

Parsimony is the thing (how do you grammar?) that happens when we sequence a bunch of peptides that can be explained by multiple proteins.
We don't have any evidence to say whether this is Protein A or Protein B that contributed these peptides, or even whether both proteins are present and they both contributed these same LC-MS compatible peptides together.  What we do then is group them together.  If Protein A is shorter than protein B, then we'd call this protein group and give it the accession number of the shorter protein in the grouped table.

Another stolen image from Tabb lab (I love you guys, and I owe you a beer!) This one shows how much worse this problem can be:

In this case (not so rare as we might hope), two proteins can explain all of these peptides, but that doesn't truly mean that either are there.  Other proteins could explain this, but it is much simpler to say it is these two and not some other 4.  And I think that really illustrates the problem here.

What if there was a way of telling which protein contributed this peptide?  Imagine the possibilities here.  Wait, don't imagine it.  If you are using Proteome Discoverer, go into one of your protein reports, turn off protein grouping, and look how many more proteins are on your front page!  Anything that could get us closer to that point would most certainly be a win, right?!?!

This is why we need to be thinking about this.  The potential for free data.  The potential to separate keratin 77 (a potential cancer biomarker) from keratin 90 (crap floating in the air all over the place).

How do we do this?  Crazy advanced statistics  Probabilistic networking (I think that is the term)....the stuff that Oliver does!

I'll direct you to two different papers.
One I've been puzzling over for a while from the Steen and Steen lab where non-parametric thing-a-ma-jings are evaluated (open access).
A new one in PlosOne that demonstrates how such networking need not eat all of the processing power on the planet for every RAW file (that is brand new and helped remind me how much I was procrastinating on this subject....) is this useful to us biologists out there who are scared of Greek letters in general, or is this another one of my useless tangents?  Well, on the other side of the screen where I'm typing this I have the Proteome Discoverer 2.0 Alpha version open.  Now, it is an Alpha, so I can't guarantee everything in this thing is going to be in the full PD 2.0 release...AND...I can't guarantee that I'm allowed to talk about it.  But considering the number of empty Stellas in front of me right now, this isn't my biggest concern at the moment.

BUT...some of these equations appear in PD 2.0 and I'm about to test soon as I find my plane.

Wednesday, March 26, 2014

Quality control in Skyline!

Man, I'm gonna get sued for real one day.  But this is the best image I can think of to describe this paper, because we're combining two of my top 10 mass spec things (don't ask for the complete list, its probably really weird) -- Quality control and Skyline.

It's described in this paper:

(Thanks, Dave, for showing me the SnipIt tool, I'm using it constantly).

This tool, called SProCoP can monitor 5 important QC metrics directly in Skyline.  Now, if you're thinking, "wow, that's nice for targeted analysis, but...." It works for shotgun analysis as well.  Seriously!

I'm all about QC in proteomics (type quality control into the search bar above for any of my rants).  If you are running any good benchmark sample on a regular basis, you rock.  If you are using doing this and using any of the great new statistical monitoring software to keep track of your overall system performance over time, you are a complete and total rock star in this field.  That's the highest compliment that I can give you, and we should be probably be friends.

TL/DR:  Take the best open source software in mass spectrometry (Skyline!) and seamlessly integrate a new R-based algorithm that can monitor quality control on virtually any RAW data file.

Tuesday, March 25, 2014

Check out the new Software Portal

I don't know when this changed, but it had to been recently.  The BRIMS Thermo Omics Software Portal has been totally rebooted.  It looks great and it is even better organized!

Free genomics classes from Harvard

Another great contribution from a super cool person!

Want to know about genomics, including the "next gen" stuff we keep hearing about?  You should consider taking a free eLearning class from Harvard.  How badass is that?  Seriously?

If you are a chemist moving into proteomics, this could be huge for you.  With rare exceptions proteomics can not exist without genomics data to compare it to, and this is going to cover how that information is, and will be obtained, in the future.

Direct link here.

Monday, March 24, 2014

PTMrs!!! Confident site localization of all PTMS in Proteome Discoverer!


I have yet another awesome new node that you don't have!  If you are interested in any kind of PTM, you are going to want this.

You know PhosphoRS.  Site localization probabilities for your phosphorylation events.  Last week, Karl Mechtler (from the lab that wrote PhosphoRS) contacted me to see if I wanted to check out his team's new node, ptmRS.  Yes.  Yes, I do.

Guess what it does!  Site localization for any post translational modification that you give it.  I'm serious.  I dropped it in this morning (easy install, didn't even look at the instructions), threw in a HeLa digest run from an Elite, gave it Phosphorylation and ubitiquinations to look for.  6 minutes and 18 seconds later, I have this report to look at.

I have localization of my few phospho events as well as a few di-Glycine (left over from ubiquitin cleaveage?) and confident localization.  I went after the lysine di-Gly first cause they're kind of low-hanging fruit (we should only have 1, maybe 2 lysines per peptide, right?  How does it do with some great enriched data?  I just found some on one of my spare hard drives.  Stay tuned.

Saturday, March 22, 2014

SILAC label a whole fish, cause why not?

Do you love working with zebra fish, but are frustrated by how little proteomics has been done on your favorite model organism?  Marcus Kruger's lab at the Max Planck Institute finally said "enough is enough" and labeled the whole damned fish.  Using similar strategies to the SILAC mouse studies this team has used in the past, they were able to quantitatively examine protein expression levels in individual organs extracted from these lucky labeled little guys.  The team reported full quantitative data on multiple organs in 5 days of instrument time using straight forward methods on the Q Exactive.

Sometimes you simply need extreme measures to push your field forward.  Strongly recommended read for anyone using zebrafish or other model organisms that have not been characterized well via quantitative proteomics.

You can read the abstract for this paper here (sorry, not open access).

Thursday, March 20, 2014

Mash Suite

The other day I joined that ASMS thing y'all are always going on about.  And I'm already reaping the benefits.  I got a journal in the mail with a ton of articles and I didn't have to worry about whether they were open source or not, cause I'm a subscriber.  Quite honestly, I'm feeling a little silly for not joining this club before now.

Anyway, one of the highlights of this magazine(?) what do you call it(?) is the Mash Suite.  Its a bunch of tools from Ying Ge's lab at the University of Wisconsin.

What is claims to be: "a user-friendly and versatile software interface for processing high-resolution mass spectrometry data"

What it really is:  a software package that
1) Deconvolutes
2) Does bottom up
3) Does middle down
4) Does top down!
5) And is free, you just have to go to Ying Ge's software website and prove that you are human.

I'm waiting for my license and opinions will follow...eventually!  Get this cool new tool here!

Wednesday, March 19, 2014

Phosphoproteomics of hallucinations?

I'm fascinated by these brain proteomics papers.  How much functioning in the brain can we detect by shifts in post translational modifications?  It sure seems like a lot!

In this paper (in press at MCP and temporarily open access) from Karaki and Becamel et al., these researchers describe an approach to studying the global phosphorylation events that occur during exposure to active hallucinogens.  They do this by using an extremely similar compound that can occupy the same binding sites, but does not actually induce hallucinations.

Using an Orbi Velos, they identify nearly 6,000 phosphorylation events and several that only occur when the active hallucinogen is given.  They follow this up with an incredibly thorough study involving knock downs, immunocytochemistry and giving mice LSD.  The number of experiments in this paper is pretty staggering.  Above and beyond what we normally see in MCP.  Not to put down MCP or nuthin, but if you are going to follow up your phosphoproteomics with this much validation and work you deserve to put this into Cell.  It makes the rest of us look like slackers, quite honestly.

1) Great phosphoproteomics work
2) Amazing level of validation
3) Cool brain stuff
4) Mice on LSD (actually, it was DOI, similar though)

Why are you still reading this and not this great paper?  Oh yeah.  Direct link is here.

Tuesday, March 18, 2014

Byonic node for Proteome Discoverer

Look what my nice friends at ProteinMetrics gave me to test out!  My favorite engine for searching MS/MS data for post translational modifications directly ported into my favorite proteomics processing software!

I'll be back later with tests and first impressions.  Its been a fantastic but long day.

Monday, March 17, 2014

How people respond to data.

I was sent this link this morning and had to share it.

How people respond to data.

Don't worry, I really am going to do the DIA post soon!  If I didn't post silly stuff once in a while, a lot of your would probably stop following me, and you know it.

Credit goes to the stunning Alexis Norris for this one.

Sunday, March 16, 2014

Chemistry lab suite app

(Image stolen from Bah Humpug!)

With the rise of the mobile device, everyone in the world is exploring how to make these powerful little computers we're carrying all the time even more useful than they are.  Two Apps that I use a lot are SparkPlug and the MSBioworks App.  I primarily use Sparkplug to find cool new Orbi papers and application notes and I use MSBioworks primarily to calculate my LC dead volumes (though the app does a ton of other things, like protein digestion, peptide fragment predictions, etc., the LC volume calculation is something I use all the time).

One I just downloaded is the Chemistry Lab Suite.

While its definitely geared toward you chemists out there, I sure do like the Protein button.  Primarily cause of this, of course (which, yes, you could do in the MSBioworks app, but what if orange on black was your color scheme of choice?):

Suite, right?  (That was an intentional pun.  I haven't got very much sleep this weekend....)

Anyway, if you find yourself needing this information sometimes (and don't always have internet access for ProteinProspector or similar check this out.  I downloaded it for Android and for iOS and they appear identical.

Saturday, March 15, 2014

The rise of OpenProteomics?

Are we starting to get organized as a field?  Check out this awesome press release from EMBL!

Thanks @attilacsordis for the heads up on this one.  We're making steps in the right direction all the time

Thursday, March 13, 2014

Targeted quan on the QE part 2!

Yay!  Part 2!

Check out Part 1 here.  The response has been...enthusiastic.  I didn't know so many triple quad users read my blog!

Next on my list is plain old, MS1-ddMS2, with an Inclusion List

Couple of different ways we could process this.  We could quan at the MS1 level, or we could quan on fragment ions at the MS2 level.

Advantages:  Specificity.  We have high resolution accurate mass MS1 to look at and fragment ions to confirm the identity of what we're looking at.  Tons of targets!  We can do an absolute ton of targets here.  How many?  Try 5k.  No joke.  Will we get all 5,000?  Probably not, but that may be more due to the fact that you didn't do a great job making that list of 5,000 ions.  Come on.  You'll do a great job on a list of 10 or 50 or even 100 ions, but you're bound to make a mistake while editing that list of 5,000 ions.  No one can blame you, maybe its 1am and you're sleepy.

Disadvantages:  Sensitivity.  This is the least sensitive way to do our targeted quan.  Now, sensitivity is a relative term.  We're still smoking every Q-TOF on this planet, but not by the margins that we're used to.

Man, I'm beat.  Tomorrow we'll start the sensitive AND specific assays.

Read part 3 here.

Tuesday, March 11, 2014

TMT labeling video

As a hint of things to come, check out the TMT 6 plex consumable site.  It now includes an instructional video for how to perform TMT labeling.

You can view it here.

Monday, March 10, 2014

Do you love FASP?

I love FASP (filter aided sample preparation).  How can you beat easy, clean digestions?  What about with Enhanced FASP?

This paper from Erde et. al., takes aim at improving FASP and comes up with a 300% increase in peptides!  Check it out (warning, not open access).

Sunday, March 9, 2014

Comparative proteomic analysis reveals characteristic molecular changes accompanying the transformation of nonmalignant to cancer lung cells

Interesting paper from Li Zhang's lab.  Straight up comparison between two cell lines from the same patient -- a normal cell line and a lung cancer cell line.

iTRAQ labeling and Q-TOF for analysis isn't the most sensitive way of doing things but the bioinformatics afterward makes up for it.  They get quan on only high abundance proteins, but the quantitative differences let them point up stream at some really well known (and less well-known) cancer checkpoint proteins.

A great paper that shows how far you can go with some extra elbow grease on older equipment.  Direct link to the paper here.

Thursday, March 6, 2014

Intro to mass spectrometry videos from freelance

I've used this comic from "Toothpaste for Dinner" before, but I still like it.

I stumbled across this guy, today, he has a site called "FreeLance Teacher" and he films himself in front of a chalk board explaining chemistry, physics, and mass spectrometry.  You can "pay what you like" for his videos, but they are pretty decent.  I don't know enough about MS physics to say whether what he's saying is true, but he seems pretty confident about it.  Interested in what's going on behind that sweet plastic casing?  Check these out!

Find the rest at

Monday, March 3, 2014

MS-Viewer -- Powerful new tool for Protein Prospector

An awesome new tool is up on the Protein Prospector and is detailed in this paper, currently in press (and open access) at MCP.

The MS-Viewer takes aim at a lofty goal -- a program that can open post-processed shotgun proteomics data from different sources and open it in a unified format.  For example, if someone posted their Mascot runs on Tranche and they don't match your collaborators replication that was processed with Proteome Discoverer, you can use MS-Viewer to open them both in the same format.  This might help you to tell whether it is a database issue, a different in false discovery rate calculation, etc., which led to these differences.

Maybe more commonly, it gives you a chance to stop searching around for a characteristic of the data that you are interested in when meta-analyzing data.

Cool new program that makes Prospector even more useful than it already is!  And there is even a nice video tutorial on how to use it.  Check it out here (video link in red).

Saturday, March 1, 2014

Blogger hates histograms -- or a study on the distribution of ions selected for fragmentation on an Orbitrap

 Whats this weird thing, you ask?  Well, this is a histogram.  And blogger apparently hates histograms, cause I've tried everything and this is the only way it will display.

Misconception?  If we digest a whole cell lysate with trypsin, we will find that there is an apex somewhere which will represent the average tryptic peptide mass as well as the standard distribution around that point.

So I took a HeLa digest run.  We know its well over 90% digested, so it should be a good comparison.  I then filtered every the m/z of every ion that was selected for fragmentation by a filter of 0.01 Da.  This provided me with 10,979 unique-ish ions selected for fragmentation.

In order to get a visual output representing the ion distribution, I binned the ions in 1 Da units from 299 to 1000.  And I got this neat histogram that appears to show little m/z bias (so no obvious apex and distribution!)

 But it sure is hard to see, right?  So I just re-binned it in units of 50.  Which shows some bias, but definitely not as strong as I was suspecting.

I'm surprised and not real sure what to make of this.  In particular due to groups that are using predictions of peptide distribution for pre- and post- processing analysis.  Gonna have to do some reading, but for now, I'm off to:

I can't say the blog is on hiatus, but maybe!

BTW, these histograms were created by the Add-in in Office 365, which is pretty great.