Thursday, June 5, 2014

Human proteome maps part 2: Digging deeper into ProteomicsDB

Wow.  The readers have spoken.  In 6 short days, my comparison of the 2 human proteome maps is already my 3rd most popular post of all time.  You can read part 1 here.

Yeah, cause I totally needed more reasons to explore these huge releases in my field!  What better thing to do on a long train ride this early morning?  For now, I think I've shown you all that the Human Proteome can do, but we've really only scratched the surface of what resources are available from ProteomicsDB.

To start off, I'm going to Browse Proteins.  And I'm not letting the Kuster lab off easy this time.  Let's see how they do with Integrin Alpha 5 (one of my favorite proteins and a nasty one to work with via mass spec. Membrane....ewww...)

First off.  It found it!  As well as isoforms.  Man, I love this thing....  I'm going to stick to full length isoform 1.  This is the summary page.

The chromosome it's on!  All the IDs!  Graphical maps of the peptide coverage, GO and links.  I'm a little confused by the illustration...probably a place the Kuster lab was slacking.  Guess I'll click on it just in case...

...oh...its DOMAIN MATCH maps!?!?  With confidence statistics.  This is via a page called SMART:  the Simple Modular Architecture Research Tool.  Something to investigate later. Next tab:

Okay!  Finally something that us normal human beings can do.  Assemble a sequence coverage map.  Did I mention that this protein sucks to work with?  From this map, you might not believe me though...

Next tab!  Whats a protease map?

Definitely click on this one to expand it.  It theoretically digests your peptide.  Big deal, the Protein Prospector has been able to do that for 20 years.  The metrics are pretty cool, though.  You pick the enzyme you want from the ones available (user customizable enzyme might be nice; minor suggestion) and it does the digest.  Hover over the map and the sequence appears in a little bubble.  The mouse obscures it a little, but it's there.  The cool thing is the % theoretical in the lower right hand.  I was pretty psyched the other day about manually calculated % theoretical coverage, and this thing does it automatically (so does X!Tandem, I'm told).  Nice to have all these features in one place!

Next tab.  Proteotypicity?

I choose iTRAQ and hit the calculate button:

...and I get spectral libraries for iTRAQ labeled peptides from the protein.  Terminology is weird/confusing, but the data is incredibly thorough (see the cool black line at the front where the reporter ions are?)

In part 1, I showed you the expression maps. Let's skip to the Projects tab:

Nice cropping job...did I mention I'm on a train?  Anyway, this is another cool tab.  For example, Mendoza_JPR_2013 found 17 unique peptides from ITGA5.  If I want 15 peptides, I can click the PubMed link, go to the Mendoza paper, and use the method they used (its a membrane paper, btw).  I can follow the Project link and actually download all the RAW data files for that project.

I'm going to cut this short now.  The WiFi speed on this train has dropped dramatically and uploading these images has become a pain.  I hope that this give you some sense of the insane amount of work they have put into this resource for us.


  1. Hi Ben,
    The reason for the high seq coverage is that integrins are single pass membrane proteins with rather large extracellular and small intracellular parts.

  2. Thanks for the extra info! I looked it up and definitely agree that you're right, but I've always found them tricky to pick up. I've definitely never had coverage to this extent. I've wondered if it is a limitation of the commercial FASP kits. Or low expression in the cell lines that I've worked with most...