Sunday, December 1, 2013

How far is the human proteome project at this point?

The human proteome project has been rocking for a while now.  How far has all this work gotten so far.

Well, here is an update (not open access), compliments of JPR and Terry Farrah et. al., and the number is around 62%.

62% what?

Oh, 62% of the coding sequences of DNA that we think code for proteins have strong supporting evidence of their existence in MS/MS spectra. That's pretty cool right  Over half way!

Let's take a moment and think about how great this is, and how far we've come so far!

While doing so, let's forget the fact that one post translational modification can have dramatic ramifications on the function of a protein.  Let's also forget the fact that in 2011, we knew of about 80,000 specific PTMs. Also, let's forget about conformational changes that can have effects every bit as impressive as PTMs.

Please don't get me wrong, I'm not trying to put down the work of the participants of the human proteome project or the good people at ISB who are running the peptide atlas.  I'm simply concerned about our tendency to underestimate the complexity of biological systems.  We did that with the human genome project.  First of all, getting MS/MS spectra for all of the proteins predicted from the HGP data is the tip of the iceberg.  Secondly, let's not declare big ongoing projects completed for a while.  Grant dollars are pretty scarce out there, and we don't need ignorant politicans reading headlines and cutting all the money to our friends because they think the job is done.

Ran into this one thanks to Twitterer @PastelBio


  1. This resource from B. Kuster in Germany has (an impressive) 93% coverage of the Human Proteome. Their strategy, to re-analyze high-quality published MS data:

  2. Agreed, post-translational modifications, along with alternative RNA splicings, mean that any of the approximately 20,000 predicted protein coding genes can produce tens, hundreds, or thousands of distinct protein molecules. So finding MS/MS evidence for 62% of the predicted human protein coding genes is indeed just the tip of the very large iceberg that is the Human Proteome Project. The project also aims to determine which protein molecules are made in which tissues, at which stages of development, and under which conditions. I haven't met any protein scientist who is underestimating the complexity of this undertaking. However, you make a very good point: we don't want funding agencies to conclude from our 62% that the entire project is 62% complete. It is far from complete when one considers PTMSs, and some facets of the project have barely begun. At the same time, it is a good thing to identify a portion of the project that is quite tractable and with which we can make measurable and satisfying progress.