Saturday, October 12, 2019

Human Proteome Project Guidelines version 3.0!

For those of us on the edge of our seats for this -- the new HPP Data Interpretation guidelines are finally available. The big highlight is probably how to incorporate Data INdependent Acquisition (DIA) data into our biggest effort to map the human proteome.

How many proteins are we up to that have significant evidence (called PE1 proteins)?

17,694 -- meaning there are some 2,000 proteins that human may or may not produce (crummy evidence PE2, PE3 or PE4, based on how crummy the evidence is -- these are the "missing proteins" that sometimes pop up in article titles).

On top of this -- I think this is probably the best part of the new guidelines -- and I'm going to steal the text and draw a red line through the link that doesn't appear to work --

This doesn't sound very efficient from an evolutionary standpoint but we have some regions of our DNA that will lead to the production of completely identical proteins. As they mention here it probably makes sense from a regulation perspective (whatever promoter thingies they are under control of) or there has never been sufficient selective pressure against having identical copies of the same proteins produced by different genetic regions. Either way, it could be problematic to assign them to a single protein group based on linear sequence homology alone, so they should be categorized as individual proteins despite the homology.

Does this make a change in most of our day-to-day work, even if we're doing human proteomics all day? Probably not, but it is interesting to think about and if you run into it on neXtpRot or the hUmAN pROtein aTLas, now you know why.

No comments:

Post a Comment