Is the human proteome (by that, I mean the number of proteins we can express) shrinking the way we've shrank the majestic mastiff into the super majestic pug?
Lets look at the evidence:
How did these studies determine that these other regions are not coding? By looking at in-depth proteomics studies, of course! One way of doing this would be to say "hey, we've ran 7 billion proteomics samples on tissue to this point in time and NO ONE has ever seen a peptide from this protein." That is one way, right, but that doesn't rule out the possibility that this thing is in plasma and has a copy number of 10 proteins per mg of plasma, right?
This group took the evolutionary approach. Genes that are highly conserved among many species produce proteins that are essential to life. The more essential, the more species carry them, and the more conserved they tend to be. What if we then just go to the gene sequence and compare that sequence against monkeys and dogs and a bunch of other things? If no one has seen a peptide from this protien & this gene is not expressed by our closest relatives (or it is highly modified) & it has a structure that looks very unlikely to be a protein THEN we can probably safely say that isn't a sequence of DNA for making protein (its probably for mysterious epigenetic weirdness, cause its unlikely its just taking up space, right?)
The conclusion is that human beings can probably express about 19,000 proteins, which probably means their are only 4 billion proteoforms....
You can read the original article (open access) here.
You can read the article in the Scientist here.