Monday, September 2, 2019

How many proteins should there be, anyway?!?


I've spent a lot of time this year wondering things like "okay -- so -- how many fracking proteins are there supposed to be here, anyway?" And the answers are suprisingly murky.

What I do know -- proteomics loooooooves to use cancer cell lines. You know why? Because...


They aren't normal human cell environments. For one, most of them can't stop dividing regardless of what damage they pick up. "Oh....this neuroblastoma cell line is now expressing tooth enamel production proteins? Not normal, but it probably won't stop that cell from continuing to grow."

If you're doing work on healthy human brain tissue, you probably shouldn't see those tooth enamel production proteins, right?

We all have decent feel for what we should get out of HeLa digests on our instruments (or Hek or K562 or whatever) and unless you're doing cancer stuff all day those numbers are probably crazy high compared to what you're normally doing. Here is the question, though, how many should be there?

The picture at the top is taken from this Human Protein Atlas page.  Of 19,000 or so human proteins, around 11,000 are found in the human liver. Okay -- I actually chose the human liver as an example at random, but this actually comes from this brand new paper.


There aren't just liver cells -- the liver is an organ made of all sorts of different types of cells.

I'd assume that there is no way that a Kerpuffle cell would express every protein that an Marovaculus encoshelail cells would (if they did, they'd be the same cell, right?) so if we subsection the liver cells by flow cytometry or by laser capture microdissection then we'd expect that number of proteins to drop of markedly, right? We're talking less than 11,000 now. A lot less?

Seems very cell-type specific. For example, probably on the low end are the boring simple old red blood cells. Two recent studies (post 1 and 2 here) may only have 2,000 or 3,000 total proteins. They don't have to do much but haul hemoglobin and malaria parasites around. They don't need a ton of proteins. I'd expect everything else goes up from there?

Getting a good answer this morning has been tougher than I thought it would be...if anyone knows of a good breakdown or review, that would be great. I feel like I should be able to make one of the Atlas projects make a chart for me, but I hadn't figured it out yet. I also can't figure out my stupid washing machine (what ever happened to a dial? what's wrong with the spring loaded -- wash -- spin -rinse -spin? why does a washing machine need a really crappy touch screen user interface?) so -- grain of salt...it's probably easy....

Scholar insists that the answer in this paper (it isn't. this title promises a lot. the paper doesn't deliver)



What about the human protein map (JHU version)?  AHA!

There is this sweet chart that provides solid insight --



The bottom chart is all 30 tissues they tested. There are 2,350 (far right) proteins that were found in every cell type they checked out. On the opposite end are genes/proteins that are unique to one single tissue/cell. Most are in the middle. I think this says a lot -- like the Venn diagram would be horrendous to look at -- OMG -- it would make the best UpSetR plot...though....okay......I've got other stuff I should be doing. This makes sense to me. I don't think RBCs were done, but they'd be the low end -- in this 2,500 protein range and we'd see this complexity all the way up, since this should all be additive, but each human cell type would exists on a spectrum ranging from 2,500 proteins right on up.

Wait. What was the point of this? It wasn't to ask a question and then say -- "sorry, I totally don't know" but that seems to be what happened. There is a take-away, though!

If you're running some proteomics experiments, don't freak out if you don't get the 6,000 or 8,000 or 16,000 proteins that you expect from your HeLa cell line under the same conditions. Your cells probably don't have that many proteins. Probably if you look hard enough in the literature for your specific organ or cell, there is guidance on what you should expect. (Transcript studies like this one might be useful guidance -- if it isn't transcribed, it won't be translated so it may be the high numbers excluding posttranscriptional/translational thingies).

Chances are it's a lot lower than your cancer control digest, and the more homogenous the cells going into your digest are the lower those total # of proteins ID'ed should be.


2 comments:

  1. Thanks Ben, I always enjoy your posts but don't say so often enough :)

    ReplyDelete
  2. 3.5 years later I`ve been wondering about similar questions.. Thank you for the interesting post. Do you have any new takes on the issue?

    ReplyDelete