Okay, I give up, what is proteomics?

Proteomics is the study of the proteome. Just like genomics is the study of the genome (or all the genes in an organism), proteomics is the study of all the proteins in an organism or a cell.

We aren't there yet. Heck, we still don't even know how many human proteins there even are

We know there is at least 20,000 or so, and there is definitely (probably) less than 1 billion, and it's really really really hard to measure more than 12,000 or so and I'd argue that it only recently became realistic to get to that number (I love this paper). 

Why would you bother with the proteome when genomics is so powerful?

Stay with me for a second, I'm going to make a really lousy metaphor/analogy thing (always hard to tell because I use the word "like" like way too much.)

Medical researchers are like auto mechanics. We often spend over a decade in college and wrack up huge amounts of student loan debt and often don't make more than auto mechanics, but the job is similar.

"Hey Mechanic (medical researcher), there is something wrong with my car (these patients), what is causing the problem and how can we fix it?" 

Good specialized mechanics have massive books of schematics of the cars they work on. Let's use an example I've actually seen. 

Imagine that I gave you every schematic of every stage of constructing a 1986 Porsche 944 Turbo. 

Something like 800-ish pages of things like this, drawn by some engineer with a ruler in the 1980s. 

There is a TON of information in this book. Where every wire runs in the car, what the gauge of that wire is, etc., etc.,

In this metaphor gone awry (it'll get worse) the schematic is DNA and the genome. Loads and loads of information about what makes up the car. 

As the mechanic, if someone tells you something is wrong with a 944, would it be easier to figure out what is wrong with the car from hearing about the symptoms and examining your schematics, or would it be easier to actually examine the the car? 

The proteome is the car. The schematic (genome) has loads of information and if you've got that information you'll definitely want to reference it, but it is ultimately indirect information.  For example, in the schematic you'll find that a 20 gauge wire runs from point A to point B and the book will tell you the insulation thickness for that wire (in a 944 it should be very very thin. how else will you be plagued by shorts and the occasional fire) and who it was originally sourced from when the car was constructed. Super helpful information. But what if the 3rd owner of the car that is having a problem replaced that awful cheap 80s German wiring harness in 1994 with something made by some chevy mechanic in Kentucky? That indirect information is no longer nearly as useful. 

The actual functional parts in the car (patient) are composed of proteins. Basically everything that performs an action of some kind. You need to spray some dead bugs off your windshield? DNA doesn't do that. A protein pump sprays that blue stuff on your windshield. The DNA tells what starting material you'd need to make that pump. Transmission? 4,000 proteins working in harmony. The DNA tells you how to start making each one of those proteins. Nothing more. Turbocharger that for some absurd reason will handle 20 pounds of boost? Again, that is a bunch of really awesome proteins working together. 

To make it even more complicated some proteins just modify other proteins. That is their job. So it may be impossible from your schematic to tell the output power of that engine because the DNA encoded a specific compression ratio, but another protein increased size of the spring on the waste gate of that turbo and made it run way more boost. 

Back to the mechanic (medical researcher).

The mechanic's job is to fix something that is wrong with that car. Over time components start to break down that will eventually send that car to the junk yard (aging).  Something got into the engine and is causing it to rapidly damage itself internally and you're having a snowball effect of parts failing? That's kind of what cancer and some neurological diseases are like. 

Every mechanic is going to be more effective with a comprehensive and direct view of the actual problem. Reference the schematic if you have it, but if you could directly hear a sound in the transmission? Do you have to reference that 800 page book before you remove the transmission? 

In proteomics we take the car apart and we look for what is different between a working car and the one that is broken. We've got great libraries and databases of what the protein profiles of healthy human brains and livers and an atlas of what proteins should look like for just about every single part of every organ of a human being. We can go online right now to ProteomeXchange and start downloading thousands of files to reference. Let's go back to the schmatic reference. If the genome is that 800 page book, ProteomeXchange is a museum some other mechanics I see at conferences twice a year have taken one thousand 944s apart and laid the parts out across the floor for me to look at. 

Now I can make a list of what is different in the broken car I'm examining compared to some that just came off the track. I hand that list of what is different off to the biologists or medical doctors. 

We're new, the word proteomics has been around for a long time, but the realization of what the word meant only became practical about 3 years ago. I typically won't even look at "proteomics" data that is more than 5 or 6 years old. We've jumped a lightyear since then.  Despite being new we've had some absurd successes. This group took a small bit from tumors that were removed from patients analyzed them thoroughly and got back to the doctors in a couple of days with a much more informed plan for what chemotherapies would work best. We can do this stuff today, if we're given the chance. 

Why has science focues on studying the genome first? 

Because it's much much much much easier to study the genome (and transcriptome or RNA). The realization that DNA was the central template of the cell (only a few decades ago) was such a revelation because no one believed that such a simple molecule map out biological complexity. And it can't. It can give you a start, though. 

There isn't a scientist on earth who wouldn't have started with understanding the proteome before understanding the genome. The problem is that the complexity is so much higher that we've only just recently been able to analyze more than a few proteins at a time. 

There are other important structural molecules, like lipids, but if you want to get to an understanding of what a cell or disease is doing, you need to know what the proteins are doing. How many are there? And how many are modified by other proteins? 

What are some other pros and cons of proteomics? 


Proteomics is relatively inexpensive. You are paying the mechanic to look at the car, not study 800 pages of schematics. Genomics requires expensive reagents and super computers to look at the data. Proteomics (by mass spectrometry, the right way most historically established and well-accepted way to do it) only requires a mass spectrometer and someone who knows how to run one. These aren't exactly cheap, but a proteomic analysis costs an insignificant fraction of the price of a genome or transcriptome. The actual cost for a proteomics assay is closer to tens of dollars. Genomics? If you look at all the costs? You're still in the hundreds of dollars per samples to do it right, and closer to $1,000 than you are to the dream of $100.  A study just last year estimated the cost for a decent proteome assay to be around $12. (I'll put that reference in when I find it.)


The field of proteomics is treated like a skilled trade, not like a scientific discipline. You can't get a degree in proteomics, when most universities around the world have genetics degrees. To learn proteomics today you need to find an expert and work with her, often through a Ph.D. program or post doctoral fellowships. Given the shortage of master level proteomics researchers, many people are coming into this rapidly growing field with no experience at all, and are learning on the job. The effect of this system can be seen in a study the Association of Biomedical Research Facilities performed a few years ago. We sent the same exact sample out to any proteomic lab that wanted them and we compared the results between everyone who anonymously sent them back and answered a few questions about themselves, their lab, and tools.  The only factor that correlated with the data quality was how long the person who performed the experiment had been doing proteomics. That is not a good thing. Particularly when the field is growing and the outside world wants our services. To keep up with a growing demand, something will have to give, and I hope it isn't quality, but it looks like right now it's quality.  Every day I hope to hear that someone somewhere has started a proteomics or mass spectrometry degree program. My dream is to do that, but I hope someone else does it first. I'm far from having the resources to try. 

And if you stumbled across this page looking for an answer to this question, I'll let you in on a secret. There are about 50 new companies out there that are trying to capitalize on the word "proteomics" and get your money. A few of them are clearly bullshit, best I can tell. A flood of money into an industry is always going to cause things like this to happen. If someone has a "proteome analyzer" with a touch screen that is about the size of an IPad and you're thinking about giving them money, don't. If one existed, I'd have it. I promise. If someone is promising to complete 6,000 "proteomes" at a time with their genomics system and you're thinking about giving them money? Check it out first! There are some innovative and new proteomics technologies out there. There are also established technologies out there that are getting better by the day. There are researchers working tirelessly to improve proteomics, but with the amount of money coming in there will be and there are scams. My gosh, there are some funny ones right now, though. Please keep in mind that RNA is not a direct measurment of protein, it never has been and it never will be. It doesn't matter how close you get to the protein (if it's in the ribosome). If you could measure the protein content from the transcripts, I'd still be doing transcriptomics instead of inventing methods for expensive vacuum chambers. RNA abundance measurements may here and there line up with the protein abundance, but it's still an indirect, measurement. And remember that proteins modify other proteins. 

Proteomics, though, is real. And it's complementary to the schematic. We'll always use them if we have them, but if I had to choose between the schematic and getting my hands on the car so it could go back on the track, I sure as hell would take the latter. 

1 comment:

  1. I am one of those that is now working in this field after running an analytical lab and teaching at a PUI for 20 years. Your description of the challenges and preconceptions in proteomics is very informative. Many of our PIs think that the core facility that runs their samples are also experts on their species, organ, disease, or treatment of study. Thank you for your post!