I was really excited about the OmicsDI paper, but I realize at this point in time we're kinda being inundated with terms like "Big Data" and everyone's got a "Database" and an "API" and... it does seem like these terms mean different things regarding on who you are talking to...and this explosion of new terms and ideas makes it a little hard to separate the really good from the blustery jargon.
First off -- OmicsDI can help us with this one of these problems. It does so by bringing a bunch of databases together. Here are some examples of what you can do with it! (this is OmicsDI.org, btw!)
I'm going to pick a random cell line. Let's go Colo205. Just typing the cell line into the little search bar gives me access to a ton of information:
There are 3 proteomics studies and 5 transcriptomics ones that we can directly access via OmicsDI that feature this cell line! From just looking at the studies (T -- transcriptome, P- proteome, etc.,) you can see an overview of the study and learn some stuff about them.
For example, 5 studies by ArrayExpress. Microarrays! Right from the start you see there are 7 studies here that won't provide any meaningful data whatsoever and you can move right along (kidding, of course!)
Clicking on the provided link will take you to the ArrayExpress (which, btw, I'd never heard of before) where you can a summary of the study and direct links to download the processed and RAW data from the study.
If someone had said to me -- cool proteomics, has anyone done transcript analysis on this model before? I would have started like this:
Which, btw, doesn't lead you to anything about science on the front page at all. OmicsDI has already made me more efficient in this hypothetical.
Okay...so...I was a little bummed that there were only 7 studies on this cell line in the repository. Guess what? There is some disagreement regarding the nomenclature of the cell line. Is it Colo-205 or Colo205. If I type the search "Colo205 or Colo-205" I get 10 more studies.
Including another database I didn't know about (Expression Atlas). Let's follow that one!
It leads me to a table that I can search in the web interface or download in it's entirety. It is the expression levels of 24,000 transcripts across a ton of cell lines with a heat map indicating relative up/down regulation stuff.
Remember that theoretical question I mentioned above? Did anyone find this in the transcriptomics? Take those proteins you found that were up- or down- regulated and search them here. And the data is at your finger tips! No looking for a database you didn't know existed!