Sunday, August 16, 2015
An informed opinion on bioinformatics for whole human proteome projects.
From an instrumentation stand-point, assembling an entire proteome is something that many labs have the ability to do. Is it still a challenge? Sure! Sample prep and prefractionation for complex organisms is still going to be stuff that you're really going to have to do right.
What about the data processing side of things? This might actually be where the real problems are right now. If you've got an FDR controlled at 1% at the MS/MS level and you have one million MS/MS spectra..that is saying you probably have about 10,000 things wrong. If you've got a billion? Thats 10 MILLION bad matches.
If you follow bioinformatics on social media in any way, chances are you know of Yasset. He has a lot of experience with datasets as large as, and much larger(!) than the ones we're generating. In this post on his blog, he takes a look at the first human proteome drafts, the re-analyses and opinions from thought leaders in the field and adds some of his own thoughtful insight to it as well.