Ruhroh, Reorge. Glendon Parker is still out there trying to rain on our free proteomic data parade in this new JPR paper.
We've been doing some classical proteogenomics and getting the "publicly deposited" genomic data for it was a huge hassle. Justification forms and waiting months and we were under a really strict deadline for the project that was imposed by the people who had the genomics data. Longest part of the project? Waiting to get approval for the whole exome sequencing data.
You have to make sure that someone isn't going to use that 250 GB RNASeq data file to extract personally identifiable data from the patient for nefarious purposes. The 24 high pH offline fractionated normal and tumor data from that same patient? Pull that down as fast as your internet connection will allow.
Could you identify that patient by the single amino acid variants you can find the proteomics data?
....let's umm.....go with...wait! change the subject! (Dr. Parker, that's enough from you and your group. I don't want to wait 2 months to download every .RAW file from every preprint. I'll forget to do it!)
Look, we are going to have to tackle this at some point. Either the genomics people are being crazy paranoid about personal data, or we're being lackadaisical.
I'm definitely being that (did I spell that right or is the spell check off?) last word, because I sat in on a webinar the other day and someone showed a slide that I recognized as a list of single amino acid variants that you can see in my personal plasma that are confirmed by my personal whole genome sequencing data. I've got some plasma proteins that look downregulated vs pool in some analyses because of the variants. I use the slide to point out issues with extreme ratio quan in some LCMS tools and why we need to think about variants and I appear to have shared that deck a lot.
In a world where the terms "pre-existing condition" and "life destroying medical bills" (still the #1 cause of bankruptcy in my country where people are dying at a really depressing rate because they are using expired or black market INSULIN because they can't afford the real stuff) maybe we wouldn't actually care about what of our personal data is out there in the world.
But if we actually care, we might want to actually care about all of it. Not just the stuff that you need access to an HPC cluster to properly process.
If you do want to identify someone by their single amino acid variants IN THEIR HAIR, this new JPR study will tell you which hardware solution to use for it and how to best set it up.