Wednesday, December 1, 2021

The growing need for controlled access models for proteomics (and metabolomics)...

 


EEEEEEEK. Last year I helped trick like 30 different people into helping us reanalyze something like 1,200 CPTAC files to look for PTMs and mutations. (If you're curious, it is open access here.) Part way through we decided it would be great to get the genomic sequences from those tumors because the NCI had the exomes and transcriptomes from these patients. You can't just download that stuff. 

You apply for access and..... 


.... you............wait..........................and maybe you get it.....maybe you don't................but you wait either way............. 

Genomics people are used to this! This is how it goes. We're used to going to ProteomeXchange and getting every file that we want as fast as our internet connection will allow. 

An increasing body of evidence is building that we probably shouldn't be allowed to do things the way that we are now, and this short story in Nasty Comms is another reinforcement. For now I think most people in the scientific community mostly forget that proteomics exists. With some increased recent interest I think we've went from being forgotten for months or years at a time to maybe just days or weeks at a time. While it's easy to blame those of you that have nice labs with windows rather than the classic subterranean dungeons where giant boxes go in every few years and we largely stay out of sight and mind, but it's probably the science. 

If you can identify a person and their traits with technology A and technology B and doing it with A is both illegal and unethical, you really have to stop and think about how you handle technology B. 

Metabolomics might surprise you a little because it isn't quite as straight forward to identify a person (if you can, I don't know how) but you can get SO much information about them. The way that I try to impress people into letting me do metabolomics is running some of their precious clinical samples that they have all the information on and then sending back a focused report of drugs and drug metabolites. If you go in blind and show some MDs what drugs their patient was on at that blood draw you get instant credibility. The flip side is that you can easily find that information in basically any untargeted metabolomics study. A fun one we just published was from a collaboration with a group that has done lots of work associating schizophrenia with cannabis use. Global metabolomics on their cohort strongly suggests that the method of screening their control group (giving them a questionaire about their drug use) has the weakness that people sometimes lie about using drugs. Even worse, sometimes LOTS of people lie about using illegal drugs, so there is a bunch of data on a huge historic cohort that may need a revisit or 12. 

It will be interesting to see how data access continues to develop for us going forward, but I predict that 10 years from now it won't be nearly as easy as it is today. I also suspect a lot of us will learn how HIPAA secure data storage systems work....

No comments:

Post a Comment