Tuesday, December 6, 2016

CanProVar -- Another great resource for cancer mutation FASTAS!

How useful is a FASTA that doesn't have your peptide sequence in it? A lot less useful than one that does!  Sure, we can de novo or Byonic it, but try streaming NBA games when your resource monitor has looked something like this since Friday....

(It actually doesn't look that bad right now. but I did have to end some runs and move Byonic from "Heavy" CPU usage to "Normal" which is a nice feature, and fine for streaming as long as I don't go above 720p)

What was I talking about? Oh yeah!

It is a whole lot easier when the FASTA contains your peptide sequence!  And if you're studying something like cancer, chances are the normal database resources don't have what you're looking for.

You'll find a couple of enthusiastic posts on this dumb blog about the XMAn cancer mutation database that Virginia Tech researchers assemble and maintain (Go Hokies!)

CanProVar is another sweet resource out of Vanderbilt that is built from public repositories and dbSNP files (whatever those are).

Which one should you use?

A first comparison between the files is that CanProVar is smaller than XMAn, which would vastly improve the search time on big datasets. The downside, of course, is that the database is...smaller.... which probably means fewer mutations are present.

You can check out CanProVar here.

Apparently, it predates other resources by quite a bit, cause here is a paper on it from 2010. Somehow, I've either never seen it before -- or I've forgotten it. And this is exactly why Twitter is awesome for science!

1 comment:

  1. Howdy Ben, I was wondering if there was a way in PD or with another tool to determine if the peptides identified actually encompass the mutation or just the wild-type flanking peptides? I've searched against the XMAn and CanProVar .fasta files and get lots of matches but have noticed that most of the matches correspond to the non-mutated portions contained in the databases.