Thursday, April 30, 2015

How to import Proteome Discoverer results into Ingenuity Pathway Analysis

I swear I wrote this up somewhere before or made a video but I can't seem to find it.  Here it goes.  Ingenuity Pathway Analysis (IPA) (TM,R) is a great program for figuring out what is happening in your samples.  It was originally designed for microarray analysis but has grown over the years to be a fantastic tool for all sorts of -omics studies.  I used it for years in my previous life as a government researcher and I'll recommend it to just about anyone doing quantitative proteomics.

Why? Cause it will find the missing proteins!  We know we aren't getting 100% proteome coverage in our studies, though many groups are starting to come close.  If you have 5 or 6 proteins up- or down- regulated IPA can figure out what pathway those are most strongly correlated with and can point you to other interesting targets (or the low copy number regulator which might be causing your phenotype).  Of the many upsides of IPA one is that all this data is manually curated.  The downside is that is an awful lot of work for them so the software isn't free.  You can try it out though (30 day trials, I think...)

As I said, however, IPA was designed for genomics.  It works for proteomics but your data needs to be in a genomics friendly format.  I like the universal gene identifiers that are embedded in the Uniprot/Swissprot FASTA files.  There are other ways around everything this but is the easiest way I've found.  I'll show you how I do this in PD 1.4.  PD 2.0 is a little different and I'll highlight that as well.

I grabbed the first data file I could find on my hard drive on this sleepy morning (no coffee in my hotel room?!?! Seriously?!?!  Short pause while I write a scathing review of this place on Google Reviews...)  Okay! Here is the processed TMT 10plex file.

For simplicity sake I went to the right corner and removed all the unnecessary (for this) stuff, including: accession number, #peptides, #unique peptide, etc. I just kept the Uniprot description and the protein intensities (for label free) or, in this case, the ratios.

Next, you need to export your results so you can edit them in Excel.  In PD 1.0-1.4 you could right click and "export to excel". This function was replaced with the one shown below in PD 2.0.

To keep this easy for me I'm going to stick to the PD 1.4 version. In PD 2.0 this process will be the same.

Open your Excel or Text output in Excel.  What you need to do is parse the universal gene identifier out of your protein description.  There are smart ways to do this. Or you can just use the "Text to Columns" feature to get your identifier into its own column.

  Keep in mind that in many formats mitochondrial proteins don't have the same descriptor parameters that other human proteins do. The end result should look something like this.

At this point, I like to save this as tab separated text. This ditches any weird formatting stuff hidden in your Excel file.  This text file is now ready to import directly into Ingenuity.  If you were doing relative label free quan with the precursor ion area detector then you'd need to divide your intensities to get ratios.

Things to note:  Excel has this annoying issue with Septin proteins.  It recognizes Sep5 as "September 5th" and will change it to 9/5/15 or something.  You can change this in configuration to turn it off.  There is something else dumb it does, but I forget what that is.

When you import into IPA, tell the program you have headers so it ignores the top row. Then identify the gene column as the target and the quan output as the conditions you want to study.  You can set up time courses or multiple comparisons this way.

Is IPA a perfect tool for proteomics? Nope, but its getting better all the time. And if someone in your lab has a license for genomics research its a nice way to quickly see if anything cool is hiding in your data.

No comments:

Post a Comment