Friday, November 2, 2018

NeoFusion -- A search engine for spliced peptides!

Why is Madison, Wisconsin the capital of proteomics innovation in the U.S.?  I have a theory developing, but it's obvious it's just snowballing up there. Pun intended. If you go to see what's going on up there for yourself don't take the free Mustang upgrade Hertz offers you up there in February.

Case in point: This brilliant new piece of software. Wait -- honestly -- I thought today I was going to talk about another brilliant piece of software from Wisconsin that we just started using continuously this week -- but NeoFusion has to jump line.

Let's fill in some backstory here: Endogenous peptides are super important to systems like self-recognition. Our cells will hold weird peptides in little protein pockets on the outside of our cells and these peptides say to the immune system things like "I'm a Ben cell. Don't eat me!" or "I'm a dying cell that's all infected with stuff. KILL ME!!" They do other stuff but this is all the biologists have gotten through to me with the sock puppets they use to explain immunology to me. Big picture is that if you can figure out the peptides on bad cells you can make antibody drugs to destroy them... and stuff....

Problem is -- there is mounting evidence (3 papers now, I think - and more on the way) that a lot of these peptides are all spliced up from either different regions of themselves or even other proteins as a side effect of the proteosomal degradation process that produces them.

I'm new to the endogenous peptide stuff, but I've been through some studies pretty thoroughly and my take on the data analysis is -- get your LC-MS files and do some de novo sequencing (or use modified proteomics engines) while simultaneously using really large mass tolerances and pretending that you've never heard of FDR. Hey -- whatever -- it's not like people are trying to make cancer drugs -- this is proteomics -- what's important is that you found more peptides than the last team.

NeoFusion is something completely new.  It has this brilliant sequenced-based identification method for matching what it's got to the sequences you feed it. It isn't like these systems invent new proteins from nothing. They splice the existing proteome together. NeoFusion uses this to it's advantage and it's like a puzzle -- once you find part of the story the sequence continues to support itself if it's correct. Does that make sense?  This team is also all about heavily relying on post-acquisition recalibrations to strengthen the identification of PTMs (crank that mass accuracy up!!!) -- and -- of course this is ridiculously useful here as well. I haven't ran anything through this yet -- but I've got a buttload of files waiting for IT permission to install this!

Bonus points because Neo-Fusion doesn't forget that FDR is a thing.

