Wednesday, April 12, 2017

MSFragger -- A search engine so good that I'll use command line!


First off -- this image is not MSFragger. This is Fragger, which is a free online video game (think high speed Angry Birds but the pigs are wearing ski masks? -- and all the birds, not just the ones that look like bombs, explode...) that I found while image searching and possibly slightly delayed the publishing of this post.

THIS is MSFragger!


What is it?  It is a ridiculously fast search engine (it says that in the title!)

How fast is UltraFast?  I gave it a 1GB RAW file I converted to mZmL (please pardon capitalization) and it was finished searching it before I had time to wonder if the engine was working. I'm not kidding. It was just done.

If you are like me, sometimes you don't have time to read instructions. Maybe you also happen to like your crooked shelves just the way they are. If you fall in that category, you can get the software directly here! -- But you might want to learn from my completely uninformed mistakes.

First off -- MSFragger runs in the command line. I swear it is worth it, though!  And it isn't one of those command line programs where you have to type in the exact mass of all 6 of your PTMs and if you make one typo the whole thing fails -- it is much better than that.

Once you download and unzip the folder you'll find a "params" file that you can open in Word (Word won't like it, but just say "open any file". Then you have all your settings and mods and stuff. You can set those however you want.  I'd like to mention -- the default mass tolerance range for the MS1 is...500Da!! Yeah. That's how fast this crazy thing is. It delta mass searches by default!

Now -- if you do it my way -- and you've contacted the authors to get help with running the program in such a way that makes it blatantly obvious that you didn't even look at the paper or the nice included instruction manual -- and are now appropriately embarrassed enough to read the very nice instruction manual (I got excited...) you'll need to follow these steps (this is Windows)

Open your CMD prompt (I always type "CMD" wait for it to show the icon then "run as administrator" just so the Windows remembers who's in charge around here)

Guide your CMD over to the folder where MSFragger is.


Cause this was just a first test for me, I put the mzML file I was testing into this folder, as well as my FASTAs.

3 things you'll need to keep in mind before you type that line and get blown away by how fast this thing is --
1) You can't run RAW files, they must be converted to a universal format (I made my RAW into MzmL in Proteome Discoverer 2.2 with no filters)
2) You should use a concatenated .FASTA with decoys
3) You'll need something to look at the PepXML output with

I broke out the Trans Proteomic Pipeline for 2 and 3. This is kind of funny:
1) It took longer to convert the RAW to mZmL
and
2) It took longer to make a concatenated .FASTA file
and
3) It took way longer for me to remember how to use the TPP
Than it took to search a 1GB LC-MS run.


And...somehow I got a lot more peptide and protein IDs than I did with this same file in Sequest! One button press in the TPP and I've got an Excel file to interrogate.

I tried to remove the names of files here, because this is someone else's unpublished stuff.  Also worth noting -- if you use the default params folder for settings for resources to allocate and you've only got 8 cores and 16GB of RAM and you do some serious searches, you might not have enough memory/processing free for your mouse to work. You can easily adjust the resource usage in the params text file.

2 comments:

  1. When will PD2.2 be available? Looking for Minora.

    ReplyDelete
  2. ASMS 2017! I promise it is worth the wait -- it is ridiculously good.

    ReplyDelete