News in Proteomics Research: May 2013

Friday, May 31, 2013

NMR instrument the size of a laptop?

I'm no NMR expert, by any means. But I do have 3-D glasses hanging right by my desk primarily so that I can look at nice NMR resolved and elucidated protein structures. The picture above is very similar to the picture of an NMR that I have in my head.
This is why I had to post this video on a benchtop (heck, it's almost hand held!) NMR instrument. While I'm still a Biologist at heart, I swear, I really love advances in instrumentation!

Saturday, May 25, 2013

Compass -- Awesome tools from the Coon lab

I needed to write this one: 1) To lead into my upcoming monologue on False Discovery Rates. And 2) Because I use this all the time and somehow I've never written about it.

If you are not familiar, you should be. COMPASS is a set of free tools from the Coon lab. It can do all of the things shown above. I use and underutilize this all the time. Because what I use it for is to build FASTA files for me. This is the easiest, fastest way to build a FASTA and get yourself a decoy database. It'll make you a separate decoy. It'll append your decoy to the end or your normal FASTA and, as for decoys, it will make you a reverse on, it'll make a scrambled one, and I think it will make a reverse and scrambled one.

If you don't have COMPASS, you should spend the 20 seconds it takes to download it and the 10 minutes it takes to learn to use it. It is described in this paper from 201l, and you can download it here.

Friday, May 24, 2013

ROCCIT: Nice new web-based search engine hosted by CalTech

The ROCCIT search engine isn't exactly new, just new to me. A poster was presented at ASMS in 2011. It's always nice to find a new tool that you didn't know about.
I don't know how the algorithm works, as it doesn't seem that it was ever published. What I do know is that it is very easy to use. You go to ROCCIT.caltech.edu, you choose your MGF file, pull down or insert your FASTA file, add your mods and go.
It takes you to a nice queue page that shows your total % progress and then gives you several pages of HTML files with your data. The protein report is nicely grouped (similar to how BioAnalyst used to group Mascot data back in the day) and the peptides are on separate tabs. The results can be downloaded as an XML file as well.
It isn't revolutionary, but it is easy and free and no harm has ever come of running your data on another search engine!

Thursday, May 23, 2013

Recalibrate Thermo RAW files automatically

Okay.
This is AWESOME.
There are several reasons that we might want to go back and recalibrate our RAW files. Sometimes you find out after a big experiment that someone forgot to calibrate the Orbitrap (not me, but other people! I'm a super calibrator. Sometimes you find out you left lockmass off. Or you have an Orbitrap XL or Discovery and lockmass adds to your cycle time so you leave it off to dig a little deeper into your data.

Until now, there was only one way to recalibrate your RAW files and it required that you use the command prompt, you had to do it one at a time (single core, even) and it doesn't work with anything newer than a Velos.

Then we get this new free node from the Mechtler lab. You download it, follow the simple installation instructions and then set up the workflow as shown above (there is another formation on PD-Nodes.org, but I couldn't get it to work myself), and you recalibrate your RAW files.

I pulled an old Orbitrap XL file from my postdoc and gave it a try. I never used lockmass due to the cycle time issue on this instrument, so it made a perfect test.

206 peptides in the fraction (ran with no modifications):

Run 1: Average mass variation: 4.65 +/- 1.15 (not too shabby for 30,000 resolution)
Run 2: (Post calibration!): 0.87 +/- 1.06

BOOM! I exported it as an MGF file and then reimported that one back into the exact same workflow. Sub-PPM mass accuracy on an XL without lockmass?

I don't think I have to spend a lot of time on this. I hope the immense value of this amazing tool is apparent. How many good peptides get dropped because they are 11 ppm out? I'm curious, because I think there are a lot.

Another idea that just popped into my head: If our goal is to get into the sub-ppm mass accuracy range and we can get it at 30,000 resolution or even 60,000 resolution, would we be able to save cycle time on experiments we would have ran at 100,000+ by just running them at lower resolution and then recalibrating them?

So much potential here. I can't wait to experiment further! Thank you Dr. Mechtler and lab!

What is in a FASTA file?

Due to a whole ton of new next gen sequencing data popping up in new databases around the world, this question keeps popping up: what is in a FASTA database?

This is what the NCBI says:

FASTA

A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line (defline) is distinguished from the sequence data by a greater-than (">") symbol at the beginning. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:

  >gi|129295|sp|P01013|OVAX_CHICK GENE X PROTEIN (OVALBUMIN-RELATED)
  QIKDLLVSSSTDLDTTLVLVNAIYFKGMWKTAFNAEDTREMPFHVTKQESKPVQMMCMNNSFNVATLPAE
  KMKILELPFASGDLSMLVLLPDEVSDLERIEKTINFEKLTEWTNPNTMEKRRVKVYLPQMKIEEKYNLTS
  VLMALGMTDLFIPSANLTGISSAESLKISQAVHGAFMELSEDGIEMAGSTGVIEDIKHSPESEQFRADHP
  FLFLIKHNPTNTIVYFGRYWSP

Blank lines are not allowed in the middle of FASTA input.

Uniprot/Swissprot entries are going to look different than TREMBL and these are going to look different than RefSeq and so on and so on, but they are all going to follow this basic format.

Wednesday, May 22, 2013

Q-Prot: New label free quan software

I can't tell you much about this one, other than you can download it on Sourceforge and that it relies on you having a C compiler and the GNU C proteomics toolkit. So, yes, it does fall into the category of software that is a pain in the neck to install. On the up-side, there is a paper currently in review on it and the instructions are already available in the install package here. It also does label free quan by spectral counting and intensity while trying to perform FDR on the quan values it receives.

I know I'm supposed to be boycotting software that is hard to install, but I've been a big fan of the other (also difficult to install....) software from the Nesvizhskii lab.
.

Tuesday, May 21, 2013

ProteoCloud --- Process your Proteomics Data on the Amazon Cloud server

Cloud cloud cloud. Commercial commercial commercial. It's all cloud this, cloud that. But what if people started thinking that the cloud cluster computing could be used to do something more than store and transfer videos of cats (and Psy!)? What if people started using all of this power to do proteomics?

Welcome to ProteoCloud!

This is really cool.
One, the interface is easy.
Two, all you need is an AmazonCloud account to use it
Three, it's a really nice looking interface.
Four, it is cool because it is the first attempt at Proteomics on a multi-use cloud system.

Downsides:
It currently can only take MGF files, so the algorithm is probably dated
That's all I can come up with. It's too cool for me to put down!

Top Down Proteomics becomes reality -- cool perspective article in Chemical and Engineering News

As the title says, this is just a cool perspective paper that came out yesterday. The author got input from some of our favorite people in the intact field and outlines what we've done and where we're going. I think one of the things holding us back on real global top down proteomics has been suitable LC conditions. Of course the mass spec side of things has been a challenge, but we've been able to resolve large intacts for several years now and ETD for fragmentation isn't a new thing. With new chemistries coming over, I think we're going to see more of a push in this direction. Recently, I've been describing my job as "when someone already knows how to do proteomics, but they want to start doing top down, I fly out and help them get going." Cause I've been doing exactly that quite a bit. I think this is a good indicator of where the field is starting to head, and this paper sums it up very nicely.

You can find the article here.

Monday, May 20, 2013

Another new PD node: Advanced filtering parameters!

I am so happy.

Dr. Mechtler made an announcement on the BRIMS portal last week that there were new nodes at PD-Nodes.org. As usual, there was not a lot of explanation of these gifts, just the installation instructions. And that is exactly the way I like it. Install them and find out what Proteome Discoverer can do now. They are all great, but this is the simplest one for me to both use and to explain.

Advanced filtering parameters. Install this node in PD 1.3 and all of the sudden you go from telling your workflow what mass ranges to look at to what ranges to look at and to not look at. First thought for me, now I can set up my iTRAQ and TMT workflows so that they ignore the region where my reporter ions come off. Which gives Sequest and the FDR nodes less to do and less to make mistakes with. There are other options, such as the ability to set filtering isolation widths, but this is the option that is going to make the biggest impact on me and my work.

As always, if you have Proteome Discoverer, you should have PD-nodes.org bookmarked. You should also be a member of the BRIMS software portal so you'll know when new stuff is coming.

I hope to have some tests of all these new nodes available soon!

Sunday, May 19, 2013

Gas phase fractionation is back!

Every couple of days right now I hear something new about gas phase fractionation. If this is a new concept to you, you're probably a relative new comer to the field. This was big news for a while thanks to some really great papers around 2001-2003, then it kind of flickered out and we didn't hear much about it. Until now. For those of you going to ASMS, I'd expect to hear a lot about it!

Gas phrase fractionation is essentially the "binning" experiments that I mentioned in an entry a couple of months back. You perform your normal MS1 scans, but you only look for ions for MS/MS if they fall within a particular mass range. You then repeat the run on another mass range and another until you've covered them all. That's all there is to it. It is much simpler than the "tiling" experiments that recently came out of the University of Wisconsin.

One of the driving forces behind this buzz is a company called NonLinear Dynamics. They are driving this buzz because their software package, Progenesis is optimized to compile data from these experiments. It seems like a really cool piece of software, with 3 dimensional peak integration and support for all vendor software.

Unfortunately for me, I don't have Progenesis, but they do offer a free trial that I think I'll jump on once I get caught up on the other 100 things I need to do. Fortunately, for me, I do have SIEVE and Proteome Discoverer so I have no problem processing gas phase fractionated data. Since SIEVE only looks at MS1 spectra and PD can be set up to only look at MS/MS spectra, neither packages care what internal instrument methods you are using. I'm still excited to check out this package but I'm not in a huge rush.

Thursday, May 16, 2013

IsobariQ: Use advanced variance statistics for reporter ion quan

I've been meaning to write this one for 6 months or so, but it always seems to slip down the list. We're all aware of the limitations by now regarding reporter ion quan. I've talked about some of them recently. A method of combating some of these limitations is a procedure called variance stablisation normalization (VSN).

While the maths are well beyond the capabilities of this biologists, the way my mathematically inclined friend Patricia explained it is that it uses clever statistics to re-evaluate the apparent ratios. These magical formulas tend to improve ratio suppression and coisolation instances.

More tangible perks: The software is free, appears well supported and commonly updated, and has a colorful and friendly looking user interface.

Still doing iTRAQ? Maybe you should check this out!

Monday, May 13, 2013

Xenografts induce liver dysfunction in mice

In a step away from proteomics over to metabolomics, this is a pretty interesting story in press at MCP. The paper from Fei Li et al., takes a look at the urine metabolomics of our favorite cancer models, mice that are growing xenografted tumors. We all know this system is a flawed one, but it is absolutely the best thing out there. There are tons of examples where treating a series of mice with these tumors with different combinations of drugs has told us exactly how to eradicate a specific tumor. A variable that no one has evaluated (as far as I know, anyway) how the growing tumors affect the liver function. Obviously implanting a tumor is a pretty bad thing, but here are quantifiable changes. I highly recommend this paper for you metabolomics people as well as anyone working with xenograft models!

Wednesday, May 8, 2013

JPR ASAP Articles are no longer open access

In yet another big hit for open access to the newest literature it appears that JPR ASAP (as soon as publishable) articles are no longer open access. There may be exceptions, but none of this month's ASAP articles are available to me from my hotel room.
Fortunately, "Just Accepted" articles do appear to still be available as long as you are a registered user. Nothing in this world is free, I guess...

Tuesday, May 7, 2013

Surface Proteome of Malaria Parasites

BOOM! Mapping of the surface of malaria proteins? I'd be more excited if I wasn't thinking "about time"! No, really, these things kill millions of people per year and we've only scratched the surface of their proteomes. This paper reveals nearly 2,000 proteins. I don't care what instrument they used, or how efficient their sporozoite purification was, I want access to this data (pretty please?!?) As rapidly as these thing mutates, I'm sure that the 2,000 they identified is just the tip of the iceberg regarding what is really there that we can't yet translate.
BTW, I'm talking about this month's MCP highlight paper from Lindner et al., from my good friends at Seattle Biomed.

Sunday, May 5, 2013

Open source software for everyone.

SO.
My spare time right now: 3 papers in various stages of preparation, trying to get back into climbing shape, and this current obsession: Making good open source software actually accessible for everyone. As some of my readers may have noticed, I've started pulling my reviews on software that is impossible for someone without a computer science degree to install or run. This is because they are not helping. Seriously.
If you write a fantastic program that is the first thing that can ever fully elucidate glycopeptide data but you write it in Python and don't write installation instructions for Windows this is not going to get to any real glycopeptide researchers. This is a fact. The cool glycopeptides aren't controls from Pierce, they are excreted from HIV or malaria infected cells. They are being found by biologists who can not install your software.
New rule. If it can't be installed by a guy with a Ph.D. in Biology, I'm not going to give you press.
That being said, I have to backtrack. There is too much good stuff out there, so this is what I've been working on -- let's write some GUIs!
I need help. At most, on my own, I'm going to finish 1 maybe 2 this year (and I can't write in Python, not yet, anyway) And I know exactly what programs I'm going after. I'm dreaming of an open sourceware site with easy-to-install high quality Open Source software. If you want to help with hosting space for this project or by writing one of these (or even suggestions of what programs we should go for first!). Email me (orsburn@vt.edu) or leave a comment!