Guess where I was this week?!?! Wait, don't guess -- the answer is in the subject line!
My day job can be pretty cool sometimes. This week I flew up to Seattle to hang out at the UWPR. I learned a lot and we got a lot done (and maybe I discovered some great Pacific Northwest Microbrews!)
Something I learned about this week was the awesome UWPR website. If you haven't been to it, you should totally check it out here.
Dr. Priska von Haller runs the facility, and this page and she has virtually every instrument you can think of. In her spare time she puts up some details of experiments she's ran so you can check them out. They make a great framework for experimentation. Thinking of comparing CID to HCD on your Fusion? Check out the numbers that Priska pulled when she tried the direct head-to-head. The percentages are rough comparisons of the number of peptide IDs she tried when running the same sample with a single method change. There are lots of examples present for many of the instruments in the facility. Absolutely worth checking out before you try to a new hybrid technique.
Priska gave me permission to link to her resources, but please don't bother her with questions regarding these experiments. I saw a portion of her workload this week and she's got her hands full.
Friday, October 31, 2014
Thursday, October 30, 2014
I learned a lot this week. One of the simplest things to write about, however, would be this cool little company in Northern California.
nanoLCMS solutions sells everything that is mentioned in the screen grab above. And some new friends (who know what they are talking about) swear by the quality of the stuff they sell. Best of all, this company has a nice, simple website and they sell bulk packing material, something I get asked about a lot.
You should totally check out this site, particularly if you pack your own columns. You can find nanoLCMS solutions here.
Wednesday, October 29, 2014
What are all these un-matched spectra? Does this keep you up at night, or do I just have insomnia and an odd sense of priorities? Sometimes I get data sets to look at where Sequest can only explain 10% of the MS/MS spectra. Normally this is on weird organisms, but with human cells it isn't that odd for only 30% to match the database. Sure, I can dump them through Preview and figure out what PTMs to go after and use multiple engines, and other techniques to push these numbers up.
At the end, however, we are searching our data against a translated and extremely conservative DNA sequence. If the DNA that was present really was what made it to the protein level, then what the heck are all these genomics and transcriptomics people doing all day? The fact of the matter is that the coding in the DNA gets mixed up, flipped around and sometimes spliced together before it becomes RNA or protein. At this point, we simply have to ignore a lot of this.
SPLICEPROT is an attempt to go after some of the products of nucleotide sequence splicing. It has a nice web-based interface to construct a database that you will be able to search with existing tools for splicing events.
You can read about this algorithm here.
And you can actually take a swing at SPLICEPROT here.
Thanks to Pastel BioSciences for the link to this one.
Tuesday, October 28, 2014
I had a couple of beers the other night with Dave, the guy who runs Omics Computing. The stuff going on over there is pretty freaking incredible. These guys have taken a look at the problems in proteomics and genomics and are hell bent on providing solutions for them.
I've already told you about their crazy fast PCs (of which I have...a base model...). They recently went out and benchmarked one of their top models and the speeds they found are just sick. They are using a HeLa digest for most of their benchmarks. Their new Proteome Destroyer can search that HeLa in about a minute. They've got a clock of 42 seconds with Mascot and 70 seconds with Morpheus and a little over 2 minutes with Proteome Destroyer. BTW, in defense of the speedy Morpheus, it had to pick its own peaks rather than being pre-processed for Mascot. BTW, this isn't their fastest model, either....
What is the next problem? STORAGE!!!! What do we do with all of this
crap data? I don't even own a mass spec (yet) and I have to dump the hard drives on my laptops monthly thanks to all the cool stuff I get to process for people. I have some QE and Fusion files that are running as high as 6GB, and we're not going down from there. What do I do with all of this data?
One thing you can do is buy and Omics Cloud or Omics Guardian from these guys. They are little cubes that contain racks where you can pull out ROWS of 4TB drives. All of the drives can be controlled with a simple web interface and all you have to do is plug the box into your ethernet and locate the IP address. I'm a little blurry (I mentioned we were having a couple beers, right?) about the differences between the 2 products, but the Guardian appears to do automatic drive imaging so that if one drive fails you always have another that is alive you can go back to. The one that I saw that was equipped with 20TB of storage but they can set up to 40TB in one single node.
You can check out the Omics Cloud here. The Omics Guardian doesn't appear to be up on their website yet.
Monday, October 27, 2014
I started reading this article because I thought parturition was referring to something else entirely.
This nice study was done in by researchers at 2 different facilities in Bristol and if you Google "Bristol, UK" you get this image.
Now that I've written the strangest intro of any blog post to-date, I'm going to actually go into this well-orchestrated paper.
Bloods was taken from fifteen pregnant participants before and during childbirth.
The plasma was separated (and if it was depleted for the Top15 or anything, they don't mention it). The proteins were tryptic digested and labeled with either TMT2 or TMT6 reagent (depending on the experiment). The TMT tagged peptides from each experiment were mixed, SCX fractionated and LC-MS/MSed on an Orbitrap Velos operating in High-High mode. A complex multi-stage LC gradient was used. (Minor note: I think the coverage could have been a tiny bit higher if they moved the MS1 scan from 300 to 380 or 400, but this is a minor point. The operator of this instrument obviously knows his/her way around Xcalibur.
Silliness aside, I've read 4 papers on this plane so far today. Why did this one make the cut for the blog? Because this group did more than TMT 10plex. Look at this, this is 15 pairs. A common criticism of isobaric tagging is that we can't compare beyond our normal dataset. This paper shows that it can be done.
How'd they do it? They ran each experiment separate in PD 1.2 (if anyone knows this group, I hereby volunteer 30 minutes I should be using for sleep to move them up to PD 1.4). If I evaluate their processing scheme, at this point, the observations were what was important. Taking the observations from the different experiments, they ruled out what was significant from what wasn't with a simple T-test. Yup. Like the t-tests that GraphPad lets you do for free if you cite them?
And it works. At the end of this simple analysis they had 40 proteins that appeared important. And 2 checked out via ELISA. That sounds qualified to me. Another argument for why we wouldn't want to do TMT down the drain.
You can read this nice paper at EuPA Open Proteomics here.
Sunday, October 26, 2014
In my job, sometimes I hear and see things that I'm not allowed to tell you guys about. I err on the side of caution, cause my car insurance is pretty expensive. I've been holding this one in for quite a while, but its on the Prosight Blog so I know I can talk about it.
Prosight as a node in PD 2.0. BAM!!! Super powerful software that can be controlled inside an interface you already know (or that I could teach you in about an hour...check the videos to the right).
I saw a working copy a few months ago, and I'm about to get it for myself to try it out (maybe this week! I'll have a nice long flight back from Seattle I'm hoping to spend working with it!!!!!) What if you split your protein into two parts, one part that you did via top-down with ProsightPD and the next part that you digested and did with Sequest+MSAmanda+Byonic? Would that be the most thorough protein characterization that you've ever done?!?! And you could do it all in Proteome Discoverer. I'm psyched. And I hope you are!
Friday, October 24, 2014
Histones are nutso. And confusing. And He Huang et. al., said "enough of this garbage, let's clarify it in a single freaking page."
You can find it here. And it's awesome.
Thursday, October 23, 2014
Yasset's blog, BioCode's Notes is one of my very favorite in the blogosphere. Today's entry is super interesting. He did a breakdown, by journal, of authors who upload their data into public repositories. You can skip over to his analysis here.
Wednesday, October 22, 2014
The NCI-60 is one of the most well used collections of cell lines in the world. There are cell lines from nearly every main cancer type and they have been studied in a remarkable number of ways over time. Everything from drug sensitivity to radiation resistance...somebody's done it. And the NCI has painstakingly maintained correct back-frozen freezer stocks so everyone knows they are doing the right things.
This dataset is perfect for meta-analyses. Case in point, this new paper in JPR this month from Maria Karapova et al.,. In this study they go back through shotgun proteomics data that they acquired on these cell lines and compare those to FASTA database files generated from whole exome sequencing. This gave them a direct way of comparing the transcriptome of these cells to the proteome and boost their matches through the ceiling.
Awesome study that shows how much data we're missing if only we figure out a way to really search for it!
This is a question that I wish people would ask me more.
The question: Is there really a difference between HPLC and LC-MS grade solvents?
The answer? YES!!!!! And please please stop putting HPLC grade solvents into your mass spectrometer!
Over the course of this summer I went on a whole lot of trips where I helped people restore their mass specs performance. While the situations were all different, I can say with a high degree of certainty that every lab I visited that was using HPLC grade solvents in their mass spectrometers came to regret it and that we got better performance when we switched it...and an engineer cleaned everything to get all the junk out....
To make an HPLC solvent you need nice clean water. No doubt. But the biggest determinant of what is clean water is that it isn't going to react to UV. Lots and lots of things in this world will have zero effect on chromatography and will also not show up on UV. ALL of these things have mass. A lot will ionize. Many will end up putting undo stress on your nanoLC components, source, quads, lenses and other shiny things in your MS. Some will even screw up the charges of your peptides and proteins.
Sorry this is a rant. I Googled: HPLC vs LC-MS grade solvents and came up with very little. I hate you to take my words just because they are my words, but I swear that if you are using HPLC grade solvents in your mass spectrometer you are going to have problems. Maybe not today, maybe not tomorrow, but soon...
Tuesday, October 21, 2014
I'm currently sitting through an incredible webinar walking me through "How to get the most from DDA data in Skyline". Nope. Not a typo. Data dependent.
When we think Skyline a lot of people just think Data INdependent Analysis or SWATH. In this awesome tutorial Brendon is walking us through how to do quantitative analysis of our normal DDA proteomics experiment.
The video will be on the Skyline page soon, but he will be doing a second webinar at 4pm Pacific time today. If you have an hour, you should definitely check it out!
Monday, October 20, 2014
I recently learned that there is another new Q Exactive instrument out there in the world. Then I saw there was an article about it in BusinessWire, which means I can talk about it! This one is called the Q Exactive Focus. According to BusinessWire it is designed for "cost-per-sample sensitive workflows." It is still a bit of a mystery to me, but it looks like it is a slightly less powerful version of my favorite mass spectrometer. Its designed for people who need routine analysis over maybe paying for the absolute cutting edge. I'm definitely excited to play with one!
I thought this would be a good time to clarify the Exactive product line. There are now a lot of models and people seem to get them a little mixed up.
In the Exactive family there are 3 models:
1) The Exactive
2) The Exactive Plus
3) The Exactive Plus EMR
The exactive was the original model. 100k max resolution, I think.
The Exactive Plus was the re-boot. 140k max resolution, much faster scan speed (12Hz) and the Exactive Plus can be upgraded to the great Q Exactive by adding an optional HCD cell and front quadrupole.
The Exactive Plus EMR is the one that Albert Heck's lab keeps publishing all the awesome native protein stuff on. It has a crazy high mass range, up to 20,000 m/z! As far as I know, only the Exactive Plus can be upgraded.
There are now 4 Q Exactives:
1) The original QE (140k resolution, 12Hz). Not upgradeable. Which is fine because it is awesome just the way it is!!
2) The QE Plus. This is just like a QE but with better quads in the front. It also has the option to add protein mode and high resolution mode (280k resolution)
3) The QE HF. This is a QE Plus with the high field 5kV Orbitrap (like the one in the Fusion)
4) The QE Focus. While I don't have full details, it looks like a QE with lower resolution and a few less experimental capabilities.
Heck, I'm still waiting to get my hands on a QE HF. But now I'm excited to see yet another new variation. When I get to try one of these out, you'll be the first to know!
You can read the BusinessWire paper here!
I found this entertaining feed while browsing Imgur over coffee. They belong to something called the upturned microscope.
You can check out the feed I saw here.
For more distraction (how much coffee are you going to drink...geez!) here is the enormously entertaining The Upturned Microscope.
Sunday, October 19, 2014
Like a lot of people out there, I don't have time to upgrade every piece of software every time a new one comes out. Like a lot of people out there who lean toward the data processing side of things, I have a lot of computers. So I definitely can't upgrade every piece of software on every computer!
However, if you haven't upgraded your Skyline in a while, I strongly encourage you to download 2.6 right now. The only thing I ever say about Skyline that could possibly be inferred in a slightly negative way is that I've always found the front page to be imposing. Here is a blank sheet. Now what? I don't know when things changed (this PC had 1.4 on it, LOL!!!!) but 2.6 has this awesome new front page interface that shortcuts you to all of the main tasks.
Saturday, October 18, 2014
Redox states are big regulators in biological systems. Those annoying cysteines that we are always reducing and alkylating so we can ignore them are actually big regulators (regulateees?) of global redox state. One example that pops very rapidly to my mind is the NRF regulator. The cysteines in that protein can give you an awful lot of information about what is currently happening in cells and some labs deliberately target this protein to get early information regarding drug toxicity.
In global proteomics, however, we haven't really every taken a shot at monitoring the cysteines and their capabilities. Heck, we make it almost impossible to do so. When we put our cysteine alkylation as a static mod (something I don't do, btw) we are saying that 100% of the cysteines in our sample were successfully reduced and successfully alkylated. If something else was going on with the cysteines we will never ever know it.
Some researchers at the University of Florida have taken a completely opposite approach. In this paper from Jennifer Parker and Kelly Balmant out of Sixue Chen's lab (currently in press at MCP here), these researchers show us that we can be monitoring the global redox information in a cell with full quantitation as well.
Now, this technique isn't for the faint of heart. The technique is called cysTMTRAQ because it employs two sets of isobaric tags. The first is one of my favorites, the cysTMT reagent (if you dig through the blog you'll see how we used this reagent in my postdoc to get at drug mechanism-of-actions...such an awesome and underutilized reagent!) They use this to get quantitative information on the state of the cysteines. Hint: If you don't reduce and you tag, then you get quan on what cysteines are currently in an un-linked "active" state. Then they do reduce/alkylate/digest and label with iTRAQ. In this way you can get quantitative information on the global protein distribution.
I have a couple comments about this approach. The first is that this is really seriously smart. This is the first approach I've ever heard of for monitoring redox and this is great purely for that reason. The second is that I see a ton of potential in this whole approach for some us out there who really won't be going after redox states. For example: When I review a paper and it says "drug treatment leads to a decrease in phosphorylation in this list of proteins" my first thought is how do we know protein abundance didn't shift? Sure, its tough to rapidly up-regulate proteins (transcription/translation take time) but caspases kick in crazy fast and degrade proteins both rapidly and sometimes selectively).
Again, I really enjoyed this paper, but the whole time I was thinking "can I double tag all of my PTM quantification experiments?"
Tuesday, October 14, 2014
This one is really smart. And I both expect and hope to give this a try soon.
Proteins have 3-dimensional structures. What the protein looks like in 3D is critical to what the protein does, who it interacts with, and all sorts of other stuff (yes, there is better terminology secondary through tertiary...I took biochem...a long time ago...)
By necessity (or so we thought) shotgun proteomics has always ignored this fact. We denature the proteins down to their 2 dimensional straight chains and then digest them from there. As a consequence we only get part of the story, maybe a small part of the story.
This new paper in Nature Biotechnology suggests that we can go after this information, and we don't have to buy a fancy NMR or tons of crazy reagents to do it -- chances are, we have everything we need to find out (in some sense, at least) how our protein 3D structures are changing from one sample to another.
Check out this example I found on Google Images (I couldn't find the original author. If this is you, I apologize and leave a comment to get credit for this). In this awesomely simple image this protein has 2 states, Open and Closed. In the Open state, if we look right where the arrow is pointing we have access to this nice middle region. If we were to digest this protein in its native state, we would be able to see this middle region. Skip to the right. Now the protein sequence is closed. The amino acids in the middle are protected by the rest of the protein chain. My enzyme of choice can't access this site in the Native Closed protein and no peptides from that area would show up in my global proteomic analysis.
Simple and elegant, right?!?!?! So, what we need to do is carefully extract our proteins in their Native state in condition 1 and in condition 2 and digest them softly. The authors of this great paper then denature everything and digest under normal conditions. By using 2 different enzymes, they can find that exposed regions in the 2 conditions disappear (or are just cut much smaller!) and the regions that don't change between the 2 conditions appear the same between the 2 samples.
Papers like this are the reason I get out of bed...that and the Pug threatening to pee on my floor....
Big thanks go to Dr. Sreelakshmi for tipping me off to this paper!
Monday, October 13, 2014
Every few years or so, a new set of standards pops up for how we should all be reporting our data. These standards have to change due to the exponential growth of data file size and shifting priorities of what information we really find valuable. Sometimes its hard to keep up.
ProteoRed is a new approach for outputting this data. ProteoRed (icon in blue above...) can automatically reach into files set up in the HUPO Pride XML format and extract the MIAPE (Minimum Information About Proteomics Experiments) that some journals require. Saves us a step and minimizes human error that can arise when transposing these data.
You can find ProteoRed here.
Wednesday, October 8, 2014
A highlight of the trip is that I've finally got to see capillary electrophoresis on an Orbitrap. I know its been around for a while, but I've never actually seen the two together.
The lab I visited was using CE to get baseline separation of proteins and protein complexes. It is a pretty impressive jump from my normal intact protein experiments (5cm C4 or C8 columns that resolve only slightly better than direct infusion!) to see separation when the proteins differ by a single PTM.
If you're interested in this technology, I was just reading through this paper from 2011 where CE was equipped on an Orbitrap Velos and it does a great job of explaining this interface!
Monday, October 6, 2014
I don't know everything about proteomics, and I'll never ever claim that I do, but I've been doing this for a while now and I've had some really good teachers over the years. Its rare when a concept completely catches me off guard.
Thermal Proteome Profiling? Never heard of it. Maybe I'm not the only one though, cause it sure landed right square in this month's issue of Science, and that's generally where we put new stuff.
Here is how it works in a sleepy Monday morning nutshell. You heat up a cell culture. Some proteins are going to denature and become soluble before others. You can get those proteins that went soluble by digesting the supernatant. You can digest them and use quantitative proteomics and see what proteins denature and when.
The baseline is going to be cool enough, right?!? All by itself, this is a cool concept. Well, what if you took it a step further and dropped in different drugs and checked to see how that affected the stability of those proteins? You could learn a ton of things. What you'd do with that data? Maybe that would point you directly toward the proteins that drug is targeting!
Super cool concept. Nope, I would never have thought of this one at all. I'd love to give it a try, though! I have a drug in mind I've been trying to figure out the molecular target of for like 5 years or something. Maybe this is the way to finally get to it!
You can check out this cool paper by Mikhail Savitski et. al., here.
Sunday, October 5, 2014
XMAn is a new resource from my proteomics teacher, Iuliana Lazar's lab at Virginia Tech (go Hokies!). I saw the poster at ASMS and the paper is currently in ASAP at JPR. This link will lead you to the abstract.
What is it? Its a new kind of FASTA database, one that specifically targets all the currently known protein mutations in cancer. How awesome is that? If you are doing cancer proteomics, you need to check out this paper.
I had some time today and I downloaded it and played around. And it is awesome!
This was my setup in Proteome Discoverer:
This quick and dirty experiment came back with 1703 protein groups. 42 protein groups were matches from the XMAn database. These are proteins that were ONLY detected because the known mutant variant of the peptide was uncovered. To see how many mutated peptides were uncovered, I had to ungroup the proteins. This gave me 85 clearly identified mutations (because more than half came from proteins where >1 "normal" peptides were uncovered.
I snapped through the spectra, and again, pretty tight tolerances, so some very nice peptide matches. I popped over to COSMIC and verified that more than a few of these mutations were known to occur in HeLa. And without the step of using XMAn, I would have missed 85 mutations in even this little dataset!
Although I had to make a few changes to incorporate XMAn into my copy of Proteome Discoverer 1.4, this may be a formatting issue specific to the old 32-bit PC I'm on the road with this week. I'm moving the HTML into text and then into PD. Other people on other (better) PCs are able to directly import the database without problems.
Thursday, October 2, 2014
I love global protein abundance experiments. With today's technology, it is now relatively easy. I like phosphoproteomics a little less because it is much tougher to do. And I probably have to say that I like glycoproteomics the least. I don't like the fact that I have to use ETD and CID/HCD to sequence by sugars and my peptides. That is a lot of work and a long cycle time.
But glycoproteomics continues to climb in importance. I saw a lecture some time this year where the speaker said that every single disease state in man is known to be linked to glycosylation in some form. Sorry, but I forget who said that. It was probably at ASMS, since I haven't been to all that many lectures this year.
What if a very major characteristic of cancer was changes in the glycoproteome? Well then, I guess I'd probably suck it up and fire up the ETD. This paper in PNAS says that its time to heat up that fluoranthene vial . In extremely thorough detail this globally spread team looked at the length of the glyco chains on proteins in various cancer types and found a very strong relationship between short chains and cancer progression. Specifically, the truncation of O-glycans appears to directly induce oncogenesis.
The glycoproteomics was performed with my favorite old workhorse, an Orbi XL + ETD. The level of validation is mind-boggling.
I recommend this paper for anybody doing anything with cancer. It really puts into perspective how little we know, but points us in a direction that we maybe should be looking.
Wednesday, October 1, 2014
If you've seen the blog the last couple of months, you've seen how excited I've been about OmicsComputing, a new company that is specifically designing computers for people in our field. They just released their V2 product line, and this stuff is nuts.
For example? Check out their new top end PC, the Maximum Destroyer. It has the Haswell E 16 core chip. I didn't even know this chip was out yet! Big deal, right? 16 cores?
Puget Systems is a company that cares about computer processing for professionals. They are interested in the maximum number of theoretical calculations that a computer that can carry out. They benchmarked this chip. Not only is it the fastest chip that Intel has ever made, but it outperforms server motherboards with TWO XEON CHIPS.
Brazenly stolen from their website (original article here) [don't sue me]:
That chip that it narrowly beats? That is TWO of the 16 thread Xeon chips. The 5960x can do more mathematical calculations than two 16 thread Xeons combined!!!
Number 3 on that list? Oh, that is a chip it looks like they are using as well, in their new Proteome Destroyer. And this is where this story takes a down turn. Last month I added the original Proteome Destroyer to my office. And it is awesome. It is, by far, the fastest PC I have ever used for searching data. But it looks like their new line would smoke my awesome desktop. Realistically, I kind of exceeded my budget anyway, and this thing is more computer than I acutally need... but I'm a busy guy and sometimes....
Anyway, thats how technology goes, right? Sometimes you get your QE Plus out of the box and your neighbor gets a QE HF. Or your buddy buys a Porsche GT3 that makes your tuned up 944 look like a snail....
I'm still super excited that we have a company out there that is making computers for proteomics! If it takes you hours or days to search your proteomics data, you should follow this link to OmicsComputing. and get around to measuring the time to process your data in minutes. Tell them I sent you and that I wouldn't turn down a good deal on a trade in.