Monday, August 31, 2015
I have to show this one to you guys. Despite some great new advances recently, we really still do lag behind a lot of other biological fields when it comes to statistics. It is getting better all the time, especially with new bioinformaticians coming from other fields into ours. We'll have it all down soon and I'm excited for all the advances that keep coming.
In terms of bioinformatics, we have two studies out there that have some pretty big targets painted on them. These studies are our first 2 drafts of the human proteomes. They are awesome studies of remarkably high level sample prep, instrumentation and methodologies. However, both relied heavily on the statistics tools that we all use in our daily work. In data sets this large and complex, our typical tools, namely the 1% false discovery rate, seems particularly weak. 1% bad matches sounds pretty good until you have a billion observations.
The title of this paper is a good one: "Solution to Statistical Challenges in Proteomics Is More Statistics,Not Less." and really highlights the value of more, and increasingly robust, statistical analyses in our proteomics studies when we are looking at increasingly larger datasets.
It makes me sad (and I find it just a little scary) because I don't personally have the background to effectively evaluate the quality of the statistics. Now that we're in this era where our instruments generate more data than we could ever possibly manually examine we are going to need to rely more and more on these algorithms to sort things out and we're going to have to find a way to let go of a little control and trust them.
This paper is a good, short, open access(!) read.
Sunday, August 30, 2015
Umm...so...if you are desperately searching for a way to learn mass spectrometry
Saturday, August 29, 2015
Aging sucks. You spend all this time looking forward to it when you're a puppy. Looking toward the day when you can drive and stay out all night dancing and are trusted to go outside and do your business and come back inside when you're done. But then it should stop. And progress is being made here and there on that front, but before we can stop it we need to understand it a little better. There are tons of good papers out there studying aging in mice and worms and zebra fish. Its tougher to do this stuff with human beings.
For an interesting model of how we could analyze human aging check this paper from Christina Menni et al.,(open access!). It isn't my typical proteomics paper but there is a lot of interesting stuff here.
First of all, they have a big group of human female twins, called the TwinsUK cohort. Twins are great for studying all sorts of things, but are most often called upon for nature vs. nurture type studies. Another awesome use of twins? Biological replicates! What they did here was take plasma from this large group of British ladies and performed standardized RNA-Seq analysis looking for changes that could be aging related.
Another great thing about studies of this kind is that once you get a bunch of human volunteers lots of researchers want to test lots of different technologies on their samples. Turns out that most of these volunteers had already had plasma drawn and ran through a kind of protein array technology called SOMAScan (you can read about it here if you are interested!)
What did they find? Clear, circulating biomarkers that are detectable at the RNA level and carry over to the protein. When you go for overlaps of technologies between RNA and protein you are going to typically see a pretty small list, and this study is no exception, but they found a cool little list.
Friday, August 28, 2015
As I mentioned in an earlier post, I've been working on a lot of things, but I haven't been finding a huge amount of success lately. I'd like to reverse that right now with maybe my favorite study I've ever set up.
Above, I made a quick PPT drawing for what I consider a typical simple phosphoproteomics experiment. Control cell line vs. treated cell line. Digest out the peptides. Run some of the peptides for global proteomics and enrich the rest for phosphoproteomics. Run it all on the same awesome high resolution instrumnet. Shouldn't you end up with data like this?
Okay, so what are we looking at here? This is a slice of a screenshot from a Proteome Discoverer 2.0 multiconsensus report. On the right we have accession numbers for the protein of interest. In the middle we have the exact side of phosphorylation within the protein(!!eek!! this one still makes me really happy!!) but even better is what is in the right column. We have 4 columns. The first 2 are the total relative area of the PROTEIN from the whole proteomics samples. The next 2 are the relative quantification of that phosphopeptide!!!!
Are you as excited as I am? Okay. Probably not. But check this out. The first one? When we treat with this drug we get up-regulation of the phosphorylation at Serine 83 in this protein of about 3 fold. That could be actual upregulation of the protein itself because we never detected this protein in the whole protein sample. But check out the last one in the screenshot. Phosphorylation on threonine 399 in this protein is upregulated following drug treatment by 5.2 fold but if we look at the total quantification of the peptides found for that protein (it doesn't show it here, but there are 7 of them!) the protein doesn't really look up-regulated. We've found a drug specific phosphorylation event! And this is in one PD 2.0 report.
I can't tell you how much I wish these functions existed during my postdoc...there aren't words.
So, how do you do it? It isn't easy. It took me...several... tries to get it exactly right. So good luck!!
Wait, just kidding. Why don't you watch this video and I'll show you how I did it.
As always, watch it in HD (once its available!). P.S. This is even better to set up in PD 2.1, but we'll talk about that later!
Thursday, August 27, 2015
Remember RAW Meat? Man, I loved that program. Unfortunately it doesn't work for a lot of newer instruments. What if I told you that virtually every function in RAW meat is hidden in Proteome Discoverer 2.0?
They are. And they are under the plot functions.
You can plot your mass discrepancy (at PSM or peptide level), you can plot your ion injection times (for MS/MS scans or for PSMS and as a histogram or over time). You can get your charge distribution to see if that matches theoretical for your enzyme (2 or 3 for trypsin, etc.,), you can plot your missed cleavages and on and on.
Now. I picked a bad screenshot above. I can't seem to plot my TopN. Maybe I haven't found it yet. Or maybe I need to submit a feature request!
You can learn more about this here!
Hey! Are you pretty darned good at mass spec? Are you looking for a postdoc? Wanna run cutting edge instruments for super secret U.S. military projects? Want to hang out with me once in a while cause its my buddy's lab and he lets me pop in and run stuff when I'm in town?
Then you should check out this listing here on the ASMS website! Oh, and it pays nearly 3x what I made during my first postdoc..so you know...it that is something that matters to you then you have that too!
Wednesday, August 26, 2015
Tuesday, August 25, 2015
Sunday, August 23, 2015
There are bunches of ways to get phosphopeptides. Hopefully each new method is better than the last. This one? Well, it looks great -- and its new, and it is hella thorough. The authors describe a two stage enrichment procedure as well as an chemical modification technique that can further boost sample enrichment/recovery.
You can't really stop there and get your method into this journal, right? So they go further and describe software they use called the "CAD Neutral loss finder" to single out the ions with the phospho losses (I want this!!!), oh and they also use ETD to trigger on that loss.
Why might you want to consider their method? Maybe cause they are getting phosphopeptide IDs from PICOgrams of starting material. Umm...so...during my postdoc I would start with MILLIgrams before I started enriching...so...you have that.
You can find the abstract for this shockingly powerful method here.
Update: I found the software, too! Its available here!
Saturday, August 22, 2015
Recently, I've been hearing a lot of good things about the Thermo SMART digest kits, but I hadn't had time to really pay attention. They did sound a little bit like the Perfinity digest kits. Turns out, they are more than a little similar!
Found this tidbit that explained the similarity in my LinkedIn feed!
Friday, August 21, 2015
This is AWESOME! I saw people discussing it on Twitter a while back but hadn't had a chance to investigate. How big of a deal is it? Well, its such a big deal that a writer for CBS in the U.S. was actually assigned to write a page on it! And if you follow U.S. television "news" you know that it takes something BIG for them to devote an instant to something unrelated to the Kardashians.
Check this out:
We have virtually no early detection systems for pancreatic cancer. Typically you go in with severe back pain and THEN you find out it has progressed like crazy and you have months to live. Its a disease bad enough that it can take even this guy out...
...a mere 20 months after his diagnosis... Which is typical. Only around 6% of people who are diagnosed are alive 5 years later. And we may have a functional biomarker IN URINE!! All signs point to it being a beatable cancer, if you catch it in time, we just never do! What if early detection was part of your normal hospital panel when you get your physical every year? You catch it before it is terrible, and you and your capable oncologist beat it, that's what!
How'd they find these biomarkers? By LC-MS/MS of course. LC-MS/MS by the capable hands of the great team at MSBioworks, no less.
Here is the BBC article.
Here is the original journal article abstract at Clinical Cancer Research.
Keep this one in your back pocket the next time some know it all says that proteomics and mass spectrometry aren't living up to their promise! Get back to work you awesome people and
Thursday, August 20, 2015
For the last several weeks the Proteome Discoverer interface on my PC has looked kind of terrible. The icons are big and easy to see, but I can't put a workflow on a single page.
Turns out, there is an option in your Control Panel in Windows 7-10 that can have big effects on the viewing of the PD interface and its called the Magnifier tool. If it (somehow?) gets moved up to medium it'll make it really hard to have the entire Consensus menu open and still be able to see the Post Processing nodes.
Probably this won't happen to you, but just in case! (It was driving me kind of crazy!)
Wednesday, August 19, 2015
I would like to divert you from proteomics to a fantastic study of dog genomes.
In this paper in some journal called "Science..." some brilliant people tricked some funding agency into giving them money to sequence a ton of single nucleotide polymorphisms (SNPs) from a bunch of dogs. OF COURSE, one that they would focus on would be the grand and majestic Pug.
The story has always been this -- that Pugs originated with other normal-faced dogs somewhere in Asia. This is substantiated by old art that seems to show Pugs in early Chinese dynasties...except those Pugs look like Shar Pei...
Okay. So, the genomics might be solving all sorts of questions. The big question, though, still stands. Where did the greatest dogs in world history come from?
I think we need to go all Robert Langdon on it to figure it out. What are the conspiracies surrounding Michelangelo?
Starting in Proteome Discoverer 2.0, we now have the ability to directly download FASTA databases from ProteinCenter. However, it might look a little confusing if you select that option from your FASTA file menu in Administration.
What you need is your Taxonomy ID. For example, if you want to autodownload all the SwisProt entries that ProteinCenter has for our species you'd want to use code 9606 and click the Import button.
How do I find these Tax IDs? I use this thing!
This is the NCBI Taxonomy Browser (click here for direct link!). Just put your species name in the top and it'll come back with your organism and the Tax ID at the top.
There are several other such tools available on the web, but this is a quick and easy one.
Important note if you use this tool: Once you download a FASTA from ProteinCenter, you now have the ability to search for updates when ProteinCenter posts them. If it finds one it will directly copy over the FASTA you have in place. If you are in a role where you need to keep a constant record of where you obtained your FASTA and when, it might be easier to download them yourself the old fashioned way so you can keep a time stamp and record of each individual FASTA.
Shoutout to Brad for pointing out that this process could use some clarification!
Tuesday, August 18, 2015
Wow. We have so many resources out there these days that I honestly can completely forget about some of them (and get them mixed up with others). One that I hadn't thought to check on in a while is the Encyclopedia of Protein Dynamics. Thanks to Twitter (@PastelBio, w00t!) and the fact that I'm too lazy to leave my hotel tonight, I can check out and direct you to a cool tool on the site.
The tool I want to direct you to is the PepTracker (trademarked!). The PepTracker gives you information about your protein of interest, including:
Protein turnover (and half-life plots!)
Linked proteins (like that awesome string map at the top! publication ready, with permissions, of course!)
Localization in certain cell lines
Heatmaps of proteins that cluster with your protein-of-interest
Cell cycle information
And other stuff!
Really really cool tool and handy information all in one place!
Cool tips in this month's Mascot Newsletter for those of you who are studying weird things that haven't gotten fully annotated genomes. Search with your MS/MS data against Expressed Sequence Tags (EST). The newsletter highlights one fish that only has 1,200 annotated proteins but nearly a quarter-million ESTs. Sure, its more work for you to sort it out, but it beats only coming up with 1,200 proteins and they'll annotate it all someday!
Monday, August 17, 2015
I'm going to be perfectly honest here. I know nothing about honeybees. Not a thing. I know there aren't as many of them as there were before(?) and I know David Tennant's Dr. Who didn't seem to know why that was. I did however have an impressively scary experience with what I think were honeybees this weekend when thousands of them descended on what I thought was a hummingbird feeder and this seemed too coincidental to pass up on. (They drained an 8 ounce container of hummingbird food in minutes. It was amazing. May repeat and film when I get home!)
Anywho! In this study, a label free proteomics approach was used to analyze the differences between honeybees as they differentiate into one type or another. Turns out they found some key upregulations in the bigger bees (drones?) and the smaller ones (with royal names). The differentiated proteins made sense as smaller bees had a whole lot less of the cytoskeletal construction proteins than the larger dumb bees. In an interesting follow-up, they uses interfering RNA on one of the proteins in this pathway and totally messed a bee up! You can read this article in ASAP JPR here!
Sunday, August 16, 2015
From an instrumentation stand-point, assembling an entire proteome is something that many labs have the ability to do. Is it still a challenge? Sure! Sample prep and prefractionation for complex organisms is still going to be stuff that you're really going to have to do right.
What about the data processing side of things? This might actually be where the real problems are right now. If you've got an FDR controlled at 1% at the MS/MS level and you have one million MS/MS spectra..that is saying you probably have about 10,000 things wrong. If you've got a billion? Thats 10 MILLION bad matches.
If you follow bioinformatics on social media in any way, chances are you know of Yasset. He has a lot of experience with datasets as large as, and much larger(!) than the ones we're generating. In this post on his blog, he takes a look at the first human proteome drafts, the re-analyses and opinions from thought leaders in the field and adds some of his own thoughtful insight to it as well.
Friday, August 14, 2015
Thursday, August 13, 2015
You'll find a lot of posts in here regarding Galaxy, particularly the GalaxyP project that is kind of centralized around my friends in Minnesota. Galaxy is a HUGE package in genomics research with tons of great tools and an amazing support community. More and more of these tools and researchers are thinking (quite logically!) that they should integrate proteomics, and we all stand to gain from access to these resources.
In a brand new paper in press at MCP, Jun Fan et al., demonstrate the usage of Galaxy tools for transcriptomics and how we can get better peptide identifications and improved protein inference by leveraging these tools versus our MS/MS spectra. I think that was a run on sentence. Who has time for grammar when you're this excited? Not me!
How'd they do it? Here is an interesting schematic stolen from the paper. Does it look convoluted? Of course it does! If it was simple, why would we even need bioinformaticians? Making it look complicated is job security. But when you break it down, it really isn't that bad.
What are the points I don't have in my typical workflows? Well, I have RAW files, I can search them, I've got FASTAs, and I can BLAST unmatched spectra (even the de novo ones, thanks to the new versions of Peaks and the de novo GUI). The only thing I don't really have here is the ability to make my transcriptomics data easily searchable. But they outline open source tools in the paper that do. Time to convert some variant call files to FASTA!!!
Wednesday, August 12, 2015
From the outside, one might argue that a weakness of our field is our level of organization. We have tons of tools but they are all over the place. I've mentioned this before, but this is definitely worth bringing up again: Pastel Bioscience has been trying to help us get organized by creating an absolutely exhaustive list of databases and resources.
You can find the list here.
And if your tool isn't present? Shoot them an email so they can add it! Lets get organized!
Tuesday, August 11, 2015
Need a quick read that really breaks down the advantages/disadvantages of DIA vs DDA techniques? How bout a new distinction between the ways that different groups are processing DIA data?
You should totally read this new paper from Ying S. Ting et. al., The highlight is breaking down the DIA analysis techniques into those that are spectrum-centric from those that are peptide-centric, and it really does make a difference.
Saturday, August 8, 2015
Edit: Severely edited this post after reading some more and discussing with someone who did take immunology AND got an A in it.
SO, I'm gonna direct you to this article in Nature called "Reproducibility crisis: blame it on the antibodies" until I have a chance to gather more information to support my ramblings on this topic.
Oh. And I think I actually have permission from Nature to use that image up there. They have kind of an intense permissions form and I think I did it right!
Friday, August 7, 2015
Okay, on the same tilt as the post this week so you can observe your flow remotely: What if you want to control your EasyNLC remotely?
Well, you should download the file Easy NLC VLC from my FTP site. Now, depending on your security configurations this might or might not be easy. You may need to get one of those IT nerds up to put in their passwords and things. I've had a ton of trouble getting it installed on some high security networks, even with the help of IT nerds.
Shout out to Shan for this and Fred for finding this convenient Zip file! I work in a pretty awesome team these days!
Thursday, August 6, 2015
This is a great perspective paper! It is from Anton Iliuk et. al., and came to my desktop thanks to the hard work of @PastelBio in making sure I always have cool stuff to read while caffeinating.
I do love introducing Proteomics to people who are new to the field. One of the big reasons is that I'm a little indoctrinated in the inside perspective. When you talk to someone who has the audacity to want to use our awesome toys as simply a tool to solve a biological problem that they have...well...it makes some of our big papers with their big lists seem kind of silly. Yes, maybe we can see 14,000 phosphosites. But what the heck do you do with all of that??
In this paper, this team wants to take our ability to find tons of phosphorylations and translate them to clinical relevance. I.e. can we use these patterns to discern a disease state and/or the progression of that state before more traditional assays can? My first thought would be :
But, you know what? We aren't quite there yet. And this review really stomps down on the reasons why.
Wednesday, August 5, 2015
I admit it. I HATE western blots. There's nothing like coming in with data from one of the most sensitive/accurate analytical devices every conjured up by physics and being told to "validate" these observations with funny colored rabbit blood.
Fortunately for us, MStern blotting has nothing at all to do with Western blots at all, except that it makes use of PVDF membranes to massively speed up our ability to do FASP-style digestions.
I stole the figure above from this paper by Sebastien Berger et. al., that is currently open access at MCP here. (oh, some Steens were involved here as well!)
The idea is this: you can do FASP in 96 well plates (a very nice JCVI paper can be found on this blog). The problem is that it takes a long time with most filters. By substituting PVDF membranes for MW cutoff filters they were able to get great digestions an cleanups in much faster times using speeds that are compatible with 96-well plates.
Awesome? Absolutely. And not just because I don't have to do a western blot! Definitely check this one out!
Monday, August 3, 2015
For some of us out there, we have the ability to log into our instruments from home. Is there anything more comforting than going to bed knowing that you queued everything up properly, the QCs looked fine and the spray stability is awesome? Or...seeing that something is messed up so you can shut it down and fix it when you get back in?!?
The missing link for most of us these days is the source camera. If you've got an EasySpray or a nanoFlex ion source your video likely shows up on a little TV screen on top your mass spec. That isn't too useful if you're using Remote Login.
Well, Sheng Zhang at the Cornell Proteomics Core has a brilliant solution for us! You can buy Dino-Lite USB controlled microscopes on Amazon and they fit right in where your normal cameras fit into your source! He has a different one on his Elite and his Fusion and we couldn't find the exact part numbers. The one on the Elite is a little better quality, but both are just awesome. The little camera even comes with a small program that projects what it sees onto the PC screen. So no extra software necessary. It doesn't appear to consume many resources at all.
There are tons of options (check this Amazon search page), but one that definitely works with USB 2.0 is this guy for $150!