Sunday, January 19, 2020

ShinyGO! A beautiful, simple and powerful online data interpretation tool.

I didn't want to write about this one until I got these stupid manuscript edits out the door, because I needed ShinyGO and I didn't want anyone else slowing it down.

"Minor" edits done! Tool sharing time! You can read about ShinyGO here.

Don't feel like reading? You can play with ShinyGO online here!

Are there lots of tools like this out there hiding on the web? Yup! There sure are, but this honestly might be the easiest way to dig through a bunch of different resources all at once. You will need to get your protein list to universal gene identifiers or something similar (it'll translate several different types) and then you can start doing all sorts of analyses. For the figure above, I let ShinyGO select the closest related organism (ended up being Scumbag Arabidopsis) and I flipped through different databases until I got some visualizations that made my data make some sense (turned out being a nice visualization of KEGG resources with pathway representation scaled.  I found that the best way to get what I wanted was to take the proteins that are up/down regulated separately and then create my network and then compare them. Worked for this model organism!

I love the fact that I can move my network nodes around and then export the image with them in that place. If you don't like that particular visualization you can export the Edges and Nodes and import them into your tool of choice.

At long last -- A Guidance Document for HLA Peptides!

Twenty somethig years of analyzing HLA Class I/II antigens with mass spectrometry and we need to face facts -- mass spectrometry of these things still sucks. Important? Yes. Super incredibly important. But mobile protons are NOT a fun thing for us to work with as our exclusive charge acceptor. Our technologies work best with doubly charged peptides that end in lysines or arginines.

But LCMS is the only thing that has ever worked at all for these molecules, so we're stuck with them. What we need is a set of guidelines to at least reference --- and here is the first one I've seen! 

Honestly, I expected this to have maybe 50 different authors on it as some sort of an over arching consensus of a big meeting on the topic to sort it out. And, that might make me a little more comfortable, but I tell you what, this document is not bad at all. You know why?

This group has actually validated some HLA peptides and successfully utilized them! This isn't a big hypothetical piece. This is the next stage beyond where most of us have been going. I came out thinking it was really sobering. On our side we're pressured to find more and more of these peptide identifications and hitting so many hundred or thousand is the only metric we have. We don't need to find 1,500 mediocre peptides. We need to find the one really good one that differs in the cell type you want to target. This isn't a long read and if you're doing these kinds of studies, I 100% recommend it.

Saturday, January 18, 2020

Making publication ready annotated spectra with IPSA and PD (or any other tool)!

IGNORE MY WRITING. Make beautiful MS/MS spectra easily online by pushing this hideous big button. 

Lookin' to make beautiful spectra for your poster or publication? Just push that big button!! 

This might, yet again, be a post mostly for me, because I can't seem to remember the name of this tool and I keep going to Google Scholar and looking through 2019 papers from the Coon lab. And that isn't exactly a one-paper-a-year sort of lab. And since I'm already typing I'm going to show you how to use this awesome tool.

You can read about in in MCP here.

If you need to do something fancy, beyond what the online IPSA tool can do, you can download the whole thing on Github here and manipulate it (or run it locally in your own web browser on your offline computer. 

I'm going to go through this from a Proteome Discoverer centric application using IMP-PD 2.1 (the free PD version that you can get here.).  Sweet! Here is a great tutorial for installing it, I'd not happened into well as a tool I haven't checked out yet!

One thing Proteome Discoverer has never done (and -- honestly -- most, if not all software packages) is made images that your editor and reviewers won't make fun of.  There are some people out there with the kind of exploited students free time that have made them remake all sorts of spectra in things like Adobe illustrator that puts anyone who can't afford the financial or time costs at a disadvantage. Illustrator images can be so pretty you can hardly look at them

Compare that to your output from your normal tool of choice. Functional? Yes. Pretty? Probably not (and the ones that are pretty, like Scaffold, often don't contain all the information you want.

IPSA fixes that! You'll need a couple of things first.

1) Your spectra sequence
2) Exact mass of your post-translational modification, if applicable
3) A clean spectra to work off of

I'll assume you have both 1 and 2. For number 3 I'm going to use the free version of Proteome Discoverer and version 2.1 because of two nodes that are compatible with 2.1. The IMP-MS2 Spectrum Processor and the Spectrum grouper.

I think the IMP-MS2 spectrum processor has been integrated into MSAmanda 2.0 for later versions of PD, but this is how I'm doing things (PD 2.1, last I checked, was compatible with the largest group of second party nodes and I'll always keep a version or two installed on everything just in case I need something neat that I don't have in later versions. I strongly advise you take the 15 minutes when you get a new computer and do the same!)

The MS2 spectrum processor will deconvolute your MS/MS fragments to all singly charged. BOOM! much simpler spectra. It won't work well, or at all(?) with low resolution spectra, but it works perfectly on the higher resolution ones. You can also deisotope. I do, just so it isn't so hard to look at everything. It reduces your spectra to the monoisotopic alone. 

The spectrum grouper finds your MS/MS spectra (if you have them) that are clearly duplicates, even if you fragmented the +2 and +3 versions and puts them together if you select grouping on "singly charged mass." To be perfectly honest, I'm not 100% sure I know what this is doing. I thought I did, but I can't be 100% sure I know what "grouping" means in this context. Meh. I'll investigate later.

If you're trying to annotate PTM MS/MS spectra, befinitely throw in the ptmRS/AScore (at least similar enough to consider together) for into the pipeline.

Run this and make some stupendous identifications!

From the PSM tab in PD you can double click on your peptide of interest and bring up a nice and informative MS/MS spectra that your reviewers will make fun of.  If you right click on it, you can select "Copy Points". This will remove all the MS/MS fragments and these annotations and make a 3 column text output.

Next, go to Excel or your Excel like program of choice and paste the data into it. It'll look something like this. (Please note, examples don't match in image above/below, because I'm lazy.)

I did this so I could ditch the scan headers, which will confuse IPSA. 

All IPSA wants is the MS/MS fragments and intensities. I highlight the cells in rows A &B (columns? I'll live my entire life without ever truly knowing which is which -- thanks, dyslexia, you're the best!) and copy with ctrl+C, or whatever you Macintosh people use. 

Hit the big button at the top of this blog post and go to IPSA. 

Click to expand the image below. This tool is amazingly straight-forward, but I'm still going to number things. 
1) Ctrl+V your cells into the big obvious box! 

2) Copy/Paste your peptide sequence in. In PD 2.1 it is easiest to do this from the very top line of the Peptide Summary (where you found the "Copy Points" button a couple of images up from this one.

3) Put in the charge state of your peptide.

4) Determine the charge states of the fragments you want to see. I've, in error, selected 2 here, which doesn't make sense in this workflow, but it did make sense with a low resolution spectra I couldn't deconvolute. Keep in mind that the IMP tool isn't perfect and may not catch every MS/MS, if the charge state is particularly high. Worth toying with if you've got an unexplained fragment.

5) Select the fragment ions you want to see, and whether you want to see neutral losses. I only use them if there are a lot of unmatched MS/MS spectra.

6) Use a reasonable fragment tolerance in Da or PPM. Matching tolerance is how low vs. base peak you want to still label. It defaults to zero which might be too messy. Putting a 5 in means that you won't label stuff below 5% of base peak.

Okay -- so this is reeeeeaaaly cool. If you don't like where your labels are, height wise, you can move them. See that y5 for example? Just click and drag it up so you can clearly read it. Then when you....

7) Generate your SVG, it keeps it that way! I'd also recommend exporting the data so that you have this output. It makes a handy CSV with the same title (your peptide sequence) as the SVG and saves it in the same place.

Maybe you're done!  However, if you want to make changes, NOW it's time to break out Powerpoint or Illustrator.

Illustrator will directly import the SVG and allow you to manipulate it if you know how to use it right. (I don't and I made my figures much worse).  Powerpoint (at least my 365 version) will directly import the SVG. And then I can make changes to it.

If you want to make changes like add in some text, you aren't done yet. IPSA uses 3 nonstandard fonts that you probably don't have installed. At the top of the page there is a button that says "Download fonts". Do that. Unzip them and then type "Fonts" into your Windows search bar thing.

In Windows 10 (booooooooo!) it'll look like this.

Predictably, it won't work quite right, but if you click/drag/drop enough times and say the exact right combination of profanities it will eventually recognize the fonts you uploaded. Powerpoint may not immediately add the new fonts to the font bar, but if you type the name of the font  in the box it will update it.

The spectra are annotated in OpenSans. On my screen to match it exactly with a spectra taking up a full slide, my b/y ions are OpenSans 16.  This may not be universal. The other 2 fonts are the text around the beautiful spectra.

Why am I adding stuff? I'm just putting in the exact mass of the ions that most clearly illustrate where the NAGs are located in my MS/MS spectra, which is never ever ever on tyrosine.

Now that you've got your spectra in, add the right arrows and colors, save the image however you want (probably .TIFF, since the image isn't compressed.) And then I'm done typing!

Friday, January 17, 2020

Context specific FDR for top down proteomics!

On the bridge of the Starship Northwestern, Captain Kelleher and his crew are exploring the farthest reaches of proteomics, going where no lab has gone before.

I just had the worst idea ever -- and -- of course there is an internet tool where you can take anyone'es picture and "Trek" it.


What started this ramble? Well, while we're here on earth still struggling with accurate estimation of FDR for linear and slightly modified peptides, on the Starship Northwestern they're beaming down tools for accurate estimation of FDR for freaking intact proteoforms!

How are you currently assessing FDR for proteoforms? I'll tell you how I am. I'm not. I'm so pumped that I've identified a few dozen proteins from fragmenting their intact forms that I'm just popping them into my list. And -- I'd wager that is what basically everyone is doing outside the 4 labs that do top down proteomics each and every day. And if you've got an exact mass and some sequence information and you can check your 24 proteins that is probably even okay.

However, if you're really getting hundred/thousands of IDs? You need a real way to estimate these and this great new tool (which is freely available on Github here) provides a real starting point on these calculations. And it turns out that context is very very important.

The authors pressure test this tool using a true known sample and by reanalyzing some previously published materials to show that for today's top down proteomics both on earth and out there where they're exploring, this is the way to engage...

....your results.

Thursday, January 16, 2020

Quantitative live cell imaging + proteomics shows real time influenza progression!

This brand new study at Nature something or other is timely, interesting and shows a combination of techniques complementing each other I'd never have thought of!

Live cell imaging? That can't help with proteomics...I does looking at the surface of a cell help?

It turns out that, live cell imaging (light microscopy!) has gotten massively sophisticated! Those images at the top are Rab11 foci!  So... live cell imaging of protein complexes inside a normal human cell. That's pretty awesome all on it's own, right?

What could make it even better? Involving proteomics, obviously, but -- you know -- let's leave that part out of the title. THEN let's do something that is right at the front of everyone's mind right now -- influenza!!

If that doesn't make you want to read this, we probably can't be friends.

Rab11 is a protein that maintains other proteins at the cell surface and helps recycle the vesicles. This group shows how the influenza virus messes up Rab11(a?) function by what appears to be messing up the dyneins. You can't figure out that it's the dyneins by even the fanciest of light microscopes, but you can by immunoprecipitation-mass spec assays! Speaking of which, I'd like to direct your eyes to a great way to display data from IP-MS/AE-MS.

Whoever did the plots for the study knows how to drive home the results and conclusions. But if that isn't enough for you, all the data is available at PRIDE via ProteomeXchange here.

Take it all together? And we've got a better understanding of how influenza screws up a key and massively evolutionarily conserved system. Could you have done this study with just protoemics? Probably! However, the light microscopy is surprisingly useful toward driving the point home and the images are stunning. I'd probably put something snarky about how this is a good lesson in catering to the people who still don't understand why we're still going on about that mass spectroscopy stuff, but I'm almost over this goshdarned virus and my mood is much better, so I won't.

Wednesday, January 15, 2020

Phosphorylation sites that alter thermal protein stability.....



The biggest advance for us with all these recent studies is that these techniques are starting to seem a whole lot less daunting from a sample prep and execution standpoint. And, if you look at the results from the earliest studies to these newer ones, the advances in the chromatography and mass spec hardware really illustrates how far we've come. Now we're (well...some people in Seattle rumored to be talented are...localizing the exact phosphorylation sites that are causing structural shifts in protein 3D structure enough to affect something as primary as thermal changes? 

!!! Wait!! This isn't even the only study THIS WEEK!!  Here is the second one

Monday, January 13, 2020

EPIC-XS -- Why Europe will always be #1 in Proteomics -- Proposal Deadline 3/31/20!!

Ignore my rambling and just click this big button! Deadline is March 31 for proposals!! 


Did you know the US vastly outspends the EU in mass spectrometry each year? Totally true. Don't ask me how I know. I can't tell you.

Who gets all the Science and Nature papers and, way more importantly,  the studies where you're like "holy shit...we can do that with a mass spec?!??"  Sure...we do okay...I don't mean that just because you're here in 'Murica you can't do something great  but deep down we know that the Europeans run proteomics and we're just trying to keep up by outspending them.

And I think PRIME-XS is a huge reason for why they're ahead right now and EPIC-XS is the reason we'll never catch up.

What is it? It's PRIME-XS powered up!
What was PRIME-XS? It is a way to connect the best and most meaningful and project that are the very most likely to be solved with proteomics technologiest to the best practitioners on an entire continent. (Here is a breakdown of some of the things PRIME-XS did!)

Imagine you're at a US university and you just discovered something big and you need to do some proteomics to finish the project. What do you do?

Option 1: Find a core or service lab: Maybe you go to a core lab and they help you out and everything is great.

But what if what you need is something truly special and hard and new? Cores are -- for the most part -- things that need to make up funds to justify their existence. They (we) have to go through samples and method development is expensive. You may not be able to find someone you could afford to develop this new technology for you. And if you can afford it -- time and time and time again, what do you actually do?

Option 2: You buy your own mass spec....
....and you spend 3-5 years learning how to use it...
...or you spend 3-5 years and a lot of money and you never learn how to use it....
OR -- and this is the worst --- you find out that you weren't right -- that proteomics or mass spectrometry couldn't help you. It's the wrong technology for the job. But you needed loads of experience with the technology to find it is the wrong match.

EPIC-XS offers Option 3:  Access to the expertise to complete your project.

This is my understanding of it: You describe your research and propose your work and an independent panel reviews it and figures out if it is a project that proteomics and mass spec can solve -- and if your project is picked it connects you with the people who have the instruments and expertise to pull your project off AND the funds to do it!

Could you argue that this is like a bunch of labs that are friends who figured out how to rig the system to get themselves the coolest collaborations that could be found on an entire continent? Sure you could! And it's totally and completely brilliant.

And the efficiency for the granting agencies is absurd. Option 2 is inefficient and expensive and, in the end, only good for the shareholders of the instrument companies.

If we want to compete we need a parallel in the US. Which probably won't happen. Instrument manufacturer's have lobbyists.....

But for the EU side -- send this to anyone who has a cool project -- send in your own -- support this amazing project! 

And...yes...I'm totally enjoying my controversial blog post titles!

Saturday, January 11, 2020

NanoPore Proteomics -- Is it closer than we thought to replacing LCMS???

I had the distinct pleasure much earlier in my career in becoming a grandmaster in microarray technology -- right after everyone (except -- maybe the NCI) had realized the technology was massively inferior to the newly emerging "next gen" sequencing technologies.

I'll still stand by my QC pipeline for tens of thousands of microarrays by really unsophisticated statistics (powered by huge n!) even if it has, to date, been cited once, possibly making it the highest $$$ per citation in human history.

I love mass spectrometry. LOVE it. And about 18 months ago we saw the first data that conclusively proved that shotgun proteomics could compete head to head with the original "next gen" genomics/transcriptomics technology -- RNA-Seq.

However, if you weren't aware RNA-Seq is starting to fall by the way-side. It's not going to go away anytime soon, but the writing is on the wall that recent improvements in "long read technologies" will soon replace them. PacBio long read sequencing data (which I will read into proteomics data with zeal over anything from an Illumina platform!! The reading frames are soooo long!! Check it out!) is great, but the big thing -- maybe the biggest thing of the future is a humble little thing called the NanoPore.

If you haven't seen it, I bet you will soon.

No joke -- that's it. You can sequence DNA with that thing. On the far right of it? That is a standard USB port. It uses the power of a USB port to sequence DNA/RNA. And -- it is the future of genomics.

However -- could it be the next step in Proteomics???

These authors seem to think so -- and the argument is compelling!

Now -- this idea has been kicked around for a while and I think the conclusion is "yeah -- this is totally going to happen, but maybe 10-20? years from now?"  Sitting here after reading this? I'd move that timeline up.

LCMS (which the authors refer to as BU-MS, for Bottom Up Mass Spectrometry) has a lot of weaknesses. We're improving them, but the outside world is getting very tired of them and, to be honest, I'm starting to think we'll never fix them.

LCMS Shotgun proteomics people will not:
1) Use the same extraction digestion methods
2) Use the same instrument parameters
3) Use the same data processing methods

And there are good reasons for these and we have lots of usefulness in solving little studies, but -- in the end -- this isn't going to be good enough for the rest of the world and they're looking for a way to replace the whole way we do things.

While I'm having my feverish rant for the 1-4 hours I'll be awake today according to this week's trend, the replacement for proteomics is NOT SomaScan. We don't need a GWAS for proteomics. We already have a couple. They're called "SWATH" and "MS-e".

However -- this is worth looking at. Or at least being prepared for!

Thursday, January 9, 2020

7,500 proteomic runs on ONE column -- the final nail in the coffin of NanoFlow?

I was just writing something about how the blog was on hiatus and it occurred to me that I didn't actually have to do a good job of summarizing every paper that I read. I could just put it here and suggest you read it because it's aweome! (...brain fatigue...and maybe a high fever...)

Someone (I forget who -- and don't worry -- I'm not going to look it up and blame you) had the great idea of superconcentrating peptides and ionizing them with nanoliters of solvent per minute by electrospray ionization. This was invented for something called the LCQ or QTrap system. Young people, if you've never heard of these, don't worry -- they are terrible. Imagine 1 scan per second with 800ppm mass accuracy. Then forget I ever mentioned either.

Even worse? The sensitivity! It's TERRIBLE on those things. The solution? The nanoflow thing above.

And what happened next was amazing -- mass specs got NOT TERRIBLE! Heck, they got good, and then great! Accuracy and speed and sensitivity.

You know what didn't change, though? The stupid, complicated, NanoLC stuff! The instruments are hundreds (thousands?) of times more sensitive and we're still doing dumb and poorly reproducible stuff with our chromatography.

Is this the paper that finally ends it? 

Probably not -- cause -- HPLC systems are expensive, but I hope this will make you consider NOT buying a nanoLC with your next mass spec.


11,000 proteins
11 samples (with TMT)
16 hours of run time on average
And a microflow (1mm x 15cm) column that shows NO decrease in performance over 7,500 runs.

My record for a nanoflow column is 6 weeks -- with a trap -- before the chromatography got all wonky. What's yours? I bet it isn't 7,500 runs!

Wednesday, January 8, 2020

Happy New Year! The blog will be back sometime soon!

Happy 2020!

The blog is currently on hiatus. Conor (@SpecInformatics) and I submitted something along the lines of 14 different papers in 2019 and at this point, 4? maybe 6? have been accepted.

I've been buried under reviewer comments, resubmissions, snarky comments and general fatigue for most of the year, which -- to be honest -- hasn't been really good for this blog.

This blog is supposed to be an homage to YOUR work and YOUR successes when YOU've made proteomics better and faster and more impactful for medicine and biology (and dozens of other fields where I{or almost anyone else, except you} would never have considered that something as fundamental as precise estimation of monoisotopic mass would make a difference and YOU showed it could and would).

Now that I'm doing this all day, every day and trying to compete with you (geez...ya'll are good at this stuff), the blog seems to fall by the wayside at the end of each day in lieu of making pretty R plots and uploading stuff to ProteomeXchange partner repositories.

It isn't going away and this isn't even the longest hiatus I've had in the years I've been typing in this weird orange box, but I need to catch up on some stuff before I can dedicate the suprising amount of time this odd hobby takes.

I'm also currently in Central America and have had flu like symptoms for several days and -- well -- I'd be lying if I said I was 100% sure that I knew what I was typing right now, but I have this vague impression I'm supposed to make a joke about drinking Bailey's out of a shoe.

Anyway, Happy New Year! And thanks for filling my days with amazing stuff to read about!!!

Monday, December 30, 2019

Optimizing/reducing in source fragmentation on 3 different Orbitrap systems!!

Talk about a useful study! If you're doing tryptic peptides, maybe this isn't all that useful, but if you are working on anything that is more fragile than that (glycopeptides? PARPylated? intact/native, metabolites...we could go on an on here) this is probably worth at least thinking about. 

On the letterbox systems (the ion tranfer tubes with the great big rectangular holes) we use lower RF% to start out with. For peptides on a Q Exactive or HF system, I typically err toward an RF of 50-60%. On the Lumos or Exploris we're typically doing 40-45% for peptides.

The great Katie Southwick explained RF% to me years ago (I need an ELI5 once in a while) as the amount of pulling force in through the very front of the instrument. Bigger things probably want a higher %RF but you have to keep in mind that there are downsides to that extra force and you could break apart smaller or more fragile things.

In this study, this group  takes some of the more fragile things that we all hate to work on -- lipids -- and painstakingly compare different systems with different source conditions.

The chart at the top is the one I find the clearest and most valuable out of this great study -- when I'm looking at something that is clearly fragile and I've looked at it on whatever instrument is available -- this provides some guidelines for normalizing a setting that I probably didn't pay enough attention to.

Sunday, December 29, 2019

Urinary Peptidomics Reveals Diabetic Markers?!?!

Well -- if you needed a protocol for doing urine peptidomics, all the way down to standardizing everything to the urine creatinine levels (90umol, if you were wondering) and wanted some WTFourier level proof that this is a good use of your time, may I present: 

1) I didn't know urine peptidomics was a thing
2) This group reduces and alkylate their endogenous peptides. I'm unclear on whether or not I think that is a good idea, but considering how this paper develops downstream, I'm just going to shut up and do exactly what they did.

Discovery was all done on a Q Exactive coupled to a slEasyNano 1000 using an interesting Agilent column I'm not familiar with (post SCX fractionation? SAX? I forget now and I've got stuff to do).

Validation? Well -- they tripled the speed of the mass spec and increased the speed of their separation by over 7 orders of magnitude with an EvoSep coupled to an HF-X. (I guess the EvoSep isn't 1e7 times faster, but is sure feels like the slEasyNLC is taking the length of a human lifetime to load a single sample.

All the data is up on ProteomeXchange and Panorama, but you should read this great paper and find the links yourself!

Saturday, December 28, 2019

PISA -- Multiplex Thermal Proteome Profiling!

Want to massively increase the speed of your drug mechanism elucidation/ drug target workflow? Back your bags for PISA!

Nope. Not that one. This one! 

Proteomics Integral Solubility Alteration! (PISA is a much better name).

What's it do? It multiplexes Thermal Proteome Profiling -- in the context of drug treatment. Here is a post that will link you to two of the previous studies (including the Nature protocol for ThPP).

The idea is that if your drug binds to some proteins it's going to change the proteins inherent 3D stuff. One readout of that will be a change in the protein's behaviour at different temperatures. In ThPP (an acronym I may have just made up so I don't confuse this with the TPP thing on my desktop) you look for how things change in your proteome at different temperatures. Check out the protocol. It's tough is lots of room (in my mind) for human error to lead you to false positives.

One way in proteomics to reduce quantitative error? Multiplexing!

One way to reduce quantitative error in everything? More samples!

PISA uses both of these to end up with a TMT quantitative readout of how the proteome changes at a global level (with both 1D and 2D fractionation for TMT seamlessly integrated just as you'd expect from a TMT based experiment) with lots of replicates all multiplexed together.

Friday, December 27, 2019

The Case for Proteomics and Phosphoproteomics in the Clinic!

After a couple of days of somewhat successfully skirting any discussion of politics with my family for the holidays - with one extremely notable exception, I'm so pumped to type something that people with a similar mindset might read one day.

What about this for building some consensus?

Where are we now? What are the challenges ahead? What do we need to do next? Yo, I'll let them tell you what....

This review has study after study that has shown the promise of proteomics to impact patient health. Now -- you can probably guess where the big technological need is in the personalized space from the picture at the top. HLA peptides still suuuuuuuck. Blech. Yes. We need help on that side, but from many of the other areas we're good to go. We just need a shot. And the paragraph above says ...more chances to prove that we know how to do this stuff.

I highlighted my favorite words: because you know what the medical community is good at? Openness to shifts. That's me being sarcastic, if you can't tell.

I love the angle on the phosphostuff here, because you sure don't here these cancer people in the clinic talking about protein abundance all that much -- they're all rambling about the "phospho status" of this protein or that one, and doing Westerns and ELISAs to check them. Which, yo, it's almost 2020. Western blots are fucking stupid. I'm not the first person to say that, but if you need someone to reference that statement to, I'm cool with you quoting me. Here's some semi-coherent reasons why. I'm pretty sure ELISAs are stupid as well, but I'm not sure I've ever actually done one, so I'm not sure I feel qualified to make such a strong statement.

A big thing that we're kind of missing in our realm might be the incorporation of -omics data with clinical data. We're not exactly running away with loads of stuff that can help us make these connections, but -- realistically -- we can steal that stuff from the GWAS people!  (There are good examples, of course, but they aren't integrated into a lot of the more common software programs.)

This is a beautiful, optimistic, and valuable review and -- I'm a few months late on posting it (11 months) but it is definitely worth a read!

Thursday, December 26, 2019

PeakOnly -- Deep learning Python code for finding your MS1 features!

There are a lot of ways to find compound peaks in your data, but some compounds/peptides (particularly modified ones) just have lousy elution profiles. Sometimes you just have to go in and look yourself. Isn't that what all this AI/Machine Learning/Deep learning baloney is supposed to be doing for us? Automating tasks that are a little bit harder?

Maybe this is gonna help!?!?

PeakOnly uses Deep Learning to classify peaks. It is meant for metabolomics and was optimized on 1-3 Hz MS1 data, but I'm still putting it here because there is a very short list of things that will make lists of your MS1 peaks and their abundances (quantifying the stuff you didn't identify) and I'll probably need this sooner rather than later.

You can get PeakOnly at this Github. It doesn't blow the traditional peak detection stuff away or anything, but it does identify some compounds here and there that XCMS misses. It makes for a solid proof of concept study with open (with MIT license?) code that deserves a look. I mean...I'm not having trouble with the high abundance ones with the perfectly gaussian distributions...I need help identifying the lousy ones....