Monday, January 20, 2020

Announcing the First Ever News In Proteomics MineAthon (Challenge)!

I have been working on yet another crazy idea off and on for a month or two and it's now almost (like 18%) fully organized.

I'll stand by these words all day. Proteomics hardware is about mature. Yeah, we'll get some cooler stuff down the road, but until we figure out how to fix our informatics problem -- who cares if you get 3% more peptide IDs or 10% more spectra? Most of the tools people are using are only converting a tiny percentage of spectra into biological findings. There is much more to be gained with smarter data processing than even applying phase constraint over a wider mass range. In the most popular data processing pipelines people aren't even looking for PTMs, because it's still really hard to do it.

SO....Let's see where we are right now!

Do you think your data processing pipeline is the best for finding important biological changes and PTMs? Want to prove it, participate in some cool human research, be on a cool paper, a wold-wide webcast talk and maybe even get a trophy and definitely get the chance to talk some smack to your peers?  Yes?

Time to sign up for the --- 


(EDIT: I was just told an "athon" means you do it now. This is a "challenge" since we do it over an extended time period)


How's it work? 

You register by sending an email to on or before we start mining data! Let's put a deadline of February 13th  16th 2020 to start. I'll make a list with your name and contact info on it and definitely will not lose that list. This is important to me.

On February 13th 16th you and anyone else who has signed up (honestly, maybe just you) will be provided the link to download a relatively large label free human proteomics data set (the one I like is 66 Q Exactive single shot files, but we're looking for the most important and under mined set of data we can find and I can't swear it'll be that one. I want to use something realistic for today's human studies by using a real and awesome human study.

You have until March 31st to turn in your results (I like long deadlines. I figure most of you people have jobs and classes and stuff and probably like decently long deadlines as well).

The goal will be to find the most important differences between patient and control samples with a specific focus on those pesky PTMs!

Why would you do this? 

No reason, to be honest. I'm just too lazy to do it myself and I'm crowdsourcing so I don't have to.  Wait! That's not right! There are reasons!

1) Bragging rights. There will be a real winner to this contest, as well as some top candidates based on some of these criteria by our not-yet-chosen judges:
A) Most PTMs
B) Best evidence of said PTMs
C) Best presentation of said PTMs
D) Most useful PTMs
E) Metrics for the quantitative changes of said PTMs.

Remember when we got dumb trophies for everything? "You ran around the playground without falling down more than twice? Have a trophy!"   Then you never ever get a trophy ever again? That's dumb.  I think we should get an awesome trophy for this. I'll find a trophy store. Not even joking.

2) FAME!! Are you familiar with GenomeWeb? It's a big deal for people that do science business stuff.  The top candidates, chosen by our impartial and-not-yet-selected judges, will be allowed (if they're interested) to present their analysis and their results via a live streamed webinar on GenomeWeb. I've talked to them and they didn't say no.  I don't think anyone actually said yes, but they were totally cool about it and they're altogether great people.

3) A paper!  Yo, we're going to try and find the most important and under-mined set of files that we can. Then we're going to mine the crap out of it and try to show what today's proteomics can really do! And we're going to showcase the ever loving shit out of the fact that it's 2020 and proteomics isn't just hardware.

I think I'm going to even put this in for at least a poster or a talk or two somewhere so I can talk about how amazing you and your solution are. Somehow I gave like 10 invited talks last year. I hope I'm not dumb enough to do that many this year, but I'll totally get you and your results and solution as much exposure as I can (which I can't swear will help you in any way. I think I get invited to talk places just so people can find out if I'm as strange in person as I appear in writing and, if you are short a qualified proteomics speaker, you can always try me, I clearly love talking about this stuff)

Who is eligible? 

Everyone! We don't care if you wrote your own pipeline or if you've just kluged (is that a word?) together a bunch of different tools into something semi-feasible that totally works for you although you've never been able to explain it to anyone else well enough that they could do it ( be honest...that might not be ideal, but I'll work with it!) I don't care what timezone you are in (we'll just adjust the webinar accordingly and I'll ship the trophy wherever. Although if you are somewhere really cool I seriously might come deliver it myself. Again, this is important to me.


There aren't any. A lot of my favorite people I've ever met have been responsible for the software that I use every day. I clearly have my biases and my favorite tools, but that's why I'm going to get some impartial judges. I'd like to just be the hype man.

If no one enters? 

That's okay, too! I really wanted to write something in this box today and I'm going to run the same dataset through every tool I have on my PCs and I'll announce a winning software and I'll be very glad that I put the deadlines so far in the future! Your solution just won't have a chance if I don't know how to use it. Probably your solution isn't very good anyway. Poop head.

No comments:

Post a Comment