Saturday, June 6, 2020

One ASMS 2020 TakeAway -- GPUs are finally coming!


I'm woefully behind on ASMS stuff. 2020 (the year itself) has been a little tiring.

One takeaway that I've got for this year is that we're finally seeing Graphics Processing Units in Proteomics. I think I've rambled about this here before, but I can't find it. Anyway, in today's computer stuff we basically have 2 kinds of processors

Central Processing Units (CPUs) -- these are the stickers on your computer "i7" "XEON" or, more commonly now "Ryzen" and, if you're really lucky, "Threadripper" (I am not this lucky yet)

CPUs have a small number of cores but each core has access to tons of resources.

Graphics Processing Units (GPUs) have TONS of cores, but each core is capable of doing only very small things, like controlling a few pixels.

A lot of genomics has even gotten to the point where it has advanced even further


to Application Specific Integrated Circuits (ASICs) these things are even dumber than GPUs -- an ASIC is designed to do only one job. And when you focus processors to just one task they can be really really good at it. The first ASICs for genomics were advertising 100x increases in speed over GPU alignments. We'll get there one day! 

Most of us are using CPUs and doing just fine with it. Someone I talked to at ASMS had his PC running for about a month solid on one analysis for his talk....indicating that sometimes we do need more power....but most of the time we're okay. 

Where do you need GPUs in proteomics? 

1) Deep learning (PROSIT, etc.,) 
2) Processing absurdly large files (liket those 40GB per run TIMSTOF files, maybe?) 

Worth noting, John Yates did a talk for Bruker (you can find it in the clunky Horsebrutality suite thing they have set up) and it's one of the best educational talks about the evolution of proteomics data processing I've ever seen. 100% recommended. The Yates lab has been using GPUs for data processing for several years through the commercial program IP2....which... I think my search bar is broken, I know there is stuff here about that....

3) Other programs are out there, like ANN-Solo GPU, and G-MSR. This isn't new, but -- okay -- this is big -- 

4)  You know that Phase Constraint thing that we keep hearing about (and finally saw in a very very limited form on the Exploris 480 and Eclipse instruments last year)? 

You know why it isn't running on everything all the time? It's super computationally difficult. You're pushing your resources on the Exploris hard to even do the phase constraint in the narrow window around your TMT tags (what, 10 Amu?) 

I strongly recommend you check out 

MP 127 High dynamic range proteome analysis with BoxCar DIA and super-resolution Orbitrap mass spectrometry

-- because these jokers set up some GPUs (TITANs!! The crazy expensive server ones) so they could phase constraint across entire mass windows. The speed and resolution increase is so much that they can do BoxCarDIA using EvoSep separations and dig way deeper into the plasma proteome. 

They process the data in SpectroNaut, which I'm going to guess now does something with the BoxCars, cause I swear it did look like it didn't look at the BoxCars MS1s at all, but I bet they fixed that! 


No comments:

Post a Comment