Saturday, March 18, 2023

Perseus as a R portal for proteomic data (and for people who can't do TIDY!)

 

(That's a UMAP of almost 2,000 single cell proteomes! And I made it without even bothering anyone for help. It almost makes sense!)

Okay, I give Perseus about as hard a time as anyone out there, but every time I absolutely have to sit down and spend several hours remembering how to do something in it, I end up being amazed by it. 

If you aren't familiar, Perseus is a really powerful informatics package that you can use for just about anything. 

It is written in C++ or something, so it can be stupid fast at some things and it is node driven so you think I'd like it better since I love a certain node driven software for proteomics. 

However, it can be sort of daunting because it ends up looking like this before it does what you want it to.


(Stolen from this amazing YouTube video

And it absolutely makes sense if you spend enough time with it and you take the time to annotate your individual nodes so you can remember what you did. 


Something that occurs to me a lot when using Perseus for what I mostly use it for is how much friendlier it is than TIDY in R. I've got this thing with rights and lefts that I've been told makes me a terrifying driver. Extend that to a rigorous data format in which you swap everything 90 degrees, and - no chance my brain can do it. 



The newer versions of Perseus can be pretty much auto-equipped with a several cool tools where Perseus talks to R or Python and brings the data back when you follow these directions. However, if that isn't enough for you, you can link in your own stuff. And it seems to me like getting your data into the correct format for R is often 90% of the battle, but that might just be my brain. 

No comments:

Post a Comment