Wednesday, March 16, 2022

Challenges and opportunities for Bayesian stuff in proteomics!

 

In the "ummm....what the f- even is this?? did this exist when I was tutoring stats in undergrad?!?" category, I am happy, though somehow slightly nauseated and very aware at 3am of how very long ago undergrad was that i either forgot every word, or they changed them all?? I present -- 


According to the WikiPedia rabbit hole I went down that started here and somehow got worse the more I clicked, this certainly did sort of exist when I was in undergrad, as Bayes first postulated this in 1763. 

My interpretation of the introduction of this paper is that Bayesian statistics allow for the detection of less extreme ratios of changes and instead measures the degree of change (where we're used to testing frequency). Which might actually mean that this post moves from the list of several thousand sitting here unposted to the one where I push the big orange button and it says "are you SURE you want to post this??" because this sounds like it is a whole lot more like a biological system than what it is convenient to pretend they are like. 

It is amazing when you find a biological condition that produces a massive increase in the whole and mostly unmodified form of your protein....and I can think of just one right this second. Myocardial infarctions cause the creatine kinase levels in someone's blood to jump right through the roof compared to their basal level. Even then, there are a bunch of other things that can cause high CK, and some people just have higher/lower levels of the protein in circulation anyway, but that's what we use in the clinic -- since at least the 1980s....suggesting that most things that are easy have already been found? 

CK spikes 6 hours after a heart attack that drops off slowly over the next two days sounds like something with a "frequency" that we could find with what I use for identifying protein changes. All the other more complicated stuff is way harder to find that way. And this is where the Bayesian stuff seems to have a lot more power? 

As the authors note, LCMS based proteomics might have several reasons for not using Bayesian statistics such as "lack of familiarity" which is a nicer way of saying "I don't know what the f- are you even talking about? are these words?!? help!!" and a lack of access to these tools.

Every figure for this paper and how it was generated is available at this Github. I'm not sure that I could use them to apply a Bayesian framework to some data here that is clearly nonfreqential(...meh...) but it would be a place to start! 


No comments:

Post a Comment