If I have a super power as a person or a scientist, it is that I'm very okay with being wrong. It helps that it happens all the time and the fact that I have friends and a domestic partner who are way way way smarter than me. I'm used to be the dumbest person in the room and I can just discover that I'm wrong.
And boy - was I wrong about this new Illumina Protein Prep thing.
I thought it was just a repackaging of SomaScan, a product that has had the strangest propensity for avoiding the very simple experiment that would make me stop making fun of it. After a decade I was starting to think that 1) They were doing it just to get on my nerves or 2) They had done it - and aptamer off binding could not be used to estimate a protein concentration in a complex mixture in any meaningful way (translation - it doesn't work).
But Illumina has been killing it for years and years! We have petabytes full of Illumina short read sequencing data all over the world. Sure, you could argue they missed the long read sequencing bandwagon and that is a little weird. But a behemoth of an organization like that has the money and the people to avoid becoming complacent.
So when Illumina acquired whatever SomaScan had changed their name to that month, you had to think "wait. maybe there IS something to it!"
And here I sit while turning a TOF after a power outage that caused me to miss the last day of a conference. Embarrassed and corrected.
The A problem with aptamers is that they are only linear within an EXTREMELY narrow linear dynamic range. If sample A has x target and sample B has 2x target, you can basically see that difference. If sample C has 10x target, you're probably okay, but you're at the end of the dynamic range. If sample D has 1,000,000,0000x more protein, you get about the same value as sample C. More on that and other problems with aptamers here.
This new product is so much more than the original product it was based on - because after you have your aptamer readout you NOW do NGS sequencing on tags on those aptamers. And then you do the quantification off of the NGS readout! By counting the reads! And we all know that there is no better way of doing quantification than counting things. And if there is, it's probably counting an indirect measurement of an indirect measurement. Wait. Didn't we do something like that before?
Okay, but that doesn't fix the linear dynamic range issue of the original measurement. But now you've got rock solid absolutely amazing quan on those narrow measurements, right?
And this is where I change my mind about this whole thing!
This group took a good hard look at precision and accuracy in a pile of different ways to do RNASeq, with a special emphasis on low input techniques like scSeq and scNSeq, but lots of work on the bulk as well.
The CVs ARE AMAZING.
Less than 1! Across the board! Okay, fancy mass spec people, tell me how many times that you've reported a CV <1 across an entire dataset. I'd love to say that I only report out proteins with less than 10%, but we use a 20% CV cutoff.
Oh...fuuuuuuuuuuuuuuuck..... they mean CV%, right? Not CV 1 = 100%??





