Wednesday, March 31, 2021

Scanning SWATH with ultrashort gradients-- 2,000 proteins in 1 minute?

 I'm gonna drop this here because everyone else is talking about it. 

Scanning SWATH goes way back to almost the beginnings of SWATH as an idea. I think it is very similar to SONAR from Waters in that the quad is not a stationary bin like we use for quadrupole Orbitrap based DIA. 

The trick here is fully on an informatics level, executed through the impressively easy to use DIA-NN software. 

There was an informal ABRF wrap up meeting with a lot of smart people from around the world that I somehow got invited to and one thing we talked about at length was the new generation of LCMS software that doesn't give you easy access to MS/MS fragments. DIA-NN falls in that group. This isn't going to make everyone comfortable, but it is something to be aware of. There is a whole generation of proteomics people using new software and getting stupendous numbers of identifications that verified by software but are not (or at least, not easily) verified by checking to see that MS/MS fragments actually exist.

If the software is right? Who cares! If the software is wrong, for many of these tools I don't know how you ever find out. I've been trying to look at the RAW data from this study but MSConvert won't recognize the .wiff.scan files and I'll probably just assume the reviewers did their due diligence on this fantastic sounding study. 

1 comment:

  1. Thanks for the kind blogpost! A comment to the visualisation capacities. Agreed that DIA-NN, like other recent software tools, are not programmed for visualization of the extracted ion chromatograms. There are software tools that are good at that, i.e. in the case of Sciex raw data, Peak View does a great job.
    But this does bring up an important point. We don't believe that visual inspection is sufficient to validate these types of proteomes. Nobody can manually look at the 500,000+ chromatograms reconstructed by DIA-NN that result 60,000+ precursor quantified in a 2-5 minute run, in a typical experiment in hundreds of samples.
    And even if, manual error would be huge. Even at my long ago postdoc time with clinical MRMs - these create a tiny fraction of the spectra - once these were manually integrated, the results were dependent on who did integrate the peaks. The most well known example of a technique that does validation by visual inspection btw is the Western Blot... imagine the error rate of scaling up a WB to 60,000 IDs per sample :-). That is just no option anymore with these technologies. It adds that DIA-NN and other software tools can increasingly use cross-run information to assign true positive IDs.
    So looking into the spectra is important i.e. for monitoring chromatography adn instrument set-up ect, and one can do this with the visualisation tools, but this doesn't suffice for validating the software nor the proteomes. That said, it does matter a lot if the software is right. We just need other means of validating it. For us, a good way are LFQ Bench-type of approaches, were multiple species samples are mixed at different ratio, and 2-species spectral libraries that you can use to test for the calls of false positive spectra, and to validate the false discovery rate. That is just the beginning, but being accurate is key, even tough these software is so complex that it might appear as a black box to many!

    M Ralser