A foundation for reliable spatial proteomics data analysis.

Abstract

Quantitative mass-spectrometry-based spatial proteomics involves elaborate, expensive, and time-consuming experimental procedures, and considerable effort is invested in the generation of such data. Multiple research groups have described a variety of approaches for establishing high-quality proteome-wide datasets. However, data analysis is as critical as data production for reliable and insightful biological interpretation, and no consistent and robust solutions have been offered to the community so far. Here, we introduce the requirements for rigorous spatial proteomics data analysis, as well as the statistical machine learning methodologies needed to address them, including supervised and semi-supervised machine learning, clustering, and novelty detection. We present freely available software solutions that implement innovative state-of-the-art analysis pipelines and illustrate the use of these tools through several case studies involving multiple organisms, experimental designs, mass spectrometry platforms, and quantitation techniques. We also propose sound analysis strategies for identifying dynamic changes in subcellular localization by comparing and contrasting data describing different biological conditions. We conclude by discussing future needs and developments in spatial proteomics data analysis..G., C.M.M., and M.F. were supported by the European Union 7th Framework Program (PRIME-XS Project, Grant No. 262067). L.M.B. was supported by a BBSRC Tools and Resources Development Fund (Award No. BB/K00137X/1). T.B. was supported by the Proteomics French Infrastructure (ProFI, ANR-10-INBS-08). A.C. was supported by BBSRC Grant No. BB/D526088/1. A.J.G. was supported by BBSRC Grant No. BB/E024777/ and a generous gift from King Abdullah University for Science and Technology, Saudi Arabia. D.J.N.H. was supported by a BBSRC CASE studentship (BB/I016147/1)

    Similar works