19 research outputs found

    Revisiting the thorny issue of missing values in single-cell proteomics

    Full text link
    Missing values are a notable challenge when analysing mass spectrometry-based proteomics data. While the field is still actively debating on the best practices, the challenge increased with the emergence of mass spectrometry-based single-cell proteomics and the dramatic increase in missing values. A popular approach to deal with missing values is to perform imputation. Imputation has several drawbacks for which alternatives exist, but currently imputation is still a practical solution widely adopted in single-cell proteomics data analysis. This perspective discusses the advantages and drawbacks of imputation. We also highlight 5 main challenges linked to missing value management in single-cell proteomics. Future developments should aim to solve these challenges, whether it is through imputation or data modelling. The perspective concludes with recommendations for reporting missing values, for reporting methods that deal with missing values and for proper encoding of missing values.Comment: The code to reproduce the images presented in the manuscript is available in the Github repository: https://github.com/UCLouvain-CBIO/2023_scp_n

    Standardised workflow for mass spectrometry-based single-cell proteomics data processing and analysis using the scp package

    Full text link
    Mass spectrometry (MS) based single-cell proteomics (SCP) explores cellular heterogeneity by focusing on the functional effectors of the cells - proteins. However, extracting meaningful biological information from MS data is far from trivial, especially with single cells. Currently, data analysis workflows are substantially different from one research team to another. Moreover,it is difficult to evaluate pipelines as ground truths are missing. Our team has developed the R/Bioconductor package called scp to provide a standardised framework for SCP data analysis. It relies on the widely used QFeatures and SingleCellExperiment data structures. In addition, we used a design containing cell lines mixed in known proportions to generate controlled variability for data analysis benchmarking. In this work, we provide a flexible data analysis protocol for SCP data using the scp package together with comprehensive explanations at each step of the processing. Our main steps are quality control on the feature and cell level, aggregation of the raw data into peptides and proteins, normalisation and batch correction. We validate our workflow using our ground truth data set. We illustrate how to use this modular, standardised framework and highlight some crucial steps

    Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experiments

    Full text link
    Analyzing proteins from single cells by tandem mass spectrometry (MS) has become technically feasible. While such analysis has the potential to accurately quantify thousands of proteins across thousands of single cells, the accuracy and reproducibility of the results may be undermined by numerous factors affecting experimental design, sample preparation, data acquisition, and data analysis. Broadly accepted community guidelines and standardized metrics will enhance rigor, data quality, and alignment between laboratories. Here we propose best practices, quality controls, and data reporting recommendations to assist in the broad adoption of reliable quantitative workflows for single-cell proteomics.Comment: Supporting website: https://single-cell.net/guideline

    A principled approach and standardised software for mass spectrometry-based single-cell proteomics data analysis

    No full text
    Recent advances in sample preparation and mass spectrometry (MS) have enabled the emergence of quantitative MS-based single-cell proteomics (SCP). However, the analysis of SCP data is challenging and must address numerous problems that are inherent to both MS-based proteomics technologies and single-cell experiments. Through the development of standardised software and data, this work establishes the foundation for SCP data analysis. Our efforts have led to a comprehensive identification and understanding of the obstacles hindering the accurate extraction of biologically meaningful information from these complex data. Consequently, we have developed a computational approach explicitly designed to address these challenges, facilitating a seamless analysis of SCP data. This work reshapes the analysis of SCP data by moving efforts from dealing with the technical aspects of data analysis to focusing on answering biologically relevant questions.(BIFA - Sciences biomédicales et pharmaceutiques) -- UCL, 202

    Replication of single-cell proteomics data reveals important computational challenges

    No full text
    Introduction: Mass spectrometry-based proteomics is actively embracing quantitative, single-cell level analyses. Indeed, recent advances in sample preparation and mass spectrometry (MS) have enabled the emergence of quantitative MS-based single-cell proteomics (SCP). While exciting and promising, SCP still has many rough edges. The current analysis workflows are custom and built from scratch. The field is therefore craving for standardized software that promotes principled and reproducible SCP data analyses. Areas covered: This special report is the first step toward the formalization and standardization of SCP data analysis. scp, the software that accompanies this work, successfully replicates one of the landmark SCP studies and is applicable to other experiments and designs. We created a repository containing the replicated workflow with comprehensive documentation in order to favor further dissemination and improvements of SCP data analyses. Expert opinion: Replicating SCP data analyses uncovers important challenges in SCP data analysis. We describe two such challenges in detail: batch correction and data missingness. We provide the current state-of-the-art and illustrate the associated limitations. We also highlight the intimate dependence that exists between batch effects and data missingness and offer avenues for dealing with these exciting challenges

    The Current State of Single‐Cell Proteomics Data Analysis

    No full text
    Sound data analysis is essential to retrieve meaningful biological information from single-cell proteomics experiments. This analysis is carried out by computational methods that are assembled into workflows, and their implementations influence the conclusions that can be drawn from the data. In this work, we explore and compare the computational workflows that have been used over the last four years and identify a profound lack of consensus on how to analyze single-cell proteomics data. We highlight the need for benchmarking of computational workflows and standardization of computational tools and data, as well as carefully designed experiments. Finally, we cover the current standardization efforts that aim to fill the gap, list the remaining missing pieces, and conclude with lessons learned from the replication of published single-cell proteomics analyses

    Replication of single-cell proteomics data reveals important computational challenges

    No full text
    Introduction Mass spectrometry-based proteomics is actively embracing quantitative, single-cell level analyses. Indeed, recent advances in sample preparation and mass spectrometry (MS) have enabled the emergence of quantitative MS-based single-cell proteomics (SCP). While exciting and promising, SCP still has many rough edges. The current analysis workflows are custom and built from scratch. The field is therefore craving for standardized software that promotes principled and reproducible SCP data analyses. Areas covered This special report is the first step toward the formalization and standardization of SCP data analysis. scp, the software that accompanies this work, successfully replicates one of the landmark SCP studies and is applicable to other experiments and designs. We created a repository containing the replicated workflow with comprehensive documentation in order to favor further dissemination and improvements of SCP data analyses. Expert opinion Replicating SCP data analyses uncovers important challenges in SCP data analysis. We describe two such challenges in detail: batch correction and data missingness. We provide the current state-of-the-art and illustrate the associated limitations. We also highlight the intimate dependence that exists between batch effects and data missingness and offer avenues for dealing with these exciting challenges

    Revisiting the Thorny Issue of Missing Values in Single-Cell Proteomics

    No full text
    Missing values are a notable challenge when analyzing mass spectrometry-based proteomics data. While the field is still actively debating the best practices, the challenge increased with the emergence of mass spectrometry-based single-cell proteomics and the dramatic increase in missing values. A popular approach to deal with missing values is to perform imputation. Imputation has several drawbacks for which alternatives exist, but currently, imputation is still a practical solution widely adopted in single-cell proteomics data analysis. This perspective discusses the advantages and drawbacks of imputation. We also highlight 5 main challenges linked to missing value management in single-cell proteomics. Future developments should aim to solve these challenges, whether it is through imputation or data modeling. The perspective concludes with recommendations for reporting missing values, for reporting methods that deal with missing values, and for proper encoding of missing values

    Standardised workflow for mass spectrometry-based single-cell proteomics data processing and analysis using the scp package

    No full text
    Mass spectrometry (MS) based single-cell proteomics (SCP) explores cellular heterogeneity by focusing on the functional effectors of the cells - proteins. However, extracting meaningful biological information from MS data is far from trivial, especially with single cells. Currently, data analysis workflows are substantially different from one research team to another. Moreover,it is difficult to evaluate pipelines as ground truths are missing. Our team has developed the R/Bioconductor package called scp to provide a standardised framework for SCP data analysis. It relies on the widely used QFeatures and SingleCellExperiment data structures. In addition, we used a design containing cell lines mixed in known proportions to generate controlled variability for data analysis benchmarking. In this work, we provide a flexible data analysis protocol for SCP data using the `scp` package together with comprehensive explanations at each step of the processing. Our main steps are quality control on the feature and cell level, aggregation of the raw data into peptides and proteins, normalisation and batch correction. We validate our workflow using our ground truth data set. We illustrate how to use this modular, standardised framework and highlight some crucial steps

    Identification and implication of tissue-enriched ligands in epithelial–endothelial crosstalk during pancreas development

    No full text
    Development of the pancreas is driven by an intrinsic program coordinated with signals from other cell types in the epithelial environment. These intercellular communications have been so far challenging to study because of the low concentration, localized production and diversity of the signals released. Here, we combined scRNAseq data with a computational interactomic approach to identify signals involved in the reciprocal interactions between the various cell types of the developing pancreas. This in silico approach yielded 40,607 potential ligand‑target interactions between the different main pancreatic cell types. Among this vast network of interactions, we focused on three ligands potentially involved in communications between epithelial and endothelial cells. BMP7 and WNT7B, expressed by pancreatic epithelial cells and predicted to target endothelial cells, and SEMA6D, involved in the reverse interaction. In situ hybridization confirmed the localized expression of Bmp7 in the pancreatic epithelial tip cells and of Wnt7b in the trunk cells. On the contrary, Sema6d was enriched in endothelial cells. Functional experiments on ex vivo cultured pancreatic explants indicated that tip cell‑produced BMP7 limited development of endothelial cells. This work identified ligands with a restricted tissular and cellular distribution and highlighted the role of BMP7 in the intercellular communications contributing to vessel development and organization during pancreas organogenesis
    corecore