6,899 research outputs found

    From Cellular Characteristics to Disease Diagnosis: Uncovering Phenotypes with Supercells

    Get PDF
    Cell heterogeneity and the inherent complexity due to the interplay of multiple molecular processes within the cell pose difficult challenges for current single-cell biology. We introduce an approach that identifies a disease phenotype from multiparameter single-cell measurements, which is based on the concept of ‘‘supercell statistics’’, a single-cell-based averaging procedure followed by a machine learning classification scheme. We are able to assess the optimal tradeoff between the number of single cells averaged and the number of measurements needed to capture phenotypic differences between healthy and diseased patients, as well as between different diseases that are difficult to diagnose otherwise. We apply our approach to two kinds of single-cell datasets, addressing the diagnosis of a premature aging disorder using images of cell nuclei, as well as the phenotypes of two non-infectious uveitides (the ocular manifestations of Behc¸et’s disease and sarcoidosis) based on multicolor flow cytometry. In the former case, one nuclear shape measurement taken over a group of 30 cells is sufficient to classify samples as healthy or diseased, in agreement with usual laboratory practice. In the latter, our method is able to identify a minimal set of 5 markers that accurately predict Behc¸et’s disease and sarcoidosis. This is the first time that a quantitative phenotypic distinction between these two diseases has been achieved. To obtain this clear phenotypic signature, about one hundred CD8+ T cells need to be measured. Although the molecular markers identified have been reported to be important players in autoimmune disorders, this is the first report pointing out that CD8+ T cells can be used to distinguish two systemic inflammatory diseases. Beyond these specific cases, the approach proposed here is applicable to datasets generated by other kinds of state-of-the-art and forthcoming single-cell technologies, such as multidimensional mass cytometry, single-cell gene expression, and single-cell full genome sequencing techniques.Fil: Candia, Julian Marcelo. University of Maryland; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física de Líquidos y Sistemas Biológicos. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física de Líquidos y Sistemas Biológicos; ArgentinaFil: Maunu, Ryan. University of Maryland; Estados UnidosFil: Driscoll, Meghan. University of Maryland; Estados UnidosFil: Biancotto, Angélique. National Institutes of Health; Estados UnidosFil: Dagur, Pradeep. National Institutes of Health; Estados UnidosFil: McCoy Jr., J Philip. National Institutes of Health; Estados UnidosFil: Nida Sen, H.. National Institutes of Health; Estados UnidosFil: Wei, Lai. National Institutes of Health; Estados UnidosFil: Maritan, Amos. Università di Padova; ItaliaFil: Cao, Kan. University of Maryland; Estados UnidosFil: Nussenblatt, Robert B. National Institutes of Health; Estados UnidosFil: Banavar, Jayanth R.. University of Maryland; Estados UnidosFil: Losert, Wolfgang. University of Maryland; Estados Unido

    A computational pipeline for the diagnosis of CVID patients

    Get PDF
    Common variable immunodeficiency (CVID) is one of the most frequently diagnosed primary antibody deficiencies (PADs), a group of disorders characterized by a decrease in one or more immunoglobulin (sub) classes and/or impaired antibody responses caused by inborn defects in B cells in the absence of other major immune defects. CVID patients suffer from recurrent infections and disease-related, non-infectious, complications such as autoimmune manifestations, lymphoproliferation, and malignancies. A timely diagnosis is essential for optimal follow-up and treatment. However, CVID is by definition a diagnosis of exclusion, thereby covering a heterogeneous patient population and making it difficult to establish a definite diagnosis. To aid the diagnosis of CVID patients, and distinguish them from other PADs, we developed an automated machine learning pipeline which performs automated diagnosis based on flow cytometric immunophenotyping. Using this pipeline, we analyzed the immunophenotypic profile in a pediatric and adult cohort of 28 patients with CVID, 23 patients with idiopathic primary hypogammaglobulinemia, 21 patients with IgG subclass deficiency, six patients with isolated IgA deficiency, one patient with isolated IgM deficiency, and 100 unrelated healthy controls. Flow cytometry analysis is traditionally done by manual identification of the cell populations of interest. Yet, this approach has severe limitations including subjectivity of the manual gating and bias toward known populations. To overcome these limitations, we here propose an automated computational flow cytometry pipeline that successfully distinguishes CVID phenotypes from other PADs and healthy controls. Compared to the traditional, manual analysis, our pipeline is fully automated, performing automated quality control and data pre-processing, automated population identification (gating) and deriving features from these populations to build a machine learning classifier to distinguish CVID from other PADs and healthy controls. This results in a more reproducible flow cytometry analysis, and improves the diagnosis compared to manual analysis: our pipelines achieve on average a balanced accuracy score of 0.93 (+/- 0.07), whereas using the manually extracted populations, an averaged balanced accuracy score of 0.72 (+/- 0.23) is achieved

    Automated Discrimination of Pathological Regions in Tissue Images: Unsupervised Clustering vs Supervised SVM Classification

    Get PDF
    Recognizing and isolating cancerous cells from non pathological tissue areas (e.g. connective stroma) is crucial for fast and objective immunohistochemical analysis of tissue images. This operation allows the further application of fully-automated techniques for quantitative evaluation of protein activity, since it avoids the necessity of a preventive manual selection of the representative pathological areas in the image, as well as of taking pictures only in the pure-cancerous portions of the tissue. In this paper we present a fully-automated method based on unsupervised clustering that performs tissue segmentations highly comparable with those provided by a skilled operator, achieving on average an accuracy of 90%. Experimental results on a heterogeneous dataset of immunohistochemical lung cancer tissue images demonstrate that our proposed unsupervised approach overcomes the accuracy of a theoretically superior supervised method such as Support Vector Machine (SVM) by 8%

    Essential guidelines for computational method benchmarking

    Get PDF
    In computational biology and other sciences, researchers are frequently faced with a choice between several computational methods for performing data analyses. Benchmarking studies aim to rigorously compare the performance of different methods using well-characterized benchmark datasets, to determine the strengths of each method or to provide recommendations regarding suitable choices of methods for an analysis. However, benchmarking studies must be carefully designed and implemented to provide accurate, unbiased, and informative results. Here, we summarize key practical guidelines and recommendations for performing high-quality benchmarking analyses, based on our experiences in computational biology.Comment: Minor update

    The Evolving Landscape of Flowcytometric Minimal Residual Disease Monitoring in B-Cell Precursor Acute Lymphoblastic Leukemia

    Get PDF
    Detection of minimal residual disease (MRD) is a major independent prognostic marker in the clinical management of pediatric and adult B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL), and risk stratification nowadays heavily relies on MRD diagnostics. MRD can be detected using flow cytometry based on aberrant expression of markers (antigens) during malignant B-cell maturation. Recent advances highlight the significance of novel markers (e.g., CD58, CD81, CD304, CD73, CD66c, and CD123), improving MRD identification. Second and next-generation flow cytometry, such as the EuroFlow consortium’s eight-color protocol, can achieve sensitivities down to 10−5 (comparable with the PCR-based method) if sufficient cells are acquired. The introduction of targeted therapies (especially those targeting CD19, such as blinatumomab or CAR-T19) introduces several challenges for flow cytometric MRD analysis, such as the occurrence of CD19-negative relapses. Therefore, innovative flow cytometry panels, including alternative B-cell markers (e.g., CD22 and CD24), have been designed. (Semi-)automated MRD assessment, employing machine learning algorithms and clustering tools, shows promise but does not yet allow robust and sensitive automated analysis of MRD. Future directions involve integrating artificial intelligence, further automation, and exploring multicolor spectral flow cytometry to standardize MRD assessment and enhance diagnostic and prognostic robustness of MRD diagnostics in BCP-ALL.</p

    The Evolving Landscape of Flowcytometric Minimal Residual Disease Monitoring in B-Cell Precursor Acute Lymphoblastic Leukemia

    Get PDF
    Detection of minimal residual disease (MRD) is a major independent prognostic marker in the clinical management of pediatric and adult B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL), and risk stratification nowadays heavily relies on MRD diagnostics. MRD can be detected using flow cytometry based on aberrant expression of markers (antigens) during malignant B-cell maturation. Recent advances highlight the significance of novel markers (e.g., CD58, CD81, CD304, CD73, CD66c, and CD123), improving MRD identification. Second and next-generation flow cytometry, such as the EuroFlow consortium’s eight-color protocol, can achieve sensitivities down to 10−5 (comparable with the PCR-based method) if sufficient cells are acquired. The introduction of targeted therapies (especially those targeting CD19, such as blinatumomab or CAR-T19) introduces several challenges for flow cytometric MRD analysis, such as the occurrence of CD19-negative relapses. Therefore, innovative flow cytometry panels, including alternative B-cell markers (e.g., CD22 and CD24), have been designed. (Semi-)automated MRD assessment, employing machine learning algorithms and clustering tools, shows promise but does not yet allow robust and sensitive automated analysis of MRD. Future directions involve integrating artificial intelligence, further automation, and exploring multicolor spectral flow cytometry to standardize MRD assessment and enhance diagnostic and prognostic robustness of MRD diagnostics in BCP-ALL.</p

    Automated and reproducible cell identification in mass cytometry using neural networks

    Get PDF
    The principal use of mass cytometry is to identify distinct cell types and changes in their composition, phenotype and function in different samples and conditions. Combining data from different studies has the potential to increase the power of these discoveries in diverse fields such as immunology, oncology and infection. However, current tools are lacking in scalable, reproducible and automated methods to integrate and study data sets from mass cytometry that often use heterogenous approaches to study similar samples. To address these limitations, we present two novel developments: (1) a pre-trained cell identification model named Immunopred that allows automated identification of immune cells without user-defined prior knowledge of expected cell types and (2) a fully automated cytometry meta-analysis pipeline built around Immunopred. We evaluated this pipeline on six COVID-19 study data sets comprising 270 unique samples and uncovered novel significant phenotypic changes in the wider immune landscape of COVID-19 that were not identified when each study was analyzed individually. Applied widely, our approach will support the discovery of novel findings in research areas where cytometry data sets are available for integration
    corecore