1,099 research outputs found

    D-AREdevil: a novel approach for discovering disease-associated rare cell populations in mass cytometry data

    Get PDF
    Background: The advances in single-cell technologies such as mass cytometry provides increasing resolution of the complexity of cellular samples, allowing researchers to deeper investigate and understand the cellular heterogeneity and possibly detect and discover previously undetectable rare cell populations. The identification of rare cell populations is of paramount importance for understanding the onset, progression and pathogenesis of many diseases. However, their identification remains challenging due to the always increasing dimensionality and throughput of the data generated. Aim: This study aimed at implementing a straightforward approach that efficiently supports a data analyst to identify disease-associated rare cell populations in large and complex biological samples and within reasonable limits of time and computational infrastructure. Methods: We proposed a novel computational framework called D-AREdevil (disease- associated rare cells detection) for cytometry datasets. The main characteristic of our computational framework is the combination of an anomaly detection algorithm (i.e. LOF, or FiRE) that provides a continuous score for individual cells with one of the best performing and fastest unsupervised clustering methods (i.e. FlowSOM). In our approach, the LOF score serves to select a set of candidate cells belonging to one or more subgroups of similar rare cell populations. Then, we tested these subgroups of rare cells for association with a patient group, disease type, clinical outcome or other characteristic of interest. Results: We reported in this study the properties and implementation of D-AREdevil and presented an evaluation of its performances and applications on three different testing datasets based on mass cytometry data. We generated data mixed with one or more known rare cell populations at varying frequencies (below 1%) and tested the ability of our approach to identify those cells in order to bring them to the attention of the data analyst. This is a key step in the process of finding cell subgroups that are associated with a disease or outcome of interest, when their existence and identification is not previously known and has yet to be discovered. Conclusions: We proposed a novel computational framework with demostrated good sensitivity and precision in detecting target rare cell poopulations present at very low frequencies in the total datasets (<1%). -- Contexte: Les avancĂ©es en technologies sur cellules individuelles telles que la cytomĂ©trie de masse offrent une meilleure rĂ©solution de la complexitĂ© des Ă©chantillons cellulaires, permettant aux chercheurs d’étudier et de comprendre plus en profondeur l’hĂ©tĂ©rogĂ©nĂ©itĂ© cellulaire et Ă©ventuellement de dĂ©tecter et dĂ©couvrir des populations de cellules rares auparavant indĂ©tectables. L’identification de populations de cellules rares est importante pour comprendre l’apparition, la progression et la pathogenĂšse de nombreuses maladies. Cependant, leur identification reste difficile en raison de la haute dimensionnalitĂ© et du dĂ©bit toujours croissants de donnĂ©es gĂ©nĂ©rĂ©es. But: Cette Ă©tude met en Ɠuvre une approche simple et efficace pour identifier des populations de cellules rares associĂ©es Ă  une maladie dans des Ă©chantillons biologiques vastes et complexes dans des limites de temps et d’infrastructure de calcul raisonnables. MĂ©thodes: Nous proposons un nouveau cadre de calcul appelĂ© D-AREdevil (dĂ©tection de cellules rares associĂ©es Ă  une maladie) pour l’analyse de donnĂ©es de cytomĂ©trie de masse. La principale caractĂ©ristique de notre cadre computationnel est la combinaison d’un algorithme de dĂ©tection d’anomalies (LOF ou FiRE) qui fournit un score continu pour chaque cellule avec l’une des mĂ©thodes de regroupement non-supervisĂ© les plus performantes et les plus rapides (FlowSOM). Dans notre approche, le score LOF sert Ă  sĂ©lectionner un ensemble de cellules candidates appartenant Ă  un ou plusieurs sous-groupes de populations de cellules rares similaires. Ensuite, nous testons ces sous-groupes de cellules rares pour dĂ©terminer s’ils sont associĂ©es avec un groupe de patients, un type de maladie, un rĂ©sultat clinique ou une autre caractĂ©ristique d’intĂ©rĂȘt. RĂ©sultats: Dans cette Ă©tude, nous avons rapportĂ© les propriĂ©tĂ©s et l’implĂ©mentation de D-AREdevil, et prĂ©sentĂ© une Ă©valuation de ses performances et applications sur trois jeux de donnĂ©es diffĂ©rents de cytomĂ©trie de masse. Nous avons gĂ©nĂ©rĂ© des donnĂ©es mĂ©langĂ©es contenant une ou plusieurs populations de cellules rares connues Ă  des frĂ©quences variables (infĂ©rieures Ă  1%) et nous avons testĂ© la capacitĂ© de notre approche Ă  identifier ces cellules afin de les porter Ă  l’attention de l’analyste. Il s’agit lĂ  d’une Ă©tape clĂ© dans le processus de recherche de sous-groupes de cellules qui sont associĂ©s Ă  une maladie ou Ă  un rĂ©sultat d’intĂ©rĂȘt qui est encore inconnu. Conclusions: Nous proposons un nouveau cadre de calcul avec une bonne sensibilitĂ© et une bonne prĂ©cision dans la dĂ©tection de cellules rares qui sont prĂ©sentes Ă  de trĂšs basses frĂ©quences dans l’ensemble des donnĂ©es (<1%)

    Uncovering Intratumoral And Intertumoral Heterogeneity Among Single-Cell Cancer Specimens

    Get PDF
    While several tools have been developed to map axes of variation among individual cells, no analogous approaches exist for identifying axes of variation among multicellular biospecimens profiled at single-cell resolution. Developing such an approach is of great translational relevance and interest, as single-cell expression data are now often collected across numerous experimental conditions (e.g., representing different drug perturbation conditions, CRISPR knockdowns, or patients undergoing clinical trials) that need to be compared. In this work, “Phenotypic Earth Mover\u27s Distance” (PhEMD) is presented as a solution to this problem. PhEMD is a general method for embedding a “manifold of manifolds,” in which each datapoint in the higher-level manifold (of biospecimens) represents a collection of points that span a lower-level manifold (of cells). PhEMD is applied to a newly-generated, 300-biospecimen mass cytometry drug screen experiment to map small-molecule inhibitors based on their differing effects on breast cancer cells undergoing epithelial–mesenchymal transition (EMT). These experiments highlight EGFR and MEK1/2 inhibitors as strongly halting EMT at an early stage and PI3K/mTOR/Akt inhibitors as enriching for a drug-resistant mesenchymal cell subtype characterized by high expression of phospho-S6. More generally, these experiments reveal that the final mapping of perturbation conditions has low intrinsic dimension and that the network of drugs demonstrates manifold structure, providing insight into how these single-cell experiments should be computational modeled and visualized. In the presented drug-screen experiment, the full spectrum of perturbation effects could be learned by profiling just a small fraction (11%) of drugs. Moreover, PhEMD could be integrated with complementary datasets to infer the phenotypes of biospecimens not directly profiled with single-cell profiling. Together, these findings have major implications for conducting future drug-screen experiments, as they suggest that large-scale drug screens can be conducted by measuring only a small fraction of the drugs using the most expensive high-throughput single-cell technologies—the effects of other drugs may be inferred by mapping and extending the perturbation space. PhEMD is also applied to patient tumor biopsies to assess intertumoral heterogeneity. Applied to a melanoma dataset and a clear-cell renal cell carcinoma dataset (ccRCC), PhEMD maps tumors similarly to how it maps perturbation conditions as above in order to learn key axes along which tumors vary with respect to their tumor-infiltrating immune cells. In both of these datasets, PhEMD highlights a subset of tumors demonstrating a marked enrichment of exhausted CD8+ T-cells. The wide variability in tumor-infiltrating immune cell abundance and particularly prominent exhausted CD8+ T-cell subpopulation highlights the importance of careful patient stratification when assessing clinical response to T cell-directed immunotherapies. Altogether, this work highlights PhEMD’s potential to facilitate drug discovery and patient stratification efforts by uncovering the network geometry of a large collection of single-cell biospecimens. Our varied experiments demonstrate that PhEMD is highly scalable, compatible with leading batch effect correction techniques, and generalizable to multiple experimental designs, with clear applicability to modern precision oncology efforts

    Human Haematopoietic Stem Cell Heterogeneity in Postnatal Haematopoiesis and Ontogeny

    Get PDF
    Haematopoietic stem cell (HSC) transplants are upheld as one of the most successful therapies in regenerative medicine. While improved purification and functionality assays have advanced understanding of steady-state haematopoiesis and the human bona fide HSC, evidence suggests significant heterogeneity exists within the HSC compartment in post-natal and pre-natal haematopoiesis. In post-natal haematopoiesis, both CD34+ and CD34- cells possess robust in vivo repopulating potential. CD34- repopulating cells, however, exhibit distinct repopulation kinetics, capacity to produce functional CD34+ repopulating cells, and accordingly have been speculated to reside at the apex of the human haematopoietic hierarchy. But a low repopulating cell frequency has hindered efforts to study these HSCs. We thus aim to improve purification of CD34- HSCs and further expand the knowledge of this immature stem pool. We successfully identified an additional positive selection marker, CD117 (c-Kit). Through limiting dilution analysis and serial transplantations of enriched CD34- repopulating cells in enhanced NSG mouse models, we observed repopulation and lineage commitment kinetics. To investigate the molecular mechanisms, we conducted single-cell RNAseq. With these new data we have asserted the importance of human CD34- HSCs their enormous therapeutic potential. In human foetal haematopoiesis, it is unknown whether CD34- repopulating cells emerge during ontogeny and play a role in foetal haematopoiesis. While reports for HSC-purifying markers have been produced, much of these studies have been restricted to foetal liver, a single gestational stage, and the CD34+ population. In humans, little is known about how the HSC cell surface marker phenotype adapts to the dynamic niches in the liver during expansion, homing to the bone marrow, and bone marrow colonisation. To address this, we optimised a multi-parameter flow cytometry panel and used it to investigate the expression of a number of reported HSC cell surface markers across first and second trimester liver and bone marrow. Through a combination of high-dimensional data analysis and mathematical modelling we have produced an antigen-based map of foetal haematopoietic stem cell and progenitor dynamics

    Dendritic Spine Shape Analysis: A Clustering Perspective

    Get PDF
    Functional properties of neurons are strongly coupled with their morphology. Changes in neuronal activity alter morphological characteristics of dendritic spines. First step towards understanding the structure-function relationship is to group spines into main spine classes reported in the literature. Shape analysis of dendritic spines can help neuroscientists understand the underlying relationships. Due to unavailability of reliable automated tools, this analysis is currently performed manually which is a time-intensive and subjective task. Several studies on spine shape classification have been reported in the literature, however, there is an on-going debate on whether distinct spine shape classes exist or whether spines should be modeled through a continuum of shape variations. Another challenge is the subjectivity and bias that is introduced due to the supervised nature of classification approaches. In this paper, we aim to address these issues by presenting a clustering perspective. In this context, clustering may serve both confirmation of known patterns and discovery of new ones. We perform cluster analysis on two-photon microscopic images of spines using morphological, shape, and appearance based features and gain insights into the spine shape analysis problem. We use histogram of oriented gradients (HOG), disjunctive normal shape models (DNSM), morphological features, and intensity profile based features for cluster analysis. We use x-means to perform cluster analysis that selects the number of clusters automatically using the Bayesian information criterion (BIC). For all features, this analysis produces 4 clusters and we observe the formation of at least one cluster consisting of spines which are difficult to be assigned to a known class. This observation supports the argument of intermediate shape types.Comment: Accepted for BioImageComputing workshop at ECCV 201

    Predictive modeling of clinical outcomes for hospitalized COVID-19 patients utilizing CyTOF and clinical data.

    Get PDF
    In December 2019, an outbreak of a novel coronavirus initiated a global pandemic. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a virus that causes the disease coronavirus disease 2019 (COVID-19). Symptoms of infection with COVID-19 vary widely between individuals. While some infected individuals are asymptomatic, others need more extensive care and require hospitalization. Indeed, the COVID-19 pandemic was characterized by a shortage of hospital beds which presented additional complications in providing adequate care for patients. In this study, we used a combination of T cell population data collected from mass cytometry analysis and clinical markers to form a predictive model of clinical outcomes for hospitalized COVID19 patients. This thesis details the steps and analysis towards the design of the final model including data acquirement and preprocessing, missing data handling via multiple imputation, and repeated imputations inferences

    Single-cell immune profiling of Meniere Disease patients

    Get PDF
    This work was supported by B-CTS-68-UGR20 Grant by FEDER Funds, PI17/1644 and PI20-1126 grants from ISCIII by FEDER Funds from the EU, CLINMON-2 from the Meniere's Society UK, and Impact Data Science (IMP0001) . MF is funded by F18/00228 grant from ISCIII by FEDER Funds from the EU. AEB is funded by the EU's Horizon 2020 Research and Innovation Programme, Grant Agreement Number 848261. LF is funded by CD20/0153 grant from ISCIII by FEDER Funds from the EU. Funding for open access charge: Universidad de Granada/CBUA.Background: Meniere Disease (MD) is an inner ear syndrome, characterized by episodes of vertigo, tinnitus and fluctuating sensorineural hearing loss. The pathological mechanism leading to sporadic MD is still poorly understood, however an allergic inflammatory response seems to be involved in some patients with MD. Objective: Decipher an immune signature associated with the syndrome. Methods: We performed mass cytometry immune profiling on peripheral blood from MD patients and controls. We analyzed differences in state and differences in abundance of the different cellular subsets. IgE levels were quantified through ELISA on supernatant of cultured whole blood. Results: We have identified two clusters of individuals according to the single cell cytokine profile. These clusters presented differences in IgE levels, immune cell population abundance, including a reduction of CD56dim NKcells, and changes in cytokine expression with a different response to bacterial and fungal antigens. Conclusion: Our results support a systemic inflammatory response in some MD patients that show a type 2 response with allergic phenotype, which could benefit from personalized IL-4 blockers.FEDER Funds B-CTS-68-UGR20, B-CTS-68-UGR20Instituto de Salud Carlos III Spanish Government PI17/1644, PI20-1126, CD20/0153, 848261EUMeniere's Society UKImpact Data Science F18/00228Horizon 2020 IMP0001Universidad de Granada/CBU

    Development of machine learning techniques for flow cytometry data

    Get PDF
    • 

    corecore