6 research outputs found

    Apprentissage de graphes causaux Ă  partir de donnĂ©es continues ou mixtes d’intĂ©rĂȘt biologique ou clinique

    No full text
    The work in this thesis follows the theory primarily developed by Judea Pearl on causal diagrams; graphical models that allow all causal quantities of interest to be derived formally and intuitively. We address the problem of causal network inference from observational data alone, i.e., without any intervention from the experimenter. In particular, we propose to improve existing methods to make them more suitable for analyzing real-world data, by freeing them as much as possible from constraints on data distributions, and by making them more interpretable. We propose an extension of MIIC, a constraint-based information-theoretic approach to recover the equivalence class of the causal graph from observations. Our contribution is an optimal discretization algorithm based on the minimum description length principle to simultaneously estimate the value of mutual (and multivariate) information and evaluate its significance between samples of variables of any nature: continuous, categorical or mixed. We use these developments to analyze mixed datasets of clinical (medical records of patients with cognitive disorders; or breast cancer and being treated by neoadjuvant chemotherapy) or biological interest (gene regulation networks of hematopoietic stem and precursor cells).Les travaux de cette thĂšse s’inscrivent dans la thĂ©orie principalement dĂ©veloppĂ©e par Judea Pearl sur les diagrammes causaux; des modĂšles graphiques qui permettent de dĂ©river toutes les quantitĂ©s causales d’intĂ©rĂȘt formellement et intuitivement. Nous traitons le problĂšme de l’infĂ©rence de rĂ©seau causal Ă  partir uniquement de donnĂ©es d’observation, c’est-Ă -dire sans aucune intervention de la part de l’expĂ©rimentateur. En particulier, nous proposons d’amĂ©liorer les mĂ©thodes existantes pour les rendre plus aptes Ă  analyser des donnĂ©es issues du monde rĂ©el, en nous affranchissant le plus possible des contraintes sur les distributions des donnĂ©es, et en les rendant plus interprĂ©tables. Nous proposons une extension de MIIC, une approche basĂ©e sur les contraintes et la thĂ©orie de l’information pour retrouver la classe d’équivalence du graphe causal Ă  partir d’observations. Notre contribution est un algorithme de discrĂ©tisation optimale basĂ© sur le principe de description minimale pour simultanĂ©ment estimer la valeur de l’information mutuelle (et multivariĂ©e) et Ă©valuer sa significativitĂ© entre des Ă©chantillons de variables de n’importe quelle nature : continue, catĂ©gorique ou mixte. Nous mettons Ă  profit ces dĂ©veloppements pour analyser des jeux de donnĂ©es mixtes d'intĂ©rĂȘt clinique (dossiers mĂ©dicaux de patients atteints de troubles cognitifs; ou du cancer du sein) ou biologique (rĂ©seaux de rĂ©gulation gĂ©nique de cellules prĂ©curseur hĂ©matopoĂŻĂ©tiques)

    Apprentissage de graphes causaux Ă  partir de donnĂ©es continues ou mixtes d’intĂ©rĂȘt biologique ou clinique

    No full text
    Les travaux de cette thĂšse s’inscrivent dans la thĂ©orie principalement dĂ©veloppĂ©e par Judea Pearl sur les diagrammes causaux; des modĂšles graphiques qui permettent de dĂ©river toutes les quantitĂ©s causales d’intĂ©rĂȘt formellement et intuitivement. Nous traitons le problĂšme de l’infĂ©rence de rĂ©seau causal Ă  partir uniquement de donnĂ©es d’observation, c’est-Ă -dire sans aucune intervention de la part de l’expĂ©rimentateur. En particulier, nous proposons d’amĂ©liorer les mĂ©thodes existantes pour les rendre plus aptes Ă  analyser des donnĂ©es issues du monde rĂ©el, en nous affranchissant le plus possible des contraintes sur les distributions des donnĂ©es, et en les rendant plus interprĂ©tables. Nous proposons une extension de MIIC, une approche basĂ©e sur les contraintes et la thĂ©orie de l’information pour retrouver la classe d’équivalence du graphe causal Ă  partir d’observations. Notre contribution est un algorithme de discrĂ©tisation optimale basĂ© sur le principe de description minimale pour simultanĂ©ment estimer la valeur de l’information mutuelle (et multivariĂ©e) et Ă©valuer sa significativitĂ© entre des Ă©chantillons de variables de n’importe quelle nature : continue, catĂ©gorique ou mixte. Nous mettons Ă  profit ces dĂ©veloppements pour analyser des jeux de donnĂ©es mixtes d'intĂ©rĂȘt clinique (dossiers mĂ©dicaux de patients atteints de troubles cognitifs; ou du cancer du sein) ou biologique (rĂ©seaux de rĂ©gulation gĂ©nique de cellules prĂ©curseur hĂ©matopoĂŻĂ©tiques).The work in this thesis follows the theory primarily developed by Judea Pearl on causal diagrams; graphical models that allow all causal quantities of interest to be derived formally and intuitively. We address the problem of causal network inference from observational data alone, i.e., without any intervention from the experimenter. In particular, we propose to improve existing methods to make them more suitable for analyzing real-world data, by freeing them as much as possible from constraints on data distributions, and by making them more interpretable. We propose an extension of MIIC, a constraint-based information-theoretic approach to recover the equivalence class of the causal graph from observations. Our contribution is an optimal discretization algorithm based on the minimum description length principle to simultaneously estimate the value of mutual (and multivariate) information and evaluate its significance between samples of variables of any nature: continuous, categorical or mixed. We use these developments to analyze mixed datasets of clinical (medical records of patients with cognitive disorders; or breast cancer and being treated by neoadjuvant chemotherapy) or biological interest (gene regulation networks of hematopoietic stem and precursor cells)

    Learning clinical networks from medical records based on information estimates in mixed-type data

    No full text
    International audienceThe precise diagnostics of complex diseases require to integrate a large amount of information from heterogeneous clinical and biomedical data, whose direct and indirect interdependences are notoriously difficult to assess. To this end, we propose an efficient computational approach to simultaneously compute and assess the significance of multivariate information between any combination of mixed-type (continuous/categorical) variables. The method is then used to uncover direct, indirect and possibly causal relationships between mixed-type data from medical records, by extending a recent machine learning method to reconstruct graphical models beyond simple categorical datasets. The method is shown to outperform existing tools on benchmark mixed-type datasets, before being applied to analyze the medical records of eldery patients with cognitive disorders from La PitiĂ©-SalpĂȘtriĂšre Hospital, Paris. The resulting clinical network visually captures the global interdependences in these medical records and some facets of clinical diagnosis practice, without specific hypothesis nor prior knowledge on any clinically relevant information. In particular, it provides some physiological insights linking the consequence of cerebrovascular accidents to the atrophy of important brain structures associated to cognitive impairment

    Inferring Gene Networks in Bone Marrow Hematopoietic Stem Cell-Supporting Stromal Niche Populations

    No full text
    International audienceThe cardinal property of bone marrow (BM) stromal cells is their capacity to contribute to hematopoietic stem cell (HSC) niches by providing mediators assisting HSC functions. In this study we first contrasted transcriptomes of stromal cells at different developmental stages and then included large number of HSC-supportive and non-supportive samples. Application of a combination of algorithms, comprising one identifying reliable paths and potential causative relationships in complex systems, revealed gene networks characteristic of the BM stromal HSC-supportive capacity and of defined niche populations of perivascular cells, osteoblasts, and mesenchymal stromal cells. Inclusion of single-cell transcriptomes enabled establishing for the perivascular cell subset a partially oriented graph of direct gene-to-gene interactions. As proof of concept we showed that R-spondin-2, expressed by the perivascular subset, synergized with Kit ligand to amplify ex vivo hematopoietic precursors. This study by identifying classifiers and hubs constitutes a resource to unravel candidate BM stromal mediators

    Metabolically Primed Multipotent Hematopoietic Progenitors Fuel Innate Immunity

    No full text
    Following infection, hematopoietic stem and progenitor cells (HSPCs) support immunity by increasing the rate of innate immune cell production but the metabolic cues that guide this process are unknown. To address this question, we developed MetaFate, a method to trace the metabolic expression state and developmental fate of single cells in vivo . Using MetaFate we identified a gene expression program of metabolic enzymes and transporters that confers differences in myeloid differentiation potential in a subset of HSPCs that express CD62L. Using single-cell metabolic profiling, we confirmed that CD62L high myeloid-biased HSPCs have an increased dependency on oxidative phosphorylation and glucose metabolism. Importantly, metabolism actively regulates immune-cell production, with overexpression of the glucose-6-phosphate dehydrogenase enzyme of the pentose phosphate pathway skewing MPP output from B-lymphocytes towards the myeloid lineages, and expansion of CD62L high HSPCs occurring to support emergency myelopoiesis. Collectively, our data reveal the metabolic cues that instruct innate immune cell development, highlighting a key role for the pentose phosphate pathway. More broadly, our results show that HSPC metabolism can be manipulated to alter the cellular composition of the immune system
    corecore