17 research outputs found
Network analysis of genomic and clinical data
Cette thèse consiste au développement d’une nouvelle approche méthodologique pour reconstruire les réseaux à partir de données biologiques et cliniques qui surmonte certains problèmes techniques et informatiques des méthodes existantes pour accomplir cette tâche. Notre algorithme (MIIC), permet l'étude d'ensembles de données discrètes, continues et mixtes avec tout type de distributions de probabilité et de densité, y compris la présence possible de variables latentes, qui sont très importantes dans des contextes réels où il n'est pas toujours possible de collecter toutes les variables pertinentes. MIIC est disponible par le biais d'une interface Web à l'adresse suivante: https://miic.curie.fr, et sous la forme d'un paquet R disponible sur CRAN. La deuxième partie de la thèse est consacrée à l'analyse d'applications réelles: de la reconstruction d'un réseau de régulation génétique et une carte de contact des protéines, à l'étude des dossiers cliniques de patients atteints de troubles cognitifs ou de cancer du sein. MIIC peut aider les médecins à visualiser et à analyser les effets directs, indirects et éventuellement causaux des dossiers médicaux des patients, à découvrir de nouvelles interdépendances directes inattendues entre des informations cliniquement pertinentes ou à expliquer une connexion manquante par d'autres liens trouvés dans la reconstruction.This thesis consists in the development of a novel methodological approach to reconstruct networks starting from biological and clinical data. It overcomes some technical and computational problems of existing methods to accomplish this task. Our algorithm (MIIC), allows the study of discrete, continuous and mixed datasets with any type of probability and density distributions, including the possible presence of latent variables, which are very important in real contexts where it is not always possible to collect all relevant variables. MIIC is available through a web interface at the address: https://miic.curie.fr, and as an R package available on CRAN. The second part of the thesis is devoted to the analysis of real life applications: from gene regulatory network reconstruction and protein contact map reconstruction, to the study of clinical records of patients affected by cognitive disorders or breast cancer. MIIC can help physicians in visualizing and analysing direct, indirect and possibly causal effects from patient medical records, discovering novel unexpected direct interdependencies between clinically relevant information or explaining a missing connection through other links found in the reconstruction
Reconstruction de réseaux à partir de données génomiques et cliniques
This thesis consists in the development of a novel methodological approach to reconstruct networks starting from biological and clinical data. It overcomes some technical and computational problems of existing methods to accomplish this task. Our algorithm (MIIC), allows the study of discrete, continuous and mixed datasets with any type of probability and density distributions, including the possible presence of latent variables, which are very important in real contexts where it is not always possible to collect all relevant variables. MIIC is available through a web interface at the address: https://miic.curie.fr, and as an R package available on CRAN. The second part of the thesis is devoted to the analysis of real life applications: from gene regulatory network reconstruction and protein contact map reconstruction, to the study of clinical records of patients affected by cognitive disorders or breast cancer. MIIC can help physicians in visualizing and analysing direct, indirect and possibly causal effects from patient medical records, discovering novel unexpected direct interdependencies between clinically relevant information or explaining a missing connection through other links found in the reconstruction.Cette thèse consiste au développement d’une nouvelle approche méthodologique pour reconstruire les réseaux à partir de données biologiques et cliniques qui surmonte certains problèmes techniques et informatiques des méthodes existantes pour accomplir cette tâche. Notre algorithme (MIIC), permet l'étude d'ensembles de données discrètes, continues et mixtes avec tout type de distributions de probabilité et de densité, y compris la présence possible de variables latentes, qui sont très importantes dans des contextes réels où il n'est pas toujours possible de collecter toutes les variables pertinentes. MIIC est disponible par le biais d'une interface Web à l'adresse suivante: https://miic.curie.fr, et sous la forme d'un paquet R disponible sur CRAN. La deuxième partie de la thèse est consacrée à l'analyse d'applications réelles: de la reconstruction d'un réseau de régulation génétique et une carte de contact des protéines, à l'étude des dossiers cliniques de patients atteints de troubles cognitifs ou de cancer du sein. MIIC peut aider les médecins à visualiser et à analyser les effets directs, indirects et éventuellement causaux des dossiers médicaux des patients, à découvrir de nouvelles interdépendances directes inattendues entre des informations cliniquement pertinentes ou à expliquer une connexion manquante par d'autres liens trouvés dans la reconstruction
MIIC online: a web server to reconstruct causal or non-causal networks from non-perturbative data
International audienc
Learning clinical networks from medical records based on information estimates in mixed-type data
International audienceThe precise diagnostics of complex diseases require to integrate a large amount of information from heterogeneous clinical and biomedical data, whose direct and indirect interdependences are notoriously difficult to assess. To this end, we propose an efficient computational approach to simultaneously compute and assess the significance of multivariate information between any combination of mixed-type (continuous/categorical) variables. The method is then used to uncover direct, indirect and possibly causal relationships between mixed-type data from medical records, by extending a recent machine learning method to reconstruct graphical models beyond simple categorical datasets. The method is shown to outperform existing tools on benchmark mixed-type datasets, before being applied to analyze the medical records of eldery patients with cognitive disorders from La Pitié-Salpêtrière Hospital, Paris. The resulting clinical network visually captures the global interdependences in these medical records and some facets of clinical diagnosis practice, without specific hypothesis nor prior knowledge on any clinically relevant information. In particular, it provides some physiological insights linking the consequence of cerebrovascular accidents to the atrophy of important brain structures associated to cognitive impairment
Learning causal networks with latent variables from multivariate information in genomic data
<div><p>Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables, commonly found in many genomic datasets. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The approach and associated algorithm, miic, outperform earlier methods on a broad range of benchmark networks. Causal network reconstructions are presented at different biological size and time scales, from gene regulation in single cells to whole genome duplication in tumor development as well as long term evolution of vertebrates. Miic is publicly available at <a href="https://github.com/miicTeam/MIIC" target="_blank">https://github.com/miicTeam/MIIC</a>.</p></div
Network reconstruction at tissue level.
<p>(</p><p><b>A</b></p>) Tumor development and drug resistance in the presence of tetraploid tumor cells following whole genome duplication (WGD). (<p><b>B</b></p>) Ploidy distribution in the 807 tumor samples and (<p><b>C</b></p>) genomic alterations: ploidy, mutations, normalized under-expression and over-expression changes from COSMIC database [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005662#pcbi.1005662.ref034" target="_blank">34</a>]. (<p><b>D</b></p>) Genomic alteration network obtained between average ploidy (violet), gene mutations (yellow, lower case) and under- or over-expressions (green, upper case). Graph predicted with miic R-package and visualized using cytoscape (blue edges correspond to repressions).<p></p
Network reconstruction at organismal and phylogenetic levels.
<p>(</p><p><b>A</b></p>) Two rounds of whole genome duplication (WGD) have led to the evolutionary radiation of vertebrates (and similarly with a third 300-MY-old WGD in teleost fish). (<p><b>B</b></p>) Biased distributions of genomic properties within ‘non-ohnolog’ and ‘ohnolog’ genes retained from WGDs in early vertebrates [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005662#pcbi.1005662.ref045" target="_blank">45</a>]. Numbers in brackets indicate the numbers of genes for which each property is identified, Materials and Methods and <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005662#pcbi.1005662.s018" target="_blank">S1 Data</a>. (<p><b>C</b></p>) Genomic property network of human genes, see main text. Graph predicted with miic R-package and visualized using cytoscape (blue edges correspond to repressions).<p></p
Network reconstruction at cellular level.
<p>(</p><p><b>A</b></p>) Hematopoietic / endothelial differentiation in single cells from mouse embryos [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005662#pcbi.1005662.ref024" target="_blank">24</a>]. (<p><b>B</b></p>) Principal component analysis and (<p><b>C</b></p>) K-means clustering of gene expression data [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005662#pcbi.1005662.ref024" target="_blank">24</a>] with histograms showing the relative proportions of cell populations at each data point (E7.0 to E8.25). (<p><b>D</b></p>) Hematopoietic / endothelial differentiation regulatory network between hematopoietic specific (red), endothelial (violet), common (blue) and unclassified (gray) TFs. Graph predicted with miic R-package and visualized using cytoscape (blue edges correspond to repressions).<p></p
Inferring Gene Networks in Bone Marrow Hematopoietic Stem Cell-Supporting Stromal Niche Populations
International audienceThe cardinal property of bone marrow (BM) stromal cells is their capacity to contribute to hematopoietic stem cell (HSC) niches by providing mediators assisting HSC functions. In this study we first contrasted transcriptomes of stromal cells at different developmental stages and then included large number of HSC-supportive and non-supportive samples. Application of a combination of algorithms, comprising one identifying reliable paths and potential causative relationships in complex systems, revealed gene networks characteristic of the BM stromal HSC-supportive capacity and of defined niche populations of perivascular cells, osteoblasts, and mesenchymal stromal cells. Inclusion of single-cell transcriptomes enabled establishing for the perivascular cell subset a partially oriented graph of direct gene-to-gene interactions. As proof of concept we showed that R-spondin-2, expressed by the perivascular subset, synergized with Kit ligand to amplify ex vivo hematopoietic precursors. This study by identifying classifiers and hubs constitutes a resource to unravel candidate BM stromal mediators