61 research outputs found

    Études de réseaux d’expression génique : utilité pour l’élucidation des déterminants génétiques des traits complexes

    Full text link
    Les traits quantitatifs complexes sont des caractéristiques mesurables d’organismes vivants qui résultent de l’interaction entre plusieurs gènes et facteurs environnementaux. Les locus génétiques liés à un caractère complexe sont appelés «locus de traits quantitatifs » (QTL). Récemment, en considérant les niveaux d’expression tissulaire de milliers de gènes comme des traits quantitatifs, il est devenu possible de détecter des «QTLs d’expression» (eQTL). Alors que ces derniers ont été considérés comme des phénotypes intermédiaires permettant de mieux comprendre l’architecture biologique des traits complexes, la majorité des études visent encore à identifier une mutation causale dans un seul gène. Cette approche ne peut remporter du succès que dans les situations où le gène incriminé a un effet majeur sur le trait complexe, et ne permet donc pas d’élucider les situations où les traits complexes résultent d’interactions entre divers gènes. Cette thèse propose une approche plus globale pour : 1) tenir compte des multiples interactions possibles entre gènes pour la détection de eQTLs et 2) considérer comment des polymorphismes affectant l’expression de plusieurs gènes au sein de groupes de co-expression pourraient contribuer à des caractères quantitatifs complexes. Nos contributions sont les suivantes : Nous avons développé un outil informatique utilisant des méthodes d’analyse multivariées pour détecter des eQTLs et avons montré que cet outil augmente la sensibilité de détection d’une classe particulière de eQTLs. Sur la base d’analyses de données d’expression de gènes dans des tissus de souris recombinantes consanguines, nous avons montré que certains polymorphismes peuvent affecter l’expression de plusieurs gènes au sein de domaines géniques de co-expression. En combinant des études de détection de eQTLs avec des techniques d’analyse de réseaux de co-expression de gènes dans des souches de souris recombinantes consanguines, nous avons montré qu’un locus génétique pouvait être lié à la fois à l’expression de plusieurs gènes au niveau d’un domaine génique de co-expression et à un trait complexe particulier (c.-à-d. la masse du ventricule cardiaque gauche). Au total, nos études nous ont permis de détecter plusieurs mécanismes par lesquels des polymorphismes génétiques peuvent être liés à l’expression de plusieurs gènes, ces derniers pouvant eux-mêmes être liés à des traits quantitatifs complexes.Complex quantitative traits are measurable characteristics of living organisms resulting from the interaction between multiple genes and environmental factors. Genetic loci associated with complex trait are called "quantitative trait loci" (QTL). Recently, considering the expression levels of thousands of genes as quantitative traits, it has become possible to detect "expression QTLs " (eQTL). These eQTL are considered intermediate phenotypes and are used to better understand the biological architecture of complex traits. However the majority of studies still try to identify a causal mutation in a single gene. This approach can only meet success in situations where the gene incriminate as a major effect on the complex trait, and therefore can not elucidate the situations where complex traits result from interactions between various genes. This thesis proposes a more comprehensive approach to: 1) take into account the possible interactions between multiple genes for the detection of eQTLs and 2) consider how polymorphisms affecting the expression of several genes in a module of co-expression may contribute to quantitative complex traits. Our contributions are as follows: We have developed a tool using multivariate analysis techniques to detect eQTLs, and have shown that this tool increases the sensitivity of detection of a particular class of eQTLs. Based on the data analysis of gene expression in recombinant inbred strains mice tissues, we have shown that some polymorphisms may affect the expression of several genes in domain of co-expression. Combining eQTLs detection studies with network of co-expression genes analysis in recombinant inbred strains mice, we showed that a genetic locus could be linked to both the expression of multiple genes at a domain of gene co-expression and a specific complex trait (i.e. left ventricular mass). Our studies have detected several mechanisms by which genetic polymorphisms may be associated with the expression of several genes, and may themselves be linked to quantitative complex traits

    Annotation des ARN non codants du génome de Candida albicans par méthode bioinformatique

    Get PDF
    La bio-informatique est un champ pluridisciplinaire qui utilise la biologie, l’informatique, la physique et les mathématiques pour résoudre des problèmes posés par la biologie. L’une des thématiques de la bio-informatique est l’analyse des séquences génomiques et la prédiction de gènes d’ARN non codants. Les ARN non codants sont des molécules d’ARN qui sont transcrites mais pas traduites en protéine et qui ont une fonction dans la cellule. Trouver des gènes d’ARN non codants par des techniques de biochimie et de biologie moléculaire est assez difficile et relativement coûteux. Ainsi, la prédiction des gènes d’ARNnc par des méthodes bio-informatiques est un enjeu important. Cette recherche décrit un travail d’analyse informatique pour chercher des nouveaux ARNnc chez le pathogène Candida albicans et d’une validation expérimentale. Nous avons utilisé comme stratégie une analyse informatique combinant plusieurs logiciels d’identification d’ARNnc. Nous avons validé un sous-ensemble des prédictions informatiques avec une expérience de puces à ADN couvrant 1979 régions du génome. Grace à cette expérience nous avons identifié 62 nouveaux transcrits chez Candida albicans. Ce travail aussi permit le développement d’une méthode d’analyse pour des puces à ADN de type tiling array. Ce travail présente également une tentation d’améliorer de la prédiction d’ARNnc avec une méthode se basant sur la recherche de motifs d’ARN dans les séquences.Bioinformatics is a multidisciplinary field that uses biology, computer science, physics and mathematics to solve problems in biology. One of the topics of bioinformatics is the analysis of genomic sequences and prediction of genes from non-coding RNA (ncRNA). The non-coding RNAs are RNA molecules that are transcribed but not translated into protein and have a function in the cell. The use of biochemistry and molecular biology techniques in order to find non-coding RNA genes is rather difficult and relatively expensive. Thus, the prediction of genes by bioinformatics methods is an important issue. This research describes a computer analysis to search for new ncRNA in the pathogen Candida albicans and an experimental validation. The strategy used was to combine several algorithms and to validate a subset of computer predictions with a microarray experience covering 1979 regions of the genome. We have identified 62 new transcripts in Candida albicans. We have also developed an analytical method for tiling array and attempted to improve the prediction of ncRNAs this with a method based on the search of RNA motifs in the sequences

    A network analysis of cofactor-protein interactions for analyzing associations between human nutrition and diseases

    Get PDF
    The involvement of vitamins and other micronutrients in intermediary metabolism was elucidated in the mid 1900's at the level of individual biochemical reactions. Biochemical pathways remain the foundational knowledgebase for understanding how micronutrient adequacy modulates health in all life stages. Current daily recommended intakes were usually established on the basis of the association of a single nutrient to a single, most sensitive adverse effect and thus neglect interdependent and pleiotropic effects of micronutrients on biological systems. Hence, the understanding of the impact of overt or sub-clinical nutrient deficiencies on biological processes remains incomplete. Developing a more complete view of the role of micronutrients and their metabolic products in protein-mediated reactions is of importance. We thus integrated and represented cofactor-protein interaction data from multiple and diverse sources into a multi-layer network representation that links cofactors, cofactor-interacting proteins, biological processes, and diseases. Network representation of this information is a key feature of the present analysis and enables the integration of data from individual biochemical reactions and protein-protein interactions into a systems view, which may guide strategies for targeted nutritional interventions aimed at improving health and preventing diseases

    Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data

    Get PDF
    The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML

    Underlying Event measurements in pp collisions at s=0.9 \sqrt {s} = 0.9 and 7 TeV with the ALICE experiment at the LHC

    Full text link

    Type Package Title integrated Bayesian Modeling of eQTL data Version 1.0.1 Date 2011-10-28

    No full text
    R topics documented: iBMQ-package....................................... 2 calculateThreshold..................................... 2 eqtlClassifier........................................ 3 eqtlFinder.......................................... 4 eqtlMcmc.......................................... 4 gene............................................. 5 genepos........................................... 6 genotype.liver........................................ 7 hotspotFinder........................................ 7 map.liver.......................................... 8 phenotype.liver....................................... 8 1 2 calculateThreshold PPA.liver.......................................... 9 probe.liver.......................................... 9 snp.............................................. 10 snppos............................................ 1

    Ancestors’ dietary patterns and environments could drive positive selection in genes involved in micronutrient metabolism—the case of cofactor transporters

    No full text
    BACKGROUND: During evolution, humans colonized different ecological niches and adopted a variety of subsistence strategies that gave rise to diverse selective pressures acting across the genome. Environmentally induced selection of vitamin, mineral, or other cofactor transporters could influence micronutrient-requiring molecular reactions and contribute to inter-individual variability in response to foods and nutritional interventions. METHODS: A comprehensive list of genes coding for transporters of cofactors or their precursors was built using data mining procedures from the HGDP dataset and then explored to detect evidence of positive genetic selection. This dataset was chosen since it comprises several genetically diverse worldwide populations whom ancestries have evolved in different environments and thus lived following various nutritional habits and lifestyles. RESULTS: We identified 312 cofactor transporter (CT) genes involved in between-cell or sub-cellular compartment distribution of 28 cofactors derived from dietary intake. Twenty-four SNPs distributed across 14 CT genes separated populations into continental and intra-continental groups such as African hunter-gatherers and farmers, and between Native American sub-populations. Notably, four SNPs were located in SLC24A3 with one being a known eQTL of the NCKX3 protein. CONCLUSIONS: These findings could support the importance of considering individual’s genetic makeup along with their metabolic profile when tailoring personalized dietary interventions for optimizing health

    Multi-omics integration-a comparison of unsupervised clustering methodologies

    No full text
    With the recent developments in the field of multi-omics integration, the interest in factors such as data preprocessing, choice of the integration method and the number of different omics considered had increased. In this work, the impact of these factors is explored when solving the problem of sample classification, by comparing the performances of five unsupervised algorithms: Multiple Canonical Correlation Analysis, Multiple Co-Inertia Analysis, Multiple Factor Analysis, Joint and Individual Variation Explained and Similarity Network Fusion. These methods were applied to three real data sets taken from literature and several ad hoc simulated scenarios to discuss classification performance in different conditions of noise and signal strength across the data types. The impact of experimental design, feature selection and parameter training has been also evaluated to unravel important conditions that can affect the accuracy of the result

    New Developments and Possibilities in Reanalysis and Reinterpretation of Whole Exome Sequencing Datasets for Unsolved Rare Diseases Using Machine Learning Approaches

    No full text
    Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20–30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics
    • …
    corecore