65 research outputs found
Études de réseaux d’expression génique : utilité pour l’élucidation des déterminants génétiques des traits complexes
Les traits quantitatifs complexes sont des caractéristiques mesurables d’organismes vivants
qui résultent de l’interaction entre plusieurs gènes et facteurs environnementaux.
Les locus génétiques liés à un caractère complexe sont appelés «locus de traits quantitatifs
» (QTL). Récemment, en considérant les niveaux d’expression tissulaire de milliers
de gènes comme des traits quantitatifs, il est devenu possible de détecter des «QTLs
d’expression» (eQTL). Alors que ces derniers ont été considérés comme des phénotypes
intermédiaires permettant de mieux comprendre l’architecture biologique des traits complexes, la majorité des études visent encore à identifier une mutation causale dans un seul gène. Cette approche ne peut remporter du succès que dans les situations où le gène incriminé a un effet majeur sur le trait complexe, et ne permet donc pas d’élucider les
situations où les traits complexes résultent d’interactions entre divers gènes.
Cette thèse propose une approche plus globale pour : 1) tenir compte des multiples
interactions possibles entre gènes pour la détection de eQTLs et 2) considérer comment
des polymorphismes affectant l’expression de plusieurs gènes au sein de groupes de
co-expression pourraient contribuer à des caractères quantitatifs complexes. Nos contributions sont les suivantes :
Nous avons développé un outil informatique utilisant des méthodes d’analyse multivariées
pour détecter des eQTLs et avons montré que cet outil augmente la sensibilité
de détection d’une classe particulière de eQTLs.
Sur la base d’analyses de données d’expression de gènes dans des tissus de souris
recombinantes consanguines, nous avons montré que certains polymorphismes
peuvent affecter l’expression de plusieurs gènes au sein de domaines géniques de
co-expression.
En combinant des études de détection de eQTLs avec des techniques d’analyse
de réseaux de co-expression de gènes dans des souches de souris recombinantes
consanguines, nous avons montrĂ© qu’un locus gĂ©nĂ©tique pouvait ĂŞtre liĂ© Ă la fois Ă
l’expression de plusieurs gènes au niveau d’un domaine génique de co-expression
et Ă un trait complexe particulier (c.-Ă -d. la masse du ventricule cardiaque gauche).
Au total, nos études nous ont permis de détecter plusieurs mécanismes par lesquels
des polymorphismes génétiques peuvent être liés à l’expression de plusieurs gènes, ces
derniers pouvant eux-mêmes être liés à des traits quantitatifs complexes.Complex quantitative traits are measurable characteristics of living organisms resulting
from the interaction between multiple genes and environmental factors. Genetic loci
associated with complex trait are called "quantitative trait loci" (QTL). Recently, considering
the expression levels of thousands of genes as quantitative traits, it has become
possible to detect "expression QTLs " (eQTL). These eQTL are considered intermediate
phenotypes and are used to better understand the biological architecture of complex
traits. However the majority of studies still try to identify a causal mutation in a single
gene. This approach can only meet success in situations where the gene incriminate as
a major effect on the complex trait, and therefore can not elucidate the situations where
complex traits result from interactions between various genes.
This thesis proposes a more comprehensive approach to: 1) take into account the possible
interactions between multiple genes for the detection of eQTLs and 2) consider how
polymorphisms affecting the expression of several genes in a module of co-expression
may contribute to quantitative complex traits. Our contributions are as follows:
We have developed a tool using multivariate analysis techniques to detect eQTLs,
and have shown that this tool increases the sensitivity of detection of a particular
class of eQTLs.
Based on the data analysis of gene expression in recombinant inbred strains mice
tissues, we have shown that some polymorphisms may affect the expression of
several genes in domain of co-expression.
Combining eQTLs detection studies with network of co-expression genes analysis
in recombinant inbred strains mice, we showed that a genetic locus could be linked
to both the expression of multiple genes at a domain of gene co-expression and a
specific complex trait (i.e. left ventricular mass).
Our studies have detected several mechanisms by which genetic polymorphisms may
be associated with the expression of several genes, and may themselves be linked to quantitative complex traits
Annotation des ARN non codants du génome de Candida albicans par méthode bioinformatique
La bio-informatique est un champ pluridisciplinaire qui utilise la biologie,
l’informatique, la physique et les mathématiques pour résoudre des problèmes posés par la
biologie. L’une des thématiques de la bio-informatique est l’analyse des séquences
génomiques et la prédiction de gènes d’ARN non codants. Les ARN non codants sont des
molécules d’ARN qui sont transcrites mais pas traduites en protéine et qui ont une fonction
dans la cellule. Trouver des gènes d’ARN non codants par des techniques de biochimie et
de biologie moléculaire est assez difficile et relativement coûteux. Ainsi, la prédiction des
gènes d’ARNnc par des méthodes bio-informatiques est un enjeu important. Cette
recherche décrit un travail d’analyse informatique pour chercher des nouveaux ARNnc
chez le pathogène Candida albicans et d’une validation expérimentale. Nous avons utilisé
comme stratégie une analyse informatique combinant plusieurs logiciels d’identification
d’ARNnc. Nous avons validé un sous-ensemble des prédictions informatiques avec une
expérience de puces à ADN couvrant 1979 régions du génome. Grace à cette expérience
nous avons identifié 62 nouveaux transcrits chez Candida albicans. Ce travail aussi permit
le développement d’une méthode d’analyse pour des puces à ADN de type tiling array. Ce
travail présente également une tentation d’améliorer de la prédiction d’ARNnc avec une
méthode se basant sur la recherche de motifs d’ARN dans les séquences.Bioinformatics is a multidisciplinary field that uses biology, computer science, physics and
mathematics to solve problems in biology. One of the topics of bioinformatics is the
analysis of genomic sequences and prediction of genes from non-coding RNA (ncRNA).
The non-coding RNAs are RNA molecules that are transcribed but not translated into
protein and have a function in the cell. The use of biochemistry and molecular biology
techniques in order to find non-coding RNA genes is rather difficult and relatively
expensive. Thus, the prediction of genes by bioinformatics methods is an important issue.
This research describes a computer analysis to search for new ncRNA in the pathogen
Candida albicans and an experimental validation. The strategy used was to combine
several algorithms and to validate a subset of computer predictions with a microarray
experience covering 1979 regions of the genome. We have identified 62 new transcripts in
Candida albicans. We have also developed an analytical method for tiling array and
attempted to improve the prediction of ncRNAs this with a method based on the search of
RNA motifs in the sequences
A network analysis of cofactor-protein interactions for analyzing associations between human nutrition and diseases
The involvement of vitamins and other micronutrients in intermediary metabolism was elucidated in the mid 1900's at the level of individual biochemical reactions. Biochemical pathways remain the foundational knowledgebase for understanding how micronutrient adequacy modulates health in all life stages. Current daily recommended intakes were usually established on the basis of the association of a single nutrient to a single, most sensitive adverse effect and thus neglect interdependent and pleiotropic effects of micronutrients on biological systems. Hence, the understanding of the impact of overt or sub-clinical nutrient deficiencies on biological processes remains incomplete. Developing a more complete view of the role of micronutrients and their metabolic products in protein-mediated reactions is of importance. We thus integrated and represented cofactor-protein interaction data from multiple and diverse sources into a multi-layer network representation that links cofactors, cofactor-interacting proteins, biological processes, and diseases. Network representation of this information is a key feature of the present analysis and enables the integration of data from individual biochemical reactions and protein-protein interactions into a systems view, which may guide strategies for targeted nutritional interventions aimed at improving health and preventing diseases
Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data
The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML
DNA methylation during human adipogenesis and the impact of fructose
Background Increased adipogenesis and altered adipocyte function contribute to the development of obesity and associated comorbidities. Fructose modified adipocyte metabolism compared to glucose, but the regulatory mechanisms and consequences for obesity are unknown. Genome-wide methylation and global transcriptomics in SGBS pre-adipocytes exposed to 0, 2.5, 5, and 10 mM fructose, added to a 5-mM glucose-containing medium, were analyzed at 0, 24, 48, 96, 192, and 384 h following the induction of adipogenesis. Results Time-dependent changes in DNA methylation compared to baseline (0 h) occurred during the final maturation of adipocytes, between 192 and 384 h. Larger percentages (0.1% at 192 h, 3.2% at 384 h) of differentially methylated regions (DMRs) were found in adipocytes differentiated in the glucose-containing control media compared to adipocytes differentiated in fructose-supplemented media (0.0006% for 10 mM, 0.001% for 5 mM, and 0.005% for 2.5 mM at 384 h). A total of 1437 DMRs were identified in 5237 differentially expressed genes at 384 h post-induction in glucose-containing (5 mM) control media. The majority of them inversely correlated with the gene expression, but 666 regions were positively correlated to the gene expression. Conclusions Our studies demonstrate that DNA methylation regulates or marks the transformation of morphologically differentiating adipocytes (seen at 192 h), to the more mature and metabolically robust adipocytes (as seen at 384 h) in a genome-wide manner. Lower (2.5 mM) concentrations of fructose have the most robust effects on methylation compared to higher concentrations (5 and 10 mM), suggesting that fructose may be playing a signaling/regulatory role at lower concentrations of fructose and as a substrate at higher concentrations
Type Package Title integrated Bayesian Modeling of eQTL data Version 1.0.1 Date 2011-10-28
R topics documented: iBMQ-package....................................... 2 calculateThreshold..................................... 2 eqtlClassifier........................................ 3 eqtlFinder.......................................... 4 eqtlMcmc.......................................... 4 gene............................................. 5 genepos........................................... 6 genotype.liver........................................ 7 hotspotFinder........................................ 7 map.liver.......................................... 8 phenotype.liver....................................... 8 1 2 calculateThreshold PPA.liver.......................................... 9 probe.liver.......................................... 9 snp.............................................. 10 snppos............................................ 1
Ancestors’ dietary patterns and environments could drive positive selection in genes involved in micronutrient metabolism—the case of cofactor transporters
BACKGROUND: During evolution, humans colonized different ecological niches and adopted a variety of subsistence strategies that gave rise to diverse selective pressures acting across the genome. Environmentally induced selection of vitamin, mineral, or other cofactor transporters could influence micronutrient-requiring molecular reactions and contribute to inter-individual variability in response to foods and nutritional interventions. METHODS: A comprehensive list of genes coding for transporters of cofactors or their precursors was built using data mining procedures from the HGDP dataset and then explored to detect evidence of positive genetic selection. This dataset was chosen since it comprises several genetically diverse worldwide populations whom ancestries have evolved in different environments and thus lived following various nutritional habits and lifestyles. RESULTS: We identified 312 cofactor transporter (CT) genes involved in between-cell or sub-cellular compartment distribution of 28 cofactors derived from dietary intake. Twenty-four SNPs distributed across 14 CT genes separated populations into continental and intra-continental groups such as African hunter-gatherers and farmers, and between Native American sub-populations. Notably, four SNPs were located in SLC24A3 with one being a known eQTL of the NCKX3 protein. CONCLUSIONS: These findings could support the importance of considering individual’s genetic makeup along with their metabolic profile when tailoring personalized dietary interventions for optimizing health
Multi-omics integration-a comparison of unsupervised clustering methodologies
With the recent developments in the field of multi-omics integration, the interest in factors such as data preprocessing, choice of the integration method and the number of different omics considered had increased. In this work, the impact of these factors is explored when solving the problem of sample classification, by comparing the performances of five unsupervised algorithms: Multiple Canonical Correlation Analysis, Multiple Co-Inertia Analysis, Multiple Factor Analysis, Joint and Individual Variation Explained and Similarity Network Fusion. These methods were applied to three real data sets taken from literature and several ad hoc simulated scenarios to discuss classification performance in different conditions of noise and signal strength across the data types. The impact of experimental design, feature selection and parameter training has been also evaluated to unravel important conditions that can affect the accuracy of the result
- …