1,152 research outputs found

    Detection of regulator genes and eQTLs in gene networks

    Full text link
    Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure

    Strategies for the intelligent integration of genetic variance information in multiscale models of neurodegenerative diseases

    Get PDF
    A more complete understanding of the genetic architecture of complex traits and diseases can maximize the utility of human genetics in disease screening, diagnosis, prognosis, and therapy. Undoubtedly, the identification of genetic variants linked to polygenic and complex diseases is of supreme interest for clinicians, geneticists, patients, and the public. Furthermore, determining how genetic variants affect an individual’s health and transmuting this knowledge into the development of new medicine can revolutionize the treatment of most common deleterious diseases. However, this requires the correlation of genetic variants with specific diseases, and accurate functional assessment of genetic variation in human DNA sequencing studies is still a nontrivial challenge in clinical genomics. Assigning functional consequences and clinical significances to genetic variants is an important step in human genome interpretation. The translation of the genetic variants into functional molecular mechanisms is essential in disease pathogenesis and, eventually in therapy design. Although various statistical methods are helpful to short-list the genetic variants for fine-mapping investigation, demonstrating their role in molecular mechanism requires knowledge of functional consequences. This undoubtedly requires comprehensive investigation. Experimental interpretation of all the observed genetic variants is still impractical. Thus, the prediction of functional and regulatory consequences of the genetic variants using in-silico approaches is an important step in the discovery of clinically actionable knowledge. Since the interactions between phenotypes and genotypes are multi-layered and biologically complex. Such associations present several challenges and simultaneously offer many opportunities to design new protocols for in-silico variant evaluation strategies. This thesis presents a comprehensive protocol based on a causal reasoning algorithm that harvests and integrates multifaceted genetic and biomedical knowledge with various types of entities from several resources and repositories to understand how genetic variants perturb molecular interaction, and initiate a disease mechanism. Firstly, as a case study of genetic susceptibility loci of Alzheimer’s disease, I reviewed and summarized all the existing methodologies for Genome Wide Association Studies (GWAS) interpretation, currently available algorithms, and computable modelling approaches. In addition, I formulated a new approach for modelling and simulations of genetic regulatory networks as an extension of the syntax of the Biological Expression Language (OpenBEL). This could allow the representation of genetic variation information in cause-and-effect models to predict the functional consequences of disease-associated genetic variants. Secondly, by using the new syntax of OpenBEL, I generated an OpenBEL model for Alzheimer®s Disease (AD) together with genetic variants including their DNA, RNA or protein position, variant type and associated allele. To better understand the role of genetic variants in a disease context, I subsequently tried to predict the consequences of genetic variation based on the functional context provided by the network model. I further explained that how genetic variation information could help to identify candidate molecular mechanisms for aetiologically complex diseases such as Alzheimer’s disease (AD) and Parkinson’s disease (PD). Though integration of genetic variation information can enhance the evidence base for shared pathophysiology pathways in complex diseases, I have addressed to one of the key questions, namely the role of shared genetic variants to initiate shared molecular mechanisms between neurodegenerative diseases. I systematically analysed shared genetic variation information of AD and PD and mapped them to find shared molecular aetiology between neurodegenerative diseases. My methodology highlighted that a comprehensive understanding of genetic variation needs integration and analysis of all omics data, in order to build a joint model to capture all datasets concurrently. Moreover genomic loci should be considered to investigate the effects of GWAS variants rather than an individual genetic variant, which is hard to predict in a biologically complex molecular mechanism, predominantly to investigate shared pathology

    Exploiting the mediating role of the metabolome to unravel transcript-to-phenotype associations.

    Get PDF
    Despite the success of genome-wide association studies (GWASs) in identifying genetic variants associated with complex traits, understanding the mechanisms behind these statistical associations remains challenging. Several methods that integrate methylation, gene expression, and protein quantitative trait loci (QTLs) with GWAS data to determine their causal role in the path from genotype to phenotype have been proposed. Here, we developed and applied a multi-omics Mendelian randomization (MR) framework to study how metabolites mediate the effect of gene expression on complex traits. We identified 216 transcript-metabolite-trait causal triplets involving 26 medically relevant phenotypes. Among these associations, 58% were missed by classical transcriptome-wide MR, which only uses gene expression and GWAS data. This allowed the identification of biologically relevant pathways, such as between ANKH and calcium levels mediated by citrate levels and SLC6A12 and serum creatinine through modulation of the levels of the renal osmolyte betaine. We show that the signals missed by transcriptome-wide MR are found, thanks to the increase in power conferred by integrating multiple omics layer. Simulation analyses show that with larger molecular QTL studies and in case of mediated effects, our multi-omics MR framework outperforms classical MR approaches designed to detect causal relationships between single molecular traits and complex phenotypes

    Mini-Workshop: Recent Developments in Statistical Methods with Applications to Genetics and Genomics

    Get PDF
    Recent progress in high-throughput genomic technologies has revolutionized the field of human genetics and promises to lead to important scientific advances. With new improvements in massively parallel biotechnologies, it is becoming increasingly more efficient to generate vast amounts of information at the genomics, transcriptomics, proteomics, metabolomics etc. levels, opening up as yet unexplored opportunities in the search for the genetic causes of complex traits. Despite this tremendous progress in data generation, it remains very challenging to analyze, integrate and interpret these data. The resulting data are high-dimensional and very sparse, and efficient statistical methods are critical in order to extract the rich information contained in these data. The major focus of the mini-workshop, entitled “Recent Developments in Statistical Methods with Applications to Genetics and Genomics”, has been on integrative methods. Relevant research questions included the optimal study design for integrative genomic analyses; appropriate handling and pre-processing of different types of omics data; statistical methods for integration of multiple types of omics data; adjustment for confounding due to latent factors such as cell or tissue heterogeneity; the optimal use of omics data to enhance or make sense of results identified through genetic studies; and statistical and computational strategies for analysis of multiple types of high-dimensional data

    Diversity and genetic architecture of agro-morphological traits in a core collection of European traditional tomato

    Get PDF
    European traditional tomato varieties have been selected by farmers given their consistent performance and adaptation to local growing conditions. Here we developed a multipurpose core collection, comprising 226 accessions representative of the genotypic, phenotypic, and geographical diversity present in European traditional tomatoes, to investigate the basis of their phenotypic variation, gene×environment interactions, and stability for 33 agro-morphological traits. Comparison of the traditional varieties with a modern reference panel revealed that some traditional varieties displayed excellent agronomic performance and high trait stability, as good as or better than that of their modern counterparts. We conducted genome-wide association and genome-wide environment interaction studies and detected 141 quantitative trait loci (QTLs). Out of those, 47 QTLs were associated with the phenotype mean (meanQTLs), 41 with stability (stbQTLs), and 53 QTL-by-environment interactions (QTIs). Most QTLs displayed additive gene actions, with the exception of stbQTLs, which were mostly recessive and overdominant QTLs. Both common and specific loci controlled the phenotype mean and stability variation in traditional tomato; however, a larger proportion of specific QTLs was observed, indicating that the stability gene regulatory model is the predominant one. Developmental genes tended to map close to meanQTLs, while genes involved in stress response, hormone metabolism, and signalling were found within regions affecting stability. A total of 137 marker–trait associations for phenotypic means and stability were novel, and therefore our study enhances the understanding of the genetic basis of valuable agronomic traits and opens up a new avenue for an exploitation of the allelic diversity available within European traditional tomato germplasmThis work was supported by European Commission H2020 research and innovation program through TRADITOM grant agreement no. 634561, G2P-SOL, grant agreement no. 677379, and HARNESSTOM grant agreement no. 101000716. Networking activities were funded by COST Actions “EUROCAROTEN CA15136 and ‘RoxyCOST’ CA18210 ‘RoxyCOSTPostprint (published version

    Non-linear regression models for time to flowering in wild chickpea combine genetic and climatic factors

    Get PDF
    Background: Accurate prediction of crop flowering time is required for reaching maximal farm efficiency. Several models developed to accomplish this goal are based on deep knowledge of plant phenology, requiring large investment for every individual crop or new variety. Mathematical modeling can be used to make better use of more shallow data and to extract information from it with higher efficiency. Cultivars of chickpea, Cicer arietanum, are currently being improved by introgressing wild C. reticulatum biodiversity with very different flowering time requirements. More understanding is required for how flowering time will depend on environmental conditions in these cultivars developed by introgression of wild alleles. Results: We built a novel model for flowering time of wild chickpeas collected at 21 different sites in Turkey and grown in 4 distinct environmental conditions over several different years and seasons. We propose a general approach, in which the analytic forms of dependence of flowering time on climatic parameters, their regression coefficients, and a set of predictors are inferred automatically by stochastic minimization of the deviation of the model output from data. By using a combination of Grammatical Evolution and Differential Evolution Entirely Parallel method, we have identified a model that reflects the influence of effects of day length, temperature, humidity and precipitation and has a coefficient of determination of R 2=0.97. Conclusions: We used our model to test two important hypotheses. We propose that chickpea phenology may be strongly predicted by accession geographic origin, as well as local environmental conditions at the site of growth. Indeed, the site of origin-by-growth environment interaction accounts for about 14.7% of variation in time period from sowing to flowering. Secondly, as the adaptation to specific environments is blueprinted in genomes, the effects of genes on flowering time may be conditioned on environmental factors. Genotype-by-environment interaction accounts for about 17.2% of overall variation in flowering time. We also identified several genomic markers associated with different reactions to climatic factor changes. Our methodology is general and can be further applied to extend existing crop models, especially when phenological information is limited
    • 

    corecore