3,916 research outputs found

    Unbiased functional clustering of gene variants with a phenotypic-linkage network

    Get PDF
    Groupwise functional analysis of gene variants is becoming standard in next-generation sequencing studies. As the function of many genes is unknown and their classification to pathways is scant, functional associations between genes are often inferred from large-scale omics data. Such data types—including protein–protein interactions and gene co-expression networks—are used to examine the interrelations of the implicated genes. Statistical significance is assessed by comparing the interconnectedness of the mutated genes with that of random gene sets. However, interconnectedness can be affected by confounding bias, potentially resulting in false positive findings. We show that genes implicated through de novo sequence variants are biased in their coding-sequence length and longer genes tend to cluster together, which leads to exaggerated p-values in functional studies; we present here an integrative method that addresses these bias. To discern molecular pathways relevant to complex disease, we have inferred functional associations between human genes from diverse data types and assessed them with a novel phenotype-based method. Examining the functional association between de novo gene variants, we control for the heretofore unexplored confounding bias in coding-sequence length. We test different data types and networks and find that the disease-associated genes cluster more significantly in an integrated phenotypic-linkage network than in other gene networks. We present a tool of superior power to identify functional associations among genes mutated in the same disease even after accounting for significant sequencing study bias and demonstrate the suitability of this method to functionally cluster variant genes underlying polygenic disorders

    Diverse type 2 diabetes genetic risk factors functionally converge in a phenotype-focused gene network

    Get PDF
    Type 2 Diabetes (T2D) constitutes a global health burden. Efforts to uncover predisposing genetic variation have been considerable, yet detailed knowledge of the underlying pathogenesis remains poor. Here, we constructed a T2D phenotypic-linkage network (T2D-PLN), by integrating diverse gene functional information that highlight genes, which when disrupted in mice, elicit similar T2D-relevant phenotypes. Sensitising the network to T2D-relevant phenotypes enabled significant functional convergence to be detected between genes implicated in monogenic or syndromic diabetes and genes lying within genomic regions associated with T2D common risk. We extended these analyses to a recent multiethnic T2D case-control exome of 12,940 individuals that found no evidence of T2D risk association for rare frequency variants outside of previously known T2D risk loci. Examining associations involving protein-truncating variants (PTV), most at low population frequencies, the T2D-PLN was able to identify a convergent set of biological pathways that were perturbed within four of five independent T2D case/control ethnic sets of 2000 to 5000 exomes each. These same pathways were found to be over-represented among both known monogenic or syndromic diabetes genes and genes within T2D-associated common risk loci. Our study demonstrates convergent biology amongst variants representing different classes of T2D genetic risk. Although convergence was observed at the pathway level, few of the contributing genes were found in common between different cohorts or variant classes, most notably between the exome variant sets which suggests that future rare variant studies may be better focusing their power onto a single population of recent common ancestry

    The clustering of functionally related genes contributes to CNV-mediated disease

    Get PDF
    Clusters of functionally related genes can be disrupted by a single copy number variant (CNV). We demonstrate that the simultaneous disruption of multiple functionally related genes is a frequent and significant characteristic of de novo CNVs in patients with developmental disorders (P = 1 × 10−3). Using three different functional networks, we identified unexpectedly large numbers of functionally related genes within de novo CNVs from two large independent cohorts of individuals with developmental disorders. The presence of multiple functionally related genes was a significant predictor of a CNV's pathogenicity when compared to CNVs from apparently healthy individuals and a better predictor than the presence of known disease or haploinsufficient genes for larger CNVs. The functionally related genes found in the de novo CNVs belonged to 70% of all clusters of functionally related genes found across the genome. De novo CNVs were more likely to affect functional clusters and affect them to a greater extent than benign CNVs (P = 6 × 10−4). Furthermore, such clusters of functionally related genes are phenotypically informative: Different patients possessing CNVs that affect the same cluster of functionally related genes exhibit more similar phenotypes than expected (P < 0.05). The spanning of multiple functionally similar genes by single CNVs contributes substantially to how these variants exert their pathogenic effects

    Genetic variants of HvCbf14 are statistically associated with frost tolerance in a European germplasm collection of Hordeum vulgare

    Get PDF
    Two quantitative trait loci (Fr-H1 and Fr-H2) for frost tolerance (FT) have been discovered on the long arm of chromosome 5H in barley. Two tightly linked groups of CBF genes, known to play a key role in the FT regulatory network in A. thaliana, have been found to co-segregate with Fr-H2. Here, we investigate the allelic variations of four barley CBF genes (HvCbf3, HvCbf6, HvCbf9 and HvCbf14) in a panel of European cultivars, landraces and H. spontaneum accessions. In the cultivars a reduction of nucleotide and haplotype diversities in CBFs compared with the landraces and the wild ancestor H. spontaneum, was evident. In particular, in cultivars the loss of HvCbf9 genetic variants was higher compared to other sequences. In order to verify if the pattern of CBF genetic variants correlated with the level of FT, an association procedure was adopted. The pairwise analysis of linkage disequilibrium (LD) among the genetic variants in four CBF genes was computed to evaluate the resolution of the association procedure. The pairwise plotting revealed a low level of LD in cultivated varieties, despite the tight physical linkage of CBF genes analysed. A structured association procedure based on a general liner model was implemented, including the variants in CBFs, of Vrn-H1, and of two reference genes not involved in FT (α-Amy1 and Gapdh) and considering the phenotypic data for FT. Association analysis recovered two nucleotide variants of HvCbf14 and one nucleotide variant of Vrn-H1 as statistically associated to FT

    2D association and integrative omics analysis in rice provides systems biology view in trait analysis.

    Get PDF
    The interactions among genes and between genes and environment contribute significantly to the phenotypic variation of complex traits and may be possible explanations for missing heritability. However, to our knowledge no existing tool can address the two kinds of interactions. Here we propose a novel linear mixed model that considers not only the additive effects of biological markers but also the interaction effects of marker pairs. Interaction effect is demonstrated as a 2D association. Based on this linear mixed model, we developed a pipeline, namely PATOWAS. PATOWAS can be used to study transcriptome-wide and metabolome-wide associations in addition to genome-wide associations. Our case analysis with real rice recombinant inbred lines (RILs) at three omics levels demonstrates that 2D association mapping and integrative omics are able to provide a systems biology view into the analyzed traits, leading toward an answer about how genes, transcripts, proteins, and metabolites work together to produce an observable phenotype

    Large-scale neuroanatomical study uncovers 198 gene associations in mouse brain morphogenesis.

    Get PDF
    Brain morphogenesis is an important process contributing to higher-order cognition, however our knowledge about its biological basis is largely incomplete. Here we analyze 118 neuroanatomical parameters in 1,566 mutant mouse lines and identify 198 genes whose disruptions yield NeuroAnatomical Phenotypes (NAPs), mostly affecting structures implicated in brain connectivity. Groups of functionally similar NAP genes participate in pathways involving the cytoskeleton, the cell cycle and the synapse, display distinct fetal and postnatal brain expression dynamics and importantly, their disruption can yield convergent phenotypic patterns. 17% of human unique orthologues of mouse NAP genes are known loci for cognitive dysfunction. The remaining 83% constitute a vast pool of genes newly implicated in brain architecture, providing the largest study of mouse NAP genes and pathways. This offers a complementary resource to human genetic studies and predict that many more genes could be involved in mammalian brain morphogenesis

    Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases

    Get PDF
    open8noThis work is funded by the University of Bologna, the IRCCS Institute of Neurological sciences of Bologna, and by the European Grants H2020 GenoMed4All [AM1] (Grant N. 101017549) and H2020 MSCA-ITN IMforFUTURE (Grant N. 721815).Background Targeted Next Generation Sequencing is a common and powerful approach used in both clinical and research settings. However, at present, a large fraction of the acquired genetic information is not used since pathogenicity cannot be assessed for most variants. Further complicating this scenario is the increasingly frequent description of a poli/oligogenic pattern of inheritance showing the contribution of multiple variants in increasing disease risk. We present an approach in which the entire genetic information provided by target sequencing is transformed into binary data on which we performed statistical, machine learning, and network analyses to extract all valuable information from the entire genetic profile. To test this approach and unbiasedly explore the presence of recurrent genetic patterns, we studied a cohort of 112 patients affected either by genetic Creutzfeldt–Jakob (CJD) disease caused by two mutations in the PRNP gene (p.E200K and p.V210I) with different penetrance or by sporadic Alzheimer disease (sAD). Results Unsupervised methods can identify functionally relevant sources of variation in the data, like haplogroups and polymorphisms that do not follow Hardy–Weinberg equilibrium, such as the NOTCH3 rs11670823 (c.3837 + 21 T &gt; A). Supervised classifiers can recognize clinical phenotypes with high accuracy based on the mutational profile of patients. In addition, we found a similar alteration of allele frequencies compared the European population in sporadic patients and in V210I-CJD, a poorly penetrant PRNP mutation, and sAD, suggesting shared oligogenic patterns in different types of dementia. Pathway enrichment and protein–protein interaction network revealed different altered pathways between the two PRNP mutations. Conclusions We propose this workflow as a possible approach to gain deeper insights into the genetic information derived from target sequencing, to identify recurrent genetic patterns and improve the understanding of complex diseases. This work could also represent a possible starting point of a predictive tool for personalized medicine and advanced diagnostic applications.openTarozzi, M.; Bartoletti-Stella, A.; Dall’Olio, D.; Matteuzzi, T.; Baiardi, S.; Parchi, P.; Castellani, G.; Capellari, S.Tarozzi, M.; Bartoletti-Stella, A.; Dall’Olio, D.; Matteuzzi, T.; Baiardi, S.; Parchi, P.; Castellani, G.; Capellari, S
    corecore