197 research outputs found

    Sushi gets serious:the draft genome sequence of the pufferfish Fugu rubripes

    Get PDF
    The publication of the Fugu rubripes draft genome sequence will take this fish from culinary delicacy to potent tool in deciphering the mysteries of human genome function

    The genomic signature of trait-associated variants

    Get PDF
    BACKGROUND: Genome-wide association studies have identified thousands of SNP variants associated with hundreds of phenotypes. For most associations the causal variants and the molecular mechanisms underlying pathogenesis remain unknown. Exploration of the underlying functional annotations of trait-associated loci has thrown some light on their potential roles in pathogenesis. However, there are some shortcomings of the methods used to date, which may undermine efforts to prioritize variants for further analyses. Here, we introduce and apply novel methods to rigorously identify annotation classes showing enrichment or depletion of trait-associated variants taking into account the underlying associations due to co-location of different functional annotations and linkage disequilibrium. RESULTS: We assessed enrichment and depletion of variants in publicly available annotation classes such as genic regions, regulatory features, measures of conservation, and patterns of histone modifications. We used logistic regression to build a multivariate model that identified the most influential functional annotations for trait-association status of genome-wide significant variants. SNPs associated with all of the enriched annotations were 8 times more likely to be trait-associated variants than SNPs annotated with none of them. Annotations associated with chromatin state together with prior knowledge of the existence of a local expression QTL (eQTL) were the most important factors in the final logistic regression model. Surprisingly, despite the widespread use of evolutionary conservation to prioritize variants for study we find only modest enrichment of trait-associated SNPs in conserved regions. CONCLUSION: We established odds ratios of functional annotations that are more likely to contain significantly trait-associated SNPs, for the purpose of prioritizing GWAS hits for further studies. Additionally, we estimated the relative and combined influence of the different genomic annotations, which may facilitate future prioritization methods by adding substantial information

    Chromatin structure and evolution in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Evolutionary rates are not constant across the human genome but genes in close proximity have been shown to experience similar levels of divergence and selection. The higher-order organisation of chromosomes has often been invoked to explain such phenomena but previously there has been insufficient data on chromosome structure to investigate this rigorously. Using the results of a recent genome-wide analysis of open and closed human chromatin structures we have investigated the global association between divergence, selection and chromatin structure for the first time.</p> <p>Results</p> <p>In this study we have shown that, paradoxically, synonymous site divergence (dS) at non-CpG sites is highest in regions of open chromatin, primarily as a result of an increased number of transitions, while the rates of other traditional measures of mutation (intergenic, intronic and ancient repeat divergence as well as SNP density) are highest in closed regions of the genome. Analysis of human-chimpanzee divergence across intron-exon boundaries indicates that although genes in relatively open chromatin generally display little selection at their synonymous sites, those in closed regions show markedly lower divergence at their fourfold degenerate sites than in neighbouring introns and intergenic regions. Exclusion of known Exonic Splice Enhancer hexamers has little affect on the divergence observed at fourfold degenerate sites across chromatin categories; however, we show that closed chromatin is enriched with certain classes of ncRNA genes whose RNA secondary structure may be particularly important.</p> <p>Conclusion</p> <p>We conclude that, overall, non-CpG mutation rates are lowest in open regions of the genome and that regions of the genome with a closed chromatin structure have the highest background mutation rate. This might reflect lower rates of DNA damage or enhanced DNA repair processes in regions of open chromatin. Our results also indicate that dS is a poor measure of mutation rates, particularly when used in closed regions of the genome, as genes in closed regions generally display relatively strong levels of selection at their synonymous sites.</p

    IMPROVE-DD: Integrating Multiple Phenotype Resources Optimises Variant Evaluation in genetically determined Developmental Disorders

    Get PDF
    Diagnosing rare developmental disorders using genome-wide sequencing data commonly necessitates review of multiple plausible candidate variants, often using ontologies of categorical clinical terms. We show that Integrating Multiple Phenotype Resources Optimizes Variant Evaluation in Developmental Disorders (IMPROVE-DD) by incorporating additional classes of data commonly available to clinicians and recorded in health records. In doing so, we quantify the distinct contributions of sex, growth, and development in addition to Human Phenotype Ontology (HPO) terms and demonstrate added value from these readily available information sources. We use likelihood ratios for nominal and quantitative data and propose a classifier for HPO terms in this framework. This Bayesian framework results in more robust diagnoses. Using data systematically collected in the Deciphering Developmental Disorders study, we considered 77 genes with pathogenic/likely pathogenic variants in ≥10 individuals. All genes showed at least a satisfactory prediction by receiver operating characteristic when testing on training data (AUC ≥ 0.6), and HPO terms were the best predictor for the majority of genes, though a minority (13/77) of genes were better predicted by other phenotypic data types. Overall, classifiers based upon multiple integrated phenotypic data sources performed better than those based upon any individual source, and importantly, integrated models produced notably fewer false positives. Finally, we show that IMPROVE-DD models with good predictive performance on cross-validation can be constructed from relatively few individuals. This suggests new strategies for candidate gene prioritization and highlights the value of systematic clinical data collection to support diagnostic programs

    Sequence level mechanisms of human epigenome evolution

    Get PDF
    DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage

    Integrated molecular characterisation of endometrioid ovarian carcinoma identifies opportunities for stratification

    Get PDF
    Endometrioid ovarian carcinoma (EnOC) is an under-investigated ovarian cancer type. Recent studies have described disease subtypes defined by genomics and hormone receptor expression patterns; here, we determine the relationship between these subtyping layers to define the molecular landscape of EnOC with high granularity and identify therapeutic vulnerabilities in high-risk cases. Whole exome sequencing data were integrated with progesterone and oestrogen receptor (PR and ER) expression-defined subtypes in 90 EnOC cases following robust pathological assessment, revealing dominant clinical and molecular features in the resulting integrated subtypes. We demonstrate significant correlation between subtyping approaches: PR-high (PR + /ER + , PR + /ER−) cases were predominantly CTNNB1-mutant (73.2% vs 18.4%, P < 0.001), while PR-low (PR−/ER + , PR−/ER−) cases displayed higher TP53 mutation frequency (38.8% vs 7.3%, P = 0.001), greater genomic complexity (P = 0.007) and more frequent copy number alterations (P = 0.001). PR-high EnOC patients experience favourable disease-specific survival independent of clinicopathological and genomic features (HR = 0.16, 95% CI 0.04–0.71). TP53 mutation further delineates the outcome of patients with PR-low tumours (HR = 2.56, 95% CI 1.14–5.75). A simple, routinely applicable, classification algorithm utilising immunohistochemistry for PR and p53 recapitulated these subtypes and their survival profiles. The genomic profile of high-risk EnOC subtypes suggests that inhibitors of the MAPK and PI3K-AKT pathways, alongside PARP inhibitors, represent promising candidate agents for improving patient survival. Patients with PR-low TP53-mutant EnOC have the greatest unmet clinical need, while PR-high tumours—which are typically CTNNB1-mutant and TP53 wild-type—experience excellent survival and may represent candidates for trials investigating de-escalation of adjuvant chemotherapy to agents such as endocrine therapy

    Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes

    Get PDF
    Genome-wide experimental methods to identify disease genes, such as linkage analysis and association studies, generate increasingly large candidate gene sets for which comprehensive empirical analysis is impractical. Computational methods employ data from a variety of sources to identify the most likely candidate disease genes from these gene sets. Here, we review seven independent computational disease gene prioritization methods, and then apply them in concert to the analysis of 9556 positional candidate genes for type 2 diabetes (T2D) and the related trait obesity. We generate and analyse a list of nine primary candidate genes for T2D genes and five for obesity. Two genes, LPL and BCKDHA, are common to these two sets. We also present a set of secondary candidates for T2D (94 genes) and for obesity (116 genes) with 58 genes in common to both diseases
    • …
    corecore