3,908 research outputs found
Characterization of functional methylomes by next-generation capture sequencing identifies novel disease-associated variants.
Most genome-wide methylation studies (EWAS) of multifactorial disease traits use targeted arrays or enrichment methodologies preferentially covering CpG-dense regions, to characterize sufficiently large samples. To overcome this limitation, we present here a new customizable, cost-effective approach, methylC-capture sequencing (MCC-Seq), for sequencing functional methylomes, while simultaneously providing genetic variation information. To illustrate MCC-Seq, we use whole-genome bisulfite sequencing on adipose tissue (AT) samples and public databases to design AT-specific panels. We establish its efficiency for high-density interrogation of methylome variability by systematic comparisons with other approaches and demonstrate its applicability by identifying novel methylation variation within enhancers strongly correlated to plasma triglyceride and HDL-cholesterol, including at CD36. Our more comprehensive AT panel assesses tissue methylation and genotypes in parallel at ∼4 and ∼3 M sites, respectively. Our study demonstrates that MCC-Seq provides comparable accuracy to alternative approaches but enables more efficient cataloguing of functional and disease-relevant epigenetic and genetic variants for large-scale EWAS.This work was supported by a Canadian Institute of Health Research (CIHR) team grant awarded to E.G., A.T., M.C.V. and M.L. (TEC-128093) and the CIHR funded Epigeneome Mapping Centre at McGill University (EP1-120608) awarded to T.P. and M.L., and the Swedish Research Council, Knut and Alice Wallenberg Foundation and the Torsten Söderberg Foundation awarded to L.R. F.A. holds studentship from The Research Institute of the McGill University Health Center (MUHC). F.G. is a recipient of a research fellowship award from the Heart and Stroke Foundation of Canada. A.T. is the director of a Research Chair in Bariatric and Metabolic Surgery. M.C.V. is the recipient of the Canada Research Chair in Genomics Applied to Nutrition and Health (Tier 1). E.G. and T.P. are recipients of a Canada Research Chair Tier 2 award. The MuTHER Study was funded by a programme grant from the Wellcome Trust (081917/Z/07/Z) and core funding for the Wellcome Trust Centre for Human Genetics (090532). TwinsUK was funded by the Wellcome Trust; European Community's Seventh Framework Programme (FP7/2007-2013). The study also receives support from the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy's and St Thomas' NHS Foundation Trust in partnership with King's College London. T.D.S. is a holder of an ERC Advanced Principal Investigator award. SNP genotyping was performed by The Wellcome Trust Sanger Institute and National Eye Institute via NIH/CIDR. Finally, we thank the NIH Roadmap Epigenomics Consortium and the Mapping Centers (http://nihroadmap.nih.gov/epigenomics/) for the production of publicly available reference epigenomes. Specifically, we thank the mapping centre at MGH/BROAD for generation of human adipose reference epigenomes used in this study.This is the final version. It was first published by NPG at http://www.nature.com/ncomms/2015/150529/ncomms8211/full/ncomms8211.html#abstrac
Distinct changes of genomic biases in nucleotide substitution at the time of mammalian radiation
Differences in the regional substitution patterns in the human genome created
patterns of large-scale variation of base composition known as genomic
isochores. To gain insight into the origin of the genomic isochores we develop
a maximum likelihood approach to determine the history of substitution patterns
in the human genome. This approach utilizes the vast amount of repetitive
sequence deposited in the human genome over the past ~250 MYR. Using this
approach we estimate the frequencies of seven types of substitutions: the four
transversions, two transitions, and the methyl-assisted transition of cytosine
in CpG. Comparing substitutional patterns in repetitive elements of various
ages, we reconstruct the history of the base-substitutional process in the
different isochores for the past 250 Myr. At around 90 Myr ago (around the time
of the mammalian radiation), we find an abrupt 4- to 8-fold increase of the
cytosine transition rate in CpG pairs compared to that of the reptilian
ancestor. Further analysis of nucleotide substitutions in regions with
different GC-content reveals concurrent changes in the substitutional patterns.
While the substitutional pattern was dependent on the regional GC-content in
such ways that it preserved the regional GC-content before the mammalian
radiation, it lost this dependence afterwards. The substitutional pattern
changed from an isochore-preserving to an isochore-degrading one. We conclude
that isochores have been established before the radiation of the eutherian
mammals and have been subject to the process of homogenization since then
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Suv4-20h Histone Methyltransferases Promote Neuroectodermal Differentiation by Silencing the Pluripotency-Associated Oct-25 Gene
Post-translational modifications (PTMs) of histones exert fundamental roles in regulating gene expression. During development, groups of PTMs are constrained by unknown mechanisms into combinatorial patterns, which facilitate transitions from uncommitted embryonic cells into differentiated somatic cell lineages. Repressive histone modifications such as H3K9me3 or H3K27me3 have been investigated in detail, but the role of H4K20me3 in development is currently unknown. Here we show that Xenopus laevis Suv4-20h1 and h2 histone methyltransferases (HMTases) are essential for induction and differentiation of the neuroectoderm. Morpholino-mediated knockdown of the two HMTases leads to a selective and specific downregulation of genes controlling neural induction, thereby effectively blocking differentiation of the neuroectoderm. Global transcriptome analysis supports the notion that these effects arise from the transcriptional deregulation of specific genes rather than widespread, pleiotropic effects. Interestingly, morphant embryos fail to repress the Oct4-related Xenopus gene Oct-25. We validate Oct-25 as a direct target of xSu4-20h enzyme mediated gene repression, showing by chromatin immunoprecipitaton that it is decorated with the H4K20me3 mark downstream of the promoter in normal, but not in double-morphant, embryos. Since knockdown of Oct-25 protein significantly rescues the neural differentiation defect in xSuv4-20h double-morphant embryos, we conclude that the epistatic relationship between Suv4-20h enzymes and Oct-25 controls the transit from pluripotent to differentiation-competent neural cells. Consistent with these results in Xenopus, murine Suv4-20h1/h2 double-knockout embryonic stem (DKO ES) cells exhibit increased Oct4 protein levels before and during EB formation, and reveal a compromised and biased capacity for in vitro differentiation, when compared to normal ES cells. Together, these results suggest a regulatory mechanism, conserved between amphibians and mammals, in which H4K20me3-dependent restriction of specific POU-V genes directs cell fate decisions, when embryonic cells exit the pluripotent state
Functional Enrichment Analysis of Regulatory Elements
This work has been partially supported by FEDER/Junta de Andalucia-Consejeria de Economia y Conocimiento/(grant CV20-36723), grant PID2020-119032RB-I00, MCIN/AEI/10.13039/501100011033 and FEDER/Junta de Andalucia-Consejeria de Transformacion Economica, Industria, Conocimiento y Universidades (Grant P20_00335).Statistical methods for enrichment analysis are important tools to extract biological information
from omics experiments. Although these methods have been widely used for the analysis
of gene and protein lists, the development of high-throughput technologies for regulatory elements
demands dedicated statistical and bioinformatics tools. Here, we present a set of enrichment analysis
methods for regulatory elements, including CpG sites, miRNAs, and transcription factors. Statistical
significance is determined via a power weighting function for target genes and tested by theWallenius
noncentral hypergeometric distribution model to avoid selection bias. These new methodologies have
been applied to the analysis of a set of miRNAs associated with arrhythmia, showing the potential of
this tool to extract biological information from a list of regulatory elements. These new methods are
available in GeneCodis 4, a web tool able to perform singular and modular enrichment analysis that
allows the integration of heterogeneous information.FEDER/Junta de Andalucia-Consejeria de Economia y Conocimiento CV20-36723MCIN/AEI PID2020-119032RB-I00FEDER/Junta de Andalucia-Consejeria de Transformacion Economica, Industria, Conocimiento y Universidades P20_0033
DNA Repair Deficiency in Huntington\u27s Disease Fibroblasts and Induced Pluripotent Stem Cells
Mutant huntingtin protein (mhtt)– the protein responsible for cellular dysfunction in Huntington’s disease (HD) –is a product of an expanded trinucleotide repeat (TNR) cytosine-adenine-guanine (CAG) sequence in exon 1 of the huntingtin (HTT) gene. The pathology of HD has been extensively researched; however, the mechanism by which the disease-causing TNR expansions occur in somatic cells remains elusive. Interestingly, HD has often been referred to a ‘DNA repair disease’, even though DNA repair dysfunction in situ has not been identified. We hypothesized that presence of the mhtt protein affects the expression of DNA repair genes used to address DNA repair, ultimately affecting genome stability, thus providing a possible mechanism for TRN instability. Using quantitative polymerase chain reaction (qPCR) gene arrays for 84 DNA repair genes, we identified 18 DNA repair genes with decreased fold changes between 2- and 3- fold, as well as 11 genes down regulated greater than 3- fold in one HD fibroblast sample relative to a wild-type sample. To ensure our results were not limited to the samples tested, we then increased our number of HD samples and investigated gene expression of APEX1, BRCA1, RPA1, and RPA3 using sensitive TaqMan Gene Expression assays. Further, immunocytochemistry (ICC) analysis validated expression deficiencies at the protein level. These data identify down-regulated genes necessary to maintain stability in the genome of multiple HD affected fibroblast lines. Our data infers that the presence of the toxic mutant huntingtin (mHtt) protein is involved in the DNA repair gene inhibition.
The mutant huntingtin protein (mHtt) produced in HD exhibits a partial gain-of-function in that the hydrophobic, expanded polyglutamine region at the N-termini aggregates to unintended targets, such as transcription factors and histone modifiers. To identify the broad pathway regulating gene expression down-regulation, we investigated epigenetic regulatory mechanisms. Rapid revival of selected DNA repair expression was observed in response to pharmacological hypomethylation treatment, but not to histone modification treatments. This identifies differential methylation patterns occur as a result of mHtt presence. Using capillary electrophoresis fragment analysis to characterize HTT TNR gene expansions, our data reveals that intermittent 5-azacytidine treatments induced HTT gene stability over 4 population doublings, elucidating methylation patterning involvement in TNR instability.
Furthermore, induced pluripotent stem cells (iPSCs) undergo global epigenetic changes relative to its native cell type that include methylation patterning changes. Upon reprogramming of HD-affected fibroblasts into a pluripotent state we revealed that gene expression was recovered to wild-type levels and was maintained through 20 population doublings. As well, iPSC-HD lines show contraction-biased instability, opposite to expansion-biased instability in native fibroblast cell types. Differentiation of iPSC-HD lines into mesenchymal-like cells (MLCs) further revealed that APEX1 expression remained static, while others retreated to pre-iPSC expression levels. Interestingly, HD-iPSC derived MLCs showed that TNR regions maintained stability in the HTT gene pathogenic region, showing no changes in (CAG) repeats. These findings demonstrate that DNA repair gene expression in HD fibroblasts is altered, thus providing insight into the mechanism in which TNR instability persists, ultimately leading to genetic anticipation. This also identifies possible biomarkers that can be used to monitor disease progression and therapeutic treatment success. This study also provides evidence that TNR instability is pharmacologically alterable. This is the first evidence that a DNA repair gene deficiency is present in cells affected by Huntington’s disease. More so, our data suggests that mechanisms involved in pluripotency has a protective effect on the pathogenic TNR region of the HTT gene
Wavelet Screening identifies regions highly enriched for differentially methylated loci for orofacial clefts
DNA methylation is the most widely studied epigenetic mark in humans and plays an essential role in normal biological processes as well as in disease development. More focus has recently been placed on understanding functional aspects of methylation, prompting the development of methods to investigate the relationship between heterogeneity in methylation patterns and disease risk. However, most of these methods are limited in that they use simplified models that may rely on arbitrarily chosen parameters, they can only detect differentially methylated regions (DMRs) one at a time, or they are computationally intensive. To address these shortcomings, we present a wavelet-based method called ‘Wavelet Screening’ (WS) that can perform an epigenome-wide association study (EWAS) of thousands of individuals on a single CPU in only a matter of hours. By detecting multiple DMRs located near each other, WS identifies more complex patterns that can differentiate between different methylation profiles. We performed an extensive set of simulations to demonstrate the robustness and high power of WS, before applying it to a previously published EWAS dataset of orofacial clefts (OFCs). WS identified 82 associated regions containing several known genes and loci for OFCs, while other findings are novel and warrant replication in other OFCs cohorts.publishedVersio
A multi-tissue full lifespan epigenetic clock for mice.
Human DNA-methylation data have been used to develop highly accurate biomarkers of aging ( epigenetic clocks ). Recent studies demonstrate that similar epigenetic clocks for mice (Mus Musculus) can be slowed by gold standard anti-aging interventions such as calorie restriction and growth hormone receptor knock-outs. Using DNA methylation data from previous publications with data collected in house for a total 1189 samples spanning 193,651 CpG sites, we developed 4 novel epigenetic clocks by choosing different regression models (elastic net- versus ridge regression) and by considering different sets of CpGs (all CpGs vs highly conserved CpGs). We demonstrate that accurate age estimators can be built on the basis of highly conserved CpGs. However, the most accurate clock results from applying elastic net regression to all CpGs. While the anti-aging effect of calorie restriction could be detected with all types of epigenetic clocks, only ridge regression based clocks replicated the finding of slow epigenetic aging effects in dwarf mice. Overall, this study demonstrates that there are trade-offs when it comes to epigenetic clocks in mice. Highly accurate clocks might not be optimal for detecting the beneficial effects of anti-aging interventions
Recommended from our members
Genome-wide DNA methylation and gene expression patterns reflect genetic ancestry and environmental differences across the Indonesian archipelago.
Indonesia is the world's fourth most populous country, host to striking levels of human diversity, regional patterns of admixture, and varying degrees of introgression from both Neanderthals and Denisovans. However, it has been largely excluded from the human genomics sequencing boom of the last decade. To serve as a benchmark dataset of molecular phenotypes across the region, we generated genome-wide CpG methylation and gene expression measurements in over 100 individuals from three locations that capture the major genomic and geographical axes of diversity across the Indonesian archipelago. Investigating between- and within-island differences, we find up to 10.55% of tested genes are differentially expressed between the islands of Sumba and New Guinea. Variation in gene expression is closely associated with DNA methylation, with expression levels of 9.80% of genes correlating with nearby promoter CpG methylation, and many of these genes being differentially expressed between islands. Genes identified in our differential expression and methylation analyses are enriched in pathways involved in immunity, highlighting Indonesia's tropical role as a source of infectious disease diversity and the strong selective pressures these diseases have exerted on humans. Finally, we identify robust within-island variation in DNA methylation and gene expression, likely driven by fine-scale environmental differences across sampling sites. Together, these results strongly suggest complex relationships between DNA methylation, transcription, archaic hominin introgression and immunity, all jointly shaped by the environment. This has implications for the application of genomic medicine, both in critically understudied Indonesia and globally, and will allow a better understanding of the interacting roles of genomic and environmental factors shaping molecular and complex phenotypes
Candidate biomarkers from the integration of methylation and gene expression in discordant autistic sibling pairs
While the genetics of autism spectrum disorders (ASD) has been intensively studied, resulting in the identification of over 100 putative risk genes, the epigenetics of ASD has received less attention, and results have been inconsistent across studies. We aimed to investigate the contribution of DNA methylation (DNAm) to the risk of ASD and identify candidate biomarkers arising from the interaction of epigenetic mechanisms with genotype, gene expression, and cellular proportions. We performed DNAm differential analysis using whole blood samples from 75 discordant sibling pairs of the Italian Autism Network collection and estimated their cellular composition. We studied the correlation between DNAm and gene expression accounting for the potential effects of different genotypes on DNAm. We showed that the proportion of NK cells was significantly reduced in ASD siblings suggesting an imbalance in their immune system. We identified differentially methylated regions (DMRs) involved in neurogenesis and synaptic organization. Among candidate loci for ASD, we detected a DMR mapping to CLEC11A (neighboring SHANK1) where DNAm and gene expression were significantly and negatively correlated, independently from genotype effects. As reported in previous studies, we confirmed the involvement of immune functions in the pathophysiology of ASD. Notwithstanding the complexity of the disorder, suitable biomarkers such as CLEC11A and its neighbor SHANK1 can be discovered using integrative analyses even with peripheral tissues
- …