59 research outputs found

    Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements

    Get PDF
    Background: Recent assays for individual-specific genome-wide DNA methylation profiles have enabled epigenome-wide association studies to identify specific CpG sites associated with a phenotype. Computational prediction of CpG site-specific methylation levels is important, but current approaches tackle average methylation within a genomic locus and are often limited to specific genomic regions. Results: We characterize genome-wide DNA methylation patterns, and show that correlation among CpG sites decays rapidly, making predictions solely based on neighboring sites challenging. We built a random forest classifier to predict CpG site methylation levels using as features neighboring CpG site methylation levels and genomic distance, and co-localization with coding regions, CGIs, and regulatory elements from the ENCODE project, among others. Our approach achieves 91% -- 94% prediction accuracy of genome-wide methylation levels at single CpG site precision. The accuracy increases to 98% when restricted to CpG sites within CGIs. Our classifier outperforms state-of-the-art methylation classifiers and identifies features that contribute to prediction accuracy: neighboring CpG site methylation status, CpG island status, co-localized DNase I hypersensitive sites, and specific transcription factor binding sites were found to be most predictive of methylation levels. Conclusions: Our observations of DNA methylation patterns led us to develop a classifier to predict site-specific methylation levels that achieves the best DNA methylation predictive accuracy to date. Furthermore, our method identified genomic features that interact with DNA methylation, elucidating mechanisms involved in DNA methylation modification and regulation, and linking different epigenetic processes

    Linking the Epigenome to the Genome: Correlation of Different Features to DNA Methylation of CpG Islands

    Get PDF
    DNA methylation of CpG islands plays a crucial role in the regulation of gene expression. More than half of all human promoters contain CpG islands with a tissue-specific methylation pattern in differentiated cells. Still today, the whole process of how DNA methyltransferases determine which region should be methylated is not completely revealed. There are many hypotheses of which genomic features are correlated to the epigenome that have not yet been evaluated. Furthermore, many explorative approaches of measuring DNA methylation are limited to a subset of the genome and thus, cannot be employed, e.g., for genome-wide biomarker prediction methods. In this study, we evaluated the correlation of genetic, epigenetic and hypothesis-driven features to DNA methylation of CpG islands. To this end, various binary classifiers were trained and evaluated by cross-validation on a dataset comprising DNA methylation data for 190 CpG islands in HEPG2, HEK293, fibroblasts and leukocytes. We achieved an accuracy of up to 91% with an MCC of 0.8 using ten-fold cross-validation and ten repetitions. With these models, we extended the existing dataset to the whole genome and thus, predicted the methylation landscape for the given cell types. The method used for these predictions is also validated on another external whole-genome dataset. Our results reveal features correlated to DNA methylation and confirm or disprove various hypotheses of DNA methylation related features. This study confirms correlations between DNA methylation and histone modifications, DNA structure, DNA sequence, genomic attributes and CpG island properties. Furthermore, the method has been validated on a genome-wide dataset from the ENCODE consortium. The developed software, as well as the predicted datasets and a web-service to compare methylation states of CpG islands are available at http://www.cogsys.cs.uni-tuebingen.de/software/dna-methylation/

    Dynamic regulation of the transcription initiation landscape at single nucleotide resolution during vertebrate embryogenesis

    Get PDF
    Spatiotemporal control of gene expression is central to animal development. Core promoters represent a previously unanticipated regulatory level by interacting with cis-regulatory elements and transcription initiation in different physiological and developmental contexts. Here, we provide a first and comprehensive description of the core promoter repertoire and its dynamic use during the development of a vertebrate embryo. By using cap analysis of gene expression (CAGE), we mapped transcription initiation events at single nucleotide resolution across 12 stages of zebrafish development. These CAGE-based transcriptome maps reveal genome-wide rules of core promoter usage, structure, and dynamics, key to understanding the control of gene regulation during vertebrate ontogeny. They revealed the existence of multiple classes of pervasive intra- and intergenic post-transcriptionally processed RNA products and their developmental dynamics. Among these RNAs, we report splice donor site-associated intronicRNA(sRNA) to be specific to genes of the splicing machinery. For the identification of conserved features, we compared the zebrafish data sets to the first CAGE promoter map of Tetraodon and the existing human CAGE data. We show that a number of features, such as promoter type, newly discovered promoter properties such as a specialized purine-rich initiator motif, as well as sRNAs and the genes in which they are detected, are conserved in mammalian and Tetraodon CAGE-defined promoter maps. The zebrafish developmental promoterome represents a powerful resource for studying developmental gene regulation and revealing promoter features shared across vertebrates

    Integrating genetics and epigenetics in breast cancer: biological insights, experimental, computational methods and therapeutic potential

    Get PDF

    Benchmarks for the Point Kinetics Equations

    No full text
    A new numerical algorithm for the solution of the Point Kinetics Equations, whose accurate solution has been sought for over 60 years. The method couples the simplest of finite difference methods, a backward Euler, with Richardsons extrapolation, also called acceleration. From this coupling, a series of benchmarks have emerged. These include cases from the literature as well as several new ones. The novelty of this presentation lies in the breadth of insertions considered, covering both prescribed and feedback reactivities, and the extreme eight to nine-digit accuracy achievable. The benchmarks presented are to provide guidance to those who wish to develop further improvements

    The solution of the point kinetics equations via converged accelerated taylor series (CATS)

    No full text
    This paper deals with finding accurate solutions of the point kinetics equations including nonlinear feedback, in a fast, efficient and straightforward way. A truncated Taylor series is coupled to continuous analytical continuation to provide the recurrence relations to solve the ordinary differential equations of point kinetics. Non-linear (Wynn-epsilon) and linear (Romberg) convergence accelerations are employed to provide highly accurate results for the evaluation of Taylor series expansions and extrapolated values of neutron and precursor densities at desired edits. The proposed Converged Accelerated Taylor Series, or CATS, algorithm automatically performs successive mesh refinements until the desired accuracy is obtained, making use of the intermediate results for converged initial values at each interval. Numerical performance is evaluated using case studies available from the literature. Nearly perfect agreement is found with the literature results generally considered most accurate. Benchmark quality results are reported for several cases of interest including step, ramp, zigzag and sinusoidal prescribed insertions and insertions with adiabatic Doppler feedback. A larger than usual (9) number of digits is included to encourage honest benchmarking. The benchmark is then applied to the enhanced piecewise constant algorithm (EPCA) currently being developed by the second author

    GM2-1 pancreatic islet ganglioside: identification and characterization of a novel islet-specific molecule.

    No full text
    Recent studies have indicated that GM2-1, a pancreatic islet monosialo-ganglioside, is an islet-specific component whose expression is metabolically regulable and represents one of the target antigens of cytoplasmic islet cell antibodies. In the present study we aimed to biochemically characterize this molecule using a panel of biochemical techniques including gas chromatography, thin layer chromatography, enzymatic digestion and mass spectrometry. GM2-1 ganglioside was extracted from human pancreas and purified by thin-layer chromatography. Fatty acids in the ceramide (the hydrophobic portion of the molecule), identified by gas chromatography ranged from C16:1 to C24:1. The oligosaccharide chain was enzymatically digested by the sequential application of various exoglycosidases (neuraminidase followed by beta-galactosidase, followed by beta-hexosaminidase) and characterized by gas chromatography identification of the liberated sugars. The following structure was deducted from enzymatic studies and confirmed by mass spectrometry analysis: N-acetyl neuraminic acid-galactose-galactosamine-galactosamine-glucose-ceramide. This is a novel ganglioside structure, not yet described, which shares characteristics with a neuronal glycolipid autoantigen: the LM1 ganglioside. Both GM2-1 and LM1 have a single sialic acid residue in the terminal position, the same migration position on thin layer chromatography and the same number of carbohydrate moieties. In conclusion, we have characterized a novel islet-specific ganglioside molecule with unusual characteristics, such as the terminal sialic acid and the galactosamine residues, which may facilitate both its antigenicity and its involvement in beta-cell autoimmunity
    corecore