6 research outputs found

    Structural features based genome-wide characterization and prediction of nucleosome organization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Nucleosome distribution along chromatin dictates genomic DNA accessibility and thus profoundly influences gene expression. However, the underlying mechanism of nucleosome formation remains elusive. Here, taking a structural perspective, we systematically explored nucleosome formation potential of genomic sequences and the effect on chromatin organization and gene expression in <it>S. cerevisiae</it>.</p> <p>Results</p> <p>We analyzed twelve structural features related to flexibility, curvature and energy of DNA sequences. The results showed that some structural features such as DNA denaturation, DNA-bending stiffness, Stacking energy, Z-DNA, Propeller twist and free energy, were highly correlated with in vitro and in vivo nucleosome occupancy. Specifically, they can be classified into two classes, one positively and the other negatively correlated with nucleosome occupancy. These two kinds of structural features facilitated nucleosome binding in centromere regions and repressed nucleosome formation in the promoter regions of protein-coding genes to mediate transcriptional regulation. Based on these analyses, we integrated all twelve structural features in a model to predict more accurately nucleosome occupancy in vivo than the existing methods that mainly depend on sequence compositional features. Furthermore, we developed a novel approach, named DLaNe, that located nucleosomes by detecting peaks of structural profiles, and built a meta predictor to integrate information from different structural features. As a comparison, we also constructed a hidden Markov model (HMM) to locate nucleosomes based on the profiles of these structural features. The result showed that the meta DLaNe and HMM-based method performed better than the existing methods, demonstrating the power of these structural features in predicting nucleosome positions.</p> <p>Conclusions</p> <p>Our analysis revealed that DNA structures significantly contribute to nucleosome organization and influence chromatin structure and gene expression regulation. The results indicated that our proposed methods are effective in predicting nucleosome occupancy and positions and that these structural features are highly predictive of nucleosome organization.</p> <p>The implementation of our DLaNe method based on structural features is available online.</p

    Nucleosome positioning: resources and tools online

    Get PDF
    Nucleosome positioning is an important process required for proper genome packing and its accessibility to execute the genetic program in a cell-specific, timely manner. In the recent years hundreds of papers have been devoted to the bioinformatics, physics and biology of nucleosome positioning. The purpose of this review is to cover a practical aspect of this field, namely, to provide a guide to the multitude of nucleosome positioning resources available online. These include almost 300 experimental datasets of genome-wide nucleosome occupancy profiles determined in different cell types and more than 40 computational tools for the analysis of experimental nucleosome positioning data and prediction of intrinsic nucleosome formation probabilities from the DNA sequence. A manually curated, up to date list of these resources will be maintained at http://generegulation.info

    Multi Layer Analysis

    Get PDF
    This thesis presents a new methodology to analyze one-dimensional signals trough a new approach called Multi Layer Analysis, for short MLA. It also provides some new insights on the relationship between one-dimensional signals processed by MLA and tree kernels, test of randomness and signal processing techniques. The MLA approach has a wide range of application to the fields of pattern discovery and matching, computational biology and many other areas of computer science and signal processing. This thesis includes also some applications of this approach to real problems in biology and seismology

    Bayesian Model Based Approaches In The Analysis Of Chromatin Structure And Motif Discovery

    Get PDF
    Efficient detection of transcription factor (TF) binding sites is an important and unsolved problem in computational genomics. Recently, due to the poor predictive ability of motif finding algorithms, along with the recent proliferation of high-throughput genomic technologies, there has been a drive to utilize secondary information, such as the positioning of nucleosomes, for improving predictions. Nucleosomes prevent transcription factor binding at those sites by blocking the TF access to the DNA. We aimed to construct an accurate map of nucleosome-free regions (NFRs), based on data from high-throughput genomic tiling arrays in yeast. Direct use of Hidden Markov Models are not always applicable due to variable-sized gaps and missing data. So we have extended the hidden Markov model procedure to a continuous time version while efficiently incorporating DNA sequence features that are relevant to nucleosome formation. Simulation studies and an application to a yeast nucleosomal assay demonstrate the advantages of the new method. The established biological role of nucleosomes in relation to TF binding, led us to formulate a joint model in the fourth chapter. The algorithm was implemented on the FAIRE data set, and comparisons were made with existing motif search algorithms. The fifth chapter deals with HMM asymptotics. We obtained results on consistency asymptotic normality and contiguity of a hidden Markov model. These have helped our inference on the convergence properties of the posterior and the consistency of the Bayesian posterior estimates. This has led to the conclusion that the Bayesian inference of a HMM run on sufficiently large datasets (which is typical, in the case of genomic data) leads us very close to the underlying true parameters, as in the case of iid models. The result is fairly general in nature to provide the justification for HMM inference in a wide variety of datasets

    BioGRO: un nuveo método de alta resolución para el estudio de la transcripción naciente a escala genómica en levadura

    Get PDF
    Esta tesis parte de la existencia de una técnica genómica para el estudio de la transcripción naciente en levadura ampliamente utilizada y contrastada: el Genomic run-on, basada en la utilización de macrochips de las ORFs completas del genoma de S. cerevisiae. Debido a la aparición progresiva de nuevas plataformas que permiten interrogar la totalidad de las regiones del genoma, y a una resolución mayor, como los microchips de embaldosado o tiling arrays, el objetivo principal de esta tesis es la puesta a punto de una técnica adaptada a ellas que permita un análisis detallado de la transcripción naciente. Los objetivos concretos que se marcaron fueron: -Desarrollar un nuevo procedimiento de run-on a escala genómica que sustituya el uso de las plataformas basadas en radiactividad y aproveche las plataformas de más resolución, así como de las herramientas bioinformáticas necesarias para el análisis de los datos generados. -Estudiar los perfiles globales de transcripción naciente aprovechando el carácter específico de hebra de los datos para estudiar las dinámicas del transcriptoma global de levadura. Comparar y evaluar la complementariedad de los datos con otras medidas alternativas de tasas de transcripción existentes en la actualidad. -Aplicar la técnica al estudio del efecto de mutantes relacionados con el ciclo de síntesis y degradación del RNA para poder extraer información sobre el funcionamiento de la maquinaria transcripcional y su regulación. -Detectar posibles patrones de actividad de las RNA Polimerasas a lo largo de los transcritos y de las zonas flanqueantes que pudieran obedecer a condicionantes impuestas, tanto por su contexto cromatínico, como por otros factores. -Caracterizar la transcripción naciente producida por las otras RNAP nucleares de levadura. -Desarrollar un protocolo que permita analizar los RNAs nacientes a la máxima resolución mediante secuenciación masiva
    corecore