34 research outputs found

    Conserved substitution patterns around nucleosome footprints in eukaryotes and Archaea derive from frequent nucleosome repositioning through evolution.

    Get PDF
    Nucleosomes, the basic repeat units of eukaryotic chromatin, have been suggested to influence the evolution of eukaryotic genomes, both by altering the propensity of DNA to mutate and by selection acting to maintain or exclude nucleosomes in particular locations. Contrary to the popular idea that nucleosomes are unique to eukaryotes, histone proteins have also been discovered in some archaeal genomes. Archaeal nucleosomes, however, are quite unlike their eukaryotic counterparts in many respects, including their assembly into tetramers (rather than octamers) from histone proteins that lack N- and C-terminal tails. Here, we show that despite these fundamental differences the association between nucleosome footprints and sequence evolution is strikingly conserved between humans and the model archaeon Haloferax volcanii. In light of this finding we examine whether selection or mutation can explain concordant substitution patterns in the two kingdoms. Unexpectedly, we find that neither the mutation nor the selection model are sufficient to explain the observed association between nucleosomes and sequence divergence. Instead, we demonstrate that nucleosome-associated substitution patterns are more consistent with a third model where sequence divergence results in frequent repositioning of nucleosomes during evolution. Indeed, we show that nucleosome repositioning is both necessary and largely sufficient to explain the association between current nucleosome positions and biased substitution patterns. This finding highlights the importance of considering the direction of causality between genetic and epigenetic change

    Nucleosome Positioning and Its Role in Gene Regulation in Yeast

    Get PDF
    Nucleosome, composed of a 147-bp segment of DNA helix wrapped around a histone protein octamer, serves as the basic unit of chromatin. Nucleosome positioning refers to the relative position of DNA double helix with respect to the histone octamer. The positioning has an important role in transcription, DNA replication and other DNA transactions since packing DNA into nucleosomes occludes the binding site of proteins. Moreover, the nucleosomes bear histone modifications thus having a profound effect in regulation. Nucleosome positioning and its roles are extensively studied in model organism yeast. In this chapter, nucleosome organization and its roles in gene regulation are reviewed. Typically, nucleosomes are depleted around transcription start sites (TSSs), resulting in a nucleosome-free region (NFR) that is flanked by two well-positioned H2A.Z-containing nucleosomes. The nucleosomes downstream of the TSS are equally spaced in a nucleosome array. DNA sequences, especially 10–11 bp periodicities of some specific dinucleotides, partly determine the nucleosome positioning. Nucleosome occupancy can be determined with high throughput sequencing techniques. Importantly, nucleosome positions are dynamic in different cell types and different environments. Histones depletions, histones mutations, heat shock and changes in carbon source will profoundly change nucleosome organization. In the yeast cells, upon mutating the histones, the nucleosomes change drastically at promoters and the highly expressed genes, such as ribosome genes, undergo more change. The changes of nucleosomes tightly associate the transcription initiation, elongation and termination. H2A.Z is contained in the +1 and −1 nucleosomes and thus in transcription. Chaperon Chz1 and elongation factor Spt16 function in H2A.Z deposition on chromatin. The chapter covers the basic concept of nucleosomes, nucleosome determinant, the techniques of mapping nucleosomes, nucleosome alteration upon stress and mutation, and Htz1 dynamics on chromatin

    Regulation of the nucleosome repeat length in vivo by the DNA sequence, protein concentrations and long-range interactions.

    Get PDF
    The nucleosome repeat length (NRL) is an integral chromatin property important for its biological functions. Recent experiments revealed several conflicting trends of the NRL dependence on the concentrations of histones and other architectural chromatin proteins, both in vitro and in vivo, but a systematic theoretical description of NRL as a function of DNA sequence and epigenetic determinants is currently lacking. To address this problem, we have performed an integrative biophysical and bioinformatics analysis in species ranging from yeast to frog to mouse where NRL was studied as a function of various parameters. We show that in simple eukaryotes such as yeast, a lower limit for the NRL value exists, determined by internucleosome interactions and remodeler action. For higher eukaryotes, also the upper limit exists since NRL is an increasing but saturating function of the linker histone concentration. Counterintuitively, smaller H1 variants or non-histone architectural proteins can initiate larger effects on the NRL due to entropic reasons. Furthermore, we demonstrate that different regimes of the NRL dependence on histone concentrations exist depending on whether DNA sequence-specific effects dominate over boundary effects or vice versa. We consider several classes of genomic regions with apparently different regimes of the NRL variation. As one extreme, our analysis reveals that the period of oscillations of the nucleosome density around bound RNA polymerase coincides with the period of oscillations of positioning sites of the corresponding DNA sequence. At another extreme, we show that although mouse major satellite repeats intrinsically encode well-defined nucleosome preferences, they have no unique nucleosome arrangement and can undergo a switch between two distinct types of nucleosome positioning

    Affinity, stoichiometry and cooperativity of heterochromatin protein 1 (HP1) binding to nucleosomal arrays

    Get PDF
    Heterochromatin protein 1 (HP1) participates in establishing and maintaining heterochromatin via its histone-modification-dependent chromatin interactions. In recent papers HP1 binding to nucleosomal arrays was measured in vitro and interpreted in terms of nearest-neighbour cooperative binding. This mode of chromatin interaction could lead to the spreading of HP1 along the nucleosome chain. Here, we reanalysed previous data by representing the nucleosome chain as a 1D binding lattice and showed how the experimental HP1 binding isotherms can be explained by a simpler model without cooperative interactions between neighboring HP1 dimers. Based on these calculations and spatial models of dinucleosomes and nucleosome chains, we propose that binding stoichiometry depends on the nucleosome repeat length (NRL) rather than protein interactions between HP1 dimers. According to our calculations, more open nucleosome arrays with long DNA linkers are characterized by a larger number of binding sites in comparison to chains with a short NRL. Furthermore, we demonstrate by Monte Carlo simulations that the NRL dependent folding of the nucleosome chain can induce allosteric changes of HP1 binding sites. Thus, HP1 chromatin interactions can be modulated by the change of binding stoichiometry and the type of binding to condensed (methylated) and non-condensed (unmethylated) nucleosome arrays in the absence of direct interactions between HP1 dimers

    Structural constraints revealed in consistent nucleosome positions in the genome of S. cerevisiae

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent advances in the field of high-throughput genomics have rendered possible the performance of genome-scale studies to define the nucleosomal landscapes of eukaryote genomes. Such analyses are aimed towards providing a better understanding of the process of nucleosome positioning, for which several models have been suggested. Nevertheless, questions regarding the sequence constraints of nucleosomal DNA and how they may have been shaped through evolution remain open. In this paper, we analyze in detail different experimental nucleosome datasets with the aim of providing a hypothesis for the emergence of nucleosome-forming sequences.</p> <p>Results</p> <p>We compared the complete sets of nucleosome positions for the budding yeast (<it>Saccharomyces cerevisiae</it>) as defined in the output of two independent experiments with the use of two different experimental techniques. We found that < 10% of the experimentally defined nucleosome positions were consistently positioned in both datasets. This subset of well-positioned nucleosomes, when compared with the bulk, was shown to have particular properties at both sequence and structural levels. Consistently positioned nucleosomes were also shown to occur preferentially in pairs of dinucleosomes, and to be surprisingly less conserved compared with their adjacent nucleosome-free linkers.</p> <p>Conclusion</p> <p>Our findings may be combined into a hypothesis for the emergence of a weak nucleosome-positioning code. According to this hypothesis, consistent nucleosomes may be partly guided by nearby nucleosome-free regions through statistical positioning. Once established, a set of well-positioned consistent nucleosomes may impose secondary constraints that further shape the structure of the underlying DNA. We were able to capture these constraints through the application of a recently introduced structural property that is related to the symmetry of DNA curvature. Furthermore, we found that both consistently positioned nucleosomes and their adjacent nucleosome-free regions show an increased tendency towards conservation of this structural feature.</p

    A study on chromatin structure via nucleosome positioning pattern classification using n-gram graphs

    Get PDF
    Γνωρίζοντας τις ακριβείς θέσεις των νουκλεοσωμάτων σε ένα γονιδίωμα είναι το κλειδί για την κατανόηση του πώς τα γονίδια ρυθμίζονται. Κατά συνέπεια, κατά τη διάρκεια των τελευταίων ετών, ένα αυξανόμενο ενδιαφέρον έχει προκύψει, για τις εφαρμογές των τεχνικών εξόρυξης κειμένου στις γονιδιωματικές μελέτες, ένα παράδειγμα των οποίων είναι η εξέταση του αν η πρωτοταγής δομή του DNA, δηλαδή τα δεδομένα κειμένου που προκύπτουν από γονιδιώματα, επηρεάζει τις θέσεις των νουκλεοσωμάτων και συνεπώς τη δομή της χρωματίνης. Βάσει των γνώσεών μας, δεν υπάρχει πλήρης μελέτη που να εξετάζει τα αποτελέσματα διαφορετικών αναπαραστάσεων στην ταξινόμηση γονιδιωματικών ακολουθιών ως Περιοχών Ελεύθερες Από Νουκλεοσώματα (ΠΕΑΝ) ή Περιοχών Πρόσδεσης Νουκλεοσωμάτων (ΠΠΝ). Το θέμα της Πτυχιακής Εργασίας αυτής, είναι η μελέτη 3 διαφορετικών αναπαραστάσεων γονιδιωματικών ακολουθιών σε συνδυασμό με έναν αριθμό αλγόριθμων μηχανικής μάθησης στο πρόβλημα της ταξινόμησης γονιδιωματικών ακολουθιών ως ΠΕΑΝ ή ΠΠΝ. Τέλος, καταλήγουμε στο συμπέρασμα ότι, με βάση τα ευρήματά μας, διαφορετικές προσεγγίσεις με χρήση διαφορετικών αναπαραστάσεων ή αλγορίθμων μπορούν να είναι λιγότερο ή περισσότερο αποτελεσματικές στην πρόβλεψη των θέσεων που καταλαμβάνουν τα νουκλεοσώματα, βάσει των δεδομένων κειμένου της υποκείμενης γονιδιωματικής ακολουθίας.Knowing the exact locations of nucleosomes in a genome is crucial for understanding how gene expression is organized. Consequently, during the latest years, an increasing interest has emerged, in applying text-mining techniques in genomic studies, an example of which is examining whether the primary structure of DNA, i.e. textual data extracted from genomes, influences nucleosome positioning and, this, chromatin structure. To the best of our knowledge, there exists no complete study on the effect of representation to the classification of genomic sequences as nucleosome-free regions (NFR) - i.e. sequences depleted of nucleosomes - or nucleosome-binding sites (NBS) - i.e. sequences where nucleosomes are present. In this thesis we study 3 different genomic sequence representations (Hidden Markov Models, Bag-of-Words and N-gram Graphs) in combination to a number of machine learning algorithms on the task of classifying genomic sequences as NFR and NBS. Finally, we conclude that, based on our findings, different approaches that involve the usage of different representations or algorithms can be more or less effective at predicting nucleosome positioning based on the textual data of the underlying genomic sequence

    Analysing and quantitatively modelling nucleosome binding preferences

    Get PDF
    The main emphasis of my work as a PhD student was the analysis and prediction of nucleosome positioning, focusing on the role sequence features play. Part I gives a broad overview of nucleosomes, before defining important technical terms. It continues by describing and reviewing experiments that measure nucleosome positioning and bioinformatic methods that learn the sequence preferences of nucleosomes to predict their positioning. Part II describes a collaboration project with the Gaul-lab, where I analyzed MNase-Seq measurements of nucleosomes in Drosophila. The original intention was to investigate the extent to which experimental biases influence the measurements. We extended the analysis to categorize and explore fragile, average and resistant nucleosome populations. I focused on the relation between nucleosome fragility and the sequence landscape, especially at promoters and enhancers. Analyzing the partial unwrapping of nucleosomes genome-wide, I found that the G+C ratio is a determinant of asymmetric unwrapping. I excluded an analysis of histone modifications from this work, which was part of this collaboration, due to its low relevance to the rest of the presented work. Part III describes my main project of developing a probabilistic nucleosome-position prediction method. I developed a maximum likelihood approach to learn a biophysical model of nucleosome binding. By including the low positional resolution of MNase-Seq and the sequence bias of CC-Seq into the likelihood, I could separate them from the nucleosome binding preferences and learn highly correlated nucleosome binding energy models. My analysis shows that nucleosomes have a position-specific binding preference and might be uninfluenced by G+C content or even disfavor it – contrary to the Consensus in literature. Part IV describes further analysis I did during my time as a PhD student that are not part of any planned publications. The main topics are: ancillary elements of my main project, unsuccessful attempts to correct experimental biases, analysis of the quality of experimental measurements, and adapting my probabilistic nucleosome-position prediction method to work with occupancy measurements. Lastly, I give a general outlook that reflects on my results and discusses next steps, like ways to improve my method further. I excluded two collaboration projects I participated in from this thesis, because they are still ongoing: a systematic analysis of how the core promoter sequence influences gene expression in Drosophila and the development of an experiment to measure nucleosome occupancy more precisely

    Nucleosome positioning dynamics in evolution and disease

    Get PDF
    Nucleosome positioning is involved in a variety of cellular processes, and it provides a likely substrate for species evolution and may play roles in human disease. However, many fundamental aspects of nucleosome positioning remain controversial, such as the relative importance of underlying sequence features, genomic neighbourhood and trans-acting factors. In this thesis, I have focused on analyses of the divergence and conservation of nucleosome positioning, associated substitution spectra, and the interplay between them. I have investigated the extent to which nucleosome positioning patterns change following the duplication of a DNA sequence and its insertion into a new genomic region within the same species, by assessing the relative nucleosome positioning between paralogous regions in both the human (using in vitro and in vivo datasets) and yeast (in vivo) genomes. I observed that the positioning of paralogous nucleosomes is generally well conserved and detected a strong rotational preference where nucleosome positioning has diverged. I have also found, in all datasets, that DNA sequence features appear to be more important than local chromosomal environments in nucleosome positioning evolution, while controlling for trans-acting factors that can potentially confound inter-species comparisons. I have also examined the relationships between chromatin structure and DNA sequence variation, with a particular focus on the spectra of (germline and somatic) substitutions seen in human diseases. Both somatic and germline substitutions are found to be enriched at sequences coinciding with nucleosome cores. In addition, transitions appear to be enriched in germline relative to somatic substitutions at nucleosome core regions. This difference in transition to transversion ratio is also seen at transcription start sites (TSSs) genome wide. However, the contrasts seen between somatic and germline mutational spectra do not appear to be attributable to alterations in nucleosome positioning between cell types. Examination of multiple human nucleosome positioning datasets shows conserved positioning across TSSs and strongly conserved global phasing between 4 cancer cell lines and 7 non-cancer cell lines. This suggests that the particular mutational profiles seen for somatic and germline cells occur upon a common landscape of conserved chromatin structure. I extended my studies of mutational spectra by analysing genome sequencing data from various tissues in a cohort of individuals to identify human somatic mutations. This allowed an assessment of the relationship between age and mutation accumulation and a search for inherited genetic variants linked to high somatic mutation rates. A list of candidate germline variants that potentially predispose to increased somatic mutation rates was the outcome. Together these analyses contribute to an integrated view of genome evolution, encompassing the divergence of DNA sequence and chromatin structure, and explorations of how they may interact in human disease

    The mechanical genome : inquiries into the mechanical function of genetic information

    Get PDF
    The four possible segments A, T, C and G that link together to form DNA molecules, and with their ordering encode genetic information, are not only different in name, but also in their physical and chemical properties. The result is that DNA molecules with different sequences have different physical behavior. For instance, one sequence may lead to a very flexible DNA molecule, another to a very stiff one. A DNA molecule with a given sequence may be straight, or intrinsically curved. This leads to an interplay between the information stored in a DNA molecule on one hand, and the physical properties of that molecule on the other. This is of great importance in our cells, where lengths of DNA far longer than the size of the cells that contain them need to be significantly folded up. The research presented in this thesis looks at how we can model this interplay, what its effects can be, and whether nature has made use of it to encode mechanical signals into real genomes.Theoretical Physic
    corecore