10 research outputs found
Chromatin Heterogeneity and Distribution of Regulatory Elements in the Late-Replicating Intercalary Heterochromatin Domains of Drosophila melanogaster Chromosomes.
Late-replicating domains (intercalary heterochromatin) in the Drosophila genome display a number of features suggesting their organization is quite unique. Typically, they are quite large and encompass clusters of functionally unrelated tissue-specific genes. They correspond to the topologically associating domains and conserved microsynteny blocks. Our study aims at exploring further details of molecular organization of intercalary heterochromatin and has uncovered surprising heterogeneity of chromatin composition in these regions. Using the 4HMM model developed in our group earlier, intercalary heterochromatin regions were found to host chromatin fragments with a particular epigenetic profile. Aquamarine chromatin fragments (spanning 0.67% of late-replicating regions) are characterized as a class of sequences that appear heterogeneous in terms of their decompactization. These fragments are enriched with enhancer sequences and binding sites for insulator proteins. They likely mark the chromatin state that is related to the binding of cis-regulatory proteins. Malachite chromatin fragments (11% of late-replicating regions) appear to function as universal transitional regions between two contrasting chromatin states. Namely, they invariably delimit intercalary heterochromatin regions from the adjacent active chromatin of interbands. Malachite fragments also flank aquamarine fragments embedded in the repressed chromatin of late-replicating regions. Significant enrichment of insulator proteins CP190, SU(HW), and MOD2.2 was observed in malachite chromatin. Neither aquamarine nor malachite chromatin types appear to correlate with the positions of highly conserved non-coding elements (HCNE) that are typically replete in intercalary heterochromatin. Malachite chromatin found on the flanks of intercalary heterochromatin regions tends to replicate earlier than the malachite chromatin embedded in intercalary heterochromatin. In other words, there exists a gradient of replication progressing from the flanks of intercalary heterochromatin regions center-wise. The peculiar organization and features of replication in large late-replicating regions are discussed as possible factors shaping the evolutionary stability of intercalary heterochromatin
Overlap between 4HMM fragments and chromatin states and types.
<p>5 principal chromatin colors reported in Filion et al., [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0157147#pone.0157147.ref004" target="_blank">4</a>] (A); 9 chromatin states by Kharchenko et al., [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0157147#pone.0157147.ref008" target="_blank">8</a>], S2 cells (B), BG3 cells (C); and 3 chromatin compactization classes [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0157147#pone.0157147.ref010" target="_blank">10</a>] (D).</p
Enrichment levels of the paused RNA pol II in the four chromatin types.
<p>X axis shows the number of 5â-forward short non-polyadenylated transcripts (reads) produced by the paused RNA pol II (peaks). The density of peaks overlapping with a particular chromatin type is shown on the Y axis.</p
Gene expression in different 4HMM chromatin types.
<p>(Đ) The number of tissues where genes are active (RPKM>3). (B) Magnitude of gene expression summed for 29 tissues. Expression range in S2 (C) and Kc167 cells (D). Quartiles computed for the RPKM values are classified by the chromatin type. The distribution of the first, second and third quartiles of RPKM values for the datasets of transcripts are restricted by the chromatin types. For each chromatin color, the bottom part of the bar denotes the interval from the first to the second quartile; the top part denotes the interval from the second to the third quartile. In all panels, various chromatin types are shown on the X axis (from left to right: border malachite IH, internal malachite IH, internal aquamarine IH, aquamarine genome, ruby genome. Whiskers below/above the 1st/3rd quartile correspond to the 12.5th and 87.5th percentiles.</p
Replication timing gradient in the IH region 47A1-2.
<p>Comparison of positions of 4HMM-derived chromatin types with replication timing. Data on the distribution of ORC2 protein and replication timing were taken from the figures published in Belyaeva et al., [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0157147#pone.0157147.ref023" target="_blank">23</a>].</p
Enhancers and protein distributions across different 4HMM chromatin types.
<p>(A) Ratio of the observed fraction of overlapping fragments to the expected one. Observed fraction means the ratio of the total length of genomic regions associated with a protein of interest to the total length of the chromatin type in IH. Expected fraction is the fraction of overlap expected by chance (under random distribution model). Only the values above the âexpectedâ threshold are shown. Asterisks denote probabilities of occurrence by chance *âp<0.05; **âp<1E<sup>-3</sup>; ***âp<1E<sup>-25</sup>. (B) Probability values that the observed overlap happened by chance. Bar height (-log<sub>10</sub>[P]) shows the significance levels for the enrichment of a chromatin type with regulatory elements or proteins indicated on the X axis. The probabilities were computed by the permutation Monte-Carlo test, as described in Materials and Methods section. Enhancers (1) and (2) are taken from RedFly [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0157147#pone.0157147.ref036" target="_blank">36</a>] and Kvon et al. [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0157147#pone.0157147.ref029" target="_blank">29</a>], respectively.</p
NotI flanking sequences: a tool for gene discovery and verification of the human genome
A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celeraâs database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity â„90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000â20 000 NotI sites, of which 6000â9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content