68 research outputs found

    The number of promoter-enhancer linkages connecting the endpoints of domains in different resolutions.

    No full text
    <p>As the resolution increases, the increase in the number of boundaries can capture a higher number of potential interactions. The blue curve shows the increase for an ensemble of randomly reshuffled TADs. The number of promoter-enhancer linkages connecting the endpoints of real domains is higher than the random counterparts.</p

    Partition of oxygen in different membrane proteomes.

    No full text
    <p>Boxplot of ratio of the number of low oxygen transmembrane proteins to the number of high-oxygen transmembrane proteins for 373 complete genomes (30 eukaryote, 34 archaea and 309 eubacteria). We removed four outliers in prokaryotes that had a ratio greater than 3. Key: Euks–Eukaryotes, Arch–Archaeabacteria, Bact–Eubacteria, and Proks–Prokaryotes.</p

    MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions

    No full text
    <div><p>Genome-wide proximity ligation based assays such as Hi-C have revealed that eukaryotic genomes are organized into structural units called topologically associating domains (TADs). From a visual examination of the chromosomal contact map, however, it is clear that the organization of the domains is not simple or obvious. Instead, TADs exhibit various length scales and, in many cases, a nested arrangement. Here, by exploiting the resemblance between TADs in a chromosomal contact map and densely connected modules in a network, we formulate TAD identification as a network optimization problem and propose an algorithm, MrTADFinder, to identify TADs from intra-chromosomal contact maps. MrTADFinder is based on the network-science concept of modularity. A key component of it is deriving an appropriate background model for contacts in a random chain, by numerically solving a set of matrix equations. The background model preserves the observed coverage of each genomic bin as well as the distance dependence of the contact frequency for any pair of bins exhibited by the empirical map. Also, by introducing a tunable resolution parameter, MrTADFinder provides a self-consistent approach for identifying TADs at different length scales, hence the acronym "Mr" standing for Multiple Resolutions. We then apply MrTADFinder to various Hi-C datasets. The identified domain boundaries are marked by characteristic signatures in chromatin marks and transcription factors (TF) that are consistent with earlier work. Moreover, by calling TADs at different length scales, we observe that boundary signatures change with resolution, with different chromatin features having different characteristic length scales. Furthermore, we report an enrichment of HOT (high-occupancy target) regions near TAD boundaries and investigate the role of different TFs in determining boundaries at various resolutions. To further explore the interplay between TADs and epigenetic marks, as tumor mutational burden is known to be coupled to chromatin structure, we examine how somatic mutations are distributed across boundaries and find a clear stepwise pattern. Overall, MrTADFinder provides a novel computational framework to explore the multi-scale structures in Hi-C contact maps.</p></div

    Boundary signatures of histone modifications in different resolutions.

    No full text
    <p>A) Histone modifications near the TAD boundary regions obtained in various resolutions. The peak density is obtained by counting the number of peaks in every 40kb bin, and normalized by a null model in which peaks are randomly distributed. B) Different histone marks show different levels of enrichment near TAD boundaries at different resolutions. Despite a general decreasing trend, the signal of certain marks likes H3K27me3 remains flat until a very high resolution.</p

    Identification of TADs in multiple resolutions.

    No full text
    <p>A) A part of the contact map of the chromosome 10 in hES cell. The greenish triangles below represent TADs called by MrTADFinder in three different resolutions. The TADs called agree well visually with the contact map. The blue triangles and red triangles represent TADs called in human ES cells and human IMR90 cells respectively as reported in [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005647#pcbi.1005647.ref008" target="_blank">8</a>]. B) The size of TADs called in different resolutions. The median TADs size decreases from 3 Mbp to 300 kbp as the resolution increases from 0.75 to 3.5. C) The number of TADs increases as the resolution increases. When <i>γ</i> = 2.25, there are about 2600 TADs in hES cells with a median size of roughly 1Mb. The median size goes down to 300kb when the resolution increases to 3.5. The number of TADs identified in [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005647#pcbi.1005647.ref008" target="_blank">8</a>] is marked by the arrow. Comparing TADs called by MrTADFinder with TADs called in [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005647#pcbi.1005647.ref008" target="_blank">8</a>]. Two algorithms agree the most in a particular resolution (<i>γ</i> ≈ 2.875).</p

    Mutational burdens across TAD boundaries.

    No full text
    <p>The 3 clusters of boundary regions exhibit distinct patterns in terms of mutational burden. For blue and red clusters, the area marks the first and the third quartiles. For the green cluster, only the mean values at different positions are shown for clarity. The inset shows the average Repli-seq signal for the red and blue clusters.</p

    Transcription factors binding in different resolutions.

    No full text
    <p>A) Enrichment of HOT (high-occupancy target) and XOT (extreme-occupancy target) regions near TAD boundaries in hES cell. Boundaries are identified by MrTADFinder at a resolution <i>γ</i> = 2.75. The y-axis is normalized by a null model that peaks are randomly distributed in along the chromosome. B) A logistic regression model to classify real TAD boundaries and random boundaries based on the binding pattern of 60 TFs. The most influential factors responsible for TAD boundaries formation at different resolutions are listed. Factors with a positive coefficient have a direct effect on border establishment or maintenance, whereas factors like MYC has a negative effect. The factors are sorted by corresponding P-values and only the significant factors are displayed.</p

    Enrichment of CTCF peaks near TAD boundaries at two different resolutions.

    No full text
    <p>The blue line shows the same analysis using TADs reported in [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005647#pcbi.1005647.ref008" target="_blank">8</a>].</p

    MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions - Fig 4

    No full text
    <p>A) Distribution of house-keeping genes and tissue-specific genes near TAD boundaries at different resolutions. House-keeping genes are more enriched near TAD boundaries as compared to tissue-specific genes. B) House-keeping genes and tissue-specific genes show different levels of enrichment near TAD boundaries at different resolutions. Tissue-specific genes show a general decreasing trend, whereas the number of house-keeping genes remains flat until a high resolution.</p

    Classification of canonical temporal expression trajectories for iPDP eigenvalue types.

    No full text
    <p>The internal principal dynamic patterns (iPDPs) represent canonical temporal expression trajectories, which can be either increasing, or damped oscillation and so on depending on iPDP’s eigenvalues (The bottom row).</p
    • …
    corecore