23 research outputs found

    Complementing Hi-C information for 3D chromatin reconstruction by ChromStruct

    Get PDF
    A multiscale method proposed elsewhere for reconstructing plausible 3D configurations of the chromatin in cell nuclei is recalled, based on the integration of contact data from Hi-C experiments and additional information coming from ChIP-seq, RNA-seq and ChIA-PET experiments. Provided that the additional data come from independent experiments, this kind of approach is supposed to leverage them to complement possibly noisy, biased or missing Hi-C records. When the different data sources are mutually concurrent, the resulting solutions are corroborated; otherwise, their validity would be weakened. Here, a problem of reliability arises, entailing an appropriate choice of the relative weights to be assigned to the different informational contributions. A series of experiments is presented that help to quantify the advantages and the limitations offered by this strategy. Whereas the advantages in accuracy are not always significant, the case of missing Hi-C data demonstrates the effectiveness of additional information in reconstructing the highly packed segments of the structure

    Inferring Single-Cell 3D Chromosomal Structures Based On the Lennard-Jones Potential

    Get PDF
    Reconstructing three‐dimensional (3D) chromosomal structures based on single‐cell Hi‐C data is a challenging scientific problem due to the extreme sparseness of the single‐cell Hi‐C data. In this research, we used the Lennard‐Jones potential to reconstruct both 500 kb and high‐resolution 50 kb chromosomal structures based on single‐cell Hi‐C data. A chromosome was represented by a string of 500 kb or 50 kb DNA beads and put into a 3D cubic lattice for simulations. A 2D Gaussian function was used to impute the sparse single‐cell Hi‐C contact matrices. We designed a novel loss function based on the Lennard‐Jones potential, in which the Δ value, i.e., the well depth, was used to indicate how stable the binding of every pair of beads is. For the bead pairs that have single‐cell Hi‐C contacts and their neighboring bead pairs, the loss function assigns them stronger binding stability. The Metropolis–Hastings algorithm was used to try different locations for the DNA beads, and simulated annealing was used to optimize the loss function. We proved the correctness and validness of the reconstructed 3D structures by evaluating the models according to multiple criteria and comparing the models with 3D‐FISH data

    The little skate genome and the evolutionary emergence of wing-like fins

    Get PDF
    Skates are cartilaginous fish whose body plan features enlarged wing-like pectoral fins, enabling them to thrive in benthic environments1,2. However, the molecular underpinnings of this unique trait remain unclear. Here we investigate the origin of this phenotypic innovation by developing the little skate Leucoraja erinacea as a genomically enabled model. Analysis of a high-quality chromosome-scale genome sequence for the little skate shows that it preserves many ancestral jawed vertebrate features compared with other sequenced genomes, including numerous ancient microchromosomes. Combining genome comparisons with extensive regulatory datasets in developing fins—including gene expression, chromatin occupancy and three-dimensional conformation—we find skate-specific genomic rearrangements that alter the three-dimensional regulatory landscape of genes that are involved in the planar cell polarity pathway. Functional inhibition of planar cell polarity signalling resulted in a reduction in anterior fin size, confirming that this pathway is a major contributor to batoid fin morphology. We also identified a fin-specific enhancer that interacts with several hoxa genes, consistent with the redeployment of hox gene expression in anterior pectoral fins, and confirmed its potential to activate transcription in the anterior fin using zebrafish reporter assays. Our findings underscore the central role of genome reorganization and regulatory variation in the evolution of phenotypes, shedding light on the molecular origin of an enigmatic trait

    Emergent Structure and Dynamics from Stochastic Pairwise Crosslinking in Chromosomal Polymer Models

    Get PDF
    The spatio-temporal organization of the genome is critical to the ability of the cell to store huge amounts of information in highly compacted DNA while also performing vital cellular functions. Experimental methods provide a window into the geometry of the chromatin but cannot provide a full picture in space and time.Polymer models have been shown to reproduce properties of chromatin and can be used to make simulated observations, informing biological experimentation. We apply a previously-studied model of the full yeast genome with dynamic protein crosslinking in the nucleolus which showed the emergence of clustering when the crosslinking timescale was sufficiently fast. We investigate the the crosslinking timescale at finer resolution and newly identify the presence of a \textit{flexible clustering} regime for intermediate timescales, which maximizes mixing of nucleolar beads, of significant interest due to the role mixing plays in nuclear processes. In order to robustly identify spatio-temporal clustering structure, we map our problem to a multi-layer network and then apply the multi-layer modularity community detection algorithm, showing the presence of spatio-temporal community structure in the fast and intermediate clustering regimes. We perform analysis of the relationship between cluster size and the ensuing stability of clusters,revealing a heterogeneous collection of clusters in which cluster size correlates with stability. We view the stochastic switching as producing an effective thermal equilibrium byextending the WKB approach for deriving quasipotentials in switching systems to the case of an overdamped Langevin equation with switching force term, and derive the associated Hamilton-Jacobi equation. We apply the string method for finding most-probable transition paths, revealing previously unreported numerical challenges; we present modifications to the algorithms to overcome them. We show that our methods can correctly compute asymptotic escape times by comparison to Monte Carlo simulations, and verified an important principle: the effective force is often significantly weaker than a naive average of the switching suggests. Through this multifaceted approach, we have shown how stochastic crosslinking leads to complex emergent structure, with different timescales optimizing different properties, and shown how the structure can be analyzed using both network data based tools and through stochastic averaging principles.Doctor of Philosoph

    CTCF knockout in zebrafish induces alterations in regulatory landscapes and developmental gene expression

    Get PDF
    Coordinated chromatin interactions between enhancers and promoters are critical for gene regulation. The architectural protein CTCF mediates chromatin looping and is enriched at the boundaries of topologically associating domains (TADs), which are sub-megabase chromatin structures. In vitro CTCF depletion leads to a loss of TADs but has only limited effects over gene expression, challenging the concept that CTCF-mediated chromatin structures are a fundamental requirement for gene regulation. However, how CTCF and a perturbed chromatin structure impacts gene expression during development remains poorly understood. Here we link the loss of CTCF and gene regulation during patterning and organogenesis in a ctcf knockout zebrafish model. CTCF absence leads to loss of chromatin structure and affects the expression of thousands of genes, including many developmental regulators. Our results demonstrate the essential role of CTCF in providing the structural context for enhancer-promoter interactions, thus regulating developmental genes

    Statistical methods for high-throughput genomic data

    Get PDF

    From nanometers to centimeters: Imaging across spatial scales with smart computer-aided microscopy

    Get PDF
    Microscopes have been an invaluable tool throughout the history of the life sciences, as they allow researchers to observe the miniscule details of living systems in space and time. However, modern biology studies complex and non-obvious phenotypes and their distributions in populations and thus requires that microscopes evolve from visual aids for anecdotal observation into instruments for objective and quantitative measurements. To this end, many cutting-edge developments in microscopy are fuelled by innovations in the computational processing of the generated images. Computational tools can be applied in the early stages of an experiment, where they allow for reconstruction of images with higher resolution and contrast or more colors compared to raw data. In the final analysis stage, state-of-the-art image analysis pipelines seek to extract interpretable and humanly tractable information from the high-dimensional space of images. In the work presented in this thesis, I performed super-resolution microscopy and wrote image analysis pipelines to derive quantitative information about multiple biological processes. I contributed to studies on the regulation of DNMT1 by implementing machine learning-based segmentation of replication sites in images and performed quantitative statistical analysis of the recruitment of multiple DNMT1 mutants. To study the spatiotemporal distribution of DNA damage response I performed STED microscopy and could provide a lower bound on the size of the elementary spatial units of DNA repair. In this project, I also wrote image analysis pipelines and performed statistical analysis to show a decoupling of DNA density and heterochromatin marks during repair. More on the experimental side, I helped in the establishment of a protocol for many-fold color multiplexing by iterative labelling of diverse structures via DNA hybridization. Turning from small scale details to the distribution of phenotypes in a population, I wrote a reusable pipeline for fitting models of cell cycle stage distribution and inhibition curves to high-throughput measurements to quickly quantify the effects of innovative antiproliferative antibody-drug-conjugates. The main focus of the thesis is BigStitcher, a tool for the management and alignment of terabyte-sized image datasets. Such enormous datasets are nowadays generated routinely with light-sheet microscopy and sample preparation techniques such as clearing or expansion. Their sheer size, high dimensionality and unique optical properties poses a serious bottleneck for researchers and requires specialized processing tools, as the images often do not fit into the main memory of most computers. BigStitcher primarily allows for fast registration of such many-dimensional datasets on conventional hardware using optimized multi-resolution alignment algorithms. The software can also correct a variety of aberrations such as fixed-pattern noise, chromatic shifts and even complex sample-induced distortions. A defining feature of BigStitcher, as well as the various image analysis scripts developed in this work is their interactivity. A central goal was to leverage the user's expertise at key moments and bring innovations from the big data world to the lab with its smaller and much more diverse datasets without replacing scientists with automated black-box pipelines. To this end, BigStitcher was implemented as a user-friendly plug-in for the open source image processing platform Fiji and provides the users with a nearly instantaneous preview of the aligned images and opportunities for manual control of all processing steps. With its powerful features and ease-of-use, BigStitcher paves the way to the routine application of light-sheet microscopy and other methods producing equally large datasets

    From nanometers to centimeters: Imaging across spatial scales with smart computer-aided microscopy

    Get PDF
    Microscopes have been an invaluable tool throughout the history of the life sciences, as they allow researchers to observe the miniscule details of living systems in space and time. However, modern biology studies complex and non-obvious phenotypes and their distributions in populations and thus requires that microscopes evolve from visual aids for anecdotal observation into instruments for objective and quantitative measurements. To this end, many cutting-edge developments in microscopy are fuelled by innovations in the computational processing of the generated images. Computational tools can be applied in the early stages of an experiment, where they allow for reconstruction of images with higher resolution and contrast or more colors compared to raw data. In the final analysis stage, state-of-the-art image analysis pipelines seek to extract interpretable and humanly tractable information from the high-dimensional space of images. In the work presented in this thesis, I performed super-resolution microscopy and wrote image analysis pipelines to derive quantitative information about multiple biological processes. I contributed to studies on the regulation of DNMT1 by implementing machine learning-based segmentation of replication sites in images and performed quantitative statistical analysis of the recruitment of multiple DNMT1 mutants. To study the spatiotemporal distribution of DNA damage response I performed STED microscopy and could provide a lower bound on the size of the elementary spatial units of DNA repair. In this project, I also wrote image analysis pipelines and performed statistical analysis to show a decoupling of DNA density and heterochromatin marks during repair. More on the experimental side, I helped in the establishment of a protocol for many-fold color multiplexing by iterative labelling of diverse structures via DNA hybridization. Turning from small scale details to the distribution of phenotypes in a population, I wrote a reusable pipeline for fitting models of cell cycle stage distribution and inhibition curves to high-throughput measurements to quickly quantify the effects of innovative antiproliferative antibody-drug-conjugates. The main focus of the thesis is BigStitcher, a tool for the management and alignment of terabyte-sized image datasets. Such enormous datasets are nowadays generated routinely with light-sheet microscopy and sample preparation techniques such as clearing or expansion. Their sheer size, high dimensionality and unique optical properties poses a serious bottleneck for researchers and requires specialized processing tools, as the images often do not fit into the main memory of most computers. BigStitcher primarily allows for fast registration of such many-dimensional datasets on conventional hardware using optimized multi-resolution alignment algorithms. The software can also correct a variety of aberrations such as fixed-pattern noise, chromatic shifts and even complex sample-induced distortions. A defining feature of BigStitcher, as well as the various image analysis scripts developed in this work is their interactivity. A central goal was to leverage the user's expertise at key moments and bring innovations from the big data world to the lab with its smaller and much more diverse datasets without replacing scientists with automated black-box pipelines. To this end, BigStitcher was implemented as a user-friendly plug-in for the open source image processing platform Fiji and provides the users with a nearly instantaneous preview of the aligned images and opportunities for manual control of all processing steps. With its powerful features and ease-of-use, BigStitcher paves the way to the routine application of light-sheet microscopy and other methods producing equally large datasets

    Detection of 3D Genome Folding at Multiple Scales

    Get PDF
    Understanding 3D genome structure is crucial to learn how chromatin folds and how genes are regulated through the spatial organization of regulatory elements. Various technologies have been developed to investigate genome architecture. These technologies include ligation-based 3C Methodologies such as Hi-C and Micro-C, ligation-based pull-down methods like Proximity Ligation-Assisted ChIP-seq (PLAC Seq) and Paired-end tag sequencing (ChIA PET), and ligation-free methods like Split-Pool Recognition of Interactions by Tag Extension (SPRITE) and Genome Architecture Mapping (GAM). Although these technologies have provided great insight into chromatin organization, a systematic evaluation of these technologies is lacking. Among these technologies, Hi-C has been one of the most widely used methods to map genome-wide chromatin interactions for over a decade. To understand how the choice of experimental parameters determines the ability to detect and quantify the features of chromosome folding, we have first systematically evaluated two critical parameters in the Hi-C protocol: cross-linking and digestion of chromatin. We found that different protocols capture distinct 3D genome features with different efficiencies depending on the cell type (Chapter 2). Use of the updated Hi-C protocol with new parameters, which we call Hi-C 3.0, was subsequently evaluated and found to provide the best loop detection compared to all previous Hi-C protocols as well as better compartment quantification compared to Micro-C (Chapter 3). Finally, to understand how the aforementioned technologies (Hi-C, Micro-C, PLAC-Seq, ChIA-PET, SPRITE, GAM) that measure 3D organization could provide a comprehensive understanding of the genome structure, we have performed a comparison of these technologies. We found that each of these methods captures different aspects of the chromatin folding (Chapter 4). Collectively, these studies suggest that improving the 3D methodologies and integrative analyses of these methods will reveal unprecedented details of the genome structure and function
    corecore