24 research outputs found

    Genome-Wide Mapping of Human DNA Replication by Optical Replication Mapping Supports a Stochastic Model of Eukaryotic Replication Timing [preprint]

    Get PDF
    DNA replication is regulated by the location and timing of replication initiation. Therefore, much effort has been invested in identifying and analyzing the sites of human replication initiation. However, the heterogeneous nature of eukaryotic replication kinetics and the low efficiency of individual initiation site utilization in metazoans has made mapping the location and timing of replication initiation in human cells difficult. A potential solution to the problem of human replication mapping is single-molecule analysis. However, current approaches do not provide the throughput required for genome-wide experiments. To address this challenge, we have developed Optical Replication Mapping (ORM), a high-throughput single-molecule approach to map newly replicated DNA, and used it to map early initiation events in human cells. The single-molecule nature of our data, and a total of more than 2000-fold coverage of the human genome on 27 million fibers averaging ~300 kb in length, allow us to identify initiation sites and their firing probability with high confidence. In particular, for the first time, we are able to measure genome-wide the absolute efficiency of human replication initiation. We find that the distribution of human replication initiation is consistent with inefficient, stochastic initiation of heterogeneously distributed potential initiation complexes enriched in accessible chromatin. In particular, we find sites of human replication initiation are not confined to well-defined replication origins but are instead distributed across broad initiation zones consisting of many initiation sites. Furthermore, we find no correlation of initiation events between neighboring initiation zones. Although most early initiation events occur in early-replicating regions of the genome, a significant number occur in late-replicating regions. The fact that initiation sites in typically late-replicating regions have some probability of firing in early S phase suggests that the major difference between initiation events in early and late replicating regions is their intrinsic probability of firing, as opposed to a qualitative difference in their firing-time distributions. Moreover, modeling of replication kinetics demonstrates that measuring the efficiency of initiation-zone firing in early S phase suffices to predict the average firing time of such initiation zones throughout S phase, further suggesting that the differences between the firing times of early and late initiation zones are quantitative, rather than qualitative. These observations are consistent with stochastic models of initiation-timing regulation and suggest that stochastic regulation of replication kinetics is a fundamental feature of eukaryotic replication, conserved from yeast to humans

    Neuron-specific signatures in the chromosomal connectome associated with schizophrenia risk

    No full text
    To explore the developmental reorganization of the three-dimensional genome of the brain in the context of neuropsychiatric disease, we monitored chromosomal conformations in differentiating neural progenitor cells. Neuronal and glial differentiation was associated with widespread developmental remodeling of the chromosomal contact map and included interactions anchored in common variant sequences that confer heritable risk for schizophrenia. We describe cell type-specific chromosomal connectomes composed of schizophrenia risk variants and their distal targets, which altogether show enrichment for genes that regulate neuronal connectivity and chromatin remodeling, and evidence for coordinated transcriptional regulation and proteomic interaction of the participating genes. Developmentally regulated chromosomal conformation changes at schizophrenia-relevant sequences disproportionally occurred in neurons, highlighting the existence of cell type-specific disease risk vulnerabilities in spatial genome organization

    Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology

    Get PDF
    DNA copy number aberrations (CNAs) are of biological and medical interest because they help identify regulatory mechanisms underlying tumor initiation and evolution. Identification of tumor-driving CNAs (driver CNAs) however remains a challenging task, because they are frequently hidden by CNAs that are the product of random events that take place during tumor evolution. Experimental detection of CNAs is commonly accomplished through array comparative genomic hybridization (aCGH) assays followed by supervised and/or unsupervised statistical methods that combine the segmented profiles of all patients to identify driver CNAs. Here, we extend a previously-presented supervised algorithm for the identification of CNAs that is based on a topological representation of the data. Our method associates a two-dimensional (2D) point cloud with each aCGH profile and generates a sequence of simplicial complexes, mathematical objects that generalize the concept of a graph. This representation of the data permits segmenting the data at different resolutions and identifying CNAs by interrogating the topological properties of these simplicial complexes. We tested our approach on a published dataset with the goal of identifying specific breast cancer CNAs associated with specific molecular subtypes. Identification of CNAs associated with each subtype was performed by analyzing each subtype separately from the others and by taking the rest of the subtypes as the control. Our results found a new amplification in 11q at the location of the progesterone receptor in the Luminal A subtype. Aberrations in the Luminal B subtype were found only upon removal of the basal-like subtype from the control set. Under those conditions, all regions found in the original publication, except for 17q, were confirmed; all aberrations, except those in chromosome arms 8q and 12q were confirmed in the basal-like subtype. These two chromosome arms, however, were detected only upon removal of three patients with exceedingly large copy number values. More importantly, we detected 10 and 21 additional regions in the Luminal B and basal-like subtypes, respectively. Most of the additional regions were either validated on an independent dataset and/or using GISTIC. Furthermore, we found three new CNAs in the basal-like subtype: a combination of gains and losses in 1p, a gain in 2p and a loss in 14q. Based on these results, we suggest that topological approaches that incorporate multiresolution analyses and that interrogate topological properties of the data can help in the identification of copy number changes in cancer

    Prediction of homo- and hetero-protein complexes by protein docking and template-based modeling: a CASP-CAPRI experiment

    No full text
    We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy

    High-throughput modeling and scoring of TCR-pMHC complexes to predict cross-reactive peptides

    No full text
    MOTIVATION: The binding of T cell receptors (TCRs) to their target peptide MHC (pMHC) ligands initializes the cell-mediated immune response. In autoimmune diseases such as multiple sclerosis, the TCR erroneously recognizes self-peptides as foreign and activates an immune response against healthy cells. Such responses can be triggered by cross-recognition of the autoreactive TCR with foreign peptides. Hence, it would be desirable to identify such foreign-antigen triggers to provide a mechanistic understanding of autoimmune diseases. However, the large sequence space of foreign antigens presents an obstacle in the identification of cross-reactive peptides. RESULTS: Here, we present an in silico modeling and scoring method which exploits the structural properties of TCR-pMHC complexes to predict the binding of cross-reactive peptides. We analyzed three mouse TCRs and one human TCR isolated from a patient with multiple sclerosis. Cross-reactive peptides for these TCRs were previously identified via yeast display coupled with deep sequencing, providing a robust dataset for evaluating our method. Modeling query peptides in their associated TCR-pMHC crystal structures, our method accurately selected the top binding peptides from sets containing more than a hundred thousand unique peptides. AVAILABILITY AND IMPLEMENTATION: Analyses were performed using custom Python and R scripts available at https://github.com/tborrman/antigen-predict. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online
    corecore