67 research outputs found

    Replication of alpha-satellite DNA arrays in endogenous human centromeric regions and in human artificial chromosome

    Get PDF
    In human chromosomes, centromeric regions comprise megabase-size arrays of 171 bp alpha-satellite DNA monomers. The large distances spanned by these arrays preclude their replication from external sites and imply that the repetitive monomers contain replication origins. However, replication within these arrays has not previously been profiled and the role of alpha-satellite DNA in initiation of DNA replication has not yet been demonstrated. Here, replication of alpha-satellite DNA in endogenous human centromeric regions and in de novo formed Human Artificial Chromosome (HAC) was analyzed. We showed that alpha-satellite monomers could function as origins of DNA replication and that replication of alphoid arrays organized into centrochromatin occurred earlier than those organized into heterochromatin. The distribution of inter-origin distances within centromeric alphoid arrays was comparable to the distribution of inter-origin distances on randomly selected non-centromeric chromosomal regions. Depletion of CENP-B, a kinetochore protein that binds directly to a 17 bp CENP-B box motif common to alpha-satellite DNA, resulted in enrichment of alpha-satellite sequences for proteins of the ORC complex, suggesting that CENP-B may have a role in regulating the replication of centromeric regions. Mapping of replication initiation sites in the HAC revealed that replication preferentially initiated in transcriptionally active regions

    Nanopore sequencing and assembly of a human genome with ultra-long reads

    Get PDF
    We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ~30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ~3 Mb). Next, we developed a protocol to generate ultra-long reads (N50 > 100kb, up to 882 kb). Incorporating an additional 5×-coverage of these data more than doubled the assembly contiguity (NG50 ~6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4 Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length and closure of gaps in the reference human genome assembly GRCh38

    Telomere-to-telomere assembly of a complete human X chromosome

    No full text

    Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population

    No full text
    The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease
    • …
    corecore