6 research outputs found

    Comparison of long-range PE sequencing methods.

    No full text
    <p>(A–D) Long-range PE sequencing with linker oligonucleotides. In these methods, biotin-labeled linker oligonucleotides are added to the two ends of long-range DNA fragments, followed by enzymes-induced intra-molecule circularization, and recovery of the paired-end for sequencing. The addition of linker oligonucleotides and subsequent complex enzyme reactions require 5–8 recoveries before capturing the paired-ends from circularized DNA fragments. In addition, the use of expensive enzymes involves additional costs. (E), Long-range PE sequencing by direct intra-molecule ligation or molecular linker-free circularization. In the method, the 3′ends of long-range DNA fragments were biotin-labeled, followed by direct intra-molecule circularization and recovery of PE ends. This method requires less recovery steps (3–4) and no complex enzyme reaction system. The steps for DNA recovery are in bold. We applied the method E in this research.</p

    Two long insertions in YH genome detected by long-range PE.

    No full text
    <p>Mapping the long-range PE reads back to the human genome (NCBI build 37) resulted in the detection of a previously identified ∼8 kb insertion in chromosome 7 (A) and a novel ∼7 kb insertion in chromosome 14 (B) in the YH genome. The abnormally mapped PE reads that supported the insertions by showing unexpected short insert size are shown.</p

    Insert-size distributions of long-range PE sequencing libraries.

    No full text
    <p>(A), 2- to 35-kb libraries; (B), 10 kb-WGA and 10 kb-dam libraries. The read-pairs that were uniquely mapped to the human genome (NCBI build 37) were used for this analysis. The insert size of a library and its corresponding small insert read contamination are shown in the ‘−’ and ‘+’direction of the x-axis, respectively. The ‘−’ direction represents the orientation relationship between PEs from circularized long-range DNA molecules (>1 kb) when mapped to the human genome, while ‘+’ represents that between the two ends from linear small DNA fragments (∼500 bp).</p

    Paired-End Sequencing of Long-Range DNA Fragments for <em>De Novo</em> Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation

    No full text
    <div><h3>Background</h3><p>The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for <em>de novo</em> assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS.</p> <h3>Results</h3><p>We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data.</p> <h3>Conclusions</h3><p>In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the <em>de novo</em> assembly of a large, complex genome using NGS short reads.</p> </div

    Summary of <i>de novo</i> YH genome assembly.

    No full text
    <p>The data from the YH project was used for the contig and initial scaffold assembly. Then, the long-range PE data were added step by step for scaffold construction. Genome coverage and gene coverage was calculated using the NCBI build 37 and RefSeq gene set as reference, respectively. The X and Y chromosomes were excluded while calculating genome coverage and gene coverage. For calculation of scaffold N50, N90 and total length, the intra-scaffold gaps were included.</p

    <i>De novo</i> assembly of the YH genome.

    No full text
    <p>(A), The YH scaffold N50 (green bar) and N90 (blue bar) sizes were dramatically improvement with the addition of long-range PE information (from 2 kb to 35 kb). The trends of improvement are shown as a dashed line. (B), Alignment between the assembled YH scaffolds (y-axis) and the reference human genome (NCBI build 37, x-xis) on chr8. Local repeat level in the reference chr8 (calculated in a 1-kb window) is showed in color along the chromosome at the top-up bar. The white blocks in the bar represent the gaps in the reference genome. (C), Alignment of the YH scaffold 320 onto the reference chr8. Local repeat level on the region of the reference chr8 is also shown in color along the sequence (calculated in a 1-kb window).</p
    corecore