25 research outputs found

    Comparison of long-range PE sequencing methods.

    No full text
    <p>(A–D) Long-range PE sequencing with linker oligonucleotides. In these methods, biotin-labeled linker oligonucleotides are added to the two ends of long-range DNA fragments, followed by enzymes-induced intra-molecule circularization, and recovery of the paired-end for sequencing. The addition of linker oligonucleotides and subsequent complex enzyme reactions require 5–8 recoveries before capturing the paired-ends from circularized DNA fragments. In addition, the use of expensive enzymes involves additional costs. (E), Long-range PE sequencing by direct intra-molecule ligation or molecular linker-free circularization. In the method, the 3′ends of long-range DNA fragments were biotin-labeled, followed by direct intra-molecule circularization and recovery of PE ends. This method requires less recovery steps (3–4) and no complex enzyme reaction system. The steps for DNA recovery are in bold. We applied the method E in this research.</p

    Insert-size distributions of long-range PE sequencing libraries.

    No full text
    <p>(A), 2- to 35-kb libraries; (B), 10 kb-WGA and 10 kb-dam libraries. The read-pairs that were uniquely mapped to the human genome (NCBI build 37) were used for this analysis. The insert size of a library and its corresponding small insert read contamination are shown in the ‘−’ and ‘+’direction of the x-axis, respectively. The ‘−’ direction represents the orientation relationship between PEs from circularized long-range DNA molecules (>1 kb) when mapped to the human genome, while ‘+’ represents that between the two ends from linear small DNA fragments (∼500 bp).</p

    Two long insertions in YH genome detected by long-range PE.

    No full text
    <p>Mapping the long-range PE reads back to the human genome (NCBI build 37) resulted in the detection of a previously identified ∼8 kb insertion in chromosome 7 (A) and a novel ∼7 kb insertion in chromosome 14 (B) in the YH genome. The abnormally mapped PE reads that supported the insertions by showing unexpected short insert size are shown.</p

    Paired-End Sequencing of Long-Range DNA Fragments for <em>De Novo</em> Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation

    No full text
    <div><h3>Background</h3><p>The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for <em>de novo</em> assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS.</p> <h3>Results</h3><p>We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data.</p> <h3>Conclusions</h3><p>In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the <em>de novo</em> assembly of a large, complex genome using NGS short reads.</p> </div

    Summary of <i>de novo</i> YH genome assembly.

    No full text
    <p>The data from the YH project was used for the contig and initial scaffold assembly. Then, the long-range PE data were added step by step for scaffold construction. Genome coverage and gene coverage was calculated using the NCBI build 37 and RefSeq gene set as reference, respectively. The X and Y chromosomes were excluded while calculating genome coverage and gene coverage. For calculation of scaffold N50, N90 and total length, the intra-scaffold gaps were included.</p

    <i>De novo</i> assembly of the YH genome.

    No full text
    <p>(A), The YH scaffold N50 (green bar) and N90 (blue bar) sizes were dramatically improvement with the addition of long-range PE information (from 2 kb to 35 kb). The trends of improvement are shown as a dashed line. (B), Alignment between the assembled YH scaffolds (y-axis) and the reference human genome (NCBI build 37, x-xis) on chr8. Local repeat level in the reference chr8 (calculated in a 1-kb window) is showed in color along the chromosome at the top-up bar. The white blocks in the bar represent the gaps in the reference genome. (C), Alignment of the YH scaffold 320 onto the reference chr8. Local repeat level on the region of the reference chr8 is also shown in color along the sequence (calculated in a 1-kb window).</p

    Non-Invasive Prenatal Diagnosis of Lethal Skeletal Dysplasia by Targeted Capture Sequencing of Maternal Plasma

    No full text
    <div><p>Background</p><p>Since the discovery of cell-free foetal DNA in the plasma of pregnant women, many non-invasive prenatal testing assays have been developed. In the area of skeletal dysplasia diagnosis, some PCR-based non-invasive prenatal testing assays have been developed to facilitate the ultrasound diagnosis of skeletal dysplasias that are caused by de novo mutations. However, skeletal dysplasias are a group of heterogeneous genetic diseases, the PCR-based method is hard to detect multiple gene or loci simultaneously, and the diagnosis rate is highly dependent on the accuracy of the ultrasound diagnosis. In this study, we investigated the feasibility of using targeted capture sequencing to detect foetal de novo pathogenic mutations responsible for skeletal dysplasia.</p><p>Methodology/Principal Findings</p><p>Three families whose foetuses were affected by skeletal dysplasia and two control families whose foetuses were affected by other single gene diseases were included in this study. Sixteen genes related to some common lethal skeletal dysplasias were selected for analysis, and probes were designed to capture the coding regions of these genes. Targeted capture sequencing was performed on the maternal plasma DNA, the maternal genomic DNA, and the paternal genomic DNA. The de novo pathogenic variants in the plasma DNA data were identified using a bioinformatical process developed for low frequency mutation detection and a strict variant interpretation strategy. The causal variants could be specifically identified in the plasma, and the results were identical to those obtained by sequencing amniotic fluid samples. Furthermore, a mean of 97% foetal specific alleles, which are alleles that are not shared by maternal genomic DNA and amniotic fluid DNA, were identified successfully in plasma samples.</p><p>Conclusions/Significance</p><p>Our study shows that capture sequencing of maternal plasma DNA can be used to non-invasive detection of de novo pathogenic variants. This method has the potential to be used to facilitate the prenatal diagnosis of skeletal dysplasia.</p></div
    corecore