18 research outputs found

    Generation of Long Insert Pairs Using a Cre-LoxP Inverse PCR Approach

    Get PDF
    Large insert mate pair reads have a major impact on the overall success of de novo assembly and the discovery of inherited and acquired structural variants. The positional information of mate pair reads generally improves genome assembly by resolving repeat elements and/or ordering contigs. Currently available methods for building such libraries have one or more of limitations, such as relatively small insert size; unable to distinguish the junction of two ends; and/or low throughput. We developed a new approach, Cre-LoxP Inverse PCR Paired-End (CLIP-PE), which exploits the advantages of (1) Cre-LoxP recombination system to efficiently circularize large DNA fragments, (2) inverse PCR to enrich for the desired products that contain both ends of the large DNA fragments, and (3) the use of restriction enzymes to introduce a recognizable junction site between ligated fragment ends and to improve the self-ligation efficiency. We have successfully created CLIP-PE libraries up to 22 kb that are rich in informative read pairs and low in small fragment background. These libraries have demonstrated the ability to improve genome assemblies. The CLIP-PE methodology can be implemented with existing and future next-generation sequencing platforms

    Critical Assessment of Metagenome Interpretation:A benchmark of metagenomics software

    Get PDF
    International audienceIn metagenome analysis, computational methods for assembly, taxonomic profilingand binning are key components facilitating downstream biological datainterpretation. However, a lack of consensus about benchmarking datasets andevaluation metrics complicates proper performance assessment. The CriticalAssessment of Metagenome Interpretation (CAMI) challenge has engaged the globaldeveloper community to benchmark their programs on datasets of unprecedentedcomplexity and realism. Benchmark metagenomes were generated from newlysequenced ~700 microorganisms and ~600 novel viruses and plasmids, includinggenomes with varying degrees of relatedness to each other and to publicly availableones and representing common experimental setups. Across all datasets, assemblyand genome binning programs performed well for species represented by individualgenomes, while performance was substantially affected by the presence of relatedstrains. Taxonomic profiling and binning programs were proficient at high taxonomicranks, with a notable performance decrease below the family level. Parametersettings substantially impacted performances, underscoring the importance ofprogram reproducibility. While highlighting current challenges in computationalmetagenomics, the CAMI results provide a roadmap for software selection to answerspecific research questions

    Results from the alignment of <i>Saccharomyces cerevisiae</i> Illumina 12 kb CLIP-PE libraries to the reference genome.

    No full text
    <p>Percentages are calculated by dividing by “Mapped Paired Reads”. Non-redundant pairs map unambiguously to the reference and are de-replicated. The mean gap size and number of gaps are from the mate pair coverage, not raw read coverage.</p

    Histogram of insert sizes from <i>Saccharomyces cerevisiae</i> Illumina 22 kb CLIP-PE libraries.

    No full text
    <p>A: cut with NlaIII, B: cut with HpyCh4IV, C: random shearing approach. The distribution of insert lengths was determined by aligning the reads to the reference genome.</p

    Results from the alignment of <i>Saccharomyces cerevisiae</i> Illumina 22 kb CLIP-PE libraries to a reference genome.

    No full text
    <p>Percentages are calculated by dividing by “Mapped Paired Reads”. Non-redundant pairs map unambiguously to the reference and are de-replicated. The mean gap size and number of gaps are from the mate pair coverage, not raw read coverage.</p

    Mis-assembly numbers of <i>Saccharomyces cerevisiae</i> CLIP-PE libraries.

    No full text
    <p><i>Std</i> refers to standard Illumina 250 bp library, <i>sim 12 kb</i> refers to simulated 12 kb mate pair library, <i>sim 22 kb</i> refers to simulated 22 kb mate pair library.</p

    Assembly metrics for <i>Saccharomyces cerevisiae</i> Illumina CLIP-PE libraries.

    No full text
    <p><i>std</i> refers to standard Illumina 250 bp library, <i>sim 12 kb</i> refers to simulated 12 kb mate pair library, and <i>sim 22 kb</i> refers to simulated 22 kb mate pair library.</p

    A schematic representation of the CLIP-PE library construction strategy.

    No full text
    <p>Following fragmentation, the DNA molecules are end-repaired and ligated with LoxP-P1 and LoxP-P2 adaptors integrated with Illumina P1 or P2 sequences. After separation and size selection, DNA is circularized by Cre recombinase, and non-circularized DNA is removed by exonuclease digestion. bp enzyme cutter is then used to digest and fragment DNA. (Alternatively, circularized DNA can be fragmented by random shearing to 400–500 bp followed by end-repair). DNA is then self-ligated. Inverse PCR with Illumina P1 and P2 PCR primers is used to enrich the mate paired molecules for sequencing. The final prepared libraries consist of short fragments made up of two DNA segments that were originally separately by 5–22 kb.</p
    corecore