20,266 research outputs found

    BIGMAC : breaking inaccurate genomes and merging assembled contigs for long read metagenomic assembly.

    Get PDF
    BackgroundThe problem of de-novo assembly for metagenomes using only long reads is gaining attention. We study whether post-processing metagenomic assemblies with the original input long reads can result in quality improvement. Previous approaches have focused on pre-processing reads and optimizing assemblers. BIGMAC takes an alternative perspective to focus on the post-processing step.ResultsUsing both the assembled contigs and original long reads as input, BIGMAC first breaks the contigs at potentially mis-assembled locations and subsequently scaffolds contigs. Our experiments on metagenomes assembled from long reads show that BIGMAC can improve assembly quality by reducing the number of mis-assemblies while maintaining or increasing N50 and N75. Moreover, BIGMAC shows the largest N75 to number of mis-assemblies ratio on all tested datasets when compared to other post-processing tools.ConclusionsBIGMAC demonstrates the effectiveness of the post-processing approach in improving the quality of metagenomic assemblies

    Assembling large, complex environmental metagenomes

    Full text link
    The large volumes of sequencing data required to sample complex environments deeply pose new challenges to sequence analysis approaches. De novo metagenomic assembly effectively reduces the total amount of data to be analyzed but requires significant computational resources. We apply two pre-assembly filtering approaches, digital normalization and partitioning, to make large metagenome assemblies more comput\ ationaly tractable. Using a human gut mock community dataset, we demonstrate that these methods result in assemblies nearly identical to assemblies from unprocessed data. We then assemble two large soil metagenomes from matched Iowa corn and native prairie soils. The predicted functional content and phylogenetic origin of the assembled contigs indicate significant taxonomic differences despite similar function. The assembly strategies presented are generic and can be extended to any metagenome; full source code is freely available under a BSD license.Comment: Includes supporting informatio

    GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs

    Get PDF
    Metagenomic sequencing allows us to study structure, diversity and ecology in microbial communities without the necessity of obtaining pure cultures. In many metagenomics studies, the reads obtained from metagenomics sequencing are first assembled into longer contigs and these contigs are then binned into clusters of contigs where contigs in a cluster are expected to come from the same species. As different species may share common sequences in their genomes, one assembled contig may belong to multiple species. However, existing tools for contig binning only support non-overlapped binning, i.e., each contig is assigned to at most one bin (species). In this paper, we introduce GraphBin2 which refines the binning results obtained from existing tools and, more importantly, is able to assign contigs to multiple bins. GraphBin2 uses the connectivity and coverage information from assembly graphs to adjust existing binning results on contigs and to infer contigs shared by multiple species. Experimental results on both simulated and real datasets demonstrate that GraphBin2 not only improves binning results of existing tools but also supports to assign contigs to multiple bins

    Pyrosequencing/Sanger Plant Genome Assembly (Limitations, Problems And Solutions) - On The Way To Cucumber (Cucumis sativus L. cv. Borszczagowski) Draft Genome Sequence Publishing

    Get PDF
    Genome sequencing give us an opportunity to broaden the range and improve the quality of omics studies. New sequencing technologies make it possible to achieve the sequencing reads fast and cheap. Since the assembly step of such next generation reads is still not well standardized it is the most cumbersome part of sequencing projects.
We focus here on the wide range of approaches undertaken to assemble the draft cucumber (Cucumis sativus L. cv. Borszczagowski) genome done by using 12x coverage of pyrosequencing Titanium chemistry shotgun paired and unpaired reads combined with Sanger reads
    • …
    corecore