7 research outputs found
Metagenomic analysis of planktonic riverine microbial consortia using nanopore sequencing reveals insight into river microbe taxonomy and function
Background Riverine ecosystems are biogeochemical powerhouses driven largely by microbial communities that inhabit water columns and sediments. Because rivers are used extensively for anthropogenic purposes (drinking water, recreation, agriculture, and industry), it is essential to understand how these activities affect the composition of river microbial consortia. Recent studies have shown that river metagenomes vary considerably, suggesting that microbial community data should be included in broad-scale river ecosystem models. But such ecogenomic studies have not been applied on a broad “aquascape” scale, and few if any have applied the newest nanopore technology. Results We investigated the metagenomes of 11 rivers across 3 continents using MinION nanopore sequencing, a portable platform that could be useful for future global river monitoring. Up to 10 Gb of data per run were generated with average read lengths of 3.4 kb. Diversity and diagnosis of river function potential was accomplished with 0.5–1.0 ⋅ 106 long reads. Our observations for 7 of the 11 rivers conformed to other river-omic findings, and we exposed previously unrecognized microbial biodiversity in the other 4 rivers. Conclusions Deeper understanding that emerged is that river microbial consortia and the ecological functions they fulfil did not align with geographic location but instead implicated ecological responses of microbes to urban and other anthropogenic effects, and that changes in taxa manifested over a very short geographic space
MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs
Abstract Background There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increasing numbers of longer reads (of varying lengths in the range of 10k-1M bps, say), but of low quality. There is an increasing interest in using long reads in microbiome sequencing, and there is a need to adapt short read tools to long read datasets. Methods We describe a new LCA-based algorithm for taxonomic binning, and an interval-tree based algorithm for functional binning, that are explicitly designed for long reads and assembled contigs. We provide a new interactive tool for investigating the alignment of long reads against reference sequences. For taxonomic and functional binning, we propose to use LAST to compare long reads against the NCBI-nr protein reference database so as to obtain frame-shift aware alignments, and then to process the results using our new methods. Results All presented methods are implemented in the open source edition of MEGAN, and we refer to this new extension as MEGAN-LR (MEGAN long read). We evaluate the LAST+MEGAN-LR approach in a simulation study, and on a number of mock community datasets consisting of Nanopore reads, PacBio reads and assembled PacBio reads. We also illustrate the practical application on a Nanopore dataset that we sequenced from an anammox bio-rector community. Reviewers This article was reviewed by Nicola Segata together with Moreno Zolfo, Pete James Lockhart and Serghei Mangul. Conclusion This work extends the applicability of the widely-used metagenomic analysis software MEGAN to long reads. Our study suggests that the presented LAST+MEGAN-LR pipeline is sufficiently fast and accurate