1,809 research outputs found
Multiple Comparative Metagenomics using Multiset k-mer Counting
Background. Large scale metagenomic projects aim to extract biodiversity
knowledge between different environmental conditions. Current methods for
comparing microbial communities face important limitations. Those based on
taxonomical or functional assignation rely on a small subset of the sequences
that can be associated to known organisms. On the other hand, de novo methods,
that compare the whole sets of sequences, either do not scale up on ambitious
metagenomic projects or do not provide precise and exhaustive results.
Methods. These limitations motivated the development of a new de novo
metagenomic comparative method, called Simka. This method computes a large
collection of standard ecological distances by replacing species counts by
k-mer counts. Simka scales-up today's metagenomic projects thanks to a new
parallel k-mer counting strategy on multiple datasets.
Results. Experiments on public Human Microbiome Project datasets demonstrate
that Simka captures the essential underlying biological structure. Simka was
able to compute in a few hours both qualitative and quantitative ecological
distances on hundreds of metagenomic samples (690 samples, 32 billions of
reads). We also demonstrate that analyzing metagenomes at the k-mer level is
highly correlated with extremely precise de novo comparison techniques which
rely on all-versus-all sequences alignment strategy or which are based on
taxonomic profiling
Methods for comparative metagenomics
<p>Abstract</p> <p>Background</p> <p>Metagenomics is a rapidly growing field of research that aims at studying uncultured organisms to understand the true diversity of microbes, their functions, cooperation and evolution, in environments such as soil, water, ancient remains of animals, or the digestive system of animals and humans. The recent development of ultra-high throughput sequencing technologies, which do not require cloning or PCR amplification, and can produce huge numbers of DNA reads at an affordable cost, has boosted the number and scope of metagenomic sequencing projects. Increasingly, there is a need for new ways of comparing multiple metagenomics datasets, and for fast and user-friendly implementations of such approaches.</p> <p>Results</p> <p>This paper introduces a number of new methods for interactively exploring, analyzing and comparing multiple metagenomic datasets, which will be made freely available in a new, comparative version 2.0 of the stand-alone metagenome analysis tool MEGAN.</p> <p>Conclusion</p> <p>There is a great need for powerful and user-friendly tools for comparative analysis of metagenomic data and MEGAN 2.0 will help to fill this gap.</p
Comparative metagenomics of Daphnia symbionts
BACKGROUND: Shotgun sequences of DNA extracts from whole organisms allow a comprehensive assessment of possible symbionts. The current project makes use of four shotgun datasets from three species of the planktonic freshwater crustaceans Daphnia: one dataset from clones of D. pulex and D. pulicaria and two datasets from one clone of D. magna. We analyzed these datasets with three aims: First, we search for bacterial symbionts, which are present in all three species. Second, we search for evidence for Cyanobacteria and plastids, which had been suggested to occur as symbionts in a related Daphnia species. Third, we compare the metacommunities revealed by two different 454 pyrosequencing methods (GS 20 and GS FLX). RESULTS: In all datasets we found evidence for a large number of bacteria belonging to diverse taxa. The vast majority of these were Proteobacteria. Of those, most sequences were assigned to different genera of the Betaproteobacteria family Comamonadaceae. Other taxa represented in all datasets included the genera Flavobacterium, Rhodobacter, Chromobacterium, Methylibium, Bordetella, Burkholderia and Cupriavidus. A few taxa matched sequences only from the D. pulex and the D. pulicaria datasets: Aeromonas, Pseudomonas and Delftia. Taxa with many hits specific to a single dataset were rare. For most of the identified taxa earlier studies reported the finding of related taxa in aquatic environmental samples. We found no clear evidence for the presence of symbiotic Cyanobacteria or plastids. The apparent similarity of the symbiont communities of the three Daphnia species breaks down on a species and strain level. Communities have a similar composition at a higher taxonomic level, but the actual sequences found are divergent. The two Daphnia magna datasets obtained from two different pyrosequencing platforms revealed rather similar results. CONCLUSION: Three clones from three species of the genus Daphnia were found to harbor a rich community of symbionts. These communities are similar at the genus and higher taxonomic level, but are composed of different species. The similarity of these three symbiont communities hints that some of these associations may be stable in the long-term
CoMet—a web server for comparative functional profiling of metagenomes
Analyzing the functional potential of newly sequenced genomes and metagenomes has become a common task in biomedical and biological research. With the advent of high-throughput sequencing technologies comparative metagenomics opens the way to elucidate the genetically determined similarities and differences of complex microbial communities. We developed the web server ‘CoMet’ (http://comet.gobics.de), which provides an easy-to-use comparative metagenomics platform that is well-suitable for the analysis of large collections of metagenomic short read data. CoMet combines the ORF finding and subsequent assignment of protein sequences to Pfam domain families with a comparative statistical analysis. Besides comprehensive tabular data files, the CoMet server also provides visually interpretable output in terms of hierarchical clustering and multi-dimensional scaling plots and thus allows a quick overview of a given set of metagenomic samples
An application of statistics to comparative metagenomics
BACKGROUND: Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments. RESULTS: Here we use a statistical method to compare curated subsystems, to predict the physiology, metabolism, and ecology from metagenomes. This approach can be used to identify those subsystems that are significantly different between metagenome sequences. Subsystems that were overrepresented in the Sargasso Sea and Acid Mine Drainage metagenome when compared to non-redundant databases were identified. CONCLUSION: The methodology described herein applies statistics to the comparisons of metabolic potential in metagenomes. This analysis reveals those subsystems that are more, or less, represented in the different environments that are compared. These differences in metabolic potential lead to several testable hypotheses about physiology and metabolism of microbes from these ecosystems
Accelerating exhaustive pairwise metagenomic comparisons
In this manuscript, we present an optimized and parallel version of our previous work IMSAME, an exhaustive gapped aligner for the pairwise and accurate comparison of metagenomes. Parallelization strategies are applied to take advantage of modern multiprocessor architectures. In addition, sequential optimizations in CPU time and memory consumption are provided. These algorithmic and computational enhancements enable IMSAME to calculate near optimal alignments which are used to directly assess similarity between metagenomes without requiring reference databases. We show that the overall efficiency of the parallel implementation is superior to 80% while retaining scalability as the number of parallel cores used increases. Moreover, we also show thats equential optimizations yield up to 8x speedup for scenarios with larger data.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tec
Comparative metagenomics of PHA synthase genes in soil
Polyhydroxyalkanoates (PHAs) are biopolymers produced naturally by bacteria. They are of considerable scientific interest as fundamental components of bacterial carbon metabolism and have biotechnological applications as potential bioplastics. To date, studies of PHA metabolism have focused on a restricted set of PHA-producing bacterial species. Therefore, the diversity of PHA-producing taxa and gene sequences, and the efficiency of existing primers to recognize PHA marker genes, is unclear. In this thesis, I report the first large-scale metagenomic analysis of PHA producing taxa through taxonomic and functional profiling of 45 soil metagenomes from a broad range of soil types (bulk and rhizosphere). From a total of 229,070 detected class I-III PHA synthase (phaC) genes, PHA-producing microbial communities were inferred and compared between soil environments, and the sequence diversity and primer efficiency for different classes of phaC genes was analyzed. Analysis revealed several main findings: 1) both known and novel PHA-producing taxa were inferred to contribute high proportions of phaC genes in environmental samples; 2) distinct shifts in the PHA-producer communities were observed both between soil types and between phaC classes; 3) phaC-containing species were detected at relatively higher abundance in rhizosphere soils implying a significant role for PHA storage in rhizobacteria; 4) existing primers did not adequately cover the sequence diversity of environmental homologs, and metagenomic diversity can be used to suggest modification that improve primer efficiency
Comparative Metagenomic Analysis of Two Hot Springs From Ourense (Northwestern Spain) and Others Worldwide
[Abstract] With their circumneutral pH and their moderate temperature (66 and 68°C, respectively), As Burgas and Muiño da Veiga are two important human-use hot springs, previously studied with traditional culture methods, but never explored with a metagenomic approach. In the present study, we have performed metagenomic sequence-based analyses to compare the taxonomic composition and functional potential of these hot springs. Proteobacteria, Deinococcus-Thermus, Firmicutes, Nitrospirae, and Aquificae are the dominant phyla in both geothermal springs, but there is a significant difference in the abundance of these phyla between As Burgas and Muiño da Veiga. Phylum Proteobacteria dominates As Burgas ecosystem while Aquificae is the most abundant phylum in Muiño da Veiga. Taxonomic and functional analyses reveal that the variability in water geochemistry might be shaping the differences in the microbial communities inhabiting these geothermal springs. The content in organic compounds of As Burgas water promotes the presence of heterotrophic populations of the genera Acidovorax and Thermus, whereas the sulfate-rich water of Muiño da Veiga favors the co-dominance of genera Sulfurihydrogenibium and Thermodesulfovibrio. Differences in ammonia concentration exert a selective pressure toward the growth of nitrogen-fixing bacteria such as Thermodesulfovibrio in Muiño da Veiga. Temperature and pH are two important factors shaping hot springs microbial communities as was determined by comparative analysis with other thermal springs.This study received financial support from the following organizations: Xunta de Galicia (Consolidación GRC) co-financed by FEDER (Grant Number ED431C 2020/08) and Ministerio de Ciencia, Innovación y Universidades (MICINN) (Grant Number RTI2018-099249-B-I00). The work of M-ED was supported by a FPU fellowship (Ministerio de Educación Cultura y Deporte) FPU12/05050. The metagenome sequencing of As Burgas water was performed by M-ED in the Dinsdale Lab (Department of Biology, San Diego State University), as part of a short stay financed by the Short-Term Mobility program of the FPU scholarshipXunta de Galicia; ED431C 2020/0
- …