3,775 research outputs found

    Local Binary Patterns as a Feature Descriptor in Alignment-free Visualisation of Metagenomic Data

    Get PDF
    Shotgun sequencing has facilitated the analysis of complex microbial communities. However, clustering and visualising these communities without prior taxonomic information is a major challenge. Feature descriptor methods can be utilised to extract these taxonomic relations from the data. Here, we present a novel approach consisting of local binary patterns (LBP) coupled with randomised singular value decomposition (RSVD) and Barnes-Hut t-stochastic neighbor embedding (BH-tSNE) to highlight the underlying taxonomic structure of the metagenomic data. The effectiveness of our approach is demonstrated using several simulated and a real metagenomic datasets

    HabiSign: a novel approach for comparison of metagenomes and rapid identification of habitat-specific sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One of the primary goals of comparative metagenomic projects is to study the differences in the microbial communities residing in diverse environments. Besides providing valuable insights into the inherent structure of the microbial populations, these studies have potential applications in several important areas of medical research like disease diagnostics, detection of pathogenic contamination and identification of hitherto unknown pathogens. Here we present a novel and rapid, alignment-free method called HabiSign, which utilizes patterns of tetra-nucleotide usage in microbial genomes to bring out the differences in the composition of both diverse and related microbial communities.</p> <p>Results</p> <p>Validation results show that the metagenomic signatures obtained using the HabiSign method are able to accurately cluster metagenomes at biome, phenotypic and species levels, as compared to an average tetranucleotide frequency based approach and the recently published dinucleotide relative abundance based approach. More importantly, the method is able to identify subsets of sequences that are specific to a particular habitat. Apart from this, being alignment-free, the method can rapidly compare and group multiple metagenomic data sets in a short span of time.</p> <p>Conclusions</p> <p>The proposed method is expected to have immense applicability in diverse areas of metagenomic research ranging from disease diagnostics and pathogen detection to bio-prospecting. A web-server for the HabiSign algorithm is available at <url>http://metagenomics.atc.tcs.com/HabiSign/</url>.</p

    Comparison of metagenomic samples using sequence signatures

    Get PDF
    BACKGROUND: Sequence signatures, as defined by the frequencies of k-tuples (or k-mers, k-grams), have been used extensively to compare genomic sequences of individual organisms, to identify cis-regulatory modules, and to study the evolution of regulatory sequences. Recently many next-generation sequencing (NGS) read data sets of metagenomic samples from a variety of different environments have been generated. The assembly of these reads can be difficult and analysis methods based on mapping reads to genes or pathways are also restricted by the availability and completeness of existing databases. Sequence-signature-based methods, however, do not need the complete genomes or existing databases and thus, can potentially be very useful for the comparison of metagenomic samples using NGS read data. Still, the applications of sequence signature methods for the comparison of metagenomic samples have not been well studied. RESULTS: We studied several dissimilarity measures, including d(2), d(2)(*) and d(2)(S) recently developed from our group, a measure (hereinafter noted as Hao) used in CVTree developed from Hao’s group (Qi et al., 2004), measures based on relative di-, tri-, and tetra-nucleotide frequencies as in Willner et al. (2009), as well as standard l(p) measures between the frequency vectors, for the comparison of metagenomic samples using sequence signatures. We compared their performance using a series of extensive simulations and three real next-generation sequencing (NGS) metagenomic datasets: 39 fecal samples from 33 mammalian host species, 56 marine samples across the world, and 13 fecal samples from human individuals. Results showed that the dissimilarity measure d(2)(S) can achieve superior performance when comparing metagenomic samples by clustering them into different groups as well as recovering environmental gradients affecting microbial samples. New insights into the environmental factors affecting microbial compositions in metagenomic samples are obtained through the analyses. Our results show that sequence signatures of the mammalian gut are closely associated with diet and gut physiology of the mammals, and that sequence signatures of marine communities are closely related to location and temperature. CONCLUSIONS: Sequence signatures can successfully reveal major group and gradient relationships among metagenomic samples from NGS reads without alignment to reference databases. The d(2)(S) dissimilarity measure is a good choice in all application scenarios. The optimal choice of tuple size depends on sequencing depth, but it is quite robust within a range of choices for moderate sequencing depths

    Microbial Similarity between Students in a Common Dormitory Environment Reveals the Forensic Potential of Individual Microbial Signatures.

    Get PDF
    The microbiota of the built environment is an amalgamation of both human and environmental sources. While human sources have been examined within single-family households or in public environments, it is unclear what effect a large number of cohabitating people have on the microbial communities of their shared environment. We sampled the public and private spaces of a college dormitory, disentangling individual microbial signatures and their impact on the microbiota of common spaces. We compared multiple methods for marker gene sequence clustering and found that minimum entropy decomposition (MED) was best able to distinguish between the microbial signatures of different individuals and was able to uncover more discriminative taxa across all taxonomic groups. Further, weighted UniFrac- and random forest-based graph analyses uncovered two distinct spheres of hand- or shoe-associated samples. Using graph-based clustering, we identified spheres of interaction and found that connection between these clusters was enriched for hands, implicating them as a primary means of transmission. In contrast, shoe-associated samples were found to be freely interacting, with individual shoes more connected to each other than to the floors they interact with. Individual interactions were highly dynamic, with groups of samples originating from individuals clustering freely with samples from other individuals, while all floor and shoe samples consistently clustered together.IMPORTANCE Humans leave behind a microbial trail, regardless of intention. This may allow for the identification of individuals based on the "microbial signatures" they shed in built environments. In a shared living environment, these trails intersect, and through interaction with common surfaces may become homogenized, potentially confounding our ability to link individuals to their associated microbiota. We sought to understand the factors that influence the mixing of individual signatures and how best to process sequencing data to best tease apart these signatures

    Metagenomic analysis of dental calculus in ancient Egyptian baboons

    Get PDF
    Dental calculus, or mineralized plaque, represents a record of ancient biomolecules and food residues. Recently, ancient metagenomics made it possible to unlock the wealth of microbial and dietary information of dental calculus to reconstruct oral microbiomes and lifestyle of humans from the past. Although most studies have so far focused on ancient humans, dental calculus is known to form in a wide range of animals, potentially informing on how human-animal interactions changed the animals' oral ecology. Here, we characterise the oral microbiome of six ancient Egyptian baboons held in captivity during the late Pharaonic era (9th-6th centuries BC) and of two historical baboons from a zoo via shotgun metagenomics. We demonstrate that these captive baboons possessed a distinctive oral microbiome when compared to ancient and modern humans, Neanderthals and a wild chimpanzee. These results may reflect the omnivorous dietary behaviour of baboons, even though health, food provisioning and other factors associated with human management, may have changed the baboons' oral microbiome. We anticipate our study to be a starting point for more extensive studies on ancient animal oral microbiomes to examine the extent to which domestication and human management in the past affected the diet, health and lifestyle of target animals

    Tiny microbes, enormous impacts: what matters in gut microbiome studies?

    Get PDF
    Many factors affect the microbiomes of humans, mice, and other mammals, but substantial challenges remain in determining which of these factors are of practical importance. Considering the relative effect sizes of both biological and technical covariates can help improve study design and the quality of biological conclusions. Care must be taken to avoid technical bias that can lead to incorrect biological conclusions. The presentation of quantitative effect sizes in addition to P values will improve our ability to perform meta-analysis and to evaluate potentially relevant biological effects. A better consideration of effect size and statistical power will lead to more robust biological conclusions in microbiome studies

    Microbial genomic taxonomy

    Get PDF
    A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups

    Dietary history contributes to enterotype-like clustering and functional metagenomic content in the intestinal microbiome of wild mice

    No full text
    Understanding the origins of gut microbial community structure is critical for the identification and interpretation of potential fitnessrelated traits for the host. The presence of community clusters characterized by differences in the abundance of signature taxa, referred to as enterotypes, is a debated concept first reported in humans and later extended to other mammalian hosts. In this study, we provide a thorough assessment of their existence in wild house mice using a panel of evaluation criteria.We identify support for two clusters that are compositionally similar to clusters identified in humans, chimpanzees, and laboratory mice, characterized by differences in Bacteroides, Robinsoniella, and unclassified genera belonging to the family Lachnospiraceae. To further evaluate these clusters, we (i) monitored community changes associated with moving mice from the natural to a laboratory environment, (ii) performed functional metagenomic sequencing, and (iii) subjected wild-caught samples to stable isotope analysis to reconstruct dietary patterns. This process reveals differences in the proportions of genes involved in carbohydrate versus protein metabolism in the functional metagenome, as well as differences in plant- versus meat-derived food sources between clusters. In conjunction with wild-caught mice quickly changing their enterotype classification upon transfer to a standard laboratory chow diet, these results provide strong evidence that dietary history contributes to the presence of enterotype-like clustering in wild mice