441 research outputs found

    Flower: extracting information from pyrosequencing data

    Get PDF
    Summary: The SFF file format produced by Roche's 454 sequencing technology is a compact, binary format that contains the flow values that are used for base and quality calling of the reads. Applications, e.g. in metagenomics, often depend on accurate sequence information, and access to flow values is important to estimate the probability of errors. Unfortunately, the programs supplied by Roche for accessing this information are not publicly available. Flower is a program that can extract the information contained in SFF files, and convert it to various textual output formats

    The distinct features of microbial 'dysbiosis' of Crohn's disease do not occur to the same extent in their unaffected, genetically linked kindred

    Get PDF
    Background/Aims: Studying the gut microbiota in unaffected relatives of people with Crohn’s disease (CD) may advance our understanding of the role of bacteria in disease aetiology. Methods: Faecal microbiota composition (16S rRNA gene sequencing), genetic functional capacity (shotgun metagenomics) and faecal short chain fatty acids (SCFA) were compared in unaffected adult relatives of CD children (CDR, n = 17) and adult healthy controls, unrelated to CD patients (HUC, n = 14). The microbiota characteristics of 19 CD children were used as a benchmark of CD ‘dysbiosis’. Results: The CDR microbiota was less diverse (p = 0.044) than that of the HUC group. Local contribution of β-diversity analysis showed no difference in community structure between the CDR and HUC groups. Twenty one of 1,243 (1.8%) operational taxonomic units discriminated CDR from HUC. The metagenomic functional capacity (p = 0.207) and SCFA concentration or pattern were similar between CDR and HUC (p>0.05 for all SCFA). None of the KEGG metabolic pathways were different between these two groups. Both of these groups (HUC and CDR) had a higher microbiota α-diversity (CDR, p = 0.026 and HUC, p<0.001) with a community structure (β-diversity) distinct from that of children with CD. Conclusions: While some alterations were observed, a distinct microbial ‘dysbiosis’, characteristic of CD patients, was not observed in their unaffected, genetically linked kindred

    Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics

    Get PDF
    We introduce Dirichlet multinomial mixtures (DMM) for the probabilistic modelling of microbial metagenomics data. This data can be represented as a frequency matrix giving the number of times each taxa is observed in each sample. The samples have different size, and the matrix is sparse, as communities are diverse and skewed to rare taxa. Most methods used previously to classify or cluster samples have ignored these features. We describe each community by a vector of taxa probabilities. These vectors are generated from one of a finite number of Dirichlet mixture components each with different hyperparameters. Observed samples are generated through multinomial sampling. The mixture components cluster communities into distinct ‘metacommunities’, and, hence, determine envirotypes or enterotypes, groups of communities with a similar composition. The model can also deduce the impact of a treatment and be used for classification. We wrote software for the fitting of DMM models using the ‘evidence framework’ (http://code.google.com/p/microbedmm/). This includes the Laplace approximation of the model evidence. We applied the DMM model to human gut microbe genera frequencies from Obese and Lean twins. From the model evidence four clusters fit this data best. Two clusters were dominated by Bacteroides and were homogenous; two had a more variable community composition. We could not find a significant impact of body mass on community structure. However, Obese twins were more likely to derive from the high variance clusters. We propose that obesity is not associated with a distinct microbiota but increases the chance that an individual derives from a disturbed enterotype. This is an example of the ‘Anna Karenina principle (AKP)’ applied to microbial communities: disturbed states having many more configurations than undisturbed. We verify this by showing that in a study of inflammatory bowel disease (IBD) phenotypes, ileal Crohn's disease (ICD) is associated with a more variable community

    Characteristics of 454 pyrosequencing data—enabling realistic simulation with flowsim

    Get PDF
    Motivation: The commercial launch of 454 pyrosequencing in 2005 was a milestone in genome sequencing in terms of performance and cost. Throughout the three available releases, average read lengths have increased to ∼500 base pairs and are thus approaching read lengths obtained from traditional Sanger sequencing. Study design of sequencing projects would benefit from being able to simulate experiments

    A Comparison of rpoB and 16S rRNA as Markers in Pyrosequencing Studies of Bacterial Diversity

    Get PDF
    Background: The 16S rRNA gene is the gold standard in molecular surveys of bacterial and archaeal diversity, but it has the disadvantages that it is often multiple-copy, has little resolution below the species level and cannot be readily interpreted in an evolutionary framework. We compared the 16S rRNA marker with the single-copy, protein-coding rpoB marker by amplifying and sequencing both from a single soil sample. Because the higher genetic resolution of the rpoB gene prohibits its use as a universal marker, we employed consensus-degenerate primers targeting the Proteobacteria. <p/>Methodology/Principal Findings: Pyrosequencing can be problematic because of the poor resolution of homopolymer runs. As these erroneous runs disrupt the reading frame of protein-coding sequences, removal of sequences containing nonsense mutations was found to be a valuable filter in addition to flowgram-based denoising. Although both markers gave similar estimates of total diversity, the rpoB marker revealed more species, requiring an order of magnitude fewer reads to obtain 90% of the true diversity. The application of population genetic methods was demonstrated on a particularly abundant sequence cluster. <p/>Conclusions/Significance: The rpoB marker can be a complement to the 16S rRNA marker for high throughput microbial diversity studies focusing on specific taxonomic groups. Additional error filtering is possible and tests for recombination or selection can be employed

    Dramatic Shifts in Benthic Microbial Eukaryote Communities following the Deepwater Horizon Oil Spill

    Get PDF
    Benthic habitats harbour a significant (yet unexplored) diversity of microscopic eukaryote taxa, including metazoan phyla, protists, algae and fungi. These groups are thought to underpin ecosystem functioning across diverse marine environments. Coastal marine habitats in the Gulf of Mexico experienced visible, heavy impacts following the Deepwater Horizon oil spill in 2010, yet our scant knowledge of prior eukaryotic biodiversity has precluded a thorough assessment of this disturbance. Using a marker gene and morphological approach, we present an intensive evaluation of microbial eukaryote communities prior to and following oiling around heavily impacted shorelines. Our results show significant changes in community structure, with pre-spill assemblages of diverse Metazoa giving way to dominant fungal communities in post-spill sediments. Post-spill fungal taxa exhibit low richness and are characterized by an abundance of known hydrocarbon-degrading genera, compared to prior communities that contained smaller and more diverse fungal assemblages. Comparative taxonomic data from nematodes further suggests drastic impacts; while pre-spill samples exhibit high richness and evenness of genera, post-spill communities contain mainly predatory and scavenger taxa alongside an abundance of juveniles. Based on this community analysis, our data suggest considerable (hidden) initial impacts across Gulf beaches may be ongoing, despite the disappearance of visible surface oil in the region

    Evidence for hydrogen oxidation and metabolic plasticity in widespread deep-sea sulfur-oxidizing bacteria

    Get PDF
    Author Posting. © The Author(s), 2012. This is the author's version of the work. It is posted here by permission of National Academy of Sciences for personal use, not for redistribution. The definitive version was published in Proceedings of the National Academy of Sciences of the United States of America 110 (2013): 330-335, doi:10.1073/pnas.1215340110.Hydrothermal vents are a well-known source of energy that powers chemosynthesis in the deep sea. Recent work suggests that microbial chemosynthesis is also surprisingly pervasive throughout the dark oceans, serving as a significant CO2 sink even at sites far-removed from vents. Ammonia and sulfur have been identified as potential electron donors for this chemosynthesis, but they do not fully account for measured rates of dark primary production in the pelagic water column. Here we use metagenomic and metatranscriptomic analyses to show that deep-sea populations of the SUP05 group of uncultured sulfur oxidizing Gammaproteobacteria, which are abundant in widespread and diverse marine environments, contain and highly express genes encoding group 1 Ni-Fe hydrogenase enzymes for H2 oxidation. Reconstruction of near-complete genomes of two co-occurring SUP05 populations in hydrothermal plumes and deep waters of the Gulf of California enabled detailed population-specific metatranscriptomic analyses, revealing dynamic patterns of gene content and transcript abundance. SUP05 transcripts for genes involved in H2 and sulfur oxidation are most abundant in hydrothermal plumes where these electron donors are enriched. In contrast, a second hydrogenase has more abundant transcripts in background deep sea samples. Coupled with results from a bioenergetic model that suggest that H2 oxidation can contribute significantly to the SUP05 energy budget, these findings reveal the potential importance of H2 as a key energy source in the deep ocean. This study also highlights the genomic plasticity of SUP05, which enables this widely distributed group to optimize its energy metabolism (electron donor and acceptor) to local geochemical conditions.This project is funded in part by the Gordon and Betty Moore Foundation and the National Science Foundation (OCE 1029242)

    Data for Millennia of genomic stability within the invasive Para C Lineage of Salmonella enterica: date estimation 1

    Get PDF
    Salmonella enterica serovar Paratyphi C is the causative agent of enteric (paratyphoid) fever. While today a potentially lethal infection of humans that occurs in Africa and Asia, early 20th century observations in Eastern Europe suggest it may once have had a wider-ranging impact on human societies. We recovered a draft Paratyphi C genome from the 800-year-old skeleton of a young woman in Trondheim, Norway, who likely died of enteric fever. Analysis of this genome against a new, significantly expanded database of related modern genomes demonstrated that Paratyphi C is descended from the ancestors of swine pathogens, serovars Choleraesuis and Typhisuis, together forming the Para C Lineage. Our results indicate that Paratyphi C has been a pathogen of humans for at least 1,000 years, and may have evolved after zoonotic transfer from swine during the Neolithic period
    corecore