18 research outputs found

    Questioning the Quality of 16S rRNA Gene Sequences Derived From Human Gut Metagenome-Assembled Genomes

    Get PDF
    The recent introduction of metagenome-assembled genomes (MAGs) has marked a major milestone in the human gut microbiome field (Almeida et al., 2019; Nayfach et al., 2019; Pasolli et al., 2019). Such reference-free, de novo-assembled genomes (Hugerth et al., 2015) have revealed a wide range of hitherto uncultured microbial species in human gut samples. The significance of MAGs in unravelling human gut microbial diversity was supported by their overwhelming representation in a comprehensive human gut prokaryotic collection filtered by metagenome data dereplicated at 97.5% average nucleotide identity (ANI) (Hiseni et al., 2021). More than 90% of the collection consists of MAGs, while the rest of the collection mainly comprises RefSeq genomes (Figure 1A).publishedVersio

    fixedTimeEvents: An R package for the distribution of distances between discrete events in fixed time

    Get PDF
    When a series of Bernoulli trials occur within a fixed time frame or limited space, it is often interesting to assess if the successful outcomes have occurred completely at random, or if they tend to group together. One example, in genetics, is detecting grouping of genes within a genome. Approximations of the distribution of successes are possible, but they become inaccurate for small sample sizes. In this article, we describe the exact distribution of time between random, non-overlapping successes in discrete time of fixed length. A complete description of the probability mass function, the cumulative distribution function, mean, variance and recurrence relation is included. We propose an associated test for the over-representation of short distances and illustrate the methodology through relevant examples. The theory is implemented in an R package including probability mass, cumulative distribution, quantile function, random number generator, simulation functions, and functions for testing

    micropan: An R-package for microbial pan-genomics

    Get PDF

    microclass: An R-package for 16S taxonomy classification

    Get PDF
    Background Taxonomic classification based on the 16S rRNA gene sequence is important for the profiling of microbial communities. In addition to giving the best possible accuracy, it is also important to quantify uncertainties in the classifications. Results We present an R package with tools for making such classifications, where the heavy computations are implemented in C++ but operated through the standard R interface. The user may train classifiers based on specialized data sets, but we also supply a ready-to-use function trained on a comprehensive training data set designed specifically for this purpose. This tool also includes some novel ways to quantify uncertainties in the classifications. Conclusions Based on input sequences of varying length and quality, we demonstrate how the output from the classifications can be used to obtain high quality taxonomic assignments from 16S sequences within the R computing environment. The package is publicly available at the Comprehensive R Archive Network.publishedVersio

    The nucleotide composition of microbial genomes indicates differential patterns of selection on core and accessory genomes

    Get PDF
    Background: The core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions. Results: We found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes. Conclusion: The higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.publishedVersio

    Rapid succession of actively transcribing denitrifier populations in agricultural soil during an anoxic spell

    No full text
    Denitrification allows sustained respiratory metabolism during periods of anoxia, an advantage in soils with frequent anoxic spells. However, the gains may be more than evened out by the energy cost of producing the denitrification achinery, particularly if the anoxic spell is short. This dilemma could explain the evolution of different regulatory phenotypes observed in model strains, such as sequential expression of the four denitrification genes needed for a complete reduction of nitrate to N2, or a “bet hedging” strategy where all four genes are expressed only in a fraction of the cells. In complex environments such strategies would translate into progressive onset of transcription by the members of the denitrifying community. We exposed soil microcosms to anoxia, sampled for amplicon sequencing of napA/narG, nirK/nirS, and nosZ genes and transcripts after 1, 2 and 4 h, and monitored the kinetics of NO, N2O, and N2. The cDNA libraries revealed a succession of transcribed genes from active denitrifier populations, which probably reflects various regulatory phenotypes in combination with cross-talks via intermediates (NO2 , NO) produced by the “early onset” denitrifying populations. This suggests that the regulatory strategies observed in individual isolates are also displayed in complex communities, and pinpoint the importance for successive sampling when identifying active key player organisms

    Rapid succession of actively transcribing denitrifier populations in agricultural soil during an anoxic spell

    No full text
    Denitrification allows sustained respiratory metabolism during periods of anoxia, an advantage in soils with frequent anoxic spells. However, the gains may be more than evened out by the energy cost of producing the denitrification achinery, particularly if the anoxic spell is short. This dilemma could explain the evolution of different regulatory phenotypes observed in model strains, such as sequential expression of the four denitrification genes needed for a complete reduction of nitrate to N2, or a “bet hedging” strategy where all four genes are expressed only in a fraction of the cells. In complex environments such strategies would translate into progressive onset of transcription by the members of the denitrifying community. We exposed soil microcosms to anoxia, sampled for amplicon sequencing of napA/narG, nirK/nirS, and nosZ genes and transcripts after 1, 2 and 4 h, and monitored the kinetics of NO, N2O, and N2. The cDNA libraries revealed a succession of transcribed genes from active denitrifier populations, which probably reflects various regulatory phenotypes in combination with cross-talks via intermediates (NO2 , NO) produced by the “early onset” denitrifying populations. This suggests that the regulatory strategies observed in individual isolates are also displayed in complex communities, and pinpoint the importance for successive sampling when identifying active key player organisms
    corecore