20 research outputs found

    Defining Reference Sequences for Nocardia Species by Similarity and Clustering Analyses of 16S rRNA Gene Sequence Data

    Get PDF
    International audienceBACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM) of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52%) corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra-species variability

    Analysis of bacterial core communities in the central Baltic by comparative RNA-DNA-based fingerprinting provides links to structure-function relationships.

    Get PDF
    Understanding structure-function links of microbial communities is a central theme of microbial ecology since its beginning. To this end, we studied the spatial variability of the bacterioplankton community structure and composition across the central Baltic Sea at four stations, which were up to 450 km apart and at a depth profile representative for the central part (Gotland Deep, 235 m). Bacterial community structure was followed by 16S ribosomal RNA (rRNA)- and 16S rRNA gene-based fingerprints using single-strand conformation polymorphism (SSCP) electrophoresis. Species composition was determined by sequence analysis of SSCP bands. High similarities of the bacterioplankton communities across several hundred kilometers were observed in the surface water using RNA- and DNA-based fingerprints. In these surface communities, the RNA- and DNA-based fingerprints resulted in very different pattern, presumably indicating large difference between the active members of the community as represented by RNA-based fingerprints and the present members represented by the DNA-based fingerprints. This large discrepancy changed gradually over depth, resulting in highly similar RNA- and DNA-based fingerprints in the anoxic part of the water column below 130 m depth. A conceivable mechanism explaining this high similarity could be the reduced oxidative stress in the anoxic zone. The stable communities on the surface and in the anoxic zone indicate the strong influence of the hydrography on the bacterioplankton community structure. Comparative analysis of RNA- and DNA-based community structure provided criteria for the identification of the core community, its key members and their links to biogeochemical functions

    High-throughput amplicon sequencing and stream benthic bacteria: identifying the best taxonomic level for multiple-stressor research

    No full text
    Disentangling the individual and interactive effects of multiple stressors on microbial communities is a key challenge to our understanding and management of ecosystems. Advances in molecular techniques allow studying microbial communities in situ and with high taxonomic resolution. However, the taxonomic level which provides the best trade-off between our ability to detect multiple-stressor effects versus the goal of studying entire communities remains unknown. We used outdoor mesocosms simulating small streams to investigate the effects of four agricultural stressors (nutrient enrichment, the nitrification inhibitor dicyandiamide (DCD), fine sediment and flow velocity reduction) on stream bacteria (phyla, orders, genera, and species represented by Operational Taxonomic Units with 97% sequence similarity). Community composition was assessed using amplicon sequencing (16S rRNA gene, V3-V4 region). DCD was the most pervasive stressor, affecting evenness and most abundant taxa, followed by sediment and flow velocity. Stressor pervasiveness was similar across taxonomic levels and lower levels did not perform better in detecting stressor effects. Community coverage decreased from 96% of all sequences for abundant phyla to 28% for species. Order-level responses were generally representative of responses of corresponding genera and species, suggesting that this level may represent the best compromise between stressor sensitivity and coverage of bacterial communities
    corecore