30 research outputs found

    Defining Reference Sequences for Nocardia Species by Similarity and Clustering Analyses of 16S rRNA Gene Sequence Data

    Get PDF
    International audienceBACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM) of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52%) corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra-species variability

    Pathogenic diversity amongst serotype C VGIII and VGIV Cryptococcus gattii isolates

    Get PDF
    Cryptococcus gattii is one of the causative agents of human cryptococcosis. Highly virulent strains of serotype B C. gattii have been studied in detail, but little information is available on the pathogenic properties of serotype C isolates. In this study, we analyzed pathogenic determinants in three serotype C C. gattii isolates (106.97, ATCC 24066 and WM 779). Isolate ATCC 24066 (molecular type VGIII) differed from isolates WM 779 and 106.97 (both VGIV) in capsule dimensions, expression of CAP genes, chitooligomer distribution, and induction of host chitinase activity. Isolate WM 779 was more efficient than the others in producing pigments and all three isolates had distinct patterns of reactivity with antibodies to glucuronoxylomannan. This great phenotypic diversity reflected in differential pathogenicity. VGIV isolates WM 779 and 106.97 were similar in their ability to cause lethality and produced higher pulmonary fungal burden in a murine model of cryptococcosis, while isolate ATCC 24066 (VGIII) was unable to reach the brain and caused reduced lethality in intranasally infected mice. These results demonstrate a high diversity in the pathogenic potential of isolates of C. gattii belonging to the molecular types VGIII and VGIV
    corecore