48 research outputs found
Sequences, sequence clusters and bacterial species
Whatever else they should share, strains of bacteria assigned to the same species should have house-keeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multilocus sequence analysis, MLSA) is to concatenate the sequences of multiple house-keeping loci and to observe the patterns of clustering among large populations of strains of closely related named bacterial species. Recent studies have shown that large populations can be resolved into non-overlapping sequence clusters that agree well with species assigned by the standard microbiological methods. The use of clustering patterns to inform the division of closely related populations into species has many advantages for poorly studied bacteria (or to re-evaluate well-studied species), as it provides a way of recognizing natural discontinuities in the distribution of similar genotypes. Clustering patterns can be used by expert groups as the basis of a pragmatic approach to assigning species, taking into account whatever additional data are available (e.g. similarities in ecology, phenotype and gene content). The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy. The advantages and problems in using sequence clusters as the basis of species assignments are discussed
Recommended from our members
Soft topographic map for clustering and classification of bacteria
In this work a new method for clustering and building a
topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different
type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria
class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification
or erroneous annotations in the database