25 research outputs found

    Sequence variability of Rhizobiales orthologs and relationship with physico-chemical characteristics of proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chromosomal orthologs can reveal the shared ancestral gene set and their evolutionary trends. Additionally, physico-chemical properties of encoded proteins could provide information about functional adaptation and ecological niche requirements.</p> <p>Results</p> <p>We analyzed 7080 genes (five groups of 1416 orthologs each) from Rhizobiales species (<it>S. meliloti, R. etli</it>, and <it>M. loti</it>, plant symbionts; <it>A. tumefaciens</it>, a plant pathogen; and <it>B. melitensis</it>, an animal pathogen). We evaluated their phylogenetic relationships and observed three main topologies. The first, with closer association of <it>R. etli </it>to <it>A. tumefaciens</it>; the second with <it>R. etli </it>closer to <it>S. meliloti</it>; and the third with <it>A. tumefaciens </it>and <it>S. meliloti </it>as the closest pair. This was not unusual, given the close relatedness of these three species. We calculated the synonymous (dS) and nonsynonymous (dN) substitution rates of these orthologs, and found that informational and metabolic functions showed relatively low dN rates; in contrast, genes from hypothetical functions and cellular processes showed high dN rates. An alternative measure of sequence variability, percentage of changes by species, was used to evaluate the most specific proportion of amino acid residues from alignments. When dN was compared with that measure a high correlation was obtained, revealing that much of evolutive information was extracted with the percentage of changes by species at the amino acid level. By analyzing the sequence variability of orthologs with a set of five properties (polarity, electrostatic charge, formation of secondary structures, molecular volume, and amino acid composition), we found that physico-chemical characteristics of proteins correlated with specific functional roles, and association of species did not follow their typical phylogeny, probably reflecting more adaptation to their life styles and niche preferences. In addition, orthologs with low dN rates had residues with more positive values of polarity, volume and electrostatic charge.</p> <p>Conclusions</p> <p>These findings revealed that even when orthologs perform the same function in each genomic background, their sequences reveal important evolutionary tendencies and differences related to adaptation.</p> <p>This article was reviewed by: Dr. Purificación López-García, Prof. Jeffrey Townsend (nominated by Dr. J. Peter Gogarten), and Ms. Olga Kamneva.</p

    Evolutionary, structural and functional relationships revealed by comparative analysis of syntenic genes in Rhizobiales

    Get PDF
    BACKGROUND: Comparative genomics has provided valuable insights into the nature of gene sequence variation and chromosomal organization of closely related bacterial species. However, questions about the biological significance of gene order conservation, or synteny, remain open. Moreover, few comprehensive studies have been reported for rhizobial genomes. RESULTS: We analyzed the genomic sequences of four fast growing Rhizobiales (Sinorhizobium meliloti, Agrobacterium tumefaciens, Mesorhizobium loti and Brucella melitensis). We made a comprehensive gene classification to define chromosomal orthologs, genes with homologs in other replicons such as plasmids, and those which were species-specific. About two thousand genes were predicted to be orthologs in each chromosome and about 80% of these were syntenic. A striking gene colinearity was found in pairs of organisms and a large fraction of the microsyntenic regions and operons were similar. Syntenic products showed higher identity levels than non-syntenic ones, suggesting a resistance to sequence variation due to functional constraints; also, an unusually high fraction of syntenic products contained membranal segments. Syntenic genes encode a high proportion of essential cell functions, presented a high level of functional relationships and a very low horizontal gene transfer rate. The sequence variability of the proteins can be considered the species signature in response to specific niche adaptation. Comparatively, an analysis with genomes of Enterobacteriales showed a different gene organization but gave similar results in the synteny conservation, essential role of syntenic genes and higher functional linkage among the genes of the microsyntenic regions. CONCLUSION: Syntenic bacterial genes represent a commonly evolved group. They not only reveal the core chromosomal segments present in the last common ancestor and determine the metabolic characteristics shared by these microorganisms, but also show resistance to sequence variation and rearrangement, possibly due to their essential character. In Rhizobiales and Enterobacteriales, syntenic genes encode a high proportion of essential cell functions and presented a high level of functional relationships

    Computational prediction of promotors in Agrobacterium tumefaciens strain C58 by using the machine learning technique

    Get PDF
    Promotors are those genomic regions on the upstream of genes, which are bound by RNA polymerase for starting gene transcription. Because it is the most critical element of gene expression, the recognition of promoters is crucial to understand the regulation of gene expression. This study aimed to develop a machine learning-based model to predict promotors in Agrobacterium tumefaciens (A. tumefaciens) strain C58. In the model, promotor sequences were encoded by three different kinds of feature descriptors, namely, accumulated nucleotide frequency, k-mer nucleotide composition, and binary encodings. The obtained features were optimized by using correlation and the mRMR-based algorithm. These optimized features were inputted into a random forest (RF) classifier to discriminate promotor sequences from non-promotor sequences in A. tumefaciens strain C58. The examination of 10-fold cross-validation showed that the proposed model could yield an overall accuracy of 0.837. This model will provide help for the study of promoters in A. tumefaciens C58 strain

    Hyperthermophily and the origin and earliest evolution of life

    Get PDF
    The possibility of a high-temperature origin of life has gained support based on indirect evidence of a hot, early Earth and on the basal position of hyperthermophilic organisms in rRNA-based phylogenies. However, although the availability of more than 80 completely sequenced cellular genomes has led to the identification of hyperthermophilic-specific traits, such as a trend towards smaller genomes, reduced proteinencoding gene sizes, and glutamic-acid-rich simple sequences, none of these characteristics are in themselves an indication of primitiveness. There is no geological evidence for the physical setting in which life arose, but current models suggest that the Earth’s surface cooled down rapidly. Moreover, at 100°C the half-lives of several organic compounds, including ribose, nucleobases, and amino acids, which are generally thought to have been essential for the emergence of the first living systems, are too short to allow for their accumulation in the prebiotic environment. Accordingly, if hyperthermophily is not truly primordial, then heat-loving lifestyles may be relics of a secondary adaptation that evolved after the origin of life, and before or soon after separation of the major lineages

    Increasing biological complexity is positively correlated with the relative genome-wide expansion of non-protein-coding DNA sequences

    Full text link
    Background: Prior to the current genomic era it was suggested that the number of protein-coding genes that an organism made use of was a valid measure of its complexity. It is now clear, however, that major incongruities exist and that there is only a weak relationship between biological complexity and the number of protein coding genes. For example, using the protein-coding gene number as a basis for evaluating biological complexity would make urochordates and insects less complex than nematodes, and humans less complex than rice. Results: We analyzed the ratio of noncoding to total genomic DNA (ncDNA/tgDNA) for 85 sequenced species and found that this ratio correlates well with increasing biological complexity. The ncDNA/tgDNA ratio is generally contained within the bandwidth of 0.05-0.24 for prokaryotes, but rises to 0.26-0.52 in unicellular eukaryotes, and to 0.62-0.985 for developmentally complex multicellular organisms. Significantly, prokaryotic species display a non-uniform species distribution approaching the mean of 0.1177 ncDNA/tgDNA (p=1.58 x 10^-13), and a nonlinear ncDNA/tgDNA relationship to genome size (r=0.15). Importantly, the ncDNA/tgDNA ratio corrects for ploidy, and is not substantially affected by variable loads of repetitive sequences. Conclusions: We suggest that the observed noncoding DNA increases and compositional patterns are primarily a function of increased information content. It is therefore possible that introns, intergenic sequences, repeat elements, and genomic DNA previously regarded as genetically inert may be far more important to the evolution and functional repertoire of complex organisms than has been previously appreciated.Comment: 25 pages, 2 figures, 1 tabl

    Bacterial genomic G + C composition-eliciting environmental adaptation

    Get PDF
    Bacterial genomes reflect their adaptation strategies through nucleotide usage trends found in their chromosome composition. Bacteria, unlike eukaryotes contain a wide range of genomic G + C. This wide variability may be viewed as a response to environmental adaptation. Two overarching trends are observed across bacterial genomes, the first, correlates genomic G + C to environmental niches and lifestyle, while the other utilizees intra-genomic G + C incongruence to delineate horizontally transferred material. In this review, we focus on the influence of several properties including biochemical, genetic flows, selection biases, and the biochemical-energetic properties shaping genome composition. Outcomes indicate a trend toward high G + C and larger genomes in free-living organisms, as a result of more complex and varied environments (higher chance for horizontal gene transfer). Conversely, nutrient limiting and nutrient poor environments dictate smaller genomes of low GC in attempts to conserve replication expense. Varied processes including translesion repair mechanisms, phage insertion and cytosine degradation has been shown to introduce higher AT in genomic sequences. We conclude the review with an analysis of current bioinformatics tools seeking to elicit compositional variances and highlight the practical implications when using such techniques

    Variation in the strength of selected codon usage bias among bacteria

    Get PDF
    Among bacteria, many species have synonymous codon usage patterns that have been influenced by natural selection for those codons that are translated more accurately and/or efficiently. However, in other species selection appears to have been ineffective. Here, we introduce a population genetics-based model for quantifying the extent to which selection has been effective. The approach is applied to 80 phylogenetically diverse bacterial species for which whole genome sequences are available. The strength of selected codon usage bias, S, is found to vary substantially among species; in 30% of the genomes examined, there was no significant evidence that selection had been effective. Values of S are highly positively correlated with both the number of rRNA operons and the number of tRNA genes. These results are consistent with the hypothesis that species exposed to selection for rapid growth have more rRNA operons, more tRNA genes and more strongly selected codon usage bias. For example, Clostridium perfringens, the species with the highest value of S, can have a generation time as short as 7 min

    Unique Roles for ISE2 in Chloroplast RNA Metabolism and Regulation of Plasmodesmata-mediated Intercellular Trafficking

    Get PDF
    INCREASED SIZE EXCLUSION LIMIT 2 (ISE2) is a nuclear gene encoding a chloroplast-localized RNA helicase that is essential for Arabidopsis thaliana embryogenesis, chloroplast RNA metabolic events and the regulation of plasmodesmal permeability. Here I report that ISE2 is essential for the editing of several chloroplast transcripts. Emb175/PPR103 is a nuclear gene encoding a pentatricopeptide repeat (PPR) protein that was previously reported to be required for embryogenesis in Arabidopsis thaliana and for seedling survival in Zea mays. EMB175/PPR103 was previously identified in our lab in a yeast-two-hybrid interaction screen with ISE2 and subsequently named ISE2 PROTEIN INTERACTOR (IPI)1. Confocal fluorescence microscopy illustrates that IPI1-YFP, similar to ISE2-YFP, localizes to chloroplasts, consistent with its predicted chloroplast N-terminal targeting sequence. In Nicotiana benthamiana, silencing of emb175/PPR103/IPI1 in mature leaf tissue produces a chlorotic phenotype coupled to defective chloroplast structural integrity. Interestingly, virus induced gene silencing (VIGS) of N. benthamiana emb175/PPR103/IPI1 or N. benthamiana ISE2 revealed defects in the RNA editing of N. benthamiana chloroplast transcripts. However, ISE2-silenced plants displayed increased plasmodesmata-mediated intercellular trafficking, whereas no intercellular trafficking defect was observed in N. benthamiana plants silenced for emb175/PPR103/IPI1. These results indicate that ISE2 performs unique functions in the regulation of PD permeability. Collectively, our results identify IPI1 as an ISE2 interacting protein that localizes to the chloroplast and that participates in the proper RNA editing of select N. benthamiana chloroplast transcripts. These observations add to the rapidly growing knowledge base of RNA helicase and PPR protein function in plants
    corecore