74 research outputs found

    Gene Decay in Shigella as an Incipient Stage of Host-Adaptation

    Get PDF
    BACKGROUND: Many facultative bacterial pathogens have undergone extensive gene decay processes, possibly due to lack of selection pressure during evolutionary conversion from free-living to intracellular lifestyle. Shigella, the causative agents of human shigellosis, have arisen from different E. coli-like ancestors independently by convergent paths. As these bacteria all have lost large numbers of genes by mutation or deletion, they can be used as ideal models for systematically studying the process of gene function loss in different bacteria living under similar selection pressures. METHODOLOGIES/PRINCIPAL FINDINGS: We compared the sequenced Shigella genomes and re-defined decayed genes (pseudogenes plus deleted genes) in these bacteria. Altogether, 85 genes are commonly decayed in the five analyzed Shigella strains and 1456 genes are decayed in at least one Shigella strain. Genes coding for carbon utilization, cell motility, transporter or membrane proteins are prone to be inactivated. Decayed genes tend to concentrate in certain operons rather than distribute averagely across the whole genome. Genes in the decayed operon accumulated more non-synonymous mutations than the rest genes and meanwhile have lower expression levels. CONCLUSIONS: Different Shigella lineages underwent convergent gene decay processes, and inactivation of one gene would lead to a lesser selection pressure for the other genes in the same operon. The pool of superfluous genes for Shigella may contain at least two thousand genes and the gene decay processes may still continue in Shigella until a minimum genome harboring only essential genes is reached

    Superhelical Destabilization in Regulatory Regions of Stress Response Genes

    Get PDF
    Stress-induced DNA duplex destabilization (SIDD) analysis exploits the known structural and energetic properties of DNA to predict sites that are susceptible to strand separation under negative superhelical stress. When this approach was used to calculate the SIDD profile of the entire Escherichia coli K12 genome, it was found that strongly destabilized sites occur preferentially in intergenic regions that are either known or inferred to contain promoters, but rarely occur in coding regions. Here, we investigate whether the genes grouped in different functional categories have characteristic SIDD properties in their upstream flanks. We report that strong SIDD sites in the E. coli K12 genome are statistically significantly overrepresented in the upstream regions of genes encoding transcriptional regulators. In particular, the upstream regions of genes that directly respond to physiological and environmental stimuli are more destabilized than are those regions of genes that are not involved in these responses. Moreover, if a pathway is controlled by a transcriptional regulator whose gene has a destabilized 5′ flank, then the genes (operons) in that pathway also usually contain strongly destabilized SIDD sites in their 5′ flanks. We observe this statistically significant association of SIDD sites with upstream regions of genes functioning in transcription in 38 of 43 genomes of free-living bacteria, but in only four of 18 genomes of endosymbionts or obligate parasitic bacteria. These results suggest that strong SIDD sites 5′ to participating genes may be involved in transcriptional responses to environmental changes, which are known to transiently alter superhelicity. We propose that these SIDD sites are active and necessary participants in superhelically mediated regulatory mechanisms governing changes in the global pattern of gene expression in prokaryotes in response to physiological or environmental changes

    Gene fusions and gene duplications: relevance to genomic annotation and functional analysis

    Get PDF
    BACKGROUND: Escherichia coli a model organism provides information for annotation of other genomes. Our analysis of its genome has shown that proteins encoded by fused genes need special attention. Such composite (multimodular) proteins consist of two or more components (modules) encoding distinct functions. Multimodular proteins have been found to complicate both annotation and generation of sequence similar groups. Previous work overstated the number of multimodular proteins in E. coli. This work corrects the identification of modules by including sequence information from proteins in 50 sequenced microbial genomes. RESULTS: Multimodular E. coli K-12 proteins were identified from sequence similarities between their component modules and non-fused proteins in 50 genomes and from the literature. We found 109 multimodular proteins in E. coli containing either two or three modules. Most modules had standalone sequence relatives in other genomes. The separated modules together with all the single (un-fused) proteins constitute the sum of all unimodular proteins of E. coli. Pairwise sequence relationships among all E. coli unimodular proteins generated 490 sequence similar, paralogous groups. Groups ranged in size from 92 to 2 members and had varying degrees of relatedness among their members. Some E. coli enzyme groups were compared to homologs in other bacterial genomes. CONCLUSION: The deleterious effects of multimodular proteins on annotation and on the formation of groups of paralogs are emphasized. To improve annotation results, all multimodular proteins in an organism should be detected and when known each function should be connected with its location in the sequence of the protein. When transferring functions by sequence similarity, alignment locations must be noted, particularly when alignments cover only part of the sequences, in order to enable transfer of the correct function. Separating multimodular proteins into module units makes it possible to generate protein groups related by both sequence and function, avoiding mixing of unrelated sequences. Organisms differ in sizes of groups of sequence-related proteins. A sample comparison of orthologs to selected E. coli paralogous groups correlates with known physiological and taxonomic relationships between the organisms

    A functional update of the Escherichia coli K-12 genome

    Get PDF
    Author Posting. © 2001 Serres et al. The definitive version was published in Genome Biology 2 (2001): research0035.1–0035.7, doi:10.1186/gb-2001-2-9-research0035.Background: Since the genome of Escherichia coli K-12 was initially annotated in 1997, additional functional information based on biological characterization and functions of sequence-similar proteins has become available. On the basis of this new information, an updated version of the annotated chromosome has been generated. Results: The E. coli K-12 chromosome is currently represented by 4,401 genes encoding 116 RNAs and 4,285 proteins. The boundaries of the genes identified in the GenBank Accession U00096 were used. Some protein-coding sequences are compound and encode multimodular proteins. The coding sequences (CDSs) are represented by modules (protein elements of at least 100 amino acids with biological activity and independent evolutionary history). There are 4,616 identified modules in the 4,285 proteins. Of these, 48.9% have been characterized, 29.5% have an imputed function, 2.1% have a phenotype and 19.5% have no function assignment. Only 7% of the modules appear unique to E. coli, and this number is expected to be reduced as more genome data becomes available. The imputed functions were assigned on the basis of manual evaluation of functions predicted by BLAST and DARWIN analyses and by the MAGPIE genome annotation system. Conclusions: Much knowledge has been gained about functions encoded by the E. coli K-12 genome since the 1997 annotation was published. The data presented here should be useful for analysis of E. coli gene products as well as gene products encoded by other genomes.This work was supported by NIH grant RO1 RR07861, the NASA Astrobiology Institute grant NCC2-1054, grants from the Edward Mallinckrodt, Jr Foundation and the Sinsheimer Foundation, and NSF grants NSF DBI - 9984882 and NSF IIS - 9996304

    EchoBASE: an integrated post-genomic database for Escherichia coli

    Get PDF
    EchoBASE (http://www.ecoli-york.org) is a relational database designed to contain and manipulate information from post-genomic experiments using the model bacterium Escherichia coli K-12. Its aim is to collate information from a wide range of sources to provide clues to the functions of the approximately 1500 gene products that have no confirmed cellular function. The database is built on an enhanced annotation of the updated genome sequence of strain MG1655 and the association of experimental data with the E.coli genes and their products. Experiments that can be held within EchoBASE include proteomics studies, microarray data, protein–protein interaction data, structural data and bioinformatics studies. EchoBASE also contains annotated information on ‘orphan’ enzyme activities from this microbe to aid characterization of the proteins that catalyse these elusive biochemical reactions

    Microarray-based screening of differentially expressed genes of E. coli O157:H7 Sakai during preharvest survival on butterhead lettuce

    Get PDF
    Numerous outbreaks of Escherichia coli O157:H7 have been linked to the consumption of leafy vegetables. However, up to the present, little has been known about E. coli O157:H7's adaptive responses to survival on actively growing (and thus responsive) plants. In this study, whole genome transcriptional profiles were generated from E. coli O157:H7 cells (isolate Sakai, stx-) one hour and two days after inoculation on the leaves of growing butterhead lettuce, and compared with an inoculum control. A total of 273 genes of E. coli O157:H7 Sakai (5.04% of the whole genome) were significantly induced or repressed by at least two-fold (p < 0.01) in at least one of the analyzed time points in comparison with the control. Several E. coli O157:H7 genes associated with oxidative stress and antimicrobial resistance were upregulated, including the iron-sulfur cluster and the multiple antibiotic resistance (mar) operon, whereas the Shiga toxin virulence genes were downregulated. Nearly 40% of the genes with significantly different expression were poorly characterized genes or genes with unknown functions. These genes are of special interest for future research as they may play an important role in the pathogens' adaptation to a lifestyle on plants. In conclusion, these findings suggest that the pathogen actively interacts with the plant environment by adapting its metabolism and responding to oxidative stress

    Bacterial Adaptation through Loss of Function

    Get PDF
    The metabolic capabilities and regulatory networks of bacteria have been optimized by evolution in response to selective pressures present in each species' native ecological niche. In a new environment, however, the same bacteria may grow poorly due to regulatory constraints or biochemical deficiencies. Adaptation to such conditions can proceed through the acquisition of new cellular functionality due to gain of function mutations or via modulation of cellular networks. Using selection experiments on transposon-mutagenized libraries of bacteria, we illustrate that even under conditions of extreme nutrient limitation, substantial adaptation can be achieved solely through loss of function mutations, which rewire the metabolism of the cell without gain of enzymatic or sensory function. A systematic analysis of similar experiments under more than 100 conditions reveals that adaptive loss of function mutations exist for many environmental challenges. Drawing on a wealth of examples from published articles, we detail the range of mechanisms through which loss-of-function mutations can generate such beneficial regulatory changes, without the need for rare, specific mutations to fine-tune enzymatic activities or network connections. The high rate at which loss-of-function mutations occur suggests that null mutations play an underappreciated role in the early stages of adaption of bacterial populations to new environments

    PLoS Genet

    Get PDF
    The metabolic capabilities and regulatory networks of bacteria have been optimized by evolution in response to selective pressures present in each species' native ecological niche. In a new environment, however, the same bacteria may grow poorly due to regulatory constraints or biochemical deficiencies. Adaptation to such conditions can proceed through the acquisition of new cellular functionality due to gain of function mutations or via modulation of cellular networks. Using selection experiments on transposon-mutagenized libraries of bacteria, we illustrate that even under conditions of extreme nutrient limitation, substantial adaptation can be achieved solely through loss of function mutations, which rewire the metabolism of the cell without gain of enzymatic or sensory function. A systematic analysis of similar experiments under more than 100 conditions reveals that adaptive loss of function mutations exist for many environmental challenges. Drawing on a wealth of examples from published articles, we detail the range of mechanisms through which loss-of-function mutations can generate such beneficial regulatory changes, without the need for rare, specific mutations to fine-tune enzymatic activities or network connections. The high rate at which loss-of-function mutations occur suggests that null mutations play an underappreciated role in the early stages of adaption of bacterial populations to new environments.5R01AI077562/AI/NIAID NIH HHS/United States8DP1ES022578/DP/NCCDPHP CDC HHS/United States23874220PMC370884
    corecore