31 research outputs found

    Persistence drives gene clustering in bacterial genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes.</p> <p>Results</p> <p>We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters.</p> <p>Conclusion</p> <p>We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering.</p

    The impact of the neisserial DNA uptake sequences on genome evolution and stability

    Get PDF
    A study of the origin and distribution of the abundant short DNA uptake sequence (DUS) in six genomes of Neisseria suggests that transformation and recombination are tightly linked in evolution and that recombination has a key role in the establishment of DUS

    Protein evolution driven by symmetric structural repeats

    Get PDF

    IMG/PR: a database of plasmids from genomes and metagenomes with rich annotations and metadata.

    Get PDF
    Plasmids are mobile genetic elements found in many clades of Archaea and Bacteria. They drive horizontal gene transfer, impacting ecological and evolutionary processes within microbial communities, and hold substantial importance in human health and biotechnology. To support plasmid research and provide scientists with data of an unprecedented diversity of plasmid sequences, we introduce the IMG/PR database, a new resource encompassing 699 973 plasmid sequences derived from genomes, metagenomes and metatranscriptomes. IMG/PR is the first database to provide data of plasmid that were systematically identified from diverse microbiome samples. IMG/PR plasmids are associated with rich metadata that includes geographical and ecosystem information, host taxonomy, similarity to other plasmids, functional annotation, presence of genes involved in conjugation and antibiotic resistance. The database offers diverse methods for exploring its extensive plasmid collection, enabling users to navigate plasmids through metadata-centric queries, plasmid comparisons and BLAST searches. The web interface for IMG/PR is accessible at https://img.jgi.doe.gov/pr. Plasmid metadata and sequences can be downloaded from https://genome.jgi.doe.gov/portal/IMG_PR

    Identification of protein secretion systems in bacterial genomes using MacSyFinder version 2

    No full text
    Abstract Protein secretion systems are complex molecular machineries that translocate proteins through the outer membrane and sometimes through multiple other barriers. They have evolved by co-option of components from other envelope-associated cellular machineries, making them sometimes difficult to identify and discriminate. Here, we describe how to identify protein secretion systems in bacterial genomes using the MacSyFinder program. This flexible computational tool uses the knowledge gathered from experimental studies to identify homologous systems in genome data. It can be used with a set of pre-defined MacSyFinder models—”TXSScan”, to identify all major secretion systems of diderm bacteria ( i . e ., with inner and LPS-containing outer membranes) as well as evolutionarily related cell appendages (pili and flagella). For this, it identifies and clusters co-localized genes encoding proteins of secretion systems using sequence similarity search with Hidden Markov Model (HMM) protein profiles. Finally, it checks if the clusters’ genetic content and genomic organization satisfy the constraints of the model. TXSScan models can be altered in the command line or customized to search for variants of known secretion systems. Models can also be built from scratch to identify novel systems. In this chapter, we describe a complete pipeline of analysis, starting from i) the integration of information from a reference set of experimentally studied systems, ii) the identification of conserved proteins and the construction of their HMM protein profiles, iii) the definition and optimization of “macsy-models”, and iv) their use and online distribution as tools to search genomic data for secretion systems of interest. MacSyFinder is available here: https://github.com/gem-pasteur/macsyfinder , and MacSyFinder models here: https://github.com/macsy-models
    corecore