52 research outputs found

    Visualization of pseudogenes in intracellular bacteria reveals the different tracks to gene destruction

    Get PDF
    Variably present genes and pseudogenes in Rickettsia species tend to have been acquired more recently and to be more divergent from the genes conserved across all specie

    The Journey to smORFland

    Get PDF
    The genome sequences completed so far contain more than 20 000 genes with unknown function and no similarity to genes in other genomes. The origin and evolution of the orphan genes is an enigma. Here, we discuss the suggestion that some orphan genes may represent pseudogenes or short fragments of genes that were functional in the genome of a common ancestor. These may be the remains of unsuccessful duplication or horizontal gene transfer events, in which the acquired sequences have entered the fragmentation process and thereby lost their similarity to genes in other species. This scenario is supported by a recent case study of orphan genes in several closely related species of Rickettsia, where full-length ancestral genes were reconstructed from sets of short, overlapping orphan genes. One of these was found to display similarity to genes encoding proteins with ankyrin-repeat domains

    AcetoBase: a functional gene repository and database for formyltetrahydrofolate synthetase sequences

    Get PDF
    Acetogenic bacteria are imperative to environmental carbon cycling and diverse biotechnological applications, but their extensive physiological and taxonomical diversity is an impediment to systematic taxonomic studies. Acetogens are chemolithoautotrophic bacteria that perform reductive carbon fixation under anaerobic conditions through the Wood–Ljungdahl pathway (WLP)/acetyl-coenzyme A pathway. The gene-encoding formyltetrahydrofolate synthetase (FTHFS), a key enzyme of this pathway, is highly conserved and can be used as a molecular marker to probe acetogenic communities. However, there is a lack of systematic collection of FTHFS sequence data at nucleotide and protein levels. In an attempt to streamline investigations on acetogens, we developed AcetoBase - a repository and database for systematically collecting and organizing information related to FTHFS sequences. AcetoBase also provides an opportunity to submit data and obtain accession numbers, perform homology searches for sequence identification and access a customized blast database of submitted sequences. AcetoBase provides the prospect to identify potential acetogenic bacteria, based on metadata information related to genome content and the WLP, supplemented with FTHFS sequence accessions, and can be an important tool in the study of acetogenic communities. AcetoBase can be publicly accessed at https://acetobase.molbio.slu.se

    An Anomalous Type IV Secretion System in Rickettsia Is Evolutionarily Conserved

    Get PDF
    Bacterial type IV secretion systems (T4SSs) comprise a diverse transporter family functioning in conjugation, competence, and effector molecule (DNA and/or protein) translocation. Thirteen genome sequences from Rickettsia, obligate intracellular symbionts/pathogens of a wide range of eukaryotes, have revealed a reduced T4SS relative to the Agrobacterium tumefaciens archetype (vir). However, the Rickettsia T4SS has not been functionally characterized for its role in symbiosis/virulence, and none of its substrates are known.Superimposition of T4SS structural/functional information over previously identified Rickettsia components implicate a functional Rickettsia T4SS. virB4, virB8 and virB9 are duplicated, yet only one copy of each has the conserved features of similar genes in other T4SSs. An extraordinarily duplicated VirB6 gene encodes five hydrophobic proteins conserved only in a short region known to be involved in DNA transfer in A. tumefaciens. virB1, virB2 and virB7 are newly identified, revealing a Rickettsia T4SS lacking only virB5 relative to the vir archetype. Phylogeny estimation suggests vertical inheritance of all components, despite gene rearrangements into an archipelago of five islets. Similarities of Rickettsia VirB7/VirB9 to ComB7/ComB9 proteins of epsilon-proteobacteria, as well as phylogenetic affinities to the Legionella lvh T4SS, imply the Rickettsiales ancestor acquired a vir-like locus from distantly related bacteria, perhaps while residing in a protozoan host. Modern modifications of these systems likely reflect diversification with various eukaryotic host cells.We present the rvh (Rickettsiales vir homolog) T4SS, an evolutionary conserved transporter with an unknown role in rickettsial biology. This work lays the foundation for future laboratory characterization of this system, and also identifies the Legionella lvh T4SS as a suitable genetic model

    Rickettsia Phylogenomics: Unwinding the Intricacies of Obligate Intracellular Life

    Get PDF
    BACKGROUND: Completed genome sequences are rapidly increasing for Rickettsia, obligate intracellular alpha-proteobacteria responsible for various human diseases, including epidemic typhus and Rocky Mountain spotted fever. In light of phylogeny, the establishment of orthologous groups (OGs) of open reading frames (ORFs) will distinguish the core rickettsial genes and other group specific genes (class 1 OGs or C1OGs) from those distributed indiscriminately throughout the rickettsial tree (class 2 OG or C2OGs). METHODOLOGY/PRINCIPAL FINDINGS: We present 1823 representative (no gene duplications) and 259 non-representative (at least one gene duplication) rickettsial OGs. While the highly reductive (approximately 1.2 MB) Rickettsia genomes range in predicted ORFs from 872 to 1512, a core of 752 OGs was identified, depicting the essential Rickettsia genes. Unsurprisingly, this core lacks many metabolic genes, reflecting the dependence on host resources for growth and survival. Additionally, we bolster our recent reclassification of Rickettsia by identifying OGs that define the AG (ancestral group), TG (typhus group), TRG (transitional group), and SFG (spotted fever group) rickettsiae. OGs for insect-associated species, tick-associated species and species that harbor plasmids were also predicted. Through superimposition of all OGs over robust phylogeny estimation, we discern between C1OGs and C2OGs, the latter depicting genes either decaying from the conserved C1OGs or acquired laterally. Finally, scrutiny of non-representative OGs revealed high levels of split genes versus gene duplications, with both phenomena confounding gene orthology assignment. Interestingly, non-representative OGs, as well as OGs comprised of several gene families typically involved in microbial pathogenicity and/or the acquisition of virulence factors, fall predominantly within C2OG distributions. CONCLUSION/SIGNIFICANCE: Collectively, we determined the relative conservation and distribution of 14354 predicted ORFs from 10 rickettsial genomes across robust phylogeny estimation. The data, available at PATRIC (PathoSystems Resource Integration Center), provide novel information for unwinding the intricacies associated with Rickettsia pathogenesis, expanding the range of potential diagnostic, vaccine and therapeutic targets

    Taking the pseudo out of pseudogenes

    Get PDF
    Pseudogenes are defined as fragments of once-functional genes that have been silenced by one or more nonsense, frameshift or missense mutations. Despite continuing increases in the speed of sequencing and annotating bacterial genomes, the identification and categorisation of pseudogenes remains problematic. Even when identified, pseudogenes are considered to be rare and tend to be ignored. On the contrary, pseudogenes are surprisingly prevalent and can persist for long evolutionary time periods, representing a record of once-functional genetic characteristics. Most importantly, pseudogenes provide an insight into prokaryotic evolutionary history as a record of phenotypic traits that have been lost. Focusing on the intracellular and symbiotic bacteria in which pseudogenes predominate, this review discusses the importance of identifying pseudogenes to fully understand the abilities of bacteria, and to understand prokaryotes within their evolutionary context

    Methods and Applications in Comparative Bacterial Genomics

    No full text
    Comparative studies of bacterial genomes, now counting in the hundreds, generate massive amounts of information. In order to support a systematic and efficient approach to genomic analyses, a database driven system with graphic visualization of genomic properties was developed - GenComp. The software was applied to studies of obligate intracellular bacteria. In all studies, ORFs were extracted and grouped into ORF-families. Based on gene order synteny, orthologous clusters of core genes and variable spacer ORFs were identified and extracted for alignments and computation of substitution frequencies. The software was applied to the genomes of six Chlamydia trachomatis strains to identify the most rapidly evolving genes. Five genes were chosen for genotyping, and close to a 3-fold higher discrimination capacity was achieved than that of serotypes. With GenComp as the backbone, a massive comparative analysis were performed on the variable gene set in the Rickettsiaceae, which includes Rickettsia prowazekii and Orientia tsutsugamushi, the agents of epidemic and scrub typhus, respectively. O. tsutsugamushi has the most exceptional bacterial genome identified to date; the 2.2 Mb genome is 200-fold more repeated than the 1.1 Mb R. prowazekii genome due to an extensive proliferation of conjugative type IV secretion systems and associated genes. GenComp identified 688 core genes that are conserved across 7 closely related Rickettsia genomes along with a set of 469 variably present genes with homologs in other species. The analysis indicates that up to 70% of the extensively degraded and variably present genes represent mobile genetic elements and genes putatively acquired by horizontal gene transfer. This explains the paradox of the high pseudogene load in the small Rickettsia genomes. This study demonstrates that GenComp provides an efficient system for pseudogene identification and may help distinguish genes from spurious ORFs in the many pan-genome sequencing projects going on worldwide

    Methods and Applications in Comparative Bacterial Genomics

    No full text
    Comparative studies of bacterial genomes, now counting in the hundreds, generate massive amounts of information. In order to support a systematic and efficient approach to genomic analyses, a database driven system with graphic visualization of genomic properties was developed - GenComp. The software was applied to studies of obligate intracellular bacteria. In all studies, ORFs were extracted and grouped into ORF-families. Based on gene order synteny, orthologous clusters of core genes and variable spacer ORFs were identified and extracted for alignments and computation of substitution frequencies. The software was applied to the genomes of six Chlamydia trachomatis strains to identify the most rapidly evolving genes. Five genes were chosen for genotyping, and close to a 3-fold higher discrimination capacity was achieved than that of serotypes. With GenComp as the backbone, a massive comparative analysis were performed on the variable gene set in the Rickettsiaceae, which includes Rickettsia prowazekii and Orientia tsutsugamushi, the agents of epidemic and scrub typhus, respectively. O. tsutsugamushi has the most exceptional bacterial genome identified to date; the 2.2 Mb genome is 200-fold more repeated than the 1.1 Mb R. prowazekii genome due to an extensive proliferation of conjugative type IV secretion systems and associated genes. GenComp identified 688 core genes that are conserved across 7 closely related Rickettsia genomes along with a set of 469 variably present genes with homologs in other species. The analysis indicates that up to 70% of the extensively degraded and variably present genes represent mobile genetic elements and genes putatively acquired by horizontal gene transfer. This explains the paradox of the high pseudogene load in the small Rickettsia genomes. This study demonstrates that GenComp provides an efficient system for pseudogene identification and may help distinguish genes from spurious ORFs in the many pan-genome sequencing projects going on worldwide

    Methods and Applications in Comparative Bacterial Genomics

    No full text
    Comparative studies of bacterial genomes, now counting in the hundreds, generate massive amounts of information. In order to support a systematic and efficient approach to genomic analyses, a database driven system with graphic visualization of genomic properties was developed - GenComp. The software was applied to studies of obligate intracellular bacteria. In all studies, ORFs were extracted and grouped into ORF-families. Based on gene order synteny, orthologous clusters of core genes and variable spacer ORFs were identified and extracted for alignments and computation of substitution frequencies. The software was applied to the genomes of six Chlamydia trachomatis strains to identify the most rapidly evolving genes. Five genes were chosen for genotyping, and close to a 3-fold higher discrimination capacity was achieved than that of serotypes. With GenComp as the backbone, a massive comparative analysis were performed on the variable gene set in the Rickettsiaceae, which includes Rickettsia prowazekii and Orientia tsutsugamushi, the agents of epidemic and scrub typhus, respectively. O. tsutsugamushi has the most exceptional bacterial genome identified to date; the 2.2 Mb genome is 200-fold more repeated than the 1.1 Mb R. prowazekii genome due to an extensive proliferation of conjugative type IV secretion systems and associated genes. GenComp identified 688 core genes that are conserved across 7 closely related Rickettsia genomes along with a set of 469 variably present genes with homologs in other species. The analysis indicates that up to 70% of the extensively degraded and variably present genes represent mobile genetic elements and genes putatively acquired by horizontal gene transfer. This explains the paradox of the high pseudogene load in the small Rickettsia genomes. This study demonstrates that GenComp provides an efficient system for pseudogene identification and may help distinguish genes from spurious ORFs in the many pan-genome sequencing projects going on worldwide
    corecore