3,753 research outputs found

    Ancestral genome estimation reveals the history of ecological diversification in Agrobacterium

    Get PDF
    Horizontal gene transfer (HGT) is considered as a major source of innovation in bacteria, and as such is expected to drive adaptation to new ecological niches. However, among the many genes acquired through HGT along the diversification history of genomes, only a fraction may have actively contributed to sustained ecological adaptation. We used a phylogenetic approach accounting for the transfer of genes (or groups of genes) to estimate the history of genomes in Agrobacterium biovar 1, a diverse group of soil and plant-dwelling bacterial species. We identified clade-specific blocks of cotransferred genes encoding coherent biochemical pathways that may have contributed to the evolutionary success of key Agrobacterium clades. This pattern of gene coevolution rejects a neutral model of transfer, in which neighboring genes would be transferred independently of their function and rather suggests purifying selection on collectively coded acquired pathways. The acquisition of these synapomorphic blocks of cofunctioning genes probably drove the ecological diversification of Agrobacterium and defined features of ancestral ecological niches, which consistently hint at a strong selective role of host plant rhizospheres

    Evolution of genes and repeats in the Nimrod superfamily

    Get PDF
    The recently identified Nimrod superfamily is characterized by the presence of a special type of EGF repeat, the NIM repeat, located right after a typical CCXGY/W amino acid motif. On the basis of structural features, nimrod genes can be divided into three types. The proteins encoded by Draper-type genes have an EMI domain at the N-terminal part and only one copy of the NIM motif, followed by a variable number of EGF-like repeats. The products of Nimrod B-type and Nimrod C-type genes (including the eater gene) have different kinds of N-terminal domains, and lack EGF-like repeats but contain a variable number of NIM repeats. Draper and Nimrod C-type (but not Nimrod B-type) proteins carry a transmembrane domain. Several members of the superfamily were claimed to function as receptors in phagocytosis and/or binding of bacteria, which indicates an important role in the cellular immunity and the elimination of apoptotic cells. In this paper, the evolution of the Nimrod superfamily is studied with various methods on the level of genes and repeats. A hypothesis is presented in which the NIM repeat, along with the EMI domain, emerged by structural reorganizations at the end of an EGF-like repeat chain, suggesting a mechanism for the formation of novel types of repeats. The analyses revealed diverse evolutionary patterns in the sequences containing multiple NIM repeats. Although in the Nimrod B and Nimrod C proteins show characteristics of independent evolution, many internal NIM repeats in Eater sequences seem to have undergone concerted evolution. An analysis of the nimrod genes has been performed using phylogenetic and other methods and an evolutionary scenario of the origin and diversification of the Nimrod superfamily is proposed. Our study presents an intriguing example how the evolution of multigene families may contribute to the complexity of the innate immune response

    The inference of gene trees with species trees

    Get PDF
    Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can co-exist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice-versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. In this article we review the various models that have been used to describe the relationship between gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a better basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational Evolutionary Biology" conference, Montpellier, 201

    Evolution and diversity of secretome genes in the apicomplexan parasite Theileria annulata

    Get PDF
    <b>BACKGROUND</b>: Little is known about how apicomplexan parasites have evolved to infect different host species and cell types. Theileria annulata and Theileria parva invade and transform bovine leukocytes but each species favours a different host cell lineage. Parasite-encoded proteins secreted from the intracellular macroschizont stage within the leukocyte represent a critical interface between host and pathogen systems. Genome sequencing has revealed that several Theileria-specific gene families encoding secreted proteins are positively selected at the inter-species level, indicating diversification between the species. We extend this analysis to the intra-species level, focusing on allelic diversity of two major secretome families. These families represent a well-characterised group of genes implicated in control of the host cell phenotype and a gene family of unknown function. To gain further insight into their evolution and function, this study investigates whether representative genes of these two families are diversifying or constrained within the T. annulata population. <b>RESULTS</b>: Strong evidence is provided that the sub-telomerically encoded SVSP family and the host-nucleus targeted TashAT family have evolved under contrasting pressures within natural T. annulata populations. SVSP genes were found to possess atypical codon usage and be evolving neutrally, with high levels of nucleotide substitutions and multiple indels. No evidence of geographical sub-structuring of allelic sequences was found. In contrast, TashAT family genes, implicated in control of host cell gene expression, are strongly conserved at the protein level and geographically sub-structured allelic sequences were identified among Tunisian and Turkish isolates. Although different copy numbers of DNA binding motifs were identified in alleles of TashAT proteins, motif periodicity was strongly maintained, implying conserved functional activity of these sites. <b>CONCLUSIONS</b>: This analysis provides evidence that two distinct secretome genes families have evolved under contrasting selective pressures. The data supports current hypotheses regarding the biological role of TashAT family proteins in the management of host cell phenotype that may have evolved to allow adaptation of T. annulata to a specific host cell lineage. We provide new evidence of extensive allelic diversity in representative members of the enigmatic SVSP gene family, which supports a putative role for the encoded products in subversion of the host immune response

    Orthology prediction methods: a quality assessment using curated protein families

    Get PDF
    The increasing number of sequenced genomes has prompted the development of several automated orthology prediction methods. Tests to evaluate the accuracy of predictions and to explore biases caused by biological and technical factors are therefore required. We used 70 manually curated families to analyze the performance of five public methods in Metazoa. We analyzed the strengths and weaknesses of the methods and quantified the impact of biological and technical challenges. From the latter part of the analysis, genome annotation emerged as the largest single influencer, affecting up to 30% of the performance. Generally, most methods did well in assigning orthologous group but they failed to assign the exact number of genes for half of the groups. The publicly available benchmark set (http://eggnog.embl.de/orthobench/) should facilitate the improvement of current orthology assignment protocols, which is of utmost importance for many fields of biology and should be tackled by a broad scientific community

    Exact reconciliation of undated trees

    Full text link
    Reconciliation methods aim at recovering macro evolutionary events and at localizing them in the species history, by observing discrepancies between gene family trees and species trees. In this article we introduce an Integer Linear Programming (ILP) approach for the NP-hard problem of computing a most parsimonious time-consistent reconciliation of a gene tree with a species tree when dating information on speciations is not available. The ILP formulation, which builds upon the DTL model, returns a most parsimonious reconciliation ranging over all possible datings of the nodes of the species tree. By studying its performance on plausible simulated data we conclude that the ILP approach is significantly faster than a brute force search through the space of all possible species tree datings. Although the ILP formulation is currently limited to small trees, we believe that it is an important proof-of-concept which opens the door to the possibility of developing an exact, parsimony based approach to dating species trees. The software (ILPEACE) is freely available for download

    The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction

    Get PDF
    Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5′ ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct
    corecore