72 research outputs found

    Evolution through segmental duplications and losses : A Super-Reconciliation approach

    Get PDF
    The classical gene and species tree reconciliation, used to infer the history of gene gain and loss explaining the evolution of gene families, assumes an independent evolution for each family. While this assumption is reasonable for genes that are far apart in the genome, it is not appropriate for genes grouped into syntenic blocks, which are more plausibly the result of a concerted evolution. Here, we introduce the Super-Reconciliation problem which consists in inferring a history of segmental duplication and loss events (involving a set of neighboring genes) leading to a set of present-day syntenies from a single ancestral one. In other words, we extend the traditional Duplication-Loss reconciliation problem of a single gene tree, to a set of trees, accounting for segmental duplications and losses. Existency of a Super-Reconciliation depends on individual gene tree consistency. In addition, ignoring rearrangements implies that existency also depends on gene order consistency. We first show that the problem of reconstructing a most parsimonious Super-Reconciliation, if any, is NP-hard and give an exact exponential-time algorithm to solve it. Alternatively, we show that accounting for rearrangements in the evolutionary model, but still only minimizing segmental duplication and loss events, leads to an exact polynomial-time algorithm. We finally assess time efficiency of the former exponential time algorithm for the Duplication-Loss model on simulated datasets, and give a proof of concept on the opioid receptor genes

    Reconstructing the History of Syntenies Through Super-Reconciliation

    Get PDF
    Classical gene and species tree reconciliation, used to infer the history of gene gain and loss explaining the evolution of gene families, assumes an independent evolution for each family. While this assumption is reasonable for genes that are far apart in the genome, it is clearly not suited for genes grouped in syntenic blocks, which are more plausibly the result of a concerted evolution. Here, we introduce the Super-Reconciliation model, that extends the traditional Duplication-Loss model to the reconciliation of a set of trees, accounting for segmental duplications and losses. From a complexity point of view, we show that the associated decision problem is NP-hard. We then give an exact exponential-time algorithm for this problem, assess its time efficiency on simulated datasets, and give a proof of concept on the opioid receptor genes

    Contrasting modes of diversification in the Aux/IAA and ARF gene families

    Get PDF
    The complete genomic sequence for Arabidopsis provides the opportunity to combine phylogenetic and genomic approaches to study the evolution of gene families in plants. The Aux/IAA and ARF gene families, consisting of 29 and 23 loci in Arabidopsis, respectively, encode proteins that interact to mediate auxin responses and regulate various aspects of plant morphological development. We developed scenarios for the genomic proliferation of the Aux/IAA and ARF families by combining phylogenetic analysis with information on the relationship between each locus and the previously identified duplicated genomic segments in Arabidopsis. This analysis shows that both gene families date back at least to the origin of land plants and that the major Aux/IAA and ARF lineages originated before the monocot-eudicot divergence. We found that the extant Aux/IAA loci arose primarily through segmental duplication events, in sharp contrast to the ARF family and to the general pattern of gene family proliferation in Arabidopsis. Possible explanations for the unusual mode of Aux/IAA duplication include evolutionary constraints imposed by complex interactions among proteins and pathways, or the presence of long-distance cis-regulatory sequences. The antiquity of the two gene families and the unusual mode of Aux/IAA diversification have a number of potential implications for understanding both the functional and evolutionary roles of these genes

    The Evolution of Mammalian Gene Families

    Get PDF
    Gene families are groups of homologous genes that are likely to have highly similar functions. Differences in family size due to lineage-specific gene duplication and gene loss may provide clues to the evolutionary forces that have shaped mammalian genomes. Here we analyze the gene families contained within the whole genomes of human, chimpanzee, mouse, rat, and dog. In total we find that more than half of the 9,990 families present in the mammalian common ancestor have either expanded or contracted along at least one lineage. Additionally, we find that a large number of families are completely lost from one or more mammalian genomes, and a similar number of gene families have arisen subsequent to the mammalian common ancestor. Along the lineage leading to modern humans we infer the gain of 689 genes and the loss of 86 genes since the split from chimpanzees, including changes likely driven by adaptive natural selection. Our results imply that humans and chimpanzees differ by at least 6% (1,418 of 22,000 genes) in their complement of genes, which stands in stark contrast to the oft-cited 1.5% difference between orthologous nucleotide sequences. This genomic “revolving door” of gene gain and loss represents a large number of genetic differences separating humans from our closest relatives

    Phylogenetic Comparison of F-Box (FBX) Gene Superfamily within the Plant Kingdom Reveals Divergent Evolutionary Histories Indicative of Genomic Drift

    Get PDF
    The emergence of multigene families has been hypothesized as a major contributor to the evolution of complex traits and speciation. To help understand how such multigene families arose and diverged during plant evolution, we examined the phylogenetic relationships of F-Box (FBX) genes, one of the largest and most polymorphic superfamilies known in the plant kingdom. FBX proteins comprise the target recognition subunit of SCF-type ubiquitin-protein ligases, where they individually recruit specific substrates for ubiquitylation. Through the extensive analysis of 10,811 FBX loci from 18 plant species, ranging from the alga Chlamydomonas reinhardtii to numerous monocots and eudicots, we discovered strikingly diverse evolutionary histories. The number of FBX loci varies widely and appears independent of the growth habit and life cycle of land plants, with a little as 198 predicted for Carica papaya to as many as 1350 predicted for Arabidopsis lyrata. This number differs substantially even among closely related species, with evidence for extensive gains/losses. Despite this extraordinary inter-species variation, one subset of FBX genes was conserved among most species examined. Together with evidence of strong purifying selection and expression, the ligases synthesized from these conserved loci likely direct essential ubiquitylation events. Another subset was much more lineage specific, showed more relaxed purifying selection, and was enriched in loci with little or no evidence of expression, suggesting that they either control more limited, species-specific processes or arose from genomic drift and thus may provide reservoirs for evolutionary innovation. Numerous FBX loci were also predicted to be pseudogenes with their numbers tightly correlated with the total number of FBX genes in each species. Taken together, it appears that the FBX superfamily has independently undergone substantial birth/death in many plant lineages, with its size and rapid evolution potentially reflecting a central role for ubiquitylation in driving plant fitness

    Polyploidy Did Not Predate the Evolution of Nodulation in All Legumes

    Get PDF
    BACKGROUND: Several lines of evidence indicate that polyploidy occurred by around 54 million years ago, early in the history of legume evolution, but it has not been known whether this event was confined to the papilionoid subfamily (Papilionoideae; e.g. beans, medics, lupins) or occurred earlier. Determining the timing of the polyploidy event is important for understanding whether polyploidy might have contributed to rapid diversification and radiation of the legumes near the origin of the family; and whether polyploidy might have provided genetic material that enabled the evolution of a novel organ, the nitrogen-fixing nodule. Although symbioses with nitrogen-fixing partners have evolved in several lineages in the rosid I clade, nodules are widespread only in legume taxa, being nearly universal in the papilionoids and in the mimosoid subfamily (e.g., mimosas, acacias)--which diverged from the papilionoid legumes around 58 million years ago, soon after the origin of the legumes. METHODOLOGY/PRINCIPAL FINDINGS: Using transcriptome sequence data from Chamaecrista fasciculata, a nodulating member of the mimosoid clade, we tested whether this species underwent polyploidy within the timeframe of legume diversification. Analysis of gene family branching orders and synonymous-site divergence data from C. fasciculata, Glycine max (soybean), Medicago truncatula, and Vitis vinifera (grape; an outgroup to the rosid taxa) establish that the polyploidy event known from soybean and Medicago occurred after the separation of the mimosoid and papilionoid clades, and at or shortly before the Papilionoideae radiation. CONCLUSIONS: The ancestral legume genome was not fundamentally polyploid. Moreover, because there has not been an independent instance of polyploidy in the Chamaecrista lineage there is no necessary connection between polyploidy and nodulation in legumes. Chamaecrista may serve as a useful model in the legumes that lacks a paleopolyploid history, at least relative to the widely studied papilionoid models

    Genome-wide analysis of the nucleotide binding site leucine-rich repeat genes of four orchids revealed extremely low numbers of disease resistance genes

    Get PDF
    Orchids are one of the most diverse flowering plant families, yet possibly maintain the smallest number of the nucleotide-binding site-leucine-rich repeat (NBS-LRR) type plant resistance (R) genes among the angiosperms. In this study, a genome-wide search in four orchid taxa identified 186 NBS-LRR genes. Furthermore, 214 NBS-LRR genes were identified from seven orchid transcriptomes. A phylogenetic analysis recovered 30 ancestral lineages (29 CNL and one RNL), far fewer than other angiosperm families. From the genetics aspect, the relatively low number of ancestral R genes is unlikely to explain the low number of R genes in orchids alone, as historical gene loss and scarce gene duplication has continuously occurred, which also contributes to the low number of R genes. Due to recent sharp expansions, Phalaenopsis equestris and Dendrobium catenatum having 52 and 115 genes, respectively, and exhibited an "early shrinking to recent expanding" evolutionary pattern, while Gastrodia elata and Apostasia shenzhenica both exhibit a "consistently shrinking" evolutionary pattern and have retained only five and 14 NBS-LRR genes, respectively. RNL genes remain in extremely low numbers with only one or two copies per genome. Notably, all of the orchid RNL genes belong to the ADR1 lineage. A separate lineage, NRG1, was entirely absent and was likely lost in the common ancestor of all monocots. All of the TNL genes were absent as well, coincident with the RNL NRG1 lineage, which supports the previously proposed notion that a potential functional association between the TNL and RNL NRG1 genes

    The Distance and Median Problems in the Single-Cut-Or-Join Model with Single-Gene Duplications

    Get PDF
    Background. In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free models, the problem is NP-complete for most extensions of these models accounting for duplicated genes. Moreover, problems involving more than two genomes, such as the genome median and the Small Parsimony problem, are intractable for most duplication-free models, with some exceptions, for example the Single-Cut-or-Join (SCJ) model. Results. We introduce a variant of the SCJ distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and the creation of single-gene circular chromosomes. We prove that in this model, computing the directed distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. We also show that the directed median problem is tractable for this distance, while the rooted median problem, where we assume that one of the given genomes is ancestral to the median, is NP-complete. We also describe an Integer Linear Program for solving this problem. We evaluate the directed distance and rooted median algorithms on simulated data. Conclusion. Our results provide a simple genome rearrangement model, extending the SCJ model to account for single-gene duplications, for which we prove a mix of tractability and hardness results. For the NP-complete rooted median problem, we design a simple Integer Linear Program. Our publicly available implementation of these algorithms for the directed distance and median problems allow to solve efficiently these problems on large instances

    Reconstruction of Oomycete Genome Evolution Identifies Differences in Evolutionary Trajectories Leading to Present-Day Large Gene Families

    Get PDF
    The taxonomic class of oomycetes contains numerous pathogens of plants and animals but is related to nonpathogenic diatoms and brown algae. Oomycetes have flexible genomes comprising large gene families that play roles in pathogenicity. The evolutionary processes that shaped the gene content have not yet been studied by applying systematic tree reconciliation of the phylome of these species. We analyzed evolutionary dynamics of ten Stramenopiles. Gene gains, duplications, and losses were inferred by tree reconciliation of 18,459 gene trees constituting the phylome with a highly supported species phylogeny. We reconstructed a strikingly large last common ancestor of the Stramenopiles that contained ∼10,000 genes. Throughout evolution, the genomes of pathogenic oomycetes have constantly gained and lost genes, though gene gains through duplications outnumber the losses. The branch leading to the plant pathogenic Phytophthora genus was identified as a major transition point characterized by increased frequency of duplication events that has likely driven the speciation within this genus. Large gene families encoding different classes of enzymes associated with pathogenicity such as glycoside hydrolases are formed by complex and distinct patterns of duplications and losses leading to their expansion in extant oomycetes. This study unveils the large-scale evolutionary dynamics that shaped the genomes of pathogenic oomycetes. By the application of phylogenetic based analyses methods, it provides additional insights that shed light on the complex history of oomycete genome evolution and the emergence of large gene families characteristic for this important class of pathogens
    corecore