347 research outputs found

    Parsimony and likelihood reconstruction of human segmental duplications

    Get PDF
    Motivation: Segmental duplications > 1 kb in length with ≥ 90% sequence identity between copies comprise nearly 5% of the human genome. They are frequently found in large, contiguous regions known as duplication blocks that can contain mosaic patterns of thousands of segmental duplications. Reconstructing the evolutionary history of these complex genomic regions is a non-trivial, but important task

    The amphioxus genome and the evolution of the chordate karyotype

    Get PDF
    Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly polymorphic approx520-megabase genome of the Florida lancelet Branchiostoma floridae, and analyse it in the context of chordate evolution. Whole-genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets and vertebrates), and allow not only reconstruction of the gene complement of the last common chordate ancestor but also partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution

    Sorting genomes with rearrangements and segmental duplications through trajectory graphs

    Get PDF
    We study the problem of sorting genomes under an evolutionary model that includes genomic rearrangements and segmental duplications. We propose an iterative algorithm to improve any initial evolutionary trajectory between two genomes in terms of parsimony. Our algorithm is based on a new graphical model, the trajectory graph, which models not only the final states of two genomes but also an existing evolutionary trajectory between them. We show that redundant rearrangements in the trajectory correspond to certain cycles in the trajectory graph, and prove that our algorithm converges to an optimal trajectory for any initial trajectory involving only rearrangements

    Phylogeny Analysis from Gene-Order Data with Massive Duplications

    Get PDF
    Background: Gene order changes, under rearrangements, insertions, deletions and duplications, have been used as a new type of data source for phylogenetic reconstruction. Because these changes are rare compared to sequence mutations, they allow the inference of phylogeny further back in evolutionary time. There exist many computational methods for the reconstruction of gene-order phylogenies, including widely used maximum parsimonious methods and maximum likelihood methods. However, both methods face challenges in handling large genomes with many duplicated genes, especially in the presence of whole genome duplication. Methods: In this paper, we present three simple yet powerful methods based on maximum-likelihood (ML) approaches that encode multiplicities of both gene adjacency and gene content information for phylogenetic reconstruction. Results: Extensive experiments on simulated data sets show that our new method achieves the most accurate phylogenies compared to existing approaches. We also evaluate our method on real whole-genome data from eleven mammals. The package is publicly accessible at http://www.geneorder.org. Conclusions: Our new encoding schemes successfully incorporate the multiplicity information of gene adjacencies and gene content into an ML framework, and show promising results in reconstruct phylogenies for whole-genome data in the presence of massive duplications

    Determining the evolutionary history of gene families

    Get PDF
    PublishedMotivation: Recent large-scale studies of individuals within a population have demonstrated that there is widespread variation in copy number in many gene families. In addition, there is increasing evidence that the variation in gene copy number can give rise to substantial phenotypic effects. In some cases, these variations have been shown to be adaptive. These observations show that a full understanding of the evolution of biological function requires an understanding of gene gain and gene loss. Accurate, robust evolutionary models of gain and loss events are, therefore, required. Results: We have developed weighted parsimony and maximum likelihood methods for inferring gain and loss events. To test these methods, we have used Markov models of gain and loss to simulate data with known properties. We examine three models: a simple birth–death model, a single rate model and a birth–death innovation model with parameters estimated from Drosophila genome data. We find that for all simulations maximum likelihood-based methods are very accurate for reconstructing the number of duplication events on the phylogenetic tree, and that maximum likelihood and weighted parsimony have similar accuracy for reconstructing the ancestral state. Our implementations are robust to different model parameters and provide accurate inferences of ancestral states and the number of gain and loss events. For ancestral reconstruction, we recommend weighted parsimony because it has similar accuracy to maximum likelihood, but is much faster. For inferring the number of individual gene loss or gain events, maximum likelihood is noticeably more accurate, albeit at greater computational cost.Biotechnology and Biological Sciences Research Council, UK

    Optimized ancestral state reconstruction using Sankoff parsimony

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Parsimony methods are widely used in molecular evolution to estimate the most plausible phylogeny for a set of characters. Sankoff parsimony determines the minimum number of changes required in a given phylogeny when a cost is associated to transitions between character states. Although optimizations exist to reduce the computations in the number of taxa, the original algorithm takes time <it>O</it>(<it>n</it><sup>2</sup>) in the number of states, making it impractical for large values of <it>n</it>.</p> <p>Results</p> <p>In this study we introduce an optimization of Sankoff parsimony for the reconstruction of ancestral states when ultrametric or additive cost matrices are used. We analyzed its performance for randomly generated matrices, Jukes-Cantor and Kimura's two-parameter models of DNA evolution, and in the reconstruction of elongation factor-1<it>α </it>and ancestral metabolic states of a group of eukaryotes, showing that in all cases the execution time is significantly less than with the original implementation.</p> <p>Conclusion</p> <p>The algorithms here presented provide a fast computation of Sankoff parsimony for a given phylogeny. Problems where the number of states is large, such as reconstruction of ancestral metabolism, are particularly adequate for this optimization. Since we are reducing the computations required to calculate the parsimony cost of a single tree, our method can be combined with optimizations in the number of taxa that aim at finding the most parsimonious tree.</p

    Chromosomal evolution of the PKD1 gene family in primates

    Get PDF
    Correction to Kirsch S, Pasantes J, Wolf A, Bogdanova N, Münch C, Pennekamp P, Krawczak M, Dworniczak B, Schempp W: Chromosomal evolution of the PKD1 gene family in primates. BMC Evolutionary Biology 2008, 8:263 (doi:10.1186/1471-2148-8-263

    Evolution and functional divergence of NLRP genes in mammalian reproductive systems

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>NLRPs (Nucleotide-binding oligomerization domain, Leucine rich Repeat and Pyrin domain containing Proteins) are members of NLR (Nod-like receptors) protein family. Recent researches have shown that <it>NLRP </it>genes play important roles in both mammalian innate immune system and reproductive system. Several of <it>NLRP </it>genes were shown to be specifically expressed in the oocyte in mammals. The aim of the present work was to study how these genes evolved and diverged after their duplication, as well as whether natural selection played a role during their evolution.</p> <p>Results</p> <p>By using <it>in silico </it>methods, we have evaluated the evolution and functional divergence of <it>NLRP </it>genes, in particular of mouse reproduction-related <it>Nlrp </it>genes. We found that (1) major <it>NLRP </it>genes have been duplicated before the divergence of mammals, with certain lineage-specific duplications in primates (<it>NLRP7 </it>and <it>11</it>) and in rodents (<it>Nlrp1</it>, <it>4 </it>and <it>9 </it>duplicates); (2) tandem duplication events gave rise to a mammalian reproduction-related <it>NLRP </it>cluster including <it>NLRP2</it>, <it>4</it>, <it>5</it>, <it>7</it>, <it>8</it>, <it>9</it>, <it>11</it>, <it>13 </it>and <it>14 </it>genes; (3) the function of mammalian oocyte-specific <it>NLRP </it>genes (<it>NLRP4</it>, <it>5</it>, <it>9 </it>and <it>14</it>) might have diverged during gene evolution; (4) recent segmental duplications concerning <it>Nlrp4 </it>copies and vomeronasal 1 receptor encoding genes (<it>V1r</it>) have been undertaken in the mouse; and (5) duplicates of <it>Nlrp4 </it>and <it>9 </it>in the mouse might have been subjected to adaptive evolution.</p> <p>Conclusion</p> <p>In conclusion, this study brings us novel information on the evolution of mammalian reproduction-related <it>NLRPs</it>. On the one hand, <it>NLRP </it>genes duplicated and functionally diversified in mammalian reproductive systems (such as <it>NLRP4</it>, <it>5</it>, <it>9 </it>and <it>14</it>). On the other hand, during evolution, different lineages adapted to develop their own <it>NLRP </it>genes, particularly in reproductive function (such as the specific expansion of <it>Nlrp4 </it>and <it>Nlrp9 </it>in the mouse).</p

    Evolution at the Subgene Level: Domain Rearrangements in the Drosophila Phylogeny

    Get PDF
    Supplementary sections 1–13, tables S1–S10, and figures S1–S9 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).Although the possibility of gene evolution by domain rearrangements has long been appreciated, current methods for reconstructing and systematically analyzing gene family evolution are limited to events such as duplication, loss, and sometimes, horizontal transfer. However, within the Drosophila clade, we find domain rearrangements occur in 35.9% of gene families, and thus, any comprehensive study of gene evolution in these species will need to account for such events. Here, we present a new computational model and algorithm for reconstructing gene evolution at the domain level. We develop a method for detecting homologous domains between genes and present a phylogenetic algorithm for reconstructing maximum parsimony evolutionary histories that include domain generation, duplication, loss, merge (fusion), and split (fission) events. Using this method, we find that genes involved in fusion and fission are enriched in signaling and development, suggesting that domain rearrangements and reuse may be crucial in these processes. We also find that fusion is more abundant than fission, and that fusion and fission events occur predominantly alongside duplication, with 92.5% and 34.3% of fusion and fission events retaining ancestral architectures in the duplicated copies. We provide a catalog of ∼9,000 genes that undergo domain rearrangement across nine sequenced species, along with possible mechanisms for their formation. These results dramatically expand on evolution at the subgene level and offer several insights into how new genes and functions arise between species.National Science Foundation (U.S.) (Graduate Research Fellowship)National Science Foundation (U.S.) (CAREER award NSF 0644282

    Peregrine and saker falcon genome sequences provide insights into evolution of a predatory lifestyle

    Get PDF
    As top predators, falcons possess unique morphological, physiological and behavioral adaptations that allow them to be successful hunters: for example, the peregrine is renowned as the world's fastest animal. To examine the evolutionary basis of predatory adaptations, we sequenced the genomes of both the peregrine (Falco peregrinus) and saker falcon (Falco cherrug), and we present parallel, genome-wide evidence for evolutionary innovation and selection for a predatory lifestyle. The genomes, assembled using Illumina deep sequencing with greater than 100-fold coverage, are both approximately 1.2 Gb in length, with transcriptome-assisted prediction of approximately 16,200 genes for both species. Analysis of 8,424 orthologs in both falcons, chicken, zebra finch and turkey identified consistent evidence for genome-wide rapid evolution in these raptors. SNP-based inference showed contrasting recent demographic trajectories for the two falcons, and gene-based analysis highlighted falcon-specific evolutionary novelties for beak development and olfaction and specifically for homeostasis-related genes in the arid environment–adapted saker
    corecore