418 research outputs found

    The evolution of the tape measure protein: units, duplications and losses

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A large family of viruses that infect bacteria, called <it>phages</it>, is characterized by long tails used to inject DNA into their victims' cells. The <it>tape measure protein</it> got its name because the length of the corresponding gene is proportional to the length of the phage's tail: a fact shown by actually copying or splicing out parts of DNA in exemplar species. A natural question is whether there exist <it>units</it> for these tape measures, and if different tape measures have different units and lengths. Such units would allow us to retrace the evolution of tape measure proteins using their duplication/loss history. The vast number of sequenced phages genomes allows us to attack this problem with a comparative genomics approach.</p> <p>Results</p> <p>Here we describe a subset of phages whose tape measure proteins contain variable numbers of an 11 amino acids sequence repeat, aligned with sequence similarity, structural properties, and simple arithmetics. This subset provides a unique opportunity for the combinatorial study of phage evolution, without the added uncertainties of multiple alignments, which are trivial in this case, or of protein functions, that are well established. We give a heuristic that reconstructs the duplication history of these sequences, using divergent strains to discriminate between mutations that occurred before and after speciation, or lineage divergence. The heuristic is based on an efficient algorithm that gives an exhaustive enumeration of all possible parsimonious reconstructions of the duplication/speciation history of a single nucleotide. Finally, we present a method that allows, when possible, to discriminate between duplication and loss events.</p> <p>Conclusions</p> <p>Establishing the evolutionary history of viruses is difficult, in part due to extensive recombinations and gene transfers, and high mutation rates that often erase detectable similarity between homologous genes. In this paper, we introduce new tools to address this problem.</p

    Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes

    Get PDF
    Background: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences. Results: We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide an description of a sequence of duplication events as a context-free grammar (CFG). Conclusion: These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes.

    Evolution of a Complex Locus: Exon Gain, Loss and Divergence at the Gr39a Locus in Drosophila

    Get PDF
    Background. Gene families typically evolve by gene duplication followed by the adoption of new or altered gene functions. A different way to evolve new but related functions is alternative splicing of existing exons of a complex gene. The chemosensory gene families of animals are characterised by numerous loci of related function. Alternative splicing has only rarely been reported in chemosensory loci, for example in 5 out of around 120 loci in Drosophila melanogaster. The gustatory receptor gene Gr39a has four large exons that are alternatively spliced with three small conserved exons. Recently the genome sequences of eleven additional species of Drosophila have become available allowing us to examine variation in the structure of the Gr39a locus across a wide phylogenetic range of fly species. Methodology/Principal Findings. We describe a fifth exon and show that the locus has a complex evolutionary history with several duplications, pseudogenisations and losses of exons. PAML analyses suggested that the whole gene has a history of purifying selection, although this was less strong in exons which underwent duplication. Conclusions/Significance. Estimates of functional divergence between exons were similar in magnitude to functional divergence between duplicated genes, suggesting that exon divergence is broadly equivalent to gene duplication.Publisher PDFPeer reviewe

    Approches algorithmiques pour l’inférence d’histoires de duplication en tandem avec inversions et délétions pour des familles multigéniques

    Full text link
    [Français] Une fraction importante des génomes eucaryotes est constituée de Gènes Répétés en Tandem (GRT). Un mécanisme fondamental dans l’évolution des GRT est la recombinaison inégale durant la méiose, entrainant la duplication locale (en tandem) de segments chromosomiques contenant un ou plusieurs gènes adjacents. Différents algorithmes ont été proposés pour inférer une histoire de duplication en tandem pour un cluster de GRT. Cependant, leur utilisation est limitée dans la pratique, car ils ne tiennent pas compte d’autres événements évolutifs pourtant fréquents, comme les inversions, les duplications inversées et les délétions. Cette thèse propose différentes approches algorithmiques permettant d’intégrer ces événements dans le modèle de duplication en tandem classique. Nos contributions sont les suivantes: • Intégrer les inversions dans un modèle de duplication en tandem simple (duplication d’un gène à la fois) et proposer un algorithme exact permettant de calculer le nombre minimal d’inversions s’étant produites dans l’évolution d’un cluster de GRT. • Généraliser ce modèle pour l’étude d’un ensemble de clusters orthologues dans plusieurs espèces. • Proposer un algorithme permettant d’inférer l’histoire évolutive d’un cluster de GRT en tenant compte des duplications en tandem, duplications inversées, inversions et délétions de segments chromosomiques contenant un ou plusieurs gènes adjacents.[English] Tandemly arrayed genes (TAGs) represent an important fraction of most genomes. A fundamental mechanism at the origin of TAG clusters is unequal crossing-over during meiosis, leading to the duplication of chromosomal segments containing one or many adjacent genes. Such duplications are called tandem duplications, as the duplicated segment is placed next to the original one on the chromosome. Different algorithms have been proposed to infer the tandem duplication history of a TAG cluster. However, their applicability is limited in practice since they do not take into account other frequent evolutionary events such as inversion, inverted duplication and deletion. In this thesis, we propose different algorithmic approaches allowing to integrate these evolutionary events in the original tandem duplication model of evolution. Our contributions are summarized as follows: • We integrate inversion events in a tandem duplication model restricted to single gene duplications, and we propose an exact algorithm allowing to compute the minimum number of inversions explaining the evolution of a TAG cluster. • We generalize this model to the study of orthologous TAG clusters in different species. • We propose an algorithm allowing to infer the evolutionary history of a TAG cluster through tandem duplication, inverted duplication, inversion and deletion of chromosomal segments containing one or many adjacent genes

    The mitochondrial genome of Sinentomon erythranum (Arthropoda: Hexapoda: Protura): an example of highly divergent evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The phylogenetic position of the Protura, traditionally considered the most basal hexapod group, is disputed because it has many unique morphological characters compared with other hexapods. Although mitochondrial genome information has been used extensively in phylogenetic studies, such information is not available for the Protura. This has impeded phylogenetic studies on this taxon, as well as the evolution of the arthropod mitochondrial genome.</p> <p>Results</p> <p>In this study, the mitochondrial genome of <it>Sinentomon erythranum </it>was sequenced, as the first proturan species to be reported. The genome contains a number of special features that differ from those of other hexapods and arthropods. As a very small arthropod mitochondrial genome, its 14,491 nucleotides encode 37 typical mitochondrial genes. Compared with other metazoan mtDNA, it has the most biased nucleotide composition with T = 52.4%, an extreme and reversed AT-skew of -0.351 and a GC-skew of 0.350. Two tandemly repeated regions occur in the A+T-rich region, and both could form stable stem-loop structures. Eighteen of the 22 tRNAs are greatly reduced in size with truncated secondary structures. The gene order is novel among available arthropod mitochondrial genomes. Rearrangements have involved in not only small tRNA genes, but also PCGs (protein-coding genes) and ribosome RNA genes. A large block of genes has experienced inversion and another nearby block has been reshuffled, which can be explained by the tandem duplication and random loss model. The most remarkable finding is that <it>trnL2(UUR) </it>is not located between <it>cox1 </it>and <it>cox2 </it>as observed in most hexapod and crustacean groups, but is between <it>rrnL </it>and <it>nad1 </it>as in the ancestral arthropod ground pattern. The "<it>cox1</it>-<it>cox2</it>" pattern was further confirmed in three more representative proturan species. The phylogenetic analyses based on the amino acid sequences of 13 mitochondrial PCGs suggest <it>S</it>. <it>erythranum </it>failed to group with other hexapod groups.</p> <p>Conclusions</p> <p>The mitochondrial genome of <it>S. erythranum </it>shows many different features from other hexapod and arthropod mitochondrial genomes. It underwent highly divergent evolution. The "<it>cox1</it>-<it>cox2</it>" pattern probably represents the ancestral state for all proturan mitogenomes, and suggests a long evolutionary history for the Protura.</p
    corecore