17 research outputs found

    Maximum likelihood estimates of pairwise rearrangement distances

    Get PDF
    Accurate estimation of evolutionary distances between taxa is important for many phylogenetic reconstruction methods. In the case of bacteria, distances can be estimated using a range of different evolutionary models, from single nucleotide polymorphisms to large-scale genome rearrangements. In the case of sequence evolution models (such as the Jukes-Cantor model and associated metric) have been used to correct pairwise distances. Similar correction methods for genome rearrangement processes are required to improve inference. Current attempts at correction fall into 3 categories: Empirical computational studies, Bayesian/MCMC approaches, and combinatorial approaches. Here we introduce a maximum likelihood estimator for the inversion distance between a pair of genomes, using the group-theoretic approach to modelling inversions introduced recently. This MLE functions as a corrected distance: in particular, we show that because of the way sequences of inversions interact with each other, it is quite possible for minimal distance and MLE distance to differently order the distances of two genomes from a third. This has obvious implications for the use of minimal distance in phylogeny reconstruction. The work also tackles the above problem allowing free rotation of the genome. Generally a frame of reference is locked, and all computation made accordingly. This work incorporates the action of the dihedral group so that distance estimates are free from any a priori frame of reference.Comment: 21 pages, 7 figures. To appear in the Journal of Theoretical Biolog

    Origin and distribution of epipolythiodioxopiperazine (ETP) gene clusters in filamentous ascomycetes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs) are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters.</p> <p>Results</p> <p>Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades.</p> <p>Conclusion</p> <p>ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of these clusters is discussed with consideration to multiple instances of independent cluster loss and lateral transfer of gene clusters between lineages.</p

    CO-phylum: An Assembly-Free Phylogenomic Approach for Close Related Organisms

    Full text link
    Phylogenomic approaches developed thus far are either too time-consuming or lack a solid evolutionary basis. Moreover, no phylogenomic approach is capable of constructing a tree directly from unassembled raw sequencing data. A new phylogenomic method, CO-phylum, is developed to alleviate these flaws. CO-phylum can generate a high-resolution and highly accurate tree using complete genome or unassembled sequencing data of close related organisms, in addition, CO-phylum distance is almost linear with p-distance.Comment: 21 pages, 6 figure

    Mitochondrial evolution in the fission yeasts

    Full text link
    Mémoire numérisé par la Direction des bibliothÚques de l'Université de Montréal

    A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, <it>Mollicutes</it>. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins.</p> <p>Results</p> <p>Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the <it>Mollicutes</it>. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling.</p> <p>Conclusions</p> <p>We describe novel features of PARCELs (<b>P</b>alindromic <b>A</b>mphipathic <b>R</b>epeat <b>C</b>oding <b>EL</b>ements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches.</p

    Characterization of chloroplast and mitochondrial genomes from green algae belonging to the class ulvophyceae, and identification of this class position within the chlorophyta lineage

    Get PDF
    Les algues vertes sont divisées en cInq classes: Charophyceae, Prasinophyceae, Ulvophyceae, Trebouxiophyceae et Chlorophyceae. Afin de résoudre le positionnement phylogénétique de la classe Ulvophyceae au sein des ces multiples lignées et d'acquérir de l' information sur les tendances évolutives de 'ses génomes d'organites, j ' ai séquencé les ADN chloroplastiques (ADN cp) et ADN mitochondriaux (ADNmt) des ulvophytes basales Pseudendoclonium akinetum et Oltmannsiellopsis viridis, effectué des analyses génomiques comparatives détaillées d'ADNcp et ADNmt de chlorophytes, et réalisé des analyses phylogénétiques approfondies dérivées de ces organites. Les analyses comparatives de génomes d'organites ont révélé que leur architecture est trÚs fluide chez les Chlorophyta et démontre une grande variabilité de structure, d' ordre génique, de contenu génique, intronique et en éléments répétés, et ont également fourni des évidences indiscutables du transfert intracellulaire, interorganite d'éléments génétiques dans les cellules d'ulvophytes. De plus, les analyses phylogénétiques des données structurales et moléculaires dérivées de ces organites supportent fortement l'affiliation entre Ulvophyceae et Chlorophyceae
    corecore