417 research outputs found

    Phylogeny of Prokaryotes and Chloroplasts Revealed by a Simple Composition Approach on All Protein Sequences from Complete Genomes Without Sequence Alignment

    Get PDF
    The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped to resolve the evolution of this organelle in photosynthetic eukaryotes. In this paper we propose an alternative method of phylogenetic analysis using compositional statistics for all protein sequences from complete genomes. This new method is conceptually simpler than and computationally as fast as the one proposed by Qi et al. (2004b) and Chu et al. (2004). The same data sets used in Qi et al. (2004b) and Chu et al. (2004) are analyzed using the new method. Our distance-based phylogenic tree of the 109 prokaryotes and eukaryotes agrees with the biologists tree of life based on 16S rRNA comparison in a predominant majority of basic branching and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated to two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution

    A Mutual Information Based Sequence Distance For Vertebrate Phylogeny Using Complete Mitochondrial Genomes

    Get PDF
    Traditional sequence distances require alignment. A new mutual information based sequence distance without alignment is defined in this paper. This distance is based on compositional vectors of DNA sequences or protein sequences from complete genomes. First we establish the mathematical foundation of this distance. Then this distance is applied to analyze the phylogenetic relationship of 64 vertebrates using complete mitochondrial genomes. The phylogenetic tree shows that the mitochondrial genomes are separated into three major groups. One group corresponds to mammals; one group corresponds to fish; and the last one is Archosauria (including birds and reptiles). The structure of the tree based on our new distance is roughly in agreement in topology with the current known phylogenies of vertebrates

    Proving universal common ancestry with similar sequences

    No full text
    Douglas Theobald recently developed an interesting test putatively capable of quantifying the evidence for a Universal Common Ancestry uniting the three domains of life (Eukarya, Archaea and Bacteria) against hypotheses of Independent Origins for some of these domains. We review here his model, in particular in relation to the treatment of Horizontal Gene Transfer and to the quality of sequence alignment

    Taking the First Steps towards a Standard for Reporting on Phylogenies: Minimum Information about a Phylogenetic Analysis (MIAPA)

    Get PDF
    In the eight years since phylogenomics was introduced as the intersection of genomics and phylogenetics, the field has provided fundamental insights into gene function, genome history and organismal relationships. The utility of phylogenomics is growing with the increase in the number and diversity of taxa for which whole genome and large transcriptome sequence sets are being generated. We assert that the synergy between genomic and phylogenetic perspectives in comparative biology would be enhanced by the development and refinement of minimal reporting standards for phylogenetic analyses. Encouraged by the development of the Minimum Information About a Microarray Experiment (MIAME) standard, we propose a similar roadmap for the development of a Minimal Information About a Phylogenetic Analysis (MIAPA) standard. Key in the successful development and implementation of such a standard will be broad participation by developers of phylogenetic analysis software, phylogenetic database developers, practitioners of phylogenomics, and journal editors. This paper is part of the special issue of OMICS on data standards.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/63208/1/omi.2006.10.231.pd

    Studying Evolutionary Change: Transdisciplinary Advances in Understanding and Measuring Evolution

    Get PDF
    Evolutionary processes can be found in almost any historical, i.e. evolving, system that erroneously copies from the past. Well studied examples do not only originate in evolutionary biology but also in historical linguistics. Yet an approach that would bind together studies of such evolving systems is still elusive. This thesis is an attempt to narrowing down this gap to some extend. An evolving system can be described using characters that identify their changing features. While the problem of a proper choice of characters is beyond the scope of this thesis and remains in the hands of experts we concern ourselves with some theoretical as well data driven approaches. Having a well chosen set of characters describing a system of different entities such as homologous genes, i.e. genes of same origin in different species, we can build a phylogenetic tree. Consider the special case of gene clusters containing paralogous genes, i.e. genes of same origin within a species usually located closely, such as the well known HOX cluster. These are formed by step- wise duplication of its members, often involving unequal crossing over forming hybrid genes. Gene conversion and possibly other mechanisms of concerted evolution further obfuscate phylogenetic relationships. Hence, it is very difficult or even impossible to disentangle the detailed history of gene duplications in gene clusters. Expanding gene clusters that use unequal crossing over as proposed by Walter Gehring leads to distinctive patterns of genetic distances. We show that this special class of distances helps in extracting phylogenetic information from the data still. Disregarding genome rearrangements, we find that the shortest Hamiltonian path then coincides with the ordering of paralogous genes in a cluster. This observation can be used to detect ancient genomic rearrangements of gene clus- ters and to distinguish gene clusters whose evolution was dominated by unequal crossing over within genes from those that expanded through other mechanisms. While the evolution of DNA or protein sequences is well studied and can be formally described, we find that this does not hold for other systems such as language evolution. This is due to a lack of detectable mechanisms that drive the evolutionary processes in other fields. Hence, it is hard to quantify distances between entities, e.g. languages, and therefore the characters describing them. Starting out with distortions of distances, we first see that poor choices of the distance measure can lead to incorrect phylogenies. Given that phylogenetic inference requires additive metrics we can infer the correct phylogeny from a distance matrix D if there is a monotonic, subadditive function ζ such that ζ^−1(D) is additive. We compute the metric-preserving transformation ζ as the solution of an optimization problem. This result shows that the problem of phylogeny reconstruction is well defined even if a detailed mechanistic model of the evolutionary process is missing. Yet, this does not hinder studies of language evolution using automated tools. As the amount of available and large digital corpora increased so did the possibilities to study them automatically. The obvious parallels between historical linguistics and phylogenetics lead to many studies adapting bioinformatics tools to fit linguistics means. Here, we use jAlign to calculate bigram alignments, i.e. an alignment algorithm that operates with regard to adjacency of letters. Its performance is tested in different cognate recognition tasks. Using pairwise alignments one major obstacle is the systematic errors they make such as underestimation of gaps and their misplacement. Applying multiple sequence alignments instead of a pairwise algorithm implicitly includes more evolutionary information and thus can overcome the problem of correct gap placement. They can be seen as a generalization of the string-to-string edit problem to more than two strings. With the steady increase in computational power, exact, dynamic programming solutions have become feasible in practice also for 3- and 4-way alignments. For the pairwise (2-way) case, there is a clear distinction between local and global alignments. As more sequences are consid- ered, this distinction, which can in fact be made independently for both ends of each sequence, gives rise to a rich set of partially local alignment problems. So far these have remained largely unexplored. Thus, a general formal frame- work that gives raise to a classification of partially local alignment problems is introduced. It leads to a generic scheme that guides the principled design of exact dynamic programming solutions for particular partially local alignment problems

    Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0

    Get PDF
    Microbial genomes are available at an ever-increasing pace, as cultivation and sequencing become cheaper and obtaining metagenome-assembled genomes (MAGs) becomes more effective. Phylogenetic placement methods to contextualize hundreds of thousands of genomes must thus be efficiently scalable and sensitive from closely related strains to divergent phyla. We present PhyloPhlAn 3.0, an accurate, rapid, and easy-to-use method for large-scale microbial genome characterization and phylogenetic analysis at multiple levels of resolution. PhyloPhlAn 3.0 can assign genomes from isolate sequencing or MAGs to species-level genome bins built from >230,000 publically available sequences. For individual clades of interest, it reconstructs strain-level phylogenies from among the closest species using clade-specific maximally informative markers. At the other extreme of resolution, it scales to large phylogenies comprising >17,000 microbial species. Examples including Staphylococcus aureus isolates, gut metagenomes, and meta-analyses demonstrate the ability of PhyloPhlAn 3.0 to support genomic and metagenomic analyses

    “Frauds and fish species authentication: study of the complete mitochondrial genome of some Sparidae species to provide specific barcode markers”

    Get PDF
    The growing global diffusion of seafood for human consumption requires always more accurate sanitary and quality controls. Among the globally marketed fish, the species belonging to the family Sparidae are excellent food-fishes of high economic value. The Sparidae family comprises about 38 genera and 159 species, some of which are highly appreciated as seafood. In Italy, the fish species of commercial interest are defined by the law (DM n. 19105 del 22 Settembre 2017). Among them, 41species belong to the Sparidae family. The identification of these species is difficult even when external characters are preserved due to their morphological similarity. Species identification becomes even harder to achieve after industry processing, when distinctive external traits are removed. Nevertheless, despite their similarity, sparid species have different organoleptic quality that corresponds to variable prices in fish markets. Consequently, substitution of commercially important Sparidae species is common. The sparid specie with highest commercial value, Dentex, is often replaced with less expensive species or lower quality alternatives. All this premised, Sparidae species identification requires immediate resolutions. Research on fish mitochondrial DNA (mtDNA, mitogenome) has led to substantial advances in the fields of species authentication and population biology. The main species’ specific DNA sequences used as markers for fish species identification belong to the mitochondrial genes encoding ribosomal 16S and 12S subunits, cytochrome b (Cytb), and cytochrome c oxidase I (COI). However, current research shows that mitochondrial DNA markers perform well for certain species but may be less discriminating for others. Therefore, they cannot be used indiscriminately for the identification of all fish species. This situation has led to the formulation of the proposal to study and analyze the complete mtDNA sequence with the aim to identify mitochondrial markers or multiple marker approaches with higher and more specific discrimination capacity. At present, mtDNA genomes of Sparidae fishes are not conspicuously represented in the scientific literature, which makes difficult to understand if currently used genetic markers are the most effective for sparid species identification. This thesis project had the aim to study and analyze Sparids’ complete mtDNA sequence to identify mitochondrial markers or a multiple marker approach to achieve a higher and more specific discrimination capacity. We focused on the genera Dentex, Pagellus and Pagrus, listed in the Ministerial Decree
    • 

    corecore