811 research outputs found

    Greedy Selection of Species for Ancestral State Reconstruction on Phylogenies: Elimination Is Better than Insertion

    Get PDF
    Accurate reconstruction of ancestral character states on a phylogeny is crucial in many genomics studies. We study how to select species to achieve the best reconstruction of ancestral character states on a phylogeny. We first show that the marginal maximum likelihood has the monotonicity property that more taxa give better reconstruction, but the Fitch method does not have it even on an ultrametric phylogeny. We further validate a greedy approach for species selection using simulation. The validation tests indicate that backward greedy selection outperforms forward greedy selection. In addition, by applying our selection strategy, we obtain a set of the ten most informative species for the reconstruction of the genomic sequence of the so-called boreoeutherian ancestor of placental mammals. This study has broad relevance in comparative genomics and paleogenomics since limited research resources do not allow researchers to sequence the large number of descendant species required to reconstruct an ancestral sequence

    More Taxa Are Not Necessarily Better for the Reconstruction of Ancestral Character States

    Full text link
    We show that the accuracy of reconstrucing an ancestral state is not an increasing function of the size of taxon sampling.Comment: 21 page

    Evolutionary modes in protein observable space: the case of Thioredoxins

    Get PDF
    In this article, we investigated the structural and dynamical evolutionary behaviour of a set of ten thioredoxin proteins as formed by three extant forms and seven resurrected ones in laboratory. Starting from the crystallographic structures, we performed all-atom molecular dynamics simulations and compare the trajectories in terms of structural and dynamical properties. Interestingly, the structural properties related to the protein density (i.e. the number of residues divided by the excluded molecular volume) well describe the protein evolutionary behaviour. Our results also suggest that the changes in sequence as occurred during the evolution have affected the protein essential motions, allowing us to discriminate between ancient and extant proteins in terms of their dynamical behaviour. Such results are yet more evident when the bacterial, archaeal and eukaryotic thioredoxins are separately analysed

    BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction

    Get PDF
    A novel discrete mathematical approach is proposed as an additional tool for molecular systematics which does not require prior statistical assumptions concerning the evolutionary process. The method is based on algorithms generating mathematical representations directly from DNA/RNA or protein sequences, followed by the output of numerical (scalar or vector) and visual characteristics (graphs). The binary encoded sequence information is transformed into a compact analytical form, called the Iterative Canonical Form (or ICF) of Boolean functions, which can then be used as a generalized molecular descriptor. The method provides raw vector data for calculating different distance matrices, which in turn can be analyzed by neighbor-joining or UPGMA to derive a phylogenetic tree, or by principal coordinates analysis to get an ordination scattergram. The new method and the associated software for inferring phylogenetic trees are called the Boolean analysis or BOOL-AN
    corecore