Search CORE

3,943 research outputs found

SunStar: an implementation of the generalized STAR method

Author: Bettisworth Benjamin
Publication venue
Publication date: 01/05/2017
Field of study

Master's Project (M.S.) University of Alaska Fairbanks, 2017STAR ... is a method of computing species trees from gene trees. Later, STAR was generalized and proven to be statistically consistent given a few conditions (Allman, Degnan, and Rhodes 2013). Using these conditions, it is possible to investigate robustness in the species tree inference process, the lack of which will produce instabilities in the tree resulting from STAR. We have developed a software package that estimates support for inferred trees called SunStar

ScholarWorks@UA

On the accuracy of language trees

Author: A Rambaut
C Christensen
CH Langley
D Bakker
D Bryant
D Bryant
D Robinson
EW Holmann
F Petroni
F Tria
F Tria
Francesca Tria
H Kishino
J Nerbonne
JL Thorne
M Dunn
M Randers
M Serva
M Swadesh
M Swadesh
Matjaz Perc
MJ Sanderson
MJ Sanderson
N Saitou
PMQ Atkinson
Q Atkinson
R Desper
RD Gray
RD Gray
S Pompei
S Wichmann
Simone Pompei
SJ Greenhill
VI Levenshtein
Vittorio Loreto
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Historical linguistics aims at inferring the most likely language phylogenetic tree starting from information concerning the evolutionary relatedness of languages. The available information are typically lists of homologous (lexical, phonological, syntactic) features or characters for many different languages. From this perspective the reconstruction of language trees is an example of inverse problems: starting from present, incomplete and often noisy, information, one aims at inferring the most likely past evolutionary history. A fundamental issue in inverse problems is the evaluation of the inference made. A standard way of dealing with this question is to generate data with artificial models in order to have full access to the evolutionary process one is going to infer. This procedure presents an intrinsic limitation: when dealing with real data sets, one typically does not know which model of evolution is the most suitable for them. A possible way out is to compare algorithmic inference with expert classifications. This is the point of view we take here by conducting a thorough survey of the accuracy of reconstruction methods as compared with the Ethnologue expert classifications. We focus in particular on state-of-the-art distance-based methods for phylogeny reconstruction using worldwide linguistic databases. In order to assess the accuracy of the inferred trees we introduce and characterize two generalizations of standard definitions of distances between trees. Based on these scores we quantify the relative performances of the distance-based algorithms considered. Further we quantify how the completeness and the coverage of the available databases affect the accuracy of the reconstruction. Finally we draw some conclusions about where the accuracy of the reconstructions in historical linguistics stands and about the leading directions to improve it.Comment: 36 pages, 14 figure

arXiv.org e-Print Archive

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Archivio della ricerca- Università di Roma La Sapienza

Inferring ancestral sequences in taxon-rich phylogenies

Author: Gascuel Olivier
Steel Mike
Publication venue
Publication date: 01/01/2010
Field of study

Statistical consistency in phylogenetics has traditionally referred to the accuracy of estimating phylogenetic parameters for a fixed number of species as we increase the number of characters. However, as sequences are often of fixed length (e.g. for a gene) although we are often able to sample more taxa, it is useful to consider a dual type of statistical consistency where we increase the number of species, rather than characters. This raises some basic questions: what can we learn about the evolutionary process as we increase the number of species? In particular, does having more species allow us to infer the ancestral state of characters accurately? This question is particularly relevant when sequence site evolution varies in a complex way from character to character, as well as for reconstructing ancestral sequences. In this paper, we assemble a collection of results to analyse various approaches for inferring ancestral information with increasing accuracy as the number of taxa increases.Comment: 32 pages, 5 figures, 1 table

arXiv.org e-Print Archive

CiteSeerX