1,529 research outputs found

    A Fast Quartet Tree Heuristic for Hierarchical Clustering

    Get PDF
    The Minimum Quartet Tree Cost problem is to construct an optimal weight tree from the 3(n4)3{n \choose 4} weighted quartet topologies on nn objects, where optimality means that the summed weight of the embedded quartet topologies is optimal (so it can be the case that the optimal tree embeds all quartets as nonoptimal topologies). We present a Monte Carlo heuristic, based on randomized hill climbing, for approximating the optimal weight tree, given the quartet topology weights. The method repeatedly transforms a dendrogram, with all objects involved as leaves, achieving a monotonic approximation to the exact single globally optimal tree. The problem and the solution heuristic has been extensively used for general hierarchical clustering of nontree-like (non-phylogeny) data in various domains and across domains with heterogeneous data. We also present a greatly improved heuristic, reducing the running time by a factor of order a thousand to ten thousand. All this is implemented and available, as part of the CompLearn package. We compare performance and running time of the original and improved versions with those of UPGMA, BioNJ, and NJ, as implemented in the SplitsTree package on genomic data for which the latter are optimized. Keywords: Data and knowledge visualization, Pattern matching--Clustering--Algorithms/Similarity measures, Hierarchical clustering, Global optimization, Quartet tree, Randomized hill-climbing,Comment: LaTeX, 40 pages, 11 figures; this paper has substantial overlap with arXiv:cs/0606048 in cs.D

    A New Quartet Tree Heuristic for Hierarchical Clustering

    Get PDF
    We consider the problem of constructing an an optimal-weight tree from the 3*(n choose 4) weighted quartet topologies on n objects, where optimality means that the summed weight of the embedded quartet topologiesis optimal (so it can be the case that the optimal tree embeds all quartets as non-optimal topologies). We present a heuristic for reconstructing the optimal-weight tree, and a canonical manner to derive the quartet-topology weights from a given distance matrix. The method repeatedly transforms a bifurcating tree, with all objects involved as leaves, achieving a monotonic approximation to the exact single globally optimal tree. This contrasts to other heuristic search methods from biological phylogeny, like DNAML or quartet puzzling, which, repeatedly, incrementally construct a solution from a random order of objects, and subsequently add agreement values.Comment: 22 pages, 14 figure

    Coevolutionary history of ecological replicates: comparing phylogenies of wing and body lice to Columbiform hosts

    Get PDF
    Book ChapterPhylogenies depict the history of speciation for groups of organisms. Comparing the phylogenies of interacting groups can reveal instances of tandem speciation, or "cospeciation" (Brooks and McLennan, 1991; Hoberg et al., 1997; Paterson and Gray, 1997). Understanding the conditions under which cospeciation takes place is a challenging task. In the case of hosts and their parasites, cospeciation occurs when isolation of host populations also isolates the parasites on those hosts. Patterns of cospeciation can break down owing to dispersal of parasites among host populations, sympatric speciation of parasites on a single host population, or extinction of parasites on a host population (Page and Charleston, 1998). All else being equal, ecologically similar parasites living on the same host should respond to isolation of host populations in the same way, yielding similar coevolutionary histories. In this chapter we compare cospeciation events in two such "replicate" groups of lice living on the same hosts. If forces promoting speciation, such as host speciation, act on these parasites in similar ways, then we would expect cospeciation events to be correlated between these parasite groups. On the other hand, if the parasites respond to isolation differently, then cospeciation events should be independent in the two groups

    Phylogenetic signal and the utility of 12S and 16S mtDNA in frog phylogeny

    Get PDF
    Genes selected for a phylogenetic study need to contain conserved information that reflects the phylogenetic history at the specific taxonomic level of interest. Mitochondrial ribosomal genes have been used for a wide range of phylogenetic questions in general and in anuran systematics in particular. We checked the plausibility of phylogenetic reconstructions in anurans that were built from commonly used 12S and 16S rRNA gene sequences. For up to 27 species arranged in taxon sets of graded inclusiveness, we inferred phylogenetic hypotheses based on different apriori decisions, i.e. choice of alignment method and alignment parameters, including/excluding variable sites, choice of reconstruction algorithm and models of evolution. Alignment methods and parameters, as well as taxon sampling all had notable effects on the results leading to a large number of conflicting topologies. Very few nodes were supported in all of the analyses. Data sets in which fast evolving and ambiguously aligned sites had been excluded performed worse than the complete data sets. There was moderate support for the monophyly of the Discoglossidae, Pelobatoidea, Pelobatidae and Pipidae. The clade Neobatrachia was robustly supported and the intrageneric relationships within Bombina and Discoglossus were well resolved indicating the usefulness of the genes for relatively recent phylogenetic events. Although 12S and 16S rRNA genes seem to carry some phylogenetic signal of deep (Mesozoic) splitting events the signal was not strong enough to resolve consistently the inter-relationships of major clades within the Anura under varied methods and parameter settings

    Bilaterian Phylogeny Based on Analyses of a Region of the Sodium-potassium ATPase beta-subunit Gene

    Get PDF
    Molecular investigations of deep-level relationships within and among the animal phyla have been hampered by a lack of slowly evolving genes that are amenable to study by molecular systematists. To provide new data for use in deep-level metazoan phylogenetic studies, primers were developed to amplify a 1.3-kb region of the alpha subunit of the nuclear-encoded sodium-potassium ATPase gene from 31 bilaterians representing several phyla. Maximum parsimony, maximum likelihood, and Bayesian analyses of these sequences (combined with ATPase sequences for 23 taxa downloaded from GenBank) yield congruent trees that corroborate recent findings based on analyses of other data sets (e.g., the 18S ribosomal RNA gene). The ATPase-based trees support monophyly for several clades (including Lophotrochozoa, a form of Ecdysozoa, Vertebrata, Mollusca, Bivalvia, Gastropoda, Arachnida, Hexapoda, Coleoptera, and Diptera) but do not support monophyly for Deuterostomia, Arthropoda, or Nemertea. Parametric bootstrapping tests reject monophyly for Arthropoda and Nemertea but are unable to reject deuterostome monophyly. Overall, the sodium-potassium ATPase alpha-subunit gene appears to be useful for deep-level studies of metazoan phylogeny
    corecore