Search CORE

60 research outputs found

A New Quartet Tree Heuristic for Hierarchical Clustering

Author: Cilibrasi Rudi
Vitanyi Paul M. B.
Publication venue
Publication date: 01/01/2006
Field of study

We consider the problem of constructing an an optimal-weight tree from the 3*(n choose 4) weighted quartet topologies on n objects, where optimality means that the summed weight of the embedded quartet topologiesis optimal (so it can be the case that the optimal tree embeds all quartets as non-optimal topologies). We present a heuristic for reconstructing the optimal-weight tree, and a canonical manner to derive the quartet-topology weights from a given distance matrix. The method repeatedly transforms a bifurcating tree, with all objects involved as leaves, achieving a monotonic approximation to the exact single globally optimal tree. This contrasts to other heuristic search methods from biological phylogeny, like DNAML or quartet puzzling, which, repeatedly, incrementally construct a solution from a random order of objects, and subsequently add agreement values.Comment: 22 pages, 14 figure

arXiv.org e-Print Archive

CiteSeerX

DROPS Dagstuhl Research Online Publication Server

A Fast Quartet Tree Heuristic for Hierarchical Clustering

Author: Cilibrasi Rudi L.
Vitanyi Paul M. B.
Publication venue
Publication date: 12/09/2014
Field of study

The Minimum Quartet Tree Cost problem is to construct an optimal weight tree from the

3{n \choose 4}

weighted quartet topologies on

n

objects, where optimality means that the summed weight of the embedded quartet topologies is optimal (so it can be the case that the optimal tree embeds all quartets as nonoptimal topologies). We present a Monte Carlo heuristic, based on randomized hill climbing, for approximating the optimal weight tree, given the quartet topology weights. The method repeatedly transforms a dendrogram, with all objects involved as leaves, achieving a monotonic approximation to the exact single globally optimal tree. The problem and the solution heuristic has been extensively used for general hierarchical clustering of nontree-like (non-phylogeny) data in various domains and across domains with heterogeneous data. We also present a greatly improved heuristic, reducing the running time by a factor of order a thousand to ten thousand. All this is implemented and available, as part of the CompLearn package. We compare performance and running time of the original and improved versions with those of UPGMA, BioNJ, and NJ, as implemented in the SplitsTree package on genomic data for which the latter are optimized. Keywords: Data and knowledge visualization, Pattern matching--Clustering--Algorithms/Similarity measures, Hierarchical clustering, Global optimization, Quartet tree, Randomized hill-climbing,Comment: LaTeX, 40 pages, 11 figures; this paper has substantial overlap with arXiv:cs/0606048 in cs.D

arXiv.org e-Print Archive

CiteSeerX

CWI's Institutional Repository

Compressed and Practical Data Structures for Strings

Author: Christiansen Anders Roy
Publication venue: DTU Compute
Publication date: 01/01/2018
Field of study

Online Research Database In Technology

The development and application of metaheuristics for problems in graph theory: A computational study

Author: Consoli Sergio
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics PhD Theses
Publication date: 01/01/2008
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.It is known that graph theoretic models have extensive application to real-life discrete optimization problems. Many of these models are NP-hard and, as a result, exact methods may be impractical for large scale problem instances. Consequently, there is a great interest in developing e±cient approximate methods that yield near-optimal solutions in acceptable computational times. A class of such methods, known as metaheuristics, have been proposed with success. This thesis considers some recently proposed NP-hard combinatorial optimization problems formulated on graphs. In particular, the min- imum labelling spanning tree problem, the minimum labelling Steiner tree problem, and the minimum quartet tree cost problem, are inves- tigated. Several metaheuristics are proposed for each problem, from classical approximation algorithms to novel approaches. A compre- hensive computational investigation in which the proposed methods are compared with other algorithms recommended in the literature is reported. The results show that the proposed metaheuristics outper- form the algorithms recommended in the literature, obtaining optimal or near-optimal solutions in short computational running times. In addition, a thorough analysis of the implementation of these methods provide insights for the implementation of metaheuristic strategies for other graph theoretic problems

OpenGrey Repository

Brunel University Research Archive

Faster Algorithms for Computing the Hairpin Completion Distance and Minimum Ancestor

Author: Boneh Itai
Fried Dvir
Micl?u? Adrian
Popa Alexandru
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

DROPS Dagstuhl Research Online Publication Server

A Fast Quartet Tree Heuristic for Hierarchical Clustering

Author: Cilibrasi R. (Rudi)
Vitányi P.M.B. (Paul)
Publication venue
Publication date: 12/09/2014
Field of study

CWI's Institutional Repository

Faster Path Queries in Colored Trees via Sparse Matrix Multiplication and Min-Plus Product

Author: Gao Younan
He Meng
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

DROPS Dagstuhl Research Online Publication Server

A new quartet tree heuristic for hierarchical clustering

Author: Cilibrasi R.L. (Rudi)
Vitányi P.M.B. (Paul)
Publication venue
Publication date: 07/07/2006
Field of study

We present a new quartet tree heuristic for hierarchical clustering from weighted quartet topologies, and a standard manner to derive those from a given distance matrix. We do not assume that there is a true ternary tree that generated the quartet topologies or distances which we wish to recover as closely as possible. Our aim is to just model the input data as faithfully as possible by the quartet tree. Our method is capable of handling up to 60–80 objects in a matter of hours, while no existing quartet heuristic can directly compute a quartet tree of more than about 20–30 objects without running for years. The method is implemented and available as public software

CWI's Institutional Repository

How Fitch-Margoliash Algorithm can Benefit from Multi Dimensional Scaling

Author: Hitchcock E.
Darwin C.
Edwards A.W.F.
Sneath P.H.A.
Saitou N.
Salemi M.
Lespinats S.
Jolliffe I.
Kuhner M.K.
Zaretsky K.
Cavalli-Sforza L.L.
Matsuda H.
Swofford D.L.
Li J.
Press W.H.
Glover F.
Goldberg D.E.
Reeves C.R.
Dowsland K.A.
Chalmers M.
Gromov M.
Milman V.D.
Bulmer M.
Demartines P.
Fleiss J.L.
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

Whatever the phylogenetic method, genetic sequences are often described as strings of characters, thus molecular sequences can be viewed as elements of a multi-dimensional space. As a consequence, studying motion in this space (ie, the evolutionary process) must deal with the amazing features of high-dimensional spaces like concentration of measured phenomenon

Crossref

Hal - Université Grenoble Alpes

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Warwick Research Archives Portal Repository

Online Research Database In Technology