Search CORE

573 research outputs found

A New Quartet Tree Heuristic for Hierarchical Clustering

Author: Cilibrasi Rudi
Vitanyi Paul M. B.
Publication venue
Publication date: 01/01/2006
Field of study

We consider the problem of constructing an an optimal-weight tree from the 3*(n choose 4) weighted quartet topologies on n objects, where optimality means that the summed weight of the embedded quartet topologiesis optimal (so it can be the case that the optimal tree embeds all quartets as non-optimal topologies). We present a heuristic for reconstructing the optimal-weight tree, and a canonical manner to derive the quartet-topology weights from a given distance matrix. The method repeatedly transforms a bifurcating tree, with all objects involved as leaves, achieving a monotonic approximation to the exact single globally optimal tree. This contrasts to other heuristic search methods from biological phylogeny, like DNAML or quartet puzzling, which, repeatedly, incrementally construct a solution from a random order of objects, and subsequently add agreement values.Comment: 22 pages, 14 figure

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

A Fast Quartet Tree Heuristic for Hierarchical Clustering

Author: Cilibrasi Rudi L.
Vitanyi Paul M. B.
Publication venue
Publication date: 12/09/2014
Field of study

The Minimum Quartet Tree Cost problem is to construct an optimal weight tree from the

3{n \choose 4}

weighted quartet topologies on

n

objects, where optimality means that the summed weight of the embedded quartet topologies is optimal (so it can be the case that the optimal tree embeds all quartets as nonoptimal topologies). We present a Monte Carlo heuristic, based on randomized hill climbing, for approximating the optimal weight tree, given the quartet topology weights. The method repeatedly transforms a dendrogram, with all objects involved as leaves, achieving a monotonic approximation to the exact single globally optimal tree. The problem and the solution heuristic has been extensively used for general hierarchical clustering of nontree-like (non-phylogeny) data in various domains and across domains with heterogeneous data. We also present a greatly improved heuristic, reducing the running time by a factor of order a thousand to ten thousand. All this is implemented and available, as part of the CompLearn package. We compare performance and running time of the original and improved versions with those of UPGMA, BioNJ, and NJ, as implemented in the SplitsTree package on genomic data for which the latter are optimized. Keywords: Data and knowledge visualization, Pattern matching--Clustering--Algorithms/Similarity measures, Hierarchical clustering, Global optimization, Quartet tree, Randomized hill-climbing,Comment: LaTeX, 40 pages, 11 figures; this paper has substantial overlap with arXiv:cs/0606048 in cs.D

arXiv.org e-Print Archive

CiteSeerX

CWI's Institutional Repository

Recommended from our members

Metaheuristic approaches for the quartet method of hierarchical clustering

Author: Consoli S
Darby-Dowman K
Geleijnse G
Korst J
Pauws S
Publication venue
Publication date: 01/01/2008
Field of study

Given a set of objects and their pairwise distances, we wish to determine a visual representation of the data. We use the quartet paradigm to compute a hierarchy of clusters of the objects. The method is based on an NP-hard graph optimization problem called the Minimum Quartet Tree Cost problem. This paper presents and compares several metaheuristic approaches to approximate the optimal hierarchy. The performance of the algorithms is tested through extensive computational experiments and it is shown that the Reduced Variable Neighbourhood Search metaheuristic is the most effective approach to the problem, obtaining high quality solutions in short computational running times

Brunel University Research Archive

A Fast Quartet Tree Heuristic for Hierarchical Clustering

Author: Cilibrasi R. (Rudi)
Vitányi P.M.B. (Paul)
Publication venue
Publication date: 12/09/2014
Field of study

CWI's Institutional Repository

Automatic Music Composition using Answer Set Programming

Author: Boenn Georg
Brain Martin
De Vos Marina
ffitch John
Publication venue
Publication date: 25/06/2010
Field of study

Music composition used to be a pen and paper activity. These these days music is often composed with the aid of computer software, even to the point where the computer compose parts of the score autonomously. The composition of most styles of music is governed by rules. We show that by approaching the automation, analysis and verification of composition as a knowledge representation task and formalising these rules in a suitable logical language, powerful and expressive intelligent composition tools can be easily built. This application paper describes the use of answer set programming to construct an automated system, named ANTON, that can compose melodic, harmonic and rhythmic music, diagnose errors in human compositions and serve as a computer-aided composition tool. The combination of harmonic, rhythmic and melodic composition in a single framework makes ANTON unique in the growing area of algorithmic composition. With near real-time composition, ANTON reaches the point where it can not only be used as a component in an interactive composition tool but also has the potential for live performances and concerts or automatically generated background music in a variety of applications. With the use of a fully declarative language and an "off-the-shelf" reasoning engine, ANTON provides the human composer a tool which is significantly simpler, more compact and more versatile than other existing systems. This paper has been accepted for publication in Theory and Practice of Logic Programming (TPLP).Comment: 31 pages, 10 figures. Extended version of our ICLP2008 paper. Formatted following TPLP guideline

arXiv.org e-Print Archive

OPUS

Statistical Inference through Data Compression

Author: Cilibrasi R. (Rudi)
Publication venue
Publication date: 23/02/2007
Field of study

CWI's Institutional Repository

An approximate search engine for structure

Author: Shan Huiyuan
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2004
Field of study

As the size of structural databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute-value and by text has become commonplace in these databases. However, searching by topological or physical structure, especially for large databases and especially for approximate matches, is still an art. In this dissertation, efficient search techniques are presented for retrieving trees from a database that are similar to a given query tree. Rooted ordered labeled trees, rooted unordered labeled trees and free trees are considered. Ordered labeled trees are trees in which each node has a label and the left-to-right order among siblings matters. Unordered labeled trees are trees in which the parent-child relationship is significant, but the order among siblings is unimportant. Free trees (unrooted unordered trees) are acyclic graphs. These trees find many applications in bioinformatics, Web log analysis, phyloinformatics, XML processing, etc. Two types of similarity measures are investigated: (i) counting the mismatching paths in the query tree and a data tree, and (ii) measuring the topological relationship between the trees. The proposed approaches include storing the paths of trees in a suffix array, employing hashing techniques to speed up retrieval, and counting the number of up-down operations to move a token from one node to another node in a tree. Various filters for accelerating a search, different strategies for parallelizing these search algorithms and applications of these algorithms to XML and phylogenetic data management are discussed. The proposed techniques have been implemented into a phylogenetic search engine which is fully operational and is available on the World Wide Web. Experimental results on comparing the similarity measures with existing tree metrics and on evaluating the efficiency of the search techniques demonstrate the effectiveness of the search engine. Future work includes extending the techniques to other structural data, as well as developing new filters and algorithms for speeding up searching and mining in complex structures

Digital Commons @ New Jersey Institute of Technology (NJIT)

A new quartet tree heuristic for hierarchical clustering

Author: Cilibrasi R.L. (Rudi)
Vitányi P.M.B. (Paul)
Publication venue
Publication date: 07/07/2006
Field of study

We present a new quartet tree heuristic for hierarchical clustering from weighted quartet topologies, and a standard manner to derive those from a given distance matrix. We do not assume that there is a true ternary tree that generated the quartet topologies or distances which we wish to recover as closely as possible. Our aim is to just model the input data as faithfully as possible by the quartet tree. Our method is capable of handling up to 60–80 objects in a matter of hours, while no existing quartet heuristic can directly compute a quartet tree of more than about 20–30 objects without running for years. The method is implemented and available as public software

CWI's Institutional Repository

29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

Author: ISAAC <29. 2018, Jiaoxi, Yilan>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/12/2018
Field of study

Digitale Bibliothek Thüringen