60 research outputs found
A New Quartet Tree Heuristic for Hierarchical Clustering
We consider the problem of constructing an an optimal-weight tree from the
3*(n choose 4) weighted quartet topologies on n objects, where optimality means
that the summed weight of the embedded quartet topologiesis optimal (so it can
be the case that the optimal tree embeds all quartets as non-optimal
topologies). We present a heuristic for reconstructing the optimal-weight tree,
and a canonical manner to derive the quartet-topology weights from a given
distance matrix. The method repeatedly transforms a bifurcating tree, with all
objects involved as leaves, achieving a monotonic approximation to the exact
single globally optimal tree. This contrasts to other heuristic search methods
from biological phylogeny, like DNAML or quartet puzzling, which, repeatedly,
incrementally construct a solution from a random order of objects, and
subsequently add agreement values.Comment: 22 pages, 14 figure
A Fast Quartet Tree Heuristic for Hierarchical Clustering
The Minimum Quartet Tree Cost problem is to construct an optimal weight tree
from the  weighted quartet topologies on  objects, where
optimality means that the summed weight of the embedded quartet topologies is
optimal (so it can be the case that the optimal tree embeds all quartets as
nonoptimal topologies). We present a Monte Carlo heuristic, based on randomized
hill climbing, for approximating the optimal weight tree, given the quartet
topology weights. The method repeatedly transforms a dendrogram, with all
objects involved as leaves, achieving a monotonic approximation to the exact
single globally optimal tree. The problem and the solution heuristic has been
extensively used for general hierarchical clustering of nontree-like
(non-phylogeny) data in various domains and across domains with heterogeneous
data. We also present a greatly improved heuristic, reducing the running time
by a factor of order a thousand to ten thousand. All this is implemented and
available, as part of the CompLearn package. We compare performance and running
time of the original and improved versions with those of UPGMA, BioNJ, and NJ,
as implemented in the SplitsTree package on genomic data for which the latter
are optimized.
  Keywords: Data and knowledge visualization, Pattern
matching--Clustering--Algorithms/Similarity measures, Hierarchical clustering,
Global optimization, Quartet tree, Randomized hill-climbing,Comment: LaTeX, 40 pages, 11 figures; this paper has substantial overlap with
  arXiv:cs/0606048 in cs.D
The development and application of metaheuristics for problems in graph theory: A computational study
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.It is known that graph theoretic models have extensive application
to real-life discrete optimization problems. Many of these models
are NP-hard and, as a result, exact methods may be impractical for
large scale problem instances. Consequently, there is a great interest
in developing e±cient approximate methods that yield near-optimal
solutions in acceptable computational times. A class of such methods,
known as metaheuristics, have been proposed with success.
This thesis considers some recently proposed NP-hard combinatorial
optimization problems formulated on graphs. In particular, the min-
imum labelling spanning tree problem, the minimum labelling Steiner
tree problem, and the minimum quartet tree cost problem, are inves-
tigated. Several metaheuristics are proposed for each problem, from
classical approximation algorithms to novel approaches. A compre-
hensive computational investigation in which the proposed methods
are compared with other algorithms recommended in the literature is
reported. The results show that the proposed metaheuristics outper-
form the algorithms recommended in the literature, obtaining optimal
or near-optimal solutions in short computational running times. In
addition, a thorough analysis of the implementation of these methods
provide insights for the implementation of metaheuristic strategies for
other graph theoretic problems
A new quartet tree heuristic for hierarchical clustering
We present a new quartet tree heuristic for hierarchical clustering from weighted quartet topologies, and a standard manner to derive those from a given distance matrix. We do not assume that there is a true ternary tree that generated the quartet topologies or distances which we wish to recover as closely as possible. Our aim is to just model the input data as faithfully as possible by the quartet tree. Our method is capable of handling up to 60–80 objects in a matter of hours, while no existing quartet heuristic can directly compute a quartet tree of more than about 20–30 objects without running for years. The method is implemented and available as public software
How Fitch-Margoliash Algorithm can Benefit from Multi Dimensional Scaling
Whatever the phylogenetic method, genetic sequences are often described as strings of characters, thus molecular sequences can be viewed as elements of a multi-dimensional space. As a consequence, studying motion in this space (ie, the evolutionary process) must deal with the amazing features of high-dimensional spaces like concentration of measured phenomenon
- …
