634 research outputs found
A New Quartet Tree Heuristic for Hierarchical Clustering
We consider the problem of constructing an an optimal-weight tree from the
3*(n choose 4) weighted quartet topologies on n objects, where optimality means
that the summed weight of the embedded quartet topologiesis optimal (so it can
be the case that the optimal tree embeds all quartets as non-optimal
topologies). We present a heuristic for reconstructing the optimal-weight tree,
and a canonical manner to derive the quartet-topology weights from a given
distance matrix. The method repeatedly transforms a bifurcating tree, with all
objects involved as leaves, achieving a monotonic approximation to the exact
single globally optimal tree. This contrasts to other heuristic search methods
from biological phylogeny, like DNAML or quartet puzzling, which, repeatedly,
incrementally construct a solution from a random order of objects, and
subsequently add agreement values.Comment: 22 pages, 14 figure
A Fast Quartet Tree Heuristic for Hierarchical Clustering
The Minimum Quartet Tree Cost problem is to construct an optimal weight tree
from the weighted quartet topologies on objects, where
optimality means that the summed weight of the embedded quartet topologies is
optimal (so it can be the case that the optimal tree embeds all quartets as
nonoptimal topologies). We present a Monte Carlo heuristic, based on randomized
hill climbing, for approximating the optimal weight tree, given the quartet
topology weights. The method repeatedly transforms a dendrogram, with all
objects involved as leaves, achieving a monotonic approximation to the exact
single globally optimal tree. The problem and the solution heuristic has been
extensively used for general hierarchical clustering of nontree-like
(non-phylogeny) data in various domains and across domains with heterogeneous
data. We also present a greatly improved heuristic, reducing the running time
by a factor of order a thousand to ten thousand. All this is implemented and
available, as part of the CompLearn package. We compare performance and running
time of the original and improved versions with those of UPGMA, BioNJ, and NJ,
as implemented in the SplitsTree package on genomic data for which the latter
are optimized.
Keywords: Data and knowledge visualization, Pattern
matching--Clustering--Algorithms/Similarity measures, Hierarchical clustering,
Global optimization, Quartet tree, Randomized hill-climbing,Comment: LaTeX, 40 pages, 11 figures; this paper has substantial overlap with
arXiv:cs/0606048 in cs.D
Towards relating the kappa-symmetric and pure-spinor versions of the supermembrane
We study the relation between the kappa-symmetric formulation of the
supermembrane in eleven dimensions and the pure-spinor version. Recently,
Berkovits related the Green-Schwarz and pure-spinor superstrings. In this
paper, we attempt to extend this method to the supermembrane. We show that it
is possible to reinstate the reparameterisation constraints in the pure-spinor
formulation of the supermembrane by introducing a topological sector and
performing a similarity transformation. The resulting BRST charge is then of
conventional type and is argued to be (related to) the BRST charge of the
kappa-symmetric supermembrane in a formulation where all second class
constraints are 'gauge unfixed' to first class constraints. In our analysis we
also encounter a natural candidate for a (non-covariant) supermembrane analogue
of the superstring b ghost.Comment: 15 Page
Flavor Doubling and the Nature of Asymptopia
We consider the possibility that QCD with N flavors has a useful low-energy
description with 2N flavors. Specifically, we investigate a free theory of 2N
quarks. Although the free theory is U(N)_L X U(N)_R invariant, it admits a
larger U(2N) invariance. However, when the axial anomaly is accounted for in
the effective theory by a 't Hooft interaction, only SU(N)_L X SU(N)_R X U(1)_B
\subset U(2N) survives. There is however a residual discrete symmetry that is
not a symmetry of the QCD lagrangian. This S_2 subgroup of U(2N) has many
interesting properties. For instance, when explicit chiral symmetry breaking
effects are present, S_2 is broken unless \bar\theta=0 or pi. By expressing the
free theory on the light-front, we show that flavor doubling implies several
superconvergence relations in pion-hadron scattering. Implicit in the 2N-flavor
effective theory is a Regge trajectory with vacuum quantum numbers and unit
intercept whose behavior is constrained by S_2. In particular, S_2 implies that
forward pion-hadron scattering becomes purely elastic at high-energies, in good
agreement with experiment.Comment: 26 pages TeX, uses mtexsis.te
DNA sequence evidence for speciation, paraphyly and a Mesozoic dispersal of cancellothyridid articulate brachiopods
Because the classification of extant and fossil articulate brachiopods is based largely upon shell characters observable in fossils, it identifies morphotaxa whose biological status can, in practice, best be inferred from estimates of genetic divergence. Allozyme polymorphism and restriction fragment length polymorphism of mitochondrial DNA (mtDNA RFLP) have been used to show that nuclear and mitochondrial genetic divergence between samples of the cancellothyridid brachiopods Terebratulina septentrionalis from Canada and T. retusa from Europe is compatible with biological speciation, but the genetic distances obtained were biased by methodological limitations. Here, we report estimates of divergence in 12S rDNA mitochondrial sequences within and between samples of these brachiopods. The sequence-based genetic distance between these samples (5.98-0.07% SE) is at least 10 times greater than within them and, since they also differ in a complex life-history trait, their species status is considered to be securely established. Divergence levels between 12S rDNA genes of three other cancellothyridids, T. unguicula from Alaska, T. crossei from near Japan, and Cancellothyris hedleyi from near Australia are higher than between the two North Atlantic species, and the mean nucleotide distance between all these cancellothyrids is similar to the mean distance between species of Littorina (Mollusca: Gastropoda). Sequences of both 12S and 16S genes from cancellothyridids and other short-looped brachiopod species show neither saturation nor lineage-specific rate differences and, when analysed with different outgroups, either separately or together, yield one unexpected, but well-supported, tree with Alaskan T. unguicula basal and C. hedleyi nested within Terebratulina, i.e. these genera are paraphyletic. A geologically dated divergence between Antarctic and New Zealand species of the short-looped brachiopod Liothyrella is used to calibrate the rate of 12S divergence at ca. 0.1% per million years (MY), and this rate is used to infer that T. septentrionalis and T. retusa have been diverging for ca. 60 MY and that they and T. unguicula have been diverging from their last common ancestor for ca. 100 MY. This indicates a Mesozoic origin for the present-day distribution of cancellothyridids and the basal position of T. unguicula suggests a possible North Pacific centre of origin, with separate Atlantic and Pacific radiations. The inclusion of Cancellothyris within Terebratulina also shows that adult shell characters such as umbo, foramen and symphytium shape, whilst probably indispensible for the practical classification of fossils, are not reliable guides to genealogy
- …