3,287 research outputs found
Fractal geometry of spin-glass models
Stability and diversity are two key properties that living entities share
with spin glasses, where they are manifested through the breaking of the phase
space into many valleys or local minima connected by saddle points. The
topology of the phase space can be conveniently condensed into a tree
structure, akin to the biological phylogenetic trees, whose tips are the local
minima and internal nodes are the lowest-energy saddles connecting those
minima. For the infinite-range Ising spin glass with p-spin interactions, we
show that the average size-frequency distribution of saddles obeys a power law
, where w=w(s) is the number of minima that can be
connected through saddle s, and D is the fractal dimension of the phase space
Fractal geometry of spin-glass models
Stability and diversity are two key properties that living entities share
with spin glasses, where they are manifested through the breaking of the phase
space into many valleys or local minima connected by saddle points. The
topology of the phase space can be conveniently condensed into a tree
structure, akin to the biological phylogenetic trees, whose tips are the local
minima and internal nodes are the lowest-energy saddles connecting those
minima. For the infinite-range Ising spin glass with p-spin interactions, we
show that the average size-frequency distribution of saddles obeys a power law
, where w=w(s) is the number of minima that can be
connected through saddle s, and D is the fractal dimension of the phase space
An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings
A widely used method for determining the similarity of two labeled trees is
to compute a maximum agreement subtree of the two trees. Previous work on this
similarity measure is only concerned with the comparison of labeled trees of
two special kinds, namely, uniformly labeled trees (i.e., trees with all their
nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled
trees with distinct symbols for distinct leaves). This paper presents an
algorithm for comparing trees that are labeled in an arbitrary manner. In
addition to this generality, this algorithm is faster than the previous
algorithms.
Another contribution of this paper is on maximum weight bipartite matchings.
We show how to speed up the best known matching algorithms when the input
graphs are node-unbalanced or weight-unbalanced. Based on these enhancements,
we obtain an efficient algorithm for a new matching problem called the
hierarchical bipartite matching problem, which is at the core of our maximum
agreement subtree algorithm.Comment: To appear in Journal of Algorithm
Trickle-down processes and their boundaries
It is possible to represent each of a number of Markov chains as an evolving
sequence of connected subsets of a directed acyclic graph that grow in the
following way: initially, all vertices of the graph are unoccupied, particles
are fed in one-by-one at a distinguished source vertex, successive particles
proceed along directed edges according to an appropriate stochastic mechanism,
and each particle comes to rest once it encounters an unoccupied vertex.
Examples include the binary and digital search tree processes, the random
recursive tree process and generalizations of it arising from nested instances
of Pitman's two-parameter Chinese restaurant process, tree-growth models
associated with Mallows' phi model of random permutations and with
Schuetzenberger's non-commutative q-binomial theorem, and a construction due to
Luczak and Winkler that grows uniform random binary trees in a Markovian
manner. We introduce a framework that encompasses such Markov chains, and we
characterize their asymptotic behavior by analyzing in detail their Doob-Martin
compactifications, Poisson boundaries and tail sigma-fields.Comment: 62 pages, 8 figures, revised to address referee's comment
Phase transition in the sample complexity of likelihood-based phylogeny inference
Reconstructing evolutionary trees from molecular sequence data is a
fundamental problem in computational biology. Stochastic models of sequence
evolution are closely related to spin systems that have been extensively
studied in statistical physics and that connection has led to important
insights on the theoretical properties of phylogenetic reconstruction
algorithms as well as the development of new inference methods. Here, we study
maximum likelihood, a classical statistical technique which is perhaps the most
widely used in phylogenetic practice because of its superior empirical
accuracy.
At the theoretical level, except for its consistency, that is, the guarantee
of eventual correct reconstruction as the size of the input data grows, much
remains to be understood about the statistical properties of maximum likelihood
in this context. In particular, the best bounds on the sample complexity or
sequence-length requirement of maximum likelihood, that is, the amount of data
required for correct reconstruction, are exponential in the number, , of
tips---far from known lower bounds based on information-theoretic arguments.
Here we close the gap by proving a new upper bound on the sequence-length
requirement of maximum likelihood that matches up to constants the known lower
bound for some standard models of evolution.
More specifically, for the -state symmetric model of sequence evolution on
a binary phylogeny with bounded edge lengths, we show that the sequence-length
requirement behaves logarithmically in when the expected amount of mutation
per edge is below what is known as the Kesten-Stigum threshold. In general, the
sequence-length requirement is polynomial in . Our results imply moreover
that the maximum likelihood estimator can be computed efficiently on randomly
generated data provided sequences are as above.Comment: To appear in Probability Theory and Related Field
- …