424 research outputs found

    Phase transitions in Phylogeny

    Get PDF
    We apply the theory of markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies. We consider the Cavender-Farris-Neyman model of evolution on trees, where all the inner nodes have degree at least 3, and the net transition on each edge is bounded by e. Motivated by a conjecture by M. Steel, we show that if 2 (1 - 2 e) (1 - 2e) > 1, then for balanced trees, the topology of the underlying tree, having n leaves, can be reconstructed from O(log n) samples (characters) at the leaves. On the other hand, we show that if 2 (1 - 2 e) (1 - 2 e) < 1, then there exist topologies which require at least poly(n) samples for reconstruction. Our results are the first rigorous results to establish the role of phase transitions for markov random fields on trees as studied in probability, statistical physics and information theory to the study of phylogenies in mathematical biology.Comment: To appear in Transactions of the AM

    Complete Characterization of Functions Satisfying the Conditions of Arrow's Theorem

    Get PDF
    Arrow's theorem implies that a social choice function satisfying Transitivity, the Pareto Principle (Unanimity) and Independence of Irrelevant Alternatives (IIA) must be dictatorial. When non-strict preferences are allowed, a dictatorial social choice function is defined as a function for which there exists a single voter whose strict preferences are followed. This definition allows for many different dictatorial functions. In particular, we construct examples of dictatorial functions which do not satisfy Transitivity and IIA. Thus Arrow's theorem, in the case of non-strict preferences, does not provide a complete characterization of all social choice functions satisfying Transitivity, the Pareto Principle, and IIA. The main results of this article provide such a characterization for Arrow's theorem, as well as for follow up results by Wilson. In particular, we strengthen Arrow's and Wilson's result by giving an exact if and only if condition for a function to satisfy Transitivity and IIA (and the Pareto Principle). Additionally, we derive formulas for the number of functions satisfying these conditions.Comment: 11 pages, 1 figur

    Mixing under monotone censoring

    Full text link
    We initiate the study of mixing times of Markov chain under monotone censoring. Suppose we have some Markov Chain MM on a state space Ξ©\Omega with stationary distribution Ο€\pi and a monotone set AβŠ‚Ξ©A \subset \Omega. We consider the chain Mβ€²M' which is the same as the chain MM started at some x∈Ax \in A except that moves of MM of the form xβ†’yx \to y where x∈Ax \in A and yβˆ‰Ay \notin A are {\em censored} and replaced by the move xβ†’xx \to x. If MM is ergodic and AA is connected, the new chain converges to Ο€\pi conditional on AA. In this paper we are interested in the mixing time of the chain Mβ€²M' in terms of properties of MM and AA. Our results are based on new connections with the field of property testing. A number of open problems are presented.Comment: 6 page

    Majority rule has transition ratio 4 on Yule trees under a 2-state symmetric model

    Full text link
    Inferring the ancestral state at the root of a phylogenetic tree from states observed at the leaves is a problem arising in evolutionary biology. The simplest technique -- majority rule -- estimates the root state by the most frequently occurring state at the leaves. Alternative methods -- such as maximum parsimony - explicitly take the tree structure into account. Since either method can outperform the other on particular trees, it is useful to consider the accuracy of the methods on trees generated under some evolutionary null model, such as a Yule pure-birth model. In this short note, we answer a recently posed question concerning the performance of majority rule on Yule trees under a symmetric 2-state Markovian substitution model of character state change. We show that majority rule is accurate precisely when the ratio of the birth (speciation) rate of the Yule process to the substitution rate exceeds the value 44. By contrast, maximum parsimony has been shown to be accurate only when this ratio is at least 6. Our proof relies on a second moment calculation, coupling, and a novel application of a reflection principle.Comment: 6 pages, 1 figur

    Robust dimension free isoperimetry in Gaussian space

    Get PDF
    We prove the first robust dimension free isoperimetric result for the standard Gaussian measure Ξ³n\gamma_n and the corresponding boundary measure Ξ³n+\gamma_n^+ in Rn\mathbb {R}^n. The main result in the theory of Gaussian isoperimetry (proven in the 1970s by Sudakov and Tsirelson, and independently by Borell) states that if Ξ³n(A)=1/2\gamma_n(A)=1/2 then the surface area of AA is bounded by the surface area of a half-space with the same measure, Ξ³n+(A)≀(2Ο€)βˆ’1/2\gamma_n^+(A)\leq(2\pi)^{-1/2}. Our results imply in particular that if AβŠ‚RnA\subset \mathbb {R}^n satisfies Ξ³n(A)=1/2\gamma_n(A)=1/2 and Ξ³n+(A)≀(2Ο€)βˆ’1/2+Ξ΄\gamma_n^+(A)\leq(2\pi)^{-1/2}+\delta then there exists a half-space BβŠ‚RnB\subset \mathbb {R}^n such that Ξ³n(AΞ”B)≀Clogβ‘βˆ’1/2(1/Ξ΄)\gamma_n(A\Delta B)\leq C\smash{\log^{-1/2}}(1/\delta) for an absolute constant CC. Since the Gaussian isoperimetric result was established, only recently a robust version of the Gaussian isoperimetric result was obtained by Cianchi et al., who showed that Ξ³n(AΞ”B)≀C(n)Ξ΄\gamma_n(A\Delta B)\le C(n)\sqrt{\delta} for some function C(n)C(n) with no effective bounds. Compared to the results of Cianchi et al., our results have optimal (i.e., no) dependence on the dimension, but worse dependence on Ξ΄ \delta.Comment: Published at http://dx.doi.org/10.1214/13-AOP860 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Phylogenetic mixtures: Concentration of measure in the large-tree limit

    Get PDF
    The reconstruction of phylogenies from DNA or protein sequences is a major task of computational evolutionary biology. Common phenomena, notably variations in mutation rates across genomes and incongruences between gene lineage histories, often make it necessary to model molecular data as originating from a mixture of phylogenies. Such mixed models play an increasingly important role in practice. Using concentration of measure techniques, we show that mixtures of large trees are typically identifiable. We also derive sequence-length requirements for high-probability reconstruction.Comment: Published in at http://dx.doi.org/10.1214/11-AAP837 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Approximation Resistant Predicates From Pairwise Independence

    Full text link
    We study the approximability of predicates on kk variables from a domain [q][q], and give a new sufficient condition for such predicates to be approximation resistant under the Unique Games Conjecture. Specifically, we show that a predicate PP is approximation resistant if there exists a balanced pairwise independent distribution over [q]k[q]^k whose support is contained in the set of satisfying assignments to PP
    • …
    corecore