33 research outputs found

    Convex Rank Tests and Semigraphoids

    Get PDF
    Convex rank tests are partitions of the symmetric group which have desirable geometric properties. The statistical tests defined by such partitions involve counting all permutations in the equivalence classes. Each class consists of the linear extensions of a partially ordered set specified by data. Our methods refine existing rank tests of non-parametric statistics, such as the sign test and the runs test, and are useful for exploratory analysis of ordinal data. We establish a bijection between convex rank tests and probabilistic conditional independence structures known as semigraphoids. The subclass of submodular rank tests is derived from faces of the cone of submodular functions, or from Minkowski summands of the permutohedron. We enumerate all small instances of such rank tests. Of particular interest are graphical tests, which correspond to both graphical models and to graph associahedra

    Generalized Permutohedra from Probabilistic Graphical Models

    Get PDF
    A graphical model encodes conditional independence relations via the Markov properties. For an undirected graph these conditional independence relations can be represented by a simple polytope known as the graph associahedron, which can be constructed as a Minkowski sum of standard simplices. There is an analogous polytope for conditional independence relations coming from a regular Gaussian model, and it can be defined using multiinformation or relative entropy. For directed acyclic graphical models and also for mixed graphical models containing undirected, directed and bidirected edges, we give a construction of this polytope, up to equivalence of normal fans, as a Minkowski sum of matroid polytopes. Finally, we apply this geometric insight to construct a new ordering-based search algorithm for causal inference via directed acyclic graphical models.Comment: Appendix B is expanded. Final version to appear in SIAM J. Discrete Mat

    The Cyclohedron Test for Finding Periodic Genes in Time Course Expression Studies

    Get PDF
    The problem of finding periodically expressed genes from time course microarray experiments is at the center of numerous efforts to identify the molecular components of biological clocks. We present a new approach to this problem based on the cyclohedron test, which is a rank test inspired by recent advances in algebraic combinatorics. The test has the advantage of being robust to measurement errors, and can be used to ascertain the significance of top-ranked genes. We apply the test to recently published measurements of gene expression during mouse somitogenesis and find 32 genes that collectively are significant. Among these are previously identified periodic genes involved in the Notch/FGF and Wnt signaling pathways, as well as novel candidate genes that may play a role in regulating the segmentation clock. These results confirm that there are an abundance of exceptionally periodic genes expressed during somitogenesis. The emphasis of this paper is on the statistics and combinatorics that underlie the cyclohedron test and its implementation within a multiple testing framework.Comment: Revision consists of reorganization and further statistical discussion; 19 pages, 4 figure

    "Building" exact confidence nets

    Get PDF
    Confidence nets, that is, collections of confidence intervals that fill out the parameter space and whose exact parameter coverage can be computed, are familiar in nonparametric statistics. Here, the distributional assumptions are based on invariance under the action of a finite reflection group. Exact confidence nets are exhibited for a single parameter, based on the root system of the group. The main result is a formula for the generating function of the coverage interval probabilities. The proof makes use of the theory of "buildings" and the Chevalley factorization theorem for the length distribution on Cayley graphs of finite reflection groups.Comment: 20 pages. To appear in Bernoull

    Using TPA to count linear extensions

    Full text link
    A linear extension of a poset PP is a permutation of the elements of the set that respects the partial order. Let L(P)L(P) denote the number of linear extensions. It is a #P complete problem to determine L(P)L(P) exactly for an arbitrary poset, and so randomized approximation algorithms that draw randomly from the set of linear extensions are used. In this work, the set of linear extensions is embedded in a larger state space with a continuous parameter ?. The introduction of a continuous parameter allows for the use of a more efficient method for approximating L(P)L(P) called TPA. Our primary result is that it is possible to sample from this continuous embedding in time that as fast or faster than the best known methods for sampling uniformly from linear extensions. For a poset containing nn elements, this means we can approximate L(P)L(P) to within a factor of 1+ϵ1 + \epsilon with probability at least 1δ1 - \delta using an expected number of random bits and comparisons in the poset which is at most O(n3(lnn)(lnL(P))ϵ2lnδ1).O(n^3(ln n)(ln L(P))\epsilon^{-2}\ln \delta^{-1}).Comment: 12 pages, 4 algorithm
    corecore