2,620 research outputs found

    Enumerating contingency tables via random permanents

    Full text link
    Given m positive integers R=(r_i), n positive integers C=(c_j) such that sum r_i = sum c_j =N, and mn non-negative weights W=(w_{ij}), we consider the total weight T=T(R, C; W) of non-negative integer matrices (contingency tables) D=(d_{ij}) with the row sums r_i, column sums c_j, and the weight of D equal to prod w_{ij}^{d_{ij}}. We present a randomized algorithm of a polynomial in N complexity which computes a number T'=T'(R,C; W) such that T' < T < alpha(R, C) T' where alpha(R,C) = min{prod r_i! r_i^{-r_i}, prod c_j! c_j^{-c_j}} N^N/N!. In many cases, ln T' provides an asymptotically accurate estimate of ln T. The idea of the algorithm is to express T as the expectation of the permanent of an N x N random matrix with exponentially distributed entries and approximate the expectation by the integral T' of an efficiently computable log-concave function on R^{mn}. Applications to counting integer flows in graphs are also discussed.Comment: 19 pages, bounds are sharpened, references are adde

    Markov bases and subbases for bounded contingency tables

    Full text link
    In this paper we study the computation of Markov bases for contingency tables whose cell entries have an upper bound. In general a Markov basis for unbounded contingency table under a certain model differs from a Markov basis for bounded tables. Rapallo, (2007) applied Lawrence lifting to compute a Markov basis for contingency tables whose cell entries are bounded. However, in the process, one has to compute the universal Gr\"obner basis of the ideal associated with the design matrix for a model which is, in general, larger than any reduced Gr\"obner basis. Thus, this is also infeasible in small- and medium-sized problems. In this paper we focus on bounded two-way contingency tables under independence model and show that if these bounds on cells are positive, i.e., they are not structural zeros, the set of basic moves of all 2×22 \times 2 minors connects all tables with given margins. We end this paper with an open problem that if we know the given margins are positive, we want to find the necessary and sufficient condition on the set of structural zeros so that the set of basic moves of all 2×22 \times 2 minors connects all incomplete contingency tables with given margins.Comment: 22 pages. It will appear in the Annals of the Institution of Statistical Mathematic

    Independence Models for Integer Points of Polytopes.

    Full text link
    The integer points of a high-dimensional polytope P are generally difficult to count or sample uniformly. We consider a class of low-complexity random models for these points which arise from an entropy maximization problem. From these models, by way of "anti-concentration" results for sums of independent random variables, we derive general, efficiently computable upper bounds on the number of integer points of P. We make a detailed study of contingency tables with bounded entries, which are the integer points of a transportation polytope truncated by a cuboid. We provide efficiently computable estimates for the logarithm of the number of m by n tables with specified row and column sums r_1, ..., r_m, c_1, ..., c_n and bounds on the entries. These estimates are asymptotic as m and n go to infinity simultaneously, given that no r_i (resp., c_j) is allowed to exceed a fixed multiple of the average row sum (resp., column sum). As an application, we consider a random, uniformly selected table with entries in {0, 1, ..., kappa} having a given sum. Responding to questions raised by Diaconis and Efron in the context of statistical significance testing, we show that the occurrence of row sums r_1, ..., r_m is positively correlated with the occurrence of column sums c_1, ..., c_n when kappa > 1 and r_1, ..., r_m, c_1, ..., c_n are sufficiently extreme. We give evidence that the opposite is true for near-average values of r_1, ..., r_m, c_1, ..., c_n.Ph.D.MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86295/1/auspex_1.pd

    A decomposition based proof for fast mixing of a Markov chain over balanced realizations of a joint degree matrix

    Get PDF
    A joint degree matrix (JDM) specifies the number of connections between nodes of given degrees in a graph, for all degree pairs and uniquely determines the degree sequence of the graph. We consider the space of all balanced realizations of an arbitrary JDM, realizations in which the links between any two degree groups are placed as uniformly as possible. We prove that a swap Markov Chain Monte Carlo (MCMC) algorithm in the space of all balanced realizations of an {\em arbitrary} graphical JDM mixes rapidly, i.e., the relaxation time of the chain is bounded from above by a polynomial in the number of nodes nn. To prove fast mixing, we first prove a general factorization theorem similar to the Martin-Randall method for disjoint decompositions (partitions). This theorem can be used to bound from below the spectral gap with the help of fast mixing subchains within every partition and a bound on an auxiliary Markov chain between the partitions. Our proof of the general factorization theorem is direct and uses conductance based methods (Cheeger inequality).Comment: submitted, 18 pages, 4 figure

    Stream sketches, sampling, and sabotage

    Get PDF
    Exact solutions are unattainable for important problems. The calculations are limited by the memory of our computers and the length of time that we can wait for a solution. The field of approximation algorithms has grown to address this problem; it is practically important and theoretically fascinating. We address three questions along these lines. What are the limits of streaming computation? Can we efficiently compute the likelihood of a given network of relationships? How robust are the solutions to combinatorial optimization problems? High speed network monitoring and rapid acquisition of scientific data require the development of space efficient algorithms. In these settings it is impractical or impossible to store all of the data, nonetheless the need for analyzing it persists. Typically, the goal is to compute some simple statistics on the input using sublinear, or even polylogarithmic, space. Our main contributions here are the complete classification of the space necessary for several types of statistics. Our sharpest results characterize the complexity in terms of the domain size and stream length. Furthermore, our algorithms are universal for their respective classes of statistics. A network of relationships, for example friendships or species-habitat pairings, can often be represented as a binary contingency table, which is {0,1}-matrix with given row and column sums. A natural null model for hypothesis testing here is the uniform distribution on the set of binary contingency tables with the same line sums as the observation. However, exact calculation, asymptotic approximation, and even Monte-Carlo approximation of p-values are so-far practically unattainable for many interesting examples. This thesis presents two new algorithms for sampling contingency tables. One is a hybrid algorithm that combines elements of two previously known algorithms. It is intended to exploit certain properties of the margins that are observed in some data sets. Our other algorithm samples from a larger set of tables, but it has the advantage of being fast. The robustness of a system can be assessed from optimal attack strategies. Interdiction problems ask about the worst-case impact of a limited change to an underlying optimization problem. Most interdiction problems are NP-hard, and furthermore, even designing efficient approximation algorithms that allow for estimating the order of magnitude of a worst-case impact has turned out to be very difficult. We suggest a general method to obtain pseudoapproximations for many interdiction problems

    New Classes of Degree Sequences with Fast Mixing Swap Markov Chain Sampling

    Get PDF
    In network modelling of complex systems one is often required to sample random realizations of networks that obey a given set of constraints, usually in the form of graph measures. A much studied class of problems targets uniform sampling of simple graphs with given degree sequence or also with given degree correlations expressed in the form of a Joint Degree Matrix. One approach is to use Markov chains based on edge switches (swaps) that preserve the constraints, are irreducible (ergodic) and fast mixing. In 1999, Kannan, Tetali and Vempala (KTV) proposed a simple swap Markov chain for sampling graphs with given degree sequence, and conjectured that it mixes rapidly (in polynomial time) for arbitrary degree sequences. Although the conjecture is still open, it has been proved for special degree sequences, in particular for those of undirected and directed regular simple graphs, half-regular bipartite graphs, and graphs with certain bounded maximum degrees. Here we prove the fast mixing KTV conjecture for novel, exponentially large classes of irregular degree sequences. Our method is based on a canonical decomposition of degree sequences into split graph degree sequences, a structural theorem for the space of graph realizations and on a factorization theorem for Markov chains. After introducing bipartite ‘splitted’ degree sequences, we also generalize the canonical split graph decomposition for bipartite and directed graphs. Copyright © Cambridge University Press 201

    Geometric Combinatorics of Transportation Polytopes and the Behavior of the Simplex Method

    Full text link
    This dissertation investigates the geometric combinatorics of convex polytopes and connections to the behavior of the simplex method for linear programming. We focus our attention on transportation polytopes, which are sets of all tables of non-negative real numbers satisfying certain summation conditions. Transportation problems are, in many ways, the simplest kind of linear programs and thus have a rich combinatorial structure. First, we give new results on the diameters of certain classes of transportation polytopes and their relation to the Hirsch Conjecture, which asserts that the diameter of every dd-dimensional convex polytope with nn facets is bounded above by ndn-d. In particular, we prove a new quadratic upper bound on the diameter of 33-way axial transportation polytopes defined by 11-marginals. We also show that the Hirsch Conjecture holds for p×2p \times 2 classical transportation polytopes, but that there are infinitely-many Hirsch-sharp classical transportation polytopes. Second, we present new results on subpolytopes of transportation polytopes. We investigate, for example, a non-regular triangulation of a subpolytope of the fourth Birkhoff polytope B4B_4. This implies the existence of non-regular triangulations of all Birkhoff polytopes BnB_n for n4n \geq 4. We also study certain classes of network flow polytopes and prove new linear upper bounds for their diameters.Comment: PhD thesis submitted June 2010 to the University of California, Davis. 183 pages, 49 figure
    corecore