Search CORE

2,620 research outputs found

Enumerating contingency tables via random permanents

Author: Barvinok Alexander
Publication venue
Publication date: 01/01/2005
Field of study

Given m positive integers R=(r_i), n positive integers C=(c_j) such that sum r_i = sum c_j =N, and mn non-negative weights W=(w_{ij}), we consider the total weight T=T(R, C; W) of non-negative integer matrices (contingency tables) D=(d_{ij}) with the row sums r_i, column sums c_j, and the weight of D equal to prod w_{ij}^{d_{ij}}. We present a randomized algorithm of a polynomial in N complexity which computes a number T'=T'(R,C; W) such that T' < T < alpha(R, C) T' where alpha(R,C) = min{prod r_i! r_i^{-r_i}, prod c_j! c_j^{-c_j}} N^N/N!. In many cases, ln T' provides an asymptotically accurate estimate of ln T. The idea of the algorithm is to express T as the expectation of the permanent of an N x N random matrix with exponentially distributed entries and approximate the expectation by the integral T' of an efficiently computable log-concave function on R^{mn}. Applications to counting integer flows in graphs are also discussed.Comment: 19 pages, bounds are sharpened, references are adde

arXiv.org e-Print Archive

CiteSeerX

Markov bases and subbases for bounded contingency tables

Author: A. Agresti
A. Bigatti
D. Cox
F. Rapallo
Fabio Rapallo
J. De Loera
J. Shao
P. Diaconis
Ruriko Yoshida
S. Aoki
S. Aoki
S. Aoki
Y. Chen
Y. Chen
Publication venue
Publication date: 01/01/2010
Field of study

In this paper we study the computation of Markov bases for contingency tables whose cell entries have an upper bound. In general a Markov basis for unbounded contingency table under a certain model differs from a Markov basis for bounded tables. Rapallo, (2007) applied Lawrence lifting to compute a Markov basis for contingency tables whose cell entries are bounded. However, in the process, one has to compute the universal Gr\"obner basis of the ideal associated with the design matrix for a model which is, in general, larger than any reduced Gr\"obner basis. Thus, this is also infeasible in small- and medium-sized problems. In this paper we focus on bounded two-way contingency tables under independence model and show that if these bounds on cells are positive, i.e., they are not structural zeros, the set of basic moves of all

2 \times 2

minors connects all tables with given margins. We end this paper with an open problem that if we know the given margins are positive, we want to find the necessary and sufficient condition on the set of structural zeros so that the set of basic moves of all

2 \times 2

minors connects all incomplete contingency tables with given margins.Comment: 22 pages. It will appear in the Annals of the Institution of Statistical Mathematic

arXiv.org e-Print Archive

Crossref

Research Papers in Economics

Archivio istituzionale della ricerca - Università di Genova

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Independence Models for Integer Points of Polytopes.

Author: Shapiro Austin Warren
Publication venue
Publication date: 01/01/2011
Field of study

The integer points of a high-dimensional polytope P are generally difficult to count or sample uniformly. We consider a class of low-complexity random models for these points which arise from an entropy maximization problem. From these models, by way of "anti-concentration" results for sums of independent random variables, we derive general, efficiently computable upper bounds on the number of integer points of P. We make a detailed study of contingency tables with bounded entries, which are the integer points of a transportation polytope truncated by a cuboid. We provide efficiently computable estimates for the logarithm of the number of m by n tables with specified row and column sums r_1, ..., r_m, c_1, ..., c_n and bounds on the entries. These estimates are asymptotic as m and n go to infinity simultaneously, given that no r_i (resp., c_j) is allowed to exceed a fixed multiple of the average row sum (resp., column sum). As an application, we consider a random, uniformly selected table with entries in {0, 1, ..., kappa} having a given sum. Responding to questions raised by Diaconis and Efron in the context of statistical significance testing, we show that the occurrence of row sums r_1, ..., r_m is positively correlated with the occurrence of column sums c_1, ..., c_n when kappa > 1 and r_1, ..., r_m, c_1, ..., c_n are sufficiently extreme. We give evidence that the opposite is true for near-average values of r_1, ..., r_m, c_1, ..., c_n.Ph.D.MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86295/1/auspex_1.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

A decomposition based proof for fast mixing of a Markov chain over balanced realizations of a joint degree matrix

Author: Erdős Péter
Miklós István
Toroczkai Z.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 16/09/2014
Field of study

A joint degree matrix (JDM) specifies the number of connections between nodes of given degrees in a graph, for all degree pairs and uniquely determines the degree sequence of the graph. We consider the space of all balanced realizations of an arbitrary JDM, realizations in which the links between any two degree groups are placed as uniformly as possible. We prove that a swap Markov Chain Monte Carlo (MCMC) algorithm in the space of all balanced realizations of an {\em arbitrary} graphical JDM mixes rapidly, i.e., the relaxation time of the chain is bounded from above by a polynomial in the number of nodes

n

. To prove fast mixing, we first prove a general factorization theorem similar to the Martin-Randall method for disjoint decompositions (partitions). This theorem can be used to bound from below the spectral gap with the help of fast mixing subchains within every partition and a bound on an auxiliary Markov chain between the partitions. Our proof of the general factorization theorem is direct and uses conductance based methods (Cheeger inequality).Comment: submitted, 18 pages, 4 figure

arXiv.org e-Print Archive

SZTAKI Publication Repository

Repository of the Academy's Library

MPG.PuRe

Stream sketches, sampling, and sabotage

Author: Chestnut Stephen Robert
Publication venue: Johns Hopkins University
Publication date: 01/01/2015
Field of study

Exact solutions are unattainable for important problems. The calculations are limited by the memory of our computers and the length of time that we can wait for a solution. The field of approximation algorithms has grown to address this problem; it is practically important and theoretically fascinating. We address three questions along these lines. What are the limits of streaming computation? Can we efficiently compute the likelihood of a given network of relationships? How robust are the solutions to combinatorial optimization problems? High speed network monitoring and rapid acquisition of scientific data require the development of space efficient algorithms. In these settings it is impractical or impossible to store all of the data, nonetheless the need for analyzing it persists. Typically, the goal is to compute some simple statistics on the input using sublinear, or even polylogarithmic, space. Our main contributions here are the complete classification of the space necessary for several types of statistics. Our sharpest results characterize the complexity in terms of the domain size and stream length. Furthermore, our algorithms are universal for their respective classes of statistics. A network of relationships, for example friendships or species-habitat pairings, can often be represented as a binary contingency table, which is {0,1}-matrix with given row and column sums. A natural null model for hypothesis testing here is the uniform distribution on the set of binary contingency tables with the same line sums as the observation. However, exact calculation, asymptotic approximation, and even Monte-Carlo approximation of p-values are so-far practically unattainable for many interesting examples. This thesis presents two new algorithms for sampling contingency tables. One is a hybrid algorithm that combines elements of two previously known algorithms. It is intended to exploit certain properties of the margins that are observed in some data sets. Our other algorithm samples from a larger set of tables, but it has the advantage of being fast. The robustness of a system can be assessed from optimal attack strategies. Interdiction problems ask about the worst-case impact of a limited change to an underlying optimization problem. Most interdiction problems are NP-hard, and furthermore, even designing efficient approximation algorithms that allow for estimating the order of magnitude of a worst-case impact has turned out to be very difficult. We suggest a general method to obtain pseudoapproximations for many interdiction problems

CiteSeerX

JScholarship

New Classes of Degree Sequences with Fast Mixing Swap Markov Chain Sampling

Author: Erdős Péter
Miklós István
Toroczkai Zoltán
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 14/12/2016
Field of study

In network modelling of complex systems one is often required to sample random realizations of networks that obey a given set of constraints, usually in the form of graph measures. A much studied class of problems targets uniform sampling of simple graphs with given degree sequence or also with given degree correlations expressed in the form of a Joint Degree Matrix. One approach is to use Markov chains based on edge switches (swaps) that preserve the constraints, are irreducible (ergodic) and fast mixing. In 1999, Kannan, Tetali and Vempala (KTV) proposed a simple swap Markov chain for sampling graphs with given degree sequence, and conjectured that it mixes rapidly (in polynomial time) for arbitrary degree sequences. Although the conjecture is still open, it has been proved for special degree sequences, in particular for those of undirected and directed regular simple graphs, half-regular bipartite graphs, and graphs with certain bounded maximum degrees. Here we prove the fast mixing KTV conjecture for novel, exponentially large classes of irregular degree sequences. Our method is based on a canonical decomposition of degree sequences into split graph degree sequences, a structural theorem for the space of graph realizations and on a factorization theorem for Markov chains. After introducing bipartite ‘splitted’ degree sequences, we also generalize the canonical split graph decomposition for bipartite and directed graphs. Copyright © Cambridge University Press 201

arXiv.org e-Print Archive

Repository of the Academy's Library

Geometric Combinatorics of Transportation Polytopes and the Behavior of the Simplex Method

Author: Kim Edward D.
Publication venue
Publication date: 01/01/2010
Field of study

This dissertation investigates the geometric combinatorics of convex polytopes and connections to the behavior of the simplex method for linear programming. We focus our attention on transportation polytopes, which are sets of all tables of non-negative real numbers satisfying certain summation conditions. Transportation problems are, in many ways, the simplest kind of linear programs and thus have a rich combinatorial structure. First, we give new results on the diameters of certain classes of transportation polytopes and their relation to the Hirsch Conjecture, which asserts that the diameter of every

d

-dimensional convex polytope with

n

facets is bounded above by

n-d

. In particular, we prove a new quadratic upper bound on the diameter of

3

-way axial transportation polytopes defined by

1

-marginals. We also show that the Hirsch Conjecture holds for

p \times 2

classical transportation polytopes, but that there are infinitely-many Hirsch-sharp classical transportation polytopes. Second, we present new results on subpolytopes of transportation polytopes. We investigate, for example, a non-regular triangulation of a subpolytope of the fourth Birkhoff polytope

B_4

. This implies the existence of non-regular triangulations of all Birkhoff polytopes

B_n

for

n \geq 4

. We also study certain classes of network flow polytopes and prove new linear upper bounds for their diameters.Comment: PhD thesis submitted June 2010 to the University of California, Davis. 183 pages, 49 figure

arXiv.org e-Print Archive

eScholarship - University of California

CERN Document Server