2,620 research outputs found
Enumerating contingency tables via random permanents
Given m positive integers R=(r_i), n positive integers C=(c_j) such that sum
r_i = sum c_j =N, and mn non-negative weights W=(w_{ij}), we consider the total
weight T=T(R, C; W) of non-negative integer matrices (contingency tables)
D=(d_{ij}) with the row sums r_i, column sums c_j, and the weight of D equal to
prod w_{ij}^{d_{ij}}. We present a randomized algorithm of a polynomial in N
complexity which computes a number T'=T'(R,C; W) such that T' < T < alpha(R, C)
T' where alpha(R,C) = min{prod r_i! r_i^{-r_i}, prod c_j! c_j^{-c_j}} N^N/N!.
In many cases, ln T' provides an asymptotically accurate estimate of ln T. The
idea of the algorithm is to express T as the expectation of the permanent of an
N x N random matrix with exponentially distributed entries and approximate the
expectation by the integral T' of an efficiently computable log-concave
function on R^{mn}. Applications to counting integer flows in graphs are also
discussed.Comment: 19 pages, bounds are sharpened, references are adde
Markov bases and subbases for bounded contingency tables
In this paper we study the computation of Markov bases for contingency tables
whose cell entries have an upper bound. In general a Markov basis for unbounded
contingency table under a certain model differs from a Markov basis for bounded
tables. Rapallo, (2007) applied Lawrence lifting to compute a Markov basis for
contingency tables whose cell entries are bounded. However, in the process, one
has to compute the universal Gr\"obner basis of the ideal associated with the
design matrix for a model which is, in general, larger than any reduced
Gr\"obner basis. Thus, this is also infeasible in small- and medium-sized
problems. In this paper we focus on bounded two-way contingency tables under
independence model and show that if these bounds on cells are positive, i.e.,
they are not structural zeros, the set of basic moves of all
minors connects all tables with given margins. We end this paper with an open
problem that if we know the given margins are positive, we want to find the
necessary and sufficient condition on the set of structural zeros so that the
set of basic moves of all minors connects all incomplete
contingency tables with given margins.Comment: 22 pages. It will appear in the Annals of the Institution of
Statistical Mathematic
Independence Models for Integer Points of Polytopes.
The integer points of a high-dimensional polytope P are generally difficult to count or sample uniformly. We consider a class of low-complexity random models for these points which arise from an entropy maximization problem. From these models, by way of "anti-concentration" results for sums of independent random variables, we derive general, efficiently computable upper bounds on the number of integer points of P.
We make a detailed study of contingency tables with bounded entries, which are the integer points of a transportation polytope truncated by a cuboid. We provide efficiently computable estimates for the logarithm of the number of m by n tables with specified row and column sums r_1, ..., r_m, c_1, ..., c_n and bounds on the entries. These estimates are asymptotic as m and n go to infinity simultaneously, given that no r_i (resp., c_j) is allowed to exceed a fixed multiple of the average row sum (resp., column sum).
As an application, we consider a random, uniformly selected table with entries in {0, 1, ..., kappa} having a given sum. Responding to questions raised by Diaconis and Efron in the context of statistical significance testing, we show that the occurrence of row sums r_1, ..., r_m is positively correlated with the occurrence of column sums c_1, ..., c_n when kappa > 1 and r_1, ..., r_m, c_1, ..., c_n are sufficiently extreme. We give evidence that the opposite is true for near-average values of r_1, ..., r_m, c_1, ..., c_n.Ph.D.MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86295/1/auspex_1.pd
A decomposition based proof for fast mixing of a Markov chain over balanced realizations of a joint degree matrix
A joint degree matrix (JDM) specifies the number of connections between nodes
of given degrees in a graph, for all degree pairs and uniquely determines the
degree sequence of the graph. We consider the space of all balanced
realizations of an arbitrary JDM, realizations in which the links between any
two degree groups are placed as uniformly as possible. We prove that a swap
Markov Chain Monte Carlo (MCMC) algorithm in the space of all balanced
realizations of an {\em arbitrary} graphical JDM mixes rapidly, i.e., the
relaxation time of the chain is bounded from above by a polynomial in the
number of nodes . To prove fast mixing, we first prove a general
factorization theorem similar to the Martin-Randall method for disjoint
decompositions (partitions). This theorem can be used to bound from below the
spectral gap with the help of fast mixing subchains within every partition and
a bound on an auxiliary Markov chain between the partitions. Our proof of the
general factorization theorem is direct and uses conductance based methods
(Cheeger inequality).Comment: submitted, 18 pages, 4 figure
Stream sketches, sampling, and sabotage
Exact solutions are unattainable for important problems. The calculations are limited by the memory of our computers and the length of time that we can wait for a solution. The field of approximation algorithms has grown to address this problem; it is practically important and theoretically fascinating. We address three questions along these lines. What are the limits of streaming computation? Can we efficiently compute the likelihood of a given network of relationships? How robust are the solutions to combinatorial optimization problems?
High speed network monitoring and rapid acquisition of scientific data require the development of space efficient algorithms. In these settings it is impractical or impossible to store all of the data, nonetheless the need for analyzing it persists. Typically, the goal is to compute some simple statistics on the input using sublinear, or even polylogarithmic, space. Our main contributions here are the complete classification of the space necessary for several types of statistics. Our sharpest results characterize the complexity in terms of the domain size and stream length. Furthermore, our algorithms are universal for their respective classes of statistics.
A network of relationships, for example friendships or species-habitat pairings, can often be represented as a binary contingency table, which is {0,1}-matrix with given row and column sums. A natural null model for hypothesis testing here is the uniform distribution on the set of binary contingency tables with the same line sums as the observation. However, exact calculation, asymptotic approximation, and even Monte-Carlo approximation of p-values are so-far practically unattainable for many interesting examples. This thesis presents two new algorithms for sampling contingency tables. One is a hybrid algorithm that combines elements of two previously known algorithms. It is intended to exploit certain properties of the margins that are observed in some data sets. Our other algorithm samples from a larger set of tables, but it has the advantage of being fast.
The robustness of a system can be assessed from optimal attack strategies. Interdiction problems ask about the worst-case impact of a limited change to an underlying optimization problem. Most interdiction problems are NP-hard, and furthermore, even designing efficient approximation algorithms that allow for estimating the order of magnitude of a worst-case impact has turned out to be very difficult. We suggest a general method to obtain pseudoapproximations for many interdiction problems
New Classes of Degree Sequences with Fast Mixing Swap Markov Chain Sampling
In network modelling of complex systems one is often required to sample random realizations of networks that obey a given set of constraints, usually in the form of graph measures. A much studied class of problems targets uniform sampling of simple graphs with given degree sequence or also with given degree correlations expressed in the form of a Joint Degree Matrix. One approach is to use Markov chains based on edge switches (swaps) that preserve the constraints, are irreducible (ergodic) and fast mixing. In 1999, Kannan, Tetali and Vempala (KTV) proposed a simple swap Markov chain for sampling graphs with given degree sequence, and conjectured that it mixes rapidly (in polynomial time) for arbitrary degree sequences. Although the conjecture is still open, it has been proved for special degree sequences, in particular for those of undirected and directed regular simple graphs, half-regular bipartite graphs, and graphs with certain bounded maximum degrees. Here we prove the fast mixing KTV conjecture for novel, exponentially large classes of irregular degree sequences. Our method is based on a canonical decomposition of degree sequences into split graph degree sequences, a structural theorem for the space of graph realizations and on a factorization theorem for Markov chains. After introducing bipartite ‘splitted’ degree sequences, we also generalize the canonical split graph decomposition for bipartite and directed graphs. Copyright © Cambridge University Press 201
Geometric Combinatorics of Transportation Polytopes and the Behavior of the Simplex Method
This dissertation investigates the geometric combinatorics of convex
polytopes and connections to the behavior of the simplex method for linear
programming. We focus our attention on transportation polytopes, which are sets
of all tables of non-negative real numbers satisfying certain summation
conditions. Transportation problems are, in many ways, the simplest kind of
linear programs and thus have a rich combinatorial structure. First, we give
new results on the diameters of certain classes of transportation polytopes and
their relation to the Hirsch Conjecture, which asserts that the diameter of
every -dimensional convex polytope with facets is bounded above by
. In particular, we prove a new quadratic upper bound on the diameter of
-way axial transportation polytopes defined by -marginals. We also show
that the Hirsch Conjecture holds for classical transportation
polytopes, but that there are infinitely-many Hirsch-sharp classical
transportation polytopes. Second, we present new results on subpolytopes of
transportation polytopes. We investigate, for example, a non-regular
triangulation of a subpolytope of the fourth Birkhoff polytope . This
implies the existence of non-regular triangulations of all Birkhoff polytopes
for . We also study certain classes of network flow polytopes
and prove new linear upper bounds for their diameters.Comment: PhD thesis submitted June 2010 to the University of California,
Davis. 183 pages, 49 figure
- …