Search CORE

46 research outputs found

Combinatorial Miller-Hagberg Algorithm for Randomization of Dense Networks

Author: Sayama Hiroki
Publication venue
Publication date: 19/11/2017
Field of study

We propose a slightly revised Miller-Hagberg (MH) algorithm that efficiently generates a random network from a given expected degree sequence. The revision was to replace the approximated edge probability between a pair of nodes with a combinatorically calculated edge probability that better captures the likelihood of edge presence especially where edges are dense. The computational complexity of this combinatorial MH algorithm is still in the same order as the original one. We evaluated the proposed algorithm through several numerical experiments. The results demonstrated that the proposed algorithm was particularly good at accurately representing high-degree nodes in dense, heterogeneous networks. This algorithm may be a useful alternative of other more established network randomization methods, given that the data are increasingly becoming larger and denser in today's network science research.Comment: 8 pages, 3 figures; to appear in the Proceedings of CompleNet 2018, in pres

arXiv.org e-Print Archive

On the impossibility of constructing good population mean estimators in a realistic Respondent Driven Sampling model

Author: Barbour Russell
Guntuboyina Adityanand
Heimer Robert
Publication venue
Publication date: 07/11/2014
Field of study

Current methods for population mean estimation from data collected by Respondent Driven Sampling (RDS) are based on the Horvitz-Thompson estimator together with a set of assumptions on the sampling model under which the inclusion probabilities can be determined from the information contained in the data. In this paper, we argue that such set of assumptions are too simplistic to be realistic and that under realistic sampling models, the situation is far more complicated. Specifically, we study a realistic RDS sampling model that is motivated by a real world RDS dataset. We show that, for this model, the inclusion probabilities, which are necessary for the application of the Horvitz-Thompson estimator, can not be determined by the information in the sample alone. An implication is that, unless additional information about the underlying population network is obtained, it is hopeless to conceive of a general theory of population mean estimation from current RDS data.Comment: 13 pages, 2 figure

arXiv.org e-Print Archive

Information content of coevolutionary game landscapes

Author: Richter Hendrik
Publication venue
Publication date: 20/03/2018
Field of study

Coevolutionary game dynamics is the result of players that may change their strategies and their network of interaction. For such games, and based on interpreting strategies as configurations, strategy-to-payoff maps can be defined for every interaction network, which opens up to derive game landscapes. This paper presents an analysis of these game landscapes by their information content. By this analysis, we particularly study the effect of a rescaled payoff matrix generalizing social dilemmas and differences between well-mixed and structured populations

arXiv.org e-Print Archive

Negative Examples for Sequential Importance Sampling of Binary Contingency Tables

Author: Bezakova Ivona
Sinclair Alistair
Stefankovic Daniel
Vigoda Eric
Publication venue
Publication date: 28/06/2011
Field of study

The sequential importance sampling (SIS) algorithm has gained considerable popularity for its empirical success. One of its noted applications is to the binary contingency tables problem, an important problem in statistics, where the goal is to estimate the number of 0/1 matrices with prescribed row and column sums. We give a family of examples in which the SIS procedure, if run for any subexponential number of trials, will underestimate the number of tables by an exponential factor. This result holds for any of the usual design choices in the SIS algorithm, namely the ordering of the columns and rows. These are apparently the first theoretical results on the efficiency of the SIS algorithm for binary contingency tables. Finally, we present experimental evidence that the SIS algorithm is efficient for row and column sums that are regular. Our work is a first step in determining the class of inputs for which SIS is effective

arXiv.org e-Print Archive

A Parallel Algorithm for Generating a Random Graph with a Prescribed Degree Sequence

Author: Bhuiyan Hasanuzzaman
Khan Maleq
Marathe Madhav
Publication venue
Publication date: 10/09/2017
Field of study

Random graphs (or networks) have gained a significant increase of interest due to its popularity in modeling and simulating many complex real-world systems. Degree sequence is one of the most important aspects of these systems. Random graphs with a given degree sequence can capture many characteristics like dependent edges and non-binomial degree distribution that are absent in many classical random graph models such as the Erd\H{o}s-R\'{e}nyi graph model. In addition, they have important applications in the uniform sampling of random graphs, counting the number of graphs having the same degree sequence, as well as in string theory, random matrix theory, and matching theory. In this paper, we present an OpenMP-based shared-memory parallel algorithm for generating a random graph with a prescribed degree sequence, which achieves a speedup of 20.5 with 32 cores. One of the steps in our parallel algorithm requires checking the Erd\H{o}s-Gallai characterization, i.e., whether there exists a graph obeying the given degree sequence, in parallel. This paper presents the first non-trivial parallel algorithm for checking the Erd\H{o}s-Gallai characterization, which achieves a speedup of 23 using 32 cores.Comment: 10 page

arXiv.org e-Print Archive

A Dynamic Programming Approach for Approximate Uniform Generation of Binary Matrices with Specified Margins

Author: Harrison Matthew T.
Publication venue
Publication date: 25/01/2013
Field of study

Consider the collection of all binary matrices having a specific sequence of row and column sums and consider sampling binary matrices uniformly from this collection. Practical algorithms for exact uniform sampling are not known, but there are practical algorithms for approximate uniform sampling. Here it is shown how dynamic programming and recent asymptotic enumeration results can be used to simplify and improve a certain class of approximate uniform samplers. The dynamic programming perspective suggests interesting generalizations.Comment: 27 pages, minor typographic corrections from previous version, superseded by arXiv:1301.392

arXiv.org e-Print Archive

Expand and Contract: Sampling graphs with given degrees and other combinatorial families

Author: Zhao James Y.
Publication venue
Publication date: 29/08/2013
Field of study

Sampling from combinatorial families can be difficult. However, complicated families can often be embedded within larger, simpler ones, for which easy sampling algorithms are known. We take advantage of such a relationship to describe a sampling algorithm for the smaller family, via a Markov chain started at a random sample of the larger family. The utility of the method is demonstrated via several examples, with particular emphasis on sampling labelled graphs with given degree sequence, a well-studied problem for which existing algorithms leave much room for improvement. For graphs with given degrees, with maximum degree

O(m^{1/4})

where

m

is the number of edges, we obtain an asymptotically uniform sample in

O(m)

steps, which substantially improves upon existing algorithms

arXiv.org e-Print Archive

Fast uniform generation of random graphs with given degree sequences

Author: Arman Andrii
Gao Pu
Wormald Nicholas
Publication venue
Publication date: 21/01/2021
Field of study

In this paper we provide an algorithm that generates a graph with given degree sequence uniformly at random. Provided that

\Delta^4=O(m)

, where

\Delta

is the maximal degree and

m

is the number of edges,the algorithm runs in expected time

O(m)

. Our algorithm significantly improves the previously most efficient uniform sampler, which runs in expected time

O(m^2\Delta^2)

for the same family of degree sequences. Our method uses a novel ingredient which progressively relaxes restrictions on an object being generated uniformly at random, and we use this to give fast algorithms for uniform sampling of graphs with other degree sequences as well. Using the same method, we also obtain algorithms with expected run time which is (i) linear for power-law degree sequences in cases where the previous best was

O(n^{4.081})

, and (ii)

O(nd+d^4)

for

d

-regular graphs when

d=o(\sqrt n)

, where the previous best was

O(nd^3)

arXiv.org e-Print Archive

Efficient importance sampling for binary contingency tables

Author: Blanchet Jose H.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

Importance sampling has been reported to produce algorithms with excellent empirical performance in counting problems. However, the theoretical support for its efficiency in these applications has been very limited. In this paper, we propose a methodology that can be used to design efficient importance sampling algorithms for counting and test their efficiency rigorously. We apply our techniques after transforming the problem into a rare-event simulation problem--thereby connecting complexity analysis of counting problems with efficiency in the context of rare-event simulation. As an illustration of our approach, we consider the problem of counting the number of binary tables with fixed column and row sums,

c_j

's and

r_i

's, respectively, and total marginal sums

d=\sum_jc_j

. Assuming that

\max_jc_j=o(d^{1/2})

\sum c_j^2=O(d)

and the

r_j

's are bounded, we show that a suitable importance sampling algorithm, proposed by Chen et al. [J. Amer. Statist. Assoc. 100 (2005) 109--120], requires

O(d^3\varepsilon^{-2}\delta^{-1})

operations to produce an estimate that has

\varepsilon

-relative error with probability

1-\delta

. In addition, if

\max_jc_j=o(d^{1/4-\delta_0})

for some

\delta_0>0

, the same coverage can be guaranteed with

O(d^3\varepsilon^{-2}\log(\delta^{-1}))

operations.Comment: Published in at http://dx.doi.org/10.1214/08-AAP558 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Characterizing Optimal Sampling of Binary Contingency Tables via the Configuration Model

Author: Blanchet Jose
Stauffer Alexandre
Publication venue
Publication date: 11/10/2011
Field of study

A binary contingency table is an m x n array of binary entries with prescribed row sums r=(r_1,...,r_m) and column sums c=(c_1,...,c_n). The configuration model for uniformly sampling binary contingency tables proceeds as follows. First, label N=\sum_{i=1}^{m} r_i tokens of type 1, arrange them in m cells, and let the i-th cell contain r_i tokens. Next, label another set of tokens of type 2 containing N=\sum_{j=1}^{n}c_j elements arranged in n cells, and let the j-th cell contain c_j tokens. Finally, pair the type-1 tokens with the type-2 tokens by generating a random permutation until the total pairing corresponds to a binary contingency table. Generating one random permutation takes O(N) time, which is optimal up to constant factors. A fundamental question is whether a constant number of permutations is sufficient to obtain a binary contingency table. In the current paper, we solve this problem by showing a necessary and sufficient condition so that the probability that the configuration model outputs a binary contingency table remains bounded away from 0 as N goes to \infty. Our finding shows surprising differences from recent results for binary symmetric contingency tables

arXiv.org e-Print Archive