46 research outputs found

    Combinatorial Miller-Hagberg Algorithm for Randomization of Dense Networks

    Full text link
    We propose a slightly revised Miller-Hagberg (MH) algorithm that efficiently generates a random network from a given expected degree sequence. The revision was to replace the approximated edge probability between a pair of nodes with a combinatorically calculated edge probability that better captures the likelihood of edge presence especially where edges are dense. The computational complexity of this combinatorial MH algorithm is still in the same order as the original one. We evaluated the proposed algorithm through several numerical experiments. The results demonstrated that the proposed algorithm was particularly good at accurately representing high-degree nodes in dense, heterogeneous networks. This algorithm may be a useful alternative of other more established network randomization methods, given that the data are increasingly becoming larger and denser in today's network science research.Comment: 8 pages, 3 figures; to appear in the Proceedings of CompleNet 2018, in pres

    On the impossibility of constructing good population mean estimators in a realistic Respondent Driven Sampling model

    Full text link
    Current methods for population mean estimation from data collected by Respondent Driven Sampling (RDS) are based on the Horvitz-Thompson estimator together with a set of assumptions on the sampling model under which the inclusion probabilities can be determined from the information contained in the data. In this paper, we argue that such set of assumptions are too simplistic to be realistic and that under realistic sampling models, the situation is far more complicated. Specifically, we study a realistic RDS sampling model that is motivated by a real world RDS dataset. We show that, for this model, the inclusion probabilities, which are necessary for the application of the Horvitz-Thompson estimator, can not be determined by the information in the sample alone. An implication is that, unless additional information about the underlying population network is obtained, it is hopeless to conceive of a general theory of population mean estimation from current RDS data.Comment: 13 pages, 2 figure

    Information content of coevolutionary game landscapes

    Full text link
    Coevolutionary game dynamics is the result of players that may change their strategies and their network of interaction. For such games, and based on interpreting strategies as configurations, strategy-to-payoff maps can be defined for every interaction network, which opens up to derive game landscapes. This paper presents an analysis of these game landscapes by their information content. By this analysis, we particularly study the effect of a rescaled payoff matrix generalizing social dilemmas and differences between well-mixed and structured populations

    Negative Examples for Sequential Importance Sampling of Binary Contingency Tables

    Full text link
    The sequential importance sampling (SIS) algorithm has gained considerable popularity for its empirical success. One of its noted applications is to the binary contingency tables problem, an important problem in statistics, where the goal is to estimate the number of 0/1 matrices with prescribed row and column sums. We give a family of examples in which the SIS procedure, if run for any subexponential number of trials, will underestimate the number of tables by an exponential factor. This result holds for any of the usual design choices in the SIS algorithm, namely the ordering of the columns and rows. These are apparently the first theoretical results on the efficiency of the SIS algorithm for binary contingency tables. Finally, we present experimental evidence that the SIS algorithm is efficient for row and column sums that are regular. Our work is a first step in determining the class of inputs for which SIS is effective

    A Parallel Algorithm for Generating a Random Graph with a Prescribed Degree Sequence

    Full text link
    Random graphs (or networks) have gained a significant increase of interest due to its popularity in modeling and simulating many complex real-world systems. Degree sequence is one of the most important aspects of these systems. Random graphs with a given degree sequence can capture many characteristics like dependent edges and non-binomial degree distribution that are absent in many classical random graph models such as the Erd\H{o}s-R\'{e}nyi graph model. In addition, they have important applications in the uniform sampling of random graphs, counting the number of graphs having the same degree sequence, as well as in string theory, random matrix theory, and matching theory. In this paper, we present an OpenMP-based shared-memory parallel algorithm for generating a random graph with a prescribed degree sequence, which achieves a speedup of 20.5 with 32 cores. One of the steps in our parallel algorithm requires checking the Erd\H{o}s-Gallai characterization, i.e., whether there exists a graph obeying the given degree sequence, in parallel. This paper presents the first non-trivial parallel algorithm for checking the Erd\H{o}s-Gallai characterization, which achieves a speedup of 23 using 32 cores.Comment: 10 page

    A Dynamic Programming Approach for Approximate Uniform Generation of Binary Matrices with Specified Margins

    Full text link
    Consider the collection of all binary matrices having a specific sequence of row and column sums and consider sampling binary matrices uniformly from this collection. Practical algorithms for exact uniform sampling are not known, but there are practical algorithms for approximate uniform sampling. Here it is shown how dynamic programming and recent asymptotic enumeration results can be used to simplify and improve a certain class of approximate uniform samplers. The dynamic programming perspective suggests interesting generalizations.Comment: 27 pages, minor typographic corrections from previous version, superseded by arXiv:1301.392

    Expand and Contract: Sampling graphs with given degrees and other combinatorial families

    Full text link
    Sampling from combinatorial families can be difficult. However, complicated families can often be embedded within larger, simpler ones, for which easy sampling algorithms are known. We take advantage of such a relationship to describe a sampling algorithm for the smaller family, via a Markov chain started at a random sample of the larger family. The utility of the method is demonstrated via several examples, with particular emphasis on sampling labelled graphs with given degree sequence, a well-studied problem for which existing algorithms leave much room for improvement. For graphs with given degrees, with maximum degree O(m1/4)O(m^{1/4}) where mm is the number of edges, we obtain an asymptotically uniform sample in O(m)O(m) steps, which substantially improves upon existing algorithms

    Fast uniform generation of random graphs with given degree sequences

    Full text link
    In this paper we provide an algorithm that generates a graph with given degree sequence uniformly at random. Provided that Ξ”4=O(m)\Delta^4=O(m), where Ξ”\Delta is the maximal degree and mm is the number of edges,the algorithm runs in expected time O(m)O(m). Our algorithm significantly improves the previously most efficient uniform sampler, which runs in expected time O(m2Ξ”2)O(m^2\Delta^2) for the same family of degree sequences. Our method uses a novel ingredient which progressively relaxes restrictions on an object being generated uniformly at random, and we use this to give fast algorithms for uniform sampling of graphs with other degree sequences as well. Using the same method, we also obtain algorithms with expected run time which is (i) linear for power-law degree sequences in cases where the previous best was O(n4.081)O(n^{4.081}), and (ii) O(nd+d4)O(nd+d^4) for dd-regular graphs when d=o(n)d=o(\sqrt n), where the previous best was O(nd3)O(nd^3)

    Efficient importance sampling for binary contingency tables

    Full text link
    Importance sampling has been reported to produce algorithms with excellent empirical performance in counting problems. However, the theoretical support for its efficiency in these applications has been very limited. In this paper, we propose a methodology that can be used to design efficient importance sampling algorithms for counting and test their efficiency rigorously. We apply our techniques after transforming the problem into a rare-event simulation problem--thereby connecting complexity analysis of counting problems with efficiency in the context of rare-event simulation. As an illustration of our approach, we consider the problem of counting the number of binary tables with fixed column and row sums, cjc_j's and rir_i's, respectively, and total marginal sums d=βˆ‘jcjd=\sum_jc_j. Assuming that max⁑jcj=o(d1/2)\max_jc_j=o(d^{1/2}), βˆ‘cj2=O(d)\sum c_j^2=O(d) and the rjr_j's are bounded, we show that a suitable importance sampling algorithm, proposed by Chen et al. [J. Amer. Statist. Assoc. 100 (2005) 109--120], requires O(d3Ξ΅βˆ’2Ξ΄βˆ’1)O(d^3\varepsilon^{-2}\delta^{-1}) operations to produce an estimate that has Ξ΅\varepsilon-relative error with probability 1βˆ’Ξ΄1-\delta. In addition, if max⁑jcj=o(d1/4βˆ’Ξ΄0)\max_jc_j=o(d^{1/4-\delta_0}) for some Ξ΄0>0\delta_0>0, the same coverage can be guaranteed with O(d3Ξ΅βˆ’2log⁑(Ξ΄βˆ’1))O(d^3\varepsilon^{-2}\log(\delta^{-1})) operations.Comment: Published in at http://dx.doi.org/10.1214/08-AAP558 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Characterizing Optimal Sampling of Binary Contingency Tables via the Configuration Model

    Full text link
    A binary contingency table is an m x n array of binary entries with prescribed row sums r=(r_1,...,r_m) and column sums c=(c_1,...,c_n). The configuration model for uniformly sampling binary contingency tables proceeds as follows. First, label N=\sum_{i=1}^{m} r_i tokens of type 1, arrange them in m cells, and let the i-th cell contain r_i tokens. Next, label another set of tokens of type 2 containing N=\sum_{j=1}^{n}c_j elements arranged in n cells, and let the j-th cell contain c_j tokens. Finally, pair the type-1 tokens with the type-2 tokens by generating a random permutation until the total pairing corresponds to a binary contingency table. Generating one random permutation takes O(N) time, which is optimal up to constant factors. A fundamental question is whether a constant number of permutations is sufficient to obtain a binary contingency table. In the current paper, we solve this problem by showing a necessary and sufficient condition so that the probability that the configuration model outputs a binary contingency table remains bounded away from 0 as N goes to \infty. Our finding shows surprising differences from recent results for binary symmetric contingency tables
    corecore