4 research outputs found

    Using Bayesian Network Representations for Effective Sampling from Generative Network Models

    Full text link
    Bayesian networks (BNs) are used for inference and sampling by exploiting conditional independence among random variables. Context specific independence (CSI) is a property of graphical models where additional independence relations arise in the context of particular values of random variables (RVs). Identifying and exploiting CSI properties can simplify inference. Some generative network models (models that generate social/information network samples from a network distribution P(G)), with complex interactions among a set of RVs, can be represented with probabilistic graphical models, in particular with BNs. In the present work we show one such a case. We discuss how a mixed Kronecker Product Graph Model can be represented as a BN, and study its BN properties that can be used for efficient sampling. Specifically, we show that instead of exhibiting CSI properties, the model has deterministic context-specific dependence (DCSD). Exploiting this property focuses the sampling method on a subset of the sampling space that improves efficiency

    The HyperKron Graph Model for higher-order features

    Full text link
    Graph models have long been used in lieu of real data which can be expensive and hard to come by. A common class of models constructs a matrix of probabilities, and samples an adjacency matrix by flipping a weighted coin for each entry. Examples include the Erd\H{o}s-R\'{e}nyi model, Chung-Lu model, and the Kronecker model. Here we present the HyperKron Graph model: an extension of the Kronecker Model, but with a distribution over hyperedges. We prove that we can efficiently generate graphs from this model in order proportional to the number of edges times a small log-factor, and find that in practice the runtime is linear with respect to the number of edges. We illustrate a number of useful features of the HyperKron model including non-trivial clustering and highly skewed degree distributions. Finally, we fit the HyperKron model to real-world networks, and demonstrate the model's flexibility with a complex application of the HyperKron model to networks with coherent feed-forward loops.Comment: 17 pages, 9 figure

    Coin-flipping, ball-dropping, and grass-hopping for generating random graphs from matrices of edge probabilities

    Full text link
    Common models for random graphs, such as Erd\H{o}s-R\'{e}nyi and Kronecker graphs, correspond to generating random adjacency matrices where each entry is non-zero based on a large matrix of probabilities. Generating an instance of a random graph based on these models is easy, although inefficient, by flipping biased coins (i.e. sampling binomial random variables) for each possible edge. This process is inefficient because most large graph models correspond to sparse graphs where the vast majority of coin flips will result in no edges. We describe some not-entirely-well-known, but not-entirely-unknown, techniques that will enable us to sample a graph by finding only the coin flips that will produce edges. Our analogies for these procedures are ball-dropping, which is easier to implement, but may need extra work due to duplicate edges, and grass-hopping, which results in no duplicated work or extra edges. Grass-hopping does this using geometric random variables. In order to use this idea on complex probability matrices such as those in Kronecker graphs, we decompose the problem into three steps, each of which are independently useful computational primitives: (i) enumerating non-decreasing sequences, (ii) unranking multiset permutations, and (iii) decoding and encoding z-curve and Morton codes and permutations. The third step is the result of a new connection between repeated Kronecker product operations and Morton codes. Throughout, we draw connections to ideas underlying applied math and computer science including coupon collector problems.Comment: 43 pages, 16 problem

    Recent Advances in Scalable Network Generation

    Full text link
    Random graph models are frequently used as a controllable and versatile data source for experimental campaigns in various research fields. Generating such data-sets at scale is a non-trivial task as it requires design decisions typically spanning multiple areas of expertise. Challenges begin with the identification of relevant domain-specific network features, continue with the question of how to compile such features into a tractable model, and culminate in algorithmic details arising while implementing the pertaining model. In the present survey, we explore crucial aspects of random graph models with known scalable generators. We begin by briefly introducing network features considered by such models, and then discuss random graphs alongside with generation algorithms. Our focus lies on modelling techniques and algorithmic primitives that have proven successful in obtaining massive graphs. We consider concepts and graph models for various domains (such as social network, infrastructure, ecology, and numerical simulations), and discuss generators for different models of computation (including shared-memory parallelism, massive-parallel GPUs, and distributed systems)
    corecore