9,299 research outputs found

    Generating realistic scaled complex networks

    Get PDF
    Research on generative models is a central project in the emerging field of network science, and it studies how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks, and for verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the paper was presented at the 5th International Workshop on Complex Networks and their Application

    BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

    Full text link
    Data generation is a key issue in big data benchmarking that aims to generate application-specific data sets to meet the 4V requirements of big data. Specifically, big data generators need to generate scalable data (Volume) of different types (Variety) under controllable generation rates (Velocity) while keeping the important characteristics of raw data (Veracity). This gives rise to various new challenges about how we design generators efficiently and successfully. To date, most existing techniques can only generate limited types of data and support specific big data systems such as Hadoop. Hence we develop a tool, called Big Data Generator Suite (BDGS), to efficiently generate scalable big data while employing data models derived from real data to preserve data veracity. The effectiveness of BDGS is demonstrated by developing six data generators covering three representative data types (structured, semi-structured and unstructured) and three data sources (text, graph, and table data)

    2.5K-Graphs: from Sampling to Generation

    Get PDF
    Understanding network structure and having access to realistic graphs plays a central role in computer and social networks research. In this paper, we propose a complete, and practical methodology for generating graphs that resemble a real graph of interest. The metrics of the original topology we target to match are the joint degree distribution (JDD) and the degree-dependent average clustering coefficient (cˉ(k)\bar{c}(k)). We start by developing efficient estimators for these two metrics based on a node sample collected via either independence sampling or random walks. Then, we process the output of the estimators to ensure that the target properties are realizable. Finally, we propose an efficient algorithm for generating topologies that have the exact target JDD and a cˉ(k)\bar{c}(k) close to the target. Extensive simulations using real-life graphs show that the graphs generated by our methodology are similar to the original graph with respect to, not only the two target metrics, but also a wide range of other topological metrics; furthermore, our generator is order of magnitudes faster than state-of-the-art techniques

    Spectral Graph Forge: Graph Generation Targeting Modularity

    Full text link
    Community structure is an important property that captures inhomogeneities common in large networks, and modularity is one of the most widely used metrics for such community structure. In this paper, we introduce a principled methodology, the Spectral Graph Forge, for generating random graphs that preserves community structure from a real network of interest, in terms of modularity. Our approach leverages the fact that the spectral structure of matrix representations of a graph encodes global information about community structure. The Spectral Graph Forge uses a low-rank approximation of the modularity matrix to generate synthetic graphs that match a target modularity within user-selectable degree of accuracy, while allowing other aspects of structure to vary. We show that the Spectral Graph Forge outperforms state-of-the-art techniques in terms of accuracy in targeting the modularity and randomness of the realizations, while also preserving other local structural properties and node attributes. We discuss extensions of the Spectral Graph Forge to target other properties beyond modularity, and its applications to anonymization

    Kronecker Graphs: An Approach to Modeling Networks

    Full text link
    How can we model networks with a mathematically tractable model that allows for rigorous analysis of network properties? Networks exhibit a long list of surprising properties: heavy tails for the degree distribution; small diameters; and densification and shrinking diameters over time. Most present network models either fail to match several of the above properties, are complicated to analyze mathematically, or both. In this paper we propose a generative model for networks that is both mathematically tractable and can generate networks that have the above mentioned properties. Our main idea is to use the Kronecker product to generate graphs that we refer to as "Kronecker graphs". First, we prove that Kronecker graphs naturally obey common network properties. We also provide empirical evidence showing that Kronecker graphs can effectively model the structure of real networks. We then present KronFit, a fast and scalable algorithm for fitting the Kronecker graph generation model to large real networks. A naive approach to fitting would take super- exponential time. In contrast, KronFit takes linear time, by exploiting the structure of Kronecker matrix multiplication and by using statistical simulation techniques. Experiments on large real and synthetic networks show that KronFit finds accurate parameters that indeed very well mimic the properties of target networks. Once fitted, the model parameters can be used to gain insights about the network structure, and the resulting synthetic graphs can be used for null- models, anonymization, extrapolations, and graph summarization
    • …
    corecore