1,034 research outputs found
Kronecker Graphs: An Approach to Modeling Networks
How can we model networks with a mathematically tractable model that allows
for rigorous analysis of network properties? Networks exhibit a long list of
surprising properties: heavy tails for the degree distribution; small
diameters; and densification and shrinking diameters over time. Most present
network models either fail to match several of the above properties, are
complicated to analyze mathematically, or both. In this paper we propose a
generative model for networks that is both mathematically tractable and can
generate networks that have the above mentioned properties. Our main idea is to
use the Kronecker product to generate graphs that we refer to as "Kronecker
graphs".
First, we prove that Kronecker graphs naturally obey common network
properties. We also provide empirical evidence showing that Kronecker graphs
can effectively model the structure of real networks.
We then present KronFit, a fast and scalable algorithm for fitting the
Kronecker graph generation model to large real networks. A naive approach to
fitting would take super- exponential time. In contrast, KronFit takes linear
time, by exploiting the structure of Kronecker matrix multiplication and by
using statistical simulation techniques.
Experiments on large real and synthetic networks show that KronFit finds
accurate parameters that indeed very well mimic the properties of target
networks. Once fitted, the model parameters can be used to gain insights about
the network structure, and the resulting synthetic graphs can be used for null-
models, anonymization, extrapolations, and graph summarization
Modeling Social Networks with Node Attributes using the Multiplicative Attribute Graph Model
Networks arising from social, technological and natural domains exhibit rich
connectivity patterns and nodes in such networks are often labeled with
attributes or features. We address the question of modeling the structure of
networks where nodes have attribute information. We present a Multiplicative
Attribute Graph (MAG) model that considers nodes with categorical attributes
and models the probability of an edge as the product of individual attribute
link formation affinities. We develop a scalable variational expectation
maximization parameter estimation method. Experiments show that MAG model
reliably captures network connectivity as well as provides insights into how
different attributes shape the network structure.Comment: 15 pages, 7 figures, 7 table
BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking
Data generation is a key issue in big data benchmarking that aims to generate
application-specific data sets to meet the 4V requirements of big data.
Specifically, big data generators need to generate scalable data (Volume) of
different types (Variety) under controllable generation rates (Velocity) while
keeping the important characteristics of raw data (Veracity). This gives rise
to various new challenges about how we design generators efficiently and
successfully. To date, most existing techniques can only generate limited types
of data and support specific big data systems such as Hadoop. Hence we develop
a tool, called Big Data Generator Suite (BDGS), to efficiently generate
scalable big data while employing data models derived from real data to
preserve data veracity. The effectiveness of BDGS is demonstrated by developing
six data generators covering three representative data types (structured,
semi-structured and unstructured) and three data sources (text, graph, and
table data)
A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal
Degree distributions are arguably the most important property of real world
networks. The classic edge configuration model or Chung-Lu model can generate
an undirected graph with any desired degree distribution. This serves as a good
null model to compare algorithms or perform experimental studies. Furthermore,
there are scalable algorithms that implement these models and they are
invaluable in the study of graphs. However, networks in the real-world are
often directed, and have a significant proportion of reciprocal edges. A
stronger relation exists between two nodes when they each point to one another
(reciprocal edge) as compared to when only one points to the other (one-way
edge). Despite their importance, reciprocal edges have been disregarded by most
directed graph models.
We propose a null model for directed graphs inspired by the Chung-Lu model
that matches the in-, out-, and reciprocal-degree distributions of the real
graphs. Our algorithm is scalable and requires random numbers to
generate a graph with edges. We perform a series of experiments on real
datasets and compare with existing graph models.Comment: Camera ready version for IEEE Workshop on Network Science; fixed some
typos in tabl
Generating realistic scaled complex networks
Research on generative models is a central project in the emerging field of
network science, and it studies how statistical patterns found in real networks
could be generated by formal rules. Output from these generative models is then
the basis for designing and evaluating computational methods on networks, and
for verification and simulation studies. During the last two decades, a variety
of models has been proposed with an ultimate goal of achieving comprehensive
realism for the generated networks. In this study, we (a) introduce a new
generator, termed ReCoN; (b) explore how ReCoN and some existing models can be
fitted to an original network to produce a structurally similar replica, (c)
use ReCoN to produce networks much larger than the original exemplar, and
finally (d) discuss open problems and promising research directions. In a
comparative experimental study, we find that ReCoN is often superior to many
other state-of-the-art network generation methods. We argue that ReCoN is a
scalable and effective tool for modeling a given network while preserving
important properties at both micro- and macroscopic scales, and for scaling the
exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the
paper was presented at the 5th International Workshop on Complex Networks and
their Application
Graphs, Matrices, and the GraphBLAS: Seven Good Reasons
The analysis of graphs has become increasingly important to a wide range of
applications. Graph analysis presents a number of unique challenges in the
areas of (1) software complexity, (2) data complexity, (3) security, (4)
mathematical complexity, (5) theoretical analysis, (6) serial performance, and
(7) parallel performance. Implementing graph algorithms using matrix-based
approaches provides a number of promising solutions to these challenges. The
GraphBLAS standard (istc- bigdata.org/GraphBlas) is being developed to bring
the potential of matrix based graph algorithms to the broadest possible
audience. The GraphBLAS mathematically defines a core set of matrix-based graph
operations that can be used to implement a wide class of graph algorithms in a
wide range of programming environments. This paper provides an introduction to
the GraphBLAS and describes how the GraphBLAS can be used to address many of
the challenges associated with analysis of graphs.Comment: 10 pages; International Conference on Computational Science workshop
on the Applications of Matrix Computational Methods in the Analysis of Modern
Dat
Multiplicative Attribute Graph Model of Real-World Networks
Large scale real-world network data such as social and information networks
are ubiquitous. The study of such social and information networks seeks to find
patterns and explain their emergence through tractable models. In most
networks, and especially in social networks, nodes have a rich set of
attributes (e.g., age, gender) associated with them.
Here we present a model that we refer to as the Multiplicative Attribute
Graphs (MAG), which naturally captures the interactions between the network
structure and the node attributes. We consider a model where each node has a
vector of categorical latent attributes associated with it. The probability of
an edge between a pair of nodes then depends on the product of individual
attribute-attribute affinities. The model yields itself to mathematical
analysis and we derive thresholds for the connectivity and the emergence of the
giant connected component, and show that the model gives rise to networks with
a constant diameter. We analyze the degree distribution to show that MAG model
can produce networks with either log-normal or power-law degree distributions
depending on certain conditions.Comment: 33 pages, 6 figure
- …