1,298 research outputs found
Methods to Determine Node Centrality and Clustering in Graphs with Uncertain Structure
Much of the past work in network analysis has focused on analyzing discrete
graphs, where binary edges represent the "presence" or "absence" of a
relationship. Since traditional network measures (e.g., betweenness centrality)
utilize a discrete link structure, complex systems must be transformed to this
representation in order to investigate network properties. However, in many
domains there may be uncertainty about the relationship structure and any
uncertainty information would be lost in translation to a discrete
representation. Uncertainty may arise in domains where there is moderating link
information that cannot be easily observed, i.e., links become inactive over
time but may not be dropped or observed links may not always corresponds to a
valid relationship. In order to represent and reason with these types of
uncertainty, we move beyond the discrete graph framework and develop social
network measures based on a probabilistic graph representation. More
specifically, we develop measures of path length, betweenness centrality, and
clustering coefficient---one set based on sampling and one based on
probabilistic paths. We evaluate our methods on three real-world networks from
Enron, Facebook, and DBLP, showing that our proposed methods more accurately
capture salient effects without being susceptible to local noise, and that the
resulting analysis produces a better understanding of the graph structure and
the uncertainty resulting from its change over time.Comment: Longer version of paper appearing in Fifth International AAAI
Conference on Weblogs and Social Media. 9 pages, 4 Figure
Network Sampling: From Static to Streaming Graphs
Network sampling is integral to the analysis of social, information, and
biological networks. Since many real-world networks are massive in size,
continuously evolving, and/or distributed in nature, the network structure is
often sampled in order to facilitate study. For these reasons, a more thorough
and complete understanding of network sampling is critical to support the field
of network science. In this paper, we outline a framework for the general
problem of network sampling, by highlighting the different objectives,
population and units of interest, and classes of network sampling methods. In
addition, we propose a spectrum of computational models for network sampling
methods, ranging from the traditionally studied model based on the assumption
of a static domain to a more challenging model that is appropriate for
streaming domains. We design a family of sampling methods based on the concept
of graph induction that generalize across the full spectrum of computational
models (from static to streaming) while efficiently preserving many of the
topological properties of the input graphs. Furthermore, we demonstrate how
traditional static sampling algorithms can be modified for graph streams for
each of the three main classes of sampling methods: node, edge, and
topology-based sampling. Our experimental results indicate that our proposed
family of sampling methods more accurately preserves the underlying properties
of the graph for both static and streaming graphs. Finally, we study the impact
of network sampling algorithms on the parameter estimation and performance
evaluation of relational classification algorithms
- …