128,399 research outputs found
Network Sampling: From Static to Streaming Graphs
Network sampling is integral to the analysis of social, information, and
biological networks. Since many real-world networks are massive in size,
continuously evolving, and/or distributed in nature, the network structure is
often sampled in order to facilitate study. For these reasons, a more thorough
and complete understanding of network sampling is critical to support the field
of network science. In this paper, we outline a framework for the general
problem of network sampling, by highlighting the different objectives,
population and units of interest, and classes of network sampling methods. In
addition, we propose a spectrum of computational models for network sampling
methods, ranging from the traditionally studied model based on the assumption
of a static domain to a more challenging model that is appropriate for
streaming domains. We design a family of sampling methods based on the concept
of graph induction that generalize across the full spectrum of computational
models (from static to streaming) while efficiently preserving many of the
topological properties of the input graphs. Furthermore, we demonstrate how
traditional static sampling algorithms can be modified for graph streams for
each of the three main classes of sampling methods: node, edge, and
topology-based sampling. Our experimental results indicate that our proposed
family of sampling methods more accurately preserves the underlying properties
of the graph for both static and streaming graphs. Finally, we study the impact
of network sampling algorithms on the parameter estimation and performance
evaluation of relational classification algorithms
Respondent-Driven Sampling: An Assessment of Current Methodology
Respondent-Driven Sampling (RDS) employs a variant of a link-tracing network
sampling strategy to collect data from hard-to-reach populations. By tracing
the links in the underlying social network, the process exploits the social
structure to expand the sample and reduce its dependence on the initial
(convenience) sample.
The primary goal of RDS is typically to estimate population averages in the
hard-to-reach population. The current estimates make strong assumptions in
order to treat the data as a probability sample. In particular, we evaluate
three critical sensitivities of the estimators: to bias induced by the initial
sample, to uncontrollable features of respondent behavior, and to the
without-replacement structure of sampling.
This paper sounds a cautionary note for the users of RDS. While current RDS
methodology is powerful and clever, the favorable statistical properties
claimed for the current estimates are shown to be heavily dependent on often
unrealistic assumptions.Comment: 35 pages, 29 figures, under revie
Estimating and Sampling Graphs with Multidimensional Random Walks
Estimating characteristics of large graphs via sampling is a vital part of
the study of complex networks. Current sampling methods such as (independent)
random vertex and random walks are useful but have drawbacks. Random vertex
sampling may require too many resources (time, bandwidth, or money). Random
walks, which normally require fewer resources per sample, can suffer from large
estimation errors in the presence of disconnected or loosely connected graphs.
In this work we propose a new -dimensional random walk that uses
dependent random walkers. We show that the proposed sampling method, which we
call Frontier sampling, exhibits all of the nice sampling properties of a
regular random walk. At the same time, our simulations over large real world
graphs show that, in the presence of disconnected or loosely connected
components, Frontier sampling exhibits lower estimation errors than regular
random walks. We also show that Frontier sampling is more suitable than random
vertex sampling to sample the tail of the degree distribution of the graph
Evolutionary Approaches to Optimization Problems in Chimera Topologies
Chimera graphs define the topology of one of the first commercially available
quantum computers. A variety of optimization problems have been mapped to this
topology to evaluate the behavior of quantum enhanced optimization heuristics
in relation to other optimizers, being able to efficiently solve problems
classically to use them as benchmarks for quantum machines. In this paper we
investigate for the first time the use of Evolutionary Algorithms (EAs) on
Ising spin glass instances defined on the Chimera topology. Three genetic
algorithms (GAs) and three estimation of distribution algorithms (EDAs) are
evaluated over hard instances of the Ising spin glass constructed from
Sidon sets. We focus on determining whether the information about the topology
of the graph can be used to improve the results of EAs and on identifying the
characteristics of the Ising instances that influence the success rate of GAs
and EDAs.Comment: 8 pages, 5 figures, 3 table
- …