1,795 research outputs found
Detection of Core-Periphery Structure in Networks Using Spectral Methods and Geodesic Paths
We introduce several novel and computationally efficient methods for
detecting "core--periphery structure" in networks. Core--periphery structure is
a type of mesoscale structure that includes densely-connected core vertices and
sparsely-connected peripheral vertices. Core vertices tend to be well-connected
both among themselves and to peripheral vertices, which tend not to be
well-connected to other vertices. Our first method, which is based on
transportation in networks, aggregates information from many geodesic paths in
a network and yields a score for each vertex that reflects the likelihood that
a vertex is a core vertex. Our second method is based on a low-rank
approximation of a network's adjacency matrix, which can often be expressed as
a tensor-product matrix. Our third approach uses the bottom eigenvector of the
random-walk Laplacian to infer a coreness score and a classification into core
and peripheral vertices. We also design an objective function to (1) help
classify vertices into core or peripheral vertices and (2) provide a
goodness-of-fit criterion for classifications into core versus peripheral
vertices. To examine the performance of our methods, we apply our algorithms to
both synthetically-generated networks and a variety of networks constructed
from real-world data sets.Comment: This article is part of EJAM's December 2016 special issue on
"Network Analysis and Modelling" (available at
https://www.cambridge.org/core/journals/european-journal-of-applied-mathematics/issue/journal-ejm-volume-27-issue-6/D245C89CABF55DBF573BB412F7651ADB
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
From which world is your graph?
Discovering statistical structure from links is a fundamental problem in the
analysis of social networks. Choosing a misspecified model, or equivalently, an
incorrect inference algorithm will result in an invalid analysis or even
falsely uncover patterns that are in fact artifacts of the model. This work
focuses on unifying two of the most widely used link-formation models: the
stochastic blockmodel (SBM) and the small world (or latent space) model (SWM).
Integrating techniques from kernel learning, spectral graph theory, and
nonlinear dimensionality reduction, we develop the first statistically sound
polynomial-time algorithm to discover latent patterns in sparse graphs for both
models. When the network comes from an SBM, the algorithm outputs a block
structure. When it is from an SWM, the algorithm outputs estimates of each
node's latent position.Comment: To appear in NIPS 201
Topology Discovery of Sparse Random Graphs With Few Participants
We consider the task of topology discovery of sparse random graphs using
end-to-end random measurements (e.g., delay) between a subset of nodes,
referred to as the participants. The rest of the nodes are hidden, and do not
provide any information for topology discovery. We consider topology discovery
under two routing models: (a) the participants exchange messages along the
shortest paths and obtain end-to-end measurements, and (b) additionally, the
participants exchange messages along the second shortest path. For scenario
(a), our proposed algorithm results in a sub-linear edit-distance guarantee
using a sub-linear number of uniformly selected participants. For scenario (b),
we obtain a much stronger result, and show that we can achieve consistent
reconstruction when a sub-linear number of uniformly selected nodes
participate. This implies that accurate discovery of sparse random graphs is
tractable using an extremely small number of participants. We finally obtain a
lower bound on the number of participants required by any algorithm to
reconstruct the original random graph up to a given edit distance. We also
demonstrate that while consistent discovery is tractable for sparse random
graphs using a small number of participants, in general, there are graphs which
cannot be discovered by any algorithm even with a significant number of
participants, and with the availability of end-to-end information along all the
paths between the participants.Comment: A shorter version appears in ACM SIGMETRICS 2011. This version is
scheduled to appear in J. on Random Structures and Algorithm
HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks
The unsupervised detection of anomalies in time series data has important
applications in user behavioral modeling, fraud detection, and cybersecurity.
Anomaly detection has, in fact, been extensively studied in categorical
sequences. However, we often have access to time series data that represent
paths through networks. Examples include transaction sequences in financial
networks, click streams of users in networks of cross-referenced documents, or
travel itineraries in transportation networks. To reliably detect anomalies, we
must account for the fact that such data contain a large number of independent
observations of paths constrained by a graph topology. Moreover, the
heterogeneity of real systems rules out frequency-based anomaly detection
techniques, which do not account for highly skewed edge and degree statistics.
To address this problem, we introduce HYPA, a novel framework for the
unsupervised detection of anomalies in large corpora of variable-length
temporal paths in a graph. HYPA provides an efficient analytical method to
detect paths with anomalous frequencies that result from nodes being traversed
in unexpected chronological order.Comment: 11 pages with 8 figures and supplementary material. To appear at SIAM
Data Mining (SDM 2020
- …