310,597 research outputs found
Change Point Detection on a Separable Model for Dynamic Networks
This paper studies the change point detection problem in time series of
networks, with the Separable Temporal Exponential-family Random Graph Model
(STERGM). We consider a sequence of networks generated from a piecewise
constant distribution that is altered at unknown change points in time.
Detection of the change points can identify the discrepancies in the underlying
data generating processes and facilitate downstream dynamic network analysis
tasks. Moreover, the STERGM that focuses on network statistics is a flexible
model to fit dynamic networks with both dyadic and temporal dependence. We
propose a new estimator derived from the Alternating Direction Method of
Multipliers (ADMM) and the Group Fused Lasso to simultaneously detect multiple
time points, where the parameters of STERGM have changed. We also provide
Bayesian information criterion for model selection to assist the detection. Our
experiments show good performance of the proposed method on both simulated and
real data. Lastly, we develop an R package CPDstergm to implement our method
Change Point Methods on a Sequence of Graphs
Given a finite sequence of graphs, e.g., coming from technological,
biological, and social networks, the paper proposes a methodology to identify
possible changes in stationarity in the stochastic process generating the
graphs. In order to cover a large class of applications, we consider the
general family of attributed graphs where both topology (number of vertexes and
edge configuration) and related attributes are allowed to change also in the
stationary case. Novel Change Point Methods (CPMs) are proposed, that (i) map
graphs into a vector domain; (ii) apply a suitable statistical test in the
vector space; (iii) detect the change --if any-- according to a confidence
level and provide an estimate for its time occurrence. Two specific
multivariate CPMs have been designed: one that detects shifts in the
distribution mean, the other addressing generic changes affecting the
distribution. We ground our proposal with theoretical results showing how to
relate the inference attained in the numerical vector space to the graph
domain, and vice versa. We also show how to extend the methodology for handling
multiple change points in the same sequence. Finally, the proposed CPMs have
been validated on real data sets coming from epileptic-seizure detection
problems and on labeled data sets for graph classification. Results show the
effectiveness of what proposed in relevant application scenarios
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Evolutionary Events in a Mathematical Sciences Research Collaboration Network
This study examines long-term trends and shifting behavior in the
collaboration network of mathematics literature, using a subset of data from
Mathematical Reviews spanning 1985-2009. Rather than modeling the network
cumulatively, this study traces the evolution of the "here and now" using
fixed-duration sliding windows. The analysis uses a suite of common network
diagnostics, including the distributions of degrees, distances, and clustering,
to track network structure. Several random models that call these diagnostics
as parameters help tease them apart as factors from the values of others. Some
behaviors are consistent over the entire interval, but most diagnostics
indicate that the network's structural evolution is dominated by occasional
dramatic shifts in otherwise steady trends. These behaviors are not distributed
evenly across the network; stark differences in evolution can be observed
between two major subnetworks, loosely thought of as "pure" and "applied",
which approximately partition the aggregate. The paper characterizes two major
events along the mathematics network trajectory and discusses possible
explanatory factors.Comment: 30 pages, 14 figures, 1 table; supporting information: 5 pages, 5
figures; published in Scientometric
Online Causal Structure Learning in the Presence of Latent Variables
We present two online causal structure learning algorithms which can track
changes in a causal structure and process data in a dynamic real-time manner.
Standard causal structure learning algorithms assume that causal structure does
not change during the data collection process, but in real-world scenarios, it
does often change. Therefore, it is inappropriate to handle such changes with
existing batch-learning approaches, and instead, a structure should be learned
in an online manner. The online causal structure learning algorithms we present
here can revise correlation values without reprocessing the entire dataset and
use an existing model to avoid relearning the causal links in the prior model,
which still fit data. Proposed algorithms are tested on synthetic and
real-world datasets, the latter being a seasonally adjusted commodity price
index dataset for the U.S. The online causal structure learning algorithms
outperformed standard FCI by a large margin in learning the changed causal
structure correctly and efficiently when latent variables were present.Comment: 16 pages, 9 figures, 2 table
- …