35,205 research outputs found
Dynamic Behavioral Mixed-Membership Model for Large Evolving Networks
The majority of real-world networks are dynamic and extremely large (e.g.,
Internet Traffic, Twitter, Facebook, ...). To understand the structural
behavior of nodes in these large dynamic networks, it may be necessary to model
the dynamics of behavioral roles representing the main connectivity patterns
over time. In this paper, we propose a dynamic behavioral mixed-membership
model (DBMM) that captures the roles of nodes in the graph and how they evolve
over time. Unlike other node-centric models, our model is scalable for
analyzing large dynamic networks. In addition, DBMM is flexible,
parameter-free, has no functional form or parameterization, and is
interpretable (identifies explainable patterns). The performance results
indicate our approach can be applied to very large networks while the
experimental results show that our model uncovers interesting patterns
underlying the dynamics of these networks
Graphs in machine learning: an introduction
Graphs are commonly used to characterise interactions between objects of
interest. Because they are based on a straightforward formalism, they are used
in many scientific fields from computer science to historical sciences. In this
paper, we give an introduction to some methods relying on graphs for learning.
This includes both unsupervised and supervised methods. Unsupervised learning
algorithms usually aim at visualising graphs in latent spaces and/or clustering
the nodes. Both focus on extracting knowledge from graph topologies. While most
existing techniques are only applicable to static graphs, where edges do not
evolve through time, recent developments have shown that they could be extended
to deal with evolving networks. In a supervised context, one generally aims at
inferring labels or numerical values attached to nodes using both the graph
and, when they are available, node characteristics. Balancing the two sources
of information can be challenging, especially as they can disagree locally or
globally. In both contexts, supervised and un-supervised, data can be
relational (augmented with one or several global graphs) as described above, or
graph valued. In this latter case, each object of interest is given as a full
graph (possibly completed by other characteristics). In this context, natural
tasks include graph clustering (as in producing clusters of graphs rather than
clusters of nodes in a single graph), graph classification, etc. 1 Real
networks One of the first practical studies on graphs can be dated back to the
original work of Moreno [51] in the 30s. Since then, there has been a growing
interest in graph analysis associated with strong developments in the modelling
and the processing of these data. Graphs are now used in many scientific
fields. In Biology [54, 2, 7], for instance, metabolic networks can describe
pathways of biochemical reactions [41], while in social sciences networks are
used to represent relation ties between actors [66, 56, 36, 34]. Other examples
include powergrids [71] and the web [75]. Recently, networks have also been
considered in other areas such as geography [22] and history [59, 39]. In
machine learning, networks are seen as powerful tools to model problems in
order to extract information from data and for prediction purposes. This is the
object of this paper. For more complete surveys, we refer to [28, 62, 49, 45].
In this section, we introduce notations and highlight properties shared by most
real networks. In Section 2, we then consider methods aiming at extracting
information from a unique network. We will particularly focus on clustering
methods where the goal is to find clusters of vertices. Finally, in Section 3,
techniques that take a series of networks into account, where each network i
Seasonality in Dynamic Stochastic Block Models
Sociotechnological and geospatial processes exhibit time varying structure
that make insight discovery challenging. This paper proposes a new statistical
model for such systems, modeled as dynamic networks, to address this challenge.
It assumes that vertices fall into one of k types and that the probability of
edge formation at a particular time depends on the types of the incident nodes
and the current time. The time dependencies are driven by unique seasonal
processes, which many systems exhibit (e.g., predictable spikes in geospatial
or web traffic each day). The paper defines the model as a generative process
and an inference procedure to recover the seasonal processes from data when
they are unknown. Evaluation with synthetic dynamic networks show the recovery
of the latent seasonal processes that drive its formation.Comment: 4 page worksho
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks
The stochastic block model (SBM) is a flexible probabilistic tool that can be
used to model interactions between clusters of nodes in a network. However, it
does not account for interactions of time varying intensity between clusters.
The extension of the SBM developed in this paper addresses this shortcoming
through a temporal partition: assuming interactions between nodes are recorded
on fixed-length time intervals, the inference procedure associated with the
model we propose allows to cluster simultaneously the nodes of the network and
the time intervals. The number of clusters of nodes and of time intervals, as
well as the memberships to clusters, are obtained by maximizing an exact
integrated complete-data likelihood, relying on a greedy search approach.
Experiments on simulated and real data are carried out in order to assess the
proposed methodology
- …