8 research outputs found
Clustering multilayer graphs with missing nodes
Relationship between agents can be conveniently represented by graphs. When
these relationships have different modalities, they are better modelled by
multilayer graphs where each layer is associated with one modality. Such graphs
arise naturally in many contexts including biological and social networks.
Clustering is a fundamental problem in network analysis where the goal is to
regroup nodes with similar connectivity profiles. In the past decade, various
clustering methods have been extended from the unilayer setting to multilayer
graphs in order to incorporate the information provided by each layer. While
most existing works assume - rather restrictively - that all layers share the
same set of nodes, we propose a new framework that allows for layers to be
defined on different sets of nodes. In particular, the nodes not recorded in a
layer are treated as missing. Within this paradigm, we investigate several
generalizations of well-known clustering methods in the complete setting to the
incomplete one and prove some consistency results under the Multi-Layer
Stochastic Block Model assumption. Our theoretical results are complemented by
thorough numerical comparisons between our proposed algorithms on synthetic
data, and also on real datasets, thus highlighting the promising behaviour of
our methods in various settings.Comment: 27 pages, 7 figures, accepted to AISTATS 202
Joint Spectral Clustering in Multilayer Degree-Corrected Stochastic Blockmodels
Modern network datasets are often composed of multiple layers, either as
different views, time-varying observations, or independent sample units,
resulting in collections of networks over the same set of vertices but with
potentially different connectivity patterns on each network. These data require
models and methods that are flexible enough to capture local and global
differences across the networks, while at the same time being parsimonious and
tractable to yield computationally efficient and theoretically sound solutions
that are capable of aggregating information across the networks. This paper
considers the multilayer degree-corrected stochastic blockmodel, where a
collection of networks share the same community structure, but
degree-corrections and block connection probability matrices are permitted to
be different. We establish the identifiability of this model and propose a
spectral clustering algorithm for community detection in this setting. Our
theoretical results demonstrate that the misclustering error rate of the
algorithm improves exponentially with multiple network realizations, even in
the presence of significant layer heterogeneity with respect to degree
corrections, signal strength, and spectral properties of the block connection
probability matrices. Simulation studies show that this approach improves on
existing multilayer community detection methods in this challenging regime.
Furthermore, in a case study of US airport data through January 2016 --
September 2021, we find that this methodology identifies meaningful community
structure and trends in airport popularity influenced by pandemic impacts on
travel