336 research outputs found
Enhanced detectability of community structure in multilayer networks through layer aggregation
Many systems are naturally represented by a multilayer network in which edges
exist in multiple layers that encode different, but potentially related, types
of interactions, and it is important to understand limitations on the
detectability of community structure in these networks. Using random matrix
theory, we analyze detectability limitations for multilayer (specifically,
multiplex) stochastic block models (SBMs) in which L layers are derived from a
common SBM. We study the effect of layer aggregation on detectability for
several aggregation methods, including summation of the layers' adjacency
matrices for which we show the detectability limit vanishes as O(L^{-1/2}) with
increasing number of layers, L. Importantly, we find a similar scaling behavior
when the summation is thresholded at an optimal value, providing insight into
the common - but not well understood - practice of thresholding
pairwise-interaction data to obtain sparse network representations.Comment: 7 pages, 4 figure
Enhanced Detectability of Community Structure in Multilayer Networks through Layer Aggregation
Many systems are naturally represented by a multilayer network in which edges exist in multiple layers that encode different, but potentially related, types of interactions, and it is important to understand limitations on the detectability of community structure in these networks. Using random matrix theory, we analyze detectability limitations for multilayer (specifically, multiplex) stochastic block models (SBMs) in which L layers are derived from a common SBM. We study the effect of layer aggregation on detectability for several aggregation methods, including summation of the layersâ adjacency matrices for which we show the detectability limit vanishes as (Lâ1/2) with increasing number of layers, L. Importantly, we find a similar scaling behavior when the summation is thresholded at an optimal value, providing insight into the commonâbut not well understoodâpractice of thresholding pairwise-interaction data to obtain sparse network representations
Super-resolution community detection for layer-aggregated multilayer networks
Applied network science often involves preprocessing network data before
applying a network-analysis method, and there is typically a theoretical
disconnect between these steps. For example, it is common to aggregate
time-varying network data into windows prior to analysis, and the tradeoffs of
this preprocessing are not well understood. Focusing on the problem of
detecting small communities in multilayer networks, we study the effects of
layer aggregation by developing random-matrix theory for modularity matrices
associated with layer-aggregated networks with nodes and layers, which
are drawn from an ensemble of Erd\H{o}s-R\'enyi networks. We study phase
transitions in which eigenvectors localize onto communities (allowing their
detection) and which occur for a given community provided its size surpasses a
detectability limit . When layers are aggregated via a summation, we
obtain , where is the number of
layers across which the community persists. Interestingly, if is allowed to
vary with then summation-based layer aggregation enhances small-community
detection even if the community persists across a vanishing fraction of layers,
provided that decays more slowly than . Moreover,
we find that thresholding the summation can in some cases cause to decay
exponentially, decreasing by orders of magnitude in a phenomenon we call
super-resolution community detection. That is, layer aggregation with
thresholding is a nonlinear data filter enabling detection of communities that
are otherwise too small to detect. Importantly, different thresholds generally
enhance the detectability of communities having different properties,
illustrating that community detection can be obscured if one analyzes network
data using a single threshold.Comment: 11 pages, 8 figure
Super-Resolution Community Detection for Layer-Aggregated Multilayer Networks
Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the trade-offs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with N nodes and L layers, which are drawn from an ensemble of ErdĆsâRĂ©nyi networks with communities planted in subsets of layers. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit K*. When layers are aggregated via a summation, we obtain KââO(NL/T), where T is the number of layers across which the community persists. Interestingly, if T is allowed to vary with L, then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that T/L decays more slowly than (Lâ1/2). Moreover, we find that thresholding the summation can, in some cases, cause K* to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. In other words, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold
Community Detection and Improved Detectability in Multiplex Networks
We investigate the widely encountered problem of detecting communities in
multiplex networks, such as social networks, with an unknown arbitrary
heterogeneous structure. To improve detectability, we propose a generative
model that leverages the multiplicity of a single community in multiple layers,
with no prior assumption on the relation of communities among different layers.
Our model relies on a novel idea of incorporating a large set of generic
localized community label constraints across the layers, in conjunction with
the celebrated Stochastic Block Model (SBM) in each layer. Accordingly, we
build a probabilistic graphical model over the entire multiplex network by
treating the constraints as Bayesian priors. We mathematically prove that these
constraints/priors promote existence of identical communities across layers
without introducing further correlation between individual communities. The
constraints are further tailored to render a sparse graphical model and the
numerically efficient Belief Propagation algorithm is subsequently employed. We
further demonstrate by numerical experiments that in the presence of consistent
communities between different layers, consistent communities are matched, and
the detectability is improved over a single layer. We compare our model with a
"correlated model" which exploits the prior knowledge of community correlation
between layers. Similar detectability improvement is obtained under such a
correlation, even though our model relies on much milder assumptions than the
correlated model. Our model even shows a better detection performance over a
certain correlation and signal to noise ratio (SNR) range. In the absence of
community correlation, the correlation model naturally fails, while ours
maintains its performance
Analysis and Actions on Graph Data.
Graphs are commonly used for representing relations between entities and handling data processing in various research fields, especially in social, cyber and physical networks. Many data mining and inference tasks can be interpreted as certain actions on the associated graphs, including graph spectral decompositions, and insertions and removals of nodes or edges. For instance, the task of graph clustering is to group similar nodes on a graph, and it can be solved by graph spectral decompositions. The task of cyber attack is to find effective node or edge removals that lead to maximal disruption in network connectivity.
In this dissertation, we focus on the following topics in graph data analytics:
(1) Fundamental limits of spectral algorithms for graph clustering in single-layer and multilayer graphs.
(2) Efficient algorithms for actions on graphs, including graph spectral decompositions and insertions and removals of nodes or edges.
(3) Applications to deep community detection, event propagation in online social networks, and topological network resilience for cyber security.
For (1), we established fundamental principles governing the performance of graph clustering for both spectral clustering and spectral modularity methods, which play an important role in unsupervised learning and data science. The framework is then extended to multilayer graphs entailing heterogeneous connectivity information.
For (2), we developed efficient algorithms for large-scale graph data analytics with theoretical guarantees, and proposed theory-driven methods for automatic model order selection in graph clustering.
For (3), we proposed a disruptive method for discovering deep communities in graphs, developed a novel method for analyzing event propagation on Twitter, and devised effective graph-theoretic approaches against explicit and lateral attacks in cyber systems.PHDElectrical & Computer Eng PhDUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135752/1/pinyu_1.pd
- âŠ