4,239 research outputs found
A Tensor Approach to Learning Mixed Membership Community Models
Community detection is the task of detecting hidden communities from observed
interactions. Guaranteed community detection has so far been mostly limited to
models with non-overlapping communities such as the stochastic block model. In
this paper, we remove this restriction, and provide guaranteed community
detection for a family of probabilistic network models with overlapping
communities, termed as the mixed membership Dirichlet model, first introduced
by Airoldi et al. This model allows for nodes to have fractional memberships in
multiple communities and assumes that the community memberships are drawn from
a Dirichlet distribution. Moreover, it contains the stochastic block model as a
special case. We propose a unified approach to learning these models via a
tensor spectral decomposition method. Our estimator is based on low-order
moment tensor of the observed network, consisting of 3-star counts. Our
learning method is fast and is based on simple linear algebraic operations,
e.g. singular value decomposition and tensor power iterations. We provide
guaranteed recovery of community memberships and model parameters and present a
careful finite sample analysis of our learning method. As an important special
case, our results match the best known scaling requirements for the
(homogeneous) stochastic block model
Online Tensor Methods for Learning Latent Variable Models
We introduce an online tensor decomposition based approach for two latent
variable modeling problems namely, (1) community detection, in which we learn
the latent communities that the social actors in social networks belong to, and
(2) topic modeling, in which we infer hidden topics of text articles. We
consider decomposition of moment tensors using stochastic gradient descent. We
conduct optimization of multilinear operations in SGD and avoid directly
forming the tensors, to save computational and storage costs. We present
optimized algorithm in two platforms. Our GPU-based implementation exploits the
parallelism of SIMD architectures to allow for maximum speed-up by a careful
optimization of storage and data transfer, whereas our CPU-based implementation
uses efficient sparse matrix computations and is suitable for large sparse
datasets. For the community detection problem, we demonstrate accuracy and
computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic
modeling problem, we also demonstrate good performance on the New York Times
dataset. We compare our results to the state-of-the-art algorithms such as the
variational method, and report a gain of accuracy and a gain of several orders
of magnitude in the execution time.Comment: JMLR 201
Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach
The increasing availability of temporal network data is calling for more
research on extracting and characterizing mesoscopic structures in temporal
networks and on relating such structure to specific functions or properties of
the system. An outstanding challenge is the extension of the results achieved
for static networks to time-varying networks, where the topological structure
of the system and the temporal activity patterns of its components are
intertwined. Here we investigate the use of a latent factor decomposition
technique, non-negative tensor factorization, to extract the community-activity
structure of temporal networks. The method is intrinsically temporal and allows
to simultaneously identify communities and to track their activity over time.
We represent the time-varying adjacency matrix of a temporal network as a
three-way tensor and approximate this tensor as a sum of terms that can be
interpreted as communities of nodes with an associated activity time series. We
summarize known computational techniques for tensor decomposition and discuss
some quality metrics that can be used to tune the complexity of the factorized
representation. We subsequently apply tensor factorization to a temporal
network for which a ground truth is available for both the community structure
and the temporal activity patterns. The data we use describe the social
interactions of students in a school, the associations between students and
school classes, and the spatio-temporal trajectories of students over time. We
show that non-negative tensor factorization is capable of recovering the class
structure with high accuracy. In particular, the extracted tensor components
can be validated either as known school classes, or in terms of correlated
activity patterns, i.e., of spatial and temporal coincidences that are
determined by the known school activity schedule
Consistent Estimation of Mixed Memberships with Successive Projections
This paper considers the parameter estimation problem in Mixed Membership
Stochastic Block Model (MMSB), which is a quite general instance of random
graph model allowing for overlapping community structure. We present the new
algorithm successive projection overlapping clustering (SPOC) which combines
the ideas of spectral clustering and geometric approach for separable
non-negative matrix factorization. The proposed algorithm is provably
consistent under MMSB with general conditions on the parameters of the model.
SPOC is also shown to perform well experimentally in comparison to other
algorithms
Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel
The mixed membership stochastic blockmodel is a statistical model for a
graph, which extends the stochastic blockmodel by allowing every node to
randomly choose a different community each time a decision of whether to form
an edge is made. Whereas spectral analysis for the stochastic blockmodel is
increasingly well established, theory for the mixed membership case is
considerably less developed. Here we show that adjacency spectral embedding
into , followed by fitting the minimum volume enclosing convex
-polytope to the principal components, leads to a consistent estimate
of a -community mixed membership stochastic blockmodel. The key is to
identify a direct correspondence between the mixed membership stochastic
blockmodel and the random dot product graph, which greatly facilitates
theoretical analysis. Specifically, a norm and central
limit theorem for the random dot product graph are exploited to respectively
show consistency and partially correct the bias of the procedure.Comment: 12 pages, 6 figure
- …