1,001 research outputs found
Mixed membership stochastic blockmodels
Observations consisting of measurements on relationships for pairs of objects
arise in many settings, such as protein interaction and gene regulatory
networks, collections of author-recipient email, and social networks. Analyzing
such data with probabilisic models can be delicate because the simple
exchangeability assumptions underlying many boilerplate models no longer hold.
In this paper, we describe a latent variable model of such data called the
mixed membership stochastic blockmodel. This model extends blockmodels for
relational data to ones which capture mixed membership latent relational
structure, thus providing an object-specific low-dimensional representation. We
develop a general variational inference algorithm for fast approximate
posterior inference. We explore applications to social and protein interaction
networks.Comment: 46 pages, 14 figures, 3 table
Supervised Blockmodelling
Collective classification models attempt to improve classification
performance by taking into account the class labels of related instances.
However, they tend not to learn patterns of interactions between classes and/or
make the assumption that instances of the same class link to each other
(assortativity assumption). Blockmodels provide a solution to these issues,
being capable of modelling assortative and disassortative interactions, and
learning the pattern of interactions in the form of a summary network. The
Supervised Blockmodel provides good classification performance using link
structure alone, whilst simultaneously providing an interpretable summary of
network interactions to allow a better understanding of the data. This work
explores three variants of supervised blockmodels of varying complexity and
tests them on four structurally different real world networks.Comment: Workshop on Collective Learning and Inference on Structured Data 201
Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel
The mixed membership stochastic blockmodel is a statistical model for a
graph, which extends the stochastic blockmodel by allowing every node to
randomly choose a different community each time a decision of whether to form
an edge is made. Whereas spectral analysis for the stochastic blockmodel is
increasingly well established, theory for the mixed membership case is
considerably less developed. Here we show that adjacency spectral embedding
into , followed by fitting the minimum volume enclosing convex
-polytope to the principal components, leads to a consistent estimate
of a -community mixed membership stochastic blockmodel. The key is to
identify a direct correspondence between the mixed membership stochastic
blockmodel and the random dot product graph, which greatly facilitates
theoretical analysis. Specifically, a norm and central
limit theorem for the random dot product graph are exploited to respectively
show consistency and partially correct the bias of the procedure.Comment: 12 pages, 6 figure
How Many Communities Are There?
Stochastic blockmodels and variants thereof are among the most widely used
approaches to community detection for social networks and relational data. A
stochastic blockmodel partitions the nodes of a network into disjoint sets,
called communities. The approach is inherently related to clustering with
mixture models; and raises a similar model selection problem for the number of
communities. The Bayesian information criterion (BIC) is a popular solution,
however, for stochastic blockmodels, the conditional independence assumption
given the communities of the endpoints among different edges is usually
violated in practice. In this regard, we propose composite likelihood BIC
(CL-BIC) to select the number of communities, and we show it is robust against
possible misspecifications in the underlying stochastic blockmodel assumptions.
We derive the requisite methodology and illustrate the approach using both
simulated and real data. Supplementary materials containing the relevant
computer code are available online.Comment: 26 pages, 3 figure
Model-based clustering for populations of networks
Until recently obtaining data on populations of networks was typically rare.
However, with the advancement of automatic monitoring devices and the growing
social and scientific interest in networks, such data has become more widely
available. From sociological experiments involving cognitive social structures
to fMRI scans revealing large-scale brain networks of groups of patients, there
is a growing awareness that we urgently need tools to analyse populations of
networks and particularly to model the variation between networks due to
covariates. We propose a model-based clustering method based on mixtures of
generalized linear (mixed) models that can be employed to describe the joint
distribution of a populations of networks in a parsimonious manner and to
identify subpopulations of networks that share certain topological properties
of interest (degree distribution, community structure, effect of covariates on
the presence of an edge, etc.). Maximum likelihood estimation for the proposed
model can be efficiently carried out with an implementation of the EM
algorithm. We assess the performance of this method on simulated data and
conclude with an example application on advice networks in a small business.Comment: The final (published) version of the article can be downloaded for
free (Open Access) from the editor's website (click on the DOI link below
- β¦