611 research outputs found

    Mixed membership stochastic blockmodels

    Full text link
    Observations consisting of measurements on relationships for pairs of objects arise in many settings, such as protein interaction and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilisic models can be delicate because the simple exchangeability assumptions underlying many boilerplate models no longer hold. In this paper, we describe a latent variable model of such data called the mixed membership stochastic blockmodel. This model extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation. We develop a general variational inference algorithm for fast approximate posterior inference. We explore applications to social and protein interaction networks.Comment: 46 pages, 14 figures, 3 table

    Stochastic blockmodels and community structure in networks

    Full text link
    Stochastic blockmodels have been proposed as a tool for detecting community structure in networks as well as for generating synthetic networks for use as benchmarks. Most blockmodels, however, ignore variation in vertex degree, making them unsuitable for applications to real-world networks, which typically display broad degree distributions that can significantly distort the results. Here we demonstrate how the generalization of blockmodels to incorporate this missing element leads to an improved objective function for community detection in complex networks. We also propose a heuristic algorithm for community detection using this objective function or its non-degree-corrected counterpart and show that the degree-corrected version dramatically outperforms the uncorrected one in both real-world and synthetic networks.Comment: 11 pages, 3 figure

    How Many Communities Are There?

    Full text link
    Stochastic blockmodels and variants thereof are among the most widely used approaches to community detection for social networks and relational data. A stochastic blockmodel partitions the nodes of a network into disjoint sets, called communities. The approach is inherently related to clustering with mixture models; and raises a similar model selection problem for the number of communities. The Bayesian information criterion (BIC) is a popular solution, however, for stochastic blockmodels, the conditional independence assumption given the communities of the endpoints among different edges is usually violated in practice. In this regard, we propose composite likelihood BIC (CL-BIC) to select the number of communities, and we show it is robust against possible misspecifications in the underlying stochastic blockmodel assumptions. We derive the requisite methodology and illustrate the approach using both simulated and real data. Supplementary materials containing the relevant computer code are available online.Comment: 26 pages, 3 figure

    Model-based clustering for populations of networks

    Get PDF
    Until recently obtaining data on populations of networks was typically rare. However, with the advancement of automatic monitoring devices and the growing social and scientific interest in networks, such data has become more widely available. From sociological experiments involving cognitive social structures to fMRI scans revealing large-scale brain networks of groups of patients, there is a growing awareness that we urgently need tools to analyse populations of networks and particularly to model the variation between networks due to covariates. We propose a model-based clustering method based on mixtures of generalized linear (mixed) models that can be employed to describe the joint distribution of a populations of networks in a parsimonious manner and to identify subpopulations of networks that share certain topological properties of interest (degree distribution, community structure, effect of covariates on the presence of an edge, etc.). Maximum likelihood estimation for the proposed model can be efficiently carried out with an implementation of the EM algorithm. We assess the performance of this method on simulated data and conclude with an example application on advice networks in a small business.Comment: The final (published) version of the article can be downloaded for free (Open Access) from the editor's website (click on the DOI link below

    Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel

    Full text link
    The mixed membership stochastic blockmodel is a statistical model for a graph, which extends the stochastic blockmodel by allowing every node to randomly choose a different community each time a decision of whether to form an edge is made. Whereas spectral analysis for the stochastic blockmodel is increasingly well established, theory for the mixed membership case is considerably less developed. Here we show that adjacency spectral embedding into Rk\mathbb{R}^k, followed by fitting the minimum volume enclosing convex kk-polytope to the k−1k-1 principal components, leads to a consistent estimate of a kk-community mixed membership stochastic blockmodel. The key is to identify a direct correspondence between the mixed membership stochastic blockmodel and the random dot product graph, which greatly facilitates theoretical analysis. Specifically, a 2→∞2 \rightarrow \infty norm and central limit theorem for the random dot product graph are exploited to respectively show consistency and partially correct the bias of the procedure.Comment: 12 pages, 6 figure
    • …
    corecore