40,251 research outputs found
Mixed membership stochastic blockmodels
Observations consisting of measurements on relationships for pairs of objects
arise in many settings, such as protein interaction and gene regulatory
networks, collections of author-recipient email, and social networks. Analyzing
such data with probabilisic models can be delicate because the simple
exchangeability assumptions underlying many boilerplate models no longer hold.
In this paper, we describe a latent variable model of such data called the
mixed membership stochastic blockmodel. This model extends blockmodels for
relational data to ones which capture mixed membership latent relational
structure, thus providing an object-specific low-dimensional representation. We
develop a general variational inference algorithm for fast approximate
posterior inference. We explore applications to social and protein interaction
networks.Comment: 46 pages, 14 figures, 3 table
Variational approximation for mixtures of linear mixed models
Mixtures of linear mixed models (MLMMs) are useful for clustering grouped
data and can be estimated by likelihood maximization through the EM algorithm.
The conventional approach to determining a suitable number of components is to
compare different mixture models using penalized log-likelihood criteria such
as BIC.We propose fitting MLMMs with variational methods which can perform
parameter estimation and model selection simultaneously. A variational
approximation is described where the variational lower bound and parameter
updates are in closed form, allowing fast evaluation. A new variational greedy
algorithm is developed for model selection and learning of the mixture
components. This approach allows an automatic initialization of the algorithm
and returns a plausible number of mixture components automatically. In cases of
weak identifiability of certain model parameters, we use hierarchical centering
to reparametrize the model and show empirically that there is a gain in
efficiency by variational algorithms similar to that in MCMC algorithms.
Related to this, we prove that the approximate rate of convergence of
variational algorithms by Gaussian approximation is equal to that of the
corresponding Gibbs sampler which suggests that reparametrizations can lead to
improved convergence in variational algorithms as well.Comment: 36 pages, 5 figures, 2 tables, submitted to JCG
Statistical Indicators of Collective Behavior and Functional Clusters in Gene Networks of Yeast
We analyze gene expression time-series data of yeast S. cerevisiae measured
along two full cell-cycles. We quantify these data by using q-exponentials,
gene expression ranking and a temporal mean-variance analysis. We construct
gene interaction networks based on correlation coefficients and study the
formation of the corresponding giant components and minimum spanning trees. By
coloring genes according to their cell function we find functional clusters in
the correlation networks and functional branches in the associated trees. Our
results suggest that a percolation point of functional clusters can be
identified on these gene expression correlation networks.Comment: 8 pages, 4 figure
- …