112,095 research outputs found
Mixed membership models for rank data: Investigating structure in Irish voting data
A mixed membership model is an individual level mixture model where individuals have partial membership of the profiles (or groups) that characterize a population. A mixed membership model for rank data is outlined and illustrated through the analysis of voting in the 2002 Irish general election. This particular election uses a voting system called proportional representation using a single transferable vote (PR-STV) where voters rank some or all of the candidates in order of preference. The data set considered consists of all votes in a constituency from the 2002 Irish general election. Interest lies in highlighting distinct voting profiles within the electorate and studying how voters affiliate themselves to these voting profiles. The mixed membership model for rank data is fitted to the voting data and is shown to give a concise and highly interpretable explanation of voting patterns in this election
Statistical Inference for Structured High-dimensional Models
High-dimensional statistical inference is a newly emerged direction of statistical science in the 21 century. Its importance is due to the increasing dimensionality and complexity of models needed to process and understand the modern real world data. The main idea making possible meaningful inference about such models is to assume suitable lower dimensional underlying structure or low-dimensional approximations, for which the error can be reasonably controlled. Several types of such structures have been recently introduced including sparse high-dimensional regression, sparse and/or low rank matrix models, matrix completion models, dictionary learning, network models (stochastic block model, mixed membership models) and more. The workshop focused on recent developments in structured sequence and regression models, matrix and tensor estimation, robustness, statistical learning in complex settings, network data, and topic models
Hierarchical relational models for document networks
We develop the relational topic model (RTM), a hierarchical model of both
network structure and node attributes. We focus on document networks, where the
attributes of each document are its words, that is, discrete observations taken
from a fixed vocabulary. For each pair of documents, the RTM models their link
as a binary random variable that is conditioned on their contents. The model
can be used to summarize a network of documents, predict links between them,
and predict words within them. We derive efficient inference and estimation
algorithms based on variational methods that take advantage of sparsity and
scale with the number of links. We evaluate the predictive performance of the
RTM for large networks of scientific abstracts, web documents, and
geographically tagged news.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS309 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Consistent Estimation of Mixed Memberships with Successive Projections
This paper considers the parameter estimation problem in Mixed Membership
Stochastic Block Model (MMSB), which is a quite general instance of random
graph model allowing for overlapping community structure. We present the new
algorithm successive projection overlapping clustering (SPOC) which combines
the ideas of spectral clustering and geometric approach for separable
non-negative matrix factorization. The proposed algorithm is provably
consistent under MMSB with general conditions on the parameters of the model.
SPOC is also shown to perform well experimentally in comparison to other
algorithms
Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel
The mixed membership stochastic blockmodel is a statistical model for a
graph, which extends the stochastic blockmodel by allowing every node to
randomly choose a different community each time a decision of whether to form
an edge is made. Whereas spectral analysis for the stochastic blockmodel is
increasingly well established, theory for the mixed membership case is
considerably less developed. Here we show that adjacency spectral embedding
into , followed by fitting the minimum volume enclosing convex
-polytope to the principal components, leads to a consistent estimate
of a -community mixed membership stochastic blockmodel. The key is to
identify a direct correspondence between the mixed membership stochastic
blockmodel and the random dot product graph, which greatly facilitates
theoretical analysis. Specifically, a norm and central
limit theorem for the random dot product graph are exploited to respectively
show consistency and partially correct the bias of the procedure.Comment: 12 pages, 6 figure
- âŠ