15 research outputs found
Computationally efficient inference for latent position network models
Latent position models are widely used for the analysis of networks in a
variety of research fields. In fact, these models possess a number of desirable
theoretical properties, and are particularly easy to interpret. However,
statistical methodologies to fit these models generally incur a computational
cost which grows with the square of the number of nodes in the graph. This
makes the analysis of large social networks impractical. In this paper, we
propose a new method characterised by a linear computational complexity, which
can be used to fit latent position models on networks of several tens of
thousands nodes. Our approach relies on an approximation of the likelihood
function, where the amount of noise introduced by the approximation can be
arbitrarily reduced at the expense of computational efficiency. We establish
several theoretical results that show how the likelihood error propagates to
the invariant distribution of the Markov chain Monte Carlo sampler. In
particular, we demonstrate that one can achieve a substantial reduction in
computing time and still obtain a good estimate of the latent structure.
Finally, we propose applications of our method to simulated networks and to a
large coauthorships network, highlighting the usefulness of our approach.Comment: 39 pages, 10 figures, 1 tabl
Continuous Latent Position Models for Instantaneous Interactions
We create a framework to analyse the timing and frequency of instantaneous
interactions between pairs of entities. This type of interaction data is
especially common nowadays, and easily available. Examples of instantaneous
interactions include email networks, phone call networks and some common types
of technological and transportation networks. Our framework relies on a novel
extension of the latent position network model: we assume that the entities are
embedded in a latent Euclidean space, and that they move along individual
trajectories which are continuous over time. These trajectories are used to
characterize the timing and frequency of the pairwise interactions. We discuss
an inferential framework where we estimate the individual trajectories from the
observed interaction data, and propose applications on artificial and real
data.Comment: 33 page
A mixture of experts model for rank data with applications in election studies
A voting bloc is defined to be a group of voters who have similar voting
preferences. The cleavage of the Irish electorate into voting blocs is of
interest. Irish elections employ a ``single transferable vote'' electoral
system; under this system voters rank some or all of the electoral candidates
in order of preference. These rank votes provide a rich source of preference
information from which inferences about the composition of the electorate may
be drawn. Additionally, the influence of social factors or covariates on the
electorate composition is of interest. A mixture of experts model is a mixture
model in which the model parameters are functions of covariates. A mixture of
experts model for rank data is developed to provide a model-based method to
cluster Irish voters into voting blocs, to examine the influence of social
factors on this clustering and to examine the characteristic preferences of the
voting blocs. The Benter model for rank data is employed as the family of
component densities within the mixture of experts model; generalized linear
model theory is employed to model the influence of covariates on the mixing
proportions. Model fitting is achieved via a hybrid of the EM and MM
algorithms. An example of the methodology is illustrated by examining an Irish
presidential election. The existence of voting blocs in the electorate is
established and it is determined that age and government satisfaction levels
are important factors in influencing voting in this election.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS178 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Active Ranking using Pairwise Comparisons
This paper examines the problem of ranking a collection of objects using
pairwise comparisons (rankings of two objects). In general, the ranking of
objects can be identified by standard sorting methods using
pairwise comparisons. We are interested in natural situations in which
relationships among the objects may allow for ranking using far fewer pairwise
comparisons. Specifically, we assume that the objects can be embedded into a
-dimensional Euclidean space and that the rankings reflect their relative
distances from a common reference point in . We show that under this
assumption the number of possible rankings grows like and demonstrate
an algorithm that can identify a randomly selected ranking using just slightly
more than adaptively selected pairwise comparisons, on average. If
instead the comparisons are chosen at random, then almost all pairwise
comparisons must be made in order to identify any ranking. In addition, we
propose a robust, error-tolerant algorithm that only requires that the pairwise
comparisons are probably correct. Experimental studies with synthetic and real
datasets support the conclusions of our theoretical analysis.Comment: 17 pages, an extended version of our NIPS 2011 paper. The new version
revises the argument of the robust section and slightly modifies the result
there to give it more impac
Model-based clustering for multivariate partial ranking data
International audienceThis paper proposes the first model-based clustering algorithm dedicated to multivariate partial ranking data. This is an extension of the Insertion Sorting Rank (isr) model for ranking data, which is a meaningful and effective model obtained by modelling the ranking generating process assumed to be a sorting algorithm. The heterogeneity of the rank population is modelled by a mixture of isr, whereas conditional independence assumption allows the extension to multivariate ranking. Maximum likelihood estimation is performed through a SEM-Gibbs algorithm, and partial rankings are considered as missing data, what allows to simulate them during the estimation process. After having validated the estimation algorithm on simulations, three real datasets are studied: the 1980 American Psychological Association (APA) presidential election votes, the results of French students to a general knowledge test and the votes of the European countries to the Eurovision song contest. For each application, the proposed model shows relevant adequacy and leads to significant interpretation. In particular, regional alliances between European countries are exhibited in the Eurovision contest, which are often suspected but never proved.Nous proposons le premier modèle de classification automatique pour données de rang multivariées potentiellement incomplètes. Ce modèle est une extension du modèle Insertion Sorting Rank (isr) pour données de rang, qui est un modèle efficace et signifiant obtenu en modélisant le processus de génération des données. L'hétérogénéité des données est traitée à l'aide d'un modèle de mélange, tandis qu'une hypothèse classique d'indépendance conditionnelle permet de prendre en compte les rangs multivariés. L'estimation des paramètres du modèle est réalisée par maximum de vraisemblance à l'aide d'un algorithme SEM-Gibbs. Les données incomplètes sont considérées comme des données manquantes, ce qui permet de les simuler durant le processus d'estimation. Après avoir validé la stratégie d'estimation sur données simulées, trois jeux de données ont été étudiés : les votes lors de l'élection du président de l'American Psychological Association de 1980, les résultats d'étudiants français lors d'un test de culture générale, et les votes des pays lors du concours de l'Eurovision. Pour chaque application, le modèle proposé a montré une très bonne qualité d'ajustement et à conduit à des interprétations intéressantes. Notamment, pour le concours de l'Eurovision, nous avons mis à jour des alliances géographiques entre pays voisins, ce qui a souvent été suspecté pour ce concours mais jamais prouvé