60 research outputs found
Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation
Pairwise ranking methods are the basis of many widely used discriminative
training approaches for structure prediction problems in natural language
processing(NLP). Decomposing the problem of ranking hypotheses into pairwise
comparisons enables simple and efficient solutions. However, neglecting the
global ordering of the hypothesis list may hinder learning. We propose a
listwise learning framework for structure prediction problems such as machine
translation. Our framework directly models the entire translation list's
ordering to learn parameters which may better fit the given listwise samples.
Furthermore, we propose top-rank enhanced loss functions, which are more
sensitive to ranking errors at higher positions. Experiments on a large-scale
Chinese-English translation task show that both our listwise learning framework
and top-rank enhanced listwise losses lead to significant improvements in
translation quality.Comment: Accepted to CONLL 201
Bayesian nonparametric models for ranked data
We develop a Bayesian nonparametric extension of the popular Plackett-Luce
choice model that can handle an infinite number of choice items. Our framework
is based on the theory of random atomic measures, with the prior specified by a
gamma process. We derive a posterior characterization and a simple and
effective Gibbs sampler for posterior simulation. We develop a time-varying
extension of our model, and apply it to the New York Times lists of weekly
bestselling books.Comment: NIPS - Neural Information Processing Systems (2012
Minimax-optimal Inference from Partial Rankings
This paper studies the problem of inferring a global preference based on the
partial rankings provided by many users over different subsets of items
according to the Plackett-Luce model. A question of particular interest is how
to optimally assign items to users for ranking and how many item assignments
are needed to achieve a target estimation error. For a given assignment of
items to users, we first derive an oracle lower bound of the estimation error
that holds even for the more general Thurstone models. Then we show that the
Cram\'er-Rao lower bound and our upper bounds inversely depend on the spectral
gap of the Laplacian of an appropriately defined comparison graph. When the
system is allowed to choose the item assignment, we propose a random assignment
scheme. Our oracle lower bound and upper bounds imply that it is
minimax-optimal up to a logarithmic factor among all assignment schemes and the
lower bound can be achieved by the maximum likelihood estimator as well as
popular rank-breaking schemes that decompose partial rankings into pairwise
comparisons. The numerical experiments corroborate our theoretical findings.Comment: 16 pages, 2 figure
Efficient Bayesian Inference for Generalized Bradley-Terry Models
The Bradley-Terry model is a popular approach to describe probabilities of
the possible outcomes when elements of a set are repeatedly compared with one
another in pairs. It has found many applications including animal behaviour,
chess ranking and multiclass classification. Numerous extensions of the basic
model have also been proposed in the literature including models with ties,
multiple comparisons, group comparisons and random graphs. From a computational
point of view, Hunter (2004) has proposed efficient iterative MM
(minorization-maximization) algorithms to perform maximum likelihood estimation
for these generalized Bradley-Terry models whereas Bayesian inference is
typically performed using MCMC (Markov chain Monte Carlo) algorithms based on
tailored Metropolis-Hastings (M-H) proposals. We show here that these MM\
algorithms can be reinterpreted as special instances of
Expectation-Maximization (EM) algorithms associated to suitable sets of latent
variables and propose some original extensions. These latent variables allow us
to derive simple Gibbs samplers for Bayesian inference. We demonstrate
experimentally the efficiency of these algorithms on a variety of applications
Recommended from our members
Random Utility Theory for Social Choice
Random utility theory models an agent's preferences on alternatives by drawing a real-valued score on each alternative (typically independently) from a parameterized distribution, and then ranking the alternatives according to scores. A special case that has received significant attention is the Plackett-Luce model, for which fast inference methods for maximum likelihood estimators are available. This paper develops conditions on general random utility models that enable fast inference within a Bayesian framework through MC-EM, providing concave loglikelihood functions and bounded sets of global maxima solutions. Results on both real-world and simulated data provide support for the scalability of the approach and capability for model selection among general random utility models including Plackett-Luce.Engineering and Applied Science
Multi-Level Spatial Comparative Judgement Models To Map Deprivation
While current comparative judgement models provide strong algorithmic efficiency, they remain data inefficient, often requiring days or weeks of extensive data collection to provide sufficient pair- wise comparisons for stable and accurate parameter estimation. This disparity between data and algorithm efficiency is preventing widespread adoption, especially so in challenging data-collection environments such as mapping human rights abuses. We address the data inefficiency challenge by introducing the finite element Gaussian process Bradley–Terry mixture model, an approach that significantly reduces the number of pairwise comparisons required by comparative judgement mod- els. This is achieved via integration of prior spatial assumptions, encoded as a mixture of functions, each function introducing a spatial smoothness constraint at a specific resolution. These functions are modelled nonparametrically, through Gaussian process prior distributions. We use our method to map deprivation in the city of Dar es Salaam, Tanzania and locate slums in the city where poverty reduction measures can be carried out
Inferring from an imprecise Plackett–Luce model : application to label ranking
Learning ranking models is a difficult task, in which data may be scarce and cautious predictions desirable. To address such issues, we explore the extension of the popular parametric probabilistic Plackett–Luce model, often used to model rankings, to the imprecise setting where estimated parameters are set-valued. In particular, we study how to achieve cautious or conservative inference with it, and illustrate their application on label ranking problems, a specific supervised learning task
- …