451 research outputs found
Comparison of Resampling Schemes for Particle Filtering
This contribution is devoted to the comparison of various resampling
approaches that have been proposed in the literature on particle filtering. It
is first shown using simple arguments that the so-called residual and
stratified methods do yield an improvement over the basic multinomial
resampling approach. A simple counter-example showing that this property does
not hold true for systematic resampling is given. Finally, some results on the
large-sample behavior of the simple bootstrap filter algorithm are given. In
particular, a central limit theorem is established for the case where
resampling is performed using the residual approach
Stochastic Bandit Models for Delayed Conversions
Online advertising and product recommendation are important domains of
applications for multi-armed bandit methods. In these fields, the reward that
is immediately available is most often only a proxy for the actual outcome of
interest, which we refer to as a conversion. For instance, in web advertising,
clicks can be observed within a few seconds after an ad display but the
corresponding sale --if any-- will take hours, if not days to happen. This
paper proposes and investigates a new stochas-tic multi-armed bandit model in
the framework proposed by Chapelle (2014) --based on empirical studies in the
field of web advertising-- in which each action may trigger a future reward
that will then happen with a stochas-tic delay. We assume that the probability
of conversion associated with each action is unknown while the distribution of
the conversion delay is known, distinguishing between the (idealized) case
where the conversion events may be observed whatever their delay and the more
realistic setting in which late conversions are censored. We provide
performance lower bounds as well as two simple but efficient algorithms based
on the UCB and KLUCB frameworks. The latter algorithm, which is preferable when
conversion rates are low, is based on a Poissonization argument, of independent
interest in other settings where aggregation of Bernoulli observations with
different success probabilities is required.Comment: Conference on Uncertainty in Artificial Intelligence, Aug 2017,
Sydney, Australi
Multiple-Play Bandits in the Position-Based Model
Sequentially learning to place items in multi-position displays or lists is a
task that can be cast into the multiple-play semi-bandit setting. However, a
major concern in this context is when the system cannot decide whether the user
feedback for each item is actually exploitable. Indeed, much of the content may
have been simply ignored by the user. The present work proposes to exploit
available information regarding the display position bias under the so-called
Position-based click model (PBM). We first discuss how this model differs from
the Cascade model and its variants considered in several recent works on
multiple-play bandits. We then provide a novel regret lower bound for this
model as well as computationally efficient algorithms that display good
empirical and theoretical performance
On the equivalence between standard and sequentially ordered hidden Markov models
Chopin (2007) introduced a sequentially ordered hidden Markov model, for
which states are ordered according to their order of appearance, and claimed
that such a model is a re-parametrisation of a standard Markov model. This note
gives a formal proof that this equivalence holds in Bayesian terms, as both
formulations generate equivalent posterior distributions, but does not hold in
Frequentist terms, as both formulations generate incompatible likelihood
functions. Perhaps surprisingly, this shows that Bayesian re-parametrisation
and Frequentist re-parametrisation are not identical concepts
Sequential Monte Carlo smoothing with application to parameter estimation in non-linear state space models
This paper concerns the use of sequential Monte Carlo methods (SMC) for
smoothing in general state space models. A well-known problem when applying the
standard SMC technique in the smoothing mode is that the resampling mechanism
introduces degeneracy of the approximation in the path space. However, when
performing maximum likelihood estimation via the EM algorithm, all functionals
involved are of additive form for a large subclass of models. To cope with the
problem in this case, a modification of the standard method (based on a
technique proposed by Kitagawa and Sato) is suggested. Our algorithm relies on
forgetting properties of the filtering dynamics and the quality of the
estimates produced is investigated, both theoretically and via simulations.Comment: Published in at http://dx.doi.org/10.3150/07-BEJ6150 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling
Conditional Random Fields (CRFs) constitute a popular and efficient approach
for supervised sequence labelling. CRFs can cope with large description spaces
and can integrate some form of structural dependency between labels. In this
contribution, we address the issue of efficient feature selection for CRFs
based on imposing sparsity through an L1 penalty. We first show how sparsity of
the parameter set can be exploited to significantly speed up training and
labelling. We then introduce coordinate descent parameter update schemes for
CRFs with L1 regularization. We finally provide some empirical comparisons of
the proposed approach with state-of-the-art CRF training strategies. In
particular, it is shown that the proposed approach is able to take profit of
the sparsity to speed up processing and hence potentially handle larger
dimensional models
- âŠ