166 research outputs found
Adaptive Importance Sampling in General Mixture Classes
In this paper, we propose an adaptive algorithm that iteratively updates both
the weights and component parameters of a mixture importance sampling density
so as to optimise the importance sampling performances, as measured by an
entropy criterion. The method is shown to be applicable to a wide class of
importance sampling densities, which includes in particular mixtures of
multivariate Student t distributions. The performances of the proposed scheme
are studied on both artificial and real examples, highlighting in particular
the benefit of a novel Rao-Blackwellisation device which can be easily
incorporated in the updating scheme.Comment: Removed misleading comment in Section
Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data
We present a Bayesian non-negative tensor factorization model for
count-valued tensor data, and develop scalable inference algorithms (both batch
and online) for dealing with massive tensors. Our generative model can handle
overdispersed counts as well as infer the rank of the decomposition. Moreover,
leveraging a reparameterization of the Poisson distribution as a multinomial
facilitates conjugacy in the model and enables simple and efficient Gibbs
sampling and variational Bayes (VB) inference updates, with a computational
cost that only depends on the number of nonzeros in the tensor. The model also
provides a nice interpretability for the factors; in our model, each factor
corresponds to a "topic". We develop a set of online inference algorithms that
allow further scaling up the model to massive tensors, for which batch
inference methods may be infeasible. We apply our framework on diverse
real-world applications, such as \emph{multiway} topic modeling on a scientific
publications database, analyzing a political science data set, and analyzing a
massive household transactions data set.Comment: ECML PKDD 201
Practical Open-Loop Optimistic Planning
We consider the problem of online planning in a Markov Decision Process when
given only access to a generative model, restricted to open-loop policies -
i.e. sequences of actions - and under budget constraint. In this setting, the
Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical
guarantees but is overly conservative in practice, as we show in numerical
experiments. We propose a modified version of the algorithm with tighter
upper-confidence bounds, KLOLOP, that leads to better practical performances
while retaining the sample complexity bound. Finally, we propose an efficient
implementation that significantly improves the time complexity of both
algorithms
A Novel Document Generation Process for Topic Detection based on Hierarchical Latent Tree Models
We propose a novel document generation process based on hierarchical latent
tree models (HLTMs) learned from data. An HLTM has a layer of observed word
variables at the bottom and multiple layers of latent variables on top. For
each document, we first sample values for the latent variables layer by layer
via logic sampling, then draw relative frequencies for the words conditioned on
the values of the latent variables, and finally generate words for the document
using the relative word frequencies. The motivation for the work is to take
word counts into consideration with HLTMs. In comparison with LDA-based
hierarchical document generation processes, the new process achieves
drastically better model fit with much fewer parameters. It also yields more
meaningful topics and topic hierarchies. It is the new state-of-the-art for the
hierarchical topic detection
Kernel Sequential Monte Carlo
We propose kernel sequential Monte Carlo (KSMC), a framework for sampling from static target densities. KSMC is a family of
sequential Monte Carlo algorithms that are based on building emulator
models of the current particle system in a reproducing kernel Hilbert
space. We here focus on modelling nonlinear covariance structure and
gradients of the target. The emulator’s geometry is adaptively updated
and subsequently used to inform local proposals. Unlike in adaptive
Markov chain Monte Carlo, continuous adaptation does not compromise
convergence of the sampler. KSMC combines the strengths of sequental
Monte Carlo and kernel methods: superior performance for multimodal
targets and the ability to estimate model evidence as compared to Markov
chain Monte Carlo, and the emulator’s ability to represent targets that
exhibit high degrees of nonlinearity. As KSMC does not require access to
target gradients, it is particularly applicable on targets whose gradients
are unknown or prohibitively expensive. We describe necessary tuning
details and demonstrate the benefits of the the proposed methodology on
a series of challenging synthetic and real-world examples
A population Monte Carlo scheme with transformed weights and its application to stochastic kinetic models
This paper addresses the problem of Monte Carlo approximation of posterior
probability distributions. In particular, we have considered a recently
proposed technique known as population Monte Carlo (PMC), which is based on an
iterative importance sampling approach. An important drawback of this
methodology is the degeneracy of the importance weights when the dimension of
either the observations or the variables of interest is high. To alleviate this
difficulty, we propose a novel method that performs a nonlinear transformation
on the importance weights. This operation reduces the weight variation, hence
it avoids their degeneracy and increases the efficiency of the importance
sampling scheme, specially when drawing from a proposal functions which are
poorly adapted to the true posterior.
For the sake of illustration, we have applied the proposed algorithm to the
estimation of the parameters of a Gaussian mixture model. This is a very simple
problem that enables us to clearly show and discuss the main features of the
proposed technique. As a practical application, we have also considered the
popular (and challenging) problem of estimating the rate parameters of
stochastic kinetic models (SKM). SKMs are highly multivariate systems that
model molecular interactions in biological and chemical problems. We introduce
a particularization of the proposed algorithm to SKMs and present numerical
results.Comment: 35 pages, 8 figure
Recovering the state sequence of hidden Markov models using mean-field approximations
Inferring the sequence of states from observations is one of the most
fundamental problems in Hidden Markov Models. In statistical physics language,
this problem is equivalent to computing the marginals of a one-dimensional
model with a random external field. While this task can be accomplished through
transfer matrix methods, it becomes quickly intractable when the underlying
state space is large.
This paper develops several low-complexity approximate algorithms to address
this inference problem when the state space becomes large. The new algorithms
are based on various mean-field approximations of the transfer matrix. Their
performances are studied in detail on a simple realistic model for DNA
pyrosequencing.Comment: 43 pages, 41 figure
Localizing the Latent Structure Canonical Uncertainty: Entropy Profiles for Hidden Markov Models
This report addresses state inference for hidden Markov models. These models
rely on unobserved states, which often have a meaningful interpretation. This
makes it necessary to develop diagnostic tools for quantification of state
uncertainty. The entropy of the state sequence that explains an observed
sequence for a given hidden Markov chain model can be considered as the
canonical measure of state sequence uncertainty. This canonical measure of
state sequence uncertainty is not reflected by the classic multivariate state
profiles computed by the smoothing algorithm, which summarizes the possible
state sequences. Here, we introduce a new type of profiles which have the
following properties: (i) these profiles of conditional entropies are a
decomposition of the canonical measure of state sequence uncertainty along the
sequence and makes it possible to localize this uncertainty, (ii) these
profiles are univariate and thus remain easily interpretable on tree
structures. We show how to extend the smoothing algorithms for hidden Markov
chain and tree models to compute these entropy profiles efficiently.Comment: Submitted to Journal of Machine Learning Research; No RR-7896 (2012
Ergodicity, Decisions, and Partial Information
In the simplest sequential decision problem for an ergodic stochastic process
X, at each time n a decision u_n is made as a function of past observations
X_0,...,X_{n-1}, and a loss l(u_n,X_n) is incurred. In this setting, it is
known that one may choose (under a mild integrability assumption) a decision
strategy whose pathwise time-average loss is asymptotically smaller than that
of any other strategy. The corresponding problem in the case of partial
information proves to be much more delicate, however: if the process X is not
observable, but decisions must be based on the observation of a different
process Y, the existence of pathwise optimal strategies is not guaranteed.
The aim of this paper is to exhibit connections between pathwise optimal
strategies and notions from ergodic theory. The sequential decision problem is
developed in the general setting of an ergodic dynamical system (\Omega,B,P,T)
with partial information Y\subseteq B. The existence of pathwise optimal
strategies grounded in two basic properties: the conditional ergodic theory of
the dynamical system, and the complexity of the loss function. When the loss
function is not too complex, a general sufficient condition for the existence
of pathwise optimal strategies is that the dynamical system is a conditional
K-automorphism relative to the past observations \bigvee_n T^n Y. If the
conditional ergodicity assumption is strengthened, the complexity assumption
can be weakened. Several examples demonstrate the interplay between complexity
and ergodicity, which does not arise in the case of full information. Our
results also yield a decision-theoretic characterization of weak mixing in
ergodic theory, and establish pathwise optimality of ergodic nonlinear filters.Comment: 45 page
Interacting Multiple Try Algorithms with Different Proposal Distributions
We propose a new class of interacting Markov chain Monte Carlo (MCMC)
algorithms designed for increasing the efficiency of a modified multiple-try
Metropolis (MTM) algorithm. The extension with respect to the existing MCMC
literature is twofold. The sampler proposed extends the basic MTM algorithm by
allowing different proposal distributions in the multiple-try generation step.
We exploit the structure of the MTM algorithm with different proposal
distributions to naturally introduce an interacting MTM mechanism (IMTM) that
expands the class of population Monte Carlo methods. We show the validity of
the algorithm and discuss the choice of the selection weights and of the
different proposals. We provide numerical studies which show that the new
algorithm can perform better than the basic MTM algorithm and that the
interaction mechanism allows the IMTM to efficiently explore the state space
- …