1,525 research outputs found
KL-based Control of the Learning Schedule for Surrogate Black-Box Optimization
This paper investigates the control of an ML component within the Covariance
Matrix Adaptation Evolution Strategy (CMA-ES) devoted to black-box
optimization. The known CMA-ES weakness is its sample complexity, the number of
evaluations of the objective function needed to approximate the global optimum.
This weakness is commonly addressed through surrogate optimization, learning an
estimate of the objective function a.k.a. surrogate model, and replacing most
evaluations of the true objective function with the (inexpensive) evaluation of
the surrogate model. This paper presents a principled control of the learning
schedule (when to relearn the surrogate model), based on the Kullback-Leibler
divergence of the current search distribution and the training distribution of
the former surrogate model. The experimental validation of the proposed
approach shows significant performance gains on a comprehensive set of
ill-conditioned benchmark problems, compared to the best state of the art
including the quasi-Newton high-precision BFGS method
Variational Hamiltonian Monte Carlo via Score Matching
Traditionally, the field of computational Bayesian statistics has been
divided into two main subfields: variational methods and Markov chain Monte
Carlo (MCMC). In recent years, however, several methods have been proposed
based on combining variational Bayesian inference and MCMC simulation in order
to improve their overall accuracy and computational efficiency. This marriage
of fast evaluation and flexible approximation provides a promising means of
designing scalable Bayesian inference methods. In this paper, we explore the
possibility of incorporating variational approximation into a state-of-the-art
MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required gradient
computation in the simulation of Hamiltonian flow, which is the bottleneck for
many applications of HMC in big data problems. To this end, we use a {\it
free-form} approximation induced by a fast and flexible surrogate function
based on single-hidden layer feedforward neural networks. The surrogate
provides sufficiently accurate approximation while allowing for fast
exploration of parameter space, resulting in an efficient approximate inference
algorithm. We demonstrate the advantages of our method on both synthetic and
real data problems
Model-based relative entropy stochastic search
Stochastic search algorithms are general black-box optimizers. Due to their ease
of use and their generality, they have recently also gained a lot of attention in operations
research, machine learning and policy search. Yet, these algorithms require
a lot of evaluations of the objective, scale poorly with the problem dimension, are
affected by highly noisy objective functions and may converge prematurely. To
alleviate these problems, we introduce a new surrogate-based stochastic search
approach. We learn simple, quadratic surrogate models of the objective function.
As the quality of such a quadratic approximation is limited, we do not greedily exploit
the learned models. The algorithm can be misled by an inaccurate optimum
introduced by the surrogate. Instead, we use information theoretic constraints to
bound the ‘distance’ between the new and old data distribution while maximizing
the objective function. Additionally the new method is able to sustain the exploration
of the search distribution to avoid premature convergence. We compare our
method with state of art black-box optimization methods on standard uni-modal
and multi-modal optimization functions, on simulated planar robot tasks and a
complex robot ball throwing task. The proposed method considerably outperforms
the existing approaches
Target-based Surrogates for Stochastic Optimization
We consider minimizing functions for which it is expensive to compute the
(possibly stochastic) gradient. Such functions are prevalent in reinforcement
learning, imitation learning and adversarial training. Our target optimization
framework uses the (expensive) gradient computation to construct surrogate
functions in a \emph{target space} (e.g. the logits output by a linear model
for classification) that can be minimized efficiently. This allows for multiple
parameter updates to the model, amortizing the cost of gradient computation. In
the full-batch setting, we prove that our surrogate is a global upper-bound on
the loss, and can be (locally) minimized using a black-box optimization
algorithm. We prove that the resulting majorization-minimization algorithm
ensures convergence to a stationary point of the loss. Next, we instantiate our
framework in the stochastic setting and propose the algorithm, which can
be viewed as projected stochastic gradient descent in the target space. This
connection enables us to prove theoretical guarantees for when minimizing
convex functions. Our framework allows the use of standard stochastic
optimization algorithms to construct surrogates which can be minimized by any
deterministic optimization method. To evaluate our framework, we consider a
suite of supervised learning and imitation learning problems. Our experiments
indicate the benefits of target optimization and the effectiveness of
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
We present a tutorial on Bayesian optimization, a method of finding the
maximum of expensive cost functions. Bayesian optimization employs the Bayesian
technique of setting a prior over the objective function and combining it with
evidence to get a posterior function. This permits a utility-based selection of
the next observation to make on the objective function, which must take into
account both exploration (sampling from areas of high uncertainty) and
exploitation (sampling areas likely to offer improvement over the current best
observation). We also present two detailed extensions of Bayesian optimization,
with experiments---active user modelling with preferences, and hierarchical
reinforcement learning---and a discussion of the pros and cons of Bayesian
optimization based on our experiences
Universal Reinforcement Learning Algorithms: Survey and Experiments
Many state-of-the-art reinforcement learning (RL) algorithms typically assume
that the environment is an ergodic Markov Decision Process (MDP). In contrast,
the field of universal reinforcement learning (URL) is concerned with
algorithms that make as few assumptions as possible about the environment. The
universal Bayesian agent AIXI and a family of related URL algorithms have been
developed in this setting. While numerous theoretical optimality results have
been proven for these agents, there has been no empirical investigation of
their behavior to date. We present a short and accessible survey of these URL
algorithms under a unified notation and framework, along with results of some
experiments that qualitatively illustrate some properties of the resulting
policies, and their relative performance on partially-observable gridworld
environments. We also present an open-source reference implementation of the
algorithms which we hope will facilitate further understanding of, and
experimentation with, these ideas.Comment: 8 pages, 6 figures, Twenty-sixth International Joint Conference on
Artificial Intelligence (IJCAI-17
MCMC-driven learning
This paper is intended to appear as a chapter for the Handbook of Markov
Chain Monte Carlo. The goal of this chapter is to unify various problems at the
intersection of Markov chain Monte Carlo (MCMC) and machine
learning\unicode{x2014}which includes black-box variational inference,
adaptive MCMC, normalizing flow construction and transport-assisted MCMC,
surrogate-likelihood MCMC, coreset construction for MCMC with big data, Markov
chain gradient descent, Markovian score climbing, and
more\unicode{x2014}within one common framework. By doing so, the theory and
methods developed for each may be translated and generalized
Information theoretic stochastic search
The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and PortoOptimization is the research field that studies the design of algorithms for finding the
best solutions to problems we may throw at them. While the whole domain is practically
important, the present thesis will focus on the subfield of continuous black-box
optimization, presenting a collection of novel, state-of-the-art algorithms for solving
problems in that class. In this thesis, we introduce two novel general-purpose
stochastic search algorithms for black box optimisation. Stochastic search algorithms
aim at repeating the type of mutations that led to fittest search points in a population.
We can model those mutations by a stochastic distribution. Typically the stochastic
distribution is modelled as a multivariate Gaussian distribution. The key idea is to
iteratively change the parameters of the distribution towards higher expected fitness.
However we leverage information theoretic trust regions and limit the change of the
new distribution. We show how plain maximisation of the fitness expectation without
bounding the change of the distribution is destined to fail because of overfitting
and the results in premature convergence. Being derived from first principles, the
proposed methods can be elegantly extended to contextual learning setting which allows
for learning context dependent stochastic distributions that generates optimal
individuals for a given context, i.e, instead of learning one task at a time, we can
learn multiple related tasks at once. However, the search distribution typically uses
a parametric model using some hand-defined context features. Finding good context
features is a challenging task, and hence, non-parametric methods are often preferred
over their parametric counter-parts. Therefore, we further propose a non-parametric
contextual stochastic search algorithm that can learn a non-parametric search distribution
for multiple tasks simultaneously.Otimização é área de investigação que estuda o projeto de algoritmos para encontrar
as melhores soluções, tendo em conta um conjunto de critérios, para problemas
complexos. Embora todo o domínio de otimização tenha grande importância,
este trabalho está focado no subcampo da otimização contínua de caixa preta,
apresentando uma coleção de novos algoritmos novos de última geração para resolver
problemas nessa classe. Nesta tese, apresentamos dois novos algoritmos de
pesquisa estocástica de propósito geral para otimização de caixa preta. Os algoritmos
de pesquisa estocástica visam repetir o tipo de mutações que levaram aos
melhores pontos de pesquisa numa população. Podemos modelar essas mutações
por meio de uma distribuição estocástica e, tipicamente, a distribuição estocástica
é modelada como uma distribuição Gaussiana multivariada. A ideia chave é mudar
iterativamente os parâmetros da distribuição incrementando a avaliação. No entanto,
alavancamos as regiões de confiança teóricas de informação e limitamos a mudança
de distribuição. Deste modo, demonstra-se como a maximização simples da expectativa
de “fitness”, sem limites da mudança da distribuição, está destinada a falhar
devido ao “overfitness” e à convergência prematura resultantes. Sendo derivado dos
primeiros princípios, as abordagens propostas podem ser ampliadas, de forma elegante,
para a configuração de aprendizagem contextual que permite a aprendizagem
de distribuições estocásticas dependentes do contexto que geram os indivíduos ideais
para um determinado contexto. No entanto, a distribuição de pesquisa geralmente usa
um modelo paramétrico linear em algumas das características contextuais definidas
manualmente. Encontrar uma contextos bem definidos é uma tarefa desafiadora e,
portanto, os métodos não paramétricos são frequentemente preferidos em relação às
seus semelhantes paramétricos. Portanto, propomos um algoritmo não paramétrico
de pesquisa estocástica contextual que possa aprender uma distribuição de pesquisa
não-paramétrica para várias tarefas simultaneamente.FCT - Fundação para a Ciência e a Tecnologia. As well as fundings by European Union’s
FP7 under EuRoC grant agreement CP-IP 608849 and by LIACC (UID/CEC/00027/2015)
and IEETA (UID/CEC/00127/2015)
- …