989 research outputs found
Deriving and improving CMA-ES with Information geometric trust regions
CMA-ES is one of the most popular stochastic search algorithms.
It performs favourably in many tasks without the need of extensive
parameter tuning. The algorithm has many beneficial properties,
including automatic step-size adaptation, efficient covariance updates
that incorporates the current samples as well as the evolution
path and its invariance properties. Its update rules are composed
of well established heuristics where the theoretical foundations of
some of these rules are also well understood. In this paper we
will fully derive all CMA-ES update rules within the framework of
expectation-maximisation-based stochastic search algorithms using
information-geometric trust regions. We show that the use of the trust
region results in similar updates to CMA-ES for the mean and the
covariance matrix while it allows for the derivation of an improved
update rule for the step-size. Our new algorithm, Trust-Region Covariance
Matrix Adaptation Evolution Strategy (TR-CMA-ES) is
fully derived from first order optimization principles and performs
favourably in compare to standard CMA-ES algorithm
Warped Hypertime Representations for Long-Term Autonomy of Mobile Robots
This letter presents a novel method for introducing time into discrete and continuous spatial representations used in mobile robotics, by modeling long-term, pseudo-periodic variations caused by human activities or natural processes. Unlike previous approaches, the proposed method does not treat time and space separately, and its continuous nature respects both the temporal and spatial continuity of the modeled phenomena. The key idea is to extend the spatial model with a set of wrapped time dimensions that represent the periodicities of the observed events. By performing clustering over this extended representation, we obtain a model that allows the prediction of probabilistic distributions of future states and events in both discrete and continuous spatial representations. We apply the proposed algorithm to several long-term datasets acquired by mobile robots and show that the method enables a robot to predict future states of representations with different dimensions. The experiments further show that the method achieves more accurate predictions than the previous state of the art
Information theoretic stochastic search
The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and PortoOptimization is the research field that studies the design of algorithms for finding the
best solutions to problems we may throw at them. While the whole domain is practically
important, the present thesis will focus on the subfield of continuous black-box
optimization, presenting a collection of novel, state-of-the-art algorithms for solving
problems in that class. In this thesis, we introduce two novel general-purpose
stochastic search algorithms for black box optimisation. Stochastic search algorithms
aim at repeating the type of mutations that led to fittest search points in a population.
We can model those mutations by a stochastic distribution. Typically the stochastic
distribution is modelled as a multivariate Gaussian distribution. The key idea is to
iteratively change the parameters of the distribution towards higher expected fitness.
However we leverage information theoretic trust regions and limit the change of the
new distribution. We show how plain maximisation of the fitness expectation without
bounding the change of the distribution is destined to fail because of overfitting
and the results in premature convergence. Being derived from first principles, the
proposed methods can be elegantly extended to contextual learning setting which allows
for learning context dependent stochastic distributions that generates optimal
individuals for a given context, i.e, instead of learning one task at a time, we can
learn multiple related tasks at once. However, the search distribution typically uses
a parametric model using some hand-defined context features. Finding good context
features is a challenging task, and hence, non-parametric methods are often preferred
over their parametric counter-parts. Therefore, we further propose a non-parametric
contextual stochastic search algorithm that can learn a non-parametric search distribution
for multiple tasks simultaneously.Otimização é área de investigação que estuda o projeto de algoritmos para encontrar
as melhores soluções, tendo em conta um conjunto de critérios, para problemas
complexos. Embora todo o domínio de otimização tenha grande importância,
este trabalho está focado no subcampo da otimização contínua de caixa preta,
apresentando uma coleção de novos algoritmos novos de última geração para resolver
problemas nessa classe. Nesta tese, apresentamos dois novos algoritmos de
pesquisa estocástica de propósito geral para otimização de caixa preta. Os algoritmos
de pesquisa estocástica visam repetir o tipo de mutações que levaram aos
melhores pontos de pesquisa numa população. Podemos modelar essas mutações
por meio de uma distribuição estocástica e, tipicamente, a distribuição estocástica
é modelada como uma distribuição Gaussiana multivariada. A ideia chave é mudar
iterativamente os parâmetros da distribuição incrementando a avaliação. No entanto,
alavancamos as regiões de confiança teóricas de informação e limitamos a mudança
de distribuição. Deste modo, demonstra-se como a maximização simples da expectativa
de “fitness”, sem limites da mudança da distribuição, está destinada a falhar
devido ao “overfitness” e à convergência prematura resultantes. Sendo derivado dos
primeiros princípios, as abordagens propostas podem ser ampliadas, de forma elegante,
para a configuração de aprendizagem contextual que permite a aprendizagem
de distribuições estocásticas dependentes do contexto que geram os indivíduos ideais
para um determinado contexto. No entanto, a distribuição de pesquisa geralmente usa
um modelo paramétrico linear em algumas das características contextuais definidas
manualmente. Encontrar uma contextos bem definidos é uma tarefa desafiadora e,
portanto, os métodos não paramétricos são frequentemente preferidos em relação às
seus semelhantes paramétricos. Portanto, propomos um algoritmo não paramétrico
de pesquisa estocástica contextual que possa aprender uma distribuição de pesquisa
não-paramétrica para várias tarefas simultaneamente.FCT - Fundação para a Ciência e a Tecnologia. As well as fundings by European Union’s
FP7 under EuRoC grant agreement CP-IP 608849 and by LIACC (UID/CEC/00027/2015)
and IEETA (UID/CEC/00127/2015)
Active Learning based on Data Uncertainty and Model Sensitivity
Robots can rapidly acquire new skills from demonstrations. However, during
generalisation of skills or transitioning across fundamentally different
skills, it is unclear whether the robot has the necessary knowledge to perform
the task. Failing to detect missing information often leads to abrupt movements
or to collisions with the environment. Active learning can quantify the
uncertainty of performing the task and, in general, locate regions of missing
information. We introduce a novel algorithm for active learning and demonstrate
its utility for generating smooth trajectories. Our approach is based on deep
generative models and metric learning in latent spaces. It relies on the
Jacobian of the likelihood to detect non-smooth transitions in the latent
space, i.e., transitions that lead to abrupt changes in the movement of the
robot. When non-smooth transitions are detected, our algorithm asks for an
additional demonstration from that specific region. The newly acquired
knowledge modifies the data manifold and allows for learning a latent
representation for generating smooth movements. We demonstrate the efficacy of
our approach on generalising elementary skills, transitioning across different
skills, and implicitly avoiding collisions with the environment. For our
experiments, we use a simulated pendulum where we observe its motion from
images and a 7-DoF anthropomorphic arm.Comment: Published on 2018 IEEE/RSJ International Conference on Intelligent
Robots and Syste
Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
Trial-and-error based reinforcement learning (RL) has seen rapid advancements
in recent times, especially with the advent of deep neural networks. However,
the majority of autonomous RL algorithms require a large number of interactions
with the environment. A large number of interactions may be impractical in many
real-world applications, such as robotics, and many practical systems have to
obey limitations in the form of state space or control constraints. To reduce
the number of system interactions while simultaneously handling constraints, we
propose a model-based RL framework based on probabilistic Model Predictive
Control (MPC). In particular, we propose to learn a probabilistic transition
model using Gaussian Processes (GPs) to incorporate model uncertainty into
long-term predictions, thereby, reducing the impact of model errors. We then
use MPC to find a control sequence that minimises the expected long-term cost.
We provide theoretical guarantees for first-order optimality in the GP-based
transition models with deterministic approximate inference for long-term
planning. We demonstrate that our approach does not only achieve
state-of-the-art data efficiency, but also is a principled way for RL in
constrained environments.Comment: Accepted at AISTATS 2018
- …