797 research outputs found
Hidden Parameter Markov Decision Processes: A Semiparametric Regression Approach for Discovering Latent Task Parametrizations
Control applications often feature tasks with similar, but not identical,
dynamics. We introduce the Hidden Parameter Markov Decision Process (HiP-MDP),
a framework that parametrizes a family of related dynamical systems with a
low-dimensional set of latent factors, and introduce a semiparametric
regression approach for learning its structure from data. In the control
setting, we show that a learned HiP-MDP rapidly identifies the dynamics of a
new task instance, allowing an agent to flexibly adapt to task variations
Deep Variational Reinforcement Learning for POMDPs
Many real-world sequential decision making problems are partially observable
by nature, and the environment model is typically unknown. Consequently, there
is great need for reinforcement learning methods that can tackle such problems
given only a stream of incomplete and noisy observations. In this paper, we
propose deep variational reinforcement learning (DVRL), which introduces an
inductive bias that allows an agent to learn a generative model of the
environment and perform inference in that model to effectively aggregate the
available information. We develop an n-step approximation to the evidence lower
bound (ELBO), allowing the model to be trained jointly with the policy. This
ensures that the latent state representation is suitable for the control task.
In experiments on Mountain Hike and flickering Atari we show that our method
outperforms previous approaches relying on recurrent neural networks to encode
the past
Decision-Making Under Uncertainty: Beyond Probabilities
This position paper reflects on the state-of-the-art in decision-making under
uncertainty. A classical assumption is that probabilities can sufficiently
capture all uncertainty in a system. In this paper, the focus is on the
uncertainty that goes beyond this classical interpretation, particularly by
employing a clear distinction between aleatoric and epistemic uncertainty. The
paper features an overview of Markov decision processes (MDPs) and extensions
to account for partial observability and adversarial behavior. These models
sufficiently capture aleatoric uncertainty but fail to account for epistemic
uncertainty robustly. Consequently, we present a thorough overview of so-called
uncertainty models that exhibit uncertainty in a more robust interpretation. We
show several solution techniques for both discrete and continuous models,
ranging from formal verification, over control-based abstractions, to
reinforcement learning. As an integral part of this paper, we list and discuss
several key challenges that arise when dealing with rich types of uncertainty
in a model-based fashion
- …