4,189 research outputs found
Variational inference for policy search in changing situations
Many policy search algorithms minimize the Kullback-Leibler (KL) divergence to a certain
target distribution in order to fit their policy. The commonly used KL-divergence forces the resulting
policy to be ’reward-attracted’. The policy tries to reproduce all positively rewarded experience
while negative experience is neglected. However, the KL-divergence is not symmetric
and we can also minimize the the reversed KL-divergence, which is typically used in variational
inference. The policy now becomes ’cost-averse’. It tries to avoid reproducing any negatively-rewarded experience while maximizing exploration. Due to this ’cost-averseness’ of the policy, Variational Inference for Policy Search (VIP) has several interesting properties. It requires no kernelbandwith nor exploration rate, such settings are
determined automatically by the inference. The algorithm meets the performance of state-of-theart
methods while being applicable to simultaneously learning in multiple situations. We concentrate on using VIP for policy search in robotics. We apply our algorithm to learn dynamic counterbalancing of different kinds of
pushes with human-like 2-link and 4-link robots
Abstraction in decision-makers with limited information processing capabilities
A distinctive property of human and animal intelligence is the ability to
form abstractions by neglecting irrelevant information which allows to separate
structure from noise. From an information theoretic point of view abstractions
are desirable because they allow for very efficient information processing. In
artificial systems abstractions are often implemented through computationally
costly formations of groups or clusters. In this work we establish the relation
between the free-energy framework for decision making and rate-distortion
theory and demonstrate how the application of rate-distortion for
decision-making leads to the emergence of abstractions. We argue that
abstractions are induced due to a limit in information processing capacity.Comment: Presented at the NIPS 2013 Workshop on Planning with Information
Constraint
Doubly Stochastic Variational Inference for Deep Gaussian Processes
Gaussian processes (GPs) are a good choice for function approximation as they
are flexible, robust to over-fitting, and provide well-calibrated predictive
uncertainty. Deep Gaussian processes (DGPs) are multi-layer generalisations of
GPs, but inference in these models has proved challenging. Existing approaches
to inference in DGP models assume approximate posteriors that force
independence between the layers, and do not work well in practice. We present a
doubly stochastic variational inference algorithm, which does not force
independence between layers. With our method of inference we demonstrate that a
DGP model can be used effectively on data ranging in size from hundreds to a
billion points. We provide strong empirical evidence that our inference scheme
for DGPs works well in practice in both classification and regression.Comment: NIPS 201
- …