205,977 research outputs found
Reinforcement learning or active inference?
This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain
Learning a Policy for Opportunistic Active Learning
Active learning identifies data points to label that are expected to be the
most useful in improving a supervised model. Opportunistic active learning
incorporates active learning into interactive tasks that constrain possible
queries during interactions. Prior work has shown that opportunistic active
learning can be used to improve grounding of natural language descriptions in
an interactive object retrieval task. In this work, we use reinforcement
learning for such an object retrieval task, to learn a policy that effectively
trades off task completion with model improvement that would benefit future
tasks.Comment: EMNLP 2018 Camera Read
Learning how to Active Learn: A Deep Reinforcement Learning Approach
Active learning aims to select a small subset of data for annotation such
that a classifier learned on the data is highly accurate. This is usually done
using heuristic selection methods, however the effectiveness of such methods is
limited and moreover, the performance of heuristics varies between datasets. To
address these shortcomings, we introduce a novel formulation by reframing the
active learning as a reinforcement learning problem and explicitly learning a
data selection policy, where the policy takes the role of the active learning
heuristic. Importantly, our method allows the selection policy learned using
simulation on one language to be transferred to other languages. We demonstrate
our method using cross-lingual named entity recognition, observing uniform
improvements over traditional active learning.Comment: To appear in EMNLP 201
- …