129,055 research outputs found
Active Bayesian perception and reinforcement learning
In a series of papers, we have formalized an active Bayesian perception approach for robotics based on recent progress in understanding animal perception. However, an issue for applied robot perception is how to tune this method to a task, using: (i) a belief threshold that adjusts the speed-accuracy tradeoff; and (ii) an active control strategy for relocating the sensor e.g. to a preset fixation point. Here we propose that these two variables should be learnt by reinforcement from a reward signal evaluating the decision outcome. We test this claim with a biomimetic fingertip that senses surface curvature under uncertainty about contact location. Appropriate formulation of the problem allows use of multi-armed bandit methods to optimize the threshold and fixation point of the active perception. In consequence, the system learns to balance speed versus accuracy and sets the fixation point to optimize both quantities. Although we consider one example in robot touch, we expect that the underlying principles have general applicability
Reinforcement learning or active inference?
This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain
Embodied Question Answering
We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where
an agent is spawned at a random location in a 3D environment and asked a
question ("What color is the car?"). In order to answer, the agent must first
intelligently navigate to explore the environment, gather information through
first-person (egocentric) vision, and then answer the question ("orange").
This challenging task requires a range of AI skills -- active perception,
language understanding, goal-driven navigation, commonsense reasoning, and
grounding of language into actions. In this work, we develop the environments,
end-to-end-trained reinforcement learning agents, and evaluation protocols for
EmbodiedQA.Comment: 20 pages, 13 figures, Webpage: https://embodiedqa.org
Intrinsically Motivated Learning of Visual Motion Perception and Smooth Pursuit
We extend the framework of efficient coding, which has been used to model the
development of sensory processing in isolation, to model the development of the
perception/action cycle. Our extension combines sparse coding and reinforcement
learning so that sensory processing and behavior co-develop to optimize a
shared intrinsic motivational signal: the fidelity of the neural encoding of
the sensory input under resource constraints. Applying this framework to a
model system consisting of an active eye behaving in a time varying
environment, we find that this generic principle leads to the simultaneous
development of both smooth pursuit behavior and model neurons whose properties
are similar to those of primary visual cortical neurons selective for different
directions of visual motion. We suggest that this general principle may form
the basis for a unified and integrated explanation of many perception/action
loops.Comment: 6 pages, 5 figure
Deep Active Inference for Partially Observable MDPs
Deep active inference has been proposed as a scalable approach to perception
and action that deals with large policy and state spaces. However, current
models are limited to fully observable domains. In this paper, we describe a
deep active inference model that can learn successful policies directly from
high-dimensional sensory inputs. The deep learning architecture optimizes a
variant of the expected free energy and encodes the continuous state
representation by means of a variational autoencoder. We show, in the OpenAI
benchmark, that our approach has comparable or better performance than deep
Q-learning, a state-of-the-art deep reinforcement learning algorithm.Comment: 1st International Workshop on Active inference, European Conference
on Machine Learning (ECML/PCKDD 2020
- …