Search CORE

2,505 research outputs found

Reinforcement learning or active inference?

This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

A brief review of neural networks based learning and control and their applications for robots

Author: Jiang Yiming
Li Guang
Li Yanan
Na Jing
Yang Chenguang
Zhong Junpei
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

As an imitation of the biological nervous systems, neural networks (NN), which are characterized with powerful learning ability, have been employed in a wide range of applications, such as control of complex nonlinear systems, optimization, system identification and patterns recognition etc. This article aims to bring a brief review of the state-of-art NN for the complex nonlinear systems. Recent progresses of NNs in both theoretical developments and practical applications are investigated and surveyed. Specifically, NN based robot learning and control applications were further reviewed, including NN based robot manipulator control, NN based human robot interaction and NN based behavior recognition and generation

Crossref

Directory of Open Access Journals

Queen Mary Research Online

Sussex Research Online

Embodied neuromorphic intelligence

Author: Bartolozzi Chiara
Donati Elisa
Indiveri Giacomo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/02/2022
Field of study

The design of robots that interact autonomously with the environment and exhibit complex behaviours is an open challenge that can benefit from understanding what makes living beings fit to act in the world. Neuromorphic engineering studies neural computational principles to develop technologies that can provide a computing substrate for building compact and low-power processing systems. We discuss why endowing robots with neuromorphic technologies – from perception to motor control – represents a promising approach for the creation of robots which can seamlessly integrate in society. We present initial attempts in this direction, highlight open challenges, and propose actions required to overcome current limitations

ZORA

Motor Control and Learning Theories

Author: Alessandro Cristiano
Beckers Niek
Goebel Peter
González-Vargas José
Osu Rieko
Resquin Francisco
Publication venue: Springer
Publication date: 01/01/2016
Field of study

University of Twente Research Information

Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis

Author: Ay Nihat
Martius Georg
Zahedi Keyan
Publication venue
Publication date: 01/01/2013
Field of study

One of the main challenges in the field of embodied artificial intelligence is the open-ended autonomous learning of complex behaviours. Our approach is to use task-independent, information-driven intrinsic motivation(s) to support task-dependent learning. The work presented here is a preliminary step in which we investigate the predictive information (the mutual information of the past and future of the sensor stream) as an intrinsic drive, ideally supporting any kind of task acquisition. Previous experiments have shown that the predictive information (PI) is a good candidate to support autonomous, open-ended learning of complex behaviours, because a maximisation of the PI corresponds to an exploration of morphology- and environment-dependent behavioural regularities. The idea is that these regularities can then be exploited in order to solve any given task. Three different experiments are presented and their results lead to the conclusion that the linear combination of the one-step PI with an external reward function is not generally recommended in an episodic policy gradient setting. Only for hard tasks a great speed-up can be achieved at the cost of an asymptotic performance lost

arXiv.org e-Print Archive

Frontiers - Publisher Connector

Principles of sensorimotor learning.

Author: A d'Avella
A Hadipour-Niktarash
A Haith
A Keisler
A Resulaj
AA Ahmed
AA Faisal
AAG Mattar
AJ Nagengast
AJ Nagengast
AR Luft
C Takahashi
CJ Burke
CM Bütefisch
CM Heyes
CS Green
D Kahneman
D Marr
D Mitrovic
D Nozaki
D Nozaki
D Pelisson
DA Braun
DA Braun
DA Braun
DA Braun
DA Braun
DA Braun
DA Braun
Daniel M. Wolpert
DJ Ostry
DM Wolpert
DM Wolpert
DW Franklin
DW Franklin
E Burdet
E Oztop
E Todorov
E Todorov
EK Cressman
FJ Valero-Cuevas
G Caithness
G Rizzolatti
G Rizzolatti
G Rotman
GC Sing
H Golla
H Tanaka
IS Howard
IS Howard
J Anguera
J Bongard
J Classen
J Diedrichsen
J Diedrichsen
J Diedrichsen
J Diedrichsen
J Fernandez-Ruiz
J Izawa
J Kluzik
J Krakauer
J Najemnik
J Reis
J Triesch
J Trommershäuser
J-Y Lee
J. Randall Flanagan
JA Hosp
JA Pruszynski
JA Pruszynski
JI Gold
JM Galea
JN Ingram
JR Flanagan
JR Flanagan
JR Flanagan
JR Lackner
Jörn Diedrichsen
K Kording
K Reed
K Seki
KA Thoroughman
KM Mosier
KP Kording
L Itti
L Madelain
LLCD Bursztyn
M Abe
M Berniker
M Brass
M Burstedt
M Deisenroth
M Ernst
M Hayhoe
M Jordan
M Land
M Land
MA Smith
MA Smith
MD Hesse
MJ Wagner
MM Churchland
N Cothros
N Cothros
N Malfait
NI Krouchev
O Donchin
O White
P Baraduc
P Cisek
P Vetter
PM Bays
R Gupta
R Shadmehr
R Srimal
RJ van Beers
RM Brown
RS Johansson
RS Johansson
RS Sutton
S Vaziri
SB Most
SM Aglioti
SM Morton
SM Nasir
T Brashers-Krug
T Martin
T Verstynen
U Sailer
V Brooks
V Della-Maggiore
VS Huang
VS Huang
VS Huang
X Liu
Y Kojima
Y-W Tseng
Publication venue: Scholarship@Western
Publication date: 27/10/2011
Field of study

The exploits of Martina Navratilova and Roger Federer represent the pinnacle of motor learning. However, when considering the range and complexity of the processes that are involved in motor learning, even the mere mortals among us exhibit abilities that are impressive. We exercise these abilities when taking up new activities - whether it is snowboarding or ballroom dancing - but also engage in substantial motor learning on a daily basis as we adapt to changes in our environment, manipulate new objects and refine existing skills. Here we review recent research in human motor learning with an emphasis on the computational mechanisms that are involved

Scholarship@Western

Crossref

Behavioural robustness and the distributed mechanisms hypothesis

Author: Fernandez Leon Jose A
Publication venue
Publication date: 01/01/2011
Field of study

A current challenge in neuroscience and systems biology is to better understand properties that allow organisms to exhibit and sustain appropriate behaviours despite the effects of perturbations (behavioural robustness). There are still significant theoretical difficulties in this endeavour, mainly due to the context-dependent nature of the problem. Biological robustness, in general, is considered in the literature as a property that emerges from the internal structure of organisms, rather than being a dynamical phenomenon involving agent-internal controls, the organism body, and the environment. Our hypothesis is that the capacity for behavioural robustness is rooted in dynamical processes that are distributed between agent ‘brain’, body, and environment, rather than warranted exclusively by organisms’ internal mechanisms. Distribution is operationally defined here based on perturbation analyses. Evolutionary Robotics (ER) techniques are used here to construct four computational models to study behavioural robustness from a systemic perspective. Dynamical systems theory provides the conceptual framework for these investigations. The first model evolves situated agents in a goalseeking scenario in the presence of neural noise perturbations. Results suggest that evolution implicitly selects neural systems that are noise-resistant during coupling behaviour by concentrating search in regions of the fitness landscape that retain functionality for goal approaching. The second model evolves situated, dynamically limited agents exhibiting minimalcognitive behaviour (categorization task). Results indicate a small but significant tendency toward better performance under most types of perturbations by agents showing further cognitivebehavioural dependency on their environments. The third model evolves experience-dependent robust behaviour in embodied, one-legged walking agents. Evidence suggests that robustness is rooted in both internal and external dynamics, but robust motion emerges always from the systemin-coupling. The fourth model implements a historically dependent, mobile-object tracking task under sensorimotor perturbations. Results indicate two different modes of distribution, one in which inner controls necessarily depend on a set of specific environmental factors to exhibit behaviour, then these controls will be more vulnerable to perturbations on that set, and another for which these factors are equally sufficient for behaviours. Vulnerability to perturbations depends on the particular distribution. In contrast to most existing approaches to the study of robustness, this thesis argues that behavioural robustness is better understood in the context of agent-environment dynamical couplings, not in terms of internal mechanisms. Such couplings, however, are not always the full determinants of robustness. Challenges and limitations of our approach are also identified for future studies

Sussex Research Online

OpenGrey Repository