Search CORE

6 research outputs found

Planning with neural networks and reinforcement learning

Author: Baldassarre Gianluca
Publication venue
Publication date
Field of study

This thesis presents the design, implementation and investigation of some predictive-planning controllers built with neural-networks and inspired by Dyna-PI architectures (Sutton, 1990). Dyna-PI architectures are planning systems based on actor-critic reinforcement learning methods and a model of the environment. The controllers are tested with a simulated robot that solves a stochastic path-finding landmark navigation task. A critical review of ideas and models proposed by the literature on problem solving, planning, reinforcement learning, and neural networks precedes the presentation of the controllers. The review isolates ideas relevant to the design of planners based on neural networks. A "neural forward planner" is implemented that, unlike the Dyna-PI architectures, is taskable in a strong sense. This planner is capable of building a "partial policy" focussed on around efficient start-goal paths, and is capable of deciding to re-plan if "unexpected" states are encountered. Planning iteratively generates "chains of predictions" starting from the current state and using the model of the environment. This model is made up by some neural networks trained to predict the next input when an action is executed. A "neural bidirectional planner" that generates trajectories backward from the goal and forward from the current state is also implemented. This planner exploits the knowledge (image) on the goal, further focuses planning around efficient start-goal paths, and produces a quicker updating of evaluations. In several experiments the generalisation capacity of neural networks proves important for learning but it also causes problems of interference. To deal with these problems a modular neural architecture is implemented, that uses a mixture of experts network for the critic, and a simple hierarchical modular network for the actor. The research also implements a simple form of neural abstract planning named "coarse planning", and investigates its strengths in terms of exploration and evaluations\u27 updating. Some experiments with coarse planning and with other controllers suggest that discounted reinforcement learning may have problems dealing with long-lasting tasks

PUblication MAnagement

Reinforcement learning algorithms that assimilate and accommodate skills with multiple tasks

Author: Baldassarre Gianluca
Caligiore Daniele
Mirolli Marco
Tommasino Paolo
Publication venue: IEEE
Publication date: 01/01/2012
Field of study

Children are capable of acquiring a large repertoire of motor skills and of efficiently adapting them to novel conditions. In a previous work we proposed a hierarchical modular reinforcement learning model (RANK) that can learn multiple motor skills in continuous action and state spaces. The model is based on a development of the mixture-of-expert model that has been suitably developed to work with reinforcement learning. In particular, the model uses a high-level gating network for assigning responsibilities for acting and for learning to a set of low-level expert networks. The model was also developed with the goal of exploiting the Piagetian mechanisms of assimilation and accommodation to support learning of multiple tasks. This paper proposes a new model (TERL - Transfer Expert Reinforcement Learning) that substantially improves RANK. The key difference with respect to the previous model is the decoupling of the mechanisms that generate the responsibility signals of experts for learning and for control. This made possible to satisfy different constraints for functioning and for learning. We test both the TERL and the RANK models with a two-DOFs dynamic arm engaged in solving multiple reaching tasks, and compare the two with a simple, flat reinforcement learning model. The results show that both models are capable of exploiting assimilation and accommodation processes in order to transfer knowledge between similar tasks, and at the same time to avoid catastrophic interference. Furthermore, the TERL model is shown to significantly outperform the RANK model thanks to its faster and more stable specialization of experts

PUblication MAnagement

Forward and bidirectional planning based on reinforcement learning and neural networks in a simulated robot.

Author: Baldassarre Gianluca
Publication venue: Springer Verlag
Publication date: 01/01/2003
Field of study

Building intelligent systems that are capable of learning, acting reactively and planning actions before their execution is a major goal of artificial intelligence. This paper presents two reactive and planning systems that contain important novelties with respect to previous neural-network planners and reinforcement- learning based planners: (a) the introduction of a new component (?matcher?) allows both planners to execute genuine taskable planning (while previous reinforcement-learning based models have used planning only for speeding up learning); (b) the planners show for the first time that trained neural- network models of the world can generate long prediction chains that have an interesting robustness with regards to noise; (c) two novel algorithms that generate chains of predictions in order to plan, and control the flows of information between the systems? different neural components, are presented; (d) one of the planners uses backward ?predictions? to exploit the knowledge of the pursued goal; (e) the two systems presented nicely integrate reactive behavior and planning on the basis of a measure of ?confidence? in action. The soundness and potentialities of the two reactive and planning systems are tested and compared with a simulated robot engaged in a stochastic path-finding task. The paper also presents an extensive literature review on the relevant issues

PUblication MAnagement

Special Issue "On Defining Artificial Intelligence"—Commentaries and Author's Response

Author: Aaron Sloman
Alan F. T. Winfield
Colin W. P. Lewis
Dagmar Monett
François Chollet
Gianluca Baldassarre
Giovanni Granato
Henry Shevlin
Istvan S. N. Berkeley
John E. Laird
John Fox
Joscha Bach
Kristinn R. Thórisson
Marek Rosa
Matthew Crosby
Pei Wang
Peter Lindes
Peter Stone
Raúl Rojas
Richard S. Sutton
Roger C. Schank
Roman V. Yampolskiy
Shane Legg
Tomas Mikolov
William J. Rapaport
Publication venue
Publication date: 01/02/2020
Field of study

Open Access Repository

Intrinsically motivated action–outcome learning and goal-based action recall: A system-level bio-constrained computational model

Author: Alexander
Anastasio
Ashby
Balcita-Pedicino
Baldassarre
Baldassarre
Balleine
Balleine
Berlyne
Berridge
Berridge
Bojak
Brown
Brunel
Calabresi
Caligiore
Cardinal
Carelli
Chevalier
Cisek
Comoli
Crabtree
Daw
Dayan
Dommett
Doya
Francesco Mannella
Fuster
Gianluca Baldassarre
Goodale
Grill-Spector
Gurney
Gurney
Gurney
Gurney
Gurney
Haber
Harlow
Hart
Hikosaka
Hikosaka
Houk
Huang
Hull
Humphries
Jaeger
Jay
Jeannerod
Joel
Kakade
Kandel
Kevin Gurney
Kish
Kumaran
Leblois
Lieberman
Lisman
Luppino
Mannella
Marco Mirolli
May
Middleton
Miller
Mirolli
Mishkin
Nambu
Otani
Oudeyer
Oudeyer
Panksepp
Pennartz
Peter Redgrave
Pitkänen
Redgrave
Redgrave
Redgrave
Reynolds
Rizzolatti
Rizzolatti
Rolls
Romanelli
Santucci
Sara
Sara
Schembri
Schembri
Schmidhuber
Schmidhuber
Schultz
Schultz
Shah
Shen
Shepherd
Simon
Singh
Singh
Skinner
Smith
Snyder
Sparks
Sutton
Taffoni
Trappenberg
Vincenzo G. Fiore
von Hofsten
Voorn
Wallis
White
Wickens
Wickens
Willshaw
Wilson
Wise
Wurtz
Yeterian
Yin
Yu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref