Search CORE

115 research outputs found

Pseudorehearsal in actor-critic agents with neural network function approximation

Author: Marochko Vladimir
Johard Leonard
Mazzara Manuel
Longo Luca
Publication venue
Publication date: 19/02/2018
Field of study

Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease forgetting

arXiv.org e-Print Archive

FigShare

Pseudorehearsal in actor-critic agents with neural network function approximation

Author: Johard Leonard
Longo Luca
Marochko Vladimir
Mazzara Manuel
Publication venue
Publication date: 01/01/2018
Field of study

arXiv.org e-Print Archive

Crossref

Arrow@TUDublin

Pseudorehearsal in value function approximation

Author: A Robins
A Robins
B Baddeley
CJ Watkins
J Gama
JL McClelland
JN Tsitsiklis
KP Murphy
M Frean
M Hattori
M McCloskey
R Coop
R Ratcliff
RJ Williams
RM French
RS Sutton
S Adam
Publication venue
Publication date: 21/03/2017
Field of study

Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the rehearsal parameters

arXiv.org e-Print Archive

Crossref

Pseudorehearsal in actor-critic agents with neural network function approximation

Author: Johard Leonard
Longo Luca
Marochko Vladimir
Mazzara Manuel
Publication venue: Technological University Dublin
Publication date: 01/01/2018
Field of study

Arrow@TUDublin

Self-adaptive node-based PCA encodings

Author: Johard Leonard
Lee JooYoung
Mazzara Manuel
Rivera Victor
Publication venue
Publication date: 16/06/2017
Field of study

In this paper we propose an algorithm, Simple Hebbian PCA, and prove that it is able to calculate the principal component analysis (PCA) in a distributed fashion across nodes. It simplifies existing network structures by removing intralayer weights, essentially cutting the number of weights that need to be trained in half

arXiv.org e-Print Archive

Crossref

Towards Sensorimotor Coupling of a Spiking Neural Network and Deep Reinforcement Learning for Robotics Application

Author: Yamazaki Kashu
Publication venue: ScholarWorks@UARK
Publication date: 01/12/2020
Field of study

Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the great achievements of deep reinforcement learning in various domains including finance,medicine, healthcare, video games, robotics and computer vision.Deep neural network was started with multi-layer perceptron (1stgeneration) and developed to deep neural networks (2ndgeneration)and it is moving forward to spiking neural networks which are knownas3rdgeneration of neural networks. Spiking neural networks aim to bridge the gap between neuroscience and machine learning, using biologically-realistic models of neurons to carry out computation. In this thesis, we first provide a comprehensive review on both spiking neural networks and deep reinforcement learning with emphasis on robotic applications. Then we will demonstrate how to develop a robotics application for context-aware scene understanding to perform sensorimotor coupling. Our system contains two modules corresponding to scene understanding and robotic navigation. The first module is implemented as a spiking neural network to carry out semantic segmentation to understand the scene in front of the robot. The second module provides a high-level navigation command to robot, which is considered as an agent and implemented by online reinforcement learning. The module was implemented with biologically plausible local learning rule that allows the agent to adopt quickly to the environment. To benchmark our system, we have tested the first module on Oxford-IIIT Pet dataset and the second module on the custom-made Gym environment. Our experimental results have proven that our system is able present the competitive results with deep neural network in segmentation task and adopts quickly to the environment

ScholarWorks@UARK

UARK (University of Arkansas )

Sequential learning in the form of shaping as a source of cognitive flexibility

Author: Krueger K.A.
Publication venue: UCL (University College London)
Publication date: 28/09/2011
Field of study

Humans and animals have the ability to quickly learn new tasks, a rapidity that is unlikely to be manageable by pure trial and error learning on each task separately. Instead, key to this rapid adaptability appears to be the ability to integrate skills and knowledge obtained from previous tasks. This is assumed for example in the sequential build-up of curricula in education, and has been employed in training animals for behavioural experiments at least since the initial work on shaping by Skinner in 1938. Despite its importance to natural learning, from a computational neuroscience point of view the question of sequential learning of tasks has largely been ignored. Instead, learning algorithms have often been devised that are capable of learning from an initial naive state. However, it is known that simply training sequentially with the same algorithms can often harm learning through interference, rather than enhance it. In this thesis, we explore the e�ffects of sequential training in the form of shaping in the cognitive domain. We consider abstract, yet neurally inspired, learning models and propose extensions and requirements to ensure that shaping is bene�ficial. We take the 12-AX task, a hierarchical working memory task with rich structure, de�fine a shaping sequence to break the hierarchical structure of the task into separate smaller and simpler tasks, and compare performance between learning the task in one fell swoop to that of learning it with the help of shaping. Using a Long Short-Term Memory (LSTM) network model, we show that learning times can be reduced substantially through shaping. Furthermore, other metrics such as forms of abstraction and generalisation may also show differential e�ffects. Crucial to this, though, is the ability to prevent interference, which we achieve through an architectural extension in the form of "resource allocation". Finally, we present initial, human behavioural data on the 12-AX task, showing that humans can learn it in a single session. Nevertheless, the task is sufficiently challenging to reveal interesting behavioural structure. This supports its use as a candidate to probe computational aspects of cognitive learning, including shaping. Furthermore, our data show that the shaping protocol used in the modelling studies can also improve averaged asymptotic performance in humans. Overall, we show the importance of taking sequential task learning into account, provided there is additional architectural support. We propose and demonstrate candidates for this support

UCL Discovery

Reinforcement Learning Algorithms in Humanoid Robotics

Author: Dusko Katic
Miomir Vukobratovic
Publication venue: 'IntechOpen'
Publication date: 01/06/2007
Field of study

IntechOpen

Crossref