Search CORE

5 research outputs found

Pseudorehearsal in value function approximation

Author: A Robins
A Robins
B Baddeley
CJ Watkins
J Gama
JL McClelland
JN Tsitsiklis
KP Murphy
M Frean
M Hattori
M McCloskey
R Coop
R Ratcliff
RJ Williams
RM French
RS Sutton
S Adam
Publication venue
Publication date: 21/03/2017
Field of study

Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple problems, given proper initialization of the rehearsal parameters

arXiv.org e-Print Archive

Crossref

Pseudorehearsal in actor-critic agents with neural network function approximation

Author: Marochko Vladimir
Johard Leonard
Mazzara Manuel
Longo Luca
Publication venue
Publication date: 19/02/2018
Field of study

Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease forgetting

arXiv.org e-Print Archive

FigShare

Pseudorehearsal in actor-critic agents with neural network function approximation

Author: Johard Leonard
Longo Luca
Marochko Vladimir
Mazzara Manuel
Publication venue
Publication date: 01/01/2018
Field of study

arXiv.org e-Print Archive

Crossref

Arrow@TUDublin

Achieving continual learning in deep neural networks through pseudo-rehearsal

Author: Atkinson Craig Robert
Publication venue: 'University of Otago Library'
Publication date: 14/09/2020
Field of study

Neural networks are very powerful computational models, capable of outperforming humans on a variety of tasks. However, unlike humans, these networks tend to catastrophically forget previous information when learning new information. This thesis aims to solve this catastrophic forgetting problem, so that a deep neural network model can sequentially learn a number of complex reinforcement learning tasks. The primary model proposed by this thesis, termed RePR, prevents catastrophic forgetting by introducing a generative model and a dual memory system. The generative model learns to produce data representative of previously seen tasks. This generated data is rehearsed, while learning a new task, through a process called pseudo-rehearsal. This process allows the network to learn the new task, without forgetting previous tasks. The dual memory system is used to split learning into two systems. The short-term system is only responsible for learning the new task through reinforcement learning and the long-term system is responsible for retaining knowledge of previous tasks, while being taught the new task by the short-term system. The RePR model was shown to learn and retain a short sequence of reinforcement tasks to above human performance levels. Additionally, RePR was found to substantially outcompete state-of-the-art solutions and prevent forgetting similarly to a model which rehearsed real data from previously learnt tasks. RePR achieved this without: increasing in memory size as the number of tasks expands; revisiting previously learnt tasks; or directly storing data from previous tasks. Further results showed that RePR could be improved by informing the generator which image features are most important to retention and that, when challenged by a longer sequence of tasks, RePR would typically demonstrate gradual forgetting rather than dramatic forgetting. Finally, results also demonstrated RePR can successfully be adapted to other deep reinforcement learning algorithms

Te Tumu Eprints Repository