Search CORE

26,793 research outputs found

Evolutionary Algorithms for Reinforcement Learning

Author: Grefenstette J. J.
Moriarty D. E.
Schultz A. C.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications

arXiv.org e-Print Archive

Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning

Author: Howley Enda
Mousavi Seyed Sajad
Schukat Michael
Publication venue
Publication date: 27/05/2017
Field of study

Recent advances in combining deep neural network architectures with reinforcement learning techniques have shown promising potential results in solving complex control problems with high dimensional state and action spaces. Inspired by these successes, in this paper, we build two kinds of reinforcement learning algorithms: deep policy-gradient and value-function based agents which can predict the best possible traffic signal for a traffic intersection. At each time step, these adaptive traffic light control agents receive a snapshot of the current state of a graphical traffic simulator and produce control signals. The policy-gradient based agent maps its observation directly to the control signal, however the value-function based agent first estimates values for all legal control signals. The agent then selects the optimal control action with the highest value. Our methods show promising results in a traffic network simulated in the SUMO traffic simulator, without suffering from instability issues during the training process

arXiv.org e-Print Archive

Access to Research at National University of Ireland, Galway

How Laminar Frontal Cortex and Basal Ganglia Circuits Interact to Control Planned and Reactive Saccades

Author: Brown Joshua
Bullock Daniel
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/10/2000
Field of study

The basal ganglia and frontal cortex together allow animals to learn adaptive responses that acquire rewards when prepotent reflexive responses are insufficient. Anatomical studies show a rich pattern of interactions between the basal ganglia and distinct frontal cortical layers. Analysis of the laminar circuitry of the frontal cortex, together with its interactions with the basal ganglia, motor thalamus, superior colliculus, and inferotemporal and parietal cortices, provides new insight into how these brain regions interact to learn and perform complexly conditioned behaviors. A neural model whose cortical component represents the frontal eye fields captures these interacting circuits. Simulations of the neural model illustrate how it provides a functional explanation of the dynamics of 17 physiologically identified cell types found in these areas. The model predicts how action planning or priming (in cortical layers III and VI) is dissociated from execution (in layer V), how a cue may serve either as a movement target or as a discriminative cue to move elsewhere, and how the basal ganglia help choose among competing actions. The model simulates neurophysiological, anatomical, and behavioral data about how monkeys perform saccadic eye movement tasks, including fixation; single saccade, overlap, gap, and memory-guided saccades; anti-saccades; and parallel search among distractors.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-l-0409, N00014-92-J-1309, N00014-95-1-0657); National Science Foundation (IRI-97-20333)

Boston University Institutional Repository (OpenBU)