26,793 research outputs found

    Evolutionary Algorithms for Reinforcement Learning

    Full text link
    There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications

    Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning

    Full text link
    Recent advances in combining deep neural network architectures with reinforcement learning techniques have shown promising potential results in solving complex control problems with high dimensional state and action spaces. Inspired by these successes, in this paper, we build two kinds of reinforcement learning algorithms: deep policy-gradient and value-function based agents which can predict the best possible traffic signal for a traffic intersection. At each time step, these adaptive traffic light control agents receive a snapshot of the current state of a graphical traffic simulator and produce control signals. The policy-gradient based agent maps its observation directly to the control signal, however the value-function based agent first estimates values for all legal control signals. The agent then selects the optimal control action with the highest value. Our methods show promising results in a traffic network simulated in the SUMO traffic simulator, without suffering from instability issues during the training process

    How Laminar Frontal Cortex and Basal Ganglia Circuits Interact to Control Planned and Reactive Saccades

    Full text link
    The basal ganglia and frontal cortex together allow animals to learn adaptive responses that acquire rewards when prepotent reflexive responses are insufficient. Anatomical studies show a rich pattern of interactions between the basal ganglia and distinct frontal cortical layers. Analysis of the laminar circuitry of the frontal cortex, together with its interactions with the basal ganglia, motor thalamus, superior colliculus, and inferotemporal and parietal cortices, provides new insight into how these brain regions interact to learn and perform complexly conditioned behaviors. A neural model whose cortical component represents the frontal eye fields captures these interacting circuits. Simulations of the neural model illustrate how it provides a functional explanation of the dynamics of 17 physiologically identified cell types found in these areas. The model predicts how action planning or priming (in cortical layers III and VI) is dissociated from execution (in layer V), how a cue may serve either as a movement target or as a discriminative cue to move elsewhere, and how the basal ganglia help choose among competing actions. The model simulates neurophysiological, anatomical, and behavioral data about how monkeys perform saccadic eye movement tasks, including fixation; single saccade, overlap, gap, and memory-guided saccades; anti-saccades; and parallel search among distractors.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-l-0409, N00014-92-J-1309, N00014-95-1-0657); National Science Foundation (IRI-97-20333)
    • …
    corecore