15 research outputs found

    Improved automatic discovery of subgoals for options in hierarchical

    Get PDF
    Options have been shown to be a key step in extending reinforcement learning beyond low-level reactionary systems to higher-level, planning systems. Most of the options research involves hand-crafted options; there has been only very limited work in the automated discovery of options. We extend early work in automated option discovery with a flexible and robust method.Facultad de Inform谩tic

    Improved automatic discovery of subgoals for options in hierarchical

    Get PDF
    Options have been shown to be a key step in extending reinforcement learning beyond low-level reactionary systems to higher-level, planning systems. Most of the options research involves hand-crafted options; there has been only very limited work in the automated discovery of options. We extend early work in automated option discovery with a flexible and robust method.Facultad de Inform谩tic

    Improved automatic discovery of subgoals for options in hierarchical

    Get PDF
    Options have been shown to be a key step in extending reinforcement learning beyond low-level reactionary systems to higher-level, planning systems. Most of the options research involves hand-crafted options; there has been only very limited work in the automated discovery of options. We extend early work in automated option discovery with a flexible and robust method.Facultad de Inform谩tic

    Synthesis of reinforcement learning and robust control theory, A

    Get PDF
    Department Head: Stephen B. Seidman.2000 Summer.Includes bibliographical references (pages 227-231).The pursuit of control algorithms with improved performance drives the entire control research community as well as large parts of the mathematics, engineering, and artificial intelligence research communities. A fundamental limitation on achieving control performance is the conflicting requirement of maintaining system stability. In general, the more aggressive is the controller, the better the control performance but also the closer to system instability. Robust control is a collection of theories, techniques, the tools that form one of the leading edge approaches to control. Most controllers are designed not on the physical plant to be controlled, but on a mathematical model of the plant; hence, these controllers often do not perform well on the physical plant and are sometimes unstable. Robust control overcomes this problem by adding uncertainty to the mathematical model. The result is a more general, less aggressive controller which performs well on the both the model and the physical plant. However, the robust control method also sacrifices some control performance in order to achieve its guarantees of stability. Reinforcement learning based neural networks offer some distinct advantages for improving control performance. Their nonlinearity enables the neural network to implement a wider range of control functions, and their adaptability permits them to improve control performance via on-line, trial-and-error learning. However, neuro-control is typically plagued by a lack of stability guarantees. Even momentary instability cannot be tolerated in most physical plants, and thus, the threat of instability prohibits the application of neuro-control in many situations. In this dissertation, we develop a stable neuro-control scheme by synthesizing the two fields of reinforcement learning and robust control theory. We provide a learning system with many of the advantages of neuro-control. Using functional uncertainty to represent the nonlinear and time-varying components of the neuro networks, we apply the robust control techniques to guarantee the stability of our neuro-controller. Our scheme provides stable control not only for a specific fixed-weight, neural network, but also for a neuro-controller in which the weights are changing during learning. Furthermore, we apply our stable neuro-controller to several control tasks to demonstrate that the theoretical stability guarantee is readily applicable to real-life control situations. We also discuss several problems we encounter and identify potential avenues of future research

    Parallel Reinforcement Learning

    No full text
    We examine the dynamics of multiple reinforcement learning agents who are interacting with and learning from the same environment in parallel. Due to the stochasticity of the environment, each agent will have a different learning experience though they should all ultimately converge upon the same value function. The agents can accelerate the learning process by sharing information at periodic points during the learning process

    Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning

    No full text
    To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of computational resources #basis functions# many researchers are investigating ways to adapt the basis functions during the learning process so that they better #t the value-function landscape. Here weintroduce temporal neighborhoods as small groups of states that experience frequentintragroup transitions during on-line sampling. We then form basis functions along these temporal neighborhoods. Empirical evidence is provided which demonstrates the e#ectiveness of this scheme. We discuss a class of RL problems for which this method mightbe plausible. 1 Overview In reinforcement learning an agentnavigates an environment #a state space# by selecting various actions in each state. As the agent makes actions, it receives rewards indicating the #goodness" of the action. Reinforcement learning is a methodology..

    Tree Traversals and Permutations

    No full text
    We build upon the previous work on the subset of permutations known as stack words and stack-sortable words. We use preorder, inorder and postorder traversals of binary trees to establish multiple bijections between binary trees and these words. We show these operators satisfy a sort of multiplicative cancellation. We further expand the study of these operators by demonstrating how properties on trees are related to properties on words.

    Essentials of the Java Programming Language

    No full text
    1.1 Preliminaries 1.1.1 Learning a Language Programming a computer is both a creative activity and a process structured by rules. Computers are programmed or given instruction, through the use of programmin

    A Neighborhood Search Technique for the Freeze Tag Problem

    No full text
    The Freeze Tag Problem arises naturally in the field of swarm robotics. Given n robots at different locations, the problem is to devise a schedule to activate all robots in the minimum amount of time. Activation of robots, other than the initial robot, only occurs if an active robot physically moves to the location of an inactive robot. Several authors have devised heuristic algorithms to build solutions to the Freeze Tag Problem. Here, we investigate an update procedure based on a hill-climbing, local search algorithm to solve the Freeze-Tag Problem. Subject Classifications: Metaheuristics, degree-bounded minimum diameter spanning trees, swarm robotics, neighborhood structure, improvement graph, combinatorial optimizatio
    corecore