79,841 research outputs found

    Feature Selection by Singular Value Decomposition for Reinforcement Learning

    Get PDF
    Solving reinforcement learning problems using value function approximation requires having good state features, but constructing them manually is often difficult or impossible. We propose Fast Feature Selection (FFS), a new method for automatically constructing good features in problems with high-dimensional state spaces but low-rank dynamics. Such problems are common when, for example, controlling simple dynamic systems using direct visual observations with states represented by raw images. FFS relies on domain samples and singular value decomposition to construct features that can be used to approximate the optimal value function well. Compared with earlier methods, such as LFD, FFS is simpler and enjoys better theoretical performance guarantees. Our experimental results show that our approach is also more stable, computes better solutions, and can be faster when compared with prior work

    Constructing Abstraction Hierarchies Using a Skill-Symbol Loop

    Full text link
    We describe a framework for building abstraction hierarchies whereby an agent alternates skill- and representation-acquisition phases to construct a sequence of increasingly abstract Markov decision processes. Our formulation builds on recent results showing that the appropriate abstract representation of a problem is specified by the agent's skills. We describe how such a hierarchy can be used for fast planning, and illustrate the construction of an appropriate hierarchy for the Taxi domain

    Manifold Representations for Continuous-State Reinforcement Learning

    Get PDF
    Reinforcement learning (RL) has shown itself to be an effective paradigm for solving optimal control problems with a finite number of states. Generalizing RL techniques to problems with a continuous state space has proven a difficult task. We present an approach to modeling the RL value function using a manifold representation. By explicitly modeling the topology of the value function domain, traditional problems with discontinuities and resolution can be addressed without resorting to complex function approximators. We describe how manifold techniques can be applied to value-function approximation, and present methods for constructing manifold representations in both batch and online settings. We present empirical results demonstrating the effectiveness of our approach

    Avoiding Wireheading with Value Reinforcement Learning

    Full text link
    How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) is a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward -- the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to learn a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent's actions. The constraint is defined in terms of the agent's belief distributions, and does not require an explicit specification of which actions constitute wireheading.Comment: Artificial General Intelligence (AGI) 201
    • …
    corecore