50 research outputs found

    Regularized fitted Q-iteration: application to planning

    Get PDF
    We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducing kernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure

    Projective simulation for artificial intelligence

    Get PDF
    We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.Comment: 22 pages, 18 figures. Close to published version, with footnotes retaine

    Deep Reinforcement Learning: An Overview

    Full text link
    In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This chapter reviews the recent advances in deep reinforcement learning with a focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework.Comment: Proceedings of SAI Intelligent Systems Conference (IntelliSys) 201

    Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

    Get PDF
    We consider the problem of finding a near-optimal policy in continuous space, discounted Markovian Decision Problems given the trajectory of some behaviour policy. We study the policy iteration algorithm where in successive iterations the action-value functions of the intermediate policies are obtained by picking a function from some fixed function set (chosen by the user) that minimizes an unbiased finite-sample approximation to a novel loss function that upper-bounds the unmodified Bellman-residual criterion. The main result is a finite-sample, high-probability bound on the performance of the resulting policy that depends on the mixing rate of the trajectory, the capacity of the function set as measured by a novel capacity concept that we call the VC-crossing dimension, the approximation power of the function set and the discounted-average concentrability of the future-state distribution. To the best of our knowledge this is the first theoretical reinforcement learning result for off-policy control learning over continuous state-spaces using a single trajectory

    Cyclic animation using Partial differential Equations

    Get PDF
    YesThis work presents an efficient and fast method for achieving cyclic animation using Partial Differential Equations (PDEs). The boundary-value nature associ- ated with elliptic PDEs offers a fast analytic solution technique for setting up a framework for this type of animation. The surface of a given character is thus cre- ated from a set of pre-determined curves, which are used as boundary conditions so that a number of PDEs can be solved. Two different approaches to cyclic ani- mation are presented here. The first consists of using attaching the set of curves to a skeletal system hold- ing the animation for cyclic motions linked to a set mathematical expressions, the second one exploits the spine associated with the analytic solution of the PDE as a driving mechanism to achieve cyclic animation, which is also manipulated mathematically. The first of these approaches is implemented within a framework related to cyclic motions inherent to human-like char- acters, whereas the spine-based approach is focused on modelling the undulatory movement observed in fish when swimming. The proposed method is fast and ac- curate. Additionally, the animation can be either used in the PDE-based surface representation of the model or transferred to the original mesh model by means of a point to point map. Thus, the user is offered with the choice of using either of these two animation repre- sentations of the same object, the selection depends on the computing resources such as storage and memory capacity associated with each particular application

    Learning and Tracking Cyclic Human Motion

    No full text
    We present methods for learning and tracking human motion in video. We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into "cycles". Then the mean and the principal components of the cycles are computed using a new algorithm that accounts for missing information and enforces smooth transitions between cycles. The learned temporal model provides a prior probability distribution over human motions that can be used inaBayesian framework for tracking human subjects in complex monocular video sequences and recovering their 3D motion

    Articulated Body Tracking by Immune Particle Filter

    No full text

    Library Trends 53 (4) Spring 2005: The Commercialized Web: Challenges for Libraries and Democracy

    Get PDF
    This issue of Library Trends addresses Web content within the context of Internet commercialization and democracy. These are big ideas and problems, with potentially big solutions, so this issue has cast a wide net, pulling together voices from multiple disciplines, including communication studies, informatics, information management, research programming, computer science, engineering, and library science.published or submitted for publicatio
    corecore