40,079 research outputs found

    Optimal Learning Theory and Approximate Optimal Learning Algorithms

    Get PDF
    The exploration/exploitation dilemma is a fundamental but often computationally intractable problem in reinforcement learning. The dilemma also impacts data efficiency which can be pivotal when the interactions between the agent and the environment are constrained. Traditional optimal control theory has some notion of objective criterion, such as regret, maximizing which results in optimal exploration and exploitation. This approach has been successful in multi-armed bandit problem but becomes impractical and mostly intractable to compute for multi-state problems. For complex problems with large state space when function approximation is applied, exploration/exploitation during each interaction is in practice generally decided in an ad hoc approach with heavy parameter tuning, such as ε-greedy. Inspired by different research communities, optimal learning strives to find the optimal balance between exploration and exploitation by applying principles from optimal control theory. The contribution of this thesis consists of two parts: 1. to establish a theoretical framework of optimal learning based on reinforcement learning in a stochastic (non-Markovian) decision process and through the lens of optimal learning unify the Bayesian (model-based) reinforcement learning and the partially observable reinforcement learning. 2. to improve existing reinforcement learning algorithms in the optimal learning view and the improved algorithms will be referred to as approximate optimal learning algorithms. Three classes of approximate optimal learning algorithms are proposed drawing from the following principles respectively: (1) Approximate Bayesian inference explicitly by training a recurrent neural network en- tangled with a feed forward neural network; (2) Approximate Bayesian inference implicitly by training and sampling from a pool of prediction neural networks as dynamics models; (3) Use memory based recurrent neural network to extract features from observations. Empirical evidence is provided to show the improvement of the proposed algorithms

    Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

    Full text link
    Deep Learning is increasingly being adopted by industry for computer vision applications running on embedded devices. While Convolutional Neural Networks' accuracy has achieved a mature and remarkable state, inference latency and throughput are a major concern especially when targeting low-cost and low-power embedded platforms. CNNs' inference latency may become a bottleneck for Deep Learning adoption by industry, as it is a crucial specification for many real-time processes. Furthermore, deployment of CNNs across heterogeneous platforms presents major compatibility issues due to vendor-specific technology and acceleration libraries. In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices. We show that, an optimized combination can achieve 45x speedup in inference latency on CPU compared to a dependency-free baseline and 2x on average on GPGPU compared to the best vendor library. Further, we demonstrate that, the quality of results and time "to-solution" is much better than with Random Search and achieves up to 15x better results for a short-time search
    • …
    corecore