3,346 research outputs found

    EMI: Exploration with Mutual Information

    Full text link
    Reinforcement learning algorithms struggle when the reward signal is very sparse. In these cases, naive random exploration methods essentially rely on a random walk to stumble onto a rewarding state. Recent works utilize intrinsic motivation to guide the exploration via generative models, predictive forward models, or discriminative modeling of novelty. We propose EMI, which is an exploration method that constructs embedding representation of states and actions that does not rely on generative decoding of the full observation but extracts predictive signals that can be used to guide exploration based on forward prediction in the representation space. Our experiments show competitive results on challenging locomotion tasks with continuous control and on image-based exploration tasks with discrete actions on Atari. The source code is available at https://github.com/snu-mllab/EMI .Comment: Accepted and to appear at ICML 201

    Laser assisted decay of quasistationary states

    Full text link
    The effects of intense electromagnetic fields on the decay of quasistationary states are investigated theoretically. We focus on the parameter regime of strong laser fields and nonlinear effects where an essentially nonperturbative description is required. Our approach is based on the imaginary time method previously introduced in the theory of strong-field ionization. Spectra and total decay rates are presented for a test case and the results are compared with exact numerical calculations. The potential of this method is confirmed by good quantitative agreement with the numerical results.Comment: 24 pages, 5 figure

    Double smoothing technique for infinite-dimensional optimization problems with applications to optimal control

    Get PDF
    In this paper, we propose an efficient technique for solving some infinite-dimensional problems over the sets of functions of time. In our problem, besides the convex point-wise constraints on state variables, we have convex coupling constraints with finite-dimensional image. Hence, we can formulate a finite-dimensional dual problem, which can be solved by efficient gradient methods. We show that it is possible to reconstruct an approximate primal solution. In order to accelerate our schemes, we apply double-smoothing technique. As a result, our method has complexity O (1/[epsilon] ln 1/[epsilon]) gradient iterations, where [epsilon] is the desired accuracy of the solution of the primal-dual problem. Our approach covers, in particular, the optimal control problems with trajectory governed by a system of ordinary differential equations. The additional requirement could be that the trajectory crosses in certain moments of time some convex sets.convex optimization, optimal control, fast gradient methods, complexity bounds, smoothing technique

    NFFT meets Krylov methods: Fast matrix-vector products for the graph Laplacian of fully connected networks

    Get PDF
    The graph Laplacian is a standard tool in data science, machine learning, and image processing. The corresponding matrix inherits the complex structure of the underlying network and is in certain applications densely populated. This makes computations, in particular matrix-vector products, with the graph Laplacian a hard task. A typical application is the computation of a number of its eigenvalues and eigenvectors. Standard methods become infeasible as the number of nodes in the graph is too large. We propose the use of the fast summation based on the nonequispaced fast Fourier transform (NFFT) to perform the dense matrix-vector product with the graph Laplacian fast without ever forming the whole matrix. The enormous flexibility of the NFFT algorithm allows us to embed the accelerated multiplication into Lanczos-based eigenvalues routines or iterative linear system solvers and even consider other than the standard Gaussian kernels. We illustrate the feasibility of our approach on a number of test problems from image segmentation to semi-supervised learning based on graph-based PDEs. In particular, we compare our approach with the Nystr\"om method. Moreover, we present and test an enhanced, hybrid version of the Nystr\"om method, which internally uses the NFFT.Comment: 28 pages, 9 figure
    corecore