3,346 research outputs found
EMI: Exploration with Mutual Information
Reinforcement learning algorithms struggle when the reward signal is very
sparse. In these cases, naive random exploration methods essentially rely on a
random walk to stumble onto a rewarding state. Recent works utilize intrinsic
motivation to guide the exploration via generative models, predictive forward
models, or discriminative modeling of novelty. We propose EMI, which is an
exploration method that constructs embedding representation of states and
actions that does not rely on generative decoding of the full observation but
extracts predictive signals that can be used to guide exploration based on
forward prediction in the representation space. Our experiments show
competitive results on challenging locomotion tasks with continuous control and
on image-based exploration tasks with discrete actions on Atari. The source
code is available at https://github.com/snu-mllab/EMI .Comment: Accepted and to appear at ICML 201
Laser assisted decay of quasistationary states
The effects of intense electromagnetic fields on the decay of quasistationary
states are investigated theoretically. We focus on the parameter regime of
strong laser fields and nonlinear effects where an essentially nonperturbative
description is required. Our approach is based on the imaginary time method
previously introduced in the theory of strong-field ionization. Spectra and
total decay rates are presented for a test case and the results are compared
with exact numerical calculations. The potential of this method is confirmed by
good quantitative agreement with the numerical results.Comment: 24 pages, 5 figure
Double smoothing technique for infinite-dimensional optimization problems with applications to optimal control
In this paper, we propose an efficient technique for solving some infinite-dimensional problems over the sets of functions of time. In our problem, besides the convex point-wise constraints on state variables, we have convex coupling constraints with finite-dimensional image. Hence, we can formulate a finite-dimensional dual problem, which can be solved by efficient gradient methods. We show that it is possible to reconstruct an approximate primal solution. In order to accelerate our schemes, we apply double-smoothing technique. As a result, our method has complexity O (1/[epsilon] ln 1/[epsilon]) gradient iterations, where [epsilon] is the desired accuracy of the solution of the primal-dual problem. Our approach covers, in particular, the optimal control problems with trajectory governed by a system of ordinary differential equations. The additional requirement could be that the trajectory crosses in certain moments of time some convex sets.convex optimization, optimal control, fast gradient methods, complexity bounds, smoothing technique
NFFT meets Krylov methods: Fast matrix-vector products for the graph Laplacian of fully connected networks
The graph Laplacian is a standard tool in data science, machine learning, and
image processing. The corresponding matrix inherits the complex structure of
the underlying network and is in certain applications densely populated. This
makes computations, in particular matrix-vector products, with the graph
Laplacian a hard task. A typical application is the computation of a number of
its eigenvalues and eigenvectors. Standard methods become infeasible as the
number of nodes in the graph is too large. We propose the use of the fast
summation based on the nonequispaced fast Fourier transform (NFFT) to perform
the dense matrix-vector product with the graph Laplacian fast without ever
forming the whole matrix. The enormous flexibility of the NFFT algorithm allows
us to embed the accelerated multiplication into Lanczos-based eigenvalues
routines or iterative linear system solvers and even consider other than the
standard Gaussian kernels. We illustrate the feasibility of our approach on a
number of test problems from image segmentation to semi-supervised learning
based on graph-based PDEs. In particular, we compare our approach with the
Nystr\"om method. Moreover, we present and test an enhanced, hybrid version of
the Nystr\"om method, which internally uses the NFFT.Comment: 28 pages, 9 figure
- …