Search CORE

406,246 research outputs found

Increasing the Action Gap: New Operators for Reinforcement Learning

Author: Bellemare Marc G.
Guez Arthur
Munos Rémi
Ostrovski Georg
Thomas Philip S.
Publication venue
Publication date: 15/12/2015
Field of study

This paper introduces new optimality-preserving operators on Q-functions. We first describe an operator for tabular representations, the consistent Bellman operator, which incorporates a notion of local policy consistency. We show that this local consistency leads to an increase in the action gap at each state; increasing this gap, we argue, mitigates the undesirable effects of approximation and estimation errors on the induced greedy policies. This operator can also be applied to discretized continuous space and time problems, and we provide empirical results evidencing superior performance in this context. Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator. As corollaries we provide a proof of optimality for Baird's advantage learning algorithm and derive other gap-increasing operators with interesting properties. We conclude with an empirical study on 60 Atari 2600 games illustrating the strong potential of these new operators

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Improved asymptotics of the spectral gap for the Mathieu operator

Author: Anahtarci Berkay
Djakov Plamen
Publication venue
Publication date: 21/02/2012
Field of study

The Mathieu operator {equation*} L(y)=-y"+2a \cos{(2x)}y, \quad a\in \mathbb{C},\;a\neq 0, {equation*} considered with periodic or anti-periodic boundary conditions has, close to

n^2

for large enough

n

, two periodic (if

n

is even) or anti-periodic (if

n

is odd) eigenvalues

\lambda_n^-

\lambda_n^+

. For fixed

a

, we show that {equation*} \lambda_n^+ - \lambda_n^-= \pm \frac{8(a/4)^n}{[(n-1)!]^2} [1 - \frac{a^2}{4n^3}+ O (\frac{1}{n^4})], \quad n\rightarrow\infty. {equation*} This result extends the asymptotic formula of Harrell-Avron-Simon, by providing more asymptotic terms

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Sabanci University Research Database

Einstein gravity from ANEC correlators

Author: Belin Alexandre
Hofman Diego M.
Mathys Gregoire
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/04/2019
Field of study

We study correlation functions with multiple averaged null energy (ANEC) operators in conformal field theories. For large

N

CFTs with a large gap to higher spin operators, we show that the OPE between a local operator and the ANEC can be recast as a particularly simple differential operator acting on the local operator. This operator is simple enough that we can resum it and obtain the finite distance OPE. Under the large

N

- large gap assumptions, the vanishing of the commutator of ANEC operators tightly constrains the OPE coefficients of the theory. An important example of this phenomenon is the conclusion that

a=c

d=4

. This implies that the bulk dual of such a CFT is semi-classical Einstein-gravity with minimally coupled matter.Comment: 32 pages + appendices, 6 figures; v2:typos corrected and a comment added in introductio

arXiv.org e-Print Archive

CERN Document Server

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Random walk on surfaces with hyperbolic cusps

Author: Colin Guillarmou
G. Lebeau
Hans Christianson
Laurent Michel
P. Diaconis
P. Diaconis
W. Müller
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/05/2010
Field of study

We consider the operator associated to a random walk on finite volume surfaces with hyperbolic cusps. We study the spectral gap (upper and lower bound) associated to this operator and deduce some rate of convergence of the iterated kernel towards its stationary distribution.Comment: 28 page

arXiv.org e-Print Archive

Crossref

HAL-UNICE

Estimating the spectral gap of a trace-class Markov operator

Author: Hobert James P.
Khare Kshitij
Qin Qian
Publication venue
Publication date: 01/01/2019
Field of study

The utility of a Markov chain Monte Carlo algorithm is, in large part, determined by the size of the spectral gap of the corresponding Markov operator. However, calculating (and even approximating) the spectral gaps of practical Monte Carlo Markov chains in statistics has proven to be an extremely difficult and often insurmountable task, especially when these chains move on continuous state spaces. In this paper, a method for accurate estimation of the spectral gap is developed for general state space Markov chains whose operators are non-negative and trace-class. The method is based on the fact that the second largest eigenvalue (and hence the spectral gap) of such operators can be bounded above and below by simple functions of the power sums of the eigenvalues. These power sums often have nice integral representations. A classical Monte Carlo method is proposed to estimate these integrals, and a simple sufficient condition for finite variance is provided. This leads to asymptotically valid confidence intervals for the second largest eigenvalue (and the spectral gap) of the Markov operator. In contrast with previously existing techniques, our method is not based on a near-stationary version of the Markov chain, which, paradoxically, cannot be obtained in a principled manner without bounds on the spectral gap. On the other hand, it can be quite expensive from a computational standpoint. The efficiency of the method is studied both theoretically and empirically

arXiv.org e-Print Archive

Crossref

A natural derivative on [0,n] and a binomial Poincar\'e inequality

Author: Hillion Erwan
Johnson Oliver
Yu Yaming
Publication venue: 'EDP Sciences'
Publication date: 01/07/2011
Field of study

We consider probability measures supported on a finite discrete interval

[0,n]

. We introduce a new finitedifference operator

\nabla_n

, defined as a linear combination of left and right finite differences. We show that this operator

\nabla_n

plays a key role in a new Poincar\'e (spectral gap) inequality with respect to binomial weights, with the orthogonal Krawtchouk polynomials acting as eigenfunctions of the relevant operator. We briefly discuss the relationship of this operator to the problem of optimal transport of probability measures

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL AMU

Numérisation de Documents Anciens Mathématiques

Open Repository and Bibliography - Luxembourg

Explore Bristol Research

A priori estimates for the Hill and Dirac operators

Author: A. M. Savchuk
B. Levitan
B. Ya. Levin
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
E. Korotyaev
H. Flaschka
J. Garnett
M. A. Lavrent’ev
M. I. Neĭman-zade
M. Krein
P. Kargaev
P. Kargaev
R. Johnson
T. Kappeler
V. A. Marchenko
V. A. Marchenko
Publication venue: 'Pleiades Publishing Ltd'
Publication date: 16/01/2007
Field of study

Consider the Hill operator

Ty=-y''+q'(t)y

L^2(\R)

, where

q\in L^2(0,1)

is a 1-periodic real potential. The spectrum of

T

is is absolutely continuous and consists of bands separated by gaps \g_n,n\ge 1 with length |\g_n|\ge 0. We obtain a priori estimates of the gap lengths, effective masses, action variables for the KDV. For example, if \m_n^\pm are the effective masses associated with the gap \g_n=(\l_n^-,\l_n^+), then |\m_n^-+\m_n^+|\le C|\g_n|^2n^{-4} for some constant

C=C(q)

and any

n\ge 1

. In order prove these results we use the analysis of a conformal mapping corresponding to quasimomentum of the Hill operator. That makes possible to reformulate the problems for the differential operator as the problems of the conformal mapping theory. Then the proof is based on the analysis of the conformal mapping and the identities. Moreover, we obtain the similar estimates for the Dirac operator

arXiv.org e-Print Archive

Crossref