394,783 research outputs found

    Increasing the Action Gap: New Operators for Reinforcement Learning

    Full text link
    This paper introduces new optimality-preserving operators on Q-functions. We first describe an operator for tabular representations, the consistent Bellman operator, which incorporates a notion of local policy consistency. We show that this local consistency leads to an increase in the action gap at each state; increasing this gap, we argue, mitigates the undesirable effects of approximation and estimation errors on the induced greedy policies. This operator can also be applied to discretized continuous space and time problems, and we provide empirical results evidencing superior performance in this context. Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator. As corollaries we provide a proof of optimality for Baird's advantage learning algorithm and derive other gap-increasing operators with interesting properties. We conclude with an empirical study on 60 Atari 2600 games illustrating the strong potential of these new operators

    Improved asymptotics of the spectral gap for the Mathieu operator

    Get PDF
    The Mathieu operator {equation*} L(y)=-y"+2a \cos{(2x)}y, \quad a\in \mathbb{C},\;a\neq 0, {equation*} considered with periodic or anti-periodic boundary conditions has, close to n2n^2 for large enough nn, two periodic (if nn is even) or anti-periodic (if nn is odd) eigenvalues λn−\lambda_n^-, λn+\lambda_n^+. For fixed aa, we show that {equation*} \lambda_n^+ - \lambda_n^-= \pm \frac{8(a/4)^n}{[(n-1)!]^2} [1 - \frac{a^2}{4n^3}+ O (\frac{1}{n^4})], \quad n\rightarrow\infty. {equation*} This result extends the asymptotic formula of Harrell-Avron-Simon, by providing more asymptotic terms

    Einstein gravity from ANEC correlators

    Get PDF
    We study correlation functions with multiple averaged null energy (ANEC) operators in conformal field theories. For large NN CFTs with a large gap to higher spin operators, we show that the OPE between a local operator and the ANEC can be recast as a particularly simple differential operator acting on the local operator. This operator is simple enough that we can resum it and obtain the finite distance OPE. Under the large NN - large gap assumptions, the vanishing of the commutator of ANEC operators tightly constrains the OPE coefficients of the theory. An important example of this phenomenon is the conclusion that a=ca=c in d=4d=4. This implies that the bulk dual of such a CFT is semi-classical Einstein-gravity with minimally coupled matter.Comment: 32 pages + appendices, 6 figures; v2:typos corrected and a comment added in introductio

    Random walk on surfaces with hyperbolic cusps

    Get PDF
    We consider the operator associated to a random walk on finite volume surfaces with hyperbolic cusps. We study the spectral gap (upper and lower bound) associated to this operator and deduce some rate of convergence of the iterated kernel towards its stationary distribution.Comment: 28 page

    A natural derivative on [0,n] and a binomial Poincar\'e inequality

    Get PDF
    We consider probability measures supported on a finite discrete interval [0,n][0,n]. We introduce a new finitedifference operator ∇n\nabla_n, defined as a linear combination of left and right finite differences. We show that this operator ∇n\nabla_n plays a key role in a new Poincar\'e (spectral gap) inequality with respect to binomial weights, with the orthogonal Krawtchouk polynomials acting as eigenfunctions of the relevant operator. We briefly discuss the relationship of this operator to the problem of optimal transport of probability measures

    Estimating the spectral gap of a trace-class Markov operator

    Full text link
    The utility of a Markov chain Monte Carlo algorithm is, in large part, determined by the size of the spectral gap of the corresponding Markov operator. However, calculating (and even approximating) the spectral gaps of practical Monte Carlo Markov chains in statistics has proven to be an extremely difficult and often insurmountable task, especially when these chains move on continuous state spaces. In this paper, a method for accurate estimation of the spectral gap is developed for general state space Markov chains whose operators are non-negative and trace-class. The method is based on the fact that the second largest eigenvalue (and hence the spectral gap) of such operators can be bounded above and below by simple functions of the power sums of the eigenvalues. These power sums often have nice integral representations. A classical Monte Carlo method is proposed to estimate these integrals, and a simple sufficient condition for finite variance is provided. This leads to asymptotically valid confidence intervals for the second largest eigenvalue (and the spectral gap) of the Markov operator. In contrast with previously existing techniques, our method is not based on a near-stationary version of the Markov chain, which, paradoxically, cannot be obtained in a principled manner without bounds on the spectral gap. On the other hand, it can be quite expensive from a computational standpoint. The efficiency of the method is studied both theoretically and empirically

    A priori estimates for the Hill and Dirac operators

    Full text link
    Consider the Hill operator Ty=−y′′+q′(t)yTy=-y''+q'(t)y in L2(R)L^2(\R), where q∈L2(0,1)q\in L^2(0,1) is a 1-periodic real potential. The spectrum of TT is is absolutely continuous and consists of bands separated by gaps \g_n,n\ge 1 with length |\g_n|\ge 0. We obtain a priori estimates of the gap lengths, effective masses, action variables for the KDV. For example, if \m_n^\pm are the effective masses associated with the gap \g_n=(\l_n^-,\l_n^+), then |\m_n^-+\m_n^+|\le C|\g_n|^2n^{-4} for some constant C=C(q)C=C(q) and any n≥1n\ge 1. In order prove these results we use the analysis of a conformal mapping corresponding to quasimomentum of the Hill operator. That makes possible to reformulate the problems for the differential operator as the problems of the conformal mapping theory. Then the proof is based on the analysis of the conformal mapping and the identities. Moreover, we obtain the similar estimates for the Dirac operator

    Whittaker-Hill equation and semifinite-gap Schroedinger operators

    Full text link
    A periodic one-dimensional Schroedinger operator is called semifinite-gap if every second gap in its spectrum is eventually closed. We construct explicit examples of semifinite-gap Schroedinger operators in trigonometric functions by applying Darboux transformations to the Whittaker-Hill equation. We give a criterion of the regularity of the corresponding potentials and investigate the spectral properties of the new operators.Comment: Revised versio
    • …
    corecore