406,246 research outputs found
Increasing the Action Gap: New Operators for Reinforcement Learning
This paper introduces new optimality-preserving operators on Q-functions. We
first describe an operator for tabular representations, the consistent Bellman
operator, which incorporates a notion of local policy consistency. We show that
this local consistency leads to an increase in the action gap at each state;
increasing this gap, we argue, mitigates the undesirable effects of
approximation and estimation errors on the induced greedy policies. This
operator can also be applied to discretized continuous space and time problems,
and we provide empirical results evidencing superior performance in this
context. Extending the idea of a locally consistent operator, we then derive
sufficient conditions for an operator to preserve optimality, leading to a
family of operators which includes our consistent Bellman operator. As
corollaries we provide a proof of optimality for Baird's advantage learning
algorithm and derive other gap-increasing operators with interesting
properties. We conclude with an empirical study on 60 Atari 2600 games
illustrating the strong potential of these new operators
Improved asymptotics of the spectral gap for the Mathieu operator
The Mathieu operator {equation*} L(y)=-y"+2a \cos{(2x)}y, \quad a\in
\mathbb{C},\;a\neq 0, {equation*} considered with periodic or anti-periodic
boundary conditions has, close to for large enough , two periodic (if
is even) or anti-periodic (if is odd) eigenvalues ,
. For fixed , we show that {equation*} \lambda_n^+ -
\lambda_n^-= \pm \frac{8(a/4)^n}{[(n-1)!]^2} [1 - \frac{a^2}{4n^3}+ O
(\frac{1}{n^4})], \quad n\rightarrow\infty. {equation*} This result extends the
asymptotic formula of Harrell-Avron-Simon, by providing more asymptotic terms
Einstein gravity from ANEC correlators
We study correlation functions with multiple averaged null energy (ANEC)
operators in conformal field theories. For large CFTs with a large gap to
higher spin operators, we show that the OPE between a local operator and the
ANEC can be recast as a particularly simple differential operator acting on the
local operator. This operator is simple enough that we can resum it and obtain
the finite distance OPE. Under the large - large gap assumptions, the
vanishing of the commutator of ANEC operators tightly constrains the OPE
coefficients of the theory. An important example of this phenomenon is the
conclusion that in . This implies that the bulk dual of such a CFT
is semi-classical Einstein-gravity with minimally coupled matter.Comment: 32 pages + appendices, 6 figures; v2:typos corrected and a comment
added in introductio
Random walk on surfaces with hyperbolic cusps
We consider the operator associated to a random walk on finite volume
surfaces with hyperbolic cusps. We study the spectral gap (upper and lower
bound) associated to this operator and deduce some rate of convergence of the
iterated kernel towards its stationary distribution.Comment: 28 page
Estimating the spectral gap of a trace-class Markov operator
The utility of a Markov chain Monte Carlo algorithm is, in large part,
determined by the size of the spectral gap of the corresponding Markov
operator. However, calculating (and even approximating) the spectral gaps of
practical Monte Carlo Markov chains in statistics has proven to be an extremely
difficult and often insurmountable task, especially when these chains move on
continuous state spaces. In this paper, a method for accurate estimation of the
spectral gap is developed for general state space Markov chains whose operators
are non-negative and trace-class. The method is based on the fact that the
second largest eigenvalue (and hence the spectral gap) of such operators can be
bounded above and below by simple functions of the power sums of the
eigenvalues. These power sums often have nice integral representations. A
classical Monte Carlo method is proposed to estimate these integrals, and a
simple sufficient condition for finite variance is provided. This leads to
asymptotically valid confidence intervals for the second largest eigenvalue
(and the spectral gap) of the Markov operator. In contrast with previously
existing techniques, our method is not based on a near-stationary version of
the Markov chain, which, paradoxically, cannot be obtained in a principled
manner without bounds on the spectral gap. On the other hand, it can be quite
expensive from a computational standpoint. The efficiency of the method is
studied both theoretically and empirically
A natural derivative on [0,n] and a binomial Poincar\'e inequality
We consider probability measures supported on a finite discrete interval
. We introduce a new finitedifference operator , defined as a
linear combination of left and right finite differences. We show that this
operator plays a key role in a new Poincar\'e (spectral gap)
inequality with respect to binomial weights, with the orthogonal Krawtchouk
polynomials acting as eigenfunctions of the relevant operator. We briefly
discuss the relationship of this operator to the problem of optimal transport
of probability measures
A priori estimates for the Hill and Dirac operators
Consider the Hill operator in , where is a 1-periodic real potential. The spectrum of is is absolutely
continuous and consists of bands separated by gaps \g_n,n\ge 1 with length
|\g_n|\ge 0. We obtain a priori estimates of the gap lengths, effective
masses, action variables for the KDV. For example, if \m_n^\pm are the
effective masses associated with the gap \g_n=(\l_n^-,\l_n^+), then
|\m_n^-+\m_n^+|\le C|\g_n|^2n^{-4} for some constant and any . In order prove these results we use the analysis of a conformal mapping
corresponding to quasimomentum of the Hill operator. That makes possible to
reformulate the problems for the differential operator as the problems of the
conformal mapping theory. Then the proof is based on the analysis of the
conformal mapping and the identities. Moreover, we obtain the similar estimates
for the Dirac operator
- …