Search CORE

91 research outputs found

Adaptive Covariance Estimation with model selection

Author: A. G. Journel
B. Bahr von
G. A. F. Seber
H. Engl
H. Lescornel
J. -M. Loubes
J. Bigot
J. Bigot
M. L. Stein
N. A. C. Cressie
R. Biscay
S. N. Elogne
Y. Baraud
Publication venue
Publication date: 01/01/2012
Field of study

We provide in this paper a fully adaptive penalized procedure to select a covariance among a collection of models observing i.i.d replications of the process at fixed observation points. For this we generalize previous results of Bigot and al. and propose to use a data driven penalty to obtain an oracle inequality for the estimator. We prove that this method is an extension to the matricial regression model of the work by Baraud

arXiv.org e-Print Archive

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse

Adaptive density estimation for stationary processes

Author: C. L. Mallows
C. Prieur
D. Andrews
D. Donoho
F. Comte
F. Comte
G. Viennet
H. Akaike
H. Akaike
H. C. P. Berbee
J. Dedecker
L. Birgé
L. Birgé
M. Lerasle
M. Rudemo
M. Talagrand
O. Bousquet
P. Massart
R. C. Bradley
Y. A. Rozanov
Y. Baraud
Publication venue: 'Allerton Press'
Publication date: 03/02/2009
Field of study

We propose an algorithm to estimate the common density

s

of a stationary process

X_1,...,X_n

. We suppose that the process is either

\beta

\tau

-mixing. We provide a model selection procedure based on a generalization of Mallows'

C_p

and we prove oracle inequalities for the selected estimator under a few prior assumptions on the collection of models and on the mixing coefficients. We prove that our estimator is adaptive over a class of Besov spaces, namely, we prove that it achieves the same rates of convergence as in the i.i.d framework

arXiv.org e-Print Archive

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Author: A. Antos
A. Antos
A. Nobel
András Antos
B. Yu
Csaba Szepesvári
D. Ernst
D. Haussler
D. Ormoneit
D. P. Bertsekas
D. P. Bertsekas
D. Pollard
E. Cheney
G. Gordon
J. N. Tsitsiklis
L. Devroye
L. Györfi
M. Anthony
M. Carrasco
M. Kuczma
M. Lagoudakis
P. Doukhan
P. Schweitzer
R. A. Howard
R. Bellman
R. Meir
R. Sutton
Rémi Munos
S. Bradtke
S. Meyn
S. Murphy
T. G. Dietterich
Y. Baraud
Y. Davidov
Publication venue
Publication date: 01/01/2008
Field of study

We consider the problem of finding a near-optimal policy in continuous space, discounted Markovian Decision Problems given the trajectory of some behaviour policy. We study the policy iteration algorithm where in successive iterations the action-value functions of the intermediate policies are obtained by picking a function from some fixed function set (chosen by the user) that minimizes an unbiased finite-sample approximation to a novel loss function that upper-bounds the unmodified Bellman-residual criterion. The main result is a finite-sample, high-probability bound on the performance of the resulting policy that depends on the mixing rate of the trajectory, the capacity of the function set as measured by a novel capacity concept that we call the VC-crossing dimension, the approximation power of the function set and the discounted-average concentrability of the future-state distribution. To the best of our knowledge this is the first theoretical reinforcement learning result for off-policy control learning over continuous state-spaces using a single trajectory

CiteSeerX

HAL - Lille 3

Crossref

SZTAKI Publication Repository

INRIA a CCSD electronic archive server