91 research outputs found
Adaptive Covariance Estimation with model selection
We provide in this paper a fully adaptive penalized procedure to select a
covariance among a collection of models observing i.i.d replications of the
process at fixed observation points. For this we generalize previous results of
Bigot and al. and propose to use a data driven penalty to obtain an oracle
inequality for the estimator. We prove that this method is an extension to the
matricial regression model of the work by Baraud
Adaptive density estimation for stationary processes
We propose an algorithm to estimate the common density of a stationary
process . We suppose that the process is either or
-mixing. We provide a model selection procedure based on a generalization
of Mallows' and we prove oracle inequalities for the selected estimator
under a few prior assumptions on the collection of models and on the mixing
coefficients. We prove that our estimator is adaptive over a class of Besov
spaces, namely, we prove that it achieves the same rates of convergence as in
the i.i.d framework
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
We consider the problem of finding a near-optimal policy in continuous space, discounted Markovian Decision Problems given the trajectory of some behaviour policy. We study the policy iteration algorithm where in successive iterations the action-value functions of the intermediate policies are obtained by picking a function from some fixed function set (chosen by the user) that minimizes an unbiased finite-sample approximation to a novel loss function that upper-bounds the unmodified Bellman-residual criterion. The main result is a finite-sample, high-probability bound on the performance of the resulting policy that depends on the mixing rate of the trajectory, the capacity of the function set as measured by a novel capacity concept that we call the VC-crossing dimension, the approximation power of the function set and the discounted-average concentrability of the future-state distribution. To the best of our knowledge this is the first theoretical reinforcement learning result for off-policy control learning over continuous state-spaces using a single trajectory
Homonymies malencontreuses
Volume: 88Start Page: 135End Page: 13
Col\ue9opt\ue8res Scarbaeoidea nouveaux de la faune pal\ue9arctique
Volume: 97Start Page: 295End Page: 30
Contribution \ue0 la connaissance du genre Eulasia Truqui (Coleoptera, Scarabaeoidea, Glaphyridae)
Volume: 97Start Page: 107End Page: 13
- …