Search CORE

59 research outputs found

Bounding the expectation of the supremum of an empirical process over a (weak) vc-major class

Author: Baraud Yannick
Publication venue
Publication date: 07/09/2015
Field of study

Given a bounded class of functions G and independent random variables X1, . . . , Xn, we provide an upper bound for the expectation of the supremum of the empirical process over elements of G having a small variance. Our bound applies in the cases where G is a VC-subgraph or a VC-major class and it is of smaller order than those one could get by using a universal entropy bound over the whole class G . It also involves explicit constants and does not require the knowledge of the entropy of

arXiv.org e-Print Archive

CiteSeerX

HAL-UNICE

Open Repository and Bibliography - Luxembourg

Rates of convergence of rho-estimators for sets of densities satisfying shape constraints

Author: Baraud Yannick
Birgé Lucien
Publication venue
Publication date: 01/01/2016
Field of study

The purpose of this paper is to pursue our study of rho-estimators built from i.i.d. observations that we defined in Baraud et al. (2014). For a \rho-estimator based on some model S (which means that the estimator belongs to S) and a true distribution of the observations that also belongs to S, the risk (with squared Hellinger loss) is bounded by a quantity which can be viewed as a dimension function of the model and is often related to the "metric dimension" of this model, as defined in Birg\'e (2006). This is a minimax point of view and it is well-known that it is pessimistic. Typically, the bound is accurate for most points in the model but may be very pessimistic when the true distribution belongs to some specific part of it. This is the situation that we want to investigate here. For some models, like the set of decreasing densities on [0,1], there exist specific points in the model that we shall call "extremal" and for which the risk is substantially smaller than the typical risk. Moreover, the risk at a non-extremal point of the model can be bounded by the sum of the risk bound at a well-chosen extremal point plus the square of its distance to this point. This implies that if the true density is close enough to an extremal point, the risk at this point may be smaller than the minimax risk on the model and this actually remains true even if the true density does not belong to the model. The result is based on some refined bounds on the suprema of empirical processes that are established in Baraud (2016).Comment: 24 page

arXiv.org e-Print Archive

HAL-UNICE

Hal-Diderot

Robust Bayes-Like Estimation: Rho-Bayes estimation

Author: Baraud Yannick
Birgé Lucien
Publication venue
Publication date: 01/01/2020
Field of study

We consider the problem of estimating the joint distribution

P

n

independent random variables within the Bayes paradigm from a non-asymptotic point of view. Assuming that

P

admits some density

s

with respect to a given reference measure, we consider a density model

\overline S

for

s

that we endow with a prior distribution

\pi

(with support

\overline S

) and we build a robust alternative to the classical Bayes posterior distribution which possesses similar concentration properties around

s

whenever it belongs to the model

\overline S

. Furthermore, in density estimation, the Hellinger distance between the classical and the robust posterior distributions tends to 0, as the number of observations tends to infinity, under suitable assumptions on the model and the prior, provided that the model

\overline S

contains the true density

s

. However, unlike what happens with the classical Bayes posterior distribution, we show that the concentration properties of this new posterior distribution are still preserved in the case of a misspecification of the model, that is when

s

does not belong to

\overline S

but is close enough to it with respect to the Hellinger distance.Comment: 68 page

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Rho-estimators revisited: General theory and applications

Author: Baraud Yannick
Birgé Lucien
Publication venue
Publication date: 15/06/2016
Field of study

Following Baraud, Birg\'e and Sart (2017), we pursue our attempt to design a robust universal estimator of the joint ditribution of

n

independent (but not necessarily i.i.d.) observations for an Hellinger-type loss. Given such observations with an unknown joint distribution

\mathbf{P}

and a dominated model

\mathscr{Q}

for

\mathbf{P}

, we build an estimator

\widehat{\mathbf{P}}

based on

\mathscr{Q}

and measure its risk by an Hellinger-type distance. When

\mathbf{P}

does belong to the model, this risk is bounded by some quantity which relies on the local complexity of the model in a vicinity of

\mathbf{P}

. In most situations this bound corresponds to the minimax risk over the model (up to a possible logarithmic factor). When

\mathbf{P}

does not belong to the model, its risk involves an additional bias term proportional to the distance between

\mathbf{P}

and

\mathscr{Q}

, whatever the true distribution

\mathbf{P}

. From this point of view, this new version of

\rho

-estimators improves upon the previous one described in Baraud, Birg\'e and Sart (2017) which required that

\mathbf{P}

be absolutely continuous with respect to some known reference measure. Further additional improvements have been brought as compared to the former construction. In particular, it provides a very general treatment of the regression framework with random design as well as a computationally tractable procedure for aggregating estimators. We also give some conditions for the Maximum Likelihood Estimator to be a

\rho

-estimator. Finally, we consider the situation where the Statistician has at disposal many different models and we build a penalized version of the

\rho

-estimator for model selection and adaptation purposes. In the regression setting, this penalized estimator not only allows to estimate the regression function but also the distribution of the errors.Comment: 73 page

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Hal-Diderot

Estimating composite functions by model selection

Author: Baraud Yannick
Birgé Lucien
Publication venue
Publication date: 01/01/2013
Field of study

We consider the problem of estimating a function

s

[-1,1]^{k}

for large values of

k

by looking for some best approximation by composite functions of the form

g\circ u

. Our solution is based on model selection and leads to a very general approach to solve this problem with respect to many different types of functions

g,u

and statistical frameworks. In particular, we handle the problems of approximating

s

by additive functions, single and multiple index models, neural networks, mixtures of Gaussian densities (when

s

is a density) among other examples. We also investigate the situation where

s=g\circ u

for functions

g

and

u

belonging to possibly anisotropic smoothness classes. In this case, our approach leads to a completely adaptive estimator with respect to the regularity of

s

.Comment: 37 page

arXiv.org e-Print Archive

HAL-UNICE

Numérisation de Documents Anciens Mathématiques

Open Repository and Bibliography - Luxembourg

Hal-Diderot

Tests and estimation strategies associated to some loss functions

Author: Baraud Yannick
Publication venue
Publication date: 01/01/2021
Field of study

We consider the problem of estimating the joint distribution of n independent random variables. Given a loss function and a family of candidate probabilities, that we shall call a model, we aim at designing an estimator with values in our model that possesses good estimation properties not only when the distribution of the data belongs to the model but also when it lies close enough to it. The losses we have in mind are the total variation, Hellinger, Wasserstein and L_p-distances to name a few. We show that the risk of our estimator can be bounded by the sum of an approximation term that accounts for the loss between the true distribution and the model and a complexity term that corresponds to the bound we would get if this distribution did belong to the model. Our results hold under mild assumptions on the true distribution of the data and are based on exponential deviation inequalities that are non-asymptotic and involve explicit constants. Interestingly, when the model reduces to two distinct probabilities, our procedure results in a robust test whose errors of first and second kinds only depend on the losses between the true distribution and the two tested probabilities

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

From robust tests to Bayes-like posterior distributions

Author: Baraud Yannick
Publication venue
Publication date: 26/07/2021
Field of study

In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions, that extends the classical Bayes one, for estimating the law of an

n

-sample. The loss functions we have in mind are based on the total variation and Hellinger distances as well as some

\mathbb{L}_{j}

-ones. We prove that, with a probability close to one, this new posterior distribution concentrates its mass in a neighbourhood of the law of the data, for the chosen loss function, provided that this law belongs to the support of the prior or, at least, lies close enough to it. We therefore establish that the new posterior distribution enjoys some robustness properties with respect to a possible misspecification of the prior, or more precisely, its support. For the total variation and squared Hellinger losses, we also show that the posterior distribution keeps its concentration properties when the data are only independent, hence not necessarily i.i.d., provided that most of their marginals or the average of these are close enough to some probability distribution around which the prior puts enough mass. The posterior distribution is therefore also stable with respect to the equidistribution assumption. We illustrate these results by several applications. We consider the problems of estimating a location parameter or both the location and the scale of a density in a nonparametric framework. Finally, we also tackle the problem of estimating a density, with the squared Hellinger loss, in a high-dimensional parametric model under some sparsity conditions. The results established in this paper are non-asymptotic and provide, as much as possible, explicit constants

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg