Search CORE

1,414 research outputs found

Rates of convergence of rho-estimators for sets of densities satisfying shape constraints

Author: Baraud Yannick
Birgé Lucien
Publication venue
Publication date: 01/01/2016
Field of study

The purpose of this paper is to pursue our study of rho-estimators built from i.i.d. observations that we defined in Baraud et al. (2014). For a \rho-estimator based on some model S (which means that the estimator belongs to S) and a true distribution of the observations that also belongs to S, the risk (with squared Hellinger loss) is bounded by a quantity which can be viewed as a dimension function of the model and is often related to the "metric dimension" of this model, as defined in Birg\'e (2006). This is a minimax point of view and it is well-known that it is pessimistic. Typically, the bound is accurate for most points in the model but may be very pessimistic when the true distribution belongs to some specific part of it. This is the situation that we want to investigate here. For some models, like the set of decreasing densities on [0,1], there exist specific points in the model that we shall call "extremal" and for which the risk is substantially smaller than the typical risk. Moreover, the risk at a non-extremal point of the model can be bounded by the sum of the risk bound at a well-chosen extremal point plus the square of its distance to this point. This implies that if the true density is close enough to an extremal point, the risk at this point may be smaller than the minimax risk on the model and this actually remains true even if the true density does not belong to the model. The result is based on some refined bounds on the suprema of empirical processes that are established in Baraud (2016).Comment: 24 page

arXiv.org e-Print Archive

HAL-UNICE

Hal-Diderot

Caveats for information bottleneck in deterministic scenarios

Author: Kolchinsky Artemy
Tracey Brendan D.
Van Kuyk Steven
Publication venue
Publication date: 08/02/2019
Field of study

Information bottleneck (IB) is a method for extracting information from one random variable

X

that is relevant for predicting another random variable

Y

. To do so, IB identifies an intermediate "bottleneck" variable

T

that has low mutual information

I(X;T)

and high mutual information

I(Y;T)

. The "IB curve" characterizes the set of bottleneck variables that achieve maximal

I(Y;T)

for a given

I(X;T)

, and is typically explored by maximizing the "IB Lagrangian",

I(Y;T) - \beta I(X;T)

. In some cases,

Y

is a deterministic function of

X

, including many classification problems in supervised learning where the output class

Y

is a deterministic function of the input

X

. We demonstrate three caveats when using IB in any situation where

Y

is a deterministic function of

X

: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of

\beta

; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal. We also show that when

Y

is a small perturbation away from being a deterministic function of

X

, these three caveats arise in an approximate way. To address problem (1), we propose a functional that, unlike the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the three caveats on the MNIST dataset

arXiv.org e-Print Archive

Estimation of semiparametric stochastic frontiers under shape constraints with application to pollution generating technologies

Author: Kortelainen Mika
Publication venue: University of Joensuu
Publication date
Field of study

UEF Electronic Publications

Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases

Author: Navarro Fabien
Saumard Adrien
Publication venue: 'EDP Sciences'
Publication date: 02/09/2016
Field of study

We investigate the optimality for model selection of the so-called slope heuristics,

V

-fold cross-validation and

V

-fold penalization in a heteroscedastic with random design regression context. We consider a new class of linear models that we call strongly localized bases and that generalize histograms, piecewise polynomials and compactly supported wavelets. We derive sharp oracle inequalities that prove the asymptotic optimality of the slope heuristics---when the optimal penalty shape is known---and

V

-fold penalization. Furthermore,

V

-fold cross-validation seems to be suboptimal for a fixed value of

V

since it recovers asymptotically the oracle learned from a sample size equal to

1-V^{-1}

of the original amount of data. Our results are based on genuine concentration inequalities for the true and empirical excess risks that are of independent interest. We show in our experiments the good behavior of the slope heuristics for the selection of linear wavelet models. Furthermore,

V

-fold cross-validation and

V

-fold penalization have comparable efficiency

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Numérisation de Documents Anciens Mathématiques

Estimating composite functions by model selection

Author: Baraud Yannick
Birgé Lucien
Publication venue
Publication date: 01/01/2013
Field of study

We consider the problem of estimating a function

s

[-1,1]^{k}

for large values of

k

by looking for some best approximation by composite functions of the form

g\circ u

. Our solution is based on model selection and leads to a very general approach to solve this problem with respect to many different types of functions

g,u

and statistical frameworks. In particular, we handle the problems of approximating

s

by additive functions, single and multiple index models, neural networks, mixtures of Gaussian densities (when

s

is a density) among other examples. We also investigate the situation where

s=g\circ u

for functions

g

and

u

belonging to possibly anisotropic smoothness classes. In this case, our approach leads to a completely adaptive estimator with respect to the regularity of

s

.Comment: 37 page

arXiv.org e-Print Archive

HAL-UNICE

Numérisation de Documents Anciens Mathématiques

Open Repository and Bibliography - Luxembourg

Hal-Diderot