203 research outputs found
Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii
Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and
oracle inequalities in risk minimization'' by V. Koltchinskii [arXiv:0708.0083]Comment: Published at http://dx.doi.org/10.1214/009053606000001064 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Optimal change-point estimation from indirect observations
We study nonparametric change-point estimation from indirect noisy
observations. Focusing on the white noise convolution model, we consider two
classes of functions that are smooth apart from the change-point. We establish
lower bounds on the minimax risk in estimating the change-point and develop
rate optimal estimation procedures. The results demonstrate that the best
achievable rates of convergence are determined both by smoothness of the
function away from the change-point and by the degree of ill-posedness of the
convolution operator. Optimality is obtained by introducing a new technique
that involves, as a key element, detection of zero crossings of an estimate of
the properly smoothed second derivative of the underlying function.Comment: Published at http://dx.doi.org/10.1214/009053605000000750 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Estimation of matrices with row sparsity
An increasing number of applications is concerned with recovering a sparse
matrix from noisy observations. In this paper, we consider the setting where
each row of the unknown matrix is sparse. We establish minimax optimal rates of
convergence for estimating matrices with row sparsity. A major focus in the
present paper is on the derivation of lower bounds
Learning by mirror averaging
Given a finite collection of estimators or classifiers, we study the problem
of model selection type aggregation, that is, we construct a new estimator or
classifier, called aggregate, which is nearly as good as the best among them
with respect to a given risk criterion. We define our aggregate by a simple
recursive procedure which solves an auxiliary stochastic linear programming
problem related to the original nonlinear one and constitutes a special case of
the mirror averaging algorithm. We show that the aggregate satisfies sharp
oracle inequalities under some general assumptions. The results are applied to
several problems including regression, classification and density estimation.Comment: Published in at http://dx.doi.org/10.1214/07-AOS546 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Penalized maximum likelihood and semiparametric second-order efficiency
We consider the problem of estimation of a shift parameter of an unknown
symmetric function in Gaussian white noise. We introduce a notion of
semiparametric second-order efficiency and propose estimators that are
semiparametrically efficient and second-order efficient in our model. These
estimators are of a penalized maximum likelihood type with an appropriately
chosen penalty. We argue that second-order efficiency is crucial in
semiparametric problems since only the second-order terms in asymptotic
expansion for the risk account for the behavior of the ``nonparametric
component'' of a semiparametric procedure, and they are not dramatically
smaller than the first-order terms.Comment: Published at http://dx.doi.org/10.1214/009053605000000895 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Variable selection with Hamming loss
We derive non-asymptotic bounds for the minimax risk of variable selection
under expected Hamming loss in the Gaussian mean model in for
classes of -sparse vectors separated from 0 by a constant . In some
cases, we get exact expressions for the nonasymptotic minimax risk as a
function of and find explicitly the minimax selectors. These results
are extended to dependent or non-Gaussian observations and to the problem of
crowdsourcing. Analogous conclusions are obtained for the probability of wrong
recovery of the sparsity pattern. As corollaries, we derive necessary and
sufficient conditions for such asymptotic properties as almost full recovery
and exact recovery. Moreover, we propose data-driven selectors that provide
almost full and exact recovery adaptively to the parameters of the classes
Distributed Zero-Order Optimization under Adversarial Noise
We study the problem of distributed zero-order optimization for a class of strongly convex functions. They are formed by the average of local objectives, associated to different nodes in a prescribed network. We propose a distributed zero-order projected gradient descent algorithm to solve the problem. Exchange of information within the network is permitted only between neighbouring nodes. An important feature of our procedure is that it can query only function values, subject to a general noise model, that does not require zero mean or independent errors. We derive upper bounds for the average cumulative regret and optimization error of the algorithm which highlight the role played by a network connectivity parameter, the number of variables, the noise level, the strong convexity parameter, and smoothness properties of the local objectives. The bounds indicate some key improvements of our method over the state-of-the-art, both in the distributed and standard zero-order optimization settings. We also comment on lower bounds and observe that the dependency over certain function parameters in the bound is nearly optimal
Noisy Independent Factor Analysis Model for Density Estimation and Classification
We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate. Using the mirror averaging aggregation algorithm, we construct a density estimator which achieves a nearly parametric rate (log1/4 n)/√n, independent of the dimensionality of the data, as the sample size n tends to infinity. This estimator is adaptive to the number of components, their distributions and the mixing matrix. We then apply this density estimator to construct nonparametric plug-in classifiers and show that they achieve the best obtainable rate of the excess Bayes risk, to within a logarithmic factor independent of the dimension of the data. Applications of this classifier to simulated data sets and to real data from a remote sensing experiment show promising results.Financial support from the IAP research network of the Belgian government (Belgian Federal Science
Policy) is gratefully acknowledged. Research of A. Samarov was partially supported by NSF grant DMS-
0505561 and by a grant from Singapore-MIT Alliance (CSB). Research of A.B. Tsybakov was partially
supported by the grant ANR-06-BLAN-0194 and by the PASCAL Network of Excellence
Exploiting higher order smoothness in derivative-free optimization and continuous bandits
We study the problem of zero-order optimization of a strongly convex function. The goal is to find the minimizer of the function by a sequential exploration of its values, under measurement noise. We study the impact of higher order smoothness properties of the function on the optimization error and on the cumulative regret. To solve this problem we consider a randomized approximation of the projected gradient descent algorithm. The gradient is estimated by a randomized procedure involving two function evaluations and a smoothing kernel. We derive upper bounds for this algorithm both in the constrained and unconstrained settings and prove minimax lower bounds for any sequential search method. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters. Based on this algorithm, we also propose an estimator of the minimum value of the function achieving almost sharp oracle behavior. We compare our results with the state-of-the-art, highlighting a number of key improvements
- …