203 research outputs found

    Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii

    Full text link
    Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii [arXiv:0708.0083]Comment: Published at http://dx.doi.org/10.1214/009053606000001064 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Optimal change-point estimation from indirect observations

    Full text link
    We study nonparametric change-point estimation from indirect noisy observations. Focusing on the white noise convolution model, we consider two classes of functions that are smooth apart from the change-point. We establish lower bounds on the minimax risk in estimating the change-point and develop rate optimal estimation procedures. The results demonstrate that the best achievable rates of convergence are determined both by smoothness of the function away from the change-point and by the degree of ill-posedness of the convolution operator. Optimality is obtained by introducing a new technique that involves, as a key element, detection of zero crossings of an estimate of the properly smoothed second derivative of the underlying function.Comment: Published at http://dx.doi.org/10.1214/009053605000000750 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Estimation of matrices with row sparsity

    Full text link
    An increasing number of applications is concerned with recovering a sparse matrix from noisy observations. In this paper, we consider the setting where each row of the unknown matrix is sparse. We establish minimax optimal rates of convergence for estimating matrices with row sparsity. A major focus in the present paper is on the derivation of lower bounds

    Learning by mirror averaging

    Get PDF
    Given a finite collection of estimators or classifiers, we study the problem of model selection type aggregation, that is, we construct a new estimator or classifier, called aggregate, which is nearly as good as the best among them with respect to a given risk criterion. We define our aggregate by a simple recursive procedure which solves an auxiliary stochastic linear programming problem related to the original nonlinear one and constitutes a special case of the mirror averaging algorithm. We show that the aggregate satisfies sharp oracle inequalities under some general assumptions. The results are applied to several problems including regression, classification and density estimation.Comment: Published in at http://dx.doi.org/10.1214/07-AOS546 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Penalized maximum likelihood and semiparametric second-order efficiency

    Full text link
    We consider the problem of estimation of a shift parameter of an unknown symmetric function in Gaussian white noise. We introduce a notion of semiparametric second-order efficiency and propose estimators that are semiparametrically efficient and second-order efficient in our model. These estimators are of a penalized maximum likelihood type with an appropriately chosen penalty. We argue that second-order efficiency is crucial in semiparametric problems since only the second-order terms in asymptotic expansion for the risk account for the behavior of the ``nonparametric component'' of a semiparametric procedure, and they are not dramatically smaller than the first-order terms.Comment: Published at http://dx.doi.org/10.1214/009053605000000895 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Variable selection with Hamming loss

    Full text link
    We derive non-asymptotic bounds for the minimax risk of variable selection under expected Hamming loss in the Gaussian mean model in Rd\mathbb{R}^d for classes of ss-sparse vectors separated from 0 by a constant a>0a > 0. In some cases, we get exact expressions for the nonasymptotic minimax risk as a function of d,s,ad, s, a and find explicitly the minimax selectors. These results are extended to dependent or non-Gaussian observations and to the problem of crowdsourcing. Analogous conclusions are obtained for the probability of wrong recovery of the sparsity pattern. As corollaries, we derive necessary and sufficient conditions for such asymptotic properties as almost full recovery and exact recovery. Moreover, we propose data-driven selectors that provide almost full and exact recovery adaptively to the parameters of the classes

    Distributed Zero-Order Optimization under Adversarial Noise

    Get PDF
    We study the problem of distributed zero-order optimization for a class of strongly convex functions. They are formed by the average of local objectives, associated to different nodes in a prescribed network. We propose a distributed zero-order projected gradient descent algorithm to solve the problem. Exchange of information within the network is permitted only between neighbouring nodes. An important feature of our procedure is that it can query only function values, subject to a general noise model, that does not require zero mean or independent errors. We derive upper bounds for the average cumulative regret and optimization error of the algorithm which highlight the role played by a network connectivity parameter, the number of variables, the noise level, the strong convexity parameter, and smoothness properties of the local objectives. The bounds indicate some key improvements of our method over the state-of-the-art, both in the distributed and standard zero-order optimization settings. We also comment on lower bounds and observe that the dependency over certain function parameters in the bound is nearly optimal

    Noisy Independent Factor Analysis Model for Density Estimation and Classification

    Get PDF
    We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise. We do not assume that either the number of components or the matrix mixing the components are known. We show that the densities of this form can be estimated with a fast rate. Using the mirror averaging aggregation algorithm, we construct a density estimator which achieves a nearly parametric rate (log1/4 n)/√n, independent of the dimensionality of the data, as the sample size n tends to infinity. This estimator is adaptive to the number of components, their distributions and the mixing matrix. We then apply this density estimator to construct nonparametric plug-in classifiers and show that they achieve the best obtainable rate of the excess Bayes risk, to within a logarithmic factor independent of the dimension of the data. Applications of this classifier to simulated data sets and to real data from a remote sensing experiment show promising results.Financial support from the IAP research network of the Belgian government (Belgian Federal Science Policy) is gratefully acknowledged. Research of A. Samarov was partially supported by NSF grant DMS- 0505561 and by a grant from Singapore-MIT Alliance (CSB). Research of A.B. Tsybakov was partially supported by the grant ANR-06-BLAN-0194 and by the PASCAL Network of Excellence

    Exploiting higher order smoothness in derivative-free optimization and continuous bandits

    Get PDF
    We study the problem of zero-order optimization of a strongly convex function. The goal is to find the minimizer of the function by a sequential exploration of its values, under measurement noise. We study the impact of higher order smoothness properties of the function on the optimization error and on the cumulative regret. To solve this problem we consider a randomized approximation of the projected gradient descent algorithm. The gradient is estimated by a randomized procedure involving two function evaluations and a smoothing kernel. We derive upper bounds for this algorithm both in the constrained and unconstrained settings and prove minimax lower bounds for any sequential search method. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters. Based on this algorithm, we also propose an estimator of the minimum value of the function achieving almost sharp oracle behavior. We compare our results with the state-of-the-art, highlighting a number of key improvements
    • …
    corecore