5,236 research outputs found

    Optimal linear estimation under unknown nonlinear transform

    Full text link
    Linear regression studies the problem of estimating a model parameter Ξ²βˆ—βˆˆRp\beta^* \in \mathbb{R}^p, from nn observations {(yi,xi)}i=1n\{(y_i,\mathbf{x}_i)\}_{i=1}^n from linear model yi=⟨xi,Ξ²βˆ—βŸ©+Ο΅iy_i = \langle \mathbf{x}_i,\beta^* \rangle + \epsilon_i. We consider a significant generalization in which the relationship between ⟨xi,Ξ²βˆ—βŸ©\langle \mathbf{x}_i,\beta^* \rangle and yiy_i is noisy, quantized to a single bit, potentially nonlinear, noninvertible, as well as unknown. This model is known as the single-index model in statistics, and, among other things, it represents a significant generalization of one-bit compressed sensing. We propose a novel spectral-based estimation procedure and show that we can recover Ξ²βˆ—\beta^* in settings (i.e., classes of link function ff) where previous algorithms fail. In general, our algorithm requires only very mild restrictions on the (unknown) functional relationship between yiy_i and ⟨xi,Ξ²βˆ—βŸ©\langle \mathbf{x}_i,\beta^* \rangle. We also consider the high dimensional setting where Ξ²βˆ—\beta^* is sparse ,and introduce a two-stage nonconvex framework that addresses estimation challenges in high dimensional regimes where p≫np \gg n. For a broad class of link functions between ⟨xi,Ξ²βˆ—βŸ©\langle \mathbf{x}_i,\beta^* \rangle and yiy_i, we establish minimax lower bounds that demonstrate the optimality of our estimators in both the classical and high dimensional regimes.Comment: 25 pages, 3 figure

    Fast learning rate of multiple kernel learning: Trade-off between sparsity and smoothness

    Full text link
    We investigate the learning rate of multiple kernel learning (MKL) with β„“1\ell_1 and elastic-net regularizations. The elastic-net regularization is a composition of an β„“1\ell_1-regularizer for inducing the sparsity and an β„“2\ell_2-regularizer for controlling the smoothness. We focus on a sparse setting where the total number of kernels is large, but the number of nonzero components of the ground truth is relatively small, and show sharper convergence rates than the learning rates have ever shown for both β„“1\ell_1 and elastic-net regularizations. Our analysis reveals some relations between the choice of a regularization function and the performance. If the ground truth is smooth, we show a faster convergence rate for the elastic-net regularization with less conditions than β„“1\ell_1-regularization; otherwise, a faster convergence rate for the β„“1\ell_1-regularization is shown.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1095 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1103.043
    • …
    corecore