90 research outputs found

    L1L_1-Penalization in Functional Linear Regression with Subgaussian Design

    Get PDF
    We study functional regression with random subgaussian design and real-valued response. The focus is on the problems in which the regression function can be well approximated by a functional linear model with the slope function being "sparse" in the sense that it can be represented as a sum of a small number of well separated "spikes". This can be viewed as an extension of now classical sparse estimation problems to the case of infinite dictionaries. We study an estimator of the regression function based on penalized empirical risk minimization with quadratic loss and the complexity penalty defined in terms of L1L_1-norm (a continuous version of LASSO). The main goal is to introduce several important parameters characterizing sparsity in this class of problems and to prove sharp oracle inequalities showing how the L2L_2-error of the continuous LASSO estimator depends on the underlying sparsity of the problem

    Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees

    Full text link
    Greedy optimization methods such as Matching Pursuit (MP) and Frank-Wolfe (FW) algorithms regained popularity in recent years due to their simplicity, effectiveness and theoretical guarantees. MP and FW address optimization over the linear span and the convex hull of a set of atoms, respectively. In this paper, we consider the intermediate case of optimization over the convex cone, parametrized as the conic hull of a generic atom set, leading to the first principled definitions of non-negative MP algorithms for which we give explicit convergence rates and demonstrate excellent empirical performance. In particular, we derive sublinear (O(1/t)\mathcal{O}(1/t)) convergence on general smooth and convex objectives, and linear convergence (O(e−t)\mathcal{O}(e^{-t})) on strongly convex objectives, in both cases for general sets of atoms. Furthermore, we establish a clear correspondence of our algorithms to known algorithms from the MP and FW literature. Our novel algorithms and analyses target general atom sets and general objective functions, and hence are directly applicable to a large variety of learning settings.Comment: NIPS 201

    Estimation in high dimensions: a geometric perspective

    Full text link
    This tutorial provides an exposition of a flexible geometric framework for high dimensional estimation problems with constraints. The tutorial develops geometric intuition about high dimensional sets, justifies it with some results of asymptotic convex geometry, and demonstrates connections between geometric results and estimation problems. The theory is illustrated with applications to sparse recovery, matrix completion, quantization, linear and logistic regression and generalized linear models.Comment: 56 pages, 9 figures. Multiple minor change

    Sparse recovery in convex hulls via entropy penalization

    Full text link
    Let (X,Y)(X,Y) be a random couple in S×TS\times T with unknown distribution PP and (X1,Y1),...,(Xn,Yn)(X_1,Y_1),...,(X_n,Y_n) be i.i.d. copies of (X,Y).(X,Y). Denote PnP_n the empirical distribution of (X1,Y1),...,(Xn,Yn).(X_1,Y_1),...,(X_n,Y_n). Let h1,...,hN:S↦[−1,1]h_1,...,h_N:S\mapsto [-1,1] be a dictionary that consists of NN functions. For λ∈RN,\lambda \in {\mathbb{R}}^N, denote fλ:=∑j=1Nλjhj.f_{\lambda}:=\sum_{j=1}^N\lambda_jh_j. Let ℓ:T×R↦R\ell:T\times {\mathbb{R}}\mapsto {\mathbb{R}} be a given loss function and suppose it is convex with respect to the second variable. Let (ℓ∙f)(x,y):=ℓ(y;f(x)).(\ell \bullet f)(x,y):=\ell(y;f(x)). Finally, let Λ⊂RN\Lambda \subset {\mathbb{R}}^N be the simplex of all probability distributions on {1,...,N}.\{1,...,N\}. Consider the following penalized empirical risk minimization problem \begin{eqnarray*}\hat{\lambda}^{\varepsilon}:={\mathop {argmin}_{\lambda\in \Lambda}}\Biggl[P_n(\ell \bullet f_{\lambda})+\varepsilon \sum_{j=1}^N\lambda_j\log \lambda_j\Biggr]\end{eqnarray*} along with its distribution dependent version \begin{eqnarray*}\lambda^{\varepsilon}:={\mathop {argmin}_{\lambda\in \Lambda}}\Biggl[P(\ell \bullet f_{\lambda})+\varepsilon \sum_{j=1}^N\lambda_j\log \lambda_j\Biggr],\end{eqnarray*} where ε≥0\varepsilon\geq 0 is a regularization parameter. It is proved that the ``approximate sparsity'' of λε\lambda^{\varepsilon} implies the ``approximate sparsity'' of λ^ε\hat{\lambda}^{\varepsilon} and the impact of ``sparsity'' on bounding the excess risk of the empirical solution is explored. Similar results are also discussed in the case of entropy penalized density estimation.Comment: Published in at http://dx.doi.org/10.1214/08-AOS621 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    SISSO: a compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates

    Full text link
    The lack of reliable methods for identifying descriptors - the sets of parameters capturing the underlying mechanisms of a materials property - is one of the key factors hindering efficient materials development. Here, we propose a systematic approach for discovering descriptors for materials properties, within the framework of compressed-sensing based dimensionality reduction. SISSO (sure independence screening and sparsifying operator) tackles immense and correlated features spaces, and converges to the optimal solution from a combination of features relevant to the materials' property of interest. In addition, SISSO gives stable results also with small training sets. The methodology is benchmarked with the quantitative prediction of the ground-state enthalpies of octet binary materials (using ab initio data) and applied to the showcase example of predicting the metal/insulator classification of binaries (with experimental data). Accurate, predictive models are found in both cases. For the metal-insulator classification model, the predictive capability are tested beyond the training data: It rediscovers the available pressure-induced insulator->metal transitions and it allows for the prediction of yet unknown transition candidates, ripe for experimental validation. As a step forward with respect to previous model-identification methods, SISSO can become an effective tool for automatic materials development.Comment: 11 pages, 5 figures, in press in Phys. Rev. Material

    Non-Negative Sparse Regression and Column Subset Selection with L1 Error

    Get PDF
    We consider the problems of sparse regression and column subset selection under L1 error. For both problems, we show that in the non-negative setting it is possible to obtain tight and efficient approximations, without any additional structural assumptions (such as restricted isometry, incoherence, expansion, etc.). For sparse regression, given a matrix A and a vector b with non-negative entries, we give an efficient algorithm to output a vector x of sparsity O(k), for which |Ax - b|_1 is comparable to the smallest error possible using non-negative k-sparse x. We then use this technique to obtain our main result: an efficient algorithm for column subset selection under L1 error for non-negative matrices
    • …
    corecore