14,197 research outputs found

    Alternating direction method of multipliers for penalized zero-variance discriminant analysis

    Get PDF
    We consider the task of classification in the high dimensional setting where the number of features of the given data is significantly greater than the number of observations. To accomplish this task, we propose a heuristic, called sparse zero-variance discriminant analysis (SZVD), for simultaneously performing linear discriminant analysis and feature selection on high dimensional data. This method combines classical zero-variance discriminant analysis, where discriminant vectors are identified in the null space of the sample within-class covariance matrix, with penalization applied to induce sparse structures in the resulting vectors. To approximately solve the resulting nonconvex problem, we develop a simple algorithm based on the alternating direction method of multipliers. Further, we show that this algorithm is applicable to a larger class of penalized generalized eigenvalue problems, including a particular relaxation of the sparse principal component analysis problem. Finally, we establish theoretical guarantees for convergence of our algorithm to stationary points of the original nonconvex problem, and empirically demonstrate the effectiveness of our heuristic for classifying simulated data and data drawn from applications in time-series classification

    On High Dimensional Sparse Regression and Its Inference

    Get PDF
    In the first part of this work, we aim to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling's T2T^2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM outperforms other state-of-the-art methods. In the second part of this work, we propose a Hard Thresholded Regression (HTR) framework for simultaneous variable selection and unbiased estimation in high dimensional linear regression. This new framework is motivated by its close connection with the L0L_0 regularization and best subset selection under orthogonal design, while enjoying several key computational and theoretical advantages over many existing penalization methods (e.g., SCAD or MCP). Computationally, HTR is a fast two-stage estimation procedure consisting of the first step for calculating a coarse initial estimator and the second step for solving a linear program. Theoretically, under some mild conditions, the HTR estimator is shown to enjoy the strong oracle property and thresholded property even when the number of covariates may grow at an exponential rate. We also propose to incorporate the regularized covariance estimator into the estimation procedure in order to better trade off between noise accumulation and correlation modeling. Under this scenario with regularized covariance matrix, HTR includes Sure Independence Screening as a special case. Both simulation and real data results show that HTR outperforms other state-of-the-art methods. In the third part of this work, we focus on multicategory classification and propose the sparse multicategory discriminant analysis. Many supervised machine learning tasks can be cast as multicategory classification problems. Linear discriminant analysis has been well studied in two class classification problems and can be easily extended to multicatigory cases. For high dimensional classification, traditional linear discriminant analysis fails due to diverging spectra and accumulation of noise. Therefore, researchers have proposed penalized LDA (Fan et al., 2012; Witten and Tibshirani, 2011). However, most available methods for high dimensional multi-class LDA are based on an iterative algorithm, which is computationally expensive and not theoretically justified. In this paper, we present a new framework for sparse multicategory discriminant analysis (SMDA) for high dimensional multi-class classification by simultaneous extracting the discriminant directions. Our SMDA can be cast as an convex programming which distinguishes itself from other state-of-the-art method. We evaluate the performances of the resulting methods on the extensive simulation study and a real data analysis.Doctor of Philosoph

    Supervised Classification Using Sparse Fisher's LDA

    Full text link
    It is well known that in a supervised classification setting when the number of features is smaller than the number of observations, Fisher's linear discriminant rule is asymptotically Bayes. However, there are numerous modern applications where classification is needed in the high-dimensional setting. Naive implementation of Fisher's rule in this case fails to provide good results because the sample covariance matrix is singular. Moreover, by constructing a classifier that relies on all features the interpretation of the results is challenging. Our goal is to provide robust classification that relies only on a small subset of important features and accounts for the underlying correlation structure. We apply a lasso-type penalty to the discriminant vector to ensure sparsity of the solution and use a shrinkage type estimator for the covariance matrix. The resulting optimization problem is solved using an iterative coordinate ascent algorithm. Furthermore, we analyze the effect of nonconvexity on the sparsity level of the solution and highlight the difference between the penalized and the constrained versions of the problem. The simulation results show that the proposed method performs favorably in comparison to alternatives. The method is used to classify leukemia patients based on DNA methylation features

    Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction

    Full text link
    It is difficult to find the optimal sparse solution of a manifold learning based dimensionality reduction algorithm. The lasso or the elastic net penalized manifold learning based dimensionality reduction is not directly a lasso penalized least square problem and thus the least angle regression (LARS) (Efron et al. \cite{LARS}), one of the most popular algorithms in sparse learning, cannot be applied. Therefore, most current approaches take indirect ways or have strict settings, which can be inconvenient for applications. In this paper, we proposed the manifold elastic net or MEN for short. MEN incorporates the merits of both the manifold learning based dimensionality reduction and the sparse learning based dimensionality reduction. By using a series of equivalent transformations, we show MEN is equivalent to the lasso penalized least square problem and thus LARS is adopted to obtain the optimal sparse solution of MEN. In particular, MEN has the following advantages for subsequent classification: 1) the local geometry of samples is well preserved for low dimensional data representation, 2) both the margin maximization and the classification error minimization are considered for sparse projection calculation, 3) the projection matrix of MEN improves the parsimony in computation, 4) the elastic net penalty reduces the over-fitting problem, and 5) the projection matrix of MEN can be interpreted psychologically and physiologically. Experimental evidence on face recognition over various popular datasets suggests that MEN is superior to top level dimensionality reduction algorithms.Comment: 33 pages, 12 figure

    Sparse multinomial kernel discriminant analysis (sMKDA)

    No full text
    Dimensionality reduction via canonical variate analysis (CVA) is important for pattern recognition and has been extended variously to permit more flexibility, e.g. by "kernelizing" the formulation. This can lead to over-fitting, usually ameliorated by regularization. Here, a method for sparse, multinomial kernel discriminant analysis (sMKDA) is proposed, using a sparse basis to control complexity. It is based on the connection between CVA and least-squares, and uses forward selection via orthogonal least-squares to approximate a basis, generalizing a similar approach for binomial problems. Classification can be performed directly via minimum Mahalanobis distance in the canonical variates. sMKDA achieves state-of-the-art performance in terms of accuracy and sparseness on 11 benchmark datasets
    • …
    corecore