14,197 research outputs found
Alternating direction method of multipliers for penalized zero-variance discriminant analysis
We consider the task of classification in the high dimensional setting where
the number of features of the given data is significantly greater than the
number of observations. To accomplish this task, we propose a heuristic, called
sparse zero-variance discriminant analysis (SZVD), for simultaneously
performing linear discriminant analysis and feature selection on high
dimensional data. This method combines classical zero-variance discriminant
analysis, where discriminant vectors are identified in the null space of the
sample within-class covariance matrix, with penalization applied to induce
sparse structures in the resulting vectors. To approximately solve the
resulting nonconvex problem, we develop a simple algorithm based on the
alternating direction method of multipliers. Further, we show that this
algorithm is applicable to a larger class of penalized generalized eigenvalue
problems, including a particular relaxation of the sparse principal component
analysis problem. Finally, we establish theoretical guarantees for convergence
of our algorithm to stationary points of the original nonconvex problem, and
empirically demonstrate the effectiveness of our heuristic for classifying
simulated data and data drawn from applications in time-series classification
On High Dimensional Sparse Regression and Its Inference
In the first part of this work, we aim to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling's test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM outperforms other state-of-the-art methods. In the second part of this work, we propose a Hard Thresholded Regression (HTR) framework for simultaneous variable selection and unbiased estimation in high dimensional linear regression. This new framework is motivated by its close connection with the regularization and best subset selection under orthogonal design, while enjoying several key computational and theoretical advantages over many existing penalization methods (e.g., SCAD or MCP). Computationally, HTR is a fast two-stage estimation procedure consisting of the first step for calculating a coarse initial estimator and the second step for solving a linear program. Theoretically, under some mild conditions, the HTR estimator is shown to enjoy the strong oracle property and thresholded property even when the number of covariates may grow at an exponential rate. We also propose to incorporate the regularized covariance estimator into the estimation procedure in order to better trade off between noise accumulation and correlation modeling. Under this scenario with regularized covariance matrix, HTR includes Sure Independence Screening as a special case. Both simulation and real data results show that HTR outperforms other state-of-the-art methods. In the third part of this work, we focus on multicategory classification and propose the sparse multicategory discriminant analysis. Many supervised machine learning tasks can be cast as multicategory classification problems. Linear discriminant analysis has been well studied in two class classification problems and can be easily extended to multicatigory cases. For high dimensional classification, traditional linear discriminant analysis fails due to diverging spectra and accumulation of noise. Therefore, researchers have proposed penalized LDA (Fan et al., 2012; Witten and Tibshirani, 2011). However, most available methods for high dimensional multi-class LDA are based on an iterative algorithm, which is computationally expensive and not theoretically justified. In this paper, we present a new framework for sparse multicategory discriminant analysis (SMDA) for high dimensional multi-class classification by simultaneous extracting the discriminant directions. Our SMDA can be cast as an convex programming which distinguishes itself from other state-of-the-art method. We evaluate the performances of the resulting methods on the extensive simulation study and a real data analysis.Doctor of Philosoph
Supervised Classification Using Sparse Fisher's LDA
It is well known that in a supervised classification setting when the number
of features is smaller than the number of observations, Fisher's linear
discriminant rule is asymptotically Bayes. However, there are numerous modern
applications where classification is needed in the high-dimensional setting.
Naive implementation of Fisher's rule in this case fails to provide good
results because the sample covariance matrix is singular. Moreover, by
constructing a classifier that relies on all features the interpretation of the
results is challenging. Our goal is to provide robust classification that
relies only on a small subset of important features and accounts for the
underlying correlation structure. We apply a lasso-type penalty to the
discriminant vector to ensure sparsity of the solution and use a shrinkage type
estimator for the covariance matrix. The resulting optimization problem is
solved using an iterative coordinate ascent algorithm. Furthermore, we analyze
the effect of nonconvexity on the sparsity level of the solution and highlight
the difference between the penalized and the constrained versions of the
problem. The simulation results show that the proposed method performs
favorably in comparison to alternatives. The method is used to classify
leukemia patients based on DNA methylation features
Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction
It is difficult to find the optimal sparse solution of a manifold learning
based dimensionality reduction algorithm. The lasso or the elastic net
penalized manifold learning based dimensionality reduction is not directly a
lasso penalized least square problem and thus the least angle regression (LARS)
(Efron et al. \cite{LARS}), one of the most popular algorithms in sparse
learning, cannot be applied. Therefore, most current approaches take indirect
ways or have strict settings, which can be inconvenient for applications. In
this paper, we proposed the manifold elastic net or MEN for short. MEN
incorporates the merits of both the manifold learning based dimensionality
reduction and the sparse learning based dimensionality reduction. By using a
series of equivalent transformations, we show MEN is equivalent to the lasso
penalized least square problem and thus LARS is adopted to obtain the optimal
sparse solution of MEN. In particular, MEN has the following advantages for
subsequent classification: 1) the local geometry of samples is well preserved
for low dimensional data representation, 2) both the margin maximization and
the classification error minimization are considered for sparse projection
calculation, 3) the projection matrix of MEN improves the parsimony in
computation, 4) the elastic net penalty reduces the over-fitting problem, and
5) the projection matrix of MEN can be interpreted psychologically and
physiologically. Experimental evidence on face recognition over various popular
datasets suggests that MEN is superior to top level dimensionality reduction
algorithms.Comment: 33 pages, 12 figure
Sparse multinomial kernel discriminant analysis (sMKDA)
Dimensionality reduction via canonical variate analysis (CVA) is important for pattern recognition and has been extended variously to permit more flexibility, e.g. by "kernelizing" the formulation. This can lead to over-fitting, usually ameliorated by regularization. Here, a method for sparse, multinomial kernel discriminant analysis (sMKDA) is proposed, using a sparse basis to control complexity. It is based on the connection between CVA and least-squares, and uses forward selection via orthogonal least-squares to approximate a basis, generalizing a similar approach for binomial problems. Classification can be performed directly via minimum Mahalanobis distance in the canonical variates. sMKDA achieves state-of-the-art performance in terms of accuracy and sparseness on 11 benchmark datasets
- …