This paper considers sparse linear discriminant analysis of high-dimensional
data. In contrast to the existing methods which are based on separate
estimation of the precision matrix \O and the difference \de of the mean
vectors, we introduce a simple and effective classifier by estimating the
product \O\de directly through constrained ℓ1 minimization. The
estimator can be implemented efficiently using linear programming and the
resulting classifier is called the linear programming discriminant (LPD) rule.
The LPD rule is shown to have desirable theoretical and numerical properties.
It exploits the approximate sparsity of \O\de and as a consequence allows
cases where it can still perform well even when \O and/or \de cannot be
estimated consistently. Asymptotic properties of the LPD rule are investigated
and consistency and rate of convergence results are given. The LPD classifier
has superior finite sample performance and significant computational advantages
over the existing methods that require separate estimation of \O and \de.
The LPD rule is also applied to analyze real datasets from lung cancer and
leukemia studies. The classifier performs favorably in comparison to existing
methods.Comment: 39 pages.To appear in Journal of the American Statistical Associatio