992 research outputs found
Penalized Orthogonal Iteration for Sparse Estimation of Generalized Eigenvalue Problem
We propose a new algorithm for sparse estimation of eigenvectors in
generalized eigenvalue problems (GEP). The GEP arises in a number of modern
data-analytic situations and statistical methods, including principal component
analysis (PCA), multiclass linear discriminant analysis (LDA), canonical
correlation analysis (CCA), sufficient dimension reduction (SDR) and invariant
co-ordinate selection. We propose to modify the standard generalized orthogonal
iteration with a sparsity-inducing penalty for the eigenvectors. To achieve
this goal, we generalize the equation-solving step of orthogonal iteration to a
penalized convex optimization problem. The resulting algorithm, called
penalized orthogonal iteration, provides accurate estimation of the true
eigenspace, when it is sparse. Also proposed is a computationally more
efficient alternative, which works well for PCA and LDA problems. Numerical
studies reveal that the proposed algorithms are competitive, and that our
tuning procedure works well. We demonstrate applications of the proposed
algorithm to obtain sparse estimates for PCA, multiclass LDA, CCA and SDR.
Supplementary materials are available online
API design for machine learning software: experiences from the scikit-learn project
Scikit-learn is an increasingly popular machine learning li- brary. Written
in Python, it is designed to be simple and efficient, accessible to
non-experts, and reusable in various contexts. In this paper, we present and
discuss our design choices for the application programming interface (API) of
the project. In particular, we describe the simple and elegant interface shared
by all learning and processing units in the library and then discuss its
advantages in terms of composition and reusability. The paper also comments on
implementation details specific to the Python ecosystem and analyzes obstacles
faced by users and developers of the library
- …