3,900 research outputs found

    Binary Linear Classification and Feature Selection via Generalized Approximate Message Passing

    Full text link
    For the problem of binary linear classification and feature selection, we propose algorithmic approaches to classifier design based on the generalized approximate message passing (GAMP) algorithm, recently proposed in the context of compressive sensing. We are particularly motivated by problems where the number of features greatly exceeds the number of training examples, but where only a few features suffice for accurate classification. We show that sum-product GAMP can be used to (approximately) minimize the classification error rate and max-sum GAMP can be used to minimize a wide variety of regularized loss functions. Furthermore, we describe an expectation-maximization (EM)-based scheme to learn the associated model parameters online, as an alternative to cross-validation, and we show that GAMP's state-evolution framework can be used to accurately predict the misclassification rate. Finally, we present a detailed numerical study to confirm the accuracy, speed, and flexibility afforded by our GAMP-based approaches to binary linear classification and feature selection

    Performance Analysis of Spectral Clustering on Compressed, Incomplete and Inaccurate Measurements

    Full text link
    Spectral clustering is one of the most widely used techniques for extracting the underlying global structure of a data set. Compressed sensing and matrix completion have emerged as prevailing methods for efficiently recovering sparse and partially observed signals respectively. We combine the distance preserving measurements of compressed sensing and matrix completion with the power of robust spectral clustering. Our analysis provides rigorous bounds on how small errors in the affinity matrix can affect the spectral coordinates and clusterability. This work generalizes the current perturbation results of two-class spectral clustering to incorporate multi-class clustering with k eigenvectors. We thoroughly track how small perturbation from using compressed sensing and matrix completion affect the affinity matrix and in succession the spectral coordinates. These perturbation results for multi-class clustering require an eigengap between the kth and (k+1)th eigenvalues of the affinity matrix, which naturally occurs in data with k well-defined clusters. Our theoretical guarantees are complemented with numerical results along with a number of examples of the unsupervised organization and clustering of image data
    • …
    corecore