Search CORE

106,957 research outputs found

Improving Sparse Representation-Based Classification Using Local Principal Component Analysis

Author: A. Singer
A.S. Georghiades
Chia-Po Wei
Christian Merkwirth
Claudio Ceruti
David L. Donoho
Hakan Cevikalp
Hongzhi Zhang
J. Wright
Jadoon Waqas
Jon Louis Bentley
Jun Yin
L Qiao
N Kambhatla
Patrice Y. Simard
R Patel
ST Roweis
Xiaoyang Tan
Y LeCun
Yong Xu
Zechao Li
Publication venue
Publication date: 02/06/2018
Field of study

Sparse representation-based classification (SRC), proposed by Wright et al., seeks the sparsest decomposition of a test sample over the dictionary of training samples, with classification to the most-contributing class. Because it assumes test samples can be written as linear combinations of their same-class training samples, the success of SRC depends on the size and representativeness of the training set. Our proposed classification algorithm enlarges the training set by using local principal component analysis to approximate the basis vectors of the tangent hyperplane of the class manifold at each training sample. The dictionary in SRC is replaced by a local dictionary that adapts to the test sample and includes training samples and their corresponding tangent basis vectors. We use a synthetic data set and three face databases to demonstrate that this method can achieve higher classification accuracy than SRC in cases of sparse sampling, nonlinear class manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition," editors Shyi-Ming Chen and Witold Pedrycz. The original publication is available at http://www.springerlink.co

arXiv.org e-Print Archive

Crossref

Sparse Correlation Kernel Analysis and Reconstruction

Author: Girosi Federico
Papgeorgiou Constantine P.
Poggio Tomaso
Publication venue
Publication date: 01/01/1998
Field of study

This paper presents a new paradigm for signal reconstruction and superresolution, Correlation Kernel Analysis (CKA), that is based on the selection of a sparse set of bases from a large dictionary of class- specific basis functions. The basis functions that we use are the correlation functions of the class of signals we are analyzing. To choose the appropriate features from this large dictionary, we use Support Vector Machine (SVM) regression and compare this to traditional Principal Component Analysis (PCA) for the tasks of signal reconstruction, superresolution, and compression. The testbed we use in this paper is a set of images of pedestrians. This paper also presents results of experiments in which we use a dictionary of multiscale basis functions and then use Basis Pursuit De-Noising to obtain a sparse, multiscale approximation of a signal. The results are analyzed and we conclude that 1) when used with a sparse representation technique, the correlation function is an effective kernel for image reconstruction and superresolution, 2) for image compression, PCA and SVM have different tradeoffs, depending on the particular metric that is used to evaluate the results, 3) in sparse representation techniques, L_1 is not a good proxy for the true measure of sparsity, L_0, and 4) the L_epsilon norm may be a better error metric for image reconstruction and compression than the L_2 norm, though the exact psychophysical metric should take into account high order structure in images

CiteSeerX

DSpace@MIT

Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains

Author: Greven S.
Happ C.
Publication venue: 'Informa UK Limited'
Publication date: 05/12/2016
Field of study

Existing approaches for multivariate functional principal component analysis are restricted to data on the same one-dimensional interval. The presented approach focuses on multivariate functional data on different domains that may differ in dimension, e.g. functions and images. The theoretical basis for multivariate functional principal component analysis is given in terms of a Karhunen-Lo\`eve Theorem. For the practically relevant case of a finite Karhunen-Lo\`eve representation, a relationship between univariate and multivariate functional principal component analysis is established. This offers an estimation strategy to calculate multivariate functional principal components and scores based on their univariate counterparts. For the resulting estimators, asymptotic results are derived. The approach can be extended to finite univariate expansions in general, not necessarily orthonormal bases. It is also applicable for sparse functional data or data with measurement error. A flexible R-implementation is available on CRAN. The new method is shown to be competitive to existing approaches for data observed on a common one-dimensional domain. The motivating application is a neuroimaging study, where the goal is to explore how longitudinal trajectories of a neuropsychological test score covary with FDG-PET brain scans at baseline. Supplementary material, including detailed proofs, additional simulation results and software is available online.Comment: Revised Version. R-Code for the online appendix is available in the .zip file associated with this article in subdirectory "/Software". The software associated with this article is available on CRAN (packages funData and MFPCA

arXiv.org e-Print Archive

FigShare

Taming Wild High Dimensional Text Data with a Fuzzy Lash

Author: Karami Amir
Publication venue
Publication date: 01/11/2017
Field of study

The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on a new basis to represent BOW. The recent increase of text data and its challenges imply that DR area still needs new perspectives. Although a wide range of methods based on the UFT strategy has been developed, the fuzzy approach has not been considered for DR based on this strategy. This research investigates the application of fuzzy clustering as a DR method based on the UFT strategy to collapse BOW matrix to provide a lower-dimensional representation of documents instead of the words in a corpus. The quantitative evaluation shows that fuzzy clustering produces superior performance and features to Principal Components Analysis (PCA) and Singular Value Decomposition (SVD), two popular DR methods based on the UFT strategy

arXiv.org e-Print Archive

Crossref

Scholar Commons - Institutional Repository of the University of South Carolina