research

Subset selection in dimension reduction methods

Abstract

Dimension reduction methods play an important role in multivariate statistical analysis, in particular with high-dimensional data. Linear methods can be seen as a linear mapping from the original feature space to a dimension reduction subspace. The aim is to transform the data so that the essential structure is more easily understood. However, highly correlated variables provide redundant information, whereas some other feature may be irrelevant, and we would like to identify and then discard both of them while pursuing dimension reduction. Here we propose a greedy search algorithm, which avoids the search over all possible subsets, for ranking subsets of variables based on their ability to explain variation in the dimension reduction variates.Dimension reduction methods, Linear mapping, Subset selection, Greedy search

    Similar works