269 research outputs found
Information-theoretic Feature Selection via Tensor Decomposition and Submodularity
Feature selection by maximizing high-order mutual information between the
selected feature vector and a target variable is the gold standard in terms of
selecting the best subset of relevant features that maximizes the performance
of prediction models. However, such an approach typically requires knowledge of
the multivariate probability distribution of all features and the target, and
involves a challenging combinatorial optimization problem. Recent work has
shown that any joint Probability Mass Function (PMF) can be represented as a
naive Bayes model, via Canonical Polyadic (tensor rank) Decomposition. In this
paper, we introduce a low-rank tensor model of the joint PMF of all variables
and indirect targeting as a way of mitigating complexity and maximizing the
classification performance for a given number of features. Through low-rank
modeling of the joint PMF, it is possible to circumvent the curse of
dimensionality by learning principal components of the joint distribution. By
indirectly aiming to predict the latent variable of the naive Bayes model
instead of the original target variable, it is possible to formulate the
feature selection problem as maximization of a monotone submodular function
subject to a cardinality constraint - which can be tackled using a greedy
algorithm that comes with performance guarantees. Numerical experiments with
several standard datasets suggest that the proposed approach compares favorably
to the state-of-art for this important problem
- …