Feature Selection Via an Upper Bound (to Any Degree Tightness) on Probability of Misclassification

Abstract

Currently, many techniques exist for feature selection purposes which are related but, unfortunately, in an indeterminable way to the probability of misclassification. In this paper a procedure is presented which yields an upper bound (to any degree tightness) on the probability of misclassification in sample Gaussian maximum likelihood classification between each pair of categories in p-dimensional space. The technique permits features to be selected so that the optimal q (q \u3c= p) features have the property that no other subset of q features yield a smaller value to the upper bound on the probability of misclassification. A computer-assessable transformation is utilized which permits a multiple integral over the misclassification region in p-dimensional space to be approximated, to any degree of accuracy, by the product of p iterated integrals, each over univariate space, and each of which may be obtained by a simple table-look-up procedure. Quite often, transformations are used without consideration of loss of information~ however, the one utilized in this procedure results in no loss of information and leaves the standard likelihood ratio invariant in value

    Similar works