Search CORE

4 research outputs found

Recommended from our members

An incremental approach to MSE-based feature selection

Author: Bao C
Guan SU
Qi Y
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2007
Field of study

Feature selection plays an important role in classification systems. Using classifier error rate as the evaluation function, feature selection is integrated with incremental training. A neural network classifier is implemented with an incremental training approach to detect and discard irrelevant features. By learning attributes one after another, our classifier can find directly the attributes that make no contribution to classification. These attributes are marked and considered for removal. Incorporated with a Minimum Squared Error (MSE) based feature ranking scheme, four batch removal methods based on classifier error rate have been developed to discard irrelevant features. These feature selection methods reduce the computational complexity involved in searching among a large number of possible solutions significantly. Experimental results show that our feature selection methods work well on several benchmark problems compared with other feature selection methods. The selected subsets are further validated by a Constructive Backpropagation (CBP) classifier, which confirms increased classification accuracy and reduced training cost

Brunel University Research Archive

Data Mining Feature Subset Weighting and Selection Using Genetic Algorithms

Author: Yilmaz Okan
Publication venue: AFIT Scholar
Publication date: 01/03/2002
Field of study

We present a simple genetic algorithm (sGA), which is developed under Genetic Rule and Classifier Construction Environment (GRaCCE) to solve feature subset selection and weighting problem to have better classification accuracy on k-nearest neighborhood (KNN) algorithm. Our hypotheses are that weighting the features will affect the performance of the KNN algorithm and will cause better classification accuracy rate than that of binary classification. The weighted-sGA algorithm uses real-value chromosomes to find the weights for features and binary-sGA uses integer-value chromosomes to select the subset of features from original feature set. A Repair algorithm is developed for weighted-sGA algorithm to guarantee the feasibility of chromosomes. By feasibility we mean that the sum of values of each gene in a chromosome must be equal to 1. To calculate the fitness values for each chromosome in the population, we use K Nearest Neighbor Algorithm (KNN) as our fitness function. The Euclidean distance from one individual to other individuals is calculated on the d-dimensional feature space to classify an unknown instance. GRaCCE searches for good feature subsets and their associated weights. These feature weights are then multiplied with normalized feature values and these new values are used to calculate the distance between features

AFTI Scholar (Air Force Institute of Technology)

Flexibility and accuracy enhancement techniques for neural networks

Author: LI PENG (HT006960U)
Publication venue
Publication date: 23/06/2004
Field of study

Master'sMASTER OF ENGINEERIN

ScholarBank@NUS

Performing Effective Feature Selection by Investigating the Deep Structure of the Data

Author: Marco Richeldi
Pier Luca Lanzi
Publication venue: AAAI Press
Publication date: 01/01/1996
Field of study

This paper introduces ADHOC (Automatic Discov-erer of Higher-Order Correlation), an algorithm that combines the advantages of both filter and feedback models to enhance the understanding of the given data and to increase the efficiency of the feature se-lection process. ADHOC partitions the observed features into a number of groups, called factors, that reflect the major dimensions of the phenomenon un-der consideration. The set of learned factors define the starting point of the search of the best performing feature subset. A genetic algorithm is used to explore the feature space originated by the factors and to de-termine the set of most informative feature configu-rations. The feature subset evaluation function is the performance of the induction algorithm. This ap-proach offers three main advantages: (i) the likeli-hood of selecting good performing features grows; (ii) the complexity of search diminishes consistently; (iii) the possibility of selecting a bad feature subset due to over-fitting problems decreases. Extensive ex-periments on real-world data have been conducted to demonstrate the effectiveness of ADHOC as data re-duction technique as well as feature selection method

CiteSeerX

Archivio istituzionale della ricerca - Politecnico di Milano