534,458 research outputs found
Feature subset selection and ranking for data dimensionality reduction
A new unsupervised forward orthogonal search (FOS) algorithm is introduced for feature selection and ranking. In the new algorithm, features are selected in a stepwise way, one at a time, by estimating the capability of each specified candidate feature subset to represent the overall features in the measurement space. A squared correlation function is employed as the criterion to measure the dependency between features and this makes the new algorithm easy to implement. The forward orthogonalization strategy, which combines good effectiveness with high efficiency, enables the new algorithm to produce efficient feature subsets with a clear physical interpretation
Feature subset selection and ranking for data dimensionality reduction
A new unsupervised forward orthogonal search (FOS) algorithm is introduced for feature selection and ranking. In the new algorithm, features are selected in a stepwise way, one at a time, by estimating the capability of each specified candidate feature subset to represent the overall features in the measurement space. A squared correlation function is employed as the criterion to measure the dependency between features and this makes the new algorithm easy to implement. The forward orthogonalization strategy, which combines good effectiveness with high efficiency, enables the new algorithm to produce efficient feature subsets with a clear physical interpretation
Practical feature subset selection for machine learning
Machine learning algorithms automatically extract knowledge from machine readable information. Unfortunately, their success is usually dependant on the quality of the data that they operate on. If the data is inadequate, or contains extraneous and irrelevant information, machine learning algorithms may produce less accurate and less understandable results, or may fail to discover anything of use at all. Feature subset selection can result in enhanced performance, a reduced hypothesis search space, and, in some cases, reduced storage requirement. This paper describes a new feature selection algorithm that uses a correlation based heuristic to determine the “goodness” of feature subsets, and evaluates its effectiveness with three common machine learning algorithms. Experiments using a number of standard machine learning data sets are presented. Feature subset selection gave significant improvement for all three algorithm
A new genetic algorithm for multi-label correlation-based feature selection.
This paper proposes a new Genetic Algorithm for Multi-Label Correlation-Based Feature Selection (GA-ML-CFS). This GA performs a global search in the space of candidate feature subset, in order to select a high-quality feature subset is used by a multi-label classification algorithm - in this work, the Multi-Label k-NN algorithm. We compare the results of GA-ML-CFS with the results of the previously proposed Hill-Climbing for Multi-Label Correlation-Based Feature Selection (HC-ML-CFS), across 10 multi-label datasets
Differential Evolution based feature subset selection
In this paper, a novel feature selection algorithm based on Differential Evolution (DE) optimization technique is presented. The new algorithm, called DEFS, modifies the DE which is a real-valued optimizer, to suit the problem of feature selection. The proposed DEFS highly reduces the computational costs while at the same time proving to present powerful performance. The DEFS technique is applied to a brain-computer-interface (BCI) application and compared with other dimensionality reduction techniques. The practical results indicate the significance of the proposed algorithm in terms of solutions optimality, memory requirement, and computational cost. © 2008 IEEE
Feature subset selection: a correlation based filter approach
Recent work has shown that feature subset selection can have a position affect on the performance of machine learning algorithms. Some algorithms can be slowed or their performance adversely affected by too much data some of which may be irrelevant or redundant to the learning task. Feature subset selection, then, is a method of enhancing the performance of learning algorithms, reducing the hypothesis search space, and, in some cases, reducing the storage requirement. This paper describes a feature subset selector that uses a correlation based heuristic to determine the goodness of feature subsets, and evaluates its effectiveness with three common ML algorithms: a decision tree inducer (C4.5), a naive Bayes classifier, and an instance based learner(IBI). Experiments using a number of standard data sets drawn from real and artificial domains are presented. Feature subset selection gave significant improvement for all three algorithms; C4.5 generated smaller decision trees
Feature Selection Library (MATLAB Toolbox)
Feature Selection Library (FSLib) is a widely applicable MATLAB library for
Feature Selection (FS). FS is an essential component of machine learning and
data mining which has been studied for many years under many different
conditions and in diverse scenarios. These algorithms aim at ranking and
selecting a subset of relevant features according to their degrees of
relevance, preference, or importance as defined in a specific application.
Because feature selection can reduce the amount of features used for training
classification models, it alleviates the effect of the curse of dimensionality,
speeds up the learning process, improves model's performance, and enhances data
understanding. This short report provides an overview of the feature selection
algorithms included in the FSLib MATLAB toolbox among filter, embedded, and
wrappers methods.Comment: Feature Selection Library (FSLib) 201
Exploring Language-Independent Emotional Acoustic Features via Feature Selection
We propose a novel feature selection strategy to discover
language-independent acoustic features that tend to be responsible for emotions
regardless of languages, linguistics and other factors. Experimental results
suggest that the language-independent feature subset discovered yields the
performance comparable to the full feature set on various emotional speech
corpora.Comment: 15 pages, 2 figures, 6 table
- …
