117 research outputs found
Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches
Abstract—Semantics-preserving dimensionality reduction refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition, and signal processing. This has found successful application in tasks that involve data sets containing huge numbers of features (in the order of tens of thousands), which would be impossible to process further. Recent examples include text processing and Web content classification. One of the many successful applications of rough set theory has been to this feature selection area. This paper reviews those techniques that preserve the underlying semantics of the data, using crisp and fuzzy rough set-based methodologies. Several approaches to feature selection based on rough set theory are experimentally compared. Additionally, a new area in feature selection, feature grouping, is highlighted and a rough set-based feature grouping technique is detailed. Index Terms—Dimensionality reduction, feature selection, feature transformation, rough selection, fuzzy-rough selection.
Performing Feature Selection with ACO
Summary. The main aim of feature selection is to determine a minimal feature subset from a problem domain while retaining a suitably high accuracy in representing the original features. In real world problems FS is a must due to the abundance of noisy, irrelevant or misleading features. However, current methods are inadequate at finding optimal reductions. This chapter presents a feature selection mechanism based on Ant Colony Optimization in an attempt to combat this. The method is then applied to the problem of finding optimal feature subsets in the fuzzy-rough data reduction process. The present work is applied to two very different challenging tasks, namely web classification and complex systems monitoring.
- …