Fuzzy-Rough Attribute Reduction with Application to Web Categorization

Jensen, Richard; Shen, Qiang

Fuzzy-Rough Attribute Reduction with Application to Web Categorization

Authors: Richard Jensen
Qiang Shen
Publication date: 1 January 2004
Publisher
Doi

Abstract

Due to the explosive growth of electronically stored information, automatic methods must be developed to aid users in maintaining and using this abundance of informa-tion eectively. In particular, the sheer volume of redundancy present must be dealt with, leaving only the information-rich data to be processed. This paper presents a novel approach, based on an integrated use of fuzzy and rough set theories, to greatly reduce this data redundancy. Formal concepts of fuzzy-rough attribute re-duction are introduced and illustrated with a simple example. The work is applied to the problem of web categorization, considerably reducing dimensionality with minimal loss of information. Experimental results show that fuzzy-rough reduction is more powerful than the conventional rough set-based approach. Classiers that use a lower dimensional set of attributes which are retained by fuzzy-rough reduc-tion outperform those that employ more attributes returned by the existing crisp rough reduction method.