Article thumbnail

Filtering Large Propositional Rule Sets While Retaining Classifier Performance

By Thomas Ågotnes

Abstract

Data mining is the problem of inducing models from data. Models have both a descriptive and a predictive aspect. Descriptive models can be inspected and used for knowledge discovery. Models consisting of decision rules -- such as those produced by methods from Pawlak's rough set theory -- are in principle descriptive, but in practice the induced models are too large to be inspected. In this thesis, extracting descriptive models from already induced complex models is considered. According to the principle of Occam's razor, the simplest of two models both consistent with the observed data should be chosen. A descriptive model can be found by simplifying a complex model while retaining predictive performance. The approach taken in this thesis is rule filtering; post-pruning of complete rules from a model. Two methods for finding high-performance subsets of a set of rules are investigated. The first is to use a genetic algorithm to search the space of subsets. The second method is to creat..

Year: 1999
OAI identifier: oai:CiteSeerX.psu:10.1.1.49.3657
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.idt.unit.no/IDT/gru... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.