Skip to main content
Article thumbnail
Location of Repository

Rough set approaches to rule induction from incomplete data

By Jerzy W. Grzymala-busse and Sachin Siddhaye


In this paper we assume that data are presented in the form of decision tables, incomplete when some attribute values are missing. Two main cases of missing attribute values are considered: lost (the original value was erased) and "do not care " conditions (the original value was irrelevant). This paper uses, as the main tool, attribute-value pair blocks. These blocks are used to construct characteristic sets, characteristic relations, and lower and upper approximations for decision tables with missing attribute values. For such tables three different definitions of lower and upper approximations may be applied: singleton, subset, and concept. A modified version of the LEM2 rule induction algorithm, accepting input data with both lost values and "do not care " conditions, is described. Results of experiments on some real-life incomplete data, in which all missing attribute values were considered to be either lost or "do not care " conditions are presented as well. A conclusion is that an error rate for classification is smaller when missing attribute values are considered to be lost

Year: 2004
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.