Experiments on Incomplete Data Sets Using Modifications to Characteristic Relation

Abstract

Rough set theory is a useful approach for decision rule induction which is applied to large life data sets. Lower and upper approximations of concept values are used to induce rules for incomplete data sets. In our research we will study validity of modifications suggested to characteristic relation. We discuss the implementation of modifications to characteristic relation, and the local definability of each modified set.We show that all suggested modification sets are not locally definable except for maximal consistent blocks that are restricted to data set with "do not care" conditions. A comparative analysis was conducted for characteristic sets and modifications in terms of cardinality of lower and upper approximations of each concept and decision rules induced by each modification. In this research, experiments were conducted on four incomplete data sets with lost and do not care conditions. LEM2 algorithm was implemented to induce certain and possible rules from the incomplete data set. To measure the classification average error rate for induced rules, ten-fold cross validation was implemented. Our results show that there is no significant difference between the qualities of rule induced from each modification

    Similar works