2 research outputs found
Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values
Handling missing values in training datasets for constructing learning models
or extracting useful information is considered to be an important research task
in data mining and knowledge discovery in databases. In recent years, lot of
techniques are proposed for imputing missing values by considering attribute
relationships with missing value observation and other observations of training
dataset. The main deficiency of such techniques is that, they depend upon
single approach and do not combine multiple approaches, that why they are less
accurate. To improve the accuracy of missing values imputation, in this paper
we introduce a novel partial matching concept in association rules mining,
which shows better results as compared to full matching concept that we
described in our previous work. Our imputation technique combines the partial
matching concept in association rules with k-nearest neighbor approach. Since
this is a hybrid technique, therefore its accuracy is much better than as
compared to those techniques which depend upon single approach. To check the
efficiency of our technique, we also provide detail experimental results on
number of benchmark datasets which show better results as compared to previous
approaches
An Approach to Find Missing Values in Medical Datasets
Mining medical datasets is a challenging problem before data mining
researchers as these datasets have several hidden challenges compared to
conventional datasets.Starting from the collection of samples through field
experiments and clinical trials to performing classification,there are numerous
challenges at every stage in the mining process. The preprocessing phase in the
mining process itself is a challenging issue when, we work on medical datasets.
One of the prime challenges in mining medical datasets is handling missing
values which is part of preprocessing phase. In this paper, we address the
issue of handling missing values in medical dataset consisting of categorical
attribute values. The main contribution of this research is to use the proposed
imputation measure to estimate and fix the missing values. We discuss a case
study to demonstrate the working of proposed measure.Comment: 7 pages,ACM Digital Library, ICEMIS September 201