Search CORE

2 research outputs found

Introducing Partial Matching Approach in Association Rules for Better Treatment of Missing Values

Author: Baig Abdul Rauf
Bashir Shariq
Maqbool Umer
Razzaq Saad
Tahir Sonya
Publication venue
Publication date: 21/04/2009
Field of study

Handling missing values in training datasets for constructing learning models or extracting useful information is considered to be an important research task in data mining and knowledge discovery in databases. In recent years, lot of techniques are proposed for imputing missing values by considering attribute relationships with missing value observation and other observations of training dataset. The main deficiency of such techniques is that, they depend upon single approach and do not combine multiple approaches, that why they are less accurate. To improve the accuracy of missing values imputation, in this paper we introduce a novel partial matching concept in association rules mining, which shows better results as compared to full matching concept that we described in our previous work. Our imputation technique combines the partial matching concept in association rules with k-nearest neighbor approach. Since this is a hybrid technique, therefore its accuracy is much better than as compared to those techniques which depend upon single approach. To check the efficiency of our technique, we also provide detail experimental results on number of benchmark datasets which show better results as compared to previous approaches

arXiv.org e-Print Archive

An Approach to Find Missing Values in Medical Datasets

Author: Bai B. Mathura
Mangathayaru N.
Rani B. Padmaja
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/04/2016
Field of study

Mining medical datasets is a challenging problem before data mining researchers as these datasets have several hidden challenges compared to conventional datasets.Starting from the collection of samples through field experiments and clinical trials to performing classification,there are numerous challenges at every stage in the mining process. The preprocessing phase in the mining process itself is a challenging issue when, we work on medical datasets. One of the prime challenges in mining medical datasets is handling missing values which is part of preprocessing phase. In this paper, we address the issue of handling missing values in medical dataset consisting of categorical attribute values. The main contribution of this research is to use the proposed imputation measure to estimate and fix the missing values. We discuss a case study to demonstrate the working of proposed measure.Comment: 7 pages,ACM Digital Library, ICEMIS September 201

arXiv.org e-Print Archive