6 research outputs found
Neural Techniques for Improving the Classification Accuracy of Microarray Data Set using Rough Set Feature Selection Method
Abstract---Classification, a data mining task is an effective method to classify the data in the process of Knowledge Data Discovery. Classification method algorithms are widely used in medical field to classify the medical data for diagnosis. Feature Selection increases the accuracy of the Classifier because it eliminates irrelevant attributes. This paper analyzes the performance of neural network classifiers with and without feature selection in terms of accuracy and efficiency to build a model on four different datasets. This paper provides rough feature selection scheme, and evaluates the relative performance of four different neural network classification procedures such as Learning Vector Quantisation (LVQ) -LVQ1, LVQ3, optimizedlearning-rate LVQ1 (OLVQ1), and The Self-Organizing Map (SOM) incorporating those methods. Experimental results show that the LVQ3 neural classification is an appropriate classification method makes it possible to construct high performance classification models for microarray data
Optimization of attribute selection model using bio-inspired algorithms
Attribute selection which is also known as feature selection is an essential process that is relevant to predictive analysis.To date, various feature selection algorithms have been introduced,
nevertheless they all work independently. Hence, reducing the consistency of the accuracy rate. The aim of this paper is to investigate the use of bio-inspired search algorithms in producing optimal attribute set. This is achieved in two stages; 1) create attribute selection models by combining search method and feature selection algorithms, and 2) determine an optimized attribute set by employing bio-inspired algorithms.Classification performance of the produced attribute set is analyzed based on
accuracy and number of selected attributes. Experimental results conducted on six (6) public real datasets reveal that the feature
selection model with the implementation of bio-inspired search algorithm consistently performs good classification (i.e higher accuracy with fewer numbers of attributes) on the selected data
set. Such a finding indicates that bio-inspired algorithms can contribute in identifying the few most important features to be used in data mining model construction
Rough set approach for categorical data clustering
A few techniques of rough categorical data clustering exist to group objects
having similar characteristics. However, the performance of the techniques is an
issue due to low accuracy, high computational complexity and clusters purity.
This work proposes a new technique called Maximum Dependency Attributes
(MDA) to improve the previous techniques due to these issues. The proposed
technique is based on rough set theory by taking into account the dependency of
attributes of an information system. The main contribution of this technique is to
introduce a new technique to classify objects from categorical datasets which has
better performance as compared to the baseline techniques.
The algorithm of the proposed technique is implemented in MATLAB®
version 7.6.0.324 (R2008a). They are executed sequentially on a processor Intel Core
2 Duo CPUs. The total main memory is 1 Gigabyte and the operating system is
Windows XP Professional SP3. Results collected during the experiments on four
small datasets and thirteen UCI benchmark datasets for selecting a clustering
attribute show that the proposed MDA technique is an efficient approach in terms of
accuracy and computational complexity as compared to BC, TR and MMR
techniques. For the clusters purity, the results on Soybean and Zoo datasets show that
MDA technique provided better purity up to 17% and 9%, respectively.
The experimental result on supplier chain management clustering also
demonstrates how MDA technique can contribute to practical system and establish
the better performance for computation complexity and clusters purity up to 90% and
23%, respectively