3 research outputs found

    Optimization of attribute selection model using bio-inspired algorithms

    Get PDF
    Attribute selection which is also known as feature selection is an essential process that is relevant to predictive analysis.To date, various feature selection algorithms have been introduced, nevertheless they all work independently. Hence, reducing the consistency of the accuracy rate. The aim of this paper is to investigate the use of bio-inspired search algorithms in producing optimal attribute set. This is achieved in two stages; 1) create attribute selection models by combining search method and feature selection algorithms, and 2) determine an optimized attribute set by employing bio-inspired algorithms.Classification performance of the produced attribute set is analyzed based on accuracy and number of selected attributes. Experimental results conducted on six (6) public real datasets reveal that the feature selection model with the implementation of bio-inspired search algorithm consistently performs good classification (i.e higher accuracy with fewer numbers of attributes) on the selected data set. Such a finding indicates that bio-inspired algorithms can contribute in identifying the few most important features to be used in data mining model construction

    Nonsmooth optimization models and algorithms for data clustering and visualization

    Get PDF
    Cluster analysis deals with the problem of organization of a collection of patterns into clusters based on a similarity measure. Various distance functions can be used to define this measure. Clustering problems with the similarity measure defined by the squared Euclidean distance have been studied extensively over the last five decades. However, problems with other Minkowski norms have attracted significantly less attention. The use of different similarity measures may help to identify different cluster structures of a data set. This in turn may help to significantly improve the decision making process. High dimensional data visualization is another important task in the field of data mining and pattern recognition. To date, the principal component analysis and the self-organizing maps techniques have been used to solve such problems. In this thesis we develop algorithms for solving clustering problems in large data sets using various similarity measures. Such similarity measures are based on the squared LDoctor of Philosoph
    corecore