2 research outputs found

    Multi-Objective Genetic Programming for Feature Extraction and Data Visualization

    Get PDF
    Feature extraction transforms high dimensional data into a new subspace of lower dimensionalitywhile keeping the classification accuracy. Traditional algorithms do not consider the multi-objective nature of this task. Data transformations should improve the classification performance on the new subspace, as well as to facilitate data visualization, which has attracted increasing attention in recent years. Moreover, new challenges arising in data mining, such as the need to deal with imbalanced data sets call for new algorithms capable of handling this type of data. This paper presents a Pareto-basedmulti-objective genetic programming algorithm for feature extraction and data visualization. The algorithm is designed to obtain data transformations that optimize the classification and visualization performance both on balanced and imbalanced data. Six classification and visualization measures are identified as objectives to be optimized by the multi-objective algorithm. The algorithm is evaluated and compared to 11 well-known feature extraction methods, and to the performance on the original high dimensional data. Experimental results on 22 balanced and 20 imbalanced data sets show that it performs very well on both types of data, which is its significant advantage over existing feature extraction algorithms

    Parallel non-linear dimension reduction algorithm on GPU

    No full text
    [[abstract]]Advances in non-linear dimensionality reduction provide a way to understand and visualise the underlying structure of complex datasets. The performance of large-scale non-linear dimensionality reduction is of key importance in data mining, machine learning, and data analysis. In this paper, we concentrate on improving the performance of non-linear dimensionality reduction using large-scale datasets on the GPU. In particular, we focus on solving problems including k-nearest neighbour (KNN) search and sparse spectral decomposition for large-scale data, and propose an efficient framework for local linear embedding (LLE). We implement a k-d tree-based KNN algorithm and Krylov subspace method on the GPU to accelerate non-linear dimensionality reduction for large-scale data. Our results enable GPU-based k-d tree LLE processes of up to about 30-60? faster compared to the brute force KNN (Hernandez et al., 2007) LLE model on the CPU. Overall, our methods save O(n²-6n-2k-3) memory space.[[cooperationtype]]國外[[booktype]]紙
    corecore