2 research outputs found
Multi-Objective Genetic Programming for Feature Extraction and Data Visualization
Feature extraction transforms high dimensional
data into a new subspace of lower dimensionalitywhile keeping
the classification accuracy. Traditional algorithms do not
consider the multi-objective nature of this task. Data transformations
should improve the classification performance
on the new subspace, as well as to facilitate data visualization,
which has attracted increasing attention in recent years.
Moreover, new challenges arising in data mining, such as
the need to deal with imbalanced data sets call for new algorithms
capable of handling this type of data. This paper
presents a Pareto-basedmulti-objective genetic programming
algorithm for feature extraction and data visualization. The
algorithm is designed to obtain data transformations that optimize
the classification and visualization performance both
on balanced and imbalanced data. Six classification and visualization
measures are identified as objectives to be optimized
by the multi-objective algorithm. The algorithm is
evaluated and compared to 11 well-known feature extraction
methods, and to the performance on the original high
dimensional data. Experimental results on 22 balanced and
20 imbalanced data sets show that it performs very well on
both types of data, which is its significant advantage over
existing feature extraction algorithms
Parallel non-linear dimension reduction algorithm on GPU
[[abstract]]Advances in non-linear dimensionality reduction provide a way to understand and visualise the underlying structure of complex datasets. The performance of large-scale non-linear dimensionality reduction is of key importance in data mining, machine learning, and data analysis. In this paper, we concentrate on improving the performance of non-linear dimensionality reduction using large-scale datasets on the GPU. In particular, we focus on solving problems including k-nearest neighbour (KNN) search and sparse spectral decomposition for large-scale data, and propose an efficient framework for local linear embedding (LLE). We implement a k-d tree-based KNN algorithm and Krylov subspace method on the GPU to accelerate non-linear dimensionality reduction for large-scale data. Our results enable GPU-based k-d tree LLE processes of up to about 30-60? faster compared to the brute force KNN (Hernandez et al., 2007) LLE model on the CPU. Overall, our methods save O(n²-6n-2k-3) memory space.[[cooperationtype]]國外[[booktype]]紙