Search CORE

2 research outputs found

Can Genetic Programming Do Manifold Learning Too?

Author: A Cano
A Lensen
B Tran
C Zhang
D François
F Pedregosa
H Liu
IT Jolliffe
JB Kruskal
K Neshatian
L Maaten van der
L Maaten van der
L Rodriguez-Coayahuitl
R Poli
S Nguyen
ST Roweis
Y Bengio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Exploratory data analysis is a fundamental aspect of knowledge discovery that aims to find the main characteristics of a dataset. Dimensionality reduction, such as manifold learning, is often used to reduce the number of features in a dataset to a manageable level for human interpretation. Despite this, most manifold learning techniques do not explain anything about the original features nor the true characteristics of a dataset. In this paper, we propose a genetic programming approach to manifold learning called GP-MaL which evolves functional mappings from a high-dimensional space to a lower dimensional space through the use of interpretable trees. We show that GP-MaL is competitive with existing manifold learning algorithms, while producing models that can be interpreted and re-used on unseen data. A number of promising future directions of research are found in the process.Comment: 16 pages, accepted in EuroGP '1

arXiv.org e-Print Archive

Victoria University of Wellington

Crossref

New representations in genetic programming for feature construction in k-means clustering

Author: AE Eiben
AJ García
AK Jain
B Tran
CW Ahn
E Hartuv
H Liu
J Handl
JA Hartigan
JR Koza
K Neshatian
LY Tseng
PG Espejo
SJ Nanda
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2017
Field of study

© Springer International Publishing AG 2017. k-means is one of the fundamental and most well-known algorithms in data mining. It has been widely used in clustering tasks, but suffers from a number of limitations on large or complex datasets. Genetic Programming (GP) has been used to improve performance of data mining algorithms by performing feature construction—the process of combining multiple attributes (features) of a dataset together to produce more powerful constructed features. In this paper, we propose novel representations for using GP to perform feature construction to improve the clustering performance of the k-means algorithm. Our experiments show significant performance improvement compared to k-means across a variety of difficult datasets. Several GP programs are also analysed to provide insight into how feature construction is able to improve clustering performance

Victoria University of Wellington

Crossref