Exploratory data analysis using self-organising maps defined in up to three dimensions

Abstract

The SOM is an artificial neural network based on an unsupervised learning process that performs a nonlinear mapping of high dimensional input data onto an ordered and structured array of nodes, designated as the SOM output space. Being simultaneously a quantization algorithm and a projection algorithm, the SOM is able to summarize and map the data, allowing its visualization. Because using the most common visualization methods it is very difficult or even impossible to visualize the SOM defined with more than two dimensions, the SOM output space is generally a regular two dimensional grid of nodes. However, there are no theoretical problems in generating SOMs with higher dimensional output spaces. In this thesis we present evidence that the SOM output space defined in up to three dimensions can be used successfully for the exploratory analysis of spatial data, two-way data and three-way data. Although the differences between the methods that are proposed to visualize each group of data, the approach adopted is commonly based in the projection of colour codes, which are obtained from the output space of 3D SOMs, in some specific bi-dimensional surface, where data can be represented according to its own characteristics. This approach is, in some cases, also complemented with the simultaneous use of SOMs defined in one and two dimensions, so that patterns in data can be properly revealed. The results obtained by using this visualization strategy indicates not only the benefits of using the SOM defined in up to three dimensions but also shows the relevance of the combined and simultaneous use of different models of the SOM in exploratory data analysis

    Similar works