19 research outputs found

    Path finding on a spherical self-organizing map using distance transformations

    Get PDF
    Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data

    WeVoS-ViSOM: an ensemble summarization algorithm for enhanced data visualization

    Get PDF
    This study presents a novel version of the Visualization Induced Self-Organizing Map based on the application of a new fusion algorithm for summarizing the results of an ensemble of topology-preserving mapping models. The algorithm is referred to as Weighted Voting Superposition (WeVoS). Its main feature is the preservation of the topology of the map, in order to obtain the most accurate possible visualization of the data sets under study. To do so, a weighted voting process between the units of the maps in the ensemble takes place, in order to determine the characteristics of the units of the resulting map. Several different quality measures are applied to this novel neural architecture known as WeVoS-ViSOM and the results are analyzed, so as to present a thorough study of its capabilities. To complete the study, it has also been compared with the well-know SOM and its fusion version, with the WeVoS-SOM and with two other previously devised fusion Fusion by Euclidean Distance and Fusion by Voronoi Polygon Similarity—based on the analysis of the same quality measures in order to present a complete analysis of its capabilities. All three summarization methods were applied to three widely used data sets from the UCI Repository. A rigorous performance analysis clearly demonstrates that the novel fusion algorithm outperforms the other single and summarization methods in terms of data sets visualizationThis research has been partially supported through projects CIT-020000-2008-2 and CIT-020000-2009-12 of the Spanish Ministry of Education and Innovation and project BUO06A08 of the Junta of Castilla and Leon. The authors would also like to thank the manufacturer of components for vehicle interiors, Grupo Antolin Ingenieria, S.A. within the framework of the MAGNO2008-1028 CENIT project, funded by the Spanish Ministry of Science and Innovatio

    Path finding on a spherical self-organizing map using distance transformations

    Get PDF
    Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data

    Methods for Estimation of Intrinsic Dimensionality

    Get PDF
    Dimension reduction is an important tool used to describe the structure of complex data (explicitly or implicitly) through a small but sufficient number of variables, and thereby make data analysis more efficient. It is also useful for visualization purposes. Dimension reduction helps statisticians to overcome the ‘curse of dimensionality’. However, most dimension reduction techniques require the intrinsic dimension of the low-dimensional subspace to be fixed in advance. The availability of reliable intrinsic dimension (ID) estimation techniques is of major importance. The main goal of this thesis is to develop algorithms for determining the intrinsic dimensions of recorded data sets in a nonlinear context. Whilst this is a well-researched topic for linear planes, based mainly on principal components analysis, relatively little attention has been paid to ways of estimating this number for non–linear variable interrelationships. The proposed algorithms here are based on existing concepts that can be categorized into local methods, relying on randomly selected subsets of a recorded variable set, and global methods, utilizing the entire data set. This thesis provides an overview of ID estimation techniques, with special consideration given to recent developments in non–linear techniques, such as charting manifold and fractal–based methods. Despite their nominal existence, the practical implementation of these techniques is far from straightforward. The intrinsic dimension is estimated via Brand’s algorithm by examining the growth point process, which counts the number of points in hyper-spheres. The estimation needs to determine the starting point for each hyper-sphere. In this thesis we provide settings for selecting starting points which work well for most data sets. Additionally we propose approaches for estimating dimensionality via Brand’s algorithm, the Dip method and the Regression method. Other approaches are proposed for estimating the intrinsic dimension by fractal dimension estimation methods, which exploit the intrinsic geometry of a data set. The most popular concept from this family of methods is the correlation dimension, which requires the estimation of the correlation integral for a ball of radius tending to 0. In this thesis we propose new approaches to approximate the correlation integral in this limit. The new approaches are the Intercept method, the Slop method and the Polynomial method. In addition we propose a new approach, a localized global method, which could be defined as a local version of global ID methods. The objective of the localized global approach is to improve the algorithm based on a local ID method, which could significantly reduce the negative bias. Experimental results on real world and simulated data are used to demonstrate the algorithms and compare them to other methodology. A simulation study which verifies the effectiveness of the proposed methods is also provided. Finally, these algorithms are contrasted using a recorded data set from an industrial melter process

    Fusion of Visualization Induced SOM

    Get PDF
    In this study ensemble techniques have been applied in the frame of topology preserving mappings with visualization purposes. A novel extension of the ViSOM (Visualization Induced SOM) is obtained by the use of the ensemble meta-algorithm and a later fusion process. This main fusion algorithm has two different variants, considering two different criteria for the similarity of nodes. These criteria are Euclidean distance and similarity on Voronoi polygons. The goal of this upgrade is to improve the quality and robustness of the single model. Some experiments performed over different datasets applying the two variants of the fusion and other simpler models are included for comparison purposes

    Quality of Adaptation of Fusion ViSOM

    Get PDF
    This work presents a research on the performance capabilities of an extension of the ViSOM (Visualization Induced SOM) algorithm by the use of the ensemble meta-algorithm and a later fusion process. This main fusion process has two different variants, considering two different criteria for the similarity of nodes. These criteria are Euclidean distance and similarity on Voronoi polygons. The capabilities, strengths and weakness of the different variants of the model are discussed and compared more deeply in the present work. The details of several experiments performed over different datasets applying the variants of the fusion to the ViSOM algorithm along with same variants of fusion with the SOM are included for this purpose

    Automated Ham Quality Classification Using Ensemble Unsupervised Mapping Models

    Get PDF
    This multidisciplinary study focuses on the application and comparison of several topology preserving mapping models upgraded with some classifier ensemble and boosting techniques in order to improve those visualization capabilities. The aim is to test their suitability for classification purposes in the field of food industry and more in particular in the case of dry cured ham. The data is obtained from an electronic device able to emulate a sensory olfative taste of ham samples. Then the data is classified using the previously mentioned techniques in order to detect which batches have an anomalous smelt (acidity, rancidity and different type of taints) in an automated way

    Beta hebbian learning: definition and analysis of a new family of learning rules for exploratory projection pursuit

    Get PDF
    [EN] This thesis comprises an investigation into the derivation of learning rules in artificial neural networks from probabilistic criteria. •Beta Hebbian Learning (BHL). First of all, it is derived a new family of learning rules which are based on maximising the likelihood of the residual from a negative feedback network when such residual is deemed to come from the Beta Distribution, obtaining an algorithm called Beta Hebbian Learning, which outperforms current neural algorithms in Exploratory Projection Pursuit. • Beta-Scale Invariant Map (Beta-SIM). Secondly, Beta Hebbian Learning is applied to a well-known Topology Preserving Map algorithm called Scale Invariant Map (SIM) to design a new of its version called Beta-Scale Invariant Map (Beta-SIM). It is developed to facilitate the clustering and visualization of the internal structure of high dimensional complex datasets effectively and efficiently, specially those characterized by having internal radial distribution. The Beta-SIM behaviour is thoroughly analysed comparing its results, in terms performance quality measures with other well-known topology preserving models. • Weighted Voting Superposition Beta-Scale Invariant Map (WeVoS-Beta-SIM). Finally, the use of ensembles such as the Weighted Voting Superposition (WeVoS) is tested over the previous novel Beta-SIM algorithm, in order to improve its stability and to generate accurate topology maps when using complex datasets. Therefore, the WeVoS-Beta-Scale Invariant Map (WeVoS-Beta-SIM), is presented, analysed and compared with other well-known topology preserving models. All algorithms have been successfully tested using different artificial datasets to corroborate their properties and also with high-complex real datasets.[ES] Esta tesis abarca la investigación sobre la derivación de reglas de aprendizaje en redes neuronales artificiales a partir de criterios probabilísticos. • Beta Hebbian Learning (BHL). En primer lugar, se deriva una nueva familia de reglas de aprendizaje basadas en maximizar la probabilidad del residuo de una red con retroalimentación negativa cuando se considera que dicho residuo proviene de la Distribución Beta, obteniendo un algoritmo llamado Beta Hebbian Learning, que mejora a algoritmos neuronales actuales de búsqueda de proyecciones exploratorias. • Beta-Scale Invariant Map (Beta-SIM). En Segundo lugar, Beta Hebbian Learning se aplica a un conocido algoritmo de Mapa de Preservación de la Topología llamado Scale Invariant Map (SIM) para diseñar una nueva versión llamada Beta-Scale Invariant Map (Beta-SIM). Este nuevo algoritmo ha sido desarrollado para facilitar el agrupamiento y visualización de la estructura interna de conjuntos de datos complejos de alta dimensionalidad de manera eficaz y eficiente, especialmente aquellos caracterizados por tener una distribución radial interna. El comportamiento de Beta-SIM es analizado en profundidad comparando sus resultados, en términos de medidas de calidad de rendimiento con otros modelos bien conocidos de preservación de topología. • Weighted Voting Superposition Beta-Scale Invariant Map (WeVoS-Beta-SIM). Finalmente, el uso de ensembles como el Weighted Voting Superposition (WeVoS) sobre el algoritmo Beta-SIM es probado, con objeto de mejorar su estabilidad y generar mapas topológicos precisos cuando se utilizan conjuntos de datos complejos. Por lo tanto, se presenta, analiza y compara el WeVoS-Beta-Scale Invariant Map (WeVoS-Beta-SIM) con otros modelos bien conocidos de preservación de topología. Todos los algoritmos han sido probados con éxito sobre conjuntos de datos artificiales para corroborar sus propiedades, así como con conjuntos de datos reales de gran complejidad

    Elastic Maps and Nets for Approximating Principal Manifolds and Their Application to Microarray Data Visualization

    Full text link
    Principal manifolds are defined as lines or surfaces passing through ``the middle'' of data distribution. Linear principal manifolds (Principal Components Analysis) are routinely used for dimension reduction, noise filtering and data visualization. Recently, methods for constructing non-linear principal manifolds were proposed, including our elastic maps approach which is based on a physical analogy with elastic membranes. We have developed a general geometric framework for constructing ``principal objects'' of various dimensions and topologies with the simplest quadratic form of the smoothness penalty which allows very effective parallel implementations. Our approach is implemented in three programming languages (C++, Java and Delphi) with two graphical user interfaces (VidaExpert http://bioinfo.curie.fr/projects/vidaexpert and ViMiDa http://bioinfo-out.curie.fr/projects/vimida applications). In this paper we overview the method of elastic maps and present in detail one of its major applications: the visualization of microarray data in bioinformatics. We show that the method of elastic maps outperforms linear PCA in terms of data approximation, representation of between-point distance structure, preservation of local point neighborhood and representing point classes in low-dimensional spaces.Comment: 35 pages 10 figure

    Spiking neurons in 3D growing self-organising maps

    Get PDF
    In Kohonen’s Self-Organising Maps (SOM) learning, preserving the map topology to simulate the actual input features appears to be a significant process. Misinterpretation of the training samples can lead to failure in identifying the important features that may affect the outcomes generated by the SOM model. Nonetheless, it is a challenging task as most of the real problems are composed of complex and insufficient data. Spiking Neural Network (SNN) is the third generation of Artificial Neural Network (ANN), in which information can be transferred from one neuron to another using spike, processed, and trigger response as output. This study, hence, embedded spiking neurons for SOM learning in order to enhance the learning process. The proposed method was divided into five main phases. Phase 1 investigated issues related to SOM learning algorithm, while in Phase 2; datasets were collected for analyses carried out in Phase 3, wherein neural coding scheme for data representation process was implemented in the classification task. Next, in Phase 4, the spiking SOM model was designed, developed, and evaluated using classification accuracy rate and quantisation error. The outcomes showed that the proposed model had successfully attained exceptional classification accuracy rate with low quantisation error to preserve the quality of the generated map based on original input data. Lastly, in the final phase, a Spiking 3D Growing SOM is proposed to address the surface reconstruction issue by enhancing the spiking SOM using 3D map structure in SOM algorithm with a growing grid mechanism. The application of spiking neurons to enhance the performance of SOM is relevant in this study due to its ability to spike and to send a reaction when special features are identified based on its learning of the presented datasets. The study outcomes contribute to the enhancement of SOM in learning the patterns of the datasets, as well as in proposing a better tool for data analysis
    corecore