Search CORE

Archivio istituzionale della ricerca - Università di Palermo

How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?

Author: Rossi Fabrice
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/07/2014
Field of study

In numerous applicative contexts, data are too rich and too complex to be represented by numerical vectors. A general approach to extend machine learning and data mining techniques to such data is to really on a dissimilarity or on a kernel that measures how different or similar two objects are. This approach has been used to define several variants of the Self Organizing Map (SOM). This paper reviews those variants in using a common set of notations in order to outline differences and similarities between them. It discusses the advantages and drawbacks of the variants, as well as the actual relevance of the dissimilarity/kernel SOM for practical applications

Multidimensional Urban Segregation - Toward A Neural Network Measure

Author: Cottrell Marie
Hazan Aurélien
Olteanu Madalina
Randon-Furling Julien
Publication venue
Publication date: 05/06/2018
Field of study

We introduce a multidimensional, neural-network approach to reveal and measure urban segregation phenomena, based on the Self-Organizing Map algorithm (SOM). The multidimensionality of SOM allows one to apprehend a large number of variables simultaneously, defined on census or other types of statistical blocks, and to perform clustering along them. Levels of segregation are then measured through correlations between distances on the neural network and distances on the actual geographical map. Further, the stochasticity of SOM enables one to quantify levels of heterogeneity across census blocks. We illustrate this new method on data available for the city of Paris.Comment: NCAA S.I. WSOM+ 201

HAL Descartes

HAL - UPEC / UPEM

Batch and median neural gas

Author: Alexander Hasenfuß
Barbara Hammer
Belkin
Blake
Borg
Bottou
Bunke
Cheng
Cottrell
Cottrell
Duda
Fort
Graepel
Guenter
Hammer
Heskes
Kaski
Kohonen
Kohonen
Lundsteen
Marie Cottrell
Martinetz
Martinetz
Mevissen
Murty
Ripley
Seo
Somervuo
Thomas Villmann
Villmann
Zhong
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Neural Gas (NG) constitutes a very robust clustering algorithm given euclidian data which does not suffer from the problem of local minima like simple vector quantization, or topological restrictions like the self-organizing map. Based on the cost function of NG, we introduce a batch variant of NG which shows much faster convergence and which can be interpreted as an optimization of the cost function by the Newton method. This formulation has the additional benefit that, based on the notion of the generalized median in analogy to Median SOM, a variant for non-vectorial proximity data can be introduced. We prove convergence of batch and median versions of NG, SOM, and k-means in a unified formulation, and we investigate the behavior of the algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari

Publications at Bielefeld University

How to improve robustness in Kohonen maps and display additional information in Factorial Analysis: application to text mining

Author: Bourgeois Nicolas
Cottrell Marie
Déruelle Benjamin
Lamassé Stéphane
Letrémy Patrick
Publication venue: 'Elsevier BV'
Publication date: 23/07/2014
Field of study

This article is an extended version of a paper presented in the WSOM'2012 conference [1]. We display a combination of factorial projections, SOM algorithm and graph techniques applied to a text mining problem. The corpus contains 8 medieval manuscripts which were used to teach arithmetic techniques to merchants. Among the techniques for Data Analysis, those used for Lexicometry (such as Factorial Analysis) highlight the discrepancies between manuscripts. The reason for this is that they focus on the deviation from the independence between words and manuscripts. Still, we also want to discover and characterize the common vocabulary among the whole corpus. Using the properties of stochastic Kohonen maps, which define neighborhood between inputs in a non-deterministic way, we highlight the words which seem to play a special role in the vocabulary. We call them fickle and use them to improve both Kohonen map robustness and significance of FCA visualization. Finally we use graph algorithmic to exploit this fickleness for classification of words

Some Further Evidence about Magnification and Shape in Neural Gas

Author: Parigi Giacomo
Pedrini Andrea
Piastra Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/03/2015
Field of study

Neural gas (NG) is a robust vector quantization algorithm with a well-known mathematical model. According to this, the neural gas samples the underlying data distribution following a power law with a magnification exponent that depends on data dimensionality only. The effects of shape in the input data distribution, however, are not entirely covered by the NG model above, due to the technical difficulties involved. The experimental work described here shows that shape is indeed relevant in determining the overall NG behavior; in particular, some experiments reveal richer and complex behaviors induced by shape that cannot be explained by the power law alone. Although a more comprehensive analytical model remains to be defined, the evidence collected in these experiments suggests that the NG algorithm has an interesting potential for detecting complex shapes in noisy datasets

Techniques for clustering gene expression data

Author: Crane Martin
Doolan Padraig
Kerr Gráinne
Ruskin Heather J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

DCU Online Research Access Service

Irish Universities

Batch kernel SOM and related Laplacian methods for social network analysis

Author: Alpert
Andras
Aronszajn
Auber
Berlinet
Bertrand Jouve
Bornholdt
Clauset
Conan-Guez
Cristianini
Di Battista
Donetti
Fabrice Rossi
Faloutsos
Filippone
Graepel
Graepel
Hammer
Hammer
Herman
Kaski
Kohohen
Kohonen
Kondor
Mac Donald
Miikkulainen
Mohar
Mossa
Nathalie Villa
Neville
Newman
Newman
Newman
Newman
Palla
Pons
Radicchi
Romain Boulet
Schaeffer
Schölkopf
Schölkopf
Smola
Strogatz
Ultsch
van den Heuvel
Vert
Villa
Watts
Watts
Zhou
Publication venue
Publication date: 01/01/2008
Field of study

Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tightly connected vertices and understanding the relations between those groups is very important in practice. This paper shows how a kernel version of the batch Self Organizing Map can be used to achieve these goals via kernels derived from the Laplacian matrix of the graph, especially when it is used in conjunction with more classical methods based on the spectral analysis of the graph. The proposed method is used to explore the structure of a medieval social network modeled through a weighted graph that has been directly built from a large corpus of agrarian contracts