6,054 research outputs found
Fast Algorithm and Implementation of Dissimilarity Self-Organizing Maps
In many real world applications, data cannot be accurately represented by
vectors. In those situations, one possible solution is to rely on dissimilarity
measures that enable sensible comparison between observations. Kohonen's
Self-Organizing Map (SOM) has been adapted to data described only through their
dissimilarity matrix. This algorithm provides both non linear projection and
clustering of non vector data. Unfortunately, the algorithm suffers from a high
cost that makes it quite difficult to use with voluminous data sets. In this
paper, we propose a new algorithm that provides an important reduction of the
theoretical cost of the dissimilarity SOM without changing its outcome (the
results are exactly the same as the ones obtained with the original algorithm).
Moreover, we introduce implementation methods that result in very short running
times. Improvements deduced from the theoretical cost model are validated on
simulated and real world data (a word list clustering problem). We also
demonstrate that the proposed implementation methods reduce by a factor up to 3
the running time of the fast algorithm over a standard implementation
Computing Exact Clustering Posteriors with Subset Convolution
An exponential-time exact algorithm is provided for the task of clustering n
items of data into k clusters. Instead of seeking one partition, posterior
probabilities are computed for summary statistics: the number of clusters, and
pairwise co-occurrence. The method is based on subset convolution, and yields
the posterior distribution for the number of clusters in O(n * 3^n) operations,
or O(n^3 * 2^n) using fast subset convolution. Pairwise co-occurrence
probabilities are then obtained in O(n^3 * 2^n) operations. This is
considerably faster than exhaustive enumeration of all partitions.Comment: 6 figure
Analysis of Data Clusters Obtained by Self-Organizing Methods
The self-organizing methods were used for the investigation of financial
market. As an example we consider data time-series of Dow Jones index for the
years 2002-2003 (R. Mantegna, cond-mat/9802256). In order to reveal new
structures in stock market behavior of the companies drawing up Dow Jones index
we apply SOM (Self-Organizing Maps) and GMDH (Group Method of Data Handling)
algorithms. Using SOM techniques we obtain SOM-maps that establish a new
relationship in market structure. Analysis of the obtained clusters was made by
GMDH.Comment: 10 pages, 4 figure
How to use the Kohonen algorithm to simultaneously analyse individuals in a survey
The Kohonen algorithm (SOM, Kohonen,1984, 1995) is a very powerful tool for
data analysis. It was originally designed to model organized connections
between some biological neural networks. It was also immediately considered as
a very good algorithm to realize vectorial quantization, and at the same time
pertinent classification, with nice properties for visualization. If the
individuals are described by quantitative variables (ratios, frequencies,
measurements, amounts, etc.), the straightforward application of the original
algorithm leads to build code vectors and to associate to each of them the
class of all the individuals which are more similar to this code-vector than to
the others. But, in case of individuals described by categorical (qualitative)
variables having a finite number of modalities (like in a survey), it is
necessary to define a specific algorithm. In this paper, we present a new
algorithm inspired by the SOM algorithm, which provides a simultaneous
classification of the individuals and of their modalities.Comment: Special issue ESANN 0
Self-Organizing Time Map: An Abstraction of Temporal Multivariate Patterns
This paper adopts and adapts Kohonen's standard Self-Organizing Map (SOM) for
exploratory temporal structure analysis. The Self-Organizing Time Map (SOTM)
implements SOM-type learning to one-dimensional arrays for individual time
units, preserves the orientation with short-term memory and arranges the arrays
in an ascending order of time. The two-dimensional representation of the SOTM
attempts thus twofold topology preservation, where the horizontal direction
preserves time topology and the vertical direction data topology. This enables
discovering the occurrence and exploring the properties of temporal structural
changes in data. For representing qualities and properties of SOTMs, we adapt
measures and visualizations from the standard SOM paradigm, as well as
introduce a measure of temporal structural changes. The functioning of the
SOTM, and its visualizations and quality and property measures, are illustrated
on artificial toy data. The usefulness of the SOTM in a real-world setting is
shown on poverty, welfare and development indicators
- …