50,585 research outputs found
Real-time Unsupervised Clustering
In our research program, we are developing machine learning algorithms to enable a mobile robot to build a compact representation of its environment. This requires the processing of each new input to terminate in constant time. Existing machine learning algorithms are either incapable of meeting this constraint or deliver problematic results. In this paper, we describe a new algorithm for real-time unsupervised clustering, Bounded Self-Organizing Clustering. It executes in constant time for each input, and it produces clusterings that are significantly better than those created by the Self-Organizing Map, its closest competitor, on sensor data acquired from a physically embodied mobile robot
Financial time series analysis with competitive neural networks
Lāobjectif principal de meĢmoire est la modeĢlisation des donneĢes temporelles non stationnaires. Bien que les modeĢles statistiques classiques tentent de corriger les donneĢes non stationnaires en diffeĢrenciant et en ajustant pour la tendance, je tente de creĢer des grappes localiseĢes de donneĢes de seĢries temporelles stationnaires graĢce aĢ lāalgorithme du Ā« self-organizing map Ā». Bien que de nombreuses techniques aient eĢteĢ deĢveloppeĢes pour les seĢries chronologiques aĢ lāaide du Ā« self- organizing map Ā», je tente de construire un cadre matheĢmatique qui justifie son utilisation dans la preĢvision des seĢries chronologiques financieĢres. De plus, je compare les meĢthodes de preĢvision existantes aĢ lāaide du SOM avec celles pour lesquelles un cadre matheĢmatique a eĢteĢ deĢveloppeĢ et qui nāont pas eĢteĢ appliqueĢes dans un contexte de preĢvision. Je compare ces meĢthodes avec la meĢthode ARIMA bien connue pour la preĢvision des seĢries chronologiques. Le deuxieĢme objectif de meĢmoire est de deĢmontrer la capaciteĢ du Ā« self-organizing map Ā» aĢ regrouper des donneĢes vectorielles, puisquāelle a eĢteĢ deĢveloppeĢe aĢ lāorigine comme un reĢseau neuronal avec lāobjectif de regroupement. Plus preĢciseĢment, je deĢmontrerai ses capaciteĢs de regroupement sur les donneĢes du Ā« limit order book Ā» et preĢsenterai diverses meĢthodes de visualisation de ses sorties.The main objective of this Masterās thesis is in the modelling of non-stationary time series data. While classical statistical models attempt to correct non- stationary data through differencing and de-trending, I attempt to create localized clusters of stationary time series data through the use of the self-organizing map algorithm. While numerous techniques have been developed that model time series using the self-organizing map, I attempt to build a mathematical framework that justifies its use in the forecasting of financial times series. Additionally, I compare existing forecasting methods using the SOM with those for which a framework has been developed and which have not been applied in a forecasting context. I then compare these methods with the well known ARIMA method of time series forecasting. The second objective of this thesis is to demonstrate the self-organizing mapās ability to cluster data vectors as it was originally developed as a neural network approach to clustering. Specifically I will demonstrate its clustering abilities on limit order book data and present various visualization methods of its output
A binary self-organizing map and its FPGA implementation
A binary Self Organizing Map (SOM) has been designed and
implemented on a Field Programmable Gate Array (FPGA) chip. A novel learning algorithm which takes binary inputs and maintains tri-state weights is presented. The binary SOM has the capability of recognizing binary input sequences after training. A novel tri-state rule is used in updating the network weights during the training phase. The rule implementation is highly suited to the FPGA architecture, and allows extremely rapid training. This architecture may be used in real-time for fast pattern clustering and classification of the binary features
Using SOMbrero for clustering and visualizing complex data
Over the years, the self-organizing map (SOM) algorithm was proven to be a powerful and convenient tool for clustering and visualizing data.
While the original algorithm had been initially designed for numerical vectors, the available data in the applications became more and more complex, being frequently too rich to be described by a
fixed set of numerical attributes only. This is the case, for example, when the data are described by relations between objects (individuals involved in a social network) or by measures of resemblance/dissemblance.
This presentation will illustrate how the SOM algorithm can be used to cluster and visualize complex data such as graphs, categorical time series or panel data. In particular, it will focus on the use of
the R package SOMbrero, which implements an online version of the relational self-organizing map, able to process any dissimilarity data. The package offers many graphical outputs and diagnostic tools,
and comes with a user-friendly web graphical interface based on R-Shiny. Several examples on various real-world datasets will be given for highlighting the functionalities of the package.Universidad de MĆ”laga. Campus de Excelencia Internacional AndalucĆa Tech
Detection of Anomalies and Novelties in Time Series with Self-Organizing Networks
This paper introduces the DANTE project: Detection of Anomalies and Novelties in Time sEries with self-organizing networks. The goal of this project is to evaluate several self-organizing networks in the detection of anomalies/novelties in dynamic data patterns. For this purpose, we first describe three standard clustering-based approaches which uses well-known self-organizing neural architectures, such as the SOM and the Fuzzy ART algorithms, and then present a novel approach based on the Operator Map (OPM) network. The OPM is a generalization of the SOM where neurons are regarded as temporal filters for dynamic patters. The OPM is used to build local adaptive filters for a given nonstationary time series. Non-parametric confidence intervals are then computed for the residuals of the local models and used as decision thresholds for detecting novelties/anomalies. Computer simulations are carried out to compare the performances of the aforementioned algorithms
A Semi-Supervised Self-Organizing Map with Adaptive Local Thresholds
In the recent years, there is a growing interest in semi-supervised learning,
since, in many learning tasks, there is a plentiful supply of unlabeled data,
but insufficient labeled ones. Hence, Semi-Supervised learning models can
benefit from both types of data to improve the obtained performance. Also, it
is important to develop methods that are easy to parameterize in a way that is
robust to the different characteristics of the data at hand. This article
presents a new method based on Self-Organizing Map (SOM) for clustering and
classification, called Adaptive Local Thresholds Semi-Supervised
Self-Organizing Map (ALTSS-SOM). It can dynamically switch between two forms of
learning at training time, according to the availability of labels, as in
previous models, and can automatically adjust itself to the local variance
observed in each data cluster. The results show that the ALTSS-SOM surpass the
performance of other semi-supervised methods in terms of classification, and
other pure clustering methods when there are no labels available, being also
less sensitive than previous methods to the parameters values
Applying Cluster Ensemble to Adaptive Tree Structured Clustering
Adaptive tree structured clustering (ATSC) is our proposed divisive hierarchical clustering method that recursively
divides a data set into 2 subsets using self-organizing feature map (SOM). In each partition, the data set is quantized by SOM and the quantized data is divided using agglomerative hierarchical clustering. ATSC can divide data sets regardless of data size in feasible time. On the other hand clustering result stability of ATSC is equally unstable as other divisive hierarchical clustering and partitioned clustering methods. In this paper, we apply cluster ensemble for each data partition of ATSC in order to improve stability. Cluster ensemble is a framework for improving partitioned clustering stability. As a result of applying cluster ensemble, ATSC yields unique clustering results that could not be yielded by previous hierarchical clustering methods. This is because a different class distances function is used in each division in ATSC
Fast Algorithm and Implementation of Dissimilarity Self-Organizing Maps
In many real world applications, data cannot be accurately represented by
vectors. In those situations, one possible solution is to rely on dissimilarity
measures that enable sensible comparison between observations. Kohonen's
Self-Organizing Map (SOM) has been adapted to data described only through their
dissimilarity matrix. This algorithm provides both non linear projection and
clustering of non vector data. Unfortunately, the algorithm suffers from a high
cost that makes it quite difficult to use with voluminous data sets. In this
paper, we propose a new algorithm that provides an important reduction of the
theoretical cost of the dissimilarity SOM without changing its outcome (the
results are exactly the same as the ones obtained with the original algorithm).
Moreover, we introduce implementation methods that result in very short running
times. Improvements deduced from the theoretical cost model are validated on
simulated and real world data (a word list clustering problem). We also
demonstrate that the proposed implementation methods reduce by a factor up to 3
the running time of the fast algorithm over a standard implementation
- ā¦