80,858 research outputs found
Somoclu: An Efficient Parallel Library for Self-Organizing Maps
Somoclu is a massively parallel tool for training self-organizing maps on
large data sets written in C++. It builds on OpenMP for multicore execution,
and on MPI for distributing the workload across the nodes in a cluster. It is
also able to boost training by using CUDA if graphics processing units are
available. A sparse kernel is included, which is useful for high-dimensional
but sparse data, such as the vector spaces common in text mining workflows.
Python, R and MATLAB interfaces facilitate interactive use. Apart from fast
execution, memory use is highly optimized, enabling training large emergent
maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at
https://peterwittek.github.io/somoclu
Fast training of self organizing maps for the visual exploration of molecular compounds
Visual exploration of scientific data in life science
area is a growing research field due to the large amount of
available data. The Kohonenâs Self Organizing Map (SOM) is
a widely used tool for visualization of multidimensional data.
In this paper we present a fast learning algorithm for SOMs
that uses a simulated annealing method to adapt the learning
parameters. The algorithm has been adopted in a data analysis
framework for the generation of similarity maps. Such maps
provide an effective tool for the visual exploration of large and
multi-dimensional input spaces. The approach has been applied
to data generated during the High Throughput Screening
of molecular compounds; the generated maps allow a visual
exploration of molecules with similar topological properties.
The experimental analysis on real world data from the
National Cancer Institute shows the speed up of the proposed
SOM training process in comparison to a traditional approach.
The resulting visual landscape groups molecules with similar
chemical properties in densely connected regions
A binary self-organizing map and its FPGA implementation
A binary Self Organizing Map (SOM) has been designed and
implemented on a Field Programmable Gate Array (FPGA) chip. A novel learning algorithm which takes binary inputs and maintains tri-state weights is presented. The binary SOM has the capability of recognizing binary input sequences after training. A novel tri-state rule is used in updating the network weights during the training phase. The rule implementation is highly suited to the FPGA architecture, and allows extremely rapid training. This architecture may be used in real-time for fast pattern clustering and classification of the binary features
Batch and median neural gas
Neural Gas (NG) constitutes a very robust clustering algorithm given
euclidian data which does not suffer from the problem of local minima like
simple vector quantization, or topological restrictions like the
self-organizing map. Based on the cost function of NG, we introduce a batch
variant of NG which shows much faster convergence and which can be interpreted
as an optimization of the cost function by the Newton method. This formulation
has the additional benefit that, based on the notion of the generalized median
in analogy to Median SOM, a variant for non-vectorial proximity data can be
introduced. We prove convergence of batch and median versions of NG, SOM, and
k-means in a unified formulation, and we investigate the behavior of the
algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari
Mining Dynamic Document Spaces with Massively Parallel Embedded Processors
Currently Océ investigates future document management services. One of these services is accessing dynamic document spaces, i.e. improving the access to document spaces which are frequently updated (like newsgroups). This process is rather computational intensive. This paper describes the research conducted on software development for massively parallel processors. A prototype has been built which processes streams of information from specified newsgroups and transforms them into personal information maps. Although this technology does speed up the training part compared to a general purpose processor implementation, however, its real benefits emerges with larger problem dimensions because of the scalable approach. It is recommended to improve on quality of the map as well as on visualisation and to better profile the performance of the other parts of the pipeline, i.e. feature extraction and visualisation
ART and ARTMAP Neural Networks for Applications: Self-Organizing Learning, Recognition, and Prediction
ART and ARTMAP neural networks for adaptive recognition and prediction have been applied to a variety of problems. Applications include parts design retrieval at the Boeing Company, automatic mapping from remote sensing satellite measurements, medical database prediction, and robot vision. This chapter features a self-contained introduction to ART and ARTMAP dynamics and a complete algorithm for applications. Computational properties of these networks are illustrated by means of remote sensing and medical database examples. The basic ART and ARTMAP networks feature winner-take-all (WTA) competitive coding, which groups inputs into discrete recognition categories. WTA coding in these networks enables fast learning, that allows the network to encode important rare cases but that may lead to inefficient category proliferation with noisy training inputs. This problem is partially solved by ART-EMAP, which use WTA coding for learning but distributed category representations for test-set prediction. In medical database prediction problems, which often feature inconsistent training input predictions, the ARTMAP-IC network further improves ARTMAP performance with distributed prediction, category instance counting, and a new search algorithm. A recently developed family of ART models (dART and dARTMAP) retains stable coding, recognition, and prediction, but allows arbitrarily distributed category representation during learning as well as performance.National Science Foundation (IRI 94-01659, SBR 93-00633); Office of Naval Research (N00014-95-1-0409, N00014-95-0657
- âŠ