6 research outputs found
Somoclu: An Efficient Parallel Library for Self-Organizing Maps
Somoclu is a massively parallel tool for training self-organizing maps on
large data sets written in C++. It builds on OpenMP for multicore execution,
and on MPI for distributing the workload across the nodes in a cluster. It is
also able to boost training by using CUDA if graphics processing units are
available. A sparse kernel is included, which is useful for high-dimensional
but sparse data, such as the vector spaces common in text mining workflows.
Python, R and MATLAB interfaces facilitate interactive use. Apart from fast
execution, memory use is highly optimized, enabling training large emergent
maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at
https://peterwittek.github.io/somoclu
XPySom: High-performance self-organizing maps
In this paper, we introduce XPySom, a new open-source Python implementation of the well-known Self-Organizing Maps (SOM) technique. It is designed to achieve high performance on a single node, exploiting widely available Python libraries for vector processing on multi-core CPUs and GP-GPUs. We present results from an extensive experimental evaluation of XPySom in comparison to widely used open-source SOM implementations, showing that it outperforms the other available alternatives. Indeed, our experimentation carried out using the Extended MNIST open data set shows a speed-up of about 7x and 100x when compared to the best open-source multi-core implementations we could find with multi-core and GP-GPU acceleration, respectively, achieving the same accuracy levels in terms of quantization error
Neural networks using for handwriting numbers recognition
V prezentovanĂ© práci, Hopfieldova neuronová sĂĹĄ byla postavena pro rozpoznávánĂ ruÄŤnÄ› psanĂ©ho ÄŤĂslice vzory obsaĹľenĂ© v MNIST databáze. Pro kaĹľdou ÄŤĂslici bylo vybudováno deset neuronovĂ˝ch sĂtĂ Hopfieldu. StĹ™edy shlukĹŻ, kterĂ© byly postaveny s vyuĹľitĂm neuronovĂ© sĂtÄ› Kohonen byly brány jako objekty pro "zapamatovánĂ". Byly navrĹľeny dvÄ› metody, kterĂ© jsou podporovanĂ˝m krokem v hopfieldskĂ© neurálnĂ sĂti; byla provedena analĂ˝za tÄ›chto metod. TakĂ©, chyba byla vypoÄŤtena pro kaĹľdĂ© metody, vĂ˝hody a nevĂ˝hody jejich pouĹľitĂ byly identifikovány. SeskupenĂ ruÄŤnÄ› psanĂ˝ch ÄŤĂslic z trĂ©ninkovĂ©ho vzorku MNIST databáze se provádĂ. Clustering is performed using a Kohonen neural network. Pro kaĹľdou ÄŤĂslici je zvolen optimálnĂ poÄŤet seskupenĂ (nepĹ™esahujĂcĂ 50). As a metric for Kohonen network, the Euclidean norm is used. SĂĹĄ je vycviÄŤena sĂ©riovĂ˝m algoritmem na procesoru a paralelnĂm algoritmem na GPU pomocĂ technologie CUDA. Grafy ÄŤasu strávenĂ©ho trĂ©ninkem neurálnĂ sĂtÄ› pro kaĹľdou ÄŤĂslici jsou uvedeny. Je prezentováno srovnánĂ ÄŤasu strávenĂ©ho sĂ©riovĂ˝m a paralelnĂm trĂ©ninkem. Bylo zjištÄ›no, Ĺľe prĹŻmÄ›rná hodnota zrychlenĂ vĂ˝cviku neurálnĂ sĂtÄ› pomocĂ technologie CUDA je tĂ©měř 17krát vyššĂ. ÄŚĂslice ze zkušebnĂho vzorku databáze MNIST se pouĹľĂvajĂ k vyhodnocenĂ pĹ™esnosti stavby seskupenĂ. Bylo zjištÄ›no, Ĺľe procento vektorĹŻ ze zkušebnĂho vzorku ve správnĂ©m seskupenĂ pro kaĹľdou ÄŤĂslici je vĂce neĹľ 90%. VypoÄŤĂtá se F-mĂra pro kaĹľdou ÄŤĂslici. Nejlepšà hodnoty F-measure jsou zĂskány pro 0 a 1 (F-measure je 0.974), vzhledem k tomu, Ĺľe nejhoršà hodnoty jsou zĂskány pro ÄŤĂslici 9 (F-measure je 0.903). Ăšvod struÄŤnÄ› popisuje obsah práce, jakĂ˝ vĂ˝zkum je v souÄŤasnĂ© dobÄ› k dispozici, a vĂ˝znam tĂ©to práce. Po tom následuje prohlášenĂ o problĂ©mu, stejnÄ› jako o tom, jakĂ© technologie byly pouĹľity k psanĂ tĂ©to práce. PrvnĂ kapitola popisuje teoretickĂ© aspekty, stejnÄ› jako popisuje, jak Ĺ™ešit kaĹľdou fázi tĂ©to práce. Druhá kapitola obsahuje popis programu práce a zĂskanĂ© vĂ˝sledky. Ve druhĂ© kapitole mluvĂme o paralelizaci vĂ˝ukovĂ©ho algoritmu Kohonenovy neurálnĂ sĂtÄ›. Ve tĹ™etĂ kapitole je software testován. VĂ˝sledky jsou uznánĂ reakci kaĹľdĂ© neuronovĂ© sĂtÄ› - obraz je nejvĂce podobnĂ˝ obraz pĹ™edloĹľenĂ© pro vstup, a takĂ© celkovĂ© procento uznánĂ za kaĹľdĂ© neuronovĂ© sĂtÄ›.In the presented work, a Hopfield neural network was constructed for recognizing handwritten digit patterns contained in the MNIST database. Ten Hopfield neural networks were built for each digit separately. The centers of clusters that were built using the Kohonen neural network were taken as objects for “memorization”. Two methods were proposed, which are a supported step in a Hopfield neural network; an analysis of these methods was carried out. Also, an error was calculated for each method, the pros and cons of their use were identified. Clustering of handwritten digits from the training sample of the MNIST database is conducted. Clustering is performed using a Kohonen neural network. The optimal number of clusters (not exceeding 50) for each digit is selected. As a metric for Kohonen network, the Euclidean norm is used. The network is trained by a serial algorithm on the CPU and by a parallel algorithm on the GPU using CUDA technology. The graphs of the time spent on training the neural network for each digit are given. A comparison of the time spent for serial and parallel training is presented. It is found that the average value of accelerating the training of a neural network using CUDA technology is almost 17-fold. The digits from the test sample of the MNIST database are used to evaluate the accuracy of building the cluster. It is found that the percentage of vectors from the test sample in the correct cluster for each digit is more than 90%. The F-measure for each digit is calculated. The best values of the F-measure are obtained for 0 and 1 (F-measure is 0.974), whereas the worst values are obtained for the digit 9 (F-measure is 0.903). The introduction briefly describes the content of the work, what research is currently available, and the relevance of this work. This is followed by a statement of the problem, as well as what technologies were used to write this work. The first chapter describes the theoretical aspects, as well as describes how to solve each stage of this work. The second chapter contains a program description of the work and the results obtained. In the second chapter, we talk about parallelizing the learning algorithm of the Kohonen neural network. In the third chapter, the software is tested. The results are the recognition response of each neural network - the image is the most similar to the image submitted for input, also, the total percentage of recognition for each neural network
A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment.
In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, and we achieve a 10x speedup for self-organizing maps over a distributed CPU algorithm.Sponsorship:Amazon Web Services</p
A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment.
In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, and we achieve a 10x speedup for self-organizing maps over a distributed CPU algorithm.Sponsorship:Amazon Web Services</p
PERICLES Deliverable 4.3:Content Semantics and Use Context Analysis Techniques
The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.PERICLE