6 research outputs found

    Somoclu: An Efficient Parallel Library for Self-Organizing Maps

    Get PDF
    Somoclu is a massively parallel tool for training self-organizing maps on large data sets written in C++. It builds on OpenMP for multicore execution, and on MPI for distributing the workload across the nodes in a cluster. It is also able to boost training by using CUDA if graphics processing units are available. A sparse kernel is included, which is useful for high-dimensional but sparse data, such as the vector spaces common in text mining workflows. Python, R and MATLAB interfaces facilitate interactive use. Apart from fast execution, memory use is highly optimized, enabling training large emergent maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at https://peterwittek.github.io/somoclu

    XPySom: High-performance self-organizing maps

    Get PDF
    In this paper, we introduce XPySom, a new open-source Python implementation of the well-known Self-Organizing Maps (SOM) technique. It is designed to achieve high performance on a single node, exploiting widely available Python libraries for vector processing on multi-core CPUs and GP-GPUs. We present results from an extensive experimental evaluation of XPySom in comparison to widely used open-source SOM implementations, showing that it outperforms the other available alternatives. Indeed, our experimentation carried out using the Extended MNIST open data set shows a speed-up of about 7x and 100x when compared to the best open-source multi-core implementations we could find with multi-core and GP-GPU acceleration, respectively, achieving the same accuracy levels in terms of quantization error

    Neural networks using for handwriting numbers recognition

    Get PDF
    V prezentované práci, Hopfieldova neuronová síť byla postavena pro rozpoznávání ručně psaného číslice vzory obsažené v MNIST databáze. Pro každou číslici bylo vybudováno deset neuronových sítí Hopfieldu. Středy shluků, které byly postaveny s využitím neuronové sítě Kohonen byly brány jako objekty pro "zapamatování". Byly navrženy dvě metody, které jsou podporovaným krokem v hopfieldské neurální síti; byla provedena analýza těchto metod. Také, chyba byla vypočtena pro každé metody, výhody a nevýhody jejich použití byly identifikovány. Seskupení ručně psaných číslic z tréninkového vzorku MNIST databáze se provádí. Clustering is performed using a Kohonen neural network. Pro každou číslici je zvolen optimální počet seskupení (nepřesahující 50). As a metric for Kohonen network, the Euclidean norm is used. Síť je vycvičena sériovým algoritmem na procesoru a paralelním algoritmem na GPU pomocí technologie CUDA. Grafy času stráveného tréninkem neurální sítě pro každou číslici jsou uvedeny. Je prezentováno srovnání času stráveného sériovým a paralelním tréninkem. Bylo zjištěno, že průměrná hodnota zrychlení výcviku neurální sítě pomocí technologie CUDA je téměř 17krát vyšší. Číslice ze zkušebního vzorku databáze MNIST se používají k vyhodnocení přesnosti stavby seskupení. Bylo zjištěno, že procento vektorů ze zkušebního vzorku ve správném seskupení pro každou číslici je více než 90%. Vypočítá se F-míra pro každou číslici. Nejlepší hodnoty F-measure jsou získány pro 0 a 1 (F-measure je 0.974), vzhledem k tomu, že nejhorší hodnoty jsou získány pro číslici 9 (F-measure je 0.903). Úvod stručně popisuje obsah práce, jaký výzkum je v současné době k dispozici, a význam této práce. Po tom následuje prohlášení o problému, stejně jako o tom, jaké technologie byly použity k psaní této práce. První kapitola popisuje teoretické aspekty, stejně jako popisuje, jak řešit každou fázi této práce. Druhá kapitola obsahuje popis programu práce a získané výsledky. Ve druhé kapitole mluvíme o paralelizaci výukového algoritmu Kohonenovy neurální sítě. Ve třetí kapitole je software testován. Výsledky jsou uznání reakci každé neuronové sítě - obraz je nejvíce podobný obraz předložené pro vstup, a také celkové procento uznání za každé neuronové sítě.In the presented work, a Hopfield neural network was constructed for recognizing handwritten digit patterns contained in the MNIST database. Ten Hopfield neural networks were built for each digit separately. The centers of clusters that were built using the Kohonen neural network were taken as objects for “memorization”. Two methods were proposed, which are a supported step in a Hopfield neural network; an analysis of these methods was carried out. Also, an error was calculated for each method, the pros and cons of their use were identified. Clustering of handwritten digits from the training sample of the MNIST database is conducted. Clustering is performed using a Kohonen neural network. The optimal number of clusters (not exceeding 50) for each digit is selected. As a metric for Kohonen network, the Euclidean norm is used. The network is trained by a serial algorithm on the CPU and by a parallel algorithm on the GPU using CUDA technology. The graphs of the time spent on training the neural network for each digit are given. A comparison of the time spent for serial and parallel training is presented. It is found that the average value of accelerating the training of a neural network using CUDA technology is almost 17-fold. The digits from the test sample of the MNIST database are used to evaluate the accuracy of building the cluster. It is found that the percentage of vectors from the test sample in the correct cluster for each digit is more than 90%. The F-measure for each digit is calculated. The best values of the F-measure are obtained for 0 and 1 (F-measure is 0.974), whereas the worst values are obtained for the digit 9 (F-measure is 0.903). The introduction briefly describes the content of the work, what research is currently available, and the relevance of this work. This is followed by a statement of the problem, as well as what technologies were used to write this work. The first chapter describes the theoretical aspects, as well as describes how to solve each stage of this work. The second chapter contains a program description of the work and the results obtained. In the second chapter, we talk about parallelizing the learning algorithm of the Kohonen neural network. In the third chapter, the software is tested. The results are the recognition response of each neural network - the image is the most similar to the image submitted for input, also, the total percentage of recognition for each neural network

    A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment.

    No full text
    In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, and we achieve a 10x speedup for self-organizing maps over a distributed CPU algorithm.Sponsorship:Amazon Web Services</p

    A GPU-Accelerated Algorithm for Self-Organizing Maps in a Distributed Environment.

    No full text
    In this paper we introduce a MapReduce-based implementation of self-organizing maps that performs compute-bound operations on distributed GPUs. The kernels are optimized to ensure coalesced memory access and effective use of shared memory. We have performed extensive tests of our algorithms on a cluster of eight nodes with two NVidia Tesla M2050 attached to each, and we achieve a 10x speedup for self-organizing maps over a distributed CPU algorithm.Sponsorship:Amazon Web Services</p

    PERICLES Deliverable 4.3:Content Semantics and Use Context Analysis Techniques

    Get PDF
    The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.PERICLE
    corecore