12,346 research outputs found
Somoclu: An Efficient Parallel Library for Self-Organizing Maps
Somoclu is a massively parallel tool for training self-organizing maps on
large data sets written in C++. It builds on OpenMP for multicore execution,
and on MPI for distributing the workload across the nodes in a cluster. It is
also able to boost training by using CUDA if graphics processing units are
available. A sparse kernel is included, which is useful for high-dimensional
but sparse data, such as the vector spaces common in text mining workflows.
Python, R and MATLAB interfaces facilitate interactive use. Apart from fast
execution, memory use is highly optimized, enabling training large emergent
maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at
https://peterwittek.github.io/somoclu
First Evaluation of the CPU, GPGPU and MIC Architectures for Real Time Particle Tracking based on Hough Transform at the LHC
Recent innovations focused around {\em parallel} processing, either through
systems containing multiple processors or processors containing multiple cores,
hold great promise for enhancing the performance of the trigger at the LHC and
extending its physics program. The flexibility of the CMS/ATLAS trigger system
allows for easy integration of computational accelerators, such as NVIDIA's
Tesla Graphics Processing Unit (GPU) or Intel's \xphi, in the High Level
Trigger. These accelerators have the potential to provide faster or more energy
efficient event selection, thus opening up possibilities for new complex
triggers that were not previously feasible. At the same time, it is crucial to
explore the performance limits achievable on the latest generation multicore
CPUs with the use of the best software optimization methods. In this article, a
new tracking algorithm based on the Hough transform will be evaluated for the
first time on a multi-core Intel Xeon E5-2697v2 CPU, an NVIDIA Tesla K20c GPU,
and an Intel \xphi\ 7120 coprocessor. Preliminary time performance will be
presented.Comment: 13 pages, 4 figures, Accepted to JINS
Massively-Parallel Break Detection for Satellite Data
The field of remote sensing is nowadays faced with huge amounts of data.
While this offers a variety of exciting research opportunities, it also yields
significant challenges regarding both computation time and space requirements.
In practice, the sheer data volumes render existing approaches too slow for
processing and analyzing all the available data. This work aims at accelerating
BFAST, one of the state-of-the-art methods for break detection given satellite
image time series. In particular, we propose a massively-parallel
implementation for BFAST that can effectively make use of modern parallel
compute devices such as GPUs. Our experimental evaluation shows that the
proposed GPU implementation is up to four orders of magnitude faster than the
existing publicly available implementation and up to ten times faster than a
corresponding multi-threaded CPU execution. The dramatic decrease in running
time renders the analysis of significantly larger datasets possible in seconds
or minutes instead of hours or days. We demonstrate the practical benefits of
our implementations given both artificial and real datasets.Comment: 10 page
Massively Parallel Computing and the Search for Jets and Black Holes at the LHC
Massively parallel computing at the LHC could be the next leap necessary to
reach an era of new discoveries at the LHC after the Higgs discovery.
Scientific computing is a critical component of the LHC experiment, including
operation, trigger, LHC computing GRID, simulation, and analysis. One way to
improve the physics reach of the LHC is to take advantage of the flexibility of
the trigger system by integrating coprocessors based on Graphics Processing
Units (GPUs) or the Many Integrated Core (MIC) architecture into its server
farm. This cutting edge technology provides not only the means to accelerate
existing algorithms, but also the opportunity to develop new algorithms that
select events in the trigger that previously would have evaded detection. In
this article we describe new algorithms that would allow to select in the
trigger new topological signatures that include non-prompt jet and black
hole--like objects in the silicon tracker.Comment: 15 pages, 11 figures, submitted to NIM
Massively Parallel Ray Tracing Algorithm Using GPU
Ray tracing is a technique for generating an image by tracing the path of
light through pixels in an image plane and simulating the effects of
high-quality global illumination at a heavy computational cost. Because of the
high computation complexity, it can't reach the requirement of real-time
rendering. The emergence of many-core architectures, makes it possible to
reduce significantly the running time of ray tracing algorithm by employing the
powerful ability of floating point computation. In this paper, a new GPU
implementation and optimization of the ray tracing to accelerate the rendering
process is presented
- …