4,078 research outputs found
CoreTSAR: Task Scheduling for Accelerator-aware Runtimes
Heterogeneous supercomputers that incorporate computational accelerators
such as GPUs are increasingly popular due to their high
peak performance, energy efficiency and comparatively low cost.
Unfortunately, the programming models and frameworks designed
to extract performance from all computational units still lack the
flexibility of their CPU-only counterparts. Accelerated OpenMP
improves this situation by supporting natural migration of OpenMP
code from CPUs to a GPU. However, these implementations currently
lose one of OpenMPâs best features, its flexibility: typical
OpenMP applications can run on any number of CPUs. GPU implementations
do not transparently employ multiple GPUs on a node
or a mix of GPUs and CPUs. To address these shortcomings, we
present CoreTSAR, our runtime library for dynamically scheduling
tasks across heterogeneous resources, and propose straightforward
extensions that incorporate this functionality into Accelerated
OpenMP. We show that our approach can provide nearly linear
speedup to four GPUs over only using CPUs or one GPU while
increasing the overall flexibility of Accelerated OpenMP
Efficient data structures for masks on 2D grids
This article discusses various methods of representing and manipulating
arbitrary coverage information in two dimensions, with a focus on space- and
time-efficiency when processing such coverages, storing them on disk, and
transmitting them between computers. While these considerations were originally
motivated by the specific tasks of representing sky coverage and cross-matching
catalogues of astronomical surveys, they can be profitably applied in many
other situations as well.Comment: accepted by A&
Integrating LHCb workflows on HPC resources: status and strategies
High Performance Computing (HPC) supercomputers are expected to play an
increasingly important role in HEP computing in the coming years. While HPC
resources are not necessarily the optimal fit for HEP workflows, computing time
at HPC centers on an opportunistic basis has already been available to the LHC
experiments for some time, and it is also possible that part of the pledged
computing resources will be offered as CPU time allocations at HPC centers in
the future. The integration of the experiment workflows to make the most
efficient use of HPC resources is therefore essential. This paper describes the
work that has been necessary to integrate LHCb workflows at a specific HPC
site, the Marconi-A2 system at CINECA in Italy, where LHCb benefited from a
joint PRACE (Partnership for Advanced Computing in Europe) allocation with the
other Large Hadron Collider (LHC) experiments. This has required addressing two
types of challenges: on the software application workloads, for optimising
their performance on a many-core hardware architecture that differs
significantly from those traditionally used in WLCG (Worldwide LHC Computing
Grid), by reducing memory footprint using a multi-process approach; and in the
distributed computing area, for submitting these workloads using more than one
logical processor per job, which had never been done yet in LHCb.Comment: 9 pages, submitted to CHEP2019 proceedings in EPJ Web of Conference
A Force-Directed Approach for Offline GPS Trajectory Map Matching
We present a novel algorithm to match GPS trajectories onto maps offline (in
batch mode) using techniques borrowed from the field of force-directed graph
drawing. We consider a simulated physical system where each GPS trajectory is
attracted or repelled by the underlying road network via electrical-like
forces. We let the system evolve under the action of these physical forces such
that individual trajectories are attracted towards candidate roads to obtain a
map matching path. Our approach has several advantages compared to traditional,
routing-based, algorithms for map matching, including the ability to account
for noise and to avoid large detours due to outliers in the data whilst taking
into account the underlying topological restrictions (such as one-way roads).
Our empirical evaluation using real GPS traces shows that our method produces
better map matching results compared to alternative offline map matching
algorithms on average, especially for routes in dense, urban areas.Comment: 10 pages, 12 figures, accepted version of article submitted to ACM
SIGSPATIAL 2018, Seattle, US
Concurrent High-performance Persistent Hash Table In Java
Current trading systems must handle both high volumes of trading and large amounts of trading data. One crucial module in high-performance trading is fast storage and retrieval of large volumes of data simultaneously accessed by multiple computer traders. To speed up access, a high-performance in-memory software-cache stores the dynamic working-set of trades during a trading day. To utilize memory effeciently, it is beneficial to provide a single shared cache for multiple trading applications. Much of the cache access is read-only, as information is gathered before a transaction to determine its value. Hence, extremely fast lookup is essential to support quick information gathering for assessment. This thesis presents a software-cache, called MapHash, that is a high-performance hash-table for use in Java
- âŠ