4,078 research outputs found

    CoreTSAR: Task Scheduling for Accelerator-aware Runtimes

    Get PDF
    Heterogeneous supercomputers that incorporate computational accelerators such as GPUs are increasingly popular due to their high peak performance, energy efficiency and comparatively low cost. Unfortunately, the programming models and frameworks designed to extract performance from all computational units still lack the flexibility of their CPU-only counterparts. Accelerated OpenMP improves this situation by supporting natural migration of OpenMP code from CPUs to a GPU. However, these implementations currently lose one of OpenMP’s best features, its flexibility: typical OpenMP applications can run on any number of CPUs. GPU implementations do not transparently employ multiple GPUs on a node or a mix of GPUs and CPUs. To address these shortcomings, we present CoreTSAR, our runtime library for dynamically scheduling tasks across heterogeneous resources, and propose straightforward extensions that incorporate this functionality into Accelerated OpenMP. We show that our approach can provide nearly linear speedup to four GPUs over only using CPUs or one GPU while increasing the overall flexibility of Accelerated OpenMP

    Efficient data structures for masks on 2D grids

    Full text link
    This article discusses various methods of representing and manipulating arbitrary coverage information in two dimensions, with a focus on space- and time-efficiency when processing such coverages, storing them on disk, and transmitting them between computers. While these considerations were originally motivated by the specific tasks of representing sky coverage and cross-matching catalogues of astronomical surveys, they can be profitably applied in many other situations as well.Comment: accepted by A&

    Integrating LHCb workflows on HPC resources: status and strategies

    Full text link
    High Performance Computing (HPC) supercomputers are expected to play an increasingly important role in HEP computing in the coming years. While HPC resources are not necessarily the optimal fit for HEP workflows, computing time at HPC centers on an opportunistic basis has already been available to the LHC experiments for some time, and it is also possible that part of the pledged computing resources will be offered as CPU time allocations at HPC centers in the future. The integration of the experiment workflows to make the most efficient use of HPC resources is therefore essential. This paper describes the work that has been necessary to integrate LHCb workflows at a specific HPC site, the Marconi-A2 system at CINECA in Italy, where LHCb benefited from a joint PRACE (Partnership for Advanced Computing in Europe) allocation with the other Large Hadron Collider (LHC) experiments. This has required addressing two types of challenges: on the software application workloads, for optimising their performance on a many-core hardware architecture that differs significantly from those traditionally used in WLCG (Worldwide LHC Computing Grid), by reducing memory footprint using a multi-process approach; and in the distributed computing area, for submitting these workloads using more than one logical processor per job, which had never been done yet in LHCb.Comment: 9 pages, submitted to CHEP2019 proceedings in EPJ Web of Conference

    A Force-Directed Approach for Offline GPS Trajectory Map Matching

    Full text link
    We present a novel algorithm to match GPS trajectories onto maps offline (in batch mode) using techniques borrowed from the field of force-directed graph drawing. We consider a simulated physical system where each GPS trajectory is attracted or repelled by the underlying road network via electrical-like forces. We let the system evolve under the action of these physical forces such that individual trajectories are attracted towards candidate roads to obtain a map matching path. Our approach has several advantages compared to traditional, routing-based, algorithms for map matching, including the ability to account for noise and to avoid large detours due to outliers in the data whilst taking into account the underlying topological restrictions (such as one-way roads). Our empirical evaluation using real GPS traces shows that our method produces better map matching results compared to alternative offline map matching algorithms on average, especially for routes in dense, urban areas.Comment: 10 pages, 12 figures, accepted version of article submitted to ACM SIGSPATIAL 2018, Seattle, US

    Concurrent High-performance Persistent Hash Table In Java

    Get PDF
    Current trading systems must handle both high volumes of trading and large amounts of trading data. One crucial module in high-performance trading is fast storage and retrieval of large volumes of data simultaneously accessed by multiple computer traders. To speed up access, a high-performance in-memory software-cache stores the dynamic working-set of trades during a trading day. To utilize memory effeciently, it is beneficial to provide a single shared cache for multiple trading applications. Much of the cache access is read-only, as information is gathered before a transaction to determine its value. Hence, extremely fast lookup is essential to support quick information gathering for assessment. This thesis presents a software-cache, called MapHash, that is a high-performance hash-table for use in Java
    • 

    corecore