Search CORE

1,837 research outputs found

A Review of Lightweight Thread Approaches for High Performance Computing

Author: Balaji Pavan
Castelló Adrián
Mayo Rafael
Peña Antonio J.
Quintana-Ortí Enrique S.
Seo Sangmin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/12/2016
Field of study

High-level, directive-based solutions are becoming the programming models (PMs) of the multi/many-core architectures. Several solutions relying on operating system (OS) threads perfectly work with a moderate number of cores. However, exascale systems will spawn hundreds of thousands of threads in order to exploit their massive parallel architectures and thus conventional OS threads are too heavy for that purpose. Several lightweight thread (LWT) libraries have recently appeared offering lighter mechanisms to tackle massive concurrency. In order to examine the suitability of LWTs in high-level runtimes, we develop a set of microbenchmarks consisting of commonly-found patterns in current parallel codes. Moreover, we study the semantics offered by some LWT libraries in order to expose the similarities between different LWT application programming interfaces. This study reveals that a reduced set of LWT functions can be sufficient to cover the common parallel code patterns andthat those LWT libraries perform better than OS threads-based solutions in cases where task and nested parallelism are becoming more popular with new architectures.The researchers from the Universitat Jaume I de Castelló were supported by project TIN2014-53495-R of the MINECO, the Generalitat Valenciana fellowship programme Vali+d 2015, and FEDER. This work was partially supported by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research (SC-21), under contract DEAC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

R friendly multi-threading in C++

Author: Nagler Thomas
Publication venue
Publication date: 08/04/2019
Field of study

Calling multi-threaded C++ code from R has its perils. Since the R interpreter is single-threaded, one must not check for user interruptions or print to the R console from multiple threads. One can, however, synchronize with R from the main thread. The R package RcppThread (current version 0.5.3) contains a header only C++ library for thread safe communication with R that exploits this fact. It includes C++ classes for threads, a thread pool, and parallel loops that routinely synchronize with R. This article explains the package's functionality and gives examples of its usage. The synchronization mechanism may also apply to other threading frameworks. Benchmarks suggest that, although synchronization causes overhead, the parallel abstractions of RcppThread are competitive with other popular libraries in typical scenarios encountered in statistical computing

arXiv.org e-Print Archive

Journal of Statistical Software

AutoParallel: A Python module for automatic parallelization and distributed execution of affine loop nests

Author: Amela Ramon
Badia Rosa M.
Clauss Philippe
Ejarque Jorge
Ramon-Cortes Cristian
Publication venue
Publication date: 26/10/2018
Field of study

The last improvements in programming languages, programming models, and frameworks have focused on abstracting the users from many programming issues. Among others, recent programming frameworks include simpler syntax, automatic memory management and garbage collection, which simplifies code re-usage through library packages, and easily configurable tools for deployment. For instance, Python has risen to the top of the list of the programming languages due to the simplicity of its syntax, while still achieving a good performance even being an interpreted language. Moreover, the community has helped to develop a large number of libraries and modules, tuning them to obtain great performance. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelization of affine loop nests to execute them in parallel in a distributed computing infrastructure. This parallelization can also include the building of data blocks to increase task granularity in order to achieve a good execution performance. Moreover, AutoParallel is based on sequential programming and only contains a small annotation in the form of a Python decorator so that anyone with little programming skills can scale up an application to hundreds of cores.Comment: Accepted to the 8th Workshop on Python for High-Performance and Scientific Computing (PyHPC 2018

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Peachy Parallel Assignments (EduHPC 2018)

Author: Alvarez Lluc
Ayaguadé Eduard
Banchelli Fabio
Bunde David P.
Burtscher Martin
González Escribano Arturo
Gutierrez Julian
Joiner David A.
Kaeli David
Previlon Fritz
Rodríguez Gutiez Eduardo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Peachy Parallel Assignments are a resource for instructors teaching parallel and distributed programming. These are high-quality assignments, previously tested in class, that are readily adoptable. This collection of assignments includes implementing a subset of OpenMP using pthreads, creating an animated fractal, image processing using histogram equalization, simulating a storm of high-energy particles, and solving the wave equation in a variety of settings. All of these come with sample assignment sheets and the necessary starter code.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Facilitar la inclusión de ejercicios prácticos de programación paralela en cursos de Computación Paralela o de alto rendimiento (HPC)Comunicación en congreso: Descripción de ejercicios prácticos con acceso a material ya desarrollado y probado

Crossref

UPCommons. Portal del coneixement obert de la UPC

Repositorio Documental de la Universidad de Valladolid

Kean Digital Learning Commons

Verifying Parallel Loops with Separation Logic

Author: Blom Stefan
Darabi Saeed
Huisman Marieke
Publication venue: 'Open Publishing Association'
Publication date: 12/04/2014
Field of study

This paper proposes a technique to specify and verify whether a loop can be parallelised. Our approach can be used as an additional step in a parallelising compiler to verify user annotations about loop dependences. Essentially, our technique requires each loop iteration to be specified with the locations it will read and write. From the loop iteration specifications, the loop (in)dependences can be derived. Moreover, the loop iteration specifications also reveal where synchronisation is needed in the parallelised program. The loop iteration specifications can be verified using permission-based separation logic.Comment: In Proceedings PLACES 2014, arXiv:1406.331

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

University of Twente Research Information