Search CORE

93 research outputs found

DART-MPI: An MPI-based Implementation of a PGAS Runtime System

Author: Fürlinger Karl
Glass Colin W.
Gracia José
Idrees Kamran
Mhedheb Yousri
Tao Jie
Zhou Huan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address Space Programming Models (PGAS14

arXiv.org e-Print Archive

Crossref

Evaluation of the power efficiency of UPC, OpenMP and MPI

Author: Cai Xing
Ha Hoai Phuong
Lagraviere Jeremie
Publication venue: Institutt for informatikk Tromsø
Publication date: 01/01/2015
Field of study

In this study we compare the performance and power efficiency of Unified Parallel C (UPC), MPI and OpenMP by running a set of kernels from the NAS Benchmark. One of the goals of this study is to focus on the Partitioned Global Address Space (PGAS) model, in order to describe it and compare it to MPI and OpenMP. In particular we consider the power effi- ciency expressed in millions operations per second per watt as a criterion to evaluate the suitability of PGAS compared to MPI and OpenMP. Based on these measurements, we provide an analysis to explain the difference of performance between UPC, MPI, and OpenMP

Munin - Open Research Archive

Exascale machines require new programming paradigms and runtimes

Author: Astsatryan Hrachya
Da Costa Georges
Fahringer Thomas
Grasso Ivan
Hristov Atanas
Karatza Helen D.
Lastovetsky Alexey
Marozzo Fabrizio
Petcu Dana
Rico-Gallego Juan-Antonio
Stavrinides Georgios L.
Talia Domenico
Trufio Paolo
Publication venue
Publication date: 01/01/2015
Field of study

Extreme scale parallel computing systems will have tens of thousands of optionally accelerator-equiped nodes with hundreds of cores each, as well as deep memory hierarchies and complex interconnect topologies. Such Exascale systems will provide hardware parallelism at multiple levels and will be energy constrained. Their extreme scale and the rapidly deteriorating reliablity of their hardware components means that Exascale systems will exhibit low mean-time-between-failure values. Furthermore, existing programming models already require heroic programming and optimisation efforts to achieve high efficiency on current supercomputers. Invariably, these efforts are platform-specific and non-portable. In this paper we will explore the shortcomings of existing programming models and runtime systems for large scale computing systems. We then propose and discuss important features of programming paradigms and runtime system to deal with large scale computing systems with a special focus on data-intensive applications and resilience. Finally, we also discuss code sustainability issues and propose several software metrics that are of paramount importance for code development for large scale computing systems

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Open Access Repository

Survey of Novel Programming Models for Parallelizing Applications at Exascale

Author: Cook R.
Dube E.
Lee I.
Nau L.
Shereda C.
Wang F.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 21/11/2011
Field of study

Crossref

UNT Digital Library

High-performance epistasis detection in quantitative trait GWAS

Author: Groth Brandon
Koltes James
Kraeva Marina
Kramer Luke
Luecke Glenn
Ma Li
Reecy James
Reecy James
Weeks Nathan
Weeks Nathan
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2016
Field of study

epiSNP is a program for identifying pairwise single nucleotide polymorphism (SNP) interactions (epistasis) in quantitative-trait genome-wide association studies (GWAS). A parallel MPI version (EPISNPmpi) was created in 2008 to address this computationally expensive analysis on large data sets with many quantitative traits and SNP markers. However, the falling cost of genotyping has led to an explosion of large-scale GWAS data sets that challenge EPISNPmpi’s ability to compute results in a reasonable amount of time. Therefore, we optimized epiSNP for modern multi-core and highly parallel many-core processors to efficiently handle these large data sets. This paper describes the serial optimizations, dynamic load balancing using MPI-3 RMA operations, and shared-memory parallelization with OpenMP to further enhance load balancing and allow execution on the Intel Xeon Phi coprocessor (MIC). For a large GWAS data set, our optimizations provided a 38.43× speedup over EPISNPmpi on 126 nodes using 2 MICs on TACC’s Stampede Supercomputer. We also describe a Coarray Fortran (CAF) version that demonstrates the suitability of PGAS languages for problems with this computational pattern. We show that the Coarray version performs competitively with the MPI version on the NERSC Edison Cray XC30 supercomputer. Finally, the performance benefits of hyper-threading for this application on Edison (average 1.35× speedup) are demonstrated

Digital Repository @ Iowa State University (ISU)

The Landscape of Exascale Research: A Data-Driven Literature Analysis

Author: Belloum A.S.Z.
Heldens S.
Hijma P.
Maassen J.
Van Nieuwpoort R.V.
Van Werkhoven B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Enabling Independent Communication for FPGAs in High Performance Computing

Author: Lant Joshua
Publication venue
Publication date: 01/08/2020
Field of study

The University of Manchester - Institutional Repository

Applying Type Oriented Programming to the PGAS Memory Model

Author: Brown Nicholas
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

The AXIOM software layers

AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

UPCommons. Portal del coneixement obert de la UPC

Archivio della Ricerca - Università degli Studi di Siena