Search CORE

16 research outputs found

The Landscape of Exascale Research: A Data-Driven Literature Analysis

Author: Belloum A.S.Z.
Heldens S.
Hijma P.
Maassen J.
Van Nieuwpoort R.V.
Van Werkhoven B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2020
Field of study

International Migration, Integration and Social Cohesion online publications

The Case for Non-Volatile RAM in Cloud HPCaaS

Author: Fridman Yehonatan
Harel Re'em
Oren Gal
Publication venue
Publication date: 03/08/2022
Field of study

HPC as a service (HPCaaS) is a new way to expose HPC resources via cloud services. However, continued effort to port large-scale tightly coupled applications with high interprocessor communication to multiple (and many) nodes synchronously, as in on-premise supercomputers, is still far from satisfactory due to network latencies. As a consequence, in said cases, HPCaaS is recommended to be used with one or few instances. In this paper we take the claim that new piece of memory hardware, namely Non-Volatile RAM (NVRAM), can allow such computations to scale up to an order of magnitude with marginalized penalty in comparison to RAM. Moreover, we suggest that the introduction of NVRAM to HPCaaS can be cost-effective to the users and the suppliers in numerous forms.Comment: 4 page

arXiv.org e-Print Archive

A job dispatcher for large and heterogeneous HPC systems running modern applications

Author: Galleguillos C.
Kiziltan Z.
Soto R.
Publication venue: Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Publication date: 01/01/2021
Field of study

High-performance Computing (HPC) systems have become essential instruments in our modern society. As they get closer to exascale performance, HPC systems become larger in size and more heterogeneous in their computing resources. With recent advances in AI, HPC systems are also increasingly being used for applications that employ many short jobs with strict timing requirements. HPC job dispatchers need to therefore adopt techniques to go beyond the capabilities of those developed for small or homogeneous systems, or for traditional compute-intensive applications. In this paper, we present a job dispatcher suitable for today's large and heterogeneous systems running modern applications. Unlike its predecessors, our dispatcher solves the entire dispatching problem using Constraint Programming (CP) with a model size independent of the system size. Experimental results based on a simulation study show that our approach can bring about significant performance gains over the existing CP-based dispatchers in a large or heterogeneous system

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Job Dispatcher for Large and Heterogeneous HPC Systems Running Modern Applications

Author: Galleguillos Cristian
Kiziltan Zeynep
Soto Ricardo
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th International Conference on Principles and Practice of Constraint Programming (CP 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

Term Rewriting on GPUs

Author: Groote Jan Friso
Hijma Pieter
Martens Jan
van Eerd Johri
Wijs Anton
Publication venue
Publication date: 15/09/2020
Field of study

We present a way to implement term rewriting on a GPU. We do this by letting the GPU repeatedly perform a massively parallel evaluation of all subterms. We find that if the term rewrite systems exhibit sufficient internal parallelism, GPU rewriting substantially outperforms the CPU. Since we expect that our implementation can be further optimized, and because in any case GPUs will become much more powerful in the future, this suggests that GPUs are an interesting platform for term rewriting. As term rewriting can be viewed as a universal programming language, this also opens a route towards programming GPUs by term rewriting, especially for irregular computations

arXiv.org e-Print Archive

VU Research Portal

Lightning: Scaling the GPU Programming Model Beyond a Single GPU

Author: Heldens S.
Hijma P.
Maassen J.
van Nieuwpoort R.V.
van Werkhoven B.
Publication venue: IEEE Computer Society
Publication date: 01/01/2022
Field of study

The GPU programming model is primarily aimed at the development of applications that run one GPU. However, this limits the scalability of GPU code to the capabilities of a single GPU in terms of compute power and memory capacity. To scale GPU applications further, a great engineering effort is typically required: work and data must be divided over multiple GPUs by hand, possibly in multiple nodes, and data must be manually spilled from GPU memory to higher-level memories. We present Lightning: a framework that follows the common GPU programming paradigm but enables scaling to large problems with ease. Lightning supports multi-GPU execution of GPU kernels, even across multiple nodes, and seamlessly spills data to higher-level memories (main memory and disk). Existing CUDA kernels can easily be adapted for use in Lightning, with data access annotations on these kernels allowing Lightning to infer their data requirements and the dependencies between subsequent kernel launches. Lightning efficiently distributes the work/data across GPUs and maximizes efficiency by overlapping scheduling, data movement, and kernel execution when possible. We present the design and implementation of Lightning, as well as experimental results on up to 32 GPUs for eight benchmarks and one real-world application. Evaluation shows excellent performance and scalability, such as a speedup of 57.2 x over the CPU using Lighting with 16 GPUs over 4 nodes and 80 GB of data, far beyond the memory capacity of one GPU. </p

arXiv.org e-Print Archive

VU Research Portal

International Migration, Integration and Social Cohesion online publications

UvA-DARE

A unified framework to improve the interoperability between HPC and Big Data languages and programming models

Author: Pichel Campos Juan Carlos
Piñeiro Pomar César Alfredo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2022
Field of study

One of the most important issues in the path to the convergence of HPC and Big Data is caused by the differences in their software stacks. Despite some research efforts, the interoperability between their programming models and languages is still limited. To deal with this problem we introduce a new computing framework called IgnisHPC, whose main objective is to unify the execution of Big Data and HPC workloads in the same framework. IgnisHPC has native support for multi-language applications using JVM and non-JVM-based languages. Since MPI was used as its backbone technology, IgnisHPC takes advantage of many communication models and network architectures. Moreover, MPI applications can be directly executed in an efficient way in the framework. The main consequence is that users could combine in the same multi-language code HPC tasks (using MPI) with Big Data tasks (using MapReduce operations). The experimental evaluation demonstrates the benefits of our proposal in terms of performance and productivity with respect to other frameworks. IgnisHPC is publicly available for the Big Data and HPC research communityThis work has been supported by MICINN, Spain (RTI2018-093336-B-C21, PLEC2021-007662), Xunta de Galicia, Spain (ED431G/08, ED431G-2019/04 and ED431C-2018/19) and the European Regional Development Fund (ERDF)S

Repositorio Institucional da Universidade de Santiago de Compostela

Towards Chapel-based Exascale Tree Search Algorithms: dealing with multiple GPU accelerators

Author: Carneiro Tiago
Hayashi Akihiro
Melab Nouredine
Sarkar Vivek
Publication venue: HAL CCSD
Publication date: 22/03/2021
Field of study

International audienceTree-based search algorithms applied to combinatorial optimization problems are highly irregular and time consuming when it comes to solving big instances. Solving such instances efficiently requires the use of massively parallel distributed-memory supercomputers. According to recent Top 500 trends, the degree of parallelism in these supercomputers continues to increase in size and complexity, with millions of heterogeneous (mainly CPU-GPU) cores. Harnessing this scale of computing resources raises at least three challenging issues which are described and addressed in this paper. Indeed, as a step towards exascale computing, we revisit the design and implementation of tree search algorithms dealing with multiple GPUs, in addition to scalability and productivity-awareness using Chapel. The proposed algorithm exploits Chapel's distributed iterators by combining a partial search strategy with pre-compiled CUDA kernels for more efficient exploitation of the intra-node parallelism. Extensive experimentation on big N-Queens problem instances using 24 GPUs shows that up to 90% of the linear speedup can be achieved

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Modern computing: vision and challenges

Author: Abraham Ajith
Arora Priyansh
Buyya Rajkumar
Cetinkaya Oktay
Ding Yuemin
Dustdar Schahram
Ghosh Soumya K
Gill Sukhpal Singh
Haunschild David
Kanhere Salil S
Li Ruidong
Lutfiyya Hanan
Ottaviani Carlo
Parlikad Ajith Kumar
Patros Panos
Pujol Victor Casamayor
Qadir Junaid
Ramamohanarao Kotagiri
Rana Omer
Rodrigues Joel JPC
Sakellariou Rizos
Song Houbing Herbert
Stankovski Vlado
Uhlig Steve
Wu Huaming
Publication venue: Elsevier
Publication date: 08/01/2024
Field of study

Over the past six decades, the computing systems field has experienced significant transformations, profoundly impacting society with transformational developments, such as the Internet and the commodification of computing. Underpinned by technological advancements, computer systems, far from being static, have been continuously evolving and adapting to cover multifaceted societal niches. This has led to new paradigms such as cloud, fog, edge computing, and the Internet of Things (IoT), which offer fresh economic and creative opportunities. Nevertheless, this rapid change poses complex research challenges, especially in maximizing potential and enhancing functionality. As such, to maintain an economical level of performance that meets ever-tighter requirements, one must understand the drivers of new model emergence and expansion, and how contemporary challenges differ from past ones. To that end, this article investigates and assesses the factors influencing the evolution of computing systems, covering established systems and architectures as well as newer developments, such as serverless computing, quantum computing, and on-device AI on edge devices. Trends emerge when one traces technological trajectory, which includes the rapid obsolescence of frameworks due to business and technical constraints, a move towards specialized systems and models, and varying approaches to centralized and decentralized control. This comprehensive review of modern computing systems looks ahead to the future of research in the field, highlighting key challenges and emerging trends, and underscoring their importance in cost-effectively driving technological progress

Oxford University Research Archive