Search CORE

12 research outputs found

FINJ: A Fault Injection Tool for HPC Systems

Author: A Gainaru
C Lameter
F Cappello
J Calhoun
MC Hsueh
N DeBardeleben
O Tuncer
Publication venue
Publication date: 01/09/2018
Field of study

We present FINJ, a high-level fault injection tool for High-Performance Computing (HPC) systems, with a focus on the management of complex experiments. FINJ provides support for custom workloads and allows generation of anomalous conditions through the use of fault-triggering executable programs. FINJ can also be integrated seamlessly with most other lower-level fault injection tools, allowing users to create and monitor a variety of highly-complex and diverse fault conditions in HPC systems that would be difficult to recreate in practice. FINJ is suitable for experiments involving many, potentially interacting nodes, making it a very versatile design and evaluation tool.Comment: To be presented at the 11th Resilience Workshop in the 2018 Euro-Par conferenc

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors

Author: Boyd-Wickizer S.
Boyd-Wickizer S.
Boyd-Wickizer S.
Cadar C.
Corbet J.
Koopman P.
Lameter C.
McKenney P. E.
Roy A.
Shapiro M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

What fundamental opportunities for scalability are latent in interfaces, such as system call APIs? Can scalability opportunities be identified even before any implementation exists, simply by considering interface specifications? To answer these questions this paper introduces the following rule: Whenever interface operations commute, they can be implemented in a way that scales. This rule aids developers in building more scalable software starting from interface design and carrying on through implementation, testing, and evaluation. To help developers apply the rule, a new tool named Commuter accepts high-level interface models and generates tests of operations that commute and hence could scale. Using these tests, Commuter can evaluate the scalability of an implementation. We apply Commuter to 18 POSIX calls and use the results to guide the implementation of a new research operating system kernel called sv6. Linux scales for 68% of the 13,664 tests generated by Commuter for these calls, and Commuter finds many problems that have been observed to limit application scalability. sv6 scales for 99% of the tests.Engineering and Applied Science

CiteSeerX

DSpace@MIT

Crossref

Harvard University - DASH

NUMA (Non-Uniform Memory Access): An Overview

Author: Christoph Lameter
Lameter C.
Lameter C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

An overview of non-uniform memory access

Author: Braithwaite R.
Christoph Lameter
Lameter C.
Lameter C.
Li Y.
Schimmel K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

FINJ: A fault injection tool for HPC systems

Author: A Gainaru
C Lameter
F Cappello
J Calhoun
MC Hsueh
N DeBardeleben
O Tuncer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Crossref

Archivio della Ricerca - Università di Pisa

NUMA-Awareness as a Plug-In for an Eventify-Based Fast Multipole Method

Author: A Amer
C Lameter
E Agullo
L Greengard
L Ying
M Abduljabbar
R Beatson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Following the trend towards Exascale, today’s supercomputers consist of increasingly complex and heterogeneous compute nodes. To exploit the performance of these systems, research software in HPC needs to keep up with the rapid development of hardware architectures. Since manual tuning of software to each and every architecture is neither sustainable nor viable, we aim to tackle this challenge through appropriate software design. In this article, we aim to improve the performance and sustainability of FMSolvr, a parallel Fast Multipole Method for Molecular Dynamics, by adapting it to Non-Uniform Memory Access architectures in a portable and maintainable way. The parallelization of FMSolvr is based on Eventify, an event-based tasking framework we co-developed with FMSolvr. We describe a layered software architecture that enables the separation of the Fast Multipole Method from its parallelization. The focus of this article is on the development and analysis of a reusable NUMA module that improves performance while keeping both layers separated to preserve maintainability and extensibility. By means of the NUMA module we introduce diverse NUMA-aware data distribution, thread pinning and work stealing policies for FMSolvr. During the performance analysis the modular design of the NUMA module was advantageous since it facilitates combination, interchange and redesign of the developed policies. The performance analysis reveals that the runtime of FMSolvr is reduced by 21% from 1.48 ms to 1.16 ms through these policies

Crossref

Juelich Shared Electronic Resources

Optimistic concurrency with OPTIK

Author: Dragojević A.
Gramoli V.
Harris T.
Herlihy M.
Herlihy M.
Lameter C.
McKenney P. E.
Rajwar R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/03/2016
Field of study

We introduce OPTIK, a new practical design pattern for designing and implementing fast and scalable concurrent data structures. OPTIK relies on the commonly-used technique of version numbers for detecting conflicting concurrent operations. We show how to implement the OPTIK pattern using the novel concept of OPTIK locks. These locks enable the use of version numbers for implementing very efficient optimistic concurrent data structures. Existing state-of-the-art lock-based data structures acquire the lock and then check for conflicts. In contrast, with OPTIK locks, we merge the lock acquisition with the detection of conflicting concurrency in a single atomic step, similarly to lock-free algorithms. We illustrate the power of our OPTIK pattern and its implementation by introducing four new algorithms and by optimizing four state-of-the-art algorithms for linked lists, skip lists, hash tables, and queues. Our results show that concurrent data structures built using OPTIK are more scalable than the state of the art

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Security Vulnerability in Processor-Interconnect Router Design

Author: Anderson D.
Dally W. J.
Huang A.
Karlin J.
Kauer B.
Lameter C.
Lineberry A.
Moscibroda T.
Niu S.
Sumrall N.
Wojtczuk R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Scalability techniques for practical synchronization primitives

Author: Boyd-Wickizer S.
Bueso D.
Corbet J.
Corbet J.
Fuerst S.
Gray J.N.
Lameter C.
McKenney P.E.
Molnar I.
Scott M.L.
Unrau R.C.
van Riel R.
Zijlstra P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

NOrec

Author: Dalessandro L.
Dice D.
Lameter C.
Larus J.R.
Lev Y.
Lev Y.
Luke Dalessandro
Marathe V.J.
McKenney P.E.
Michael F. Spear
Michael L. Scott
Minh C.C.
Spear M.F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref