Search CORE

5 research outputs found

Recommended from our members

Not all applications have boring communication patterns: Profiling message matching with BMM

Author: Alverson B
Cook B
Groves T
Keen N
Ravichandrasekaran N
Roweth D
Trebotich D
Underwood K
Wright NJ
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

Message matching within MPI is an important performance consideration for applications that utilize two-sided semantics. In this work, we present an instrumentation of the CrayMPI library that allows the collection of detailed message-matching statistics as well as an implementation of hashed matching in software. We use this functionality to profile key DOE applications with complex communication patterns to determine under what circumstances an application might benefit from hardware offload capabilities within the NIC to accelerate message matching. We find that there are several applications and libraries that exhibit sufficiently long match list lengths to motivate a Binned Message Matching approach

eScholarship - University of California

Navigating An Evolutionary Fast Path to Exascale.

Author: C T Vaughan
D Roweth
D W Doerfler
J P Luitjens
M A Heroux
R F Barrett
S D Hammond
Publication venue
Publication date: 01/01/2012
Field of study

Abstract-The computing community is in the midst of a disruptive architectural change. The advent of manycore and heterogeneous computing nodes forces us to reconsider every aspect of the system software and application stack. To address this challenge there is a broad spectrum of approaches, which we roughly classify as either revolutionary or evolutionary. With the former, the entire code base is re-written, perhaps using a new programming language or execution model. The latter, which is the focus of this work, seeks a piecewise path of effective incremental change. The end effect of our approach will be revolutionary in that the control structure of the application will be markedly different in order to utilize single-instruction multipledata/thread (SIMD/SIMT), manycore and heterogeneous nodes, but the physics code fragments will be remarkably similar. Our approach is guided by a set of mission driven applications and their proxies, focused on balancing performance potential with the realities of existing application code bases. Although the specifics of this process have not yet converged, we find that there are several important steps that developers of scientific and engineering application programs can take to prepare for making effective use of these challenging platforms. Aiding an evolutionary approach is the recognition that the performance potential of the architectures is, in a meaningful sense, an extension of existing capabilities: vectorization, threading, and a re-visiting of node interconnect capabilities. Therefore, as architectures, programming models, and programming mechanisms continue to evolve, the preparations described herein will provide significant performance benefits on existing and emerging architectures

CiteSeerX

Network-accelerated non-contiguous memory transfers

Author: Benini L.
Beranek J.
Besta M.
Di Girolamo S.
Hoefler T.
Kurth A.
Roweth D.
Schaffner M.
Schneider T.
Taranov K.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Applications often communicate data that is non-contiguous in the send- or the receive-buffer, e.g., when exchanging a column of a matrix stored in row-major order. While non-contiguous transfers are well supported in HPC (e.g., MPI derived datatypes), they can still be up to 5x slower than contiguous transfers of the same size. As we enter the era of network acceleration, we need to investigate which tasks to offload to the NIC: In this work we argue that non-contiguous memory transfers can be transparently network-accelerated, truly achieving zero-copy communications. We implement and extend sPIN, a packet streaming processor, within a Portals 4 NIC SST model, and evaluate strategies for NIC-offloaded processing of MPI datatypes, ranging from datatype-specific handlers to general solutions for any MPI datatype. We demonstrate up to 8x speedup in the unpack throughput of real applications, demonstrating that non-contiguous memory transfers are a first-class candidate for network acceleration

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Shared memory programming on the Meiko CS-2

Author: Koeln Univ. (Germany). Angewandte Mathematik und Informatik
Pfenning T. (Koeln Univ. (Germany). Zentrum fuer Paralleles Rechnen)
Roweth D. (Meiko Scientific, Bristol (United Kingdom))
Publication venue
Publication date: 01/01/1994
Field of study

A interesting feature of some recent parallel computers is the fact that the underlying transport mechanism behind the currently dominating message passing interfaces is based on a global address space model. By accessing this global adress space directly most of the inherent delays for administering message buffers and queues can be avoided. Using this interface we have implemented a user level distributed shared memory layer using the virtual memory protection mechanisms of the operating system. The synchronisation required for maintaining the coherency of the memory is addressed by implementing a distributed shared lock which exploits the remote atomic store operations provided by the Meiko CS-2. This allows an asynchronous stype of programming where the load is dynamically distributed over the nodes of a parallel partition. (orig.)SIGLEAvailable from TIB Hannover: RO 3476(180) / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekDEGerman

OpenGrey Repository

Two novel, putative mechanisms of action for citalopram-induced platelet inhibition

Author: A Bonnin
A DeLean
AA Cook
AM Galan
AMD Carneiro
BJ Hoffman
BT Kinsella
BZS Paul
C Goubau
C Walraven van
D Strümper
DC Snell
DG Hackam
E Caron
E Maurer-Spurej
G Grynkiewicz
GE Jarvis
GE Jarvis
H Chen
HG Roweth
HS Lee
I Lopez-vilchez
J Hyttel
J Ren
JM Burkhart
JR Crittenden
JW Black
L Opatrny
L Stefanini
L Stefanini
M Dall
M Moser
MD Filippi
MJ Owens
MJ Owens
ML Lozano
N Herr
NS Poulter
P Bellavite
RD Blakely
S Kumar
SM Jung
SM Jung
SO Dalton
VP Yakubenko
W Bergmeier
YL Tseng
YL Tseng
Z Honda
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref