Search CORE

9,866 research outputs found

Learning from the Success of MPI

Author: A. Geist
A. Skjellum
C.H. Koelbel
J. Boyle
J. Cownie
J. Dongarra
J.L. Traeff
K. Krechmer
Message Passing Interface Forum
Message Passing Interface Forum MPI2
N. Carriero
O. Zaki
P.B. Hansen
R. Hempel
R.C. Whaley
R.W. Numrich
W. Gropp
W. Gropp
W.W. Carlson
Publication venue
Publication date: 01/01/2001
Field of study

The Message Passing Interface (MPI) has been extremely successful as a portable way to program high-performance parallel computers. This success has occurred in spite of the view of many that message passing is difficult and that other approaches, including automatic parallelization and directive-based parallelism, are easier to use. This paper argues that MPI has succeeded because it addresses all of the important issues in providing a parallel programming model.Comment: 12 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

UNT Digital Library

Recommended from our members

A performance comparison of several superscalar processsor [sic] models with a VLIW processor

Author: Bagherzadeh Nader
Lenell John
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

Superscalar and VLIW processors can both execute multiple instructions each cycle. Each employs a different instruction scheduling method to achieve multiple instruction execution. Superscalar processors schedule instructions dynamically, and VLIW processors execute statically scheduled instructions. This paper quantitatively compares various superscalar processor architectures with a Very Long Instruction Word architecture developed at the University of California, Irvine. An architectural overview and performance analysis of the superscalar processor models and VIPER, a VLIW processor designed to take advantage of the parallelizing capabilities of Percolation Scheduling, are presented. The motivation for this comparison is to study the capability of a dynamically scheduled processor to obtain the same performance achieved by a statically scheduled processor, and examine the hardware resources required by each

eScholarship - University of California

On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation

Author: Cristal Adrian
Salami Behzad
Unsal Osman
Publication venue
Publication date: 14/06/2018
Field of study

Machine Learning (ML) is making a strong resurgence in tune with the massive generation of unstructured data which in turn requires massive computational resources. Due to the inherently compute- and power-intensive structure of Neural Networks (NNs), hardware accelerators emerge as a promising solution. However, with technology node scaling below 10nm, hardware accelerators become more susceptible to faults, which in turn can impact the NN accuracy. In this paper, we study the resilience aspects of Register-Transfer Level (RTL) model of NN accelerators, in particular, fault characterization and mitigation. By following a High-Level Synthesis (HLS) approach, first, we characterize the vulnerability of various components of RTL NN. We observed that the severity of faults depends on both i) application-level specifications, i.e., NN data (inputs, weights, or intermediate), NN layers, and NN activation functions, and ii) architectural-level specifications, i.e., data representation model and the parallelism degree of the underlying accelerator. Second, motivated by characterization results, we present a low-overhead fault mitigation technique that can efficiently correct bit flips, by 47.3% better than state-of-the-art methods.Comment: 8 pages, 6 figure

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC