Search CORE

7 research outputs found

Parallel programming environment for OpenMP

Author: Insung Park
Michael J Voss
Rudolf Eigenmann
Seon Wook Kim
Publication venue
Publication date: 11/04/2020
Field of study

We present our effort to provide a comprehensive parallel programming environment for the OpenMP parallel directive language. This environment includes a parallel programming methodology for the OpenMP programming model and a set of tools ( Ursa Minor and InterPol) that support this methodology. Our toolset provides automated and interactive assistance to parallel programmers in time-consuming tasks of the proposed methodology. The features provided by our tools include performance and program structure visualization, interactive optimization, support for performance modeling, and performance advising for finding and correcting performance problems. The presented evaluation demonstrates that our environment offers significant support in general parallel tuning efforts and that the toolset facilitates many common tasks in OpenMP parallel programming in an efficient manner

CiteSeerX

Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors

Author: Akkary H.
Cintra M.
Figueiredo R.
Garzarán M. J.
Gopal S.
Gupta M.
Hammond L.
Josep Torrellas
José María Llabería
Knight T.
Lawrence Rauchwerger
Marcuello P.
María Jesús Garzarán
Milos Prvulovic
Prvulovic M.
Rauchwerger L.
Rundberg P.
Sohi G. S.
Steffan J.
Tremblay M.
Víctor Viñals
Zhang Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Tiled Algorithms for Matrix Computations on Multicore Architectures

Author: Elizabeth R. Jessup
Gita Alaghband
Henricus M Bouwmeester
Henricus M Bouwmeester
Julien Langou Advisor
Stephen Billups
Publication venue
Publication date: 13/03/2013
Field of study

The current computer architecture has moved towards the multi/many-core structure. However, the algorithms in the current sequential dense numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multi/many-core architectures. A new family of algorithms, the tile algorithms, has recently been introduced to circumvent this problem. Previous research has shown that it is possible to write efficient and scalable tile algorithms for performing a Cholesky factorization, a (pseudo) LU factorization, and a QR factorization. The goal of this thesis is to study tiled algorithms in a multi/many-core setting and to provide new algorithms which exploit the current architecture to improve performance relative to current state-of-the-art libraries while maintaining the stability and robustness of these libraries.Comment: PhD Thesis, 2012 http://math.ucdenver.ed

arXiv.org e-Print Archive

CiteSeerX

SPIRAL: Code Generation for DSP Transforms

Author: A. Gacic
B.W. Singer
D. Padua
F. Franchetti
J.M.F. Moura
J.R. Johnson
Jianxin Xiong
K. Chen
M. Puschel
M.M. Veloso
N. Rizzolo
R.W. Johnson
Y. Voronenko
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

On the Automatic Parallelization of the Perfect Benchmarks

Author: David Padua
Jay Hoeflinger
Rudolf Eigenmann
Publication venue
Publication date: 01/01/1994
Field of study

This paper presents the results of the Cedar Hand-Parallelization Experiment, conducted from 1989 through 1992 within the Center for Supercomputing Research and Development (CSRD) at the University of Illinois. In this experiment we manually transformed the Perfect Benchmarks R fl into parallel program versions. In doing so, we used techniques that may be automated in an optimizing compiler. We then ran these programs on the Cedar multiprocessor (built at CSRD during the 1980s) and measured the speed improvement due to each technique

CiteSeerX

On the Automatic Parallelization of the Perfect Benchmarks R Rudolf Eigenmann y Jay Hoe inger

Author: David Padua
Publication venue
Publication date
Field of study

This paper presents the results of the Cedar Hand-Parallelization Experiment, conducte

CiteSeerX