Search CORE

13 research outputs found

GPU Behavior on a Large HPC Cluster

Author: A. Danalis
F. Cappello
M. Fatica
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Online impact analysis via dynamic compilation technology

Author: A. Danalis
B. Breech
Lori Pollock
Stacey Shindo
Publication venue
Publication date
Field of study

Dynamic impact analysis based on whole path profiling of method calls and returns has been shown to provide more useful predictions of software change impacts than methodlevel static slicing and to avoid the overhead of expensive dependency analysis needed for dynamic slicing-based impact analysis. This paper presents the design, implementation, and evaluation of an online approach to dynamic impact analysis as an extension to the DynamoRIO binary code modification system and to the Jikes Research Virtual Machine. Storage and postmortem analysis of program traces, even compressed, are avoided. 1

CiteSeerX

Maestro: Data Orchestration and Tuning for OpenCL Devices

Author: A. Danalis
C.I. Rodrigues
J. Bolz
J. Owens
J.E. Stone
J.S. Meredith
S. Venkatasubramanian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

An OpenMP 3.1 Validation Testsuite

Author: A. Danalis
A. Duran
C. Bienia
C. Burgess
C. Liao
M. Wong
V. Aslot
W. McKeeman
X. Yang
Publication venue
Publication date: 01/01/2012
Field of study

Parallel programming models are evolving so rapidly that it needs to be ensured that OpenMP can be used easily to program multicore devices. There is also effort involved in getting OpenMP to be accepted as a de facto standard in the embedded system community. However, in order to ensure correctness of OpenMP’s implementation, there is a requirement of an up-to-date validation suite. In this paper, we present a portable and robust validation testsuite execution environment to validate the OpenMP implementation in several compilers. We cover all the directives and clauses of OpenMP until the latest release, OpenMP Version 3.1. Our primary focus is to determine and evaluate the correctness of the OpenMP implementation in our research compiler, OpenUH and few others such as Intel, Sun/Oracle and GNU. We also aim to find the ambiguities in the OpenMP specification and help refine the same with the validation suite. Furthermore, we also include deeper tests such as cross tests and orphan tests in the testsuite

CiteSeerX

Crossref

Improving Performance Portability in OpenCL Programs

Author: A. Danalis
A.A. Chien
D. Loveman
J. Fang
J.A. Stratton
K. Goto
P. Du
P. Thoman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

An Automated Approach to Improve Communication-Computation Overlap in Clusters. Senior Thesis

Author: A. Danalis
Anthony Danalis A
E. Zapata
F. J. Peters
G. R. Joubert
L. Fishgold
L. Pollock
Lewis Fishgold A
Lori Pollock A
M. Swany
Martin Swany A
O. Plata
P. Tirado
Published In
W. E. Nagel
Publication venue
Publication date
Field of study

Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. T

CiteSeerX

A Case for Non-blocking Collective Operations

Author: A. Alexandrov
A. Danalis
A. Dubey
A. Wagner
C. Iancu
F. Baude
F. Petrini
J.S. Vetter
P. Shivam
P. Terry
P.Y. Calland
R. Brightwell
R. Brightwell
S. Gorlatch
W. Lawry
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Crossref

CUDA-For-Clusters: A System for Efficient Execution of CUDA Kernels on Multi-Core Clusters

Author: A. Danalis
B. Chamberlain
C. Amza
F. Cappello
G.F. Diamos
I. Gelado
J. Gummaraju
J.A. Stratton
K. Li
M. Snir
N.P. Manoj
P. Charles
S.V. Adve
Publication venue
Publication date: 01/01/2012
Field of study

Abstract. Rapid advancements in multi-core processor architectures along with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs to efficiently utilize all the resources in such a cluster is still a major challenge. Programmers have to manually deal with low-level details that should ideally be the responsibility of an intelligent compiler or a run-time layer. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing number of CUDA developers and benchmarks along with the stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation

CiteSeerX

Crossref

Open Access Repository of IISc Research Publications

Fast polyenergetic forward projection for image formation using OpenCL on a heterogeneous parallel computing platform

Author: Danalis A.
Dorgham O.
Gendrin C. W. C.
Keck B.
Knaup M.
Riabkov D.
Ruijters D.
Scherl H.
Zubal I. G.
Zubal I. G.
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications

Author: Agullo E.
Akbudak K.
Amestoy P. R.
Bauer M.
Bebendorf M.
Bosilca G.
Bosilca G.
Chan E.
Danalis A.
Dokulil J.
Garg R.
Hoque R.
Lacoste X.
Ltaief H.
Parr R.G.
Pei Y.
Tillenius M.
Tsafrir D.
Wilson A. G.
Wu W.
Yu C.D.
Publication venue
Publication date: 29/06/2020
Field of study

Crossref

The University of Manchester - Institutional Repository