Search CORE

49 research outputs found

A Hybrid Approach to Proving Memory Reference Monotonicity

Author: C.E. Oancea
H. Yu
J. Hoeflinger
L. Rauchwerger
L. Rauchwerger
M. Berry
M.W. Hall
P. Feautrier
P. Feautrier
S. Rus
T. Fahringer
W. Blume
W. Blume
W. Pugh
Y. Lin
Y. Lin
Y. Paek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Value Prediction and Speculative Execution on GPU

Author: Christine Eisenbeis
Jean-Luc Gaudiot
L. Hammond
L. Rauchwerger
M. Franklin
S. Liu
Shaoshan Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors

Author: Akkary H.
Cintra M.
Figueiredo R.
Garzarán M. J.
Gopal S.
Gupta M.
Hammond L.
Josep Torrellas
José María Llabería
Knight T.
Lawrence Rauchwerger
Marcuello P.
María Jesús Garzarán
Milos Prvulovic
Prvulovic M.
Rauchwerger L.
Rundberg P.
Sohi G. S.
Steffan J.
Tremblay M.
Víctor Viñals
Zhang Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Automatically Harnessing Sparse Acceleration

Sparse linear algebra is central to many scientific programs, yet compilers fail to optimize it well. High-performance libraries are available, but adoption costs are significant. Moreover, libraries tie programs into vendor-specific software and hardware ecosystems, creating non-portable code. In this paper, we develop a new approach based on our specification Language for implementers of Linear Algebra Computations (LiLAC). Rather than requiring the application developer to (re)write every program for a given library, the burden is shifted to a one-off description by the library implementer. The LiLAC-enabled compiler uses this to insert appropriate library routines without source code changes. LiLAC provides automatic data marshaling, maintaining state between calls and minimizing data transfers. Appropriate places for library insertion are detected in compiler intermediate representation, independent of source languages. We evaluated on large-scale scientific applications written in FORTRAN; standard C/C++ and FORTRAN benchmarks; and C++ graph analytics kernels. Across heterogeneous platforms, applications and data sets we show speedups of 1.1

\times

to over 10

\times

without user intervention.Comment: Accepted to CC 202

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

New insights into the synergism of nucleoside analogs with radiotherapy

Author: A Cohen
A Cohen
A Cohen
A Fyrberg
A Neschadim
A Sandoval
A Saven
A Saven
A Saven
A Zhenchuk
AD Bulgar
AD Seidman
AR Pettitt
AR Pettitt
B Ewald
B Lund
B Pauwels
BD Cheson
Bo Xu
C Nabhan
C Smal
C Smal
C Smal
C Smal
C Xie
C Yang
CE Cass
CF Pollera
CM Galmarini
CM Galmarini
CO Rodriguez Jr
CO Rodriguez Jr
CO Rodriguez Jr
CU Lambe
D Genini
D Genini
D Latz
DA Carson
DR Rauchwerger
DS Shewach
DS Shewach
DS Shewach
E Sabini
ER Giblett
ES Arner
ES Casper
EW Gelfand
FJ Keith
FM Wachters
GP Leung
H Anderson
H Kawasaki
HB Latourette
HM Kantarjian
J Bernier
J Carmichael
J Griffig
J Sigmond
JA Montgomery
JA Montgomery
JA Ubersax
JK Lamba
JK Owens
JR Mackey
JS Ryu
JWG van Putten
K Bhalla
K Fabianowska-Majewska
K Lotfi
K Lotfi
K Ohmine
KL Prus
KM King
L Danhauser
L Li
L Taricani
L Wang
LD Piro
LE Robertson
M Grever
M Iacobini
M Johansson
M Moore
M Nitsche
M Sundaram
MA Stackhouse
Michael W Lee
MJ Cariveau
MJ Pugmire
NA Kocabas
NM Chandler
P Hatzis
P Hentosh
P Hentosh
P Huang
P Huang
P Huang
P Rossolillo
PL Bonate
R Amsailale
RL Capizzi
RP Abratt
RW Brockman
S Hazra
S Hazra
S Nagai
S Seto
T McSorley
T Yamauchi
TA Krenitsky
TS Lawrence
U Consoli
V Gandhi
V Gandhi
V Gregoire
V Gregoire
V Heinemann
V Heinemann
V Verhoef
VI Avramis
VL Damaraju
VM Santana
W Plunkett
WB Parker
WB Parker
WB Parker
William B Parker
Y Saiki
Y Zhang
Z Csapo
ZS Chen
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

Nucleoside analogs have been frequently used in combination with radiotherapy in the clinical setting, as it has long been understood that inhibition of DNA repair pathways is an important means by which many nucleoside analogs synergize. Recent advances in our understanding of the structure and function of deoxycytidine kinase (dCK), a critical enzyme required for the anti-tumor activity for many nucleoside analogs, have clarified the mechanistic role this kinase plays in chemo- and radio-sensitization. A heretofore unrecognized role of dCK in the DNA damage response and cell cycle machinery has helped explain the synergistic effect of these agents with radiotherapy. Since most currently employed nucleoside analogs are primarily activated by dCK, these findings lend fresh impetus to efforts focused on profiling and modulating dCK expression and activity in tumors. In this review we will briefly review the pharmacology and biochemistry of the major nucleoside analogs in clinical use that are activated by dCK. This will be followed by discussions of recent advances in our understanding of dCK activation via post-translational modifications in response to radiation and current strategies aimed at enhancing this activity in cancer cells

Crossref

Springer - Publisher Connector

PubMed Central

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Double Inspection for Run-Time Loop Parallelization

Author: C.Q. Zhu
J.H. Saltz
L. Rauchwerger
L. Rauchwerger
M. Kulkarni
S.P. Midkiff
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

On the Scalability of an Automatically Parallelized Irregular Application

Author: J. Larus
L. Rauchwerger
L.J. Hendren
R.J. Allen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Abstract. Irregular applications, i.e., programs that manipulate pointer-based data structures such as graphs and trees, constitute a challenging target for pa-rallelization because the amount of parallelism is input dependent and changes dynamically. Traditional dependence analysis techniques are too conservative to expose this parallelism. Even manual parallelization is difficult, time consum-ing, and error prone. The Galois system parallelizes such applications using an optimistic approach that exploits higher-level semantics of abstract data types. In this paper, we study the performance and scalability of a Galoised, that is, automatically parallelized, version of Delaunay mesh refinement (DR) on a shared-memory system with 128 CPUs. DR is an important irregular application that is used, e.g., in graphics and finite-element codes. The parallelized program scales to 64 threads, where it reaches a speedup of 25.8. For large numbers of threads, the performance is hampered by the load imbalance and the nonuniform memory latency, both of which grow as the number of threads increases. While these two issues will have to be addressed in future work, we believe our results already show the Galois approach to be very promising

CiteSeerX

Crossref

An Efficient Run-Time Scheme for Exploiting Parallelism on Multiprocessor Systems

Author: C. Polychronopoulos
C. Q. Zhu
J. Saltz
L. Rauchwerger
L. Rauchwerger
S. Midkiff
T. C. Huang
T. C. Huang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref