Search CORE

1,451 research outputs found

Preference-Guided Register Assignment

Author: B. Boissinot
C. Wimmer
F. Bouchez
F. Bouchez
F. Pereira
G.J. Chaitin
G.Y. Lueh
J. Fabri
J. Park
L. George
M. Braun
O. Traub
P. Briggs
P. Brisk
R. Morgan
S. Hack
S. Hack
T. Nakaike
T.A. Wagner
V. Sarkar
Z. Budimlić
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Abstract. This paper deals with coalescing in SSA-based register allo-cation. Current coalescing techniques all require the interference graph to be built. This is generally considered to be too compile-time intensive for just-in-time compilation. In this paper, we present a biased coloring approach that gives results similar to standalone coalescers while signif-icantly reducing compile time.

CiteSeerX

Crossref

Global Optimization for Future Gravitational Wave Detectors' Sites

Author: Gondan Laszlo
Hendry Martin
Heng Ik Siong
Hu Yi-Ming
Kelecsenyi Nandor
Marka Szabolcs
Marka Zsuzsa
Raffai Peter
Publication venue: 'IOP Publishing'
Publication date: 01/01/2015
Field of study

We consider the optimal site selection of future generations of gravitational wave detectors. Previously, Raffai et al. optimized a 2-detector network with a combined figure of merit. This optimization was extended to networks with more than two detectors in a limited way by first fixing the parameters of all other component detectors. In this work we now present a more general optimization that allows the locations of all detectors to be simultaneously chosen. We follow the definition of Raffai et al. on the metric that defines the suitability of a certain detector network. Given the locations of the component detectors in the network, we compute a measure of the network's ability to distinguish the polarization, constrain the sky localization and reconstruct the parameters of a gravitational wave source. We further define the `flexibility index' for a possible site location, by counting the number of multi-detector networks with a sufficiently high Figure of Merit that include that site location. We confirm the conclusion of Raffai et al., that in terms of flexibility index as defined in this work, Australia hosts the best candidate site to build a future generation gravitational wave detector. This conclusion is valid for either a 3-detector network or a 5-detector network. For a 3-detector network site locations in Northern Europe display a comparable flexibility index to sites in Australia. However for a 5-detector network, Australia is found to be a clearly better candidate than any other location.Comment: 30 pages, 23 figures, 2 table

arXiv.org e-Print Archive

Enlighten

ELTE Digital Institutional Repository (EDIT)

Efficient and Reasonable Object-Oriented Concurrency

Author: Meyer Bertrand
Nanz Sebastian
West Scott
Publication venue
Publication date: 27/07/2015
Field of study

Making threaded programs safe and easy to reason about is one of the chief difficulties in modern programming. This work provides an efficient execution model for SCOOP, a concurrency approach that provides not only data race freedom but also pre/postcondition reasoning guarantees between threads. The extensions we propose influence both the underlying semantics to increase the amount of concurrent execution that is possible, exclude certain classes of deadlocks, and enable greater performance. These extensions are used as the basis an efficient runtime and optimization pass that improve performance 15x over a baseline implementation. This new implementation of SCOOP is also 2x faster than other well-known safe concurrent languages. The measurements are based on both coordination-intensive and data-manipulation-intensive benchmarks designed to offer a mixture of workloads.Comment: Proceedings of the 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE '15). ACM, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Efficient optimization of memory accesses in parallel programs

Author: Barik Rajkishore
Publication venue
Publication date: 01/01/2009
Field of study

The power, frequency, and memory wall problems have caused a major shift in mainstream computing by introducing processors that contain multiple low power cores. As multi-core processors are becoming ubiquitous, software trends in both parallel programming languages and dynamic compilation have added new challenges to program compilation for multi-core processors. This thesis proposes a combination of high-level and low-level compiler optimizations to address these challenges. The high-level optimizations introduced in this thesis include new approaches to May-Happen-in-Parallel analysis and Side-Effect analysis for parallel programs and a novel parallelism-aware Scalar Replacement for Load Elimination transformation. A new Isolation Consistency (IC) memory model is described that permits several scalar replacement transformation opportunities compared to many existing memory models. The low-level optimizations include a novel approach to register allocation that retains the compile time and space efficiency of Linear Scan, while delivering runtime performance superior to both Linear Scan and Graph Coloring. The allocation phase is modeled as an optimization problem on a Bipartite Liveness Graph (BLG) data structure. The assignment phase focuses on reducing the number of spill instructions by using register-to-register move and exchange instructions wherever possible. Experimental evaluations of our scalar replacement for load elimination transformation in the Jikes RVM dynamic compiler show decreases in dynamic counts for getfield operations of up to 99.99%, and performance improvements of up to 1.76x on 1 core, and 1.39x on 16 cores, when compared with the load elimination algorithm available in Jikes RVM. A prototype implementation of our BLG register allocator in Jikes RVM demonstrates runtime performance improvements of up to 3.52x relative to Linear Scan on an x86 processor. When compared to Graph Coloring register allocator in the GCC compiler framework, our allocator resulted in an execution time improvement of up to 5.8%, with an average improvement of 2.3% on a POWER5 processor. With the experimental evaluations combined with the foundations presented in this thesis, we believe that the proposed high-level and low-level optimizations are useful in addressing some of the new challenges emerging in the optimization of parallel programs for multi-core architectures

CiteSeerX

DSpace at Rice University

A Polynomial Spilling Heuristic: Layered Allocation

Author: Cohen Albert
Diouf Boubacar
Rastello Fabrice
Publication venue: HAL CCSD
Publication date: 02/07/2012
Field of study

Register allocation is one of the most important, and one of the oldest compiler optimizations. Its purpose is to map temporary variables to either machine registers or main memory locations and explicit load/store instructions. The latter option is referred to as spilling. This paper addresses the minimization of the spill code overhead, one of the di fficult problems in register allocation. We devised a heuristic approach called layered. It is rooted in the recent advances in SSA-based register allocation. As opposed to the conventional incremental spilling approaches, our method incrementally allocates clusters of variables. We describe a new polynomial method, the layered-optimal allocator, and demonstrate its quasi-optimiality on standard benchmarks and on two architectures.L'allocation de registres est l'une des premières et des plus importantes optimisations effectuées par les compilateurs. Elle a pour but d'associer aux variables temporaires du programme des registres de la machine ou des locations mémoires et d'insérer, dans le code, des instructions de load/store explicites, appelées vidage. Dans ce papier, nous nous intéressons à la minimisation des latences mémoires dues au code de vidage, un des problèmes difficiles en allocation de registres. Nous proposons une approche heuristique d'allocation par couches. Ce travail se base sur les récentes avancées en allocation de registres sous SSA. Contrairement à l'approche conventionnelle de vidage incrémental, notre méthode alloue les variables de manière incrémentale par groupe. Nous comparons notre approche, appelée allocation-optimale par couche, aux méthodes de l'état de l'art à une approche optimale et nous montrons l'allocation-optimale par couche est quasi-optimale sur des benchmarks standard et sur deux architectures différentes

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Physics, Astrophysics and Cosmology with Gravitational Waves

Author: A Abramovici
A Bruce
A Buonanno
A Buonanno
A Buonanno
A Buonanno
A Buonanno
A Buonanno
A Buonanno
A Eckart
A Giazotto
A Hewish
A Królak
A Megevand
A Pai
A Reisenegger
A Stroeer
A Stroeer
A Vilenkin
AA Penzias
AA Watson
AC Searle
AG Lyne
AG Lyne
AG Riess
AI MacFadyen
AL Watts
AM Cruise
AM Sintes
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Abbott
B Allen
B Allen
B Allen
B Brügmann
B Caron
B Knispel
B Krishnan
B Willke
B Willke
BD Lackey
BEJ Pagel
BF Schutz
BF Schutz
BF Schutz
BF Schutz
BF Schutz
BF Schutz
BF Schutz
BG Keating
BJ Owen
BJ Owen
BJ Owen
BS Sathyaprakash
BS Sathyaprakash
BS Sathyaprakash
BW Stappers
C Alcock
C Bennett
C Broeck Van Den
C Broeck Van Den
C Cutler
C Cutler
C Cutler
C Rover
C Rover
C Ungarelli
C Ungarelli
CAK Robinson
CD Ott
CJ Hogan
CJ Hogan
CJ Hogan
CL MacLeod
CM Caves
CM Will
CM Will
CV Vishveshwara
CV Vishveshwara
CW Helstrom
CW Misner
D Baskaran
D Hils
D Merritt
D Nicholson
D Nicholson
D Pollney
D Richstone
DE Holz
DF Chernoff
DJB Payne
DN Spergel
DS Sivia
E Berti
E Berti
E Berti
E Berti
E Coccia
E Müller
E Noyola
E Poisson
E Schreier
F Acernese
F Acernese
F Herrmann
F Pretorius
F Pretorius
FA Jenet
FA Jenet
FD Ryan
FD Ryan
FJ Raab
FJ Raab
FJ Zerilli
FP Gavriil
FR Pearce
G Nelemans
G Nelemans
G Steigman
G Ushomirsky
GF Smoot
H Dimmelmeier
H Dimmelmeier
H Lück
H Lück
H Mukhopadhyay
H Tagoshi
H Tagoshi
H Vahlbruch
HD Falcke
IS Heng
J Arons
J Crowder
J Faulkner
J Hjorth
J Hough
J Smak
J Veitch
J Veitch
J Weber
JA Faber
JA Gonzalez
JDE Creighton
JG Baker
JG Baker
JH Taylor
JH Taylor Jr
JL Friedman
JM Lattimer
JM Weisberg
JR Gair
JR Gair
JR Gair
JT Whelan
K Danzmann
K Danzmann
K Gebhardt
K Glampedakis
K Glampedakis
K Glampedakis
K Glampedakis
K Tsubono
K Tsubono
KA Compton
KG Arun
KG Arun
KG Arun
KG Arun
KG Arun
KG Arun
KS Thorne
KS Thorne
L Baggio
L Baiotti
L Barack
L Barack
L Barack
L Bildsten
L Blanchet
L Blanchet
L Blanchet
L Blanchet
L Blanchet
L Blanchet
L Blanchet
L Gottardi
L Lindblom
L Lindblom
L Lindblom
L Lindblom
L Page
L Randall
L Rezzolla
L Rezzolla
L Rezzolla
L Wen
L Wen
LP Grishchuk
LP Grishchuk
LS Finn
LS Finn
LS Finn
LS Finn
M Bonaldi
M Boyle
M Boyle
M Burgay
M Campanelli
M Campanelli
M Klis van der
M Kramer
M Landgraf
M Milosavljević
M Nayyar
M Schmidt
M Shibata
M Trias
M Vallisneri
MG Haehnelt
MJ Rees
MJ Rees
MJ Valtonen
MV Plissi
N Andersson
N Andersson
N Andersson
N Arnaud
N Arnaud
N Christensen
N Christensen
N Dalal
N Stergioulas
NJ Cornish
NJ Cornish
NJ Cornish
NJ Cornish
OD Aguiar
P Ajith
P Amaro-Seoane
P Amaro-Seoane
P Amaro-Seoane
P Astone
P Astone
P Astone
P Astone
P Astone
P Jaranowski
P Jaranowski
P Marronetti
P Mészáros
PC Peters
PL Bender
PL Bender
PR Brady
R Balasubramanian
R Balasubramanian
R Balasubramanian
R Brustein
R Schneider
R Schneider
R Umstatter
RA Capon
RA Hulse
RA Hulse
RJ Dupuis
RN Lang
RN Lang
RR Caldwell
RV Wagoner
RW Klebesadel
RWP Drever
S Babak
S Bose
S Chandrasekhar
S Chandrasekhar
S Chatterji
S Kawamura
S Klimenko
S Klimenko
S Komossa
S Komossa
S Perlmutter
S Sigurdsson
S Sigurdsson
SA Hughes
SA Hughes
SA Teukolsky
SA Teukolsky
SD Mohanty
SE Woosley
SF Portegies Zwart
SV Dhurandhar
SV Dhurandhar
T Akutsu
T Cokelaer
T Daisuke
T Damour
T Damour
T Damour
T Damour
T Damour
T Damour
T Damour
T Damour
T Damour
T Damour
T Damour
T Futamase
T Nakamura
T Regge
TA Apostolatos
TC Quinn
V Ferrari
V Kalogera
VF Mukhanov
VM Kaspi
W Sutherland
WG Anderson
WH Press
WH Press
Y Gürsel
Y Mino
Y Pan
Y Pan
ZA Allen
ÉÉ Flanagan
ÉÉ Flanagan
ÉÉ Flanagan
Publication venue: 'Living Reviews'
Publication date: 01/01/2009
Field of study

Gravitational wave detectors are already operating at interesting sensitivity levels, and they have an upgrade path that should result in secure detections by 2014. We review the physics of gravitational waves, how they interact with detectors (bars and interferometers), and how these detectors operate. We study the most likely sources of gravitational waves and review the data analysis methods that are used to extract their signals from detector noise. Then we consider the consequences of gravitational wave detections and observations for physics, astrophysics, and cosmology.Comment: 137 pages, 16 figures, Published version <http://www.livingreviews.org/lrr-2009-2

arXiv.org e-Print Archive

CiteSeerX

Crossref

Online Research @ Cardiff

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Optimizing Local Memory Allocation and Assignment Through a Decoupled Approach

Author: Cohen Albert
Diouf Boubacar
Özturk Özcan
Publication venue: HAL CCSD
Publication date: 01/10/2009
Field of study

International audienceSoftware-controlled local memories (LMs) are widely used to provide fast, scalable, power efficient and predictable access to critical data. While many studies addressed LM management, keeping hot data in the LM continues to cause major headache. This paper revisits LM management of arrays in light of recent progresses in register allocation, supporting multiple live-range splitting schemes through a generic integer linear program. These schemes differ in the grain of decision points. The model can also be extended to address fragmentation, assigning live ranges to precise offsets. We show that the links between LM management and register allocation have been underexploited, leaving much fundamental questions open and effective applications to be explored

INRIA a CCSD electronic archive server

A new approach to reversible computing with applications to speculative parallel simulation

Author: Cingolani Davide
Publication venue
Publication date: 22/02/2019
Field of study

In this thesis, we propose an innovative approach to reversible computing that shifts the focus from the operations to the memory outcome of a generic program. This choice allows us to overcome some typical challenges of "plain" reversible computing. Our methodology is to instrument a generic application with the help of an instrumentation tool, namely Hijacker, which we have redesigned and developed for the purpose. Through compile-time instrumentation, we enhance the program's code to keep track of the memory trace it produces until the end. Regardless of the complexity behind the generation of each computational step of the program, we can build inverse machine instructions just by inspecting the instruction that is attempting to write some value to memory. Therefore from this information, we craft an ad-hoc instruction that conveys this old value and the knowledge of where to replace it. This instruction will become part of a more comprehensive structure, namely the reverse window. Through this structure, we have sufficient information to cancel all the updates done by the generic program during its execution. In this writing, we will discuss the structure of the reverse window, as the building block for the whole reversing framework we designed and finally realized. Albeit we settle our solution in the specific context of the parallel discrete event simulation (PDES) adopting the Time Warp synchronization protocol, this framework paves the way for further general-purpose development and employment. We also present two additional innovative contributions coming from our innovative reversibility approach, both of them still embrace traditional state saving-based rollback strategy. The first contribution aims to harness the advantages of both the possible approaches. We implement the rollback operation combining state saving together with our reversible support through a mathematical model. This model enables the system to choose in autonomicity the best rollback strategy, by the mutable runtime dynamics of programs. The second contribution explores an orthogonal direction, still related to reversible computing aspects. In particular, we will address the problem of reversing shared libraries. Indeed, leading from their nature, shared objects are visible to the whole system and so does every possible external modification of their code. As a consequence, it is not possible to instrument them without affecting other unaware applications. We propose a different method to deal with the instrumentation of shared objects. All our innovative proposals have been assessed using the last generation of the open source ROOT-Sim PDES platform, where we integrated our solutions. ROOT-Sim is a C-based package implementing a general purpose simulation environment based on the Time Warp synchronization protocol

Archivio della ricerca- Università di Roma La Sapienza