Search CORE

1,638 research outputs found

Introducing Molly: Distributed Memory Parallelization with LLVM

Author: Kruse Michael
Publication venue
Publication date: 01/01/2013
Field of study

Programming for distributed memory machines has always been a tedious task, but necessary because compilers have not been sufficiently able to optimize for such machines themselves. Molly is an extension to the LLVM compiler toolchain that is able to distribute and reorganize workload and data if the program is organized in statically determined loop control-flows. These are represented as polyhedral integer-point sets that allow program transformations applied on them. Memory distribution and layout can be declared by the programmer as needed and the necessary asynchronous MPI communication is generated automatically. The primary motivation is to run Lattice QCD simulations on IBM Blue Gene/Q supercomputers, but since the implementation is not yet completed, this paper shows the capabilities on Conway's Game of Life

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

Interest rate convergence in the EMS prior to European Monetary Union

Author: Frömmel Michael
Kruse Robinson
Publication venue: Universiteit Gent, Faculteit Economie en Bedrijfskunde
Publication date: 01/01/2009
Field of study

In this paper we analyze the convergence of interest rates in the European Monetary System (EMS) in a framework of changing persistence. This allows us to estimate the exact date of full convergence from the data. A change in persistence means that a time series switches from stationarity to non-stationarity, or vice versa. It is often argued that due to the specific historical situation in the EMS the interest rate differential was non-stationary before the full convergence of interest rates was achieved and stationary afterwards. Our empirical results suggest that the convergence date has been very different for Belgium, France, the Netherlands and Italy and are in line with the conclusions one would draw from a narrative approach. We compare three different estimators for the convergence date and find that the results are quite robust. Our results therefore stress the importance of credibility for monetary policy

Ghent University Academic Bibliography

Lattice QCD estimate of the $\eta_{c}(2S)\to J/\psi\gamma$ decay rate

Author: Becirevic Damir
Kruse Michael
Sanfilippo Francesco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/11/2014
Field of study

We compute the hadronic matrix element relevant to the physical radiative decay

\eta_{c}(2S)\to J/\psi\gamma

by means of lattice QCD. We use the (maximally) twisted mass QCD action with Nf=2 light dynamical quarks and from the computations made at four lattice spacings we were able to take the continuum limit. The value of the mass ratio

m_{\eta_c(2S)}/m_{\eta_c(1S)}

we obtain is consistent with the experimental value, and our prediction for the form factor is

V^{\eta_{c}(2S)\to J/\psi\gamma}(0)\equiv V_{12}(0)=0.32(6)(2)

, leading to

\Gamma(\eta_c (2S) \to J/\psi\gamma) = (15.7\pm 5.7)

keV, which is much larger than

\Gamma(\psi (2S) \to \eta_c\gamma)

and within reach of modern experiments.Comment: 19 pages, 4 fig

arXiv.org e-Print Archive

Springer - Publisher Connector

Perfrewrite -- Program Complexity Analysis via Source Code Instrumentation

Author: Kruse Michael
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

ACACES 2012 summer schoolMost program profiling methods output the execution time of one specific program execution, but not its computational complexity class in terms of the big-O notation. Perfrewrite is a tool based on LLVM's Clang compiler to rewrite a program such that it tracks semantic information while the program executes and uses it to guess memory usage, communication and computational complexity. While source code instrumentation is a standard technique for profiling, using it for deriving formulas is an uncommon approach

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Generalizing Hierarchical Parallelism

Author: Kruse Michael
Publication venue
Publication date: 04/09/2023
Field of study

Since the days of OpenMP 1.0 computer hardware has become more complex, typically by specializing compute units for coarse- and fine-grained parallelism in incrementally deeper hierarchies of parallelism. Newer versions of OpenMP reacted by introducing new mechanisms for querying or controlling its individual levels, each time adding another concept such as places, teams, and progress groups. In this paper we propose going back to the roots of OpenMP in the form of nested parallelism for a simpler model and more flexible handling of arbitrary deep hardware hierarchies.Comment: IWOMP'23 preprin

arXiv.org e-Print Archive