289 research outputs found
Discontinuous collocation methods and gravitational self-force applications
Numerical simulations of extereme mass ratio inspirals, the mostimportant
sources for the LISA detector, face several computational challenges. We
present a new approach to evolving partial differential equations occurring in
black hole perturbation theory and calculations of the self-force acting on
point particles orbiting supermassive black holes. Such equations are
distributionally sourced, and standard numerical methods, such as
finite-difference or spectral methods, face difficulties associated with
approximating discontinuous functions. However, in the self-force problem we
typically have access to full a-priori information about the local structure of
the discontinuity at the particle. Using this information, we show that
high-order accuracy can be recovered by adding to the Lagrange interpolation
formula a linear combination of certain jump amplitudes. We construct
discontinuous spatial and temporal discretizations by operating on the
corrected Lagrange formula. In a method-of-lines framework, this provides a
simple and efficient method of solving time-dependent partial differential
equations, without loss of accuracy near moving singularities or
discontinuities. This method is well-suited for the problem of time-domain
reconstruction of the metric perturbation via the Teukolsky or
Regge-Wheeler-Zerilli formalisms. Parallel implementations on modern CPU and
GPU architectures are discussed.Comment: 29 pages, 5 figure
Discontinuous collocation and symmetric integration methods for distributionally-sourced hyperboloidal partial differential equations
This work outlines a time-domain numerical integration technique for linear
hyperbolic partial differential equations sourced by distributions (Dirac
-functions and their derivatives). Such problems arise when studying
binary black hole systems in the extreme mass ratio limit. We demonstrate that
such source terms may be converted to effective domain-wide sources when
discretized, and we introduce a class of time-steppers that directly account
for these discontinuities in time integration. Moreover, our time-steppers are
constructed to respect time reversal symmetry, a property that has been
connected to conservation of physical quantities like energy and momentum in
numerical simulations. To illustrate the utility of our method, we numerically
study a distributionally-sourced wave equation that shares many features with
the equations governing linear perturbations to black holes sourced by a point
mass.Comment: 29 pages, 4 figures
The effect of body mass index and melphalan dose adjustments on outcomes in patients undergoing autologous haematopoietic cell transplantation for multiple myeloma: a single-centre retrospective study
Neural Architecture Search as Program Transformation Exploration
Improving the performance of deep neural networks (DNNs) is important to both
the compiler and neural architecture search (NAS) communities. Compilers apply
program transformations in order to exploit hardware parallelism and memory
hierarchy. However, legality concerns mean they fail to exploit the natural
robustness of neural networks. In contrast, NAS techniques mutate networks by
operations such as the grouping or bottlenecking of convolutions, exploiting
the resilience of DNNs. In this work, we express such neural architecture
operations as program transformations whose legality depends on a notion of
representational capacity. This allows them to be combined with existing
transformations into a unified optimization framework. This unification allows
us to express existing NAS operations as combinations of simpler
transformations. Crucially, it allows us to generate and explore new tensor
convolutions. We prototyped the combined framework in TVM and were able to find
optimizations across different DNNs, that significantly reduce inference time -
over 3 in the majority of cases.
Furthermore, our scheme dramatically reduces NAS search time. Code is
available
at~\href{https://github.com/jack-willturner/nas-as-program-transformation-exploration}{this
https url}
mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis
MLIR is an emerging compiler infrastructure for modern hardware, but existing
programs cannot take advantage of MLIR's high-performance compilation if they
are described in lower-level general purpose languages. Consequently, to avoid
programs needing to be rewritten manually, this has led to efforts to
automatically raise lower-level to higher-level dialects in MLIR. However,
current methods rely on manually-defined raising rules, which limit their
applicability and make them challenging to maintain as MLIR dialects evolve.
We present mlirSynth -- a novel approach which translates programs from
lower-level MLIR dialects to high-level ones without manually defined rules.
Instead, it uses available dialect definitions to construct a program space and
searches it effectively using type constraints and equivalences. We demonstrate
its effectiveness \revi{by raising C programs} to two distinct high-level MLIR
dialects, which enables us to use existing high-level dialect specific
compilation flows. On Polybench, we show a greater coverage than previous
approaches, resulting in geomean speedups of 2.5x (Intel) and 3.4x (AMD) over
state-of-the-art compilation flows for the C programming language. mlirSynth
also enables retargetability to domain-specific accelerators, resulting in a
geomean speedup of 21.6x on a TPU
mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR using Program Synthesis
MLIR is an emerging compiler infrastructure for modern hardware, but existing programs cannot take advantage of MLIR’s high-performance compilation if they are described in lower-level general purpose languages. Consequently, to avoid programs needing to be rewritten manually, this has led to efforts to automatically raise lower-level to higher-level dialects in MLIR. However, current methods rely on manually-defined raising rules, which limit their applicability and make them challenging to maintain as MLIR dialects evolve. We present mlirSynth – a novel approach which translates programs from lower-level MLIR dialects to high-level ones without manually defined rules. Instead, it uses available dialect definitions to construct a program space and searches it effectively using type constraints and equivalences. We demonstrate its effectiveness by raising C programs to two distinct high-level MLIR dialects, which enables us to use existing high-level dialect specific compilation flows. On Polybench, we show a greater coverage than previous approaches, resulting in geomean speedups of 2.5x (Intel) and 3.4x (AMD) over state-of-the-art compilation flows. mlirSynth also enables retargetability to domain-specific accelerators, resulting in a geomean speedup of 21.6x on a TPU
HETSIM: Simulating Large-Scale Heterogeneous Systems using a Trace-driven, Synchronization and Dependency-Aware Framework
- …