214 research outputs found
Machine Learning in Compiler Optimization
In the last decade, machine learning based compilation has moved from an an obscure research niche to a mainstream activity. In this article, we describe the relationship between machine learning and compiler optimisation and introduce the main concepts of features, models, training and deployment. We then provide a comprehensive survey and provide a road map for the wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions. This paper provides both an accessible introduction to the fast moving area of machine learning based compilation and a detailed bibliography of its main achievements
Specialization Opportunities in Graphical Workloads
Computer games are complex performance-critical graphical applications which require specialized GPU hardware. For this reason, GPU drivers often include many heuristics to help optimize throughput. Recently however, new APIs are emerging which sacrifice many heuristics for lower-level hardware control and more predictable driver behavior. This shifts the burden for many optimizations from GPU driver developers to game programmers, but also provides numerous opportunities to exploit application-specific knowledge."br/""br/"This paper examines different opportunities for specializing GPU code and reducing redundant data transfers. Static analysis of commercial games shows that 5-18% of GPU code is specializable by pruning dead data elements or moving portions to different graphics pipeline stages. In some games, up to 97% of the programs’ data inputs of a particular type, namely uniform variables, are unused, as well as up to 62% of those in the GPU internal vertex-fragment interface. This shows potential for improving memory usage and communication overheads. Insome test scenarios, removing dead uniform data can lead to 6x performance improvements."br/""br/"We also explore the upper limits of specialization if all dynamic inputs are constant at run-time. For instance, if uniform inputs are constant, up to 44% of instructions can be eliminated in some games, with a further 14% becoming constant-foldable at compile time. Analysis of run-time traces, reveals that 48-91% of uniform inputs are constant in real games, so values close to the upper limit may be achieved in practice
Discontinuous collocation methods and gravitational self-force applications
Numerical simulations of extereme mass ratio inspirals, the mostimportant
sources for the LISA detector, face several computational challenges. We
present a new approach to evolving partial differential equations occurring in
black hole perturbation theory and calculations of the self-force acting on
point particles orbiting supermassive black holes. Such equations are
distributionally sourced, and standard numerical methods, such as
finite-difference or spectral methods, face difficulties associated with
approximating discontinuous functions. However, in the self-force problem we
typically have access to full a-priori information about the local structure of
the discontinuity at the particle. Using this information, we show that
high-order accuracy can be recovered by adding to the Lagrange interpolation
formula a linear combination of certain jump amplitudes. We construct
discontinuous spatial and temporal discretizations by operating on the
corrected Lagrange formula. In a method-of-lines framework, this provides a
simple and efficient method of solving time-dependent partial differential
equations, without loss of accuracy near moving singularities or
discontinuities. This method is well-suited for the problem of time-domain
reconstruction of the metric perturbation via the Teukolsky or
Regge-Wheeler-Zerilli formalisms. Parallel implementations on modern CPU and
GPU architectures are discussed.Comment: 29 pages, 5 figure
Discontinuous collocation and symmetric integration methods for distributionally-sourced hyperboloidal partial differential equations
This work outlines a time-domain numerical integration technique for linear
hyperbolic partial differential equations sourced by distributions (Dirac
-functions and their derivatives). Such problems arise when studying
binary black hole systems in the extreme mass ratio limit. We demonstrate that
such source terms may be converted to effective domain-wide sources when
discretized, and we introduce a class of time-steppers that directly account
for these discontinuities in time integration. Moreover, our time-steppers are
constructed to respect time reversal symmetry, a property that has been
connected to conservation of physical quantities like energy and momentum in
numerical simulations. To illustrate the utility of our method, we numerically
study a distributionally-sourced wave equation that shares many features with
the equations governing linear perturbations to black holes sourced by a point
mass.Comment: 29 pages, 4 figures
- …