7 research outputs found
Reduced Order and Surrogate Models for Gravitational Waves
We present an introduction to some of the state of the art in reduced order
and surrogate modeling in gravitational wave (GW) science. Approaches that we
cover include Principal Component Analysis, Proper Orthogonal Decomposition,
the Reduced Basis approach, the Empirical Interpolation Method, Reduced Order
Quadratures, and Compressed Likelihood evaluations. We divide the review into
three parts: representation/compression of known data, predictive models, and
data analysis. The targeted audience is that one of practitioners in GW
science, a field in which building predictive models and data analysis tools
that are both accurate and fast to evaluate, especially when dealing with large
amounts of data and intensive computations, are necessary yet can be
challenging. As such, practical presentations and, sometimes, heuristic
approaches are here preferred over rigor when the latter is not available. This
review aims to be self-contained, within reasonable page limits, with little
previous knowledge (at the undergraduate level) requirements in mathematics,
scientific computing, and other disciplines. Emphasis is placed on optimality,
as well as the curse of dimensionality and approaches that might have the
promise of beating it. We also review most of the state of the art of GW
surrogates. Some numerical algorithms, conditioning details, scalability,
parallelization and other practical points are discussed. The approaches
presented are to large extent non-intrusive and data-driven and can therefore
be applicable to other disciplines. We close with open challenges in high
dimension surrogates, which are not unique to GW science.Comment: Invited article for Living Reviews in Relativity. 93 page
Systematic Design Methods for Efficient Off-Chip DRAM Access
Typical design flows for digital hardware take, as their input, an abstract description
of computation and data transfer between logical memories. No existing commercial
high-level synthesis tool demonstrates the ability to map logical memory inferred from
a high level language to external memory resources. This thesis develops techniques for
doing this, specifically targeting off-chip dynamic memory (DRAM) devices. These are
a commodity technology in widespread use with standardised interfaces. In use, the
bandwidth of an external memory interface and the latency of memory requests asserted
on it may become the bottleneck limiting the performance of a hardware design. Careful
consideration of this is especially important when designing with DRAMs, whose latency
and bandwidth characteristics depend upon the sequence of memory requests issued by
a controller.
Throughout the work presented here, we pursue exact compile-time methods for designing
application-specific memory systems with a focus on guaranteeing predictable performance
through static analysis. This contrasts with much of the surveyed existing work,
which considers general purpose memory controllers and optimized policies which improve
performance in experiments run using simulation of suites of benchmark codes.
The work targets loop-nests within imperative source code, extracting a mathematical
representation of the loop-nest statements and their associated memory accesses, referred
to as the ‘Polytope Model’. We extend this mathematical representation to represent the
physical DRAM ‘row’ and ‘column’ structures accessed when performing memory transfers.
From this augmented representation, we can automatically derive DRAM controllers
which buffer data in on-chip memory and transfer data in an efficient order. Buffering
data and exploiting ‘reuse’ of data is shown to enable up to 50× reduction in the quantity
of data transferred to external memory. The reordering of memory transactions exploiting
knowledge of the physical layout of the DRAM device allowing to 4× improvement in
the efficiency of those data transfers
An Active-Library Based Investigation into the Performance Optimisation of Linear Algebra and the Finite Element Method
In this thesis, I explore an approach called "active libraries". These are libraries that take
part in their own optimisation, enabling both high-performance code and the presentation of
intuitive abstractions.
I investigate the use of active libraries in two domains. Firstly, dense and sparse linear algebra,
particularly, the solution of linear systems of equations. Secondly, the specification and solution
of finite element problems.
Extending my earlier (MEng) thesis work, I describe the modifications to my linear algebra
library "Desola" required to perform sparse-matrix code generation. I show that optimisations
easily applied in the dense case using code-transformation must be applied at a higher level of
abstraction in the sparse case. I present performance results for sparse linear system solvers
generated using Desola and compare against an implementation using the Intel Math Kernel
Library. I also present improved dense linear-algebra performance results.
Next, I explore the active-library approach by developing a finite element library that captures
runtime representations of basis functions, variational forms and sequences of operations between
discretised operators and fields. Using captured representations of variational forms and
basis functions, I demonstrate optimisations to cell-local integral assembly that this approach
enables, and compare against the state of the art.
As part of my work on optimising local assembly, I extend the work of Hosangadi et al. on
common sub-expression elimination and factorisation of polynomials. I improve the weight
function presented by Hosangadi et al., increasing the number of factorisations found. I present
an implementation of an optimised branch-and-bound algorithm inspired by reformulating the
original matrix-covering problem as a maximal graph biclique search problem. I evaluate the
algorithm's effectiveness on the expressions generated by our finite element solver
A unified approach to evaluation algorithms for multivariate polynomials
Abstract. We present a unified framework for most of the known and a few new evaluation algorithms for multivariate polynomials expressed in a wide variety of bases including the Bernstein-Bézier, multinomial (or Taylor), Lagrange and Newton bases. This unification is achieved by considering evaluation algorithms for multivariate polynomials expressed in terms of L-bases, a class of bases that include the Bernstein-Bézier, multinomial, and a rich subclass of Lagrange and Newton bases. All of the known evaluation algorithms can be generated either by considering up recursive evaluation algorithms for L-bases or by examining change of basis algorithms for L-bases. For polynomials of degree n in s variables, the class of up recursive evaluation algorithms includes a parallel up recurrence algorithm with computational complexity O(n s+1), a nested multiplication algorithm with computational complexity O(n s log n) and a ladder recurrence algorithm with computational complexity O(n s). These algorithms also generate a new generalization of the Aitken-Neville algorithm for evaluation of multivariate polynomials expressed in terms of Lagrange L-bases. The second class of algorithms, based on certain change of basis algorithms between L-bases, include a nested multiplication algorithm with computational complexity O(n s), a divided difference algorithm, a forward difference algorithm, and a Lagrange evaluation algorithm with computational complexities O(n s), O(n s)andO(n) per point respectively for the evaluation of multivariate polynomials at several points. 1