2,536 research outputs found

    On Computational Small Steps and Big Steps: Refocusing for Outermost Reduction

    Get PDF
    We study the relationship between small-step semantics, big-step semantics and abstract machines, for programming languages that employ an outermost reduction strategy, i.e., languages where reductions near the root of the abstract syntax tree are performed before reductions near the leaves.In particular, we investigate how Biernacka and Danvy's syntactic correspondence and Reynolds's functional correspondence can be applied to inter-derive semantic specifications for such languages.The main contribution of this dissertation is three-fold:First, we identify that backward overlapping reduction rules in the small-step semantics cause the refocusing step of the syntactic correspondence to be inapplicable.Second, we propose two solutions to overcome this in-applicability: backtracking and rule generalization.Third, we show how these solutions affect the other transformations of the two correspondences.Other contributions include the application of the syntactic and functional correspondences to Boolean normalization.In particular, we show how to systematically derive a spectrum of normalization functions for negational and conjunctive normalization

    Towards an Achievable Performance for the Loop Nests

    Full text link
    Numerous code optimization techniques, including loop nest optimizations, have been developed over the last four decades. Loop optimization techniques transform loop nests to improve the performance of the code on a target architecture, including exposing parallelism. Finding and evaluating an optimal, semantic-preserving sequence of transformations is a complex problem. The sequence is guided using heuristics and/or analytical models and there is no way of knowing how close it gets to optimal performance or if there is any headroom for improvement. This paper makes two contributions. First, it uses a comparative analysis of loop optimizations/transformations across multiple compilers to determine how much headroom may exist for each compiler. And second, it presents an approach to characterize the loop nests based on their hardware performance counter values and a Machine Learning approach that predicts which compiler will generate the fastest code for a loop nest. The prediction is made for both auto-vectorized, serial compilation and for auto-parallelization. The results show that the headroom for state-of-the-art compilers ranges from 1.10x to 1.42x for the serial code and from 1.30x to 1.71x for the auto-parallelized code. These results are based on the Machine Learning predictions.Comment: Accepted at the 31st International Workshop on Languages and Compilers for Parallel Computing (LCPC 2018

    An Active-Library Based Investigation into the Performance Optimisation of Linear Algebra and the Finite Element Method

    No full text
    In this thesis, I explore an approach called "active libraries". These are libraries that take part in their own optimisation, enabling both high-performance code and the presentation of intuitive abstractions. I investigate the use of active libraries in two domains. Firstly, dense and sparse linear algebra, particularly, the solution of linear systems of equations. Secondly, the specification and solution of finite element problems. Extending my earlier (MEng) thesis work, I describe the modifications to my linear algebra library "Desola" required to perform sparse-matrix code generation. I show that optimisations easily applied in the dense case using code-transformation must be applied at a higher level of abstraction in the sparse case. I present performance results for sparse linear system solvers generated using Desola and compare against an implementation using the Intel Math Kernel Library. I also present improved dense linear-algebra performance results. Next, I explore the active-library approach by developing a finite element library that captures runtime representations of basis functions, variational forms and sequences of operations between discretised operators and fields. Using captured representations of variational forms and basis functions, I demonstrate optimisations to cell-local integral assembly that this approach enables, and compare against the state of the art. As part of my work on optimising local assembly, I extend the work of Hosangadi et al. on common sub-expression elimination and factorisation of polynomials. I improve the weight function presented by Hosangadi et al., increasing the number of factorisations found. I present an implementation of an optimised branch-and-bound algorithm inspired by reformulating the original matrix-covering problem as a maximal graph biclique search problem. I evaluate the algorithm's effectiveness on the expressions generated by our finite element solver
    • …
    corecore