833 research outputs found

    Type-driven automated program transformations and cost modelling for optimising streaming programs on FPGAs

    Get PDF
    In this paper we present a novel approach to program optimisation based on compiler-based type-driven program transformations and a fast and accurate cost/performance model for the target architecture. We target streaming programs for the problem domain of scientific computing, such as numerical weather prediction. We present our theoretical framework for type-driven program transformation, our target high-level language and intermediate representation languages and the cost model and demonstrate the effectiveness of our approach by comparison with a commercial toolchain

    Generating Fast Sparse Matrix Vector Multiplication From a High Level Generic Functional IR

    Get PDF
    Usage of high-level intermediate representations promises the generation of fast code from a high-level description, improving the productivity of developers while achieving the performance traditionally only reached with low-level programming approaches. High-level IRs come in two flavors: 1) domain-specific IRs designed only for a specific application area; or 2) generic high-level IRs that can be used to generate high-performance code across many domains. Developing generic IRs is more challenging but offers the advantage of reusing a common compiler infrastructure across various applications. In this paper, we extend a generic high-level IR to enable efficient computation with sparse data structures. Crucially, we encode sparse representation using reusable dense building blocks already present in the high-level IR. We use a form of dependent types to model sparse matrices in CSR format by expressing the relationship between multiple dense arrays explicitly separately storing the length of rows, the column indices, and the non-zero values of the matrix. We achieve high-performance compared to sparse low-level library code using our extended generic high-level code generator. On an Nvidia GPU, we outperform the highly tuned Nvidia cuSparse implementation of spmv multiplication across 28 sparse matrices of varying sparsity on average by 1.7×

    Genetic improvement of GPU software

    Get PDF
    We survey genetic improvement (GI) of general purpose computing on graphics cards. We summarise several experiments which demonstrate four themes. Experiments with the gzip program show that genetic programming can automatically port sequential C code to parallel code. Experiments with the StereoCamera program show that GI can upgrade legacy parallel code for new hardware and software. Experiments with NiftyReg and BarraCUDA show that GI can make substantial improvements to current parallel CUDA applications. Finally, experiments with the pknotsRG program show that with semi-automated approaches, enormous speed ups can sometimes be had by growing and grafting new code with genetic programming in combination with human input

    High Performance Stencil Code Generation with LIFT

    Get PDF
    Stencil computations are widely used from physical simulations to machine-learning. They are embarrassingly parallel and perfectly fit modern hardware such as Graphic Processing Units. Although stencil computations have been extensively studied, optimizing them for increasingly diverse hardware remains challenging. Domain Specific Languages (DSLs) have raised the programming abstraction and offer good performance. However, this places the burden on DSL implementers who have to write almost full-fledged parallelizing compilers and optimizers. Lift has recently emerged as a promising approach to achieve performance portability and is based on a small set of reusable parallel primitives that DSL or library writers can build upon. Lift’s key novelty is in its encoding of optimizations as a system of extensible rewrite rules which are used to explore the optimization space. However, Lift has mostly focused on linear algebra operations and it remains to be seen whether this approach is applicable for other domains. This paper demonstrates how complex multidimensional stencil code and optimizations such as tiling are expressible using compositions of simple 1D Lift primitives. By leveraging existing Lift primitives and optimizations, we only require the addition of two primitives and one rewrite rule to do so. Our results show that this approach outperforms existing compiler approaches and hand-tuned codes

    Term Rewriting on GPUs

    Full text link
    We present a way to implement term rewriting on a GPU. We do this by letting the GPU repeatedly perform a massively parallel evaluation of all subterms. We find that if the term rewrite systems exhibit sufficient internal parallelism, GPU rewriting substantially outperforms the CPU. Since we expect that our implementation can be further optimized, and because in any case GPUs will become much more powerful in the future, this suggests that GPUs are an interesting platform for term rewriting. As term rewriting can be viewed as a universal programming language, this also opens a route towards programming GPUs by term rewriting, especially for irregular computations

    Managing Industrial Simulator Visual Databases Using Geographic Information Systems

    Get PDF
    Geographic Information Systems are developed to handle enormous volumes of data and are equipped with numerous functionalities intended to capture, store, edit, organise, process and analyse or represent the geographically referenced information. On the other hand, industrial simulators for driver training are real-time applications that require a virtual environment, either geospecific, geogeneric or a combination of the two, over which the simulation programs will be run. In the final instance, this environment constitutes a geographic location with its specific characteristics of geometry, appearance, functionality, topography, etc. The set of elements that enables the virtual simulation environment to be created and in which the simulator user can move, is usually called the Visual Database (VDB). The main idea behind the work being developed approaches a topic that is of major interest in the field of industrial training simulators, which is the problem of analysing, structuring and describing the virtual environments to be used in large driving simulators. This paper sets out a methodology that uses the capabilities and benefits of Geographic Information Systems for organising, optimising and managing the visual Database of the simulator and for generally enhancing the quality and performance of the simulator
    • …
    corecore