Search CORE

2,917 research outputs found

Towards Automatic Learning of Heuristics for Mechanical Transformations of Procedural Code

Author: Carro Manuel
Mariño Julio
Tamarit Salvador
Vigueras Guillermo
Publication venue
Publication date: 09/03/2016
Field of study

The current trend in next-generation exascale systems goes towards integrating a wide range of specialized (co-)processors into traditional supercomputers. However, the integration of different specialized devices increases the degree of heterogeneity and the complexity in programming such type of systems. Due to the efficiency of heterogeneous systems in terms of Watt and FLOPS per surface unit, opening the access of heterogeneous platforms to a wider range of users is an important problem to be tackled. In order to bridge the gap between heterogeneous systems and programmers, in this paper we propose a machine learning-based approach to learn heuristics for defining transformation strategies of a program transformation system. Our approach proposes a novel combination of reinforcement learning and classification methods to efficiently tackle the problems inherent to this type of systems. Preliminary results demonstrate the suitability of the approach for easing the programmability of heterogeneous systems.Comment: Part of the Program Transformation for Programmability in Heterogeneous Architectures (PROHA) workshop, Barcelona, Spain, 12th March 2016, 9 pages, LaTe

arXiv.org e-Print Archive

Directory of Open Access Journals

Type-driven automated program transformations and cost modelling for optimising streaming programs on FPGAs

Author: Nabi Syed Waqar
Urlea Cristian
Vanderbauwhede Wim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/04/2018
Field of study

In this paper we present a novel approach to program optimisation based on compiler-based type-driven program transformations and a fast and accurate cost/performance model for the target architecture. We target streaming programs for the problem domain of scientific computing, such as numerical weather prediction. We present our theoretical framework for type-driven program transformation, our target high-level language and intermediate representation languages and the cost model and demonstrate the effectiveness of our approach by comparison with a commercial toolchain

Enlighten

Batch solution of small PDEs with the OPS DSL

Author: E László
GR Mudalige
H Carter Edwards
H Wang
IZ Reguly
JE Stone
JG Verwer
K In’t Hout
K In’t Hout
M Wyns
P MacNeice
R Chandra
R Nath
S Kronawitter
SP Jammy
T Deakin
W Gropp
W Hundsdorfer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In this paper we discuss the challenges and optimisations opportunities when solving a large number of small, equally sized discretised PDEs on regular grids. We present an extension of the OPS (Oxford Parallel library for Structured meshes) embedded Domain Specific Language, and show how support can be added for solving multiple systems, and how OPS makes it easy to deploy a variety of transformations and optimisations. The new capabilities in OPS allow to automatically apply data structure transformations, as well as execution schedule transformations to deliver high performance on a variety of hardware platforms. We evaluate our work on an industrially representative finance simulation on Intel CPUs, as well as NVIDIA GPUs

Crossref

Warwick Research Archives Portal Repository

Repository of the Academy's Library

Relay: A New IR for Machine Learning Frameworks

Author: Abadi Martin
Chen Tianqi
Krizhevsky Alex
Rotem Nadav
Shankar Asim
Vasilache Nicolas
Wei Richard
Wiltschko Alex
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/09/2018
Field of study

Machine learning powers diverse services in industry including search, translation, recommendation systems, and security. The scale and importance of these models require that they be efficient, expressive, and portable across an array of heterogeneous hardware devices. These constraints are often at odds; in order to better accommodate them we propose a new high-level intermediate representation (IR) called Relay. Relay is being designed as a purely-functional, statically-typed language with the goal of balancing efficient compilation, expressiveness, and portability. We discuss the goals of Relay and highlight its important design constraints. Our prototype is part of the open source NNVM compiler framework, which powers Amazon's deep learning framework MxNet

arXiv.org e-Print Archive

Crossref