4,675 research outputs found
Compilers that learn to optimise: a probabilistic machine learning approach
Compiler optimisation is the process of making a compiler produce better code, i.e. code that,
for example, runs faster on a target architecture. Although numerous program transformations
for optimisation have been proposed in the literature, these transformations are not always beneficial and they can interact in very complex ways. Traditional approaches adopted by compiler
writers fix the order of the transformations and decide when and how these transformations
should be applied to a program by using hard-coded heuristics. However, these heuristics require a lot of time and effort to construct and may sacrifice performance on programs they have
not been tuned for.This thesis proposes a probabilistic machine learning solution to the compiler optimisation problem that automatically determines "good" optimisation strategies for programs. This
approach uses predictive modelling in order to search the space of compiler transformations.
Unlike most previous work that learns when/how to apply a single transformation in isolation or
a fixed-order set of transformations, the techniques proposed in this thesis are capable of tackling the general problem of predicting "good" sequences of compiler transformations. This is
achieved by exploiting transference across programs with two different techniques: Predictive
Search Distributions (PSD) and multi-task Gaussian process prediction (multi-task GP). While
the former directly addresses the problem of predicting "good" transformation sequences, the
latter learns regression models (or proxies) of the performance of the programs in order to
rapidly scan the space of transformation sequences.Both methods, PSD and multi-task GP, are formulated as general machine learning techniques. In particular, the PSD method is proposed in order to speed up search in combinatorial
optimisation problems by learning a distribution over good solutions on a set of problem in¬
stances and using that distribution to search the optimisation space of a problem that has not
been seen before. Likewise, multi-task GP is proposed as a general method for multi-task learning that directly models the correlation between several machine learning tasks, exploiting the
shared information across the tasks.Additionally, this thesis presents an extension to the well-known analysis of variance
(ANOVA) methodology in order to deal with sequence data. This extension is used to address the problem of optimisation space characterisation by identifying and quantifying the
main effects of program transformations and their interactions.Finally, the machine learning methods proposed are successfully applied to a data set that
has been generated as a result of the application of source-to-source transformations to 12 C
programs from the UTDSP benchmark suite
Portable compiler optimisation across embedded programs and microarchitectures using machine learning
Building an optimising compiler is a difficult and time consuming task which must be repeated for each generation of a microprocessor. As the underlying microarchitecture changes from one generation to the next, the compiler must be retuned to optimise specifically for that new system. It may take several releases of the compiler to effectively exploit a processor’s performance potential, by which time a new generation has appeared and the process starts again. We address this challenge by developing a portable optimising compiler. Our approach employs machine learning to automatically learn the best optimisations to apply for any new program on a new microarchitectural configuration. It achieves this by learning a model off-line which maps a microarchitecture description plus the hardware counters from a single run of the program to the best compiler optimisation passes. Our compiler gains 67 % of the maximum speedup obtainable by an iterative compiler search using 1000 evaluations. We obtain, on average, a 1.16x speedup over the highest default optimisation level across an entire microarchitecture configuration space, achieving a 4.3x speedup in the best case. We demonstrate the robustness of this technique by applying it to an extended microarchitectural space where we achieve comparable performance
Machine Learning in Compiler Optimization
In the last decade, machine learning based compilation has moved from an an obscure research niche to a mainstream activity. In this article, we describe the relationship between machine learning and compiler optimisation and introduce the main concepts of features, models, training and deployment. We then provide a comprehensive survey and provide a road map for the wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions. This paper provides both an accessible introduction to the fast moving area of machine learning based compilation and a detailed bibliography of its main achievements
Adaptive Neural Compilation
This paper proposes an adaptive neural-compilation framework to address the
problem of efficient program learning. Traditional code optimisation strategies
used in compilers are based on applying pre-specified set of transformations
that make the code faster to execute without changing its semantics. In
contrast, our work involves adapting programs to make them more efficient while
considering correctness only on a target input distribution. Our approach is
inspired by the recent works on differentiable representations of programs. We
show that it is possible to compile programs written in a low-level language to
a differentiable representation. We also show how programs in this
representation can be optimised to make them efficient on a target distribution
of inputs. Experimental results demonstrate that our approach enables learning
specifically-tuned algorithms for given data distributions with a high success
rate.Comment: Submitted to NIPS 2016, code and supplementary materials will be
available on author's pag
Evolutionary improvement of programs
Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions
In the past decade, Convolutional Neural Networks (CNNs) have demonstrated
state-of-the-art performance in various Artificial Intelligence tasks. To
accelerate the experimentation and development of CNNs, several software
frameworks have been released, primarily targeting power-hungry CPUs and GPUs.
In this context, reconfigurable hardware in the form of FPGAs constitutes a
potential alternative platform that can be integrated in the existing deep
learning ecosystem to provide a tunable balance between performance, power
consumption and programmability. In this paper, a survey of the existing
CNN-to-FPGA toolflows is presented, comprising a comparative study of their key
characteristics which include the supported applications, architectural
choices, design space exploration methods and achieved performance. Moreover,
major challenges and objectives introduced by the latest trends in CNN
algorithmic research are identified and presented. Finally, a uniform
evaluation methodology is proposed, aiming at the comprehensive, complete and
in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal,
201
- …