17,477 research outputs found
Machine learning in compilers
Tuning a compiler so that it produces optimised code is a difficult task because modern processors
are complicated; they have a large number of components operating in parallel and each
is sensitive to the behaviour of the others. Building analytical models on which optimisation
heuristics can be based has become harder as processor complexity increased and this trend is
bound to continue as the world moves towards further heterogeneous parallelism. Compiler
writers need to spend months to get a heuristic right for any particular architecture and these
days compilers often support a wide range of disparate devices. Whenever a new processor
comes out, even if derived from a previous one, the compiler’s heuristics will need to be retuned
for it. This is, typically, too much effort and so, in fact, most compilers are out of date.
Machine learning has been shown to help; by running example programs, compiled in
different ways, and observing how those ways effect program run-time, automatic machine
learning tools can predict good settings with which to compile new, as yet unseen programs.
The field is nascent, but has demonstrated significant results already and promises a day when
compilers will be tuned for new hardware without the need for months of compiler experts’
time. Many hurdles still remain, however, and while experts no longer have to worry about
the details of heuristic parameters, they must spend their time on the details of the machine
learning process instead to get the full benefits of the approach.
This thesis aims to remove some of the aspects of machine learning based compilers for
which human experts are still required, paving the way for a completely automatic, retuning
compiler.
First, we tackle the most conspicuous area of human involvement; feature generation. In all
previous machine learning works for compilers, the features, which describe the important aspects
of each example to the machine learning tools, must be constructed by an expert. Should
that expert choose features poorly, they will miss crucial information without which the machine
learning algorithm can never excel. We show that not only can we automatically derive
good features, but that these features out perform those of human experts. We demonstrate our
approach on loop unrolling, and find we do better than previous work, obtaining XXX% of the
available performance, more than the XXX% of previous state of the art.
Next, we demonstrate a new method to efficiently capture the raw data needed for machine
learning tasks. The iterative compilation on which machine learning in compilers depends is
typically time consuming, often requiring months of compute time. The underlying processes
are also noisy, so that most prior works fall into two categories; those which attempt to gather
clean data by executing a large number of times and those which ignore the statistical validity
of their data to keep experiment times feasible. Our approach, on the other hand guarantees
clean data while adapting to the experiment at hand, needing an order of magnitude less work
that prior techniques
A Compilation Target for Probabilistic Programming Languages
Forward inference techniques such as sequential Monte Carlo and particle
Markov chain Monte Carlo for probabilistic programming can be implemented in
any programming language by creative use of standardized operating system
functionality including processes, forking, mutexes, and shared memory.
Exploiting this we have defined, developed, and tested a probabilistic
programming language intermediate representation language we call probabilistic
C, which itself can be compiled to machine code by standard compilers and
linked to operating system libraries yielding an efficient, scalable, portable
probabilistic programming compilation target. This opens up a new hardware and
systems research path for optimizing probabilistic programming systems.Comment: In Proceedings of the 31st International Conference on Machine
Learning (ICML), 201
Towards an Achievable Performance for the Loop Nests
Numerous code optimization techniques, including loop nest optimizations,
have been developed over the last four decades. Loop optimization techniques
transform loop nests to improve the performance of the code on a target
architecture, including exposing parallelism. Finding and evaluating an
optimal, semantic-preserving sequence of transformations is a complex problem.
The sequence is guided using heuristics and/or analytical models and there is
no way of knowing how close it gets to optimal performance or if there is any
headroom for improvement. This paper makes two contributions. First, it uses a
comparative analysis of loop optimizations/transformations across multiple
compilers to determine how much headroom may exist for each compiler. And
second, it presents an approach to characterize the loop nests based on their
hardware performance counter values and a Machine Learning approach that
predicts which compiler will generate the fastest code for a loop nest. The
prediction is made for both auto-vectorized, serial compilation and for
auto-parallelization. The results show that the headroom for state-of-the-art
compilers ranges from 1.10x to 1.42x for the serial code and from 1.30x to
1.71x for the auto-parallelized code. These results are based on the Machine
Learning predictions.Comment: Accepted at the 31st International Workshop on Languages and
Compilers for Parallel Computing (LCPC 2018
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
- …