17,363 research outputs found
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
Simple and Effective Type Check Removal through Lazy Basic Block Versioning
Dynamically typed programming languages such as JavaScript and Python defer
type checking to run time. In order to maximize performance, dynamic language
VM implementations must attempt to eliminate redundant dynamic type checks.
However, type inference analyses are often costly and involve tradeoffs between
compilation time and resulting precision. This has lead to the creation of
increasingly complex multi-tiered VM architectures.
This paper introduces lazy basic block versioning, a simple JIT compilation
technique which effectively removes redundant type checks from critical code
paths. This novel approach lazily generates type-specialized versions of basic
blocks on-the-fly while propagating context-dependent type information. This
does not require the use of costly program analyses, is not restricted by the
precision limitations of traditional type analyses and avoids the
implementation complexity of speculative optimization techniques.
We have implemented intraprocedural lazy basic block versioning in a
JavaScript JIT compiler. This approach is compared with a classical flow-based
type analysis. Lazy basic block versioning performs as well or better on all
benchmarks. On average, 71% of type tests are eliminated, yielding speedups of
up to 50%. We also show that our implementation generates more efficient
machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on
several benchmarks. The combination of implementation simplicity, low
algorithmic complexity and good run time performance makes basic block
versioning attractive for baseline JIT compilers
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
Less is More: Exploiting the Standard Compiler Optimization Levels for Better Performance and Energy Consumption
This paper presents the interesting observation that by performing fewer of
the optimizations available in a standard compiler optimization level such as
-O2, while preserving their original ordering, significant savings can be
achieved in both execution time and energy consumption. This observation has
been validated on two embedded processors, namely the ARM Cortex-M0 and the ARM
Cortex-M3, using two different versions of the LLVM compilation framework; v3.8
and v5.0. Experimental evaluation with 71 embedded benchmarks demonstrated
performance gains for at least half of the benchmarks for both processors. An
average execution time reduction of 2.4% and 5.3% was achieved across all the
benchmarks for the Cortex-M0 and Cortex-M3 processors, respectively, with
execution time improvements ranging from 1% up to 90% over the -O2. The savings
that can be achieved are in the same range as what can be achieved by the
state-of-the-art compilation approaches that use iterative compilation or
machine learning to select flags or to determine phase orderings that result in
more efficient code. In contrast to these time consuming and expensive to apply
techniques, our approach only needs to test a limited number of optimization
configurations, less than 64, to obtain similar or even better savings.
Furthermore, our approach can support multi-criteria optimization as it targets
execution time, energy consumption and code size at the same time.Comment: 15 pages, 3 figures, 71 benchmarks used for evaluatio
Evolutionary improvement of programs
Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues
- …