12,348 research outputs found
Evolutionary improvement of programs
Most applications of genetic programming (GP) involve the creation of an entirely new function, program or expression to solve a specific problem. In this paper, we propose a new approach that applies GP to improve existing software by optimizing its non-functional properties such as execution time, memory usage, or power consumption. In general, satisfying non-functional requirements is a difficult task and often achieved in part by optimizing compilers. However, modern compilers are in general not always able to produce semantically equivalent alternatives that optimize non-functional properties, even if such alternatives are known to exist: this is usually due to the limited local nature of such optimizations. In this paper, we discuss how best to combine and extend the existing evolutionary methods of GP, multiobjective optimization, and coevolution in order to improve existing software. Given as input the implementation of a function, we attempt to evolve a semantically equivalent version, in this case optimized to reduce execution time subject to a given probability distribution of inputs. We demonstrate that our framework is able to produce non-obvious optimizations that compilers are not yet able to generate on eight example functions. We employ a coevolved population of test cases to encourage the preservation of the function's semantics. We exploit the original program both through seeding of the population in order to focus the search, and as an oracle for testing purposes. As well as discussing the issues that arise when attempting to improve software, we employ rigorous experimental method to provide interesting and practical insights to suggest how to address these issues
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
RVSDG: An Intermediate Representation for Optimizing Compilers
Intermediate Representations (IRs) are central to optimizing compilers as the
way the program is represented may enhance or limit analyses and
transformations. Suitable IRs focus on exposing the most relevant information
and establish invariants that different compiler passes can rely on. While
control-flow centric IRs appear to be a natural fit for imperative programming
languages, analyses required by compilers have increasingly shifted to
understand data dependencies and work at multiple abstraction layers at the
same time. This is partially evidenced in recent developments such as the MLIR
proposed by Google. However, rigorous use of data flow centric IRs in general
purpose compilers has not been evaluated for feasibility and usability as
previous works provide no practical implementations. We present the
Regionalized Value State Dependence Graph (RVSDG) IR for optimizing compilers.
The RVSDG is a data flow centric IR where nodes represent computations, edges
represent computational dependencies, and regions capture the hierarchical
structure of programs. It represents programs in demand-dependence form,
implicitly supports structured control flow, and models entire programs within
a single IR. We provide a complete specification of the RVSDG, construction and
destruction methods, as well as exemplify its utility by presenting Dead Node
and Common Node Elimination optimizations. We implemented a prototype compiler
and evaluate it in terms of performance, code size, compilation time, and
representational overhead. Our results indicate that the RVSDG can serve as a
competitive IR in optimizing compilers while reducing complexity
- …