59,918 research outputs found
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
Specifying and Executing Optimizations for Parallel Programs
Compiler optimizations, usually expressed as rewrites on program graphs, are
a core part of all modern compilers. However, even production compilers have
bugs, and these bugs are difficult to detect and resolve. The problem only
becomes more complex when compiling parallel programs; from the choice of graph
representation to the possibility of race conditions, optimization designers
have a range of factors to consider that do not appear when dealing with
single-threaded programs. In this paper we present PTRANS, a domain-specific
language for formal specification of compiler transformations, and describe its
executable semantics. The fundamental approach of PTRANS is to describe program
transformations as rewrites on control flow graphs with temporal logic side
conditions. The syntax of PTRANS allows cleaner, more comprehensible
specification of program optimizations; its executable semantics allows these
specifications to act as prototypes for the optimizations themselves, so that
candidate optimizations can be tested and refined before going on to include
them in a compiler. We demonstrate the use of PTRANS to state, test, and refine
the specification of a redundant store elimination optimization on parallel
programs.Comment: In Proceedings GRAPHITE 2014, arXiv:1407.767
Rule-based optimization with K-framework
The thesis discusses pre-compiler optimization using rule-based rewriting. Our goal is to facilitate the proof of correctness of the process of program optimization. A source-to-source optimizer based on the proposed strategy can be a preprocessor to a certified compiler such as CompCert and this way it will facilitate the process of certification of a sophisticated compiler. In fact, CompCert is highly assured but it trades off the assurance with efficiency. Unlike other compilers like Intel C compiler or GNU C compiler, CompCert does not optimize aggressively. The paper will discuss optimization rules implemented using the K-framework and how much efficiency improvement achieved by use of our optimization rules
A Compiler and Runtime Infrastructure for Automatic Program Distribution
This paper presents the design and the implementation of a compiler and runtime infrastructure for automatic program distribution. We are building a research infrastructure that enables experimentation with various program partitioning and mapping strategies and the study of automatic distribution's effect on resource consumption (e.g., CPU, memory, communication). Since many optimization techniques are faced with conflicting optimization targets (e.g., memory and communication), we believe that it is important to be able to study their interaction.
We present a set of techniques that enable flexible resource modeling and program distribution. These are: dependence analysis, weighted graph partitioning, code and communication generation, and profiling. We have developed these ideas in the context of the Java language. We present in detail the design and implementation of each of the techniques as part of our compiler and runtime infrastructure. Then, we evaluate our design and present preliminary experimental data for each component, as well as for the entire system
Redesigning OP2 Compiler to Use HPX Runtime Asynchronous Techniques
Maximizing parallelism level in applications can be achieved by minimizing
overheads due to load imbalances and waiting time due to memory latencies.
Compiler optimization is one of the most effective solutions to tackle this
problem. The compiler is able to detect the data dependencies in an application
and is able to analyze the specific sections of code for parallelization
potential. However, all of these techniques provided with a compiler are
usually applied at compile time, so they rely on static analysis, which is
insufficient for achieving maximum parallelism and producing desired
application scalability. One solution to address this challenge is the use of
runtime methods. This strategy can be implemented by delaying certain amount of
code analysis to be done at runtime. In this research, we improve the parallel
application performance generated by the OP2 compiler by leveraging HPX, a C++
runtime system, to provide runtime optimizations. These optimizations include
asynchronous tasking, loop interleaving, dynamic chunk sizing, and data
prefetching. The results of the research were evaluated using an Airfoil
application which showed a 40-50% improvement in parallel performance.Comment: 18th IEEE International Workshop on Parallel and Distributed
Scientific and Engineering Computing (PDSEC 2017
- …
