5,147 research outputs found
Lost in translation: Exposing hidden compiler optimization opportunities
Existing iterative compilation and machine-learning-based optimization
techniques have been proven very successful in achieving better optimizations
than the standard optimization levels of a compiler. However, they were not
engineered to support the tuning of a compiler's optimizer as part of the
compiler's daily development cycle. In this paper, we first establish the
required properties which a technique must exhibit to enable such tuning. We
then introduce an enhancement to the classic nightly routine testing of
compilers which exhibits all the required properties, and thus, is capable of
driving the improvement and tuning of the compiler's common optimizer. This is
achieved by leveraging resource usage and compilation information collected
while systematically exploiting prefixes of the transformations applied at
standard optimization levels. Experimental evaluation using the LLVM v6.0.1
compiler demonstrated that the new approach was able to reveal hidden
cross-architecture and architecture-dependent potential optimizations on two
popular processors: the Intel i5-6300U and the Arm Cortex-A53-based Broadcom
BCM2837 used in the Raspberry Pi 3B+. As a case study, we demonstrate how the
insights from our approach enabled us to identify and remove a significant
shortcoming of the CFG simplification pass of the LLVM v6.0.1 compiler.Comment: 31 pages, 7 figures, 2 table. arXiv admin note: text overlap with
arXiv:1802.0984
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Full-Stack, Real-System Quantum Computer Studies: Architectural Comparisons and Design Insights
In recent years, Quantum Computing (QC) has progressed to the point where
small working prototypes are available for use. Termed Noisy Intermediate-Scale
Quantum (NISQ) computers, these prototypes are too small for large benchmarks
or even for Quantum Error Correction, but they do have sufficient resources to
run small benchmarks, particularly if compiled with optimizations to make use
of scarce qubits and limited operation counts and coherence times. QC has not
yet, however, settled on a particular preferred device implementation
technology, and indeed different NISQ prototypes implement qubits with very
different physical approaches and therefore widely-varying device and machine
characteristics.
Our work performs a full-stack, benchmark-driven hardware-software analysis
of QC systems. We evaluate QC architectural possibilities, software-visible
gates, and software optimizations to tackle fundamental design questions about
gate set choices, communication topology, the factors affecting benchmark
performance and compiler optimizations. In order to answer key cross-technology
and cross-platform design questions, our work has built the first top-to-bottom
toolflow to target different qubit device technologies, including
superconducting and trapped ion qubits which are the current QC front-runners.
We use our toolflow, TriQ, to conduct {\em real-system} measurements on 7
running QC prototypes from 3 different groups, IBM, Rigetti, and University of
Maryland. From these real-system experiences at QC's hardware-software
interface, we make observations about native and software-visible gates for
different QC technologies, communication topologies, and the value of
noise-aware compilation even on lower-noise platforms. This is the largest
cross-platform real-system QC study performed thus far; its results have the
potential to inform both QC device and compiler design going forward.Comment: Preprint of a publication in ISCA 201
- …