1,998 research outputs found
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Transformations of High-Level Synthesis Codes for High-Performance Computing
Specialized hardware architectures promise a major step in performance and
energy efficiency over the traditional load/store devices currently employed in
large scale computing systems. The adoption of high-level synthesis (HLS) from
languages such as C/C++ and OpenCL has greatly increased programmer
productivity when designing for such platforms. While this has enabled a wider
audience to target specialized hardware, the optimization principles known from
traditional software design are no longer sufficient to implement
high-performance codes. Fast and efficient codes for reconfigurable platforms
are thus still challenging to design. To alleviate this, we present a set of
optimizing transformations for HLS, targeting scalable and efficient
architectures for high-performance computing (HPC) applications. Our work
provides a toolbox for developers, where we systematically identify classes of
transformations, the characteristics of their effect on the HLS code and the
resulting hardware (e.g., increases data reuse or resource consumption), and
the objectives that each transformation can target (e.g., resolve interface
contention, or increase parallelism). We show how these can be used to
efficiently exploit pipelining, on-chip distributed fast memory, and on-chip
streaming dataflow, allowing for massively parallel architectures. To quantify
the effect of our transformations, we use them to optimize a set of
throughput-oriented FPGA kernels, demonstrating that our enhancements are
sufficient to scale up parallelism within the hardware constraints. With the
transformations covered, we hope to establish a common framework for
performance engineers, compiler developers, and hardware developers, to tap
into the performance potential offered by specialized hardware architectures
using HLS
FIST: A Feature-Importance Sampling and Tree-Based Method for Automatic Design Flow Parameter Tuning
Design flow parameters are of utmost importance to chip design quality and
require a painfully long time to evaluate their effects. In reality, flow
parameter tuning is usually performed manually based on designers' experience
in an ad hoc manner. In this work, we introduce a machine learning-based
automatic parameter tuning methodology that aims to find the best design
quality with a limited number of trials. Instead of merely plugging in machine
learning engines, we develop clustering and approximate sampling techniques for
improving tuning efficiency. The feature extraction in this method can reuse
knowledge from prior designs. Furthermore, we leverage a state-of-the-art
XGBoost model and propose a novel dynamic tree technique to overcome
overfitting. Experimental results on benchmark circuits show that our approach
achieves 25% improvement in design quality or 37% reduction in sampling cost
compared to random forest method, which is the kernel of a highly cited
previous work. Our approach is further validated on two industrial designs. By
sampling less than 0.02% of possible parameter sets, it reduces area by 1.83%
and 1.43% compared to the best solutions hand-tuned by experienced designers
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications
The challenging deployment of compute-intensive applications from domains
such Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces
the community of computing systems to explore new design approaches.
Approximate Computing appears as an emerging solution, allowing to tune the
quality of results in the design of a system in order to improve the energy
efficiency and/or performance. This radical paradigm shift has attracted
interest from both academia and industry, resulting in significant research on
approximation techniques and methodologies at different design layers (from
system down to integrated circuits). Motivated by the wide appeal of
Approximate Computing over the last 10 years, we conduct a two-part survey to
cover key aspects (e.g., terminology and applications) and review the
state-of-the art approximation techniques from all layers of the traditional
computing stack. In Part II of our survey, we classify and present the
technical details of application-specific and architectural approximation
techniques, which both target the design of resource-efficient
processors/accelerators & systems. Moreover, we present a detailed analysis of
the application spectrum of Approximate Computing and discuss open challenges
and future directions.Comment: Under Review at ACM Computing Survey
- …