13 research outputs found

    MaSiF: Machine learning guided auto-tuning of parallel skeletons

    Get PDF

    Portable compiler optimisation across embedded programs and microarchitectures using machine learning

    Get PDF
    Building an optimising compiler is a difficult and time consuming task which must be repeated for each generation of a microprocessor. As the underlying microarchitecture changes from one generation to the next, the compiler must be retuned to optimise specifically for that new system. It may take several releases of the compiler to effectively exploit a processor’s performance potential, by which time a new generation has appeared and the process starts again. We address this challenge by developing a portable optimising compiler. Our approach employs machine learning to automatically learn the best optimisations to apply for any new program on a new microarchitectural configuration. It achieves this by learning a model off-line which maps a microarchitecture description plus the hardware counters from a single run of the program to the best compiler optimisation passes. Our compiler gains 67 % of the maximum speedup obtainable by an iterative compiler search using 1000 evaluations. We obtain, on average, a 1.16x speedup over the highest default optimisation level across an entire microarchitecture configuration space, achieving a 4.3x speedup in the best case. We demonstrate the robustness of this technique by applying it to an extended microarchitectural space where we achieve comparable performance

    A Survey on Compiler Autotuning using Machine Learning

    Full text link
    Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

    Predictive Modeling in a Polyhedral Optimization Space

    Get PDF
    International audienceHigh-level program optimizations, such as loop transformations, are critical for high performance on multi-core targets. However, complex sequences of loop transformations are often required to expose parallelism (both coarse-grain and fine-grain) and improve data locality. The polyhedral compilation framework has proved to be very effective at representing these complex sequences and restructuring compute-intensive applications, seamlessly handling perfectly and imperfectly nested loops. Nevertheless identifying the most effective loop transformations remains a major challenge. We address the problem of selecting the best polyhedral optimizations with dedicated machine learning models, trained specifically on the target machine. We show that these models can quickly select high-performance optimizations with very limited iterative search. Our end-to-end framework is validated using numerous benchmarks on two modern multi-core platforms. We investigate a variety of different machine learning algorithms and hardware counters, and we obtain performance improvements over productions compilers ranging on average from 3.2x to 8.7x, by running not more than 6 program variants from a polyhedral optimization space

    Compiler and Runtime Optimization Techniques for Implementation Scalable Parallel Applications

    Get PDF
    The compiler is able to detect the data dependencies in an application and is able to analyze the specific sections of code for parallelization potential. However, all of these techniques provided by a compiler are usually applied at compile time, so they rely on static analysis, which is insufficient for achieving maximum parallelism and desired application scalability. These compiler techniques should consider both the static information gathered at compile time and dynamic analysis captured at runtime about the system to generate a safe parallel application. On the other hand, runtime information is often speculative. Solely relying on it doesn\u27t guarantee maximal parallel performance. So collecting information at compile time could significantly improve the runtime techniques performance. The goal is achieved in this research by introducing new techniques proposed for both compiler and runtime system that enable them to contribute with each other and utilize both static and dynamic analysis information to maximize application parallel performance. In the proposed framework, a compiler can implement dynamic runtime methods in its parallelization optimizations and a runtime system can apply static information in its parallelization methods implementation. The proposed techniques are able to use high-level programming abstractions and machine learning to relieve the programmer of difficult and tedious decisions that can significantly affect program behavior and performance
    corecore