Search CORE

45,305 research outputs found

A Survey on Compiler Autotuning using Machine Learning

Author: Ashouri Amir H.
Cavazos John
Killian William
Palermo Gianluca
Silvano Cristina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/09/2018
Field of study

Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Independence in constraint logic programs

Author: García de la Banda M.
Hermenegildo Manuel V.
Marriott K.
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/1993
Field of study

Studying independence of literals, variables, and substitutions has proven very useful in the context of logic programming (LP). Here we study independence in the broader context of constraint logic programming (CLP). We show that a naive extrapolation of the LP definitions of independence to CLP is unsatisfactory (in fact, wrong) for two reasons. First, because interaction between variables through constraints is more complex than in the case of logic programming. Second, in order to ensure the efUciency of several optimizations not only must independence of the search space be considered, but also an orthogonal issue - "independence of constraint solving." We clarify these issues by proposing various types of search independence and constraint solver independence, and show how they can be combined to allow different independence-related optimizations, from parallelism to intelligent backtracking. Sufficient conditions for independence which can be evaluated "a-priori" at run-time are also proposed. Our results suggest that independence, provided a suitable definition is chosen, is even more useful in CLP than in LP

CiteSeerX

Archivo Digital UPM

Computing the Greedy Spanner in Linear Space

Author: Alewijnse Sander P. A.
Bouts Quirijn W.
Brink Alex P. ten
Buchin Kevin
Publication venue
Publication date: 01/01/2013
Field of study

The greedy spanner is a high-quality spanner: its total weight, edge count and maximal degree are asymptotically optimal and in practice significantly better than for any other spanner with reasonable construction time. Unfortunately, all known algorithms that compute the greedy spanner of n points use Omega(n^2) space, which is impractical on large instances. To the best of our knowledge, the largest instance for which the greedy spanner was computed so far has about 13,000 vertices. We present a O(n)-space algorithm that computes the same spanner for points in R^d running in O(n^2 log^2 n) time for any fixed stretch factor and dimension. We discuss and evaluate a number of optimizations to its running time, which allowed us to compute the greedy spanner on a graph with a million vertices. To our knowledge, this is also the first algorithm for the greedy spanner with a near-quadratic running time guarantee that has actually been implemented

arXiv.org e-Print Archive

CiteSeerX

Repository TU/e

Pure OAI Repository

Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis

Author: Cheshmi Kazem
Dehnavi Maryam Mehri
Kamil Shoaib
Strout Michelle Mills
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/05/2017
Field of study

Sympiler is a domain-specific code generator that optimizes sparse matrix computations by decoupling the symbolic analysis phase from the numerical manipulation stage in sparse codes. The computation patterns in sparse numerical methods are guided by the input sparsity structure and the sparse algorithm itself. In many real-world simulations, the sparsity pattern changes little or not at all. Sympiler takes advantage of these properties to symbolically analyze sparse codes at compile-time and to apply inspector-guided transformations that enable applying low-level transformations to sparse codes. As a result, the Sympiler-generated code outperforms highly-optimized matrix factorization codes from commonly-used specialized libraries, obtaining average speedups over Eigen and CHOLMOD of 3.8X and 1.5X respectively.Comment: 12 page

arXiv.org e-Print Archive

Crossref

Optimizing a Certified Proof Checker for a Large-Scale Computer-Generated Proof

Author: A Fouilhe
C Sternagel
DC Voorhis van
DE Knuth
E Contejean
L Cruz-Filipe
N Oury
P Letouzey
R O’Connor
R Thiemann
RW Floyd
X Leroy
Y Bertot
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

In recent work, we formalized the theory of optimal-size sorting networks with the goal of extracting a verified checker for the large-scale computer-generated proof that 25 comparisons are optimal when sorting 9 inputs, which required more than a decade of CPU time and produced 27 GB of proof witnesses. The checker uses an untrusted oracle based on these witnesses and is able to verify the smaller case of 8 inputs within a couple of days, but it did not scale to the full proof for 9 inputs. In this paper, we describe several non-trivial optimizations of the algorithm in the checker, obtained by appropriately changing the formalization and capitalizing on the symbiosis with an adequate implementation of the oracle. We provide experimental evidence of orders of magnitude improvements to both runtime and memory footprint for 8 inputs, and actually manage to check the full proof for 9 inputs.Comment: IMADA-preprint-c

arXiv.org e-Print Archive

Crossref

University of Southern Denmark Research Output