Search CORE

89,125 research outputs found

Achieving High-Performance the Functional Way: A Functional Pearl on Expressing High-Performance Optimizations as Rewrite Strategies

Author: Boyle James M
Chakravarty Manuel M. T.
Chen Tianqi
Collins Alexander
Delahaye David
Hall Mary
Henriksen Troels
Jones Simon Peyton
Lattner Chris
McDonell Trevor L.
Ragan-Kelley Jonathan
Steuwer Michel
Steuwer Michel
Svensson Joel
Visser Eelco
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2020
Field of study

Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and optimizations. This results in a portability nightmare that is particularly problematic given the accelerating trend towards specialized hardware devices to further increase efficiency. Many emerging DSLs used in performance demanding domains such as deep learning or high-performance image processing attempt to simplify or even fully automate the optimization process. Using a high-level - often functional - language, programmers focus on describing functionality in a declarative way. In some systems such as Halide or TVM, a separate schedule specifies how the program should be optimized. Unfortunately, these schedules are not written in well-defined programming languages. Instead, they are implemented as a set of ad-hoc predefined APIs that the compiler writers have exposed. In this functional pearl, we show how to employ functional programming techniques to solve this challenge with elegance. We present two functional languages that work together - each addressing a separate concern. RISE is a functional language for expressing computations using well known functional data-parallel patterns. ELEVATE is a functional language for describing optimization strategies. A high-level RISE program is transformed into a low-level form using optimization strategies written in ELEVATE . From the rewritten low-level program high-performance parallel code is automatically generated. In contrast to existing high-performance domain-specific systems with scheduling APIs, in our approach programmers are not restricted to a set of built-in operations and optimizations but freely define their own computational patterns in RISE and optimization strategies in ELEVATE in a composable and reusable way. We show how our holistic functional approach achieves competitive performance with the state-of-the-art imperative systems Halide and TVM

Crossref

Edinburgh Research Explorer

Enlighten

Normalisierung und partielle Auswertung von funktional-logischen Programmen

Author: Peemöller Björn
Publication venue
Publication date: 25/07/2016
Field of study

This thesis deals with the development of a normalization scheme and a partial evaluator for the functional logic programming language Curry. The functional logic programming paradigm combines the two most important fields of declarative programming, namely functional and logic programming. While functional languages provide concepts such as algebraic data types, higher-order functions or demanddriven evaluation, logic languages usually support a non-deterministic evaluation and a built-in search for results. Functional logic languages finally combine these two paradigms in an integrated way, hence providing multiple syntactic constructs and concepts to facilitate the concise notation of high-level programs. However, both the variety of syntactic constructs and the high degree of abstraction complicate the translation into efficient target programs. To reduce the syntactic complexity of functional logic languages, a typical compilation scheme incorporates a normalization phase to subsequently replace complex constructs by simpler ones until a minimal language subset is reached. While the individual transformations are usually simple, they also have to be correctly combined to make the syntactic constructs interact in the intended way. The efficiency of normalized programs can then be improved by means of different optimization techniques. A very powerful optimization technique is the partial evaluation of programs. Partial evaluation basically anticipates the execution of certain program fragments at compile time and computes a semantically equivalent program, which is usually more efficient at run time. Since partial evaluation is a fully automatic optimization technique, it can also be incorporated into the normal compilation scheme of programs. Nevertheless, this also requires termination of the optimization process, which establishes one of the main challenges for partial evaluation besides semantic equivalence. In this work we consider the language Curry as a representative of the functional logic programming paradigm. We develop a formal representation of the normalization process of Curry programs into a kernel language, while respecting the interference of different language constructs. We then define the dynamic semantics of this kernel language, before we subsequently develop a partial evaluation scheme and show its correctness and termination. Due to the previously described normalization process, this scheme is then directly applicable to arbitrary Curry programs. Furthermore, the implementation of a practical partial evaluator is sketched based on the partial evaluation scheme, and its applicability and usefulness is documented by a variety of typical partial evaluation examples

MACAU: Open Access Repository of Kiel University

Building Efficient Query Engines in a High-Level Language

Author: Klonatos Yannis
Koch Christoph
Shaikhha Amir
Publication venue
Publication date: 16/12/2016
Field of study

Abstraction without regret refers to the vision of using high-level programming languages for systems development without experiencing a negative impact on performance. A database system designed according to this vision offers both increased productivity and high performance, instead of sacrificing the former for the latter as is the case with existing, monolithic implementations that are hard to maintain and extend. In this article, we realize this vision in the domain of analytical query processing. We present LegoBase, a query engine written in the high-level language Scala. The key technique to regain efficiency is to apply generative programming: LegoBase performs source-to-source compilation and optimizes the entire query engine by converting the high-level Scala code to specialized, low-level C code. We show how generative programming allows to easily implement a wide spectrum of optimizations, such as introducing data partitioning or switching from a row to a column data layout, which are difficult to achieve with existing low-level query compilers that handle only queries. We demonstrate that sufficiently powerful abstractions are essential for dealing with the complexity of the optimization effort, shielding developers from compiler internals and decoupling individual optimizations from each other. We evaluate our approach with the TPC-H benchmark and show that: (a) With all optimizations enabled, LegoBase significantly outperforms a commercial database and an existing query compiler. (b) Programmers need to provide just a few hundred lines of high-level code for implementing the optimizations, instead of complicated low-level code that is required by existing query compilation approaches. (c) The compilation overhead is low compared to the overall execution time, thus making our approach usable in practice for compiling query engines

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Speculative Staging for Interpreter Optimization

Author: Brunthaler Stefan
Publication venue
Publication date: 08/10/2013
Field of study

Interpreters have a bad reputation for having lower performance than just-in-time compilers. We present a new way of building high performance interpreters that is particularly effective for executing dynamically typed programming languages. The key idea is to combine speculative staging of optimized interpreter instructions with a novel technique of incrementally and iteratively concerting them at run-time. This paper introduces the concepts behind deriving optimized instructions from existing interpreter instructions---incrementally peeling off layers of complexity. When compiling the interpreter, these optimized derivatives will be compiled along with the original interpreter instructions. Therefore, our technique is portable by construction since it leverages the existing compiler's backend. At run-time we use instruction substitution from the interpreter's original and expensive instructions to optimized instruction derivatives to speed up execution. Our technique unites high performance with the simplicity and portability of interpreters---we report that our optimization makes the CPython interpreter up to more than four times faster, where our interpreter closes the gap between and sometimes even outperforms PyPy's just-in-time compiler.Comment: 16 pages, 4 figures, 3 tables. Uses CPython 3.2.3 and PyPy 1.

arXiv.org e-Print Archive

CiteSeerX

AMaχoS—Abstract Machine for Xcerpt

Author: Bry François
Furche Tim
Linse Benedikt
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Web query languages promise convenient and efficient access to Web data such as XML, RDF, or Topic Maps. Xcerpt is one such Web query language with strong emphasis on novel high-level constructs for effective and convenient query authoring, particularly tailored to versatile access to data in different Web formats such as XML or RDF. However, so far it lacks an efficient implementation to supplement the convenient language features. AMaχoS is an abstract machine implementation for Xcerpt that aims at efficiency and ease of deployment. It strictly separates compilation and execution of queries: Queries are compiled once to abstract machine code that consists in (1) a code segment with instructions for evaluating each rule and (2) a hint segment that provides the abstract machine with optimization hints derived by the query compilation. This article summarizes the motivation and principles behind AMaχoS and discusses how its current architecture realizes these principles

Open Access LMU

Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems - Report on the Workshop ICOOOLPS'2006 at ECOOP'06

Author: Ducournau Roland
Gagnon Etienne
Krintz Chandra
Mulet Philippe
Vitek Jan
Zendra Olivier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

ICOOOLPS'2006 was the first edition of ECOOP-ICOOOLPS workshop. It intended to bring researchers and practitioners both from academia and industry together, with a spirit of openness, to try and identify and begin to address the numerous and very varied issues of optimization. This succeeded, as can be seen from the papers, the attendance and the liveliness of the discussions that took place during and after the workshop, not to mention a few new cooperations or postdoctoral contracts. The 22 talented people from different groups who participated were unanimous to appreciate this first edition and recommend that ICOOOLPS be continued next year. A community is thus beginning to form, and should be reinforced by a second edition next year, with all the improvements this first edition made emerge.Comment: The original publication is available at http://www.springerlink.co

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

HAL-Rennes 1