89,125 research outputs found
Achieving High-Performance the Functional Way: A Functional Pearl on Expressing High-Performance Optimizations as Rewrite Strategies
Optimizing programs to run efficiently on modern parallel hardware is hard but crucial for many applications. The predominantly used imperative languages - like C or OpenCL - force the programmer to intertwine the code describing functionality and optimizations. This results in a portability nightmare that is particularly problematic given the accelerating trend towards specialized hardware devices to further increase efficiency.
Many emerging DSLs used in performance demanding domains such as deep learning or high-performance image processing attempt to simplify or even fully automate the optimization process. Using a high-level - often functional - language, programmers focus on describing functionality in a declarative way. In some systems such as Halide or TVM, a separate schedule specifies how the program should be optimized. Unfortunately, these schedules are not written in well-defined programming languages. Instead, they are implemented as a set of ad-hoc predefined APIs that the compiler writers have exposed.
In this functional pearl, we show how to employ functional programming techniques to solve this challenge with elegance. We present two functional languages that work together - each addressing a separate concern. RISE is a functional language for expressing computations using well known functional data-parallel patterns. ELEVATE is a functional language for describing optimization strategies. A high-level RISE program is transformed into a low-level form using optimization strategies written in ELEVATE . From the rewritten low-level program high-performance parallel code is automatically generated. In contrast to existing high-performance domain-specific systems with scheduling APIs, in our approach programmers are not restricted to a set of built-in operations and optimizations but freely define their own computational patterns in RISE and optimization strategies in ELEVATE in a composable and reusable way. We show how our holistic functional approach achieves competitive performance with the state-of-the-art imperative systems Halide and TVM
Normalisierung und partielle Auswertung von funktional-logischen Programmen
This thesis deals with the development of a normalization scheme and a partial evaluator for the functional logic programming language Curry. The functional logic programming paradigm combines the two most important fields of declarative programming, namely functional and logic programming. While functional languages provide concepts such as algebraic data types, higher-order functions or demanddriven evaluation, logic languages usually support a non-deterministic evaluation and a built-in search for results. Functional logic languages finally combine these two paradigms in an integrated way, hence providing multiple syntactic constructs and concepts to facilitate the concise notation of high-level programs. However, both the variety of syntactic constructs and the high degree of abstraction complicate the translation into efficient target programs. To reduce the syntactic complexity of functional logic languages, a typical compilation scheme incorporates a normalization phase to subsequently replace complex constructs by simpler ones until a minimal language subset is reached. While the individual transformations are usually simple, they also have to be correctly combined to make the syntactic constructs interact in the intended way. The efficiency of normalized programs can then be improved by means of different optimization techniques. A very powerful optimization technique is the partial evaluation of programs. Partial evaluation basically anticipates the execution of certain program fragments at compile time and computes a semantically equivalent program, which is usually more efficient at run time. Since partial evaluation is a fully automatic optimization technique, it can also be incorporated into the normal compilation scheme of programs. Nevertheless, this also requires termination of the optimization process, which establishes one of the main challenges for partial evaluation besides semantic equivalence. In this work we consider the language Curry as a representative of the functional logic programming paradigm. We develop a formal representation of the normalization process of Curry programs into a kernel language, while respecting the interference of different language constructs. We then define the dynamic semantics of this kernel language, before we subsequently develop a partial evaluation scheme and show its correctness and termination. Due to the previously described normalization process, this scheme is then directly applicable to arbitrary Curry programs. Furthermore, the implementation of a practical partial evaluator is sketched based on the partial evaluation scheme, and its applicability and usefulness is documented by a variety of typical partial evaluation examples
Building Efficient Query Engines in a High-Level Language
Abstraction without regret refers to the vision of using high-level
programming languages for systems development without experiencing a negative
impact on performance. A database system designed according to this vision
offers both increased productivity and high performance, instead of sacrificing
the former for the latter as is the case with existing, monolithic
implementations that are hard to maintain and extend. In this article, we
realize this vision in the domain of analytical query processing. We present
LegoBase, a query engine written in the high-level language Scala. The key
technique to regain efficiency is to apply generative programming: LegoBase
performs source-to-source compilation and optimizes the entire query engine by
converting the high-level Scala code to specialized, low-level C code. We show
how generative programming allows to easily implement a wide spectrum of
optimizations, such as introducing data partitioning or switching from a row to
a column data layout, which are difficult to achieve with existing low-level
query compilers that handle only queries. We demonstrate that sufficiently
powerful abstractions are essential for dealing with the complexity of the
optimization effort, shielding developers from compiler internals and
decoupling individual optimizations from each other. We evaluate our approach
with the TPC-H benchmark and show that: (a) With all optimizations enabled,
LegoBase significantly outperforms a commercial database and an existing query
compiler. (b) Programmers need to provide just a few hundred lines of
high-level code for implementing the optimizations, instead of complicated
low-level code that is required by existing query compilation approaches. (c)
The compilation overhead is low compared to the overall execution time, thus
making our approach usable in practice for compiling query engines
Speculative Staging for Interpreter Optimization
Interpreters have a bad reputation for having lower performance than
just-in-time compilers. We present a new way of building high performance
interpreters that is particularly effective for executing dynamically typed
programming languages. The key idea is to combine speculative staging of
optimized interpreter instructions with a novel technique of incrementally and
iteratively concerting them at run-time.
This paper introduces the concepts behind deriving optimized instructions
from existing interpreter instructions---incrementally peeling off layers of
complexity. When compiling the interpreter, these optimized derivatives will be
compiled along with the original interpreter instructions. Therefore, our
technique is portable by construction since it leverages the existing
compiler's backend. At run-time we use instruction substitution from the
interpreter's original and expensive instructions to optimized instruction
derivatives to speed up execution.
Our technique unites high performance with the simplicity and portability of
interpreters---we report that our optimization makes the CPython interpreter up
to more than four times faster, where our interpreter closes the gap between
and sometimes even outperforms PyPy's just-in-time compiler.Comment: 16 pages, 4 figures, 3 tables. Uses CPython 3.2.3 and PyPy 1.
AMaχoS—Abstract Machine for Xcerpt
Web query languages promise convenient and efficient access
to Web data such as XML, RDF, or Topic Maps. Xcerpt is one such Web
query language with strong emphasis on novel high-level constructs for
effective and convenient query authoring, particularly tailored to versatile
access to data in different Web formats such as XML or RDF.
However, so far it lacks an efficient implementation to supplement the
convenient language features. AMaχoS is an abstract machine implementation
for Xcerpt that aims at efficiency and ease of deployment. It
strictly separates compilation and execution of queries: Queries are compiled
once to abstract machine code that consists in (1) a code segment
with instructions for evaluating each rule and (2) a hint segment that
provides the abstract machine with optimization hints derived by the
query compilation. This article summarizes the motivation and principles
behind AMaχoS and discusses how its current architecture realizes
these principles
Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems - Report on the Workshop ICOOOLPS'2006 at ECOOP'06
ICOOOLPS'2006 was the first edition of ECOOP-ICOOOLPS workshop. It intended
to bring researchers and practitioners both from academia and industry
together, with a spirit of openness, to try and identify and begin to address
the numerous and very varied issues of optimization. This succeeded, as can be
seen from the papers, the attendance and the liveliness of the discussions that
took place during and after the workshop, not to mention a few new cooperations
or postdoctoral contracts. The 22 talented people from different groups who
participated were unanimous to appreciate this first edition and recommend that
ICOOOLPS be continued next year. A community is thus beginning to form, and
should be reinforced by a second edition next year, with all the improvements
this first edition made emerge.Comment: The original publication is available at http://www.springerlink.co
- …