12 research outputs found

    Reasoning About Multi-stage Programs

    Get PDF
    Multi-stage programming (MSP) is a style of writing program generators---programs which generate programs---supported by special annotations that direct construction, combination, and execution of object programs. Various researchers have shown MSP to be effective in writing efficient programs without sacrificing genericity. However, correctness proofs of such programs have so far received limited attention, and approaches and challenges for that task have been largely unexplored. In this thesis, I establish formal equational properties of the multi-stage lambda calculus and related proof techniques, as well as results that delineate the intricacies of multi-stage languages that one must be aware of. In particular, I settle three basic questions that naturally arise when verifying multi-stage functional programs. Firstly, can adding staging MSP to a language compromise the interchangeability of terms that held in the original language? Unfortunately it can, and more care is needed to reason about terms with free variables. Secondly, staging annotations, as the term ``annotations'' suggests, are often thought to be orthogonal to the behavior of a program, but when is this formally guaranteed to be the case? I give termination conditions that characterize when this guarantee holds. Finally, do multi-stage languages satisfy extensional facts, for example that functions agreeing on all arguments are equivalent? I develop a sound and complete notion of applicative bisimulation, which can establish not only extensionality but, in principle, any other valid program equivalence as well. These results improve our general understanding of staging and enable us to prove the correctness of complicated multi-stage programs

    Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs

    Get PDF
    Good software engineering practice demands generalization and abstraction, whereas high performance demands specialization and concretization. These goals are at odds, and compilers can only rarely translate expressive high-level programs to modern hardware platforms in a way that makes best use of the available resources. Generative programming is a promising alternative to fully automatic translation. Instead of writing down the target program directly, developers write a program generator, which produces the target program as its output. The generator can be written in a high-level, generic style and still produce efficient, specialized target programs. In practice, however, developing high-quality program generators requires a very large effort that is often hard to amortize. We present Lightweight Modular Staging (LMS), a generative programming approach that lowers this effort significantly. LMS seamlessly combines program generator logic with the generated code in a single program, using only types to distinguish the two stages of execution. Through extensive use of component technology, LMS makes a reusable and extensible compiler framework available at the library level, allowing programmers to tightly integrate domain-specific abstractions and optimizations into the generation process, with common generic optimizations provided by the framework. LMS is well suited to develop embedded domain specific languages (DSLs) and has been used to develop powerful performance-oriented DSLs for demanding domains such as machine learning, with code generation for heterogeneous platforms including GPUs. LMS has also been used to generate SQL for embedded database queries and JavaScript for web applications

    Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs

    Get PDF
    Software engineering demands generality and abstraction, performance demands specialization and concretization. Generative programming can provide both, but the effort required to develop high-quality program generators likely offsets their benefits, even if a multi-stage programming language is used. We present lightweight modular staging, a library-based multi-stage programming approach that breaks with the tradition of syntactic quasi-quotation and instead uses only types to distinguish between binding times. Through extensive use of component technology, lightweight modular staging makes an optimizing compiler framework available at the library level, allowing programmers to tightly integrate domain-specific abstractions and optimizations into the generation process. We argue that lightweight modular staging enables a form of language virtualization, i.e. allows to go from a pure-library embedded language to one that is practically equivalent to a stand-alone implementation with only modest effort

    Scala-Virtualized: Linguistic Reuse for Deep Embeddings

    Get PDF
    Scala-Virtualized extends the Scala language to better support hosting embedded DSLs. Scala is an expressive language that provides a flexible syntax, type-level computation using implicits, and other features that facilitate the development of em- bedded DSLs. However, many of these features work well only for shallow embeddings, i.e. DSLs which are implemented as plain libraries. Shallow embeddings automatically profit from features of the host language through linguistic reuse: any DSL expression is just as a regular Scala expression. But in many cases, directly executing DSL programs within the host language is not enough and deep embeddings are needed, which reify DSL programs into a data structure representation that can be analyzed, optimized, or further translated. For deep embeddings, linguistic reuse is no longer automatic. Scala-Virtualized defines many of the language’s built-in constructs as method calls, which enables DSLs to redefine the built-in semantics using familiar language mechanisms like overloading and overriding. This in turn enables an easier progression from shallow to deep embeddings, as core language constructs such as conditionals or pattern matching can be redefined to build a reified representation of the operation itself. While this facility brings shallow, syntactic, reuse to deep embeddings, we also present examples of what we call deep linguistic reuse: combining shallow and deep components in a single DSL in such a way that certain features are fully implemented in the shallow embedding part and do not need to be reified at the deep embedding level

    Lightweight Modular Staging and Embedded Compilers:Abstraction without Regret for High-Level High-Performance Programming

    Get PDF
    Programs expressed in a high-level programming language need to be translated to a low-level machine dialect for execution. This translation is usually accomplished by a compiler, which is able to translate any legal program to equivalent low-level code. But for individual source programs, automatic translation does not always deliver good results: Software engineering practice demands generalization and abstraction, whereas high performance demands specialization and concretization. These goals are at odds, and compilers can only rarely translate expressive high-level programs tomodern hardware platforms in a way that makes best use of the available resources. Explicit program generation is a promising alternative to fully automatic translation. Instead of writing down the program and relying on a compiler for translation, developers write a program generator, which produces a specialized, efficient, low-level program as its output. However, developing high-quality program generators requires a very large effort that is often hard to amortize. In this thesis, we propose a hybrid design: Integrate compilers into programs so that programs can take control of the translation process, but rely on libraries of common compiler functionality for help. We present Lightweight Modular Staging (LMS), a generative programming approach that lowers the development effort significantly. LMS combines program generator logic with the generated code in a single program, using only types to distinguish the two stages of execution. Through extensive use of component technology, LMS makes a reusable and extensible compiler framework available at the library level, allowing programmers to tightly integrate domain-specific abstractions and optimizations into the generation process, with common generic optimizations provided by the framework. Compared to previous work on programgeneration, a key aspect of our design is the use of staging not only as a front-end, but also as a way to implement internal compiler passes and optimizations, many of which can be combined into powerful joint simplification passes. LMS is well suited to develop embedded domain specific languages (DSLs) and has been used to develop powerful performance-oriented DSLs for demanding domains such as machine learning, with code generation for heterogeneous platforms including GPUs. LMS has also been used to generate SQL for embedded database queries and JavaScript for web applications

    Specialising Parsers for Queries

    Get PDF
    Many software systems consist of data processing components that analyse large datasets to gather information and learn from these. Often, only part of the data is relevant for analysis. Data processing systems contain an initial preprocessing step that filters out the unwanted information. While efficient data analysis techniques and methodologies are accessible to non-expert programmers, data preprocessing seems to be forgotten, or worse, ignored. This despite real performance gains being possible by efficiently preprocessing data. Implementations of the data preprocessing step traditionally have to trade modularity for performance: to achieve the former, one separates the parsing of raw data and filtering it, and leads to slow programs because of the creation of intermediate objects during execution. The efficient version is a low-level implementation that interleaves parsing and querying. In this dissertation we demonstrate a principled and practical technique to convert the modular, maintainable program into its interleaved efficient counterpart. Key to achieving this objective is the removal, or deforestation, of intermediate objects in a program execution. We first show that by encoding data types using Böhm-Berarducci encodings (often referred to as Church encodings), and combining these with partial evaluation for function composition we achieve deforestation. This allows us to implement optimisations themselves as libraries, with minimal dependence on an underlying optimising compiler. Next we illustrate the applicability of this approach to parsing and preprocessing queries. The approach is general enough to cover top-down and bottom-up parsing techniques, and deforestation of pipelines of operations on lists and streams. We finally present a set of transformation rules that for a parser on a nested data format and a query on the structure, produces a parser specialised for the query. As a result we preserve the modularity of writing parsers and queries separately while also minimising resource usage. These transformation rules combine deforested implementations of both libraries to yield an efficient, interleaved result

    Specialization of applications using shared libraries

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Monadic Approach for Avoiding Code Duplication when Staging Memoized Functions

    No full text
    Building program generators that do not duplicate generated code can be challenging. At the same time, code duplication can easily increase both generation time and runtime of generated programs by an exponential factor. We identify an instance of this problem that can arise when memoized functions are staged. Without addressing this problem, it would be impossible to effectively stage dynamic programming algorithms. Intuitively, direct staging undoes the effect of memoization. To solve this problem once and for all, and for any function that uses memoization, we propose a staged monadic combinator library. Experimental results confirm that the library works as expected. Preliminary results also indicate that the library is useful even when memoization is not used

    Domain-specific languages for modeling and simulation

    Get PDF
    Simulation models and simulation experiments are increasingly complex. One way to handle this complexity is developing software languages tailored to specific application domains, so-called domain-specific languages (DSLs). This thesis explores the potential of employing DSLs in modeling and simulation. We study different DSL design and implementation techniques and illustrate their benefits for expressing simulation models as well as simulation experiments with several examples.Simulationsmodelle und -experimente werden immer komplexer. Eine Möglichkeit, dieser Komplexität zu begegnen, ist, auf bestimmte Anwendungsgebiete spezialisierte Softwaresprachen, sogenannte domänenspezifische Sprachen (\emph{DSLs, domain-specific languages}), zu entwickeln. Die vorliegende Arbeit untersucht, wie DSLs in der Modellierung und Simulation eingesetzt werden können. Wir betrachten verschiedene Techniken für Entwicklung und Implementierung von DSLs und illustrieren ihren Nutzen für das Ausdrücken von Simulationsmodellen und -experimenten anhand einiger Beispiele
    corecore