19 research outputs found

    Optimizing and Incrementalizing Higher-order Collection Queries by AST Transformation

    Get PDF
    In modernen, universellen Programmiersprachen sind Abfragen auf Speicher-basierten Kollektionen oft rechenintensiver als erforderlich. Während Datenbankenabfragen vergleichsweise einfach optimiert werden können, fällt dies bei Speicher-basierten Kollektionen oft schwer, denn universelle Programmiersprachen sind in aller Regel ausdrucksstärker als Datenbanken. Insbesondere unterstützen diese Sprachen meistens verschachtelte, rekursive Datentypen und Funktionen höherer Ordnung. Kollektionsabfragen können per Hand optimiert und inkrementalisiert werden, jedoch verringert dies häufig die Modularität und ist oft zu fehleranfällig, um realisierbar zu sein oder um Instandhaltung von entstandene Programm zu gewährleisten. Die vorliegende Doktorarbeit demonstriert, wie Abfragen auf Kollektionen systematisch und automatisch optimiert und inkrementalisiert werden können, um Programmierer von dieser Last zu befreien. Die so erzeugten Programme werden in derselben Kernsprache ausgedrückt, um weitere Standardoptimierungen zu ermöglichen. Teil I entwickelt eine Variante der Scala API für Kollektionen, die Staging verwendet um Abfragen als abstrakte Syntaxbäume zu reifizieren. Auf Basis dieser Schnittstelle werden anschließend domänenspezifische Optimierungen von Programmiersprachen und Datenbanken angewandt; unter anderem werden Abfragen umgeschrieben, um vom Programmierer ausgewählte Indizes zu benutzen. Dank dieser Indizes kann eine erhebliche Beschleunigung der Ausführungsgeschwindigkeit gezeigt werden; eine experimentelle Auswertung zeigt hierbei Beschleunigungen von durchschnittlich 12x bis zu einem Maximum von 12800x. Um Programme mit Funktionen höherer Ordnung durch Programmtransformation zu inkrementalisieren, wird in Teil II eine Erweiterung der Finite-Differenzen-Methode vorgestellt [Paige and Koenig, 1982; Blakeley et al., 1986; Gupta and Mumick, 1999] und ein erster Ansatz zur Inkrementalisierung durch Programmtransformation für Programme mit Funktionen höherer Ordnung entwickelt. Dabei werden Programme zu Ableitungen transformiert, d.h. zu Programmen die Eingangsdifferenzen in Ausgangdifferenzen umwandeln. Weiterhin werden in den Kapiteln 12–13 die Korrektheit des Inkrementalisierungsansatzes für einfach-getypten und ungetypten λ-Kalkül bewiesen und Erweiterungen zu System F besprochen. Ableitungen müssen oft Ergebnisse der ursprünglichen Programme wiederverwenden. Um eine solche Wiederverwendung zu ermöglichen, erweitert Kapitel 17 die Arbeit von Liu and Teitelbaum [1995] zu Programmen mit Funktionen höherer Ordnung und entwickeln eine Programmtransformation solcher Programme im Cache-Transfer-Stil. Für eine effiziente Inkrementalisierung ist es weiterhin notwendig, passende Grundoperationen auszuwählen und manuell zu inkrementalisieren. Diese Arbeit deckt einen Großteil der wichtigsten Grundoperationen auf Kollektionen ab. Die Durchführung von Fallstudien zeigt deutliche Laufzeitverbesserungen sowohl in Praxis als auch in der asymptotischen Komplexität.In modern programming languages, queries on in-memory collections are often more expensive than needed. While database queries can be readily optimized, it is often not trivial to use them to express collection queries which employ nested data and first-class functions, as enabled by functional programming languages. Collection queries can be optimized and incrementalized by hand, but this reduces modularity, and is often too error-prone to be feasible or to enable maintenance of resulting programs. To free programmers from such burdens, in this thesis we study how to optimize and incrementalize such collection queries. Resulting programs are expressed in the same core language, so that they can be subjected to other standard optimizations. To enable optimizing collection queries which occur inside programs, we develop a staged variant of the Scala collection API that reifies queries as ASTs. On top of this interface, we adapt domain-specific optimizations from the fields of programming languages and databases; among others, we rewrite queries to use indexes chosen by programmers. Thanks to the use of indexes we show significant speedups in our experimental evaluation, with an average of 12x and a maximum of 12800x. To incrementalize higher-order programs by program transformation, we extend finite differencing [Paige and Koenig, 1982; Blakeley et al., 1986; Gupta and Mumick, 1999] and develop the first approach to incrementalization by program transformation for higher-order programs. Base programs are transformed to derivatives, programs that transform input changes to output changes. We prove that our incrementalization approach is correct: We develop the theory underlying incrementalization for simply-typed and untyped λ-calculus, and discuss extensions to System F. Derivatives often need to reuse results produced by base programs: to enable such reuse, we extend work by Liu and Teitelbaum [1995] to higher-order programs, and develop and prove correct a program transformation, converting higher-order programs to cache-transfer-style. For efficient incrementalization, it is necessary to choose and incrementalize by hand appropriate primitive operations. We incrementalize a significant subset of collection operations and perform case studies, showing order-of-magnitude speedups both in practice and in asymptotic complexity

    A Theory of Higher-Order Subtyping with Type Intervals (Extended Version)

    Full text link
    The calculus of Dependent Object Types (DOT) has enabled a more principled and robust implementation of Scala, but its support for type-level computation has proven insufficient. As a remedy, we propose F..ωF^\omega_{..}, a rigorous theoretical foundation for Scala's higher-kinded types. F..ωF^\omega_{..} extends F<:ωF^\omega_{<:} with interval kinds, which afford a unified treatment of important type- and kind-level abstraction mechanisms found in Scala, such as bounded quantification, bounded operator abstractions, translucent type definitions and first-class subtyping constraints. The result is a flexible and general theory of higher-order subtyping. We prove type and kind safety of F..ωF^\omega_{..}, as well as weak normalization of types and undecidability of subtyping. All our proofs are mechanized in Agda using a fully syntactic approach based on hereditary substitution.Comment: 73 pages; to be presented at the 26th ACM SIGPLAN International Conference on Functional Programming (ICFP 2021), 22-27 August 202

    Coordination Of Hierarchical Command And Control Services

    Get PDF
    The purpose of this program is to show emerging information technologies can significantly improve key areas of tactical operations, resulting in the conversion of software developed under the ATO to existing battlefield systems. One such key area is Information Dissemination and Management (ID&M). The key software that will be developed under the ID&M portion requires a collection of agent-based software services that will collaborate during tactical mission planning and execution

    数学と計算機科学の相互作用:計算代数とHaskellにおける安全性と拡張性

    Get PDF
    筑波大学 (University of Tsukuba)201

    Extending old languages for new architectures

    Get PDF
    Architectures evolve quickly. The number of transistors available to chip designers doubles every 18 months, allowing increasingly complex architectures to be developed on a single chip. Power dissipation issues have forced chip designers to look for new ways to use the transistors at their disposal. This situation inevitably leads to new architectural features on a fairly regular basis. Enabling programmers to benefit from these new architectural features can be problematic. Since architectures change frequently, and compilers last for a long time, it is clear that compilers should be designed to be extensible. This thesis argues that to support evolving architectures a compiler should support the creation of high-level language extensions. In particular, it must support extending the compiler's middle-end. We describe the design of EMCC, a C compiler that allows extension of its front-, middle- and back-ends. OpenMP is an extension to the C programming language to support parallelism. It has recently added support for task-based parallelism, a dynamic form of parallelism made popular by Cilk. However, implementing task-based parallelism efficiently requires much more involved program transformation than the simple static parallelism originally supported by OpenMP. We use EMCC to create an implementation of OpenMP, with particular focus on efficient implementation of task-based parallelism. We also demonstrate the benefits of supporting high-level analysis through an extended middle-end, by developing and implementing an interprocedural analysis that improves the performance of task-based parallelism by allowing tasks to share stacks. We develop a novel generalisation of logic programming that we use to concisely express this analysis, and use this formalism to demonstrate that the analysis can be executed in polynomial time. Finally, we design extensions to OpenMP to support heterogeneous architectures

    Reliably composable language extensions

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2017. Major: Computer Science. Advisor: Eric Van Wyk. 1 computer file (PDF); x, 300 pages.Many programming tasks are dramatically simpler when an appropriate domain-specific language can be used to accomplish them. These languages offer a variety of potential advantages, including programming at a higher level of abstraction, custom analyses specific to the problem domain, and the ability to generate very efficient code. But they also suffer many disadvantages as a result of their implementation techniques. Fully separate languages (such as YACC, or SQL) are quite flexible, but these are distinct monolithic entities and thus we are unable to draw on the features of several in combination to accomplish a single task. That is, we cannot compose their domain-specific features. "Embedded" DSLs (such as parsing combinators) accomplish something like a different language, but are actually implemented simply as libraries within a flexible host language. This approach allows different libraries to be imported and used together, enabling composition, but it is limited in analysis and translation capabilities by the host language they are embedded within. A promising combination of these two approaches is to allow a host language to be directly extended with new features (syntactic and semantic.) However, while there are plausible ways to attempt to compose language extensions, they can easily fail, making this approach unreliable. Previous methods of assuring reliable composition impose onerous restrictions, such as throwing out entirely the ability to introduce new analysis. This thesis introduces reliably composable language extensions as a technique for the implementation of DSLs. This technique preserves most of the advantages of both separate and "embedded" DSLs. Unlike many prior approaches to language extension, this technique ensures composition of multiple language extensions will succeed, and preserves strong properties about the behavior of the resulting composed compiler. We define an analysis on language extensions that guarantees the composition of several extensions will be well-defined, and we further define a set of testable properties that ensure the resulting compiler will behave as expected, along with a principle that assigns "blame" for bugs that may ultimately appear as a result of composition. Finally, to concretely compare our approach to our original goals for reliably composable language extension, we use these techniques to develop an extensible C compiler front-end, together with several example composable language extensions

    A formal, resource consumption-preserving translation from actors with cooperative scheduling to Haskell

    Get PDF
    We present a formal translation of a resource-Aware extension of the Abstract Behavioral Specification (ABS) language to the functional language Haskell. ABS is an actor-based language tailored to the modeling of distributed systems. It combines asynchronous method calls with a suspend and resume mode of execution of the method invocations. To cater for the resulting cooperative scheduling of the method invocations of an actor, the translation exploits for the compilation of ABS methods Haskell functions with continuations. The main result of this article is a correctness proof of the translation by means of a simulation relation between a formal semantics of the source language and a high-level operational semantics of the target language, i.e., a subset of Haskell. We further prove that the resource consumption of an ABS program extended with a cost model is preserved over this translation, as we establish an equivalence of the cost of executing the ABS program and its corresponding Haskell-Translation. Concretely, the resources consumed by the original ABS program and those consumed by the Haskell program are the same, considering a cost model. Consequently, the resource bounds automatically inferred for ABS programs extended with a cost model, using resource analysis tools, are sound resource bounds also for the translated Haskell programs. Our experimental evaluation confirms the resource preservation over a set of benchmarks featuring different asymptotic costs

    Understanding the interaction between elaboration and quotation

    Get PDF

    A Tool for Producing Verified, Explainable Proofs

    Get PDF
    Mathematicians are reluctant to use interactive theorem provers. In this thesis I argue that this is because proof assistants don't emphasise explanations of proofs; and that in order to produce good explanations, the system must create proofs in a manner that mimics how humans would create proofs. My research goals are to determine what constitutes a human-like proof and to represent human-like reasoning within an interactive theorem prover to create formalised, understandable proofs. Another goal is to produce a framework to visualise the goal states of this system. To demonstrate this, I present HumanProof: a piece of software built for the Lean 3 theorem prover. It is used for interactively creating proofs that resemble how human mathematicians reason. The system provides a visual, hierarchical representation of the goal and a system for suggesting available inference rules. The system produces output in the form of both natural language and formal proof terms which are checked by Lean's kernel. This is made possible with the use of a structured goal state system which interfaces with Lean's tactic system which is detailed in Chapter 3. In Chapter 4, I present the subtasks automation planning subsystem, which is used to produce equality proofs in a human-like fashion. The basic strategy of the subtasks system is break a given equality problem in to a hierarchy of tasks and then maintain a stack of these tasks in order to determine the order in which to apply equational rewriting moves. This process produces equality chains for simple problems without having to resort to brute force or specialised procedures such as normalisation. This makes proofs more human-like by breaking the problem into a hierarchical set of tasks in the same way that a human would. To produce the interface for this software, I also created the ProofWidgets system for Lean 3. This system is detailed in Chapter 5. The ProofWidgets system uses Lean's metaprogramming framework to allow users to write their own interactive, web-based user interfaces to display within the VSCode editor and in an online web-editor. The entire tactic state is available to the rendering engine, and hence expression structure and types of subexpressions can be explored interactively. The ProofWidgets system also allows the user interface to interactively edit the proof document, enabling a truly interactive modality for creating proofs; human-like or not. In Chapter 6, the system is evaluated by asking real mathematicians about the output of the system, and what it means for a proof to be understandable to them. The user group study asks participants to rank and comment on proofs created by HumanProof alongside natural language and pure Lean proofs. The study finds that participants generally prefer the HumanProof format over the Lean format. The verbal responses collected during the study indicate that providing intuition and signposting are the most important properties of a proof that aid understanding.EPSR

    Effectful Programming in Declarative Languages with an Emphasis on Non-Determinism: Applications and Formal Reasoning

    Get PDF
    This thesis investigates effectful declarative programming with an emphasis on non-determinism as an effect. On the one hand, we are interested in developing applications using non-determinism as underlying implementation idea. We discuss two applications using the functional logic programming language Curry. The key idea of these implementations is to exploit the interplay of non-determinism and non-strictness that Curry employs. The first application investigates sorting algorithms parametrised over a comparison function. By applying a non-deterministic predicate to these sorting functions, we gain a permutation enumeration function. We compare the implementation in Curry with an implementation in Haskell that uses a monadic interface to model non-determinism. The other application that we discuss in this work is a library for probabilistic programming. Instead of modelling distributions as list of event and probability pairs, we model distributions using Curry's built-in non-determinism. In both cases we observe that the combination of non-determinism and non-strictness has advantages over an implementation using lists to model non-determinism. On the other hand, we present an idea to apply formal reasoning on effectful declarative programming languages. In order to start with simple effects, we focus on modelling a functional subset first. That is, the effects of interest are totality and partiality. We then observe that the general scheme to model these two effects can be generalised to capture a wide range of effects. Obviously, the next step is to apply the idea to model non-determinism. More precisely, we implement a model for the non-determinism of Curry: non-strict non-determinism with call-time choice. Therefore, we finally discuss why the current representation models call-by-name rather than Curry's call-by-need semantics and give an outlook on ideas to tackle this problem.Diese Arbeit beschäftigt sich mit der deklarativen Programmierung mit Effekten und legt dabei besonderen Fokus auf Nichtdeterminismus als Effekt. Einerseits möchten wir Anwendungen entwickeln, deren zugrundeliegende Implementierungsidee auf Nichtdeterminismus basiert. Wir stellen dazu zwei beispielhafte Anwendungen vor, die in der funktional logischen Programmiersprache Curry implementiert sind. Die Kernidee dieser Implementierungen ist dabei die Kombination von Nichtstriktheit und Nichtdeterminismus, die Curry unterliegen, gewinnbringend auszunutzen. Für die erste Anwendung untersuchen wir Sortierfunktionen, die über eine Vergleichsfunktion parametrisiert sind, und wenden diese Funktionen auf ein nichtdeterministisches Prädikat an. Dabei entsteht eine Funktion, die Permutationen der Eingabeliste berechnet. Wir vergleichen unsere Implementierung in Curry mit einer Implementierung in Haskell, die den Nichtdeterminismus monadisch modelliert. Als zweite Anwendung werden wir über eine Bibliothek zur probabilistischen Programmierung diskutieren. Statt der üblichen Modellierung von Wahrscheinlichkeitsverteilungen als Liste von Paaren von Ereignis- und korrespondierenden Wahrscheinlichkeitswerten modellieren wir diese Verteilungen mithilfe von Currys nativem Nichtdeterminismus. Beide Implementierungen haben durch die Kombination von Nichtdeterminismus und Nichtstriktheit Vorteile gegenüber einer Implementierung, die den Nichtdeterminismus durch Listen repräsentiert. Andererseits möchten wir eine Möglichkeit schaffen, über die Programme, die wir in effektbehafteten deklarativen Programmiersprachen entwickelt haben, in einem formalen Rahmen zu argumentieren. Dabei fangen wir mit der Teilmenge der rein funktionalen Effekte an, das heißt, wir interessieren uns zunächst für totale und partielle Programme. Die zugrundeliegende Idee zur Modellierung dieser zwei Effekte kann dann auch für weitere Effekte genutzt werden. Als natürlichen nächsten Schritt betrachten wir den Effekt, der bei der Sprache Curry zusätzlich hinzukommt: nicht-strikter Nichtdeterminismus mit call-time choice Semantik. Dabei geben wir eine Übersicht darüber, warum die aktuelle Repräsentation call-by-name modelliert, sowie erste Ideen, wie die für Curry erforderliche call-by-need Semantik modelliert werden könnte
    corecore