246 research outputs found

    A foundation for synthesising programming language semantics

    Get PDF
    Programming or scripting languages used in real-world systems are seldom designed with a formal semantics in mind from the outset. Therefore, the first step for developing well-founded analysis tools for these systems is to reverse-engineer a formal semantics. This can take months or years of effort. Could we automate this process, at least partially? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging, as found by Krishnamurthi, Lerner and Elberty. They propose automatically learning desugaring translation rules, mapping the language whose semantics we seek to a simplified, core version, whose semantics are much easier to write. The present thesis contains an analysis of their challenge, as well as the first steps towards a solution. Scaling methods with the size of the language is very difficult due to state space explosion, so this thesis proposes an incremental approach to learning the translation rules. I present a formalisation that both clarifies the informal description of the challenge by Krishnamurthi et al, and re-formulates the problem, shifting the focus to the conditions for incremental learning. The central definition of the new formalisation is the desugaring extension problem, i.e. extending a set of established translation rules by synthesising new ones. In a synthesis algorithm, the choice of search space is important and non-trivial, as it needs to strike a good balance between expressiveness and efficiency. The rest of the thesis focuses on defining search spaces for translation rules via typing rules. Two prerequisites are required for comparing search spaces. The first is a series of benchmarks, a set of source and target languages equipped with intended translation rules between them. The second is an enumerative synthesis algorithm for efficiently enumerating typed programs. I show how algebraic enumeration techniques can be applied to enumerating well-typed translation rules, and discuss the properties expected from a type system for ensuring that typed programs be efficiently enumerable. The thesis presents and empirically evaluates two search spaces. A baseline search space yields the first practical solution to the challenge. The second search space is based on a natural heuristic for translation rules, limiting the usage of variables so that they are used exactly once. I present a linear type system designed to efficiently enumerate translation rules, where this heuristic is enforced. Through informal analysis and empirical comparison to the baseline, I then show that using linear types can speed up the synthesis of translation rules by an order of magnitude

    Complete and easy type Inference for first-class polymorphism

    Get PDF
    The Hindley-Milner (HM) typing discipline is remarkable in that it allows statically typing programs without requiring the programmer to annotate programs with types themselves. This is due to the HM system offering complete type inference, meaning that if a program is well typed, the inference algorithm is able to determine all the necessary typing information. Let bindings implicitly perform generalisation, allowing a let-bound variable to receive the most general possible type, which in turn may be instantiated appropriately at each of the variable’s use sites. As a result, the HM type system has since become the foundation for type inference in programming languages such as Haskell as well as the ML family of languages and has been extended in a multitude of ways. The original HM system only supports prenex polymorphism, where type variables are universally quantified only at the outermost level. This precludes many useful programs, such as passing a data structure to a function in the form of a fold function, which would need to be polymorphic in the type of the accumulator. However, this would require a nested quantifier in the type of the overall function. As a result, one direction of extending the HM system is to add support for first-class polymorphism, allowing arbitrarily nested quantifiers and instantiating type variables with polymorphic types. In such systems, restrictions are necessary to retain decidability of type inference. This work presents FreezeML, a novel approach for integrating first-class polymorphism into the HM system, focused on simplicity. It eschews sophisticated yet hard to grasp heuristics in the type systems or extending the language of types, while still requiring only modest amounts of annotations. In particular, FreezeML leverages the mechanisms for generalisation and instantiation that are already at the heart of ML. Generalisation and instantiation are performed by let bindings and variables, respectively, but extended to types beyond prenex polymorphism. The defining feature of FreezeML is the ability to freeze variables, which prevents the usual instantiation of their types, allowing them instead to keep their original, fully polymorphic types. We demonstrate that FreezeML is as expressive as System F by providing a translation from the latter to the former; the reverse direction is also shown. Further, we prove that FreezeML is indeed a conservative extension of ML: When considering only ML programs, FreezeML accepts exactly the same programs as ML itself. # We show that type inference for FreezeML can easily be integrated into HM-like type systems by presenting a sound and complete inference algorithm for FreezeML that extends Algorithm W, the original inference algorithm for the HM system. Since the inception of Algorithm W in the 1970s, type inference for the HM system and its descendants has been modernised by approaches that involve constraint solving, which proved to be more modular and extensible. In such systems, a term is translated to a logical constraint, whose solutions correspond to the types of the original term. A solver for such constraints may then be defined independently. To this end, we demonstrate such a constraint-based inference approach for FreezeML. We also discuss the effects of integrating the value restriction into FreezeML and provide detailed comparisons with other approaches towards first-class polymorphism in ML alongside a collection of examples found in the literature

    Language integrated relational lenses

    Get PDF
    Relational databases are ubiquitous. Such monolithic databases accumulate large amounts of data, yet applications typically only work on small portions of the data at a time. A subset of the database defined as a computation on the underlying tables is called a view. Querying views is helpful, but it is also desirable to update them and have these changes be applied to the underlying database. This view update problem has been the subject of much previous work before, but support by database servers is limited and only rarely available. Lenses are a popular approach to bidirectional transformations, a generalization of the view update problem in databases to arbitrary data. However, perhaps surprisingly, lenses have seldom actually been used to implement updatable views in databases. Bohannon, Pierce and Vaughan propose an approach to updatable views called relational lenses. However, to the best of our knowledge this proposal has not been implemented or evaluated prior to the work reported in this thesis. This thesis proposes programming language support for relational lenses. Language integrated relational lenses support expressive and efficient view updates, without relying on updatable view support from the database server. By integrating relational lenses into the programming language, application development becomes easier and less error-prone, avoiding the impedance mismatch of having two programming languages. Integrating relational lenses into the language poses additional challenges. As defined by Bohannon et al. relational lenses completely recompute the database, making them inefficient as the database scales. The other challenge is that some parts of the well-formedness conditions are too general for implementation. Bohannon et al. specify predicates using possibly infinite abstract sets and define the type checking rules using relational algebra. Incremental relational lenses equip relational lenses with change-propagating semantics that map small changes to the view into (potentially) small changes to the source tables. We prove that our incremental semantics are functionally equivalent to the non-incremental semantics, and our experimental results show orders of magnitude improvement over the non-incremental approach. This thesis introduces a concrete predicate syntax and shows how the required checks are performed on these predicates and show that they satisfy the abstract predicate specifications. We discuss trade-offs between static predicates that are fully known at compile time vs dynamic predicates that are only known during execution and introduce hybrid predicates taking inspiration from both approaches. This thesis adapts the typing rules for relational lenses from sequential composition to a functional style of sub-expressions. We prove that any well-typed functional relational lens expression can derive a well-typed sequential lens. We use these additions to relational lenses as the foundation for two practical implementations: an extension of the Links functional language and a library written in Haskell. The second implementation demonstrates how type-level computation can be used to implement relational lenses without changes to the compiler. These two implementations attest to the possibility of turning relational lenses into a practical language feature

    Measuring with confidence : leveraging expressive type systems for correct-by-construction software

    Get PDF
    Modern programming language type systems help programmers write correct software, and furthermore helps them write the software they actually intended to write. We show how expressive types can be used to encode dimension and units of measure information, which can be used to avoid dimensional mistakes and guide software construction, and how types can even help to generate code automatically, which eliminates a whole class of bugs

    Towards A Practical High-Assurance Systems Programming Language

    Full text link
    Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation. Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code. To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process

    Erasure in dependently typed programming

    Get PDF
    It is important to reduce the cost of correctness in programming. Dependent types and related techniques, such as type-driven programming, offer ways to do so. Some parts of dependently typed programs constitute evidence of their typecorrectness and, once checked, are unnecessary for execution. These parts can easily become asymptotically larger than the remaining runtime-useful computation, which can cause linear-time algorithms run in exponential time, or worse. It would be unnacceptable, and contradict our goal of reducing the cost of correctness, to make programs run slower by only describing them more precisely. Current systems cannot erase such computation satisfactorily. By modelling erasure indirectly through type universes or irrelevance, they impose the limitations of these means to erasure. Some useless computation then cannot be erased and idiomatic programs remain asymptotically sub-optimal. This dissertation explains why we need erasure, that it is different from other concepts like irrelevance, and proposes two ways of erasing non-computational data. One is an untyped flow-based useless variable elimination, adapted for dependently typed languages, currently implemented in the Idris 1 compiler. The other is the main contribution of the dissertation: a dependently typed core calculus with erasure annotations, full dependent pattern matching, and an algorithm that infers erasure annotations from unannotated (or partially annotated) programs. I show that erasure in well-typed programs is sound in that it commutes with single-step reduction. Assuming the Church-Rosser property of reduction, I show that properties such as Subject Reduction hold, which extends the soundness result to multi-step reduction. I also show that the presented erasure inference is sound and complete with respect to the typing rules; that this approach can be extended with various forms of erasure polymorphism; that it works well with monadic I/O and foreign functions; and that it is effective in that it not only removes the runtime overhead caused by dependent typing in the presented examples, but can also shorten compilation times."This work was supported by the University of St Andrews (School of Computer Science)." -- Acknowledgement

    A Direct-Style Effect Notation for Sequential and Parallel Programs

    Get PDF
    Modeling sequential and parallel composition of effectful computations has been investigated in a variety of languages for a long time. In particular, the popular do-notation provides a lightweight effect embedding for any instance of a monad. Idiom bracket notation, on the other hand, provides an embedding for applicatives. First, while monads force effects to be executed sequentially, ignoring potential for parallelism, applicatives do not support sequential effects. Composing sequential with parallel effects remains an open problem. This is even more of an issue as real programs consist of a combination of both sequential and parallel segments. Second, common notations do not support invoking effects in direct-style, instead forcing a rigid structure upon the code. In this paper, we propose a mixed applicative/monadic notation that retains parallelism where possible, but allows sequentiality where necessary. We leverage a direct-style notation where sequentiality or parallelism is derived from the structure of the code. We provide a mechanisation of our effectful language in Coq and prove that our compilation approach retains the parallelism of the source program

    Hoogle?: Constants and ?-abstractions in Petri-net-based Synthesis using Symbolic Execution

    Get PDF
    Type-directed component-based program synthesis is the task of automatically building a function with applications of available components and whose type matches a given goal type. Existing approaches to component-based synthesis, based on classical proof search, cannot deal with large sets of components. Recently, Hoogle+, a component-based synthesizer for Haskell, overcomes this issue by reducing the search problem to a Petri-net reachability problem. However, Hoogle+ cannot synthesize constants nor ?-abstractions, which limits the problems that it can solve. We present Hoogle?, an extension to Hoogle+ that brings constants and ?-abstractions to the search space, in two independent steps. First, we introduce the notion of wildcard component, a component that matches all types. This enables the algorithm to produce incomplete functions, i.e., functions containing occurrences of the wildcard component. Second, we complete those functions, by replacing each occurrence with constants or custom-defined ?-abstractions. We have chosen to find constants by means of an inference algorithm: we present a new unification algorithm based on symbolic execution that uses the input-output examples supplied by the user to compute substitutions for the occurrences of the wildcard. When compared to Hoogle+, Hoogle? can solve more kinds of problems, especially problems that require the generation of constants and ?-abstractions, without performance degradation

    A dependently typed programming language with dynamic equality

    Get PDF
    Dependent types offer a uniform foundation for both proof systems and programming languages. While the proof systems built with dependent types have become relatively popular, dependently typed programming languages are far from mainstream. One key issue with existing dependently typed languages is the overly conservative definitional equality that programmers are forced to use. When combined with a traditional typing workflow, these systems can be quite challenging and require a large amount of expertise to master. This thesis explores an alternative workflow and a more liberal handling of equality. Programmers are given warnings that contain the same information as the type errors that would be given by an existing system. Programmers can run these programs optimistically, and they will behave appropriately unless a direct contradiction confirming the warning is found. This is achieved by localizing equality constraints using a new form of elaboration based on bidirectional type inference. These local checks, or casts, are given a runtime behavior (similar to those of contracts and monitors). The elaborated terms have a weakened form of type soundness: they will not get stuck without an explicit counter example. The language explored in this thesis will be a Calculus of Constructions like language with recursion, type-in-type, data types with dependent indexing and pattern matching. Several meta-theoretic results will be presented. The key result is that the core language, called the cast system, "will not get stuck without a counter example"; a result called cast soundness. A proof of cast soundness is fully worked out for the fragment of the system without user defined data, and a Coq proof is available. Several other properties based on the gradual guarantees of gradual typing are also presented. In the presence of user defined data and pattern matching these properties are conjectured to hold. A prototype implementation of this work is available

    PureCake: A verified compiler for a lazy functional language

    Get PDF
    We present PureCake, a mechanically-verified compiler for PureLang, a lazy, purely functional programming language with monadic effects. PureLang syntax is Haskell-like and indentation-sensitive, and its constraint-based Hindley-Milner type system guarantees safe execution. We derive sound equational reasoning principles over its operational semantics, dramatically simplifying some proofs. We prove end-to-end correctness for the compilation of PureLang down to machine code---the first such result for any lazy language---by targeting CakeML and composing with its verified compiler. Multiple optimisation passes are necessary to handle realistic lazy idioms effectively. We develop PureCake entirely within the HOL4 interactive theorem prover
    corecore