9 research outputs found

    Software refactoring and rewriting: from the perspective of code transformations

    Full text link
    To refactor already working code while keeping reliability, compatibility and perhaps security, we can borrow ideas from micropass/nanopass compilers. By treating the procedure of software refactoring as composing code transformations, and compressing repetitive transformations with automation tools, we can often obtain representations of refactoring processes short enough that their correctness can be analysed manually. Unlike in compilers, in refactoring we usually only need to consider the codebase in question, so regular text processing can be extensively used, fully exploiting patterns only present in the codebase. Aside from the direct application of code transformations from compilers, many other kinds of equivalence properties may also be exploited. In this paper, two refactoring projects are given as the main examples, where 10-100 times simplification has been achieved with the application of a few kinds of useful transformations.Comment: 10 pages, 4 figure

    Catala: A Programming Language for the Law

    Get PDF
    Law at large underpins modern society, codifying and governing many aspects of citizens' daily lives. Oftentimes, law is subject to interpretation, debate and challenges throughout various courts and jurisdictions. But in some other areas, law leaves little room for interpretation, and essentially aims to rigorously describe a computation, a decision procedure or, simply said, an algorithm. Unfortunately, prose remains a woefully inadequate tool for the job. The lack of formalism leaves room for ambiguities; the structure of legal statutes, with many paragraphs and sub-sections spread across multiple pages, makes it hard to compute the intended outcome of the algorithm underlying a given text; and, as with any other piece of poorly-specified critical software, the use of informal language leaves corner cases unaddressed. We introduce Catala, a new programming language that we specifically designed to allow a straightforward and systematic translation of statutory law into an executable implementation. Catala aims to bring together lawyers and programmers through a shared medium, which together they can understand, edit and evolve, bridging a gap that often results in dramatically incorrect implementations of the law. We have implemented a compiler for Catala, and have proven the correctness of its core compilation steps using the F* proof assistant. We evaluate Catala on several legal texts that are algorithms in disguise, notably section 121 of the US federal income tax and the byzantine French family benefits; in doing so, we uncover a bug in the official implementation. We observe as a consequence of the formalization process that using Catala enables rich interactions between lawyers and programmers, leading to a greater understanding of the original legislative intent, while producing a correct-by-construction executable specification reusable by the greater software ecosystem

    Higher-Order, Data-Parallel Structured Deduction

    Full text link
    State-of-the-art Datalog engines include expressive features such as ADTs (structured heap values), stratified aggregation and negation, various primitive operations, and the opportunity for further extension using FFIs. Current parallelization approaches for state-of-art Datalogs target shared-memory locking data-structures using conventional multi-threading, or use the map-reduce model for distributed computing. Furthermore, current state-of-art approaches cannot scale to formal systems which pervasively manipulate structured data due to their lack of indexing for structured data stored in the heap. In this paper, we describe a new approach to data-parallel structured deduction that involves a key semantic extension of Datalog to permit first-class facts and higher-order relations via defunctionalization, an implementation approach that enables parallelism uniformly both across sets of disjoint facts and over individual facts with nested structure. We detail a core language, DLsDL_s, whose key invariant (subfact closure) ensures that each subfact is materialized as a top-class fact. We extend DLsDL_s to Slog, a fully-featured language whose forms facilitate leveraging subfact closure to rapidly implement expressive, high-performance formal systems. We demonstrate Slog by building a family of control-flow analyses from abstract machines, systematically, along with several implementations of classical type systems (such as STLC and LF). We performed experiments on EC2, Azure, and ALCF's Theta at up to 1000 threads, showing orders-of-magnitude scalability improvements versus competing state-of-art systems

    Generic Programming with Extensible Data Types; Or, Making Ad Hoc Extensible Data Types Less Ad Hoc

    Full text link
    We present a novel approach to generic programming over extensible data types. Row types capture the structure of records and variants, and can be used to express record and variant subtyping, record extension, and modular composition of case branches. We extend row typing to capture generic programming over rows themselves, capturing patterns including lifting operations to records and variations from their component types, and the duality between cases blocks over variants and records of labeled functions, without placing specific requirements on the fields or constructors present in the records and variants. We formalize our approach in System R{\omega}, an extension of F{\omega} with row types, and give a denotational semantics for (stratified) R{\omega} in Agda.Comment: To appear at: International Conference on Functional Programming 2023 Corrected citations from previous versio

    Pattern eliminating transformations

    Get PDF
    International audienceProgram transformation is a common practice in computer science, and its many applications can have a range of different objectives. For example, a program written in an original high level language could be either translated into machine code for execution purposes, or towards a language suitable for formal verification. Such compilations are split into several so-called passes which generally aim at eliminating certain constructions of the original language to get some intermediate languages and finally generate the target code. Rewriting is a widely established formalism to describe the mechanism and the logic behind such transformations. In a typed context featuring type-preserving rewrite rules, the underlying type system can be used to give syntactic guarantees on the shape of the results obtained after each pass, but this approach could lead to an accumulation of (auxiliary) types that should be considered. We propose in this paper an approach where the function symbols corresponding to the transformations performed in a pass are annotated with the (anti-)patterns they are supposed to eliminate and show how we can check that the transformation is consistent with the annotations and thus, that it eliminates the respective patterns

    A Type and Scope Safe Universe of Syntaxes with Binding: Their Semantics and Proofs

    Get PDF
    Almost every programming language’s syntax includes a notion of binder and corresponding bound occurrences, along with the accompanying notions of α-equivalence, capture avoiding substitution, typing contexts, runtime environments, and so on. In the past, implementing and reasoning about programming languages required careful handling to maintain the correct behaviour of bound variables. Modern programming languages include features that enable constraints like scope safety to be expressed in types. Nevertheless, the programmer is still forced to write the same boilerplate over again for each new implementation of a scope safe operation (e.g., renaming, substitution, desugaring, printing, etc.), and then again for correctness proofs. We present an expressive universe of syntaxes with binding and demonstrate how to (1) implement scope safe traversals once and for all by generic programming; and (2) how to derive properties of these traversals by generic proving. Our universe description, generic traversals and proofs, and our examples have all been formalised in Agda and are available in the accompanying material

    A nanopass framework for commercial compiler development

    No full text
    Contemporary commercial compilers typically handle sophisticated high-level source languages, generate efficient assembly or machine code for multiple hardware architectures, run under and generate code to run under multiple operating systems, and support source-level debugging, profiling, and other program development tools. As a result, commercial compilers tend to be among the most complex of software systems. Nanopass frameworks are designed to help make this complexity manageable. A nanopass framework is a domain-specific language, embedded in a general purpose programming language, to aid in compiler development. A nanopass compiler is comprised of many small passes, each of which performs a single task and specifies only the interesting transformations to be performed by the pass. Intermediate languages are formally specified by the compiler writer, which allows the infrastructure both to verify that the output of each pass is well-formed and to fill in the uninteresting boilerplate parts of each pass. Prior nanopass frameworks were prototype systems aimed at educational use, but we believe that a suitable nanopass framework can be used to support the development of commercial compilers. We have created such a framework and have demonstrated its effectiveness by using the framework to create a new commercial compiler that is a “plug replacement” for an existing commercial compiler. The new compiler uses a more sophisticated, although slower, register allocator and implements nearly all of the optimizations of the original compiler, along with several “new and improved” optimizations. When compared to the original compiler on a set of benchmarks, code generated by the new compiler runs, on average, 21.5% faster. The average compile time for these benchmarks is less than twice as long as with the original compiler. This dissertation provides a description of the new framework, the new compiler, and several experiments that demonstrate the performance and effectiveness of both, as well as a presentation of several optimizations performed by the new compiler and facilitated by the infrastructure
    corecore