19 research outputs found
Meta-F*: Proof Automation with SMT, Tactics, and Metaprograms
We introduce Meta-F*, a tactics and metaprogramming framework for the F*
program verifier. The main novelty of Meta-F* is allowing the use of tactics
and metaprogramming to discharge assertions not solvable by SMT, or to just
simplify them into well-behaved SMT fragments. Plus, Meta-F* can be used to
generate verified code automatically.
Meta-F* is implemented as an F* effect, which, given the powerful effect
system of F*, heavily increases code reuse and even enables the lightweight
verification of metaprograms. Metaprograms can be either interpreted, or
compiled to efficient native code that can be dynamically loaded into the F*
type-checker and can interoperate with interpreted code. Evaluation on
realistic case studies shows that Meta-F* provides substantial gains in proof
development, efficiency, and robustness.Comment: Full version of ESOP'19 pape
DecoderLens:Layerwise Interpretation of Encoder-Decoder Transformers
In recent years, many interpretability methods have been proposed to help interpret the internal states of Transformer-models, at different levels of precision and complexity. Here, to analyze encoder-decoder Transformers, we propose a simple, new method: DecoderLens. Inspired by the LogitLens (for decoder-only Transformers), this method involves allowing the decoder to cross-attend representations of intermediate encoder layers instead of using the final encoder output, as is normally done in encoder-decoder models. The method thus maps previously uninterpretable vector representations to human-interpretable sequences of words or symbols. We report results from the DecoderLens applied to models trained on question answering, logical reasoning, speech recognition and machine translation. The DecoderLens reveals several specific subtasks that are solved at low or intermediate layers, shedding new light on the information flow inside the encoder component of this important class of models
Parametric Compositional Data Types
In previous work we have illustrated the benefits that compositional data
types (CDTs) offer for implementing languages and in general for dealing with
abstract syntax trees (ASTs). Based on Swierstra's data types \'a la carte,
CDTs are implemented as a Haskell library that enables the definition of
recursive data types and functions on them in a modular and extendable fashion.
Although CDTs provide a powerful tool for analysing and manipulating ASTs, they
lack a convenient representation of variable binders. In this paper we remedy
this deficiency by combining the framework of CDTs with Chlipala's parametric
higher-order abstract syntax (PHOAS). We show how a generalisation from
functors to difunctors enables us to capture PHOAS while still maintaining the
features of the original implementation of CDTs, in particular its modularity.
Unlike previous approaches, we avoid so-called exotic terms without resorting
to abstract types: this is crucial when we want to perform transformations on
CDTs that inspect the recursively computed CDTs, e.g. constant folding.Comment: In Proceedings MSFP 2012, arXiv:1202.240
Typeful Normalization by Evaluation
We present the first typeful implementation of Normalization by Evaluation for the simply typed lambda-calculus with sums and control operators: we guarantee type preservation and eta-long (modulo commuting conversions), beta-normal forms using only Generalized Algebraic Data Types in a general-purpose programming language, here OCaml; and we account for sums and control operators with Continuation-Passing Style. First, we implement the standard NbE algorithm for the implicational fragment in a typeful way that is correct by construction. We then derive its call-by-value continuation-passing counterpart, that maps a lambda-term with sums and call/cc into a CPS term in normal form, which we express in a typed dedicated syntax. Beyond showcasing the expressive power of GADTs, we emphasize that type inference gives a smooth way to re-derive the encodings of the syntax and typing of normal forms in Continuation-Passing Style
A Typed, Algebraic Approach to Parsing
In this paper, we recall the definition of the context-free expressions (or µ-regular expressions), an algebraic presentation of the context-free languages. Then, we define a core type system for the context-free expressions which gives a compositional criterion for identifying those context-free expressions which can be parsed unambiguously by predictive algorithms in the style of recursive descent or LL(1).
Next, we show how these typed grammar expressions can be used to derive a parser combinator library which both guarantees linear-time parsing with no backtracking and single-token lookahead, and which respects the natural denotational semantics of context-free expressions. Finally, we show how to exploit the type information to write a staged version of this library, which produces dramatic increases in performance, even outperforming code generated by the standard parser generator tool ocamlyacc
The Impact of Depth and Width on Transformer Language Model Generalization
To process novel sentences, language models (LMs) must generalize
compositionally -- combine familiar elements in new ways. What aspects of a
model's structure promote compositional generalization? Focusing on
transformers, we test the hypothesis, motivated by recent theoretical and
empirical work, that transformers generalize more compositionally when they are
deeper (have more layers). Because simply adding layers increases the total
number of parameters, confounding depth and size, we construct three classes of
models which trade off depth for width such that the total number of parameters
is kept constant (41M, 134M and 374M parameters). We pretrain all models as LMs
and fine-tune them on tasks that test for compositional generalization. We
report three main conclusions: (1) after fine-tuning, deeper models generalize
better out-of-distribution than shallower models do, but the relative benefit
of additional layers diminishes rapidly; (2) within each family, deeper models
show better language modeling performance, but returns are similarly
diminishing; (3) the benefits of depth for compositional generalization cannot
be attributed solely to better performance on language modeling or on
in-distribution data
Converting data-parallelism to task-parallelism by rewrites: purely functional programs across multiple GPUs
High-level domain-specific languages for array processing on the GPU are increasingly common, but they typically only run on a single GPU. As computational power is distributed across more devices, languages must target multiple devices simultaneously. To this end, we present a compositional translation that fissions data-parallel programs in the Accelerate language, allowing subsequent compiler and runtime stages to map computations onto multiple devices for improved performance---even programs that begin as a single data-parallel kernel
A Computational Interpretation of Parametricity
Reynolds' abstraction theorem has recently been extended to lambda-calculi with dependent types. In this paper, we show how this theorem can be internalized. More precisely, we describe an extension of Pure Type Systems with a special parametricity rule (with computational content), and prove fundamental properties such as Church-Rosser's and strong normalization. All instances of the abstraction theorem can be both expressed and proved in the calculus itself. Moreover, one can apply parametricity to the parametricity rule: parametricity is itself parametric
A Presheaf Model of Parametric Type Theory
Abstract We extend Martin-Löf's Logical Framework with special constructions and typing rules providing internalized parametricity. Compared to previous similar proposals, this version comes with a denotational semantics which is a refinement of the standard presheaf semantics of dependent type theory. Further, this presheaf semantics is a refinement of the one used to interpret nominal sets with restrictions. The present calculus is a candidate for the core of a proof assistant with internalized parametricity