303,772 research outputs found
Partial evaluation in an optimizing prolog compiler
Specialization of programs and meta-programs written in high-level languages has been an active area of research for some time. Specialization contributes to improvement in program performance. We begin with a hypothesis that partial evaluation provides a framework for several traditional back-end optimizations. The present work proposes a new compiler back-end optimization technique based on specialization of low-level RISC-like machine code. Partial evaluation is used to specialize the low-level code. Berkeley Abstract Machine (BAM) code generated during compilation of Prolog is used as the candidate low-level language to test the hypothesis. A partial evaluator of BAM code was designed and implemented to demonstrate the proposed optimization technique and to study its design issues. The major contributions of the present work are as follows: It demonstrates a new low-level compiler back-end optimization technique. This technique provides a framework for several conventional optimizations apart from providing opportunity for machine-specific optimizations. It presents a study of various issues and solutions to several problems encountered during design and implementation of a low-level language partial evaluator that is designed to be a back-end phase in a real-world Prolog compiler. We also present an implementation-independent denotational semantics of BAM code--a low-level language. This provides a vehicle for showing the correctness of instruction transformations. We believe this work to provide the first concrete step towards usage of partial evaluation on low-level code as a compiler back-end optimization technique in real-world compilers
Strong Normalization by Type-Directed Partial Evaluation and Run-Time Code Generation (Preliminary Version)
We investigate the synergy between type-directed partial evaluation and run-time code generation for the Caml dialect of ML. Type-directed partial evaluation maps simply typed, closed Caml values to a representation of their long beta-eta-normal form. Caml uses a virtual machine and has the capability to load byte code at run time. Representing the long beta-eta-normal forms as byte code gives us the ability to strongly normalize higher-order values (i.e., weak head normal forms in ML), to compile the resulting strong normal forms into byte code, and to load this byte code all in one go, at run time.We conclude this note with a preview of our current work on scalingup strong normalization by run-time code generation to the Camlmodule language
Neural Machine Translation for Code Generation
Neural machine translation (NMT) methods developed for natural language
processing have been shown to be highly successful in automating translation
from one natural language to another. Recently, these NMT methods have been
adapted to the generation of program code. In NMT for code generation, the task
is to generate output source code that satisfies constraints expressed in the
input. In the literature, a variety of different input scenarios have been
explored, including generating code based on natural language description,
lower-level representations such as binary or assembly (neural decompilation),
partial representations of source code (code completion and repair), and source
code in another language (code translation). In this paper we survey the NMT
for code generation literature, cataloging the variety of methods that have
been explored according to input and output representations, model
architectures, optimization techniques used, data sets, and evaluation methods.
We discuss the limitations of existing methods and future research directionsComment: 33 pages, 1 figur
A Symmetric Approach to Compilation and Decompilation
Just as specializing a source interpreter can achieve compilation from a source language to a target language, we observe that specializing a target interpreter can achieve compilation from the target language to the source language. In both cases, the key issue is the choice of whether to perform an evaluation or to emit code that represents this evaluation. We substantiate this observation by specializing two source interpreters and two target interpreters. We first consider a source language of arithmetic expressions and a target language for a stack machine, and then the lambda-calculus and the SECD-machine language. In each case, we prove that the target-to-source compiler is a left inverse of the source-to-target compiler, i.e., it is a decompiler. In the context of partial evaluation, compilation by source-interpreter specialization is classically referred to as a Futamura projection. By symmetry, it seems logical to refer to decompilation by target-interpreter specialization as a Futamura embedding
Zero-cost meta-programmed stateful functors in F*
Writing code is hard; proving it correct is even harder. As the scale of
verified software projects reaches new heights, the problem of efficiently
verifying large amounts of software becomes more and more salient. Nowhere is
this issue more evident than in the context of verified cryptographic
libraries. To achieve feature-parity and be competitive with unverified
cryptographic libraries, a very large number of algorithms and APIs need to be
verified. However, the task is oftentimes repetitive, and factoring out
commonality between algorithms is fraught with difficulties, requiring until
now a significant amount of manual effort.
This paper shows how a judicious combination of known functional programming
techniques leads to an order-of-magnitude improvement in the amount of verified
code produced by the popular HACL* cryptographic library, without compromising
performance. We review three techniques that build upon each other, in order of
increasing sophistication. First, we use dependent types to crisply capture the
specification and state machine of a block algorithm, a cryptographic notion
that was until now only informally and imprecisely specified. Next, we rely on
partial evaluation to author a higher-order, stateful functor that transforms
any unsafe block API into a safe counterpart. Finally, we rely on elaborator
reflection to automate the very process of authoring a functor, using a
code-rewriting tactic. This culminates in a style akin to templatized C++ code,
but relying on a userland tactic and partial evaluation, rather than built-in
compiler support
Aspects of functional programming
This thesis explores the application of functional programming in new areas and its
implementation using new technologies. We show how functional languages can be
used to implement solutions to problems in fuzzy logic using a number of languages:
Haskell, Ginger and Aladin. A compiler for the weakly-typed, lazy language Ginger
is developed using Java byte-code as its target code. This is used as the inspiration
for an implementation of Aladin, a simple functional language which has two novel
features: its primitives are designed to be written in any language, and evaluation
is controlled by declaring the strictness of all functions. Efficient denotational and
operational semantics are given for this machine and an implementation is devel-
oped using these semantics. We then show that by using the advantages of Aladin
(simplicity and strictness control) we can employ partial evaluation to achieve con-
siderable speed-ups in the running times of Aladin programs
PDEBENCH: An Extensive Benchmark for Scientific Machine Learning
Machine learning-based modeling of physical systems has experienced increased
interest in recent years. Despite some impressive progress, there is still a
lack of benchmarks for Scientific ML that are easy to use but still challenging
and representative of a wide range of problems. We introduce PDEBench, a
benchmark suite of time-dependent simulation tasks based on Partial
Differential Equations (PDEs). PDEBench comprises both code and data to
benchmark the performance of novel machine learning models against both
classical numerical simulations and machine learning baselines. Our proposed
set of benchmark problems contribute the following unique features: (1) A much
wider range of PDEs compared to existing benchmarks, ranging from relatively
common examples to more realistic and difficult problems; (2) much larger
ready-to-use datasets compared to prior work, comprising multiple simulation
runs across a larger number of initial and boundary conditions and PDE
parameters; (3) more extensible source codes with user-friendly APIs for data
generation and baseline results with popular machine learning models (FNO,
U-Net, PINN, Gradient-Based Inverse Method). PDEBench allows researchers to
extend the benchmark freely for their own purposes using a standardized API and
to compare the performance of new models to existing baseline methods. We also
propose new evaluation metrics with the aim to provide a more holistic
understanding of learning methods in the context of Scientific ML. With those
metrics we identify tasks which are challenging for recent ML methods and
propose these tasks as future challenges for the community. The code is
available at https://github.com/pdebench/PDEBench.Comment: 16 pages (main body) + 34 pages (supplemental material), accepted for
publication in NeurIPS 2022 Track Datasets and Benchmark
- …