1,772 research outputs found
A formally verified compiler back-end
This article describes the development and formal verification (proof of
semantic preservation) of a compiler back-end from Cminor (a simple imperative
intermediate language) to PowerPC assembly code, using the Coq proof assistant
both for programming the compiler and for proving its correctness. Such a
verified compiler is useful in the context of formal methods applied to the
certification of critical software: the verification of the compiler guarantees
that the safety properties proved on the source code hold for the executable
compiled code as well
Compiler architecture using a portable intermediate language
The back end of a compiler performs machine-dependent tasks and low-level optimisations that are laborious to implement and difficult to debug. In addition, in languages that require run-time services such as garbage collection, the back end must interface with the run-time system to provide
those services. The net result is that building a compiler back end entails a high implementation cost.
In this dissertation I describe reusable code generation infrastructure that enables the construction of a complete programming language implementation (compiler and run-time system) with reduced effort. The infrastructure consists of a portable intermediate language, a compiler for this language and a low-level run-time system. I provide an implementation of this system and I show that it can support a variety of source programming languages, it reduces the overall eort required to implement a programming
language, it can capture and retain information necessary to support run-time services and optimisations, and it produces efficient code
Recommended from our members
Duplo: A framework for OCaml post-link optimisation
We present a novel framework,
Duplo
, for the low-level post-link optimisation of OCaml programs, achieving a speedup of 7% and a reduction of at least 15% of the code size of widely-used OCaml applications. Unlike existing post-link optimisers, which typically operate on target-specific machine code, our framework operates on a Low-Level Intermediate Representation (LLIR) capable of representing both the OCaml programs and any C dependencies they invoke through the foreign-function interface (FFI). LLIR is analysed, transformed and lowered to machine code by our post-link optimiser, LLIR-OPT. Most importantly, LLIR allows the optimiser to cross the OCaml-C language boundary, mitigating the overhead incurred by the FFI and enabling analyses and transformations in a previously unavailable context. The optimised IR is then lowered to amd64 machine code through the existing target-specific code generator of LLVM, modified to handle garbage collection just as effectively as the native OCaml backend. We equip our optimiser with a suite of SSA-based transformations and points-to analyses capable of capturing the semantics and representing the memory models of both languages, along with a cross-language inliner to embed C methods into OCaml callers. We evaluate the gains of our framework, which can be attributed to both our optimiser and the more sophisticated amd64 backend of LLVM, on a wide-range of widely-used OCaml applications, as well as an existing suite of micro- and macro-benchmarks used to track the performance of the OCaml compiler.
EPSRC EP/P020011/1, Cambridge Trust
The C++0x "Concepts" Effort
C++0x is the working title for the revision of the ISO standard of the C++
programming language that was originally planned for release in 2009 but that
was delayed to 2011. The largest language extension in C++0x was "concepts",
that is, a collection of features for constraining template parameters. In
September of 2008, the C++ standards committee voted the concepts extension
into C++0x, but then in July of 2009, the committee voted the concepts
extension back out of C++0x.
This article is my account of the technical challenges and debates within the
"concepts" effort in the years 2003 to 2009. To provide some background, the
article also describes the design space for constrained parametric
polymorphism, or what is colloquially know as constrained generics. While this
article is meant to be generally accessible, the writing is aimed toward
readers with background in functional programming and programming language
theory. This article grew out of a lecture at the Spring School on Generic and
Indexed Programming at the University of Oxford, March 2010
Program representation size in an intermediate language with intersection and union types
The CIL compiler for core Standard ML compiles whole programs using a novel typed intermediate language (TIL) with intersection and union types and flow labels on both terms and types. The CIL term representation duplicates portions of the program where intersection types are introduced and union types are eliminated. This duplication makes it easier to represent type information and to introduce customized data representations. However, duplication incurs compile-time space costs that are potentially much greater than are incurred in TILs employing type-level abstraction or quantification. In this paper, we present empirical data on the compile-time space costs of using CIL as an intermediate language. The data shows that these costs can be made tractable by using sufficiently fine-grained flow analyses together with standard hash-consing techniques. The data also suggests that non-duplicating formulations of intersection (and union) types would not achieve significantly better space complexity.National Science Foundation (CCR-9417382, CISE/CCR ESS 9806747); Sun grant (EDUD-7826-990410-US); Faculty Fellowship of the Carroll School of Management, Boston College; U.K. Engineering and Physical Sciences Research Council (GR/L 36963, GR/L 15685
A Tracing JIT Compiler for Erlang using LLVM
We have modified the Erlang runtime to add support for a tracing just-in-time (JIT) compiler, similar to Mozilla’s TraceMonkey. Tracing is a technique to augment an existing interpreter with a JIT simply by recording the instructions executed during a loop iteration, and then generate optimized native code from this. Tracing compilers are particularly suited to optimize number crunching tight loops, an area where Erlang traditionally has been lacking. We make use of the LLVM compiler library to optimize and emit native code. In micro benchmarks we show some major improvements, reducing execution time by up to 75%. However, from an engineering point of view, we conclude that the effort of an industrial strength implementation would be substantial – essentially reimplementing large parts of Erlang’s interpreter – and discuss a potential solution based on recent research in the area.Nästan alla moderna programspråk använder en interpretator – en flexibel och praktisk om än långsam lösning. Vi prövar ett enkelt sätt att kraftigt öka prestandan på Erlangs interpretator
Removing and restoring control flow with the Value State Dependence Graph
This thesis studies the practicality of compiling with only data flow information.
Specifically, we focus on the challenges that arise when using the Value
State Dependence Graph (VSDG) as an intermediate representation (IR).
We perform a detailed survey of IRs in the literature in order to discover
trends over time, and we classify them by their features in a taxonomy. We
see how the VSDG fits into the IR landscape, and look at the divide between
academia and the 'real world' in terms of compiler technology. Since most
data flow IRs cannot be constructed for irreducible programs, we perform an
empirical study of irreducibility in current versions of open source software,
and then compare them with older versions of the same software. We also
study machine-generated C code from a variety of different software tools.
We show that irreducibility is no longer a problem, and is becoming less so
with time. We then address the problem of constructing the VSDG. Since
previous approaches in the literature have been poorly documented or ignored
altogether, we give our approach to constructing the VSDG from a common
IR: the Control Flow Graph. We show how our approach is independent of
the source and target language, how it is able to handle unstructured control
flow, and how it is able to transform irreducible programs on the fly. Once the
VSDG is constructed, we implement Lawrence's proceduralisation algorithm
in order to encode an evaluation strategy whilst translating the program into
a parallel representation: the Program Dependence Graph. From here, we
implement scheduling and then code generation using the LLVM compiler.
We compare our compiler framework against several existing compilers, and
show how removing control flow with the VSDG and then restoring it later
can produce high quality code. We also examine specific situations where the
VSDG can put pressure on existing code generators. Our results show that the
VSDG represents a radically different, yet practical, approach to compilation
Aspects of functional programming
This thesis explores the application of functional programming in new areas and its
implementation using new technologies. We show how functional languages can be
used to implement solutions to problems in fuzzy logic using a number of languages:
Haskell, Ginger and Aladin. A compiler for the weakly-typed, lazy language Ginger
is developed using Java byte-code as its target code. This is used as the inspiration
for an implementation of Aladin, a simple functional language which has two novel
features: its primitives are designed to be written in any language, and evaluation
is controlled by declaring the strictness of all functions. Efficient denotational and
operational semantics are given for this machine and an implementation is devel-
oped using these semantics. We then show that by using the advantages of Aladin
(simplicity and strictness control) we can employ partial evaluation to achieve con-
siderable speed-ups in the running times of Aladin programs
- …