7,497 research outputs found
Verified Compilers for a Multi-Language World
Though there has been remarkable progress on formally verified compilers in recent years, most of these compilers suffer from a serious limitation: they are proved correct under the assumption that they will only be used to compile whole programs. This is an unrealistic assumption since most software systems today are comprised of components written in different languages - both typed and untyped - compiled by different compilers to a common target, as well as low-level libraries that may be handwritten in the target language.
We are pursuing a new methodology for building verified compilers for today\u27s world of multi-language software. The project has two central themes, both of which stem from a view of compiler correctness as a language interoperability problem. First, to specify correctness of component compilation, we require that if a source component s compiles to target component t, then t linked with some arbitrary target code t\u27 should behave the same as s interoperating with t\u27. The latter demands a formal semantics of interoperability between the source and target languages. Second, to enable safe interoperability between components compiled from languages as different as ML, Rust, Python, and C, we plan to design a gradually type-safe target language based on LLVM that supports safe interoperability between more precisely typed, less precisely typed, and type-unsafe components. Our approach opens up a new avenue for exploring sensible language interoperability while also tackling compiler correctness
Compositional Compiler Verification for a Multi-Language World
Verified compilers are typically proved correct under severe
restrictions on what the compiler\u27s output may be linked with, from no
linking at all to linking only with code compiled from the same source
language. Such assumptions contradict the reality of how we use these
compilers since most software systems today are comprised of
components written in different languages compiled by different
compilers to a common target, as well as low-level libraries that may
be handwritten in the target language.
The key challenge in verifying compilers for today\u27s world of
multi-language software is how to formally state a compiler
correctness theorem that is compositional along two dimensions.
First, the theorem must guarantee correct compilation of components
while allowing compiled code to be composed (linked) with
target-language components of arbitrary provenance, including those
compiled from other languages. Second, the theorem must support
verification of multi-pass compilers by composing correctness proofs
for individual passes. In this talk, I will describe a methodology
for verifying compositional compiler correctness for a higher-order
typed language and discuss the challenges that lie ahead. I will
argue that compositional compiler correctness is, in essence, a
language interoperability problem: for viable solutions in the long
term, high-level languages must be equipped with principled
foreign-function interfaces that specify safe interoperability
between high-level and low-level components, and between more
precisely and less precisely typed code
A formally verified compiler back-end
This article describes the development and formal verification (proof of
semantic preservation) of a compiler back-end from Cminor (a simple imperative
intermediate language) to PowerPC assembly code, using the Coq proof assistant
both for programming the compiler and for proving its correctness. Such a
verified compiler is useful in the context of formal methods applied to the
certification of critical software: the verification of the compiler guarantees
that the safety properties proved on the source code hold for the executable
compiled code as well
Beyond Good and Evil: Formalizing the Security Guarantees of Compartmentalizing Compilation
Compartmentalization is good security-engineering practice. By breaking a
large software system into mutually distrustful components that run with
minimal privileges, restricting their interactions to conform to well-defined
interfaces, we can limit the damage caused by low-level attacks such as
control-flow hijacking. When used to defend against such attacks,
compartmentalization is often implemented cooperatively by a compiler and a
low-level compartmentalization mechanism. However, the formal guarantees
provided by such compartmentalizing compilation have seen surprisingly little
investigation.
We propose a new security property, secure compartmentalizing compilation
(SCC), that formally characterizes the guarantees provided by
compartmentalizing compilation and clarifies its attacker model. We reconstruct
our property by starting from the well-established notion of fully abstract
compilation, then identifying and lifting three important limitations that make
standard full abstraction unsuitable for compartmentalization. The connection
to full abstraction allows us to prove SCC by adapting established proof
techniques; we illustrate this with a compiler from a simple unsafe imperative
language with procedures to a compartmentalized abstract machine.Comment: Nit
Modular, Fully-abstract Compilation by Approximate Back-translation
A compiler is fully-abstract if the compilation from source language programs
to target language programs reflects and preserves behavioural equivalence.
Such compilers have important security benefits, as they limit the power of an
attacker interacting with the program in the target language to that of an
attacker interacting with the program in the source language. Proving compiler
full-abstraction is, however, rather complicated. A common proof technique is
based on the back-translation of target-level program contexts to
behaviourally-equivalent source-level contexts. However, constructing such a
back- translation is problematic when the source language is not strong enough
to embed an encoding of the target language. For instance, when compiling from
STLC to ULC, the lack of recursive types in the former prevents such a
back-translation.
We propose a general and elegant solution for this problem. The key insight
is that it suffices to construct an approximate back-translation. The
approximation is only accurate up to a certain number of steps and conservative
beyond that, in the sense that the context generated by the back-translation
may diverge when the original would not, but not vice versa. Based on this
insight, we describe a general technique for proving compiler full-abstraction
and demonstrate it on a compiler from STLC to ULC. The proof uses asymmetric
cross-language logical relations and makes innovative use of step-indexing to
express the relation between a context and its approximate back-translation.
The proof extends easily to common compiler patterns such as modular
compilation and it, to the best of our knowledge, it is the first compiler full
abstraction proof to have been fully mechanised in Coq. We believe this proof
technique can scale to challenging settings and enable simpler, more scalable
proofs of compiler full-abstraction
Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
MapReduce is a popular programming paradigm for developing large-scale,
data-intensive computation. Many frameworks that implement this paradigm have
recently been developed. To leverage these frameworks, however, developers must
become familiar with their APIs and rewrite existing code. Casper is a new tool
that automatically translates sequential Java programs into the MapReduce
paradigm. Casper identifies potential code fragments to rewrite and translates
them in two steps: (1) Casper uses program synthesis to search for a program
summary (i.e., a functional specification) of each code fragment. The summary
is expressed using a high-level intermediate language resembling the MapReduce
paradigm and verified to be semantically equivalent to the original using a
theorem prover. (2) Casper generates executable code from the summary, using
either the Hadoop, Spark, or Flink API. We evaluated Casper by automatically
converting real-world, sequential Java benchmarks to MapReduce. The resulting
benchmarks perform up to 48.2x faster compared to the original.Comment: 12 pages, additional 4 pages of references and appendi
A Fast Compiler for NetKAT
High-level programming languages play a key role in a growing number of
networking platforms, streamlining application development and enabling precise
formal reasoning about network behavior. Unfortunately, current compilers only
handle "local" programs that specify behavior in terms of hop-by-hop forwarding
behavior, or modest extensions such as simple paths. To encode richer "global"
behaviors, programmers must add extra state -- something that is tricky to get
right and makes programs harder to write and maintain. Making matters worse,
existing compilers can take tens of minutes to generate the forwarding state
for the network, even on relatively small inputs. This forces programmers to
waste time working around performance issues or even revert to using
hardware-level APIs.
This paper presents a new compiler for the NetKAT language that handles rich
features including regular paths and virtual networks, and yet is several
orders of magnitude faster than previous compilers. The compiler uses symbolic
automata to calculate the extra state needed to implement "global" programs,
and an intermediate representation based on binary decision diagrams to
dramatically improve performance. We describe the design and implementation of
three essential compiler stages: from virtual programs (which specify behavior
in terms of virtual topologies) to global programs (which specify network-wide
behavior in terms of physical topologies), from global programs to local
programs (which specify behavior in terms of single-switch behavior), and from
local programs to hardware-level forwarding tables. We present results from
experiments on real-world benchmarks that quantify performance in terms of
compilation time and forwarding table size
- …