19 research outputs found
Repairing and mechanising the JavaScript relaxed memory model
© 2020 ACM. Modern JavaScript includes the SharedArrayBuffer feature, which provides access to true shared memory concurrency. SharedArrayBuffers are simple linear buffers of bytes, and the JavaScript specification defines an axiomatic relaxed memory model to describe their behaviour. While this model is heavily based on the C/C++11 model, it diverges in some key areas. JavaScript chooses to give a well-defined semantics to data-races, unlike the "undefined behaviour" of C/C++11. Moreover, the JavaScript model is mixed-size. This means that its accesses are not to discrete locations, but to (possibly overlapping) ranges of bytes. We show that the model, in violation of the design intention, does not support a compilation scheme to ARMv8 which is used in practice. We propose a correction, which also incorporates a previously proposed fix for a failure of the model to provide Sequential Consistency of Data-Race-Free programs (SC-DRF), an important correctness condition. We use model checking, in Alloy, to generate small counter-examples for these deficiencies, and investigate our correction. To accomplish this, we also develop a mixed-size extension to the existing ARMv8 axiomatic model. Guided by our Alloy experimentation, we mechanise (in Coq) the JavaScript model (corrected and uncorrected), our ARMv8 model, and, for the corrected JavaScript model, a "model-internal" SC-DRF proof and a compilation scheme correctness proof to ARMv8. In addition, we investigate a non-mixed-size subset of the corrected JavaScript model, and give proofs of compilation correctness for this subset to x86-TSO, Power, RISC-V, ARMv7, and (again) ARMv8, via the Intermediate Memory Model (IMM). As a result of our work, the JavaScript standards body (ECMA TC39) will include fixes for both issues in an upcoming edition of the specification
Recommended from our members
Mechanising and evolving the formal semantics of WebAssembly: the Web's new low-level language
WebAssembly is the first new programming language to be supported natively by all major Web browsers since JavaScript. It is designed to be a natural low-level compilation target for languages such as C, C++, and Rust, enabling programs written in these languages to be compiled and executed efficiently on the Web. WebAssembly’s specification is managed by the W3C WebAssembly Working Group (made up of representatives from a number of major tech companies). Uniquely, the language is specified by way of a full pen-and-paper formal semantics.
This thesis describes a number of ways in which I have both helped to shape the specification of WebAssembly, and built upon it. By mechanising the WebAssembly formal semantics in Isabelle/HOL while it was being drafted, I discovered a number of errors in the specification, drove the adoption of official corrections, and provided the first type soundness proof for the corrected language. This thesis also details a verified type checker and interpreter, and a security type system extension for cryptography primitives, all of which have been mechanised as extensions of my initial WebAssembly mechanisation.
A major component of the thesis is my work on the specification of shared memory concurrency in Web languages: correcting and verifying properties of JavaScript’s existing relaxed memory model, and defining the WebAssembly-specific extensions to the corrected model which have been adopted as the basis of WebAssembly’s official threads specification. A number of deficiencies in the original JavaScript model are detailed. Some errors have been corrected, with the verified fixes officially adopted into subsequent editions of the language specification. However one discovered deficiency is fundamental to the model, an instance of the well-known "thin-air problem".
My work demonstrates the value of formalisation and mechanisation in industrial programming language design, not only in discovering and correcting specification errors, but also in building confidence both in the correctness of the language’s design and in the design of proposed extensions.2019 Google PhD Fellowship in Programming Technology and Software Engineering
Peterhouse Research Fellowshi
Mechanizing Webassembly Proposals
WebAssembly is a modern low-level programming language designed to provide high performance and security. To enable these goals, the language specifies a relatively small number of low-level types, instructions, and language constructs. The language is proven to be sound with respect to its types and execution, and a separate mechanized formalization of the specification and type soundness proofs confirms this. As an emerging technology, the language is continuously being developed, with modifications being proposed and discussed in the open and on a frequent basis.
In order to ensure the soundness properties exhibited by the original core language are maintained as WebAssembly evolves, these proposals should too be mechanized and verified to be sound. This work extends the existing Isabelle mechanization to include three such proposals which add additional features to the language, and shows that the language maintains its soundness properties with their inclusion
Recommended from our members
Weakening WebAssembly
WebAssembly (Wasm) is a safe, portable virtual instruction set that can be hosted in a wide range of environments, such as a Web browser. It is a low-level language whose instructions are intended to compile directly to
bare hardware. While the initial version of Wasm focussed on single-threaded computation, a recent proposal extends it with low-level support for multiple threads and atomic instructions for synchronised access to
shared memory. To support the correct compilation of concurrent programs, it is necessary to give a suitable specification of its memory model.
Wasm’s language definition is based on a fully formalised specification that carefully avoids undefined behaviour. We present a substantial extension to this semantics, incorporating a relaxed memory model, along
with a few proposed operational extensions. Wasm’s memory model is unique in that its linear address space can be dynamically grown during execution, while all accesses are bounds-checked. This leads to the novel
problem of specifying how observations about the size of the memory can propagate between threads. We argue that, considering desirable compilation schemes, we cannot give a sequentially consistent semantics to memory growth.
We show that our model guarantees Sequential Consistency of Data-Race-Free programs (SC-DRF). However, because Wasm is to run on the Web, we must also consider interoperability of its model with that of JavaScript.
We show, by counter-example, that JavaScript’s memory model is not SC-DRF, in contrast to what is claimed in its specification. We propose two axiomatic conditions that should be added to the JavaScript model to
correct this difference.
We also describe a prototype SMT-based litmus tool which acts as an oracle for our axiomatic model, visualising its behaviours, including memory resizing
The Leaky Semicolon
Program logics and semantics tell a pleasant story about sequential composition: when executing (S1;S2), we first execute S1 then S2. To improve performance, however, processors execute instructions out of order, and compilers reorder programs even more dramatically. By design, single-threaded systems cannot observe these reorderings; however, multiple-threaded systems can, making the story considerably less pleasant. A formal attempt to understand the resulting mess is known as a “relaxed memory model.” Prior models either fail to address sequential composition directly, or overly restrict processors and compilers, or permit nonsense thin-air behaviors which are unobservable in practice.
To support sequential composition while targeting modern hardware, we enrich the standard event-based approach with preconditions and families of predicate transformers. When calculating the meaning of (S1; S2), the predicate transformer applied to the precondition of an event e from S2 is chosen based on the set of events in S1 upon which e depends. We apply this approach to two existing memory models
Armed Cats: formal concurrency modelling at Arm
International audienceWe report on the process for formal concurrency modelling at Arm. An initial formal consistency model of the Arm achitecture, written in the cat language, was published and upstreamed to the herd+diy tool suite in 2017. Since then, we have extended the original model with extra features, for example mixed-size accesses, and produced two provably equivalent alternative formulations. In this paper, we present a comprehensive review of work done at Arm on the consistency model. Along the way, we also show that our principle for handling mixed-size accesses applies to x86: we confirm this via vast experimental campaigns. We also show that our alternative formulations are applicable to any model phrased in a style similar to the one chosen by Arm
Recommended from our members
Repairing and mechanising the JavaScript relaxed memory model
© 2020 ACM. Modern JavaScript includes the SharedArrayBuffer feature, which provides access to true shared memory concurrency. SharedArrayBuffers are simple linear buffers of bytes, and the JavaScript specification defines an axiomatic relaxed memory model to describe their behaviour. While this model is heavily based on the C/C++11 model, it diverges in some key areas. JavaScript chooses to give a well-defined semantics to data-races, unlike the "undefined behaviour" of C/C++11. Moreover, the JavaScript model is mixed-size. This means that its accesses are not to discrete locations, but to (possibly overlapping) ranges of bytes. We show that the model, in violation of the design intention, does not support a compilation scheme to ARMv8 which is used in practice. We propose a correction, which also incorporates a previously proposed fix for a failure of the model to provide Sequential Consistency of Data-Race-Free programs (SC-DRF), an important correctness condition. We use model checking, in Alloy, to generate small counter-examples for these deficiencies, and investigate our correction. To accomplish this, we also develop a mixed-size extension to the existing ARMv8 axiomatic model. Guided by our Alloy experimentation, we mechanise (in Coq) the JavaScript model (corrected and uncorrected), our ARMv8 model, and, for the corrected JavaScript model, a "model-internal" SC-DRF proof and a compilation scheme correctness proof to ARMv8. In addition, we investigate a non-mixed-size subset of the corrected JavaScript model, and give proofs of compilation correctness for this subset to x86-TSO, Power, RISC-V, ARMv7, and (again) ARMv8, via the Intermediate Memory Model (IMM). As a result of our work, the JavaScript standards body (ECMA TC39) will include fixes for both issues in an upcoming edition of the specification