99 research outputs found
Recommended from our members
Mechanising and evolving the formal semantics of WebAssembly: the Web's new low-level language
WebAssembly is the first new programming language to be supported natively by all major Web browsers since JavaScript. It is designed to be a natural low-level compilation target for languages such as C, C++, and Rust, enabling programs written in these languages to be compiled and executed efficiently on the Web. WebAssembly’s specification is managed by the W3C WebAssembly Working Group (made up of representatives from a number of major tech companies). Uniquely, the language is specified by way of a full pen-and-paper formal semantics.
This thesis describes a number of ways in which I have both helped to shape the specification of WebAssembly, and built upon it. By mechanising the WebAssembly formal semantics in Isabelle/HOL while it was being drafted, I discovered a number of errors in the specification, drove the adoption of official corrections, and provided the first type soundness proof for the corrected language. This thesis also details a verified type checker and interpreter, and a security type system extension for cryptography primitives, all of which have been mechanised as extensions of my initial WebAssembly mechanisation.
A major component of the thesis is my work on the specification of shared memory concurrency in Web languages: correcting and verifying properties of JavaScript’s existing relaxed memory model, and defining the WebAssembly-specific extensions to the corrected model which have been adopted as the basis of WebAssembly’s official threads specification. A number of deficiencies in the original JavaScript model are detailed. Some errors have been corrected, with the verified fixes officially adopted into subsequent editions of the language specification. However one discovered deficiency is fundamental to the model, an instance of the well-known "thin-air problem".
My work demonstrates the value of formalisation and mechanisation in industrial programming language design, not only in discovering and correcting specification errors, but also in building confidence both in the correctness of the language’s design and in the design of proposed extensions.2019 Google PhD Fellowship in Programming Technology and Software Engineering
Peterhouse Research Fellowshi
Mechanising and verifying the WebAssembly specification
WebAssembly is a new low-level language currently being implemented in all major web browsers. It is designed to become the universal compilation target for the web, obsoleting existing solutions in this area, such as asm.js and Native Client. The WebAssembly working group has incorporated formal techniques into the development of the language, but their efforts so far have focussed on pen and paper formal specification.
We present a mechanised Isabelle specification for the WebAssembly language, together with a verified executable interpreter and type checker. Moreover, we present a fully mechanised proof of the soundness of the WebAssembly type system, and detail how our work on this proof has exposed several issues with the official WebAssembly specification, influencing its development. Finally, we give a brief account of our efforts in performing differential fuzzing of our interpreter against industry implementations
Towards logic-based verification of javascript programs
In this position paper, we argue for what we believe is a correct pathway to achieving scalable symbolic verification of JavaScript based on separation logic. We highlight the difficulties imposed by the language, the current state-of-the-art in the literature, and the sequence of steps that needs to be taken. We briefly describe JaVerT, our semiautomatic toolchain for JavaScript verification
An executable formal semantics of PHP with applications to program analysis
Nowadays, many important activities in our lives involve the web. However, the software and protocols on which web applications are based were not designed with the appropriate level of security in mind. Many web applications have reached a level of complexity for which testing, code reviews and human inspection are no longer sufficient quality-assurance guarantees. Tools that employ static analysis techniques are needed in order to explore all possible execution paths through an application and guarantee the absence of undesirable behaviours. To make sure that an analysis captures the properties of interest, and to navigate the trade-offs between efficiency and precision, it is necessary to base the design and the development of static analysis tools on a firm understanding of the language to be analysed. When this underlying knowledge is missing or erroneous, tools can’t be trusted no matter what advanced techniques they use to perform their task. In this Thesis, we introduce KPHP, the first executable formal semantics of PHP, one of the most popular languages for server-side web programming. Then, we demonstrate its practical relevance by developing two verification tools, of increasing complexity, on top of it - a simple verifier based on symbolic execution and LTL model checking and a general purpose, fully configurable and extensible static analyser based on Abstract Interpretation. Our LTL-based tool leverages the existing symbolic execution and model checking support offered by K, our semantics framework of choice, and constitutes a first proof-of-concept of the usefulness of our semantics. Our abstract interpreter, on the other hand, represents a more significant and novel contribution to the field of static analysis of dynamic scripting languages (PHP in particular). Although our tool is still a prototype and therefore not well suited for handling large real-world codebases, we demonstrate how our semantics-based, principled approach to the development of verification tools has lead to the design of static analyses that outperform existing tools and approaches, both in terms of supported language features, precision, and breadth of possible applications.Open Acces
Rigorous engineering for hardware security: Formal modelling and proof in the CHERI design and implementation process
The root causes of many security vulnerabilities include a pernicious combination of two problems, often regarded as inescapable aspects of computing. First, the protection mechanisms provided by the mainstream processor architecture and C/C++ language abstractions, dating back to the 1970s and before, provide only coarse-grain virtual-memory-based protection. Second, mainstream system engineering relies almost exclusively on test-and-debug methods, with (at best) prose specifications. These methods have historically sufficed commercially for much of the computer industry, but they fail to prevent large numbers of exploitable bugs, and the security problems that this causes are becoming ever more acute.
In this paper we show how more rigorous engineering methods can be applied to the development of a new security-enhanced processor architecture, with its accompanying hardware implementation and software stack. We use formal models of the complete instruction-set architecture (ISA) at the heart of the design and engineering process, both in lightweight ways that support and improve normal engineering practice -- as documentation, in emulators used as a test oracle for hardware and for running software, and for test generation -- and for formal verification. We formalise key intended security properties of the design, and establish that these hold with mechanised proof. This is for the same complete ISA models (complete enough to boot operating systems), without idealisation.
We do this for CHERI, an architecture with \emph{hardware capabilities} that supports fine-grained memory protection and scalable secure compartmentalisation, while offering a smooth adoption path for existing software. CHERI is a maturing research architecture, developed since 2010, with work now underway on an Arm industrial prototype to explore its possible adoption in mass-market commercial processors. The rigorous engineering work described here has been an integral part of its development to date, enabling more rapid and confident experimentation, and boosting confidence in the design.This work was supported by EPSRC programme grant EP/K008528/1 (REMS: Rigorous Engineering for Mainstream Systems).
This work was supported by a Gates studentship (Nienhuis).
This project has received funding from the European Research Council
(ERC) under the European Union's Horizon 2020 research and innovation
programme (grant agreement 789108, ELVER).
This work was supported by the Defense
Advanced Research Projects Agency (DARPA) and the Air Force Research
Laboratory (AFRL), under contracts FA8750-10-C-0237 (CTSRD),
HR0011-18-C-0016 (ECATS),
and FA8650-18-C-7809 (CIFV)
Modular, Fully-abstract Compilation by Approximate Back-translation
A compiler is fully-abstract if the compilation from source language programs
to target language programs reflects and preserves behavioural equivalence.
Such compilers have important security benefits, as they limit the power of an
attacker interacting with the program in the target language to that of an
attacker interacting with the program in the source language. Proving compiler
full-abstraction is, however, rather complicated. A common proof technique is
based on the back-translation of target-level program contexts to
behaviourally-equivalent source-level contexts. However, constructing such a
back- translation is problematic when the source language is not strong enough
to embed an encoding of the target language. For instance, when compiling from
STLC to ULC, the lack of recursive types in the former prevents such a
back-translation.
We propose a general and elegant solution for this problem. The key insight
is that it suffices to construct an approximate back-translation. The
approximation is only accurate up to a certain number of steps and conservative
beyond that, in the sense that the context generated by the back-translation
may diverge when the original would not, but not vice versa. Based on this
insight, we describe a general technique for proving compiler full-abstraction
and demonstrate it on a compiler from STLC to ULC. The proof uses asymmetric
cross-language logical relations and makes innovative use of step-indexing to
express the relation between a context and its approximate back-translation.
The proof extends easily to common compiler patterns such as modular
compilation and it, to the best of our knowledge, it is the first compiler full
abstraction proof to have been fully mechanised in Coq. We believe this proof
technique can scale to challenging settings and enable simpler, more scalable
proofs of compiler full-abstraction
Necessity Specifications for Robustness
Robust modules guarantee to do only what they are supposed to do - even in
the presence of untrusted, malicious clients, and considering not just the
direct behaviour of individual methods, but also the emergent behaviour from
calls to more than one method. Necessity is a language for specifying
robustness, based on novel necessity operators capturing temporal implication,
and a proof logic that derives explicit robustness specifications from
functional specifications. Soundness and an exemplar proof are mechanised in
Coq
JSExplain: a double debugger for JavaScript
We present JSExplain, a reference interpreter for JavaScript that closely follows the specification and that produces execution traces. These traces may be interactively investigated in a browser, with an interface that displays not only the code and the state of the interpreter, but also the code and the state of the interpreted program. Conditional breakpoints may be expressed with respect to both the interpreter and the interpreted program. In that respect, JSExplain is a double-debugger for the specification of JavaScript
- …