20 research outputs found
Exploring C semantics and pointer provenance
The semantics of pointers and memory objects in C has been a vexed question for many years. C values cannot be treated as either purely abstract or purely concrete entities: the language exposes their representations, but compiler optimisations rely on analyses that reason about provenance and initialisation status, not just runtime representations. The ISO WG14 standard leaves much of this unclear, and in some respects differs with de facto standard usage - which itself is difficult to investigate.
In this paper we explore the possible source-language semantics for memory objects and pointers, in ISO C and in C as it is used and implemented in practice, focussing especially on pointer provenance. We aim to, as far as possible, reconcile the ISO C standard, mainstream compiler behaviour, and the semantics relied on by the corpus of existing C code. We present two coherent proposals, tracking provenance via integers and not; both address many design questions. We highlight some pros and cons and open questions, and illustrate the discussion with a library of test cases. We make our semantics executable as a test oracle, integrating it with the Cerberus semantics for much of the rest of C, which we have made substantially more complete and robust, and equipped with a web-interface GUI. This allows us to experimentally assess our proposals on those test cases. To assess their viability with respect to larger bodies of C code, we analyse the changes required and the resulting behaviour for a port of FreeBSD to CHERI, a research architecture supporting hardware capabilities, which (roughly speaking) traps on the memory safety violations which our proposals deem undefined behaviour. We also develop a new runtime instrumentation tool to detect possible provenance violations in normal C code, and apply it to some of the SPEC benchmarks. We compare our proposal with a source-language variant of the twin-allocation LLVM semantics proposal of Lee et al. Finally, we describe ongoing interactions with WG14, exploring how our proposals could be incorporated into the ISO standard
Cerberus-BMC: A Principled Reference Semantics and Exploration Tool for Concurrent and Sequential C
C remains central to our infrastructure, making verification of C code
an essential and much-researched topic, but the semantics of C is
remarkably complex, and important aspects of it are still unsettled,
leaving programmers and verification tool builders on shaky
ground.
This paper describes a tool, Cerberus-BMC, that for the first time
provides a principled reference semantics that simultaneously supports
(1) a choice of concurrency memory model (including substantial
fragments of the C11, RC11, and Linux kernel memory models), (2) a modern memory object model,
and (3) a well-validated thread-local semantics for a large fragment
of the language.
The tool should be useful for C programmers, compiler writers,
verification tool builders, and members of the C/C++ standards committees
C provenance semantics: examples
This note discusses the design of provenance semantics for C, looking at a series of examples. We consider three variantsof the provenance-not-via-integer (PNVI) model: PNVI plain, PNVI address-exposed (PNVI-ae) and PNVI address-exposeduser-disambiguation (PNVI-ae-udi), and also the provenance-via-integers (PVI) model. The examples include those of ExploringC Semantics and Pointer Provenance [POPL 2019] (also available as ISO WG14 N2311 http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2311.pdf), with several additions. This is based on recent discussion in the C memory object model study group. Itshould be read together with the two companion notes, one giving detailed diffs to the C standard text (N2362), and anothergiving detailed semantics for these variants (N2364)
Formalizing Stack Safety as a Security Property
The term stack safety is used to describe a variety of compiler, runtime, and hardware mechanisms for protecting stack memory. Unlike âthe heap,â the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it difficult to nail down what it means to correctly enforce stack safety
Rigorous engineering for hardware security: Formal modelling and proof in the CHERI design and implementation process
The root causes of many security vulnerabilities include a pernicious combination of two problems, often regarded as inescapable aspects of computing. First, the protection mechanisms provided by the mainstream processor architecture and C/C++ language abstractions, dating back to the 1970s and before, provide only coarse-grain virtual-memory-based protection. Second, mainstream system engineering relies almost exclusively on test-and-debug methods, with (at best) prose specifications. These methods have historically sufficed commercially for much of the computer industry, but they fail to prevent large numbers of exploitable bugs, and the security problems that this causes are becoming ever more acute.
In this paper we show how more rigorous engineering methods can be applied to the development of a new security-enhanced processor architecture, with its accompanying hardware implementation and software stack. We use formal models of the complete instruction-set architecture (ISA) at the heart of the design and engineering process, both in lightweight ways that support and improve normal engineering practice -- as documentation, in emulators used as a test oracle for hardware and for running software, and for test generation -- and for formal verification. We formalise key intended security properties of the design, and establish that these hold with mechanised proof. This is for the same complete ISA models (complete enough to boot operating systems), without idealisation.
We do this for CHERI, an architecture with \emph{hardware capabilities} that supports fine-grained memory protection and scalable secure compartmentalisation, while offering a smooth adoption path for existing software. CHERI is a maturing research architecture, developed since 2010, with work now underway on an Arm industrial prototype to explore its possible adoption in mass-market commercial processors. The rigorous engineering work described here has been an integral part of its development to date, enabling more rapid and confident experimentation, and boosting confidence in the design.This work was supported by EPSRC programme grant EP/K008528/1 (REMS: Rigorous Engineering for Mainstream Systems).
This work was supported by a Gates studentship (Nienhuis).
This project has received funding from the European Research Council
(ERC) under the European Union's Horizon 2020 research and innovation
programme (grant agreement 789108, ELVER).
This work was supported by the Defense
Advanced Research Projects Agency (DARPA) and the Air Force Research
Laboratory (AFRL), under contracts FA8750-10-C-0237 (CTSRD),
HR0011-18-C-0016 (ECATS),
and FA8650-18-C-7809 (CIFV)
Rust for Morello: Always-on memory safety, even in unsafe code (experience paper)
Memory safety issues are a serious concern in systems programming. Rust is a systems language that provides memory safety through a combination of a static checks embodied in the type system and ad hoc dynamic checks inserted where this analysis becomes impractical. The Morello prototype architecture from ARM uses capabilities, fat pointers augmented with object bounds information, to catch failures of memory safety. This paper presents a compiler from Rust to the Morello architecture, together with a comparison of the performance of Rustâs runtime safety checks and the hardware-supported checks of Morello. The cost of Morelloâs always-on memory safety guarantees is 39% in our 19 benchmark suites from the Rust crates repository (comprising 870 total benchmarks). For this cost, Morelloâs capabilities ensure that even unsafe Rust code benefits from memory safety guarantees
Recommended from our members
Complete spatial safety for C and C++ using CHERI capabilities
Lack of memory safety in commonly used systems-level languages such as C and C++ results in a constant stream of new exploitable software vulnerabilities and exploit techniques. Many exploit mitigations have been proposed and deployed over the years, yet none address the root issue: lack of memory safety. Most C and C++ implementations assume a memory model based on a linear array of bytes rather than an object-centric view. Whilst more efficient on contemporary CPU architectures, linear addresses cannot encode the target object, thus permitting memory errors such as spatial safety violations (ignoring the bounds of an object). One promising mechanism to provide memory safety is CHERI
(Capability Hardware Enhanced RISC Instructions), which extends existing processor architectures with capabilities that provide hardware-enforced checks for all accesses and can be used to prevent spatial memory violations. This dissertation prototypes and evaluates a pure-capability programming model (using CHERI capabilities for all pointers) to provide complete spatial memory protection for traditionally unsafe languages.
As the first step towards memory safety, all language-visible pointers can be implemented as capabilities. I analyse the programmer-visible impact of this change and refine the pure-capability programming model to provide strong source-level compatibility with existing code. Second, to provide robust spatial safety, language-invisible pointers (mostly arising from program linkage) such as those used for functions calls and global variable accesses must also be protected. In doing so, I highlight trade-offs between performance and privilege minimization for implicit and programmer-visible pointers. Finally, I present
CheriSH, a novel and highly compatible technique that protects against buffer overflows between fields of the same object, hereby ensuring that the CHERI spatial memory protection is complete.
I find that the byte-granular spatial safety provided by CHERI pure-capability code is not only stronger than most other approaches, but also incurs almost negligible performance overheads in common cases (0.1% geometric mean) and a worst-case overhead of only 23.3% compared to the insecure MIPS baseline. Moreover, I show that the pure-capability programming model provides near-complete source-level compatibility with existing programs. I evaluate this based on porting large widely used open-source applications such as PostgreSQL and WebKit with only minimal changes: fewer than 0.1% of source lines.
I conclude that pure-capability CHERI C/C++ is an eminently viable programming environment offering strong memory protection, good source-level compatibility and low performance overheads