Memory corruption vulnerabilities are endemic to unsafe languages, such as C, and they can even be found in safe languages that themselves are implemented in unsafe languages or linked with libraries implemented in unsafe languages. Robust compilation mitigates the threat of linking with memory-unsafe libraries. e source language is a Clike language, enriched with a notion of a component which encapsulates data and code, exposing functionality through well-de ned interfaces. Robust compilation de nes what security properties a component still has, even, if one or more components are compromised. e main contribution of this work is to demonstrate that the compartmentalization necessary for a compiler that has the robust compilation property can be realized on a basic RISC processor using so ware fault isolation.
Problem and Motivation
Formal de nitions of secure compilation have been proposed by Juglaret et al. [7] and, more recently, by Garg et al. [4] .
is work is part of the e ort to propose a new de nition for robust compilation of unsafe low-level languages [3] . A compiler has the robust compilation property if any a ack on a compiled variant of a program (a set of components) that can be mounted by a component linked and executed with it, can also be mounted at the source level by a component. In the source level semantics, it is impossible to write in another's component memory and only procedures exported by the callee and imported by the caller can be called. us, for the robust compilation property to hold, a strong machine-level separation of the compiled program and the target context is necessary. Juglaret et al.'s [6] implementation targeted a micro-policy architecture [2] with special tagging capabilities at the level of memory location. is work focuses on supporting the new de nition of secure compilation on a generic RISC processor, without specialized hardware. We use so ware fault isolation [13] mechanisms to provide a proof-of-concept implementation of a compiler back-end to a basic RISC machine.
Background and Related Work
So ware fault isolation was proposed in 1993 by Wahbe et al. [13] . A distrusted module is sandboxed into its own fault domain, a logical region of the address space. it from modifying data or executing code belonging to the rest of the application, its object code is instrumented. e physical address is split logically into a segment id and o set, and the introduced instrumentation does not allow writes outside the data domain and execution to escape the code domain, other than prede ned exit points. Many applications that use so ware fault isolation followed. Google's Native Client [14] uses so ware fault isolation to sandbox C/C++ code in the Chrome web browser. Morrisse et al. [10] proposed a semantics of the x86 architecture and constructed a machine veri ed checker of Native Client. ARMor [15] is a machine veri ed system that uses so ware isolation to sandbox application code running on embedded processors. In this research, we combine ideas from this previous work and apply them to support robust compilation on a processor without specialized hardware.
Abadi [1] de ned full abstraction as the property of a compiler to preserve and re ect observational equivalence. Achieving observational equivalence in the presence of side channels such as timing, is impossible. Instead, robust compilation focuses on only mapping back to the source level a context that induces a certain behavior on a program. e robust compilation property for unsafe languages proposed by Fachini et al. [3] is:
at is, for all source-level programs P and all low-level contexts C T there exists a source-level context C S , with no unde ned behavior, such that the low-level trace t of compiled P linked with C T and source-level trace t of P linked with C S , match up to an unde ned behavior in P.
Approach and Uniqueness
e work presented in this abstract is part of a project [3] that aims at de ning a new security property that implements a proof-of-concept compiler from a C-like language with components to two target machines: a generic RISC processor and a micro-policy machine [2] . e generated executable runs on the bare hardware with the back-end compiler phase targeting the generic RISC processor. While promising, the micro-policy machine [2] does not exist yet. Here we target a generic load-store machine with no specialized hardware for protection. e novelty of this new so ware fault isolation implementation is that instead of protecting an application from one or more potentially malicious libraries, all components are potentially malicious and, thus, mutually distrustful.
arXiv:1802.01044v1 [cs.CR] 3 Feb 2018
In our approach, a source-level program is translated from the C-like language with components to an intermediate level language that uses a similar memory model to CompCert [9] enriched with a notion of component and interfaces between components. e addresses are not resolved and the interface calls between components are abstract. Our work implements a compiler pass in Coq [12] . It takes this intermediate program and generates a RISC assembly program that satis es the following invariants:
1. a component can write only within its own data memory; 2. a component can only jump within its own code memory, except for prede ned exit points allowed by the interface; and 3. if a er a call to another component, the execution is transferred back to the callee component, then it will always return to the instruction a er the call. e assumptions in this research are that the basic RISC machine has a minimal load-store instruction set. e register le contains a set of registers dedicated to the so ware fault isolation instrumentation. e memory is unbounded and it is split into slots. e slots are allocated statically to each component and their type, code or data, is also statically determined. A physical address is an unbounded integer, with the bits starting from the least signi cant: o set with slot, component identi er, slot identi er. e o set and component are bounded, and the slot identi er is not. us, each component has an unbounded memory, but a limit on the contiguous memory it can allocate.
To enforce the rst two invariants, this work uses a strategy from Wahbe et al. [13] that has two extra instructions and three dedicated registers. Using binary bitwise operations on an address, the bits corresponding to the component identi er are set to the current one. All the data slots are odd and the instrumentation for the store instruction sets the least signi cant bit of the slot. All the code slots are even and the instrumentation for jump resets the least signi cant bit of the slot. us, no writes are possible in the code segment.
For the enforcement of the cross-component control ow, we use a dedicated protected control stack and a dedicated register for the stack pointer. e protected control stack is kept in a reserved memory, which can be accessed only from special instrumentation sequences. To ensure continuous execution of a certain number of instructions needed for managing the protected control stack, we align the instructions [10] . e rst two sandboxing invariants do not protect the current executing component, but rather protect all other components from it. Special care must be taken to protect the control stack. First, the procedures called externally are placed at an unaligned address and are preceded by a Halt instruction. us spurious pushes onto the protected control stack are avoided. Second, to avoid the error of popping from an empty stack the execution starts with pushing the address of a Halt instruction on the protected control stack and, then the execution is transferred to the main function.
Results and Contributions
e project is implemented in Coq [12] and uses the ickChick [11] framework to test the three invariants. A test consists of the following steps: randomly generate intermediate program using ickChick's primitives [8] , compilate with our proof-of-concept compiler, execute in simulator with recording of a log speci c to each invariant using a state monad, and verify the log by a checker [8] . e intermediate programs were syntactically correct and no tests were discarded. Currently, we are working on simulating an a ack by randomly injecting a change to the data memory of a component.
e robust compilation property de nition cannot be directly applied at the the target level, where the addresses are resolved and a certain layout in memory and instrumentation are expected. Here, the adversarial context is linked and compiled together with the program and the robust compilation property is de ned as:
In gure 1 the program P has three components, and it's linked with the adversarial component C a . Together, they are compiled and executed in the target machine semantic and produce the trace t. By robust compilation, there exists a component S, with no unde ned behavior, such that: S together with P can be executed in the intermediate semantic, producing a trace t . e trace t is a pre x of trace t until S induces and unde ned behavior in P. In conclusion, we designed and implemented a compiler transformation from a RISC-like intermediate language to a basic RISC assembly language that uses so ware fault isolation mechanisms to provide the memory and control ow separation required by the robust compilation property. We tested the implementation using property based testing [5] and the ickChick framework [11] . e robust compilation property does not require specialized hardware. More work is needed to support system calls and dynamic loading, but this is an encouraging rst step.
