Abstract. For the verication of system software, information ow properties of the instruction set architecture (ISA) are essential. They show how information propagates through the processor, including sometimes opaque control registers. Thus, they can be used to guarantee that user processes cannot infer the state of privileged system components, such as secure partitions. Formal ISA models -for example for the HOL4 theorem prover -have been available for a number of years. However, little work has been published on the formal analysis of these models. In this paper, we present a general framework for proving information ow properties of a number of ISAs automatically, for example for ARM. The analysis is represented in HOL4 using a direct semantical embedding of noninterference, and does not use an explicit type system, in order to (i) minimize the trusted computing base, and to (ii) support a large degree of context-sensitivity, which is needed for the analysis. The framework determines automatically which system components are accessible at a given privilege level, guaranteeing both soundness and accuracy.
Introduction
From a security perspective, isolation of processes on lower privilege levels is one of the main tasks of system software. More and more vulnerabilities discovered in operating systems and hypervisors demonstrate that assurance of this isolation is far from given. That is why an increasing eort has been made to formally verify system software, with noticeable progress in recent years [10, 14, 16, 6, 1] .
However, system software depends on hardware support to guarantee isolation.
Usually, this involves at least the ability to execute code on dierent privilege levels and with basic memory protection. Kernels need to control access to their own code and data and to critical software, both in memory and as content of registers or other components. Moreover, they need to control the management of the access control itself. For the correct conguration of hardware, it is essential to understand how and under which circumstances information ows through the system. Hardware must comply to a contract that kernels can rely on. In practice, however, information ows can be indirect and hidden. For example, some processors automatically set control ags on context switches that can later be used by unprivileged code to see if neighbouring processes have been running or to establish a covert channel [19] . Such attacks can be addressed by the kernel, but to that end, kernel developers need machinery to identify the exact components available to unprivileged code, and specications often fail to provide this information in a concise form. When analysing information ow, it is insucient to focus on direct register and memory access. Condentiality, in particular, can be broken in more subtle ways. Even if direct reads from a control ag are prevented by hardware, the ag can be set as an unintended side eect of an action by one process and later inuence the behaviour of another process, allowing the latter to learn something about the control ow of the former.
In this paper we present a framework to automate information ow analysis of instruction set architectures (ISAs) and their operational semantics inside the interactive theorem prover HOL4 [11] . We employ the framework on ISA models developed by Fox et al. [7] and verify noninterference, that is, that secret (high ) components can not inuence public (low ) components. Besides an ISA model, the input consists of desired conditions (such as a specic privilege mode) and a candidate labelling, specifying which system components are already to be considered as low (such as the program counter) and, implicitly, which components might possibly be high. The approach then iteratively renes the candidate labelling by downgrading new components from high to low until a proper noninterference labelling is obtained, reminiscent of [12] . The iteration may fail for decidability reasons. However, on successful termination, both soundness and accuracy are guaranteed unless a warning is given indicating that only an approximate, sound, but not necessarily accurate solution has been found.
What makes accurate ISA information ow analysis challenging is not only the size and complexity of modern instruction sets, but also particularities in semantics and representation of their models. For example, arithmetic operations (e.g., with bitmasks) can cancel out some information ows and data structures can contain a mix of high and low information. Modication of the models to suit the analysis is error prone and requires manual eort. Automatic, and provably correct, preprocessing of the specications could overcome some, but not all, of those diculties, but then the added value of standard approaches such as type systems over a direct implementation becomes questionable. By directly embedding noninterference into HOL4, we can make use of machinery to address the discussed diculties and at the same time we are able to minimize the trusted computing base (TCB), since the models, the preprocessing and the actual reasoning are all implemented/represented in HOL4. Previous work on HOL4 noninterference proofs for ISA models [13] had to rely on some manual proofs, since its compositional approach suered from the lack of sufcient context in some cases (e.g., the secrecy level of a register access in one step can depend on location lookups in earlier steps). In contrast, the approach suggested in the present paper analyses ISAs one instruction at a time, allowing for accuracy and automation at the same time. However, since many instructions involve a number of subroutines, this instruction-wide context introduces complexity challenges. We address those by unfolding denitions of transitions in such a way that their eects can be extracted in an ecient manner.
Our analysis is divided into three steps: (i) rewriting to unfold and simplify instruction denitions, (ii) the actual proof attempt, and (iii) automated counterexample-guided renement of the labelling in cases where the proof fails.
The framework can with minor adaptations be applied to arbitrary HOL4 ISA models. We present benchmarks for ARMv7 and MIPS. With a suitable labelling identied, the median verication time for one ARMv7 instruction is about 40 seconds. For MIPS, the complete analysis took slightly more than one hour and made conguration dependencies explicit that we had not been aware of before.
We report on the following contributions: (i) a backward proof tactic to automatically verify noninterference of HOL4 state transition functions, as used in operational ISA semantics; (ii) the automated identication of sound and accurate labellings; (iii) benchmarks for the ISAs of ARMv7-A and MIPS, based on an SML-implementation of the approach.
Processor Models

ISA Models
In the recent years, Fox et al. have created ISA models for x86-64, MIPS, several versions of ARM and other architectures [8, 7] . The instruction sets are modelled based on ocial documentations and on the abstraction level of the programmer's view, thus being agnostic to internals like pipelines. The newest models are produced in the domain-specic language L3 [7] and can be exported to the interactive theorem prover HOL4. Our analysis targets those purely-functional HOL4 models for single-core systems. An ISA is formalized as a state transition system, with the machine state represented as record structure (on memory, registers, operational modes, control ags, etc.) and the operational semantics as functions (or transitions) on such states. The top-level transition NEXT processes the CPU by one instruction. While L3 also supports export to HOL4 denitions in monadic style, we focus our work on the standard functional representation based on let-expressions. States resulting from an unpredictable (i.e., underspecied) operation are tagged with an exception marker (see Section 7 for a discussion).
Notation
A state s = {C 1 := c 1 , C 2 := c 2 , . . .} is a record, where the elds C 1 , C 2 , . . . depend on the concrete ISA. As a naming convention, we use R i for elds that are records themselves (such as control registers) and F i for elds of a function/mapping type (such as general purpose register sets 
Memory Management
For simplicity, our analysis focuses on core-internal ows (e.g., between registers) and abstracts away from the concrete behaviour of the memory subsystem (including address translation, memory protection, caching, peripherals, buses, etc.). Throughout the course of the -otherwise core internal -analysis, a contract on the memory subsystem is assumed that then allows the reasoning on global properties. The core can communicate with the memory subsystem through an interface, but never directly accesses its internal state. The interface expects inputs like the type of access (read, fetch, write, . . . ), the virtual address, the privilege state of the processor, and other parameters. It updates the state of the memory subsystem and returns a success or error message along with possibly read data. While being agnostic about the concrete behaviour of the memory subsystems, we assume that there is a secure memory conguration P m , restricting unprivileged accesses, e.g., through page table settings. Furthermore, we assume the existence of a low-equivalence relation R m on pairs of memory subsystems.
Typically, two memories in R m would agree on memory content accessible in an unprivileged processor mode. When in unprivileged processor mode and starting from secure memory congurations, transitions on memory subsystems are assumed to maintain both the memory relation and secure congurations. Consider an update of state s assigning the sum of the values of register y and the memory at location a to register x, slightly simplied: s[x := s.y + read(a, s.mem)]. Since read -as a function of the memory interface -satises the constraints above, for two pre-states s 1 and s 2 satisfying P m s 1 .mem ∧ P m s 2 .mem ∧ R m (s 1 .mem, s 2 .mem), we can infer that read will return the same value or error. Overall, with preconditions met, two states that agree on x, y, and the low parts of the memory before the computation, will also agree after the computation. That is, as long as read fulls the contract, the analysis of the core (and in the end the global analysis) does not need to be concerned with details of the memory subsystem.
ISA Information Flow Analysis
Objectives
Consider an ISA model with an initial specication determining some preconditions (e.g., on the privilege mode) and some system components, typically only the program counter, that are to be regarded as observable (or low) by some given actor. If there is information ow from some other component (say, a control register) to some of these initially-low components, this other component must be regarded as observable too for noninterference to hold. The objective of the analysis is to identify all these other components that are observable due to their direct or indirect inuence on the given low components.
A labelling L assigns to each atomic component (component without subcomponents) a label, high or low.
3 It is sound if it does not mark any component as high that can inuence, and hence pass information to, a component marked
the renement order such that L is sound and renes the initial labelling.
Determining whether a labelling is accurate is generally undecidable. Suppose C(P (x), s.C, 0) is assigned to a low component. Deciding whether C needs to be deemed low requires deciding whether there is some valid instantiation of x, such that P (x) holds, which might not be decidable. However, it appears that in many cases, including those considered here, accurate labellings are feasible.
In our approach we check the necessity of a label renement by identifying an actual ow from the witness component to some low component. We cannot guarantee that this check always succeeds, for undecidability reasons. If it does not, the tool still tries to rene the low equivalence and a warning that the nal relation may no longer be accurate is generated. For the considered case studies the tool always nds an accurate labelling, which is then by construction unique. 3 We have not found a use for ISA security lattices of ner granularity.
Labellings correspond to low-equivalence relations on pairs of states, relations that agree on all low components including the memory relation R m and leave all other components unrestricted. Noninterference holds if the only components aecting the state or any return value are themselves low. Formally, assume the two pre-states s 1 and s 2 agree on the low-labelled components, expressed by a low-equivalence relation R on those states. Then, for a given transition Φ and preconditions P, noninterference N (R, P, Φ) holds if after Φ the post-states are again in R and the resulting return values are equal:
Preconditions on the starting states can include architecture properties (version number, present extensions, etc.), a secure memory conguration and a specication of the privilege level. In our framework the user denes relevant preconditions and an initial low-equivalence relation R 0 for an input ISA. The goal of the analysis is to statically and automatically nd an accurate renement of R 0 so that noninterference holds for Φ = NEXT. The analysis yields the nal low-equivalence relation, the corresponding HOL4 noninterference theorem demonstrating the soundness of the relation, and a notication of whether the analysis succeeded to establish a guarantee on the relation's accuracy. The proof search is not guaranteed to terminate successfully, but we have found it robust enough to reliably produce accurate output on ISA models of considerable complexity (see Section 5). We do not treat timing and probabilistic channels and leave safety-properties about unmodied components for future work.
Challenges
Our goal is to perform the analysis from an initial, user-supplied labelling on a standard ISA with minimal user interaction. In particular, we wish to avoid user supplied label annotations and error-prone manual rewrites of the ISA specication, that a type-based approach might depend on to eliminate some of the complications specic to ISA models. Instead, we address those challenges with symbolic evaluation and the application of simplication theorems. Since both are available in HOL4, and so are the models, we verify noninterference in HOL4 directly. This also frees us from external preprocessing and soundness proofs, thus minimizing the TCB. Below, we give examples for common challenges.
Representation The functional models that we use represent register sets as mappings. Static type systems for (purely) functional languages [9, 17] need to assign secrecy levels uniformly to all image values, even if a mapping has both public and secret entries. Adaptations of representation and type system might allow to type more accurately for lookups on constant locations. But common lookup patterns on locations represented by variables or complex terms would require a preprocessing that propagates constraints throughout large expressions.
Semantics Unprivileged ARMv7 processes can access the current state of the control register CPSR. The ISA species to (i) map all subcomponents of the control register to a 32-bit word and (ii) apply the resulting word to a bitmask.
As a result, the returned value does actually not depend on all subcomponents of the CPSR, even though all of them were referred to in the rst step. For accuracy, an actual understanding of the arithmetics is required.
Context-sensitivity Earlier work on ISA information ow [13] deals with ARM's complex operational semantics in a stepwise analysis, focusing on one subprocedure at a time. This allows for a systematic solution, but comes with the risk of insucient context. For example, when reading from a register, usually two steps are involved: rst, the concrete register identier with respect to the current processor mode is looked up; second, the actual reading is performed.
Analysing the reading operation in isolation is not accurate, since the lack of constraints on the register identier would require to deem all registers low. In order to include restrictions from the context, [13] required a number of manual proofs. To avoid this, we analyse entire instructions at a time, using HOL4's machinery to propagate constraints.
Approach
We are not the rst to study (semi-)automated hardware verication using theorem proving. As [5] points out for hardware renement proofs, a large share of the proof obligations can be discharged by repeated unfolding (rewriting) of definitions, case splits and basic simplication. While easy to automate, these steps lead easily to an increase in complexity. The challenge, thus, is to nd ecient and eective ways of rewriting and to minimize case splits throughout the proof.
Our framework traverses the instruction set instruction by instruction, managing a task queue. For each instruction, three steps are performed: (i) rewriting/unfolding to obtain evaluated forms, (ii) attempting to prove noninterference for the instruction, (iii) on failure, using the identied counterexample to rene the low-equivalence relation. This section details those steps. After each renement, the instructions veried so far are re-enqueued. The steps are repeated until the queue is empty and each instruction has successfully been veried with the most recent low-equivalence relation. Finally, noninterference is shown for NEXT, employing all instruction lemmas, as well as rewrite theorems for the fetch and decode transitions. Soundness is inherited from HOL4's machinery. Accuracy is tracked by the counterexample verication in step (iii).
Rewriting towards an Evaluated Form
The evaluated form of instructions is obtained through symbolic evaluation.
Starting from the denition of a given transition, (i) let-expressions are eliminated, (ii) parameters of subtransitions are evaluated (in a call-by-value manner), (iii) the subtransitions are recursively unfolded by replacing them with their respective evaluated forms, (iv) the result is normalized, and (v) in a few cases substituted with an abstraction. Normalization and abstraction are described below. For the rst three steps we reuse evaluation machinery from [7] and extend it, mainly to add support for automated subtransition identication and recursion. Preconditions, for example on the privilege level, allow to reduce rewriting time and the size of the result. Since they can become invalid during instruction execution, they have to be re-evaluated for each recursive invocation.
Throughout the whole rewriting process, various simplications are applied, for example on nested conditional expressions, case distinctions, words, and pairs, as well as conditional lifting, which we motivate below. For soundness, all steps produce equivalence theorems.
Step Library The ISA models are provided together with so-called step libraries, specic to every architecture [7] . They include a database of precomputed rewrite theorems, connecting transitions to their evaluated forms.
Those theorems are computed in an automated manner, but are guided manually. Our tool is able to employ them as hints, as long as their preconditions are not too restrictive for the general security analysis. Otherwise, we compute the evaluated forms autonomously. Besides instruction specic theorems, we use some datatype specic theorems and general machinery from [7] .
Conditional Lifting Throughout the rewriting process, the evaluated forms of two sequential subtransitions might be composed by passing the result of the rst transition into the formal parameters of the second. This often leads to terms
However, in order to derive equality properties in the noninterference proof (e.g., [s 1 .C 3 = s 2 .C 3 ] γ(s 1 ) = γ(s 2 )) or to check validity of premises (e.g., γ(s) = 0), conditional lifting is applied:
To mitigate exponential blow-up, conditional lifting should only be applied where For a state term τ updating state variable s in the elds C 1 , . . . , C n with the values c 1 , . . . , c n , we verify the normalized form in a forward construction (omitting subcomponents here and below for readability; they are treated analogously):
We signicantly improve proof performance with the abstraction of complex expressions by showing (1) independently of the concrete τ and (2) independently of the values of the updates, both those inside τ and those applied to τ . We obtain c 1 , . . . , c n by similar means to those shown in the lifting example of γ above.
In [7] , both conditional lifting and normalization are based on the precomputation of datatype specic lifting and unlifting lemmas for updates. Our procedures are largely independent of record types and update patterns. However, because of the performance benets of [7] , we plan to generalize/automate their normalization machinery or combine both approaches in future work.
Abstracted Transitions Even with normalization, the specication of a transition grows quickly when unfolding complex subtransitions, especially for loops.
We therefore choose to abstract selected subtransitions. To this end, we substitute their evaluated forms with terms that make potential ows explicit, but abstract away from concrete specications. Let the normalized form of transition Φ be φs = (β(s), s[C 1 := γ 1 (s), . . . , C n := γ n (s)]). The values of all primitive state updates γ 1 (s), . . . , γ n (s) on s and the return value β(s) of Φ are substituted with new function constants f 0 , f 1 , . . . , f n applied to relevant state components actually accessed instead of to the entire state:
Except for situations that suggest the need for a renement of the low-equivalence relation, f 0 , . . . , f n do not need to be unfolded in the further processing of Φ.
Low-equivalence of the post-states can be inferred trivially:
To avoid accuracy losses in cases where φ mentions components that neither return value nor low components actually depend on, we unfold abstractions as last resort before declaring a noninterference proof as failed.
Backward Proof Strategy
Having computed the evaluated form for an instruction Φ, we proceed with the verication attempt of N (R, P, Φ) through a backward proof, for the userprovided preconditions P and the current low-equivalence relation R. The sound backward proof employs a combination of the following steps:
Conditional Lifting: Especially in order to resolve record eld accesses on complex state expressions, we apply conditional lifting in various scopes (record accesses, operators, operands) and degrees of aggressiveness.
Equality of Subexpressions: Let F be a functional component and n and m be two variables ranging over {0, 1, 2}. The equality C(n = 2, 0, s 1 .F (C(n, a, b, c)) ) + s 1 .F (C(m, a, b, a)) = C(n = 2, 0, s 2 .F (C(n, a, b, c)) ) + s 2 .F (C(m, a, b, a)) can be established from the premises s 1 .F (a) = s 2 .F (a) and s 1 .F (b) = s 2 .F (b) by lifting the distinctions on n and m outwards or -alternatively -by case splitting on n and m. Either way, equality should be established for each summand separately, in order to limit the number of considered cases to 3 + 3 instead of 3 × 3. Doing so in explicit subgoals also helps in discarding unreachable cases, such as the one where c would be chosen. We identify relevant expressions via pre-dened and user-dened patterns.
Memory Reasoning: Axioms and derived theorems on noninterference properties of the memory subsystem and maintained invariants are applied.
Simplications: Throughout the whole proof process, various simplications take eect, for example on record eld updates.
Case Splitting: Usually the mentioned steps are sucient. For a few harder instructions or if the low-equivalence relation requires renement, we apply case splits, following the branching structure closely.
Evaluation: After the case splitting, a number of more aggressive simplications, evaluations, and automatic proof tactics are used to unfold remaining constants and to reason about words, bit operations, unusual forms of record accesses, and other corner cases.
Relation Renement
Throughout the analysis, renement of the low-equivalence relation is required whenever noninterference does not hold for the instruction currently considered. Counterexamples to noninterference enable the identication of new components to be downgraded to low. When managed carefully, failed backward proofs of noninterference allow to extract such counterexamples. However, backward proofs are not complete. Unsatisable subgoals might be introduced despite the goal being veriable. For accuracy, we thus verify the necessity of downgrading a component C before the actual renement of the relation. To that end, it is sucient to identify two witness states that full the preconditions P, agree on all components except C, and lead to a violation of noninterference in respect to the analysed instruction Φ and the current (yet to be rened) relation R. We refer to the existence of such witnesses as N :
If such witnesses exist, any sound relation R rening R will have to contain some restriction on C. With the chosen granularity, that translates to ∀s 1 , s 2 :
We proceed with the weakest such relation, i.e., R (s 1 , s 2 ) := (R(s 1 , s 2 ) ∧ s 1 .C = s 2 .C). As discussed in Section 3.1, it can be undecidable whether the current relation needs renement. However, for the models that we analyzed, our framework was always able to verify the existence of suitable witnesses. The identication and verication of new low components consists of three steps:
1. Identication of a new low component. We transform subgoal G on top of the goal stack into a subgoal false with premises extended by ¬G. In this updated list of premises for the pre-states s 1 and s 2 , we identify a premise on s 1 which would solve the transformed subgoal by contradiction when assumed for s 2 as well. Intuitively, we suspect that noninterference is prevented by the disagreement on components in the identied premise. We arbitrarily pick one such component as candidate for downgrading.
2. Existential verication of the scenario. To ensure that the extended premises alone are not already in contradiction, we prove the existence of a scenario in which all of them hold. We furthermore introduce the additional premise that the two pre-states disagree on the chosen candidate, but agree on all other components. An instantiation satisfying this existential statement is a promising suspect for the set of witnesses for N . The existential proof in HOL4 renes existentially quantied variables with patterns, e.g., symbolic states for state variables, bit vectors for words, and mappings with abstract updates for function variables (allowing to reduce ∃f : P (f (n)) to ∃x : P (x)). If possible, existential goals are split. Further simplications include HOL4 tactics particular to existential reasoning, the application of type-specic existential inequality theorems, and simplications on word and bit operations. If after those steps and automatic reasoning existential subgoals remain, the tool attempts to nish the proof with dierent combinations of standard values for the remaining existentially quantied variables.
3. Witness verication. We use the anonymous witnesses of the existential statement in the previous step as witnesses for N . After initialisation, the core parts of the proof strategy from the failed noninterference proof are repeated until the violation of noninterference has been demonstrated.
In order to keep the analysis focused, it is important to handle case splits before entering the renement stage. At the same time, persistent case splits can be expensive on a non-provable goal. Therefore, we implemented a depth rst proof tactical, which introduces hardly any performance overhead on successful proofs, but fails early in cases where the proof strategy does not succeed. Furthermore, whenever case splits become necessary in the proof attempt, the framework strives to diverge early, prioritizing case splits on state components.
Evaluation
We applied our framework to analyse information ows on ARMv7-A and MIPS-III (64-bit RS4000). For ARM, we focus on user mode execution without security or virtualization extension. Since unprivileged ARM code is able to switch between several instructions sets (ARM, Thumb, Thumb2, ThumbEE), the information ow analysis has to be performed for all of them. For MIPS, we consider all three privilege modes (user, kernel, and supervisor Table 1 . Identied ows (model components might deviate from physical systems) Table 1 shows the initial and accurate nal low-equivalence relations for the two ISAs with dierent congurations. All relations rene the memory relation.
The nal relation column only lists components not already restricted by the corresponding initial relations. For simplicity, the initial relation for MIPS restricts three components accessed on the highest level of NEXT. The corresponding table cell also lists components already restricted by the preconditions. Initially unaware of the privilege management in MIPS, we were surprised that our tool rst yielded the same results for all MIPS processor modes and that even user processes can read the entire state of system coprocessor CP0, which is responsible for privileged operations such as the management of interrupts, exceptions, or contexts. To restrict user privileges, the CU0 status ag must be cleared (see last line of the table). While ARMv7-processes in user mode can not read from banked registers of privileged modes, they can infer the state of various control registers.
Alignment control register ags (CP15.SCTLR.A/U in ARMv7) are a good example for implicit ows in CPUs. Depending on their values, an unaligned address will either be accessed as is, forcibly aligned, or cause an alignment fault. Table 2 shows the time that rewriting, instruction proofs (including relation renement), and the composing proof for NEXT took on a single Xeon X3470 core. The rst benchmark for MIPS refers to unrestricted user mode (with similar times as for kernel and supervisor mode), the second one to restricted user mode. Even though we borrowed a few data type theorems and some basic machinery from the step library, we did not use instruction specic theorems for the MIPS veri- Table 3 . Performance ARMv7 proof of L3 compared to 2080 lines [7] , the specications of the ARMv7 instructions are both larger and more complex. Consequently, we observed a remarkable dierence in performance. However, as Table 3 shows, minimum, median, and mean processing times (given in seconds) for the ARM instructions are actually moderate throughout all steps (rewriting, successful and failed noninterference proofs, and relation renement). Merely a few complex outliers are responsible for the high verication time of the ARM ISA. While we believe that optimizations and parallelization could signicantly improve performance, those outliers still demonstrate the limits of analyzing entire instructions as a whole. Combining our approach with compositional solutions such as [13] could overcome this remaining challenge. We leave this for future work.
Related Work
While most work on processor verication focuses on functional correctness [4, 5, 21] and ignores information ow, we survey hardware noninterference, both for special separation hardware and for general purpose hardware.
Noninterference Verication for Separation Hardware Wilding et al. [24] verify noninterference for the partitioning system of the AAMP7G microprocessor. The processor can be seen as a separation kernel in hardware, but lacks for example user-visible registers. Security is rst shown for an abstract model, which is later rened to a more concrete model of the system, comprising about 3000 lines of ACL2. The proof appears to be performed semi-automatically.
SAFE is a computer system with hardware operating on tagged data [2] .
Noninterference is rst proven for a more abstract machine model and then transferred to the concrete machine by renement. The proof in Coq does not seem to involve much automation.
Sinha et al. [20] From our own experience on ISA-level, the bottleneck is mainly constituted by the preprocessing to obtain the model's evaluated form and by the identication of a suitable labelling. The actual verication is comparatively fast.
In earlier work [13] we described a HOL4 proof for the noninterference (and other isolation properties) of a monadic ARMv7-model. A compositional approach based on proof rules was used to support a semi-automatic analysis.
However, due to insucient context, a number of transitions had to be veried manually or with the support of context-enhancing proof rules. In the present work, we overcome this issue by analysing entire instructions. Furthermore, our new analysis exhibits the low-equivalence relation automatically, while [13] provides it as xed input. Finally, the framework described in the present paper is less dependent of the analysed architecture.
Verication of Binaries Fox's ARM model is also used to automatically verify security properties of binary code. Balliu et al. [3] does this for noninterference, Tan et al. [22] for safety-properties. Despite the seeming similarities, ISA analysis and binary code analysis dier in many respects. While binary verication considers concrete assembly instructions for (partly) known parameters, ISA analysis has to consider all possible assembly instructions for all possible parameters. On the other hand, it is sucient for an ISA analysis to do this for each instruction in isolation, while binary verication usually reasons on a sequence (or a tree of ) instructions. In eect, that makes the verication of a binary program an analysis on imperative code. In contrast, ISA analysis (in our setting) is really concerned with functional code, namely the operational semantics that describe the dierent steps of single instructions. In either case, to enable full automation, both analyses have to include a broader context when the local context is not sucient to verify the desired property for a single step in isolation. As discussed above, we choose an instruction-wide context from the beginning. Both [3] and [22] employ a more local reasoning. In [22] a Hoare-style logic is used and context is provided by selective synchronisation of pre-and postconditions between neighbouring code blocks. In [3] a forward symbolic analysis carries the context in a path condition when advancing from instruction to instruction. SMT solvers then allow to discard symbolic states with non-satisable paths.
7 Discussion on Unpredictable Behaviour In addition, newer versions still model a reasonable behaviour for such cases, but there is no guarantee that the manufacturer chooses the same behaviour.
A physical implementation might include ows from more components than the model does, or vice versa. A more conservative analysis like ours takes state changes after model exceptions into account, but can still miss ows simply not specied. To the rescue might come statements from processor designers like ARM that unpredictable behaviour must not represent security holes. 5 In one
