A Verified Architecture for Proofs of Execution on Remote Devices under
  Full Software Compromise by Nunes, Ivan De Oliveira et al.
A Verified Architecture for Proofs of Execution on
Remote Devices under Full Software Compromise
Ivan De Oliveira Nunes∗, Karim Eldefrawy†, Norrathep Rattanavipanon∗, Gene Tsudik∗
∗University of California, Irvine
†SRI International
{ivanoliv, nrattana, gene.tsudik}@uci.edu, karim.eldefrawy@sri.com
Abstract—Modern society is increasingly surrounded by, and
accustomed to, a wide range of Cyber-Physical Systems (CPS),
Internet-of-Things (IoT), and smart devices. They often perform
safety-critical functions, e.g., personal medical devices, auto-
motive CPS and industrial automation (smart factories). Some
devices are small, cheap and specialized sensors and/or actuators.
They tend to run simple software and operate under control of a
more sophisticated central control unit. The latter is responsible
for the decision-making and orchestrating the entire system.
If devices are left unprotected, consequences of forged sensor
readings or ignored actuation commands can be catastrophic,
particularly, in safety-critical settings. This prompts the following
three questions: (1) How to trust data produced by a simple remote
embedded device? and (2) How to ascertain that this data was
produced via execution of expected software? Furthermore, (3)
Is it possible to attain (1) and (2) under the assumption that all
software on the remote device could be modified or compromised?
In this paper we answer these questions by designing, proving
security of, and formally verifying, VAPE: Verified Architecture
for Proofs of Execution. To the best of our knowledge, this is the
first of its kind result for low-end embedded systems. Our work
has a range of applications, especially, to authenticated sensing
and trustworthy actuation, which are increasingly relevant in the
context of safety-critical systems. VAPE architecture is publicly
available and our evaluation demonstrates that it incurs low
overhead, affordable even for lowest-end embedded devices, e.g.,
those based on MSP430 or ARV ATMega processors.
I. INTRODUCTION
The number and variety of special-purpose computing
devices has been increasing dramatically. This includes all
kinds of embedded devices, cyber-physical systems (CPS) and
Internet-of-Things (IoT) gadgets, utilized in various “smart”
or instrumented settings, such as homes, offices, factories,
automotive systems and public venues. Tasks performed by
these devices are often safety-critical. For example, a typical
industrial control system depends on physical measurements
(e.g., temperature, pressure, humidity, speed) reported by
sensors, and on actions taken by actuators, e.g., turning on
the A/C, sounding an alarm, or reducing speed.
A cyber-physical control system is usually composed of
multiple sensors and actuators, in the form of low-cost micro-
controller units (MCUs). Such devices run simple software
(often on "bare metal") and operate under control of a re-
mote central control unit. Despite their potential importance
to overall system functionality, low-end MCUs are typically
designed to minimize cost, size and energy consumption, e.g.,
TI MSP430.
Therefore, their architectural security is often primitive
or non-existent, thus making them vulnerable to malware
infestations and other malicious software modifications. A
compromised MCU can, for instance, spoof sensed quan-
tities or ignore actuation commands leading to potentially
catastrophic results. For example, in a smart city, large-
scale erroneous reports of electricity consumption by smart
meters might lead to power outages. A medical device that
returns incorrect values when queried by a remote physician
might result in a wrong drug being prescribed to a patient.
A compromised car engine temperature sensor that reports
incorrect (low) readings can lead to undetected overheating
and major damage. However, despite the very real risks of
remote software compromise, most people tend to believe that
these devices execute the expected software and thus perform
their expected function.
In this paper, we argue that Proofs of Execution (PoX)
are both important and necessary for securing low-end MCUs.
Specifically, we demonstrate in Section VII, that PoX schemes
can be used to construct sensors and actuators that “can not
lie”, even under the assumption of full software compromise.
In a nutshell, a PoX conveys that the remote (and possibly
compromised) device really executed specific software, and
all execution results are authenticated and cryptographically
bound to this execution. This functionality is similar to authen-
ticated outputs that can be produced by software execution in
SGX-alike architectures [1], [2]. However, such architectures
are comparatively heavy-weight and unsuitable for low-end
MCUs; see Section I-A for further details on targeted devices.
One of our main building blocks in designing PoX schemes
is Remote Attestation (RA). Basically,RA is a means to detect
malware on a remote low-end MCU. It allows a trusted verifier
(Vrf) to remotely measure memory contents (or software state)
of an untrusted embedded device (Prv).RA is usually realized
as a 2-message challenge-response protocol:
1) Vrf sends an attestation request containing a challenge
(Chal) to Prv. It might also contain a token derived from
a secret (shared by Vrf and Prv) that allows Prv to
authenticate Vrf.
2) Prv receives the attestation request and computes an au-
thenticated integrity check over its memory and Chal. The
memory region might be either pre-defined, or explicitly
specified in the request.
3) Prv returns the result to Vrf.
ar
X
iv
:1
90
8.
02
44
4v
1 
 [c
s.C
R]
  7
 A
ug
 20
19
4) Vrf receives the result, and checks whether it corresponds
to a valid memory state.
The authenticated integrity check is typically realized as a
Message Authentication Code (MAC) computed over Prv
memory. We overview one concrete RA architecture in Sec-
tion III.
Despite major progress and several proposals for RA archi-
tectures with different assumptions and guarantees [3]–[17],
RA alone is insufficient to obtain proofs of execution.
RA allows Vrf to ascertain integrity of software residing in
Prv attested memory region. However, RA by itself offers
no guarantee that malware is not present elsewhere in Prv
memory. It also does not guarantee that the attested software
is ever executed or that any such execution completes suc-
cessfully. Even if the attested software is executed, there is
no guarantee that it has not been modified (e.g., by malware
residing elsewhere in memory) in time between its execution
and its attestation. This phenomenon is well known as the RA
Time-Of-Check-Time-Of-Use (TOCTOU) problem. Finally,
RA does not guarantee authenticity and integrity of any output
produced by the execution of the attested software.
To bridge this gap, we design and implement VAPE:
Verified Architecture for Proofs of Execution. In addition to
RA, VAPE allows Vrf to request an unforgeable proof that
the attested software executed successfully and (optionally)
produced certain authenticated output. These guarantees
hold even in case of full software compromise on Prv. Our
intended contributions are:
– New security service: we design and implement VAPE
for unforgeable remote proofs of execution (PoX). VAPE is
built on top of VRASED [17], a formally verified hybrid RA
architecture. VAPE overhead vis-a-vis VRASED is small and,
to the best of our knowledge, it is the first security architecture
for proofs of remote software execution on low-end devices.
– Provable security & implementation verification: We
prove that the composition of VRASED with VAPE yields a
secure PoX architecture. All security properties expected from
VAPE are formally specified using Linear Temporal Logic
(LTL) and VAPE modules are verified to adhere to these
properties.
– Evaluation, publicly available implementation and ap-
plications: VAPE was implemented on a real-world low-end
MCU (TI MSP430) and deployed using commodity FPGAs.
Its design (along with verification) is publicly available at
[18]. Our evaluation demonstrates low hardware overhead,
which we consider affordable even for low-end MCUs. The
implementation is accompanied by sample PoX application
(see Section VII). In particular, we use VAPE to construct
trustworthy safety-critical devices. On such a device, even if
it is in full software control, malware cannot spoof measure-
ments (or fake performing actuation) without detection.
A. Targeted Devices & Scope
This work focuses on CPS/IoT sensors and actuators with
relatively low computing power. These are some of the lowest-
end devices based on low-power single core MCUs with only
a few KBytes of program and data memory. A representative
of this class of devices is the Texas Instruments MSP430
MCU family [19]. It has a 16-bit word size, resulting in
≈ 64 KBytes of addressable memory. SRAM is used as
data memory and its size ranges between 4 and 16KBytes
(depending on the specific MSP430 model), while the rest of
the address space is used for program memory, e.g., ROM and
Flash. MSP430 is a Von Neumann architecture processor with
common data and code address spaces. It has no support of
memory management unit (MMU) to perform virtual memory
management. Instead, MSP430 accesses memory directly in
the physical address. Multiple memory accesses can be per-
formed within a single instruction; its instruction execution
time varies from 1 to 6 clock cycles, and instruction length
varies from 16 to 48 bits. MSP430 was designed for low-
power and low-cost. It is widely used in many application
domains, e.g., automotive industry, utility meters, as well as
consumer devices and computer peripherals. Our choice is also
motivated by availability of a well-maintained open-source
MSP430 hardware design from Open Cores [20]. Nevertheless,
our machine model is applicable to other low-end MCUs in
the same class as MSP430 (e.g., Atmel AVR ATMega).
B. Organization
Section II discusses related work on remote attestation,
formal verification of security services and control flow at-
testation. Section III provides some background on automated
verification, and on VRASED’s RA architecture. Section IV
introduces Proofs of Execution (PoX), followed by a realiza-
tion thereof in Section V, including technical details of VAPE
design, as well as the adversarial model and assumptions.
Section VI presents VAPE’s formal verification. Next, in
Section VII, we describe how to use VAPE to implement
authenticated sensing/actuation. Section VIII concludes the
paper with a summary of results.
II. RELATED WORK
Remote Attestation (RA)– architectures can be divided into
three categories: hardware-based, software-based, or hybrid.
Hardware-based [21]–[23] relies on dedicated secure hardware
components, e.g., Trusted Platform Modules (TPMs) [24].
However, the cost of such hardware is normally prohibitive
for low-end IoT/CPS devices. Software-based attestation [25]–
[27] requires no hardware security features but imposes strong
security assumptions about communication between Prv and
Vrf, which are unrealistic in the IoT/CPS ecosystem. (Though,
it is the only choice for legacy devices). Hybrid RA [8], [9],
[28]–[30] aims to achieve security equivalent to hardware-
based mechanisms at minimal cost. It thus entails minimal
hardware requirements while relying on software to reduce
overall complexity and RA footprint on Prv.
The first hybrid RA architecture – SMART [6] – acknowl-
edged the importance of executing code on Prv, in addition
to just attesting Prv’s memory. Using an attest-then-execute
approach, SMART attempted to achieve software execution
2
guarantees by specifying the address of the first instruction to
be executed after completion of attestation. We consider this
to be a best-effort approach which merely guarantees that the
code will start executing. It does not guarantee that execution
completes successfully. For example, SMART’s approach can
not detect if execution is interrupted and never resumed. It also
can not detect when a reset (e.g., due to software bugs, or Prv
running low on power) happens in the middle of execution,
preventing its completion. Furthermore, direct memory access
(DMA) may happen during execution and it can modify the
code being executed or its output. In other words, SMART
offers no guarantees beyond “invoking the executable”.
Another notable RA architecture is TrustLite [7], which
builds upon SMART to allow secure interrupts. However,
TrustLite does not enforce temporal consistency of attested
memory, and is thus conceptually vulnerable to self-relocating
malware and memory modification during attestation [31].
Consequently, it is challenging for deriving secure PoX from
TrustLite. Several other prominent low-to-medium-end RA ar-
chitectures – e.g., SANCUS [11], HYDRA [9], and TyTaN [8]
– do not offer PoX. In this paper, we show that the execute-
then-attest approach, built on top of a temporally consistent
RA architecture, provides unforgeable proofs of execution that
are produced only if execution completes successfully.
Control Flow Attestation (CFA)– In contrast withRA, which
measures Prv’s software integrity, CFA techniques [32]–[35]
provide Vrf with a measurement of the exact control flow path
taken during execution of specific software on Prv. Such a
measurement allows Vrf to detect run-time attacks. We believe
that it is possible to construct a PoX scheme that relies on CFA
to produce proofs of execution based on the attested control
flow path. However, in this paper, we advocate a different
approach – specific for proofs of execution – for two main
reasons:
• CFA requires heavy-weight hardware (e.g., TrustZone
in [32], branch monitor and hash engine in [33], [35]) to
attest executed instructions in real time, along with mem-
ory addresses and the program counter. Such hardware
components are not viable for low-end devices, since their
cost (in terms of price, size, and energy consumption)
is typically higher than the cost of a low-end MCUs
itself. For example, the cheapest Trusted Platform Module
(TPM) [24], is about 10× more expensive than MSP430
MCU itself1. As discussed in Appendix VIII, current CFA
architectures are also more expensive than the MCU [20]
itself.
• CFA assumes that Vrf can enumerate a large (potentially
exponential) number of valid control flow paths for a
given program, and verify a valid response for each.
This burden is unnecessary for determining if a proof of
execution is valid, because one does not need to know the
exact execution path in order to determine if execution
occurred (and terminated) successfully.
1Source: https://www.digikey.com/
Instead of relying on CFA, our work introduces the concepts
of ephemeral immutability and ephemeral atomicity. We
use them to show how to construct a provably secure PoX
architecture. Our VAPE architecture is non-invasive (it does not
modify MCU behavior and semantics) and has low hardware
overhead (around 2% for registers and 12% for LUTs). Also,
Vrf is not required to enumerate valid control flow graphs and
the verification burden PoX is exactly the same as the effort
to verify a typical RA response for the same code.
Formally Verified Security Services– In recent years, several
efforts focused on formally verifying security-critical systems.
In terms of cryptographic primitives, Hawblitzel et al. [36]
verified implementations of SHA, HMAC, and RSA. Bond
et al. [37] verified an assembly implementation of SHA-256,
Poly1305, AES and ECDSA. Zinzindohoué, et al. [38] devel-
oped HACL*, a verified cryptographic library containing the
entire cryptographic API of NaCl [39]. Larger security-critical
systems have also been successfully verified. Bhargavan [40]
implemented the TLS protocol with verified cryptographic
security. CompCert[41] is a C compiler that is formally
verified to preserve C code semantics in generated assembly
code. Klein et al. [42] designed and proved functional correct-
ness of the seL4 microkernel. More recently, VRASED [17]
realized a verified hybrid RA architecture. VAPE architecture
proposed in this paper builds upon VRASED’s formally verified
properties (see Section III-B for details) and adds additional
properties to obtain PoX. Our implementation is also formally
verified to guarantee such properties.
III. BACKGROUND
This section provides some background on formal verifica-
tion and overviews VRASED.
A. Formal Verification, Model Checking & Linear Temporal
Logic
Computer-aided formal verification typically involves three
basic steps. First, the system of interest (e.g., hardware,
software, communication protocol) is described using a formal
model, e.g., a Finite State Machine (FSM). Second, properties
that the model should satisfy are formally specified. Third, the
system model is checked against formally specified properties
to guarantee that the system retains them. This can be achieved
via either Theorem Proving or Model Checking. In this work,
we use the latter to verify the implementation of system
modules, and the former to derive new properties from sub-
properties that were proved for the modules’ implementation.
In one instantiation of model checking, properties are speci-
fied as formulae using Temporal Logic (TL) and system mod-
els are represented as FSMs. Hence, a system is represented
by a triple (S, S0, T ), where S is a finite set of states, S0 ⊆ S
is the set of possible initial states, and T ⊆ S × S is the
transition relation set – it describes the set of states that can
be reached in a single step from each state. The use of TL
to specify properties allows representation of expected system
behavior over time.
3
We apply the model checker NuSMV [43], which can be
used to verify generic HW or SW models. For digital hardware
described at Register Transfer Level (RTL) – which is the
case in this work – conversion from Hardware Description
Language (HDL) to NuSMV model specification is simple.
Furthermore, it can be automated [44], because the standard
RTL design already relies on describing hardware as an FSM.
In NuSMV, properties are specified in Linear Temporal
Logic (LTL), which is particularly useful for verifying se-
quential systems, since LTL extends common logic statements
with temporal clauses. In addition to propositional connectives,
such as conjunction (∧), disjunction (∨), negation (¬), and
implication (→), LTL includes temporal connectives, thus
enabling sequential reasoning. In this paper, we are interested
in the following temporal connectives:
• Xφ – neXt φ: holds if φ is true at the next system state.
• Fφ – Future φ: holds if there exists a future state where
φ is true.
• Gφ – Globally φ: holds if for all future states φ is true.
• φ U ψ – φ Until ψ: holds if there is a future state where
ψ holds and φ holds for all states prior to that.
• φ B ψ – φ Before ψ: holds if the existence of state where
ψ holds implies the existence of an earlier state where φ
holds. This connective can be expressed using U through
the equivalence: φ B ψ ≡ ¬(¬φ U ψ).
This set of temporal connectives combined with propositional
connectives (with their usual meanings) allows us to specify
powerful rules. NuSMV works by checking LTL specifications
against the system FSM for all reachable states in such FSM.
B. VRASED Architecture
VRASED [17] is a formally verified hybrid (hardware/-
software co-design) RA architecture. It was built as a set
of sub-modules; each guaranteeing a specific set of sub-
properties. All VRASED sub-modules, both hardware and
software, are individually verified. Finally, the composition
of all sub-modules is proved to achieve formal definitions of
RA soundness and security. RA soundness guarantees that
an integrity-ensuring function (HMAC in VRASED’s case) is
correctly computed on the memory being attested. Moreover,
it guarantees that attested memory can not be modified after
the start of RA computation, protecting against “hide-and-
seek” attacks caused by self-relocating malware [31]. RA
security ensures that RA execution generates an unforgeable
authenticated memory measurement and that the secret key
K used in computing this measurement is not leaked before,
during, or after, attestation.
Figure 1 illustrates VRASED architecture. To achieve afore-
mentioned goals, VRASED’s software (SW-Att in Figure 1) is
stored in Read-Only Memory (ROM) and relies on a formally
verified HMAC implementation from HACL* cryptographic
library [38]. A typical execution of SW-Att is carried out as
follows:
1) Read challenge Chal from memory region MR.
2) Derive a one-time key from Chal and the attestation
master key K.
MCU CORE
MEM.
BACK-
BONE
HW-Mod
SW-Att
K
SW-Att
STACK
(XS)
App.
Avail.
RAM
App.
Code
PC,
irq,
Ren,
Wen,
Daddr ,
DMAen,
DMAaddr
reset
ROM
RAM
FLASH
Fig. 1. VRASED architecture (from [17])
3) Generate an attestation token H by computing an HMAC
over an attested memory region AR using the derived
key:
H = HMAC(KDF (K,MR), AR)
4) Write H into MR and return the execution to a unpriv-
ileged Software, i.e, normal applications.
VRASED’s Hardware (HW-Mod in Figure 1) monitors 7 MCU
signals:
• PC: Current Program Counter value;
• Ren: Signal that indicates if the MCU is reading from
memory (1-bit);
• Wen: Signal that indicates if the MCU is writing to
memory (1-bit);
• Daddr: Address for an MCU memory access;
• DMAen: Signal that indicates if Direct Memory Access
(DMA) is currently enabled (1-bit);
• DMAaddr: Memory address being accessed by DMA.
• irq: Signal that indicates if an interrupt is happening (1-
bit);
These signals are used to determine a one-bit reset signal
output, that, when set to 1, triggers an immediate system-wide
MCU reset, i.e., before execution of the next instruction. The
reset output is triggered when VRASED’s hardware detects
any violation of security properties. VRASED’s hardware is
described in Register Transfer Level (RTL) using Finite State
Machines (FSMs). Then, NuSMV Model Checker [45] is
used to automatically prove that such FSMs achieve claimed
security sub-properties. Finally, the proof that the conjunction
of hardware and software sub-properties implies end-to-end
soundness and security is done using an LTL theorem prover.
More formally, VRASED end-to-end security proof guarantees
that no probabilistic polynomial time (ppt) adversary can win
4
Definition 1. VRASED’s Security Game [17]
1.1 RA Security Game (RA-game):
Notation:
- l is the security parameter and |K| = |Chal| = |MR| = l
- AR(t) denotes the content of AR at time t
RA-game:
1) Setup: Adv is given oracle access to SW-Att calls.
2) Challenge: A random challenge Chal ← ${0, 1}l is gener-
ated and given to Adv.
3) Response:Adv responds with a pair (M,σ), where σ is either
forged by Adv, or is the result of calling SW-Att at some
arbitrary time t.
4) Adv wins if and only if M 6= AR(t) and σ =
HMAC(KDF (K, Chal),M).
1.2 RA Security Definition:
An RA scheme is considered secure if for all PPT adversaries Adv,
there exists a negligible function negl such that:
Pr[Adv,RA-game] ≤ negl(l)
the security game in Definition 1 with more than negligible
probability in the security parameter.
IV. PROOF OF EXECUTION (PoX) SCHEMES
A Proof of Execution (PoX) is a scheme2 involving two par-
ties: (1) a trusted verifier Vrf, and (2) an untrusted (potentially
infected) prover Prv. Informally, the goal of PoX is to allow
Vrf to request the execution of specific software S by Prv. As
a part of PoX, Prv must reply to Vrf with an authenticated
unforgeable cryptographic proof (H) that convinces Vrf that
Prv indeed executed S. To accomplish this, H must prove
that: (1) S executed atomically, in its entirety, and that such
execution occurred on Prv (and not on some other device); and
(2) any claimed result/output value of such execution, that is
accepted as legitimate by Vrf, could not have been spoofed or
modified. In addition, the size and behavior (i.e., instructions)
of S, as well as the size of its output (if any), should be
configurable and optionally specified by Vrf. In other words,
PoX should provide proofs of execution for arbitrary software,
along with corresponding authenticated outputs. Definition 2
specifies PoX schemes in more detail.
We now justify the need to include atomic execution of
S in the definition of PoX. On low-end MCUs, software
typically runs on “bare metal" and, in most cases, there is no
mechanism to enforce memory isolation between applications.
Therefore, allowing S’s execution to be interrupted would
permit other (potentially malicious) software running on Prv
to alter the behavior of S. This might be done, for example,
by an application that interrupts execution of S and changes
intermediate computation results in S data memory, thus
tampering with its output or control flow. Another example is
an interrupt that resumes S at different instruction modifying
S’s execution flow. Such an action could modify S behavior
completely via return oriented programming (ROP).
2We refer to PoX as a scheme, instead of protocol, because its security
relies on support from the underlying hardware implementation in Prv.
A. PoX Adversarial Model & Security Definition
We consider an adversary, Adv, that might control Prv’s
entire software state, code, and data. Adv can modify any
writable memory and read any memory that is not explicitly
protected by (hardware-enforced) access control rules, i.e., it
can read anything (including secrets) that is not explicitly
protected by the “trusted" hardware Adv may also have full
control over all Direct Memory Access (DMA) controllers on
Prv. DMA allows a hardware controller to directly access
main memory (e.g., RAM, flash or ROM) without going through
the CPU.
We consider a scheme PoX = (XRequest, XAtomicExec,
XProve, XVerify) to be secure if the aforementioned Adv
has negligible probability of convincing Vrf that S executed
successfully when, in reality, such execution did not take place,
or if it was interrupted. In addition we require that, if execution
of S takes place, Adv can not tamper with, or influence,
this execution’s outputs. These notions are formalized by the
security game in Definition 3.
We note that Definition 3 binds execution of S to the time
between Vrf issuing the request and receiving the response.
Therefore, if a PoX scheme is secure according to this
definition, Vrf can be certain about freshness of the execution.
In the same vein, the output produced by such execution is also
guaranteed to be fresh. This timeliness property is important
to avoid replays of previous valid executions; in fact, it is
essential for safety-critical applications. (See Section VII for
examples).
Physical Attacks: physical and hardware attacks are out of
scope in this paper. Specifically, Adv cannot modify the
code in ROM, induce hardware faults, or retrieve Prv secrets
via physical presence side-channels. Protection against such
attacks is considered orthogonal and could be supported via
standard physical-security techniques [46].
V. VAPE: A SECURE PoX ARCHITECTURE
We now present VAPE, a PoX architecture that realizes
our PoX security definition presented in Definition 3. One
key aspect of VAPE is a computer-aided formally verified
and publicly available implementation thereof. This section
first provides some intuition behind VAPE’s design. All VAPE
properties, which are overviewed informally in this section,
are formalized in Section VI.
In the rest of this section we use the term “unprivileged
software” to refer to any software other than SW-Att code
from VRASED. Adv is allowed to overwrite or bypass any
“unprivileged software”. Meanwhile, “trusted software” refers
to VRASED’s implementation of SW-Att (see Section III for
details) which is formally verified and cannot be modified
by Adv, since it is stored in ROM. VAPE is designed such
that no changes to SW-Att are required. Therefore, both
functionalities (RA and PoX, i.e., VRASED and VAPE) can co-
exist on the same device without interfering with each other.
Notation is summarized in Table I.
5
Definition 2 (Proof of Execution (PoX) Scheme).
A Proof of Execution (PoX) scheme is a tuple of algorithms [XRequest,XAtomicExec,XProve,XVerify] performed between Prv and Vrf where:
1) XRequestVrf→Prv(S, ·): is an algorithm executed by Vrf which takes as input some software S (consisting of a list of instructions
{s1, s2, ..., sm}). Vrf expects an honest Prv to execute S. XRequest generates a challenge Chal, and embeds it alongside S, into an output
request message asking Prv to execute S, and to prove that such execution took place.
2) XAtomicExecPrv(ER, ·): an algorithm (with possible hardware-support) that takes as input some executable region ER in Prv’s memory,
containing a list of instructions {i1, i2, ..., im}. XAtomicExec runs on Prv and is considered successful iff: (1) instructions in ER are executed
from its first instruction, i1, and end at its last instruction, im; (2) ER’s execution is atomic, i.e., if E is the sequence of instructions executed
between i1 and im, then {e|e ∈ E} ⊆ ER; and (3) ER’s execution flow is not altered by external events, i.e., MCU interrupts or DMA events.
The XAtomicExec algorithm outputs a string O. Note that O may be a default string (⊥) if ER’s execution does not result in any output.
3) XProvePrv(ER, Chal,O, ·): an algorithm (with possible hardware-support) that takes as input some ER, Chal and O and is run by Prv to
output H, i.e., a proof that XRequestVrf→Prv(S, ·) and XAtomicExecPrv(ER, ·) happened (in this sequence) and that O was produced by
XAtomicExecPrv(ER, ·).
4) XVerifyPrv→Vrf(H,O,S, Chal, ·): an algorithm executed by Vrf with the following inputs: some S, Chal, H and O. The XVerify algorithm
checks whether H is a valid proof of the execution of S (i.e., executed memory region ER corresponds to S) on Prv given the challenge Chal,
and if O is an authentic output/result of such an execution. If both checks succeed, XVerify outputs 1, otherwise it outputs 0.
Remark: In the parameters list, (·) denotes that additional parameters might be included depending on the specific PoX construction.
Fig. 2. Definition of Proof of Execution (PoX) Scheme
Definition 3 (PoX Security Game).
– Let treq denote time when Vrf issues Chal← XRequestVrf→Prv(S).
– Let tverif denote time when Vrf receives H and O back from Prv in response to XRequestVrf→Prv.
– Let XAtomicExecPrv(S, treq → tverif ) denote that XAtomicExecPrv(ER, ·), such that ER ≡ S, was invoked and completed within the time
interval [treq , tverif ].
– Let O ≡ XAtomicExecPrv(S, treq → tverif ) denote that XAtomicExecPrv(S, treq → tverif ) produces output O. Conversely, O 6≡
XAtomicExecPrv(S, treq → tverif ) indicates O is not produced by XAtomicExecPrv(S, treq → tverif ).
3.1 PoX Security Game (PoX-game): Challenger plays the following game with Adv:
1) Adv is given full control over Prv software state and oracle access to calls to the algorithms XAtomicExecPrv and XProvePrv.
2) At time treq , Adv is presented with software S and challenge Chal.
3) Adv wins in two cases:
a) None or incomplete execution: Adv produces (HAdv,OAdv), such that XVerify(HAdv,OAdv,S, Chal, ·) = 1,
without calling XAtomicExecPrv(S, treq → tverif ).
b) Execution with tampered output: Adv calls XAtomicExecPrv(S, treq → tverif ) and can produce (HAdv,OAdv),
such that XVerify(HAdv,OAdv,S, Chal, ·) = 1 and OAdv 6≡ XAtomicExecPrv(S, treq → tverif )
3.2 PoX Security Definition:
A PoX scheme is considered secure for security parameter l if, for all PPT adversaries Adv, there exists a negligible function negl such that:
Pr[Adv,PoX-game] ≤ negl(l)
Fig. 3. Definition of PoX Security Game
A. Protocol and Architecture
VAPE implements a secure PoX = (XRequest,
XAtomicExec, XProve, XVerify) scheme conforming to
Definition 4. The steps in VAPE workflow are illustrated in
Figure 5. The main idea is to first execute code contained in
ER. Then, at some later time, VAPE invokes VRASED verified
RA functionality to attest the code in ER and include, in
the attestation result, additional information that allows Vrf
to verify that ER code actually executed. If ER execution
produces an output (e.g., Prv is a sensor running ER’s
code to obtain some physical/ambient quantity), authenticity
and integrity of this output can also be verified. These are
achieved by including the EXEC flag among inputs to
HMAC computed as part of VRASED RA. The value of this
flag is controlled by VAPE formally verified hardware and its
memory can not be written by any software running on
Prv. VAPE hardware module runs in parallel with the MCU
monitoring its behavior and deciding the value of EXEC
accordingly.
Figure 6 depicts VAPE’s architecture. In addition to
VRASED hardware that provides secure RA by monitoring
a set of CPU signals (see Section III-B for details), VAPE
also monitors values stored in the dedicated physical memory
region — METADATA. METADATA contains address-
es/pointers to memory boundaries of ER (i.e., ERmin and
ERmax) and memory boundaries of expected output: ORmin
and ORmax. These addresses are sent by Vrf as part of
XRequest, and are configurable at run-time. The code S to
6
Definition 4 (Proof of Execution Protocol). VAPE instantiates a PoX = (XRequest, XAtomicExec, XProve, XVerify) scheme behaving
as follows:
1) XRequestVrf→Prv(S, ERmin, ERmax, ORmin, ORmax): includes a set of configuration parameters ERmin, ERmax, ORmin,
ORmax. The Executable Range (ER) is a contiguous memory block in which S is to be installed: ER = [ERmin, ERmax]. Similarly,
the Output Range (OR) is also configurable and defined by Vrf’s request as OR = [ORmin, ORmax]. If S does not produce any output
ORmin = ORmax =⊥. S is the software to be installed in ER and executed. If S is unspecified (S =⊥) the protocol will execute whatever
code was pre-installed on ER on Prv, i.e., Vrf is not required to provide S in every request, only when it wants to update ER contents before
executing it. If the code for S is sent by Vrf, untrusted auxiliary software in Prv is responsible for copying S into ER. Prv also receives a
random l-bit challenge Chal (|Chal| = l) as part of the request, where l is the security parameter.
2) XAtomicExecPrv(ER,OR,METADATA): This algorithm starts with unprivileged auxiliary software writing the values of: ERmin,
ERmax, ORmin, ORmax and Chal to a special pre-defined memory region denoted by METADATA. VAPE’s verified hardware enforces
immutability, atomic execution and access control rules according to the values stored in METADATA; details are described in Section V-A.
Finally, it begins execution of S by setting the program counter to the value of ERmin.
3) XProvePrv(ER, Chal, OR): produces proof of execution H. H allows Vrf to decide whether: (1) code contained in ER actually executed;
(2) ER contained specified (expected) S’s code during execution; (3) this execution is fresh, i.e., performed after the most recent XRequest;
and (4) claimed output in OR is indeed produced by this execution. As mentioned earlier, VAPE uses VRASED’s RA architecture to compute
H by attesting at least the executable, along with its output, and corresponding execution metadata. More formally:
H = HMAC(KDF (K, Chal), ER,OR,METADATA, ...) (1)
METADATA also contains the EXEC flag that is read-only to all software running in Prv and can only be written to by VAPE’s
formally verified hardware. This hardware monitors execution and sets EXEC = 1 only if ER executed successfully (XAtomicExec) and
memory regions of METADATA, ER, and OR were not modified between the end of ER’s execution and the computation of H. The
reasons for these requirements are detailed in Section V-C. If any malware residing on Prv attempts to violate any of these properties VAPE’s
verified hardware (provably) sets EXEC to zero. After computing H, Prv returns it and contents of OR (O) produced by ER’s execution to Vrf.
4) XVerifyPrv→Vrf(H,O,S,METADATAVrf) : Upon receiving H and O, Vrf checks whether H is produced by a legitimate execution of S
and reflects parameters specified in XRequest, i.e., METADATAVrf = Chal||ORmin||ORmax||ERmin||ERmax||EXEC = 1. This way,
Vrf concludes that S successfully executed on Prv and produced output O if:
H ≡ HMAC(KDF (K, ChalVrf),S,O,METADATAVrf , ...) (2)
Fig. 4. Definition of Proof of Execution Protocol
TABLE I
NOTATION SUMMARY
Notation Description
PC Current Program Counter value
Ren Signal that indicates if the MCU is reading from memory (1-bit)
Wen Signal that indicates if the MCU is writing to memory (1-bit)
Daddr Address for an MCU memory access
DMAen Signal that indicates if DMA is currently enabled (1-bit)
DMAaddr Memory address being accessed by DMA, if any
irq Signal that indicates if an interrupt is happening
CR Memory region where SW-Att is stored: CR = [CRmin, CRmax]
MR (MAC Region) Memory region in which SW-Att computation result is
written: MR = [MRmin,MRmax]. The same region is also used to pass
the attestation challenge as input to SW-Att
AR (Attested Region) Memory region to be attested. Can be fixed/predefined or
specified in an authenticated request from Vrf: AR = [ARmin, ARmax]
reset A 1-bit signal that reboots/resets the MCU when set to logical 1
ER (Execution Region) Memory region that stores an executable to be executed:
ER = [ERmin, ERmax]
OR (Output Region) Memory region that stores execution output: OR =
[ORmin, ORmax]
EXEC 1-bit execution flag indicating whether a successful execution has happened
METADATA Memory region containing VAPE’s metadata
Fig. 5. Overview of VAPE’s workflow
be stored in ER is optionally3 sent by Vrf.
METADATA includes the EXEC flag, which is ini-
tialized to 0 and only changes from 0 to 1 (by VAPE’s
hardware) when ER execution starts, i.e., when the PC points
to ERmin. Afterwards, any violation of VAPE’s security
properties (detailed in Section V-C) immediately changes
EXEC back to 0. After a violation, the only way to set
the flag back to 1 is to re-start execution of ER from the
very beginning, i.e., with PC=ERmin. In other words, VAPE
verified hardware makes sure that EXEC value covered by
3Sending the code to be executed is optional because S might be pre-
installed on Prv. In that case the proof of execution will allow Vrf to conclude
that the pre-installed S was not modified and that it was executed.
7
the HMAC’s result (represented by H) is 1, if and only if ER
code executed successfully. As mentioned earlier, we consider
an execution to be successful if it runs atomically (i.e., without
being interrupted), from its first ERmin to its last instruction
ERmax.
MCU CORE
VRASED
VAPE
HW-Mod
PC,
irq,
Ren,
Wen,
Daddr,
DMAen,
DMAaddr reset
Chal
ORmax
ORmin
ERmax
ERmin
EXEC
ER
OR
MCU’s Address Space
Fig. 6. HW-Mod composed of both VAPE and VRASED hardware modules.
Shaded area represents VAPE’s METADATA.
In addition to EXEC, HMAC covers a set of parameters
(in METADATA memory region) that allows Vrf to
check whether executed software was indeed located in
ER = [ERmin, ERmax]. If any output is expected, Vrf
specifies a memory range OR = [ORmin, ORmax] for storing
output. Contents of OR are also covered by the computed
HMAC, allowing Vrf to verify authenticity of the output of
the execution. VAPE protocol is presented in Definition 4.
Remark: Our notion of successful execution requires
S to have a single exit point – ERmax. Any self-
contained code with multiple legal exits can be triv-
ially instrumented to have a single exit pointas, as fol-
lows: Replace each exit instruction with a jump
to the unified exit point ERmax. This notion also
requires S to run atomically. Since this constraint might be
undesirable in some real-time systems, we discuss how it can
be relaxed in Appendix VIII-C. Finally, Vrf is responsible for
defining OR memory region according to the behavior of S.
OR should be large enough to fit all output produced by S
and OR boundaries should correspond to addresses where S
writes the output values to be sent to Vrf.
B. MCU Assumptions
As mentioned in section V-A, VAPE extends VRASED
to enable a verified architecture for proofs of execution.
Therefore, we assume the same machine model introduced
in VRASED and make no additional assumptions. We review
these assumptions throughout the rest of this section and then
formalize them as an LTL machine model in Section VI.
Verification of the entire CPU is beyond the scope of this
paper. Therefore, we assume the CPU architecture strictly
adheres to, and correctly implements, its specifications. In
particular, our design and verification rely on the following
simple axioms:
A1 - Program Counter (PC): PC always contains the address
of the instruction being executed in a given CPU cycle.
A2 - Memory Address: Whenever memory is read or written,
a data-address signal (Daddr) contains the address of the
corresponding memory location. For a read access, a data
read-enable bit (Ren) must be set, while, for a write
access, a data write-enable bit (Wen) must be set.
A3 - DMA: Whenever the DMA controller attempts to ac-
cess the main system memory, a DMA-address signal
(DMAaddr) reflects the address of the memory location
being accessed and a DMA-enable bit (DMAen) must
be set. DMA can not access memory when DMAen is
off (logical zero).
A4 - MCU Reset: At the end of a successful reset routine, all
registers (including PC) are set to zero before resuming
normal software execution flow. Resets are handled by the
MCU in hardware. Thus, the reset handling routine can
not be modified. When a reset happens, the corresponding
reset signal is set. The reset signal is also set when the
MCU initializes for the first time.
A5 - Interrupts: When interrupts happen, the corresponding
irq signal is set.
C. VAPE’s Sub-Properties at a High-Level
We now describe the sub-properties enforced by VAPE.
Section VI formalizes these sub-properties in LTL and pro-
vides a single end-to-end definition for VAPE’s correctness.
This end-to-end correctness notion is provably implied by
the composition of all sub-properties. The sub-properties fall
into two major groups: Execution Protection and Metadata
Protection. A violation of any of these properties implies one
or more of the following:
• Code in ER was not executed atomically and in its
entirety;
• Output in OR was not produced by ER execution;
• Code in ER was not executed in a timely manner, i.e.,
after receiving the latest XRequest.
Therefore, whenever VAPE detects any violation, EXEC is
set to 0. Then, since EXEC is included among the inputs
to the computation of HMAC (conveyed in Prv’s response),
it will be interpreted by Vrf as failure to prove execution of
code in ER.
Execution Protection:
P1 - Ephemeral Immutability: Code in ER can not be
modified from the start of its execution until the end of
SW-Att computation in XProve routine. This property
is necessary to ensure that the attestation result in fact
reflects the code that executed. Lack of this property
would allow Adv to execute some other code ERAdv,
8
then overwrite it with expected ER and finally call
XProve; it would result in a valid proof for the execution
of ER even though ERAdv was executed instead.
P2 - Ephemeral Atomicity: ER execution is only considered
successful if ER runs starting from ERmin until ERmax
atomically, i.e., without any interruption. This property
conforms with XAtomicExec routine in Definition 2 and
with what is considered to be a successful execution in
the context of this work. As discussed in Section IV,
ER must run atomically to prevent malware residing on
Prv from interrupting ER execution and resuming it at a
different instruction, or modifying intermediate execution
results in data memory. Without this property, Return
Oriented Programming (ROP) and similar attacks on ER
could change its behavior completely and unpredictably,
making any proof of execution, (and corresponding out-
put) useless.
P3 - Output Protection: Similar to P1, VAPE must make
sure that OR is not modified from the time after ER
code execution is finished until the time when HMAC
computation in XProve completes. Lack of this property
would allow Adv to overwrite OR and successfully
spoof OR produced by ER, thus convincing Vrf that it
produced output ORAdv.
Metadata Protection:
P4 - Executable/Output (ER/OR) Boundaries: VAPE hard-
ware ensures properties P1, P2, and P3 according to the
values ERmin, ERmax, ORmin, ORmax. These values
are configurable and can be decided by Vrf according
to application needs. They are written into metadata-
dedicated physical addresses in Prv’s memory before ER
execution. Therefore, once ER execution starts, VAPE
hardware must ensure that such values remain unchanged
until XProve completes. Otherwise, Adv would generate
valid attestation results, by attesting [ERmin, ERmax],
while, in fact, having executed code in a different region:
[ERAdvmin, ER
Adv
max].
P5 - Response Protection: The appropriate response to Vrf’s
challenge must be unforgeable and non-invertible. This
implies that, in the XProve routine, K used to compute
HMAC must never be leaked (with non-negligible prob-
ability) and HMAC implementation must be functionally
correct (adhere to its cryptographic specification). More-
over, contents of memory being attested must not change
during the HMAC computation. We rely on VRASED ver-
ified RA architecture to ensure these properties. Also, to
ensure trustworthiness of the response, VAPE guarantees
that no software in Prv can ever modify EXEC flag and
that, once EXEC = 0, it can only become 1 again if
ER’s execution re-starts completely.
P6 - Challenge Temporal Consistency: VAPE must ensure
that Chal can not be modified between ER’s execution
and HMAC computation in XProve. Without this prop-
erty, the following attack is possible: (1) Prv-resident
malware first executes ER properly (i.e., by not violating
P1-P5), resulting in EXEC = 1 after execution stops,
and (2) at some later time, malware receives Chal from
Vrf and simply calls XProve on this Chal without execut-
ing ER. As a result, malware would acquire a valid proof
of execution (since EXEC remains 1 when the proof is
generated) even though no ER execution occurred before
Chal was received. Such attacks can be prevented by
setting EXEC = 0 whenever the memory region storing
Chal is modified.
D. Formal Verification Methodology
Our formal verification approach starts by formalizing RA
sub-properties discussed in this section using Linear Temporal
Logic (LTL) to define invariants that must hold throughout
the entire execution. We then use a theorem prover [47] to
write a computer-aided proof that the conjunction of the LTL
sub-properties imply an end-to-end formal definition for the
guarantee expected from VAPE hardware. VAPE correctness,
when properly composed with VRASED guarantees, yields a
PoX scheme secure according to Definition 3. This is proved
by showing that, if the composition between the two is imple-
mented as described in Definition 4, VRASED security can be
reduced to VAPE security. For more details see Section VI.
VAPE hardware module is composed of several sub-modules
written in the Verilog Hardware Description Language (HDL).
Each sub-module is responsible for enforcing a set of LTL
sub-properties and is described as an FSM in: (1) Verilog at
Register Transfer Level (RTL); and (2) the Model-Checking
language SMV [43]. We then use the NuSMV model checker
to verify that the FSM complies with LTL specifications. If
verification fails, the sub-module is re-designed.
Once each sub-module is verified, they are combined into
a single Verilog design. The composition is converted to
SMV using the automatic translation tool Verilog2SMV [44].
The resulting SMV is simultaneously verified against all LTL
specifications to prove that the final Verilog design for HW-Mod
complies with all necessary properties. Automatic conversion
of the composition of HW-Mod from Verilog to SMV rules
out the possibility of human mistakes in representing Verilog
FSMs as SMV.
VI. FORMAL SPECIFICATION & VERIFIED
IMPLEMENTATION
We now describe VAPE formally verified implementation.
We start by defining a generic machine model for low-end
embedded systems composed of a subset of VRASED machine
model and expressed in LTL. Then, we formally state the end-
goal of VAPE implementation. Next, we prove that a set of
LTL sub-properties, corresponding to the formal specification
of P1–P6, when applied to this machine model, implies VAPE
end goal. Finally, we implement VAPE hardware by applying
the methodology described in Section V-D to verify that
the implementation conforms to all LTL sub-properties, thus
implying VAPE end goal.
9
A. Machine Model
Definition 5 models the behavior of low-end MCUs, as
described in Section I-A. It consists of a subset of the machine
model introduced by VRASED. Nonetheless, this subset mod-
els all MCU behavior that is relevant for stating and verifying
the correctness of VAPE’s implementation.
Definition 5. Machine Model (subset)
1) Modify_Mem(i) →
(Wen ∧Daddr = i) ∨ (DMAen ∧DMAaddr = i)
2) Interrupt → irq
3) MR, CR, AR, KR, XS, and METADATA are non-
overlapping memory regions
Modify_Mem models that a given memory address can be
modified in two cases: by a CPU instruction or by DMA. In
the former, Wen signal must be on and Daddr must contain the
memory address being accessed. In the second case, DMAen
signal must be on and DMAaddr must contain the address
being modified by DMA. The requirements for reading from
a given address are similar, except that instead of Wen, Ren
must be on. We do not explicitly state this behavior since it
is not used in VAPE proofs. For the same reason, modeling
the effects of instructions that only modify register values
(e.g., ALU operations, such as add and mul) is also not
necessary. The machine model also captures the fact that, when
an interrupt happens during execution, the irq signal in MCU
hardware is set to 1.
With respect to memory layout, the model states that MR,
CR, AR, KR, XS, and METADATA are disjoint memory
regions. The first five are memory regions are defined in
VRASED, as shown in Figure 1. As shown in Figure 6,
METADATA is a fixed memory region used by VAPE to
store information about software execution status.
B. Security & Implementation Correctness
Our strategy in proving that VAPE is a secure PoX archi-
tecture (according to Definition 3) is two-part:
[A]: We show that properties P1-P6, discussed in Section V-C
and formally specified next in Section VI-C, are sufficient
to guarantee that EXEC flag is 1 if and only if S indeed
executed on Prv. To show this, we compose a computer
proof using SPOT LTL proof assistant [47].
[B]: We use cryptographic reduction proofs to show that, as
long as part A holds, VRASED security with respect to
Definition 1 can be reduced to VAPE’s PoX security from
Definition 3. In turn, HMAC’s existential unforgeability
can be reduced to VRASED’s security [17]. Therefore,
both VAPE and VRASED rely on the assumption that
HMAC is a secure MAC.
In the rest of this section, we convey the intuition behind
both of these steps. Proof details are in Appendix A.
The goal of part A above is to show that VAPE’s sub-
properties imply Definition 6. LTL specification in Definition 6
captures the conditions that must hold in order for EXEC to
be set to 1 during execution of XProve, enabling generation
of a valid proof of execution. This specification ensures that,
in order to have EXEC = 1 during execution of XProve (i.e,
for [EXEC ∧PC ∈ CR] to hold), at least once before such
time the following must have happened:
1) The system reached state S0 where software stored in
ER started executing from its first instruction (PC =
ERmin).
2) The system eventually reached a state S1 when ER fin-
ished executing (PC = ERmax). In the interval between
S0 and S1 PC kept executing instructions within ER,
there were no interrupts, no resets, and DMA remained
inactive.
3) The system eventually reached a state S2 when XProve
started executing (PC = CRmin). In the interval be-
tween S0 and S2, METADATA and ER regions were
not modified.
4) In the interval between S0 and S2, OR region was
only modified by ER’s execution, i.e., PC ∈ ER ∨
¬ Modify_Mem(OR).
Figure 7 shows the time windows wherein each memory region
must not change during VAPE’s PoX as implied by VAPE’s
correctness (Definition 6). Violating any of these conditions
will cause EXEC have value 0 during XProve’s computation.
Consequently, any violation will result in Vrf rejecting the
proof of execution since it will not conform to the expected
value of H, per Equation 2 in Definition 4.
The intuition behind the cryptographic reduction (part B
of our proof strategy) is that computing token consists
simply of invoking VRASED SW-Att with MR = Chal,
ER ∈ AR, OR ∈ AR, and METADATA ∈ AR. Therefore,
a successful forgery of VAPE’s H implies breaking VRASED
security. Since H always includes the value of EXEC, this
implies that VAPE is PoX-secure (Definition 3). The complete
reduction is presented in Appendix A.
C. VAPE’s Sub-Properties in LTL
We now introduce formal definitions for the necessary sub-
properties enforced by VAPE as LTL specifications 3–12 in
Definition 7. We describe how they map to high-level notions
P1–P6 discussed in Section V-C. Appendix A discusses a
computer proof that conjunction of this set of properties is
sufficient to satisfy a formal definition of VAPE correctness
from Definition 6. Then, Section VI-D shows examples of
VAPE hardware sub-modules, designed as FSMs and verified
to enforce properties in Definition 7.
LTL 3 enforces P1 – Ephemeral immutability by making
sure that whenever ER memory region is written, either by
CPU or DMA, EXEC is immediately set to logical 0 (false).
P2 – Ephemeral Atomicity is enforced by a set of three
LTL specifications. LTL 4 enforces that the only way for ER’s
execution to terminate, without setting EXEC to logical 0, is
through its last instruction: PC = ERmax. This is specified
by checking the relation between current and next PC values
using LTL neXt operator. In particular, if current PC value is
10
Definition 6. Formal specification of VAPE’s correctness.
{
PC = ERmin ∧ [(PC ∈ ER ∧ ¬Interrupt ∧ ¬reset ∧ ¬DMAen) U PC = ERmax] ∧
[(¬ Modify_Mem(ER) ∧ ¬ Modify_Mem(METADATA) ∧ (PC ∈ ER ∨ ¬ Modify_Mem(OR))) U PC = CRmin]
} B {EXEC ∧ PC ∈ CR}
Definition 7. Necessary Sub-Properties for Secure Proofs of Execution in LTL.
Ephemeral Immutability:
G : {[Wen ∧ (Daddr ∈ ER)] ∨ [DMAen ∧ (DMAaddr ∈ ER)]→ ¬EXEC} (3)
Ephemeral Atomicity:
G : {(PC ∈ ER) ∧ ¬(X(PC) ∈ ER)→ PC = ERmax ∨ ¬X(EXEC) } (4)
G : {¬(PC ∈ ER) ∧ (X(PC) ∈ ER)→ X(PC) = ERmin ∨ ¬X(EXEC)} (5)
G : {(PC ∈ ER) ∧ irq → ¬EXEC} (6)
Output Protection:
G : {[¬(PC ∈ ER) ∧ (Wen ∧Daddr ∈ OR)] ∨ (DMAen ∧DMAaddr ∈ OR) ∨ (PC ∈ ER ∧DMAen)→ ¬EXEC} (7)
Executable/Output (ER/OR) Boundaries & Challenge Temporal Consistency:
G : {ERmin > ERmax ∨ORmin > ORmax → ¬EXEC} (8)
G : {ERmin ≤ CRmax ∨ ERmax > CRmax → ¬EXEC} (9)
G : {[Wen ∧ (Daddr ∈METADATA)] ∨ [DMAen ∧ (DMAaddr ∈METADATA)]→ ¬EXEC} (10)
Remark: Note that Chalmem ∈METADATA.
Response Protection:
G : {¬EXEC ∧ X(EXEC)→ X(PC = ERmin)} (11)
G : {reset→ ¬EXEC} (12)
treq t(ERmin) t(ERmax) t(CRmin) t(CRmax)tverif Time
OR
ER
META
DATA
Region
State S0 State S1 State S2 H ready
ER execution Attestation
Unchanged memory
required by VAPE
Unchanged memory
enforced by VRASED
Fig. 7. Illustration of time intervals that each memory region must remain
unchanged in order to produce a valid H (EXEC = 1). t(X) denotes the
time when PC = X .
within ER, and next PC value is outside SW-Att region, then
either current PC value is the address of ERmax, or EXEC
is set to 0 in the next cycle. Also, LTL 5 enforces that the only
way for PC to enter ER is through the very first instruction:
ERmin. This prevents ER execution from starting at some
point in the middle of ER, thus making sure that ER always
executes in its entirety. Finally, LTL 6 enforces that EXEC
is set to zero if an interrupt happens in the middle of ER
execution. Even though LTLs 4 and 5 already enforce that PC
can not change to anywhere outside ER, interrupts could be
programmed to return to an arbitrary instruction within ER.
Although this would not violate LTLs 4 and 5, it would still
modify ER’s behavior. Therefore, LTL 6 is needed to prevent
that.
P3 – Output Protection is enforced by LTL 7 by making
sure that: (1) DMA controller does not write into OR; (2)
CPU can only modify OR when executing instructions within
ER; and 3) DMA can not be active during ER execution;
otherwise, a compromised DMA could change intermediate
11
results of ER computation in data memory, potentially mod-
ifying ER behavior.
Similar to P3, P4 – Executable/Output Boundaries and P6
– Challenge Temporal Consistency are enforced by LTL 10.
Since Chal as well as ERmin, ERmax, ORmin, and ORmax
are all stored in METADATA reserved memory region, it
suffices to ensure that EXEC is set to logical 0 whenever this
region is modified. Also, LTL 8 enforces that EXEC is only
set to one if ER and OR are configured (by METADATA
values ERmin, ERmax, ORmin, ORmax) as valid memory
regions.
Finally, LTLs 11, and 12 (in addition to VRASED verified
RA architecture) are responsible for ensuring P5- Response
Protection by making sure that EXEC always reflects what
is intended by VAPE hardware. LTL 7 specifies that the only
way to change EXEC from 0 to 1 is by starting ER’s
execution over. Finally, LTL 12 states that, whenever a reset
happens (this also includes the system initial booting state)
and execution is initialized, the initial value of EXEC is 0.
To conclude, we recall that no software running on Prv can
modify EXEC. Therefore, it is not possible for malware to
change it directly.
D. Formally Verified Modules
VAPE is designed as a set of seven sub-modules. We now
describe VAPE’s verified implementation, by focusing on two
of these sub-modules and their corresponding properties. The
Verilog implementation of omitted sub-modules is available
in [18]. Each sub-module enforces a sub-set of LTLs speci-
fication in Definition 7. As discussed in Section III-A, sub-
modules are designed as FSMs. In particular, we implement
them as Mealy FSMs, i.e, their output changes as a function
of both the current state and current input values. Each FSM
takes as input a subset of signals shown in Figure 6 and
produces only one output – EXEC – indicating violation of
PoX properties.
To simplify the presentation, we do not explicitly represent
the value of EXEC for each state transition. Instead, we
define the following implicit representation:
1) EXEC is 0 whenever an FSM transitions to NotExec
state;
2) EXEC remains 0 until a transition leaving NotExec
state is triggered;
3) EXEC is 1 in all other states.
4) Sub-modules composition: Since all PoX properties
must simultaneously hold, the value of EXEC produced
by VAPE is the conjunction (logical AND) of all sub-
modules’ individual EXEC flags.
Figure 8 represents a verified model enforcing LTLs 4-
6, corresponding to the high-level property P2- Ephemeral
Atomicity . The FSM consists of five states. notER and
midER represent states when PC is: (1) outside ER, and
(2) within ER respectively, excluding the first (ERmin) and
last (ERmax) instructions. Meanwhile, fstER and lstER
correspond to states when PC points to the first and last
instructions, respectively. The only possible path from notER
NotExec
notER
fstER
midER
lastER
otherwise
PC = ERmin ∧ ¬ irq
(PC < ERmin ∨ PC > ERmax)
PC = ERmin ∧ ¬ irq
otherwise
PC = ERmin
∧¬ irq
(PC > ERmin ∧ PC < ERmax)
∧¬ irq
otherwise
(PC > ERmin ∧ PC < ERmax)
∧¬ irq
PC = ERmax ∧ ¬ irqotherwise
PC = ERmax
∧¬ irq
(PC < ERmin ∨ PC > ERmax)
∧¬ irq
otherwise
Fig. 8. Verified FSM for LTLs 4-6, a.k.a., P2- Ephemeral Atomicity.
Run NotExec
otherwise otherwise
[Wen ∧ (Daddr ∈METADATA)]∨
[DMAen ∧ (DMAaddr ∈METADATA)]
PC = ERmin∧
¬[Wen ∧ (Daddr ∈METADATA)]∧
¬[DMAen ∧ (DMAaddr ∈METADATA)]
Fig. 9. Verified FSM for LTL 10, a.k.a., P6- Challenge Temporal Consistency.
to midER is through fstER. Similarly, the only path from
midER to notER is through lstER. A transition to the
NotExec state is triggered whenever: (1) any sequence of
values for PC do not follow the aforementioned conditions,
or (2) irq is logical 1 while PC is inside ER. Lastly, the only
way to transition out of the NotExec state is to restart ER’s
execution.
Figure 9 shows the FSM verified to comply with LTL 10
(P6- Challenge Temporal Consistency). The FSM has two
states: Run and NotExec. The FSM transitions to the
NotExec state and outputs EXEC = 0 whenever a vio-
lation happens, i.e., whenever METADATA is modified in
software. It transitions back to Run when ER’s execution is
restarted without such violation.
E. Evaluation
VAPE incurs modest hardware overhead, compared to the
VRASED baseline: ≈ 2% for registers and 12% for LUTs. The
runtime to produce a proof of S execution depends on the size
of S , which determines VRASED’s attestation runtime used to
produce H. In the most expensive or extreme case, when the
entire program memory (8 kB) is occupied by ER+OR, this
computation takes around 900ms on the 8MHz MSP430. Due
to space limitations, a more detailed evaluation is deferred to
Appendix VIII.
12
VII. AUTHENTICATED SENSING/ACTUATION
As discussed in Section I an important functionality that
can be realized using PoX is authenticated sensing/actuation.
In this section, we demonstrate how VAPE can be used to
build sensors and actuators that “can not lie”.
As a running example we use a fire sensor: a safety critical
low-end embedded device commonly present in households
and workplaces. Such a device consists of an MCU equipped
with analog hardware capable of measuring physical/chemical
quantities, e.g., temperature, humidity, and CO2 level. It is
also usually equipped with actuation-capable analog hardware,
such as a buzzer. Analog hardware components are directly
connected to MCU General Purpose Input/Output (GPIO)
ports. GPIO ports are physical wires directly mapped to
fixed memory locations in MCU memory. Therefore, software
running on the MCU can read the physical quantities directly
from GPIO memory.
In this example, we consider that MCU’s software peri-
odically reads these values, and transmits them to a remote
safety authority, e.g., a fire department. The safety authority
then decides to take action. The MCU also triggers the buzzer
actuator whenever sensed values indicate a fire. Given the
safety-critical nature of this application, it is important for
the safety authority to be sure that reported values are au-
thentic and were produced by execution of expected software.
Otherwise, malware could spoof such values (e.g., by not
reading them from the proper GPIO). PoX can guarantee that
reported values were read from the correct GPIO port (since
the memory address is specified by the instructions in the ER
executable), and that the produced output (stored in OR) was
indeed generated by execution of ER and was not modified
thereafter. Thus, upon receiving sensed values accompanied
by a proof of execution, the safety authority can be sure that
the reported sensed value can be trusted.
As a proof of concept, we use VAPE to implement a simple
fire sensor that operates with temperature and humidity quanti-
ties. It communicates with a remote Vrf (e.g., fire department)
using a low-power ZigBee radio4 typically used by low-end
CPS/IoT devices. Temperature and humidity analog devices
are connected to a VAPE-enabled MSP430 MCU running at
8MHz and synthesized using a Basys3 Artix-7 FPGA board.
As shown in Figure 10, MCU GPIO ports connected to the
temperature/humidity sensor and to the buzzer.
VAPE is used to prove execution of fire sensor software.
This software is shown in Figure 12a in Appendix VIII-C.
The software consists of two main functions: ReadSensor
and SoundAlarm. Proofs of execution are requested by
the safety authority via an XRequest to issue commands
to execute these functions. ReadSensor reads and pro-
cesses the value generated temperature/humidity analog device
memory-mapped GPIO, and copies this value to OR. The
SoundAlarm function turns the buzzer on for 2 seconds, i.e.,
it writes “1” to the memory address mapped to the buzzer,
busy-waits for 2 seconds, and then writes “0” to the same
4https://www.zigbee.org/
Fig. 10. Hardware setup for a fire sensor
memory. This implementation corresponds to the one in the
open-source repository 5 and was ported to a VAPE-enabled
MCU. Our porting effort was minimal; it involved around 30
additional lines of C code, mainly for re-implementing sub-
functions that are originally implemented as shared APIs, e.g.,
digitalRead/Write. Finally, we transform ported code to
be compatible with VAPE’s PoX architecture. Details can be
found in Appendix VIII-C.
VIII. CONCLUSION
This paper introduces VAPE, a novel and formally verified
security service targeting low-end embedded devices. It allows
a remote untrusted prover to generate unforgeable proofs
of remote software execution. We envision VAPE’s use in
many IoT application domains, such as authenticated sensing
and actuation. Our implementation of VAPE is realized on a
real embedded system platform, MSP430, synthesized on an
FPGA, and the verified implementation is publicly available.
Our evaluation shows that VAPE has low overhead for both
hardware footprint, time for generating proofs of execution.
We believe that this work is especially relevant to safety-
critical environments and applications.
REFERENCES
[1] Intel, “Intel Software Guard Extensions (Intel SGX).”
[2] V. Costan, I. Lebedev, and S. Devadas, “Sanctum: Minimal hardware
extensions for strong software isolation,” in 25th {USENIX} Security
Symposium ({USENIX} Security 16), 2016.
[3] A. Seshadri, M. Luk, A. Perrig, L. van Doorn, and P. Khosla, “Scuba:
Secure code update by attestation in sensor networks,” in ACM workshop
on Wireless security, 2006.
[4] D. Perito and G. Tsudik, “Secure code update for embedded devices via
proofs of secure erasure.,” in ESORICS, 2010.
[5] Y. Li, J. M. McCune, and A. Perrig, “Viper: Verifying the integrity of
peripherals’ firmware,” in CCS, ACM, 2011.
[6] K. Eldefrawy, G. Tsudik, A. Francillon, and D. Perito, “SMART: Secure
and minimal architecture for (establishing dynamic) root of trust,” in
NDSS, Internet Society, 2012.
5https://github.com/Seeed-Studio/LaunchPad_Kit
13
[7] P. Koeberl, S. Schulz, A.-R. Sadeghi, and V. Varadharajan, “TrustLite:
A security architecture for tiny embedded devices,” in EuroSys, ACM,
2014.
[8] F. Brasser et al., “Tytan: Tiny trust anchor for tiny devices,” in DAC,
2015.
[9] K. Eldefrawy, N. Rattanavipanon, and G. Tsudik, “HYDRA: hybrid
design for remote attestation (using a formally verified microkernel),”
in Wisec, ACM, 2017.
[10] F. Brasser, A.-R. Sadeghi, and G. Tsudik, “Remote attestation for low-
end embedded devices: the prover’s perspective,” in DAC, 2016.
[11] J. Noorman, J. V. Bulck, J. T. Mühlberg, F. Piessens, P. Maene,
B. Preneel, I. Verbauwhede, J. Götzfried, T. Müller, and F. Freiling,
“Sancus 2.0: A low-cost security architecture for iot devices,” ACM
Trans. Priv. Secur., vol. 20, pp. 7:1–7:33, July 2017.
[12] X. Carpent, N. Rattanavipanon, and G. Tsudik, “ERASMUS: Efficient
remote attestation via self-measurement for unattended settings,” in
Design, Automation and Test in Europe (DATE), 2018.
[13] I. D. O. Nunes, G. Dessouky, A. Ibrahim, N. Rattanavipanon, A.-R.
Sadeghi, and G. Tsudik, “Towards systematic design of collective remote
attestation protocols,” in ICDCS, 2019.
[14] X. Carpent, N. Rattanavipanon, and G. Tsudik, “Remote attestation of iot
devices via SMARM: Shuffled measurements against roving malware,”
in IEEE International Symposium on Hardware Oriented Security and
Trust (HOST), 2018.
[15] A. Ibrahim, A.-R. Sadeghi, and S. Zeitouni, “SeED: secure non-
interactive attestation for embedded devices,” in ACM Conference on
Security and Privacy in Wireless and Mobile Networks (WiSec), 2017.
[16] X. Carpent, K. Eldefrawy, N. Rattanavipanon, and G. Tsudik, “Tempo-
ral consistency of integrity-ensuring computations and applications to
embedded systems security,” in ASIACCS, 2018.
[17] I. De Oliveira Nunes, K. Eldefrawy, N. Rattanavipanon, M. Steiner, and
G. Tsudik, “Vrased: A verified hardware/software co-design for remote
attestation,” USENIX Security’19 (To appear). Pre-print available at:
https://arxiv.org/abs/1811.00175, 2019.
[18] Anonymous Authors, “VAPE source code.” https://www.dropbox.com/
sh/9id1ntfrnjy40tc/AADONZgUdibXlONxSMdlm6npa, 2018.
[19] T. Instruments, “Msp430 ultra-low-power sensing &
measurement mcus.” http://www.ti.com/microcontrollers/
msp430-ultra-low-power-mcus/overview.html.
[20] O. Girard, “openMSP430,” 2009.
[21] J. Petroni et al., “Copilot — A coprocessor-based kernel runtime
integrity monitor,” in USENIX, 2004.
[22] Trusted Computing Group (TCG), “Website.”
http://www.trustedcomputinggroup.org, 2015.
[23] X. Kovah et al., “New results for timing-based attestation,” in IEEE S&P
’12, 2012.
[24] Trusted Computing Group., “Trusted platform module (tpm),” 2017.
[25] R. Kennell et al., “Establishing the genuinity of remote computer
systems,” in USENIX, 2003.
[26] A. Seshadri et al., “SWATT: Software-based attestation for embedded
devices,” in IEEE S&P ’04, 2004.
[27] A. Seshadri et al., “Pioneer: Verifying code integrity and enforcing
untampered code execution on legacy systems,” in ACM SOSP, 2005.
[28] K. Eldefrawy et al., “SMART: Secure and minimal architecture for
(establishing a dynamic) root of trust,” in NDSS, 2012.
[29] P. Koeberl et al., “TrustLite: A security architecture for tiny embedded
devices,” in EuroSys, 2014.
[30] A. Francillon et al., “A minimalist approach to remote attestation,” in
DATE, 2014.
[31] X. Carpent, K. Eldefrawy, N. Rattanavipanon, and G. Tsudik, “Temporal
consistency of integrity-ensuring computations and applications to em-
bedded systems security,” in Proceedings of the 2018 on Asia Conference
on Computer and Communications Security, pp. 313–327, ACM, 2018.
[32] T. Abera et al., “C-flat: Control-flow attestation for embedded systems
software,” in CCS ’16, 2016.
[33] G. Dessouky, S. Zeitouni, T. Nyman, A. Paverd, L. Davi, P. Koeberl,
N. Asokan, and A.-R. Sadeghi, “Lo-fat: Low-overhead control flow
attestation in hardware,” in Proceedings of the 54th Annual Design
Automation Conference 2017, p. 24, ACM, 2017.
[34] S. Zeitouni, G. Dessouky, O. Arias, D. Sullivan, A. Ibrahim, Y. Jin,
and A.-R. Sadeghi, “Atrium: Runtime attestation resilient under mem-
ory attacks,” in Proceedings of the 36th International Conference on
Computer-Aided Design, pp. 384–391, IEEE Press, 2017.
[35] G. Dessouky, T. Abera, A. Ibrahim, and A.-R. Sadeghi, “Litehax:
lightweight hardware-assisted attestation of program execution,” in 2018
IEEE/ACM International Conference on Computer-Aided Design (IC-
CAD), pp. 1–8, IEEE, 2018.
[36] C. Hawblitzel, J. Howell, J. R. Lorch, A. Narayan, B. Parno, D. Zhang,
and B. Zill, “Ironclad apps: End-to-end security via automated full-
system verification.,” in OSDI, vol. 14, pp. 165–181, 2014.
[37] B. Bond, C. Hawblitzel, M. Kapritsos, K. R. M. Leino, J. R. Lorch,
B. Parno, A. Rane, S. Setty, and L. Thompson, “Vale: Verifying high-
performance cryptographic assembly code,” in USENIX, 2017.
[38] J.-K. Zinzindohoué, K. Bhargavan, J. Protzenko, and B. Beurdouche,
“Hacl*: A verified modern cryptographic library,” in Proceedings of
the 2017 ACM SIGSAC Conference on Computer and Communications
Security, pp. 1789–1806, ACM, 2017.
[39] D. J. Bernstein, T. Lange, and P. Schwabe, “The security impact of a
new cryptographic library,” in International Conference on Cryptology
and Information Security in Latin America, 2012.
[40] K. Bhargavan, C. Fournet, M. Kohlweiss, A. Pironti, and P.-Y. Strub,
“Implementing TLS with verified cryptographic security,” in SP, 2013.
[41] X. Leroy, “Formal verification of a realistic compiler,” Communications
of the ACM, vol. 52, no. 7, pp. 107–115, 2009.
[42] G. Klein, K. Elphinstone, G. Heiser, J. Andronick, D. Cock, P. Derrin,
D. Elkaduwe, K. Engelhardt, R. Kolanski, M. Norrish, T. Sewell,
H. Tuch, and S. Winwood, “seL4: Formal verification of an OS kernel,”
in Proceedings of the ACM SIGOPS 22Nd Symposium on Operating
Systems Principles, SOSP ’09, (New York, NY, USA), pp. 207–220,
ACM, 2009.
[43] A. Cimatti, E. Clarke, E. Giunchiglia, F. Giunchiglia, M. Pistore,
M. Roveri, R. Sebastiani, and A. Tacchella, “NuSMV 2: An opensource
tool for symbolic model checking,” in International Conference on
Computer Aided Verification, pp. 359–364, Springer, 2002.
[44] A. Irfan, A. Cimatti, A. Griggio, M. Roveri, and R. Sebastiani, “Ver-
ilog2SMV: A tool for word-level verification,” in Design, Automation &
Test in Europe Conference & Exhibition (DATE), 2016, pp. 1156–1159,
IEEE, 2016.
[45] A. Cimatti, E. Clarke, E. Giunchiglia, F. Giunchiglia, M. Pistore,
M. Roveri, R. Sebastiani, and A. Tacchella, “Nusmv 2: An opensource
tool for symbolic model checking,” in International Conference on
Computer Aided Verification, pp. 359–364, Springer, 2002.
[46] S. Ravi, A. Raghunathan, and S. Chakradhar, “Tamper resistance
mechanisms for secure embedded systems,” in VLSI Design, 2004.
Proceedings. 17th International Conference on, pp. 605–611, IEEE,
2004.
[47] A. Duret-Lutz, A. Lewkowicz, A. Fauchille, T. Michaud, E. Renault, and
L. Xu, “Spot 2.0—a framework for ltl and ω-automata manipulation,”
in International Symposium on Automated Technology for Verification
and Analysis, pp. 122–129, Springer, 2016.
APPENDIX A: PROOFS FOR IMPLEMENTATION
CORRECTNESS & SECURITY
In this section we discuss the computer proof for VAPE’s
implementation correctness (Theorem 1) and the reduction
proof that VAPE is a secure PoX architecture as long as
VRASED is a secure RA architecture (Theorem 2).
Theorem 1. Definition 5 ∧ LTLs 3 –12→ Definition 6.
A formal LTL computer proof for Theorem 1 is available
at [18]. We here discuss the intuition behind such proof.
Theorem 1 states that LTLs 3 – 12, when considered in
conjunction with the machine model in Definition 5, imply
VAPE’s implementation correctness.
Recall that Definition 6 states that, in order to have
EXEC = 1 during the computation of XProve, at least once
before such time the following must have happened:
14
1) The system reached state S0 in which the software stored
in ER started executing from its first instruction (PC =
ERmin).
2) The system eventually reached a state S1 when ER
finished executing (PC = ERmax). In the interval
between S0 and S1 PC remained executing instructions
within ER, there were no interrupts, no resets, and DMA
remained inactive.
3) The system eventually reached a state S2 when XProve
started executing (PC = CRmin). In the interval be-
tween S0 and S2 the memory regions of METADATA
and ER were not modified.
4) In the interval between S0 and S2 the OR memory region
was only modified by ER’s software execution (PC ∈
ER ∨ ¬ Modify_Mem(OR)).
The first two properties to be noted are LTL 12 and LTL 11.
LTL 12 establishes the default state of EXEC is 0. LTL 11
enforces that the only possible way to change EXEC from 0
to 1 is by having PC = ERmin. In other words, EXEC is
1 during the computation of XProve only if, at some before
that, the code stored in ER started to execute (state S0).
To see why state S1 (when ER execution finishes, i.e.,
PC = ERmax) is reached and until then ER executes
atomically, we look at LTLs 4, 5, 6, and 9. LTLs 4, 5 and
6 enforce that PC will stay inside ER until S1 or otherwise
EXEC will be set to 0. On the other hand, it is impossible to
execute instructions of XProve (PC ∈ CR) without leaving
ER, because LTL 9 guarantees that ER and CR do not
overlap, or EXEC = 0.
So far we have argued that to have a token H that reflects
EXEC = 1 the code contained in ER must have executed
successfully. What remains to be shown is: producing this
token implies the code in ER and METADATA are not
modified in the interval between S0 and S2 and only ER’s
execution can modify OR in the same time interval.
Clearly, the contents of ER can not be modified after
S0 because Modify_Mem(ER) directly implies that LTL 3
will set EXEC = 0. The same reasoning is applicable
for modifications to METADATA region with respect to
LTL 10. The same argument applies to modifying OR, with
the only exception that OR modifications are allowed only
by the CPU and when PC ∈ ER (LTL 7). This means that
OR can only be modified by the execution of ER. In addition,
LTL 7 also ensures that DMA is disabled during the execution
of ER to prevent unauthorized modification of intermediate
results in data memory. Therefore, the timeline presented in
Figure 7 is strictly implied by VAPE’s implementation. This
concludes the reasoning behind Theorem 1.
Theorem 2. VAPE is secure according to Definition 3 as
long as VRASED is a secure RA architecture according
to Definition 1.
Proof. Assume that AdvPoX is an adversary capable of winning the
security game in Definition 3 against VAPE with more than negligible
probability. We show that, if such AdvPoX exists, then it can be used
to construct (in a polynomial number of steps) AdvRA that wins
VRASED’s security game (Definition 1) with more than negligible
probability. Therefore, by contradiction, inexistence of AdvRA (i.e.,
VRASED’s security) implies inexistence ofAdvPoX (VAPE’s security).
First we recall that to win VAPE’s security game AdvPoX must
provide (HAdv, OAdv), such that XVerify(HAdv,OAdv,S, Chal, ·) =
1. To comply with conditions 3.a and 3.b in Definition 3, this must
be done either of the following two cases:
Case1 Adv does not execute S in the time window between treq
and tverif (i.e., ¬XAtomicExecPrv(S, treq → tverif )).
Case2 Adv calls XAtomicExecPrv(S, treq → tverif ) but modifies
its output O in between the time when the execution of S
completes and the time when XProve is called.
However, according to the specification of VAPE’s XVerify algo-
rithm (see Definition 4) a token HAdv will only be accepted if it
reflects an input value with EXEC = 1, as expected by Vrf. In
VAPE’s implementation O is stored in region OR, and S in region
ER. Moreover, given Theorem 1, we know that having EXEC = 1
during XProve implies three conditions have been fulfilled:
Cond1 The code in ER executed successfully.
Cond2 The code in ER and METADATA were not modified
after starting ER’s execution and before calling XProve.
Cond3 Outputs in OR were not modified after completing ER’s
execution and before calling XProve.
The third condition rules out the possibility of Case2 since that case
assumes Adv can modify O, resided in OR, after ER execution
and EXEC stays logical 1 during XProve. We further break down
Case1 into three cases:
Case1.1 Adv does not follow Cond1-Cond3. The only way for
Adv to produces (HAdv, OAdv) in this case is to not call
XProve, i.e., by directly guessing H.
Case1.2 Adv follows Cond1-Cond3 but does not execute S be-
tween treq and tverif . Instead, it produces (HAdv, OAdv) by
calling:
OAdv ≡ XAtomicExecPrv(ERAdv, treq → tverif ) (13)
where ERAdv is a memory region different from the one speci-
fied by Vrf on XRequest (AdvPoX can do this by modifying
METADATA to different values of ERmin and ERmax
before calling XAtomicExec).
Case1.3 Similar to Case1.2, but ERAdv is the same region speci-
fied by Vrf on XRequest containing a different executable SAdv.
We show that an adversary that succeeds in any of these cases
can be used win VRASED’s security game. To see why this is
the case, we note that VAPE’s XProve function is implemented by
using VRASED’s SW-Att without any modification. SW-Att covers
memory regions MR (challenge memory) and AR (attested region).
Hence, VAPE instantiates these memory regions as:
1) MR = Chal;
2) ER ⊂ AR;
3) OR ⊂ AR;
4) METADATA ⊂ AR;
Doing so ensures that all sensitive memory regions used by VAPE
are included among the inputs to VRASED’s attestation. Let X(t)
denote the content in memory region X at time t. AdvRA can then
be constructed using AdvPoX as follows:
1) AdvRA receives Chal from the challenger in step (2) of RA
security game of Definition 1.
2) At arbitrary time t, AdvRA has 3 options to write AR(t) =
ARAdv and call AdvPoX:
a) Modify ER(t) 6= S or OR(t) 6= O or
METADATA(t) 6= METADATAVrf . It then calls
AdvPoX in Case1.1.
15
b) Modify ER to be different from the range chosen by Vrf.
Therefore, METADATA(t) 6= METADATAVrf . It then
calls AdvPoX in Case1.2.
c) Modify ER(t) to be different from S. It then calls AdvPoX
in Case1.3.
In any of these options, AdvRA will produce (HAdv,OAdv),
such that XVerify(HAdv,OAdv,S, Chal, ·) = 1 with non-
negligible probability.
3) AdvRA replies to the challenger with the pair (M,HAdv),
where M corresponds to the values of S, O and
METADATAVrf , matching HAdv and OAdv generated by
AdvPoX. By construction M 6= ARAdv = AR(t), as required
by Definition 1.
4) Challenger will accept (M,HAdv) with the same non-negligible
probability that AdvPoX has of producing (HAdv,OAdv) such
that XVerify(HAdv,OAdv,S, Chal, ·) = 1.
APPENDIX B: IMPLEMENTATION & FURTHER EVALUATION
A. FPGA Implementation
Following VRASED, VAPE targets the MSP430 architecture
and leverages OpenMSP430 [20] as its open core implementa-
tion. We extended VRASED to implement the hardware archi-
tecture presented in Figure 6. In addition to the VAPE module
in HW-Mod, we added another peripheral module responsible
for storing and maintaining VAPE’s METADATA. As a
peripheral, the content in METADATA can be accessed
in a pre-defined memory address via the standard periph-
eral memory access. We also ensure that EXEC (which is
located inside METADATA) is unmodifiable in software
by removing software-write wires in hardware. Finally, we
use Xilinx Vivado to synthesize an RTL description of the
modified HW-Mod and deploy it on the Artix-7 FPGA class.
B. Overhead
Register Look-up Table (LUT)
OpenMSP430 [20] 691 1904
VRASED [17] 721 1964
VAPE 735 2206
TABLE II
VAPE’S HARDWARE OVERHEAD.
Hardware & Memory Overhead. Table II reports VAPE’s
hardware overhead when compared to the unmodified Open-
MSP430 [20] and VRASED [17]. VAPE’s hardware overhead
is small compared to the baseline VRASED; it requires 2% and
12% additional registers and LUTs, respectively. In absolute
numbers, it adds 44 registers and 302 LUTs to the underlying
MCU. In terms of memory, VAPE requires 9 additional bytes
of RAM for storing METADATA. This overhead corresponds
to 0.01% of MSP430 16-bit address space.
Run-time. We do not observe any overhead for software’s
execution time on the VAPE-enabled Prv. This is because
VAPE does not introduce new instructions or modifications
to the MSP430 ISA; VAPE’s hardware runs in parallel with
the original MSP430 CPU. Run-time to produce a proof of
S’s execution comprise: (1) time to execute S (XAtomicExec),
and (2) time to compute an attestation token (XProve). The
first runtime only depends S behavior itself (e.g., SW-Att can
be a small sequence of instructions or have long loops). As
aforementioned, VAPE does not affect S’s runtime. XProve’s
run-time is linear on the total size of ER + OR. In a worst
case setting where these regions occupy the entire program
memory, 8 kB, XProve takes around 900ms to complete on
an 8MHz device.
C. Comparison with CFA
To the best of our knowledge, VAPE is the first architec-
ture for proofs of execution. Therefore, there are no other
architectures that are directly comparable. Nonetheless, to
provide a (performance and overhead) point of reference and
a comparison, we contrast VAPE’s overhead with that of three
Control Flow Attestation (CFA) architectures. As discussed
in Section II, even though CFA is not directly applicable to
produce proofs of execution with authenticated outputs, we
consider it to be the most closely related service, since it
reports on the execution path of a program.
In this comparison, we consider three recent CFA ar-
chitectures: Atrium [34], LiteHAX [35], and LO-FAT [33].
Figure 11.a compares VAPE to these architectures in terms
of number of additional hardware Look-Up Tables (LUTs)
required. In this figure, the black dashed line represents the
total cost of the MSP430 MCU: 1904 LUTs. Figure 11.b
presents a similar comparison for the amount of additional
registers required by these architectures. In this case, the total
cost of the MSP430 MCU itself is of 691 registers. Finally,
Figure 11.c presents the amount of dedicated RAM required
by these architectures (VAPE’s dedicated RAM corresponds to
the exclusive access stack implemented by VRASED).
As expected, VAPE requires much lower overhead. Accord-
ing to these results, the cheapest CFA architecture, LiteHAX,
would represent an overhead of nearly 100% LUTs and 300%
registers, if applied to MSP430. In addition, LiteHAX would
require 150 kB of dedicated RAM. This amount is far above
the entire addressable memory (64 kB) of 16-bit processors,
such as MSP430. These results support our claim that CFA is
not applicable to this class of low-end devices. VAPE, on the
other hand, introduces a total of 12% additional LUTs and 2%
additional registers. VRASED requires about 2 kB of reserved
RAM, which is not increased by VAPE’s support to proofs of
execution.
APPENDIX C: EXECUTABLE LIMITATIONS
We now discuss the limitations of our approach on the
executable types.
Shared libraries. In order to produce a valid proof, Vrf
must ensure that execution of S does not depend on external
code located outside its execution range ER (e.g., shared
libraries). A call to such code would violate LTL 4, resulting
in EXEC = 0 during the HMAC computation. One possible
way to support this type of executable is to transform it into a
self-contained executable by statically linking all dependencies
during the compilation time. Another is to appropriately set
ER to cover all external code used by S.
16
1 # d e f i n e P4IN ( * ( v o l a t i l e u n s i g n e d c h a r * ) 0x001C )
2 # d e f i n e P4OUT ( * ( v o l a t i l e u n s i g n e d c h a r * ) 0x001D )
3 # d e f i n e P4DIR ( * ( v o l a t i l e u n s i g n e d c h a r * ) 0x001E )
4 # d e f i n e P4SEL ( * ( v o l a t i l e u n s i g n e d c h a r * ) 0x001F )
5 # d e f i n e BIT4 (0 x0010 )
6
7 # d e f i n e MAXTIMINGS 85
8
9 # d e f i n e OR 0xEEE0 / / OR i s i n AR
10
11 # d e f i n e HIGH 0x1
12 # d e f i n e LOW 0x0
13 # d e f i n e INPUT 0x0
14 # d e f i n e OUTPUT 0x1
15
16 _ _ a t t r i b u t e _ _ ( ( s e c t i o n (".exec.entry" ) , naked ) ) void ReadSens o rEn t ry ( ) {
17 // ERmin
18 ReadSensor ( ) ;
19 __asm__ volatile ( "br #__exec_leave" "\n\t" ) ;
20 }
21
22 _ _ a t t r i b u t e _ _ ( ( s e c t i o n (".exec.body" ) ) ) int d i g i t a l R e a d ( ) {
23 if ( P3IN & BIT4 ) return HIGH ;
24 else return LOW;
25 }
26
27 _ _ a t t r i b u t e _ _ ( ( s e c t i o n (".exec.body" ) ) ) void d i g i t a l W r i t e ( u i n t 8 _ t v a l ) {
28 if ( v a l == LOW)
29 P3OUT &= ~BIT4 ;
30 else
31 P3OUT | = BIT4 ;
32 }
33
34 _ _ a t t r i b u t e _ _ ( ( s e c t i o n (".exec.body" ) ) ) void pinMode ( u i n t 8 _ t v a l ) {
35 if ( v a l == INPUT )
36 P3DIR &= ~BIT4 ;
37 else if ( v a l == OUTPUT)
38 P3DIR | = BIT4 ;
39 }
40
41 _ _ a t t r i b u t e _ _ ( ( s e c t i o n (".exec.body" ) ) ) void ReadSensor ( ) {
42 // Tell the sensor that we are about to read
43 d i g i t a l W r i t e (HIGH) ;
44 delayMS ( 2 5 0 ) ;
45 pinMode (OUTPUT) ;
46 d i g i t a l W r i t e (LOW) ;
47 delayMS ( 2 0 ) ;
48 d i g i t a l W r i t e (HIGH) ;
49 d e l a y M i c r o s e c o n d s ( 4 0 ) ;
50 pinMode ( INPUT ) ;
51 u i n t 8 _ t l a s t s t a t e = HIGH , c o u n t e r = 0 , j = 0 , i ;
52 u i n t 8 _ t d a t a [ 5 ] = { 0 } ;
53 // Read the sensor’s value
54 for ( i =0 ; i < MAXTIMINGS; i ++) {
55 c o u n t e r = 0 ;
56 while ( d i g i t a l R e a d ( ) == l a s t s t a t e ) {
57 c o u n t e r ++;
58 if ( c o u n t e r == 255) {
59 break ;
60 }
61 }
62 l a s t s t a t e = d i g i t a l R e a d ( ) ;
63 if ( c o u n t e r == 255) break ;
64 if ( ( i >= 4) && ( i%2 == 0) ) {
65 d a t a [ j / 8 ] <<= 1 ;
66 if ( c o u n t e r > 100) {
67 d a t a [ j / 8 ] | = 1 ;
68 avg += c o u n t e r ;
69 k ++;
70 }
71 j ++;
72 }
73
74 }
75 // Copy the reading to OR
76 memcpy (OR, da t a , 5 ) ;
77 }
78
79 _ _ a t t r i b u t e _ _ ( ( s e c t i o n (".exec.exit" ) , naked ) ) void R e a d S e n s o r E x i t ( ) {
80 __asm__ volatile ("ret" "\n\t" ) ;
81 // ERmax
82 }
(a) Fire Sensor’s code written in C
1 . . .
2 SECTIONS
3 {
4 . . .
5 . t e x t :
6 {
7 . . .
8 * ( . exec . e n t r y )
9 . = ALIGN ( 2 ) ;
10 * ( . exec . body )
11 . = ALIGN ( 2 ) ;
12 PROVIDE ( _ _ e x e c _ l e a v e = . ) ;
13 * ( . exec . e x i t )
14 } > REGION_TEXT
15 . . .
16 }
17 . . .
(b) Linker script
Fig. 12. Code snippets for (a) fire sensor described in Section VII (b) linker
script
VAPE Atrium LiteHAX LO−FAT
N
um
be
r o
f A
dd
itio
na
l L
oo
k−
Up
 T
a
bl
es
0
20
00
40
00
60
00
80
00
10
00
0
(a) Additional HW overhead (%) in
Number of Look-Up Tables
VAPE Atrium LiteHAX LO−FAT
N
um
be
r o
f A
dd
itio
na
l R
eg
ist
er
s
0
50
00
10
00
0
15
00
0
(b) Additional HW overhead (%) in
Number of Registers
VAPE Atrium LiteHAX LO−FAT
Ad
di
tio
na
l D
ed
ica
te
d 
RA
M
 (k
B)
0
50
10
0
15
0
20
0
(c) Dedicated RAM
Fig. 11. Overhead comparison between VAPE and CFA architectures. Dashed
lines in (a) and (b)represent the total hardware cost of MSP430. The dashed
line in (c) represents the total addressable memory (64 kB) on MSP430.
Self-modifying code (SMC). SMC is a type of executable
that alters itself while executing. Clearly, this executable type
violates LTL 3 that requires the code in ER to remain
unchanged during ER execution. It is unclear how VAPE can
be adapted to support SMC; however, we are unaware of any
legitimate and realistic use-case of SMC in our target bare-
metal applications.
Interrupts. Our notion of successful execution in Section V-A
prohibits an interrupt to happen during S’s execution. This
limitation can be problematic especially for interrupt-driven
programs such as the ones in real-time systems. Nonetheless,
simply allowing interrupts to happen during the execution may
result in attacks that allow malware to modify intermediate
execution results in data memory and consequently influence
the execution output. One possible way to remedy this issue
is to allow interrupts as long as all interrupt handlers are:
(1) immutable from the start of execution till the end of
attestation and (2) included in the attested memory range
during the attestation process. Vrf then can determine whether
an interrupt that may have happened during the execution is
malicious by inspecting all interrupt handlers from the proof
of execution.
17
APPENDIX D: SOFTWARE TRANSFORMATION
Recall that our notion of successful execution (in Sec-
tion V-A) requires the function’s entry point to be at the first
instruction and the exit point to be at the last instruction. In
this section, we discuss an efficient way to transform arbitrary
software (besides the ones in Appendix VIII-C) implementing
a function to conform with this requirement.
Line 10-17 of Figure 12 shows an (partial) implementation
of the ReadSensor function described in Section VII. This
implementation, when converted to an executable, does not
guarantee VAPE’s executable requirement since the compiler
may choose to place one of its sub-functions, instead of
ReadSensor, to the entry and/or exit points of the exe-
cutable. One obvious way to fix this issue is to implement
all of its subfunctions as inline functions; however, such
approach may be inefficient as in this example it will create
multiple duplicate code for the same sub-functions (e.g.,
digitalWrite) inside the executable.
Instead, we created the dedicated functions for the entry
(Line 1-4) and exit (Line 6-8) points, and assign those func-
tions to separated executable sections – “.exec.entry” for the
entry and “.exec.exit” for the exit. Then, we labeled all sub-
functions used by ReadSensor as well as ReadSensor
itself to the same section – “.exec.body” – and modified the
MSP430 linker to place “.exec.body” between “.exec.entry”
and “.exec.exit” sections. The modified linker script is shown
in Figure 12b. This way, we ensure that the entry and exit
function locate at the beginning and the end of the executable,
respectively, and thus the resulting executable conforms with
VAPE’s requirement. The overhead of this approach is small,
it adding constant 10 byte to the instrumented executable.
18
