HECTOR-V: A Heterogeneous CPU Architecture for a Secure RISC-V Execution
  Environment by Nasahl, Pascal et al.
HECTOR-V: A Heterogeneous CPU Architecture for a Secure
RISC-V Execution Environment
Pascal Nasahl
pascal.nasahl@iaik.tugraz.at
Graz University of Technology
Graz, Austria
Robert Schilling
robert.schilling@iaik.tugraz.at
Graz University of Technology
Graz, Austria
Mario Werner
mario.werner@iaik.tugraz.at
Graz University of Technology
Graz, Austria
Stefan Mangard
stefan.mangard@iaik.tugraz.at
Graz University of Technology
Graz, Austria
ABSTRACT
To ensure secure and trustworthy execution of applications in po-
tentially insecure environments, vendors frequently embed trusted
execution environments (TEE) into their systems. Applications
executed in this safe, isolated space are protected from adversaries,
including a malicious operating system. TEEs are usually build
by integrating protection mechanisms directly into the processor
or by using dedicated external secure elements. However, both
of these approaches only cover a narrow threat model resulting
in limited security guarantees. Enclaves nested into the applica-
tion processor typically provide weak isolation between the secure
and non-secure domain, especially when considering side-channel
aacks. Although external secure elements do provide strong isola-
tion, the slow communication interface to the application processor
is exposed to adversaries and restricts the use cases. Independently
of the used implementation approach, TEEs oen lack the possi-
bility to establish secure communication to external peripherals,
and most operating systems executed inside TEEs do not provide
state-of-the-art defense strategies, making them vulnerable against
various aacks.
We argue that TEEs, such as Intel SGX or ARM TrustZone, imple-
mented on the main application processor, are insecure, especially
when considering side-channel aacks. In this paper, we demon-
strate how a heterogeneous multicore architecture can be utilized
to realize a secure TEE design. We directly embed a secure pro-
cessor into our HECTOR-V architecture to provide strong isolation
between the secure and non-secure domain. e tight coupling of
the TEE and the application processor enables HECTOR-V to pro-
vide mechanisms for establishing secure communication channels
between dierent devices. We further introduce RISC-V Secure Co-
Processor (RVSCP), a security-hardened processor tailored for TEEs.
To secure applications executed inside the TEE, RVSCP provides
hardware enforced control-ow integrity and rigorously restricts
I/O accesses to certain execution states. RVSCP reduces the trusted
computing base to a minimum by providing operating system ser-
vices directly in hardware.
KEYWORDS
trusted execution environment, secure I/O, heterogeneous com-
puter architecture, RISC-V
1 INTRODUCTION
With the growing demand for complex IT applications, such as
autonomous driving or smart city infrastructures, soware com-
plexity increases steadily. e codebase of the Linux kernel, for
example, increases by 250k lines of code (LOC) each year, reaching
27.8M LOC in 2020 [12]. is is challenging because one can expect
roughly 1 to 25 bugs in 1,000 lines of code [41]. While not all of
these bugs might be exploitable by an aacker, a growing codebase
complexity clearly leads to a larger aack surface.
One strategy to deal with the growing complexity and meet
security goals is to isolate all security critical applications using
a trusted execution environment (TEE) [23]. A TEE establishes a
secure execution environment by creating a safe, isolated space
using hardware and soware primitives [24]. Most TEE threat
models consider a powerful adversary controlling user space ap-
plications, the operating system, or even the hypervisor, trying to
inuence the execution of applications in the trusted environment.
To achieve the protection needed for this threat model, TEEs require
strong isolation between the TEE and the so-called rich execution
environment (REE).
e powerful concept of shiing security critical applications
into a trusted execution environment is already adapted by major
vendors like Intel, ARM, and Apple. One of the most common TEE
design approaches is to create a virtual secure processor in the
main application processor by using hardware extensions. Intel
SGX [20] and ARM TrustZone [7] take this approach. Contradictory
to embedding the TEE tightly into the CPU, Google’s Titan [34] and
Apple’s T2 [3] implements a secure element by externally mounting
a dedicated security processor next to the main CPU. However,
both TEE design approaches yield dierent weaknesses. Several
recent aacks [13, 14, 17, 28, 29, 37, 49, 54, 66] showed that the
isolation of TrustZone and SGX can be bypassed by mounting cache
or transient-based side-channel aacks. While dedicated secure
elements provide strong isolation between REE and TEE, Google’s
Titan, e.g., uses a slow SPI connection for communication between
the two domains limiting potential use cases. Furthermore, this
o-chip communication fabric between REE and TEE is physically
exposed to an aacker, making it vulnerable against probing aacks.
Independently of the used TEE design approach, typical TEE im-
plementations oer several weaknesses. First, although a security
breach within the TEE is fatal, operating systems deployed in the
ar
X
iv
:2
00
9.
05
26
2v
1 
 [c
s.C
R]
  1
1 S
ep
 20
20
secure environment surprisingly do not oer state-of-the-art pro-
tection mechanisms like ASLR, guard pages, or stack cookies [15].
Second, most TEEs do not provide architectural features to establish
secure communication with I/O devices. e lack of trusted I/O
paths in TEE systems is critical because secrets shared between
user and TEE are unprotected.
With the Intel Lakeeld architecture [35] and an AMD patent [42],
major vendors recently announced to introduce heterogeneous
multi-core architectures in upcoming processor designs. While the
approach of tightly coupling dierent processor cores on one chip
to balance power and energy eciency is already used by ARM’s
big.LITTLE technology [6] in mobile platforms, Intel and AMD are
planning to introduce this concept in forthcoming computer archi-
tectures. is design strategy raises the following research question:
RQ1: ”Can the tight coupling of distinct processor cores on a SoC be
used to increase the security of trusted execution environments?”
Contribution
In this paper, we introduce HECTOR-V, a design methodology uti-
lizing a heterogeneous multi-core architecture to develop secure
trusted execution environments. HECTOR-V achieves strong isola-
tion between the secure and non-secure domain and provides pro-
tection for various side-channel aacks, such as cache and transient-
based aacks, by using a distinct processor of the heterogeneous
core cluster for the secure environment. e tight coupling of TEE
and REE in the shared SoC infrastructure yields several advantages.
In the HECTOR-V architecture, the application processor and TEE
are connected through a communication fabric, enabling a high-
speed link between the two processors. Additionally, since the
TEE is embedded into the SoC, all peripherals integrated into the
system are also available for the secure environment. To manage
peripheral sharing and protect peripherals from unauthorized ac-
cesses, HECTOR-V further introduces a secure I/O path concept. e
identier-based strategy deeply nested into the communication
fabric and the peripherals allows a ne granular protection of the
aached peripherals. In HECTOR-V, the access permission manage-
ment is implemented using a hardware-based security monitor.
While the security monitor owner can congure a set of access
permissions, the other parties in the system can request access to
certain peripherals by consulting the security monitor. We further
extend this mechanism to dynamically grant and deny access to
certain peripherals by introducing the concept of security monitor
ownership transfer. To enable various use cases, the owner of the
security monitor dynamically can transfer the ownership to other
parties.
Furthermore, we introduce RVSCP, a security-hardened RISC-V
processor tailored for being used as a TEE. Although the HECTOR-V
architecture is generic and independent of the used TEE processor,
we show that RVSCP further increases the TEE security by utilizing
features of HECTOR-V. RVSCP isolates applications within the trusted
execution environment by enforcing the integrity of the control-
ow. We combine the control-ow information with the secure
I/O mechanism to only grant access to certain peripherals when
reaching a predened control-ow state. To reduce the trusted
computing base and therefore the aack surface to a minimum,
RVSCP implements operating system features, such as multitasking,
directly in hardware.
We demonstrate the HECTOR-V concept by introducing a het-
erogeneous multi-core architecture for RISC-V. We embedded a
RISC-V processor with a control-ow integrity protection unit into
the HECTOR-V architecture. To verify the functionality of HECTOR-V
and RVSCP design approaches, we use secure boot as a prototype
application on our FPGA implementation. In summary, our contri-
butions are:
HECTOR-V, a design methodology for developing secure TEEs.
HECTOR-V utilizes a heterogeneous multi-core architecture to realize
a secure trusted execution environment. e architecture includes
a congurable security monitor managing access permissions to
specic peripherals. is mechanism allows the system to establish
secure communication channels between peripherals, users, and
the processor cores. e dynamic transfer of security monitor
ownership enables the HECTOR-V architecture to realize several use
cases, such as secure boot or executing trusted applications.
RVSCP, a security-hardened processor integrated into the HECTOR-V
architecture. RVSCP protects applications within the TEE by using a
control-ow integrity protection unit and combines this mechanism
with the peripheral access protection oered by HECTOR-V. e
secure processor provides operating system features in hardware
to minimize the trusted computing base.
2 BACKGROUND
To protect the processing of sensitive data, such as biometric data or
cryptographic keys in an untrusted environment, trusted execution
environments (TEE) are used. ese secure environments, mostly
used in mobile platforms and cloud computing, comprise a combina-
tion of hardware and soware features, the trusted computing base
(TCB), which needs to be trusted by the system [23]. A TEE guar-
antees the secure execution of trusted applications, the trustlets,
even when considering a malicious operating system running in
the rich execution environment (REE). is feature can either be
realized with a virtual processor approach, using an external secure
processor, or embedding a secure processor into the SoC.
2.1 ARM TrustZone
ARM TrustZone [7] is a TEE mostly used in mobile platforms.
TrustZone protects trustlets executed inside the TEE from mali-
cious accesses and tampering approaches, even from the operating
system [45]. Due to the massive deployment of ARM cores in An-
droid phones, TrustZone is commonly used by vendors, such as
Samsung with KNOX [46], to provide secure key storage and pay-
ment. e strong isolation of TrustZone is achieved by partitioning
the system into a secure and non-secure world using hardware
extensions. TrustZone enforces this separation by introducing a
non-secure (NS) bit directly into the bus architecture. is bit in-
dicates, in which domain the cache, the TrustZone Address Space
Controller (TZASC), and other peripherals are working and pro-
tects secure world devices from non-secure world accesses [44].
Secure communication between the two domains and the handling
of context switches, interrupts, and exceptions are managed by the
security monitor. Since the rigorous segregation between the two
2
security domains is enforced for all SoC components, one physi-
cal TrustZone core operates like two virtual cores [63]. To deploy
multiple trusted applications in the trusted execution environment,
usually, an operating system is mounted in the secure environment.
Trusty [27], which is used in Android devices supporting Trust-
Zone, consists of a small kernel and provides a communication
interface to the non-secure world. Trusted applications, such as key
storage and ngerprint processing, are directly integrated into the
kernel by the device vendor. Deploying third-party applications is
currently not supported by Trusty. Other small footprint operating
systems deployed in TrustZone assisted TEEs are the open-source
OP-TEE [25] OS developed by Linaro or alcomm’s QSEE.
Security of TrustZone. e main idea of a TEE is to reduce the
trusted computing base to a minimum. Kernels deployed in ARM
TrustZone comply with this requirement by only providing a small
set of operating system services. Although applications executed
in the TEE require strong isolation between the trustlets, operating
systems like Trusty and OP-TEE fail to provide state-of-the-art
protection mechanisms like ASLR or stack cookies [15]. Due to the
insucient protection of applications executed in the TEE, several
aacks, like the alcomm QSEE aack [10] were discovered in
the past. In this particular aack, an adversary could gain remote
code execution by exploiting a single buer overow vulnerability
in a trustlet. ese aacks are critical because they threaten not
only the security of the TEE but also the security of the REE. Since
alcomm’s QEE allows the TEE to modify arbitrary memory, an
aacker easily could obtain control over the Android system exe-
cuted in the REE [11]. However, not only the lack of conventional
security concepts in TEE kernels is problematic, but TrustZone also
is vulnerable against various side-channel aacks. CLKSCREW [51]
utilizes the DVFS energy management technique to perform a fault
aack retrieving secret keys from the TEE or deploying malicious
code. Although TrustZone separates the cache between the secure
and non-secure world, past research demonstrated several cache
aacks [29, 37, 66].
2.2 Intel SGX
Intel SGX [20] provides hardware-based isolation for enclaves by
introducing a set of dedicated instructions embedded into the mi-
croarchitecture of the CPU. e threat model covers potentially ma-
licious user applications, the operating system, and the hypervisor
and protects the enclave content from unauthorized accesses [21].
To verify the program loaded into the enclave by untrusted parties,
Intel SGX oers a soware aestation service. SGX preserves con-
dentiality and integrity of external memory by using transparent
memory encryption [30].
Security of SGX. Although SGX provides protection against phys-
ical aackers targeting the external memory, Intel explicitly ex-
cludes side-channel aacks in their threat model and recommends
developers to prevent such aacks at enclave level [32]. Past re-
search showed that SGX is vulnerable against cache aacks [13, 28,
49], transient aacks [14, 17, 54], and fault aacks such as Plun-
dervolt [43]. In addition to side-channel aacks, trustlets in Intel
SGX are also vulnerable against classical soware aacks, such as
memory vulnerabilities [19, 36, 57]. Furthermore, SGX does not
natively provide secure I/O interaction; it requires additional tech-
niques, such as SGXIO [58], to establish secure communication
with peripherals.
2.3 Google Titan
Google Titan [34] is a secure element providing a hardware-based
root of trust for cloud computing. To mitigate aacks against the
integrity of the basic rmware or the operating system, Google
uses Titan to verify the boot process on server platforms [26]. Titan
internally stores signatures of the boot les and uses cryptographic
accelerators to verify the loaded boot images. In addition, Titan also
provides several cryptographic services to the main CPU. e dedi-
cated chip is mounted next to the main application processor and
establishes communication over a slow SPI interconnect. Google
currently announced to use the open-source version OpenTitan [39]
in the next Titan generations. Similarly, OpenTitan consists of a
small RISC-V processor and several cryptographic hardware accel-
erators. A smaller version of the Titan, the Titan-M, is used by
Google in current Pixel phones.
2.4 Apple Secure Enclave Processor
Apple’s Secure Enclave Processor (SEP) [31], which was introduced
with the iPhone 5s, is a dedicated ARM secure coprocessor directly
embedded into the main SoC. Similar to ARM TrustZone in mobile
platforms, SEP is mainly used to store keys and enable secure
processing of biometric data [4]. SEP, which runs the L4-based
SEPOS, fully controls several dedicated secure peripherals, such as
cryptographic accelerators, and shares other peripherals on the SoC
with the application processor. e memory controller allows the
secure coprocessor to dene isolated memory regions in the main
memory. To enable communication between the main application
processor and SEP, a mailbox system is used. Since the secure
enclave processor also contains a key and signature storage, SEP is
also used to verify the boot process of the application processor.
Security of SEP. Although SEP and SEPOS implement basic pro-
tection mechanisms like stack cookies, guard pages, and memory
encryption, state-of-the-art defense strategies such as ASLR are still
missing [40]. In contrast to ARM TrustZone, peripherals are either
exclusively owned by SEP or are shared, like the power manager or
the PLL clock generator. is makes SEP possibly vulnerable against
aacks similar to CLKSCREW [51]. Additionally, the decryption
keys for some SEP implementations were already leaked [33].
3 THREAT MODEL
Our threat model considers the common aack scenarios on TEEs
dened by Cerdeira et al. [15]. We are considering an aacker
directly exploiting architectural weaknesses of the TEE and the
TEE kernel. Here an adversary uses a combination of bugs in the
kernel, aws in the hardware protection mechanism, and missing
state-of-the-art defense strategies, such as ASLR or guard pages, to
compromise the system. Furthermore, we expect bugs in trustlets,
which can be exploited by an aacker over the communication in-
terface between REE and TEE. e security of applications in REE or
TEE also can be threatened by using cache-based or transient-based
side-channel aacks. Additionally, we are considering a malicious
trustlets explicitly trying to aack the SoC. We are extending the
3
Peripheral
SM
Interconnect
Core Cluster
Figure 1: Overall HECTOR-V design.
threat model of common TEEs also to cover aacks on peripher-
als, i.e., illegitimate accesses to secure storage elements, sensitive
peripherals such as a ngerprint reader or SPI [1], or protected mem-
ory regions. In summary, we expect a powerful aacker having full
control over user applications, the operating system, trustlets, or
even the hypervisor executed on the application processor.
4 DESIGN
is section presents our secure TEE design approach consisting of
the HECTOR-V architecture and the secure processor RVSCP. We rst
introduce HECTOR-V consisting of several architectural features,
such as trusted I/O paths, a security monitor, and a secure TEE
integration. en, we introduce RVSCP, a concrete proposal for a
secure processor used as a TEE and demonstrate, how RVSCP and
HECTOR-V combined, form a secure TEE system. In Section 5, we
then demonstrate a prototype of HECTOR-V and RVSCP integrated
into our RISC-V base platform, the lowRISC chip.
4.1 HECTOR-V Design
e HECTOR-V design proposes architectural features for creating
secure TEEs. As seen in Figure 1, the TEE in HECTOR-V is realized
by mounting a secure processor directly into the main SoC. is
approach is similar to ARM’s big.LITTLE technology [6], where
independent cores are embedded into the chip. However, instead
of using the additional cores for power eciency, HECTOR-V uses
the dedicated processor for security purposes. Although placing a
secure processor directly into the SoC increases the area overhead,
we argue that this approach is feasible. First, we contend that
oering strong isolation between secure and non-secure domain
only is possible with a dedicated secure processor for the TEE. is
design choice completely separates the instruction pipelines and
caches of the cores, resulting in an independent execution ow
of REE and TEE applications. Second, services deployed on TEE
processors, such as applications handling banking information or
processing passwords, are typically rather small. is constrained
processing requirement allow us to deploy a tiny secure processor.
erefore, when comparing the TEE processor with a state-of-the-
art multicore processor, the area overhead is negligible.
While integrating a dedicated secure processor into the SoC
infrastructure is straightforward, a secure and viable TEE system
requires to establish communication channels between REE and
TEE, as well as for peripherals enabling user I/O. In the remaining
part of this section, we introduce our trusted I/O path design uti-
lizing a hardware-based security monitor and elaborate how these
concepts allow REE and TEE to share devices on the SoC and realize
features, such as secure storage elements.
4.1.1 Trusted I/O Paths. e trusted I/O path mechanism is
the central element of the HECTOR-V architecture. is concept
allows HECTOR-V to securely share devices on the SoC, establish
secure communication channels between external users and the
TEE or REE, and implement concepts, like secure storage elements.
HECTOR-V establishes trusted I/O paths by assigning an unique
identier to each party of the system, i.e., the processor cores. When
accessing a device, like the SD-card or the block RAM (BRAM), the
peripheral uses an internal mechanism to verify the identier. is
strategy allows the architecture to enforce access permissions, such
as the ngerprint reader only can be accessed by the TEE.
Identier. In HECTOR-V, an identier is used to distinguish be-
tween legitimate and illegitimate accesses to a peripheral. While
this concept is similar to ARM TrustZone [7], TrustZone only uses
a 1-bit identier to distinguish between accesses from the secure
or non-secure world. e identier used by our system consists
of a core identier, a process identier, and a peripheral identier.
While the core identier permanently and unchangeable is assigned
to each processing core, e.g., an identier for the application pro-
cessor and one identier for the secure processor, at design time,
the process and peripheral identier can be assigned by each entity
itself.
Interconnect. To eciently transport the identier from the par-
ticipants to the peripherals, we integrate the identier directly into
the system interconnect. e AMBA AXI4 [5] protocol, which is
used as interconnect by our lowRISC base platform and many other
SoC designs, allows to embed up to 1024-bits wide user-dened
signals into the protocol. By embedding the identier into the
user-dened signals, which are not used by most AXI devices, the
identier is transported without any protocol overhead. Addition-
ally, this approach assures that the identier is sent along with
the address and data on each AXI request. Since the core iden-
tier is hardcoded directly into the the interconnect interface of
the participants, an aacker can not change this identier from
soware.
Peripherals. HECTOR-V creates a trusted channel between an en-
tity, e.g., the application or the secure processor, and a peripheral
by enforcing the identier-based access check directly in hardware
at the peripheral driver. e peripherals, e.g., the hardware imple-
mentation of the SD-card controller or the BRAM, use the identier
to lter illegitimate accesses. is scheme, which requires the pe-
ripherals to be identier-aware, is implemented by introducing
a lightweight wrapper module for each peripheral instance. In
our architecture, the peripheral wrapper module consists of two
communication channels: the data channel and the conguration
channel. e data channel interface of the peripheral wrapper is
directly connected to the AXI4 communication fabric, allowing
other parties, e.g., the secure processor, to access the peripheral.
By using the conguration channel, the conguration party can
set or unset the identier in the ID eld. In HECTOR-V, the data and
conguration channels are physically separated by introducing a
dedicated, lightweight AXI4-lite [5] communication fabric.
When the identier of the AXI4 communication request trans-
ported over the communication fabric matches the identier stored
4
in the peripheral wrapper ID eld, the access is permied and the
request is directly forwarded to the actual peripheral driver. On
an ID mismatch, the rewall mechanism blocks the request and re-
turns an exception over the AXI4 channel, which can be processed
by the issuer. If the process or peripheral identier is set to zero,
the peripheral wrapper ignores these ID elds.
e peripheral wrapper resides in two states: claimed or un-
claimed. While in the unclaimed state all data channel requests are
blocked, in the claimed state only the requests with the matching
identier are permied. Furthermore, we dierentiate between
congurable and non-congurable peripheral wrappers. To realize
functionalities, like secure storage elements, which must not be
accessible by any party except one, non-congurable peripheral
wrappers store a hardcoded identier, consisting of the core and
process ID, in the ID eld.
4.1.2 Security Monitor. e architectural features of identier,
identier transportation, and peripheral rewalls are the foun-
dation for establishing trusted I/O paths on the SoC. To enforce
security policies utilizing the trusted paths, we embed a tiny se-
curity monitor into the SoC infrastructure. is hardware-based
security monitor (SM) module is responsible for managing access
permissions to peripherals. Internally, the SM consists of a table
tracking these access permissions. For each peripheral included in
the system, the security monitor maintains a table entry tracking
the state (claimed or unclaimed) and a list of identiers allowed to
claim the device.
Security Monitor Owner. e HECTOR-V architecture introduces
the security monitor owner privilege, which is assigned to a certain
party at design time. is party solely is responsible for conguring
the security monitor. More concretely, only the SM owner is allowed
to dene, which peripheral can be claimed by which participant.
e security monitor stores the identier of the current owner and
accepts conguration commands only from this party.
Peripheral Claiming and Releasing. To access a certain periph-
eral, the requester rst needs to claim the peripheral by sending a
claim request to the security monitor. en, the SM checks, if the
peripheral currently is unclaimed and veries that the ID of the
issuer is in the list of identiers permied to access the peripheral.
Finally, the security monitor sends the ID of the issuer to the periph-
eral wrapper over the AXI4-lite bus and the peripheral rewall sets
this identier in its ID eld. Now, the identier transported in the
user-dened signals of each AXI4 request issued by the requester
matches the identier in the ID eld and the requester exclusively
can access the peripheral. However, if the requester is not allowed
to claim the peripheral, i.e., the identier of the requester is not
in the list of privileged entities, the ID verication fails and the
security monitor sends an access denied message back to the issuer.
If the requester ID is in the table entry of the peripheral, but the
peripheral currently is claimed, the security monitor noties the
issuer.
HECTOR-V is a cooperative scheme, meaning, the entity currently
claiming a peripheral needs to release it aer using it. To release
a peripheral, the entity sends a dedicated release command to the
security monitor. e security monitor processes this request by
clearing the ID eld of the peripheral wrapper.
Peripheral Access Withdraw. Claiming a peripheral and bind-
ing access exclusively to an entity is a powerful concept and estab-
lishes a trusted, secure channel between this entity and a peripheral,
but it also can be abused. A participant, e.g., the application pro-
cessor or the secure processor, in control of an aacker, could
permanently occupy one or more peripherals resulting in a denial-
of-service (DOS) aack. To mitigate such aacks and manage un-
responsive participants, the security monitor has the capability to
withdraw access from certain peripherals. A simple approach to
implement this functionality would be to clear the ID eld in the
peripheral wrapper. However, this approach is dangerous because
it could enable time-of-check to time-of-use (TOCTOU) aacks.
For example, when withdrawing access to a peripheral currently
processing a secret, the SM owner would be able to retrieve the
result of the request.
To securely withdraw access to certain peripherals, HECTOR-V
introduces a withdrawing mechanism, which can be triggered by
any entity in the system. While a withdraw request issued by the
SM owner is always granted, a request from other parties rst needs
to be approved by the security monitor. When a withdraw request
is retrieved by the security monitor, the SM simultaneously starts
a timer with a predened timeout and noties the owner of the
peripheral. e notication of the peripheral owner is realized
by introducing a dedicated interrupt line and a interrupt service
routine (ISR) provided by the processors for each peripheral. e
ISR implements a cleanup function responsible for clearing secrets,
stopping current transactions, and gracefully releasing the corre-
sponding peripheral. Aer the timeout is reached in the security
monitor and the peripheral is not released gracefully by the ISR,
the SM forcefully releases the peripheral by clearing the ID eld in
the rewall.
Security Monitor Ownership Transfer. A signicant advan-
tage of HECTOR-V, compared to other TEE architectures, is the
possibility to utilize the TEE infrastructure for various use cases.
While in TEE systems, like ARM TrustZone, one entity, e.g., the
secure-world, is the exclusive owner of the highest privilege level,
HECTOR-V introduces a dynamic ownership transfer mechanism.
e security monitor privilege, which allows the SM owner to con-
gure access privileges and release arbitrary peripherals, can be
transferred to any other entity by the SM owner. To initiate a SM
ownership transfer, the entity sends a request with the identier
of the new owner to the security monitor. e security monitor
acknowledges this request by seing the received identier into
the SM owner ID eld. To obtain a clear state, we recommend that
the security monitor owner rst releases all peripherals.
4.2 RVSCP Design
To build a secure TEE environment, a processor of the heteroge-
neous core cluster could be utilized to deploy an operating system.
is operating system then is in charge of assigning each trustlet on
the core an individual process identier, annotate data in the cache
with the process ID, and manage accesses to peripherals. However,
deploying a fairly complex OS, which increases the TCB and the
aack surface, is contrary to the usage model of most TEEs. We
argue that trustlets that are commonly deployed in TEEs, such as
providing an interface to a cryptographic accelerator, a key storage,
5
or a ngerprint sensor, do not extensively make usage of operating
system features. erefore, we reduce the TCB of RVSCP to a mini-
mum by providing basic operating system services, such as context
switches and ID management, directly in hardware. Furthermore,
RVSCP utilizes a control-ow integrity (CFI) unit and the secure I/O
scheme of HECTOR-V to restrict access to sensitive peripherals.
4.2.1 Control-Flow Integrity. To protect a program from aacks
targeting to alter the control-ow, CFI schemes are commonly
used [52, 55, 65]. ese schemes mitigate aacks like ROP [50] or
JOP [16] by ensuring that the control-ow of the program can not
escape the control-ow graph (CFG) determined at compile time.
Enforcing the integrity of the instruction stream can be realized
in dierent degrees of neness. While some schemes [2, 8, 18]
preserve the correctness of the execution ow at basic block level,
other techniques [22, 60, 61] maintain the integrity of the control-
ow even at instruction granularity.
Control-Flow IntegrityUnit. RVSCP utilizes the ne-grain Sponge-
Based Control-Flow Protection (SCFP) [60] scheme to protect trustlets
within the secure processor from aack aempts. e main idea of
SCFP is to encrypt a program using a sponge-based authenticated
encryption primitive during compile time and decrypt it instruction
for instruction at runtime. Decryption of the individual instruc-
tions is realized using a dedicated decryption stage in the processor
pipeline. To successfully decrypt an instruction, the pipeline stage
needs to know the key and an internal state. e SCFP scheme
accumulates information over all previously executed instructions
in this state. If the integrity of the state is violated, the decryption
fails and returns a faulty instruction, which can be detected with a
certain probability by the CPU. An aacker modifying instructions,
e.g., using fault injection, or inserting additional instructions, alters
the state and can be detected by SCFP.
4.2.2 Hardware Scheduling. We extend the native SCFP imple-
mentation, which allows to execute a bare-metal program on a
processor, to support multitasking. One approach to enable mul-
titasking with CFI protected trustlets could be realized purely in
soware using an operating system. However, since TEE operat-
ing systems, such as OP-TEE [25], do not provide state-of-the-art
protection mechanisms, such as ASLR or guard pages, mounting
an operating system would also increase the aack surface of the
TEE. erefore, similar to Antikernel [67], RVSCP oers hardware
features to run multiple trustlets on the processor and reduce the
soware TCB to a minimum.
Hardware Scheduling Unit. RVSCP introduces a hardware entity
providing minimal OS functionality for trustlets. is hardware
unit is responsible for performing secure context switches between
individual trustlets in hardware. e round-robin based scheduling
mechanism internally maintains a list of trustlets and aer a certain
amount of cycles executed, a context switch is conducted. When
performing a context switch, the hardware entity stops the current
trustlet, stores the architectural state to a secure place, and loads
the next architectural state of the next trustlet. Additionally, the
hardware context switch mechanism also exchanges the decryption
key used for SCFP. Using an individual key for each trustlet yields
two major advantages. First, since the programs are encrypted
with a dierent key, only the developer with the correct key can
access the plain program, leading to an IP protection mechanism for
trustlets. Second, using dierent keys for trustlets enables strong
isolation between the applications. Aer the context switch, the
execution is resumed and the next trustlet is executed.
While this hardware scheduling unit allows the processor to
basically consist of several virtual processor cores, only a xed
number of trustlets can run simultaneously on the physical core.
However, since most TEEs are anyway limited in their processing
power and only a well-chosen set of trustlets is usually deployed
in TEEs by the vendor, an upper bound of processes is acceptable.
Furthermore, we argue that for simple services typically used in
mobile platforms, such as a process handling biometric data for
unlocking the device or a process handling the secure key storage,
no dedicated operating system is needed. Completely omiing the
operating system and providing operating system services using a
tiny hardware unit reduces the soware TCB to a minimum and
would even allow to formally verify the simple hardware unit.
4.2.3 Control-Flow Integrity with Secure I/O. e ne granular
control-ow integrity unit embedded into the HECTOR-V architec-
ture prevents an aacker from performing control-ow hijack at-
tacks by limiting the control-ow of the program to only valid paths
through its control-ow graph. However, while the CFI scheme pro-
tects the control-ow by detecting integrity violation of forward-
and backward-edges, and thus prevents aacks such as ROP or JOP,
data-oriented aacks can not be detected by this countermeasure.
In such aacks, an adversary modies control- or non-control re-
lated data to break the security of the system. By manipulating
control-related data, such as the condition value in an if statement,
the aacker indirectly can inuence the control-ow of the pro-
gram. Furthermore, an aacker could leak sensitive data, such as
passwords or cryptographic keys, by manipulating addresses in the
system. For example, instead of returning the ciphertext over an
API to the user, an aacker could modify the address from pointing
to the location of the ciphertext in memory to the encryption key
stored in a secure storage element instead by exploiting a buer
overow bug. To lower the impact of such aacks, RVSCP binds
access to certain assets to a certain CFI state. More concretely, only
when the CFI protected program reaches a predened CFI state, the
program is permied to access the distinct peripheral. In RVSCP,
this strategy is realized by tunneling each interaction request with a
peripheral through a dedicated function with a certain state. RVSCP
automatically sets the peripheral identier of the processor to the
current state. Only if this state matches the ID stored in the periph-
eral wrapper, access to the device is granted. If an aacker calls the
peripheral access function outside the valid control-ow graph, the
CFI mechanism detects this violation. Additionally, if the adversary
cras an address accessing the asset, the state used as an identier
does not match the identier of the peripheral and access to the
device is restricted.
5 IMPLEMENTATION
In this section, we provide a prototype implementation of the
HECTOR-V architecture and the secure RVSCP processor. We rst
introduce the RISC-V lowRISC chip, which we use as our base plat-
form. en, we extend this platform with the HECTOR-V features.
Finally, we present our RVSCP design and demonstrate how this
processor is embedded into the SoC infrastructure.
6
SDUART PS2 SPI
lowRISC Chip
DDR3
Controller
BRAM
AXI AXI Crossbar
Rocket Chip
TileLink
Figure 2: lowRISC base platform.
5.1 Base Platform
e prototype implementation is based on the open-source
lowRISC [38] project. Internally, the lowRISC chip consists of an
instance of the 64-bit RISC-V Rocket chip [9, 56]. e SoC, capable
of running Linux, provides an external o-chip DDR3 memory and
an on-chip BRAM. As shown in Figure 2, peripherals, such as a
SD-card controller and a SPI interface are connected over an AXI4
communication fabric with the processor core.
5.2 HECTOR-V
Figure 3 depicts the prototype implementation of the HECTOR-V
architecture. e extended lowRISC base platform consists of the
Rocket Core application processor ÷© and the secure processor
RVSCP integrated into the TEE ï©. By providing an AXI4 master
interface 1© 2©, REE and TEE gain access to various peripherals
over the shared communication fabric 3©. To dierentiate between
REE and TEE, the immutable core identier is directly embedded
into this AXI4 master interface. We further introduce a security
monitor  ©, a memory protection unit (MPU) ©, a reset unit Ê©, and
a secure storage element ¤© as part of the HECTOR-V architecture.
Interconnect. e SoC communication infrastructure consists of
a shared AXI4 3© and AXI4-lite C© interconnect. In our architecture,
the AXI4 bus is used to enable interaction between the processing
cores and the peripherals. We extended the AXI4 bus protocol and
the crossbar to transport the 16-bit user signal along with each bus
request. is user signal carries the 1-bit core identier, the 4-bit
process identier, and the 10-bit peripheral identier. While com-
munication between peripherals and cores requires a high-speed
link, the conguration of the security monitor and the peripheral
wrappers only consists of small conguration commands. Hence,
we use a lightweight AXI4-lite bus as a conguration channel. e
conguration of the security monitor is realized by introducing a
point-to-point communication channel between REE and SM A©
and TEE and SM B©. By using a separate AXI4-lite interconnect C©,
the conguration of the peripheral wrappers can only be initiated
by the security monitor. is strategy ensures that neither the TEE
nor the REE directly can manipulate the peripheral rewalls; con-
guration only can be requested over the security monitor module.
Security Monitor. To receive commands from REE and TEE, the
security monitor implements two AXI4-lite slave interfaces. e
protocol used to interact with the security monitor consists of two
privileged and four unprivileged commands. With the privileged
conguration command, the SM owner congures the table entries
of the hardware module. e table includes all peripherals known to
the SM, the current claiming status, and a list of identiers allowed
to request access to the device. By using the privileged owner-
ship transfer conguration command, the SM owner can dene a
new designated SM owner. e permission to issue a privileged
conguration command is checked by the security monitor using
the SM owner ID eld stored in the SM module. e unprivileged
commands consist of a claiming and release request command al-
lowing the issuer to gain access to a peripheral. A status command
can be used to determine the permission level of the peripheral
and if it is currently claimed. To gain access to an already claimed
peripheral, the unprivileged withdraw request command can be
used. While a withdraw request issued by the SM owner is always
granted, the request of the unprivileged participant rst needs to be
approved. When the withdraw request is granted, the security mon-
itor uses dedicated interrupt lines F© to notify the current owner of
the peripheral about the withdraw request. For each peripheral, a
dedicated interrupt line between SM and TEE or REE is introduced.
When the request is approved by the security monitor, the congu-
ration command is forwarded to the peripheral wrapper over the
AXI4-lite crossbar C©.
Peripheral Wrapper. An AXI4 read or write request issued by
the TEE or REE and transported over the AXI4 interconnect 3©
is not directly sent to the peripheral driver. First, a simple logic
deployed in the peripheral wrapper checks if the identier stored
in the user signal of the AXI4 request matches the identier stored
in the ID eld of the wrapper module. When the process ID or the
peripheral ID is set to zero, the access control is only conducted
with the core ID, allowing all entities on either the TEE or REE to
access the peripheral. en, if the ID transported in the request
matches the ID stored in the module, the request is forwarded to
the actual peripheral. However, on an ID mismatch, the peripheral
wrapper transports the error code SLVERR to the issuer using the
RRESP or BRESP AXI4 signal. In addition to the AXI4 slave interface,
the peripheral wrappers also implement an AXI4-lite slave interface.
is interface is used by the SM to set or unset the identier in the
ID eld of the wrapper.
Interrupt. e peripheral wrapper also is responsible for routing
the interrupt line of the peripheral to the current peripheral owner.
To realize correct interrupt handling, the peripheral wrapper mod-
ules consists of one dedicated interrupt line for each participant
(REE and TEE). Based on the identier of the current peripheral
owner, the interrupt either is routed to the TEE or the REE.
Soware Support. To interact with the security monitor, we pro-
vide a Linux kernel module for the application processor. is ker-
nel module allows the user processes to claim, release, withdraw,
or query the status of a peripheral. Furthermore, the kernel module
also provides functionalities to congure the security monitor. To
handle interrupts from the security monitor withdraw mechanism,
each peripheral driver is extended with a dedicated cleanup inter-
rupt service routine. is ISR is responsible for clearing any secrets,
aborting communication channels with other parties, and releasing
the peripheral gracefully using the release mechanism.
For the TEE, we provide a small library. Similar to the kernel
module, this library provides basic functionalities to interact with
the security monitor
Physical Memory Protection. Isolating a peripheral by binding
it exclusively to one entity does not work for the shared, external
DDR3 memory. erefore, the HECTOR-V architecture introduces
7
TEE
IRQRst
Security
Monitor
Rocket Chip
TileLink
MEM NetworkIRQRst
REE TEEREE TEEREE TEE REE TEE REE TEE REE TEE REE TEE REE TEE TEE
AXI4
AXI4 Crossbar
AXI4-lite
AXI4AXI4-lite
A
X
I4
-l
it
e A
X
I4
AXI4-lite Crossbar
1
3
2
A
B
C
AXI4-lite
Interrupt
Reset
AXI4
3x 4x
BRAM
AXI4 Slave Interface
AXI4 Slave Interface
A
X
I4
-l
it
e 
S
la
ve
S
la
v
e
ID
_
R
R
eq
u
es
t
ID
_
C
ID_C
if(ID_R==ID_C)
else
Request=SLVERR
Request_B
Request_B=Request
Figure 3: e HECTOR-V architecture.
a memory protection unit (MPU) ©, which is placed directly be-
tween the memory controller and the AXI4 bus interface. e MPU
can be claimed like any other peripheral by the TEE or REE using
the security monitor and the dedicated AXI4-lite slave interface.
e party currently claiming the MPU is now able to divide the
physical memory into up to 16 regions. ese regions can then be
either exclusively accessed by one entity or are shared among mul-
tiple entities. Each incoming AXI4 read or write access is checked
by the memory protection unit using the identier transported
with the request. With this mechanism, a secure storage place for
the REE and each virtual core of the RVSCP is enabled and shared
communication channels between REE and TEE can be established.
Reset Unit. e reset unit Ê© controls the reset lines of the appli-
cation and the secure processor. Similar to other peripherals, this
entity can be claimed by each participant in the system. e owner
of the device can release the reset lines and turn on or o the other
entity in the system.
Secure Storage. To securely store sensitive data, such as crypto-
graphic keys, biometric data, or user passwords, HECTOR-V intro-
duces a secure storage element ¤©. In contrast to other peripherals,
a predened, immutable identier consisting of the core ID and the
process ID is programmed into the ID eld of the wrapper module
at design time. erefore, only the entity with the corresponding
identier can access the storage element. In the prototype, each of
the four virtual cores of RVSCP posses an own secure storage.
5.3 RVSCP
e RVSCP prototype implementation is based on the 32-bit RISC-V
REMUS core [47, 60] already oering the sponge-based CFI unit.
e REMUS core originally is based on the Ri5CY core [53], which
achieves similar performance as an ARM Cortex-M4 core. We
further extend the core with the RVSCP features and embed the
processor into the HECTOR-V architecture.
TEE Infrastructure. By using the AXI4 master interface 2© con-
nected to the AXI4 crossbar 3©, the RVSCP is able to interact with
the peripherals, such as the UART or PS2 controller. Similar to the
main application processor, the secure processor implements an
AXI4-lite master interface B© to congure and transmit peripheral
access requests to the security monitor.
Context Switch. e hardware scheduler unit is responsible for
performing secure context switches. For the RVSCP prototype, the
hardware scheduler maintains a list of four slots enabling four
virtual cores VC0 . . .VC3 on the RVSCP core. On a context switch,
the hardware scheduler saves the current SCFP state and the current
register le and loads the state, the decryption key, and the register
le for the next trustlet. To implement the replacement of the
register le, we added four additional register sets to our processor.
Note that the register le replacement could also be implemented by
storing and loading the content of the registers to memory, e.g., to a
secure storage element, to keep the area overhead of the processor
low. To dierentiate between the four trustlets executed on the
RVSCP, each of the four slots gets assigned an individual process
identier. While the core ID is identical for all slots, the hardware
scheduler replaces the process ID directly in the AXI4 and AXI4-
lite master interface individually for each slot. By using the same
core ID for all four threads, a peripheral could be congured to be
accessible by all four trustlets.
Decryption Keys. To decrypt the encrypted instruction stream,
the SCFP unit needs to know the decryption key for the correspond-
ing trustlet. In the prototype implementation, the key is stored in
a dedicated control and status register of each virtual core, which
is only accessible from the respective core. To initially load the
key into the secure storage, our prototype trustlet consists of a
8
write_se(addr,data)
*addr = data
...
write_se(0x0,1)
write_se(0x1,2)
main()
Figure 4: Access function for secure peripheral accesses.
small, unencrypted boot code and the actual, encrypted code. e
unencrypted boot code can either generate the key, load the key
from the network, or directly from the binary. Aer storing the key
into the key register, the actual encrypted trustlet starts to execute.
Code Storage. Each of the virtual processor cores VC0 . . .VC3 run-
ning on the RVSCP processor executes code from an on-chip BRAM.
While VC1 . . .VC3 fetch code from a claimable BRAM ©, the rst
virtual processor core fetches its code from a secure code storage
element © exclusively and permanently owned by VC0. To utilize
one of the virtual cores at RVSCP as an enclave, the issuer needs to
store the trustlet code on the BRAM of this core. Since the code of
the rst virtual core is stored in a secure storage element ©, this
code cannot be changed by the REE or by the other virtual cores of
RVSCP at any point in time.
Control-Flow Integrity with Secure I/O. To implement the con-
cept of binding access to peripherals to a certain CFI state, each
AXI4 request leaving the RVSCP is annotated with the current CFI
state. e processor directly places a compressed form of the cur-
rent state into the 10-bit peripheral identier eld of the AXI4 user
signal. To only allow, e.g., write access to the secure element when
reaching a predened state, the trustlet tunnels all write accesses
to this peripheral through the function write se. Since decrypting
this function only works, when a certain state SSE is reached, the
SCFP scheme automatically patches the state of the callee with the
patch value pa or pb . As seen in Figure 4, the patching mechanism
of SCFP ensures that, on each valid call of the access function, state
SSE is reached. By using this state as the peripheral identier, ac-
cess to a specic peripheral only is granted when reaching state SSE
through the access function. Similar to the setup procedure of the
decryption keys, the boot code of the trustlet claims the peripheral
used in the code by sending the identier consisting of the core
identier (ID of RVSCP), the process identier (ID of the virtual
processor), and the peripheral identier (compressed CFI state SSE )
to the security monitor. Now, access to secrets stored in the secure
storage element or other sensitive peripherals, like a ngerprint
reader, only is permied when reaching a predened CFI state. If
a trustlet does not need explicit protection of certain peripherals,
the peripheral identier in the peripheral claiming request is set to
zero.
5.4 Area Overhead
To quantify the area overhead introduced by the additional RISC-V
core and the HECTOR-V features, we synthesized the lowRISC base
platform for a Xilinx Kintex-7 series FPGA. Since the HECTOR-V
architecture is generic, we synthesized the extended lowRISC plat-
form with the additional HECTOR-V features and a selection of dif-
ferent RISC-V cores used as TEE processor. Table 1 shows the total
hardware overhead of the HECTOR-V architecture consisting of ei-
ther the Ri5CY [53], the REMUS [47, 60], or the FRANKENSTEIN [48]
core used as secure TEE processor. e REMUS core, which is based
on the Ri5CY, implements a decryption unit to oer the Sponge-
Based Control-Flow Protection (SCFP) mechanism. FRANKENSTEIN,
which is an extended version of the REMUS core, is a 64-bit RISC-V
processor including the SCFP scheme and a pointer protection
scheme for secure memory accesses. While the Rocket Core used
in the lowRISC platform is capable of running Linux, the processor
core still is rather small compared to application processors from
vendors like Intel, ARM, and AMD. erefore, when using a larger
processor in the HECTOR-V architecture, the relative overhead intro-
duced by the additional secure TEE processor is negligible. In Ta-
ble 2, we list area number for dierent components in the HECTOR-V
architecture using the RI5CY core as secure TEE processor.
6 USE CASES
is section proposes use cases for the secure TEE design consisting
of the HECTOR-V architecture and the RVSCP. More concretely, we
demonstrate how the TEE can be used to boot Linux on the main
application processor securely. Aer the secure boot use case, we
show how the RVSCP can be utilized to securely execute trustlets.
Note that these two scenarios are only a selection of many other
use cases that can be realized with HECTOR-V.
6.1 Secure Boot
On the unmodied lowRISC platform, a zero stage bootloader (ZSBL)
permanently stored in the on-chip BRAM rst loads the Berke-
ley boot loader (BBL) from the external SD-card. en, the ZSBL
hands over control to the BBL rst stage bootloader (FSBL). e
BBL, which is linked against the Linux kernel, fetches the Linux
image and sets up the environment by conguring the hardware
Table 1: Lookup table (LUT) overhead of the overall archi-
tecture consisting of the HECTOR-V features and an additional
secure processor compared to the lowRISC base project.
Conguration Area[LUTs]
Area Overhead
[%]
lowRISC base platform 55,443 -
HECTOR-V with RI5CY 63,648 14.8
HECTOR-V with REMUS 67,024 20.89
HECTOR-V with FRANKENSTEIN 68,746 23.99
Table 2: Number of lookup tables for dierent features of
the HECTOR-V architecture.
Component Area[LUTs]
Area
[%]
Rocket Chip 33,341 52.38
RI5CY 5,780 9.08
Security Monitor 446 0.7
Peripheral Wrapper 43 0.07
AXI4 Crossbar 3,052 4.79
AXI4-lite Crossbar 93 0.14
9
threads (HARTS) and the memory. Finally, the Linux operating
system starts. However, loading the FSBL and the Linux image
directly from the SD-card to boot up the device is problematic in
many ways. An aacker with physical access to the device can
boot arbitrary code by simply modifying the unprotected boot les
stored on the SD-card. is aack methodology also can be used in
an online aack by overwriting the boot les. To only allow authen-
ticated soware to boot, vendors frequently embed a secure boot
mechanism into their systems. Here, a chain-of-trust is generated
by authenticating each boot le before executing it.
In our use case implementing secure boot using features of the
HECTOR-V architecture, the rst virtual core VC0 of RVSCP is the
designated security monitor owner. At design time, the reset unit
of the system is congured to release the reset line of RVSCP and
keep the REE processor halted when applying power to the SoC.
Additionally, the security monitor owned by VC0 is uncongured,
except for the reset unit which is claimed by the SM owner. When
applying power to the device, VC0 starts to execute the ZSBL stored
in the secure code storage ©. Since the secure code storage is
permanently and exclusively owned by VC0, the ZSBL is isolated
from the other parties and only VC0 can update this code using an
update mechanism. Due to these strong protection mechanisms,
the ZSBL stored in the secure code storage is the system root-
of-trust. e ZSBL code rst claims the SD-card controller and
congures the memory protection unit of the DDR3 memory. en,
the ZSBL fetches the BBL from the SD card and stores it to the
external memory. Before passing control to the BBL, the ZSBL
determines the hash value of the loaded BBL image. When this
hash value does not match the expected hash value stored in the
secure storage element ¤© of VC0, the boot process is aborted. Again,
an update of the expected hash value only can be initiated by the
rst virtual processor core of RVSCP because the secure storage
element exclusively is owned by this party and can not be claimed
by any other party at any point in time. If the verication of the
BBL was successful, the ZSBL releases the SD-card driver, gives the
main application processor access to the DDR3 memory region by
conguring the MPU, and triggers the reset unit to start the REE.
Now, the Rocket Core starts to execute the modied BBL from the
external memory. e modied BBL then requests access to the
SD-card controller, loads the Linux image from the SD-card, and
veries the loaded image by comparing the computed hash value
with the expected hash value. Finally, if the verication process
succeeds, the Linux operating system starts and can claim the rst
peripherals by sending requests to the security monitor. Since each
stage of the boot process now is cryptographically veried, the
system is in a genuine state. Now, the VC0 passes the SM owner
privilege to the main application processor (AP). is change is
initiated by sending the identier of the AP to the SM using the
ownership transfer command. Although the REE is now in full
control of the system, e.g., is able to switch of the secure processor
using the reset unit, secrets stored in the secure storage ¤© and the
secure code storage © are still protected from AP accesses.
6.2 Trustlets
In this use case, we utilize the virtual processor VC1 of RVSCP to
execute a trustlet. We compile the trustlet with a LLVM-based
toolchain supporting the SCFP scheme and encrypt the program
with the key K . To interact with the outside and to store secrets,
the trustlet uses the embedded UART controller and the secure
storage element. e secure storage element is protected from
malicious accesses by tunneling all requests through a dedicated
access function and binding the access to it to the CFI state SSE .
We dene access to the UART controller as uncritical. erefore
the interaction with this controller is unprotected.
First, the application processor switches o the RVSCP by utiliz-
ing the reset unit. en, the processor claims a claimable BRAM ©
and stores the code, consisting of an unencrypted small boot code
and the encrypted trustlet, to this storage. Aer the code is saved,
the AP sets VC1 as owner of the code storage in the security monitor.
Furthermore, the MPU is congured to provide a memory region
owned by the trustlet to use as RAM and a shared memory region
to establish a communication channel between REE and the trustlet.
Finally, the RVSCP is started by releasing the reset line of the secure
processor and code gets executed. To initialize key K and state
SSE , the boot code of the trustlet writes the decryption key into
the key register of VC1 and claims the secure key storage element
by seing the compressed state SSE into the peripheral identier
eld. Now, control is passed to the SCFP protected trustlet code
and the decryption stage of the processor decrypts the code using
the key. To access the UART controller, the trustlet needs to claim
the controller by sending a claim request to the SM. e secure
storage element is accessed by using the dedicated access function.
By seing the compressed state directly into the peripheral ID eld
of each AXI4 request leaving the processor, access to the secure
storage element only is permied when reaching the predened
state SSE . e communication channel established using the shared,
external memory allows the REE and trustlet to exchange informa-
tion. When the designated SM owner, the AP, withdraws access to
the UART controller, the claimable BRAM, or the shared memory
region, the trustlet is notied using an interrupt. is interrupt is
handled by the ISR implemented on the trustlet. e ISR clears all
secrets, aborts communication with the UART controller, gracefully
releases the peripheral, and enters a predened IDLE state.
6.3 Security Discussion
For the security analysis of our heterogeneous architecture oering
a secure environment, we consider three aacker models.
Exploitable Trustlet. e rst aack model considers a trustlet
containing a bug exploitable by an adversary. Similar to recent
aacks [10, 11], the bug, e.g., a buer overow, could be exploited
over the communication API between REE and TEE. RVSCP protects
against this aacker model by oering a CFI unit mitigating against
control-ow hijacking aacks on forward- and backward-edges.
Furthermore, RVSCP limits the impact of a data-oriented aack by
restricting access to sensitive peripherals by combining the CFI
unit with the secure I/O concept.
Malicious Trustlet e second aacker model considers a trustlet
containing explicit code to aack the architecture directly. To pre-
vent an adversary from performing a DOS aack by permanently
claiming all peripherals, HECTOR-V oers a withdraw mechanism.
Since the core, process, and peripheral ID are managed by the hard-
ware, a malicious trustlet can not change these IDs in soware
10
to gain unauthorized access to peripherals. While the heteroge-
neous architecture prevents an aacker from performing cache
and transient-based aacks between REE and TEE, RVSCP needs
additional countermeasures, such as ushing the microarchitec-
tural state [64] on a context switch, to also protect against cache
aacks. Due to the simplicity and openness of RVSCP, transient-
based aacks can not be performed by an adversary. RVSCP protects
trustlets within the processor from each other by using a hardware
mechanism to perform context switches, use dierent encryption
keys, and use a separate code storage and RAM region for each
trustlet.
Malicious SecurityMonitor Owner. is aacker model consid-
ers a malicious SM owner targeting the other entities. e concept
of seing the ID immutable into the ID eld of critical elements
prevents the SM owner from accessing secrets, such as keys, hash
values, or the secure boot code. In HECTOR-V, withdrawing access
to specic peripherals can only be initiated by using the dedicated
withdrawing command. Since the withdrawing mechanism noties
the current owner of the peripheral about the incoming withdraw
procedure, sensitive data can be rst cleared. To prevent the SM
owner to switch o a dierent entity in the system and then access
secrets stored in the peripherals, the reset unit could, similar to the
withdrawing mechanism, notify the entity about the incoming reset.
Except for reseing participants and withdrawing peripherals, the
execution of trustlets can not be interrupted by the SM owner.
7 RELATEDWORK
Concurrent developed with the HECTOR-V architecture, SiFive re-
cently introduced WorldGuard [62]. In WorldGuard, each core gets
assigned a world ID and each process on the core can be anno-
tated with a process ID. Similar to HECTOR-V, the ID is transported
using the interconnect and requests from participants are ltered
by peripherals, the memory, and the caches. However, both archi-
tectures dier in various design choices. First, HECTOR-V uses a
hardware-based security monitor which only can be congured by
one party. In contrast to WorldGuard, the security monitor own-
ership can be dynamically transferred to other parties allowing
exible use cases. Additionally, the security monitor allows each
participant in HECTOR-V to request access to certain peripherals
and request access to already claimed peripherals using a withdraw
request. We further propose a concrete secure processor design uti-
lizing features of the heterogeneous architecture to create a trusted
execution environment. Moreover, we comprehensively describe
the hardware-soware interaction and demonstrate features of
HECTOR-V by introducing several use case scenarios.
8 CONCLUSION AND FUTUREWORK
In this paper, we proposed HECTOR-V, a secure TEE design strategy
using a heterogeneous CPU architecture. Our design establishes
secure paths between peripherals and the cores by tagging each
party with an identier. e peripherals enforce access permissions
by checking the identier, which is transported along with each
bus request. To congure these access permissions, we integrate a
hardware-based security monitor into the architecture. e secu-
rity monitor, which exclusively can set permissions, is owned by
a conguration party. By allowing to transfer this ownership to
other parties, HECTOR-V allows exible permission management. In
contrast to similar design approaches, we provide a notier-based
mechanism to withdraw access to certain peripherals securely. We
further introduce RVSCP, a security-hardened CPU design tailored
for our architecture. RVSCP combines a ne-granular control-ow
integrity scheme with the secure I/O concept of HECTOR-V to re-
strict access to assets. To complete our TEE design, we introduce
secure data and code storage elements, a reset unit, and a memory
protection unit. We examine the features of our architecture in a
secure boot and enclave scenario.
Future Work. To oer a feature commonly used by TEEs, our
platform can be extended with a remote aestation mechanism.
is feature could be integrated into RVSCP by utilizing the SCFP
scheme [59]. Furthermore, the security monitor can be extended to
also support ne-granular access permissions, such as read or write
only. Additionally, as already discussed in Section 6.3, RVSCP needs
to be extended to mitigate against cache-based aacks by either
ushing the cache or integrating ID checks in the cache. Although
the identier-based system guarantees that peripherals only can be
accessed by certain parties, communication with external devices
aached are not authenticated. e HECTOR-V architecture may be
extended with a mechanism to authenticate externally aached
devices to ensure that, e.g., the TEE only is communicating with
trusted devices.
ACKNOWLEDGMENTS
is project has received funding from the European Research
Council (ERC) under the European Union’s Horizon 2020 research
and innovation programme (grant agreement No 681402) and by
the Austrian Research Promotion Agency (FFG) via the competence
center Know-Center (grant number 844595), which is funded in the
context of COMET - Competence Centers for Excellent Technolo-
gies by BMVIT, BMWFW, and Styria.
REFERENCES
[1] 2018. CVE-2016-10423.
[2] Martı´n Abadi, Mihai Budiu, U´lfar Erlingsson, and Jay Ligai. 2009. Control-ow
integrity principles, implementations, and applications. ACM Trans. Inf. Syst.
Secur. 13 (2009). hps://doi.org/10.1145/1609956.1609960
[3] Apple. 2020. About the Apple T2 Security Chip. Accessed: 2020-07-10.
[4] Apple. 2020. Product security certications, validations, and guidance for SEP:
Secure Key Store. Accessed: 2020-07-11.
[5] ARM. 2019. AMBA AXI and ACE Protocol Specication. arm.com (2019).
[6] ARM. 2020. Processing Architecture for Power Eciency and Performance. Ac-
cessed: 2020-07-07.
[7] Architecure ARM. 2009. Security technology building a secure system using
trustzone technology (white paper). ARM Limited (2009).
[8] Divya Arora, Srivaths Ravi, Anand Raghunathan, and Niraj K. Jha. 2006.
Hardware-Assisted Run-Time Monitoring for Secure Program Execution on
Embedded Processors. IEEE Trans. Very Large Scale Integr. Syst. 14 (2006).
hps://doi.org/10.1109/TVLSI.2006.887799
[9] Krste Asanovic, Rimas Avizienis, Jonathan Bachrach, Sco Beamer, David Bian-
colin, Christopher Celio, Henry Cook, Daniel Dabbelt, John Hauser, Adam
Izraelevitz, Sagar Karandikar, Ben Keller, Donggyu Kim, John Koenig, Yunsup
Lee, Eric Love, Martin Maas, Albert Magyar, Howard Mao, Miquel Moreto, Albert
Ou, David A Paerson, Brian Richards, Colin Schmidt, Stephen Twigg, Huy Vo,
and Andrew Waterman. 2016. e Rocket Chip Generator. EECS Department,
University of California, Berkeley, Technical Report (2016).
[10] Gal Beniamini. 2016. QSEE privilege escalation vulnerability and exploit (CVE-
2015-6639). Accessed: 2020-05-13.
[11] Gal Beniamini. 2016. War of the Worlds - Hijacking the Linux Kernel from QSEE.
Accessed: 2020-05-13.
[12] Swapnil Bhartiya. 2020. Linux in 2020: 27.8 million lines of code in the kernel, 1.3
million in systemd. Accessed: 2020-07-07.
11
[13] Ferdinand Brasser, Urs Mu¨ller, Alexandra Dmitrienko, Kari Kostiainen, Srdjan
Capkun, and Ahmad-Reza Sadeghi. 2017. Soware Grand Exposure: SGX Cache
Aacks Are Practical. In Workshop on Oensive Technologies – WOOT.
[14] Jo Van Bulck, Marina Minkin, Or Weisse, Daniel Genkin, Baris Kasikci, Frank
Piessens, Mark Silberstein, omas F. Wenisch, Yuval Yarom, and Raoul Strackx.
2018. Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient
Out-of-Order Execution. In USENIX Security Symposium.
[15] David Cerdeira, Nuno Santos, Pedro Fonseca, and Sandro Pinto. 2020. SoK:
Understanding the Prevailing Security Vulnerabilities in TrustZone-assisted TEE
Systems. In Proceedings of the IEEE Symposium on Security and Privacy (S&P),
San Francisco, CA, USA.
[16] Stephen Checkoway, Lucas Davi, Alexandra Dmitrienko, Ahmad-Reza Sadeghi,
Hovav Shacham, and Marcel Winandy. 2010. Return-oriented programming
without returns. In Conference on Computer and Communications Security – CCS.
hps://doi.org/10.1145/1866307.1866370
[17] Guoxing Chen, Sanchuan Chen, Yuan Xiao, Yinqian Zhang, Zhiqiang Lin, and
Ten-Hwang Lai. 2020. SgxPectre: Stealing Intel Secrets From SGX Enclaves via
Speculative Execution. IEEE Secur. Priv. 18 (2020). hps://doi.org/10.1109/MSEC.
2019.2963021
[18] Nick Christoulakis, George Christou, Elias Athanasopoulos, and Sotiris Ioanni-
dis. 2016. HCFI: Hardware-enforced Control-Flow Integrity. In Conference on
Data and Application Security and Privacy – CODASPY. hps://doi.org/10.1145/
2857705.2857722
[19] Tobias Cloosters, Michael Rodler, and Lucas Davi. 2020. TeeRex: Discovery and
Exploitation of Memory Corruption Vulnerabilities in {SGX} Enclaves. In 29th
{USENIX} Security Symposium ({USENIX} Security 20).
[20] Intel Corporation. 2019. Intel 64 and IA-32 Architectures Soware Developer’s
Manual. (2019).
[21] Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. ePrint 2016/86
(2016).
[22] Ruan de Clercq, Johannes Go¨tzfried, David U¨bler, Pieter Maene, and Ingrid
Verbauwhede. 2017. SOFIA: Soware and control ow integrity architecture.
Comput. Secur. 68 (2017). hps://doi.org/10.1016/j.cose.2017.03.013
[23] Jan-Erik Ekberg, Kari Kostiainen, and N. Asokan. 2014. e Untapped Potential
of Trusted Execution Environments on Mobile Devices. IEEE Secur. Priv. 12 (2014).
hps://doi.org/10.1109/MSP.2014.38
[24] Jan-Erik Ekberg, Kari Kostiainen, and N Asokan. 2014. e untapped potential
of trusted execution environments on mobile devices. IEEE Security & Privacy
12 (2014).
[25] Trusted Firmware. 2020. OP-TEE. Accessed: 2020-05-12.
[26] Google. 2020. Titan in depth: Security in plaintext. Accessed: 2020-07-07.
[27] Google. 2020. Trusty TEE. Accessed: 2020-06-07.
[28] Johannes Go¨tzfried, Moritz Eckert, Sebastian Schinzel, and Tilo Mu¨ller. 2017.
Cache Aacks on Intel SGX. In Proceedings of the 10th European Workshop
on Systems Security, EUROSEC 2017, Belgrade, Serbia, April 23, 2017. hps:
//doi.org/10.1145/3065913.3065915
[29] Roberto Guanciale, Hamed Nemati, Christoph Baumann, and Mads Dam. 2016.
Cache Storage Channels: Alias-Driven Aacks and Veried Countermeasures.
In IEEE Symposium on Security and Privacy – S&P. hps://doi.org/10.1109/SP.
2016.11
[30] Shay Gueron. 2016. A Memory Encryption Engine Suitable for General Purpose
Processors. ePrint 2016/204 (2016).
[31] Apple Inc. 2020. Security enclave processor for a system on a chip. Accessed:
2020-06-07.
[32] Intel. 2017. Intel SGX and Side-Channels. Accessed: 2020-05-23.
[33] e iPhone Wiki. 2020. Greensburg 14G60 (iPhone6,1). Accessed: 2020-07-07.
[34] Sco Johnson, Dominic Rizzo, Parthasarathy Ranganathan, Jon McCune, and
Richard Ho. 2018. Titan: enabling a transparent silicon root of trust for Cloud.
In Hot Chips: A Symposium on High Performance Chips.
[35] Sanjeev Khushu and Wilfred Gomes. 2019. Lakeeld: Hybrid cores in 3D Package.
In 2019 IEEE Hot Chips 31 Symposium (HCS), Cupertino, CA, USA, August 18-20,
2019. hps://doi.org/10.1109/HOTCHIPS.2019.8875641
[36] Jae-Hyuk Lee, Jin Soo Jang, Yeongjin Jang, Nohyun Kwak, Yeseul Choi, Changho
Choi, Taesoo Kim, Marcus Peinado, and Brent ByungHoon Kang. 2017. Hacking
in Darkness: Return-oriented Programming against Secure Enclaves. In USENIX
Security Symposium.
[37] Moritz Lipp, Daniel Gruss, Raphael Spreitzer, Cle´mentine Maurice, and Stefan
Mangard. 2016. ARMageddon: Cache Aacks on Mobile Devices. In USENIX
Security Symposium.
[38] LowRISC. 2019. lowRISC Chip. Accessed: 2020-05-07.
[39] lowRISC. 2020. OpenTitan Open source silicon root of trust (RoT). Accessed:
2020-06-22.
[40] Tarjei Mandt, Mathew Solnik, and David Wang. 2016. Demystifying the secure
enclave processor. Black Hat Las Vegas (2016).
[41] Steve McConnell. 2004. Code complete. Pearson Education.
[42] Elliot H. MednickEdward McLellan. 2020. Instruction subset implementation for
low power operation. US10698472B2 (2020).
[43] Kit Murdock, David Oswald, Flavio D Garcia, Jo Van Bulck, Daniel Gruss, and
Frank Piessens. 2020. Plundervolt: Soware-based fault injection aacks against
Intel SGX. In 2020 IEEE Symposium on Security and Privacy (SP).
[44] Bernard Ngabonziza, Daniel Martin, Anna Bailey, Haehyun Cho, and Sarah
Martin. 2016. TrustZone Explained: Architectural Features and Use Cases. In 2nd
IEEE International Conference on Collaboration and Internet Computing, CIC 2016,
Pisburgh, PA, USA, November 1-3, 2016. hps://doi.org/10.1109/CIC.2016.065
[45] Sandro Pinto and Nuno Santos. 2019. Demystifying Arm TrustZone: A Compre-
hensive Survey. ACM Comput. Surv. 51 (2019). hps://doi.org/10.1145/3291047
[46] Samsung. 2020. KNOX and ARM®TrustZone®. Accessed: 2020-05-07.
[47] David Schaenrath. 2016. Fault-Aack Secure Processor Design. Graz University
of Technology (2016).
[48] Robert Schilling, Mario Werner, Pascal Nasahl, and Stefan Mangard. 2018. Point-
ing in the Right Direction - Securing Memory Accesses in a Faulty World. In
Annual Computer Security Applications Conference – ACSAC. hps://doi.org/10.
1145/3274694.3274728
[49] Michael Schwarz, Samuel Weiser, Daniel Gruss, Cle´mentine Maurice, and Stefan
Mangard. 2017. Malware Guard Extension: Using SGX to Conceal Cache Aacks.
In Detection of Intrusions and Malware & Vulnerability Assessment – DIMVA.
hps://doi.org/10.1007/978-3-319-60876-1 1
[50] Hovav Shacham. 2007. e geometry of innocent esh on the bone: return-
into-libc without function calls (on the x86). In Conference on Computer and
Communications Security – CCS. hps://doi.org/10.1145/1315245.1315313
[51] Adrian Tang, Simha Sethumadhavan, and Salvatore J. Stolfo. 2017. CLKSCREW:
Exposing the Perils of Security-Oblivious Energy Management. In USENIX Secu-
rity Symposium.
[52] Caroline Tice, Tom Roeder, Peter Collingbourne, Stephen Checkoway, U´lfar
Erlingsson, Luis Lozano, and Geo Pike. 2014. Enforcing Forward-Edge Control-
Flow Integrity in GCC & LLVM. In USENIX Security Symposium.
[53] Andreas Traber, Florian Zaruba, Sven Stucki, Antonio Pullini, Germain Haugou,
Eric Flamand, Frank K Gurkaynak, and Luca Benini. 2016. PULPino: A small
single-core RISC-V SoC. In 3rd RISCV Workshop.
[54] Jo Van Bulck, Daniel Moghimi, Michael Schwarz, Moritz Lipp, Marina Minkin,
Daniel Genkin, Yuval Yarom, Berk Sunar, Daniel Gruss, and Frank Piessens.
2020. LVI: Hijacking transient execution through microarchitectural load value
injection. In 41th IEEE Symposium on Security and Privacy (S&P’20).
[55] Zhi Wang and Xuxian Jiang. 2010. HyperSafe: A Lightweight Approach to
Provide Lifetime Hypervisor Control-Flow Integrity. In IEEE Symposium on
Security and Privacy – S&P. hps://doi.org/10.1109/SP.2010.30
[56] Andrew Waterman, Yunsup Lee, David A. Paerson, and Krste Asanovic´. 2011.
e RISC-V Instruction Set Manual, Volume I: Base User-Level ISA. Technical
Report. EECS Department, University of California, Berkeley.
[57] Nico Weichbrodt, Anil Kurmus, Peter R. Pietzuch, and Ru¨diger Kapitza. 2016.
AsyncShock: Exploiting Synchronisation Bugs in Intel SGX Enclaves. In European
Symposium on Research in Computer Security – ESORICS. hps://doi.org/10.1007/
978-3-319-45744-4 22
[58] Samuel Weiser and Mario Werner. 2017. SGXIO: Generic Trusted I/O Path for
Intel SGX. In Conference on Data and Application Security and Privacy – CODASPY.
hps://doi.org/10.1145/3029806.3029822
[59] Mario Werner. 2020. System Architectures and Techniques for Ecient, Secure,
and Trusted Code Execution. Accessed: 2020-08-13.
[60] Mario Werner, omas Unterluggauer, David Schaenrath, and Stefan Mangard.
2018. Sponge-Based Control-Flow Protection for IoT Devices. In European Sym-
posium on Security and Privacy – EuroS&P. hps://doi.org/10.1109/EuroSP.2018.
00023
[61] Mario Werner, Erich Wenger, and Stefan Mangard. 2015. Protecting the Control
Flow of Embedded Processors against Fault Aacks. In Smart Card Research and
Advanced Applications – CARDIS. hps://doi.org/10.1007/978-3-319-31271-2 10
[62] Bob Wheeler. 2019. SIFIVE SECURES RISC-V. Microprocessor report (2019).
[63] Johannes Winter. 2008. Trusted computing building blocks for embedded linux-
based ARM trustzone platforms. In Conference on Computer and Communications
Security – CCS. hps://doi.org/10.1145/1456455.1456460
[64] Nils Wisto, Moritz Schneider, Frank K. Gu¨rkaynak, Luca Benini, and Gernot
Heiser. 2020. Prevention of Microarchitectural Covert Channels on an Open-
Source 64-bit RISC-V Core. arXiv abs/2005.02193 (2020).
[65] Mingwei Zhang and R. Sekar. 2013. Control Flow Integrity for COTS Binaries.
In USENIX Security Symposium.
[66] Ning Zhang, He Sun, Kun Sun, Wenjing Lou, and Yiwei omas Hou. 2016.
CacheKit: Evading Memory Introspection Using Cache Incoherence. In European
Symposium on Security and Privacy – EuroS&P. hps://doi.org/10.1109/EuroSP.
2016.34
[67] Andrew D. Zonenberg and Bu¨lent Yener. 2016. Antikernel: A Decentral-
ized Secure Hardware-Soware Operating System Architecture. In Crypto-
graphic Hardware and Embedded Systems – CHES. hps://doi.org/10.1007/
978-3-662-53140-2 12
12
