Fort-NoCs: Mitigating the Threat of a Compromised NoC

ABSTRACT
In this paper, we uncover a novel and imminent threat to
an emerging computing paradigm: MPSoCs built with 3rd
party IP NoCs. We demonstrate that a compromised NoC
(C-NoC) can enable a range of security attacks with an accomplice software component. To counteract these threats,
we propose Fort-NoCs, a series of techniques that work together to provide protection from a C-NoC in an MPSoC.
Fort-NoCs’s foolproof protection disables covert backdoor
activation, and reduces the chance of a successful side-channel
attack by "clouding" the information obtained by an attacker.
Compared to recently proposed techniques, Fort-NoCs offers
a substantially better protection with lower overheads.

1.

INTRODUCTION

Three emerging technology trends in the modern world
are conspiring to impose massive challenges in secure hardware design. First, the increasing integration of uncore components in a processor such as accelerators and IP cores with
different signaling protocols necessitates the need for an efficient interconnect system. Figure 1 presents the evolution of
commodity multicore server processors. We see a transition
from bus-based crossbars connecting a few cores (e.g. Core
i7) to an on-chip network connecting many cores (e.g. Tilera,
Intel SCC [9,27]). Second, as Multiprocessor-System-on-Chip
(MPSoC) development grows in complexity and cost, there is
an increasing emphasis on employing third party IP Networkon-Chip (NoC) blocks to connect these components for costreduction purposes [14, 16, 21, 22], also shown in Figure 1.
Even if all other SoC components are designed with the highest security standards, a malicious NoC that has access to
all nodes can wreak havoc to an otherwise secure system.
Lastly, recent trends of using MPSoCs in cloud computing
data centers, expose these systems to new threats as different
co-scheduled applications (secure and non-secure) are forced
to share the underlying hardware resources [26].
In this paper, we uncover a potent threat posed by a third
party NoC in an MpSoC cloud computing setup. We demonstrate that a malicious third party vendor can provide a compromised NoC by embedding a hardware trojan within the
IP block. Such a trojan can facilitate a range of possible attacks with an accomplice software. For example, an accomplice software component can establish a covert communication with the NoC to snoop on the ongoing data communication through the NoC, thereby stealing classified information. Many other types of attacks such as voluntary data
corruption or denial of service are also possible. A key challenge then is how to provide secure and reliable communication, when the underlying NoC is compromised. In the
context of existing works (see Section 6), to the best of our
knowledge, this is the first work to explicitly explore the threat
posed by a Compromised NoC (C-NoC).

We propose Fort-NoCs1 —a three-layer security mechanism
for an MPSoC system with a potentially compromised communication platform. These security measures are introduced
in the SoC firmware that interfaces the processing element
(PE) with the network interface (NI) of the NoC. Fort-NoCs
comprises: (1) a lower layer Data Scrambling (DS) that creates
a stiff barrier for activation of a hardware trojan in the NoC;
(2) a middle layer Packet Certification (PC) that breaks the
communication link between the untrusted NoC hardware
and its accomplice thread; and (3) a top layer Node Obfuscation (NObf) that decouples the source and destination nodes
of a communication to dramatically increase side-channel resilience. Combined together, our three-layer security mechanisms mitigate the threat of a C-NoC, with minimal powerperformance overhead.
We make the following contributions in this paper:

• We illustrate a new threat model stemming from a CNoC design. This model can be a potent security threat
as third party NoCs become more prominent in lowcost cloud computing on MPSoCs (Section 2).
• We show a detailed design of a C-NoC and evaluate its
design footprint and runtime performance overhead.
Our rigorous analysis demonstrates that it is possible
to realize this threat model with a minimal footprint,
clearly showcasing the potency of the threat model we
uncover in this paper (Section 3).
• We present Fort-NoCs, a holistic layered approach to
harden security on systems with a C-NoC: Node Obfuscation (NObf), Packet Certification (PC) and Data Scrambling (DS). Our techniques play complementary roles in
hardware level protection by preventing the two-way
communication between software and the hardware CNoC, and introducing noise in the NoC data communication semantics with negligible overhead on performance and bandwidth of the NoC (Section 4).
• We analyze our secure design solutions by measuring
the associated power, area, and performance overheads.
Compared to the recently proposed state-of-the-art NoCMPU [19], our proposed Fort-NoCs offers compelling
advantages in power-performance overheads and threat
resilience from a C-NoC (Section 5).

2. THREAT OF A C-NOC
In this section, we outline the security threat posed by a
compromised NoC. We outline the threat model and briefly
discuss the significance of this threat in modern software and
hardware practices. Subsequently, in the next section, we discuss how an attack can be implemented with a low overhead,
demonstrating the high potency of this threat.
1 Fort-NoCs is a word play between NoCs and Fort Knox–a
security hardened structure housing US gold deposits.

Designer components
3rd-party IP components

Core

Core

R

Core 1 Core 2 Core 3 Core 4

R
Core

R

Cache Cache Cache Cache

R

R

Core

R

Core

R

R
Core

Core

Crossbar Switch
bus-based: Core i7, Barcelona

Core

Core

R

Core

R

Core

Core

Core

R
Core

R
Core

R

Core

R

R

R
Core

R

Core

R
Core

R

Core

R

NoC-centric + 3rd IP NoC

NoC-centric: Tilera, Intel SCC

~2011: many-core

~2000: multicore

R

Core

Core

R

R
Core

R
Core

Core

R

~2015: cloud scale MPSoC

Time

Figure 1: Cloud Chips Design Evolution.
(


P
T 

(



 !
M"#$%$&')
*+&,+"-

 
E

H  

AcTh sends commands to initiate a malicious activity.
In the case of information steal, the AcTh may request
duplication of specific on-going data communication.
For example in Figure 2, an AcTh residing in the local
PE of NoC node X could send a command to the NoC
node W to leak the communication between nodes Y
and Z. The hardware trojan at NoC node W will then
duplicate all packets going in/out of its local PE and
send it to the AcTh at the PE of NoC node X.
4. Tear down Phase: In the tear down phase, the AcTh informs the compromised NoC to suspend its malicious
activity. Similar to activation, this operation can be realized by sending a previously agreed upon message
through a dirty cache block.

2.2 Threat Relevance

U 
H  
N

Z

W

Y
X

Figure 2: Compromised NoC snooping data messages between
programs A & B, and leaking to the accomplice program C.

2.1 Threat Overview
Figure 2 shows a conceptual overview of one specific threat
posed by a compromised NoC: information leak. The security attack must have two complementary hardware and
software components. First, a hardware trojan implanted inside the NoC is responsible for facilitating an attack on the
on-chip communication network. Second, an accomplice application must secretly communicate with the hardware trojan to activate it and send commands. Based on these commands, the trojan can carry out a plethora of potent attacks,
such as information leak and denial of service.
Without loss of generality, we outline the sequence of phases
to realize information leaking on a compromised NoC.
1. Design Phase: In the design phase, a third party NoC IP
provider inserts a hardware trojan in the NoC. The trojan is able to detect and act upon commands, which are
inserted in the flits communicated through the NoC.
Section 3 shows how to insert this trojan in a standard
NoC router with a minimal footprint.
2. Activation Phase: There are two specific steps in the activation phase. First, an accomplice thread (AcTh) is
scheduled on one of the on-chip processing elements
(PE). This step can be realized in a cloud computing
setup where a client thread is co-scheduled with other
client threads of the service provider. Second, the AcTh
establishes a covert communication channel with the
embedded hardware trojan in the NoC. For example,
the AcTh can write a covert message on its own address
space, and then cause eviction of the cache block containing that message. When a close-by NoC node receives data flits containing that covert message, it sends
an acknowledgment message back to the AcTh. That
NoC node subsequently sends covert messages to other
NoC nodes in the network to activate them, while reporting the location of the AcTh. Once all activation
sequences are received, the trojan in each NoC node
will be waiting for further commands from the AcTh.
3. Operational Phase: During the operational phase, the

The setting of the security attack in this work is a cloud
computing system with many users running different programs on an MPSoC. This threat will be of growing importance as MPSoCs are poised to take over general purpose
processors in cloud computing hosting [14]. We also assume
that for promoting a cost-efficient MPSoC design, the on-chip
interconnect is designed with a 3rd party NoC IP. This is a
very likely scenario as NoC IP blocks continue to find their
way in many SoCs. In fact, iSuppli, an independent market research firm, has determined that four out of the top
five Chinese fabless semiconductor OEM companies use the
FlexNoC interconnect from Arteris [22]. Consequently, Arteris has achieved a three-year 1002% sales growth through
IP licensing [21]. Given this trend, we have shown a unique
threat model where the NoC design is compromised. With
an accomplice software, such a C-NoC can engage in a wide
range of malicious activities such as information extraction,
denial of service or voluntary data corruption.
A key question now is can a third-party design a seemingly innocuous but malicious NoC? To answer this question, we must understand two central aspects of a C-NoC design. First, we need to evaluate the design footprint of such a
design, in terms of its area and power overhead. Second, we
want to analyze the runtime performance overhead on other
applications, while the NoC is engaged in malicious activities. Ideally, a designer would like to keep these overheads
low to increase the threat potency. We now delve into these
key issues.

3. EVALUATING A C-NOC
In this section, we go into more detail about the design
of a C-NoC (Section 3.1), and evaluate its design footprint
(Section 3.2) and the runtime overhead (Section 3.3).

3.1 Design Overview
Figure 3 shows a classic NoC 4-stage virtual-channel (VC)
router pipeline. The four stages are the input buffers/route
calculation, VC allocation, switch allocation and switch traversal. There are p ports with v virtual channels in each port. A
trojan hardware (HW) that duplicates incoming packets from
the local processing element can be inserted in each or one of
the ports as shown in the figure. The trojan taps the incoming links from the network interface (NI) and watches out for
covert signals from a possible accomplice thread.
The brain of a trojan is a state machine that has three major states: inactive, waiting and leaking, described in details in




 
  



State #

Description

Inactive

The trojan is inactive and is waiting for an accomplice thread to communicate.
An accomplice thread has established communication. The trojan is waiting for further commands.
The trojan is currently sending duplicate data
to an accomplice thread.

Waiting







 

Leaking





Table 2: States of the Trojan.







AcTh


 


 

Theft path

AcTh


AcTh Accomplice Thread
T Trojan

Figure 3: Compromised NoC with Hardware Trojan.
T

Table 2. There are minor states in between the major states
to ensure that the major state changes are sanctioned by the
AcTh. To change the state of the trojan, a coded sequence of
flits are sent by the AcTh. Although, accidental major state
changes are possible due to random traffic, such an occurrence has an extremely low probability. We show in Equation 1 the probability that m consecutive n-bit sequences can
occur when having a random traffic.
Pchange =



1
2n

m

(1)

For m = 5 and n = 32, Pchange ≈ 10−50 . Both variables can be
customized by the trojan creator according to their needs.

3.2 Design Footprint
To evaluate the power and area overhead of adding a trojan in an NoC, we modify the RTL of an open-source NoC
Router [1]. We add the logic for the trojan described in this
section. We also assume that the router is used as part of
a mesh topology with 5-input/output ports (4 cardinal directions + 1 local) and 5 virtual channels in each port. We
synthesize the design with the TSMC 45nm library using the
Synopsys Design Compiler. Our results for power and area
overhead are shown in Table 1. Adding the trojan HW yields
only 4.62% and 0.28% overheads in area and power, respectively. This result demonstrates that the footprint added by
the trojan HW is really low and hard to detect.
Metric
(um2 )

Area
Power (mw)

Baseline

with Trojan

Overhead

66400
705.56

69474
707.71

4.62%
0.28%

Table 1: Design footprint of the NoC Trojan Hardware.

3.3 Runtime Overhead
Evaluating the runtime overhead of our proposed model
is critical, as a designer would like to engage in covert activity with low overhead so that its activity remains undetected.
Therefore, we perform a rigorous analysis to simulate the effect of a covert theft on the network traffic.
We use the gem5+garnet simulator [3] with PARSEC [2]
benchmarks to simulate execution on an 8x8 mesh topology.
We then add our trojan model that duplicates every packet

Figure 4: Trojan with two possible locations of AcTh. The
length of the theft path depends on where the AcTh is.
inserted at the target node and redirects it to the AcTh. Finally, we measure the average packet latency increase due
to the additional traffic caused by duplicating packets. In a
large NoC, the relative position of the target node and the
malicious node, where the AcTh is scheduled, can vary substantially. Figure 4 shows two of the many possible theft
paths. Depending on this relative position, the overall impact on network latency can change substantially. Thus, we
analyze all possible theft paths across all benchmarks to evaluate the runtime overhead.
Figure 5 shows the probability distribution function (PDF)
of the average latency increase versus the number of possible theft paths. On an average, 83% of the theft paths have
a 0% overhead, 14% have between 0-5% performance overhead and only 0.25% of the theft paths have an overhead of
more than 10%. These results show that this type of a covert
theft can occur with a very small impact on the network latency when the longer theft paths are avoided. Given this result, it is clear that a compromised NoC poses a potent threat
to MPSoCs in the near future. We now outline our approach
to protect against this threat model.

4. MITIGATING THE THREAT OF A C-NOC
In this section, we describe our proposed design solutions
to prevent covert data theft by a compromised NoC. Our solutions introduce security measures in the SoC firmware that
interfaces the processing element with the network interface
(NI) of the NoC. We show in Figure 6 the design methodology used by SoC integrators. We assume two security levels:
a trusted in-house design team hired by the SoC integrator
and one or more untrusted 3rd IP vendor(s). Once the inhouse design team and the IP vendors agree on what protocol to be used, the design team can then independently develop its own trusted interface to the IP NoC chip. We assume the NIs use the Open-Core Protocol (OCP) [17] for interfacing on-chip components. Hence, Fort-NoCs’s techniques
do not depend on 3rd party vendors, including the NoC IP provider,
and are protected from malicious alterations by them.
Fort-NoCs is a three-layer security system approach that
provides both proactive and reactive protection against information leaking attacks. We briefly outline the three layers
of our proposed design below:

blackscholes
fluidanimate

Distribution (% of Theft Paths)

100%

bodytrack
swaptions

canneal
vips

ferret
x264

90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
-0.25% 0.00% 0.25% 0.50% 0.75% 1.00% 1.25% 1.50% 1.75% 2.00% >2.00%

Average Packet Delay Increase

Figure 5: Runtime Performance Overhead. The figure shows
overhead distribution across all possible theft paths.
Core

OCP Interface

Core

Request

In-house Design Team

Fort-NoCs

IP Vendor

Trojan Attacker

Response

On-Chip Interconnection Network

Figure 6: FortNoCs Design Perspective.

• Layer 1–Data Scrambling: At the lowermost layer, we
propose a low-overhead data scrambling technique to
avoid clean data transmission over the NoC at all times
(Section 4.1). This technique creates a stiff barrier for
the backdoor activation of the trojan, as well as, dramatically reduces the efficacy of NoC data snooping.
• Layer 2–Packet Certification: The second layer creates a
barrier between the NoC and the processing element
through a dynamic packet certification mechanism (Section 4.2). Flits with invalid certificates are never forwarded to the processing element, thereby breaking the
link between the software running on the processing element and the NoC hardware.
• Layer 3–Node Obfuscation: At the topmost layer, we propose an elegant technique to dynamically hide the communication nodes in an NoC, thereby introducing massive noise in the NoC data (Section 4.3). Our proposed
technique achieves remarkable side channel resilience
with minimal performance overhead, and zero overhead on NoC bandwidth demands.

4.1 Layer 1: Data Scrambling (DS)
Our first technique, Data Scrambling (DS), proactively prevents covert trojan backdoor activations by scrambling the
data in the SoC firmware before injecting it in the NoC. Consequently, the activation key sent by the AcTh will be distorted, thereby blocking its intended communication with
the trojan in the C-NoC. Apart from preventing activations,
it also renders the leaked data incomprehensible to the attacker. The SoC firmware scrambles the data using a transformation technique that can only be reversed at the intended
destination. There are many ways to provide low overhead
hardware encryption. In this work, we utilize the well known
XOR cipher encryption to encrypt the data before sending
it in the compromised NoC. We discuss the implementation
and a concrete example of DS in Section A1.

4.2 Layer 2: Packet Certification (PC)
The goal of PC is to provide a second layer reinforcement

of security, once the previous technique has been breached.
Although the previous technique is effective in preventing
intended activations, the trojan backdoor can still be activated accidentally, albeit at a very low probability (Section
3.1). Other types of triggers that cannot be prevented by
DS can also be integrated in the circuit. For instance, analog thermal triggers (i.e. triggering after certain temperature
is reached) and timer triggers can also be inserted by attackers [11]. This is the precise reason a multi-layer security approach is important to protect against C-NoCs.
PC provides data integrity protection by attaching an encrypted tag at the end of the packet before injecting it in the
NoC. The SoC firmware creates a dynamically generated random lookup table at the boot-time that lists a 16-bit unique
identifier for each node in the system. Based on the destination node of a packet, each data packet embeds a tag containing the translated identifier of the destination node from the
lookup table. Before forwarding the data to the PE, the SoC
firmware at the destination verifies this certificate by comparing its own local copy of the lookup table and the destination ID from the OCP header. PC relies on the fact that an ongoing information leak attack will always involve sending a
copy of a flit to an unintended location, where the malicious
application resides. Once this is detected, the SoC firmware
drops the packet, while also triggering an exception that will
alert the hypervisor/OS of an ongoing security breach.
With sufficient insight and inside information, a creative
attacker can outdo PC by thinking ahead and dropping the
tail flits or by replacing it with a copy of the tail flits originally
sent to the node of the malicious application. This can fool
the verification test by the SoC firmware at the PE interface.
This can be easily solved by using a post-silicon configuration to determine which part of the packet the tag should be
inserted, and periodically altered at the boot-time. By configuring post-silicon, we ensure that the designer is always
ahead of the attacker in the "cat-and-mouse" security game.

4.3 Layer 3: Node Obfuscation
Node Obfuscation (NObf) is the final armament in FortNoCs’s arsenal of security measures. One of the key aspects
of security threats posed by a compromised NoC depends
on identifying the target application running on a particular node on the NoC. NObf aims to obfuscate the NoC communication semantics by periodically decoupling the sourcedestination of a given communication. We discuss the insight
of our proposed mechanisms, implementation of the scheme,
and outline how to estimate the efficacy of this technique.

4.3.1 Design Insight
Wang and Suh have shown that a malicious thread running together with a hashing algorithm can analyze or narrowdown the possible values of a private key by observing its
own traffic latency [25]. Pellegrini et al. also showed that private keys can be derived from observations of various faulty
encrypted data [18]. These observations led us to explore a
fundamentally different and complimentary approach of introducing massive noise in the NoC data communication by
altering the source/destination. By mixing the data sent over
any specific routing path, recovering meaningful sensitive
information from an application becomes almost impossible.

4.3.2 Implementation
NObf is implemented in the SoC firmware by initiating

a routine seamless migration of the running application to
another node in the system. Of course, there are various
PEs in the MPSoC, and care must be taken while performing these migrations so that matching PEs are selected when
a thread is migrated. To this end, SoC firmware maintains
the type of PEs, and uses such information for performing
NObf. The overhead of NObf comes from moving the architected state (processor registers) and cold cache effects. For
practical intervals between two migrations, our architectural
simulations show these effects to be low for our benchmarks.
More details on the implementation is given in Section A2.

4.3.3 Efficacy Estimation
To estimate the efficacy of our proposed NObf, we utilize
the recent work by Kocher et al. to transform this problem
into a signal-to-noise (SNR) estimation. Though noise injection to the signal does not provide theoretical security, it
increases the attacker’s effort of extracting secret keys [20].
Kocher et al. has shown that decreasing the SNR of the sidechannel information can linearly or quadratically increase
the number of inputs required for a successful side-channel
analysis [13]. Hence, we analyze the disturbance introduced
by NObf to the original data in the network. Our experiment involves computing the amount of original data (i.e.
signal) communicated over a given routing path, compared
to unrelated data communicated over the same path. Given
a scheduling quanta Q, the time before Nobf migrates a running application to a different node as P (termed as Singular
Vulnerability Period (SVP)), and the number of threads N, the
SNR is given by the Equation 2. The derivation of the formula can be found in Section A3.

SNR NOb f =

P×

l

Q
PN

Q−P×

l

m

Q
PN

m

(2)

4.4 Potential Spoofing Attacks
In our threat model, the compromised NoC can spoof any
node in order to create a dummy request of a privileged information. We discuss how Fort-NoCs can fully deal with
this type of attack in Section A4.

5.

EXPERIMENTAL RESULTS

In this section, we present the area, power and performance
overheads incurred by adding our proposed FortNoCs. We
have also observed scalable side channel resilience of our
proposed schemes (presented in detail in Section A6). Our
baseline is a traditional NoC without any security features.
We also compare Fort-NoCs against state-of-the-art related
works in preventing information leakage. We compare with
the NoC-MPU scheme by Porquet et al. that prevents unauthorized read and memory accesses at each NI [19]. Section
A5 outlines our implementation of the NoC-MPU.

5.1 Implementation
We evaluate our proposed designs on several metrics such
as power, area and performance. To obtain power and area
overheads, we use the open-source Stanford Verilog model of
a modern NoC router as our baseline [1]. We insert the trojan
on top of a virtual channel router to implement the functionality described in Section 3. The NoC router model that we
use has a 3-stage speculative pipeline composed of RC/VC

allocate, switch allocation and switch traversal. To evaluate
the performance overhead, we model DS, PC and NObf in
our gem5+garnet setup [3]. These modules are interfaced
during packet dispatch and reception. We then simulate realworld PARSEC benchmarks using full-system simulation.

5.2 Area and Power
Table 3 shows the overhead of adding security features in
a standard SoC OCP interface [8]. NObf is excluded from
the table as it doesn’t change the OCP interface. Fort-NoCs’s
schemes show lower overhead compared to the NoC-MPU.
Between Fort-NoCs’s two schemes, PC has about 15× lower
area overhead compared to DS. The large area overhead of
DS is due to the shifters needed to transform the data bits
during the packetization and depacketization stages. PC has
no power overhead as it needs minimal logic to attach a tag
on the unused field of the head flit. NoC-MPU has a large
overhead compared to Fort-NoCs because of the logic to check
if each access is authorized. This involves checking each entry of a content-addressable-memory (CAM) like the PLB to
see if a thread has read/write access to a memory location.
Metric
(um2 )

Area
Power (mw)

Baseline

PC

DS

NoC-MPU

4917
1.705

+0.34%
+0%

+9.57%
+5.08%

+42.00%
+64.29%

Table 3: Overhead of Security Schemes.

5.3 Performance
Fort-NoCs provides a three-layer security to a system with
a compromised communication platform. As such, we evaluate the incremental effect on the performance overhead of
adding each layer of security. This trade-off analysis will allow designers to configure Fort-NoCs based on the type of
security required by the application or design. For instance,
high-security domains can use Fort-NoCs’s three layers at
the cost of higher performance penalty while low-security
domains will probably need only 1 or 2 layers of security.
Figure 7 shows the performance overhead of Fort-NoCs.
We show the incremental overhead of adding each layer of
protection. All Fort-NoCs techniques have low overheads.
On an average, DS adds 3.8% overhead, PC adds 2%, while
NObf adds 0.01% increase in latency. The reason NObf adds
minimal overhead on an average is that some of the threads
experience degradation, while some experience improvement
due to application-dependent traffic patterns. This correlation between thread location and performance was also observed by Missler and Jerger [15]. Between DS and PC, PC
has a smaller overhead because the size of the certificate to
be checked and decrypted is much smaller compared to the
size of the data flits. Combining the first two layers’ security
yields 5.8% overhead. Fort-NoCs’s overhead (3-layers) is at
5.9%. NoC-MPU’s overhead is larger (8.03%) because when
an access to the PLB misses, it needs to wait for a PLB refill
before authenticating the current transaction. On an average,
Fort-NoCs’s 3-layer security system has 26% less overhead
compared to NoC-MPU.

6. RELATED WORK
A vast amount of recent works focus on hardware and software security, primarily to address a problem of growing im-

Percentage Overhead

15

DS

DS_PC

DS_PC_NObf

NMPU

10
5
0

s
k eal ret ate ons ips 64 ge
r
ole ac
v x2 era
ch dytr cann fe anim apti
s
av
k
o
w
c
d
b
s
i
a
bl
flu
Figure 7: Performance Overhead

portance. We focus our discussion on NoC security, which is
most relevant to our work.
The current state-of-the-art in NoC security revolves around
protecting information traveling in the network against side
channel, physical and software attacks. Table 4 presents a
high-level comparison of existing works on NoC security based
on four major factors. Similar to software protection mechanisms, many existing works provide access control by monitoring the memory addresses (e.g., Data Protection Unit (DPU)
proposed by Fiorin et al. [6], firewall from Sonics [23], access control on memory banks by Diguet et al. [5]). Some
proposals aim to use encrypted data transmission over the
NoC (e.g., [7, 10]) or partition the NoC into separate zones
based on trust [25, 26]. In the industry, ARM has approached
trusted computing through its TrustZone platform [24]. The
idea is to establish secure and non-secure states throughout
the chip. By changing the security mode of a component
along with safety checks, information leaking from secure to
non-secure areas are prevented.
Our work in this paper explores a new threat model orthogonal to these works. First, we assume that the hardware trojan is embedded in the NoC itself, whereas others
assume a trustworthy NoC. Second, we demonstrate that information can be extracted from the NoC without relying
on memory access (either through on-chip cache or off-chip
memory). Third, we assume the presence of an accomplice
software thread, which acts as the orchestrator of the attack.
The hardware-software coalition in our threat model is similar to the Illinois Malicious Processor (IMP) [12]. However,
the IMP does not consider the threat of a C-NoC.
Trojan Loca

ISb

Protc

TMd

Fort-NoCs

NoC

NoC

NI

S/W

DPU [6]

S/W

Mem

NI

—

KeyCore [7]

S/W

Mem

NI

—

surfNoC [26]

S/W

S/W

NoC

—

AE [10]

S/W

Mem

NI

—

IMP [12]

µP

µP/Mem

—

S/W

NoC-MPU [19]

S/W

Mem

NI

—

a The

part of the system where a trojan is inserted.
b The part of the system where the information is stolen.
c The part of the system where a protection mechanism is
implemented to prevent an attack.
d The triggering mechanism in case of a hardware trojan.
Table 4: Comparison of Threats in NoCs.

7.

CONCLUSIONS
NoCs designed by third-party entities may contain IP blocks

with hardware trojans. Such C-NoCs can coordinate an attack with an accomplice software thread to pose a potent
threat for the emerging computing platforms. We propose
Fort-NoCs, implemented by augmenting the SoC firmware,
that mitigates the threat of a C-NoC through a layered approach. Through a rigorous evaluation methodology, we demonstrate compelling advantages of Fort-NoCs over existing schemes.

8. REFERENCES
[1] Open Source NoC Router RTL. https:
//nocs.stanford.edu/cgi-bin/trac.cgi/wiki/Resources/Router.
[2] B IENIA , C. AND OTHERS The PARSEC benchmark suite:
characterization and architectural implications. In PACT (2008),
pp. 72–81.
[3] B INKERT, N. AND OTHERS The gem5 simulator. SIGARCH Comput.
Archit. News 39, 2 (Aug. 2011), 1–7.
[4] B OVET, D., AND C ESATI , M. Understanding the Linux Kernel - from I/O
ports to process management: covers version 2.6 (3. ed.). O’Reilly, 2005.
[5] D IGUET, J.-P. AND OTHERS NOC-centric Security of Reconfigurable
SoC. In NOCS (2007), pp. 223–232.
[6] FIORIN , L. AND OTHERS Secure Memory Accesses on
Networks-on-Chip. TC 57, 9 (2008), 1216–1229.
[7] GEBOTYS , C. H., AND GEBOTYS , R. J. A framework for security on NoC
technologies. In VLSI, 2003. Proceedings. IEEE Computer Society Annual
Symposium on (2003), IEEE, pp. 113–117.
[8] GUDLA , R. P., AND S TEVENS , K. Design and Implementation of
Clocked OCP Interfaces between IP Cores and On-Chip Network Fabric,
April 2011. Masters Honors Thesis at University of Utah.
[9] HOWARD , J. AND OTHERS A 48-Core IA-32 Processor in 45 nm CMOS
Using On-Die Message-Passing and DVFS for Performance and Power
Scaling. J. of Solid-State Circ. 46, 1 (2011), 173–183.
[10] K APOOR , H. K. AND OTHERS A Security Framework for NoC Using
Authenticated Encryption and Session Keys. Circuits, Systems, and Signal
Processing (2013), 1–18.
[11] K ARRI , R. AND OTHERS Trustworthy Hardware: Identifying and
Classifying Hardware Trojans. Computer 43, 10 (2010), 39–46.
[12] K ING, S. T. AND OTHERS Designing and Implementing Malicious
Hardware. LEET 8 (2008), 1–8.
[13] K OCHER , P. C. AND OTHERS Introduction to differential power analysis.
J. Cryptographic Engineering 1, 1 (2011), 5–27.
[14] L I , S. AND OTHERS System-level integrated server architectures for
scale-out datacenters. In Proc. of MICRO (2011), ACM, pp. 260–271.
[15] M ISLER , M., AND JERGER , N. D. E. Moths: Mobile threads for on-chip
networks. ACM Trans. Embedded Comput. Syst. 12, 1s (2013), 56.
[16] O BJECTIVE A NALYSIS S EMICONDUCTOR M ARKET R ESEARCH I NC , .
NoC Interconnect Improves SoC Economics.
www.objective-analysis.com/uploads/NoC_Interconnect_
Improves_SoC_Economics_-_Objective_Analysis.pdf, June 2011.
[17] OCP I NTERNATIONAL PARTNERSHIP, . Specification, Release 3.0, 2009.
www.ocpip.org.
[18] P ELLEGRINI , A. AND OTHERS Fault-based attack of RSA authentication.
In Proc. of DATE (2010), pp. 855–860.
[19] P ORQUET, J. AND OTHERS NoC-MPU: a secure architecture for flexible
co-hosting on shared memory MPSoCs. In Proc. of DATE (2011), pp. 1–4.
[20] R OSTAMI , M. AND OTHERS Hardware Security: Threat Models and
Metrics. In Proc. of ICCAD (2013), ACM, pp. 109–118.
[21] S HULER , K. Arteris Makes Big Gains on Inc. 500 List of America’s
Fastest-Growing Private Companies.
www.arteris.com/Inc-500-Arteris-pr-2013-august-20, August
2013.
[22] S HULER , K. Majority of Leading China Semiconductor Companies Rely
on Arteris Network-on-Chip Interconnect IP.
www.arteris.com/China_Majority_Arteris_pr_19_august_2013,
August 2013.
[23] S ONICS I NC ., . Sonics Inc. SonicsMX: SMART Interconnect Security
Datasheet. http://www.sonicsinc.com.
[24] T A LVES, D. F. TrustZone: Integrated Hardware and Software Security.
White Paper. Tech. rep., ARM, 2004.
[25] WANG, Y., AND S UH , G. E. Efficient timing channel protection for
on-chip networks. In NOCS (2012), IEEE, pp. 142–151.
[26] WASSEL , H. M. G. AND OTHERS SurfNoC: a low latency and provably
non-interfering approach to secure networks-on-chip. In Proc. of ISCA
(2013), pp. 583–594.
[27] W ENTZLAFF , D. AND OTHERS On-Chip Interconnection Architecture of
the Tile Processor. Micro, IEEE (sept.-oct. 2007).

Appendix Section

A1.

A2. FURTHER IMPLEMENTATION DETAILS
OF NODE OBFUSCATION

IMPLEMENTATION AND EXAMPLE OF
DS

Figure 8 shows a conceptual view of DS implementation
with XOR cipher encryptors and decryptors. The sender end
of the SoC firmware encrypts the data packet using a dynamic key, while the receiver end of the SoC firmware decrypts the packet using the same dynamic key. For this paper, we use a low-overhead dynamic key for every destination node where the key is generated by the firmware at the
system boot-time, and stored in a lookup table. For example,
one can use the node ID itself as a key to encrypt the messages. Figure 9 shows an example of how the DS works with
a dynamic key (destination node ID in this case). We also
show the possible decoding of the message as it is routed
to different nodes, including the destination node 8 in Table
5. As such, the keys are never exposed to untrusted parties
involved in the MPSoC development process. The configuration can be done through a firmware and can be changed
anytime as desired.

encrypted
data

data
encrypted
data

XOR

data
XOR
dynamic
key

dynamic
key

Sending Interface
Receiving Interface
Figure 8: Low-overhead XOR Cipher allows encryption/decryption using dynamic keys from the firmware.

Dest Node #
0
1
2
3
4

Result

Dest Node #

Result

–
0xB579
0x6AF3
0xD5E6
0xABCD

5
6
7
8

0x579B
0xAF36
0x5E6D
0xBCDA

C

A
0

F

A

1

2

4

5

D

P



S S


3

A
6

For better comprehension, we analyze the communication
profile in a given scheduling quanta (Q), which can be a regular OS scheduling quanta (commonly 10ms, but can be up
to 100ms for high priority tasks [4]). To estimate this SNR,
we must assess the maximum time a given communication,
uniquely identified by the source-destination pair, is utilizing a given routing path. Without NObf, this time is the entire Q. However, NObf migrates a running application to a
different node after every P time (referred
l m as Singular Vulner-

ability Period (SVP), thereby creating Q
P intervals within a
Q. Assuming, there are N threads in the system,
l m a given ap-

Q
intervals, for
plication may occupy the same node for PN
l m
Q
a total time of P × PN . For all other times in the Q, the
communication between the said source node to the destination node is essentially noise in the system. Thus, the SNR
of our proposed NObf is:

P×

l

Q
PN

Q−P×

l

m

Q
PN

m

A4. POTENTIAL SPOOFING ATTACKS

A
D

A3. SNR DERIVATION

SNR NOb f =

Table 5: Possible Decoding of 0xDABC at different nodes.

P

Node Obfuscation is implemented through the firmware
micro-code utilizing the support for application migration
provided by many modern ISAs [SR2, SR3]. The micro-code
uses special instructions to save the processor state (all registers) into a pre-designated memory location. Subsequently,
the processor state is restored from the that memory location
in the new processing element. The memory state is transferred through the cache coherence protocol of the system.
Two key problems with such application migration stem
from portability: backward and forward. The backward compatibility problem arises when the application is migrated
on a processing element that does not fully support all the
functionality of the previous processing element. Similarly,
forward compatibility can be a problem by altering error behavior sequence (e.g., previously invalid opcode correctly
decoded on the new machine). To resolve these problems,
the SoC firmware maintains a compatibility list for each onchip processing element, and enables node obfuscation only
between compatible nodes.

7

R S


A
8

Figure 9: Example for DS.

C
D

P

In our threat model, the compromised NoC can spoof any
node in order to create a dummy request of a privileged information. Our proposed techniques successfully prevent
these spoofing attacks. For PT, embedding identification information in the tail of the packet can distinguish a real request from a bogus one. Intra-node spoofing attacks also exist when running multiple threads in a PE, such as one where
a PE is a Simultaneously Multithreaded Processor. When
running multiple threads, an untrusted co-running program
can initiate an attack and steal information from its co-scheduled
thread successfully. To nullify this attack, we copy the thread
identification tag from the OCP header into the unused portion of the head flit. Subsequently, the NI matches these
thread IDs before allowing the thread to access the NoC data.

Signal-to-Noise Ratio

0.12

16 threads
64 threads
256 threads

0.1

0.08
0.06

A

0.04
0.02

B

0
0

C

1

2

3

4

5

6

7

8

Vulnerability Period (x0.1 ms)

Figure 10: Side Channel Resilience Comparison (lower is
better).
Another successful spoofing attack can happen if the NoC
trojan can appropriately decode the tags or the scrambling of
the flits. However, even if an attacker is aware of the mechanisms used by Fort-NoCs, the added complexity needed to
circumvent these security measures will greatly increase design overheads of the trojan, defeating its purpose due to easier detection.

A5.

NOC-MPU IMPLEMENTATION

The implementation of the NoC-MPU that we use has a
small 10-entry permission lookaside buffer (PLB) table in the
NI. The PLB accesses a Permission Table (PT) in a separate
on-chip memory when a miss is encountered in the PLB. The
PT is not counted towards the area and power overhead. The
MPU takes 1 cycle to authenticate an operation and 10-cycles
to refill a PLB miss.

A6.

SIDE-CHANNEL RESILIENCY

Figure 10 shows the side channel resilience, measured as
SNR, of our proposed NObf as a function of the SVP (see Sec-

tion 4.3.3). We used a round-robin scheduling of threads because it prolongs the time that the target thread is scheduled
again on the target node compared to a random scheduler.
In the said figure, we want a minimal SNR to hide the actual
information with other data. At the same time, we also want
this minimum SNR to occur at the maximum period as much
as possible so that NObf does not have to incur the overhead
of rescheduling threads very often. However, designers do
not have to use the minimum SNR and can set a threshold at
a more practical period size.
A key feature of our proposed scheme is the scalability as
seen from the figure: increasing thread count substantially
improves SNR. For instance, in a 16-threaded application,
there are several minimas. However, the optimum period
is at point A because it incurs less overhead performancewise. As we increase the number of applications or threads
on the system, the optimum period gets pushed to the left
(points B and C) as we need more and more applications to
use any specific node in order to introduce more noise. For
a 64-threaded application, the optimum period is 0.25ms for
a 10ms quanta. However, for a 256-thread system, the optimum P is less than 0.1ms, but one may even chose 0.25ms
and achieve comparable SNR to a 64-thread system.

A7. REFERENCES
[SR1] D ANIEL B ECKER AND W ILLIAM D ALLY Open Source
NoC Router RTL https://nocs.stanford.edu/cgibin/trac.cgi/wiki/Resources/Router.
[SR2] U HLIG , R. AND OTHERS Intel Virtualization Technology.
Computer 38, 5 (2005).
[SR3] A DVANCED M ICRO D EVICES. Live Migration with
AMD-V Extended Machine Migration, April 2008.

