PASCAL: Timing SCA Resistant Design and Verification Flow by Lai, Xinhui et al.
c©2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any
current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new
collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other
works.
ar
X
iv
:2
00
2.
11
10
8v
2 
 [c
s.C
R]
  1
9 A
ug
 20
20
PASCAL: Timing SCA Resistant Design and
Verification Flow
Xinhui Lai1, Maksim Jenihhin1, Jaan Raik1, Kolin Paul1,2
1 Computer Systems, Tallinn University of Technology, Estonia
2 Department of Computer Science & Engg. Indian Institute of Technology Delhi, India
email:{Xinhui.Lai,Maksim.Jenihhin,Jaan.Raik,Kolin.Paul}@taltech.ee
Abstract—A large number of crypto accelerators are being
deployed with the widespread adoption of IoT. It is vitally im-
portant that these accelerators and other security hardware IPs
are provably secure. Security is an extra functional requirement
and hence many security verification tools are not mature. We
propose an approach/flow – PASCAL – that works on RTL
designs and discovers potential Timing Side Channel Attack
(SCA) vulnerabilities in them. Based on information flow analysis,
this is able to identify Timing Disparate Security Paths that
could lead to information leakage. This flow also (automatically)
eliminates the information leakage caused by the timing channel.
The insertion of a lightweight Compensator Block as balancing
or compliance FSM removes the timing channel with minimum
modifications to the design with no impact on the clock cycle
time or combinational delay of the critical path in the circuit.
I. INTRODUCTION
Security is not a first class citizen in (hardware) design and
is rarely considered during design space exploration. Bugs
or vulnerabilities can originate from design flaws, some of
which can be fully eliminated after a complete verification.
The goal of the adversary in a security critical application,
is to learn information that one has no legitimate access to,
e.g. the classified data or secret keys. Novel attack vectors
like side-channel analysis rely on design features, to build
efficient exploits that undermine assumptions regarding the
accessibility of internal secret information in a computing
system. For example, Timing Driven Attacks exploit timing
differences in execution traces as the information flow is via
different paths with the same start and end nodes (controllable
and observable) to derive the secret information.
Denning et. al. introduced the concept of secure information
flow in a computer system whereby it can be shown that no
unauthorized flow of information is possible due to control
and data flow [1]. However, in recent years, side channels
or out of band data channels have been exploited to exfiltrate
or deduce secret information. Consequently, Information Flow
Tracking (IFT) has evolved and has been used as a formal
methodology for modeling and reasoning about security prop-
erties related to integrity, confidentiality of side channels. The
problem becomes more interesting and hard because high-level
architecture abstractions are translated into transparent micro
architecture implementations. While the hardware behavior in
the micro-architecture can cause additional information flows
which can be gainfully exploited to form these side channels.
As opposed to physical Side Channels Attacks (SCA) like
differential power attacks etc. that require physical access to
the computer system, Timing SCA can be launched (relatively)
easily on general purpose compute environments that contain
a memory hierarchy or performance enhancing microarchitec-
ture features like speculative execution. The key invariant in
these attacks is that there are different timing paths that provide
out of band information. Security path verification addresses
a specific, important aspect of overall security verification by
checking access to the secure data on the hardware to make
certain that attackers access the secure (secret) data through
illegal logic paths. For example, in Figure 1, there are paths
Fig. 1: Timing Disparate Paths
from A to B which is controlled by the node S containing
the secret. Tools do Taint Propagation/Taint Analysis, which
is a conservative approximation of secure information flow
analysis, to find such paths [2]. A timing side channel exists,
if the contents of S can be derived/deduced by analyzing time
of arrival of K at N. We call these two or more paths with
unequal transit time as Timing Disparate Security Paths.
And these Timing Disparate Security Paths will be potentially
vulnerabilities open for a timing side channel exploit.
The primary contribution of our work is a secure automated
digital design flow – PAth based Side Channel AnaLysis
(PASCAL) – that creates a secure IP core or system-on-chip.
The proposed flow starts from the RTL design and the threat
model and uses a state of the art Security Path verification
tool to identify potential timing side channel vulnerabilities
and proposes a method to remove them by enforcing uniform
timing to remove data dependent instruction cycle count
variations in the timing side channels.
The remainder of the paper is organized as follows. Sec-
tion II summarizes the state of the art in this area. The next
section details the approach used and presents the key algorith-
mic contribution in this work for identifying Timing Disparate
Security Paths while Section IV presents a lightweight method
for Timing SCA Resistant Design using the results of the
method proposed in the previous section. Section V describes
the implementation results on a widely used crypto core and
also demonstrates the efficacy of the proposed mitigation
method. Section VI summarises the contributions of the paper
and provides directions for future work.
II. RELATED WORK
Timing side channel attacks are known to be a hard and
a very important problem in modern systems. They have
been used to extract cryptographic secrets from running sys-
tems.Even differential privacy systems are not immune to these
attacks. And these are possible using both remoteand local
adversaries. Koeune et. al present an indepth tutorial on Side
Channel Attacks [3].
A popular approach for defending against both local and
remote timing attacks is to ensure that the low-level instruction
sequence does not contain instructions whose performance
depends on secret information. This can be enforced by
manually re-writing the code, as was done in OpenSSL or
by changing the compiler to ensure that the generated code
has this property [4].
While methods for high performance design or low power
are available, design for security is still adhoc. Only recently,
systematic methods support design for trust and security have
been described in literature [5].Menichelli et. al present an
exploration approach centered on high level simulation based
on SystemC to suggest improvements in the knowledge and
identification of the weaknesses in cryptographic algorithm
implementations [6]. Ardeshiricham et. al. have proposed an
information flow based method for secure hardware design [7]
by analyzing all logical code flows of the RTL code. In contrast
VeriCoq-IFT converts designs from HDL to Coq to analyze
a formal security properties [8]. SecVerilog requires explicit
annotating each variable in the design with a security label
— this is similar to using a type system to track information
flow in the code [9]. Deng et. al. have proposed a Computation
Tree Logic to model execution paths of the processor cache
logic and derive formulas for paths that can lead to timing
side-channel vulnerabilities [10].
Most of the mitigation techniques that have been proposed
try to remove data dependent instruction cycle count variations
by balancing timing or do a power flattening to remove power
peaks/anomalies [11]. In some cases, Pipeline randomization
for power and timing is also attempted. Alternatively, packet
route randomisation as a mechanism to increase NoC re-
silience has also been proposed [12]. Recently, Jiang et. al.
have proposed a high-level synthesis (HLS) infrastructure that
incorporates static information flow analysis to remove timing
channels in a verifiable manner on HLS-generated hardware
accelerators [13].
The methodology proposed in this paper is based on a
formal method that can identify all Timing Disparate Security
Paths at RT level and improve the state of the art is a
simple mitigation scheme for potential SCA vulnerable timing
channels.
In the next section, we discuss the proposed method for
discovering Timing Disparate Security Paths in RTL designs.
III. METHODOLOGY
Hardware implementations of encryption algorithms are be-
ing increasingly used as hardware is regarded as more effective
root of trust. RSA is a asymmetric cryptographic algorithm
and has been shown to be vulnerable to Timing SCAs and
mitigation techniques have also been proposed. However, the
major focus continues to rely on verifying the correctness of
encryption algorithms and their implementation in software
and hardware. We present an approach based on RT level
analysis that allows a precise understanding of possible flows
for side channels based on timing. The methodology relies
on a formal analysis tool Cadence JasperGold Security Path
Verification App (JG SPV) [14]. The original objective of the
tool is for security verification by checking access or leak
of the secure data on the hardware to make certain that the
attacker cannot breach the authentication logic and seek the
secure data through illegal paths.
Based on a formal method of path sensitization from the
secret information to the output observable points, we propose
a method that can detect possible Timing-Disparate Paths
in RTL designs which could be exploited as Timing Side
Channel(s). As a result of this analysis, a simple and effective
retiming of Timing SCA sensitive paths is proposed to make
the design immune for the threats under the chosen threat
model. We illustrate this on a standard RSA RTL Verilog code.
Algorithm 1: Example: RSA Modulus Code
Input: Cm, Pn; // C is the m bits cipher text, P is the n
bits private key
Output: Om; // O is the m bits output plain text
1 R0 ←Montgomery(Cm) and R1 ←Montgomery(1))
2 j ← 0
3 while j ≤ n− 1 do
4 R0 ←Montgomery Reduction(R1 ∗R1)
5 if P [j] then
6 R0 ←Montgomery Reduction(R0 ∗R1)
7 end
8 Om ←Montgomery−1(R0)
9 end
The decryption of the RSA modulus in Algorithm 1 uses
Montgomery modular multiplication with square-and-multiply
algorithm. Here we did not mention the details about how to
choose the key or how the Montegomery algorithm works but
focus on explaining the unintended timing channels in RSA
which can be used by attacker to reverse the private key. In
Algorithm 1, n, the bit number of Pn, decides total loop times
while value of single bit of Pn: P[j] determine the operations
for each single loop – only when P[j] equal to 1, statements
at Line 5,6,7 will be executed while P[j] equal to 0 will not.
For the decryption of RSA, the total operations need to be
executed might be different with different private key duing
to the above reasons. Assuming the time for single bit P[j] is
tP[j], the final execution time will be ttotal =
∑n
j=0 tP[j]. Thus
keys with different number of ’1s’ will cause the different
execution time. This will open a timing side channel for the
attackers.
For this Timing SCAs, the PASCAL is shown in figure 2.
Firstly, we use JG SPV to analyze if there is one or several
Fig. 2: PASCAL: Graphical Representation
paths, from a variable deemed to be secure and unobservable
to the output, exist. JG SPV uses a special path sensitization
technology implying taint analysis to find if private key P
can be propagated to the output O. Then if the path exist, JG
SPV will give a counterexample along with an execution trace
detailing: the exact number of clock cycles(say X). As shown
in the figure 3, the example shows waveforms of related signals
along the path. We use the command ”[get property info
-list{max length} property exponent to finish]” to get the
total execution time(clock based) of an exist path for this speci-
fied secure signal pair. Here it needs 44 clock cycles(additional
2 clock cycles are for setting up)to propagate. After that, JG
Fig. 3: Counter Example and Execution Trace
SPV will be used to find another functional path (if it exists)
from P to the output O with a time length different from X
cycles. This is achieved by invoking JG SPV on a modified
design, shown in Figure 4. A counter is added which drives
the multiplexer to select the situation where the DUV both
finish the decryption AND also the length of the execution
trace Y is not equal to X. If JG SPV finds another path with
an execution trace length not equal to X and Y, it is added to
the Union Clause of the multiplexer select condition and the
process is repeated until they find all the timing classes.
Fig. 4: Modified DUV
IV. TIMING SCA SECURE DESIGN FLOW
We also propose a method that aims to achieve timing-
sensitive noninterference for the synthesized design, via which
it is ensured that confidential or secret values cannot be
revealed by the observing/measuring the timing of events at
observable ports. An intuitive method to remove this Timing
SCA vulnerability is to insert additional registers in the faster
paths using path-balanced scheduling [15]. However, as shown
in Figure 1, there could be many paths t1, t2, · · · , tnin the
same basic block. Assume without loss of generality, that
there are n/2 paths each differing by one cycle. Hence a
path balanced scheduling synthesis procedure would insert
1 + 2 + 3 + · · ·+ n/2 or O(n2) registers.
The method we propose is shown in Algorithm 2. Since
Timing Disparate Security Paths result in a Timing SCA
vulnerability only if they are observable at user interfaces
(output ports), it can relax the constrains in the path balanced
scheduling approach and enforce indistinguishable timing be-
haviors at the observable points in the design. Clearly, for
the basic block or the core to be timing insensitive at the
observable points, the output should be observable modulo
tmax cycles where tmax = maximum(t1, t2, · · · , tn). We
enable the output port/interface every tmax cycles using a
counter and an AND gate. This small additional circuitry
acts as the Compensator or balancing/compliance FSM and
provides the (read) enable / data ready signal for observable
interface.
This therefore, leads to a very simple synthesis technique
for ensuring a path balanced design with a single lightweight
Compensator Block at the observable points of interest in
the design. The additional circuit has a very small overhead
counter which counts upto tm to generate the control input
for the AND gate which provides the enable signal to the
observable register. The counter is reset every time a new
input enters the basic block. This additional logic incurs
no penalty in the critical path of the system and avoids
resource duplication since it has a uniform counter where the
results from the different Timing Disparate Security Paths are
delivered to the observable interface with the same latency.
V. RESULTS
The Montgomery modular multiplication with square-and-
multiply algorithm based RSA cryptographic RTL implemen-
Algorithm 2: Timing Channel Removal
Input: Design Under Verification (DUV)
Output: Secure Design Under Test (sDUV)
1 P ← PACSCAL(DUV) ;
2 tm = findMaxExecutionLength(P);
3 Ccompensator Logic Block ← Counter(tm) + ResetLogic;
4 sDUV ← Compensator Logic Block + DUV
tation is vulnerable to timing SCA. This is because for differ-
ent keys the time differences are dependent on the number of
’1s’ in the key as explained in the Algorithm 1. In figure 5,
the time required to generate the timing disparate classes
for 32-bits RSA, 64-bits RSA and 128-bits RSA are shown.
For different RSA, the time needed to identify timing cases
are varies: for the initial few time classes, they are obtained
quickly while for the last few time classes they need a very
long time.
Our method can correctly identifies all timing classes using
formal methods. i.e. for the 32-bits RSA verilog implementa-
tion, it identifies all the timing classes with cycle times from
33 to 64. As for the mitigation method mentioned in Figure 4.
Since the counter need to count to 64, we only need a 7 bits
counter which incurs an approximate area penalty of 7 flops. In
contrast, the path-balanced scheduling strategy would require
about 512 flip flops. Clearly, with a 64-bit RSA, the savings
are more significant. As mentioned earlier, the Compliance
State Machine is not in the critical path and incurs no penalty
in the operational speed of the circuit.
 
RSA-128
1500 180 30
1250 150 25
1000 120 20
750  90 15
500  60 10
250  30 5
0   0 0
RSA-64
RSA-32
1                 8              16              24              32
1        8       16      24      32      40      48      56      64
1   8   16   24  32  40  48  56  64  72  80  88  96 104 112 120 128
Identified disparate timing classes
C
P
U
 t
im
e 
fo
r 
a 
n
ex
t 
d
is
p
ar
at
e 
ti
m
in
g 
cl
as
s 
id
en
ti
fi
ca
ti
o
n
 (
se
co
n
d
s)
Fig. 5: Normalized Execution Times
VI. CONCLUSION
Significant numbers of hardware IPs or crypto accelerators
are being deployed with the widespread adoption of IoT. It
is vitally important that these IPs are provably secure. We
have proposed a novel approach to discover timing SCA
vulnerabilities that (can)exist in designs. This flow also (au-
tomatically) eliminates the information leakage caused by the
timing channel. The insertion of a lightweight Compensator
Block removes the timing channel with minimum modifica-
tions to the design with no impact on the clock cycle time
or combinational delay of the critical path in the circuit. For
the future work, multiple secrets in design or multiple public
interfaces will be studied. And we will also integrate this
framework to High Level Synthesis flow so that more accurate
estimates of area can be obtained.
ACKNOWLEDGEMENTS
This research was supported in part by projects H2020 MSCA
ITN RESCUE funded from the EU H2020 programme under the
MSC grant agreement No.722325, by the Estonian Ministry of
Education and Research institutional research grant no. IUT19-1 and
by European Union through the European Structural and Regional
Development Funds.
REFERENCES
[1] D. E. Denning, “A lattice model of secure information flow,” Commun.
ACM, vol. 19, no. 5, pp. 236–243, 1976.
[2] J. Ming, D. Wu, G. Xiao, J. Wang, and P. Liu, “Taintpipe: Pipelined
symbolic taint analysis,” in 24th USENIX Security Symposium (USENIX
Security 15). Washington, D.C.: USENIX Association, 2015, pp. 65–
80.
[3] F. Koeune and F.-X. Standaert, “Foundations of security analysis and
design iii,” A. Aldini, R. Gorrieri, and F. Martinelli, Eds. Berlin,
Heidelberg: Springer-Verlag, 2005, ch. A Tutorial on Physical Security
and Side-channel Attacks, pp. 78–108.
[4] B. Coppens, I. Verbauwhede, K. D. Bosschere, and B. D. Sutter,
“Practical mitigations for timing-based side-channel attacks on modern
x86 processors,” in 30th IEEE Symposium on Security and Privacy (S&P
2009), 17-20 May 2009, Oakland, California, USA, 2009, pp. 45–60.
[5] K. Tiri and I. Verbauwhede, “A vlsi design flow for secure side-channel
attack resistant ics,” in Design, Automation and Test in Europe, March
2005, pp. 58–63 Vol. 3.
[6] F. Menichelli, R. Menicocci, M. Olivieri, and A. Trifiletti, “High-level
side-channel attack modeling and simulation for security-critical systems
on chips,” IEEE Transactions on Dependable and Secure Computing,
vol. 5, no. 3, pp. 164–176, July 2008.
[7] A. Ardeshiricham, W. Hu, J. Marxen, and R. Kastner, “Register transfer
level information flow tracking for provably secure hardware design,” in
Proceedings of the Conference on Design, Automation & Test in Europe,
ser. DATE ’17. 3001 Leuven, Belgium, Belgium: European Design and
Automation Association, 2017, pp. 1695–1700.
[8] M. Bidmeshki and Y. Makris, “Toward automatic proof generation for
information flow policies in third-party hardware ip,” in 2015 IEEE
International Symposium on Hardware Oriented Security and Trust
(HOST), vol. 00, May 2015, pp. 163–168.
[9] D. Zhang, Y. Wang, G. E. Suh, and A. C. Myers, “A hardware design
language for timing-sensitive information-flow security,” SIGPLAN Not.,
vol. 50, no. 4, pp. 503–516, Mar. 2015.
[10] S. Deng, W. Xiong, and J. Szefer, “Cache timing side-channel vulner-
ability checking with computation tree logic,” in Proceedings of the
7th International Workshop on Hardware and Architectural Support for
Security and Privacy, ser. HASP ’18. New York, NY, USA: ACM,
2018, pp. 2:1–2:8.
[11] E. Oswald, S. Mangard, N. Pramstaller, and V. Rijmen, “A side-
channel analysis resistant description of the aes s-box,” in Fast Software
Encryption, H. Gilbert and H. Handschuh, Eds. Berlin, Heidelberg:
Springer Berlin Heidelberg, 2005, pp. 413–423.
[12] L. S. Indrusiak, J. Harbin, and M. J. Sepu´lveda, “Side-channel attack
resilience through route randomisation in secure real-time networks-on-
chip,” CoRR, vol. abs/1607.03450, 2016.
[13] Z. Jiang, S. Dai, G. E. Suh, and Z. Zhang, “High-level synthesis with
timing-sensitive information flow enforcement,” in Proceedings of the
International Conference on Computer-Aided Design, ser. ICCAD ’18.
New York, NY, USA: ACM, 2018, pp. 88:1–88:8.
[14] JasperGold Security Path Verification App. [Online]. Available:
https://www.cadence.com/content/cadence-www/global/en US/home/
tools/system-design-and-verification/formal-and-static-verification/
jasper-gold-verification-platform.html
[15] S. Peter and T. Givargis, “Towards a timing attack aware high-level
synthesis of integrated circuits,” in 34th IEEE International Conference
on Computer Design, ICCD 2016, Scottsdale, AZ, USA, October 2-5,
2016, 2016, pp. 452–455.
