Side channel analysis of stream cipher hardware by Anderson, Jonathan




Side Channel Analysis of 
Stream Cipher Hardware 
by 
© Jonathan Anderson, B.Eng. 
A thesis submitted to the 
School of Graduate Studies 
in partial fulfillment of the 
requirements for the degree of 
Master of Engineering in Computer Engineering 
Faculty of Engineering and Applied Science 
Memorial University of Newfoundland 
September 2008 
St. John 's, Newfoundland 
Abstract 
In today's world of ubiquitous connectivity, communications security is an ever-
present concern. In order to protect sensitive information from eave dropping by 
foreign governments, identity thieves and other curious individuals and organiza-
tions, cryptography is today deployed on a wide scale. o longer strictly the domain 
of large banks and governments, cryptographic systems ar found in such everyday 
places as building passes and vehicl ignition k ys. Cryptanaly i is the study of 
methods - called attacks - that can be used to extract secret information from these 
cryptographic systems. It is largely a statis tical discipline, but out of it has grown a 
more hands-on approach: side channel analysis. 
Side channel analysis is an exciting field of study which attempts to extract seer t 
information from cryptographic systems though the careful measurement of physical 
characteristics such as power usag and execution time. These characteristics pro-
vide "side channels" of information flow that algorithm design rs may not anticipat . 
This research focuses of the power side channel, which extracts information from th 
instantaneous power either used or radiated by a cryptographic system. Traditional 
forms of power analysis are ineffective against a large class of ciphers called stream 
ciphers, but a recently-introduced group of techniques - template attacks - have 
been shown to be effective against microcontroller-based implementations of stream 
ciphers. 
II 
This thesis describes the theory behind template attacks, and describes how we 
have applied them to perform power analysis of hardware imp! m ntations of stream 
ciphers. We have built hardware for this purpose, called the Side Channel Analysis 
Board (SCAB) as well as designed software to perform the neces ary analysis. We 
used our experimental setup to measure the power usage of FPGA-based hardware 
- specifically the Actel ProASIC3 - running a stream cipher building block call d 
LFSR-16. We have also simulated and analysed the power usage of LFSR-16 and a 
functional str am cipher, Trivium. Trivium is a hardware-focused stream cipher t hat 
was vetted by the r cent eSTREAM initiative, and is thus of great importanc . In 
both simulation and hardware, we were able to extract secret k y information with 
a probability greater than we would expect to achieve through random guessing. In 
the case of the cipher building block LFSR-16, we were able to correctly classify 
four key bits with accuracy greater than 90%. In the case of the stream cipher 
Trivium, average classification success exce ded 20% where random guessing would 
have achieved a success rate of just 6.25%. 
Thus, we may state that the template attack technique is applicable to hardware-
based stream ciphers, and that implementers of such ciphers must be aware of such 
techniques and attempt to apply appropriate countermeasures wh re possible. 
Acknowledgements 
Chrissy, my wife, has been an unfailing source of encouragement and joy. 
Dr. Howard Heys, my supervisor, has provided me with the freedom to explore 
and the guidance to succeed . 
Mr. Chris Batten, of MUN Technical Services, lent his invaluable aid in assembling 
the SCAB platform. After a design flaw was discovered in SCAB Mk I, his st ady 
hand re-routed a single signal from a 208-pin surface-mount chip, and saved a month 's 
worth of refabrication. 
This research was supported by the Natural Sciences and Engineering Research 
Council (NSERC), through the Canada Graduate Scholarship and Discovery Grant 
program . 
Soli Deo gloria. 
lll 
Contents 
1 Introduction 1 
2 Background 4 
2.1 Cryptography 4 
2.1.1 Goals and Actors 4 
2.1.2 Ciphers and Attacks 6 
2.1.2.1 Cryptanalysis 6 
2.1.2.2 Public Key Cryptography 8 
2.1.2.3 One-Time Pad 9 
2.1.2.4 Block Ciphers . 10 
2.1.2.5 Stream Ciphers 12 
2.2 Sid Channel Analysis 13 
2.2.1 Timing Analysis . 15 
2.2.2 Fault Analysis . 16 
2.2.3 Power Analysis & Electromagnetic Analysis 1 
2.3 Summary 22 
3 Template Attacks 24 
3.1 ttack Overview 24 
IV 
CONTENTS 
3.2 Attack Details . ................ . 
3.2.1 The Multivariate Normal Distribution 
3.2.2 Maximum Likelihood Estimators 
3.2.3 Signal Classification 
3.2.4 Template Masking 
3.3 Attack Application . . . . 
3.3.1 Inapplicability of DPA 
3.3.2 Applicability of Template Attacks 
3.3.3 Applicability to Hardware Implementations 
3.4 Summary . ... 
4 Experimental Setup 
4.1 SCAB - Side Channel Analysis Board . 
4.1.1 Design Constraints 
4. 1.2 Power Analysis 
4. 1.3 Fault Analysis . 
4.1.4 Timing Analysis . 
4.2 Other Hardware .... . 
4.3 Measurement Equipment 
4.4 Software .. . .... . 
4.4.1 
4.4.2 
4.4.3 
4.4.4 
4.4.5 
4.4.6 
Power 'frace Formatting 
Calculating 'frace Mean Vectors 
Simulating Power Usage 
Viewing Power 'fraces 
Building Templates . . 
Classifying Power 'fraces 
v 
26 
26 
27 
30 
32 
35 
35 
39 
42 
42 
44 
44 
46 
47 
49 
50 
50 
51 
53 
54 
54 
55 
57 
58 
59 
CONTENTS 
4040 7 Evaluating Classification Success Rate 
405 Summary 0 0 0 0 0 0 0 0 0 0 0 0 0 . 
5 Experimental Results and Analysis 
501 Initial Experiments 
502 LFSR-16 0 0 0 0 0 0 
502 01 Simulation Results 
50202 Experimental Resul ts 0 
503 Summary 0 0 0 0 0 0 0 0 0 0 0 
6 Application of Template Attack to Trivium 
601 Description 0 0 0 0 
602 Simulation Results 
60201 Classification Success Rate vso T mplate Size 
60202 Classification Succes vso Thaining Samples 0 0 
60203 Classification Success Rate vso Bits Under Attack 
603 Thivium Hardware 
604 Summary 0 0 0 0 0 
7 Conclusions 
A Detailed Results 
Ao1 Simulation 0 0 
Aol. l LFSR-16 0 
Aol.2 Thivium 
Aol. 20 1 One Key Bit 
Aol.202 Two Key Bits 
Aol.203 Four Key Bits 0 
Vl 
60 
61 
64 
64 
67 
67 
75 
77 
80 
80 
83 
83 
84 
85 
88 
88 
89 
97 
97 
97 
109 
109 
109 
110 
CONTENTS Vll 
A.l.2.4 Eight Key Bits 113 
A.2 Physical Measurement •• • 0 • 114 
B Software Data Formats 115 
B.1 Cleverscope Text Files 115 
B.l.1 Header . 115 
B.l.2 Body . . 117 
B.2 Analog Trace Files 118 
B.2.1 Text 118 
B.2.2 Binary 119 
B.3 Digital Trace Files 119 
B.3.1 Text 120 
B.3.2 Binary 121 
B.4 Power Usage Files . 122 
B.5 Power Simulation . 123 
B.5.1 Power Model 125 
B.5.2 Cipher Model 125 
List of Tables 
3.1 Stream ciphering operations 
5.1 Power usage characteristics . . . . . . . . . . . . 
5.2 Classification success rate vs. bits under attack 
A.1 16 training samples per operation (lo- 8 W noise) 
A.2 16 training samples per operation (lo- 7 W noise) 
A.3 16 training samples per operation (lo- 6 W noise) 
A.4 16 training samples per operation (10- 5 W noise) 
A.5 16 training samples per operation (10- 4 W noise) 
A.6 16 training samples per operation (10- 3 W noise) 
A.7 16 training samples per operation (.01 W noise) 
A.8 16 training samples per operation (.1 W noise) 
A.9 16 training samples per operation (1 W noise) 
A.10 32 training samples per operation (lo- 8 W noise) 
A.ll 32 training samples per operation (10- 7 W noise) 
A.12 32 training samples per operation (lo- 6 W noise) 
A.13 32 training samples per operation (10-5 W noise) 
A.14 64 training samples per operation (10- 8 W noise) 
A.15 64 training samples per operation (lo- 7 W noise) 
Vlll 
28 
66 
72 
97 
98 
98 
99 
99 
100 
100 
101 
101 
102 
102 
103 
103 
104 
104 
LIST OF TABLES 
A.16 64 training samples per operation (10- 6 W noise) 
A.17 64 training samples per operation (lo- s W noise) 
A.18 64 training samples per operation (10- 4 W noise) 
A.19 64 training samples per operation (lo-3 W noise) 
A.20 64 training samples per operation (10- 2 W noise) 
A.21 64 training samples per operation (.1 W noise) 
A.22 64 training samples per operation (1 W noise) 
A.23 128 training samples per operation (lo- 6 W noise) . 
A.24 128 training samples per operation (lo- s W noise) . 
A.25 256 training samples per operation (lo-7 W noise) . 
A.26 256 training samples per operation (lo-s W noise) . 
A.271Tivium results- attacking one key bit . 
A.28 Trivium results - attacking two key bits . 
A.29 'Ifivium results - attacking four key bits, 64 samples, 
A.30 'Ifivium results - attacking four key bits, 64 samples, 
10- 8 peak noise 
10- 7 peak noise 
lX 
105 
105 
106 
106 
106 
106 
106 
107 
107 
108 
108 
109 
110 
111 
111 
A.31 'Ifivium results - attacking four key bits, 256 samples, 10- 8 peak noise 112 
A.32 'Ifivium results - attacking four key bits, 256 samples, 10- 7 peak noise 112 
A.33 'Ifivium results - attacking four key bits, 1024 samples, 10- 8 peak noise 113 
A.34 'Ifivium results - attacking four key bits , 4096 samples, 10- 8 peak noise 113 
A.35 'Ifivium results - attacking eight key bits 
A.36 Physical measurement results . . . . . . 
114 
114 
List of Figures 
201 Alice, Bob and Eve 0 0 0 0 0 0 0 
202 The one-time pad in operation 0 
203 Block cipher operation 0 
204 Stream cipher operation 
205 An abstract model of a cipher 
206 A more realistic model of a cipher 
207 Data dependent branching 0 0 0 0 
208 Power analysis and electromagnetic analysis 
209 Simple Power Analysis 0 0 0 0 0 0 0 0 0 0 0 0 
5 
10 
11 
12 
14 
15 
16 
18 
20 
301 Inter-operation mean and standard deviation vectors for actual hardware 34 
302 
303 
304 
3°5 
306 
401 
402 
403 
DPA key guesses 
DPA bit guess 0 0 
DPA trace difl'erences 0 
Difference of averages - 1 sample 
Diff renee of averages - 50 samples 
SCAB - Side Channel Analysis Board 
PCB Layout for SCAB 
Experimental setup 0 0 
X 
36 
37 
38 
40 
41 
45 
48 
50 
LIST OF FIGURES 
404 Switch debouncing circuit 
405 Cleverscope PC interface 
406 Workflow - data files 
40 7 Partitioning trace 0 0 
408 traceview showing the contents of a Cleverscope file 
409 traceview used to select subtrace mask 
4010 classify output 
4011 success output 0 
501 The "FlipFlopper" Circuit 
502 FlipFlopper output and instantaneous power usage 
503 Design of LFSR-16 0 0 0 0 0 0 0 0 0 0 
504 Basic statistics of simulated LFSR-16 
505 Classification success vso template size 
506 Classification success vso template size 
507 Classification success vs. peak noise 0 0 
5.8 Inter-operation statistics: varying bits 0- 3 
509 Inter-operation statistics: varying bits 4- 7 
5.10 Inter-operation statistics: varying bits 8- 11 0 
5011 Inter-operation statistics: varying bits 12- 15 
501 2 Classification success vso training samples 
5.13 Classification success vso template size 
5.14 Hardware LFSR-16 statistics ... 0 .. 
601 Trivium 0 . 0 0 0 . 0 
602 Trivium initialization 
6.3 Trivium keystream generation 
Xl 
51 
52 
53 
56 
58 
59 
60 
61 
65 
66 
67 
69 
70 
71 
72 
73 
73 
74 
74 
75 
76 
78 
81 
82 
82 
LIST OF FIGURES 
6.4 Classification success vs. template size - Trivium .. 
6.5 Classification success vs. training samples - Trivium 
6.6 Trivium classification success vs. bits being attacked 
6.7 TI·ivium information leakage . 
B.1 Cleverscope text file example .. . .. . . . 
B.2 Example of a text-based AnalogTrace file . 
B.3 Writing a binary AnalogTrace file . . 
B.4 Example of a binary Analog'Il·ace file 
B.5 Example of a text-based DigitalTrace file 
B.6 Writing a binary AnalogTrace file . . 
B. 7 Example of a binary Analog'Il·ace file 
B.8 Example of a binary PowerUsage fi le 
B.9 PowerUsageModel interface 
B.10 Cipher interface ...... . 
Xll 
84 
85 
86 
87 
116 
119 
120 
121 
122 
122 
123 
124 
125 
126 
Chapter 1 
Introduction 
The world today is more connected than it has ever been. Business employees log 
into corporate computers from home via Virtual Private etworks (VP s), banking 
customers access th ir accounts via mobile phones and billions of dollars are spent 
in online shopping and auctions. With all of this sensit ive information flowing acros 
public networks, the incentive for criminals and others to eave drop i very high so 
securi ty is a top priority. 
The study of ecuring communications is cryptography, and it is concern d with 
two central problems: how to safeguard s cret message , and how to bypass t he 
safeguards of others. The solution to each problem benefits the other, as we cannot 
build or select ecurity tools without understanding the attacks that may be applied 
against th m. With this principle in mind, in the work pr sented in this thesis we 
proceed to attack ciphers that have been implemented in digital har !ware, in an effort 
to circumvent their protections and extract secret information. 
The primary tools of cryptography are ciphers, which p rform encryption (to 
protect information that is to be kept secret) and decryption (to render encrypted 
data readable again). These ciphers can be classified as b longing to one of two ets: 
1 
CHAPTER 1. INTRODUCTION 2 
block ciphers or stream ciphers. There are different applications for these cipher , bu t 
both are important. In 2001 , the US National Institute of Standards and Technology 
( IST) , after a competitive process, published the Advanced Encrypt ion Standard 
(AES) [1], which has become the de facto global standard for block ciphers. In 
2008, the European Union's eSTREAM process ident ified a por tfolio of strong stream 
ciphers - F-FCSR-H v2 [21, Grain v2 [3], MICKEY v2 [4] and Trivium [5] - and it is 
to this more recent ly recognized group that we turn our attention. 
Our goal, then, is to extract secret information from stream ciphers surrept i-
t iously; i. e. to attack them. Rather than the tradit ional (and w 11-studied) method 
of cryptanalysis, whereby mathematical r lationships are found among secret infor-
mation and encrypted data, we turn to the newer approach of side channel analysis 
[6], which xtracts secret information from careful measurement of physical quantit ies 
such as power consumption. 
Traditional forms of side channel analysis are often ineffective against stream ci-
phers, bu t a recent class of techniques known as template attacks [7] have proved effec-
tive against microcontroller-based implementations of stream ciphers (see Chapter 3) . 
Microcontrollers, however , ar large, complex systems. The qu stion before us was, 
could such attacks be effective against hardware implementations of cryptographic 
systems? Could we demonstrate their efficacy, not just against a theor tical model of 
power usage, but against physical hardware? Such a demon tration would impact the 
design and implementation of stream ciph r hardware in emb deled hardware such 
as smart cards and RFIDs, which could impact on the payment and authen tication 
technology sectors. 
We built both hardware and software in an attempt to answer these questions. 
This experimental setup, which is comprised of a custom FPGA-bearing PCB, a 
purpose-bought mixed-signal o cilloscop and thousands of line of analysis software, 
CHAPTER 1. INTRODUCTION 3 
is described in detail in Chapter 4. 
Finally, we discovered the answers to our questions: yes, template attacks are ef-
fective against the power usage of hardware cryptosystems, and yes, this effectiven ss 
can be demonstrated using physical hardware. 
Chapter 2 
Background 
2.1 Cryptography 
The word cryptography comes from the Greek xpun16s (seer t) and ypa<pw (writing) 
[8]. Cryptography is the "science and art of designing ciphers," [9] which are used in 
many application to make secret the messages communicated among two or more 
parties. A basic understanding of cryptography, and its goals, is requisite to under-
standing the purpose and methodology of the attack that we will pres nt in Chapter 
3, and whose r suits we will give in Chapter 5. 
2 .1.1 Goals and Actors 
Cryptography has many goals, including confidentiality (the ability to keep secrets 
from those who we wish not to know them), integrity (the ability to verify that 
messages have not been altered), authentication and non-repudiation (the ability to 
prove that a party sent a message, even if they choose to deny it lat r) . To illustrate 
these goals, we will introduce three characters who figure prominently in the literature: 
Alice, Bob and Eve. 
4 
CHAPTER 2. BACKGROUND 5 
Alice and Bob In the literature, Alice and Bob are often used to repr ent any two 
parties who wish to communicate in a secure manner [10, 11]. Since their communi-
cations are of a sensitive nature, they use cryptographic tools to protect the content 
of their messages from being discerned by avesdroppers (e.g. learning the name of 
a reporter's source), to prevent adversaries from making undetectable changes to the 
substance of their messages (e.g. changing a beneficiary's name in a will) , and if 
desired , to prevent them from later denying that they sent a particular message (e.g. 
an agreement to pay for a good or service). Stated more formally, they use cryptog-
raphy to provide their communications with confidentiality, integrity, authentication 
and non-repudiation. 
Eve Eavesdroppers are commonly represented by an actor named Eve. Eve is as-
sumed to have complete access to the communications channels that Alice and Bob are 
using, even th ability to send messages to one or both parties, but good cryptography 
will prevent her from understanding what Alice and Bob communicate (violating con-
fidentiality), changing the meaning of messag s (violating integrity), masquerading 
as either Alice or Bob (falsifying authentication) or helping either party deny their 
communication (r pudiating transactions). 
8 
Figure 2.1: Alice, Bob and Eve 
CHAPTER 2. BACKGROUND 
2.1.2 Ciphers and Attacks 
6 
T he primary cryptographic tool used to provide confidentiality is the cipher. A cipher 
transforms information that we wish to remain confidential - the plaintext - into a 
stream of data - the ciphertext - that can be safely transmitted via untrusted channels 
such as public networks. This transformation is called encryption, and it - as well as 
the reverse transformation, decryption - is parametrized by secret information called 
the key. Without this key, an adversary in possession of ciphertext material should 
not be able to decrypt any of the ciphertext to read the original plaintext. 
2.1.2.1 Cryptanalysis 
The field of cryptanalysis is dedicated to finding weaknesses in cryptographic algo-
rit hms such as ciphers, whether for the purposes of better understanding cipher design 
(as in academic settings) or eavesdropping on secret communications (as in some in-
dustrial or governmental settings) . There are several methods that can be used to 
attack a ciph r , all of which assume that the attacker knows the cipher being used 
[12]: 
Ciphertext-only attack In this type of attack, it is assumed that th attacker has 
access to cipher text , as well as knowledge of the cipher algorithm. It should be com-
putationally infeasible for the at tacker to ascertain any plaintext or key information. 
T he most obvious such at tack is an exhaustive search (colloquially, a "bru te 
force" attack) . In this approach, the attacker checks every possible key to ee if it can 
be used to decrypt the given ciphertext into an intelligible plaintext. This approach 
is very inefficient: for an n-bit key, the expected number of keys t he attacker must 
CHAPTER 2. BACKGROUND 7 
search, N, is 
(2. 1) 
For the block cipher DES (the Data Encryption Standard) [131, an attacker can exp ct 
to search through 255 = 36 x 1015 keys. This can be achieved today using dedicated 
ha rdware such as the "Deep Crack" machine [14], so newer encryption standards us 
longer keys [1]. For instance, the smallest key permissible for use with AES is 128 bits 
[1], so we would expect an exhaustive search to take N = 2127 = 1. 7 x 1077 decryption 
operations. An attacker would have be be able to search over 1060 keys per second in 
order to expect to finish this search before the death of our sun [15]. 
Known-plaintext attack In a known-plaintext attack, the attacker has knowl-
edge of th cipher algorithm, ciphertext and corresponding plaintext. Even with full 
knowledge of cipher input and out put, it should still be computationally infeasible 
for the attacker to det rmine t h key (or to d rypt later ciphertext) . 
Chosen-plaintext attack In this most powerful type of theoretical attack, not 
only does t he attacker have full knowledge of cipher input and outpu t, but she can 
actually choose plaintexts that ar conveni nt for her purpos s. A secure cipher will 
resist chosen-plaintext attacks - it will still be compu tationally infeasible for the 
attacker to determine any key information, or to be able to d crypt later ciphertexts 
whose plaintexts are not known to the attacker. 
Implementation attack Beyond the realm of strict cryptanalysis - attacks on 
cipher algorithms - there is also a class of attacks that exploit physical properties of 
cipher implementations. Such implementation attacks include timing analysis, fault 
analysis, power analysis and electromagnetic analysis, and will be discussed in Section 
CHAPTER 2. BACKGROU D 
2.2. 
2.1.2.2 Public K ey Cry p t ogr aphy 
One of the fundamental problems of cia ical cryptography was the key distribution 
problem 116]. People eparated by long distances could prot ct th ir communication 
via ciphers, but thi protection was m a ningless unless a secret k y could be se urely 
communicated. Banks and governments could use trusted couriers and diplomatic 
pouches, but such means were beyond the means of privat individuals. 
Key distribution remained an open problem unti l the 1970s, wh n public-key cryp-
tography was inv nted. Public-key cryptography uses one-wa mathematical functions 
- function who inverses, e.g. dis rete logarithms, are very hard to calculate - in 
such a way that encryption can be p rformed by anyone, using a public key, bu t de-
cryption is only feasible for the owner of a secret key. The quintessential public-k y 
cryptosystem is RSA, named for its author : Rivest Shamir and dl man 116] . With 
such a sy tern encryption keys could be published openly larg ly solving the key 
distribution problem. 
The fo cus of this thesis, however , is stream ciphers, which use symm tric key . 
Symmetric-key (or secret-key) cryptography uses the same, secret k y for both en-
cryption and d cryption. Symmetric-key ciphers are still important, as public-k y 
cryptography i very computationally compl x, and is thus often used for the pur-
poses of s tting up a session key - a s cr t key that traditional, lower-complexity 
cryptosystems can use to provid confidentiality for a session . T his i th premis 
behind systems such as PGP - Pretty Good Privacy [17]. 
CHAPTER 2. B CKGROU D 9 
2.1.2.3 One-Time Pad 
During World War I , Vernam proposed the idea of a simple cipher that could not be 
broken: th one-time pad 118]. Shannon ubsequently demonstrated in 112] that this 
cipher did indeed provide perfect secrecy - if the key is t ruly random, then intercepting 
ciphertext provides the attacker with no information about t he plaint xt. 
The critical r quirement for perfect security is that th t of possible keys b 
at l ast as large as t he set of possible plaintexts. In a one-time pad , a long stream 
of random bit i generated and distribu ted to both communicating parties (e.g. an 
embassy's key could be encoded on optical tape and shipp d in a diplomatic bag [9]). 
When a m s ag is ncrypted, each plaintext symbol is add d to a ymbol of k y 
u ing Galois Field arithmetic, and that portion of key is discard d , n ver to be u eel 
again . D ryption occurs via the inverse process: each ciph rtext symbol is added 
to the Galois F ield inverse of an identical keystream symbol - which is afterwards 
discarded - to produce the original plaintext. If the symbol alphabet i in GF(2), 
then both encryption and decryption are imply the XOR operation. 
Since there i as much key material a plaintext, and if that k y material is truly 
random, then it is impossible to "break" the cipher. If th plaintext and key both 
have an alphabet of L symbols, th n there are NL possible plaintexts and NL po -
sible keys, wh r N is the numb r of symbols transmitted. From the ciphert xt 
"GDIFBALDKRPDFZLSB" it is impo sible to know which of the 17-letter plain-
texts "MEETMEAT I ETODAY", 'MAXSMARTISAGENT86' or v n LOVEY-
OUSWEETHEART" is correct, a each of th ir corresponding k y i equally lik ly 
to be correct - as shown in Figure 2.2. 
Because of th logistical co ts of g nerating and eli tributing vast amount of 
key material th one-time pad i not u eel extensively out id of diplomatic and 
CHAPTER 2. BACKGROUND 
Plaintext I M IE IE IT I M IE I A IT IN I I IN IE IT I 0 I D IA I Y I 
Key I U I A I E I M 1 .. ·I I I I I I I I I I I I I 
Ciphertext I G I D I I I F I B I A I L I D IK I RIP I D I F I Z I L I 5 I B I 
Plaintext I MIA I X I 5 I MIA I R IT I I I 5 I A I G IE IN IT IBI 61 
Key I U I D I L I N 1 .. ·I I I I I I I I I I I I I 
Ciphertext I G I D I I IF I B I A I L I D IK I RIP I D IF I Z I L I 5 I B I 
Plaintext I L I 0 IV IE I Y I 0 I U I 5 I WI E IE IT I H IE I A I R IT I 
Key I V I P I N I B 1 .. ·I I I I I I I I I I I I I 
Ciphertext I G I D I I IF I B I A I LID I K I RIP I D I F I Z I L I 5 I B I 
Figure 2.2: The one-time pad in operation 
10 
intelligence circles [9], but we will see in Section 2.1.2.5 how its principles are appli d 
in many practical ciphers. 
2.1.2.4 Block Cipher s 
The last fifty years of symmetric-key cryptography have been dominated by the block 
cipher. A block cipher is a cipher that operates on fixed-size blocks of data (typically 
of 64 or 128 bits) , transforming plaintext blocks into ciphert xt blocks (encryption) 
or vice versa (decryption) . An example is the Advanced Encryption Standard [1], a 
block cipher selected in 2001 to be an official standard of the US National Institute 
of Standards and Technology. 
As shown in Figure 2.3, the unit of communication is a block of ciphertext. An 
eavesdropper cannot decrypt a block without the secret key, providing confidentiality, 
as a single bit difference between the correct key and the key guess will rend r the 
entire block undecryptable. There are several modes of operation for block ciphers, 
CHAPTER 2. BACKGROU D 11 
• • Plaintext Ciphertext 
Key-+ E D ~Key 
Ciphertext Plaintext 
• • 
Figure 2.3: Block cipher operation 
but all operate with two common characteristics: 
• Complexity 
Block cipher are often larg , complex hardware y t m whos power u -
age can vary greatly, dep nding on input elem nt su h as plaintext and 
secret key 
• Key Usage 
- B cause of the overhead as ociated with changing encryption k y (both in 
key management and cipher s tup) block cipher p rform many encryp-
tions/ decryptions with a single key 
While neither of the e characteristics interfere with block ciph r ' abili ty to operate 
securely in a theoretical sense, they will become important wh n w discus imple-
mentation attacks in Section 2.2. 
CHAPTER 2. BACKGROUND 12 
2.1.2.5 Stream Ciphers 
Anoth r class of symmetric-key ciphers, typically associated with resource-constrained 
environments, is the stream cipher. This clas of cipher, shown in Figure 2.4, u e 
identical encryption and decryption modul , each of which produ es a very long 
(e.g. 280 bits) pseudo-random stream of symbols that is parametrized or seeded by a 
public bit vector - the initialization vector - and a secret key. This pseudo-random 
stream is called the keystream, and it is an approximation of the one-time pad d -
scribed in Section 2.1.2.3. An example of such a cipher is Trivium 15], par t of the 
eSTREAM portfolio of stream ciphers. Trivium and the eSTREAM portfolio will be 
addressed in detail in Chapter 6. 
Initialization Vector 
Key- E 
u 
,,,,, ,, ,, EB 
Plaintext 
lll l l l lll 
Ciphertext L_ _____ c_h_a_n_n_e_l ____ _,~ 
Initialization Vector 
E 
' ' !'''''' EB Ciphertext ll!!lllll Plaintext 
Figure 2.4: Stream cipher operation 
As in the case of the one-time pad, the keystream is added to the plaintext using 
Galois Field arithmetic - typically in GF(2) , which is the binary XOR - to produce a 
ciphert xt stream. The ciphertext is transmitted through the communication hann I, 
where it is added to another keystream to produce plaintext again. If the secret key 
at the transmitter and receiver are identical, then their keystreams will be as well, so 
the original plaintext will be recovered. If the keys differ, however, ven by a single 
. -- ------------------------------------------------------------
CHAPTER 2. BACKGROUND 13 
bit, then the two keystreams will be very different, and the original plaintext will not 
be recovered from the ciphertext. 
Tho characteristics of stream ciphers that will become important in Section 2.2 
are: 
• Complexity 
- Stream ciphers are typically very simple systems, and the power used by 
their hardware implementations does not vary as greatly as that of block 
ciphers 
• Key Usage 
- The internal state of many stream ciphers (e.g. Grain [3] and Trivium [5]) 
is initialized with the secret key, but continually changes in such a way 
that key information is "mixed in" to the state, so no two bits of keystream 
are generated from the same internal state. 
2. 2 Side Channel Analysis 
When cryptographers design a new cipher, the approach that they take is often very 
abstract; many cryptographers would agree with [19] when it says that "essentially 
a block cipher is a keyed permutive mapping (encryption) together with its inverse 
(decryption)". Such an abstract, mathematical model of a cipher appears in Figure 
2.5. 
CHAPTER 2. BACKGROU D 
Secre t Known to Attacker r------r-------------
11 
:~ Plaintext 
II 
II 
II 
II 
II 
II 
II 
Key~ II E 
Ciphertext 
Figure 2.5: An abstract model of a ciph r 
14 
In this model, one assumes that the cipher algorithm is known to potential attack-
ers, and pos ibly even pairs of plaintext (data to be encrypted) and ciphertext ( n-
crypted data). The key, which parametrizes the cipher, is not known to the atta ker; 
it is this key that the attacker attempts to find using mathematical relationships 
betwe n the plaintext and ciphertext. 
o cryptographi function , however, exists as merely an abstra t algorithm; it is 
not u eful unt il it has been implem nted in hardware or oftware. The operation of an 
actual cryptographic cipher implementation can yield information about its internal 
that the designer did not expect or plan for. This information i said to flow through 
"side chann Is", which include: 
• power u age - how much power a devi e uses 
• electromagneti radiation - how much power a device radiate 
• execution time - how long an operation takes 
• respons to faults - how the device reacts to intentionally- ind u cd errors 
A more realistic cipher model which incorporates these ide channels i hown in 
F igure 2.6. Car fu l measurement and analysis of the signal in t h channels can 
CHAPTER 2. BACKGROUND 15 
capture information "leaking" out of the cipher. The techniques t hat cryptographers 
use to exploit these correlations are collectively known as side channel analysis [6J. 
Secret Known to Attacker r-----·r---------- ------ ------------ ----11 I 
:~ Plaintext : 
1
1 + I 1  
11 I 
11 I 
11 ~Execution Time 1 
11 I 
11 I 
1
1 E ..-Induced Faults 1 Key~ ~Response to Faults 
II 
II 
11 ~Power Usage 
:: ~EM Radiation 
II t II 
II 
II 
11 Ciphertext 
_____ j~--------------------------------
F igure 2.6: A more realistic model of a cipher 
Most techniques of side channel analysis require some level of physical access to 
the cryptographic device under attack. This was once an implausible assumption, but 
as cryptography moves from secure server rooms to notebook PCs to smart cards in 
our wallets, it becomes an increasingly realistic and important component of security 
threat models. 
2. 2.1 Timing Analysis 
Timing Analysis, first demonstrated m [20], uses very precise measurement of al-
gorithm execution time in order to infer bits of data upon which the algorithm is 
operating. At first glance, there may not seem to be a correlation between these two 
things, but in fact, execution time can be related to: 
• key- or data-dependent branch instructions 
• cache hits 
CHAPTER 2. BACKGROU D 16 
• long processor instructions (e.g. multiplication) 
For instance, in public-key cryptography, mathematical operations are often p r-
formed on very large (512- or 1024-bit) integers. In order to improv performance, 
many public-k y implementations will use conditional (if/ els ) oftware in t ruction 
that depend on key or data bits, as in Figure 2.7. 
for (i = .. . ) 
if (input & (1 << i) ! = 0) 
output << 1 
output *= input 
Figure 2.7: Data dependent branching 
This pseudocode, which could be part of a large-integer exponentiator, ha two 
lines that only execute if an particular input bit is 1. If th input and output 
variables in this pseudocode are 512-bit integers, ther will b a very ignificant 
difference in execution time dep nding on how many bits of input are 1. 
Timing attacks have been applied to block ciphers such a RC5 [21] and are even 
applicable to careless implementation of the Advanced Encryption Standard 122]. 
2.2.2 Fault Analysis 
Fault analysi attempts to induce small (u ually single-bit) rror into a ryptographi 
computation 123]. The resu ltant ciphertext can be compared to th ciphertext that 
would emerge if there were no error, and the differences between the two can yield 
insight into internal bits that should be ecret. 
CHAPTER 2. BACKGROUND 17 
Fault Induction In [23], the theoretical effectiveness of fault analysis was demon-
strated, but no concrete results against actual, physical cryptosystems were given. 
Rather, it was assumed the attacker had the power to cause bits in the system to 
"flip" from 0 to 1 or 1 to 0, as required by the attack. This model was first given in 
[24], but more recent work has demonstrated practical fault induction. 
More recently, [25] showed that, if such faults can be generated , then complete 
AES keys can be recovered using as few as 128 faulty encryptions. Such faul ts have 
been demonstrated in [26], where such commonplace equipment as lenses and camera 
flashes were used to set individual bits of microcontroller memory with precise timing. 
This does require opening the packaging of the chip, however. More insidious is the 
attack in [271, which claims that arbitrary memory bits may be set or reset by an 
attacker. If practical, such attacks would be very difficult to d fend against. 
Clock Glitches Side channel analysis is often used against the "smart cards" that 
control mobile phones, pay TV and satellite receivers, and in orne places, even power 
meters. These devices are very small, and they usually hav no on-board power or 
clock sources; they rely on connected equipment, and this can leave them vulnerable 
to clock gli tches. 
In [28], it was demonstrated that by sending a 20MHz puls to a smart card which 
operates at 5MHz, faults could be introduced in the system. In fact , it wa shown 
that individual instructions executing on a microcontroller could be bypassed by way 
of such fau lts. Hence, the number of encryption rounds could be reduced , making 
cryptanalysis trivial . 
Chip Rewriting In [28], it was shown that single ROM bits could be overwritten 
with a laser cutter microscope. Again , this could be used to attack data, but it is 
CHAPTER 2. BACKGROUND 18 
even more effective to attack the program, reducing encryption rounds to one or two 
and allowing for trivial cryptanalysis of the system. 
In [29], a focused ion beam is used to cut traces inside of a microchip, or even to 
lay down new traces . This equipment is expensive, on the order of mi llions of dollars , 
but it can be rented for much less . Against such a powerful adversary, it is difficult 
to imagine countermeasures that would have more efficacy than slowing the attacker 
down. Indeed, [29] contains confirmations from "a senior agency official" and "a senior 
scientist at a leading chip maker" that the contents of a microchip cannot be kept 
secret indefin itely from a skilled, equipped and motivated attacker . 
2 .2 .3 Power Analysis & Electromagnetic Analysis 
The power usage and electromagnetic radiation side channels are closely related. 
Power analysis attempts to find internal secrets by correlating them with how much 
power an electronic device is consuming [30]. Electromagnetic analysis attempts to 
find correlations with the power that a device is radiating, either from the entire 
system or from a specific location on a chip [31] . The experimental setups involved 
with both can be very simple, as shown in F igure 2.8. 
l (a) (b) 
Figure 2.8: Power analysis and electromagnetic analysis 
CHAPTER 2. BACKGROUND 19 
Both forms of SCA have their place: power analysis does not have to contend 
with high levels of ambient noise, but electromagnetic analysis allows the attacker to 
focus on a specific part of the device under attack - concentrating on cryptography 
and ignoring unrelated hardware. 
In Figure 2.8( a), we see a hardware device with a small resistor inserted between 
its Vee terminal and the actual Vee supply. The power consumed by such a device 
can easily be calculated as 
p(t) = v(t). i(t) = Vt(t) . vh(t) ~ Vt(t). (2.2) 
Figure 2.8(b) shows a small electromagnetic probe receiving radiation from the 
cryptographic device. In both cases, deep memory oscilloscopes are used to record 
the power being consumed or emitted. 
A capture of the power usage over a complete cryptographic operation is referred 
to as a power trace. While it is possible to correlate these traces with internal secrets, 
it is often made difficult by a very low signal-to-noise ratio (SNR). An attacker may 
be interested in whether a particular flip-flop in a cryptographic device changes from 
0 to 1 or from 1 to 0, but there may be thousands of flip-flops in the device, each of 
which has just as much effect on overall power usage as the bit being attacked. In 
order to overcome this SNR problem, increasingly sophisticated methods of analysis 
are being developed. 
Simple Power Analysis (SPA) The first, and simplest, method of power analy-
sis can be used to analyse the power consumption of microcontroller-based software 
implementations. Na!ve implementations of ciphers may incorporate software tech-
CHAPTER 2. BACKGROUND 20 
niques like branching instructions that depend on key bits. Such techniques build a 
very high correlation between power usage and individual key bits, as the difference 
in power traces where a branch was or was not taken can be obvious even to the 
naked eye. In these situations, an attacker may be able to simply examine the power 
trace and pick out key bits by observing whether or not power-intensive instructions 
following branch instructions were executed. 
For instance, Figure 2.9 shows a current trace from a DES operation. Arrows point 
to dips in current characteristic of rotation functions, clearly showing the attacker that 
one rotation occurred in one round and two in the next. Since the number of rotations 
are key-dependent, being able to count rotations gives the attacking information about 
the secret key . 
.-..._ 5.0 q:: E 4.5 
--.;... 
..... 4.0 
c:: (I) 3.5 
E 
:::J 3.0 
0 2.5 i i 
2.0 
0 100 200 300 400 500 600 700 BOO 
Time (pS) 
Figure 2.9: Simple Power Analysis [30] 
This technique is called simple power analysis (SPA) [30], and it relies on a rel-
atively high SNR. It has been used practically, as shown in the SPA attack against 
DES in [30], but it is a very simple matter for a cipher implementation to counter-
act this threat: all that is required is for the designer and/ or implementer to ensure 
that branching instructions do not depend on key bits. This may increase execution 
time, but avoiding key-dependent shortcuts means that the high SNR necessary for 
Simple Power Analysis is not attained, and so SPA is rendered ineffective against the 
implementation. 
CHAPTER 2. BACKGROU D 21 
Differential Power Analysis (DPA) The differential power analysis (DPA) tech-
nique [30], which is suitable for attacking hardware and software systems, attempts 
to overcome low S R by analysing many power traces which u e the same key infor-
mation and statistically testing hypotheses concerning internal key bits. The great r 
the number of traces used, the higher the resultant SNR, but there is a caveat: the at-
tacker may have to gather thousands of power traces from the device being attacked , 
which may b difficult to acquire without arousing suspicion. 
It has been shown that DPA can be used in practical attacks on real cryp tosystem 
involving block ciphers such as DES [30], but it is often not effectiv against stream 
ciphers [32]. The reason is that, as stated in Section 2.1.2.5, the secret key often only 
exists in the cipher's internal state for a few clock cycles, so th analysis described in 
[30] does not work unless the attack can obtain many traces from cipher re-keyings. 
Obtaining power traces of several thousand r -k yings, with th same key, from an in-
production device can be prohibitively difficult; this type of key re-use i purposefully 
avoided in most protocols to minimize su ceptibili ty to traditional cryptanalysi . 
Template Attacks A newer approach to the S R problem, which removes th 
requirement for obtaining thousands of power traces from the device under attack, is 
referred to as a template attack, as presented in [7]. We will pre ent this attack in 
more detail in Chapter 3. 
The template attack takes a two-step approach to power analysis: 
1. Template Preparation 
A cryptographic device identical to the one under attack is acquired 1 and t ern-
1 When we say that a device has been acquired, it may be either constructed or otherwise obtained 
(e.g. by purchasing the same model of device). It is very realistic to assum that this is practical, as 
many cryptographic systems are built with standard commercial components, such as !SO-standard 
smart cards [33] ; only the secret keys are not available to the attacker. 
CHAPTER 2. BACKGROUND 22 
plates are built, which are multivariate Gaussian models of the noise associat d 
with particular guesses a t key bits. 
2. Actual Attack 
In this st p , a single power trace is collected from an actual device in use and, 
for each template, the probability that it belongs to that template is calculated. 
Communications ngineers will se that this approach is analogous to using match d 
filters to resolve received signals. The technique shows much promise, having b en 
used to successfully attack a microcontroller-based implem ntation of the stream 
cipher RC4 17]. To date, however, t he template approach has not been appli d to 
hardware-based implementations of stream ciphers. 
2.3 Summary 
Cryptography is an important part of daily life in our networked world. One of the 
most fundamental tools of cryptography is the cipher, which provides confidentiality 
for parties wishing to communicate in the presence of an eavesdropping threat. These 
ciphers may be attacked through methods of cryptanalysis, which may be classified 
as ciphertext-only, known-plaintext, chosen-plaintext or implementation attacks. 
Ciphers can be cat gorized as symmetric-key or asymmetric-key. Among symmetric-
key ciphers, whi h this thesis is concern d with, there are two broad categories: block 
ciphers and stream ciphers. Stream ciphers attempt to approximate the on -tim pad 
- which has perfect secrecy - by generating long pseudo-random keystreams from se-
cret keys. Encrypt ion consists of adding this keystream to the plaintext stream, and 
decryption consists of adding it to the ciphertext stream. One important stream 
cipher today is Trivium, which will be considered in detail in Chapter 6. 
CHAPTER 2. BACKGROUND 23 
Side channel analysis is a broad term for a class of implementation attacks that 
attempt to extract secret information via careful measurement of various physical 
characteristics, such as execution time, response to induced faults and the power con-
sumed or radiated by a system. These measurements can be analysed by inspection, 
partitioning-based statistics and multivariate Gaussian analysis. The latter approach 
is called a template attack, and its details are the subject of Chapter 3. 
Chapter 3 
Template Attacks 
The t emplate attack is a powerful method for extracting secret information from 
cryptographic hardware. Chari et al. claimed in [7] that it is "the strongest form of 
side channel attack possible in an information theoretic sense" (under certain assump-
tions concerning the nature of the side channel - see Section 3.3.2) . The attack i 
effective when physical access is limited- an attacker needs just one power trace from 
the device under attack - and is even effective against stream iphers, which resist 
traditional power analysis techniques such as simple power analysis and differential 
power analysis (se Section 4.1.2 for more information on SPA and DPA) . 
In this thesis , we focus on the power usage side chann l, but template attack 
are not inherently limited to power analysis; they can also be applied to other side 
channels such as electromagnetic radiation and execution time. 
3.1 Attack Overview 
Template attacks operate according to the principles of signal detection , and th y ar 
optimal in the same sense that the matched filter approach is the optimal technique 
24 
CHAPTER 3. TEMPLATE ATTACKS 25 
availabile in its domain. Unlike traditional power analysis techniques, template at-
tacks are spli t into two steps, only one of which actually requires access to the device 
under attack [7): 
Template Preparation Analogous to the preparation of matched filters , this step 
of the attack involves the construction of templates - collections of statistical in-
formation that will later be used to recognize secret parameters to cryptographic 
operations. 
Cryptographic hardware, similar or identical to the hardware to be attacked 1, is 
run through the initial stages of operation many t imes - hundreds or even thousands 
of times - with certain parameters (e.g. several bits of the secret key) set to known 
values. Other parameters are permitted to vary randomly, so that, as the numer of 
sample traces increases, the template comes to reflect only that side channel infor-
mation which depends on the parameters set above. 
The side channel (power usage, timing, etc.) is measured carefully for each op-
eration that the attacker chooses to target (e.g. one for each of the 16 possible 
combinations of four particular key bits). Statistics are compiled , and a set of this 
statistical data - the template - is generated. The set of templates - one per operation 
- is then used in the second step of the attack. 
Template Application In t his second step, the attacker captures traces from the 
device under attack - just one sample can be sufficient, though additional traces can 
increase the probability of success. Thes traces are then compared to each template 
to determine which template each t race is "closest, to (see Section 3.9 on page 30 for 
1 The assumption is t hat the attacker has access to similar or identical hardware, but this as-
sumption is very realistic. From smart cards to tamper-resistant PC cards and associated libraries, 
standard hardware and software is available on the open market for the would-be attacker to legally 
acquire. 
CHAPTER 3. TEMPLATE ATTACKS 26 
a precise definition of closeness). The operation associated with thi template ( .g. 
a guess at a portion of the secret key) is assumed to be corr ct, and the attack can 
repeat on other parameters, such as other k y bits. 
3.2 Attack Details 
The theory of template attacks is rooted in the statistics of multivariate normal di tri-
butions, as presented in [34] and [35]. It is assumed that side channel measurement 
can b characterized by such a distribu t ion; the validity of thi a sumption is consid-
ered in Section 3.3.2. 
3.2.1 The Multivariate Normal Distribution 
Suppose we have a random variable which is an n x 1 vector. Sine it is both random 
and a vector, it will be represented here by x. For the purpo e of the templat 
attack, thi random variable could b a set of power or timing measurem nts. The 
probability di tribu t ion of this v ctor can be represented by a mean vector and an 
independant covariance matr·ix. The mean vector is defin ed a : 
E {xd 
E {x} = E {x2} (3.1) J-Lx. 
E {xn} 
where x is the n x 1 random variable, x i is an I ment in th random variable (e.g. 
a single power or timing value), E {} i the expectation operator and i1 is the mean 
vector. T he covariance matrix of th random variable is: 
CHAPTER 3. TEMPLATE ATTACKS 27 
Ex= cov (:X)= (3.2) 
where CJij = E {(xi - 1-ti) (xj - /-t j )} is the covariance of i-th and j -th elements of the 
n x 1 random variable :X, and I-ii = E {xi}. 
We may now express the distribution 's probability density function (PDF) as 
1 (- - )r,, - 1(- - ) fx (x) = e x-J.Lx. ""'x. x - J.Lx. 
v(2nt IExl ' (3 .3) 
where x is an n x 1 vector, flx is the distribution's mean vector, Ex the covariance 
matrix and I Ex I the determinant of the covariance matrix. This PDF is the general 
(multi-dimensional) form of the well-known univariate Gaussian PDF: 
1 (x-tt 
f (x) = --e- 2a X (J..j2if ) (3.4) 
where x is a real value, p, is the distribution's mean value and CJ is its standard 
deviation. 
Our side channel values (e.g. the power usage of the cipher when, say, the last 
four key bits are 0110) can be represented by such a distribution, with each element 
of the vector x being a different power measurement (e.g. power at t ime t = 20ns, 
t = 40ns, etc.). 
3.2.2 Maximum Likelihood Estimators 
The template in a template attack is a maximum-likelihood estimation of j1 and E for 
a set of possible side channel values, e.g. the power usage of a cipher for a particular 
CHAPTER 3. TEMPLATE ATTACKS 28 
I Operation I Description 
Q(l) Cipher with last four key bits 0000 
Q(2) Cipher with last four key bits 0010 
0(3) Cipher with last four key bits 0011 
Q(l6) Cipher with last four key bits 1111 
Table 3.1: Stream ciphering operations 
subset of key bits. After collecting a large number of side-channel values - hundreds 
or thousands - we can calculate the maximum likelihood estimations of the actual 
mean and covariance matrix. 
Using the nomenclature of (7], we fi rst identify a number of operations t hat we 
wish to study. If the identified operations are microprocessor instructions, t hen the 
template attack will enable an attacker to identify when particular instructions ex-
ecute. In our case - attacking stream cipher hardware - an operation will be the 
execution of a cipher with a particular subset of known key bits. For instance, the 
initial round of attack may involve 16 operations, given in Table 3.1. 
Again using the nomenclature of [7], we will now define several values important 
to the attack: 
K The number of operations we wish to study 
L The number of sample traces we will measure per operation 
N The number of data points in each sample trace 
Note that, for reasons given below, L should be greater than or equal to N (and 
in practice, L > 2N). 
We may organize t he sample values into K matrices, one per operation , each 
CHAPTER 3. TEMPLATE ATTACKS 29 
denoted g (k), where k represents the operation, and containing L vectors of N points 
of sid channel data: 
g (k) = (3.5) 
Having generated the matrix g (k), we may estimate the operation's mean vector. The 
maximum-likelihood estimation of the true mean vector is simply the sample mean 
vector, an arithmetic average of each sample trace. Let k be the number of the 
operation being studied (in the range [1, K]) and ~k) be the j-th column of g(k). The 
arithmetic average of all L samples of side channel measurements for operation Q (k) 
is a vector of N values, represented by p,(k) and given by 
~ (k) 
J-L 
L 
tLsil 
i=l 
L 
tLsi2 
i= l 
L fLSiN 
i=l 
(3.6) 
Once we have calculated an operation's sample mean vector, we may calculate the 
noise vector, n~k) for each sample trace for operation Q (k), ~k) : 
CHAPTER 3. TEMPLATE ATTACKS 30 
(3.7) 
The noise vectors of all L sample traces are used to calculate the maximum-likelihood 
estimate of the operation's covariance matrix: 
(3.8) 
This N x N matrix is our maximum-likelihood estimate of the operation's covari-
ance matrix E(k), and the template for operation Q (k) is (p,, t) . 
3.2.3 Signal Classification 
Having built J{ templates, one per operation, we can classify any signal s by calcu-
lating that signal's noise vector, fi, and th Mahalanobis distanc between that noise 
vector and each operation's mean, il(k) [36]: 
(3.9) 
where t (k) is the sample covariance matrix for operation Q (k). Having calculated D~) 
for each of the K templates, we may classify the signal s as belonging to the operation 
Q(k) which has the smallest Mahalanobis distance D~) (n) . 
The method introduced in [7] attempts to effect classification by using the multi-
variate Gaussian PDF directly as a probability: 
'... the noise probabili ty distribu tion is given by the N- dimensional 
CHAPTER 3. TEMPLATE ATTACKS 31 
multivariate Gaussian distribution PN; (-) where the probability of observ-
ing a noise vector n is: 
where lEN, I denotes the determinant of EN; and EJ\r; is its inverse.' 17] 
This is not strictly valid, as a point on a PDF is not a probability. The probability of 
a point on a continuous distribution is vanishingly small, as probabilit ies are obtained 
by integrating under a PDF and the area underneath a point is infinitesimal. 
While the nomenclature is not precise, the method does work - it is concerned 
with ratios of "probabilities" rather than the probabilities themselves. Indeed, though 
a value of a point on the PDF may be much greater than 1, a ratio-based comparison 
of PDF values can be an effective classification mechanism. 
Given Equation 3.9, we see that t he PDF from 17] can be r presented as: 
(3.10) 
The ratio between PDF values for a given noise vector and two operations, Q (ko ) and 
Q(kJ), is: 
J(ko) ( n) 
J(k!) ( n) 
CHAPTER 3. TEMPLATE ATTACKS 32 
Since ex, x2 and Vx are monotonically increasing functions with respect to x (where 
x ~ 0), we see that if IIS(ko) II = IIS(k1 ) II, then choosing the operation whose PDF 
value is largest is equivalent to choosing the operation whose Mahalanobis distance 
is smallest . Our experiments have shown that, while IIS(ko) II may not be equal to 
IIS(k1) II, they are typically on the same order of magnitude, whereas eD~~o l(ii) and 
eD~!l(fi) often differ by orders of magnatude. Thus, the method described in 171 is 
effective , even if the nomenclature is imprecise . 
3.2.4 Template Masking 
Computing large templates can be computationally intensive: for L sample traces 
and template size N, t he computationa l complexity is in the class: 
8(LN + LN + LN2 ) 
- 8 (LN 2) . 
Fortunately, we are able to reduce the template size N though a process of masking, 
as not all points in a side-channel trace are equally significant. Often, the power used 
or emitted by a cryptographic device at the passing of a clock edge is mor significant 
than the power used or emitted between clock pulses. Some clock cycles may b more 
significant than others , as the change in Hamming weight may vary more because of 
certain key bits than others at certain times. 
We reduce the size of templat es by selecting for the template only those points 
in t he side-channel trace which are significant. For instance, we may select 32 data 
points out of 1600 measured , leading to a thousand-fold reduction in computational 
complexity. This selection is accomplished as follows: 
CHAPTER 3. TEMPLATE ATTACKS 33 
1. The sample m an vector p,(k) is calculated for each operation Q (k). 
2. An overall mean vector p, is calculated: 
3. The int r-operation standard deviation of the mean vectors is calculated: 
B= 
1 I< 
-"' ([t(k) - p,)2 J(L . 
k=l 
4. For a chosen value N (e.g. 32 points of interest), theN points with the greatest 
inter-operation standard deviation are selected for template generation. 
Actual inter-operation mean and standard deviation vectors are shown in Figure 3.1. 
This data was d rived from the experimental setup to be describ d in Chapter 4, 
and it illustrates just how significant differences can be among data points in the 
inter-operation standard deviation. 
The upper graph shows the inter-op ration mean vector. In this vector, we can s e 
clear spikes of power usage whenever a clock edge occurs. T his behaviour is common 
to all operations, and thus, it can be observed in the inter-operation mean. The 
lower graph is the inter-operation standard deviation vector. This vector shows us 
two important facts: 
1. The greatest differences occur at clock edges. 
2. The greatest differences occur early in the operations - before the secret k y 
can "mix into" the cipher state. 
--------------
' : 
Fl~ lrlormation I Statisiics I Trac~ I P~wer u;age 
Mean -
Stan:j3J"d Devumon 
'":'t§lTx 
,--------------------------------~. 
0 000491099 w 1 
Figure 3.1: Inter-operation mean and standard deviation vectors for actual hardware 
Cl 
~ 
'\:) 
~ 
:::0 
w 
~ 
~ 
t:-< 
~ 
~ 
~ 
1--j 
~ 
~ (f) 
CHAPTER 3. TEMPLATE ATTACKS 
3.3 Attack Application 
35 
Template attacks have been applied in [7] against microcontrollers running th tr am 
cipher ARC4 (the "Alleged RC4™", so called because the name "RC4" is still pro-
tected by trad mark , though source code to produce data equivalent to RC4 has b en 
available on the Internet since 1987). A a stream cipher , ARC4 is resistant to dif-
ferential power analysis (see Section 2.2.3 on page 21), but is highly susceptible to 
template attacks. 
3.3.1 Inapplicability of DPA 
Differential power analysis, which can be applied quite successfully to block cipher , 
is simply not applicable to most stream ciphers, including ARC4. The reason has 
to do with the persistence of secret key information. We now turn our attention to 
explaining this important distinction in detail. 
DPA and B lock Ciphers When DPA is applied against a block cipher , the at-
tacker makes several guesses at a subset of the secret key, as shown in Figure 3.2 on 
the following page. 
This figure shows a model of a block cipher with four encryption rounds, each 
having S-boxes (providing non-linear substitution), a permutation layer (providing 
linear diffusion) and a key mixing layer. This model is similar to t he ubstitut ion-
permutation network presented in [371, but with the addition of a key mixing lay r 
between each round. The block on the left-hand side of the figur r presents the key 
scheduler, which converts the secret key into several round keys, whi h are added via 
XOR in each round's key mixing layer. Th secret key shown is OxXXBXX7XX in 
hexadecimal, where an 'X' digit represents bits of key that are not part of the current 
CHAPTER 3. TEMPLATE ATTACKS 
guess. 
Key 
X 
X 
X 
Plaintext 
7fhH.2~~~~ 
X 
X 
Ciphertext 
Figure 3.2: DPA key guesses 
36 
After making this guess, the attacker observes a large number of blocks of cipher-
text and records power traces associated with their production. T he secret k y gue s 
allows the attacker to work backwards through the cipher to determine a sing! bit 
whose value can be inferred if the key guess is correct. 
In Figure 3.3, the guess of subkey bits, combined with knowledge of the key 
scheduling algorithm, permutation layer and S-box construction allows the attacker 
to evaluate a particular key bit entering an S-box in the last round of encryption. This 
bit is used to partition the side channel data. into two set : those for which the internal 
bit is 0, and those for which that bit is 1. If the attacker's key gues was incorrect, 
then we expect traces whose internal bit was 0 and 1 to be evenly distributed among 
t he two sets. If, however, the guess is corr ct, t hen the partitioning will be correct, 
and there will be a. significant difference between the averages of t he two sets of side 
channel data. Resultant trace differences are shown in F igure 3.4, which show four 
CHAPTER 3. TEMPLATE ATTACKS 
Key 
graphs: 
X 
X 
X 
Plaintext 
7-tH~~~~::s 
X 
X 
Ciphertext 
Figure 3.3: DPA bit guess 
1. A refer nee current trace (from which power may be derived , since p =vi) 
37 
2. A graph showing the difference betw en the average of two ample trac s t 
where the sets have been partitioned by a correct k y gu 
3. Two graphs showing differences between the averages of two sampl s trace et 
each, wher th ts have be n partitioned by an incorr t k y gues 
The current spikes in the middle of the trace show that there is a material di~ renee 
between th traces in the partition d s t . That is, the initial key gue was correct, 
which made the partitioning effective. The attacker may now move on to anot her 
subset of th k y, then another, unti l ev ntually the entire key is r v aled. In thi 
way, the cr t key of a block cipher can be recovered in a linear way u ing a divid 
and conquer approach instead of the 2N approach of exhau tiv ar h. 
CHAPTER 3. TEMPLATE ATTACKS 38 
20 
15 
10 
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 
Time (JlS) 
Figure 3.4: DPA t race differences j30] 
DPA and Stream Ciphers Against stream ciphers, however, the differential 
power analysis technique is ineffective. In order t o perform DPA, the attack must 
gather a large number of ciphertext/ side channel pairs while the cipher 's secret k y 
remains constant . A stream cipher 's internal state, however , is constantly changing; 
the secret key is only found at init ialization, and by the t ime the keystream (and , 
thus, ciphertext) can be generated , the key has been mixed wit h an init ialization 
vector (IV) so that the starting stat e is never the same twice. 
A DPA a t tack might be effective if the attacker could obtain many ciphertext / -
CHAPTER 3. TENIPLATE ATTACKS 39 
side channel pairs from a device using a constant key and initialization vector , bu t 
cryptographic protocols are designed not to re-use IVs. Hence, an attack that does 
not rely on a persistent secret key being used by the device unci r attack is ne dec!; 
one such attack is the template at tack. 
3.3.2 Applicability of Template Attacks 
Figure 3.5 shows two graphs, each a difference between two power traces obtain d 
from t he key initialization phase of ARC4. These graphs show: 
1. The difference between two power traces produced u ing the sam key. 
2. T he difference between two power traces produced using different keys. 
These graphs do not reveal the striking differences tha t an attacker might expect 
to see - differences like those in Figure 3.6. In fact, the first graph actually exhibi ts 
larger differences than the second, even though its keys w r identical. T his is due to 
the stochastic nature of power measurem nts: there is always noise associated with 
unrela ted hardware or operations, so ident ical operations may be more dissimilar than 
different op rations. 
Figure 3.6 shows graphs for th same conditions - ARC4, the first graph for trace 
using the same key and the second graph for t races using different keys - bu t unlike 
Figure 3.5, the graphs show differences b tween averages of ets of traces and th re 
are spikes of dissimilarity in the second graph which do not appear in the first (or in 
Figure 3.5). This dissimilarity reveals that the two graphs were produced by dissimilar 
operations, which in turn gives the attacker information about the secret key. It is 
this informa tion t hat can be exploited by a template attack. 
Template attacks assume t hat side channel information can be characterized by 
a mul tivariate Gaussian distribut ion. This characterization may introduce error -
---------------
CHAPTER 3. TEMPLATE ATTACKS 
40.-------.--------~--------~--------.--------.--------~ 
20 
0 
- 20 
-40 
- 60 
-80 ~------~--------~--------~--------~------~--------~ 
40 
0 500 1000 1500 2000 2500 3000 
40.--------.--------~--------~------~.-------~--------~ 
30 
20 
10 
0 
- 10 
- 20 
-30L-------~---------L--------~--------~------~--------~ 
0 500 1000 1500 2000 2500 3000 
Figure 3.5: Differenc of averages - 1 sample [7] 
CHAPTER 3. TEMPLATE ATTACKS 
15 .--------,---------,--------.---------.--------,--------~ 
10 
5 
0 
- 5 
- 10 
- 15 ~------~--------~--------~--------~------~--------~ 
41 
0 500 1000 1500 2000 2500 3000 
30 ~------~--------~--------~--------~------~--------~ 
20 
- 10 
- 20 
-30~------~--------~--------~--------~------~--------~ 
0 500 1000 1500 2000 2500 3000 
Figure 3.6: Difference of averages - 50 samples 17] 
CHAPTER 3. TEMPLATE ATTACKS 42 
indeed, the interdependance of hardware elements suggests that the Gaussian distri-
bution is not ideal - but the results in [7] show that it is a useful characterization. 
Further study could better model the characteristics of side channels, but such study 
is beyond the scope of this research. 
3.3.3 Applicability to Hardware Implementations 
Through template attacks, the classification success rate for ARC4 running on an 
embedded microcontroller was shown in [7] to be 98.1% - 99.3%, a clear success. 
Microcontrollers, while less resource-intensive than t radit ional CPUs, are still much 
more complex systems than the "pure" hardware that many ciphers are implemented 
in. The lower complexity of simply hardware implementations has been thought to 
be a defense against side channel analysis: 
"the only exposure for !fast cipher hardware] is the loading of the key bytes 
from EEPROM which usually leaks the hamming weight" [71 
The question that we set about answering was: 
Can template attacks be used to differentiate secret keys on such a small scale of 
power usage as seen in digital hardware? 
3.4 Summary 
Template attacks, based on the theory of signal detection and classification, are very 
powerful side chann 1 attacks. If the noise associated with the side channel is Gaus-
sian, then this technique is in fact optimal [7] . 
Template attacks consist of two important stages: 
1. Template Preparation 
CHAPTER 3. TEMPLATE ATTACKS 43 
(a) The attacker collects a large number of sample trac s from a cryptographic 
syst m identical to the one being attacked. 
(b) These traces are used to build templates - pairs, each consisting of a mean 
vector and a covariance matrix - for a number of operations. 
2. Template Application 
(a) A small number of side channel traces (possibly just one) are taken from 
the device being attacked. 
(b) The traces are classified using their Mahalanobis distances to each opera-
tion: for each trace, the operation whose Mahalanobis difference is smalle t 
- or whose probability density function is largest - is selected as the pro-
ducer of the trace. 
Computational effort can be reduced by "masking" the trace data points used to 
construct the t mplate; only those points with significant inter-operation standard 
deviation are used. This pruning process may reduce computation time a thousand-
fold , but often does not reduce the success rate of the attack by mor t han several 
percentage points. 
Template attacks are applicable to stream ciphering systems, though older tech-
niques such as differential power analysis (DPA) are not. This is because, unlike 
DPA, template attacks do not rely on a system using a persistent secret key. Tem-
plate attacks have been shown to be very successful against microcontroller-based 
cryptographic implementations, but we will show in Chapter 5 that they are also 
effective against implementations in digital hardware. 
First, however, we will describe the setup used in our experimentation. This 
experimental setup is the subject of Chapter 4. 
Chapter 4 
Experimental Setup 
For this thesis, we wished to both apply the template attack by simulating hardware -
based on a model derived from actual physical characteristics - and also to apply the 
attack to measurements taken of the characteristics of physical hardware. These char-
acteristics, especially power usage, w re to be measured while real hardware p rforms 
cryptographic operations. Simulating, measuring and analy ing th se characteristic 
required: 
1. A hardware platform on which we could run cryptographic op rations 
2. Sensitive measurement equipment with suitable amounts of memory 
3. Software to simulate hardware and analyse physical measurement data 
4 .1 SCAB - Side Channel Analysis Board 
The Side Channel Analysis Board (SCAB) , shown in Figure 4.1 , is a development 
board intended to facilitate the study of side channel attacks (SCA) against crypto-
graphic hardware. It was developed for th purpose of the research contained in this 
44 
CHAPTER 4. EXPERIMENTAL SETUP 45 
thesis, but its long-term objective is to provide security res archers with a platform 
on which any algorithm can run - thanks to reconfigurable hardware - and many 
physical properties can be studied. 
Figure 4.1: SCAB - Side Channel Analysis Board 
In order to study arbitrary cryptograph ic operations we chose to make use of 
reconfigurable hardware, namely Field Programmable Gate Arrays (FPGAs). After 
studying many commercially available FPGA development kits, it was found that 
none were suitable for our use, for several reasons: 
• development boards include hardware extraneous to our purposes (e.g. LEDs, 
keypads , media and storage I/ 0) that could obscure the FPGA's power usage 
• many FPGA I/ 0 pins are tied to this extraneous hardware, making it difficult 
to load and unload blocks of data and keys 
• m asuring power usage, which requires inserting a resistor between Vee and the 
FPGA, would require physically altering the board- cutting traces and inserting 
CHAPTER 4. EXPERIMENTAL SETUP 46 
the resistor 
For these reasons, as well as the opportunity for learning, we decided to build our 
own development board, SCAB. 
4.1.1 Design Constraints 
Although in this thesis, SCAB was used only for the application of template attacks 
to the power usage side channel of stream cipher hardware, it was designed to be 
a facility for subsequent researchers to also use. These researchers could focus on 
any number of side channels, and any number of cryptographic systems that can be 
implemented in hardware. 
The design of SCAB had to satisfy several constraints, some external and some 
owing to the intrinsic nature of the research: 
• It must be possible to configure SCAB with large, fast implementations of mod-
ern ciphers such as AES. 
- To accommodate high-throughput designs like that found in [38], the min-
imum acceptable gate count of the FPGA is 60k gates. 
• It must be possible to transfer blocks of data through parallel I/ Os. 
- To accommodate large, modern block ciphers, we wish to be able to transfer 
128-bit blocks of data in a single clock period. 
• It must be possible to assemble SCAB in Memorial's PCB facilities. 
Non-local PCB construction was acceptable (and indeed, required), but 
the assembly process required interaction with technicians, which would 
not have been possible unless SCAB was assembled locally. 
CHAPTER 4. EXPERIMENTAL SETUP 47 
- High-pin count packages such as Ball Grid Array (BGA) may not be used; 
only through-hole and surface-mount packages are acceptable. 
• The design should be as simple as pos ible. 
- Increased hardware complexity would increase the time required to design 
SCAB. 
- FPGA support chips would influence power usage and obscure side channel 
information. 
• Each type of SCA also presents its own requirements and constraints; they are 
di cussed below. 
In order to fulfill all of these requirements , we selected an Actel ProASIC3 FPGA 
which has 125,000 gates, 131 digital I/ Os, independent core and I/ 0 power inputs, 
surface-mount packaging- Quad Flat Package (QFP) - and on-board flash memory, 
which eliminates the need for external memory chips on the board. 
We would have liked to use an FPGA with at least 256 digital I/ 0 pins, but su h 
chips require packaging technology which cannot be handled in a local assembly of 
the PCB. 
4.1.2 Power Analysis 
In order to facilitate power analysis, SCAB has two independent power upply n ts: 
Vee, which powers the FPGA's core logic, and Vee1 , which powers FPGA I/ 0 and 
anything else whose power usage is not relevant to the research. 
The former of these nets, Vee, has a resistor - R1 in Figure 4.2 - inserted in 
series with the power supply so that the FPGA's current draw can be measured . 
CHAPTER 4. EXPERIMENTAL SETUP 48 
Figure 4.2: PCB Layout for SCAB 
This resistor must have a very small value (we have selected 50) so as to keep Vee 
from falling outside the FPGA's operating envelope. Vee may be powered by one 
of two sources; when performing power analysis, a researcher will typically choose to 
power Vee by the voltage regulator Vl. The other option - best for fault analysis -
is discussed in Section 4. 1.3. 
The latter net, Veer, is independent of Vee, so that power used for FPGA I/ 0 
does not affect the measurement of core power usage. Vee1 is powered by t h voltage 
regulator V2, which ensures that I/ 0 voltage is always held steady, even during faul t 
attacks (Section 4. 1.3) . This regulator is, in turn , connected to the DC power jack 
J2 whenever the main power switch (S2) is closed. 
CHAPTER 4. EXPERIMENTAL SETUP 
4.1.3 Fault Analysis 
49 
SCAB also facilitates fault analysis, in which the researcher attempts to induce an 
incorrect computation through externally-induced faul ts. Th sourc of these faults 
may include gli tches in the clock signal or unusual power supply characteristics (e.g. 
too high, too low, spikes). SCAB provides the access needed to study the effect of 
such faults through its power supply design, external clock and large number of I/ 0 
pms. 
Power Supply As mentioned in Section 4.1.2, SCAB's Vee supply net may be 
driven by a voltage regulator or, more interestingly for the purpose of faul t analysi , 
an external power source. This direct connection to the Vee net allow a researcher 
to set up abnormal power supply conditions, including undervoltage or overvoltage 
conditions as well as voltage spikes. 
External Clock SCAB also supports timing fault analysis. Sine SCAB's clock is 
driven ext rnally (connected via BNC), a researcher can modify clock signals, inducing 
glitches and changing duty cycles and periods, in an attempt to induce erroneous 
computation. 
I/ 0 Pins Finally, SCAB's large I/ 0 bank allows a researcher to export up to 128 
internal signals from a hardware design , which permits the direct observation of how 
internal values change while the system is under external stre s (power, clock or 
temperatur glitches, ionizing radiation, tc.). This level of access permits the study 
of fau lt propagation, and it also allows researchers to verify existing fault models. 
CHAPTER 4. EXPERIMENTAL SETUP 
4.1.4 Timing Analysis 
50 
SCAB's external clock and large I/ 0 bank also supports timing analysis: as a re-
searcher can manipulate clock signals at will and gain insight into the internals of a 
hardware implementation, looking to see not just when an algorithm is complete, but 
where sub-sections of it are complete. 
4.2 Other Hardware 
The complete experimental setup is shown in Figure 4.3. 
Figure 4.3: Experimental setup 
Besides SCAB, other hardware that can be seen in Figure 4.3 includes: 
• oscilloscope used for measurement 
• DC power supply 
• DIP switches (used for parallel key and/ or IV bit inputs) 
CHAPTER 4. EXPERIMENTAL SET UP 51 
• momentary reset switch 
• "go" switch 
The latter of these switches provided t he hardware wit h the signal to start (and 
cont inue) cryptographic operation. Interfacing this switch directly with the hardware 
r quired debouncing to prevent momentary glitches in the "go" signal - caused by the 
mechanical bouncing of switch elements - from reaching the cryptographic hardware. 
The debouncing circuit is shown in Figure 4.4. 
Vee 
0"-~----- Debounce<l Signal 
1 
1' 
Vee 
Figure 4.4: Switch debouncing circui t 
4.3 Measurement Equipment 
The focus of our research is the application of template at tacks to the power usag 
side channel of stream ciphers using SCAB. Our power m asurements w re all made 
with the Cleverscope CS328A [39], a P C-based mixed-mode oscilloscope with high 
t ime resolut ion (10 ns minimum period ) and deep memory (up to one million data 
points per channel). This scope allowed us to: 
1. Generate clock signals 
CHAPTER 4. EXPERIMENTAL SETUP 52 
2. Monitor eight digital channels 
3. Measure two analog voltage channels, using analog, digital and/ or external trig-
genng 
The Cleverscope interface is shown in Figure 4.5. 
1
\ "" 
., ,. ., ' :: /,, 
I . -
-, ··;, II \J\ '~' II r·\. ,. ~.:.·· j\r\, .. -·· I~ r·~'\ ' .. :::.!: 
, f ~~ I VI ·c...- V "-" I ' \.r- / ":: ·---- '*.,. 
I ''" 
I "" 
'1 . , ... 
.... 
I 
~ 
Figure 4.5: Cleverscope PC interface 
The eight digital channels were used to monitor the hardware being tested, in-
cluding clock input and keystream output. The analog channels were used to measure 
the voltage before and after the resistor Rl in Figure 4.2. Calculating instantaneous 
power usage from these voltages is very simple: 
(4 .1) 
CHAPTER 4. EXPERIMENTAL SETUP 53 
With one million data points per channel at our disposal, we were able to capture 
thousands of points per key / IV pair, or approximately 50 data points per clock cycle. 
This data allowed us to construct accurate templates (see Chapter 5). 
The data captured by the Cleverscope was saved to t ext files and interpreted by 
our software, described in the following section. 
4.4 Software 
Turning physical simulations or measurements into classification statistics requires 
software. The software workflow is shown in Figure 4.6. We wrote approximately 
10,000 lines of C++ to accomplish these tasks; the programs which accomplish them 
are described here. 
Hardware 
Model 
Physical Cleverscope 
Measurement ( .cscope) 
'--------' C1.everscope .__ ____ __J 
(from Cleverscope 
manufacturer) 
Trace Mean t raceview Subtrace Mask 
(.mean) (.mask) 
success 
Success Statistics 
(.success) 
Figure 4.6: Workflow - data files 
CHAPTER 4. EXPERIMENTAL SETUP 
4.4.1 Power Trace Formatting 
54 
After taking physical measurements of side channel data, the output of the Clever-
scope program is a file that , for every time increment, specifies the voltage for each 
analog trace and logic value for each digital trace. We must take this information 
and turn it into a format more amenable to interpretation L. 
The powercat program reads voltage traces from input fil s (Clever cope, Tek-
tronix GRAB2212 or our own analog trace format) and outputs them to binary trace 
files. These fil s may be a concatenation of several trace files - hence the nam of the 
program - and contain just a power trace and a digital "partitioning" trace (which is 
described in Section 4.4.2, below). 
A text fi le containing a full captur of Cleverscope memory occupies 50 MB of 
disk space. If many such captures are required (e.g. when capturing output from 
multiple keys), storage requirements quickly become enormous. Conv rting this data 
to a binary format saves both storage space and computational complexity, as text 
parsing is not r quired every time we load a power trace. 
4.4.2 Calculating Trace Mean Vectors 
The traceaverage program takes a power trace fi le, partitions it and averages all of 
the subtraces. 
Aside: Partitions and Subtraces A single power trace file may contain trac s 
for many samples. Each of these subtraces is denoted by a single digital trace, called 
the partitioning trace. This partitioning trace is shown in Figure 4.7, and it is u ed 
by the traceaverage program (and others) to partition a long trace file into multiple 
subtraces. T he partitioning trace is qual to 1 during encryption operation and 0 
1 Details concerning fi le formats are given in Appendix B on page 115 
CHAPTER 4. EXPERIMENTAL SETUP 55 
between them. Thus, whenever the partitioning trace switches from 0 to 1, a new 
subtrace (sample trace) has begun. 
4.4.3 Simulating Power Usage 
The simulate program simulates t he power usage of a hardware implementation of 
a cryp tographic cipher. The user can specify a number of parameters: 
• the cipher to simulate 
- currently LFSR-16 or Trivium 
• the secret key to use 
- specified as OxXXXX or ObXXXX, where X can be: 
* a value (0-1 for binary, 0-f for hex) 
* the literal X, meaning "assign randomly for each sample" 
• how much noise to add 
• sampling period 
• the number of samples to simulate 
• the number of clock cycles to simulate per sample 
• the number of samples to simulate per clock cycle 
The power model used can be customized by writing a C++ class that implements 
the PowerUsageModel interface (see Section B.5.1). This model tells the simulator 
how much power is consumed by a flip-flop that either: 
-· · OxOO.power- Trace Viewer 
file !;:dit View J:!elp 
-0 -
File 1 nformation Statistics Traces Power Usage 
--
-- -- -
I 
- -- - -- - --- --· A 
- A 
v 
< - u ~~--~ ---1--------=---1- ------IA 
I 
Figure 4.7: Partitioning t race 
CHAPTER 4. EXPERIMENTAL SETUP 57 
• changes from high to low, 
• changes from low to high , 
• remains steady at low or 
• remains steady at high. 
Determining the number of flip-flops that maintain or change state is the job of 
another C+ + class, one which inherits from the abstract class Cipher (see Section 
B.5.2). This class tells the simulator , on each clock cycle, how its internal state has 
changed ince the last cycle. 
Using the power usage model and the cipher model , the simulator generates pow r 
traces for part icular ciphers running on particular (simulat d) hardware. 
The output of this simulation is a power trace file ( .power extension) and a 
su btrace mean file ( . mean extension). 
4.4.4 Viewing Power Traces 
The traceview program is used for two purposes: 
1. To view voltage and power trace files, as w 11 as simpl statistics about them 
2. To view inter-operation statistics and choose subtrac masks (see Section 3.2.4 
on page 32). 
In the first mode, the program simply displays the contents of a trace fil , as in Figure 
4.8. 
In the second mode, several subtrace mean files are loaded (one per operation) , and 
simple inter-operation statistics can be viewed. From the inter-operation standard 
CHAPTER 4. EXPERIMENTAL SETUP 58 
... oxoo.cecope . Trlot Vltwllf' 
file { dit VIew t:t.•lp 
-.,j 0 C"" .. •, • 1 I 
Fllelnform•tfon St1tlstla 1'\"eces Power U11ge 
Figure 4.8: tracevi ew showing the contents of a Clever cope fi le 
deviat ion, we select a subtrace mask to apply when building templates. In Figure 
4.9, we can see a line drawn acros t he inter-operation standard d viation. T his line 
is the cutoff above which the subtrace mask will accept poin t and below which it 
will reject them. 
In th is case, 32 points in each sample trace wi ll become part of the templat ; 
the templa te's size will be N = 32. This number N can b varied unt il the desired 
value is reached , whether by inspection - placing the cutoff line above the level of 
background noise, or to achieve a particular probability of clas ification succe s. See 
Section 5 on page 64 for graphs of clas ifi ation success rate versu template size. 
4.4.5 Building Templates 
T he build- template program takes as inpu t a power trace (containing a number of 
subt races) and an opt ional subtrace mask. Its ou tput is a template fi l ontaining 
CHAPTER 4. EXPERIME TAL SETUP 
.. Tt•~ Vltwtf 
flle { dit View tt•lp 
loJ 0 ' .,, I 
File Information Statistics T1'1ces Power U.wg• 
11..1.. .l I J. 
•• 
M .. n 
Stlr'K11rd Deviation 
i ). l l 
Figure 4.9: traceview used to select subtrace mask 
59 
LLL L l .U '1 
. .. 
0 .00011492 w ~ 
the pair ( p(k), f:(k) ) for a particular operation Q(k) as explain d in Section 3.8 on 
page 30. This file has a . template extension (see Figure 4.6 on page 53) and can be 
read by the next program in the software workflow, classify. 
4.4.6 Classifying Power Traces 
The classify program takes as input a power trace file (. power) - whose subtraces 
are known to have been generated by a particular operation Q(k) - a number of 
template files (. template) and an optional subtrace mask fil (. mask). It partitions 
the trace fil into subtraces, and classifies each as being likeliest to correspond to one 
of the giv n templates. The output is a classification file ( .classification) , whose 
format is shown in Figure 4.10. 
T he probabilities are derived using the procedure given in Section 3.2.3 on page 30, 
with one modification: since the attacker knows that one of the generated templates 
CHAPTER 4. EXPERIMENTAL SETUP 60 
classify 
Loading mask file: .. /mask32.mask 
Opening templates: OxOO.template Ox01.template Ox02. 
template Ox03.template Ox04.template Ox05.template Ox06 
.template Ox07.template Ox08.template Ox09.template 0 
xOa.template OxOb . template OxOc.template OxOd .template 
OxOe . template OxOf . template 
Done reading templates . 
Reading data to classify ... 
Opening ' .. I dut /0 xOO. power ' ... 
[==================================================] 100% 
256 traces, sized [32-32] samples/trace 
Trace 0: 
Probability of template OxOO.template: 0 . 999997 
Probability of template Ox01.template: 1.20044e -0 7 
Probability of template Ox02.template: 1.66347e - 14 
Probability of template Ox03.template: 5.76915e -08 
Probability of template Ox04.template : 9 . 61753e - 07 
Figure 4.10: classify output 
is for the operation that generated the trace, t he probability density function values 
are normaliz d such that they add to 1 and represent the probability that a particular 
operation produced the trace, given that one of the templates is correct. 
4.4. 7 Evaluating Classification Success Rate 
The success program reads . classify files (one per operation) and produces sum-
mary statistics, both on per-key and overall bases. The output of this program is 
shown in Figure 4.11. 
CHAPTER 4. EXPERIMENTAL SETUP 
success 
Opening output files .. . 
OxOO . classification ... key : OxOO 
255 correct guesses (99. 6094%, 98 . 3821% certainty) 
1 incorrect guesses (0. 390625%, 85.6105% certaint y) 
Ox01.classification ... key : Ox01 
Lowest success rate : 
Highest success rate: 
Average success rate: 
97 . 6562% 
100% 
99 . 3896% 
Figure 4.11: success output 
4.5 Summary 
61 
We have described t he experimental setup used to simulate, realize and measure the 
characteristics of cryptographic hardware for this thesis. 
The Side Channel Analysis Board (SCAB) was designed to be a platform for 
security researchers to investigate many kinds of side channel analysis. It was designed 
to meet the following constraints: 
• It must be possible to reconfigure SCAB with large, fast implementations of 
modern ciphers such as AES. 
• It must be possible to transfer blocks of data through parallel I/ Os. 
• It must be possible to assemble SCAB in Memorial's PCB facilities. 
• The design should be as simple as possible. 
• It should meet side channel-specific constraints: 
Power Analysis 
* SCAB should incorporate two independent supply nets, one for core 
logic and one for I/ 0. 
CHAPTER 4. EXPERIMENTAL SETUP 62 
* The core logic supply should have a small-valued resistor inserted in 
series with the power supply for measurement purposes. 
Fault Analysis 
* SCAB should incorporate both on-board - i. e. regulated - and exter-
nal power supply options. 
* SCAB should be driven by an external clock. 
* SCAB should have a large number of I/ 0 pins to xpose internal state 
and allow verification of fault models. 
- Timing Analysis 
* SCAB should be driven by an external clock. 
* SCAB should have a large number of I/ 0 pins to expose internal state. 
Other hardware m the setup included power supplies, switches and measurement 
equipment. 
This measurement equipment consisted of the Cleverscope CS328A, a PC-based 
oscilloscope. It was purchased for this research, and performed its tasks well. 
We also wrote 10,000 lines of C++ software to do many things: 
• reformat power data 
• calculate average power usage 
• simula te power usage 
• view power traces and select template masks 
• build templates 
• classify power traces 
CHAPTER 4. EXPERIMENTAL SETUP 63 
• calculate classification success rates 
This experimental setup was used to apply template attacks to stream cipher hard-
ware. The results of this application are given in the next two chapters. 
Chapter 5 
Experimental Results and Analysis 
In this chapter, we present the initial results of our experimentation. These results 
consist of basic measurements of hardware characteristics and the application of the 
template attack technique to both simulated and measured power usage characteris-
tics of a stream cipher building block. 
5.1 Initial Experiments 
One of the first uses of the experimental setup was to evaluate the difference between 
the power consumed during the flip of a flip-flop and the power consumed at other 
times. In order to test this, we built a very simple circuit called "Flip-Flopper," 
shown in Figure 5.1. This circuit used a large number of identical elements, acting in 
parallel, to increase the ratio of data-dependent power usage to background noise. 
This circuit consists of 512 D flip-flops switching in concert between the values 
0 and 1. There is a counter and two comparators, one to check for a counter value 
of 5 and the other to check for a counter value of 9. These comparators control the 
changes of the flip-flops: after initializing all values to 0, the circuit counts five clock 
64 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 
Counter 
CLR 
512 D 
Flip-Flops 
Q 
Q 
Figure 5. 1: The "F lipFlopper" Circuit 
65 
cycles, then a ll 512 flip-flops change their values from 0 to 1. After another nine clo k 
cycles, a ll flip-flops change from 1 back to 0. The output of this simple circui t , as 
well as its instantaneous power usage, is shown in F igure 5.2. 
This power t race was determined according to Equation 4. 1, and it shows t hat 
not only is the power consumed by a bit flip greater than the static power, bu t the 
power consumed wh n the bi ts flip from 0 to 1 (approximately 11 mW in total, for 
all 512 fl ip-flops) is greater than the power consumed wh n the bits fl ip from 1 to 0 
(approximately 6 mW in total). T his differenc between types of bit flip provides u 
with more information than we expected . 
These initial resul ts provided us with the basic power characteristics of D fli p-flops 
CHAPTER 5. EXPERIME TAL RESULTS AND ANALYSIS 
~ 
Ql 
01 
Ill 
Vl 
::J 
.... 
Ql 
66 
3: 
~ 
lv,· 
1
: 
1.',!1~J~J!~J~ ~:.1 ll/wt~~.l},.b -~) ',!,.J~~J~,! .. \· r' 1.J\1':·~1.!"-·\!j\\JJ.:~: :1. 11)' \.!J~·! ... \.JlJ ~~}~.~ · ·~' ~ J,1l,!lv.,11 
Time (s) 
Figure 5.2: FlipFlopper output and instantaneous power usage 
in our FPGA hardware, as given in Table 5.1. The values in thi table w re derived 
by measuring the total power usage of the system and dividing by 512, the number 
of flip-flop in the system. 
Event I Minimum Power I Maximum Power I Mean Power I Power Rang 
Static 10.0 J-LW 11.9 J-LW 11.9 J-LW ±.97 J-LW 
Bit flip (1 to 0) 17.2 J-LW 18.0 J-LW 17.6 J-LW ± .39 J-LW 
Bit flip (0 to 1) 21.3 J-LW 22.5 J-LW 21.9 J-LW ± .59J-LW 
Table 5.1: Power usage characteristics 
From the mean values in this table , we created a very simpl simulation model: 
for every bit in a cryptographic system that remains the sam , power usage will be 
11.9 ~W. For very bit that changes from 0 to 1, the power usage will be 21.9 ~W, and 
for every bit that changes from 1 to 0, 17.6 ~ W. To this ideal value, we add additiv 
white Gaus ian noise (AWGN); how much noi e we add is a parameter that we vary 
while studying the attack's effectiven ss. 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 67 
5.2 LFSR-16 
Before attacking a fu ll-fledged stream cipher, we started by attacking a basic building 
block of many stream ciphers , the linear feedback shift regist r (LFSR). 
LFSRs are shift registers that feed back on themselves, inserting a new bit at their 
tail every clock cycle which is a linear combination of other bits in the register. On 
its own, an LFSR is not a stream cipher: it can be cryp tana lys d trivially because 
of its linear nature. It is , however , a useful building block in the construction of real 
stream ciphers. 
We chose a simple 16-bit LFSR with the characteristic polynomial given in Equa-
t ion 5.1. 
(5. 1) 
This LFSR has a maximal period: assuming it is not loaded with 0, it will shift 216 - 1 
times before it repeats a previous stat . A simple diagram of LFSR-16 is shown in 
Figure 5.3. 
Figure 5.3: Design of LFSR-16 
5 .2. 1 Simulation Results 
Using the power usage characteristic in Table 5. 1, we simulated LFSR-16 running on 
an Actel ProASIC3 FPGA. The power usage of this cipher was simulat d using a 16-
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 68 
bit initial state (key) that was determined randomly, except for t he most significant 
four bits. These bits were fixed for any particular operation Q (k), so we were able to 
generate 16 templates, one for each of the keys { OxOXXX, Ox1XXX ... OxfXXX }, 
where the most significant bit of the key i loaded into the left-most bit of th shift 
register in Figure 5.3. 
For each simulation of LFSR-16, we simulated a different numb r of sample traces 
(16, 32, 64, 128 or 256). We also varied the amount of noise added to the power trace, 
as well as th number of data points included in the template mask ( e Section 3.2.4 
on page 32). The detailed results of this analysis can be found in Appendix A, but 
we present an overview here. 
Figure 5.4 shows the basic inter-operation statistics for t h simulated LFSR-16 
(64 samples, peak noise 10- 6 ). The top graph is the inter-operation mean, and the 
bottom graph is the inter-operation standard deviation. As expected, the greatest 
deviation is early in the sample traces, before the key bits "mix in" to the cipher' 
tate. 
The line aero s t he standard deviation graph shows the cutoff for trace rna king 
with N = 8. Even with such a small number of data points, we are able to obtain 
useful information from the trace so as to have very good classification success. 
Figur 5.5 shows the classification succ ss rate for simulated LFSR-16 when we 
use L = 64 training samples per operation and the noise present has peak values of 
10- 5 . This noise value was chosen because it fits with the characteristics in Table 
5.1. It hows the success rate increasing with template size, and even with template 
size N < 5, the average classification success rate is greater than 6.25%, which i the 
success rate we would expect if we were guessing randomly. 
T he four lines on this graph are: 
..,...... - Trace Viewer 
file f dit View ttelp 
File I rTformation Statistics Trace; Power Usage 
Mean 
! ' !llj~~ ~ ! j ~ I I ' I I I \H! . ~ l I ,!II\ N '',,,/\ 11 ~I I 'l/ 1 \ ·~ 1/i\ 1 1/ I i i \f 
Standard Deviation 
Figure 5.4: Basic statistics of simulated LFSR-16 
I ~~ ~ ~ . II \ill \iii I ~ 
-
- -- -
~ 
A 
v 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 70 
1. Maximum success rate: the highest rate of correct classification for any opera-
tion 
2. Average success rate: the average rate of correct classification over all operations 
3. Minimum success rate: the lowest rate of correct classification for any operation 
4. Guess rate: how successful we would expect to be if we guessed randomly 
Simulated LFSR-16, 64 Samples, 1 e-5 W Noise 
60.0% 
Q) 
- 50.0% ro 0::: 
(/) 
(/) 40.0% Q) 
(.) 
(.) •, 
::::J 30.0% \ (f) 
' c 
0 20.0% +=' 
~ 
<+= 
'Vi 10.0% 
(/) 
~ 
0 0.0% 
0 5 10 15 20 25 30 
Template Size (N) 
Figure 5.5: Classification success vs. template size 
---·· Max 
- Avg 
· · Min 
- Guess 
Figure 5.6 shows the classification success rate versus template size when the noise 
is much lower, with a power peak of 10- 7 W . This noise level is lower than observed, 
but as we will see in Section 5.2.2, the results are a closer approximation to those 
obtained through physical experiment than those obtained using the noise 1 vel of 
w-s w (peak). 
With this noise level, we were able to achieve > 90% average classification success 
using as few as four data point and approximately 80% minimum success with as 
few as 10 points, making this an attack of remarkably low computational complexity. 
CHAPTER 5. EXPERIMENTAL RESULTS AND A ALYSIS 
Simulated LFSR-16, 64 Samples , 1e-7 W Noise 
120.0% 
Q) 
-~ 100.0% 
(/) 
(/) 
Q) 80.0% 
(.) 
(.) 
:::::l (/) 60.0% 
c 
0 
~ 40.0% 
(.) 
~ (/) 20.0% (/) 
('0 
() 0.0% 
_ I 
0 
··---···-----·---···---··---···---··--- ----···---···---···--· 
- .. - . 
/ 
I 
5 10 15 20 
Template Size (N) 
25 30 
-.... Max 
- Avg 
· Min 
- Guess 
Figure 5.6: Classificat ion success vs . template size 
71 
Effect of Noise Increasing the amount of noise in the power t races has a negativ 
effect on classification success, as shown in Figure 5.7. This graph shows a general 
downwards trend in classification success as th peak noise increases from 10- 7 W. 
Effect of Varying Bits Under Attack Varying which key bits templates were 
constructed from also affected the success rates of t he template at tack. Inter-operation 
statistics are shown in Figures 5.8, 5.9, 5.10 and 5.11 . 
These figures show that attacking less significant bits leads to more similar op r-
ation means, as shown by fewer peaks in inter-operation standard deviation . 
Correspondingly, we see in Table 5.2 that classification succ rate diminish as 
we a ttack progressively less significant bits in the LFSR-16 key. 
This behaviour can be explained by observing the feedback "taps" in LFSR-16. 
The less significant the bits which vary according to operation, the more clock cycles 
it will take them to reach the feedback taps and affect other bits. Once t he least 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 
Simulated LFSR-16, N=1 0, 16 Samples 
120.0% 
2 100.0% 
co 
' 
' ' ,·. 
0:: 
en 
en 
Q) 
() 
() 
::::J 
(f) 
c 
0 
:;:::; 
co () 
I+= 
en 
en 
co 
u 
80.0% 
60.0% 
40.0% 
20.0% 
\\ 
\\, 
. ' 
' '. 
\
·.\\. 
...... \ ---... _______ _ 
\ -··-----------.. -----
--
\.. ----- ---- --- --- --- ---
~ 
··---. 
0.0% ........ --- ----------
---·· tv1ax 
---- Avg 
- · Min 
- Guess 
1e-7 1e-6 1e-5 1e-4 1e-3 1e-2 1e-1 1e+O 
Figure 5. 7: Classification success vs. peak noise 
I Bits Under Attack I Minimum I Average I Maximum I 
0 - 3 11.4% 19.3% 38.3% 
4 - 7 9.7% 14.9% 26.9% 
8 - 11 12.2% 17.2% 29.1% 
12 - 15 10.3% 16.2% 25. 1% 
Table 5.2: Classification success rate vs. bits under attack 
72 
significant bits have reached the feedback taps, however, the operation of the LFSR 
will have caused internal states to vary just as greatly within operations as between 
operations. Thus, inter-operation differences are reduced, as are classification su c ss 
rates. 
Effect of Number of Training Samples For a fixed number of sample points 
in the template rna k, a higher number of training samples was more likely to yield 
a correct result , as shown in F igure 5.12. With such low noise, classification is very 
CHAPTER 5. EXPERIME TAL RESULTS A D A ALYSIS 
Trace VIewer 
file Eet View !:ialP 
Flle lnl'ormatlon SIIIIIIIICS Trac• Poww u .. ~ 
Standwd OevuUlon 
1.41085e·06W ~ 
Figur 5.8: Inter-operation statistics: varying bit 0- 3 
Trace Vlewer 
~le {Ct View tiiiP 
\... 0 - :.- ~ 
1.32627•06 w ~ 
Figure 5.9: Inter-op ration tatistics: varying bits 4- 7 
73 
------------------------------
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 
Trace Vk!wer 
Ble E_d1l View !:!alp 
... o 
FllelrtonnaUon Stahsllcs Traces Power Usaga 
Mean 
Standard Deviation 
1.20602e-06 w ~ 
Figure 5.10: Inter-operation statistics: varying bits 8- 11 
Trace VIewer 
file Eat! View !:ieiP 
- o r ,._ -·· .. -, 
Al~t lntounatlon Staltshcs Ttaces Powtr Usage 
Mean 
Sttw"'dard Deviation 
I 28436e-08 W ~ 
Figure 5.11: Inter-operation statistics: varying bits 12- 15 
74 
-- -- ---------------------------
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 75 
successful even with a low number of training samples, but as we will see in Section 
5.2.2, this level of classification success is realistic. 
Simulated LFSR-16, N=12, 1e-7 W Noise 
120.0% 
(1) 
..... 
ctl 100.0% 0::: 
en 
en 80.0% (1) 
u 
u 
:J 60.0% (/) 
;----- - - ---
' I 
----Max 
--- A-.g 
c: - - Min 
0 40.0% :.;::; -Guess 
ctl 
u 
-
20.0% 
·u; 
en 
ctl 0.0% () 
0 50 1 00 150 200 250 300 
Training Samples (L) 
Figure 5.12: Classification success vs . training samples 
With this success in hand, we proceed to apply template attacks to the LFSR-16 
implemented in hardware. 
5.2.2 Experimental Results 
Having successfully attacked a simulated LFSR-16, we proceeded to a practical ap-
plication of the template attack technique in real hardware. Using the SCAB and 
Cleverscope described in Chapter 4, we carefully measured the power used by LFSR-
16 during its initialization. The secret key was set to { OxXXOO, ... OxXXOf }, and 
256 samples were taken , allowing the hardware to initialize once with each possible 
combination of unspecified key bits. 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 76 
Figure 5.13 shows the classification success rate versus template size. A in the 
simulated results, approximately 90% average success was achieved with low tem-
plate sizes (N = 11) , and with N ~ 12, the minimum classification success was 
approximately 80% or higher. 
120.0% 
100.0% 
Q) 
-ro 
0::: 80.0% 
en 
en 
Q) 
(.) 
(.) 
~ 60.0% 
c 
0 
~ (.) 40.0% 
<;:::: 
·u; 
en 
ro 
0 20.0% 
0.0% 
0 
Hardware LFSR-16 , 256 Samples 
.- ·~ -· 
' 
/ ,,' 
' ' ,~.: ,/ 
.•' ,' 
/ // 
i:' / 
/,', 
·--·, ! ii 
! i I 
' ' 
' ' 
I 
I 
:' I 
--1-
5 10 
/ 
/ 
/ 
15 20 25 30 35 40 45 
Template Size (N) 
Figure 5.13: Classification success vs. template size 
---- Max 
--- Avg 
-- Min 
- Guess 
The fact that the template attack performed better against real hardwar than 
against simulated hardware has to do with information content. The simulated LFSR-
16's power usage carries information at the edge of a clock pulse, but the physical 
LFSR-16 's trace carries some information during the rest of t he pulse, too - though 
less than at the edge. The inter-operation standard deviation for the physical LFSR-
16 is shown in Figure 5.14; compared to Figure 5.4, we can see that there are many 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 77 
data points per clock cycle whose standard deviation rises well above the background. 
The line in Figure 5.14's standard deviation graph (the bot tom graph) shows that 
48 points - all centred around five clock transit ions - can be selected from the t race 
whose values are clearly more significant than the others. 
Information cont nt also affects classification success in that, for implementation 
reasons, the keys used for the hardware LFSR-16 had four fixed bits. As ment ion d 
above, the secret keys were in the set { OxXXOO, .. . OxXXOF } , not { OxXXXO, ... 
OxXXXF }. This is because keys were fed to the LFSR-16 via manual interaction, in 
the form of DIP switches. To attack a full LFSR-16, we would have to build more 
sophisticated off-board hardwar to load randomly-generated keys and ini tia lization 
vectors. This, combined wit h the PC software to drive it, is beyond t he scope of this 
research. To attack LFSR-16, we simply fixed four key bits to 0, set four more in an 
operation-dependent manner and iterated through all 256 possibili ties for the eight 
unfixed bits. 
Inter-operation standard deviation p eaks are observed later in Figure 5. 14 than in 
Figure 5.4; this is due to the loading of secret keys { OxXXOO, ... OxXXOF } and not 
{ OxFOXX, ... OxFFXX }. The peaks st art occurring at clock edge 8 instead of clock 
edge 0; this is precisely what we would expect if the difl'ering key bi ts were loaded 
into the four righ t-most flip-flops in Figure 5.3. 
5.3 Summary 
In this chapter, we revealed the results of our initial experiments using the F lipFlopper 
and LFSR-16 circuits. 
Using the FlipFlopper circuit, we were able to measure the power usage charac-
teristics of the FPGA on SCAB. These characteristics Jed us to a power model to use 
---- -- --- - --- - ----------
~ · Trace Viewer 
file ~dit View !:!elp 
w 0 _.,. c: _ .... ' : _, '"'J -..:. 
File Information Statistics Traces Power Usage 
Mean 
A ~--- -- ------ - - ~-~ ---~----- ~-- ---~------ ------ ---~~<>v 
Standard Deviation 
Figure 5.14: Hardware LFSR-16 statistics 
CHAPTER 5. EXPERIMENTAL RESULTS AND ANALYSIS 79 
for simulating hardware on the FPGA. 
Using this power model, we simulated the operation of LFSR-16 hardware, and 
applied the template attack to its power usage. As expected , increased noise caused 
a decrease in classification success, but we were able to recover information about the 
secret key very successfully in many different noise conditions. 
We then proceeded to apply the template attack to a real hardware implemen-
tation of LFSR-16. We were able to correctly guess secret key bits over 90% of the 
time, even with such small template sizes as N = 12. 
Having successfully attacked LFSR-16 in both simulation and hardware, and hav-
ing found good classification success with both , we proceeded to attack a simulated 
implementation of a real stream cipher: Trivium. 
Chapter 6 
Application of Template Attack to 
Trivium 
Trivium is a candidate cipher for the eSTREAM stream cipher selection proces 
(hardware profil ) [5[. By applying Template Attacks, we were abl to extract secret 
key material from a simulated version of thi cipher. 
6.1 Description 
Trivium is a stream cipher that was developed for eSTREAM, a four-year effort to 
identify 'promi ing new stream ciph rs," orne targeting software imJ lementation and 
orne targeting hardware [40]. Trivium i of the latter group , and it wa de igned "a 
an exerci e in exploring how far a stream ipher can be implified without sacrificing 
its security, sp d or flexibility" [5]. 
Trivium ha a 288-bit internal state which is updated through a combination of 
linear and non-linear f edback. It can generate up to 264 bits of k ystream from an 
80-bit seer t k y and 80-bit initialization ve tor. It was designed to be implement d 
80 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVIUM 81 
in a parallel fashion: no state bit is used for 64 clock cycles after it is updated, so up 
to 64 iterations of the cipher can be calculated in parallel [5]. 
Precise specifications are given below, but intuitively, Trivium can be thought of 
as a collection of Feedback Shift Registers, as shown in Figure 6.1. 
-------.. 
Figure 6.1: Trivium [5] 
Before the keystream can be generated, the internal state has to be initialized. 
The state is initia lly loaded with the 80-bit secret key (state bits 0 - 79) and an 80-bit 
initialization vector (state bits 93 - 172). Thre bits are then set to 1 (bits 286 - 288) 
and the remaining 125 bits are set to 0. The initialization procedure from Figure 6.2 
is then followed , where s [i] is the ith bit of the internal stat and t1 , t2 and t3 are 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVIUM 82 
temporary variables. 
for i = 
t1 = 
t2 = 
t3 = 
1 to 4 * 288 do 
s [66] xor (s [91] and s [92]) xor s [93] xor s [171] 
s[162] xor (s[175] and s[176]) xor s[177] xor s[264] 
s [243] xor ( s [286] and s [287]) xor s [288] xor s [69] 
(s[1],s[2], ... ,s[93]) = (t3,s[1], . . . ,s[92]) 
(s[94] ,s[95] , .. . ,s[177]) = (t1,s[94] , ... ,s[176]) 
(s[178] ,s[279] , ... ,s[288]) = (t2,s[178] , ... ,s[287]) 
end for 
Figure 6.2: Trivium init ialization 
Keystream generation - shown in Figure 6.3 - is similar, but involves an output 
variable z, which is t he current keystream output. 
for i 
t1 
t2 
t3 
= 
= 
= 
1 to N do 
s [66] xor s [93] 
s[162] xor s[177] 
s [243] xor s [288] 
z = t1 xor t2 xor t3 
t1 = t1 xor (s[91] and s[92]) xor s[171] 
t 2 = t 1 x or ( s [ 1 7 5] and s [ 1 7 6] ) x or s [ 2 6 4] 
t3 = t 1 xor ( s [286] and s [287]) xor s [69] 
(s[1], s[2], ... , s[93]) = (t3, s[1], ... , s[92]) 
(s[94], s[95], ... , s[177]) = (t1, s[94], ... , s[176]) 
(s [178], s [279], . .. , s [288]) = (t2, s [178], ... , s [287]) 
end for 
Figure 6.3: Trivium keystream generation 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVIUM 83 
6.2 Simulation Results 
Most Trivium simulations were performed with AWG added , at a peak power noise 
of 10- 7 W. This is a value which we found , for LFSR-16, produced success rates 
approximately equivalent to those obtained from hardware experimentation. In all 
cases, the right-most key bits were varied according to operation; the remaining bits 
were allowed to vary randomly. 
6 .2.1 Classificat ion Success Rate vs. Template Size 
Figure 6.4 shows our classification success rates for simulated Trivium versus template 
size. There are four lines on the graph: 
• Maximum success rate 
- this is the highest classification success for any operation 
- e.g. if four operations { Q(O), Q (l ) , Q (2), Q (3) } had classification success 
rates {45%, 32%,51%, 29%}, t he maximum success rate would be 51% 
• Average success rate 
- this is the average of classification success rates over all operations 
- e.g. if four operations { Q(O) , Q (l ) , Q (2) , Q (3) } had classification succes 
rates {45%, 32%, 51%,29%}, the average success rate would be 39.25% 
• Minimum success rate 
- this is t he lowest classification success for any operation 
- e.g. if four operations { Q (o), Q (l ), Q (2) , Q (3) } had classification success 
rates {45%,32%,51%, 29%}, the minimum success rate would be 29% 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVIUM 84 
• "Guess" rate 
- this is how successful we would expect to be if we guess d randomly 
- t his rate is 2~, where n is the number of bits included in the template 
* if we fixed four bits, we would have 24 = 16 operations and the prob-
abili ty of a correct guess would be i4 = 6.25% 
Simulated Trivium 
4096 Training Samples, 1 e-8 W Noise, 16 Templates 
70% 
a.> 60% ..... co 
cr:: 
(/) 50% (/) 
a.> (.) 
40% (.) 
::J 
C/) 
c 30% 
0 
:.:::; 
~ 20% 
!E 
(/) 10% (/) 
~ 
0 0% 
0 
....... 
· .. 
...•... / ... /······ 
_, .... .. · ,..,... ~ ·-- -- --
---;;>'" 
--- -- --- --
5 10 15 20 25 30 
Template Size (N) 
35 
····· Max 
···· Avg 
·· Min 
- Guess 
Figure 6.4: Classification success vs. template size - Trivium 
These success rates are lower than for LFSR-16 with the same amount of added 
noise but average success rates as high as 22% were achieved - better than the 6.25% 
that we would expect to achieve through random guessing. 
6.2.2 Classification Success vs. Training Samples 
As expected, increasing the number of training samples increas d the probability of 
success, though success rates increased roughly linearly for exponentially increasing 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVIUM 85 
numbers of samples. This is shown in Figure 6.5, where we can see that the maximum, 
average and minimum classification success rates are monotonically increa ing wit h 
the number of training samples. 
Simulated Trivium 
N=32, 1 e-8 W Noise , 16 Templates 
40% 
35% 
Q) 
_.. 
~ 30% , .... · 
(/) 
(/) 25% Q) ---·· tv1ax 
(..) 
(..) 
20% :::J 
(J) 
---- Avg 
-- Mn 
c 15% 0 .· - Guess 
~ 
ro 
(..) 10% 
---
-- -
'+= 
·u; 
(/) 5% ro 
() 
0% 
10 100 1000 10000 
Figure 6.5: Classification success vs . training samples - Trivium 
For this research , we spent considerable time simulating the cipher under varying 
conditions. Simulating Trivium with 4,096 training samples per operation might 
only take an hour, but simulating the cipher's operation and performing analysis for 
varying template sizes might take a day. Thus, while using more than 4,096 training 
samples would be prohibitively time-consuming for this research, an individual or 
organization mounting a serious side channel attack could spend significant t ime -
and computational power - building templates from many training samples. 
6 .2.3 Classification Success Rate vs. B its Under Attack 
With Trivium simulations, we also varied the number of bits under attack, running 
simulations and analysis for one-bit templates (21 = 2 operations) , two-bit templates 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVIUM 86 
(22 = 4 operations), four-bit templates (24 = 16 operations) and eight-bit templa.t s 
(28 = 256 operations). 
A linear increase in the number of bi ts under attack led to a. exponential increase in 
the computation required to perform all simulation and a.na.ly i . On first inspection 
however, maximum, average and minimum classification success rates all seem to 
vary exponentially with the inverse of the number of bits being attacked, as shown 
in F igure 6.6. 
Simulated Trivium, 64 Samples , N=20, 1 e-8 W Noise 
70% 
Q) 
-
60% 
ro 
0::: 50% (/) 
(/) 
Q) 
(.) 40% (.) ---· tv1ax 
::J 
(f) --- Avg 
c 30% 0 
-- Min 
+=> 
ro 
(.) 20% ~ 
(/) 
--.... 
--.... ---... 
(/) 
E1 10% u 
0% 
One Bit TIM:> Bits Four Bits Eight Bits 
Bits Being Attacked 
Figur 6.6: Trivium classification success vs. bits being attacked 
What thi graph does not reveal, however, is how the classification success com-
pares to the expected classification success rate if we had no information about the 
cipher - i. e. if we guessed randomly. For n bits, we exp ct that random guessing 
would yield the correct subkey 2~ of the time. Our improvement over th is rate tells 
us how much information each guess must reveal to enable as many correct guesses 
as we have made, and t his information leakage can be calculated by Equation 6.1: 
CHAPTER 6. APPLI CATION OF TEMPLATE ATTACK TO TRIVI UM 87 
Information Leakage vs. Template Size 
Simulated Trivium, 64 Training Samples, 1e-8 W Noise 
2.50 
~ 
2 2.00 
e 
~ 1.50 
Jl1 
~ 1.00 
.§ 0.50 
ro 
E 
.E 
£ 
0.00 
-0.50 
I 
I . 
1 2 3 4 6 B 10 12 14 16 20 24 32 
Template Size (N) 
- One Bit 
· · Two Bits 
· · Four Bits 
Eight Bits 
Figure 6.7: Trivium information leakage 
l = B - log2 ( ~) , (6.1) 
where l is the information leakage, B is the number of bits being attacked and s 
is the classification success rate. 
Figure 6.7 shows the information leakage for attacks on various numbers of bits. 
From t his graph, we can see that the information obtained via attacking eight bi ts of 
key can be approximately twice that obtained from attacking four bits of key. 
The a ttack is stronger as more bits are attacked , but this greatly increases compu-
tational complexity: if n is the number of bits being attacked , t hen 2n templates must 
be generated , requiring L2n total template samples. As mentioned above, however, 
a serious side channel at tack could be mounted on a system using resources such as 
computing clusters. This would make practical attack a very realistic possibility. 
CHAPTER 6. APPLICATION OF TEMPLATE ATTACK TO TRIVI UM 88 
6.3 Trivium Hardware 
We did not apply template attacks to the power usage of Trivium hardware, as we did 
with LFSR-16. The reason for this is entirely pract ical: £ ding random key and IV 
values from attack software to the cipher hardware would require a more sophisticat d 
experimental etup t han we currently have. While this would b a logical attack fo r 
future work to implement, it is beyond the scope of this thesis. 
Consid ring the similarity of our simulation and hardware results against LFSR-
16, however, we conjecture that template attacks could be applied against real hard-
ware implementations of Trivium. 
6.4 Summary 
Trivium is a very simple stream cipher which has wit hstood the rigours of the eS-
TREAM process, and is part of t he final eSTREAM portfolio. Because of this, a 
well as its overwhelming populari ty among stream cipher researcher [41], it is a very 
important cipher. 
By applying template at tacks, we were able to ext ract secret key material from a 
simulated version of Trivium. Our classification success rate was as high as 22% in 
noi e condition that we saw were rea onable in Chapter 5. T hi uccess rate could 
be increased by using more training samples p er operation or by capturing mul t ipl 
traces from th device under attack and exploiting the joint information contained in 
all of them. 
We conjecture that these attacks could be realised against practical cryp to ystems. 
Implementers of Trivium, and other practical stream ciphers, should take care to 
ensure that their implementations are not vulnerable to thes attacks. 
Chapter 7 
Conclusions 
As cryptography continues to be imp! mented in embedded systems such as smart 
cards and RFIDs, implementers of cryptographic systems must consider threat model 
that include adversaries having physical access to cipher hardwar . This physical 
access enables attack via side channel analysis, including the powerful class of attack 
known as template attacks. 
In this thesis, we have demonstrated that template attacks can be applied to 
stream ciphers implemented not just via microcontrollers, but also in reconfigurable 
hardware. To this end, we have prepared an experimental setup that includes the 
Side Channel Analysis Board (SCAB), measurement equipment and oftware. SCAB 
is a custom PCB designed to support research in side channel analysis, with features 
to aiel researcher in performing power analysis, electromagnetic analysi , faul t anal-
ysis and timing analysis. In this research, we have used the power analy is features 
of SCAB, measuring the power used by str am cipher hardware with a PC-ba eel 
oscillo cope called Cleverscope. We have also written 10,000 lines of C++ code to 
perform simulation and analysis of the power u age of cryptographic hardware. 
sing this experimental setup, we measured the power usage characteristics of 
89 
CHAPTER 7. CONCLUSIONS 90 
FPGA-based hardware. Having found that these characteristics could be exploited for 
side channel analysis, we constructed a simple power usage model from them, which 
included the above characteristics and Additive White Gaussian Noise (AWGN). We 
simulated the opera tion of a stream cipher building block , a 16-bit Linear Feedback 
Shift Register (LFSR-16), and applied template at tacks to its simulated power usag . 
We were able to recover secret key material from these simulated power traces -
success rates depended on the amount of AWGN present , but even with very high 
amounts of noise, success still exceeded the 6.25% rate t hat we would expect had 
we made random guesses a t key bits. We then implemented LFSR-16 in ha rd ware, 
measuring its power usage with the Cleverscope and analysing it with our software. 
V·le were able to recover secret key bits wi th success rates greater than 90% even 
with small template sizes (N < 20). 
From this success, we simula ted the power usage of Trivium, a stream cipher that 
has been vet ted by the eSTREAM init iative. For this complet stream cipher, we 
were able to retri ve four correct bits of key information for over 20% of our guesses, 
and our investigations indicate t hat higher success would be pos ible for a dedicated 
attacker with reasonable computational resources. 
We thus conclud that side channel analysis is a very real threat to stream cipher 
hardware, and implementers of such hardware should take care to evaluat th ir 
implementations for suscept ibili ty to this class of attacks. 
Future Work 
This thesis pres nts a black-box approach to a t tacking stream cipher hardware. Fu-
ture work would in Jude attacking different group of bits within TI:·ivium to determine 
the bits whi h are most or least suscept ible to Template Attacks, as well as xploring 
CHAPTER 7. CONCLUSIONS 91 
techniques to combine attacks so as to extract the maximum amount of key informa-
tion possible. 
Future work would also include more application of the method to physical hard-
ware, especially the final eSTREAM portfolio ciphers (hardware focus) - F-FCSR-H 
v2 12], Grain v2 [3], MICKEY v2 [4] and Trivium [5] . This work will require a more 
elaborate experimental setup. The number of key and IV bits that mu t be deter-
mined randomly will be much larger - approximately 80 bits each - which rules out 
the current method of IV generation: exhaustive search. Rather, unfixed key bits 
and all IV bits must be generated by hardware and/ or software external to the device 
being tested - likely in software on the PC controlling the attack - and exported to 
the hardware being analysed. 
Other important future work is determining the effectiveness of traditional side 
channel countermeasures against the Template Attack. Many counterm asures wer 
designed to defeat Differential Power Analysis, but the principles of the Template At-
tack are quite different. Whether or not they can be applied, and what techniques are 
effective at foiling the Template Attack, should be of particular interest to hardware 
designers. 
Bibliography 
[1] "Announcing the Advanced Encryption Standard ," National Instit ute of Stan-
dards and Technology ( IST), Tech. R p . FIPS 197, ov. 2001. 
[2] F. Arnaul t and T . Berger, "F-FCSR: design of a new clas of tream cipher ' 
Fast Software Encryption-FSE, vol. 3557, pp. 83- 97, 2005. 
[3] M. Hell , T. Johansson, and W. Meier , "Grain - a stream cipher for con-
strained environments," eSTREAM - ECRYPT Stream Cipher Project, Tech. 
Rep. 2005/ 010, 2005. 
[4] S. Babbage and M. Dodd , "The stream cipher MICKEY-128,' eSTREAM -
ECRYPT Stream Cipher Project, Tech. Rep. 2005/ 016, 2005. 
[5] C. de Canniere and B. Preneel, "Trivium Specifications," available fm m ES-
TREAM (http://www. ecrypt. eu. orgj str·eamj triviump2. html) . 
[6] J . Muir, "Techniques of Side Channel Cryptanalysis," Ma ter 's thesis, University 
of Waterloo, 2001. 
[7] S. Chari , J. Rao, and P. Rohatgi, "Template Attacks," in Proceedings of Crypto-
graphic Hardware and Embedded Systems, vol. LNCS 2535, 2002, pp. 13- 28. 
92 
BIBLIOGR PHY 93 
[8] L. Smith , Cryptography: The Science of Secret Writing. Dover P ublication 
1955. 
[9] R. ndcrson Security Engineering: A Guide to Building Dependable Distributed 
Systems, 2001 . 
[10] N. Ferguson and B. Schneier, Practical cryptography. John Wiley & Sons, 2003. 
[11] W. Diffie, P. Oorschot, and M. W iener, "Aut hentication and authenticated k y 
exchanges," Designs, Codes and Cryptography, vol. 2, no. 2, pp. 107- 125, 1992. 
[12] C. Shannon "Communication th ory of secrecy system ." 
[13] "Data En yption Standard (DES) ," ational Institute of Standards and T ch-
nology ( IST), T ch. Rep. FIPS 46-3, Oct. 1999. 
[14] E. Foundation, M. Loukides, and J. Gilmore, Cracking DES: Secrets of Encryp-
tion Research, Wiretap Politics and Chip Design. O'Reilly & A sociates, In . 
Sebastopol A, USA, 1998. 
[15[ . Mowlavi, "The Fu ture of our Sun and Stars," The Future of the Universe and 
the Future of our· Civilization, pp. 57- 69, 2000. 
[16] R. Rive t , A. Shamir , and L. Adleman, "A method for obtaining digital signatures 
and public-key cryptosystems," Communications of the ACM, vol. 21 no. 2, pp. 
120- 126, 1978. 
[1 7[ S. Garfinkel, PCP: Pr·etty Good Privacy. O'Reilly, 1995. 
[18] D. Kahn, The Codebreakers. N w York: Macruillan, 1967. 
[19] J. Daemon , R. Govaerts and J . endewall , "A ew pproach Towards Block 
Cipher Design," in Fast Software Encryption, FSE 2003. Springer-Verlag 1993. 
BIBLIOGRAPHY 94 
[20] P. C. Kocher , "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS , 
and Other Systems," in Advances in Cryptology: Proceedings of CRYPT0'96, 
vol. 96. Springer-Verlag, 1996, pp. 104- 113. 
[21] H. Handschuh and H. M. Heys, "A Timing Attack on RC5," Lecture Notes in 
Computer Science 1556: Selected Areas in Cryptography- SAC '98, pp. 306- 318, 
1999. 
[22] D. Osvik, A. Shamir, and E. Tromer, "Cache attacks and countermeasures: the 
case of aes," CT-RSA, pp. 1- 20, 2006. 
[23] E. Biham and A. Shamir, "Differential Fault Analysis," in Advances in Cryptol-
ogy: Proceedings of CRYPTO '97, vol. LNCS 1294. Springer-Verlag, 1997, pp. 
513- 525. 
124] D. Boneh , "On the Importance of Eliminating Errors in Cryptographic Compu-
tations," Journal of Cryptology, vol. 14, no. 2, pp. 101- 119, 2001. 
125] J . Blamer and J.-P. Seifert, "Fault based cryptanalysis of the advanced encryption 
standard (aes)," LNCS 2742: FC 2003, pp. 162- 181 , 2003. 
126] S. Skorobogatov and R. Anderson, "Optical fault induction attacks," Proceedings 
of CHES '02, pp. 2- 12, 2002. 
[27] J .-J. Quisquater and D. Samyde, "Eddy current for magnetic analysis with active 
sensor," Proceedings of Int . Conf. on Research in Smart Cards (E-Smart 2002), 
pp. 185- 194, 2002. 
[28] R. Anderson and M. Kuhn , "Low Cost Attacks on Tamper Resistant Devices," 
in 5th International WoTkshop on Secur·ity Protocols, vol. LNCS 1361. Springer-
Verlag, 1997, pp. 125- 126. 
BIBLIOGRAPHY 95 
[29] --, "Tamper Resistance - A Cautionary ote," in Proceedings of the Second 
USENIX Workshop on Electronic Commerce, 1996. 
[30] P. C. Kocher, J. Jaffe, and B . Jun, "Differential Power Analysis," in Proceedings of 
the 19th Annual International Cryptology Conference on Advances in Cryptology, 
1999, pp. 388- 397. 
[31] K. Gandolfi, C. Mourtel, and F. Olivier, "Electromagnetic Analysis: Concrete 
Resul ts," Cryptographic hardware and embedded systems-CHES 2001: Third In-
ternational Workshop, Paris, Prance, May 14-16, 2001: Proceedings, 2001. 
[32] C. Rechberg r and E . Oswald "Stream Ciphers and Side-Channel Analysis," in 
Workshop on the State of the Art in Stream Ciphers, 2004, pp. 320- 326. 
[33] Identification cards - Integrated circuit cards - Par·t 1: Physical characteristics, 
International Standards Organization (ISO) , Oct. 1998. 
[34] S. F. Arnold, The Theory of Linear Models and Multivariate Analysis, ser. Wiley 
Series in Probability and Mathemetical Statistics. Wiley, 1981. 
[35] C. S. Davis, Statistical Methods for the Analysis of Repeated Measurements, ser. 
Springer Texts in Statistics. Springer, 2002. 
[36] R. Duda, P. Hart et al., Pattern classification and scene analysis, 1973. 
[37] A. Menezes, P. Van Oorschot, and S. Vanstone, Handbook of Applied Cryptogra-
phy. CRC Press, 1997. 
[38] C. Su , T. Lin , C. Huang, and C. Wu, "A high-throughput low-cost AES proces-
sor," Communications Magazine, IEEE, vol. 41, no. 12, pp. 86- 91, 2003. 
[39] "Ciev rs ope." [Online] . Available: http: / j www.cleverscope.com/ 
BIBLIOGRAPHY 96 
[40[ The eSTREAM Project. [Online]. Available: http://www. crypt.eu.org/ str am/ 
[41] S. Babbage, C. D. Canniere, A. Canteaut, C. Cid , H. Gilbert, T. Johans on, 
M. Parker, B. Preneel, V. Rijmen, and M. Robshaw, "The eSTREAM Portfolio," 
ECRYPT, Tech . Rep. , Apr. 2008. 
[42] Qt cross-platform application framework. [Online]. Available: 
http: / / trolltech.com/ products/ qt/ 
Appendix A 
Detailed Results 
A.l Simulation 
A.l.l LFSR-16 
All LFSR-16 data is for an attack against four key bits. 
16 Training Samples per Operation 
Table A.l contains the classification success rates when the peak power noise was 
10- 8 W. 
N I Minimum I Average I Maximum I 
1 0 10.3% 51% 
2 0 25.8% 56% 
3 17% 48.5% 71% 
4 40% 94.2% 100% 
6 0 12.5% 100% 
8 0 43.8% 100% 
10 0 56.1% 100% 
12 0 0 0 
14 0 0 0 
Table A.l: 16 training samples per operation (lo- 8 W noise) 
97 
- -~------------------------------
APPENDIX A . DETAILED RESULTS 98 
Table A.2 contains the classification success rates when the peak power noise was 
w-7 w. 
N I Minimum I Average I Maximum I 
1 0 9.8% 59% 
2 2% 25.2% 56% 
3 17% 52.1% 70% 
4 42% 94.1% 100% 
6 50% 94.4% 100% 
8 48% 94.9% 100% 
10 54% 93.6% 100% 
12 58% 89.5% 100% 
14 24% 75.8% 100% 
Table A.2: 16 training samples per operation (lo- 7 W noise) 
Table A.3 contains the classification success rates when the peak power noise was 
w-6 w. 
N I Minimum I Average I Maximum I 
1 0 10.3% 59% 
2 0 12.4% 30% 
3 1% 14.9% 30% 
4 9% 20.7% 39% 
6 9% 21.8% 39% 
8 11% 22.6% 36% 
10 9% 22.9% 43% 
12 4% 22.9% 41% 
14 4% 23.6% 34% 
Table A.3: 16 training samples per operation (lo- 6 W noise) 
Table A.4 contains the classification success rates when the peak power noise was 
w- 5 w. 
APPENDIX A. DETAILED RESULTS 99 
I N I Minimum I Average I Maximum I 
1 0 6.1% 31% 
2 0 6.6% 26% 
3 0 7.7% 24% 
4 0 8.1% 25% 
6 0 9.8% 27% 
8 0 10.7% 25% 
10 0 12.8% 34% 
12 0 12.4% 35% 
14 0 12.6% 44% 
Table A.4: 16 training samples per operation (lo- 5 W noise) 
Table A.5 contains the classification success rates when the peak power noise was 
10- 4 w. 
N I Minimum I Average I Maximum I 
1 0 6.7% 49% 
2 0 7.9% 33% 
3 0 7.9% 26% 
4 0 8.8% 28% 
6 0 10.2% 30% 
8 0 11.6% 27% 
10 0 12.3% 32% 
12 0 12.8% 37% 
14 0 12.5% 36% 
Table A.5: 16 training samples per operation (10- 4 W noise) 
Table A.6 contains t he classification success rates when the peak power noise was 
10- 3 w. 
APPENDIX A. DETAILED RESULTS 100 
I N I Minimum I Average I Average I 
1 0 6.5% 34% 
2 0 6.6% 29% 
3 0 6.4% 22% 
4 0 8.1% 29% 
6 0 9.4% 26% 
8 0 9.5% 24% 
10 0 11.3% 27% 
12 0 12.3% 31% 
Table A.6: 16 training samples per operation (10- 3 W noise) 
Table A. 7 contains the classification success rates when the peak power noise was 
.01 w. 
I N I Minimum I Average I Maximum I 
1 0 7.8% 46% 
2 0 7.4% 31% 
3 0 7.7% 25% 
4 0 8.0% 21% 
6 0 9.1% 21% 
8 0 11.7% 23% 
10 0 12.7% 28% 
12 0 13.2% 30% 
Table A. 7: 16 training samples per operation ( .01 W noise) 
Table A.8 contains the classification success rates when the peak power noise was 
.1 w. 
APPENDIX A . DETAILED RESULTS 101 
I N I Minimum I Average I Maximum I 
1 0 6.8% 59% 
2 0 8.0% 56% 
3 0 8.6% 70% 
4 0 10.4% 100% 
6 0 11 .5% 100% 
8 0 14.3% 100% 
10 0 16.8% 100% 
12 0 19.8% 100% 
Table A.8: 16 training samples per operation (.1 W noi e) 
Table A.9 contains the classification success rates when the peak power noise wa 
1 w. 
I Minimum I Average I Maximum I 
1 0 6.6% 48% 
2 0 8.5% 26% 
3 0 9.1% 23% 
4 0 9.6% 21% 
6 0 11.8% 23% 
8 0 12.6% 21% 
10 0 16.8% 22% 
12 0 19.3% 28% 
Table A.9: 16 training samples per operation (1 W nois ) 
32 Training Samples per Operation 
Table A. 10 contains the classification success rates when t he peak power noi e was 
10- 8 W. 
APPENDIX A. DETAILED RESULTS 102 
I N I Minimum I Average I Maximum I 
1 0 8.2% 55% 
2 0 28.2% 66% 
3 16% 51.7% 77% 
4 50% 95.0% 100% 
6 53% 95.7% 100% 
8 56% 95.9% 100% 
10 54% 95.5% 100% 
12 64% 96.2% 100% 
14 72% 96.6% 100% 
16 0 0 0 
20 0 0 0 
24 0 10.3% 89% 
Table A.10: 32 training samples per operation (lo-s W noise) 
Table A.ll contains the classification success rates when the peak power noise was 
10- 7 W. 
I Minimum I Average I Maximum I 
1 0 10.1% 65% 
2 0 28.0% 56% 
3 14% 53.2% 78% 
4 50% 94.8% 100% 
6 53% 94.8% 100% 
8 58% 95.3% 100% 
10 59% 95.0% 100% 
12 53% 94.8% 100% 
14 51% 94.4% 100% 
16 51% 94.3% 100% 
20 50% 92.6% 100% 
24 54% 90.4% 100% 
Table A.ll : 32 training samples per operation (lo- 7 W noise) 
Table A.12 contains the classification success rates when th e peak power noise was 
10- 6 W. 
APPENDIX A. DETAILED RESULTS 103 
I N I Minimum I Average I Maximum I 
1 0 10.7% 47% 
2 0 14.8% 38% 
3 2% 19.5% 43% 
4 9% 24.8% 41% 
6 14% 26.3% 39% 
8 16% 27.0% 42% 
10 15% 28.4% 41% 
12 18% 29.6% 40% 
14 26% 31.4% 43% 
16 20% 31.2% 40% 
20 14% 32.6% 40% 
24 7% 33.9% 40% 
Table A.12: 32 training samples per operation (lo- 6 W noise) 
Table A.13 contains the classification success rates when the peak power noise was 
w-5 w. 
N I Minimum I Average I Maximum I 
1 0 7.3% 43% 
2 0 8.0% 24% 
3 0 9.0% 23% 
4 0 9.1% 24% 
6 0 10.3% 26% 
8 0 11.1% 32% 
10 0 13.8% 32% 
12 0 14.8% 37% 
14 0 16.7% 40% 
16 0 18.1% 44% 
20 0 19.6% 45% 
24 0 20.1% 43% 
Table A.13: 32 training samples per operation (lo- 5 W noise) 
64 Training Samples per Operation 
Table A.14 contains the classification success rates when the peak power noise was 
w-s w. 
APPENDIX A . DETAILED RESULTS 104 
I N I Minimum I Average I Maximum I 
1 0 11.7% 40% 
2 0 25.8% 58% 
3 27% 53.5% 73% 
4 52% 94.2% 100% 
6 64% 95.8% 100% 
8 62% 95.4% 100% 
10 59% 95.1% 100% 
12 56% 94.6% 100% 
14 58% 94.9% 100% 
16 56% 95.0% 100% 
20 51% 94.9% 100% 
24 41% 94. 1% 89% 
Table A.l4: 64 training samples per operation (lo- s W noise) 
Table A.l 5 contains the classification success rates when the peak power noise was 
10- 7 W. 
I N I Minimum I Average I Maximum I 
1 0 12.4% 63% 
2 0 28.2% 56% 
3 14% 55.6% 70% 
4 50% 94.1% 100% 
6 53% 95.3% 100% 
8 58% 96.4% 100% 
10 59% 96.8% 100% 
12 53% 96.6% 100% 
14 51% 96.8% 100% 
16 51% 97.1% 100% 
20 50% 97.2% 100% 
24 54% 97.9% 100% 
Table A.l5: 64 training samples per operation ( 10- 7 W noise) 
Table A.l6 contains the classification success rates when the peak power noise was 
10- 6 w. 
APPENDIX A. DETAILED RESULTS 105 
I N I Minimum I Average I Maximum I 
1 0 9.8% 48% 
2 0 14.1% 36% 
3 5% 18.3% 33% 
4 3% 23.1% 38% 
6 13% 26.0% 39% 
8 10% 26.4% 42% 
10 11% 29.6% 42% 
12 13% 31.6% 49% 
14 11% 32.8% 49% 
16 12% 34.7% 50% 
20 12% 38.1% 54% 
24 8% 40.9% 61% 
Table A.16: 64 training samples per operation (lo- 6 W noise) 
Table A.17 contains the classification success rates when the peak power noise was 
w-5 w. 
N I Minimum I Average I Maximum I 
1 0 7.6% 46% 
2 0 8.8% 34% 
3 0 9.7% 32% 
4 0 9. 5% 24% 
6 2% 11.0% 22% 
8 5% 13.6% 23% 
10 2% 15.2% 28% 
12 2% 18.6% 32% 
14 2% 20.6% 30% 
16 4% 24.1% 34% 
20 2% 32.3% 47% 
24 2% 39.0% 55% 
Table A.17: 64 training samples per opera tion ( 10- 5 W noise) 
Beyond the full data sets we collected, we also collected partial sets at 64 training 
samples for different noise values. 
Table A.18 contains the classification success ra tes when the peak power noise was 
w-4 w. 
APPENDIX A. DETAILED RESULTS 106 
I I Minimum I Average I Maximum I 
1 24 1 1% 1 24.5% 1 57% 
Table A.18: 64 training samples per operation (lo- 4 W noise) 
Table A.19 contains the classification success rates when the peak power noise was 
10- 3 W. 
I N I Minimum I Average I Maximum I 
1% 1 37.4% 1 53% 
Table A.19: 64 training sampl s per operation (lo- 3 W noise) 
Table A.20 contains the classification success rates when the peak power noise was 
10- 2 vv. 
I N I Minimum I Average I Maximum I 
1 24 1 1% 1 26.s% 1 53% 1 
Table A.20: 64 training samples per operation (lo- 2 W noise) 
Table A.21 contains the classification success rates when the peak power noise was 
.1 W. 
I N I Minimum I Average I Maximum I 
1 24 1 3% 1 37.4% 1 53% 1 
Table A.21: 64 training samples per operation (.1 W noise) 
Table A.22 contains the classification success rates when the peak power noise was 
1 W. 
I N I Minimum I Average I Maximum I 
1 24 1 2% 1 21.o% 1 52% 1 
Table A.22: 64 training samples per operation (1 W noise) 
APPE DIX A. DETAILED RESULTS 107 
128 Training Samples per Operation 
Table A.23 contains the classification success rates when the peak power noise was 
w-6 w. 
N I Minimum I Average I Maximum I 
1 0 10.8% 61% 
2 0 14.4% 39% 
3 5% 18.8% 33% 
4 1% 22.1% 41% 
6 7% 26.7% 46% 
8 14% 28.9% 47% 
10 14% 30.1% 52% 
12 15% 31.1% 52% 
14 11% 33.0% 58% 
16 11% 34.2% 59% 
20 11% 38.3% 65% 
24 10% 42.8% 66% 
Table A.23: 128 training samples per operation (lo- 6 W noise) 
Table A.24 contains the classification success rates when the peak power noise was 
w-5 w. 
N I Minimum I Average I Maximum I 
1 0 8.1% 47% 
2 0 8.5% 38% 
3 0 10.2% 36% 
4 0 10.1% 34% 
6 0 11.0% 31% 
8 0 11.7% 30% 
10 1% 12.9% 28% 
12 1% 13.6% 29% 
14 1% 14.9% 32% 
16 0 16.9% 37% 
20 0 22.1% 55% 
24 0 26 .3% 57% 
24 1% 34.9% 75% 
Table A.24: 128 training samples per operation (lo- 5 W noise) 
APPENDIX A. DETAILED RESULTS 108 
256 Training Samples per Operation 
Table A.25 contains the classification success rates when the peak power noise was 
w-7 w. 
N I Minimum I Average I Maximum I 
1 0 10.5% 69% 
2 1% 27.4% 62% 
3 15% 54.6% 68% 
4 58% 94.9% 100% 
6 70% 96.5% 100% 
8 70% 96.8% 100% 
10 69% 96.3% 100% 
12 70% 96.3% 100% 
14 61% 95.9% 100% 
16 60% 95.9% 100% 
20 62% 96.4% 100% 
24 66% 97.2% 100% 
Table A.25: 256 training samples per operation (lo- 7 W noise) 
Table A.26 contains the classification success rates when the peak power noise was 
w-5 w. 
N I Minimum I Average I Maximum I 
1 0 7.3% 36% 
2 0 8.0% 36% 
3 0 7.2% 31% 
4 0 6.6% 25% 
6 2% 6.9% 18% 
8 2% 7.1% 12% 
10 4% 7.8% 20% 
12 3% 7.8% 18% 
14 2% 7.4% 19% 
16 3% 7.8% 22% 
20 2% 8.1% 24% 
24 0 8.1% 29% 
Table A.26: 256 training samples per operation (10- 5 W noise) 
APPENDIX A. DETAILED RESULTS 109 
A.l.2 Trivium 
All Trivium simulations, unless otherwise specified, were performed with 256 training 
samples and a peak power noise of 10- 8 W. 
A.1.2.1 One Key Bit 
The results of attacking one key bit (79 bits randomly assigned) are given in Table 
A.27. 
N I Minimum I Average I Maximum I 
1 43% 49.5% 56% 
2 43% 53.0% 63% 
3 44% 53.0% 62% 
4 48% 57.0% 66% 
6 46% 53.5% 61% 
8 50% 52.5% 55% 
10 48% 52.0% 56% 
12 45% 47.5% 50% 
14 44% 51.5% 59% 
16 44% 50.5% 57% 
20 51 % 55 .0% 59% 
24 39% 48.0% 57% 
32 35% 49.0% 49% 
Table A.27: Trivium results - attacking one key bit 
A.1.2.2 Two Key Bits 
The results of attacking two key bits (78 bits randomly assigned) are given in Table 
A.28. 
APPE DIX A. DETAILED RESULTS 110 
I N I Minimum I Average I Maximum I 
1 1% 27.0% 42% 
2 9% 27.3% 48% 
3 10% 27.3% 44% 
4 19% 29.5% 41% 
6 21 % 24.8% 29% 
8 22% 25.0% 30% 
10 17% 23.3% 27% 
12 22% 24.3% 27% 
14 24% 27.3% 30% 
16 26% 26.8% 27% 
20 22% 26.3% 31% 
24 26% 29.0% 37% 
32 23% 25.8% 39% 
Table A.28: Trivium r suits- attacking two key bits 
A .1.2.3 Four Key Bits 
·when attacking four key bits (76 bits randomly assigned) , simulations were performed 
with power noise of 10- 7 and 10- 8 for s vera) numbers of training samples . 
64 Training Samples per Operation The a ttack re ults when the power noi 
is 10- 8 W are given in Table A.29. 
APPENDIX A. DETAILED RESULTS 111 
I Minimum I Average I Maximum I 
1 0% 7.25% 47% 
2 0% 7.19% 29% 
3 1% 7.19% 24% 
4 0% 5.81% 25% 
6 1% 6.19% 17% 
8 3% 7.38% 12% 
10 2% 6.81% 14% 
12 3% 5.81% 11% 
14 1% 8.56% 16% 
16 8% 13.3% 20% 
20 8% 12.1% 17% 
24 5% 11 .3% 18% 
32 5% 11.9% 16% 
Table A.29: Trivium results - a ttacking four key bits, 64 samples, 10- 8 peak nois 
The attack results when the power noise is 10- 7 W are given in Table A.30. 
I N I Minimum I Average I Maximum I 
1 0% 6.88% 32% 
2 0% 7.00% 21% 
3 1% 7.31% 21% 
4 1% 6.00% 14% 
6 2% 6.31% 12% 
8 1% 5.94% 13% 
10 1% 7.00% 14% 
12 4% 6.50% 12% 
14 1% 6.44% 13% 
16 3% 6.69% 13% 
20 1% 7.50% 13% 
24 2% 6.25% 10% 
32 2% 6.13% 10% 
Table A.30: Trivium results - attacking four key bits, 64 samples, 10- 7 peak noise 
256 Training Samples per Operation The attack results when the power noise 
is 10- 8 W are given in Table A.31. 
APPENDIX A. DETAILED RESULTS 112 
I Minimum I Average I Maximum I 
1 0% 7.2% 52% 
2 0% 7.2% 34% 
3 0% 7.2% 36% 
4 0% 5.8% 28% 
6 0% 6.2% 24% 
8 1% 7.4% 22% 
10 2% 6.8% 14% 
12 2% 5.8% 14% 
16 10% 13.3% 24% 
20 6% 12.1% 21% 
24 7% 11.3% 18% 
32 8% 11.9% 18% 
Table A.31 : Trivium results- attacking four k y bits , 256 samples, 10- 8 peak noise 
The attack r sults when the power noise is 10- 7 W are given in Table A.32. 
I N I Minimum I Average I Maximum / 
1 0% 7.8% 32% 
2 0% 8.5% 21% 
3 0% 7.0% 21% 
4 0% 7.8% 14% 
6 1% 7.9% 12% 
8 1% 7.7% 13% 
10 2% 6.7% 14% 
12 2% 7.6% 12% 
16 3% 11.6% 13% 
20 6% 10.2% 13% 
24 4% 9.3% 10% 
32 2% 8.5% 10% 
Table A.32: Trivium results - attacking four key bits, 256 samples, 10- 7 peak noise 
1024 Training Samples p er Operation The attack results when the power nois 
is 10- 8 W are given in Table A.33. 
-------------------- ---------· -----
APPENDIX A . DETAILED RESULTS 113 
I N I Minimum I Average I Maximum I 
1 0% 6.9% 55% 
2 0% 7.1% 43% 
3 0% 7.7% 36% 
4 0% 7.5% 29% 
6 1% 7.4% 23% 
8 2% 7.0% 21 % 
10 1% 7.4% 17% 
12 1% 7.0% 17% 
16 6% 16.6% 27% 
20 10% 16.4% 28% 
24 11% 20.1% 32% 
32 12% 19.3% 29% 
Table A.33: Trivium results - attacking four key bits, 1024 samples, 10- 8 peak noise 
4096 Training Samples per Operation The attack results when th power noise 
is 10- 8 W are given in Table A.31. 
I N I Minimum I Average I Maximum I 
1 0% 8.6% 58% 
2 0% 8.0% 49% 
3 0% 8.2% 46% 
4 0% 8.1% 43% 
6 0% 7.8% 40% 
8 0% 8.1% 33% 
10 0% 7.4% 30% 
12 1% 7.2% 29% 
16 8% 17.8% 27% 
20 12% 22.1% 37% 
24 11% 22.4% 35% 
32 13% 21.6% 35% 
Table A.34: Trivium results - attacking four key bits, 4096 samples, 10- 8 peak noise 
A.1.2.4 Eight Key Bits 
The results of attacking eight key bits (72 bits randomly assigned) are given in Table 
A.35. 64 training samples were used in all cases. 
APPENDIX A. DETAILED RESULTS 114 
I N I Minimum I Average I Maximum I 
1 0% 0.48% 21% 
2 0% 0.54% 21% 
3 0% 0.47% 9% 
4 0% 0.49% 8% 
6 0% 0.41% 9% 
8 0% 0.46% 17% 
10 0% 0.43% 4% 
12 0% 0.49% 3% 
14 0% 0.75% 5% 
16 0% 0.92% 5% 
20 0% 1.63% 5% 
24 0% 1.39% 6% 
32 0% 1.10% 5% 
Table A.35: Trivium results - attacking eight key bits 
A.2 Physical Measurement 
Physical measurement was performed of the LFSR-16 cipher building block for 256 
training samples. Results are given in Table A .36 
I Minimum I Average I Maximum I 
1 0 11 .3% 40.6% 
2 0 26.7% 59.0% 
3 29.7% 52.3% 58.6% 
4 60.2% 65.9% 73.4% 
6 65 .2% 73.0% 80.5% 
8 72.3% 85.6% 95.7% 
10 72.7% 88.2% 96.1 % 
12 77.7% 91.6% 96.5% 
16 85.2% 94.1% 97.7% 
20 93.4% 97.4% 99.6% 
24 95.7% 98.5% 100% 
32 97.7% 99.4% 100% 
40 98.8% 99.7% 100% 
Table A.36: Physical measurement results 
Appendix B 
Software Data Formats 
B.l Cleverscope Text Files 
The Cleverscope text-based format has a header , beginning with the line " [Sample 
Definition] " and a body, beginning wit h the line " [Data] ". An example of t his 
format is shown in Figure B.l. 
B .1.1 Header 
The header of a Cleverscope text file contains several pieces of information impor tant 
to our analysis: 
• Usage of digital t races 
- If digital t races were capt ured by the Cleverscope unit, the UseDig param-
eter is TRUE; otherwise, it is FALSE . 
• Analog scale, offset 
115 
APPENDIX B. SOFTWARE DATA FORMATS 
[Sample Definition] 
Type =Time 
UseBuffer=FALSE 
UseDig=TRUE 
ChAscale=1.000000 
ChAoffset =O.OOOOOO 
ChBscale =1.000000 
ChBoffset =O.OOOOOO 
delta =0 . 0000000100 
start =0.0006427900 
nsample =7047 
offset=O 
Save Time = 8/20/2007 2:35:46 PM 
[Data] 
Time Chan A Chan B 
0.00064279 1 . 50011814 1.48075295 
0.00064280 1 . 49992914 1 . 47955595 
0.00064281 1 . 50005514 1.48058195 
0 . 00064282 1.49879514 1.47915695 
0 . 00064283 1.49948814 1.47927095 
0 . 00064284 1.49911014 1 . 47852995 
0.00064285 1.49961414 1.47898595 
0 . 00064286 1 . 49904714 1.47972695 
0.00064287 1 . 49929914 1.47938495 
Figure B.l : Cleverscope text file example 
116 
Dig 
240 . 00000000 
240 . 00000000 
240 . 00000000 
240.00000000 
240.00000000 
240 . 00000000 
240 . 00000000 
240 . 00000000 
240.00000000 
- Each analog channel (A and B) has a scale and an offset associated with 
it; t hese values must be mult iplied with and added to, respectively, the 
analog channel data specified below. 
• Sampling period 
- T he time between samples is given by the delta parameter. While the 
sampling period does not affect template attacks directly, we do read and 
store it to ensure that we only attempt to add or multiply traces with the 
APPENDIX B. SOFTWARE DATA FORMATS 117 
same sampling period. 
• Number of samples 
- The number of sample points in the trace is given by the nsample param-
eter. After loading sample points from the file, we ensure that the entire 
fil e was loaded by comparing the number of loaded points to nsample . 
Other parameters, such as "Save Time", are not important for the research , but are 
nonetheless parsed and saved. 
B .1.2 B ody 
The body of a Cleverscope text file contains tab-delimited lin s of data in four 
columns: 
1. Time: the time, in seconds, that the data was sampled 
2. Chan A: the voltage measured by Channel A 
3. Chan B: the voltage measured by Channel B 
4. Dig: digital trace values 
(a) This number varies between 0 and 255, and represents the values of all 
eight digital traces 
(b) Retrieving a particular trace's value is a matter of bit masking: 
for(int j = 0; j < 8; j++) 
digitalTraces[j]->append(value & (1 << j)); 
APPENDIX B. SOFTWARE DATA FORMATS 
B.2 Analog Trace Files 
An AnalogTrace C++ object has six attributes: 
arne Type Description 
my arne QString arne of the trace (e.g. "Channel A') 
unit Unit* Unit of trace values (e.g. Volts, Watts) 
timeDivision double T ime between samples 
trace QList< double> Actual trace values 
min Value double Smallest value in the trace 
ma..."XValu double Largest value in the trace 
118 
QString and QList a re data structures from the Qt C++ toolki t [421, double i the 
64-bit IEEE-standard C++ primitive and Unit is a class that we wrote to manage 
trace units (e.g. dissimilar units cannot be added , multiplying an Amp by a Volt 
produces a Watt). 
Such a trace can be written to two types of files: text-based or binary. 
B.2.1 Text 
When writing small t races to fi le, we may choose to write them in a text-based format 
that facilitates direct inspection. This is accomplished via the Qt cla s QTextStream. 
A QTextStream object , vvhich is associated with a QFile object, can be used to r ad 
or write primitives such as strings and double-precision floating-point numbers. An 
example of the output is shown in Figure B.2. 
APPENDIX B. SOFTWARE DATA FORMATS 119 
AnalogTrace ("Mean Power Usage for Unnamed Trace", W, 192 
values, 1e - 06s apart, range [1 . 11306e - 05:0.000221819]){ 
0.000221806 1 . 11333e-05 1 . 11326e -0 5 0.000221819 
1 . 11322e-05 1 .1 1331e -05 0.000221815 1.11339e-05 1.11316 
e-05 0.000221804 1 . 11328e-05 1.11315e - 05 0.000215942 
} 
Figure B.2: Example of a text-based AnalogTrace file 
B.2.2 Binary 
When writing files that are large or will be read many tim s, it is mor efficient to 
write AnalogTrace objects in a binary format. Such a format is smaller than the 
equivalent text-based format, and it saves the computational effort r quired to parse 
floating-point numbers from text. 
Writing a AnalogTrace to a binary file - or reading it back - is accomplished using 
the Qt class QDataStream. Like QTextStream above, QDataStream object can b 
used to r ad or write primitives such as strings and double-precision floating-point 
numbers. The binary format includes a "magic" number - used to recognize the 
format - and a binary format version (currently version 2). The procc s of writing 
such a file is shown in Figure B.3, and a sample trace as viewed in a hex ditor is 
shown in Figure B.7. 
B.3 Digital Trace Files 
Digital trace files are much simp! r than ana log traces , as they contain a binary 
trace - there are no units or minimum/ maximum values to be concerned with. An 
DigitalTrace C++ object has just two a ttributes: 
APPENDIX B. SOFTWARE DATA FORMATS 
QDataStream& power: :operator << (QDataStream& d, 
AnalogTrace& trace) 
{ 
120 
d << (quint32) Ox5CABOOA7; II magic SCAB AT ( Analog Trac4 
d << (quint32) 2 · 
' 
II binary format 
d << trace.name(); 
d << trace. units() . toString (); 
d << trace. period(); 
d << trace.values() . size(); 
for(long inti= 0; i < trace.values().size(); i++) 
d << trace.values()[i]; 
return d; 
} 
Figure B.3: Writing a binary AnalogTrace file 
Name Type Description 
timeDivision double Sampling period 
trace QList< bool> Binary trace 
Digital traces are are also simpler to parse than analog traces - there is only 
one floating-point number per file - and in our usage, they are also much smaller, 
since we only use them for subtrace masking, and ubtraces are much smaller than 
full traces (see Section 4.4.2). Thus, we only write digital trace files in a text-bas d 
format, though a binary representation is required when writing digital traces as part 
of power usage files. 
B.3.1 Text 
The text-based digital trace file format is quite simple, as shown in Figure B.5, in-
corporating just the trace length , sampling period and actual trace values. 
version 
APPENDIX B. SOFTWARE DATA FORMATS 
~ file:///home{jon/school/research/SCAB/software/ l ibscat;tests/ test.trace · KH exEct it 
Eile f.dit ~iew .Documents .6.ookmarks Iools .S.ettings .tlelp 
~ ... - I l 
[ test. trace "Magic .. Verst-an 
0000 :0000 Sc 00 00 00 02 
0000:0010 0 · 0 4 
0000:0020 00 63 00 65 00 00 00 02 
0000:0030 00 00 00 00 00 2a 00 00 
0000:0040 00 00 00 00 00 00 3f fO 
0000:0050 00 00 00 00 00 00 40 00 
0000:0060 00 00 00 00 00 00 40 08 
0000:0070 00 00 00 00 00 00 40 10 
0000:0080 00 00 00 00 00 00 40 14 
0000:0090 00 00 00 00 00 00 40 18 
OOOO:OOaO 00 00 00 00 00 00 40 1c 
0000:00b0 00 00 00 00 00 00 40 20 
0000:00c0 00 00 00 00 00 00 40 22 
0000:00d0 00 oa 00 00 00 00 40 24 
0000:00e0 00 OG 00 00 00 00 40 26 
0000:00f0 00 00 00 00 00 00 40 28 
0000:0100 00 00 00 00 00 00 40 2a 
0000:0110 00 00 00 00 00 00 40 2c 
0000 :0120 00 00 00 00 00 00 40 2e 
0000; 0130 00 00 00 00 00 00 40 30 
Hex • J 
~ 
00 00 00 18 00 
00 20 00 54 00 
00 41 3f aS 81 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
00 00 00 00 00 
41 00 
72 00 
06 20 
00 3f 
00 3f 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
00 40 
Find 
20 
61 
00 
eO 
f8 
04 
0c 
12 
16 
1a 
1e 
21 
23 
25 
27 
29 
2b 
2d 
2f 
30 
Trace Name 
. .... * . .. . ... . 70 
.. .... ?o . ..... 7o 
' . ' . ' . @ .. ' '. ' . @. 
.... '. @ . . ... • . @. 
.. . .. . @ .. . ' . . • @. 
. . . .. . @ . . . . . .. @. 
.. . . .. @ .. .. . .. @. 
. •... , @ . .. ..•. @. 
.. " .. @ .. ... . @! 
. .. ... @" .. . . .. @# 
' . . ' .. @$ .. ' • .. @% 
. .•.. ' @& . .. . .. @' 
. . • .. ' @( . .. ... @) 
' . . . . . @*' '' . ', @+ 
' .... , @, . . ' . . . @-
' .• . ' . @ . .. ' ... @/ 
' .... . @0 .. ' ' . . @0 
r.\ 1 
B.ickwards 
121 
-~ 
X 
Signed 8 bit: 92 Signed 32 bit : 
Unsigned 8 bit: 92 Unsigned 32 bit : 
Signed 16 bit: 23723 32 bit float: 
1554710695 
1554710695 
3.850635E+17 
Hexadecimal: 
Octal: 
Binary : 
5C 
134 
01011100 
Unsigned 16 bit : 23723 64 bit float: 2.512192 E+ 138 Text: 
Show l ittle endian decod ing Show .u.nsigned as hexadecimal Stream length: Fixed 8 Bit 
Encoding: Default OVR Size: 390 Offset: 0000:0000-7 Hex RW 
Figure B.4: Example of a binary AnalogTrace file 
B .3.2 Binary 
Like that of an AnalogTrace, the DigitalTrace's binary repre entation us s a "magic" 
value for the purposes of format recognition and a format version - currently version 
1. T he code to write such a file is shown in Figure B.6. 
APPENDIX B. SOFTWARE DATA FORMATS 
DigitalTrace (192 values, 1s apart){ 
100100100100100100100 ... 0000 } 
Figure B.5: Example of a text-based DigitalTrace file 
QDataStream& power: :operator << (QDataStream& ds, 
DigitalTrace& t) 
{ 
122 
ds << (quint32) Ox5CABOOD7; II magic SCAB DT ( Digital Tr ; 
ds << (quint32) 1· 
' 
II binary format versio 
ds << t . period(); II sampling period 
ds << t.size(); II size 
const QList <bool > values = t. values(); 
ds << values; 
return ds; 
} 
Figure B.6: Writing a binary AnalogTrace file 
B.4 Power Usage Files 
A PowerUsage file is a binary representation of two things: 
• an AnalogTrace containing a power trace 
• a DigitalTrace that partitions the t race into subtraces (see Section 4.4.2 
This file consists of another "magic" number , a version (current version 1), two binary 
values (to indicate the presence of an analog and digital trace, respectively) and then 
the binary representations of the power trace and partitioning trace. An example of 
this format is shown in Figure B.S. 
APPENDIX B. SOFTvVARE DATA FORMATS 123 
r G{ file:///home{jon/school/research/SCAB/software/ libscat/tests/ test.trace · KHexEdit )( 
Eile .E.dit ~iew .Qocuments .6.ookmarks Iools .S.ettings tlelp 
fS1 "-~ • ~agic" .·1 
test.trace Trace Name 
000G:OOOG Sc OG 00 GO 02 00 00 00 18 00 41 00 20 
OOOG:0010 0 · 0 4 00 20 00 54 00 72 00 61 
000G:0020 00 63 00 65 0G 00 00 02 00 41 3f aS 81 06 20 G0 
000G:0030 00 GO 00 00 OG 2a 00 00 00 00 00 00 00 0G 3f eO . .... * ........ 70 
0000 :0040 00 00 00 00 00 00 3f fO 00 00 00 00 00 00 3f fB . . . . .. 70 ..... . 70 
0000:0050 00 00 00 00 00 00 40 00 00 GO 00 00 00 00 40 04 ... . .. @ ....... @. 
0000:0060 00 00 GO 00 OG 00 40 08 00 00 GO 00 00 OG 40 Gc .. ... . @ ....... @. 
0000:007G 00 GO 00 00 00 00 40 10 00 00 00 00 00 0G 40 12 . ... .. @ ...... . @. 
0000:008G 00 GO 00 00 OG 00 40 14 00 00 00 00 00 00 40 16 . ... .. @ .. ..... @. 
000G : 0090 00 Of.l 00 00 0G 00 40 18 00 00 00 00 00 00 40 la . ... .. @ ...... . @. 
OOOG: OOaO 00 00 00 00 OG 00 40 lc 00 00 00 00 00 00 40 1e . .. ... @ .. . . ... @. 
000G:00b0 00 00 00 00 00 00 40 20 00 00 00 00 00 00 40 21 .... .. @ .. ... . @! 
0000 :00c0 00 GO 00 00 0G 00 40 22 00 00 00 00 00 OG 40 23 ..... . @" .. .. . . @# 
OOOG :OOdG 00 00 00 00 OG GO 40 24 00 00 00 00 00 00 40 25 ...... @$ ..... . @% 
OOOO :OOeO 00 GO 00 00 00 00 40 26 00 00 GO 00 00 00 40 27 ...... @& ...... @' 
000G :OOfG 00 GO 00 00 OG 00 40 28 00 00 00 00 00 OG 40 29 .. . ... @( ...... @) 
000G :0100 00 00 00 00 OG GO 40 2a 00 00 GO 00 00 OG 40 2b . . .. . . @* ...... @+ 
OOOG :011G 00 00 00 00 00 00 40 2c 00 00 00 00 00 OG 40 2d . . . . .. @, ...... @· 
0000 :0120 00 GO 00 00 00 00 40 2e 00 00 00 00 00 OG 40 2f ... .. . @ . .. . ... @/ 
OOOG :0 130 00 00 00 00 00 00 40 30 00 00 00 00 00 00 40 30 ' .. ... @0 ... . .. @0 
,., ,. , 
Hex y II Find Bickwards X 
Signed 8 bit: 92 Signed 32 bit : 
Unsigned 8 bit: 92 Unsigned 32 bit : 
Signed 16 bit : 23723 32 bit float: 
1554710695 
1554710695 
3.850635E+17 
Hexadecimal: 
Octal: 
Binary: 
5C 
134 
01011100 
Unsigned 16 bit: 23723 64 bit float: 2.512192E+138 Text: 
Show little endian decoding Show _ynsigned as hexadecimal Stream length: Fixed 8 Bit 
Encoding: Default OVR Size: 390 Offset: 0000:0000-7 Hex RW 
F igure B.7: Example of a binary Analog'II·ace file 
B.5 Power Simulation 
Hardware was modeled in this work in two parts: first the hardware itself was char-
acterized , th n hardware power usage was simulated using these characteristics as a 
model. 
APPENDIX B. SOFTWARE DATA FORMATS 124 
(Y file:///hometjon/school/researcl waveforrns/simulation/ LFSR-16/64·trials·64·cycles-le-8-noise/0x) y A " 
file fdit Yiew Qocuments .6.ookmarks Iools .s_ettings .l:ielp 
@ • ~ ~ 
PowerUsage "Magic" Version AnalogTrace "Magic" 
ooo0:oooo c 00 4 ·o oo oo e 01 01 eo a oo oo \ O. B .... .. \ o.o. 
0000:0010 o~o~~m-oM· O oo so oo Gf oo 65 oo 72 ..... N.P .o .w. e . r 
0000:0020 00 20 00 55 GO GO 61 00 67 00 65 00 20 GO 6t . . U. s .a g . e .. o 
0000 :0030 00 66 GO 20 00 4c GO 46 00 53 00 52 00 2d GO 31 . f .. L.F .S. R. - . 1 
OOOO :OG40 00 36 00 20 00 75 00 73 00 69 GO 6e GO 67 00 2G .6 . . u .s . i .n . g . 
0000:0050 00 6b 00 65 00 79 00 2G 00 30 00 78 00 58 GO 58 . k .e . y . O. x . X. X 
0000:0060 00 58 00 30 00 00 00 02 00 57 3e bO c6 f7 aO 00 . X.O .. ... W>OODD· 
0000:0070 00 00 00 00 30 40 3f 30 92 d2 ef 00 00 00 3e e7 . ... 0@?0 .00· . . >0 
GOOO:OOBO 53 ec 60 GO 00 00 3e e7 5d ld aO 00 00 00 3t 30 SO' ... >OJ .O ... 70 
0000:0090 93 26 aB 00 00 00 3e e7 5c ec aO 00 00 00 3e e7 .&Q . . . >0\ 00· .. >O 
OOOO:OOaO 59 dS 80 00 00 00 3f 30 95 80 e5 00 00 00 3e e7 YO .. . . ?0 . . o ... >0 
OOOO :OObO 55 7b 40 00 00 00 3e e7 54 de 80 00 00 00 3f 30 U{@ ... > . . .. ?0 
oooo:ooco 93 40 16 oo oo oo 3e e7 55 89 eo oo oo oo 3e e7 .@ .... >ou .o ... >O 
OOOO:OOdO 5b Se 80 00 00 00 3f 2f be Be fO 00 00 00 3e e7 [A . . .. ?/0·0· .. 
OOOO·OOeO 54 60 aO 00 00 00 3e e7 5b eB 80 00 00 00 3f 2e T' Q ... >0 [0 . . .. 7 . 
OOOO :OOfO e6 ec 08 00 00 00 3e e7 59 df 80 00 00 00 3e e7 DO .... >OYO .. . . 
0000 :0100 5d Sa eO 00 00 00 3f 2d 7c 70 aO 00 00 00 3e e7 J .D ... 7-I PD· .. >0 
0000:0110 57 a6 60 00 00 00 3e e7 57 20 aO 00 00 00 3f 2c WO' ... >OW D· .. ?, 
0000 :0120 a3 45 6e 00 00 00 3e e7 5c 67 eO 00 00 00 3e e7 DEn ... >0\gQ .. . >O 
0000 :0130 59 43 cO 00 00 00 3f 2b 3b 24 00 00 00 00 3e e7 YCO ... ?+ ; $ . . . . 
IHHH>. f.l 1 A () C: '7 0, ~f.l t:l() ()() (l(l .,~ ~'7 o::~ ., ... Af.l f.lt:l ()f.l /1/l .,4' .., .... hi n - n \_a ') x 
Hex • I Find B_gckwards 
Signed 8 bit : 92 Signed 32 bit: 1554710594 Hexadecimal: 
Unsigned 8 bit : 92 Unsigned 32 bit : 1554710594 Octal: 
X 
5C 
134 
Signed 16 bit: 23723 32 bit float: 3.850600E+l7 Binary: 01011100 
Unsigned 16 bit: 23723 64 bit float: 2.512048E+138 T ext: 
Show little endian decoding Show ynsigned as hexadecimal Stream length: Fixed 8 Bit 
Encoding: Default OVR Size: 111310 Offset: 0000:0000-7 Hex RW 
Figure B.8: Example of a binary PowerUsage file 
APPENDIX B. SOFTWARE DATA FORMATS 
class PowerUsageModel 
{ 
} ; 
public: 
double noiseLevel() const; 
void setNoiseLevel(double); 
II 
II 
virtual float basic() const = 0; 
virtual float zeroToOne() const 0; 
virtual float oneToZero() const = 0; 
protected: 
float noise() const; I I 
float noise(float scale) const; II 
private: 
double myNoiseLevel; II 
! < Amount of 
! < Set amount 
! < Noise 
125 
AWG N in pc 
of AWGN i 1 
! < Noise in a specific +-
! < Amount of AWGN 
Figure B.9: PowerUsageModel interface 
B.5.1 Power Model 
Once hardware has been characterized a C + class can be written which implements 
the PowerUsageModel interface, which is shown in Figure B.9. 
B.5.2 Cipher Model 
Simulating hardware requires simulating the number of high-low and low-high tran-
sit ions of a cipher. A Qt / C++ class (a C++ class using t he Qt lass library and 
preproces ed by Qt's Meta Obj ct Compiler - MOC) must be written which inherit· 
from the abstract class Cipher , shown in Figure B.lO. 
APPENDIX B. SOFTWARE DATA FORMATS 
class Cipher 
{ 
public QObject 
public: 
I I! Represents what 
struct StateChange 
{ 
II ... 
happens when 
int ll; int lh; int hl; int hh; 
} ; 
I I! The cipher 's name 
virtual QString name() const O· 
' 
126 
an cipher changes state 
I I! Current cipher state ( should be 
virtual QString stateString() const = 0; 
human - readable) 
} ; 
virtual int minimumKeySize () const = o· 
' 
virtual int maximumKeySize () const = 0; 
virtual int minimumiVSize () const o· 
' 
virtual int maximumiVSize() const = O· 
' 
I I! Initialize the cipher 
virtual void initialize(const 
const 
for use 
Cryptovariable& key , 
Cryptovariable& iv) 
virtual void initialize(const QList<bool>& key, 
const QList<bool>& iv) = 0; 
I** 
* Cycle the clock 
* 
0; 
* @returns a StateChange class 
- >high and 
representing the number of 
* low 
*I 
virtual StateChange clock() = 0; 
signals: 
I I! The internal state has 
void newState(QString state); 
changed 
Figure B.lO: Cipher interface 
high ->low transiti ons etc 




