Physical Security Analysis of Embedded Devices by Ege, B.
PDF hosted at the Radboud Repository of the Radboud University
Nijmegen
 
 
 
 
The following full text is a publisher's version.
 
 
For additional information about this publication click this link.
http://hdl.handle.net/2066/158436
 
 
 
Please be advised that this information was generated on 2017-12-05 and may be subject to
change.
Physical Security Analysis
of Embedded Devices
Barıs¸ Ege
Copyright c© Barıs¸ Ege, 2016
ISBN: 978-90-9029785-9
IPA dissertation series: 2016-07
Typeset using LATEX
Cover design: Barıs¸ Ege, Rachel–Groß Hardt and Valentina Banciu. PCB photo on
the cover is a derivative of “electronic circuits” c© http://creativity103.com
Printed by: Gildeprint
The work in this thesis has been carried out under the auspices of the research school
IPA (Institute for Programming research and Algorithmics).
This work is in part ﬁnanced by the Dutch Technology Foundation STW (project
12624 - SIDES).
Physical Security Analysis
of Embedded Devices
Proefschrift
ter verkrijging van de graad van doctor
aan de Radboud Universiteit Nijmegen
op gezag van de rector magniﬁcus,
volgens besluit van het college van decanen
in het openbaar te verdedigen op dinsdag 5 juli 2016
om 10:30 uur precies
door
Barıs¸ Ege
geboren op 26 mei 1984
te Ankara (Turkije)
Promotor:
Prof. dr. B.P.F. Jacobs
Copromotor:
Dr. L. Batina
Manuscriptcommissie:
Prof. dr. J.J.C. Daemen
Prof. dr. F.X. Standaert (Universite´ Catholique de Louvain, Belgie¨)
Prof. dr. I. Verbauwhede (KU Leuven, Belgie¨)
Dr. E. Oswald (University of Bristol, Verenigde Koninkrijk)
Dr. P. Schaumont (Virginia Polytechnic Institute and State University, VS)
Physical Security Analysis
of Embedded Devices
Doctoral Thesis
to obtain the degree of doctor
from Radboud University Nijmegen
on the authority of the Rector Magniﬁcus,
according to the decision of the Council of Deans
to be defended in public on Tuesday, July 5, 2016
at 10:30 hours
by Barıs¸ Ege
born on May 26, 1984
in Ankara (Turkey)
Supervisor:
Prof. dr. B.P.F. Jacobs
Co-supervisor:
Dr. L. Batina
Doctoral Thesis Committee:
Prof. dr. J.J.C. Daemen
Prof. dr. F.X. Standaert (Universite´ Catholique de Louvain, Belgium)
Prof. dr. I. Verbauwhede (KU Leuven, Belgium)
Dr. E. Oswald (University of Bristol, United Kingdom)
Dr. P. Schaumont (Virginia Polytechnic Institute and State University, USA)
Acknowledgements
I’ve had many extraordinary experiences in the past 4 years as a PhD student and
I would like to thank a few people for their support and friendship. First of all, I
would like to mention that I feel privileged to have been a part of the DS Group in
Nijmegen as a PhD student. I would like to thank Bart Jacobs for establishing and
maintaining such a pleasant work environment. I would like to thank Lejla Batina for
believing in me and pushing hard to take me as a PhD student even though funding
was scarce at the time I started. Being hired as a PhD student in Nijmegen was a
trigger for many positive changes in my life and I’m grateful to her for that. I would
also like to thank Elisabeth Oswald, Patrick Schaumont, Franc¸ois–Xavier Standaert
and Ingrid Verbauwhede for taking the time to be in my reading committee. I would
like to thank Joan Daemen for his patience in our discussions and many insightful
comments on improving this thesis.
Early on in my PhD period, Roel Verdult, Flavio Garcia and I have found ourselves
in a predicament. Although this event made the news with ﬂashy headlines later on,
the process was a diﬃcult and not to mention a stressful one. I would like to use this
opportunity to thank Roel Verdult and Flavio Garcia for the amazing eﬀort they have
put in to defend our case. But most importantly, I would like to thank Bart Jacobs
for the support and guidance he provided during this diﬃcult time of our lives.
In the 4 years I’ve spent as a PhD student, there were a few periods which I’ve
spent visiting other researchers. First of which was my internship at Riscure soon
after I’ve started my PhD. There I’ve worked with Ileana Buhan and enjoyed the
warm working environment. I’ve learned a lot in that period and I’m grateful for the
kickstart Riscure provided me on side channel analysis. Later on, I’ve had a brief visit
to Michael Hutter in Graz, where I’ve met and worked with Thomas Korak. Seeing
their intense approach to work inspired me to push myself harder and I believe it
changed me for the better. Last of my visits was to Thomas Eisenbarth to whom
I’m grateful as I’ve learned a lot from him through our many thought provoking
discussions. I had to study new topics and learn fast to be able to keep up with his
vii
vast knowledge in side channel analysis. I would like to thank him for pushing me to
my limits.
I would also like to thank my co–authors,  Lukasz Chmielewski, Amitabh Das,
Thomas Eisenbarth, Santosh Ghosh, Michael Hutter, Elif Bilge Kavun, Thomas Ko-
rak, Jasmina Omic, Kostas Papagiannopoulos, Stjepan Picek and Tolga Yalc¸ın for
their inspiration and contribution to this thesis.
Moving thousands of kilometres away from home and getting used to a very dif-
ferent culture was not an easy task to accomplish. However, there are a few people
that I would like to thank for making this process easier for me. Thank you Olha
Shkaravska for the intriguing discussions on European culture, politics and religion.
Thank you Joeri de Ruiter for showing me that Dutch people not accepting foreigners
as close friends is just a myth. Thank you Freek, Barbara and Peter for sharing your
wisdom and guiding me through my moments of despair. Thank you Fabian, Rody,
Bas, Freek and Bernard for not just introducing me to diﬀerent kinds of beers but
also showing me how to cook with them. Thank you Peter, Alexandra, Erik, Barbara,
Anna, and Robert for the delicious meals and engaging conversations we had over the
years. Thank you Ronny, Bram, Bas, Engelbert, Gergely, Bernard, Merel, Paulan,
Mathys, Antonio and all the members of DS that I cannot list here for contributing
to many nice conversations we had at coﬀee breaks and lunches. Thanks to all people
of many game nights: Veelasha, Joost, Pedro, Joost, Ko and Tommy for the exciting
and fun evenings.
I would like to thank my parents, and my sister Bilgen for their patience and
support all these years. Living so far apart from each other is a challenge for us all,
and I’m grateful to you for encouraging me to do things in my own pace.
Finally I would like to express my gratitude to my partner Rachel. I wouldn’t
be the person I am today without you challenging me intellectually. Your love and
support helped me go through the most diﬃcult parts of my PhD. I feel lucky to be
able to call you my partner in life.
Barıs¸ Ege
Utrecht, May 2016
Summary
As more and more of our personal data is stored and processed by electronic systems,
security evaluations of such systems has become a necessity rather than a choice in
today’s world. Even if a protocol or a cryptographic primitive is theoretically se-
cure, security vulnerabilities can be introduced in their implementation. Side channel
attacks exploit such vulnerabilities and can be mounted using widely accessible equip-
ments with ease, if countermeasures are not utilized in the implementation. Therefore,
any product that implements cryptography should go through a rigorous security eval-
uation which also covers its physical behaviour. This thesis focuses on the physical
security evaluations of cryptographic implementations.
In Chapter 2, we compare diﬀerent theoretical metrics which attempt to quantify
the security of a cryptographic algorithm against diﬀerential side channel attacks. We
show that some of these metrics fail to capture the level of security against diﬀerential
side channel attacks. Additionally, we show that the security of a cryptographic
algorithm against diﬀerential side channel attacks may vary depending on whether
the input or the output data is used by the attacker.
In Chapter 3, we revisit the topic of exploiting side channel leakage in the frequency
domain. Our mathematical derivations and experiments suggest that the signal trend
(a component in a side channel signal that is constant and shared among independent
experiments) can aﬀect the success rate of a side channel attack in the frequency
domain. Moreover, we show that the signal to noise ratio can be increased in the
frequency domain if the attacker can manipulate the signal trend.
In Chapter 4, we introduce a new approach to side channel collision attacks in an
attempt to relax the assumptions required for correlation based side channel analysis.
We further show that, using this approach, we can signiﬁcantly improve an attack on
a class of low entropy masking schemes.
In Chapter 5, we investigate how ambient temperature can aﬀect the vulnerability
of a device to clock glitches. We show that increased temperature makes the target
device more vulnerable to clock glitch attacks. We also present for the ﬁrst time in
ix
the literature that an instruction can be repeated if a clock glitch can be carefully
timed.
In Chapter 6, we investigate the security of test compression algorithms. With
increased complexity of the modern electronic chips, their structural testing is be-
coming more challenging. Test compression is a solution which is widely used by
manufacturers to save time and keep the test quality high. However, contrary to
the assumptions made in the literature, such systems do not automatically provide
security. We show that even when industrial countermeasures are employed, classical
diﬀerential scan attacks can compromise the security of a chip under some conditions.
Samenvatting
Steeds meer van onze persoonlijke gegevens worden opgeslagen en verwerkt door dig-
itale systemen. Security-evaluaties van deze systemen zijn daarom inmiddels meer
noodzaak dan een keuze. Zelfs als een cryptograﬁsche primitieve die hierbij gebruikt
wordt theoretisch veilig is, kunnen beveiligingskwetsbaarheden ge¨ıntroduceerd wor-
den tijdens de implementatie. Aanvallen op basis van neveneﬀecten (side-channels)
maken gebruik van deze kwetsbaarheden en kunnen eenvoudig worden toegepast op
breed toegankelijke apparatuur, als er geen maatregelen worden toegepast in de imple-
mentatie. Daarom zou ieder product dat cryptograﬁe implementeert een rigoureuze
security-evaluatie moeten ondergaan die ook het fysieke gedrag beschouwt. Dit proef-
schrift gaat over de fysieke security-evaluatie van cryptograﬁsche implementaties.
In het tweede hoofdstuk vergelijken we verschillende theoretische metrieken die
proberen de beveiliging van een cryptograﬁsch algoritme tegen diﬀerential side chan-
nel aanvallen te kwantiﬁceren. We laten zien dat sommige van deze metrieken niet
een adequate indicatie geven van het beveiligingsniveau tegen dergelijke aanvallen.
Daarnaast laten we zien dat de beveiliging van een cryptograﬁsch algoritme tegen dif-
ferential side channel aanvallen kan varie¨ren afhankelijk van of de aanvaller gebruik
maakt van de in- of uitvoergegevens.
In Hoofdstuk 3 bekijken we het lekken van informatie door side channels in het
frequentie-domein. Onze wiskundige aﬂeidingen en experimenten suggereren dat de
signaaltrend de slaagkans van een side channel aanval in het frequentie-domein kan
be¨ınvloeden. Een signaaltrend is een component in een side channel signaal dat
constant is en gemeenschappelijk is voor verschillende experimenten. Bovendien tonen
we aan dat de signaal-ruis verhouding kan worden verhoogd in het frequentie domein
als de aanvaller de signaaltrend kan manipuleren.
In Hoofdstuk 4 introduceren we een nieuwe aanpak voor side channel collision aan-
vallen in een poging de aannames te verzwakken die nodig zijn voor een op correlatie
gebaseerde side channel analyse. Verder tonen we aan dat we, met behulp van deze
methode, een aanval op een klasse van masking schemes met lage entropie signiﬁcant
kunnen verbeteren.
xi
In Hoofdstuk 5 onderzoeken we hoe omgevingstemperatuur invloed kan hebben
op de kwetsbaarheid van een apparaat door afwijkingen in de klok (zogenaamde clock
glitches). We laten zien dat een verhoogde temperatuur het apparaat kwetsbaarder
maakt voor clock glitch aanvallen. We tonen ook voor het eerst aan dat een instructie
kan worden herhaald als een clock glitch nauwkeurig getimed wordt.
In Hoofdstuk 6 onderzoeken we de beveiliging van algoritmes voor testcompressie.
Door de toenemende complexiteit van moderne elektronische chips wordt het testen
hiervan steeds uitdagender. Testcompressie is een oplossing die veel gebruikt wordt
door producenten om tijd te besparen en de testkwaliteit hoog te houden. Echter, in
tegenstelling tot de aannames in de literatuur, bieden deze oplossingen niet automa-
tisch beveiliging. We laten zien dat zelfs wanneer standaard beveiligingsmaatregelen
worden gebruikt, klassieke diﬀerential scan aanvallen de beveiliging kunnen compro-
mitteren onder bepaalde condities.
Contents
Acknowledgements vii
Summary ix
Samenvatting xi
1 Introduction 1
1.1 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Symmetric Cryptography . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Asymmetric Cryptography . . . . . . . . . . . . . . . . . . . . 4
1.2 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Side Channel Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Fault Injection Attacks . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2 Scan Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Power/EM Analysis . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.4 Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Security Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Outline and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Side Channel Resistance by Design 15
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 The Phantom Property . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Analysis from Input/Output . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 FPGA and Microcontroller Experiments . . . . . . . . . . . . . . . . . 25
2.4.1 Software (AVR) . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.2 Hardware (FPGA) . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Side Channel Analysis in Frequency Domain 29
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Leakage in Frequency Domain . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.1 Transformation of Leakage . . . . . . . . . . . . . . . . . . . . 34
3.2.2 Simulated Experiments: Eﬀect of the Signal Trend . . . . . . . 35
3.2.3 Real Experiments: Eﬀect of the Signal Trend . . . . . . . . . . 36
3.3 How to Increase SNR in the Frequency Domain . . . . . . . . . . . . . 39
3.3.1 Using Truncated Sine Waves as Signal Trend . . . . . . . . . . 40
3.3.2 Real Experiments: Increasing the Signal in Frequency Domain 42
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Near Collision Side Channel Attacks 47
4.1 Backgound & Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Side Channel Near Collision Attack . . . . . . . . . . . . . . . . . . . . 50
4.2.1 Simulated Experiments on Unprotected AES Implementation . 52
4.2.2 Implementation Eﬃciency of NCA . . . . . . . . . . . . . . . . 54
4.3 Near Collision Attack Against LEMS . . . . . . . . . . . . . . . . . . . 55
4.3.1 Leaking Set Collision Attack . . . . . . . . . . . . . . . . . . . 55
4.3.2 Leaking Set Near Collision Attack . . . . . . . . . . . . . . . . 55
4.3.3 Simulated Experiments on AES Implementation with LEMS . 58
4.3.4 Experiments on DPA Contest v4 Traces . . . . . . . . . . . . . 59
4.3.5 Implementation Eﬃciency of the Attack . . . . . . . . . . . . . 60
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5 Ambient Factors on Clock Glitches 63
5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 The Fault Injection Setup . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.1 Fault Board for Clock Tampering . . . . . . . . . . . . . . . . . 66
5.2.2 Heating Plate with Temperature Measurement . . . . . . . . . 69
5.2.3 The Investigated Microcontroller - AVR ATmega162 . . . . . . 70
5.3 The Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.1 Inconsistent Faults . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.2 Modiﬁed Instructions . . . . . . . . . . . . . . . . . . . . . . . 74
5.4.3 Repeated Instructions . . . . . . . . . . . . . . . . . . . . . . . 75
5.5 The Inﬂuence of Ambient Temperature . . . . . . . . . . . . . . . . . . 77
5.5.1 Temperature Derating Factor . . . . . . . . . . . . . . . . . . . 79
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6 Security of Test Compression Algorithms 83
6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.1.1 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.1.2 Industrial Test Compression Schemes . . . . . . . . . . . . . . 85
6.1.3 Introduction to Scan Attack on AES . . . . . . . . . . . . . . . 87
6.1.4 Diﬀerential Scan Attack on AES . . . . . . . . . . . . . . . . . 88
6.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 Attack Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.3.1 Distributions Considered . . . . . . . . . . . . . . . . . . . . . 91
6.4 Scan Attacks on Industrial Test Compression Algorithms . . . . . . . 93
6.4.1 Scan Attack on Synopsys Adaptive Scan . . . . . . . . . . . . 93
6.4.2 Scan Attack on Cadence OPMISR . . . . . . . . . . . . . . . . 95
6.4.3 Scan Attack on Mentor Graphics EDT . . . . . . . . . . . . . . 99
6.5 Scan Attack Countermeasures . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.1 Insertion of Inverters in the Scan Path . . . . . . . . . . . . . . 102
6.5.2 Partial Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.5.3 Scan Chain Scrambling . . . . . . . . . . . . . . . . . . . . . . 103
6.5.4 Masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.5.5 Our Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Bibliography 109
Curriculum vitae 127

Chapter 1
Introduction
Nowadays, we are integrating technological gadgets more and more into our daily
lives. We use electronic tokens to access our apartment buildings and drive our cars.
We use our personal computers or mobile phones to do banking transactions and
declare our taxes. Although these gadgets introduce convenience to our lives, this
comes at a price. As this type of means to handle our everyday business becomes
ubiquitous, we can no longer ensure our security and privacy by simply having a solid
lock on our door and closing the curtains. In times we live in, any information can
easily be collected and distributed. Therefore, we have to be more aware of what data
we share with which entity to keep our personal data and transactions secure.
Not only the information that we spread willingly or unwillingly about ourselves,
the devices we use also pose a threat to our security and privacy. We are relying on our
“secure” usb tokens or point of sale (POS) devices at any shop for handling banking
transactions and assume that no personal information can be retrieved and copied
from such devices. However these mobile devices can fall into the hands of people
with malicious intent, giving them the possibility to physically alter these devices to
recover their secrets. Such attacks not only hurt the individuals who are the victims
of criminals, but also the manufacturers of these devices, as they can face signiﬁcant
bad publicity as a result. Therefore, a thorough physical security analysis of such
devices has become a necessity rather than a luxury in today’s world.
Security of an embedded device is like a metal chain. The security of the entire
system is only as strong as the weakest link in the security chain. Therefore the se-
curity analysis of such a system should also be versatile and cover suﬃciently many
attack vectors that are required for the targeted security level. Let it be the built
in test infrastructure for eﬃcient structural testing of chips, or the branch prediction
algorithms to run programs faster on our computers, these functionalities which are
designed for eﬃciency purposes can also introduce weak links to the security chain.
This is not a problem that is endemic to the security of embedded devices but also
to any security solution that is in production. Take a door lock for instance. The
lock may introduce pioneering technology to avoid an attacker from even attempting
to pick the lock. However, if the lock is installed on a cam made of soft metal, an
attack with a lever can successfully circumvent the intended security measure (Chap-
ter 11.2.4 of [And08]). Much like this example, physical security analysis exploits
unintentional sources of information to recover secret data from embedded devices
2 1. Introduction
which are otherwise secure on conventional channels (e.g. protocol level).
1.1 Cryptography
Security is a concept that is intermingled with cryptography in today’s world as we rely
more and more on electronic devices. The word cryptography itself is a combination
of ancient Greek words which would roughly translate to ‘secret writing’. In its earlier
times, it was a means to have covert communication between high proﬁle people or
military personnel. Nowadays, cryptography is a necessity since our communication
has become omnipresent and it is moving to the “cloud”. If we are to transmit
and store our personal data in the cloud, we need strong cryptography to ensure its
security.
Although cryptography has been present since the ancient times, it also has evolved
since then. First examples of cryptographic algorithms are based on substitution
of each letter with another one in the alphabet, as in Caesar cipher or the aﬃne
cipher. As long as the other party knows the used substitution, she can recover the
original text. The original text here is referred to as the “plaintext”, the output of a
cryptographic cipher as the “ciphertext”, the method of generating a ciphertext from
the plaintext as “encryption” and the method that does the opposite as “decryption”.
Other examples of ancient cryptography, such as the scytale, use a permutation
(alternatively called a transposition) of the plaintext. First an empty strip of parch-
ment or leather is wrapped around a cylinder with a certain diameter. Then the
plaintext is written out and once the strip is unwrapped, it will seem like a random
collection of letters. Unless the other party has a cylinder with the same diameter,
it is diﬃcult to recover the plaintext. Therefore, the diameter of the cylinder can be
referred to as the “key” that opens the secrets of a scytale ciphertext. Recovering
the plaintext without using the required cylinder however, is commonly referred to as
“breaking” the cipher.
Since computers have become an integral part of our life, cryptography also had
to be adapted to this new way of life. The letters are replaced by ones and zeros
(bits) that computers can process, and the ﬁrst modern cryptography standards were
born in 1970s. Modern cryptography can be divided into two main ﬁelds: symmetric
(secret–key) cryptography and asymmetric (public–key) cryptography.
In the following sections of this chapter, a brief introduction to modern crypto-
graphic primitives is given. As this dissertation’s main focus is on the security of the
implementations of cryptographic primitives, this introduction is intentionally kept
brief to cover only the fundamental cryptographic primitives.
1.1.1 Symmetric Cryptography
The cryptographic algorithms that belong to this area either use the same key for
encryption and decryption, or the decryption key can be easily computed from the
1.1. Cryptography 3
encryption key. This type of cryptographic algorithms are highly eﬃcient and there-
fore used for secure communications once a key is agreed upon. There are two main
cryptographic primitives for symmetric cryptography: stream ciphers, block ciphers
and hash functions.
Stream Ciphers
Stream ciphers are cryptographic algorithms which are inspired by the one–time pad
encryption which is proven to provide “perfect secrecy” by Shannon during 1940s
[Sha49]. Although one–time pad is proven to be perfectly secure, since the key size
is as large as the plaintext itself, it is of little use for real–time communications as
the key should be transferred securely in the ﬁrst place. Stream ciphers attempt to
solve this issue by generating a large “key stream” from a relatively small key which
can be transferred eﬃciently. Further information on stream ciphers can be found in
the ﬁnal report of the eSTREAM competition which provides a portfolio of stream
ciphers that can be used for software and hardware environments [RB08].
Block Ciphers
As the name suggests, rather than using a stream of bits to encrypt data, block
ciphers process a ﬁxed sized block of bits at a time. Early examples of modern block
ciphers followed the footsteps of their ancient predecessors, using substitutions and
bit permutations to provide a complex enough relationship between the plaintext
and the ciphertext depending on a relatively short key. However the current trend is
moving towards using non–linear layers followed by a linear mixing layer to provide the
complexity, and therefore the security required from a block cipher. Data Encryption
Standard (DES) and Advanced Encryption Standard (AES) are the most commonly
used examples of this class of ciphers.
Although block ciphers are reliable cryptographic primitives, they encrypt only a
small chunk of data at once (usually 64 or 128 bits which translate to 8 or 16 ASCII
characters respectively). Therefore, they have to be used in conjunction with the
so–called “modes of operation”, to be able to encrypt any data of size larger than one
block. There are many diﬀerent modes of operation for encryption but they can also
be used to provide data integrity and data authenticity. The interested reader can
ﬁnd further information on the topic of block ciphers and modes of operation in The
Block Cipher Companion by Knudsen and Robshaw [KR11].
Hash Functions
Unlike block ciphers and stream ciphers, hash functions play a diﬀerent role in the
world of cryptography. No matter the size of the input, hash functions output a small–
sized ﬁngerprint called the “hash value”. Cryptographic hash functions have their own
criteria to be considered secure. Firstly, it should be computationally infeasible to
4 1. Introduction
ﬁnd the input which produces a certain hash value. This is referred to as the pre–
image resistance of a hash function. Secondly, given the original input, it should not
be feasible to ﬁnd any other input which computes the same hash value. This is called
the second pre–image resistance of a hash function. Another basic criteria is that it
should be infeasible to ﬁnd any two messages which produce the same hash value,
which is called the collision resistance of a hash function. NIST recently announced
the new FIPS standard for cryptographic hashing after a lengthy competition which
attracted attention from all over the world. In the end, Keccak [BDPA11] was adopted
as the SHA–3 (Secure hashing algorithm) standard.
1.1.2 Asymmetric Cryptography
The cryptographic algorithms that are categorized as asymmetric (or public–key) sys-
tems take this name from their use of diﬀerent keys for encryption and decryption.
Given the encryption key for an asymmetric cryptosystem, it is computationally in-
feasible to compute the decryption key without the knowledge of a “backdoor” value.
Therefore, the encryption key can be public knowledge. This is the reason why these
types of cryptosystems are also referred to as public–key systems. The ﬁrst examples
of public–key cryptosystems make use of modular exponentiation [DH76]. But the
key size required by RSA [RSA78], for instance, is ten to twenty times larger than
the required key size of a symmetric cipher which provides a similar level of secu-
rity. There are other, more eﬃcient alternatives to RSA which are based on complex
mathematical structures. These are commonly referred to as Elliptic Curve Cryp-
tography [Mil86, Kob87]. Although the mathematical background is more complex
with these cryptosystems, it leads to a smaller key size; only twice as large as the
symmetric systems with a similar level of security.
1.2 Cryptanalysis
The design process of a cryptographic algorithm is incomplete without a thorough
analysis of its security. This method is called “cryptanalysis” and it is of absolute
importance for estimating the security of a cryptographic algorithm. This term should
not be confused with the commonly used term “breaking” a cipher. Cryptanalysis
can lead to breaking a cipher if the cipher design is not secure enough. However
calling cryptanalysis merely the science of “breaking” cryptosystems would not do
justice to the ﬁeld of cryptanalysis which plays an essential role in constructing secure
cryptosystems.
Modern cryptography also created its modern counterpart in the 1970s: mod-
ern cryptanalysis. When DES (data encryption standard) was ﬁrst proposed by
NIST, NSA is said to have interfered with the design process to make DES more
resistant to diﬀerential cryptanalysis: a method which was not published until the
year 1991 [BS91]. NSA’s involvement in the design process and the fact that it was
1.3. Side Channel Analysis 5
published by NIST attracted much attention on cryptanalysis from the academic
community. In 1994, linear cryptanalysis was proposed by Matsui [Mat94] and it
was also shown to be the most eﬃcient analysis of DES, being 512 times faster than
searching through all possible 256 keys. After these initial analysis techniques, many
academic articles followed on new cryptanalysis techniques [BBS99,BDK01,NSZW09]
and therefore DES is believed to have triggered the current widespread academic in-
terest in cryptography and cryptanalysis.
1.3 Side Channel Analysis
A device which uses a cipher that is secure against known cryptanalysis techniques
does not always imply that any implementation of it is practically secure. Implement-
ing a cryptographic algorithm to be used in a device is a challenging task to achieve.
The process is full of pitfalls which can compromise the ﬁnal product if extreme cau-
tion is not taken. Unfortunately, some of these pitfalls may not even be detectable
until a prototype has been produced. On top of all this, the engineer has the challenge
to implement a cryptographic algorithm with constraints such as: low latency, low
area, low power/energy consumption, or high throughput, which might conﬂict with
the required level of physical security of the ﬁnal product.
The way a cryptosystem is implemented on a device can make it vulnerable to
analysis techniques exploiting the physical aspects of computing data in embedded
devices. Such techniques are widely studied in the ﬁeld of “side channel analysis”.
Side channel analysis and countermeasures belong to the most active research areas
of applied cryptography today. Many ways to exploit the so–called “side channel
leakage” have been proposed since the seminal work of Kocher et al. [KJJ99]. The
assumptions for attacks and the adversary models vary but the challenges still remain
to defend against this type of attacks as the adversary is assumed to always take the
next unknown step.
Side channel attacks can be categorized in two classes: passive and active attacks.
Passive attacks are based on physical observations of the target device while it is
functioning within established parameters. Measuring the power consumption or the
electromagnetic (EM) emanations of a device are among passive attacks. Active at-
tacks, on the other hand, aim at placing the target under stress, and therefore forcing
it to make errors in internal computations. These errors (more commonly referred to
as faults) are either exploited directly, or compared with the normal behaviour of the
target for recovering the secrets.
Side channel attacks can further be categorized into three independent groups:
invasive, semi–invasive and non–invasive attacks. Invasive attacks assume the exis-
tence of a powerful attacker who can alter the device, break connections, or make
new ones on the target to retrieve the secrets. Semi–invasive attacks usually involve
altering the working conditions of the device to be able to either have more detailed
observations on the internal workings of the target, or to inject faults which are ei-
6 1. Introduction
ther more diﬃcult to achieve or not possible at all. Non–invasive attacks classify the
type of side channel attacks which do not require altering the target. For instance,
power and EM analysis are classiﬁed as passive, non–invasive attacks. As these type
of attacks usually do not leave a trace behind, detecting the attack at a later time is
not possible. Therefore, for any device that we use today, it is very important to do
a thorough security analysis for such attacks before they are released on the market.
1.3.1 Fault Injection Attacks
This type of attacks are classiﬁed under active side channel attacks. The core idea
of fault attacks is to deliberately put the target device under stress by perturbing
the device through various methods. A number of methods are proposed in the
literature on perturbation of embedded devices. The ways to inject faults on an
embedded device include, but are not limited to: overclocking, varying the supplied
power, focused laser beam, heating and electromagnetic pulses [BECN+06, SH07].
Unless the erroneous behaviour is detected and compensated for, a fault attack is
deemed successful. A successful fault attack on an embedded device can either lead
to complete circumvention of the crypto used in the device, or to the recovery of the
secret key used for the cryptographic algorithm implemented in the device. Hence a
thorough security evaluation using these methods is also very important for almost
every electronic device we use today.
1.3.2 Scan Attacks
As the manufacturing technologies become less bulkier, testing of modern embedded
devices is becoming more and more diﬃcult. This is due to the number of functional
elements that can be ﬁt in smaller areas, and as a result, increasing the complexity
of these devices. The small size of these circuits makes them even more vulnerable
to physical imperfections that can happen during manufacturing process. Therefore,
after manufacturing and before it is implemented in a larger system, an embedded
device should be thoroughly checked for these imperfections. However, with the
increasing complexity of such devices, functionally testing all possible inputs to an
embedded device and comparing with the expected outputs has become infeasible.
Instead, nowadays structural testing methods are used to test embedded devices for
imperfections. A commonly used method for structural testing is to replace the
registers (i.e. temporary memory elements in embedded devices) with so called “scan–
registers”. These registers are then connected to one another, forming a “scan–chain”.
This way, the tester can input an automatically generated text vector to the entire
circuit and can test the switching activity of each functional element. These test
vectors are automatically generated by commercial products in a way that many
transitions (from 0 to 1 or 1 to 0) in the circuit are tested at once. Therefore test
time can be greatly reduced by using this technique.
1.3. Side Channel Analysis 7
However, if the implementation of the cryptographic circuit is also included in such
a scan–chain, the test structure (the scan–chains) can be exploited by an attacker to
recover the key used for cryptographic operations in the embedded device. This type
of active side channel attack is called a “scan attack”. Although this is a known issue
since 2004 [HFB+04], in 2013 we showed that such attacks are still applicable to new
practices used in structural testing of embedded devices [DEG+13](see Chapter 6).
1.3.3 Power/EM Analysis
In 1999, Kocher et al. showed that by analysing the amount of power an embedded
device consumes when executing a cryptographic operation, it is possible to recover
the secret key that is used for that operation [KJJ99]. Following Kocher et al.’s
observation, in 2001, Gandolﬁ et al. applied the DPA method successfully on elec-
tromagnetic (EM) emanations measured from a smartcard [GMO01]. Soon after the
initial attacks based on power consumption and EM measurements, Brier et al. pro-
posed to use Pearson correlation coeﬃcient which is a well-studied statistical function
indicating linear relationships between data sets [BCO04]. This type of attack re-
quires some assumptions to be made on the data leakage visible in the power and
EM measurements. The ﬁrst works on side channel analysis demonstrated a linear
relationship between the Hamming weight of a processed value (i.e. the number of
non–zero bits in the bit representation of the value) and its power consumption (or
EM emanation). Although such assumptions improve the attack success when they
hold in practice, they also make the analysis technique less valuable when the target
does not leak data in the assumed way. In an attempt to improve this shortcoming
of correlation based DPA, Schindler et al. proposed to use stochastic models which
estimate the power consumption of a target through a linear “leakage function” of
the bits of the target variable [SLP05]. Stochastic models can be extended to leakage
functions of higher degrees, however this signiﬁcantly increases the computational
complexity of the analysis method.
Batina et al. further relaxed these assumptions and proposed Mutual Information
Analysis (MIA) which can be applied to a wide range of devices with diﬀerent leakage
functions [BGP+10]. Although MIA requires more measurements to be analysed in
general, it attempts to propose a generic distinguisher that can be used to analyse
diﬀerent target devices. Following MIA’s footsteps, Whitnall et al. propose to use
the two sample Kolmogorov-Smirnov (KS) test as a counterpart to MIA, and show
that analyses based on two sample KS test are more robust to noise when univariate
attacks are considered [WOM11].
Together with the increasing popularity of side channel attacks, an important
question for cryptographic algorithm designers arose whether their algorithm is more
(or less) susceptible to these sort of attacks. One of the earliest examples is put for-
ward by Prouﬀ in 2005: the so–called “transparency order” [Pro05]. An improvement
of this metric was proposed later in 2014 [CSM+14]. Orthogonally in 2012, Fei et
al. put forward the concept of “confusion coeﬃcient” which can estimate the result
8 1. Introduction
of a side channel attack rather accurately with respect to the leakage function and
the noise level [FLD12]. In Chapter 2 we lay down a comparative evaluation of such
theoretical metrics which estimate the side channel leakage for a given algorithm.
Another way of doing side channel analysis is to exploit the internal collisions
that can occur when certain inputs are being encrypted. The ﬁrst example of such
an attack exploits the internal collisions that can happen at the output of the round
function of DES [SWP03]. If these collisions can be detected by comparing the cor-
responding power or EM measurements, they can be exploited for full key recovery.
Later this method is also extended to the ﬁrst round output of AES as well [SLFP04].
In order to be able to apply collision attacks in practice, detecting internal colli-
sions on side channel measurements becomes very important. Bogdanov proposed
solutions to both improve the collision detection [Bog07] and also improve the key
recovery technique, but kept the chosen plaintext nature of the attack [Bog08]. In
2009, Batina et al. proposed a new method to detect internal collisions using cluster
statistics [BGLR09]. Unlike previous collision attacks, this attack can be applied in
a known plaintext setting, i.e. there is no need to choose the particular plaintexts to
be encrypted for the attack to work. In 2010, Moradi et al. simpliﬁed the previously
proposed key recovery techniques and proposed to use Pearson correlation to detect
collisions between diﬀerent S–box outputs [MME10]. Although this way of combining
collision detection and key recovery makes the attacks simpler, the idea of looking
for collisions between S–box outputs makes these types of collision attacks inherently
bivariate when software implementations are considered. When the amount of sam-
ples required to digitally reconstruct a side channel signal is taken into account, the
search for the related time samples can greatly complicate the attack. To amend
this issue, we have introduced the so–called “near collision attack” which looks for
S–box outputs which have only one bit diﬀerence in between [EEB16](see Chapter 4).
This attack can be applied in a univariate setting, therefore eliminating the added
complexity when software implementations are considered.
The most powerful side channel attacks to date belong to the class of “template
attacks”. In template attacks, the attacker is assumed to have access to an identical
copy of the target device. A further assumption is that she can also modify the key
and send queries to the device with diﬀerent data. Therefore, she can “proﬁle” the
device and learn how it leaks data and what aspects of the target can be exploited. At
a later stage, when the attacker has access to a device which she has already proﬁled
before, she can recover the secrets of this device with a small number of measurements
gathered from the target device.
1.3.4 Countermeasures
Various countermeasures to reduce the side channel leakage in an implementation are
introduced in the literature. Side channel countermeasures can be classiﬁed into two
main groups: countermeasures which aim at decreasing the signal to noise ratio (SNR)
in power/EM measurements, and countermeasures which aim to break the statistical
1.3. Side Channel Analysis 9
relation between the processed data and the power/EM measurements. The most
common countermeasures which decrease the SNR and covered widely in academia
are the ones introducing noise in either time or amplitude domain, and using side
channel resistant logic styles. Introducing noise in time domain can be achieved in
two ways. One can either insert dummy operations randomly to the execution of
the encryption algorithm [CK10], or perform certain operations of the cryptographic
algorithm in a random order whenever possible [THM07]. As for introducing noise
in the amplitude domain, there are multiple paths to follow for the designer. For
instance, if the implementation processes multiple chunks of the key in parallel, the
SNR can be reduced signiﬁcantly. For other methods of decreasing SNR, we direct
the interested reader to Chapter 7 of the Power Analysis Attacks [MOP07].
Although these countermeasures can successfully decrease the SNR in time do-
main, there are multiple works which show that when the measurements are trans-
formed into the frequency domain, the eﬀectiveness of the countermeasures which in-
troduce timing noise can be reduced [GHT05]. There are also multiple pre–processing
techniques that aim at reducing noise in the time domain through ﬁltering [TOT+14],
or overcoming noise in the amplitude domain by using transformations such as prin-
ciple component analysis [MBLM12]. In Chapter 3 we cover the basics of how the
side channel leakage is transformed and important points to pay attention to when
doing analysis in the frequency domain.
Another important ﬁeld in side channel countermeasures is masking. This class of
countermeasures randomize the processed data in an attempt to break the statistical
relationship between the processed data and the power/EM traces. Depending on
the algorithm that is intended to be protected against side channel attacks and also
depending on the target platform, diﬀerent methods are proposed in the literature
for masking. For software implementations of block ciphers for instance, Prouﬀ and
Rivain gave a security proof of a secret sharing based masking scheme [PR13]. For
hardware implementations, Nikova et al.’s threshold implementation has attracted
much attention in the literature [NRR06]. Recently this technique has been extended
to protect implementations from higher order attacks as well [BGN+14].
Although proposals for masked implementations of block ciphers exist in the lit-
erature, usually the amount of randomness required to implement such schemes can
make it a challenging goal to achieve. A recently proposed method called “low entropy
masking scheme” (LEMS) attempted to reduce the amount of randomness required in
masking schemes [NGD11]. This masking scheme was later extended to resist higher
order attacks [MGD11]. However, Grosso et al. pointed out that the security of such
schemes depends on the leakage function of the target [GSP14]. Moreover, an attack
which exploits the basic principle behind low entropy masking was later proposed by
Ye and Eisenbarth [YE14]. In Chapter 4, we extend this attack to a certain class of
LEMS which uses codewords of linear codes as its mask values.
10 1. Introduction
1.4 Security Evaluations
Once an embedded device is designed and the countermeasures to mitigate possible
attacks are put in place, the responsibility falls on the shoulders of the security evalu-
ator to make an assessment on the practical security of the device. Depending on the
application that the device is designed for, there are certain criteria that should be
satisﬁed before the device can be released to the market [Com12]. The security eval-
uator’s job therefore is to test the security of an embedded device to see if the device
complies with the speciﬁed levels of security that is required. For the evaluator to
achieve this, she has to have access to all the means that an attacker has. Not only in
terms of equipment, but also the analysis methods should be constantly improved so
that no product with an undetected security ﬂaw is released to the market. Therefore
it is also the academic community’s responsibility to constantly improve and publish
such methods in the public domain.
1.5 Outline and Contributions
The work put together in this thesis attempts to achieve the goal of bringing security
evaluations to a higher level. This thesis can be useful in assisting designers, who
aim at developing secure products, or security evaluators, who assess the security
of a given product. From the algorithm design phase to manufacturing the ﬁnal
product, there are multiple stages where a security evaluation can be done to ensure
the security of the ﬁnal product. In fact, if a cryptographer wants to ensure a base
level of side channel attack security, she can use theoretical metrics to evaluate how
strong diﬀerent components in a cryptographic algorithm would be against diﬀerential
side channel attacks and identify potential weaknesses.
In case the cryptographic algorithm is implemented with certain design con-
straints, a prototype can ﬁrst be analysed before it is included in the ﬁnal product.
If a security vulnerability is detected, developers can use this early feedback from
the security analysis to improve their product design. However, a prototype that is
deemed secure up to a given security level may not necessarily result in the same secu-
rity level in the ﬁnal product. Once it is embedded in its ultimate environment, new
vulnerabilities may arise which may not be possible to detect on a prototype. Take
a crypto co–processor for instance; the co–processor itself might be secure against
fault injection attacks on a prototype board. However, if the co-processor can be
circumvented due to a fault injection attack on the system that it is embedded in,
this will lead to an insecure product. Therefore an additional security analysis would
be valuable to ensure the security of the ﬁnal product. Figure 1.1 shows a graphical
representation of this process and how this thesis contributes to diﬀerent stages of
such security analyses. The contributions of each chapter in the thesis is summarized
in the rest of this section.
1.5. Outline and Contributions 11
Algorithm Design
HW/SW Design
Manufacturing
Chapter 2: Metrics
Chapter 3: Frequency domain analysis
Chapter 6: Scan attacks
Chapter 4: Near collision SCA
Chapter 5: Fault Injection
Security Final
ProductAnalysis
Security
Analysis
Security
Analysis&
Prototyping
Figure 1.1: Graphical representation of this thesis’ contributions mapped to diﬀerent
types and stages of security analyses.
Chapter 2 Side channel analysis makes recovering the secret key from an embedded
device practical through analysing power and EM measurements, even though the
underlying cryptographic algorithm is secure. Therefore, the algorithm designers
proposed a number of theoretical metrics in an attempt to estimate the diﬃculty of
analysing diﬀerent block ciphers from a diﬀerential side channel attack perspective.
However, a thorough investigation of these metrics has not yet been put together. This
chapter ﬁlls this gap in the literature, comparing various theoretical metrics that aim
at quantifying the resistance of a cryptographic algorithm to diﬀerential side channel
attacks. We show that some theoretical metrics do not accurately quantify the side
channel leakage of a cryptosystem. The chapter combines the analyses given in the
articles listed below and extends them further to side channel attacks using output
data:
• “On Using Genetic Algorithms for Intrinsic Side-channel Resistance: The Case
of AES S-box” by Stjepan Picek, the author, Kostas Papagiannopoulos, Lejla
Batina and Domagoj Jakobovic [PEB+14].
• “Confused by Confusion: Systematic Evaluation of DPA Resistance of Various
S-boxes” by Stjepan Picek, Kostas Papagiannopoulos, the author, Lejla Batina
and Domagoj Jakobovic [PPE+14].
• “Improving DPA Resistance of S-boxes: How far can we go?” by the author,
Kostas Papagiannopoulos, Lejla Batina and Stjepan Picek [EPBP15].
My contribution in this chapter is running the practical experiments required to verify
the validity of the proposed theoretical metrics. In addition, comparing diﬀerent the-
oretical metrics to see which ones are better representatives of the actual side channel
security of the implementation of a given algorithm is my contribution. Finally, the
so–called phantom S–boxes and the analysis given in the chapter is my contribution
to this line of work.
Chapter 3 As side channel analysis focuses on the implementation of the underly-
ing cryptosystem, the theoretical security of the cryptosystem can be circumvented
12 1. Introduction
for key recovery. Following the initial works in side channel analysis, many counter-
measures are proposed to introduce noise in the temporal domain in an attempt to
reduce the amount of signal with respect to noise in the measurements. However,
when the measurements are transformed into frequency domain, this temporal noise
can in turn be circumvented by computing the amplitudes of the underlying signal
components. Since these amplitudes provide a time invariant information on the
measurements, researchers used this information for improved key recovery attacks
on implementations with temporal noise. This chapter explores the eﬀect of the sig-
nal trend, a component of the measured signal which has been neglected in the side
channel analysis literature so far. Through a theoretical analysis, we show how signal
trend can signiﬁcantly aﬀect the leakage in frequency domain. Moreover, we propose
a method to use an artiﬁcial signal trend to improve leakage in the frequency domain.
This chapter is based on the submission below:
• “Understanding the Side Channel Leakage in Frequency Domain” by the author,
Jasmina Omic and Lejla Batina (under review).
I am the main contributor to this work. I worked on devising the necessary simulations
and run the practical experiments to support the ideas put forward in the chapter.
Moreover, I have made the ﬁrst observations leading to the co-development of the
method to improve the signal to noise ratio in frequency domain that is presented in
the chapter.
Chapter 4 One of the most widely used methods for performing side channel anal-
ysis is the correlation power analysis (CPA). However, the analyst has to make strong
assumptions on how the target leaks data. This requirement of strong assumptions
make CPA less adaptable to devices which leak data in unconventional ways. Al-
though this type of devices can be analysed using side channel collision attacks, this
can introduce further problems since they are intrinsically bivariate when applied on
bijective S-boxes. In this chapter, we propose a technique which enables an analyst
to run a correlation based collision attack on a bijective S-box in a univariate setting.
The technique we propose enables to ﬁnd the middle way between correlation analysis
and collision attacks, as it has simpler assumptions on the leakage of the target device.
In addition, we use the same technique to signiﬁcantly improve a previously proposed
analysis of a masking scheme. This chapter is based on the following article:
• “Near Collision Side Channel Attacks” by the author, Thomas Eisenbarth and
Lejla Batina [EEB16].
I am the main contributor in this work. I designed the attack on unprotected im-
plementations and the improvement of the previously proposed attack on linear code
based low entropy masking schemes. I further devised and ran simulations as well as
ran the practical attack on DPA Contest v4 traces.
1.5. Outline and Contributions 13
Chapter 5 Another class of side channel attacks that can compromise an embedded
device is fault injection. This type of attacks are widely used in the security analysis
of everyday devices and can pose a large threat to the security of an embedded
system as it can be possible to circumvent any cryptographic operation through such
attacks. Although these attacks are studied well individually, the combination of these
methods have attracted attention from only a handful of researchers. In this chapter,
we investigate how clock glitch attempts are aﬀected by the ambient temperature.
By heating up a microcontroller, we explore what types of faults can be induced in
the target. Our analysis shows that when the temperature is increased, the target is
more vulnerable to clock faults. Moreover, we show for the ﬁrst time in the literature
that repeating an instruction is also possible through clock fault injection. The work
presented in this chapter is based on the following article:
• “Clock Glitch Attacks in the Presence of Heating” by Thomas Korak, Michael
Hutter, the author and Lejla Batina [KHEB14].
My contribution in this work lies in designing the interface for running the experi-
ments. Large part of the experiments presented in the chapter, as well as most of the
analysis done in the chapter are my contributions to this work.
Chapter 6 During the manufacturing process of an embedded device, unintended
faults can occur due to imperfections in the chip die. To be able to detect these
imperfections before using the chip as a component of a larger system, each chip
should be tested. The testing process can be rather time consuming but nonetheless
it is of crucial importance for reliability. Built–in self–test is a method widely adopted
by manufacturers to reduce testing time while keeping the quality of such tests at a
high level. However, if a cryptographic implementation is also included in this testing
environment that is embedded on the chip, it can also introduce vulnerabilities. Scan
attacks exploit the existence of such a testing infrastructure embedded in the chip
to recover secret data used by the cryptographic algorithm embedded in the chip.
In this chapter we introduce the ﬁrst extension of the scan attack methodology to
major industrial test compression algorithms. Prior to our work, some industrial test
compression algorithms were believed to be secure against scan attacks. This is based
on an analysis showing that ﬁnding the exact locations of the ﬂip ﬂops which leak
key information is not possible due to the mixing and compression layers employed
in industrial systems. We showed, for the ﬁrst time in the literature, that such
systems can also be attacked using diﬀerential scan attacks by extending the number
of observations to enable a statistical analysis. In the chapter we investigate various
scenarios in which an AES circuit is distributed over the scan chain infrastructure
of a circuit and provide experimental results to estimate the security of each such
distribution. The chapter is based on the following articles:
• “Differential Scan Attack on AES with X-tolerant and X-masked Test Response
14 1. Introduction
Compactor” by the author, Amitabh Das, Santosh Ghosh and Ingrid Ver-
bauwhede [EDGV12].
• “Security Analysis of Industrial Test Compression Schemes” by Amitabh Das,
the author, Santosh Ghosh, Lejla Batina and Ingrid Verbauwhede [DEG+13].
I am the main contributor of these works. My contribution was to devise the attack
and run the simulated experiments. I further extended these simulations to industrial
test compression algorithms based on the RTL codes that my co-authors generated.
In addition, designing the experimental framework to present results and writing the
security analysis sections of the aforementioned articles were also my contributions.
Other works that the author contributed to, but that fall out of scope of this thesis
are: [VGE13,BDE+13,BEE+13,PBJ+14,PEP+14].
Chapter 2
Side Channel Resistance by Design
This chapter investigates if the theoretical metrics available in the literature to as-
sess diﬀerential side channel attack resistance of cryptographic algorithms are good
representatives of their actual security in terms of side channel attacks. The analysis
given in this chapter is based on the following papers: [PEB+14,PPE+14,EPBP15]
When evaluating S-boxes, the widely used non-linear building blocks used in block
ciphers, algorithm designers need to consider a number of diﬀerent properties to be
met. Each property characterizes the resistance of the cipher employing that S-box to
a certain attack. However, when DPA resistance of S-boxes is considered, the situation
is more complex. The main diﬃculty lies in ensuring both side channel analysis
(SCA) resistance and cryptanalytical security of S-boxes. More precisely, higher non-
linearity of an S-box implies a better cryptanalytically secure S-box. However, this
can make the ciphers which employ such an S-box more vulnerable to side channel
attacks [HRG14].
So far, there exists only a few properties that address the resilience of S-boxes
to Diﬀerential Power Analysis (DPA) attacks. Such properties like Signal to Noise
ratio (SNR-DPA) [GP04] or transparency order [Pro05] are not thoroughly investi-
gated and the results so far do not fully advocate their suitability for the design of
side channel secure ciphers. Recently, new research in this direction introduces the
confusion coeﬃcient [FLD12] measuring the diﬃculty of an SCA key recovery. All
those eﬀorts suggest a clear interest in this type of research but we are still far from
taking a standard metric for SCA resistance into consideration when designing a new
block cipher.
Starting from the initial work of Kocher et al. [KJJ99], the topic of diﬀerential
power analysis receives an undivided interest from both academia and industry. This
fact is largely related to the practicality of these attacks and as a consequence, this
topic also attracts some attention from cryptographic algorithm designers. When
block ciphers are concerned, the resistance of a cryptographic algorithm to attacks
such as diﬀerential [BS91] and linear [MY93] cryptanalysis are well studied. However,
improving the resistance to DPA together with improved resistance against linear and
diﬀerential attacks are claimed to be contradicting phenomena [GP04,HRG14].
In the literature, there are multiple attempts to quantify the resistance of a block
cipher against power analysis. In 2004, Guilley and Pacalet proposed SNR (DPA) as a
ﬁrst attempt to quantify the level of leakage expected from a design under certain as-
16 2. Side Channel Resistance by Design
sumptions [GP04]. Shortly after that, Prouﬀ proposed the “transparency order” in an
attempt to compare S-boxes in terms of their resistance to DPA [Pro05]. The metric
was picked up by researchers doing practical analysis and the results are presented for
various platforms. However the improvement in terms of side channel security seems
to be heavily platform-dependent. Mazumdar et al. generated new S-boxes with dif-
ferent transparency order values and tested them on an FPGA [MMS13a,MMS13b].
Their work presents a substantial increase in the number of required measurements
to recover the secret key. On the other hand, we showed that such an increase in the
level of security is not observed on a software implementation [PEB+14]. In addition
to the previous work that is dealing with 8×8 S-boxes only, in the follow-up work,
we showed that when considering 4×4 S-boxes, it is possible to achieve better trans-
parency order values while the resistance to linear and diﬀerential cryptanalysis are
kept at a high level [PEP+14]. However, a close look at the practical analysis shows
that the increase in security against DPA is not signiﬁcant. Chakraborty et al. re-
cently presented limitations to Prouﬀ’s transparency order and proposed amendments
to the metric [CSM+14].
Guilley et al. revisit the topic in 2007 by improving their metric and deﬁning a
more eﬃcient one that is quantifying the resistance of an S-box against correlation
power attacks(CPA) [GHPS07]. In 2012 Fei et al. take another approach and propose
the “confusion coeﬃcient” for S-boxes which quantiﬁes how distinguishable two key
candidates can be in the case of DPA [FLD12]. Shortly after, they extend this work to
include CPA [FDLZ14] and they also cover the security of masking schemes [DZFL14].
The main goal for this type of research, aiming at intrinsically improving SCA re-
sistance for cryptographic building blocks, is in unifying diﬀerent approaches and com-
ing up with a generic framework. Ideally, the best metric to evaluate SCA resistance
at the design stage should be valid for all implementation options and be platform-
agnostic. Another challenge is to incorporate countermeasures such as masking in the
proposed metrics.
In this chapter, we mainly focus on how accurate these metrics are on a realistic
simulation setting before moving on to real world experiments. We compute diﬀerent
metrics for various S-boxes and compare if our simulations match what would be
expected from each metric. An important contribution of the chapter is to analyse
both the S-box and the inverse of it for side channel resistance. This helps to quantify
how strong the side channel resistance is when the implementations are analysed
using the input data or the output data. Our results show that it is easier to analyse
some ciphers from the input side or the output side depending on the S-box that
is used. Moreover, we also provide a mathematical analysis on why the so called
“Phantom” S-boxes are more diﬃcult to attack with CPA. We conclude the chapter
by extending the given analysis to real world analysis on a software platform as well as
an FPGA platform and by providing a few points for algorithm designers to consider
for estimating the resistance of their ciphers to DPA and CPA.
2.1. Background 17
2.1 Background
As mentioned previously, there exist several properties of S-boxes where each property
is relevant in providing protection against certain classes of cryptographic attacks.
However, here we concentrate only on several basic properties such as bijectivity,
linearity and δ-uniformity as well as the theoretical measures quantifying side channel
attack resistance of an S-box.
The addition modulo 2 is denoted as ⊕. The inner product of vectors a and b is
denoted as a · b and equals a · b = ⊕ni=1aibi.
Function F, called S-box or vectorial Boolean function, of size (n,m) is deﬁned as
any mapping F from Fn2 to F
m
2 [Pro05]. When m equals 1 the function is called a
Boolean function. Boolean functions fi, where i ∈ {1, ...,m}, are coordinate functions
of F, where every Boolean function has n variables.
Hamming weight: HW of a vector a¯, where a¯ ∈ Fn2 , is the number of non-zero
positions in the vector.
An (n,m)-function is called balanced if it takes every value of Fm2 the same
number 2n−m of times [CH10].
Walsh transform: WF (a, v) of a function F is deﬁned as
WF (a, v) =
∑
x∈Fn2
(−1)v·F (x¯)⊕a·x. (2.1)
Nonlinearity: NF of a function F of size (n,m) is equal to the minimum nonlin-
earity of all non-zero linear combinations v ·F , where v 6= 0, of its coordinate functions
fi [Car05].
NF = 2
n−1 −
1
2
max
a∈Fn
2
v∈Fm∗2
|WF (a, v)|. (2.2)
After this brief summary of notation and fundamental deﬁnitions for boolean
functions, now we summarize the metrics that are used in this chapter.
Differential uniformity: δ represents the largest value in the diﬀerence distri-
bution table without counting the value 2n in the ﬁrst row and ﬁrst column posi-
tion [Nyb91].
Transparency Order [Pro05]: TF of a function F of size (n,m) is deﬁned as:
TF = max
β∈Fm2

|m− 2 HW(β)| − 1
22n − 2n
∑
a∈Fn∗2
∣∣∣∣∣∣∣
∑
v∈Fm
2
HW (v)=1
(−1)v·βWDaˆF (0, v)
∣∣∣∣∣∣∣

 (2.3)
whereWDaˆF represents Walsh transform of the derivative of F with respect to a vector
a ∈ Fn2 . According to the directions given in the original paper, a low transparency
order value should result in an S-box which is more robust to side channel attacks.
18 2. Side Channel Resistance by Design
Guilley metric [GHPS07]: G07F of an (n,m)-function F is deﬁned as:
G07F = min
β∈Fn
∗
2
∑
e∈Fn2

∑
x∈Fn2
m∑
j=1
m∑
i=1
(−1)fi(x⊕e)
(
(−1)fj(x⊕β) − (−1)fj(x)
)
2
. (2.4)
One should note that this metric is based on an attack using a maximum likelihood
estimator. However, in this chapter we evaluate if it is representative of the strength
of an S-box against correlation attacks (CPA). Guilley et al. state that the higher G07
value an S-box gets, the easier it will be analyse in a side channel analysis context.
In 2014, Chakraborty et al. revisit the transparency order and points out some
incorrect assumptions made in the original paper proposing the transparency order.
They also propose a remedy to the problems they identify and call the new metric
Improved Transparency Order [CSM+14]. The improved transparency order
ITF of an (n,m)-function F is deﬁned as:
ITF = max
β∈Fm2

m− 1
22n − 2n
∑
a∈Fn
∗
2
m∑
j=1
∣∣∣∣∣
m∑
i=1
(−1)βi⊕βjCfi,fj (a)
∣∣∣∣∣

 . (2.5)
where βi is the i-th bit of β and Cfi,fj is the cross correlation spectrum: Cfi,fj (ω) =∑
x∈Fn2
(−1)fi(x)⊕fj(x⊕ω). Similar to the original proposal of the transparency order,
lower values should represent S-boxes which are more robust to side channel analysis.
Confusion Coefficient [FDLZ14]: Analysing individual S-boxes requires an
evaluation metric that clearly separates the eﬀect of the physical characteristics of
the device under attack (such as noise) from the algorithmic eﬀect of the cipher that
is targeted. For this purpose, Fei et al. proposed the Confusion Coeﬃcient model for
DPA [FLD12] and CPA [FDLZ14].
The suggested model is closely related to the selection function that we use to
perform the attack. For the following selection function γ:
γ(input, key) = HW (F (input⊕ key)) (2.6)
the confusion coeﬃcient κ is deﬁned for two diﬀerent keys ki, kj , where ki 6= kj , as:
κ(ki, kj) = E[(γ(input, ki)− γ(input, kj))
2] (2.7)
where E denotes the expected value over all possible inputs. The resulting κ(ki, kj)
describes the eﬀect of an algorithmic confusion: a large value indicates that it is
easy to distinguish the keys ki and kj if a side channel attacks is performed with the
selection function γ. In general, the coeﬃcient κ(ki, kj) demonstrates the probability
of successfully distinguishing the two keys.
In order to fully characterize the behaviour of the S-box outputs in terms of the
selection function γ in a side channel context, we need to compute all possible values
of κ for all distinct key pairs ki and kj . After acquiring all possible κ values, we
2.2. The Phantom Property 19
Table 2.1: Variance of the confusion coeﬃcient computed for diﬀerent S-boxes.
PRESENT Phantom New
var(κ) 0.6600 1.3880 1.3629
proceed in crafting the frequency distribution of the confusion coefficients for the
chosen S-box. According to Heuser et al. [HRG14], highly non-linear components
lead to a frequency distribution with low variance, compared to linear elements, which
demonstrate high variance. Therefore, in order to improve side channel resistance, we
need to search for S-boxes whose frequency distributions demonstrate high variance.
To this end, we employ heuristic techniques such as genetic algorithms (for a full
description see [PPE+14]) that generate new S-boxes with high variance in their
respective confusion coeﬃcient values κ, while presenting high resistance to linear
and diﬀerential cryptanalysis.
Following the aforementioned heuristics, we have generated two “improved” S-
boxes: the so-called “Phantom” and “New” that are deﬁned as follows:
Phantom = {6, 4, 7, 8, 0, 5, 2, 10, 14, 3, 13, 1, 12, 15, 9, 11}
New = {15, 11, 8, 4, 2, 0, 14, 13, 9, 3, 1, 5, 12, 10, 7, 6}
The “Phantom” S-box demonstrates an increased variance of the frequency dis-
tribution (see Table 2.1), when compared to the PRESENT [BKL+07] S-box, while
it remains in the “optimal” S-box classes as deﬁned by Leander et al. [LP07]. The
“Phantom” S-box actually exhibits increased resistance (resembling the behaviour of
a linear element) as it results in ghost peaks after CPA. Hence, we refer to it as the
“Phantom S-box”.
The “New” S-box, on the other hand, does not result in ghost peaks in a software
setting and in fact has a lower variance of the distribution of the confusion coeﬃcient
when compared to the Phantom S-box (see Table 2.1). Nevertheless, when imple-
mented in hardware, this S-box yielded the best results in our experiments presented
in Section 2.4.2.
2.2 The Phantom Property
Phantom S-box was ﬁrst introduced in our earlier work presented at Indocrypt ’14
[PPE+14]. The name ‘Phantom’ is inspired by the strong ghost peaks present in the
CPA results. Figure 2.1 presents the expected CPA result when Hamming weight
model is assumed to attack a system employing the Phantom S-box and the S-box
output is used as the selection function. The plot given in the ﬁgure assumes the
correct key to be 0, and presents the correlation value that each key candidate would
get from a CPA when a signal with no noise is analysed. As it is clearly visible from
the ﬁgure, strong ghost peaks appear after running a CPA assuming the Hamming
20 2. Side Channel Resistance by Design
weight (HW) model as the power model of the target. An interesting note here is
that even if the actual device has a diﬀerent leakage function than strict Hamming
weight leakage, as long as the attacker assumes HW as the power model, strong ghost
peaks will be prominent in the attack output. This can be easily seen by following
the outputs of the S-box for related inputs.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
−1
−0.5
0
0.5
1
Co
rre
la
tio
n 
Va
lu
es
Key Candidates
Figure 2.1: CPA results on Phantom S-box in a noiseless setup.
Table 2.2: Inputs and outputs of the Phantom S-box with related keys: k′ = k ⊕ 9.
X ⊕ k S(X ⊕ k) HW (S(X ⊕ k)) X ⊕ k′ S(X ⊕ k′) HW (S(X ⊕ k′))
0 6 2 9 3 2
1 4 1 8 14 3
2 7 3 11 1 1
3 8 1 10 13 3
4 0 0 13 15 4
5 5 2 12 12 2
6 2 1 15 11 3
7 10 2 14 9 2
8 14 3 1 4 1
9 3 2 0 6 2
10 13 3 3 8 1
11 1 1 2 7 3
12 12 2 5 5 2
13 15 4 4 0 0
14 9 2 7 10 2
15 11 3 6 2 1
Table 2.2 shows the HW values to be used to predict the power consumption of
a device when the selection function is the S-box output. One should pay particular
attention to the columns printed in bold fonts, as they show how the power model
prediction would change depending on a given key guess. Note that the columns
printed in bold always add up to 4. In other words, they are of the same distance from
the mean (which is 2 in this case), but have complementary signs when computing
2.3. Analysis from Input/Output 21
the correlation coeﬃcient given in Eq. (2.8).
rxy =
∑n
i=1(xi − x¯)(yi − y¯)√∑n
i=1(xi − x¯)
2
√∑n
i=1(yi − y¯)
2
(2.8)
Note that the denominator for both keys stay the same. However when the numerator
is considered, (yi− y¯) stays the same since it depends on the measurements taken from
the device. Therefore the term (xi − x¯) will make the diﬀerence when computing the
correlation coeﬃcient for keys k and k′. Following the bold font columns of Table 2.2,
one can easily see that the term (xi− x¯) takes the same value in magnitude, but with
complementary signs for keys k and k′. Therefore the correlation coeﬃcient for both
keys will be of the same magnitude and will have complementary signs as a result of
the analysis. Hence, unless the analyst knows the exact way the target intermediate
value leaks for a given time sample in power traces, distinguishing the two keys will
not be possible. Note that the complementary CPA results hold for any related key
pair k and k′ such that k′ = k ⊕ 9 for the Phantom S-box.
Although the Phantom S-box, or any other S-box exhibiting the Phantom prop-
erty, makes the analyst’s job harder to pinpoint the correct key out of all possible
candidates, the analyst still can reduce the search space signiﬁcantly by increasing
the signal to noise ratio or simply increasing the number of measurements she takes.
This way, the candidate key space can be reduced to only a couple of key candi-
dates, which in turn can be brute forced once all key parts are analysed. Hence, the
Phantom property cannot be seen as a sole countermeasure to thwart SCA, but as
a simple method that can be used by the algorithm designers to increase the side
channel security of their block cipher.
2.3 Analysis from Input/Output
There are many previous works in the literature considering the theoretical side
channel security of cryptographic primitives [MMS13a, MMS13b, HRG14, PPE+14,
EPBP15] and the numbers given in those works are only for attacks using the input
data. However, an SCA can be mounted using either the input or the output data.
In case the output data is used, the S-box that is used for analysis is eﬀectively the
inverse S-box. Therefore if a theoretical analysis of a block cipher is to be done, one
has to consider the inverse S-box to assess the side channel security of most block
ciphers. To the best of the author’s knowledge, there is only one work which consid-
ers presenting properties for the inverse S-box [Sto15]. However, that work does not
draw conclusions on the diﬀerence in side channel resistance of analysis from input or
output. Therefore this section ﬁlls that gap in the current literature.
Table 2.3 presents the theoretical indicators, which are summarized in Section 2.1,
for side channel security of an S-box, for a selection of 8x8 S-boxes. As a basis to
our analysis, we have chosen the AES S-box and included two 8×8 S-boxes from
Picek’s PhD thesis [Pic15] since they achieve better side channel resistance indicator
22 2. Side Channel Resistance by Design
Table 2.3: Properties of the analysed 8×8 S-boxes
S-Box TF ITF G07F σ(κ(F ))
AES 7.8600 6.9161 13 742 080 0.1113
inv(AES) 7.8515 6.9114 11 341 312 0.0804
Sbox 1 [Pic15] 7.5650 6.8729 801 024 4.0574
inv(Sbox 1) 7.7951 6.8494 14 364 928 0.1046
Sbox 2 [Pic15] 7.3588 6.5903 10 166 016 0.4104
inv(Sbox 2) 7.8037 6.8563 10 802 944 0.1247
values in the expense of a small decline in cryptanalytical security of the AES S-box.
In fact, the diﬀerential uniformity value is increased from 4 for AES S-box, to 12
for both S-box 1 and S-box 2. Although this seems like a large compromise in the
cryptanalytical security, this diﬀerence becomes negligible for AES in practice thanks
to the large security margin it has with respect to classical diﬀerential cryptanalysis.
In particular, thanks to its structure in AES any 4-round diﬀerential trail has at
least 25 active S-boxes. A diﬀerential uniformity of 4 implies all diﬀerentials over
the S-box have diﬀerential probability (DP) at most 1/64 leading to a bound of
(1/64)
25
= 2−150. The increase of the uniformity to 12 due to the alternative S-box
reduces this bound to (3/64)
25
≈ 2−110.38. As the required plaintext-ciphertext pairs
to attack the cryptographic algorithm is in the same order as 1
DP
, we can safely
assume that a diﬀerential attack against an AES-like algorithm employing one of the
aforementioned S-boxes becomes practically infeasible when the full 10 round cipher is
considered. Although other techniques such as truncated diﬀerential attacks might be
applicable with higher probability, we expect these attacks to be practically infeasible
on the full cipher due to the decreased number of diﬀerential relations with non-zero
DP for the S-box. pairs required to mount the attack. Therefore, we expect these
S-boxes to be a good choice that can be achieved with a “reasonable” cryptanalytical
security. In the case of S-box 1 from Picek’s PhD thesis, an attack from the input
side would seem to be rather diﬃcult in comparison to other S-boxes included in this
section. However, when an attack using the output data is considered, we see that it
is “theoretically” one of the weakest S-boxes in comparison to others, in terms of side
channel attacks.
To attest the actual security of the selected S-boxes, we have run 1000 separate
analyses on digitized simulated measurements. The reason that we have chosen to
digitize the simulated power consumption values is to provide a more realistic simu-
lated data. Although in the case of CPA we don’t expect a signiﬁcant variation in
attack results, attacks like MIA can be greatly aﬀected by this choice. The method
used to test the actual side channel security of cipher employing the S-boxes given in
Table 2.3 is CPA. As the selection function, the S-box output is used and the HW
model of the intermediate value is used as the power model. The average guessing
entropy1 over 1000 simulated attacks on each S-box is presented in Figure 2.2.
1First proposed by Massey [Mas94] and introduced as an indicator of the success of a side channel
2.3. Analysis from Input/Output 23
−10 −9 −8 −7 −6 −5 −4 −3
0
1
2
3
4
5
6
log2(SNR)
lo
g 2
 
(gu
es
sin
g e
ntr
op
y)
Avg GE of 1000 simulations (CPA−HW), 2000 traces
 
 
AES
inv(AES)
Sbox−1
inv(Sbox−1)
Sbox−2
inv(Sbox−2)
Figure 2.2: Simulation results for 8×8 S-boxes
Although the theoretical indicators show some diﬀerences in diﬀerent S-boxes that
are considered, in the 8×8 case, the eﬀect on actual attack resistance does not seem
to be clearly visible from Figure 2.2. The simulation results show that a small diﬀer-
ence in the theoretical indicators for side channel resistance does not translate into a
signiﬁcant diﬀerence in the attack results as it was reported in [MMS13b]. Most im-
portantly, our simulations show that for the case in hand, the original transparency
order and the updated transparency order do not seem to be representative as an
indicator.
As introduced before, a low transparency value should translate to a strong resis-
tance against CPA. However, if S-box 1 and S-box 2 are considered, contrary to the
results presented in [MMS13b] we see that neither the transparency order nor the up-
dated transparency order is representative of the side channel resistance of an S-box
for these sample S-boxes. Moreover, looking at the simulation results, the variance
of the confusion coeﬃcient matrix and the metric proposed by Guilley et al. seem to
be better representatives for side channel security for the sample S-boxes analysed in
this chapter.
When 4×4 S-boxes are considered however, the diﬀerence in strength in terms
of side channel attack resistance seems to be more pronounced. As in the previous
case, Table 2.3 presents the theoretical indicators for side channel resistance for a
number of S-boxes. Looking at the indicators given in the table, the S-box generated
by Picek et al. [PMMB15] (named as Modiﬁed Transparency in Figure 2.3) using
genetic algorithms and the updated transparency order in the ﬁtness function seems
to be the best one when transparency order metrics are considered. If the variance of
the confusion coeﬃcient is considered however, the strongest S-boxes seem to be the
Phantom [PPE+14], New [EPBP15] and inverse PRESENT [BKL+07] S-boxes.
Similar to the experiments for the 8×8 case, we have run experiments on digitized
attack by Standaert et al. [SMY09b], guessing entropy captures the remaining workload of a side
channel adversary to recover the secret key.
24 2. Side Channel Resistance by Design
Table 2.4: Properties of the analysed 4×4 S-boxes
S-Box TF ITF G07F σ(κ(F ))
Mod. Trans. 3.27 1.90 1280 1.2625
inv(Mod. Trans.) 3.33 2.40 4096 0.5847
New 3.47 2.50 512 1.3629
inv(New) 3.60 2.63 7424 0.3085
PRESENT 3.53 2.47 6400 0.6600
inv(PRESENT) 3.33 2.33 512 1.3629
PRINCE 3.40 2.33 5376 0.6600
inv(PRINCE) 3.33 2.30 2304 1.2123
Phantom 3.33 2.50 2048 1.3880
inv(Phantom) 3.47 2.50 4096 0.5847
simulated data to assess the security of each S-box w.r.t. side channel analysis. We
have run 1000 independent experiments using the S-box output as the selection func-
tion and HW model as the power model. We compute the average guessing entropy
depending on the amplitudes of the correlation values and overlapped the ﬁgures for
each analysed S-box in Figure 2.3. Again, similar to the 8×8 case, transparency order
metrics do not seem to be representative for the set of S-boxes that we have used for
comparison.
−10 −9 −8 −7 −6 −5 −4 −3
0
0.5
1
1.5
2
2.5
log2(SNR)
lo
g 2
 
(gu
es
sin
g e
ntr
op
y)
Avg GE of 1000 CPA simulations (HW), 2000 traces
 
 
Modified Transparency
inv(Modified Transparency)
New
inv(New)
Present
inv(Present)
Prince
inv(Prince)
Phantom
inv(Phantom)
Figure 2.3: Simulation results for 4×4 S-boxes
One interesting observation that can be made from the simulation results presented
in Figure 2.3 is how robust inverse Modiﬁed Transparency S-box is w.r.t. side channel
attacks, contrary to all theoretical indicators. However, this is due to the fact that the
2.4. FPGA and Microcontroller Experiments 25
inverse Modiﬁed Transparency S-box also presents the phantom property explained
in Section 2.2. Although most key candidates do not result in a strong ghost peak
after CPA (see Figure 2.4), the key candidate k′ = k ⊕ 7 results in complementary
HW values when computing the hypothetical power model. Therefore the correct key
cannot be distinguished unless the exact leakage function of the device is known, but
the number of key candidates can be reduced to only two with strong conﬁdence,
unlike the case of Phantom S-box.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
−1
−0.5
0
0.5
1
Co
rre
la
tio
n 
Va
lu
es
Key Candidates
Figure 2.4: CPA results on inverse Modiﬁed Transparency S-box in a noiseless setup
Another interesting observation is that the diﬀerence between S-boxes in terms of
side channel security seem to be more pronounced for the 4×4 case. This diﬀerence
is also picked up by the theoretical indicators presented in Table 2.4. Moreover, the
strength of each S-box and its inverse seems rather diﬀerent with the exception of the
PRINCE S-box [BCG+12]. Looking at Figure 2.3, one can conclude that analysing
a block cipher employing the Phantom S-box is much easier than attacking from the
input side. In the contrary, attacking a PRESENT implementation using input data
seem to be the better option for an analyst.
2.4 FPGA and Microcontroller Experiments
In this section, we investigate experimentally if the “Phantom” S-box provides in-
creased security in a real world setting (software and hardware). Rather than im-
plementing only the S-box lookup for testing purposes, we take the more realistic
approach of embedding the S-box in the block cipher PRESENT [BKL+07]. Note
that the confusion coeﬃcient (and its distribution) is computed using the selection
function of a given cryptographic algorithm. In Section 2.1 we mentioned that it is
possible to use heuristics in order to generate S-boxes resistant to a specific selection
function. For instance, the “Phantom” S-box was tailor-made to be resistant to the
selection function γ that is used in attacks on software implementations of PRESENT,
where the HW of the S-box output would be the strongest source of leakage. Thus, it
is of direct interest to verify whether this holds for an AVR software implementation
as well as to investigate the Phantom S-box’s behaviour on an FPGA implementation.
Our choice of PRESENT is motivated by its standardization as a lightweight block
cipher with a 4×4 S-box [ISO12]. PRESENT employs the substitution permutation
26 2. Side Channel Resistance by Design
network (SPN) design: the S-Layer is formed of 4×4 S-box lookups to provide non-
linearity, and the P-Layer is a bit permutation that ensures diﬀusion.
2.4.1 Software (AVR)
For software analysis, we use an AVR smartcard with an ATmega163 microcontroller.
We used Riscure Power Tracer 3 to communicate with the smartcard and extract
the power consumption traces using a LeCroy WaveRunner 610Zi oscilloscope. We
acquired 2 500 traces and using the low-noise measurements supplied by the Power
Tracer and run 50 independent experiments (with 50 traces each) to generate the
guessing entropy plot presented in Figure 2.5. The ATmega163 microcontroller leaks
the Hamming weight of the intermediate values and we use the selection function γ
as speciﬁed in Eq. (2.6).
Our target implementation uses separate look up tables for the S-layer and the
P-layer. However, as the code is running on an 8-bit microcontroller, we used 8-bit
look ups for the S-layer, rather than using 4-bit look ups which would have introduced
an unnecessary delay in the computation. While doing the analysis however, we have
analysed the key in a nibble (i.e. 4-bit chunks) by nibble manner.
The average guessing entropy over 50 independent experiments is presented in Fig-
ure 2.5. As expected, the Phantom property of the S-box (explained in Section 2.2)
prevents the attack from identifying the correct key with certainty. However the
key space can be successfully reduced to only two candidates after analysing about
15 traces. This is an expected result, since our analysis is mounted using a selec-
5 10 15 20 25 30 35 40 45 50
0
1
2
3
4
# traces
G
ue
ss
in
g 
En
tro
py
 (l
og
2)
 
 
Present S−box
Phantom S−box (HW)
New S−box (HW)
Figure 2.5: Guessing entropy with respect to the number of processed traces.
tion function which beneﬁts the Phantom S-box. Hence, it is of greater interest to
investigate how the Phantom S-box would behave on a hardware implementation.
2.4. FPGA and Microcontroller Experiments 27
2.4.2 Hardware (FPGA)
For hardware analysis we use the SASEBO board with a XILINX Virtex-II Pro
(XC2VP7) FPGA for running the PRESENT design. The block diagram of the
design we implemented in FPGA is given in Fig. 2.6. As shown in the ﬁgure, the
data register is updated at each round only once. This makes the selection function
g slightly more complicated than the one used in the software case. When attacking
from the input, rather than simply computing the Hamming weight of the S-box out-
put, we need to put the values through the P-layer and compute which bit positions
in the data register are aﬀected. Then, we compute the Hamming distance (HD)
between the old and new register state, in order to estimate the power consumption
γ(input, key) = HD(Rold, Rnew)
= HD(Rold(0, 16, 32, 48), P (S(Rold(0, 1, 2, 3)⊕ key).
We emphasize again the fact that the improved S-box was crafted for the software
selection function f , based on the Hamming weight leakage assumption.
Input
Data Register
S-Layer
P-Layer
⊕
Output
Keyri
Keyr31 ⊕
Figure 2.6: Block diagram of the implemented PRESENT cipher core.
For the experiments, we acquired a total of 150 000 traces for the new S-box we
generated, 70 000 traces for the phantom S-box, and 50 000 traces for the original
PRESENT using a LeCroy WaveRunner 610Zi oscilloscope. To compute the guessing
entropy diagram [SMY09a], ten separate attacks are run and the results obtained
from the attacks are summarized in Figure 2.7. As it is clearly visible in the ﬁgure,
the behavior of the phantom S-box in an FPGA implementation is rather similar
compared to the behavior of the original PRESENT S-box. Thus, we observe that
an S-box crafted to reduce the Hamming weight leakage has negligible eﬀect when
applied in a context where Hamming distance leakage is prevalent. On the other
hand, the new S-box given in Section 2.1 acts rather diﬀerent when compared to
28 2. Side Channel Resistance by Design
0 50 100 150
0
1
2
3
4
x100 traces
G
ue
ss
in
g 
En
tro
py
 (l
og
2)
 
 
Present S−box
Phantom S−box (HW)
New S−box (HW)
Figure 2.7: Guessing entropy with respect to the number of processed traces.
phantom and original PRESENT S-boxes. From these experiments we can deduce
that improved security (in terms of SCA resistance) for a particular leakage model
does not necessarily imply adequately improved security in another setting.
2.5 Conclusions
Although there are many theoretical indicators proposed in the literature, the analy-
sis done in the ﬁrst part of this chapter shows that some of these indicators are not
exactly representative of the diﬀerential side channel attack security of cryptographic
primitives. Our simulated analysis shows that the transparency order and its im-
proved version are not reliable indicators for the set of S-boxes we have used in this
chapter. Even in cases that the analysed S-boxes have very good transparency order
values, we see that this does not translate to exceptional resistance to diﬀerential
side channel attacks. On the other hand, the metric proposed by Guilley et al. and
the confusion coeﬃcient proposed by Fei et al. seem to be good indicators for side
channel attack resistance of a cryptographic primitive. However, one should note that
the confusion coeﬃcient is highly dependent on the selection function to be used for
the analysis, and if taken out of context (as shown in our FPGA experiments), it is
no longer representative. Therefore, if confusion coeﬃcient is to be used to estimate
the side channel attack security of a cryptographic algorithm, the designer should
take into account multiple possible leakage scenarios before drawing conclusions on
the side channel attack resistance of an algorithm.
In conclusion, modifying the cryptographic primitives for improved side channel
security cannot be considered as a side channel countermeasure. However, taking into
account the side channel security in design level can improve the overall security of
the product when combined with adding time domain noise to the implementation
through random delays in execution [CK10] or shuﬄing [THM07]. This way the SNR
value that can be achieved by an attacker can be reduced, and if this reduction is
at a signiﬁcant level, the attacker’s job can be made harder as demonstrated in Fig-
2.5. Conclusions 29
ures 2.2 & 2.3 when the cryptographic primitives are chosen with the metric proposed
by Guilley et al. or confusion coeﬃcient in mind.

Chapter 3
Side Channel Analysis in Frequency
Domain
Since the seminal work of Kocher et al. [KJJ99], many novel works have been published
in the ﬁeld of side channel analysis (SCA). One of the fundamental assumptions for
diﬀerential power analysis attacks is that the power measurements should be aligned.
This led to multiple methods being proposed in the literature to hide leakage through
introducing noise in temporal dimension such as: introducing random delays [CK10]
and shuﬄing the order of computations [THM07] in crypto implementations. These
countermeasures can be listed as the most popular side channel countermeasures
utilized in real world applications due to their relative low overhead with respect to the
gain they provide in side channel security. However, as the power spectrum of a signal
is time invariant, these countermeasures led to a new branch of side channel analysis
which utilizes frequency domain to either synchronize traces [GKLD11], use frequency
domain information to increase SNR in time domain [TOT+14], or directly do the
analysis in frequency domain [GHT05,Luo10,TM13,DSEA+12,KNdA13,CTMD15].
Following the initial works by Gebotys et al. [GHT05] and by Zhang et al. [ZDZC09]
showing that diﬀerential power analysis (DPA) and correlation power analysis (CPA)
in frequency domain are valid methods for side channel analysis, many experimen-
tal works followed proving the advantages of this method. Aside from using power
spectral density (PSD) for SCA, other methods using frequency domain information
are also proposed in the literature such as: magnitude squared coherence [TM13],
instantaneous frequency against voltage scaling countermeasure [KNdA13], or even
a method to mount a second order attack against masking schemes [BBB+14]. Re-
cently, Carbone et al. showed that using mutual information analysis (MIA) is a
better alternative to CPA in frequency domain since the correlation coeﬃcient used
in CPA favours linear relationships between two vectors [CTMD15]. Orthogonally to
that work, we show that SNR in frequency domain can be improved such that CPA
can perform better on the power spectrum output.
All those works provide powerful insights on how useful frequency domain in-
formation can be in the context of side channel analysis. However two works are
particularly distinguished among the works covering SCA in frequency domain: the
work by Luo [Luo10] and another by Tiran et al. [TOT+14] as they study how time
domain leakage is transformed into frequency domain. The former uses this transfor-
mation to show the improvement on security when the parameters of the implemented
32 3. Side Channel Analysis in Frequency Domain
time domain countermeasure (either random delays or shuﬄing) change. On the other
hand, the latter uses it to choose device speciﬁc ﬁlters which increase the leakage in
time domain. We also follow the approach of studying the transformation of time do-
main leakage in an attempt to fully utilize correlation power spectral analysis (CPSA)
by studying the signal components which aﬀect the leakage in frequency domain.
Our contributions:
In this chapter we propose a simple, yet eﬀective signal processing technique which
improves the leakage in frequency domain in a power analysis scenario. The technique
is as simple as adding a sine wave generated following the directions given in this
chapter. Our mathematical analysis shows that the signal trend (the side channel
signal which is data independent and shared among all measurements) signiﬁcantly
aﬀects leakage in frequency domain. Therefore, using an artiﬁcial trend is the key to
improve correlation power analysis in frequency domain. Mathematical background
is also given in the chapter to provide directions to generate a sine wave to be used
as an artiﬁcial signal trend. Using this method, we show how to increase SNR in
frequency domain and present results showing that adding an artiﬁcial signal trend
is still helpful even when the traces cannot be perfectly aligned. Our experiments on
a smartcard employing a hardware DES engine conﬁrms that, using the technique
proposed in the chapter, it is possible to improve SNR in frequency domain. This in
turn translates to a successful key recovery attack with fewer traces than it is required
for a straightforward CPSA.
The rest of the chapter is structured as follows: in Section 3.1 we lay down our
assumptions and make our ﬁrst observations on leakage in frequency domain. In Sec-
tion 3.2 we discuss how a given leakage in time domain is transformed into frequency
domain. This section also includes simulated and real experiments to support the
claims made in the section. In Section 3.3 we show how the observations made in
previous sections can be turned into a signal processing technique which improves
leakage in frequency domain. This section is also supported by simulated and real
experiments.
3.1 Background
DPA uses the diﬀerences in power consumption to infer the data used in crypto-
graphic operations. The diﬀerences caused by processed data tend to be small and is
sometimes overshadowed by device noise or measurement errors. In this section we
ﬁrst model the measured signal as a combination of “leakage” and “noise”. Then,
using the linearity of Fourier transform, we show how the power spectral density of
the signal is related to the leakage and noise components of the time domain signal.
Following this, we derive methods to better utilize the frequency domain to exploit
side channel leakage.
3.1. Background 33
In order to perform a DPA attack, it is necessary to collect power consumption
traces Stot of m encryptions with N samples each. As the traces are discrete samples
of the real signal, we use a vector of dimension N to represent the measured power
consumption.
Stot =


stot(0)
stot(1)
...
stot(N − 1)

 (3.1)
Out of these N samples only a subset is inﬂuenced by the leaking data. Assume
that the leakage is present in the signal Stot and denote it with a vector Sl of dimension
N . Further assume that the leakage is present in a small number of samples of Sl and
the rest of the samples of Sl is equal to 0 (similar to the middle part of Figure 3.1).
In addition, the noise in the signal is denoted by Sn. Therefore we have,
stot(i) = sl(i) + sn(i) (3.2)
for each time sample i ∈ {0, 1, . . . , N − 1}. Thus the signal can be represented in
vector form as:
Stot = Sl + Sn . (3.3)
Note that Gaussian noise is represented together with signal trend by the signal Sn.
The mentioned signal trend is usually caused by the inner workings of the target
device (e.g. device clock) and it is constant for all power consumption traces. An
illustration of the extracted leakage in Eq. (3.2) is given in Figure 3.1. Although this
perfect separation of leakage and the rest of the signal is not practically possible,
we use this representation in order to derive the transformed leakage in frequency
domain.
In order to determine which frequency components inﬂuence the leakage, we trans-
form the signal consisting of noise component and leakage component as in Eq. (3.2).
Fourier transform of the signal Stot can be represented in terms of amplitude and
phase for a given frequency component k as:
Fstot(k) = Astot(k) e
j φstot (k) (3.4)
where Astot is the amplitude of the frequency component k and φstot(k) is its phase.
Since the signal is a linear combination of two components and the frequency trans-
formation is linear, we can perform transformation of the signal and noise separately.
Therefore the signal represented in terms of amplitude and phase form would be
Fstot(k) = Asl(k) e
j φsl (k) +Asn(k) e
j φsn(k) (3.5)
where Asl(k) and Asn(k) are the amplitudes of the frequency component k, φsl(k)
and φsl(k) are their corresponding phases.
34 3. Side Channel Analysis in Frequency Domain
0 50 100 150 200 250 300
−5
0
5
10
0 50 100 150 200 250 300
0
2
4
Po
w
er
 C
on
su
m
pt
io
n
0 50 100 150 200 250 300
−5
0
5
10
Time Samples
Figure 3.1: Illustration of signal decomposition. Top: total signal including leakage
and noise(Stot), middle: leakage component(Sl), bottom: the noise component(Sn)
We are interested in the power spectral density (PSD) of the signal given by
Eq. (3.2), which is a sum of two signals. In order to derive PSD values, we ﬁrst
separate the signal to its real and imaginary parts:
Fstot(k) = Asl(k)(cos(φsl (k)) + j sin(φsl(k)))
+Asn(k)(cos(φsn (k)) + j sin(φsn(k))).
(3.6)
Then multiplying the signal with its complex conjugate we get
‖Fstot(k)‖
2 = A2sl(k) +A
2
sn
(k)
+ 2 Asl(k) Asn(k) cos[φsl(k)− φsn(k)] .
(3.7)
Note that the noise component sn consists of signal trend and Gaussian noise in the
above equation. Let us now consider the eﬀects of signal trend that is constant over
all traces, such as a strong clock signal.
The quadratic relation in Eq. (3.7) shows that the leakage at a given frequency
component of PSD (a function of ‖Fstot‖
2) can follow time domain leakage, or not,
depending on the phase between the leakage and noise. We can see this relation more
clearly if we assume that Sn is purely formed of signal trend and consider Eq. (3.7)
where Asl contains information of only the leakage, and Asn contains information of
only signal trend. Then the signal Asl can take ﬁxed values depending on the leakage,
namely (Asl0 , Asl1 , . . . , Asl4 ) if the leakage is the Hamming weight (HW) of a 4-bit
variable .
3.1. Background 35
Now, simplifying Eq. (3.7) by combining the phase from the leakage and the noise
φ(k) = φsl(k)− φsn(k), we get
‖Fstot(k)‖
2 = A2
sl
(k) + 2 Asl(k) Asn(k) cos(φ(k)) +A
2
sn
(k) . (3.8)
We analyse the resulting parabola with signal trend ﬁxed, namely with ﬁxed values
of Asn and φsn . Note that the minima of the quadratic function, given in Eq. (3.8),
is achieved for Asl = −Asncos(φ) while the value of the parabola ‖Fstot(k)‖
2 at point
Asl = 0 is A
2
sn
.
Figure 3.2 illustrates the relation between diﬀerent leakage values Asli , and the
PSD values (a function of ‖Fstot‖
2) of the signal stot for a ﬁxed frequency component.
Any change in signal trend, namely Asn and φsn , will cause the parabola to move in
Asl0Asl1Asl2Asl3Asl4
‖FStot‖
2
Asl
−Asn · cos(φ)
A2
sn
(a)
Asl0Asl1Asl2Asl3Asl4
‖FStot‖
2
Asl
Asl = −Asn · cos(φ)
A2
sn
(b)
Figure 3.2: (a): Frequency domain leakage is similar to time domain leakage. (b):
Frequency domain leakage is diﬀerent than time domain leakage.
the upper two quadrants of the coordinate system. If the phase of the signal trend
and the leakage is combined in a certain way, the signal can be transformed as in
Figure 3.2(a) and the same power model used in time domain can be used in the
frequency domain as well. On the other hand, it can also be the case that the phases
will combine in a way that the minima of the parabola is shifted to a value in between
Asl0 and Asl4 . In this case, the power model used in time domain will not hold
and the leakage will degrade as it is more diﬃcult to distinguish the data dependent
amplitude changes in that frequency component as depicted in Figure 3.2(b).
In a real world scenario, Sn can be composed of signal trend, electronic noise and
switching noise [MOP07]. However, as long as electronic noise and switching noise are
independent from signal trend and the leakage, we can compute their contribution to
a given frequency component of PSD separately. In case of electronic noise (assumed
to be Gaussian) the contribution to each frequency component will be constant. As
for switching noise, as long as the inputs provided to the cryptosystem for acquisition
of power traces are uniformly random, the contribution of switching noise will be
36 3. Side Channel Analysis in Frequency Domain
constant as well. Therefore they contribute to components related to sn in Eq. (3.8)
as constant oﬀsets. Hence, it is indeed possible to improve leakage in frequency
domain by focusing on the signal trend.
3.2 Leakage in Frequency Domain
Following the arguments in the previous section, one can see how useful it can be to
control the signal trend in a given set of power traces. However, this control can only
aﬀect the particular frequency components that the signal trend itself is composed of.
Therefore, to be able to get the best results in the frequency domain, studying how a
given leakage (as in Figure 3.1) is transformed into frequency domain is essential. In
this section, we discuss diﬀerent shapes of leakage and present simulated results as a
proof of concept before moving to real measurements.
3.2.1 Transformation of Leakage
In [TOT+14], authors have assumed a triangular and rectangular shaped model for
data dependent power consumption, parameters of which depend on the device and
the environmental conditions. This assumption is conﬁrmed on diﬀerent targets as
previous works which present experimental results on frequency domain based analysis
show strong leakage in the low frequency components of the power spectrum [MG10,
OP13]. Therefore we choose to follow the footsteps of our predecessors and use
information on Fourier transform of triangular and rectangular shaped leakage signals.
In case the leakage can be approximated with a square pulse, the power spectrum
of the leakage depends on the length of the pulse as can be seen by the following
equation:
FSl(k) = ǫ ·
sin(2pikτ
N
)
2pik
N
(3.9)
where ǫ represents the level of pulse in time domain and τ represent the length of the
pulse.
Similarly, when the leakage can be approximated with a triangle function, the
spectrum is governed by the equation:
FSl(k) =
(
ǫ ·
sin(2pikτ
N
)
2pik
N
)2
. (3.10)
Based on the frequency representation of diﬀerent types of pulses, we can conclude
that the leakage should dominantly reside in the low frequencies. In other words, the
longer the leakage in time domain is, the lower the leaking frequency components are
in frequency domain, and vice versa.
Note that Eq. (3.9) (and Eq. (3.10)) can be equal to zero depending on a given
frequency component k. In case of an exceptionally short leakage, the ﬁrst frequency
3.2. Leakage in Frequency Domain 37
component with zero amplitude in Eq. (3.9) and in Eq. (3.10) is positioned in relatively
high frequency ranges. As the ﬁrst segment up to the ﬁrst zero value contains the
most of the signal power, it is possible that higher frequencies can also leak. For
instance, a 1 ns of leakage in time domain spans to 1 GHz in frequency domain.
3.2.2 Simulated Experiments: Effect of the Signal Trend
Signal trend is considered to be the type of noise which is constant and shared among
power traces, and it is usually produced by the device operations in the window of
interest; a strong clock signal being one such example. We have brieﬂy discussed the
ampliﬁcation of leakage in the presence of appropriate signal trend in Section 3.1. In
this section, we present simulated experiments and show that signal trend can in fact
make or break the leakage in frequency domain.
We simulated 10 000 traces with 1000 samples each, out of which only 10 samples
correlate with the data. Gaussian noise with mean 0, and standard deviation 0.1 is
added to each simulated trace which leads to a signal–to–noise ratio (SNR) value of
2 in time domain at its maximum. Note that the SNR value here is computed based
on the deﬁnition given in Chapter 4.3.2 of Power Analysis Attacks by Mangard et
al. [MOP07]. The leakage is represented with a triangle function in the simulations
and therefore the SNR values range from 2 to 0 progressively in the course of 10
leaking time samples.
Assuming that each simulated time sample represents a microsecond in time, sine
waves with frequency of 20Hz are added to the traces with various phase shifts. Then,
the SNR of the PSD of each set with a diﬀerent phase shift is computed. Figure 3.3
shows all these SNR traces overlapped. Note that at the frequency component cor-
responding to the artiﬁcial signal trend (the added sine wave), the SNR value varies
signiﬁcantly. This variation depends on the phase change in the added signal trend
with respect to the leaking time samples in the simulated power traces. It is also
important to note that the green plot in the ﬁgure represents the SNR trace in case
there was no added sine wave as a signal trend. This way, one can easily point out
the advantage (or disadvantage) of having a particular sine wave as the signal trend.
To be able to correctly interpret this ﬁgure, we need to recall Eq. (3.8). Since
we add a sine wave of frequency exactly 20Hz, we can focus on this one particular
frequency and rewrite the leakage in frequency domain (cf ) as:
cf = d
2 + 2 d t cos(φ) + t2 (3.11)
where d represents the data dependent leakage which leaks in Hamming weight, and
t is the amplitude of the signal trend together with a small constant contributed by
the Gaussian noise in the signal. It should be noted that the only variable in the
above relation in this experiment (and in this particular frequency) is φ = φsl − φsn .
Therefore, depending on the value of φsn , cos(φ) will determine how strongly the
Hamming weight leakage is represented in the frequency domain. Examining closely
38 3. Side Channel Analysis in Frequency Domain
0 50 100 150
0.02
0.04
0.06
0.08
0.1
Si
gn
al
 to
 N
oi
se
 R
at
io
 (S
NR
)
Frequency Components
18 20 22
0.02
0.04
0.06
0.08
0.1
Figure 3.3: SNR in frequency domain for diﬀerent phase values of signal trend.
the magniﬁed part of Figure 3.3, one can see that the SNR value takes various values
between 0 and 0.11. This means, depending on the phase of the signal trend, Hamming
weight leakage may not be exploitable in frequency domain. On the other hand, for
other values for the phase of signal trend, one can get much higher SNR values than
she would get without alterations to the signal trend.
Although we have seen how the phase of signal trend can aﬀect leakage in frequency
domain, there is another important observation that should be made from Eq. (3.11),
and that is the amplitude of signal trend. Looking back at Eq. (3.11), we see that the
leakage model is also modiﬁed when traces are transformed into frequency domain. If
the amplitude of signal trend represented with t in Eq. (3.11) is too small, the leakage
will be related to the square of the Hamming weight of the data that is processed by
the device. Therefore, only if the amplitude of the signal trend is large enough, the
Hamming weight leakage will be linearly related to the frequency domain signal, and
thus resulting in a better correlation value for the correct key.
3.2.3 Real Experiments: Effect of the Signal Trend
To conﬁrm the assumptions made in the previous section, we have collected power
traces from a smartcard while the DES hardware engine embedded in the microcon-
troller encrypts uniformly random data. Since we would like to test the eﬀect of the
signal trend on side channel attacks in the frequency domain, we have run attacks on
both the original traces and centred traces where the global mean trace is subtracted
from each trace in the set. A sample power trace, its centred form and the SNR ﬁgure
in time domain is given in Figure 3.4. The SNR ﬁgure is computed based on a single
6–bit part of the correct key and the Hamming distance between the corresponding
bits of the ﬁrst and the second round input that can be computing using the 6–bit
key part.
3.2. Leakage in Frequency Domain 39
200 400 600 800 1000
−100
0
100
P
ow
er
C
on
su
m
pt
io
n
200 400 600 800 1000
−100
0
100
C
en
te
re
d
C
on
su
m
pt
io
n
200 400 600 800 1000
0
2
4
6
8
x 10−3
Time Samples
Si
gn
al
-t
o-
N
oi
se
R
at
io
Figure 3.4: A sample power trace in time domain (top), centred power trace (middle)
and SNR in time domain (bottom).
An important point to note here is that if one is to analyse centred traces in the
PSD output, one has to pay attention to the side eﬀects of this centring process.
If time domain traces are centred around 0, the leakage cannot be detected with a
ﬁrst order statistical distinguisher such as the Pearson correlation coeﬃcient in the
PSD output. This is due to the fact that the PSD of a signal is representative of its
energy on various frequency components. In fact, if we assume that the device leaks
in relative to the Hamming weight of the target intermediate value, then the data that
has 0 Hamming weight and maximum Hamming weight (e.g. 4 for the DES S–box)
is represented with equal magnitudes in the PSD output. Although one way to avoid
this problem is to compute mutual information between the Hamming weights and
the PSD output, there is an easier option. If a large enough oﬀset is applied to the
centred time domain signals, then each Hamming weight value will be represented
in diﬀerent magnitudes in the PSD output. We used the latter method due to its
simplicity and eﬀectiveness.
To show the eﬀect of the signal trend on the key distinguishing ability in the
frequency domain analysis, we use the relative distinguishing margin proposed by
Whitnall and Oswald [WO11]. Therefore, any negative value represents an unsuc-
cessful attack while large non-negative values represent better distinguishing ability
among other key candidates. The analysis is run on the PSD output computed over
a window of 512 time samples, and the Pearson correlation coeﬃcient is used as the
40 3. Side Channel Analysis in Frequency Domain
0 50 100 150 200 250 300 350 400
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Original Traces
Shift Amount
N
u
m
b
er
o
f
tr
a
ce
s
0 50 100 150 200 250 300 350 400
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Centered Traces
Shift Amount
N
u
m
b
er
o
f
tr
a
ce
s
 
 
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
Figure 3.5: Relative distinguishing margin with respect to the number of traces pro-
cessed and for various shift amounts for the leaking samples within the window for
FFT computation. The results are presented for both original traces (left) and centred
traces (right).
statistical distinguisher. The results are presented as a scaled image plot in Fig-
ure 3.5. Each column in the ﬁgure represents the amount of circular shift applied to
the time domain traces before PSD computation. Even though the window is shifted
for each column, all leaking samples are always kept within the window on which PSD
is computed. One can argue that the eﬀect visible on the left image of Figure 3.5
is due to the aperiodic nature of the time domain signal. Common practice to deal
with such a problem is to use a windowing function before FFT computation. There-
fore, to avoid unwanted noise in the frequency domain representation of the signal,
Hamming window is applied to the selected set of time domain samples before FFT
computation.
Since the only diﬀerence between the two images given in Figure 3.5 is the signal
trend, the eﬀect becomes clear in terms of key distinguishability in the frequency
domain. One should also note that depending on the shift amount, diﬀerent parts
of the signal trend is included in the window for PSD computation. When the two
images given in Figure 3.5 are compared, one notices that some parts of the signal
trend (visible in the top plot of Figure 3.4) can favour key distinguishability in the
frequency domain. However, the analysis given in Section 3.2.2 implies that the
opposite eﬀect can also occur. In fact, looking at the results given in Figure 3.5
around the shift amount value of 335, the correct key can be better distinguished
when the signal trend is removed. Therefore, if the analyst can control the signal
trend by adding an artiﬁcial one of her choice, key distinguishability can further be
improved in the frequency domain.
3.3. How to Increase SNR in the Frequency Domain 41
0 20 40 60 80 100
0.04
0.06
0.08
0.1
0.12
X: 75
Y: 0.03112
X: 7
Y: 0.1286
Aligned
Frequency of the added sine wave
m
a
x(S
NR
)
0 5 10 15 20
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
X: 12
Y: 0.03112
X: 2
Y: 0.1154
Misaligned by 25 time samples
Frequency of the added sine wave
m
a
x(S
NR
)
(a) (b)
Figure 3.6: Maximum correlation values one can get in frequency domain for sine
waves with various periods added to the signal as a signal trend.
3.3 How to Increase SNR in the Frequency Domain
If we are to use an artiﬁcial signal trend to improve SNR in frequency domain, it is
very important to determine the optimal period of the sine wave that is to be added
to the signal. To show how the period of the added signal trend aﬀects leakage in
frequency domain, we add a sine wave of a speciﬁc frequency to the same simulations
described in Section 3.2.2, and we search for the phase value which gives the maximum
SNR. We repeat this experiment for a range of frequencies and the maximum SNR
values one can get from such an experiment are presented in Figure 3.6(a). The same
experiment is repeated for the case that the traces cannot be perfectly aligned (see
Figure 3.6(b)). Note that in a real experimental setting, the signal trend can easily
be removed through subtracting the global mean trace from the measurements, in
case the traces are aligned. However, in case re-aligning the signal is not possible, the
underlying signal trend can also interfere with the SNR in frequency domain following
the relation given in Eq. (3.11). To show this eﬀect experimentally, we also included
an underlying signal trend which is the sum of two sine waves with periods N3 and
N
5
in the simulated traces, where N is the total number of simulated time samples.
The ﬁrst observation we can make on the plots in Figure 3.6 is that they both
converge to the same SNR value of about 0.031. Figure 3.3 shows that this is the
maximum correlation value that one can get by correlating PSD with the data without
adding an artiﬁcial signal trend to it. Therefore we can conclude from Figure 3.6 that
adding an artiﬁcial sine wave of a low frequency is essential for achieving higher SNR
values in frequency domain.
The second observation is that both plots reach about the same maximum value,
which indicates that adding a constant sine wave to the traces can signiﬁcantly im-
42 3. Side Channel Analysis in Frequency Domain
prove the SNR in frequency domain even when the traces are not aligned.
The ﬁnal observation involves the frequencies that one can use for the sine wave
to be added as an artiﬁcial signal trend to improve SNR in frequency domain. From
Eq. (3.9) and Eq. (3.10) we already know that most of the leakage in frequency domain
is concentrated on the frequencies below N
τ
, where N is the number of time samples
and τ being the number of consecutive leaking time samples. Recall that there are
10 time samples which correlate with the data and there are a total of 1000 samples
per trace. Following the relations given in Eq. (3.9) and Eq. (3.10), we know that
the frequencies which leak the most data are below N
τ
= 100010 = 100Hz (assuming
that each time sample represents a microsecond in time). In Figure 3.6(a) we see that
the frequencies which we can have a meaningful inﬂuence on the correlation value
in frequency domain match this observation from Section 3.2.1. Moreover, when the
traces are misaligned, the eﬀect in the frequency domain analysis is as if there were
a larger number of leaking time samples, hence the frequency components leading to
high SNR values are lower than in the aligned case. However, in reality, the number
of leaking frequencies do not decrease in the PSD output since it is time invariant.
The eﬀect visible in Figure 3.6(b) is due to the fact that the added sine wave should
have a lower frequency to cover all leaking samples throughout the entire trace set
when they are not aligned. Hence, when a sine wave is to be added to a misaligned
set of measurements in an attempt to improve SNR in frequency domain, one should
aim for even lower frequencies than the leaking ones.
From the analysis and experiments given above, we can conclude that adding
a sine wave to the traces as an artiﬁcial signal trend leads to an improvement in
frequency domain analysis even when the traces cannot be perfectly aligned. This
is particularly important as frequency domain analysis is usually required when the
collected measurements from a target cannot be perfectly aligned.
3.3.1 Using Truncated Sine Waves as Signal Trend
So far, we have seen that it is possible to aﬀect SNR in frequency domain for a given
frequency component. However, to be able to utilize an artiﬁcial signal trend to
increase leakage in frequency domain, we need to ﬁnd the optimal frequencies that
can be used for the artiﬁcial trend. But in a real scenario it might not be always
possible to pinpoint the frequencies that can used for improving SNR in frequency
domain. Although these frequencies can be estimated using the relations given in
Section 3.2.1, we need a more eﬃcient methodology for searching for the frequency
and phase of the sine wave to be added to the trace to increase the leakage in the
frequency domain.
In this context, using a truncated sine wave can be handy since it will aﬀect
multiple frequency components in the window that PSD is computed. As this signal
is not periodic in the window that FFT is computed, it will be represented in multiple
frequency components in the power spectrum. However, one important note is that
3.3. How to Increase SNR in the Frequency Domain 43
this aperiodic signal trend will also introduce noise on the entire spectrum unless it is
compensated for. Therefore, when using a truncated sine wave as an artiﬁcial signal
trend, it is of absolute importance to use a window function before computing FFT
over the time domain signal. This minimizes the unintended noise in the frequency
domain introduced by adding an aperiodic signal to the time domain signal.
The truncated sine wave we used in this section is three quarters of the full period
of a sine wave. To minimize the aforementioned noise introduced by the aperiodic
artiﬁcial trend in the frequency domain, we applied the Hamming window to the
simulated time domain signals before transforming them. We repeated the previous
experiment of varying the phase of the added sine wave but this time also varying
the period of the truncated sine wave which is added as an artiﬁcial signal trend. As
Figure 3.7 presents, truncated sine wave aﬀects multiple frequency components of the
SNR trace as expected. For a better visibility of the eﬀect, we plot only two extreme
cases for which the maximum and minimum SNR values are achieved in frequency
domain, together with the case where no sine wave is added to the signal. We see,
once again, that an “appropriate” signal trend can increase SNR in frequency domain.
The advantage of adding a truncated sine wave to the signal is that it gives an error
margin for estimating the leaking frequency components in frequency domain.
0 50 100 150
0
0.05
0.1
0.15
0.2
0.25
0.3
Frequency Components
S
ig
n
a
l
to
N
o
is
e
R
a
ti
o
(S
N
R
)
 
 
Best Match
Worst Match
Without Added Trend
Figure 3.7: Correlation in frequency domain aﬀected by the added truncated sine
wave.
Finding the “appropriate” signal trend, on the other hand, can be done eﬃciently
as follows. Since we know that the frequencies below 100 can be used to improve
the SNR in frequency domain (see discussion of Figure 3.6 in Section 3.3), we used
10 base frequencies below 100 for our search for the best ﬁtting period. For each
period, we have used 10 diﬀerent phase values for the added truncated sine wave.
Although this seems like running the attack 100 times, Figure 3.7 shows that only
very few frequencies utilize high SNR in frequency domain. Therefore, after each
PSD computation, only the amplitudes of 10 lowest frequencies can be stored, which
44 3. Side Channel Analysis in Frequency Domain
in turn results in again 1000 samples to analyse. Hence, following this technique, the
only additional cost would be computing the PSD of the trace set 100 times, which
can be done eﬃciently using FFT.
3.3.2 Real Experiments: Increasing the Signal in Frequency
Domain
Following the mathematical derivations and their conﬁrmation on simulations, in this
section, we examine if this method can be utilized in a real world application. The
aim is to gain control of the signal trend, therefore controlling the leakage in frequency
domain. For unprotected implementations this is as easy as adding a sine wave of
“appropriate frequency” and searching for the “correct phase” (see Section 3.3.1 for
directions to eﬃciently search for the period and phase of the artiﬁcial signal trend).
This way, one can eﬀectively improve attack success on an implementation where
multiple time samples leak information on data that is processed in the target device.
The target is the same smartcard we used for the experiments in Section 3.2.3
which has an embedded hardware DES engine. Although the card is provided with
an external clock of 4MHz for communication, the hardware DES engine operates on
30.4MHz. 35 000 power measurements are collected from the smartcard at 250MS/s
sampling rate using a LeCroy WaveRunner 610Zi oscilloscope, and a Riscure Power
Tracer to handle the communication with the card and also to extract the power
measurements. Note that although the card does not implement any countermeasures,
the power traces are shifted by ±50 samples to imitate an unreliable internal clock.
Time domain analysis of the target is given in Figure 3.8 together with the signal
to noise ratio (SNR) and a sample power trace. SNR is computed for only one 6–
bit key part and using the Hamming distance between the corresponding bits of the
ﬁrst and the second round inputs. As the middle plot of Figure 3.8 shows, there are
about 50 time samples that leak signiﬁcantly in time domain. This is an important
observation since it indicates the frequencies that can be used for the artiﬁcial signal
trend to improve SNR in the frequency domain. Following the relations given in
Section 3.2.1, one can search for a sine wave of appropriate frequency below N
τ
for a
leakage length approximation of τ = 50 time samples and a total number of available
time samples N = 512. We computed the SNR to quantify the eﬀectiveness of our
signal processing technique for diﬀerent sine waves with frequencies less than or equal
to N
τ
, and the best results are summarized in Table 3.1. Note that the SNR trace is
computed using all available power traces. Furthermore, we chose the shift amount
which results in the highest relative distinguishing margin value found in Section 3.2.3
for computing the frequency domain SNR, for a fair comparison with our proposed
technique.
To achieve the SNR values given in Table 3.1, the phase and the amplitude of
the sine wave to be added should follow certain criteria. Note that our goal is to
maximise the leakage function which is modelled in the attack as HW (Reg0 ⊕ Reg1)
3.3. How to Increase SNR in the Frequency Domain 45
0 50 100 150 200 250 300 350 400 450 500
−100
0
100
P
o
w
er
C
o
n
su
m
p
ti
o
n
0 50 100 150 200 250 300 350 400 450 500
0
1
2
x 10−3
S
N
R
0 50 100 150 200 250 300 350 400 450 500
−0.05
0
0.05
Time Samples
C
o
rr
el
a
ti
o
n
V
a
lu
e
Figure 3.8: Top: Time domain trace sample, middle: signal-to-noise ratio, bottom:
correlation traces.
Table 3.1: Change in SNR with respect to the processing technique used.
Processed Traces Maximum SNR
Time Domain 1.939× 10−3
Frequency Domain 4.402× 10−3
Improved Freq. Domain Added Sine 6.559× 10−3
46 3. Side Channel Analysis in Frequency Domain
where Reg0 and Reg1 represent the same four bits of the round register in diﬀerent
rounds of encryption. Therefore, if the Hamming weight leakage in frequency domain
is increased, this should intuitively lead to a better attack success. Recall Eq. (3.11)
and the discussions in Section 3.2.2 that the amplitude of the signal trend aﬀects the
leakage in frequency domain. Hence, the amplitude of the added sine wave should be
large enough to overshadow the squared Hamming weight leakage (d2 in Eq. (3.11) )
in the PSD output. A good practice is to start from a multiple of the standard
deviation of the signal and that way one can verify that the contribution of increasing
the amplitude converges very quickly. As for searching for the phase, the best practice
is to cover the entire period of a sine wave for phase shifts and computing SNR in
frequency domain for each phase shift. Picking the phase which give the highest SNR
value, one can then move on to carrying out the attack in frequency domain. Note
that on a real case, the SNR values will be distinctively large for only a small number
of frequency components, therefore this method can be improved by storing the PSD
values of a small number of frequency components to be further analysed. This way,
the number of samples to be analysed can be kept low and avoid additional processing
time during analysis.
For our experiments, we ﬁxed the amplitude of our artiﬁcial signal trend and
repeated the attack on 10 intermediate phase values within the full period of the sine
wave added as the artiﬁcial signal trend. We have searched for 10 distinct periods (not
necessarily an integer value to generate a truncated sine wave) less than or equal to
N
tau
, as Figure 3.6 shows that the best period for the artiﬁcial signal trend is always
lower than N
τ
. As mentioned before, one can save time by limiting the number of
frequencies to keep for analysis. In our case, we have kept 20 frequency components
for further analysis and discarded the rest. Since we keep 20 frequency components
for 10 distinct periods and 10 diﬀerent phase values each, this results in a total of 2000
samples to analyse if all is collected together. This search can also be implemented
in a way that only the PSD output leading to the largest SNR value can be kept in
memory and it can be updated if another set of parameters lead to a higher SNR in
the PSD output.
For comparison purposes we have run the attack and computed guessing entropy
[SMY09a] in both time domain, and frequency domain. Note that each guessing
entropy plot presented in Figure 3.9 results from analysis of 7 independent data sets.
The gain in SNR (see Table 3.1) is also represented in the guessing entropy plot in
Figure 3.9. However, more experiments are required for a clearer picture. Results on
this data set shows that, using an artiﬁcial signal trend improves the SNR and the
attack success in frequency domain.
3.4 Conclusion
In this chapter, we have used a simple model for signal decomposition and pointed out
important facts when traces are transformed into frequency domain. The mathemati-
3.4. Conclusion 47
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
0
1
2
3
4
5
6
Number of processed traces
lo
g
2
(g
u
es
si
n
g
en
tr
o
p
y
)
 
 
Time
Frequency
Frequency Carried
Figure 3.9: Comparative guessing entropy ﬁgure for various processing methods.
cal derivations laid out in the chapter are supported by both simulations and analysis
on a hardware implementation of a block cipher. Using the methods proposed in the
paper, we showed that frequency domain analysis can be improved signiﬁcantly by
using a simple signal processing method. Even when the power traces cannot be per-
fectly aligned, the signal processing method can still be used and yields better results
than straightforward frequency domain analysis.

Chapter 4
Near Collision Side Channel Attacks
This chapter introduces the concept of side channel near collision attack and its
extension to low entropy masking schemes that use mask sets generated by linear
codes. The work presented in this chapter is based on the paper [EEB16].
Side channel collision attacks exploit the fact that identical intermediate values
consume the same power and hence similar patterns can be observed in power/EM
measurements. More in detail, an internal collision attack exploits the fact that a
collision in an algorithm often occurs for some intermediate values. This happens if,
for at least two diﬀerent inputs, a function within the algorithm returns the same
output. In this case, the side channel traces are assumed to be very similar during
the time span when the internal collision persists (a more elaborate description is
given in Section 4.1). Since their original proposal [SWP03], a number of works have
improved on various aspects of collision attacks, such as collision ﬁnding [Bog08] or
eﬀective key recovery [GS12].
There are also diﬀerent approaches in collision detection. Batina et al. introduce
Diﬀerential Cluster Analysis (DCA) as a new method to detect internal collisions
and extract keys from side channel signals [BGLR09]. The new strategy includes key
hypothesis testing and the partitioning step similar to those of DPA. Being inher-
ently multivariate, DCA as a technique also inspired a simple extension of standard
DPA to multivariate analysis. The approach by Moradi et al. [MME10] extends col-
lision attacks by creating a ﬁrst order (or higher order in [Mor12]) leakage model and
comparing it to the leakage of other key parts through correlation. The approach is
univariate only if leakages for diﬀerent sub-keys occur at the same time instance, i.e.
for parallel implementations, as often found in dedicated hardware. When software
implementations are considered, these two sensitive values would leak in diﬀerent
times, therefore other papers pursued the possibility to mount a similar attack for
software implementations in a bivariate setting [YCE14,CFG+11]. Although ﬁnding
the exact time samples which leak information about the intended intermediate vari-
ables increases the attack complexity, this type of attacks are especially favourable
when the leakage function is unknown, or it is a non-linear function of the bits of the
sensitive variable [GS12].
In general, it is desirable for attacks to apply to a wide range of leakage functions.
Some strategies have no assumptions on the leakage model, e.g. Mutual Information
Analysis [GBTP08] or Kolmogorov-Smirnov Analysis [WOM11]. In contrast to this
50 4. Near Collision Side Channel Attacks
assumption-less leakage model approach, there is also an alternative in choosing a very
generic model as in stochastic models approach [SLP05]. We follow this direction in
terms of restricting ourselves to leakages that are linear functions of the contributing
bits. Nevertheless, in our scenario this is considered merely as a ballpark rather than
a restriction.
When univariate attacks are considered such as the one that is proposed in this
chapter, the best way to mitigate is to implement a masking scheme. However, one of
the biggest drawbacks of masking schemes is the overhead introduced into implemen-
tations. Recently there has been a rising interest in reducing the entropy needed and
thereby the implementation overhead by cleverly choosing masks from a reduced set.
These approaches are commonly referred to as low entropy masking schemes (LEMS)
or leakage squeezing. In fact, LEMS are a low-cost solution proposed to at least keep or
even enhance the security over classical masking [NGD11,NSGD12,CDGM12]. Since
the proposal, LEMS have been analysed from diﬀerent angles, including speciﬁc at-
tacks [YE14], a detailed analysis of the applicability of made assumptions [GSP14]
and problems that may occur during its implementation [MGH14]. Attention to
LEMS has been stipulated to a speciﬁc version of LEMS, the Rotating S-box Masking
(RSM) [NGD11], since it has been used for both DPA contest v4.1 and v4.2 [BBD+14].
Our Contributions
The contribution of this work can be summarised as follows:
• We introduce a new way of analysing side channel measurements which is void
of strong assumptions on the power consumption of a device.
• The attack that we propose is a non-proﬁled univariate attack which only as-
sumes that the leakage function of the target device is linear.
• We further extend this idea to analyse a low entropy masking scheme by im-
proving on [YE14], and we show that our technique is more eﬃcient to recover
the key than generic univariate mutual information analysis.
• The proposed attack is applicable to any low entropy mask set that is a binary
linear code [BCG13].
Structure
The rest of the chapter is structured as follows. Section 4.1 introduces the notation
used throughout the work and also the ideas in the literature underlying our new anal-
ysis method. Section 4.2 introduces the near collision attack and presents simulated
results in comparison to other similar attacks in the literature. Section 4.3 introduces
the extension of our idea to a low entropy masking scheme together with a summary
of the previous work that is improved upon. This section also presents comparative
4.1. Backgound & Notation 51
results of the extended attack and other attacks similar to it in the literature, and a
discussion on the attack complexity. Finally, Section 4.4 concludes the chapter.
4.1 Backgound & Notation
In this work, we focus on the information leakage on the power consumption or electro-
magnetic leakage of an implementation. Further, we use the Advanced Encryption
Standard (AES) to explain our new attack and run experiments as it is a widely
deployed crypto algorithm around the world. This ensures comparability with other
works in the literature that use AES for presenting results, but does not hinder
generalization to other block ciphers whose nonlinear elements can be implemented
with table look ups.
Correlation based power or EM analysis (CPA) against AES implementations
usually focuses on the output of the S-box operation which is the only non-linear
element of the algorithm. This non-linearity ensures a good distinguishability between
the correct and incorrect key guesses for CPA; the correlation between the observed
and the predicted power or EM leakage will be closer to zero if the key guess is
incorrect, due to the highly nonlinear relation between the predicted state and the
key. To run a CPA, the analyst observes the power (or EM) leakages of the device
for each input x ∈ X and stores it as an observed value ox ∈ OX . The next step is to
reveal the relation between ox and x through estimating the power consumption of
the target device. Assume that the analyst would like to estimate power consumption
with the Hamming weight function (HW(x)) which returns the number of ones in
the bit representation of a given variable. In this case, the power estimation for
the input value x becomes P (x, kg) = HW(S(x ⊕ kg)), where kg is a key guess for
the part of the key related to x. Proceeding this way, the analyst forms 256 sets
Pkg = {P (x, kg) : x ∈ X} from the known input values xi ∈ X for each key guess
kg ∈ F
8
2. What remains is to compare the estimated power consumptions Pkg with
the set of observations OX on the power consumption through a distinguisher, in this
case through computing the Pearson correlation coeﬃcient ρ(Pkg , O
X), ∀kg ∈ F
8
2.
If the analyst has suﬃcient data and if the modelled leakage P is close enough to
the actual leakage function L of the device (i.e. a linear representative of L), then
the correct key kc should result in a distinguishing value for the Pearson correlation
when compared to the wrong key guesses kw. In case P is not a linear representative
of L however, then the correct key may not be distinguishable with this technique.
Therefore, the choice of power model determines the strength of a side channel attack
which makes use of such a model.
Collision attacks aim to amend this problem by removing the requirement to
estimate L in an accurate manner. Linear collision attacks against AES use the fact
that if there are two S-box outputs equal to each other, then their power consumption
should be the same [Bog08]. If two S-box outputs for inputs xi and xj are equal to
52 4. Near Collision Side Channel Attacks
each other, then
S(xi ⊕ ki) = S(xj ⊕ kj)
⇒ xj ⊕ kj = xj ⊕ kj
⇒ xi ⊕ xj = ki ⊕ kj
(4.1)
when S is an injective function. Since the AES S-box is bijective, collisions as above
reveal information about the relation of a pair of key bytes. Detecting collisions can be
a challenging task, and therefore Moradi et al. [MME10] propose to use Pearson cor-
relation for collision detection by comparing pairs of time samples which correspond1
to the times that two input bytes, which have a ﬁxed diﬀerence, are processed. If
a collision is detected, following the relations given in Eqn. (4.1), this diﬀerence be-
tween the input bytes would represent the diﬀerence between the corresponding key
bytes. After running a linear collision attack (referred to as the ‘correlation enhanced
collision attack ’), the analyst can reduce the key space radically and solve the re-
sulting linear system of equations to recover the entire key. Hence the only challenge
remains is to ﬁnd the time instances where the targeted leakages occur. This can be
a time consuming task in particular depending on the amount of samples the analyst
acquires for a security analysis.
Although the resulting work load for brute forcing the key is reduced signiﬁcantly
by a linear collision attack, to further reduce the work load, Ye et. al [YCE14] propose
a new collision attack (namely the ‘non-linear collision attack ’) to directly recover a
key byte rather than a linear relation of two key bytes. Rather than looking for a
collision in the same power measurement, non-linear collision attack looks for a linear
relation between the input of the S-box for a plaintext value x and an S-box output
value of another input x′ which are related to the same key byte as:
x′ ⊕ k = S(x⊕ k) (4.2)
x′ = S(x⊕ k)⊕ k . (4.3)
Therefore, the collision can be tested by building a hypothesis for k and whenever
a collision is detected, the correct key for that byte is immediately revealed. Even
though the key byte can be recovered directly with this attack, the intrinsic problem
here remains and that is the challenge to ﬁnd the two leaking samples which refer
to the leakage of such values. Next section presents the univariate solution to this
problem which removes the requirement of strong leakage assumptions to be able to
mount a side channel attack similar to side channel collision attacks.
4.2 Side Channel Near Collision Attack
In this section, we introduce a univariate non-proﬁled attack, namely the side channel
near collision attack (NCA) with an example to an AES implementation. NCA is
very similar to other collision attacks in the sense that a priori knowledge of the
4.2. Side Channel Near Collision Attack 53
exact leakage function is not required to mount it. However, unlike collision attacks
proposed up until now, near collision attack exploits the existence of very similar but
yet distinct values that are computed when the inputs are assumed to be selected
uniformly at random from the entire set of inputs: F82. This brings up an implicit
power model assumption that the power consumption should be linearly related to
the bits of the sensitive value that is computed in the device. In comparison to the
popular Hamming weight model, this implicit power model assumption is a much
weaker one and therefore makes the attack more powerful against a wider range of
platforms and devices with diﬀerent leakage functions.
The main idea of a near collision attack (NCA) is to separate the measurements
into two vectors and statistically compare how these two vectors are related to each
other. Assuming that the S-box output leaks in the measurements, for any input byte
x0, another input value x1 is computed for the same byte and for a key guess kg as:
S(x1 ⊕ kg) = S(x0 ⊕ kg)⊕∆(t) (4.4)
x1 = S
−1(S(x0 ⊕ kg)⊕∆(t)) ⊕ kg (4.5)
where ∆(t) is an 8-bit value with a ‘1’ at the tth bit position and ‘0’ elsewhere. If the
key guess is correct, then the S-box outputs have only one bit (XOR) diﬀerence. If the
key guess is not correct, then the outputs will have a random (XOR) diﬀerence. Note
that this property holds due to AES S-box’s strength against diﬀerential cryptanalysis.
Proceeding in this way, one can form a pair of vectors, X0 = [x
1
0 · · ·x
128
0 ] and X1 =
[x11 · · ·x
128
1 ] such that
X0 = [x
i
0 ∈ F
8
2 : i ∈ {1, . . . , 128}, S(x
i
0 ⊕ kg) ∧∆(t) = 0] (4.6)
where ∧ is the bit-wise AND operation, and X1 is formed from each element of X0
through the relation given in Eqn. (4.5). This way the whole set of values in F82 are
separated into two vectors and now they can be used to generate a statistic for kg
which in turn can be used to distinguish the correct key from others.
For t = 8, the observed values corresponding to the sets X0 and X1 can be
visualized in Figure 4.1 for an incorrect and a correct key guess under the assumption
that the Hamming weight of a value leaks in observations. The diﬀerence between
the observed values is also included in the plot for ease of comparison. As it is visible
from Figure 4.1, when the key guess is correct (Figure 4.1(right)), the diﬀerence
between the vectors of observed values OX0 and OX1 , corresponding to X0 and X1
respectively, is constant. If the key guess is not correct (Figure 4.1(left)), then there is
no directly visible relation between the two vectors OX0 and OX1 . Therefore Pearson
correlation coeﬃcient (ρ(OX0 , OX1 )) can be used as a statistical distinguisher in this
case to recover the key.
Following the high level description of the near collision side channel attack, the
method can be applied to real measurements in a known plaintext setting by utilizing
the steps below:
54 4. Near Collision Side Channel Attacks
0 20 40 60 80 100 120
−4
−2
0
2
4
6
8
Correct Key Guess
Pair Index (i)
P
ow
er
C
o
n
su
m
p
ti
o
n
 
 
Observed OX0
Observed OX1
Difference
0 20 40 60 80 100 120
−4
−2
0
2
4
6
8
Incorrect Key Guess
Pair Index (i)
P
ow
er
C
o
n
su
m
p
ti
o
n
 
 
Observed OX0
Observed OX1
Difference
Figure 4.1: Simulation values in sets X0 and X1 for incorrect(left) and correct(right)
key guesses.
1. Make a key guess kg.
2. Pick a bit value t for ∆(t).
3. Partition the inputs to sets X0 and X1 following the relation Eqn. (4.5).
4. Compute mean traces (µ(Ox
i
0) and µ(Ox
i
1 ) for all i ∈ {1, . . . , 128}) correspond-
ing to each element of X0 and X1.
5. Compute the Pearson correlation coeﬃcient betweenOX0 andOX1 (i.e. ρ(OX0 , OX1 )).
6. Repeat steps 2 to 5 for each value of t ∈ {1, . . . , 8}, and store the sum of
correlation coeﬃcients for each kg.
After following these steps, the correct key is expected to get the highest cumulative
correlation value.
4.2.1 Simulated Experiments on Unprotected AES Implemen-
tation
We have run simulated experiments to assess the capabilities of the near collision
attack (NCA) and its eﬃciency in comparison to other similar attacks in the literature.
To evaluate how our attack reacts to noise, we have ﬁxed the number of traces and
conducted experiments with various signal to noise ratio (SNR) values. Assuming the
“signal” is the power consumption of the target intermediate variable, and “noise”
being additive noise, SNR is computed as: var(signal)
var(noise) [Man04].
An important note here is that we use discrete simulated values to mimic the mea-
surements collected from an oscilloscope. Usually when simulated measurements are
analysed, the fact that the simulations provide unnaturally optimistic measurements
is neglected. Since this may lead to misleading simulation results which cannot be
4.2. Side Channel Near Collision Attack 55
reproduced in real life, we have chosen to ﬁlter the simulated traces and scale them
to the resolution of an 8-bit oscilloscope, therefore producing 256 unique values for
traces. Note that, depending on the noise level the simulated traces can cover a large
range of values. Therefore we have chosen to scale the values in a way such that the
maximum and minimum values (128 and -127) are assigned to values (µ+3×σ) and
(µ− 3× σ) respectively, where µ is the mean, and σ is the standard deviation of the
simulated traces. The rest of the values are distributed equally over the sub-ranges
which are of equal size.
As to measure the robustness of our technique against diﬀerent linear leakage
functions, we have used two ways to compute the simulated traces:
(a) The ﬁrst method computes the Hamming weight of the S-box output (HW
model).
(b) The second method is a weighted linear function of the bits of the S-box output,
where the weight values are picked uniformly at random in the range [−1, 1] ⊂ R
(Random linear model).
For comparison, we have selected the popular non-proﬁling univariate attacks:
correlation power analysis (CPA) [BCO04], absolute sum DPA (AS-DPA) [ARR03],
non-proﬁled linear regression attack (NP-LRA) [DPRS11], and univariate mutual in-
formation analysis (UMIA) [GBTP08]. We have included CPA with Hamming weight
model to have a basis for comparison as it is a popular choice for doing side channel
analysis. The choice of AS-DPA and NP-LRA are to have a comparison with at-
tacks which also have weak assumptions on the leakage model; AS-DPA assumes that
each bit of the sensitive variable contribute signiﬁcantly to the power consumption,
where NP-LRA usually limits the algebraic order of the leakage function. For this
work, we have restricted the basis functions of NP-LRA to linear relations (the case
d = 1 in [DPRS11]), so that it would be a fair comparison to our work. Furthermore,
we have included the leakage model dependent UMIA with Hamming weight model
(UMIA-(HW)), and the leakage model agnostic variant UMIA which measures the
mutual information between the least signiﬁcant 7 bits of the sensitive variable and
power measurements (UMIA-(7 LSB)). For both instances of UMIA we use histograms
to estimate the probability distributions .
We have run the experiments with 10 000 traces to put all methods on fair ground.
Note that MIA requires a large number of traces as its distinguishing ability depends
on the accuracy of the joint probability distribution estimations between the sensitive
variable and power traces. We have computed the guessing entropy [SMY09a] over
100 independent experiments for each SNR value considered. Figure 4.2 presents the
results of these experiments. As it is visible in Figure 4.2 (a) which shows results
for Hamming weight leakage function, CPA has an obvious advantage over all other
methods. When the second leakage function is considered however (Figure 4.2(b)),
the attacks using relaxed assumptions on the leakage function outperforms CPA. We
56 4. Near Collision Side Channel Attacks
−10 −9 −8 −7 −6 −5 −4
0
1
2
3
4
5
6
7
Avg GE for HW model
log2(SNR)
lo
g 2
 
(gu
es
sin
g e
ntr
op
y)
 
 
This Work (All bits)
CPA (HW)
UMIA (HW)
UMIA (7 LSB)
NP−LRA (d=1)
AS−DPA
−10 −9 −8 −7 −6 −5 −4
0
1
2
3
4
5
6
7
Avg GE for Random Linear model
log2(SNR)
lo
g 2
 
(gu
es
sin
g e
ntr
op
y)
 
 
This Work (All bits)
CPA (HW)
UMIA (HW)
UMIA (7 LSB)
NP−LRA (d=1)
AS−DPA
(a) (b)
Figure 4.2: SNR vs Guessing Entropy values computed over 100 independent ex-
periments with 10 000 traces for perfect HW leakage (a), and random linear leakage
(b).
also clearly see that MIA cannot deal with high levels of noise as eﬃciently as NCA,
AS-DPA and NP-LRA.
Finally, if we only consider the attacks which have fewer assumptions on the
leakage function, absolute sum DPA and the non-proﬁled linear regression attack
seems to be able to deal with noise more eﬃciently when compared to NCA in an
unprotected setting. Section 4.3 explains how the near collision approach of looking
for small diﬀerences in the sensitive values can lead to a signiﬁcant improvement over
the state of the art attack against low entropy masking schemes.
4.2.2 Implementation Efficiency of NCA
Although near collision attack has the advantage of having reduced assumptions on
the target device, this comes at a price, namely in computation time. For each key
guess, the analyst should ﬁnd the measurements which have a particular value in its
corresponding plaintext. Although this can be a cumbersome operation, it does not
scale up when the analyst has to run the analysis on multiple samples of collected
measurements. To give a more accurate idea on the timing cost of NCA, Table 4.1
summarizes the average running time of each attack that is run in the previous section.
The table presents average running time of each attack on 10 000 traces. All attacks
are implemented as Matlab scripts executed in Matlab 2015a running on a PC with a
Xeon E7 CPU. Note that the performance numbers assume the traces to be already
loaded into memory in all cases.
Looking at the results presented in Table 4.1 and also Figure 4.2, AS-DPA seems
to be the best choice for the analyst in the tested cases in terms of running time and
the ability to deal with Gaussian noise. However, even though AS-DPA and NP-LRA
are more eﬃcient in the unprotected case, these techniques are not applicable in a
univariate attack setting against low entropy masking schemes.
4.3. Near Collision Attack Against LEMS 57
Table 4.1: Average timing results from 100 independent experiments.
Technique Time (sec.)
NCA 5.2727
CPA (HW) 1.0285
AS-DPA 1.1568
NP-LRA (d=1) 2.7621
UMIA (HW) 1.4130
UMIA (7 LSB) 6.6153
4.3 Near Collision Attack Against LEMS
A rather eﬀective countermeasure against ﬁrst order attacks such as introduced in the
previous sections of this work is to use a masking scheme. However, one of the biggest
drawbacks of masking schemes is the overhead introduced to the implementations.
Low entropy masking schemes (LEMS) are a solution proposed to keep the security
of classical masking [NGD11,NSGD12, CDGM12] but reducing the implementation
costs signiﬁcantly. This is achieved by using a small subset of the entire mask set
and requiring less randomness than it is required from a traditional masking scheme
(e.g. for AES, the amount of randomness required for each encryption is 16 bits in
comparison to 256 bits on a conventional masking scheme). In this section, we argue
how near collision attack idea can be extended to low entropy masking schemes. In
particular, we focus on the mask set that is also used in the DPA Contest v4 traces:
M16 = {00, 0F, 36, 39, 53, 5C, 65, 6A, 95, 9A, A3, AC, C6, C9,F0,FF} .
4.3.1 Leaking Set Collision Attack
Leaking set collision attack is based on the observation that two sensitive variables
which are masked with the mask set M16 lead to the same 16 leaking (masked) values
if they are the bit-wise complement of each other [YE14]. Following this observation,
the so-called leaking sets for each input x and it’s pair x′ are computed for a key
guess kg as:
x′ = S−1(S(x⊕ kg)⊕ (FF)16) . (4.7)
Once the input pairs per key guess are computed, the analyst collects the observed
values Ox and Ox
′
corresponding to x and x′. If the key guess is correct, Ox and
Ox
′
should have the same distribution. If the key guess is not correct however,
the resulting distributions will diﬀer signiﬁcantly, thanks to the AES S-box’s good
resistance against diﬀerential attacks. For comparing the distributions of these two
sets, authors of [YE14] propose to use the 2-sample Kolmogorov-Smirnov (KS) test
statistic. As KS test measures the distance between two distributions, the correct key
guess should result in a lower KS test statistic than the incorrect key guesses do (see
Figure 4.3(b)).
58 4. Near Collision Side Channel Attacks
4.3.2 Leaking Set Near Collision Attack
We now deﬁne the ‘leaking set near collision attack ’ (LS-NCA) as a combination of
the LSCA idea proposed in [YE14] and the near collision attack (NCA) introduced in
Section 4.2. Leaking set near collision attack can be summarized as an extension of
the idea explained in Section 4.3.1 to the entire mask set ofM16. Similarly, we use the
same observation that some input values lead to the same distribution in the S-box
output as a direct result of the properties of the mask set that is used. As the authors
of [YE14] point out in their work, whenever a sensitive value x is protected with the
mask set M16, the value x⊕ (FF)16 also results in the same values after applying the
mask set. A further observation on the mask set M16 is that it is a closed set with
respect to the XOR operation. In other words, XOR of any two elements in the set
M16 results in another element of the mask set:
mi ⊕mj ∈M16, ∀mi,mj ∈M16 . (4.8)
This means that a sensitive variable x protected with the mask set M16, and another
sensitive variable y(i) = x ⊕ mi, mi ∈ M16 leads to the same masked values. It
is easy to see that the property exploited in [YE14] is one particular case of the
observation given in Eqn. (4.8). Therefore, rather than directly comparing two similar
distributions, if one collects all data from the input values which lead to the same
distribution in the same set, this will lead to an equally reliable statistical analysis
with less data. One should note that once all data that contribute to the same
distribution are collected together, it is no longer possible to make a comparison
between leaking sets and expect the same distribution as in a leaking set collision
attack (see Section 4.3.1). Therefore, we utilize a similar approach as we have done
in Section 4.2 and look for sensitive values with 1-bit diﬀerences for comparison.
Leaking set near collision attack can be summarized as follows:
1. Partition the inputs (x) such that their S-box outputs that contribute to the
same distribution are collected together:
DxiM16 = {x : x = S
−1(S(xi)⊕m), ∀ m ∈M16} .
2. Make a key guess kg.
3. For each input byte x ⊕ kg which contribute to the same distribution (e.g.
x⊕kg ∈ D
xi
M16
), collect the corresponding measurement sample in a set Oxi(kg).
4. Use 2-sample Kolmogorov-Smirnov(KS) test to check how similar the distribu-
tions of Oxi(kg) and O
xj (kg) are, where S(xi)⊕ S(xj) = ∆(t), ∀t ∈ {1, ..., 8}.
5. Store the sum of all 2-sample KS test statistics for each kg.
The KS test statistic indicates how diﬀerent two distributions are by comparing
their empirical distribution functions. Therefore, computing the KS test statistic does
4.3. Near Collision Attack Against LEMS 59
not require estimating the distributions that the drawn samples (i.e. observed values)
come from. Moreover, as the KS test has been shown to be more robust to noise in
measurements [WOM11] than other alternatives in a side channel analysis setting, we
choose to utilize KS-test for comparing distributions in Step 4 of our analysis.
Note that in Step 4, only the sets which have a 1-bit diﬀerence in between are
used for 2-sample KS-test statistic calculation. In fact, sets with more than one bit
diﬀerence in their S-box outputs might have the same Hamming weight, which in turn
leads to similar (but not the same) distributions. Therefore, we expect the correct
key to lead to a large distance between the two distributions. In case of an incorrect
key guess however, each of the 16 elements in the set DxiM16 will lead to 16 distinct
values after the S-box, therefore resulting in a distribution which spans the entire
space F82. Sets with only one bit diﬀerence however will always result in diﬀerent
distributions. For instance, if the device leaks the Hamming weight of a value it
computes, comparing sets with more than one bit diﬀerence would introduce noise
in the cumulative KS-test statistic as values (05)16 and (03)16 have a 2-bit XOR
diﬀerence in between, but have the same Hamming weight. Further note that doing
the analysis on 1-bit diﬀerent sensitive values limits the analysis to 64 calls to the
2-sample KS-test, therefore saves running time when the device leaks the Hamming
weight of the sensitive variable.
On the other hand, if the leakage function is an injection, all
(
16
2
)
= 120 combina-
tions should be compared cumulatively. Here, using only the sets with 1-bit diﬀerence
for comparison can be thought of a method similar to using mutual information anal-
ysis (MIA) by estimating power consumption with the Hamming weight model, since
there is an implicit leakage function assumption that there is no inter-bit interaction
in the leaking (sensitive) variable. In fact, if the leakage function is non-linear, the im-
provement gained through using only 1-bit diﬀerent sets for comparing distributions
would be less pronounced.
Unlike LSCA (outlined in Section 4.3.1), the cumulative test statistic now results
in much smaller values for the incorrect key guesses. This is due to the fact that
a wrong key guess (kw) results in a random sampling of the set F
8
2 and taking into
account that 16 masks in M16 results in 16 distinct values for each sample in the
set, the resulting Oxi(kw) will cover the contributions of almost all of the elements
in F82. However, this does not diminish the distinguishability of the correct key from
other candidates. In the case where the key guess is correct (kc), the set O
xi(kc)
will have around 16 unique values (the exact number can increase due to noise in
the measurement setup). Now that we have a much smaller sampling of the set F82,
a comparison of distinct sets ([Oxi(kc), O
xj (kc)], i 6= j) is more meaningful, and in
fact the cumulative 2-sample KS-test statistic value results in a larger value than the
one obtained for a wrong key guess as the distributions are deﬁnitely diﬀerent (see
Figure 4.3(a)).
Note that this mask set is an example of the mask sets that are generated as
a linear code [BCG13]. As binary linear codes have the intrinsic property of being
60 4. Near Collision Side Channel Attacks
0 50 100 150 200 250
1.5
2
2.5
3
3.5
4
4.5
Cu
m
ul
at
ive
 K
S 
Te
st
 S
ta
tis
tic
Key Candidates
0 50 100 150 200 250
12.5
13
13.5
14
14.5
15
15.5
16
Cu
m
ul
at
ive
 K
S 
Te
st
 S
ta
tis
tic
Key Candidates
(a) (b)
Figure 4.3: Analysis results of Leaking Set Near Collision Attack (a) and Leaking Set
Collision Attack [YE14] (b), run over 40 000 simulated traces with SNR = 4.
closed sets with respect to the XOR operation, any mask set that is a binary linear
code is vulnerable to the attack explained in this section.
4.3.3 Simulated Experiments on AES Implementation with
LEMS
In this section we present results of our simulated experiments with diﬀerent SNR
values and also with two diﬀerent leakage functions as it is done in Section 4.2.1. In
our simulations, we compare our attacks to the previously proposed univariate, non-
proﬁled attacks: univariate MIA (UMIA) and LSCA that is recalled in Section 4.3.1.
To compare the eﬃciency of the attacks in terms of the expected remaining work
to ﬁnd the key, we use the guessing entropy metric [SMY09a]. For each experiment
using a random linear model as the leakage function, we generate 8 values which
are picked uniformly at random from [1, 10] ⊂ R. Unlike the simulations presented in
Section 4.2.1, we have chosen to use a linear leakage function which is slightly diﬀerent,
and favour the attacks assuming that each bit of the sensitive value contributes to
the leakages. Although this may not always be the case in real life, we choose to use
this leakage function as it is a favourable leakage model for MIA using the Hamming
weight model. We present results that show the proposed technique in Section 4.3.2 is
more eﬃcient than MIA in terms of handling the noise in an unknown linear leakage
model setting even when the leakage model favours MIA.
The experiments are carried out with various SNR values and the results are
presented in Figure 4.4 for both Hamming weight model and the random linear leakage
model we computed for each experiment. Note that the attack which assumes a linear
leakage model and computes only 64 comparisons is marked as ‘Linear’, and the attack
which computes all possible 120 comparisons is marked as ‘ID’ for identity model.
A quick look at Figure 4.4 shows that, similar to the case in near collision attack
proposed in Section 4.2, the attack is indiﬀerent to changes in the leakage model as
long as it stays linear. Moreover, if the leakage model is a random linear function of
4.3. Near Collision Attack Against LEMS 61
−2 −1 0 1 2 3
0
1
2
3
4
5
6
7
8
Avg GE of 32 simulations, 40000 traces, Random Linear Model
log2(SNR)
lo
g 2
 
(gu
ess
ing
 en
tro
py
)
 
 
This Work (Linear)
This Work (ID)
LSCA
UMIA (HW)
UMIA (7 LSB)
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 3
0
1
2
3
4
5
6
7
8
Avg GE of 32 simulations, 40000 traces, HW Model
log2(SNR)
lo
g 2
 
(gu
ess
ing
 en
tro
py
)
 
 
This Work (Linear)
This Work (ID)
LSCA
UMIA (HW)
UMIA (7 LSB)
(a) (b)
Figure 4.4: SNR vs Guessing Entropy values computed over 32 independent experi-
ments with 40 000 traces for perfect HW leakage (a), and random linear leakage (b).
the bits of the sensitive variable, univariate MIA fails to recover the key for leakage
models with a high variance of its weight values. However when the leakage model
follows a strict Hamming weight leakage, then univariate MIA seems to handle noise
more eﬃciently than both of our proposed approaches to analyse the low entropy
masking scheme at hand.
4.3.4 Experiments on DPA Contest v4 Traces
This section shows the eﬃciency of our attacks compared to the similar previously
proposed methods in the context of a real world scenario. For reproducibility of our
results, we used DPA Contest v4 traces which are EM measurements collected from
a smart card having an ATMega-163 microcontroller which implements an LEMS
against ﬁrst and second order attacks [NSGD12]. The implementation uses the mask
set M16 given in Section 4.8.
Figure 4.5 presents an analysis in time domain which reveals that even with 1000
traces, it is possible to recover the key with high conﬁdence. To further test the
reliability of our technique on the DPA Contest v4 traces, we have focused our analysis
on the time samples where each of the 16 S-box outputs lead to the highest signal-
to-noise ratio (SNR computed following the deﬁnition in [MOP07]) for computing
guessing entropy and the results of the analysis are presented in Figure 4.6.
First thing to notice in the ﬁgure is that LSCA has some room for improvement
even when compared to a generic univariate MIA (UMIA (7-bit) in Figure 4.6). On
the other hand, when MIA is applied with a more accurate power model (in this
case the Hamming weight model), the gap is rather large. When the leaking set near
collision attack proposed in this work is considered, it is easy to see that the one
which does not assume any power model (‘ID’) performs twice as eﬃcient in terms of
62 4. Near Collision Side Channel Attacks
0 100 200 300 400 500 600 700 800 900 1000
5
6
7
8
9
10
11
12
13
14
Time Samples
Cu
m
ul
at
ive
 2
−S
am
pl
e 
KS
−T
es
t S
ta
tis
tic
Figure 4.5: Leaking set near collision attack results vs time samples.
1000 2000 3000 4000 5000 6000
0
1
2
3
4
5
6
7
Avg GE of 16 bytes from DPA Contest v4 traces
Number of traces
lo
g 2
 
(gu
ess
ing
 en
tro
py
)
 
 
This Work (Linear)
This Work (ID)
LSCA
UMIA (HW)
UMIA (7 bit)
Figure 4.6: Logarithm of average partial guessing entropy for various univariate at-
tacks on DPA Contest v4 traces.
the number of traces required to recover the full key when compared to the generic
univariate MIA (‘7 LSB’). Moreover, when the leakage function is assumed to have
a linear relation with respect to the bits of the leaking value, the results are almost
identical to the ones from a univariate MIA which models the power consumption as
the Hamming weight of the leaking value. One should note that Hamming weight
model is rather accurate in this case as SNR values (computed with Hamming weight
model) vary between 3 and 5 for the points taken into consideration for the analysis.
4.3.5 Implementation Efficiency of the Attack
Similar to the near collision attack, leaking set near collision attack also requires to
ﬁnd the traces in the measurement set which correspond to a set of input bytes.
4.4. Conclusions 63
Table 4.2: Average timing results from 100 independent experiments.
Technique Time (sec.)
LS-NCA (Linear) 13.7144
LS-NCA (ID) 15.4003
LSCA 8.6176
UMIA (HW) 1.5648
UMIA (7 LSB) 7.6951
Although this operation is computationally heavy when applied to a large trace set,
it does not get worse when multiple samples are needed to be analysed. The analyst
can group all the traces corresponding to a leaking set and then compute 2-sample
KS test statistic for each sample of a pair of leaking sets.
As in Section 4.2, we have run simulated experiments to assess the time required
to run the proposed attacks in comparison to the other attacks run in this section.
Table 4.2 presents the average running times of each attack applied to the chosen
low entropy masking scheme. Timings presented in the table are average running
times over 100 independent experiments that are run over 10 000 traces. Similar to
the experiments before, all attacks are implemented as Matlab scripts executed in
Matlab 2015a run on a PC with a Xeon E7 CPU. Note that the performance numbers
assume the traces to be already loaded into memory in all cases.
Looking at the results presented in Table 4.2 and taking into consideration that
the leaking set near collision attacks (LS-NCA) require less number of traces, they
are the strongest attacks against software implementations of LEMS.
4.4 Conclusions
In this chapter, we introduced a new way of analysing side channel traces, namely the
side channel near collision attack (NCA). Unlike the collision attacks proposed in the
literature, NCA is intrinsically univariate and only assumes the leakage function to
be linear. Simulations show that NCA is indiﬀerent to changes in the linear leakage
function.
Furthermore, we present a new attack, leaking set near collision attack, against the
low entropy masking scheme used in DPA Contest v4 [NSGD12]. This attack improves
the attack proposed in [YE14] by fully exploiting the properties of the used mask set,
and combining it with the NCA approach. As the proposed attack is univariate, it is
especially of interest for software implementations of low entropy masking schemes.
Simulations show that in case the leakage function diverges from a perfect Hamming
weight leakage but yet stays a linear function, our attack overpowers univariate MIA.
It should be noted that not only the mask set M16, but all mask sets which have
a linear relation in between (as proposed in [BCG13]) are vulnerable to the attack
64 4. Near Collision Side Channel Attacks
presented in this chapter.
Chapter 5
Ambient Factors on Clock Glitches
In this chapter we investigate the eﬀect of ambient temperature on clock glitches based
on experiments done on an AVR microcontroller. This chapter is based on [KHEB14].
Fault attacks pose a serious threat for cryptographic implementations. In the
worst scenario, a single fault can reveal the entire secret key which has been shown
to be feasible by many researchers in the last decade. There exist several techniques
to inject faults, the most prominent techniques are to modify the power supply or the
clock source by injecting spikes or glitches. Other methods have been proven even
more powerful such as optical inductions that allow a precise localization of the fault
injection, global and local EM pulses, or temperature variations. In this chapter, we
ﬁrst evaluate the impact of ambient temperature and supply voltage on fault attacks,
and then combine these techniques to improve the performance of practical fault
attacks.
In principle, fault attacks can be either non-invasive, semi-invasive, or invasive.
Non-invasive fault attacks do not require a modiﬁcation of the targeted device. Vari-
ations in the supply voltage, the clock signal, or the temperature are used to force a
faulty behaviour during the calculation of the microcontroller. Semi-invasive attacks
require the de-capsulation of the chip package to, for example, be able to inject faults
using light. Exposing the opened chip to an intense light source (e.g., laser beam,
ﬂash light) or using small needles to probe single wires on the metal layer of the chip
are typical techniques used in the past. Invasive attacks make modiﬁcations in the
chip (e.g., add additional wire connections, cutting wires, etc.) that can sometimes
lead to the destruction of the device. Clock glitch, power glitch and thermo attacks,
which are considered in this chapter are typically non-invasive and they do not require
de-capsulation or modiﬁcation of the device.
After injection of a fault, several analysis techniques can be applied to reveal the
secret key, e.g., Diﬀerential Fault Analysis (DFA) [BS97], Collision Fault Analysis
(CFA) [Hem04], or Ineﬀective Fault Analysis (IFA) [BS03]. The decision, which
analysis technique to use depends on various factors like which algorithm is going to
be attacked or if the injected fault aﬀects the program ﬂow or the processed data.
A very good and detailed overview of diﬀerent kind of fault attacks can be found
in the work of Bar-El, Choukri, Naccache, Tunstall, and Whelan [BECN+06]; also
Verbauwhede, Karaklajic, and Schmidt [VKS11] published a classiﬁcation of current
fault-injection techniques and countermeasures to prevent them.
66 5. Ambient Factors on Clock Glitches
In 2011, Balasch, Gierlichs, and Verbauwhede [BGV11] performed an in-depth
analysis of the eﬀects of clock glitches on an 8-bit AVR microcontroller. They did
not target a speciﬁc implementation but evaluated the inﬂuence of clock glitches on
the instruction-execution pipeline of the microcontroller. Results show that the fetch
as well as the execute stage of the pipeline can be aﬀected by clock glitches. The
work presented in this chapter in part extends their research in the sense that we
additionally investigate the impact of high-temperature on the same platform (this
has not been done before to the best of the author’s knowledge).
In this chapter, we present various non-invasive fault attacks on an AVR AT-
mega162 microcontroller. We performed clock-glitch fault attacks and evaluated the
impact of high ambient temperature in respect to improving the performance of the
attack. We injected several faults by varying the fault-injection time and shape at
two diﬀerent device temperatures: 25 ◦C and 100 ◦C. We further analysed the impact
of the clock frequency and made the same experiments while the device was clocked
with 10MHz and 20MHz. All attacks were performed using a low-cost custom-made
FPGA board that allows injection of highly parametrized clock glitches. The obtained
results and contributions of this chapter can be summarized as follows:
• We show that with increased temperature, the device under attack gets more
sensitive to clock-glitch attacks, i.e., we were able to inject faults that were not
injected at room temperature.
• We also show that with increased temperature, the time frame where the device
is sensitive to glitches is getting larger which makes the device more susceptible
to practical attacks.
• We further demonstrate that individual AVR instructions can be simply re-
peated by inducing glitches (independent of the ambient temperature). In con-
trast to related work in [BGV11], we identify that the program counter is not
incremented due to a glitch and that no instructions are skipped.
• We are able to insert new random instructions during the program ﬂow without
skipping other instructions.
• We conﬁrm the outcomes of [BGV11] and show that with our setup we were
able to change the opcode of individual instructions, e.g., changing an ADD
instruction to a MOV, or to change the operand addresses, e.g., changing the
operand address from register R5 to R14.
The rest of the chapter is structured as follows. In Section 5.1, we give a brief
overview on related work. In Section 5.2 we describe the used setup to inject faults and
to perform heating experiments. Section 5.3 describes the experiments and Section 5.4
presents the results of our work and discusses the details of the induced fault types.
A discussion of the obtained results is given in Section 5.5 and conclusions are drawn
in Section 5.6.
5.1. Related Work 67
5.1 Related Work
A huge number of successful fault attacks have been reported during the last years. It
has become evident that hardware as well as software implementations of symmetric
and asymmetric cryptographic primitives are vulnerable. Ko¨mmerling et al. [KK99]
recognized the threat of fault attacks already in 1999 and proposed some low-cost
protection concepts in their work.
Many papers presented successful attacks by intentionally modifying the power
supply of a device during cryptographic operations. For example, Choukri and Tun-
stall [CT05] presented an attack in 2005 where the number of rounds of round-based
block ciphers can be reduced by injecting faults. They used power-supply glitches
for that purpose. A similar fault-injection technique was applied by Schmidt and
Herbst [SH08] who targeted an RSA implementation that makes use of the square-
and-multiply algorithm. Selmane, Guilley, and Danger presented underpowering at-
tacks in 2008 [SGD08]. They caused timing violations to attack an AES implemen-
tation on a smart card.
There also exist related work on electromagnetic glitch attacks. Dehbaoui et
al. [DMM+13], for example, presented an attack on AES in 2013. They injected
transient faults by using electromagnetic pulses. For their fault injection, no physical
access to the attacked device is required and they show the applicability by modifying
the round counter of an AES implementation.
Clock glitches in particular have been exploited by Fukunaga et al. [FT09] in
2009 to attack a wide range of block ciphers implemented on a large-scale integrated
circuit (LSI). They reduce the clock period to modify the internal state of the cipher
caused by setup-time violations. A detailed description of the eﬀects of clock glitches
on integrated circuits is given in [ADN+10]. In that paper, the authors conﬁrm their
theoretical assumptions by attacking the AES block cipher implemented on an FPGA.
It has been shown in the past that tampering with the clock signal, the power
supply voltage, or with electromagnetic pulses, faults can be injected that mainly
cause timing violations in the digital circuit. Temperature fault attacks, in con-
trast, have shown to be eﬀective against data-memory modiﬁcations. One of the ﬁrst
who demonstrated successful temperature attacks was Skorobogatov [Sko02] who
performed data-retention attacks on diﬀerent SRAM chips in 2002. He decreased
the ambient temperature of these chips by cooling the devices down to −20 ◦C and
below. He showed that data gets somehow frozen and can be read out after some
seconds after power down. Samyde, Skorobogatov, Anderson, and Quisquater made
similar experiments published in the same year in [SSAQ02]. Another similar experi-
ment was done by Mu¨ller and Spreitzenbarth [MS11] in 2011. They developed a tool
called FROST (forensic recovery of scrambled telephones), which allows to recover
the RAM content of modern Android smart phones. The tool allows to retrieve disk
encryption keys from RAM and the approach is comparable to cold boot attacks on
PCs [HSH+08].
68 5. Ambient Factors on Clock Glitches
While low temperatures and cooling allows to increase the data-retention and re-
manence time, high temperatures and heating allows to change its content: Quisquater
and Samyde [QS02] were among the ﬁrst ones to observe that high temperatures
causes memory errors after hours of extensive heating. Govindavajhala and Ap-
pel [GA03] were able to induce errors into memories using a 50 watt spotlight clip-on
lamp. By heating a device up to 100 ◦C, they were able to inject faults with a proba-
bility of 71.4%. Recently, Hutter and Schmidt [HS13] presented heating fault-attacks
on an AVR microcontroller in 2014. They operated the device above the temperature
speciﬁcation (> 125 ◦C). The authors verify the eﬃciency of this high-temperature
attack by successfully attacking an RSA implementation.
The impact of temperature in combination with power or clock glitch attacks
has not been analysed in prior work. In this chapter, we therefore answer the open
research question if the sensitivity of glitch attacks gets eﬀected by temperature and
if yes, to which extent.
5.2 The Fault Injection Setup
In this section, we describe the fault-injection setup we used. It is based on a custom-
made prototyping board consisting of a ﬂexible Field Programmable Gate Array
(FPGA). Afterwards, we give an overview about the heating process to let the device
under attack operate in a higher temperature environment. Finally, we give a brief
introduction to the targeted AVR family of microcontrollers and describe the basic
architecture and instruction set.
5.2.1 Fault Board for Clock Tampering
In order to inject faults during the computation of a microcontroller, we designed
a custom-made Printed Circuit Board (PCB). This board consists of a XILINX
Spartan–6 (XC6SLX45) FPGA and allows communicating with a PC over a USB-
over-serial connection. It is equipped with many I/O pins that can be used to connect
a wide range of microcontrollers or other FPGAs. Figure 5.2 shows the setup on the
left side of the ﬁgure.
We mainly used two pins of the FPGA to generate clock glitches for the micro-
controller. The ﬁrst pin provides a clock signal that can be adapted by the FPGA
(meaning that we are able to change the clock duration and edges individually). The
second pin provides a trigger signal that indicates the starting point of the glitch
injection. Next to these two pins, we used two power pins from our fault board to
supply the microcontroller. We set the power supply to 3.3 Volts in our experiments
which is within the normal speciﬁcation range of the AVR.
Figure 5.1 shows the experimental setup as a block diagram. The clock and
the trigger signals were captured by a digital storage oscilloscope, i.e., we used the
5.2. The Fault Injection Setup 69
clock
trigger
co
nfig
config
traces
Figure 5.1: Block diagram of the experimental setup.
PicoScope 5203 from PicoTechnology for these measurements. The oscilloscope and
also the fault board is connected to a computer that runs Matlab. Via Matlab scripts
we were able to automatically conﬁgure our fault board (e.g., setting diﬀerent clock-
glitch parameters) and to start and stop individual measurements of the clock and
trigger signal.
In order to heat-up the microcontroller and to evaluate the inﬂuence of heating
during clock-glitch injections, we placed the device on top of a heating plate. Our
custom-made fault board is connected to this microcontroller via insulated copper
wires. These wires had a diameter of 0.2mm and allow to place the microcontroller
in the middle of a heating plate while our fault board keeps exempt from extensive
heating. Figure 5.2 shows the setup on the right side of the ﬁgure.
Clock-glitch generation. We applied a similar approach like presented in
[ADN+10,ESH+11] in order to inject clock glitches into the target device. A block
diagram of this clock-generation unit is shown in Figure 5.3. The clock generation
Figure 5.2: The experimental setup: The fault board is located on the left side and
the microcontroller is connected to it using thin copper wires. This allows to place
the microcontroller in the middle of the heating plate.
70 5. Ambient Factors on Clock Glitches
xxx
Q
QD
S1
S2
D
C ENB
Multiplexer
clk
clkgl
DCM1
glen glact
clks1 clks2
xxx
(d1) (d2)
xxxDCM2
&glen,sync
clks1
Figure 5.3: Block diagram of the clock-generation unit using two Digital Clock Man-
ager (DCM) blocks of the FPGA.
works as follows. The hardware module takes as inputs a reference clock signal clk
and a glitch-enable signal glen. By using the reference clock signal clk, we are able to
generate phase-shifted versions of it which we further denote by clks1 and clks2. For
this, we used two Digital Clock Managers (DCMs) of the FPGA that provide these
phase-shifting capabilities. Furthermore, we denote glact the time when the glitch is
active. The glact signal is generated using a two-input AND-gate. One input is the
synchronized glen signal, named glen,sync while clks1 serves as the second input. As
an output, the clock-generation unit provides a new clock signal, further denoted by
clkgl, that includes an inserted glitch.
The shape of the clock glitch can be parameterized by two values, i.e., d1 and d2.
The ﬁrst value d1 represents the starting time when the glitch is inserted. The second
value d2 represents the ending time of the glitch. To be more exact, these two values
represent the phase shifts of clks1 and clks2 and deﬁne the ﬁnal shape of the inserted
clock glitch. Figure 5.4 shows all involved signals and the ﬁnal clock clkgl for a small
value of d1 (meaning that the clock glitch is started very early after a positive clock
edge). Figure 5.5 shows the signals for a bigger value of d1 (meaning that the clock
glitch is started right before the end of a positive clock edge).
The ﬁgures also show the low times of the clock signal as denoted by tlow. By
having a closer look at the two ﬁgures, it shows that the low times of the glitch-
injected clock signal clkgl is diﬀerent and depends on the parameter d1. In particular,
tlow,gl = tlow − d1; so the low time becomes shorter the higher the value of d1.
This means that the negative clock edge becomes shorter the later the clock glitch is
injected during the positive clock edge. Both the two parameters d1 and d2 and also
the decreased low time tlow,gl can be used to cause faulty computations during the
computation of the microcontroller.
For the experiments, we used and evaluated the impact of two diﬀerent clock
frequencies, i.e., 10MHz (T = 100ns) and 20MHz (T = 50ns). Figure 5.6 show the
5.2. The Fault Injection Setup 71
act
s1
s2
en
gl
d1
d2
d1
d2 tlow,gl
tlow
Figure 5.4: Clock-glitch generation for a
small value of d1.
act
s1
s2
en
gl
d1
d2
d1
d2
tlow
tlow,gl
Figure 5.5: Clock-glitch generation for a
big value of d1. Note that tlow,gl is lower
compared to tlow.
measured clock signals for three diﬀerent [d1, d2] settings, once for a reference clock
frequency of 10MHz and once for a reference clock frequency of 20MHz, respectively.
These ﬁgures also show the relationship between d1 and tlow,gl. The corresponding
settings for [d1, d2] are shown in the legends of these plots. Values for d2 in the range
of 7.0 ns and 50.0 ns were used for the setting of fclk = 10MHz. This equals glitch
frequencies between 20MHz and 142MHz. For the setting of fclk = 20MHz, values
for d2 between 7.0 ns and 25.0 ns were used. The clock glitch is inserted after a trigger
event on a predeﬁned pin of the FPGA. Deﬁning the number of clock cycles between
the trigger event and the glitch insertion allows to precisely control the point in time
when the clock glitch actually takes eﬀect.
5.2.2 Heating Plate with Temperature Measurement
For heating up the microcontroller, a laboratory heating plate from Schott instru-
ments (SLK1) was used. It does not allow to accurately control the temperature us-
ing a control system, but measuring the temperature and regulate the heating power
ﬁgured out to be suﬃcient for the performed experiments. For the temperature mea-
surements we have used a PT100 sensor element. Temperature sensors based on the
PT100 sensor element are very common for industrial applications. According to the
temperature, the resistance of the PT100 changes and at 0◦C the resistance equals
to 100Ω. Several approaches in order to measure the resistance (or a proportional
value like voltage drop or current) exist, depending on the intended accuracy. For
the experiments in this work, we have used the resistance measurement function of a
Fluke 111 TRUE RMS multimeter in order to acquire the resistance value. For that
purpose the value shown on the multimeter has been subtracted by the resistance
of the connection wires and with an online tool1 the temperature value has been
calculated.
1http://www.thermibel.be/documents/pt100/conv-rtd.xml
72 5. Ambient Factors on Clock Glitches
Remark: During the experiments it turned out that the heating plate introduces
electromagnetic interferences which lowers the signal quality. This did not inﬂuence
our experiments, but if power measurements are performed in addition, we suggest
to use a resistor-based heating element for heating up the device under test. Further-
more, this resistor-based heating element can be controlled very easily and the temper-
ate can be adjusted more accurately than a heating plate. We used this resistor-based
heating element to perform our ﬁnal power measurements.
5.2.3 The Investigated Microcontroller - AVR ATmega162
We decided to analyze heating eﬀects during clock glitch attacks on an ATmega162.
The reason for this choice is that the microcontroller is commonly used especially in
the ﬁeld of embedded systems and has been widely investigated by the crypto-research
community due to its ease of use, availability, and architecture documentation.
ATmega162 is an 8-bit low-power microcontroller from Atmel. It is part of the
AVR family and is based on a RISC architecture. ATmega162 supports 131 instruc-
tions where most of them are single-cycle operations. The device can be clocked up to
8MHz with an internal clock source or up to 16MHz using an external clock (depend-
ing on the supply voltage). It provides 32 internal general-purpose registers (denoted
by R0 ... R31) that can be used by applications. Some of them are dedicated to special
functions such as the registers R0 and R1 which store the result of a multiplication,
or the sets (R26,R27), (R28,R29), and (R30,R31) which can be used for memory ad-
dressing purposes (they are referred to registers X, Y, and Z in the documentation).
It further has a 1 kB of internal SRAM and 16kB of programmable ﬂash memory.
Further information about the ATmega162 can be found in the datasheet [Atm03].
Table 5.1 summarizes the parameters which are important for the further experi-
ments. Note that the minimum clock signal low-time is of special interest because if
it gets lower than 25 ns, one can tamper with the program counter of the device.
Table 5.1: AVR ATmega162: External Clock Drive
VCC 2.7-5.5V 4.5-5.5V
Parameter Min. Max. Min. Max Units
Clock Frequency 0 8 0 16 MHz
Clock Period 125 - 62.5 - ns
High Time 50 - 25 - ns
Low Time 50 - 25 - ns
Period Change - 2 - 2 %
5.3. The Experiments 73
5.3 The Experiments
In the following, we describe the experiments in detail. First, we describe the written
microcontroller program which was used to analyze the impact of clock glitches as well
as temperature on the investigated instructions. Second, we explain the measurement
process where we used a Matlab script to control the entire fault-injection process.
The AVR microcontroller program. At the beginning of the execution, some
device-speciﬁc conﬁgurations are performed, including setting up the serial interface
for communication with the control computer as well as setting the clock source to an
external clock. After that, the program runs in a loop and waits for instructions from
the measurement PC. Speciﬁc commands are used to select diﬀerent initialization
values for the registers and diﬀerent instructions which should be aﬀected by the
clock glitch. After reception of a command (further denoted by cmd), the registers are
initialized to known values, which enables us to evaluate the impact of an eventually
occurring fault and guarantees a deﬁned start state. After rising a trigger pin and
executing a ﬁxed number of NOP instructions, the targeted instruction is executed.
We have surrounded the targeted instruction with NOP instructions to avoid any side
eﬀects, introduced by the clock glitch, on the rest of the program ﬂow. The ﬁxed
number of NOP instructions between the trigger event and the attacked instruction
further allows us to precisely select the point in time the clock glitch should occur.
At the end of the execution, the current register states are transferred to the control
computer to evaluate the impact of the current glitch shape on the instruction.
Evaluation process. The following experiments are all performed for ambient
temperatures of 25 ◦C (room temperature) and 100 ◦C, respectively. Note that the
maximum temperature rating is speciﬁed to 125 ◦C in the datasheet, so we do not
operate the device beyond its speciﬁcations. The whole procedure was automated
using a MATLAB script in order to maximize the performance and minimize human
interaction. The following values had to be deﬁned before the script was started:
d1,start Start value for d1
d1,end End value for d1
d2 − d1 Shape of the inserted glitch (glitch duration)
∆d1 Step size for increasing d1
N Number of repetitions for same glitch shape
cmd Command deﬁning the targeted instruction
Regref Reference register values for the current command
fclk Clock frequency for the microcontroller
For each parameter set [d1, d2, cmd], the following procedure is then performed N
times:
1. Conﬁgure the fault board with the current clock glitch parameters [d1, d2].
74 5. Ambient Factors on Clock Glitches
2. Arm the clock glitch function on the fault board to insert the clock glitch with
the deﬁned shape after a trigger event.
3. Send the command cmd to the microcontroller.
4. Wait for the response of the microcontroller.
5. Compare the received register contents with the reference register contents
Regref of the reference execution without clock glitch and store the values if
there are deviations.
5.4 Results
In this section, we present two main sets of results based on experiments when the
AVR microcontroller is clocked with 10 MHz (T = 100ns) and 20 MHz (T = 50ns).
Although the maximum clock frequency is documented as 16MHz in the documen-
tation of ATmega162, a slight overclocking to 20MHz clock did not cause any faulty
behavior to the operations when no clock glitch was present. The reason we did ad-
ditional experiments with overclocking was that to push the device to its limits, and
therefore making it more vulnerable to glitches.
Although we have done experiments with various instructions, the results summa-
rized in this section are collected when injecting a clock glitch during the execution
phase of the instruction ADD R16,R5 (R16← R16+R5). R16 is the destination register
and R5 is the source register. For veriﬁcation purposes, diﬀerent source and destina-
tion registers were used for further experiments. It turned out that injecting a clock
glitch while the instruction ADD R16, R5 is executed, three diﬀerent types of faults can
be caused, depending on the conﬁgured glitch-shape parameters:
1. Inconsistent faults aﬀecting the value of the destination register of an ADD in-
struction as well as one neighboring register.
2. Consistent faults modifying the executed instruction.
3. Consistent faults repeating the executed instruction.
In the following, we describe each fault type in a more detail and provide the
experimental results.
5.4.1 Inconsistent Faults
The ﬁrst type of fault is generated when a clock glitch is introduced early in the
positive clock-edge phase of the clock signal. Exemplary glitch shapes generating this
type of faults are shown in the top plots of Figure 5.6. In case where the device was
clocked at 10MHz (see Figure 5.6), an additional positive clock edge is inserted around
5.4. Results 75
✵ ✺✵ ✶✵✵ ✶✺✵ ✷✵✵ ✷✺✵ ✸✵✵ ✸✺✵ ✹✵✵ ✹✺✵ ✺✵✵
✲✶
✵
✶
✷
✸
✹
❞
 
✥ ✁✂✷✵ ✄☎
❞
✆
✥ ✶✶✂✷✵ ✄☎
t
❧✝✞✟✠❧
✥ ✹✸✂✶✵ ✄☎
❱
✡
☛
☞
✌
✍
✎
✏
❱
✑
✵ ✺✵ ✶✵✵ ✶✺✵ ✷✵✵ ✷✺✵ ✸✵✵ ✸✺✵ ✹✵✵ ✹✺✵ ✺✵✵
✲✶
✵
✶
✷
✸
✹
❞
 
✥ ✒✂✸✵ ✄☎
❞
✆
✥ ✶✸✂✸✵ ✄☎
t
❧✝✞✟✠❧
✥ ✹✶✂✶✺ ✄☎
❱
✡
☛
☞
✌
✍
✎
✏
❱
✑
✵ ✺✵ ✶✵✵ ✶✺✵ ✷✵✵ ✷✺✵ ✸✵✵ ✸✺✵ ✹✵✵ ✹✺✵ ✺✵✵
✲✶
✵
✶
✷
✸
✹
❞
 
✥ ✸✵✂✶✵ ✄☎
❞
✆
✥ ✸✺✂✶✵ ✄☎
t
❧✝✞✟✠❧
✥ ✶✓✂✹✵ ✄☎
❱
✡
☛
☞
✌
✍
✎
✏
❱
✑
❚✔✕✖ ✗✄☎✘
Figure 5.6: Three diﬀerent clock-glitch shapes generated with the fault board. The
frequency of the reference clock clk was set to 10MHz (T=100ns) for generating this
plot (glitch length = 5ns).
220ns, 6.20ns after the previous positive clock edge. In the case when the device is
clocked at 20MHz, an additional positive clock edge is inserted around 95 ns, 5.75 ns
after the previous positive clock edge. This type of fault not only sets the destination
register (R16) to an incorrect value but also sets the next register (R17) in the register
bank to an unpredictable value. Although the register R17 gets cleared most of the
time, our experiments also showed that the fault causes also other non-reproducible
values such as 1, 17, 96, or 97. An interesting fact here is that this type of fault
occurs only when the destination register is in the second half of the register bank
(from register R16. . . R31), no fault is caused in registers R0. . . R15.
In order to further evaluate this inconsistent behavior, we slightly modiﬁed the
attacked instruction. Diﬀerent destination registers were used, all located in the up-
per half of the register bank. The results showed that for even destination registers
(R16, R18, . . . ) the current and the next register are changed by the glitch in an
unpredictable way. For odd destination registers (R17, R19, . . . ) the current and the
previous register are changed by the glitch in an unpredictable way. This is inter-
esting since there exist also a special AVR instruction, called MOVW, which is able to
move two 8-bit values to other registers, i.e., it copies one register pair into another
76 5. Ambient Factors on Clock Glitches
register pair. A pre-requisite of this instruction however is that the source and also
the destination register addresses need to be even (which somehow corresponds to our
ﬁndings that a glitch can cause modiﬁcations not only in a single register but also in
a register pair which is due to the underlying hardware architecture and supported
instruction set). Also the instructions ADIW (add immediate to word) and SBIW (sub-
tract immediate from word) are potential candidates for instructions modifying two
subsequent registers. A closer look on the description of these instructions, however,
reveals that they only work for registers R24. . .R31.
Further evaluation showed that the glitch attack indeed causes to replace the
addition by a MOV (or MOVW) instruction. Interestingly, it shows that also the following
instruction is replaced by a MOV. We veriﬁed this fact by executing two subsequent
ADD instructions and attacking the ﬁrst one. For glitch shapes aﬀecting two registers,
the second ADD instruction was not executed. These results show that the opcode of
two consecutive instructions is modiﬁed by a single clock glitch. A prerequisite for
an executed MOVW instruction is that the values in the modiﬁed, subsequent registers
also appear in two additional, subsequent registers in the register bank which should
have been used as the source registers, which was not the case in our experiments.
This observation as well as the replacement of the instruction after the attacked ADD
instruction make the execution of two MOV instructions due to the clock glitch more
likely than an execution of a MOVW instruction.
If the instruction before the attacked ADD instruction is not a NOP as it was the
case in the previous experiments, diﬀerent results for the similar glitch shape could
be observed. We have changed the preceding NOP instruction by an ADD instruction,
leading to the attack result described in Table 5.2. The attacked ADD instruction at
program counter value n+1 is replaced by a MOV instruction with the source register
equal to the destination register of the previous ADD instruction and the destination
register equal to the destination register of the attacked ADD instruction. In the next
clock cycle, the attacked ADD instruction is executed replacing the original instruction,
in the current example a NOP.
Table 5.2: Modiﬁcations of instructions when a clock glitch for inconsistent faults is
introduced at n+1.
Prog. Counter Execution without fault Execution with fault
n ADD Rd,Add1, Rs,Add1 ADD Rd,Add1, Rs,Add1
n+ 1 ADD Rd,Add2, Rs,Add2 MOV Rd,Add2, Rd,Add1
n+ 2 NOP ADD Rd,Add2, Rs,Add2
Summing up these ﬁrst observations, it can be said that too many parameters
(previous instruction, next instruction, involved registers) aﬀect the outcome of the
injected clock glitch. The faults result in un-predictable values so that the practical
usage to target an implementation is somehow limited.
5.4. Results 77
5.4.2 Modified Instructions
The second type of fault is generated after the ﬁrst type when increasing the glitch
parameters which essentially shifts the glitch to the right within the clock cycle.
Exemplary glitch shapes generating this type of faults are shown in the middle plot of
Figure 5.6, for 10 MHz. These type of faults can modify the executed instruction in an
unpredictable but reproducible way. For instance, the instruction ADD R16, R5 can
be turned into a copy instruction: MOV R16, R20, or to another addition instruction
with modiﬁed source register: ADD R16, R14. The resulting instructions caused by a
clock glitch are summarized in Table 5.3. The order of the instructions given in the
table follows the same order that they appear when the glitch is moved to the right
within the clock cycle. We have also added the opcodes of the resulting instructions
to Table 5.3. Comparing the opcodes of the modiﬁed instructions with the original
one, the following statements can be made: Bits in the opcode can ﬂip from 1 to 0
and from 0 to 1, three bits can be ﬂipped at most, the destination register is never
aﬀected in our experiments.
Table 5.3: The resulting instructions when a clock glitch is introduced while executing
the instruction ADD R16, R5
Instruction Opcode
ADD R16,R5 0000 1110 0000 0101
ADD R16,R4 0000 1110 0000 0110
ADD R16,R20 0000 1111 0000 0100
MOV R16,R4 0010 1110 0000 0100
ADD R16,R14 0000 1110 0000 1110
ADD R16,R12 0000 1110 0000 1100
ADD R16,R13 0000 1110 0000 1101
We veriﬁed these changes in the executed instructions by repeating the experi-
ments with diﬀerent sets of initialization values for the registers, and this type of
faults turned out to be reproducible for a given glitch shape and a ﬁxed tempera-
ture. The reproducible behavior of these faults suggest that with a glitch introduced
at a certain time within the high part of the clock cycle, it is possible to force the
instruction decoder to make a faulty computation in a reproducible way. However,
the instruction is modiﬁed in a way that the destination register is kept unchanged.
Either the instruction itself is changed or the source register, as shown in Table 5.3.
Our additional experiments using diﬀerent destination registers also did not yield any
modiﬁcation to the executed instruction where the destination register gets modiﬁed.
The glitch parameters that give these kinds of faults are shown in Figure 5.7 and
Figure 5.8.
78 5. Ambient Factors on Clock Glitches
5.4.3 Repeated Instructions
The third type of fault results in the instruction which is in the execution phase to be
repeated and the rest of the instructions are run in the correct order. Glitch shapes
leading to this kind of faults are shown in the bottom plots of Figure 5.6. Recon-
structing the internal workings of this kind of fault is not straightforward, since we
investigate the device in a black box model and have no access to internal information
such as: the value of the program counter or the output of the instruction decoder.
In our experiments we observed that for some particular glitch shapes (also visual-
ized at the bottom plots of Figure 5.6 , we consistently observed that the result of the
addition (58) was much lower than the expected value (213), while introducing the
glitch in the execution phase of the instruction ADD R16, R5. Since the initial value
of the register 5 is 101 and the initial value of the register 16 is 112, the expected
result of the addition would be 101 + 112 ≡ 213 (mod 28). If this addition was to be
repeated, then the results would be
101 + 112 + 101 ≡ 314 ≡ 58 (mod 28).
It should be noted here that all arithmetic operations are in modulo 28 since the Device
under Test (DUT) is an 8-bit microcontroller. Therefore, this particular result of the
addition made apparent that the current instruction was executed twice.
There are two main possible scenarios that can cause this kind of behavior: either
the instruction which is in the pre-fetch phase is replaced with the instruction which
is currently in the execution phase, or the program counter is not updated in the
presence of such a clock glitch. To further investigate which scenario occurred in our
experiments, we changed the test code from a single ADD instruction to two distinct
ADD instructions as shown in the upper half of Table 5.4. This can verify if the program
counter is updated or not depending on the ﬁnal value of the register 16.
However, there is a subtle point here that is worth noting. If the program counter is
updated and if the instruction which is already pre-fetched to the instruction register
is executed independent of the current value of the program counter, then the program
counter would have the value n+ 3 after the execution of the second ADD operation,
in turn skipping the instruction which should have been executed when program
counter is n + 2. Therefore, adding two more instructions (a CLR instruction and
an LDI instruction) in the test code enabled us to verify if the program counter is
in fact updated and how the instruction register behaves in relation to the program
counter. The complete test code is as shown in Table 5.4: First, two additions are
performed, each having R16 as destination. The ﬁrst addition uses R5 and the second
R21 as source registers. The other two instructions are a clear instruction, clearing
the content of R4 and a load instruction, writing 0xFF to R18. Also on the right most
column, the current value of register R16 is shown to make it easier for the reader to
follow the expected behavior of the microcontroller.
This experiment conﬁrmed that in fact no instruction is skipped (or replaced with
a NOP instruction) and the ﬁrst addition instruction is executed twice. This is only
5.5. The Influence of Ambient Temperature 79
Table 5.4: Order of instruction calls to test the glitches causing to repeat the executed
instruction, where the value of R5 is 101 and the value of R21 is 117.
Prog. Counter Instruction Value of R16
n− 1 NOP 112
n ADD R16, R5 213
n+ 1 ADD R16, R21 74
n+ 2 CLR R4 74
n+ 3 LDI R18, 0xFF 74
n+ 4 NOP 74
possible if the program counter is not updated and provided the fact that this type
of fault happens whenever the negative time of the clock is very short, we believe
that the program counter update on the DUT depends on the negative time of the
clock signal. Our interpretation of the instruction modiﬁcation is given in Table 5.5.
In the table, the ﬁrst row shows the initial value of the register 16 before the glitch
(112). In the following rows, the program ﬂow and how the value of the register 16 is
updated is shown. Further experiments showed that repeating the ADD instruction is
independent of the involved registers as well as the values stored in the registers.
Table 5.5: Interpretation of a glitch resulting in the repetition of an instruction, where
the value of R5 is 101 and the value of R21 is 117.
Prog. Counter Instruction Value of R16
n− 1 NOP 112
n ADD R16, R5 213
n ADD R16, R5 58
n+ 1 ADD R16, R21 175
n+ 2 CLR R4 175
n+ 3 LDI R18, 0xFF 175
n+ 4 NOP 175
Similarly to the case given in Table 5.5, it can also happen that either the ﬁrst or
the second execution of the same instruction can get modiﬁed. In our experiments
we observed values in {57, 66, 67} which can only be explained by a repeated ADD
instruction but either in the ﬁrst or the second execution of the instruction is somehow
misinterpreted in the instruction decoder and therefore resulting in a value much lower
than the expected result.
5.5 The Influence of Ambient Temperature
Figure 5.7 and Figure 5.8 show the types of faults that can be caused for a given
diﬀerence between the glitch parameters d1 and d2 when the microcontroller is clocked
80 5. Ambient Factors on Clock Glitches
5 10 15 20 25
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 4 ns
 
 
5 10 15 20 25 30 35 40
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 5 ns
 
 
5 10 15 20 25 30 35 40
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 6 ns
 
 
5 10 15 20 25 30 35 40
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 7 ns
 
 
25° C
100° C
25° C
100° C
25° C
100° C
25° C
100° C
Figure 5.7: Types of glitches generated depending on the glitch parameters used and
the ambient temperature, while executing the instruction ADD R16, R5. The device
is clocked at 10 MHz for these experiments.
at 10MHz, and 20MHz respectively. In both ﬁgures, only the parameters which
consistently produce a particular type of fault are plotted. On the vertical axis,
diﬀerent types of faults that can be generated are listed, and the horizontal axis is
related to the timing of the glitch which is introduced within the clock cycle. For the
experiments done in 25◦ Celsius, the results are plotted with a blue circle (◦). To
represent the results for the experiments that the device was heated to 100◦ Celsius,
a red (∗) is used. Note that a more detailed version of Figure 5.7 is given in Figure
5.9 for the interested reader.
By increasing the ambient temperature during clock-glitch injections, we made
the following key observations:
1. We identiﬁed that essentially the same faults that can be caused during room
temperature can be also caused at high temperatures. So there is no negative
impact of higher temperatures in obtaining diﬀerent fault types. However, we
identiﬁed that the clock-glitch injection time is shifted and needs to be per-
formed at a later instant of time within the targeted clock cycle. The reason
for this is explained later in Section 5.5.1.
2. In some cases, e.g., in the case when the clock-glitch shape has a duration of
d2−d1 = 5ns, it shows that the sensitivity window, i.e., the time that the device
is sensitive to clock-glitch injections, is getting longer with higher temperature.
5.5. The Influence of Ambient Temperature 81
✻ ✽ ✶  ✶✁ ✶✂ ✶✻
■✄☎✆✄✝✞✝✟✠✄✟
❆✡✡ ☛✶✻☞ ☛✂ ✌✁✶✁✍
❆✡✡ ☛✶✻☞ ☛✁  ✌✁✁✽✍
▼✎✏ ☛✶✻☞ ☛✂ ✌✶  ✍
❆✡✡ ☛✶✻☞ ☛✶✂ ✌✁✁✁✍
❆✡✡ ☛✶✻☞ ☛✶✁ ✌✁✁ ✍
❆✡✡ ☛✶✻☞ ☛✶✑ ✌✁✁✶✍
☛✠❘✠✒✟ ✓ ▼✆✔✞✕✞✠✔
☛✠❘✠✒✟ ✖✒✗✠
✔
✘
✥✄✝✙
✌✔
✷
✲✔
✘
✍ ✮ ✂ ✄✝
✻ ✽ ✶  ✶✁ ✶✂ ✶✻
■✄☎✆✄✝✞✝✟✠✄✟
❆✡✡ ☛✶✻☞ ☛✂ ✌✁✶✁✍
❆✡✡ ☛✶✻☞ ☛✁  ✌✁✁✽✍
▼✎✏ ☛✶✻☞ ☛✂ ✌✶  ✍
❆✡✡ ☛✶✻☞ ☛✶✂ ✌✁✁✁✍
❆✡✡ ☛✶✻☞ ☛✶✁ ✌✁✁ ✍
❆✡✡ ☛✶✻☞ ☛✶✑ ✌✁✁✶✍
☛✠❘✠✒✟ ✓ ▼✆✔✞✕✞✠✔
☛✠❘✠✒✟ ✖✒✗✠
✔
✘
✥✄✝✙
✌✔
✷
✲✔
✘
✍ ✮ ✚ ✄✝
✻ ✽ ✶  ✶✁ ✶✂ ✶✻
■✄☎✆✄✝✞✝✟✠✄✟
❆✡✡ ☛✶✻☞ ☛✂ ✌✁✶✁✍
❆✡✡ ☛✶✻☞ ☛✁  ✌✁✁✽✍
▼✎✏ ☛✶✻☞ ☛✂ ✌✶  ✍
❆✡✡ ☛✶✻☞ ☛✶✂ ✌✁✁✁✍
❆✡✡ ☛✶✻☞ ☛✶✁ ✌✁✁ ✍
❆✡✡ ☛✶✻☞ ☛✶✑ ✌✁✁✶✍
☛✠❘✠✒✟ ✓ ▼✆✔✞✕✞✠✔
☛✠❘✠✒✟ ✖✒✗✠
✔
✘
✥✄✝✙
✌✔
✷
✲✔
✘
✍ ✮ ✻ ✄✝
✻ ✽ ✶  ✶✁ ✶✂ ✶✻
■✄☎✆✄✝✞✝✟✠✄✟
❆✡✡ ☛✶✻☞ ☛✂ ✌✁✶✁✍
❆✡✡ ☛✶✻☞ ☛✁  ✌✁✁✽✍
▼✎✏ ☛✶✻☞ ☛✂ ✌✶  ✍
❆✡✡ ☛✶✻☞ ☛✶✂ ✌✁✁✁✍
❆✡✡ ☛✶✻☞ ☛✶✁ ✌✁✁ ✍
❆✡✡ ☛✶✻☞ ☛✶✑ ✌✁✁✶✍
☛✠❘✠✒✟ ✓ ▼✆✔✞✕✞✠✔
☛✠❘✠✒✟ ✖✒✗✠
✔
✘
✥✄✝✙
✌✔
✷
✲✔
✘
✍ ✮ ✛ ✄✝
✁✚➦ ✜
✶  ➦ ✜
✁✚➦ ✜
✶  ➦ ✜
✁✚➦ ✜
✶  ➦ ✜
✁✚➦ ✜
✶  ➦ ✜
Figure 5.8: Types of glitches generated depending on the glitch parameters used and
the ambient temperature, while executing the instruction ADD R16, R5. The device
is clocked at 20 MHz for these experiments.
This means that glitch attacks in the presence of heating are easier to perform
in practice because the time frame where the glitch causes faults is bigger.
3. It is possible to consistently cause faults for certain glitch parameters (e.g.,:
d2 − d1 = 6ns) when the device is running under high ambient temperature,
which is not possible within room temperature though. That means that the
success rate of clock-glitch attacks on that device is getting higher the higher
the ambient temperature.
4. It should be noted that when the device was clocked at 20MHz (results shown
in Figure 5.8), the type of fault causing to repeat the executed instruction did
not occur when the device was heated to 100◦C. Although this behavior is
visible when the diﬀerence between glitch parameters d2 − d1 = 4ns in 25
◦ C,
our experiments did not yield this kind of fault when the device was heated and
clocked at 20MHz. The repeated instructions at this temperature were always
modiﬁed in the second execution in the presence of a clock glitch at 100◦C.
5. When the device is overclocked at 20MHz, the longer glitch shapes, like d2−d1 =
6 or 7ns, result in a larger variety of faults when the device is heated up to
100◦C. This behavior can be clearly seen in the bottom plots of Figure 5.8.
82 5. Ambient Factors on Clock Glitches
6 8 10 12 14 16
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 4 ns
 
 
6 8 10 12 14 16
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 5 ns
 
 
6 8 10 12 14 16
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 6 ns
 
 
6 8 10 12 14 16
Inconsistent
ADD R16, R4 (212)
ADD R16, R20 (228)
MOV R16, R4 (100)
ADD R16, R14 (222)
ADD R16, R12 (220)
ADD R16, R13 (221)
Repeat & Modified
Repeat Same
d1 [ns]
(d2−d1) = 7 ns
 
 
25° C
100° C
25° C
100° C
25° C
100° C
25° C
100° C
Figure 5.9: Types of glitches generated depending on the glitch parameters used and
the ambient temperature, while executing the instruction ADD R16, R5. The device
is clocked at 10 MHz for these experiments.
5.5.1 Temperature Derating Factor
By analyzing Figures 5.7 and 5.8, it is clearly visible that the sensitivity window for
inducing the faults is shifted to the right when the temperature is higher. We are
now going to explain this eﬀect with a simple example. Let us assume the simple
synchronous circuit shown in Figure 5.10. The registers sample the data input D at
the positive clock edge and the same clock signal CLK is provided to all registers.
Between the transmitting registers RegTX and the receiving registers RegRX a com-
binational logic block is inserted. This combinational logic block has a propagation
delay tp,comb. This propagation delay deﬁnes the time after which the output has
settled to a stable value in the worst case after an input change. A proportional
relationship between tp,comb and the junction temperature exists, i.e., the higher the
junction temperature is, the longer is the propagation delay of a combinational cir-
cuit. In industry, the derating factor KΘ is used to describe the inﬂuence of the
temperature on the speed of a circuit. In order to fully describe the impact of PTV
(process, temperature, and voltage) variation on the speed of a circuit, derating fac-
tors describing the process (KP ) as well as the supply voltage (KV ) also exist. The
nominal timing is multiplied with the product of KΘ, KP , and KV to get the timing
for a speciﬁc condition. More detailed information about the derating factors can
be found, for example, in Chapter 12 (p. 590) in the book Digital Integrated Circuit
Design by H.Kaeslin [Kae08].
5.6. Conclusion 83
D        Q
CLK
D        Q
CLK
datain dataout
Combinational logic
RegTX RegRX
tp,comb
Figure 5.10: A simple synchronous circuit with a combinational logic block between
two storage elements. tp,comb equals the propagation delay of the combinational logic
block.
Several intermediate values (IV1, IV2, IV3, . . . ) appear at the output of the
combinational logic block before it settles to the stable value. If the receiving registers
sample their input before the combinational block provides a stable value (due to a
too high clock frequency or the insertion of a clock glitch to perform a fault attack),
this consequently leads to wrong results. The intermediate values, which can be
observed at the output of the combinational logic block depend on the previous input
value datain(t − 1) and the new input value datain(t). Each intermediate value can
be observed for a speciﬁc time interval. With rising temperature, the speed of the
combinational logic slows down as discussed above, so the temperature inﬂuences
the signal-propagation time and therefore the fault-injection window when a glitch
is eﬀective or not. This fact is shown in the timing diagram in Figure 5.11. The
proportional relationship between temperature and speed of the circuit increase the
size of the signal-propagation intervals as well as shifts their position to the right. If
a similar clock glitch is inserted at two diﬀerent temperatures, the type of the fault
is diﬀerent if the receiving registers sample diﬀerent intermediate values. This fact
is also illustrated in Figure 5.11. By applying the modiﬁed clock signal CLKgl, the
receiving registers are forced to sample the output of the combinational logic block
dataout before it has settled to a stable value. For a temperature of 25
◦C, IV3 is
sampled while for 100◦C, IV2 is sampled.
5.6 Conclusion
In this chapter, we aimed to answer the question if and how temperature eﬀects the
success rate of clock-glitch fault attacks on cryptographic implementations. As a
target device, we evaluated the temperature impact on an 8-bit AVR ATmega162
microcontroller. In addition to this contribution, we present and discuss new fault
types caused by clock glitches on the AVR. Our investigations showed that it is
possible to repeat individual instructions, to insert new random instructions, and to
change the opcode of instructions such that the operand address is changed to another
84 5. Ambient Factors on Clock Glitches
datain
dataout
dataout
CLKgl
IV1 IV2 IV3
IV1 IV2 IV3
dataout valid
dataout valid
25°C
100°C
time0
CLK
RegTX samples data at input 
at this time instance because 
of clock glitch
Figure 5.11: Timing diagram showing the inﬂuence of temperature on the result of
a clock glitch insertion. At diﬀerent temperatures, diﬀerent intermediate values are
sampled by the receiving registers due to the changed timing.
value. We further demonstrated that with increased ambient temperature, clock-glitch
attacks are getting more eﬀective. This means that it is possible 1) to inject faults
that were not injected during room temperature and 2) to increase the time frame
when the device is sensitive to faults. The latter fact makes practical attacks more
easy to perform since the device gets less sensitive to the exact fault-injection time.
Chapter 6
Security of Test Compression Algorithms
In this chapter we discuss how straight-forward scan based attacks can be applied to
industrial test compression systems, and propose a countermeasure to mitigate this
vulnerability. This chapter is based on [DEG+13] and [EDGV12].
In VLSI industry, design for testability (DfT) infrastructure is included in most
circuits for eﬃcient testing of the ﬁnal product. However when circuits with cryp-
tographic algorithms are concerned, the testing functionality can be exploited by an
attacker to recover the secret key used for encryption. These attacks can be catego-
rized as a form of side channel attacks which target the scan chain structure widely
deployed as a DfT technique.
Scan chains were exploited in [YWK06] to recover secret keys of Data Encryption
Standard (DES) and Advanced Encryption Standard (AES) hardware implementa-
tions. Later on, these scan based attacks are extended to break public-key ciphers
in [NSY+10b,NSY+10a,DRDDN+12a,DRDDN+12b]. Some of these works take into
account the modern test compression schemes and the attacks are performed on a
generic structure of the schemes. XOR-tree based test compression was attacked suc-
cessfully using the signature attack in [DRDNFR11a,DRDNFR11b], while advanced
DfT structures such as X-Masking and partial scan were attacked by a new attack
in [DRDNF12].
In this chapter, we present scan based side channel attacks on popular DfT struc-
tures generated by the major test tools of leading EDA vendors: Synopsys, Cadence,
and Mentor Graphics. The speciﬁc test structures are incorporated on an AES cir-
cuit using the DfT toolkits. Success rates of the attacks are reported for various
DfT conﬁgurations. Though AES is taken as a case study, the scan attack principle
outlined in this work is also applicable for other symmetric-key ciphers which have
similar diﬀusion properties. One of the purposes of the work presented in this chap-
ter is to illustrate that the classical diﬀerential scan attack (DSA) principle [YWK06]
is still useful for attacking advanced DfT structures such as test compression with
X-Tolerant and X-Masking by enhancing it further using techniques outlined in this
chapter.
Apart from performing DSA on industrial test structures, another contribution of
this chapter is to analyse the security provided by the scan attack countermeasures.
The preliminary scan attack results on generic test compression structures were pre-
sented in [EDGV12] along with a brief discussion on the eﬀectiveness of scan attack
86 6. Security of Test Compression Algorithms
countermeasures. However, this chapter expands the analysis by providing compre-
hensive scan attack success rates on AES in the presence of actual test compression
algorithms and X-state handling schemes included in commercial test tools for a wide
distribution of key-dependent ﬂip-ﬂops (KFFs) on slices and scan chains. Moreover,
an analysis based on the number of test inputs required for a successful scan at-
tack on AES designs with test compression is also provided. Attack successes on the
scan attack countermeasures are given in detail. Finally, we propose a new counter-
measure along with experimental results which presents its security against DSA for
various parameters, albeit with increased test cost. A noise injector countermeasure
is proposed and its implementation and comparison with other countermeasures is
provided.
The rest of the chapter is organized in the following way. In Section 6.1, we brieﬂy
describe the AES block cipher, scan attack and industrial DfT techniques. Initial
works on DSA are also described in that section together with our improvements
presented in [EDGV12]. We provide the motivation and objective for our work in
Section 6.2. The overall DSA strategy used in this chapter to analyse the industrial
test compression algorithms is outlined in Section 6.3. Section 6.4 presents our scan
attack results on test compression methods provided by the main EDA vendors. In
Section 6.5, we ﬁrst present DSA on the existing scan attack countermeasures, fol-
lowed by our new noise injector countermeasure. Finally, the chapter is concluded in
Section 6.6.
6.1 Background
6.1.1 AES
AES [DR02] is one of the widely used industrial standard block ciphers which encrypts
blocks of 128-bit messages and supports variable key sizes: 128 bits, 192 bits, or
256 bits [DR02]. AES round function consists of four operations which are applied
to the cipher state in the following order: SubBytes, ShiftRows, MixColumns and
AddRoundKey. SubBytes operation is a non-linear transformation which operates
on each byte of the state. ShiftRows rotates the bytes in each row of the state.
MixColumns multiplies each column with a Maximum Distance Separable (MDS)
matrix of branch number ﬁve. Therefore, each byte of the input will aﬀect all four
bytes of the MixColumns output which forms the basis of DSAs on AES. For instance,
a non-zero byte diﬀerence in the ﬁrst byte, as in Figure 6.1, will transform into a non-
zero diﬀerence after SubBytes and will not be aﬀected by the ShiftRows operation.
This diﬀerence will be transformed by the MixColumns operation and leads to four
non-zero diﬀerences on that column. Since the AddRoundKey operation simply XORs
the round key to the state, it has no eﬀect on diﬀerence propagation as long as the
same key is used. Additionally, there is an initial key XOR step before the encryption
starts, and this is the key that is targeted in scan attacks to recover the encryption
6.1. Background 87
key of AES. In this work, we have used an open-source AES implementation from the
❙❤✐❢t
❙✉❜ ▼✐①
❇②t❡s
❘♦✇s
❈♦❧✉♠♥s
❆❞❞
❘♦✉♥❞
❑❡②
Figure 6.1: Diﬀerence propagation of a byte diﬀerence through an AES round.
Gezel hardware/software co-design website [Gez], as our attack target. This design
contains table-lookup based S-box. The Gezel HDL code has been converted into
VHDL using the fdlvhd converter tool and synthesized into gate-level Verilog netlist
using Synopsys Design Compiler v2009.06 with a Faraday 130 nm library, on which
the test compression structures are added.
6.1.2 Industrial Test Compression Schemes
Test compression is now widely deployed in the semiconductor industry for testing
complex circuits in a short time without compromising on test quality. An example
of a space compaction scheme using output XOR gates is shown in Figure 6.2. Each
key-dependent bit (KFF) in the ﬁgure is a part of the (crypto) state register. In the
ﬁgure, each column of scan ﬂip/ﬂops represents a slice. A slice containing at least one
KFF is called an active slice. Similarly, an active scan chain is deﬁned as a scan chain
containing at least one KFF. Figure 6.2 shows a scan design with 3 KFFs which are
distributed over 2 active scan chains and 2 active slices. In this work, attack success
rates are presented for a scan structure containing a maximum of 32 scan chains and
32 active slices. The number 32 is chosen because it corresponds to the four bytes
aﬀected by the AES Mix Columns operation. When test vectors are generated for a
circuit by an automatic test pattern generator (ATPG), most of the test vector bits
are unspeciﬁed, or don’t care states, which are randomly ﬁlled with 0s or 1s, to enable
their use on an Automatic Test Equipment (ATE). These states can be removed from
the test vectors in an eﬃcient manner using test compression, allowing for substantial
reduction in test time and cost. The three popular industrial test compression tools
are Synopsys DFTMAX employing Adaptive Scan, Cadence Encounter Test using
OPMISR, and Mentor Graphics Tessent TestKompress using Embedded Deterministic
Test (EDT). In addition to providing space compaction, industrial DfT tools also have
provisions such as X-tolerance and X-masking for dealing with unknown values (X-
states).
Figure 6.3 shows the structure of an X-Tolerant space compressor employed in
Synopsys Adaptive Scan [WWPA03]. Due to the combination of diﬀerent scan chain
outputs at the space compactor, when DSA is considered, there is further loss of
88 6. Security of Test Compression Algorithms








	
	

	
		



	



	



	

		
	
	



 


		!


Figure 6.2: Generic scan compression structure with X-state handling
	

	


	





Figure 6.3: X-Tolerant compressor logic structure.
observability of the internal states of the scan chains compared to a simple XOR
tree. The other technique, X-masking, as shown in Figure 6.2, blocks certain parts
of the scan chains which contain a higher probability of unknown states (X-states).
This is achieved by inserting a mask layer between the test outputs and the test
response compactor. The mask can be static as in OPMISR or dynamic as in EDT.
As shown in the ﬁgure, in static masking a ﬁxed mask value feeds the mask decoder,
whereas in dynamic masking, part of the input vector is given to the decoder to
generate a variable mask. This masking makes DSA even more diﬃcult as some of
the KFFs are not present in the compacted output anymore. Time compaction uses
sequential logic to compact test responses. Here, multiple input signature registers
(MISRs) are employed to reduce test time. In order to minimize the need to shift
out test responses, the scan cell outputs are compressed into a signature with a
MISR [WWW06]. In Built-in Self-Test(BIST), this compressed signature is compared
6.1. Background 89
with a golden signature.
One might expect test compression techniques to provide some security as the ex-
ternally observable values are only the compressed stimuli and compacted responses.
The theoretical security analysis of EDT was presented in [LH07], while the security
claims of Tessent TestKompress tool are investigated in a Mentor Graphics whitepa-
per [Men10]. Although no such security claims have been made to-date from Ca-
dence and Synopsys regarding their test compression methods, they oﬀer similar test
response compaction structures. The coming sections present that security is not
directly provided by test compression techniques and therefore the designer should
take further measures to increase the security of the product.
6.1.3 Introduction to Scan Attack on AES
Scan design has been generally accepted as the standard method of testing chips due
to the high fault coverage and low overhead. Including scan while designing the chip
requires one additional pin to the primary I/O to serve as the test control pin (TC).
Internally, there is little impact on the design since the standard ﬂip-ﬂops (FFs) are
replaced with scan ﬂip-ﬂops (SFFs) (ﬂip-ﬂops with an input multiplexer) which are
then linked to one another creating a scan chain. TC selects between functional and
test mode operations. An example of a scan chain is shown in the Figure 6.4. TC
controls each multiplexer, choosing between the normal mode input of the FF or the
output of the previous SFF in the chain.
Clk
TC
SI
PI
Combinational Logic
PO
0
1
11
11
0000
SFF1 SFF2 SFF3 SFF4
SO
FFFFFFFF
Figure 6.4: Scan chain structure.
The FF registers make up the I/O to the combinational logic blocks in the chip,
so test engineers are able to manipulate the values that are input (controllability) and
view the output (observability) of each block. This is performed by multiplexing one
primary input pin and one primary output pin as the scan-in (SI) pin and scan-out
(SO) pin, respectively.
Using the SI pin while the TC is enabled, a test pattern is scanned into the scan
chain as dictated by the system clock. When the entire pattern is scanned in, the TC
90 6. Security of Test Compression Algorithms
is disabled, and the chip is run in normal mode for one cycle storing the responses
back in the SFFs. TC is again enabled to scan out the response, while at the same
time, scanning in a new test pattern to check for new faults previous patterns were
not able to detect.
The ﬁrst scan-based attack on AES appears in [YWK06] in which a two-step
approach employing chosen plaintexts for attacking the round register of AES is pre-
sented. In the ﬁrst step, the bits in the scan-chain output corresponding to the round
register are determined. This proceeds as follows: the chip is run in functional mode
for one clock cycle. The result after the XOR of the key with the plaintext and the
round 1 is stored in round register. Then the chip is switched to test mode and the
contents of the bit stream are scanned out. This step is repeated for another plaintext
input diﬀering in only one byte. The obtained pattern is saved and the process is run
for all 256 possible values of the chosen byte.
The next step consists of ﬁnding the round key in a byte-by-byte manner. A
chosen plaintext containing a particular byte is applied, and the corresponding word
is observed on the round register. The corresponding ciphertext byte is determined
and XORed with the plaintext byte to determine one byte of the key. This process is
repeated until all the bytes of the key are determined.
6.1.4 Differential Scan Attack on AES
This attack exploits the fact that two particular inputs to the round function of AES
can transform into output vectors with a unique Hamming distance in between after
one round of encryption. For instance, if two plaintexts with an XOR diﬀerence of
0x01 in their least signiﬁcant byte (LSB), are encrypted using only one round of AES,
the Hamming distance between the one round output vectors can only have 18 values.
The distribution of these values is given in Figure 6.5. The y-axis denotes the number
of pairs of plaintexts, while the x-axis represents the observed Hamming distances at
the output.
✹ ✺ ✻ ✼ ✽ ✾ ✶  ✶✶ ✶✁ ✶✂ ✶✹ ✶✺ ✶✻ ✶✼ ✶✽ ✶✾ ✁  ✁✶ ✁✁ ✁✂ ✁✹ ✁✺
 
✁
✹
✻
✽
✶ 
✶✁
✶✹
✶✻
✶✽
✁ 
Figure 6.5: Distribution of Hamming distances, for 0x01 XOR diﬀerence.
Note that a one byte diﬀerence in the plaintext will transform into a four byte
diﬀerence due to the structure of the MixColumn operation. In fact, analysing the
6.2. Motivation 91
distribution of the Hamming distances for all 27 pairs generated with the byte diﬀer-
ence 0x01 in their LSB, one can easily verify that there are four Hamming distance
values (9, 12, 23 and 24) which can only be generated by a unique pair of inputs.
Therefore, whenever such a Hamming distance is observed between the output vec-
tors, one can XOR the corresponding plaintext byte with the pre-computed values
to recover a byte of the encryption key. As an example, let the attacker be able to
encrypt two plaintexts which have a diﬀerence of 0x01 in their LSB and observes a
Hamming distance of 9 between the one round output vectors. Since the attacker has
pre-computed the possible Hamming distances depending on all possible input pairs
with 0x01 diﬀerence in their LSB, she ﬁnds out that the inputs to the S-box should
either be 0xE2 or 0xE3. Hence, the only thing that remains is to XOR these values to
the corresponding byte of the plaintexts and obtain two possible keys, one of which
is deﬁnitely the correct key byte. Proceeding in this way, the attacker can reduce the
search space for the key from 2128 to 216, and eventually recover the 128-bit encryp-
tion key used in that AES implementation in negligible time. Here, the number of
encryptions the attacker needs depends on the byte diﬀerence used for attacking the
system. The more unique values in the Hamming distance distribution, the better
the chances are for the attacker to be able to reduce the search space for that key
byte.
The simplicity of the attack enables it to be applicable to schemes with a simple
compaction scheme such as an XOR tree compactor. The XOR tree compactor is a
simple scheme which XORs the outputs of each scan chain in every clock and therefore
saving the space required to keep the test outputs. However, this XOR operation can
aﬀect the observed Hamming distances depending on the distribution of KFFs over
the scan chain structure. In [DRDNFR11b], Da Rolt et al. analyse the eﬀect of an
XOR compaction on the basic attack and show that the attack is still applicable to
these systems but some additional work is required to successfully recover the key.
In this chapter, we improve upon this approach and extend the scan-based dif-
ferential attack [DRDNFR11a,DRDNFR11b] to test compression schemes which use
X-masking logic as well as response compactors. For comparability with the previous
work, expected and observed results are given in the next section for a collection of
distinct distributions of KFFs over the scan chain structure of generic test compression
scheme.
6.2 Motivation
As described in Section 6.1, commercial EDA tools provide several variants of space
and time compaction techniques. The X-tolerant/ X-masking space compaction and
MISR-based time compaction logic present in these tools are much more complex than
simple XOR trees assumed in some of the previous works. These complex test struc-
tures are very often used in practical cryptographic circuits to meet current market de-
mands of reducing test time, achieving lower product cost, and faster time-to-market.
92 6. Security of Test Compression Algorithms
In this chapter, these advanced test compression techniques are targeted depicting
their vulnerability to DSA introduced in [YWK06] and improved in [DRDNFR11a]
and [EDGV12].
Scan attack countermeasures, such as partial scan [IYHF09] and scan chain scram-
bling [HFB+04] have been proposed to increase confusion on the scan out data.
In this work, the vulnerability of these countermeasures to DSA is investigated.
The countermeasures are emulated in software, and DSA is performed to illustrate
their susceptibility to scan attacks. We enhance the scan attack principles outlined
in [YWK06,DRDNFR11a] by combining multiple attacks with distinct input diﬀer-
ences to improve the attack success rate when applied to commercial test compres-
sion schemes and scan attack countermeasures. Moreover, we present comprehensive
attack results for the popular space and time compaction techniques and countermea-
sures for diﬀerent distributions of key information on the scan infrastructure, which
was not extensively considered earlier and is addressed in detail in this work. Another
unique feature of the current work is the investigation of the variation of success rates
with respect to the number of test inputs.
6.3 Attack Strategy
Building upon the DSA basics given in Section 6.1.4, the following assumptions are
made in the DSA presented in this work:
• The scan enable pin can be controlled by the attacker.
• The cryptographic algorithm is known to the attacker.
• The time required to execute the target operation is known to the attacker.
• The test structure type (space compaction, time compaction, X-tolerance, X-
masking) is known to the attacker. However, the attacker may not be aware of
the exact details of the test scheme (such as compression ratio, number of scan
chains).
The ﬁrst paper that discusses the model for the attacker is [YWK04], whereas the
attacker model speciﬁc to AES is discussed in [YWK06]. In addition to this, an
implicit assumption in DSA is that all FFs (except the KFFs) have the same value
after one encryption round. Thus the diﬀerential process eliminates the eﬀect of these
FFs.
Diﬀerent from the previously published works, evaluating test compression schemes
from diﬀerent EDA vendors requires modiﬁcations to the attack strategy in compar-
ison to the attacks against generic test compression algorithms. In fact, when X-
masking or X-tolerant logic is considered, one has to use a modiﬁed version of the
key guessing strategy outlined in Section 6.1.4. The main diﬀerences in the attack
strategy used in this work and the previous works is that multiple byte diﬀerences
6.3. Attack Strategy 93
are used for attacking designs, and the attacks are repeated for random test inputs
depending on the testing scheme under security analysis.
The DSA outlined in this paper is performed on the software emulation of the DfT
structures. The approach taken in our attack is divided into two phases: an online
phase and an oﬄine phase. In the online phase, testing structure of industrial DfT
solutions is derived by inserting DfT functionality to an AES design using the DfT
tools. Also, possible inputs to the round function of AES are derived for corresponding
input diﬀerences in one byte. In the oﬄine phase, the scan attack is performed by
emulating the DfT structures in a C program, making use of the XOR diﬀerences
and possible inputs derived in the online phase. An attack is deemed to be successful
whenever the correct key byte is suggested as the most likely key byte after the attack.
Success rates are computed over 10 000 random permutations of KFFs.
All the attack successes presented in the following sections are obtained employing
attack codes written in C customized for the speciﬁc test compression structure or
scan attack countermeasure. Simulations are performed on a 64-bit x86-64 Intel Core
i7-2600 CPU running at 3.4 GHz having 7 virtual processors and 8GB of RAM.
Through software simulations, we made further observations on the eﬀects of vari-
ous diﬀerences, other than the ones given in previous works [DRDNFR11b,DRDNFR11a].
We implemented the diﬀerential scan-attack described in [DRDNFR11b] on a C im-
plementation of AES-128, and observed that there are seven more one-byte diﬀerences
(0x4A, 0x69, 0x6B, 0x89, 0x92, 0xE7 and 0xF5) which result in having four unique
Hamming distance values after one round of encryption. We further observed that
using XOR diﬀerence 0xD1 results in ﬁve unique Hamming diﬀerences which prove
useful especially if the KFF distribution is unknown. We additionally observed that
diﬀerent XOR diﬀerences between inputs can lead to having the same unique Ham-
ming distance values. The unique Hamming distance values corresponding to each
XOR diﬀerence are given in Table 6.1.
Table 6.1: XOR diﬀerences and corresponding unique Hamming distances after one
round AES encryption
XOR diﬀerence Unique Hamming distances
0xD1 {5, 9, 12, 23, 24}
0x01, 0x89, 0x92, 0xE7 {9, 12, 23, 24}
0x4A, 0x69, 0x6B, 0xF5 {8, 9, 12, 24}
While the attacks on test compression schemes with X-masking are dependent on
the distributions of KFFs in the active slices, attacks on test compression schemes
with X-tolerant logic are dependent on the distributions of KFFs in the active scan-
chains. This eﬀect has not been considered in previous works, but forms the basis of
our analysis strategy.
94 6. Security of Test Compression Algorithms
6.3.1 Distributions Considered
The AES round register can be spread in the scan structure in several ways unknown
to the attacker. This work considers the following distributions of active slices and
active scan chains to cover the most typical scenarios:
• 32 active scan chains with active slices varying from 16 to 32.
• 32 active slices with active scan chains varying from 16 to 32.
• 24 active scan chains with 24 active slices.
In AES, one byte of input diﬀerence can aﬀect up to 32 bits of the round register
due to the Mix Column operation after one round. Therefore, there can be at most
32 KFFs would be in the scan structure. This indicates that there will be at most 32
active scan chains when each scan chain contains one KFF. Similarly, there will be at
most 32 active slices when each slice contains one KFF. These represent two extreme
cases. To consider other practical scenarios, the number of active scan chains is
varied from 32 to 16 keeping active slices ﬁxed at 32, and vice-versa. An intermediate
scenario of 24 active scan chains and 24 active slices is also considered to show the
eﬀect of diﬀerent KFF distributions when the number of active slices and active
scan chains is ﬁxed. Table 6.2 shows the distributions used in the paper. Even if AES
forms a small part of an industrial SoC consisting of millions of gates, the distributions
considered in this paper are still applicable. Independent of the size of the design, the
maximum number of active slices or active scan chains is always equal to 32 (as only
32 FFs would be key-dependent and the rest unrelated when subjected to DSAs).
In the table, each digit represents the number of KFFs in one slice. There are
24 KFFs in total in each distribution. For instance, the ﬁrst distribution has two
KFFs on 8 slices, and one KFF each on another 8 slices. Similarly, for the 22nd
distribution, there are 9 KFFs on one slice and one KFF each on 15 slices. Since there
are 22 possible choices for distributing active slices, there are a total of 222 = 484
distributions with 24 active slices and 24 active scan chains. These distributions are
in fact all possible choices to distribute 32 KFFs over 24 active slices / scan chains.
Since in these distributions, there are multiple FFs on each active slice, we consider
cases where one KFF would mask another on the same slice, as is possible in a realistic
scenario.
6.4 Scan Attacks on Industrial Test Compression
Algorithms
6.4.1 Scan Attack on Synopsys Adaptive Scan
Adaptive scan is the test compression architecture used in the Synopsys DFTMAX
test tool. At the input side, there are multiplexers to enable testing of multiple scan
6.4. Scan Attacks on Industrial Test Compression Algorithms 95
Table 6.2: Distributions for 24 active slices / scan chains
# Distribution # Distribution
1 {2, 2, 2, 2, 2, 2, 2, 2, 1, . . . , 1} 12 {5, 3, 2, 2, 1, . . . , 1}
2 {3, 2, 2, 2, 2, 2, 2, 1, . . . , 1} 13 {5, 3, 3, 1, . . . , 1}
3 {3, 3, 2, 2, 2, 2, 1, . . . , 1} 14 {5, 4, 2, 1, . . . , 1}
4 {3, 3, 3, 2, 2, 1, . . . , 1} 15 {5, 5, 1, . . . , 1}
5 {3, 3, 3, 3, 1, . . . , 1} 16 {6, 2, 2, 2, 1, . . . , 1}
6 {4, 2, 2, 2, 2, 2, 1, . . . , 1} 17 {6, 3, 2, 1, . . . , 1}
7 {4, 3, 2, 2, 2, 1, . . . , 1} 18 {6, 4, 1, . . . , 1}
8 {4, 3, 3, 2, 1, . . . , 1} 19 {7, 2, 2, 1, . . . , 1}
9 {4, 4, 2, 2, 1, . . . , 1} 20 {7, 3, 1, . . . , 1}
10 {4, 4, 3, 1, . . . , 1} 21 {8, 2, 1, . . . , 1}
11 {5, 2, 2, 2, 2, 1, . . . , 1} 22 {9, 1, . . . , 1}
chains using a reduced number of scan inputs. At the output side, there is an XOR
network to connect multiple scan chains to reduce the number of test outputs. This
way compression is achieved without compromising on testability. The output side
XOR compactor network is also known as the unload compressor which provides an
X-tolerant design. This helps in diagnosing high volume of scan pattern failures which
can be observed on the tester [WWW06]. The X-tolerant compressor [WWK+07] is
based on an algorithm derived from Steiner systems, and provides a good balance
between scan compression, and silicon area. To prevent aliasing, cancellation of si-
multaneous faults on two or more chains, a unique combination of sub-scan chains
connects to each compactor output. This combination depends on the compaction
structure employed. Sometimes, the outputs are computed by XORing disjoint sub-
sets of scan outputs.
Since the X-tolerant XOR compressor used in Adaptive Scan has a diﬀerent struc-
ture than an XOR-tree, DSA success rates are expected to be diﬀerent than the success
rates for the attacks available in the literature. Observed Hamming distances vary
depending on the structure of the XOR network, therefore providing some security
through obscurity as long as the structure of the XOR connections is not known. How-
ever, this compressor also leaks information on the Hamming distance (HD) between
outputs as it consists of linear operations.
Description of the Attack
Similar to the initial attack in [YWK06], our scan attack consists of two main steps.
First, all 256 possible values are given to the ﬁrst byte of the plaintext and corre-
sponding ﬁrst round outputs are collected. In the second step, these outputs are paired
depending on the selected input diﬀerence and the Hamming distance between them
is computed. Unlike previous works, we use ﬁve diﬀerent XOR diﬀerences (namely
0xD1,0x01,0x89,0x4A,0x69) to amplify the visibility of the correct key among other
96 6. Security of Test Compression Algorithms
key guesses. Note that all test outputs are XORed together to obtain a single value
for the Hamming distance. This is important since the structure of the X-tolerant
logic used can vary depending on the number of scan chains and the number of test
outputs in the design. The X-tolerant logic used in this work has been derived from
an actual test compression DfT insertion with a 32:8 compressor on the AES design
by Synopsys DFT Compiler. It combines the scan chain outputs in the following way:
out0 = s2 ⊕ s5 ⊕ s24 ⊕ s26 ⊕ s19 ⊕ s13 ⊕ s8 ⊕ s29 ⊕ s16 ⊙ s22 ⊕ s0 ⊕ s11
out1 = s3 ⊕ s8 ⊕ s5 ⊕ s27 ⊕ s16 ⊕ s18 ⊕ s13 ⊕ s21 ⊕ s23 ⊕ s29 ⊕ s0 ⊕ s10
out2 = s28 ⊕ s25 ⊕ s17 ⊕ s20 ⊕ s22 ⊕ s14 ⊕ s9 ⊕ s3 ⊕ s31 ⊙ s6 ⊕ s0 ⊙ s11
out3 = ¬s30 ⊕ s27 ⊕ s14 ⊕ s19 ⊕ s21 ⊕ s1 ⊕ s6 ⊕ s22 ⊕ s3 ⊕ s8 ⊕ s11 ⊕ s17
out4 = s1 ⊙ s9 ⊕ s12 ⊕ s15 ⊕ s18 ⊕ s4 ⊕ s26 ⊕ s20 ⊕ s31 ⊙ s6 ⊕ s23 ⊕ s29
out5 = ¬s30 ⊕ s27 ⊕ s14 ⊕ s19 ⊕ s25 ⊕ s12 ⊕ s7 ⊕ s4 ⊕ s1 ⊕ s9 ⊕ s16 ⊙ s22
out6 = s10 ⊕ s7 ⊕ s13 ⊕ s21 ⊕ s28 ⊕ s15 ⊕ s31 ⊕ s2 ⊕ s24 ⊕ s26 ⊕ s18 ⊕ s4
out7 = s28 ⊕ s25 ⊕ s17 ⊕ s20 ⊕ ¬s30 ⊙ s23 ⊙ s12 ⊕ s15 ⊕ s2 ⊕ s5 ⊕ s10 ⊕ s7
where si, i ∈ {0, . . . , 31} are the scan chain outputs, ⊕ is the XOR and ⊙ is the
XNOR operation.
In the compactor structure above, each scan chain occurs at least twice at the
compactor outputs to satisfy the X-Tolerant requirement of canceling X-states oc-
curring on the scan chains. Since all the scan chains are included more than once,
evaluating compactor outputs separately can result in an incorrect estimation of the
actual Hamming distance between the corresponding scan designs. Therefore, the test
outputs are XORed to make sure that the observed Hamming distance is smaller than
or equal to the actual Hamming distance. This is due to the fact that without the
knowledge of the exact structure of the output compactor, it would not be possible to
tell which scan chains contribute to the diﬀerences observed in particular compactor
outputs. In some cases that we observed for diﬀerent X-tolerant logic designs gener-
ated by Synopsys, a scan chain is included in an even number of compactor outputs.
In this case, XORing the compactor outputs will cancel out these scan chain outputs,
therefore degrading the observability of the actual Hamming distance between two
scan designs. When the case in the example above is considered, one can easily see
that s22 and s24 are the only ones that are included in an even number of times in the
compactor output. Therefore, when performing the analysis, they will be canceled
out and therefore information about two particular scan chains s22 and s24 will be
lost. Hence, a decrease in the success rate for a random distribution of KFFs is to be
expected when compared to an XOR tree.
A key guess is performed if the observed Hamming distance is one of the extreme
cases. For example for the XOR diﬀerence 0xD1, we make a key guess if the observed
Hamming distance is less than 5 or 9, and if it is exactly equal to 23 or 24. Our
experiments show that this approach improves the overall success rate of the attack.
6.4. Scan Attacks on Industrial Test Compression Algorithms 97
Attack results
Table 6.3 summarizes the attack success for diﬀerent number of active slices and active
scan chains. The results suggest that the attack success decreases with decrease in
the number of active slices. However, it seems that the success rate is aﬀected to a
much smaller extent with variations in the number of active scan chains.
Table 6.3: DSA Success rates for X-Tolerant Logic for diﬀerent distributions
#Active Scan Chains = 32 #Active Slices = 32
#Active Success #Active Scan Success
Slices Rate Chains Rate
32 74.94% 32 74.94%
31 71.22% 31 74.83%
30 70.24% 30 74.24%
29 66.35% 29 74.11%
28 62.65% 28 74.28%
27 60.60% 27 74.45%
26 59.16% 26 74.22%
25 57.93% 25 74.13%
24 57.26% 24 74.06%
23 55.90% 23 74.19%
22 54.38% 22 74.56%
21 49.65% 21 73.58%
20 50.66% 20 61.98%
19 49.79% 19 56.65%
18 44.62% 18 56.78%
17 37.59% 17 57.34%
16 30.26% 16 56.09%
Figure 6.6 tells another interesting story. Although the number of active slices
aﬀects the success rate quite signiﬁcantly (see Table 6.3), it seems the eﬀect on the
distribution over 24 active slices is minimal. When the number of active slices and
active scan chains are both ﬁxed at 24, the average success rate of the attack is
around 76.01% for a random permutation of KFFs. As stated in Section 6.4.1, for the
32:8 X-tolerant structure used in this work, the information on only two scan chains
(namely s22 and s24) are lost after XORing the two test outputs. Although the loss of
information is minimal, some distributions of scan chains lead to further reductions in
the observed Hamming distance value. Hence, the success rate is inevitably degraded.
Experiments are repeated for 10000 random distributions of KFFs to have reliable
statistics on the attack success. Performing the attack once takes an average of 1.28
milliseconds using the conﬁguration mentioned in Section 6.3.
98 6. Security of Test Compression Algorithms
Figure 6.6: Success rate of the attack on Adaptive Scan for 24 active scan chains and
24 active slices (Distributions from Table 6.2)
6.4.2 Scan Attack on Cadence OPMISR
OPMISR (On-Product MISR) is one of the main features included in the Cadence
Encounter Test toolkit [WWW06]. The essential part of OPMISR is space compaction
employing XOR trees against which DSA in [DRDNFR11a] is eﬀective. OPMISR has
an optional feature of X-state handling using static X-masking. Although X-masking
is used for testing purposes, it can enhance security by reducing the observability of
the internal registers in a design. Time compaction with Multiple Input Signature
Registers (MISRs) is another optional feature provided by OPMISR. We also provide
results of scan attack on combined XOR-tree, static X-masking, and MISR structures.
Description of the Attack
As in Section 6.4.1, the attack consists of two stages. First, all 256 possible values
are given to the ﬁrst byte of the plaintext and the corresponding test outputs are
collected. In the second stage, the test outputs are paired depending on the chosen
XOR diﬀerence and a key guess is made depending on the Hamming distance between
output pairs. This two-stage attack is repeated multiple times for diﬀerent mask
values, so that the correct key guess eventually overwhelms the incorrect key guesses,
and becomes the top key candidate. Therefore, an attack is deemed successful only
if the top key candidate is the correct one. A ﬁnal note on the attack is that since it
is repeated for a number of random test inputs, only 3 diﬀerent XOR diﬀerences are
used in analysis to reduce the execution time of the attack.
In Figure 6.7, the trend in success rate in comparison to the number of test inputs
used is presented. It can be observed from the ﬁgure that the attacks on masking
schemes are more successful if a larger collection of random test inputs are used to
mount an attack.
6.4. Scan Attacks on Industrial Test Compression Algorithms 99
103 104 105 106
0
0.2
0.4
0.6
0.8
1
# of test inputs
Su
cc
es
s 
Ra
te
 
 
Dynamic Masking
Static Masking
Figure 6.7: Change in success rate with respect to the number of test inputs (or
masks) used for the attack.
DSA on XOR Compaction with Static X-Masking
The attack success not only depends on the number of active slices, but also on the
number of active scan chains. This is due to the fact that a smaller number of scan
chains can be covered more frequently since a static mask with a lower Hamming
weight would be suﬃcient to include all of them. For instance, let there be c active
scan chains in the design. Statistically, once in 2c diﬀerent masks, all KFFs will aﬀect
the test outputs, therefore giving a better chance of mounting a successful attack.
Hence, the smaller number of active scan chains present in the design, the better
the success rates will be for the attack. Also it should be noted that even when the
number of active scan chains and active slices are ﬁxed, the attack success will be
aﬀected by the distribution of KFFs over the scan structure.
Table 6.4 shows DSA success rates in percentages for varying active scan chains
and active slices. The results illustrate that there is a substantial fall in the success
rates with decreasing number of active slices when the number of active scan chains
is kept ﬁxed. However, when the number of active slices is ﬁxed, there is a small
increase in success rates as the number of active scan chains is reduced. This shows
that the dependency of DSA is more on active slices than on active scan chains. This
observed behaviour is due to KFFs on the same slice being XORed together and
therefore reducing the observed Hamming distance value in the test output.
Figure 6.8 shows the change in the success rate of the attack with diﬀerent dis-
tributions having the same number of active scan chains and active slices. There are
two important observations which can be made from Figure 6.8. Firstly, distribution
of KFFs over active slices will aﬀect the success rate as it determines how fast the
information is processed by the compactor. For instance, when the distribution 22 in
Table 6.2 is considered for the active slices, it is clear that 9 KFFs will be processed
in one test clock. However, for distribution 4, it would take 3 test clocks to process
the same amount of information, which means less information is lost and eventually
100 6. Security of Test Compression Algorithms
Table 6.4: DSA Success rates for static masking for diﬀerent distributions
#Active Scan Chains = 32 #Active Slices = 32
#Active Success #Active Scan Success
Slices Rate Chains Rate
32 81.91% 32 81.91%
31 77.48% 31 83.26%
30 72.02% 30 83.76%
29 66.79% 29 86.07%
28 63.21% 28 86.14%
27 56.88% 27 88.19%
26 53.24% 26 88.03%
25 49.21% 25 88.30%
24 44.39% 24 89.82%
23 41.25% 23 91.09%
22 37.81% 22 91.77%
21 33.19% 21 92.28%
20 30.39% 20 92.60%
19 27.94% 19 93.13%
18 25.29% 18 93.78%
17 22.63% 17 94.49%
16 20.75% 16 94.22%
the observed Hamming distance after the compactor is more likely to be successfully
attacked. Hence, it is realistic to expect a change in success rates inversely propor-
tional to the increasing number of KFFs on a single slice. Similarly, when 9 KFFs
are grouped on a single scan chain, information on those KFFs will be included with
probability 12 . If we again compare it to distribution 4 for the active scan chains, the
same KFFs would be included in the output with probability 18 . Therefore, the wider
the distribution of KFFs over the active scan chains, the lower success rates one will
get when mounting the attack.
Although the above arguments can provide a designer with options towards design-
ing more secure scan chain structures with the same tools that she is using, Figure 6.7
should always be kept in mind when claiming security. It should be noted that, it
is possible to make the attacker’s job harder, but it is only a matter of resources for
the attacker to recover the key when masking schemes are considered. Applying the
attack once with 1000 inputs on a design, with a random distribution of KFFs, takes
around 0.96 seconds using the conﬁguration mentioned in Section 6.3.
DSA on XOR Compaction, Static X-Masking and OPMISR
For the sake of completeness, the same attack is applied for a design which uses both
static X-Masking and MISRs, therefore having both time and space compaction at
6.4. Scan Attacks on Industrial Test Compression Algorithms 101
✺
✶ 
✶✺
✷ 
✺
✶ 
✶✺
✷ 
✶ 
✷ 
✸ 
✹ 
✺ 
✻ 
✼ 
✽ 
✾ 
❆✁✂✄☎✆ ✝✞✄✁✆ ✟✄✠✂✡ ☛
❆✁✂✄☎✆ ✝✁☞✌
❈✍☞✄✌ ✟✄✠✂✡ ☛
❙
✎
✏
✏
✑
✒
✒
✓
✔
✕
✑
✖
✗
✘
Figure 6.8: Success rate of the attack on OPMISR (space compaction only) for 24
active scan chains and 24 active slices (Distributions from Table 6.2)
the same time. When the attack is applied using the distribution (32 active slices
and 16 active scan chains) on which the attack described in Section 6.4.2 is most
successful, success rate is reduced from 94.22% to 3.55%. Although this gives an idea
about the security of combined time and space compaction, it should be noted that
the attack does not include any method to exploit the MISR structure.
Though the attack in [DRDNF12] is claimed to be successful against AES in the
presence of MISR-based time compaction, it relies on the assumption that the MISR
register is observable after each scan clock. In a real-life case, it may be possible to
make the parallel outputs of a MISR visible during testing, however it would raise
two major issues. Firstly, the gain in using MISRs after the scan structure diminishes
as they are supposed to compress the golden value to make the testing procedure
more eﬃcient. Secondly, if MISR content is available at all times, then this implies
that the complete scan chain contents are available to the attacker. Therefore using
the method proposed in [YWK06] would suﬃce to recover the key. In this work, we
focus only on the output signatures of MISRs, which we believe to be a more realistic
assumption, and observe that attacking such a system would not be possible by using
simple DSA techniques.
102 6. Security of Test Compression Algorithms
6.4.3 Scan Attack on Mentor Graphics Embedded Determin-
istic Test(EDT)
Mentor Graphics test compression tool Tessent TestKompress employs Embedded
Deterministic Test (EDT) [RTK+02] [RTKM04]. Similar to test compression tools
from Synopsys and Cadence, it uses XOR trees for space compaction. However, it
deals with X-states in a diﬀerent manner through the use of a dynamically changing
mask, which provides more ﬂexibility and easy of applicability, since knowledge of
scan chains which have a higher probability of occurrence of X-states (as in static
X-masking) is not required. This method makes use of a ring generator (similar
to a circular linear feedback shift register) together with a phase shifter to produce
independent inputs for each scan chain. The scan outputs are compacted together
with an XOR tree which is used right after an X-masking operation. X-masking is
done with AND gates where the enabling inputs are generated on-the-ﬂy through a
pattern mask decoder. The masking logic varies based on a special EDT clock and
test inputs.
In [Men10], it is claimed that Tessent TestKompress provides inherent security
as the scan inputs and outputs are compressed and rendered useless for an attacker.
A theoretical security analysis of EDT is provided in [LH07]. However, the security
claim in that work is based on the assumption that a scan attack requires knowledge
of the internal test structure and secret registers, which may not always be valid.
DSA on Dynamic X-Masking
Similar to the attack described in Section 6.4.2, the scan attack principle remains the
same. However, the only diﬀerence is that the mask used in the emulation of the attack
is dynamically changing depending on the test input. First all 256 possible values are
given to the ﬁrst byte of the plaintext and the ﬁrst round outputs are collected.
Then, the outputs are paired depending on the selected input XOR diﬀerence and a
key guess is made. The attack is again repeated for a number of times to be able to
distinguish the correct key candidate from others.
Table 6.5 shows DSA success rates in percentages for varying active scan chains
and active slices. Similar to the static X-masking case presented in Section 6.4.2, the
results illustrate that there is a substantial fall in the success rates with decreasing
number of active slices when the number of active scan chains is kept ﬁxed. However,
when the number of active slices is ﬁxed, the success rate stays almost the same. This
is so since the mask is assumed to be updated at each clock, and therefore the proba-
bility of all KFFs aﬀecting the output only depends on the total number of KFFs, as
each KFF in a diﬀerent slice should be selected at each clock individually. This shows
that the dependency of DSA for dynamic X-Masking is much more pronounced for
varying active slices, and there is hardly any dependency on active scan chains. This
can act as a guideline for the security design engineer to place the KFFs on speciﬁc
slices during DfT insertion.
6.5. Scan Attack Countermeasures 103
Table 6.5: DSA Success rates for dynamic masking for diﬀerent distributions
#Active Scan Chains = 32 #Active Slices = 32
#Active Success #Active Scan Success
Slices Rate Chains Rate
32 82.18% 32 82.18%
31 77.00% 31 81.87%
30 71.73% 30 81.28%
29 66.66% 29 81.37%
28 62.37% 28 82.52%
27 57.49% 27 81.77%
26 54.10% 26 81.60%
25 49.99% 25 81.74%
24 44.14% 24 81.75%
23 40.00% 23 81.48%
22 37.62% 22 81.45%
21 32.49% 21 82.58%
20 29.88% 20 81.64%
19 27.97% 19 81.19%
18 24.78% 18 80.85%
17 22.19% 17 81.88%
16 20.53% 16 81.90%
From Figure 6.9, it can be inferred that the argument in Section 6.4.2 regarding the
distribution of KFFs over active slices is also valid for the case of dynamic masking.
However, the KFF distribution over the active scan chains will have no eﬀect since
the mask is assumed to be updated at each test clock. Therefore, even if the KFFs
are grouped in the same active scan chain, it is equally diﬃcult for all of them to be
picked as in distribution 1 in Table 6.2. However if the mask update clock is slower
than the test clock, then we would see a slight increase in success rate while more and
more KFFs are grouped at the same scan chain (as in distribution 22 in Table 6.2).
Therefore, the cases presented in this work cover the two extreme possibilities when
the mask is not updated and when mask update clock is the same as the test clock,
giving an idea of the cases in between. Applying this attack once with 1000 inputs
on a design with a random distribution of KFFs takes around 0.88 seconds using the
conﬁguration mentioned in Section 6.3.
6.5 Scan Attack Countermeasures
In Section 6.4.2, we show that combined space and time compaction can act as a
suitable scan attack countermeasure implicitly achieved through the DfT structure.
To the best of our knowledge, latest version of Cadence Encounter Test consisting
104 6. Security of Test Compression Algorithms
✺
✶ 
✶✺
✷ 
✺
✶ 
✶✺
✷ 
✶ 
✷ 
✸ 
✹ 
✺ 
✻ 
❆✁✂✄☎✆ ✝✞✄✁✆ ✟✄✠✂✡ ☛
❆✁✂✄☎✆ ✝✁☞✌
❈✍☞✄✌ ✟✄✠✂✡ ☛
❙
✎
✏
✏
✑
✒
✒
✓
✔
✕
✑
✖
✗
✘
Figure 6.9: Success rate of the attack on EDT for 24 active scan chains and 24 active
slices (Distributions from Table 6.2)
of OPMISR+ is the only DfT tool which provides a combination of time compaction
(MISRs) and space compaction (XOR-trees). Other EDA companies (Mentor, Syn-
opsys, SynTest) have the capability to add a MISR to a design by means of scripting
to get the OPMISR+ eﬀect. However, in our work, we have only considered the stan-
dard features provided in the commercial test tools, and not any custom extensions
that can be added.
An explicit scan attack countermeasure is necessary to be integrated with the DfT
structure generated by these tools. In this section, some of the existing explicit scan
attack countermeasures are evaluated, and a new countermeasure is also proposed.
To have a fair basis for comparing the countermeasures considered in this section, all
simulations are run without X-masking or X-tolerant logic. Experiments are repeated
10000 times with three distinct XOR diﬀerences to get good statistics and only one
test input is used per attack since there are no test input dependent elements in the
proposed countermeasures.
6.5.1 Insertion of Inverters in the Scan Path
The technique is also known as the ﬂipped scan tree architecture [SMC07]. It involves
dividing the scan chains into a number of sub-chains in the form of a scan tree. Sub-
chains are parts of complete scan chains which may be connected in a random order in
order to make extraction of useful information from the scan outputs diﬃcult for an
6.5. Scan Attack Countermeasures 105
attacker. Inverters are inserted in front of the scan ﬂip-ﬂops in some secret locations.
These locations are known only to the designer and the tester, but not to an attacker.
However, as the position of the inverters in the scan path is ﬁxed, DSAs are immune
against this countermeasure and the same attack principle is applicable.
6.5.2 Partial Scan
This approach (also referred to as balanced secure scan) aims to protect non-scan
registers by employing a test controller that enables the test mode only when an au-
thentication succeeds [IYHF09]. Only a few ﬂip-ﬂops belonging to the secret registers
are included in the scan chains. Further confusion is added to the kernel wherever a
secret register is inserted in the scan chain.
To emulate the eﬀect of this countermeasure in our software implementation of
the attack, we removed some of the KFFs from the two-dimensional array used to
represent the scan chains. Then DSA is performed together with the test compression
structures. The distributions in our experiments have 25%, 50% and 75% of the
KFFs removed from the scan chains. The average scan attack success results that we
obtained for these distributions are 24.46%, 3.01% and 0.82% respectively. The attack
is repeated 10 000 times, with diﬀerent KFFs being blocked at each try. Therefore,
the success rates presented here shows the eﬀect of partial scan countermeasure when
a randomly selected group of KFFs are blocked in a design.
Although the results suggest that the partial scan countermeasure is a good way to
mitigate DSA on an AES design, the cost of the countermeasure should also be taken
into account. We implemented the partial scan countermeasure following the structure
of Figure 3 in [IYHF09]. The test controller along with the other modules were
implemented in HDL and a 32-bit LFSR is used to derive the signals which are used
for blocking a group of KFFs. Implemented this way, the partial scan countermeasure
takes 341.5 GE when synthesized using a Faraday 130 nm technology library. This
corresponds to an area overhead of 1.95% for the partial scan countermeasure when
added to the AES design in [Gez], which occupies an area of 17 484.25 GEs.
6.5.3 Scan Chain Scrambling
This countermeasure proposes to divide each scan chain into multiple scan elements
and the order of connections of the scan elements is controlled through the scan chain
scrambler. When the scan mode has been reached securely, the scan chain elements
are arranged in a predetermined order [HFB+04]. However, in insecure mode, the
order of the scan chain elements keeps changing at a certain frequency.
As the resulting KFF distribution in insecure mode is expected to be randomised
by this countermeasure, 10000 random distributions of KFFs are used to simulate the
behaviour of the countermeasure. The same scan attack principle is applicable. Full
information is still visible as all KFFs contribute to the test outputs even after ran-
106 6. Security of Test Compression Algorithms
domising the KFF distributions. Hence, for each distribution, the attack successfully
recovers the correct key.
6.5.4 Masking
In [DRDNFR11b], two masking countermeasures for protecting AES against scan at-
tacks were proposed. These masking schemes are similar to the countermeasures used
to protect against diﬀerential power analysis (DPA) side channel attacks. The ﬁrst
method is based on masking the round-register data. The mask can be added to all the
128 round-register FFs, and then removed from the encrypted value before executing
the next AES round. The masking is eﬀective only during testing and is completely
transparent in functional mode. Alternatively, to reduce area requirements, the mask
can be applied on a single FF per slice to be protected. In this manner, the parity
of the whole slice is aﬀected by the mask and its eﬀects cannot be eliminated by
the attacker. The second method works by modifying the response compactor. This
is implemented by masking the parity bitstream on the output of the test response
compactor instead of the data before being captured in the scan chain. It makes use
of an enhanced LFSR (eLFSR) that can function either as a simple register or as
an LFSR. In functional mode, the eLFSR is loaded with a value matching with the
round-register (a 128 bit value unknown to the attacker), or with any other value
depending on the input message and part of the secret key. In test mode, the eLFSR
provides the mask to the observed stream.
The mask register countermeasure has an area overhead of 5.9%, while the coun-
termeasure of modifying the response compactor has an overhead of 4.5% for the AES
design used in the paper [DRDNFR11b] and involves storing a secret key. The mask-
ing countermeasure can also be extended to ciphers other than AES, for instance RSA
or ECC, by masking the intermediate registers storing the results of modular square
and multiply operations or point doubling and point addition results respectively.
There are also other scan attack countermeasures such as the Lock and Key Tech-
nique [LTP07], Design for Secure Test [STYO07] and resetting crypto chip in test
mode [HFB+04] that were not evaluated experimentally in this work. These schemes
are brieﬂy discussed qualitatively here. The Lock and Key technique [LTP07] uses a
plaintext key for comparison to unlock the ﬁnite state machine used for randomizing
the order in which the scan elements are connected. In case the communication link
is not secure, an attacker can observe this key. The Design for Secure Test [STYO07]
which checks the parity of AES rounds is an ad-hoc solution for AES designs with a
completely unrolled structure (having high area requirements), limiting its applica-
bility to other designs. Though the scheme involving resetting the crypto chip and
removing all traces of cryptographic execution in test mode [HFB+04] might provide
a high level of security, it is not applicable for implementations where some secret
data needs to be stored on-chip.
6.5. Scan Attack Countermeasures 107
6.5.5 Our Proposal
The existing scan attack countermeasures aim at securing uncompacted scan chain
structures. However, as described in the previous sections, practical scan structures
consist of compaction techniques which lead to a loss of observability of internal scan
chains. Hence, they may be extended into a countermeasure. In this regard, we
propose a new scan attack countermeasure based on randomization of the compactor
outputs.
Figure 6.10 shows the proposed noise injector countermeasure. It consists of a
Linear Feedback Shift Register (LFSR), a True Random Number Generator (TRNG)
and some basic logic gates. For the particular case of Noise Injector depicted in the
ﬁgure, two OR gates, one NOR gate, one AND gate, and one XOR gate are required.
The LFSR makes a pseudo-random selection of the test cycles when random noise
is injected into the compactor outputs. More speciﬁcally, the compactor output is
ﬂipped if ‘A’ in Figure 6.10 is 1. The signal ‘A’ is generated through a combination of
TRNG and the LFSR outputs, making it unpredictable for an attacker. Hence, the
compactor outputs becomes random and cannot be exploited by DSA.


	






Figure 6.10: Noise Injector countermeasure for Injection Freq. Factor of 16
Injection frequency factor (IFF) refers to the rate at which noise is injected into
the compactor outputs, calculated as a factor by which the test clock frequency is
divided. As an example, let us consider the LFSR structure in Figure 6.10 having a
IFF of 16. Let us assume that the state bits of the LFSR are completely random, with
equal probability of occurrence of 1s and 0s (12 ). An OR gate will give an output of
0 with probability of 14 . Hence the probability that both inputs of the NOR gate are
0 (to give an output of 1) is 116 (as the events are independent of each other). Only
when the output ‘B’ of the NOR gate is 1, random noise is injected into the compactor
output, as the output of the TRNG and ‘B’ are connected through the AND gate.
Similarly, to obtain an injection frequency factor of 4, one of the OR gates may be
removed and the other input of the NOR gate connected to ground (0 logic), resulting
in a probability of 14 for obtaining a 0 at both inputs of the NOR gate. Similar simple
structures can be designed for all the other possible cases. The gate combination to
obtain an injection frequency factor which is not a power of 2 is also straightforward,
though involving more number of gates. For instance, to obtain an injection factor of
5, we can construct a 4-input truth table which gives ﬁve 1s and eleven 0s at its output.
One such possible expression would be (¬I1∧¬I2∧I3)∨(I1∧I2∧¬I3)∨(I0∧¬I1∧I2∧I3),
108 6. Security of Test Compression Algorithms
Table 6.6: Change in success rate VS how frequently a random bit is XORed to the
compactor output
Injection Freq. Factor Success Rate1 Success Rate2
16 63.33% 81.37%
15 59.17% 78.70%
14 54.37% 75.53%
13 49.34% 72.79%
12 43.78% 68.93%
11 37.71% 66.31%
10 32.76% 60.57%
9 28.52% 56.99%
8 23.55% 52.53%
7 18.77% 47.70%
6 15.05% 40.62%
5 11.91% 33.02%
4 8.32% 23.04%
3 5.51% 12.85%
2 2.93% 3.38%
1 0.90% 0.00%
where I0, I1, I2 and I3 represent the states at the four tap points of the LFSR.
We have also analyzed two ways of attacking this system to determine its security.
The ﬁrst attack does not use any information about the countermeasure, therefore
taking a black-box approach. The attack is applied just as it would be applied to any
other countermeasure. Results for this scan attack on the noise injector is provided
in the second column (Success Rate1) of Table 6.6. However, there is another way
of attacking this countermeasure. An attacker can ﬁrst give the same input to the
crypto algorithm and the same test input to the test circuit. After collecting the
outputs by repeating this procedure enough number of times, the attacker can ﬁgure
out the points where TRNG corrupts the output. Later, DSA can be applied to the
circuit with the same test input and these bits can be removed from the test output
as they have the potential to corrupt the test output. We simulated the attack in
software and the probability of such an attack to be successful for a random design
is presented in the third column (Success Rate2) of Table 6.6. As is evident from
the table, success rate of the attack decreases drastically when noise is injected at
a higher frequency. Attack success reduces to only 0.9% when noise injected at the
same frequency as the test clock.
Test coverage of the circuit is not aﬀected by the proposed countermeasure with
the following conditions:
• The LFSR structure (feedback polynomial, output points and seed value) should
6.5. Scan Attack Countermeasures 109
be known to the tester.
• The test cycles should be ignored when signal ‘B’ is 1 (Figure 6.10).
• For achieving complete test coverage, the test patterns for ignored test cycles
should be repeated until signal ‘B’ becomes 0.
The knowledge of the TRNG outputs is not required by the tester, as the test
cycles when signal ‘B’ = 1 are always ignored irrespective of signal ‘C’. We suggest to
have dedicated test outputs while incorporating our proposed countermeasure. In case
of pin constrained applications, the test outputs may be multiplexed with some of the
primary outputs. To make the proposed scheme compatible for such applications, an
additional control circuit is required which conﬁgures the whole cryptographic circuit
based on its mode of operation. In normal mode, the noise injector is not connected
to the compactor outputs, while in test mode it is connected. This could be realized
by multiplexers controlled by the mode selection input pin.
A TRNG is used instead of an LFSR for generating the random input ‘C’ as an
LFSR has a linear structure which is prone to cryptanalysis if the seed and feedback
polynomial is known or if the LFSR does not have suﬃcient length (incurring high area
overhead). A TRNG has much higher unpredictability property. An implementation
based on Fibonacci and Galois Ring Oscillators is presented in [Gol06].
Compared to some of the countermeasures mentioned in the previous section,
our proposed scheme has lower area overhead, as we are utilizing the existing test
compression infrastructure. Our noise injector countermeasure requires an area of
106.75 GEs incurring an overhead of 0.61% over the table-lookup based S-box AES
implementation in [Gez] (which needs 17 484.25 GEs) with implemented DfT, using
a Faraday 130nm library and synthesized using Synopsys Design Compiler version
C-2009.06-SP3. The area requirement is less than one-third of that required for the
partial scan countermeasure presented in Section 6.5.2. A representative area require-
ment for Ring Oscillator based TRNGs is 16× 10−4 sqmm (or 200 GEs) as presented
in [BGL+03] (where for UMC 180 nm employed in the paper, 125 KGEs is generally
contained in one sqmm). Hence, the overhead of the noise injector countermeasure
with an actual TRNG will be around 1%. The TRNG may also be part of a Cryp-
tographic SoC as an on-chip random number generator. In such a case, it would
not require any additional hardware resources. Test application time will however
increase by a factor of 1
IFF
, for a given noise injection frequency factor IFF.
We present here a brief comparison with the weighted (biased) pseudo-random
(WPR) test pattern generation schemes [AK94] employed in older Logic BIST solu-
tions before the advent of test data compression. WPR approach is based on two key
steps. Firstly, bit-ﬁxing is used in which certain idler register bit positions in the scan
chains are assigned to hold ﬁxed values for certain portions of the test time. This is
followed by Biased pseudo-random (PR) testing, consisting of applying biased PR test
patterns in the variable idler register bits. Our noise injection technique is diﬀerent
from this approach as we do not have ﬁxed bit positions in the LFSR deciding the test
110 6. Security of Test Compression Algorithms
cycles where random noise from the TRNG is injected. We only use certain portions
of the LFSR state to derive the noise injection point.
6.6 Conclusions
In this work, security of industrial test compression schemes against diﬀerential scan
attacks has been evaluated in detail for the ﬁrst time. Scan attack results for all three
major DfT tools: from Synopsys, Cadence and Mentor Graphics are presented. It is
demonstrated that space compression with X-handling logic is vulnerable to scan at-
tacks, whereas time compaction acts as a strong countermeasure against scan attacks.
The most well-known scan attack countermeasures are investigated for their security
vulnerability and attack success results are presented. A new noise injector counter-
measure is proposed and its security properties are analyzed. Future work would be to
investigate the scan attack susceptibility of other time compression techniques, such as
Syntest Virtual scan and Ultra Scan which uses Time Division Demultiplexer(TDDM)
and Multiplexer(TDM).
Bibliography
[ADN+10] M. Agoyan, J.-M. Dutertre, D. Naccache, B. Robisson, and A. Tria.
When Clocks Fail: On Critical Paths and Clock Faults. In Smart Card
Research and Advanced Application - CARDIS 2010, pages 182–193.
Springer, 2010.
[AK94] M.F. AlShaibi and C. R. Kime. Fixed-biased pseudorandom built-in
self-test for random pattern resistant circuits. in Proc. IEEE ITC,
pages 929–938, 1994.
[And08] R. Anderson. Security Engineering: A Guide to Building Depend-
able Distributed Systems. Wiley, 2008. https://books.google.nl/books?
id=ILaY4jBWXfcC.
[ARR03] D. Agrawal, J. R. Rao, and P. Rohatgi. Multi-channel attacks. In
Cryptographic Hardware and Embedded Systems - CHES 2003, vol-
ume 2779 of Lecture Notes in Computer Science, pages 2–16. Springer
Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/978-3-540-45238-6 2.
[Atm03] Atmel Corporation. ATmega 162/v Datasheet, 2003. http://www.atmel.
com/Images/Atmel-2513-8-bit-AVR-Microntroller-ATmega162 Datasheet.pdf.
[BBB+14] P. Belgarric, S. Bhasin, N. Bruneau, J.-L. Danger, N. Debande,
S. Guilley, A. Heuser, Z. Najm, and O. Rioul. Time-frequency analysis
for second-order attacks. In Aure´lien Francillon and Pankaj Rohatgi,
editors, Smart Card Research and Advanced Applications - CARDIS
2013, Lecture Notes in Computer Science, pages 108–122. Springer In-
ternational Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-08302-5
8.
[BBD+14] S. Bhasin, N. Bruneau, J.-L. Danger, S. Guilley, and Z. Najm.
Analysis and Improvements of the DPA Contest v4 Implementation.
112 Bibliography
In RajatSubhra Chakraborty, Vashek Matyas, and Patrick Schau-
mont, editors, Security, Privacy, and Applied Cryptography Engi-
neering, volume 8804 of Lecture Notes in Computer Science, pages
201–218. Springer International Publishing, 2014. http://dx.doi.org/10.
1007/978-3-319-12060-7 14.
[BBS99] E. Biham, A. Biryukov, and A. Shamir. Cryptanalysis of skipjack
reduced to 31 rounds using impossible diﬀerentials. In Advances
in Cryptology — EUROCRYPT 1999, pages 12–23, Berlin, Hei-
delberg, 1999. Springer Berlin Heidelberg. http://dx.doi.org/10.1007/
3-540-48910-X 2.
[BCG+12] J. Borghoﬀ, A. Canteaut, T. Gu¨neysu, E. B. Kavun, M. Kneze-
vic, L. R. Knudsen, G. Leander, V. Nikov, C. Paar, C. Rechberger,
P. Rombouts, S. S. Thomsen, and T. Yalc¸ın. PRINCE : A Low-
Latency Block Cipher for Pervasive Computing Applications. In Ad-
vances in Cryptology: ASIACRYPT 2012, volume 7658 of Lecture
Notes in Computer Science, pages 208–225. Springer Berlin Heidel-
berg, 2012.
[BCG13] S. Bhasin, C. Carlet, and S. Guilley. Theory of masking with code-
words in hardware: low-weight dth-order correlation-immune Boolean
functions. Cryptology ePrint Archive, Report 2013/303, 2013. http://
eprint.iacr.org/.
[BCO04] E. Brier, C. Clavier, and F. Olivier. Correlation power analysis
with a leakage model. In Marc Joye and Jean-Jacques Quisquater,
editors, Cryptographic Hardware and Embedded Systems - CHES
2004, volume 3156 of Lecture Notes in Computer Science, pages
16–29. Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/
978-3-540-28632-5 2.
[BDE+13] L. Batina, A. Das, B. Ege, E. B. Kavun, N. Mentens, C. Paar, I. Ver-
bauwhede, and T. Yalc¸ın. Dietary recommendations for lightweight
block ciphers: Power, energy and area analysis of recently devel-
oped architectures. In Michael Hutter and Jo¨rn-Marc Schmidt, edi-
tors, Radio Frequency Identification, volume 8262 of Lecture Notes in
Computer Science, pages 103–112. Springer Berlin Heidelberg, 2013.
http://dx.doi.org/10.1007/978-3-642-41332-2 7.
[BDK01] E. Biham, O. Dunkelman, and N. Keller. The rectangle attack —
rectangling the serpent. In Advances in Cryptology — EUROCRYPT
2001, pages 340–357, Berlin, Heidelberg, 2001. Springer Berlin Hei-
delberg. http://dx.doi.org/10.1007/3-540-44987-6 21.
Bibliography 113
[BDPA11] G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche. The Keccak
SHA-3 submission. Submission to NIST (Round 3), 2011. http://
keccak.noekeon.org/Keccak-submission-3.pdf.
[BECN+06] H. Bar-El, H. Choukri, D. Naccache, M. Tunstall, and C. Whelan.
The sorcerer’s apprentice guide to fault attacks. Proceedings of the
IEEE, 94(2):370–382, 2006.
[BEE+13] J. Balasch, B. Ege, T. Eisenbarth, B. Ge´rard, Z. Gong, T. Gu¨neysu,
S. Heyse, S. Kerckhof, F. Koeune, T. Plos, T. Po¨ppelmann, F. Regaz-
zoni, F.-X. Standaert, G. Van Assche, R. Van Keer, L. van Olde-
neel tot Oldenzeel, and I. von Maurich. Compact implementation
and performance evaluation of hash functions in attiny devices. In
Stefan Mangard, editor, Smart Card Research and Advanced Appli-
cations - CARDIS 2013, volume 7771 of Lecture Notes in Computer
Science, pages 158–172. Springer Berlin Heidelberg, 2013. http://dx.
doi.org/10.1007/978-3-642-37288-9 11.
[BGL+03] M. Bucci, L. Germani, R. Luzzi, A. Triﬁletti, and M Varanonuovo.
A high-speed oscillator-based truly random number source for cryp-
tographic applications on a smart card ic. IEEE Trans. Comput.,
52(4):403–409, 2003.
[BGLR09] L. Batina, B. Gierlichs, and K. Lemke-Rust. Diﬀerential Cluster
Analysis. In Christophe Clavier and Kris Gaj, editors, Cryptographic
Hardware and Embedded Systems - CHES 2009, volume 5747 of Lec-
ture Notes in Computer Science, pages 112–127. Springer Berlin Hei-
delberg, 2009. http://dx.doi.org/10.1007/978-3-642-04138-9 9.
[BGN+14] B. Bilgin, B. Gierlichs, S. Nikova, V. Nikov, and V. Rijmen. Higher-
order threshold implementations. In Advances in Cryptology – ASI-
ACRYPT 2014: 20th International Conference on the Theory and
Application of Cryptology and Information Security, Kaoshiung, Tai-
wan, R.O.C., December 7-11, 2014, Proceedings, Part II, pages 326–
343, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg. http://dx.
doi.org/10.1007/978-3-662-45608-8 18.
[BGP+10] L. Batina, B. Gierlichs, E. Prouﬀ, M. Rivain, F.-X. Standaert, and
N. Veyrat-Charvillon. Mutual information analysis: a comprehensive
study. Journal of Cryptology, 24(2):269–291, 2010. http://dx.doi.org/
10.1007/s00145-010-9084-8.
[BGV11] J. Balasch, B. Gierlichs, and I. Verbauwhede. An In-depth and
Black-box Characterization of the Eﬀects of Clock Glitches on 8-bit
MCUs. In Workshop on Fault Diagnosis and Tolerance in Cryptog-
raphy, FDTC 2011, pages 105–114. IEEE, 2011.
114 Bibliography
[BKL+07] A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann,
M. J. Robshaw, Y. Seurin, and C. Vikkelsoe. PRESENT: An Ultra-
Lightweight Block Cipher. In Cryptographic Hardware and Embedded
Systems CHES 2007, Lecture Notes in Computer Science, pages 450–
466, Berlin, Heidelberg, 2007. Springer-Verlag.
[Bog07] A. Bogdanov. Improved side-channel collision attacks on aes. In
Selected Areas in Cryptography – SAC 2007, pages 84–95, Berlin,
Heidelberg, 2007. Springer Berlin Heidelberg. http://dx.doi.org/10.1007/
978-3-540-77360-3 6.
[Bog08] A. Bogdanov. Multiple-Diﬀerential Side-Channel Collision Attacks
on AES. In Elisabeth Oswald and Pankaj Rohatgi, editors, Crypto-
graphic Hardware and Embedded Systems – CHES 2008, volume 5154
of Lecture Notes in Computer Science, pages 30–44. Springer Berlin
Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-85053-3 3.
[BS91] E. Biham and A. Shamir. Diﬀerential Cryptanalysis of DES-like
Cryptosystems. In Advances in Cryptology – CRYPTO 1990, Lecture
Notes in Computer Science, pages 2–21, London, UK, 1991. Springer-
Verlag.
[BS97] E. Biham and A. Shamir. Diﬀerential Fault Analysis of Secret Key
Cryptosystems. In Burton S. Kaliski Jr., editor, Advances in Cryp-
tology - CRYPTO 1997, volume 1294 of Lecture Notes in Computer
Science, pages 513–525. Springer, 1997.
[BS03] J. Blo¨mer and J.-P. Seifert. Fault Based Cryptanalysis of the Ad-
vanced Encryption Standard (AES). In Rebecca N. Wright, edi-
tor, Financial Cryptography, 7th International Conference, FC 2003,
Guadeloupe, French West Indies, January 27-30, 2003, Revised Pa-
pers, volume 2742 of Lecture Notes in Computer Science, pages 162–
181. Springer, 2003.
[Car05] C. Carlet. On highly nonlinear S-boxes and their inability to thwart
DPA attacks. In Progress in Cryptology - INDOCRYPT 2005, Lecture
Notes in Computer Science, pages 49–62, Berlin, Heidelberg, 2005.
Springer-Verlag.
[CDGM12] C. Carlet, J.-L. Danger, S. Guilley, and H. Maghrebi. Leakage Squeez-
ing of Order Two. In Steven Galbraith and Mridul Nandi, editors,
Progress in Cryptology - INDOCRYPT 2012, volume 7668 of Lecture
Notes in Computer Science, pages 120–139. Springer Berlin Heidel-
berg, 2012. http://dx.doi.org/10.1007/978-3-642-34931-7 8.
Bibliography 115
[CFG+11] C. Clavier, B. Feix, G. Gagnerot, M. Roussellet, and V. Verneuil.
Improved Collision-Correlation Power Analysis on First Order Pro-
tected AES. In Bart Preneel and Tsuyoshi Takagi, editors, Crypto-
graphic Hardware and Embedded Systems – CHES 2011, volume 6917
of Lecture Notes in Computer Science, pages 49–62. Springer Berlin
Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-23951-9 4.
[CH10] Y. Crama and P. L. Hammer. Boolean Models and Methods in Math-
ematics, Computer Science, and Engineering. Cambridge University
Press, New York, USA, 1st edition, 2010.
[CK10] J.-S. Coron and I. Kizhvatov. Analysis and Improvement of the Ran-
dom Delay Countermeasure of CHES 2009. In Stefan Mangard and
Franc¸ois-Xavier Standaert, editors, Cryptographic Hardware and Em-
bedded Systems, CHES 2010, volume 6225 of Lecture Notes in Com-
puter Science, pages 95–109. Springer Berlin Heidelberg, 2010. http://
dx.doi.org/10.1007/978-3-642-15031-9 7.
[Com12] Common Criteria. Part 3: Security assurance components. Tech-
nical report, Common Criteria for Information Technology Security
Evaluation , 2012.
[CSM+14] K. Chakraborty, S. Sarkar, S. Maitra, B. Mazumdar, D. Mukhopad-
hyay, and E. Prouﬀ. Redeﬁning the transparency order. Cryptology
ePrint Archive, Report 2014/367, 2014. http://eprint.iacr.org/.
[CT05] H. Choukri and M. Tunstall. Round Reduction Using Faults. Work-
shop on Fault Diagnosis and Tolerance in Cryptography, FDTC 2005,
5:13–24, 2005.
[CTMD15] M. Carbone, Y. Teglia, P. Maurine, and G. R. Ducharme. Interest of
mia in frequency domain? In Proceedings of the Second Workshop on
Cryptography and Security in Computing Systems, CS2 2015, pages
35:35–35:38, New York, NY, USA, 2015. ACM. http://doi.acm.org/10.
1145/2694805.2694812.
[DEG+13] A. Das, B. Ege, S. Ghosh, L. Batina, and I. Verbauwhede. Secu-
rity analysis of industrial test compression schemes. Computer-Aided
Design of Integrated Circuits and Systems, IEEE Transactions on,
32(12):1966–1977, 2013.
[DH76] W. Diﬃe and M. E. Hellman. New directions in cryptography. In-
formation Theory, IEEE Transactions on, 22(6):644–654, 1976.
[DMM+13] A. Dehbaoui, A.-P. Mirbaha, N. Moro, J.-M. Dutertre, and A. Tria.
Electromagnetic Glitch on the AES Round Counter. In COSADE
2013, Paris, France, pages 17–31. Springer, 2013.
116 Bibliography
[DPRS11] J. Doget, E. Prouﬀ, M. Rivain, and F.-X. Standaert. Univariate side
channel attacks and leakage modeling. Journal of Cryptographic Engi-
neering, 1(2):123–144, 2011. http://dx.doi.org/10.1007/s13389-011-0010-2.
[DR02] J. Daemen and V. Rijmen. The Design of Rijndael. Springer-Verlag
New York, Inc., Secaucus, NJ, USA, 2002.
[DRDDN+12a] J. Da Rolt, A. Das, G. Di Natale, M.-L. Flottes, B. Rouzeyre, and
I. Verbauwhede. A new scan-attack on RSA in presence of industrial
countermeasures. in Proc. COSADE, 7275:89–104, 2012.
[DRDDN+12b] J. Da Rolt, A. Das, G. Di Natale, M.-L. Flottes, B. Rouzeyre, and
I. Verbauwhede. A new scan attack on elliptic curve cryptosystems in
presence of industrial design-for-testability structures. in Proc. DFT,
pages 43–48, 2012.
[DRDNF12] J. Da Rolt, G. Di Natale, and B. Flottes, M-L. Rouzeyre. Are ad-
vanced dft structures suﬃcient for preventing scan-attacks? in Proc.
IEEE VTS, pages 325–336, 2012.
[DRDNFR11a] J. Da Rolt, G. Di Natale, M-L. Flottes, and B. Rouzeyre. New Secu-
rity Threats Against Chips Containing Scan Chain Structures. In in
Proc. HOST, pages 105–110, 2011.
[DRDNFR11b] J. Da Rolt, G Di Natale, M.-L. Flottes, and B. Rouzeyre. Scan attacks
and countermeasures in presence of scan response compactors. In in
Proc. IEEE ETS, pages 19–24, 2011.
[DSEA+12] N. Debande, Y. Souissi, M. A. El Aabid, S. Guilley, and J.-L. Danger.
Wavelet transform based pre-processing for side channel analysis. In
Proceedings of the 2012 45th Annual IEEE/ACM International Sym-
posium on Microarchitecture Workshops, MICROW 2012, pages 32–
38, Washington, DC, USA, 2012. IEEE Computer Society. http://dx.
doi.org/10.1109/MICROW.2012.15.
[DZFL14] A. A. Ding, L. Zhang, Y. Fei, and P. Luo. A Statistical Model for
Higher Order DPA on Masked Devices. IACR Cryptology ePrint
Archive, 2014:433, 2014.
[EDGV12] B. Ege, A. Das, S. Ghosh, and I. Verbauwhede. Diﬀerential Scan
Attack on AES with X-Tolerant and X-Masked Test Response Com-
pactor. in Proc. IEEE Euromicro Conf. DSD, pages 545–552, 2012.
[EEB16] B. Ege, T. Eisenbarth, and L. Batina. Near collision side channel at-
tacks. In Selected Areas in Cryptography - SAC 2015, Lecture Notes
in Computer Science, pages 277–292. Springer International Publish-
ing, 2016.
Bibliography 117
[EPBP15] B. Ege, K. Papagiannopoulos, L. Batina, and S. Picek. Improving dpa
resistance of s-boxes: How far can we go? In Circuits and Systems
(ISCAS), 2015 IEEE International Symposium on, pages 2013–2016,
2015.
[ESH+11] S. Endo, T. Sugawara, N. Homma, T. Aoki, and A. Satoh. An
on-chip glitchy-clock generator and its application to sage-error at-
tack. In COSADE 2011, Darmstadt, Germany, Workshop Proceed-
ings COSADE 2011, pages 175–182, 2011.
[FDLZ14] Y. Fei, A. A. Ding, J. Lao, and L. Zhang. A Statistics-based Fun-
damental Model for Side-channel Attack Analysis. IACR Cryptology
ePrint Archive, 2014:152, 2014.
[FLD12] Y. Fei, Q. Luo, and A. A. Ding. A Statistical Model for DPA with
Novel Algorithmic Confusion Analysis. In Cryptographic Hardware
and Embedded Systems CHES 2012, Lecture Notes in Computer Sci-
ence, pages 233–250, 2012.
[FT09] T. Fukunaga and J. Takahashi. Practical Fault Attack on a Crypto-
graphic LSI with ISO/IEC 18033-3 Block Ciphers. In Workshop on
Fault Diagnosis and Tolerance in Cryptography, FDTC 2011, pages
84–92. IEEE, 2009.
[GA03] S. Govindavajhala and A. W. Appel. Using Memory Errors to Attack
a Virtual Machine. In IEEE Symposium on Security and Privacy,
Proceedings of the 2003, pages 154–165, 2003. http://dl.acm.org/citation.
cfm?id=830563.
[GBTP08] B. Gierlichs, L. Batina, P. Tuyls, and B. Preneel. Mutual Information
Analysis. In Elisabeth Oswald and Pankaj Rohatgi, editors, Crypto-
graphic Hardware and Embedded Systems – CHES 2008, volume 5154
of Lecture Notes in Computer Science, pages 426–442. Springer Berlin
Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-85053-3 27.
[Gez] http://rijndael.ece.vt.edu/gezel2/examples.html#aes. Gezel Hard-
ware/Software codesign Environment - Advanced Encryption Stan-
dard GEZEL Code.
[GHPS07] S. Guilley, P. Hoogvorst, R. Pacalet, and J. Schmidt. Improving
Side-Channel Attacks by Exploiting Substitution Boxes Properties.
In International Workshop on Boolean Functions : Cryptography and
Applications, BFCA 2014, pages 1–25, 2007.
[GHT05] C. H. Gebotys, S. Ho, and C. C. Tiu. EM analysis of Rijndael and
ECC on a wireless java-based PDA. In Cryptographic Hardware and
Embedded Systems - CHES 2005, Lecture Notes in Computer Science,
118 Bibliography
pages 250–264, Berlin, Heidelberg, 2005. Springer-Verlag. http://dx.
doi.org/10.1007/11545262 19.
[GKLD11] S. Guilley, K. Khalfallah, V. Lomne, and J.-L. Danger. Formal frame-
work for the evaluation of waveform resynchronization algorithms. In
ClaudioA. Ardagna and Jianying Zhou, editors, Information Secu-
rity Theory and Practice. Security and Privacy of Mobile Devices in
Wireless Communication, volume 6633 of Lecture Notes in Computer
Science, pages 100–115. Springer Berlin Heidelberg, 2011. http://dx.
doi.org/10.1007/978-3-642-21040-2 7.
[GMO01] K. Gandolﬁ, C. Mourtel, and F. Olivier. Electromagnetic analysis:
Concrete results. In C¸. K. Koc¸, D. Naccache, and C. Paar, edi-
tors, Cryptographic Hardware and Embedded Systems — CHES 2001,
pages 251–261, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg.
http://dx.doi.org/10.1007/3-540-44709-1 21.
[Gol06] J. D. Golic. New methods for digital generation and postprocessing
of random data. IEEE Trans. Comput., 55(10):1217–1229, 2006.
[GP04] S. Guilley and R. Pacalet. Diﬀerential Power Analysis Model and
Some Results. In proceedings of CARDIS 2004, pages 127–142.
Kluwer Academic Publishers, 2004.
[GS12] B. Ge´rard and F.-X. Standaert. Uniﬁed and Optimized Linear Colli-
sion Attacks and Their Application in a Non-proﬁled Setting. In Em-
manuel Prouﬀ and Patrick Schaumont, editors, Cryptographic Hard-
ware and Embedded Systems — CHES 2012, volume 7428 of Lecture
Notes in Computer Science, pages 175–192. Springer Berlin Heidel-
berg, 2012. http://dx.doi.org/10.1007/978-3-642-33027-8 11.
[GSP14] V. Grosso, F.-X. Standaert, and E. Prouﬀ. Low entropy masking
schemes, revisited. In Aure´lien Francillon and Pankaj Rohatgi, ed-
itors, Smart Card Research and Advanced Applications - CARDIS
2013, pages 33–43, Cham, 2014. Springer International Publishing.
http://dx.doi.org/10.1007/978-3-319-08302-5 3.
[Hem04] L. Hemme. A Diﬀerential Fault Attack Against Early Rounds of
(Triple-) DES. In Cryptographic Hardware and Embedded Systems -
CHES 2004, pages 254–267. Springer, 2004.
[HFB+04] D. Hely, M.-L. Flottes, F. Bancel, B. Rouzeyre, N. Berard, and
M. Renovell. Scan design and secure chip [secure ic testing]. In
On-Line Testing Symposium, 2004. IOLTS 2004. Proceedings. 10th
IEEE International, pages 219–224, 2004.
Bibliography 119
[HRG14] A. Heuser, O. Rioul, and S. Guilley. A theoretical study of
kolmogorov-smirnov distinguishers. In Emmanuel Prouﬀ, editor,
Constructive Side-Channel Analysis and Secure Design, volume 8622
of Lecture Notes in Computer Science, pages 9–28. Springer Interna-
tional Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-10175-0 2.
[HS13] M. Hutter and J.-M. Schmidt. The Temperature Side-Channel and
Heating Fault Attacks. In Smart Card Research and Advanced Appli-
cations - CARDIS 2013, Lecture Notes in Computer Science, 2013.
in press.
[HSH+08] J. A. Halderman, S. D. Schoen, N. Heninger, W. Clarkson, W. Paul,
J. A. Calandrino, A. J. Feldman, J. Appelbaum, and E. W. Felten.
Lest We Remember: Cold Boot Attacks on Encryption Keys. In 17th
USENIX Security Symposium, San Jose, CA, July 2008, pages 45–60,
2008.
[ISO12] ISO/IEC 29192-2:2012. Information technology – Security techniques
– Lightweight cryptography – Part 2: Block ciphers, 2012.
[IYHF09] M. Inoue, T. Yoneda, M. Hasegawa, and H. Fujiwara. Partial scan
approach for secret information protection. in Proc. IEEE ETS, pages
143–148, 2009.
[Kae08] H. Kaeslin. Digital Integrated Circuit Design – From VLSI Architec-
tures to CMOS Fabrication. Cambridge University Press, 2008. ISBN
978-0-521-88267-5.
[KHEB14] T. Korak, M. Hutter, B. Ege, and L. Batina. Clock Glitch Attacks
in the Presence of Heating. In Fault Diagnosis and Tolerance in
Cryptography (FDTC), 2014 Workshop on, pages 104–114, 2014.
[KJJ99] P. C. Kocher, J. Jaﬀe, and B. Jun. Diﬀerential Power Analysis. In
Advances in Cryptology – CRYPTO 1999, pages 388–397, London,
UK, 1999. Springer-Verlag.
[KK99] O. Ko¨mmerling and M. G. Kuhn. Design Principles for Tamper-
Resistant Smartcard Processors. In Proceedings of the 1st USENIX
Workshop on Smartcard Technology (Smartcard 1999), Chicago, Illi-
nois, USA, May 10-11, 1999, pages 9–20, McCormick Place South,
1999. USENIX Association. ISBN 1-880446-34-0.
[KNdA13] R. Korkikian, D. Naccache, and G. O. de Almeida. Instantaneous
frequency analysis. IACR Cryptology ePrint Archive, 2013:320, 2013.
http://eprint.iacr.org/2013/320.
[Kob87] N. Koblitz. Elliptic Curve Cryptosystems. Mathematics of Compu-
tation, 48(177):203–209, 1987. http://www.jstor.org/stable/2007884.
120 Bibliography
[KR11] L. R. Knudsen and M. Robshaw. The Block Cipher Companion. In-
formation Security and Cryptography. Springer-Verlag Berlin Heidel-
berg, 2011. https://books.google.nl/books?id=YiZKt FcmYQC.
[LH07] C. Liu and Y. Huang. Eﬀects of embedded decompression and com-
paction architectures on side-channel attack resistance. In in Proc.
IEEE VTS, pages 461–468, 2007.
[LP07] G. Leander and A. Poschmann. On the Classiﬁcation of 4 Bit S-
Boxes. In Arithmetic of Finite Fields, volume 4547 of Lecture Notes in
Computer Science, pages 159–176. Springer Berlin Heidelberg, 2007.
[LTP07] J. Lee, M. Tehranipoor, and J. Plusquellic. Securing designs against
scan-based side-channel attacks. IEEE Trans. Depend. and Secure
Comput., 4(4):325–336, 2007.
[Luo10] Q. Luo. Enhance multi-bit spectral analysis on hiding in tempo-
ral dimension. In Smart Card Research and Advanced Applications
- CARDIS 2010, Lecture Notes in Computer Science, pages 13–23,
Berlin, Heidelberg, 2010. Springer-Verlag. http://dx.doi.org/10.1007/
978-3-642-12510-2 2.
[Man04] S. Mangard. Hardware countermeasures against dpa – a statistical
analysis of their eﬀectiveness. In Tatsuaki Okamoto, editor, Top-
ics in Cryptology – CT-RSA 2004, volume 2964 of Lecture Notes in
Computer Science, pages 222–235. Springer Berlin Heidelberg, 2004.
http://dx.doi.org/10.1007/978-3-540-24660-2 18.
[Mas94] J. L. Massey. Guessing and entropy. In Information Theory, 1994.
Proceedings., 1994 IEEE International Symposium on, pages 204–,
1994.
[Mat94] M. Matsui. Linear cryptanalysis method for des cipher. In Advances
in Cryptology — EUROCRYPT 1993, pages 386–397, Berlin, Hei-
delberg, 1994. Springer Berlin Heidelberg. http://dx.doi.org/10.1007/
3-540-48285-7 33.
[MBLM12] D. Mavroeidis, L. Batina, T. Laarhoven, and E. Marchiori. Pca,
eigenvector localization and clustering for side-channel attacks on
cryptographic hardware devices. In Machine Learning and Knowl-
edge Discovery in Databases: European Conference, ECML PKDD
2012, Bristol, UK, September 24-28, 2012. Proceedings, Part I, pages
253–268, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. http://
dx.doi.org/10.1007/978-3-642-33460-3 22.
[Men10] Mentor Graphics. Silicon test and yield analysis whitepaper - high
quality test solutions for secure applications. 2010.
Bibliography 121
[MG10] E. Mateos and C. H. Gebotys. A new correlation frequency analysis
of the side channel. In Proceedings of the 5th Workshop on Embedded
Systems Security, WESS 2010, pages 4:1–4:8, New York, NY, USA,
2010. ACM. http://doi.acm.org/10.1145/1873548.1873552.
[MGD11] H. Maghrebi, S. Guilley, and J.-L. Danger. Leakage squeezing coun-
termeasure against high-order attacks. In Information Security The-
ory and Practice. Security and Privacy of Mobile Devices in Wire-
less Communication: 5th IFIP WG 11.2 International Workshop,
WISTP 2011, Heraklion, Crete, Greece, June 1-3, 2011. Proceed-
ings, pages 208–223, Berlin, Heidelberg, 2011. http://dx.doi.org/10.1007/
978-3-642-21040-2 14.
[MGH14] A. Moradi, S. Guilley, and A. Heuser. Detecting Hidden Leakages. In
Ioana Boureanu, Philippe Owesarski, and Serge Vaudenay, editors,
Applied Cryptography and Network Security, volume 8479 of Lecture
Notes in Computer Science, pages 324–342. Springer International
Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-07536-5 20.
[Mil86] V. S. Miller. Use of elliptic curves in cryptography. In H. C. Williams,
editor, Advances in Cryptology — CRYPTO 1985, pages 417–426,
Berlin, Heidelberg, 1986. Springer Berlin Heidelberg. http://dx.doi.
org/10.1007/3-540-39799-X 31.
[MME10] A. Moradi, O. Mischke, and T. Eisenbarth. Correlation-Enhanced
Power Analysis Collision Attack. In Stefan Mangard and Franc¸ois-
Xavier Standaert, editors, Cryptographic Hardware and Embedded
Systems, CHES 2010, volume 6225 of Lecture Notes in Computer
Science, pages 125–139. Springer Berlin Heidelberg, 2010. http://dx.
doi.org/10.1007/978-3-642-15031-9 9.
[MMS13a] B. Mazumdar, D. Mukhopadhay, and I. Sengupta. Constrained
Search for a Class of Good Bijective S-Boxes with Improved DPA
Resistivity. Information Forensics and Security, IEEE Transactions
on, PP(99):1–1, 2013.
[MMS13b] B. Mazumdar, D. Mukhopadhyay, and I. Sengupta. Design and im-
plementation of rotation symmetric S-boxes with high nonlinearity
and high DPA resilience. In Hardware-Oriented Security and Trust
(HOST), 2013 IEEE International Symposium on, pages 87–92, 2013.
[MOP07] S. Mangard, E. Oswald, and T. Popp. Power Analysis Attacks: Re-
vealing the Secrets of Smart Cards (Advances in Information Secu-
rity). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2007.
[Mor12] A. Moradi. Statistical Tools Flavor Side-Channel Collision Attacks.
In David Pointcheval and Thomas Johansson, editors, Advances in
122 Bibliography
Cryptology — EUROCRYPT 2012, volume 7237 of Lecture Notes in
Computer Science, pages 428–445. Springer Berlin Heidelberg, 2012.
http://dx.doi.org/10.1007/978-3-642-29011-4 26.
[MS11] T. Mu¨ller and M. Spreitzenbarth. FROST - Forensic Recovery of
Scrambled Telephones. In Michael Jacobson, Michael Locasto, Pay-
man Mohassel, and Reihaneh Safavi-Naini, editors, ACNS 2013,
Banff, AB, Canada., volume 7954, pages 373–388, 2011. http://link.
springer.com/chapter/10.1007%2F978-3-642-38980-1 23.
[MY93] M. Matsui and A. Yamagishi. A new method for known plaintext
attack of FEAL cipher. In Advances in Cryptology - EUROCRYPT
1992, Lecture Notes in Computer Science, pages 81–91, Berlin, Hei-
delberg, 1993. Springer-Verlag.
[NGD11] M. Nassar, S. Guilley, and J.-L. Danger. Formal Analysis of the En-
tropy / Security Trade-oﬀ in First-Order Masking Countermeasures
against Side-Channel Attacks. In DanielJ. Bernstein and Sanjit Chat-
terjee, editors, Progress in Cryptology — INDOCRYPT 2011, volume
7107 of Lecture Notes in Computer Science, pages 22–39. Springer
Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-25578-6 4.
[NRR06] S. Nikova, C. Rechberger, and V. Rijmen. Threshold implementations
against side-channel attacks and glitches. In Information and Com-
munications Security: 8th International Conference, ICICS 2006,
Raleigh, NC, USA, December 4-7, 2006. Proceedings, pages 529–545,
Berlin, Heidelberg, 2006. http://dx.doi.org/10.1007/11935308 38.
[NSGD12] M. Nassar, Y. Souissi, S. Guilley, and J.-L. Danger. RSM: A Small
and Fast Countermeasure for AES, Secure Against 1st and 2nd-order
Zero-oﬀset SCAs. In Proceedings of the Conference on Design, Au-
tomation and Test in Europe, DATE 2012, pages 1173–1178, San
Jose, CA, USA, 2012. EDA Consortium. http://dl.acm.org/citation.cfm?
id=2492708.2492999.
[NSY+10a] R. Nara, K. Satoh, M. Yanagisawa, T. Ohtsuki, and N. Togawa. Scan-
based attack against elliptic curve cryptosystems. in Proc. ASPDAC,
pages 407–412, 2010.
[NSY+10b] R. Nara, K. Satoh, M. Yanagisawa, T. Ohtsuki, and N. Togawa. Side-
channel attack against RSA cryptosystems using scan signatures. IE-
ICE Trans. Fund. Elec. Comm. and Comp. Sc., (E 93A(12)):2481–
2489, 2010.
[NSZW09] J. Nakahara, P. Sepehrdad, B. Zhang, and M. Wang. Linear (hull)
and algebraic cryptanalysis of the block cipher present. In Cryptology
Bibliography 123
and Network Security: 8th International Conference, CANS 2009,
Kanazawa, Japan, December 12-14, 2009. Proceedings, pages 58–75,
Berlin, Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-10433-6 5.
[Nyb91] K. Nyberg. Perfect Nonlinear S-Boxes. In Advances in Cryptology -
EUROCRYPT 1991, volume 547 of Lecture Notes in Computer Sci-
ence, pages 378–386. Springer, 1991.
[OP13] D. Oswald and C. Paar. Improving side-channel analysis with optimal
linear transforms. In Stefan Mangard, editor, Smart Card Research
and Advanced Applications - CARDIS 2013, volume 7771 of Lecture
Notes in Computer Science, pages 219–233. Springer Berlin Heidel-
berg, 2013. http://dx.doi.org/10.1007/978-3-642-37288-9 15.
[PBJ+14] S. Picek, L. Batina, D. Jakobovic´, B. Ege, and M. Golub. S-box,
set, match: A toolbox for s-box analysis. In David Naccache and
Damien Sauveron, editors, Information Security Theory and Practice.
Securing the Internet of Things, volume 8501 of Lecture Notes in
Computer Science, pages 140–149. Springer Berlin Heidelberg, 2014.
http://dx.doi.org/10.1007/978-3-662-43826-8 10.
[PEB+14] S. Picek, B. Ege, L. Batina, D. Jakobovic,  L. Chmielewski, and
M. Golub. On Using Genetic Algorithms for Intrinsic Side-channel
Resistance: The Case of AES S-box. In Proceedings of the First
Workshop on Cryptography and Security in Computing Systems, CS2
2014, pages 13–18, New York, USA, 2014. ACM.
[PEP+14] S. Picek, B. Ege, K. Papagiannopoulos, L. Batina, and D. Jakobovic.
Optimality and beyond: The case of 4x4 s-boxes. In 2014 IEEE
International Symposium on Hardware-Oriented Security and Trust,
HOST 2014, Arlington, VA, USA, May 6-7, 2014, pages 80–83, 2014.
[Pic15] S. Picek. Applications of Evolutionary Computation to Cryptography.
PhD thesis, Radboud University, 2015.
[PMMB15] S. Picek, B. Mazumdar, D. Mukhopadhyay, and L. Batina. Modi-
ﬁed transparency order property: Solution or just another attempt.
In Rajat Subhra Chakraborty, Peter Schwabe, and Jon Solworth,
editors, Security, Privacy, and Applied Cryptography Engineering,
volume 9354 of Lecture Notes in Computer Science, pages 210–
227. Springer International Publishing, 2015. http://dx.doi.org/10.1007/
978-3-319-24126-5 13.
[PPE+14] S. Picek, K. Papagiannopoulos, B. Ege, L. Batina, and D. Jakobovic.
Confused by confusion: Systematic evaluation of dpa resistance of
various s-boxes. In Willi Meier and Debdeep Mukhopadhyay, edi-
tors, Progress in Cryptology – INDOCRYPT 2014, Lecture Notes in
124 Bibliography
Computer Science, pages 374–390. Springer International Publishing,
2014.
[PR13] E. Prouﬀ and M. Rivain. Masking against side-channel attacks: A
formal security proof. In Advances in Cryptology – EUROCRYPT
2013, pages 142–159, Berlin, Heidelberg, 2013. http://dx.doi.org/10.
1007/978-3-642-38348-9 9.
[Pro05] E. Prouﬀ. DPA Attacks and S-Boxes. In Fast Software Encryption —
FSE 2005, volume 3557 of Lecture Notes in Computer Science, pages
424–441. Springer, 2005.
[QS02] J.-J. Quisquater and D. Samyde. Eddy Current for Magnetic Analysis
with Active Sensor. In Conference on Research in SmartCards (E-
Smart’02), Nice, France., pages 185–194. UCL, 2002.
[RB08] M. Robshaw and O. Billet, editors. New Stream Cipher Designs: The
eSTREAM Finalists. Springer-Verlag, Berlin, Heidelberg, 2008.
[RSA78] R. L. Rivest, A. Shamir, and L. Adleman. A method for obtaining
digital signatures and public-key cryptosystems. Commun. ACM,
21(2):120–126, 1978. http://doi.acm.org/10.1145/359340.359342.
[RTK+02] J. Rajski, J. Tyszer, M. Kassab, N. Mukherjee, R. Thompson, K-
H. Tsai, A. Hertwig, N. Tamarapalli, and G. Mrugalski. Embedded
deterministic test for low cost manufacturing test. In in Proc. ITC,
pages 301–310, 2002.
[RTKM04] J. Rajski, J. Tyszer, M. Kassab, and N. Mukherjee. Embedded de-
terministic test. IEEE Trans. Comput.-Aided Des., 23(5):776–792,
2004.
[SGD08] N. Selmane, S. Guilley, and J.-L. Danger. Practical Setup Time Vio-
lation Attacks on AES. In Dependable Computing Conference, 2008.
EDCC 2008. Seventh European, pages 91–96. IEEE, 2008.
[SH07] J.-M. Schmidt and M. Hutter. Optical and EM Fault-Attacks on
CRT-based RSA: Concrete Results. In Johannes Wolkerstorfer Karl
C. Posch, editor, Austrochip 2007, 15th Austrian Workhop on Micro-
electronics, 11 October 2007, Graz, Austria, Proceedings, pages 61 –
67. Verlag der Technischen Universita¨t Graz, 2007.
[SH08] J.-M. Schmidt and C. Herbst. A Practical Fault Attack on Square
and Multiply. In Workshop on Fault Diagnosis and Tolerance in
Cryptography, FDTC 2008, pages 53–58. IEEE, 2008.
[Sha49] C.E. Shannon. Communication theory of secrecy systems. Bell Sys-
tem Technical Journal, The, 28(4):656–715, 1949.
Bibliography 125
[Sko02] S. Skorobogatov. Low temperature data remanence in static RAM.
Technical report, University of Cambridge Computer Laboratory,
2002. http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-536.pdf.
[SLFP04] K. Schramm, G. Leander, P. Felke, and C. Paar. A Collision-Attack
on AES. In Cryptographic Hardware and Embedded Systems - CHES
2004. Springer Berlin / Heidelberg, 2004.
[SLP05] W. Schindler, K. Lemke, and C. Paar. A Stochastic Model for Diﬀer-
ential Side Channel Cryptanalysis. In JosyulaR. Rao and Berk Sunar,
editors, Cryptographic Hardware and Embedded Systems — CHES
2005, volume 3659 of Lecture Notes in Computer Science, pages 30–
46. Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11545262
3.
[SMC07] G. Sengar, D. Mukhopadhayay, and D. R. Chowdhury. An eﬃcient
approach to develop secure scan tree for crypto-hardware. In in Proc.
ADCOM, pages 21–26, 2007.
[SMY09a] F.-X. Standaert, T. G. Malkin, and M. Yung. A Uniﬁed Framework
for the Analysis of Side-Channel Key Recovery Attacks. In Antoine
Joux, editor, Advances in Cryptology - EUROCRYPT 2009, volume
5479 of Lecture Notes in Computer Science, pages 443–461. Springer
Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-01001-9 26.
[SMY09b] F.-X. Standaert, T. G. Malkin, and M. Yung. A uniﬁed framework
for the analysis of side-channel key recovery attacks. In Antoine Joux,
editor, Advances in Cryptology - EUROCRYPT 2009, volume 5479 of
Lecture Notes in Computer Science, pages 443–461. Springer Berlin
Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-01001-9 26.
[SSAQ02] D. Samyde, S. P. Skorobogatov, R. J. Anderson, and J.-J. Quisquater.
On a New Way to Read Data from Memory. In IEEE Security in
Storage Workshop (SISW02), pages 65–69. IEEE Computer Society,
2002.
[Sto15] K. Stoﬀelen. Intrinsic Side-Channel Analysis Resistance and Eﬃcient
Masking. Master’s thesis, Radboud University, 2015.
[STYO07] Y. Shi, N. Togawa, M. Yanagisawa, and T. Ohtsuki. Design for secure
test - a case study on pipelined advanced encryption standard. In in
Proc. IEEE ISCAS, pages 149–152, 2007.
[SWP03] K. Schramm, T. Wollinger, and C. Paar. A New Class of Collision
Attacks and Its Application to DES. In Thomas Johansson, editor,
Fast Software Encryption, volume 2887 of Lecture Notes in Computer
Science, pages 206–222. Springer Berlin Heidelberg, 2003. http://dx.
doi.org/10.1007/978-3-540-39887-5 16.
126 Bibliography
[THM07] S. Tillich, C. Herbst, and S. Mangard. Protecting AES software
implementations on 32-bit processors against power analysis. In
J. Katz and M. Yung, editors, Applied Cryptography and Network
Security, 5th International Conference, ACNS 2007, Zhuhai, China,
June 5-8, 2007, Proceedings, volume 4521 of Lecture Notes in Com-
puter Science, pages 141–157. Springer, 2007. http://dx.doi.org/10.1007/
978-3-540-72738-5 10.
[TM13] S. Tiran and P. Maurine. SCA with magnitude squared coherence. In
Stefan Mangard, editor, Smart Card Research and Advanced Appli-
cations - CARDIS 2012, volume 7771 of Lecture Notes in Computer
Science, pages 234–247. Springer Berlin Heidelberg, 2013. http://dx.
doi.org/10.1007/978-3-642-37288-9 16.
[TOT+14] S. Tiran, S. Ordas, Y. Teglia, M. Agoyan, and P. Maurine. A model
of the leakage in the frequency domain and its application to CPA
and DPA. Journal of Cryptographic Engineering, pages 1–16, 2014.
http://dx.doi.org/10.1007/s13389-014-0074-x.
[VGE13] R. Verdult, F. D. Garcia, and B. Ege. Dismantling megamos crypto:
Wirelessly lockpicking a vehicle immobilizer. In 22nd USENIX Se-
curity Symposium (USENIX Security 13), Washington, D.C., 2013.
USENIX Association. https://www.usenix.org/conference/usenixsecurity13/
dismantling-megamos-crypto-wirelessly-lockpicking-vehicle-immobilizer.
[VKS11] I. Verbauwhede, D. Karaklajic, and J.-M. Schmidt. The Fault Attack
Jungle - A Classiﬁcation Model to Guide You. In Luca Breveglieri,
Sylvain Guilley, Israel Koren, David Naccache, and Junko Takahashi,
editors,Workshop on Fault Diagnosis and Tolerance in Cryptography,
FDTC 2011, pages 3–8. IEEE, 2011.
[WO11] C. Whitnall and E. Oswald. A fair evaluation framework for compar-
ing side-channel distinguishers. Journal of Cryptographic Engineer-
ing, 1(2):145–160, 2011. http://dx.doi.org/10.1007/s13389-011-0011-1.
[WOM11] C. Whitnall, E. Oswald, and L. Mather. An exploration of the
kolmogorov-smirnov test as a competitor to mutual information anal-
ysis. In Emmanuel Prouﬀ, editor, Smart Card Research and Ad-
vanced Applications - CARDIS 2011, volume 7079 of Lecture Notes in
Computer Science, pages 234–251. Springer Berlin Heidelberg, 2011.
http://dx.doi.org/10.1007/978-3-642-27257-8 15.
[WWK+07] P. Wohl, J. A. Waicukauski, R. Kapur, S. Ramnath, E. Gizdarski,
T. W. Williams, and P. Jaini. Minimizing the impact of scan com-
pression. In in Proc. IEEE VTS, pages 67–74, 2007.
Bibliography 127
[WWPA03] P. Wohl, J. A. Waicukauski, S. Patel, and M. B. Amin. X-tolerant
compression and application of scan-atpg patterns in a bist architec-
ture. in Proc. IEEE ITC, pages 727–736, 2003.
[WWW06] L-T. Wang, C-W. Wu, and X. Wen. VLSI Test Principles and Archi-
tectures: Design for Testability (Systems on Silicon). Morgan Kauf-
mann Publishers Inc., San Francisco, CA, USA, 2006.
[YCE14] X. Ye, C. Chen, and T. Eisenbarth. Non-Linear Collision Analy-
sis. In Nitesh Saxena and Ahmad-Reza Sadeghi, editors, Radio Fre-
quency Identification: Security and Privacy Issues, Lecture Notes in
Computer Science, pages 198–214. Springer International Publishing,
2014. http://dx.doi.org/10.1007/978-3-319-13066-8 13.
[YE14] X. Ye and T. Eisenbarth. On the Vulnerability of Low Entropy Mask-
ing Schemes. In Aure´lien Francillon and Pankaj Rohatgi, editors,
Smart Card Research and Advanced Applications - CARDIS 2013,
Lecture Notes in Computer Science, pages 44–60. Springer Interna-
tional Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-08302-5 4.
[YWK04] B. Yang, K. Wu, and R. Karri. Scan based side channel attack on
dedicated hardware implementations of data encryption standard. In
International Test Conference, 2004. Proceedings. ITC 2004., pages
339–344, 2004.
[YWK06] B. Yang, K. Wu, and R. Karri. Secure scan: A design-for-test archi-
tecture for crypto chips. Computer-Aided Design of Integrated Cir-
cuits and Systems, IEEE Transactions on, 25(10):2287–2293, 2006.
[ZDZC09] P. Zhang, G. Deng, Q. Zhao, and K. Chen. EM frequency domain
correlation analysis on cipher chips. In Proceedings of the 2009 First
IEEE International Conference on Information Science and Engi-
neering, ICISE 2009, pages 1729–1732, Washington, DC, USA, 2009.
IEEE Computer Society. http://dx.doi.org/10.1109/ICISE.2009.542.

Curriculum vitae
Barıs¸ Ege
1984:
Born in Ankara, Turkey
2002 - 2007:
Bachelor of Science (Honours)
Mathematics
Middle East Technical University
2007 - 2010:
Master of Science (High Honours)
Cryptography
Middle East Technical University, Institute of Applied Mathematics
2009 - 2011:
Research Assistant
Institute of Applied Mathematics, Middle East Technical University
2012 - 2016:
PhD
Digital Security Group
Radboud University
Titles in the IPA Dissertation Series since 2013
H. Beohar. Refinement of Communi-
cation and States in Models of Embedded
Systems. Faculty of Mathematics and
Computer Science, TU/e. 2013-01
G. Igna. Performance Analysis of Real-
Time Task Systems using Timed Au-
tomata. Faculty of Science, Mathemat-
ics and Computer Science, RU. 2013-02
E. Zambon. Abstract Graph Transfor-
mation – Theory and Practice. Faculty
of Electrical Engineering, Mathematics
& Computer Science, UT. 2013-03
B. Lijnse. TOP to the Rescue – Task-
Oriented Programming for Incident Re-
sponse Applications. Faculty of Sci-
ence, Mathematics and Computer Sci-
ence, RU. 2013-04
G.T. de Koning Gans. Outsmart-
ing Smart Cards. Faculty of Science,
Mathematics and Computer Science,
RU. 2013-05
M.S. Greiler. Test Suite Compre-
hension for Modular and Dynamic Sys-
tems. Faculty of Electrical Engineer-
ing, Mathematics, and Computer Sci-
ence, TUD. 2013-06
L.E. Mamane. Interactive mathemat-
ical documents: creation and presenta-
tion. Faculty of Science, Mathematics
and Computer Science, RU. 2013-07
M.M.H.P. van den Heuvel. Compo-
sition and synchronization of real-time
components upon one processor. Faculty
of Mathematics and Computer Science,
TU/e. 2013-08
J. Businge. Co-evolution of the Eclipse
Framework and its Third-party Plug-ins.
Faculty of Mathematics and Computer
Science, TU/e. 2013-09
S. van der Burg. A Reference Archi-
tecture for Distributed Software Deploy-
ment. Faculty of Electrical Engineer-
ing, Mathematics, and Computer Sci-
ence, TUD. 2013-10
J.J.A. Keiren. Advanced Reduction
Techniques for Model Checking. Faculty
of Mathematics and Computer Science,
TU/e. 2013-11
D.H.P. Gerrits. Pushing and Pulling:
Computing push plans for disk-shaped
robots, and dynamic labelings for mov-
ing points. Faculty of Mathematics and
Computer Science, TU/e. 2013-12
M. Timmer. Efficient Modelling, Gen-
eration and Analysis of Markov Au-
tomata. Faculty of Electrical Engineer-
ing, Mathematics & Computer Science,
UT. 2013-13
M.J.M. Roeloffzen. Kinetic Data
Structures in the Black-Box Model. Fac-
ulty of Mathematics and Computer Sci-
ence, TU/e. 2013-14
L. Lensink. Applying Formal Methods
in Software Development. Faculty of Sci-
ence, Mathematics and Computer Sci-
ence, RU. 2013-15
C. Tankink. Documentation and For-
mal Mathematics — Web Technology
meets Proof Assistants. Faculty of Sci-
ence, Mathematics and Computer Sci-
ence, RU. 2013-16
C. de Gouw. Combining Monitoring
with Run-time Assertion Checking. Fac-
ulty of Mathematics and Natural Sci-
ences, UL. 2013-17
J. van den Bos. Gathering Evidence:
Model-Driven Software Engineering in
Automated Digital Forensics. Faculty of
Science, UvA. 2014-01
D. Hadziosmanovic. The Process
Matters: Cyber Security in Industrial
Control Systems. Faculty of Electrical
Engineering, Mathematics & Computer
Science, UT. 2014-02
A.J.P. Jeckmans. Cryptographically-
Enhanced Privacy for Recommender
Systems. Faculty of Electrical Engineer-
ing, Mathematics & Computer Science,
UT. 2014-03
C.-P. Bezemer. Performance Opti-
mization of Multi-Tenant Software Sys-
tems. Faculty of Electrical Engineer-
ing, Mathematics, and Computer Sci-
ence, TUD. 2014-04
T.M. Ngo. Qualitative and Quan-
titative Information Flow Analysis for
Multi-threaded Programs. Faculty of
Electrical Engineering, Mathematics &
Computer Science, UT. 2014-05
A.W. Laarman. Scalable Multi-Core
Model Checking. Faculty of Electrical
Engineering, Mathematics & Computer
Science, UT. 2014-06
J. Winter. Coalgebraic Characteri-
zations of Automata-Theoretic Classes.
Faculty of Science, Mathematics and
Computer Science, RU. 2014-07
W. Meulemans. Similarity Mea-
sures and Algorithms for Cartographic
Schematization. Faculty of Mathematics
and Computer Science, TU/e. 2014-08
A.F.E. Belinfante. JTorX: Exploring
Model-Based Testing. Faculty of Electri-
cal Engineering, Mathematics & Com-
puter Science, UT. 2014-09
A.P. van der Meer. Domain Specific
Languages and their Type Systems. Fac-
ulty of Mathematics and Computer Sci-
ence, TU/e. 2014-10
B.N. Vasilescu. Social Aspects of Col-
laboration in Online Software Communi-
ties. Faculty of Mathematics and Com-
puter Science, TU/e. 2014-11
F.D. Aarts. Tomte: Bridging the Gap
between Active Learning and Real-World
Systems. Faculty of Science, Mathemat-
ics and Computer Science, RU. 2014-12
N. Noroozi. Improving Input-Output
Conformance Testing Theories. Faculty
of Mathematics and Computer Science,
TU/e. 2014-13
M. Helvensteijn. Abstract Delta Mod-
eling: Software Product Lines and Be-
yond. Faculty of Mathematics and Nat-
ural Sciences, UL. 2014-14
P. Vullers. Efficient Implementations
of Attribute-based Credentials on Smart
Cards. Faculty of Science, Mathematics
and Computer Science, RU. 2014-15
F.W. Takes. Algorithms for Analyzing
and Mining Real-World Graphs. Faculty
of Mathematics and Natural Sciences,
UL. 2014-16
M.P. Schraagen. Aspects of Record
Linkage. Faculty of Mathematics and
Natural Sciences, UL. 2014-17
G. Alpa´r. Attribute-Based Iden-
tity Management: Bridging the Crypto-
graphic Design of ABCs with the Real
World. Faculty of Science, Mathematics
and Computer Science, RU. 2015-01
A.J. van der Ploeg. Efficient Abstrac-
tions for Visualization and Interaction.
Faculty of Science, UvA. 2015-02
R.J.M. Theunissen. Supervisory
Control in Health Care Systems.
Faculty of Mechanical Engineering,
TU/e. 2015-03
T.V. Bui. A Software Architecture
for Body Area Sensor Networks: Flex-
ibility and Trustworthiness. Faculty
of Mathematics and Computer Science,
TU/e. 2015-04
A. Guzzi. Supporting Developers’
Teamwork from within the IDE. Faculty
of Electrical Engineering, Mathematics,
and Computer Science, TUD. 2015-05
T. Espinha. Web Service Grow-
ing Pains: Understanding Services and
Their Clients. Faculty of Electrical En-
gineering, Mathematics, and Computer
Science, TUD. 2015-06
S. Dietzel. Resilient In-network Aggre-
gation for Vehicular Networks. Faculty
of Electrical Engineering, Mathematics
& Computer Science, UT. 2015-07
E. Costante. Privacy throughout the
Data Cycle. Faculty of Mathematics and
Computer Science, TU/e. 2015-08
S. Cranen. Getting the point — Ob-
taining and understanding fixpoints in
model checking. Faculty of Mathematics
and Computer Science, TU/e. 2015-09
R. Verdult. The (in)security of pro-
prietary cryptography. Faculty of Sci-
ence, Mathematics and Computer Sci-
ence, RU. 2015-10
J.E.J. de Ruiter. Lessons learned
in the analysis of the EMV and TLS
security protocols. Faculty of Science,
Mathematics and Computer Science,
RU. 2015-11
Y. Dajsuren. On the Design of an Ar-
chitecture Framework and Quality Eval-
uation for Automotive Software Systems.
Faculty of Mathematics and Computer
Science, TU/e. 2015-12
J. Bransen. On the Incremental Eval-
uation of Higher-Order Attribute Gram-
mars. Faculty of Science, UU. 2015-13
S. Picek. Applications of Evolution-
ary Computation to Cryptology. Faculty
of Science, Mathematics and Computer
Science, RU. 2015-14
C. Chen. Automated Fault Localiza-
tion for Service-Oriented Software Sys-
tems. Faculty of Electrical Engineer-
ing, Mathematics, and Computer Sci-
ence, TUD. 2015-15
S. te Brinke. Developing Energy-
Aware Software. Faculty of Electrical
Engineering, Mathematics & Computer
Science, UT. 2015-16
R.W.J. Kersten. Software Analysis
Methods for Resource-Sensitive Systems.
Faculty of Science, Mathematics and
Computer Science, RU. 2015-17
J.C. Rot. Enhanced coinduction. Fac-
ulty of Mathematics and Natural Sci-
ences, UL. 2015-18
M. Stolikj. Building Blocks for
the Internet of Things. Faculty of
Mathematics and Computer Science,
TU/e. 2015-19
D. Gebler. Robust SOS Specifications
of Probabilistic Processes. Faculty of Sci-
ences, Department of Computer Science,
VUA. 2015-20
M. Zaharieva-Stojanovski. Closer
to Reliable Software: Verifying func-
tional behaviour of concurrent pro-
grams. Faculty of Electrical Engineer-
ing, Mathematics & Computer Science,
UT. 2015-21
R.J. Krebbers. The C standard for-
malized in Coq. Faculty of Science,
Mathematics and Computer Science,
RU. 2015-22
R. van Vliet. DNA Expressions –
A Formal Notation for DNA. Faculty
of Mathematics and Natural Sciences,
UL. 2015-23
S.-S.T.Q. Jongmans. Automata-
Theoretic Protocol Programming. Fac-
ulty of Mathematics and Natural Sci-
ences, UL. 2016-01
S.J.C. Joosten. Verification of Inter-
connects. Faculty of Mathematics and
Computer Science, TU/e. 2016-02
M.W. Gazda. Fixpoint Logic, Games,
and Relations of Consequence. Faculty
of Mathematics and Computer Science,
TU/e. 2016-03
S. Keshishzadeh. Formal Analysis
and Verification of Embedded Systems
for Healthcare. Faculty of Mathematics
and Computer Science, TU/e. 2016-04
P.M. Heck. Quality of Just-in-Time
Requirements: Just-Enough and Just-in-
Time. Faculty of Electrical Engineer-
ing, Mathematics, and Computer Sci-
ence, TUD. 2016-05
Y. Luo. From Conceptual Models
to Safety Assurance – Applying Model-
Based Techniques to Support Safety As-
surance. Faculty of Mathematics and
Computer Science, TU/e. 2016-06
B. Ege. Physical Security Analysis
of Embedded Devices. Faculty of Sci-
ence, Mathematics and Computer Sci-
ence, RU. 2016-07
