Brigham Young University

BYU ScholarsArchive
Theses and Dissertations
2021-05-17

Exploring the Performance Impacts of Harmful FPGA
Configurations
Tanner Gaskin
Brigham Young University

Follow this and additional works at: https://scholarsarchive.byu.edu/etd
Part of the Engineering Commons

BYU ScholarsArchive Citation
Gaskin, Tanner, "Exploring the Performance Impacts of Harmful FPGA Configurations" (2021). Theses and
Dissertations. 9006.
https://scholarsarchive.byu.edu/etd/9006

This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion
in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please
contact ellen_amatangelo@byu.edu.

Exploring the Performance Impacts of Harmful FPGA Configurations

Tanner Gaskin

A thesis submitted to the faculty of
Brigham Young University
in partial fulfillment of the requirements for the degree of
Master of Science

Brad L. Hutchings, Chair
Jeffery B. Goeders
Brent E. Nelson

Department of Electrical and Computer Engineering
Brigham Young University

Copyright © 2021 Tanner Gaskin
All Rights Reserved

ABSTRACT
Exploring the Performance Impacts of Harmful FPGA Configurations
Tanner Gaskin
Department of Electrical and Computer Engineering, BYU
Master of Science
In this work a new technique for accelerating the aging of FPGA devices is proposed and
demonstrated. The proposed technique uses harmful configurations (short circuits) to accelerate
the aging process on targeted portions of an FPGA chip. A testbed is developed for the purpose of
measuring FPGA degradation. Using this testbed it is shown that implementing thousands of short
circuits in FPGA fabric generates enough heat to cause significant damage to the chip, reducing
switching speeds by up to 8%. It is also demonstrated that different parts of the FPGA fabric can
be aged at different rates, with some parts of the chip only slowing down 2% while other parts slow
down as much as 8%.

Keywords: fpga, cmos aging, security

ACKNOWLEDGMENTS
I would first like to thank Dr. Brad Hutchings and Dr. Jeff Goeders, who’s guidance has
greatly impacted both the direction of this work and my life. Two years prior to the start of this
work they took a chance, hiring me for a position I was grossly under-qualified for and undeserving
of. Under their tutelage I expanded my engineering abilities, learned how to communicate complex
ideas using both verbal and written mediums and gained an understanding of how ask the right
questions at the right time, as well as how to answer them. Without their guidance and influence I
would not be the same person I am today.
Even more impactful has been my family. Even though none of them contributed technically to this work they all played an instrumental role in it’s completion. Without their encouragement and support none of this would have been possible. Of special note are my mother and father,
who have been encouraging me from before I could walk and who have played an instrumental
role in helping me develop the fortitude necessary to complete this degree and the work presented
here.
I would also like to recognize the role that my fellow researchers played. In particular,
Hayden Cook has played an instrumental role in the development of this work, making many of
the key discoveries that made everything presented here possible. Hayden possesses a rare intellect
that has made him a pleasure to work with. I wish him the best in all his future endeavours.
In addition to Hayden there are many friends and co-workers that I have been blessed to
know and who have made this journey enjoyable. They are too many to name, but the list certainly
includes Benjamin James, Ben Alexander, Clark Green, Bobby Hale, Joan Magalhaes, Jennings
Leavitt, Wesley Stirk, Sean Jensen, Adam Hastings, Jacob Arscott and Brent George. Thank you
all for your friendship and support throughout some of the most formative years of my life.

CONTENTS
Title Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

i

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 2 Background . . .
2.1 FPGAs . . . . . . . . .
2.1.1 FPGA Internals
2.1.2 Design Flows .
2.1.3 RapidWright .
2.2 Aging mechanisms . .
2.3 Related Work . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

1
2
3

. 4
. 4
. 4
. 7
. 10
. 10
. 12

Chapter 3 Hardware Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 RapidWright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Short Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 4 Testbed . . . . . . . . . . . . .
4.1 Measurement Technique . . . . . . .
4.1.1 Ring Oscillators . . . . . . . .
4.1.2 RO Creation . . . . . . . . . .
4.1.3 RO Performance . . . . . . .
4.1.4 Limitations . . . . . . . . . .
4.1.5 Static Region . . . . . . . . .
4.2 Environmental Control and Monitoring
4.2.1 Voltage . . . . . . . . . . . .
4.2.2 Temperature . . . . . . . . . .
4.3 Experiment Management . . . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

20
20
20
22
23
25
26
28
28
30
32

Chapter 5 Initial Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1 Low Current Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
iv

5.2

5.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Temperature Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Chapter 6 Long Term Experiment .
6.1 Methodology . . . . . . . . . .
6.1.1 Short Circuit Layout . .
6.1.2 Characterization Design
6.1.3 Recovery Design . . . .
6.1.4 Experiment Design . . .
6.2 Results . . . . . . . . . . . . . .
6.2.1 Initial Burn Period . . .
6.2.2 Recovery Period . . . .
6.2.3 Second Burn Period . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

42
42
42
44
44
44
45
45
46
48

Chapter 7 Discussion . . . . . . . . . . . . . . . . .
7.1 Localized Degradation . . . . . . . . . . . . .
7.2 Aging Effects . . . . . . . . . . . . . . . . . .
7.2.1 Bias Temperature Inversion . . . . . . .
7.2.2 Hot Carrier Injection . . . . . . . . . .
7.2.3 Electromigration . . . . . . . . . . . .
7.2.4 Time-Dependent Dielectric Breakdown
7.3 Other Notable Results . . . . . . . . . . . . . .
7.3.1 Current over Time . . . . . . . . . . . .
7.3.2 Damage Pattern . . . . . . . . . . . . .
7.3.3 Recovery Phase . . . . . . . . . . . . .
7.4 Limitations . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

49
49
50
51
52
52
53
53
53
53
54
55

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

Chapter 8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

v

LIST OF TABLES
6.1

Results for the long term experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

vi

LIST OF FIGURES
2.1
2.2
2.3
2.4
3.1
3.2
3.3

3.4
4.1

4.2

4.3
4.4
4.5

4.6

4.7

4.8
5.1

An example of a slice. There are 4 LUTs and 8 FFs in every slice. . . . . . . . . . .
An example of a switchbox. All of the potential connections for a single PIP have been
highlighted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
An example of a synthesized design. . . . . . . . . . . . . . . . . . . . . . . . . . .
An example of a fully implemented design. . . . . . . . . . . . . . . . . . . . . . .
An example of how a short circuit is created inside an FPGA. . . . . . . . . . . . . .
The three different short circuit options created by varying the primitive type used. .
This graph shows the differences in current draw between short circuits created using
different PIP junction groups. Each group was shorted four times. The error bars on
plot represent the min and max current draw out of the four tests, while the dot shows
the average current draw. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Eight fully routed short circuits. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

6

.
.
.

7
8
9

. 15
. 17

. 18
. 19

The logic function of a ring oscillator (RO). The AND gate is able to turn the RO on
and OFF. When the RO is on a signal propagates through the interverters, generating a
clock signal on ro clk that corresponds to the intrinsic speed of the resources involved.
A logical representation of the flow used to create RO and short circuit bitstreams. The
static design is created in Vivado while the RO and short circuit designs are created in
RapidWright. RO designs are combined with the static design to create RO bitstreams
while short bitstreams are generated using only the RapidWright checkpoint. . . . . . .
A histogram showing the data collected during a single characterization. The distribution of the data can be seen to be normal. . . . . . . . . . . . . . . . . . . . . . . . . .
A block diagram showing the static design used when characterizing the FPGA. . . . .
The experiment is kept in a controlled environment by a thermal chamber and a high
precision power supply. There is also a Intel NUC mini PC and Labjack ADC for
monitoring and controlling the experiments. . . . . . . . . . . . . . . . . . . . . . . .
The input voltage to the FPGA over the course of five days. The four lines are the
result of the quantization of the power supply. The input voltage fluctuates between
two quantization values for the duration of the experiment. During characterization
(when the load is less) the voltage still only fluctuates between two values, but two
quantizations up from where it is while short circuits are programmed. . . . . . . . . .
The temperature inside the thermal chamber over the course of five days. Each day
a dip can be seen when a burn stops and a characterization takes place, showing that
changes to the ambient temperature are due to the programmed short circuits and not
external stimulus. Note: The stripes in the graph are due to the quantization of the
temperature sensor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A diagram showing a logical view of the testbed. The Intel NUC mini PC is the brain
that controls and monitors all other parts of the system. . . . . . . . . . . . . . . . . .

21

23
25
27

29

30

31
32

The floorplan of the short circuit bitstream used in the initial low current experiments. . 36

vii

5.2

5.3

5.4

5.5

6.1

6.2

7.1

A histogram showing the data collected during four different characterizations. The
data distribution for each characterization can be seen to be normal. Short circuits
were programmed in between each of these characterizations and it can be seen that
this resulted in a slowdown of the intrinsic frequency of the board. . . . . . . . . . . .
A picture of the board modifications that enable powering the FPGA directly with an
external power supply. Multiple entry points were chosen to decrease the resistance
introduced by the new solder joints. . . . . . . . . . . . . . . . . . . . . . . . . . . .
A set of plots showing the correlation of RO frequency and chip temperature. This
data was collected as part of a normal characterization and so occurred right after
short circuits had been programmed to the board, which was the cause of the increased
temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A set of plots showing the relationship between RO frequency and chip temperature.
This data was collected over a six minute period where the chip was heated and cooled
externally. It can be seen that the RO frequency is proportional to the chip temperature.

37

38

39

40

Three ring oscillators are used to measure slowdowns across the chip, during the “characterization” phase of the experiment. 20,798 shorts are used during the “burn” phase
of the experiment, covering the lower two-thirds of the part. While both the short circuit and RO configurations are shown in this figure, they are configured at different
times. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
The results of the long term, high current experiment. After each 24 hour period the
Ring Oscillators were re-characterized. Those results are shown above, which track
the slowdown of each RO as time progressed. It is clearly shown that the Top-RO
(outside of the burn region), detected far less slowdown than the two ROs inside the
burn region. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
VINT input current during the first burn phase. . . . . . . . . . . . . . . . . . . . . . . 54

viii

CHAPTER 1.

INTRODUCTION

Through normal operation of semiconductor devices, the performance characteristics of
transistors gradually decline, resulting in decreased maximum clock speeds. This performance
degradation, referred to as transistor aging, is the result of several physical mechanisms (bias
temperature instability (BTI), electromigration (EM), and more), and is generally a greater concern
when scaling to smaller technology nodes [1]. Field-programmable gate arrays (FPGAs) are not
immune to this effect, and several studies have measured how FPGA performance is affected by
transistor aging [2]–[4].
Understanding FPGA aging is critical. Of highest importance, FPGA design tools must
account for aging mechanisms to ensure the reported fmax is achievable over the entire lifetime
of a part. However, aging also plays a role in other aspects of FPGA tools. Dogan et. al. show
that FPGA aging models can be used as an effective predictor of recycled parts [4], and Maiti et
al. demonstrate how aging disrupts the reliability of physically unclonable functions (PUFs) [5].
Several works have shown how aging can be rapidly induced, inflicting years of wear-out on parts
in a short period of time [2], [5]. This is typically accomplished by raising the supply voltage
above normal levels and baking the part in high temperatures.
In traditional digital design the presence of a short circuit is considered harmful and CAD
tools work to prevent such configurations [6]. This work explores the impact these harmful configurations have on FPGA performance. A testbed is introduced that is capable of precisely measuring
differences in circuit delay, that is then used in a set of experiments that program thousands of short
circuits to an FPGA. These experiments demonstrate that such configurations can result in circuit
delay increasing by more than 8%. Over the course of some experiments the FPGA device sinks
current in excess of 7.9 A and self-heats to temperatures over 170 ◦C. The FPGA is typically left
in this stressed state for an extended period of time, up to 24 hours. Afterwards, the testbed is
used to determine how much additional delay has been induced by the excessive heat and current.

1

The testbed uses ring oscillators (ROs) to measure relative increases in delay induced by the invalid configuration. Experiments repetitively load the shorting and characterization bitstreams to
measure the accelerated aging process over time. Over the course of several weeks these aging
experiments are successfully able to increase circuit delay by over 8%. This increase in delay
appears permanent; i.e. the degradation remains even after the FPGA is allowed time to recover.
Using harmful configurations (short circuits) as a method for degrading performance is
unique in that it does not require modifications to the supply voltage or an external heat source.
Potentially, this technique could be used to perform aging remotely, without physical control of the
FPGA; however, this would require that the system be equipped with a relatively high-amperage
power supply. Of perhaps more interest, this technique can apply aging non-uniformly across
the die, something that is not possible with previously published techniques. For example, our
experiments demonstrated that the fabric outside of the region targeted with short circuits experienced only one-fourth the slowdown compared to the targeted region. To our knowledge this
is the first work that has demonstrated targeted, nonuniform accelerated degradation on any commodity semiconductor device, not just FPGAs. We believe that as FPGAs continue to be used in
more applications, including safety critical IoT devices and environments where users can deploy
FPGA bitstreams on remote machines (Amazon EC3), understanding and planning for these aging
techniques is of high importance.

1.1

Contributions
The primary contributions of this work are as follows:
• A technique for implementing short circuits on Xilinx 7-Series FPGAs.
• An approach for measuring FPGA performance as well as a testbed implementing this approach.
• A demonstration of an 8% permanent slowdown of the FPGA fabric using only configuration.
• A demonstration of significant non-uniformity in aging; FPGA fabric outside of the targeted
region exhibited only one-fourth the slowdown of the targeted region.
2

1.2

Thesis Outline
This thesis is outlined as follows. Chapter 2 introduces several background topics relating

to FPGAs (such as design flows and aging mechanisms) and discusses related work. Chapter 3 describes a technique for implementing short circuits on a Xilinx 7-Series FPGA. Chapter 4 discusses
measuring FPGA degradation and the implementation of a testbed for this purpose. Chapter 5
presents a set of initial experiments performed with short circuits and their impact on future explorations. Chapter 6 describes a long term experiment that was performed and Chapter 7 discusses
the results of that experiment. Chapter 8 concludes this thesis and discusses future work.

3

CHAPTER 2.

BACKGROUND

This chapter introduces several background topics needed for understanding the work presented in this thesis. Section 2.1 discusses several topics relating to FPGAs, such as routing and
design flows. Section 2.2 discusses various aging mechanisms for CMOS devices.

2.1

FPGAs
Field programmable gate arrays (FPGAs) are integrated circuits that can perform the logic

function of any digital circuit. FGPAs are unique among integrated circuits because the integrated
circuit itself is designed to be able to replicate the functionality of other digital circuits, namely
boolean logic functions. This distinguishes FPGAs from other integrated circuits because FPGAs
can be configured in the field to implement arbitrary logic functions, whereas most integrated
circuits have their logic functionality hardened into the silicon and are unable to change once
fabricated. The configurable nature of FPGAs makes them ideal candidates for many applications,
such as high speed networking, satellite operations, accelerating internet searches and more [7]–
[10].
Because FPGAs are configurable it is possible to create configurations that cause harm to
the FPGA. CAD tools treat such configurations as invalid and work to prevent them from occurring [6]. Since this work studies the effects of such harmful configurations it is important to first
understand certain terminology and the FPGA design process. This section provides insight into
the terminology and tools used in this work.

2.1.1

FPGA Internals
From a design perspective FPGAs consist of different types of resources that can be routed

together in different patterns. This routing is configurable and is changed to implement differ-

4

ent logic functions. FPGAs typically have many different types of resources that can be routed
together. This work focuses on only two, Look Up Tables (LUTs) and Flip Flops (FFs).

LUTs
A Look Up Table (LUT) consists of several inputs and a single output. The LUT can be
configured to implement the truth table of any arbitrary logic function that has the same number of
inputs. For example, if a LUT has two inputs then the output could be configured to be the AND
of the those two inputs, or the OR of those two inputs, or the XOR, etc. If the LUT has three inputs
then the output can be configured such that it implements any logic function that has three inputs
and one output. In this work we use an Artix-7 FPGA, which uses six-input LUTs [11].

FFs
A flip flop (FF) is a logic element that holds state. It consists of an input, an output and a
clock signal. When the clock is high the output of the FF changes to match the input, but when
the clock is low the output keeps whatever value it had the last time it was high, regardless of what
the input does. This means that the FF remembers the state of the input from the prior clock cycle,
making it a core building block in sequential circuit design. In this work we use FFs to output a
constant value and so do not use the clock input.

Routing
Almost all designs of interest consist of logic more complex than a single LUT or FF. As
a result FPGAs need a mechanism that allows for the inputs and outputs of LUTs and FFs to be
connected together, allowing for the implementation of large and complex logic functions. The
mechanism that allows for this is called routing, which consists of different levels of multiplexers,
allowing the FPGA to be configured in an almost infinite number of ways.
In Xilinx 7-series FPGAs the LUT and FF resources are grouped together in what is known
as a slice, seen in Figure 2.1 [11]. Each slice contains four LUTs, labeled A through D, and eight
FFs. Four of the FFs are primary FFs and four are secondary. The four primary FFs are also labeled
A through D.
5

Figure 2.1: An example of a slice. There are 4 LUTs and 8 FFs in every slice.

Each input and output of a slice is connected to many routing resources, known as programmable interconnect points (PIPs). Each PIP is also connected to many resources, including
the inputs and outputs of slices and other PIPs. As a result the output of a slice can be connected
to almost any other resource on the chip, simply by being routed through a series of PIPs.
To make routing easier PIPs usually live in dedicated routing logic called switch boxes,
which contain many PIPs and act as an intersection point. An example of a switch box can be seen
in Figure 2.2. In this switchbox all the inputs to a single PIP have been highlighted to demonstrate
the number of possible sources a PIP might have. Switchboxes can contain hundreds of PIPs and
an FPGA may contain hundreds of thousands of switchboxes, enabling the highly reconfigurable
nature of FPGAs. Even though PIPs contain many inputs CAD tools ensure that only one is
active at a time. If multiple inputs are active then a bus conflict occurs, which results in undefined
behaviour. Such configurations are considered invalid.

6

Figure 2.2: An example of a switchbox. All of the potential connections for a single PIP have been
highlighted.

2.1.2

Design Flows
To configure an FPGA a bitstream is needed. A bitstream is simply a stream of ones and

zeros that tell the FPGA what logic function each LUT should implement, which inputs to connect
to what PIPs, and so forth. Due to the proprietary nature of FPGA layouts only vendor provided
CAD tools are capable of generating bitstreams. The vendor of the FPGA we targeted, Xilinx,
provides a tool called Vivado, which allows a user to design digital circuits and convert these designs to bitstreams. The tool flow is usually split into four steps: design, synthesis, implementation
and bitstream generation. As a part of each step a set of design rule checks (DRCs) are run which
ensure the validity of the design. Some DRCs can be downgraded to warnings, but some always result in errors. Different stages of the design flow may run different DRCs or have different rules for

7

Figure 2.3: An example of a synthesized design.

the same DRC. Any errors generating during these DRCs results in no bitstream being generated.
In this way the CAD tools ensure that bitstreams don’t contain invalid configurations.

Design
The design step is relatively standard for all digital circuit design. A designer will describe
a set of functionality in a hardware description language (HDL), or using a product that auto
generates HDL code (such as high level synthesis products). The designer then provides this
hardware description to Vivado for the next step.

Synthesis
Once Vivado has received a design it’s first step is to synthesize it. This is the process
of taking the HDL description and converting it to a set of logical operations that implement the
described functionality. A synthesized design will first contain only a logical representation of the
design, using boolean logic gates like AND, NOT and OR. The synthesis will then convert this
representation to a logical representation using only hardened FPGA resources, such as LUTs and
FFs. Figure 2.3 shows an example of what a synthesized design might look like after it has been
changed to use only FPGA resources.
8

Figure 2.4: An example of a fully implemented design.

Implementation
Once a design has been synthesized the next step is to implement the design, also known as
place and route. During this step the tool decides what resources to use on the chip to implement the
design (place) and which PIPs to use when connecting those resources together (route). During this
stage DRCs are run which check that each PIP only has a single input enabled. If this check fails
it will always result in an error (i.e. it cannot be downgraded to a warning) as such configurations
are considered harmful. An example of a fully implemented design can be found in Figure 2.4.
This design is largely constrained to the top left clock region, but has routing logic implemented
in every clock region (highlighted in green). In this design two Pblocks can also be seen, outlined
9

in purple. Designers use Pblocks to control the physical placement of logic. For example, in the
design shown in Figure 2.4 the logic block called static region has been assigned to a Pblock in
the upper left of the chip, constraining placement of static region to that Pblock.

Bitstream Generation
The last step is to generate the bitstream. The exact nature of this process is highly proprietary, but it is known that DRCs are run during this step. The DRC that checks for multiple inputs
to a PIP (described in the prior section) is also run here. Different from the implementation stage,
this DRC has the ability to be downgraded from an error to a warning, creating a loophole that
makes it possible to generate bitstreams with invalid configurations.

2.1.3

RapidWright
RapidWright is a Xilinx provided CAD tool that allows for low level manipulation of de-

signs [12], [13]. Unlike Vivado, RapidWright allows a user to specify exactly what resources and
routing PIPs to use in a design. Furthermore, once it has performed this placement and routing it
has the ability to export the design as a fully implemented Vivado design checkpoint. Because the
design is already fully implemented, Vivado only needs to run bitstream generation, which has a
less restrictive set of DRCs then the implementation step, as described in Section 2.1.2. By implementing short circuits in RapidWright the loophole described in Section 2.1.2 can be exploited to
generate bitstreams that contain routing which would normally be considered invalid.

2.2

Aging mechanisms
Since this work is focused on presenting an approach for accelerating FPGA aging, it is

important to have a basic understanding of standard aging mechanisms. Prior work has established
several aging mechanisms pertinent to CMOS technologies: Bias Temperature Instability (BTI),
electromigration (EM), Hot Carrier Injection (HCI) and Time-Dependent Dielectric Breakdown
(TDDB) [14]–[18]. Each of these are discussed in more detail below.

10

Bias Temperature Instability
Bias Temperature Instability (BTI) is one of the key reliability issues with CMOS technology, and has become more prevalent as technology nodes have gotten smaller [1], [19]. There
are two distinct types of BTI, Negative Bias Temperature Inversion (NBTI), which affects PMOS
transistors in CMOS, and Positive Bias Temperature Inversion, which affects NMOS transistors in
CMOS. In older CMOS technology nodes NBTI was the primary concern, since the PBTI effects
were negligible. However, since the introduction of high-k dielectrics in sub-45 nm technologies
PBTI has become as much of a concern as NBTI [19]–[22].
BTI occurs when an electric field is applied across the gate oxide of MOSFET transistors.
This can create interface traps and fill in preexisting traps with carriers in the dielectric. This
phenomena causes the threshold voltage to slowly increase over time [14], [17]. As the threshold
voltage increases, the switching speed of the transistors decrease, thus degrading performance.
This effect is accelerated with higher temperatures and higher supply voltages [18], [23].

Hot Carrier Injection
Hot Carrier Injection (HCI) occurs when carriers gain sufficient kinetic energy to overcome
a potential barrier and are injected into the gate oxide layer of a transistor. As hot carriers are injected into the oxide the chemical bonds of the oxide break down, creating interface traps, which
raises the threshold voltage and lowers switching speeds [14], [15]. This aging effect is accelerated
by increased voltages and high substrate current [17], [18], [24]. The high substrate current is typically induced with dynamic activity; i.e. high switching speeds [2], [18]. While the temperature
dependence of HCI is not yet conclusive it has been seen that HCI effects decrease slightly with
increased temperature [25], [26].

Electromigration
Electromigration (EM) is the gradual migration of metal atoms over time due to electric
currents. As this migration occurs the metal atoms slowly accumulate at one end of the channel,
thus narrowing the channel or wire [14], [15], [17]. This results in slower transistor switching
speeds due to increased resistances, which in turn results in degraded performance. Advanced
11

cases of EM can cause wires to break, resulting in circuit failure [14], [15], [17]. EM is accelerated
by both high DC current and high temperatures [18], [27], [28].

TDDB
Time-Dependent Dielectric Breakdown (TDDB) occurs when defects created in the gate
oxide cause traps to accumulate, creating a conductive path between the substrate and gate [15]–
[18]. This increases the leakage current of the transistors, which raises the overall power consumption and lowers switching speeds [23]. Eventually TDDB can cause a hard break down of
the dielectric and lead to device failure [14]. Defects that contribute to TDDB occur under constant stress, thus increasing device voltage accelerates the effects of TDDB. Like HCI, degradation
caused by TDDB has been seen to decrease with increased temperature [29].

2.3

Related Work
Our work is not the first to introduce short circuits in the FPGA bitstream, nor is it the first

to induce FPGA aging; however, it is the first published work to combine the two ideas, resulting
in the ability to perform localized aging via configuration.
Beckhoff et al. demonstrate how early partial reconfiguration tools could result in short
circuits that increase current and power consumption [6]. Hadzid et al. discuss how short circuits
in FPGAs could cause them to operate in an unsafe operating region, potentially exceeding the
power supply capability and disrupting operation and perhaps even damaging the part, though
actual damage was not shown [30]. Hutchings et al. show that short circuits can intentionally be
inserted into FPGAs and that the relationship between the the number of short-circuits and power
consumption is highly linear [31].
Previous aging studies have primarily used a combination of over-voltage and high temperatures to induce NBTI and HCI and accelerate aging [5], [18], [32]. Stott et al. raise the voltage and
temperature by 83% and 120% of their nominal values respectively to accelerate aging on an Altera Cyclone III device. They observe a 15% slowdown over 75 days with this method [18]. Maiti
et al. demonstrate a slowdown on a Xilinx Spartan 3e device using two stress phases. The first
phase increases voltage and temperature by 25% and 180% respectively, producing a slowdown

12

of approximately 5.0% after 200 hours. The second phase increases the voltage and temperature
by 50% and 220% over nominal respectively, resulting in a total slowdown of approximately 6.7%
after an additional 200 hours. In between phases they provide a day long recovery period where
the slowdown recovers 0.5%, from 5.0% to 4.5% [5]. Slimani et al. demonstrate aging on a Xilinx
Artix-7 part. This is the same FPGA used in this work. They were able to achieve a 1.8% slowdown by increasing the temperature by 400% of the nominal value for 14 days. They have an eight
day recovery period where the slowdown recovers 1.0%, from 1.8% to 0.8%. [32].
As far as we are aware, only one other previous work has discussed using configuration to
produce a slowdown. Chakraborty et al. discuss the idea of using ring oscillators to heat up the
board and cause a slowdown, but do not validate their model through experimentation [33].

13

CHAPTER 3.

HARDWARE DESIGN

The key ingredient in our accelerated aging technique is short circuits. In theory, a short
circuit is an uninhibited electrical path between two points that contain a difference in electrical
potential. The presence of a short circuit results in an infinite amount of current passing through
this path, continuing until the two points reach the same electrical potential. In practice, there is no
such thing as an uninhibited electrical path. As such, a short circuit is usually used to describe the
circumstance when an unintended low impedance path is created between two points that contain
a difference in electrical potential. When this occurs the current is limited only by the physical
characteristics of the component parts involved, and since such a path is usually unintended this
often results in currents high enough to cause damage to electrical components, potentially leading
to device failure.
In this work a short circuit is defined as a single PIP having two inputs enabled and where
those two inputs contain opposing logic levels. An example of this can be seen in Figure 3.1, where
the PIP has two inputs enabled, one from a LUT and another from a FF. The LUT is configured
to output a logic-1 while the FF is configured to output a logic-0. The connecting of these two
outputs to the same PIP results in a low impedance path from VCC to GND, thus creating a short
circuit. As described in chapter 2 enabling two inputs on a single PIP is considered an invalid
configuration and CAD tool designers implement design rule checks to ensure that it doesn’t occur.
This is intended as a protection for digital designers, as such configurations are theorized to draw
large amounts of current and can theoretically damage the FPGA [6]. In this chapter a technique
is put forth for overcoming these protections. This chapter also describes various short circuit
configurations and evaluates which configuration has the greatest potential to damage the FPGA
fabric.

14

Figure 3.1: An example of how a short circuit is created inside an FPGA.

3.1

RapidWright
The primary issue with creating short circuit on an FPGA is the design rule that specifies

that a single PIP cannot have more than one input enabled. There are a few ways of overcoming
this restriction. One option is to reverse engineer the bitstream of an FPGA to the extent that it is
known which bits determine which inputs are enabled on every PIP. Once that knowledge is known
it would be feasible to then modify the bitstream to create as many short circuits as desired. The
drawback to this approach is that one, it requires someone to reverse engineer the bitstream, which
is a time consuming task, and two, it requires the low level manipulation of bits in a bitstream,
resulting in the inability to design at a higher abstraction level, a benefit typically provided by
CAD tools.
In this work, reverse engineering the bitstream was the initial approach. However, the
above difficulties were quickly recognized and so other options were explored. The target FPGA,
the Artix-7, is built by Xilinx and their CAD tool Vivado is the industry standard for creating
designs for their FPGAs. Furthermore, Vivado is the only commercially available tool that is
capable of generating a bitstream that can be programmed onto the Artix-7 part. However, Xilinx

15

provides another tool, RapidWright, that allows for low level manipulation of FPGA designs at the
implementation level. In particular, RapidWright allows users to control the exact placement and
routing of their designs, something that Vivado does not allow to the same degree. Additionally,
RapidWright does not enforce the same design rule checks as Vivado during the implementation
stage, allowing a user to circumvent the Vivado implementation design rule checks and generate
bitstreams using only the design rules enforced by the bitstream generation step in Vivado.
This work takes advantage of this by using RapidWright to implement short circuits, which
consists of choosing which FPGA primitives and routing resources to use. The design is then exported from RapidWright in the form of a design checkpoint, which is then imported into Vivado.
Vivado then generates a bitstream from the implementation information provided by the design
checkpoint. Since RapidWright handles the entire implementation stage the design rule checks
Vivado typically runs for implementation are not run, circumventing the rule that checks for multiple inputs on the PIPs. It should be noted that the bitstream generation step has a design rule
that checks for this condition, but it can be downgraded from an error to just a warning, allowing
bitstreams to still be generated.
This approach for short circuit creation has several benefits. First, the bitstream doesn’t
need to be reverse engineered, as Vivado is generating the bitstream. This significantly reduces the
time and complexity involved in creating short circuits. Second, the design abstractions provided
by RapidWright can be used, resulting in faster design times, relative to creating short circuits
by manipulating bits in a bitstream. Third, Vivado guarantees that the circuit described in the
implementation step is the same circuit described in the bitstream, a guarantee that would be lost
if the bitstream needed to be modified manually. Fourth, this approach allows for characterization
methods to be designed that use as many of the same resources as the short circuits as possible.
These methods are described in more detail in chapter 4.

3.2

Short Circuits
Given our definition of a short circuit, where the output of two primitives holding opposite

values are connected together, there are several different configurations that qualify as a short
circuit. As such, some work needs to be done to figure out what the ideal short is; i.e. draws the
highest current. There are three variables that can be changed to create new types of short circuits
16

and so need to be considered: the type of primitives used, the sink direction of the short and the
PIP group the primitives are connected by.
In this work the discussion will be limited to only consider LUT and FF primitives, as they
are the most prevalent resources on the target FPGA and the easiest to configure as a short circuit.
As such, three different types of short circuits are possible. LUT →
− FF, LUT →
− LUT, and FF
→
− FF (which can be seen in Figure 3.2). In order to determine which short produced the highest
current three bitstreams were created, each containing shorts of one type. The current draw of each
was measured, with the LUT →
− FF configuration drawing the most. Based on this the LUT →
− FF
configuration was chosen.

(a) LUT →
− FF Short

(b) LUT →
− LUT Short

(c) FF →
− FF Short

Figure 3.2: The three different short circuit options created by varying the primitive type used.

The next variable is the the sink direction. Should the LUT be configured to logic-1, or
should the FF? To answer this two different bitstreams were created that contained the exact same
shorts, differing only in that one bitstream had all the LUTs configured as logic-1 and in the other
the FFs were configured as logic-1. In comparing the current draw of both it was determined
that there was no appreciable difference between the two. As such, it was arbitrarily decided to
configure the LUT as logic-1.
The remaining variable is determining which PIP group to use when connecting the outputs.
The layout of each logic slice and switchbox in the FPGA is very regular, meaning that the mapping
between slice outputs and the PIPs in the nearest switchbox is the same for every slice. This results
in each PIP belonging to a PIP group, where every PIP in the group belongs to a different switchbox
but has the same relative inputs from it’s nearest logic slice. Each primitive in a slice connects to
multiple PIPs. Primitives of the same type in a slice tend to not connect to any of the same PIPs,
but primitives of different types do. As such, all PIP groups that every LUT-FF pair shares needs
17

to be evaluated in order to determine which configuration draws the most current. We determined
that for each LUT-FF pair there were 24 different PIP groups that could be used to create a short
circuit. To determine which PIP group drew the most current 96 different bitstreams were created,
24 of which were bitstreams that only used the top LUT-FF pair in each slice (labeled as ALUT
and AFF, respectively). Each bitstream in this group of 24 used a different PIP group. Another
24 used only the second LUT-FF pair (BLUT-BFF) in a slice to create the short circuits, while
another 24 used only the third LUT-FF pair (CLUT-CFF) and the remaining 24 used only the last
LUT-FF pair (DLUT-DFF). This tested whether or not the different LUT-FF pairs in a given slice
gave different results, as well as the different PIP groups.

Figure 3.3: This graph shows the differences in current draw between short circuits created using
different PIP junction groups. Each group was shorted four times. The error bars on plot represent
the min and max current draw out of the four tests, while the dot shows the average current draw.

Each bitstream was then programmed. After letting the current and temperature settle, the
steady state current draw was measured. The results of these experiments can be seen in Figure 3.3.
It was determined that the current didn’t vary significantly between the different LUT-FF pairs in

18

Figure 3.4: Eight fully routed short circuits.

a given slice and so the results of bitstreams that used the same PIP group were averaged together
in the plot. The error bars indicate the range that the 4 values were spread over.
As can be seen in Figure 3.3 the clear winner is PIP group NN2BEG, with the configuration
drawing over 600 µA per short. However, due to some configuration mistakes in the testbed the
experiments presented later in this work ended up not using the NN2BEG PIP group. Instead they
use a mix of the NE2BEG and SW2BEG PIP groups. Both of these groups draw slightly less than
600 µA per short. It should be noted that the bitstreams these results are based off of each had
6,000 short circuits. As the number of short circuits in a bitstream increases the current per short
decreases, and so the current per short value shown Figure 3.3 only holds true when there are 6,000
shorts present. However, the relative ranking of the different PIP groups does not change as the
number of short circuits increases.
As a result, the short circuit configuration used in the experiments presented in this work
is a LUT-FF combination, where the LUT is driven with a logic-1, the FF with a logic-0 and they
are both connected to either the NE2BEG or SW2BEG PIP group. Using RapidWright these short
circuits can be placed in any logic slice on the chip. A floorplan representation of these short
circuits can be seen in Figure 3.4. This figures shows eight fully routed short circuits, highlighted
in red.
19

CHAPTER 4.

TESTBED

In order to determine the efficacy of using invalid configurations to accelerate aging it is
critical to have a way of precisely measuring FPGA performance and a way of controlling the environment during the aging process. This chapter describes a testbed that was built to accomplish
these tasks. Section 4.1 describes a method for measuring the intrinsic delay of FPGA resources,
Section 4.2 describes methods for controlling and monitoring various aspects about the environment of the test boards and Section 4.3 describes the software environment needed to manage the
testbed. The testbed described in this chapter is one of the main contributions of this work, as it not
only enabled the novel results presented later in this work but also serves as a platform for future
exploration.

4.1

Measurement Technique
One of the most important tasks when testing a new technique for accelerating FPGA aging

is developing a measurement technique for measuring performance degradation. This is difficult
because there is no direct way to measure the switching performance of any individual transistor
in the chip. As a result there is no way of determining the effect of an aging technique on any
given transistor. However, what can be measured is the performance of small groups of resources
in the fabric, including the routing associated with connecting the resources together. This section
describes our methodology for performing these measurements.

4.1.1

Ring Oscillators
Ring Oscillators (ROs) were our chosen method for measuring the intrinsic speed of the

FPGA fabric. ROs are purely combinational circuits that are built using a ring of an odd number of
inverters, causing a pulse to travel around the ring. Since these are purely combinational circuits the
speed with which this pulse travels is limited only by the physical characteristics of the transistors
20

and wires involved, making it the perfect candidate for measuring the intrinsic delay of FPGA
resources. In order to measure the damage caused by short circuits ROs can be created that use
some of the same resources as short circuits. RO delay can then be measured before and after short
circuits are programmed to the FPGA, enabling the detection of any increases in delay caused by
the presence of short circuits. Using this methodology characterization designs can be created that
test the performance of different parts of the FPGA fabric.

Figure 4.1: The logic function of a ring oscillator (RO). The AND gate is able to turn the RO on
and OFF. When the RO is on a signal propagates through the interverters, generating a clock signal
on ro clk that corresponds to the intrinsic speed of the resources involved.

A logical representation of a RO can be seen in Figure 4.1. As shown, ROs contain an odd
number of inverters (implemented using LUTs) and a single AND gate (also implemented with
a LUT). These inverters are routed together sequentially. The AND gate is present to allow us
to enable and disable the oscillation of the RO. When the ro en signal is set low the circuit is in
equilibrium and nothing changes. When it is set high the output of the AND gate will go high,
causing a chain reaction through the inverters that will eventually set one of the inputs to the AND
gate low, resulting in the output of the AND gate going low and a chain reaction that results in the
AND gate going high again. This pattern continues for as long as ro en is set high, resulting in a
periodic signal being outputted on ro clk. The speed at which ro clk oscillates is dictated solely
by the physical characteristics of the transistors and wires through which the RO ”pulse” travels.
Because of this the ro clk oscillation frequency can be used as a measurement of the intrinsic speed
of the elements the RO uses in the FPGA fabric.

21

In order to calculate the oscillation frequency ro clk is fed into a counter. ro en is then set
high at the same time a different counter is started, running at a known frequency. Once this second
counter has reached a predetermined value the RO is disabled and the value of the counter that was
being fed by ro clk is checked. Because the frequency and count of the second counter is known
the frequency of the RO can be calculated. In this way the intrinsic speed of FPGA resources can
be measured.

4.1.2

RO Creation
A combination of Vivado and RapidWright is used in the creation of these ROs. A graphical

representation of this flow can be seen in Figure 4.4. At a high level, each RO is synthesized and
placed using RapidWright, while Vivado performs routing and generates the bitstream. Each RO
is also a part of a larger design that performs other functions necessary for characterization. This
larger design is designed completely in Vivado.
When creating a RO, a design is first created in RapidWright, consisting of a single Pblock.
Inside this Pblock a RO is placed but not routed. The ROs are designed in RapidWright to consist of an odd number of inverters and a single AND gate. These nodes are assigned to specific
LUTs (placed) using RapidWright. The design is then exported from RapidWright as a design
checkpoint.
Vivado is then used to design and implement a system that allows for the measurement and
communication of on-board information, such as RO frequencies. This design is referred to as the
static region and is described in more depth in Section 4.1.5. As a part of this design a black box is
instantiated. Vivado then synthesizes and implements the design. During the implementation step
the design checkpoint for the RO created by RapidWright gets read in and the Pblock in the RO
design replaces the black box in the Vivado design. This Pblock is already placed but gets routed
along with the rest of the design. After this implementation step a bitstream is generated.
During this process the black box in the Vivado design gets labeled as partially reconfigurable. This labeling is part of the process that enables the reading in of the partially implemented
design checkpoint during the implementation stage. It also means that if RapidWright is used to
implement multiple different ROs then they all can be implemented in the same implementation
run, without needing to rerun implementation on the static design. This flow also results in a set
22

Figure 4.2: A logical representation of the flow used to create RO and short circuit bitstreams. The
static design is created in Vivado while the RO and short circuit designs are created in RapidWright.
RO designs are combined with the static design to create RO bitstreams while short bitstreams are
generated using only the RapidWright checkpoint.

of partial bitstreams, allowing new ROs to be programmed in the system without interrupting the
processes occurring in the static design.

4.1.3

RO Performance
In order to evaluate the efficacy of any measurement technique it is important to understand

how precise it is. Precision is defined as how close multiple measurements of the same thing are
to each other and is very important when measuring changes in circuit delay. Since slowdown is
measured by taking two measurements (with damage being performed in between) and calculating
the difference one first has to know what the error bounds on a single measurement are. If the
difference between two measurements is within these error bounds then determining how much
damage occurred becomes far more difficult. If the error bounds aren’t known then they must be
assumed to be infinite, resulting in all the measurements becoming meaningless.

23

To this end, a series of experiments were designed for evaluating the precision of our RO
measurements. A series of 177 bitstreams were created, each with a single RO that varied in size
and shape between different bitstreams. Sizes varied from 3 inverters to 799 inverters and shapes
varied from spanning multiple columns, to spanning multiple rows, to spanning both multiple
columns and rows.
Each bitstream was then programmed on a board and measurements were collected for 20
minutes, with a single measurement occurring once every second. This same process was then repeated on a different board. This allowed for the evaluation of the precision between measurements
of the same RO on the same board and between the same RO on different boards. The variation in
RO type across the bitstreams also allowed us to gain insight into whether or not the RO size and
shape mattered in regards to precision.
The results of this experiment were encouraging. It was found that in all cases the single
board results had relatively small error bounds. Variance was calculated by finding the difference
between the highest and lowest measured values and then dividing that by the average of all 1200
measurements. The highest variance found out of all 354 tests was ±0.139% and the lowest
variance was ±0.062%, meaning that any of the ROs were precise enough to measure differences
in delay greater than ±0.150%.
It was determined that the differences between the different boards were much greater than
the differences between measurements on a single board. The largest percent difference found
between the same RO on two different boards was 1.422%, and the smallest percent difference
was 0.101%. It was determined that this amount of difference was too large for to be able to
effectively compare measurements taken on different boards, even if the bitstreams are the same.
We theorize that inter board differences were greater than intra board differences due to process
differences between dies, introduced by the manufacturing process.
In collecting this data it was also discovered that the distribution of measurements for
a single RO is normal. This can be seen in Figure 4.3, where the measurements taken from a
single run of one of the bitstreams is plotted in a histogram. It can clearly be seen that the noise
in the measurements follows a normal distribution, which allows us to measure slowdown even
more precisely than the ±0.139% variance described earlier. This is because, even if portions of
the measurements overlap, the peaks of the distribution will not (assuming an increase in delay
24

Figure 4.3: A histogram showing the data collected during a single characterization. The distribution of the data can be seen to be normal.

occurred), thus allowing for confirmation that differences are present. This will be explored more
in chapter 5.

4.1.4

Limitations
One of the characteristics of using ROs to measure the intrinsic speed of FPGA resources

is that it only measures the resources used by the RO. This means that in order to best measure
damage caused by short circuits the ROs need to use the same resources as the short circuits.
Unfortunately, this is quite difficult.
The first problem is that, as described in chapter 3, our short circuits use both LUTs and
FFs. However, the key feature of a RO that allows them to be used them to measure intrinsic
FPGA speed is that they are made up of purely combinational logic, which means that they are

25

built entirely out of LUTs and use no FFs. As a result, any damage that might have been done to
the FFs isn’t being measured.
The second problem is that when routing the ROs it is far easier and more efficient to allow
Vivado to perform the routing, as compared to doing it manually in RapidWright. This is because
the ROs connect multiple slices together, which results in needing to use multiple levels of PIPs,
which quickly becomes far more complex than the routing needed for the short circuits. As a result
the ROs can’t be forced to use the same PIP groups that the short circuits do, meaning that damage
to the PIPs also isn’t being measured as very few of the PIPs the ROs use overlap with the PIPs the
short circuits use.
Because of these limiting factors it can only be guaranteed that the ROs use the same LUTs
as the short circuits and that the resources used by the ROs are physically close to the rest of the
resources used by the short circuits (i.e. PIPs are in the same switchbox, FFs are in the same
slice). This in turn means that this characterization method is largely limited to measuring damage
done by aging mechanisms that impact the area around a shorted transistor, as compared to aging
mechanisms that impact only the transistor itself. These would largely be effects accelerated by
heat, as abnormally high temperatures is one of the primary side effects of short circuits. More
information on aging effects can be found in chapter 2. These limitations are addressed more fully
in Chapter 7.

4.1.5

Static Region
In addition to the RO logic the characterization bitstream needs to also have logic that one,

controls the RO, two, measures the RO frequency and three, communicates that frequency to the
rest of the system, so that it can be stored for later analysis. There also needs to be logic that
periodically measures the on-chip temperature and communicates it to an off-board system that
stores the information. To this end a system was designed to accomplish all of these tasks. As
can be seen in Figure 4.4 this system contains a Xilinx microblaze microprocessor, ADC module,
UART, AXI Timer and a custom RO control module. The microblaze is running software that
periodically measures the chip temperature through the on-chip Xilinx ADC and the RO frequency
using the AXI timer and RO Controller. It then transmits these values over the UART where they
are read by an external system.
26

27
Figure 4.4: A block diagram showing the static design used when characterizing the FPGA.

4.2

Environmental Control and Monitoring
One of the difficulties of evaluating whether or not a specific technique causes performance

degradation is determining whether or not differences in measurements are due to measurement
noise, environmental factors or the technique being evaluated. Section 4.1 described our measurement technique and how the noise of its measurements was evaluated. This section describes how
environmental factors are controlled.
The two environmental factors that most heavily influence CMOS switching performance
are voltage and temperature [2], [5], [22], [34]–[36]. To control and monitor these factors a test
setup was created, which can be seen in Figure 4.5. In this setup a thermal chamber, power supply, Labjack and Intel NUC mini PC can be seen. The thermal chamber is used to control the
environmental temperature during the experiments and the power supplies are used to control the
input voltage to the FPGA. The Labjack is an ADC measurement device that allows us to monitor
the input voltage, input current and ambient temperature in the thermal chamber. The NUC is a
mini PC that is running control software, which controls and records information from the power
supplies, Labjack and the test boards themselves.

4.2.1

Voltage
Changes in supply voltage have a great impact on transistor switching speeds [34]–[36].

Since our measurement of degradation depends upon measuring changes in switching speed (or
intrinsic circuit delay) it is important that the input voltage is controlled, so that it can be guaranteed
that changes in voltage aren’t the cause of any measured changes in FPGA performance.
In order to do this the test boards are powered with an external power supply. For some
of the experiments this power was supplied through the provided power connector on the Arty-7
board. In other experiments the input voltage was supplied by soldering directly to points on the
board that connected to the FPGA core power. In both of these cases external power supplies
provided constant voltage to the board. The sense feature of the power supply was used to account
for voltage drops across the voltage supply lines and Labjack monitoring equipment. This was
accomplished by running very low impedance wires from the board (where the input connects)

28

Figure 4.5: The experiment is kept in a controlled environment by a thermal chamber and a high
precision power supply. There is also a Intel NUC mini PC and Labjack ADC for monitoring and
controlling the experiments.

back to the power supply, allowing the power supply to automatically account for any voltage
drops that occurred and keep the actual input voltage constant.
In addition to powering the boards with the power supply the voltage information from the
power supply was recorded and stored in a database several times a second. Furthermore, the input
voltage was monitored using a shunt resistor and Labjack ADC and was also recorded several times
a second. This provides two independent records of the voltage going into the FPGA, which can
then be used during analysis to confirm that changes in voltage aren’t the cause of any measured
changes in FPGA performance.
A graph of the supply voltage (collected from the power supply) over the period of several
days can be seen in Figure 4.6. Four distinct lines in the graph can be seen, which correspond
to four different quantization values of the power supply ADC. This graph shows that the input
voltage is kept within two quantization values for the duration of the experiment. During characterization the supply voltage increases slightly (due to a smaller load being applied) but is still kept
29

within two quantization values. This demonstrates that the input voltage is held constant and has
no impact on measured differences in delay.

Figure 4.6: The input voltage to the FPGA over the course of five days. The four lines are the result
of the quantization of the power supply. The input voltage fluctuates between two quantization
values for the duration of the experiment. During characterization (when the load is less) the
voltage still only fluctuates between two values, but two quantizations up from where it is while
short circuits are programmed.

4.2.2

Temperature
Another well documented cause of changes in transistor switching speeds is tempera-

ture [2], [5], [22], [36]. To ensure that ambient temperature is not the cause of any measured
degradation all experiments are run inside a thermal chamber, which maintains an ambient temperature of 35 ◦C. The temperature is then monitored and recorded both inside the thermal chamber
and on the FPGA. By measuring both temperatures it can be proved that the on-chip temperatures
30

reached while short circuits are programmed is not caused by an external heat source. Furthermore, this setup allows us to prove that both the ambient temperature and on-chip temperature is
the same during all characterizations, ruling out temperature as a possible cause for differences in
measurements.
The ambient temperature inside the thermal chamber over the course of five days can be
seen in Figure 4.7. The stripes in the graph are due to the different quantization values of the ADC
being used to read the temperature sensor. It can be seen that the temperature is held at a constant
35 ◦C during all five days. Every time a characterization occurs the temperatures dips slightly as a
heat source (the short circuits) is being removed from the thermal chamber’s environment.

Figure 4.7: The temperature inside the thermal chamber over the course of five days. Each day
a dip can be seen when a burn stops and a characterization takes place, showing that changes to
the ambient temperature are due to the programmed short circuits and not external stimulus. Note:
The stripes in the graph are due to the quantization of the temperature sensor.

31

4.3

Experiment Management
Now that the individual components of the testbed have been described it is important to

understand how everything fits together. A conceptual view of the system can be seen in Figure 4.8,
where the connections between the different parts of the system can be seen. The Artix-7 FPGAs
under test live inside the thermal chamber, as does the temperature sensor being used to measure
the ambient temperature in the thermal chamber. The test boards receive power from the power
supply and communicate data to the NUC. The Labjack monitors the voltage and current being
provided by the power supply as well as communicates with the temperature sensor in the thermal
chamber. It then communicates all of this information to the NUC via USB. The power supply
is also connected to the NUC via Ethernet, which is how it reports information to and receives
commands from the NUC.

Data Server
(MySQL)

Network
Intel NUC
mini PC

Data
Visualization

USB
UART
Ethernet
Modified
Arty A7-35T

.95 V, 7.9 A
Voltage
Sense

Power
Supply

Thermal Chamber
(Temperature Control)

LabJack
ADC

I2C

Temperature
Sensor

Figure 4.8: A diagram showing a logical view of the testbed. The Intel NUC mini PC is the brain
that controls and monitors all other parts of the system.

As can be seen, the NUC is the central hub of the experiments. It communicates with and
controls all aspects of the system. It dictates when the test boards are characterized, burned or put

32

in recovery, as well as records all of the information generated during any phase of an experiment
from all parts of the system. However, the NUC does not perform any analysis function in the
system. Every time the NUC receives a new piece of data it commits that data to a MySQL
database that lives on a well provisioned remote server.
This remote server is where all data analysis occurs. An extensive software package was
designed and built that is capable of querying the data, calculating metrics about the data and
creating data visualizations. All of the data metrics and graphs presented in this work are a product
of these data analysis tools.
In summary this testbed provides the platform needed to effectively test, measure and understand the effects short circuits have on FPGA performance. All experiments and results presented in this work are a direct result of the capabilities provided by this testbed.

33

CHAPTER 5.

INITIAL EXPERIMENTS

This chapter presents the results of two experiments that helped define the approach taken
in the long term experiment described in Chapter 6. These experiments also helped validate the
effectiveness of the testbed described in Chapter 4. Section 5.1 discusses initial experiments that
attempted to cause damage with short circuits and Section 5.2 discusses experiments performed in
response to some unexpected results.

5.1

Low Current Experiment
In an initial attempt to cause damage using short circuits the FPGA was filled with as many

short circuits as possible. A bitstream was created that configured every LUT/FF pair on the chip
to a short circuit. However, it was quickly discovered that the targeted FPGA (an Artix-7 35T) is
software limited by Xilinx, meaning that not all resources can be used at the same time. As a result
only around 20,000 short circuits could be instantiated.
With this reduced number of shorts a bitstream was generated and then loaded onto a board.
Upon uploading the bitstream the board would immediately enter a reset cycle, where it would turn
on, load it’s configuration, reset, turn back on, reload it’s configuration and then reset again, in a
cycle. After some investigation it was determined that 20,000 short circuits drew too much current
and was causing the on-board power regulator to reset the board.
To solve this it was experimentally determined that the most current the board could draw
was approximately 3 A. We were then able to experimentally discover that 6,000 short circuits was
the most that could be programmed at one time and remain under the current limits of the on-board
power supply.

34

5.1.1

Methodology
Using this knowledge a bitstream was created with 6,000 shorts spread out across two clock

regions, as seen in Figure 5.1. Using this single burn bitstream and four characterization bitstreams
(each with a RO in a different place and the static logic needed to perform a characterization) an
experiment configuration was created. For the purposes of this experiment a single characterization
consisted of: one, programming each RO bitstream (one after another) for 20 minutes each and two,
collecting the RO frequency once a second.
To begin the experiment the board was first characterized three times, each six hours apart.
This provided extra confidence in the initial performance of the board. A burn cycle (where the
short circuit bitstream was loaded) was then started, which lasted six hours. A characterization
would then be performed and the process would repeat.

5.1.2

Results
After nine days of repeating the burn/characterize cycle there was a characterization where

the error bounds didn’t overlap with the error bounds of the original characterization, providing
high confidence that the short circuits were indeed causing damage. In Figure 5.2 this can be seen.
The blue data is one of the original characterizations and the green data shows the characterization
that took place after 212 hours of burn. It can be seen that these two characterizations are nonoverlapping. Some of the intermediate characterizations are also shown (taken at 140 hours and
164 hours of burn), which do overlap with the original characterization.
These results are compelling for a few reasons. First, they prove that short circuits do have
the potential to cause measurable damage to the intrinsic speed of the FPGA, validating the initial
hypothesis. Second, these results demonstrate and validate the precision of the testbed. Even prior
to the results that didn’t have overlapping data, slowdown was still detected. In Figure 5.2 the
characterizations that occurred at 140 and 164 hours both clearly show that slowdown is occurring.
This is because the data distribution of the RO measurements is normal, allowing for even small
amounts of slowdown to be detected and measured.
The results from this initial experiment also show that the damage caused by 6,000 short
circuits is very minimal. This experiment was ultimately run for over a month and the maximum

35

Figure 5.1: The floorplan of the short circuit bitstream used in the initial low current experiments.

measured slowdown was 0.8%. Because of this it was determined that an increase in the number of
short circuits was needed. As described in Section 5.1, the limiting factor was the amount of current
the on-board power regulator could supply. As a result it was decided to supply the FPGA core
power directly from an external power supply by soldering to specific points on the board, shown
in Figure 5.3. This modification allows for as much current as is needed to be supplied, which
in turns allows for more short circuits to be programmed at the same time. This modification led
directly to the experiment described in chapter 6.

36

Figure 5.2: A histogram showing the data collected during four different characterizations. The
data distribution for each characterization can be seen to be normal. Short circuits were programmed in between each of these characterizations and it can be seen that this resulted in a
slowdown of the intrinsic frequency of the board.

5.2

Temperature Experiments
Once these board modifications had been made experiment configurations were then cre-

ated with bitstreams that had much higher current draws, which in turn resulted in greater amounts
of heat being produced. This then led to concerns about needing a cool down period, where the
FPGA is allowed time to cool off after a burn has ended before characterization occurs. To evaluate whether or not this was necessary a bitstream with more than 20,000 short circuits was programmed. After it sat for several minutes (until the hardware manager in Vivado showed that the
internal temperature sensor on the board had reached steady state) a characterization bitstream was
37

Figure 5.3: A picture of the board modifications that enable powering the FPGA directly with an
external power supply. Multiple entry points were chosen to decrease the resistance introduced by
the new solder joints.

then programmed and immediately started taking measurements. It should be noted, that because
there is no circuitry other than short circuits in the burn bitstream, there is no temperature data
available during the burn, and the data that is shown starts several seconds after the burn ended, as
it takes some time for a bitstream to be programmed and for the characterization software to begin
running.

38

The results of this test are depicted in Figure 5.4, where it can be seen that the RO frequency is directly correlated to the temperature of the chip. This tight correlation was expected,
but the direction of the correlation was not. It was anticipated that the RO frequency would decrease at higher temperatures, but the data shows that instead the RO frequency increased at higher
temperatures.
(b) Chip Temperature vs Time

(a) RO Frequency vs Time

Figure 5.4: A set of plots showing the correlation of RO frequency and chip temperature. This
data was collected as part of a normal characterization and so occurred right after short circuits
had been programmed to the board, which was the cause of the increased temperature.

This unexpected behavior led to concerns about there being a flaw in the way degradation
was being measured. Is delay really being measured if the RO frequency doesn’t react to temperature like expected? In order to gain greater visibility into the issue a new experiment was created.
In the original data only the relationship between falling temperature and RO frequency could
be observed. Thus in the new experiment the temperature of the chip was controlled externally,
allowing for observation of the issue as the temperature fluctuated in both directions.
To do this a characterization bitstream was programmed that began collecting both RO
frequency and chip temperature data. A heat gun was then used to increase chip temperature and
freon was used to decrease temperature. The experiment ran for 6 minutes, with the results being
39

depicted in Figure 5.5. The first minute was of just the board sitting there running, with no external
stimulus applied. The second minute is heating up the board from approximately 2 feet away using
the heat gun. For the third minute the heat gun was moved closer, to a distance of a few inches.
During the fourth minute no stimulus was applied, allowing for a natural cool down. During the
fifth minute freon was periodically sprayed on the board, causing it to cool down rapidly. This
wasn’t very consistent and so resulted in a lot more noise in the graph. The last minute was spent
with no stimulus, allowing the board to naturally warm up from the heat generated by the ROs.
As seen in Figure 5.5, the results agreed with the original findings, the RO frequency was directly
proportional to the chip temperature. This held true both when the chip was heating up and as the
chip cooled down.
(a) RO Frequency vs Time

(b) Chip Temperature vs Time

Figure 5.5: A set of plots showing the relationship between RO frequency and chip temperature.
This data was collected over a six minute period where the chip was heated and cooled externally.
It can be seen that the RO frequency is proportional to the chip temperature.

After later consulting with others regarding these results we learned that this effect is likely
due to a phenomenon called temperature inversion. When temperatures increase the MOSFET
channels inside the chip shrink, resulting in increased switching speeds. However, at higher temperatures this effect is typically dominated by the extra resistance the heat introduces, causing a
40

net effect of decreased performance at higher temperatures. However, at smaller process nodes
(where less resistance is introduced by heat) this effect plays a bigger role, resulting in a net effect
of faster switching speeds [36], [37].
These results are important for two reasons. First, by providing an unexpected but true
result it was again confirmed that the testbed was working properly and is capable of precisely
measuring changes in FPGA performance. Second, it was determined that in order to have a valid
characterization a wait period was needed for the chip to cool down, as the chip temperature has a
clear and measurable effect on the characterization process. To this end a seven minute cool down
period was implemented, where the experiments wouldn’t use any data collected for the first seven
minutes after a characterization bitstream is loaded. Based on the graphs in Figure 5.4 this seven
minute period is enough to ensure that the results are not impacted by the lingering heat caused by
programming large amounts of short circuits. This decision directly impacted the structure of the
long term experiment presented in chapter 6.

41

CHAPTER 6.

LONG TERM EXPERIMENT

Based on the results from the initial experiments a new, long term experiment was developed, examining the impact of using large amounts of short circuits (20,000+). This chapter
describes this long term experiment and presents its results. In Section 6.1 the setup of the experiment is explained and in Section 6.2 the results of the experiment are put forth. These results are
then discussed in more depth in chapter 7.

6.1

Methodology
For this experiment the testbed and hardware design techniques described in Chapter 3 and

Chapter 4 are used. This experiment also builds off of the results found in Chapter 5 and is a
natural extension of the work presented there.

6.1.1

Short Circuit Layout
As described in Chapter 5, the Artix-7 35T has two features that limit the number of short

circuits that can be placed on the board at the same time: current limits on the on-board power
supply and software limits on resource utilization. This experiment used the technique described
in Chapter 5 to overcome the power supply limits. The second limitation was not addressed,
resulting in only two-thirds of the chip containing short circuits (20,798 short circuits total).
In Figure 6.1 the location of the short circuits in the design is identified by the presence of
light blue squares. It can be seen that the short circuits are concentrated in the bottom two thirds
of the chip, while the top third of the chip is completely empty. This allows for localized damage
to be studied. If the top third experiences as much damage as the bottom two thirds, then there is
no localization. If the top third experiences less damage than the bottom two thirds then the short
circuits do cause localized damage, at least to some degree.

42

Figure 6.1: Three ring oscillators are used to measure slowdowns across the chip, during the
“characterization” phase of the experiment. 20,798 shorts are used during the “burn” phase of
the experiment, covering the lower two-thirds of the part. While both the short circuit and RO
configurations are shown in this figure, they are configured at different times.
43

6.1.2

Characterization Design
The characterization design is responsible for measuring the intrinsic delay of the FPGA

fabric before and after short-circuit burns. A more detailed description of how this is done is given
in Chapter 3. For this experiment, three different characterizations were used, each testing the
damage done to a different location on the chip .The locations of these three ROs can be seen
in Figure 6.1 and will be referred to as Top-RO, Mid-RO and Bottom-RO. Note, even though
Figure 6.1 shows the short circuits and characterization ROs on the same floorplan, in reality they
are part of different bitstreams, configured on the FPGA at separate times. The ROs were each 159
inverters long, covering the same area as 40 shorted slices.

6.1.3

Recovery Design
In order to determine the permanence of damage to the FPGA, a blank bitstream was also

created. This bitstream was used during the recovery period in place of a burn bitstream. This
blank bitstream leaves the FPGA powered on but with no circuit activity.

6.1.4

Experiment Design
The experiment iteratively sequences through the following steps: (1) load the short-circuit

configuration into the FPGA, (2) allow the FPGA to “burn” in this configuration for 24 hours, and
(3) load the characterization bitstream to determine increases in circuit delay.
An initial characterization consisted of measuring the frequency of the three ROs at the top,
middle and bottom of the chip (see Figure 6.1). Each of the three ROs were characterized three
times, six hours apart, to ensure that repeated measurements produced the same frequency results,
and to establish a baseline for measuring relative slowdown.
After the initial characterization, the burn period was started, which consisted of 36 burn
cycles. At the beginning of each burn cycle the bitstream containing 20,798 short circuits was
configured onto the FPGA. After a period of 24 hours the shorts were removed from the board
and a one hour characterization was performed. Over the course of the hour, each RO was programmed onto the board for 20 minutes. The first seven minutes were allotted to allow temperature

44

fluctuations to settle, and then the RO frequency was measured once per second for 13 minutes.
Following this, another 24 hour burn and one hour characterization cycle would begin.
Once 36 of these burn/characterize cycles had taken place, a 16 day recovery period was
begun that investigated whether the aging effects were temporary or long lasting. During the
recovery period a blank bitstream would be repeatedly left on the FPGA for 24 hours and then the
same one-hour characterization process was performed.
After the 16 day recovery period, another burn period was started, identical to the first one,
but that lasted for 90 days instead of 36.

6.2

Results
The results of this experiment will be presented in 3 sections. Section 6.2.1 will present

the results of the initial burn period, Section 6.2.2 will present the results of the recovery period,
and Section 6.2.3 will present the results of the second burn period. The impact of the these results
will be discussed in Chapter 7.
The results of the experiment are summarized in Table 6.1 and the slowdown of the ROs
are plotted in Figure 6.2. It can be seen from Table 6.1 that for all stages of the experiment the
ambient temperature was held steady at around 35 ◦C for the duration of the experiment, and that
the input voltage was held at a steady 0.95 V for the duration of the experiment. This provides
confidence that the slowdown discussed below is due to damage caused by only the short circuits
and not due to the impact of outside influences such as temperature or voltage fluctuations.

6.2.1

Initial Burn Period
The maximum slowdown measured was by the Mid-RO (red), located in the middle of the

region of short circuits (see Figure 6.1). This RO measured a slowdown of 5.13% after the initial
burn. The Bottom-RO (green), also in the short circuit region but at the corner of the chip, detected
a slowdown of 4.81%. The Top-RO (blue), located far from the short circuit region, had the least
amount of degradation, with a slowdown of only 1.35%.
It can also be seen in Figure 6.2 that the degradation effects are heavily weighted towards
the beginning of the burn schedule. After the first burn cycle the Mid-RO experienced a 1.25%

45

decrease in frequency and by the end of the sixth cycle the slowdown had reached 2.56%, half of
the total slowdown seen during the initial burn period. This early drop-off in delay accumulation
is also seen in previous FPGA aging work [2], [5], [16], [18], [32].

Figure 6.2: The results of the long term, high current experiment. After each 24 hour period the
Ring Oscillators were re-characterized. Those results are shown above, which track the slowdown
of each RO as time progressed. It is clearly shown that the Top-RO (outside of the burn region),
detected far less slowdown than the two ROs inside the burn region.

6.2.2

Recovery Period
Another key part of accelerated aging techniques is understanding the permanence of the

aging effects. Prior work has shown that up to 56% of the slowdown can be lost during a recovery
period [32], showing that recovery periods are a critical part of understanding accelerated aging
mechanisms.
We tested the permanency of our aging technique with a 16 day recovery period, during
which time characterization took place once a day and no circuit activity occurred the rest of the
day. As can be seen from Table 6.1 and Figure 6.2, the FPGA did not experience any recovery,

46

Table 6.1: Results for the long term experiment.
Time

Top-RO
Slowdown

Mid-RO
Slowdown

Bottom-RO
Slowdown

Input Current

Input Voltage

Ambient Temp.

During Burn

During

During

Characterization

Characterization

First Burn Period
0 Days
5 Days
10 Days
15 Days
20 Days
25 Days
30 Days
35 Days

0.00 % ± 0.016 %
0.84 % ± 0.025 %
0.98 % ± 0.026 %
1.11 % ± 0.021 %
1.22 % ± 0.022 %
1.23 % ± 0.028 %
1.34 % ± 0.020 %
1.38 % ± 0.028 %

0.00 % ± 0.023 %
2.39 % ± 0.020 %
3.13 % ± 0.023 %
3.71 % ± 0.020 %
4.16 % ± 0.018 %
4.49 % ± 0.017 %
4.86 % ± 0.018 %
5.13 % ± 0.016 %

0.00 % ± 0.021 %
N/A
N/A
−3
2.43 % ± 0.017 % 8.01 ± 8.88 × 10 A 0.95 ± 5.15 × 10−4 V
3.04 % ± 0.018 % 8.01 ± 9.67 × 10−3 A 0.95 ± 5.15 × 10−4 V
3.53 % ± 0.023 % 8.00 ± 8.96 × 10−3 A 0.95 ± 5.15 × 10−4 V
3.92 % ± 0.018 % 7.99 ± 9.12 × 10−3 A 0.95 ± 5.15 × 10−4 V
4.20 % ± 0.018 % 8.00 ± 1.066 × 10−2 A 0.95 ± 5.15 × 10−4 V
4.56 % ± 0.018 % 7.98 ± 1.065 × 10−2 A 0.95 ± 5.15 × 10−4 V
4.81 % ± 0.019 % 7.97 ± 9.51 × 10−3 A 0.95 ± 5.15 × 10−4 V

N/A
34.86 ± 0.031 ◦C
34.93 ± 0.043 ◦C
34.94 ± 0.059 ◦C
34.94 ± 0.039 ◦C
34.97 ± 0.062 ◦C
34.93 ± 0.031 ◦C
34.96 ± 0.055 ◦C

Recovery Period
47

0 Days 1.38 % ± 0.017 %
5 Days 1.37 % ± 0.018 %
10 Days 1.39 % ± 0.020 %
15 Days 1.38 % ± 0.019 %
20 Days 1.38 % ± 0.017 %
25 Days 1.39 % ± 0.021 %

5.14 % ± 0.017 %
5.15 % ± 0.018 %
5.17 % ± 0.020 %
5.16 % ± 0.019 %
5.17 % ± 0.016 %
5.18 % ± 0.019 %

4.81 % ± 0.018 %
4.79 % ± 0.016 %
4.80 % ± 0.016 %
4.80 % ± 0.020 %
4.80 % ± 0.018 %
4.81 % ± 0.017 %

0.05 ± 1.18 × 10−3 A
0.05 ± 1.26 × 10−3 A
0.05 ± 1.22 × 10−3 A
0.05 ± 1.18 × 10−3 A
0.05 ± 1.22 × 10−3 A
0.05 ± 1.22 × 10−3 A

0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V

34.94 ± 0.066 ◦C
34.94 ± 0.066 ◦C
34.94 ± 0.059 ◦C
34.94 ± 0.031 ◦C
34.90 ± 0.035 ◦C
34.92 ± 0.047 ◦C

N/A
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
N/A
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V
0.95 ± 5.15 × 10−4 V

35.12 ± 0.043 ◦C
35.13 ± 0.039 ◦C
35.01 ± 0.094 ◦C
35.16 ± 0.039 ◦C
35.01 ± 0.055 ◦C
35.01 ± 0.043 ◦C
35.03 ± 0.039 ◦C
35.06 ± 0.043 ◦C

Second Burn Period
0 Days
12 Days
24 Days
36 Days
48 Days
60 Days
72 Days
73 Days

1.52 % ± 0.021 %
1.66 % ± 0.015 %
1.76 % ± 0.016 %
1.82 % ± 0.019 %
1.92 % ± 0.023 %
2.02 % ± 0.015 %
2.10 % ± 0.022 %
2.11 % ± 0.016 %

6.13 % ± 0.017 %
6.69 % ± 0.024 %
7.18 % ± 0.025 %
7.53 % ± 0.023 %
7.90 % ± 0.018 %
8.24 % ± 0.023 %
8.52 % ± 0.021 %
8.52 % ± 0.013 %

5.66 % ± 0.055 %
6.23 % ± 0.046 %
6.72 % ± 0.051 %
7.07 % ± 0.050 %
7.48 % ± 0.043 %
7.82 % ± 0.043 %
8.12 % ± 0.043 %
8.14 % ± 0.043 %

7.95 ± 1.042 × 10−2 A
7.94 ± 1.090 × 10−2 A
7.95 ± 9.16 × 10−3 A
7.95 ± 1.019 × 10−2 A
7.93 ± 1.050 × 10−2 A
7.92 ± 1.089 × 10−2 A
7.92 ± 1.523 × 10−2 A
7.92 ± 1.477 × 10−2 A

leading to the conclusion that the damage is permanent. The implications of this are discussed in
greater detail in Chapter 7.

6.2.3

Second Burn Period
After the recovery period another burn period was started, identical to the first burn pe-

riod. During the first 16 days some technical issues were encountered with the testbed, resulting
in corrupted data. The FPGA was continuing to burn during that time, even though there is no
characterization data.
After 90 days of additional burn the maximum slowdown seen was again by the MidRO, with a free running frequency 8.52% slower than it originally started at. The next greatest
slowdown was by the Bottom-RO, with 8.14% degradation, while the Top-RO experienced 2.11%
slowdown. It is notable that the slowdown for all three ROs is nearly linear for this entire burn
period, suggesting that after enough time the damage becomes linear instead of stopping. It is also
notable that the Top-RO, which lives outside the burn region, slows down at a much slower rate
than the ROs inside the burn region. The implications of these results will be discussed in greater
detail in Chapter 7.

48

CHAPTER 7.

DISCUSSION

In Chapter 6 a high current, long running experiment was described and the results put
forth. In this chapter the implications and impact of that work will be discussed. Section 7.1 discusses the localized nature of the damage caused by short circuits, Section 7.2 discusses different
aging effects that may play a role in the damage done, Section 7.3 discusses other notable results
and Section 7.4 discusses the limitations of this experiment.

7.1

Localized Degradation
One of the unique aspects of the results presented in Chapter 6 is the localized pattern of

damage. In Figure 6.2 it can clearly be seen that the two ROs inside the burn region (Mid-RO
and Bottom-RO) are heavily degraded due to the short circuits. However, the Top-RO shows a
dramatically different aging profile. While the Mid-RO and Bottom-RO slowed down by 8.52%
and 8.41% respectively, the Top-RO reached a maximum slowdown of 2.11%, only one-fourth of
the maximum slowdown. This clearly shows that the damage caused by short circuits is localized
in nature, meaning that not all parts of the chip are affected equally.
This localized effect is a significant breakthrough. Previous FPGA aging approaches have
only tried to age chips uniformly [2], [5], [16], [18]. This seems to be primarily due to the methods
that are used to induce aging (increasing the supply voltage and ambient temperature of the chip),
which effect the entire FPGA equally. However, by using harmful configurations to age the chip
targeted aging can be performed, where one section of the die ages more than other sections, as
demonstrated in Chapter 6. This ability is one of the novel aspects of a configuration based aging
approach and opens the door to some very interesting possibilities.
For example, accelerated FPGA aging is traditionally performed and studied so that chip
designers can better understand the effects that time will have on their devices, which in turn allows
them to generate age-tolerant designs [2], [3], [16]. However, since current aging techniques affect

49

all the transistors equally this doesn’t allow for the testing of scenarios where one part fails earlier
or faster than another part of the chip. The ability to perform localized aging has the potential to
make such an analysis feasible.
Furthermore, localized aging also has application in the FPGA security domain. If targeted
sections of a chip could be slowed down more than their surrounding regions then it could be
possible to create a physical watermark on the chip. Such a watermark would be based on the
manipulation of the physical characteristics of targeted transistors (namely their switching speed),
making it difficult to remove and spoof.
Another security application is the potential to clone a RO based (PUF). RO-PUFs are used
to uniquely identify individual chips (much like a watermark) and are based on differences in speed
between different ROs [16], [17]. With targeted aging it may be possible to target individual ROs,
resulting in a change of the relative ranking of the ROs. If this could be controlled then existing
PUFs could be cloned.
While such applications show promise, further work would first need to be done to understand and develop this localized effect. Exactly how localized the damage is, what the statistical
significance of the damage is and how time will impact and interact with an unevenly aged chip
(i.e. will the damage catch up) are all questions that need to be answered. These questions, among
others, are potential sources of future work in this area.

7.2

Aging Effects
Chapter 6 clearly indicates that damage is being done to the FPGA. However, determining

the underlying mechanisms is very difficult to do. There are several reasons for this. One, the
layout and design of the FPGA fabric is highly proprietary. This lack of access to the exact transistor layout means that it is impossible to know exactly what paths in the FPGA are being shorted,
what the physical properties of the shorted and characterized transistors are, and the proximity between the shorted transistors and the characterized transistors. Additionally, little is known about
the implementation of the power delivery network, resulting in little to no insight about how the
current is traveling through the chip. Two, even if all the above information was known, there is
no visibility into the damage itself. Physical inspection of the chip would be destructive in nature
(resulting in an end of testing for that chip) and requires specialized equipment that is beyond the
50

scope of this work. Third, there is no way to inspect or characterize the transistors while they are
being damaged. The best that can be done is to perform a characterization, try to cause damage,
and then characterize again to see if it worked. Very little is known about what is happening during
the burn phase.
As a result it is very difficult to give any definitive answers as to what caused the damage
measured in Chapter 6. However, there are several well established CMOS aging mechanisms that
most accelerating aging techniques try to exacerbate. This section will explore these mechanisms
and discuss how they relate to this work.

7.2.1

Bias Temperature Inversion
BTI, as presented in chapter 2, becomes more pronounced in the presence of increased

temperature and increased voltage. For the duration of the experiment the voltage is held at a
constant 0.95 V, which is the normal operating voltage for the Artix-7 part [23]. As such, there
is very little concern of increased voltage inducing a greater than normal BTI effect. However,
short circuits do produce significant heat during the burn phase. During the burn period the steady
state temperature of the chip, according to the on-chip temperature sensor, was 177.7 ◦C, more
than 70 ◦C higher than the maximum temperature rating of 100 ◦C for the device on the Arty board
[23]. Furthermore, due to the heat spreading capabilities of ICs and the distance between the onboard temperature sensor and the transistors in the burn region, this value only represents a lower
bound of the temperature actually experienced by the transistors in the burn region. The actual
junction temperature is likely to be much higher. Additionally, since heat affects transistors around
the shorted transistor it is likely the characterization methods described in Chapter 4 are capable
of detecting BTI damage.
As a result, we theorize that it is very likely that advanced BTI effects are at least one cause
of the observed damage. This claim is further supported by the pronounced initial slowdown.
Similar results, where the initial degradation is the most pronounced, are reported in [2], [5], [16],
[18], [32], all of which examined the effects of BTI on FPGAs.

51

7.2.2

Hot Carrier Injection
HCI, as presented in chapter 2, becomes more pronounced in the presence of increased

voltage and increased substrate current (on sub 70 nm technology nodes). In the experiment the
voltage is held at it’s nominal value. However, higher than typical currents are induced by short
circuits, and so it is very plausible that HCI effects may have caused damage. Unfortunately, in
the papers reviewed for this work HCI was most commonly studied by inducing high currents
with high switching activity, as that allowed for separation from other effects, such as BTI [2],
[18]. Since short circuits cause no switching activity it is impossible for us to distinguish BTI
effects from HCI effects. Furthermore, HCI would only effect the shorted transistors and not the
surrounding transistors.
As a result we theorize that it is likely that HCI is occurring but that it is unlikely to be the
cause of any of the observed damage, as it is unlikely the characterization method is sharing very
many of the same transistors as the short circuits (see Chapter 4 for more details).

7.2.3

Electromigration
EM, as presented in chapter 2, becomes more pronounced in the presence of increased

temperature and high DC current. In our experiment currents in excess of 7.9 A and temperatures
in excess of 170 ◦C are shown, providing the environmental conditions needed for accelerated EM.
Unfortunately, similar to HCI, the characterization method used to measure the damage is unlikely
to have used the same wires as the short circuits, resulting in it being unlikely that EM is the cause
of any of the measured damage. We theorize that EM is occurring, since the conditions are present
for it, but is not being measured.
It should be noted that EM affects wires, not transistors, and so any EM effects that have
occurred would be on the routing in the chip and not on the transistors themselves. This is in
contrast to HCI (also induced by high current) and BTI (induced by high temperatures), which
effect the transistors themselves.

52

7.2.4

Time-Dependent Dielectric Breakdown
TDDB, as presented in chapter 2, becomes more pronounced in the presence of increased

voltage. Also, similar to HCI, the effects of TDDB have been observed to become less pronounced
at higher temperatures. Since the experiment holds the voltage at it’s nominal level and induces
high temperatures (both of which help mitigate TDDB effects) we theorize that TDDB doesn’t
play a significant role in the observed damage [14], [29].

7.3

Other Notable Results
This section discusses several other noteworthy aspects of the Chapter 6 results.

7.3.1

Current over Time
One of the primary effects of the short circuits is the high current they induce, resulting

in a steady state burn current exceeding 7.9 A. An interesting trend in the data is that the steady
state current draw decreased over the length of the burn period. As shown in both Table 6.1 and
Figure 7.1, the average current dropped from 7.94 A at the beginning of the experiment to 7.90 A
at the end of the first burn, a 0.04 A drop. While not large, the decrease was a consistent trend.
Furthermore, this current drop is the only indication of damage occurring that can be measured
while the FPGA is in a burn state, as compared to needing to stop a burn to take measurements,
making it noteworthy. Note: the spikes shown in Figure 7.1 occur at the beginning of each burn
cycle as the board is programmed with shorts and the current settles.

7.3.2

Damage Pattern
When evaluating techniques for accelerating aging it is important to examine how the dam-

age changes with time. Will the technique stop working and the damage plateau, or will damage
continue to occur? If so, will the damage occur at the same rate as it does initially, or will it backoff
with time?
In Figure 6.2 it can be seen that the damage caused by our short circuits is most pronounced
in the beginning, and then tapers off during the first burn phase. The damage then ceases during

53

Figure 7.1: VINT input current during the first burn phase.

the recovery phase and then starts again during the second burn phase. By the end of the first burn
phase and through the second burn phase the damage profile is linear, meaning that the amount of
slowdown achieved during each burn cycle is roughly equal. This is important because it indicates
that more damage can be done simply by allowing for more time to pass, potentially until device
failure. Characterizing these damage patterns more fully is a potential avenue for future work.

7.3.3

Recovery Phase
It is also worth noting that no recovery occurred during the recovery phase. This is ab-

normal, as prior work indicates that at least some recovery is expected following damage done by
BTI (which is the most likely cause of the damage we have measured) [16]. This indicates that
something other than BTI may be causing damage or may imply that there is a critical amount

54

of damage that can be performed, after which no recovery occurs. Exploring this more fully is
another topic left to future work.

7.4

Limitations
The primary limitations in this experiment are with the chosen characterization method.

First, only three locations on the chip are characterized. As a result nothing is known about what
is happening everywhere else. This in turn means that the localization effects can’t be measured
very precisely, which presents a large limitation when considering possible applications.
Furthermore, the chosen characterization method doesn’t use very many of the same resources as the short circuits, with only the LUT being guaranteed to be the same. Thus, damage
done by HCI and EM are unlikely to be measured using this technique. Improving this has the
potential to greatly impact the characterization’s ability to provide understanding of what is happening inside the chip.
It is recommended that future work in this area develop a more thorough characterization
methodology that overcomes these limitations. For example, characterizing all slices on the chip
and using the same PIPs as the short circuits are potential ways to mitigate these limitations.

55

CHAPTER 8.

CONCLUSION

This thesis introduces a new technique for accelerating aging on FPGAs using harmful
configurations. A definition for short circuits in the context of FPGAs is presented and a technique
for creating designs that contain short circuits is described. The technique utilizes various design
tools made available by Xilinx such as Vivado and RapidWright. This technique is also shown to
be implemented and tested on real Xilinx 7-series parts, specifically the Artix-7 35T.
This work also describes several experiments that demonstrate the degradation impacts
that short circuits have on FPGA performance. These experiments are enabled by a testbed that
is also described. During the formulation of this testbed the challenges associated with measuring
degradation in FPGA performance are described, as well as the methods employed to combat them.
This testbed is then used in several experiments that explore different short circuit configurations.
This ultimately culminates in a long running experiment, which is also presented. In this long
running experiment an FPGA is shown to degrade by 8.52% after 126 days of having short circuits
programmed. Furthermore, this same experiment shows that short circuits are capable of producing
localized degradation, as one part of the chip only experienced 2.11% slowdown after the same
126 days of burn. Currently this is the only method known to the author that is capable of causing
nonuniform aging effects on any commodity semiconductor device.

8.1

Future Work
In addition to the future work presented in Chapter 7 there are several aspects of this work

that could use more exploration. In Chapter 3 a description for how to implement short circuits in
a Xilinx 7-series FPGA is given. Migrating those techniques to newer FPGA families (such as ultrascale) makes the work more applicable to applications that use those families. Additionally, the
fundamental idea of short circuits could be revisited. For example, a short that connects to multiple PIPs (as compared to just one) has the potential to produce more current and heat, which could

56

greatly impact the results presented here. Another avenue for exploration is with reverse engineering the bitstream. More knowledge regarding bitstream details may lead to new insights about how
short circuits are effecting the FPGA or perhaps into other harmful configurations that might cause
even more damage. A fourth option for future work could be to mix this aging technique with
other, more traditional aging techniques, such as increasing the supply voltage or baking the part
in high temperatures, which may result into new and interesting insights.

8.2

Contributions
This work resulted from the effort of several individuals. The genesis and direction of the

project were developed under the guidance of Professors Brad Hutchings and Jeffery Goeders.
They also provided invaluable insight regarding the interpretation of the results presented. Hayden
Cook, Wesley Stirk, Maximillian Warner and Robert Lucas all contributed to various aspects of
the testbed, short circuit creation and experimental design. Each also participated in interpreting
the data collected from the experiments. To help clarify my specific contributions to this work I
have listed below several parts of the project that I was mostly or entirely responsible for.
• Data collection procedures and methods.
• The software package that manage experiments and interacts with multiple aspects of the
system.
• The database system that stores all experimental data.
• The data analytics platform that enables detailed data analysis.
• The design of various experiments.
• The overall architecture and design of the testbed.

57

BIBLIOGRAPHY
[1] S. Novak, C. Parker, D. Becher, M. Liu, M. Agostinelli, M. Chahal, P. Packan, P. Nayak,
S. Ramey, and S. Natarajan, “Transistor aging and reliability in 14nm tri-gate technology”,
in Reliability Physics Symposium, Apr. 2015, 2F.2.1–2F.2.5.
[2] A. Amouri, F. Bruguier, S. Kiamehr, P. Benoit, L. Torres, and M. Tahoori, “Aging effects
in FPGAs: An experimental analysis”, in Conference on Field Programmable Logic and
Applications (FPL), Sep. 2014, pp. 1–4.
[3] Z. Ghaderi and E. Bozorgzadeh, “Aging-aware high-level physical planning for reconfigurable systems”, in Asia and South Pacific Design Automation Conference (ASP-DAC), Jan.
2016, pp. 631–636.
[4] H. Dogan, D. Forte, and M. M. Tehranipoor, “Aging analysis for recycled FPGA detection”,
in Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT),
Oct. 2014, pp. 171–176.
[5] A. Maiti, L. McDougall, and P. Schaumont, “The impact of aging on an FPGA-based physical unclonable function”, in Conference on Field Programmable Logic and Applications
(FPL), Sep. 2011, pp. 151–156.
[6] C. Beckhoff, D. Koch, and J. Torresen, “Short-circuits on FPGAs caused by partial runtime reconfiguration”, in Conference on Field Programmable Logic and Applications (FPL),
Aug. 2010, pp. 596–601.
[7] A. Putnam, A. M. Caulfield, E. S. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, and J. Fowers, “A reconfigurable fabric for accelerating large-scale datacenter
services”, p. 12,
[8] C. R. Clark and D. E. Schimmel, “Scalable pattern matching for high speed networks”, in
12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Apr.
2004, pp. 249–257.
[9] M. Caffrey, “A space-based reconfigurable radio”, p. 6,

58

[10] M. Wirthlin, “High-reliability FPGA-based systems: Space, high-energy physics, and beyond”, Proceedings of the IEEE, vol. 103, no. 3, pp. 379–389, Mar. 2015, Conference Name:
Proceedings of the IEEE.
[11] N. Mehta, “Xilinx 7 series FPGAs: The logical advantage”, p. 9, 2012.
[12] C. Lavin and A. Kaviani, “RapidWright: Enabling custom crafted implementations for FPGAs”, in Symposium on Field-Programmable Custom Computing Machines (FCCM), Apr.
2018, pp. 133–140.
[13] ——, “Build your own domain-specific solutions with RapidWright: Invited tutorial”, in
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable
Gate Arrays, Seaside CA USA: ACM, Feb. 20, 2019, pp. 14–22.
[14] N. Weste and D. Harris, “CMOS VLSI design: A circuits and systems perspective”, Jan. 1,
2010.
[15] JEDEC Solid State Technology Association and others, JEP122c, failure mechanisms and
models for semiconductor devices, Mar. 2006.
[16] S. Gehrer, S. Leger, and G. Sigl, “Aging effects on ring-oscillator-based physical unclonable
functions on FPGAs”, in Conference on ReConFigurable Computing and FPGAs (ReConFig), Dec. 2015, pp. 1–6.
[17] S. Gehrer, “Highly efficient implementation of physical unclonable functions on FPGAs”,
2017.
[18] E. A. Stott, J. S. Wong, P. Sedcole, and P. Y. Cheung, “Degradation in FPGAs: Measurement
and modelling”, in Symposium on Field Programmable Gate Arrays (FPGA), ser. FPGA ’10,
Monterey, California, USA: Association for Computing Machinery, Feb. 21, 2010, pp. 229–
238.
[19] J. Kim, R. M. Rao, J. Schaub, A. Ghosh, A. Bansal, K. Zhao, B. P. Linder, and J. Stathis,
“PBTI/NBTI monitoring ring oscillator circuits with on-chip vt characterization and high
frequency AC stress capability”, in 2011 Symposium on VLSI Circuits - Digest of Technical
Papers, ISSN: 2158-5636, Jun. 2011, pp. 224–225.

59

[20] B. B and C. Ts, “Mitigating the impact of NBTI and PBTI degradation”, Global Journal of
Technology and Optimization, vol. 7, no. 2, 2016.
[21] S. Zafar, Y. Kim, V. Narayanan, C. Cabral, V. Paruchuri, B. Doris, J. Stathis, A. Callegari,
and M. Chudzik, “A comparative study of NBTI and PBTI (charge trapping) in SiO2/HfO2
stacks with FUSI, TiN, re gates”, in 2006 Symposium on VLSI Technology, 2006. Digest of
Technical Papers., ISSN: 2158-9682, Jun. 2006, pp. 23–25.
[22] A. Amouri and M. Tahoori, “High-level aging estimation for FPGA-mapped designs”, in International Conference on Field Programmable Logic and Applications (FPL), Aug. 2012,
pp. 284–291.
[23] Xilinx, “Artix-7 FPGAs data sheet: DC and AC switching characteristics (DS181)”, p. 64,
2018.
[24] M. Naouss and F. Marc, “FPGA LUT delay degradation due to HCI: Experiment and simulation results”, Microelectronics Reliability, Proceedings of the 27th European Symposium
on Reliability of Electron Devices, Failure Physics and Analysis, vol. 64, pp. 31–35, Sep. 1,
2016.
[25] X. Wang, Q. Tang, P. Jain, D. Jiao, and C. H. Kim, “The dependence of BTI and HCIinduced frequency degradation on interconnect length and its circuit level implications”,
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 2, pp. 280–
291, Feb. 2015, Conference Name: IEEE Transactions on Very Large Scale Integration
(VLSI) Systems.
[26] J. Keane, X. Wang, D. Persaud, and C. H. Kim, “An all-in-one silicon odometer for separately monitoring HCI, BTI, and TDDB”, IEEE Journal of Solid-State Circuits, vol. 45,
no. 4, pp. 817–829, Apr. 2010, Conference Name: IEEE Journal of Solid-State Circuits.
[27] S. Sadiqbatcha, Z. Sun, and S. X. Tan, “Accelerating electromigration aging: Fast failure
detection for nanometer ICs”, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, vol. 39, no. 4, pp. 885–894, Apr. 2020, Conference Name: IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems.

60

[28] O. Aubel, W. Hasse, and M. Hommel, “Highly accelerated electromigration lifetime test
(HALT) of copper”, IEEE Transactions on Device and Materials Reliability, vol. 3, no. 4,
pp. 213–217, Dec. 2003, Conference Name: IEEE Transactions on Device and Materials
Reliability.
[29] M. Arabi, X. Federspiel, F. Cacho, M. Rafik, A. .-.-P. Nguyen, X. Garros, and G. Ghibaudo,
“Temperature dependence of TDDB at high frequency in 28fdsoi”, Microelectronics Reliability, 30th European Symposium on Reliability of Electron Devices, Failure Physics and
Analysis, vol. 100-101, p. 113 422, Sep. 1, 2019.
[30] I. Hadžić, S. Udani, and J. M. Smith, “FPGA viruses”, in Conference on Field Programmable
Logic and Applications (FPL), P. Lysaght, J. Irvine, and R. Hartenstein, Eds., ser. Lecture
Notes in Computer Science, Berlin, Heidelberg: Springer, 1999, pp. 291–300.
[31] B. L. Hutchings, J. Monson, D. Savory, and J. Keeley, “A power side-channel-based digital
to analog converter for xilinx FPGAs”, in Symposium on Field-Programmable Gate Arrays
(FPGA), ser. FPGA ’14, Monterey, California, USA: Association for Computing Machinery,
Feb. 26, 2014, pp. 113–116.
[32] M. Slimani, K. Benkalaia, and L. Naviner, “Analysis of ageing effects on ARTIX7 XILINX
FPGA”, Microelectronics Reliability, vol. 76-77, Jul. 1, 2017.
[33] R. S. Chakraborty, I. Saha, A. Palchaudhuri, and G. K. Naik, “Hardware trojan insertion
by direct modification of FPGA configuration bitstream”, IEEE Design Test, vol. 30, no. 2,
pp. 45–54, Apr. 2013.
[34] J. Ke, P. Sun, X. Zhang, Z. Zhao, and X. Cui, “Experimental study of the factors affecting
on SiC MOSFET switching performance”, in PCIM Asia 2017; International Exhibition
and Conference for Power Electronics, Intelligent Motion, Renewable Energy and Energy
Management, Jun. 2017, pp. 1–8.
[35] I. Ahmed, L. L. Shen, and V. Betz, “Becoming more tolerant: Designing FPGAs for variable
supply voltage”, in 2019 29th International Conference on Field Programmable Logic and
Applications (FPL), ISSN: 1946-1488, Sep. 2019, pp. 1–8.

61

[36] S. Kalra, “An insight into temperature inversion using -power MOSFET model for ultradeep
submicron digital CMOS technologies”, AEU - International Journal of Electronics and
Communications, vol. 125, p. 153 349, Oct. 1, 2020.
[37] A. Sassone, A. Calimera, A. Macii, E. Macii, M. Poncino, R. Goldman, V. Melikyan, E.
Babayan, and S. Rinaudo, “Investigating the effects of inverted temperature dependence
(ITD) on clock distribution networks”, in 2012 Design, Automation Test in Europe Conference Exhibition (DATE), ISSN: 1558-1101, Mar. 2012, pp. 165–166.

62

