Design methods for self-repairing digital logic by Yang, Jingchi







of the Requirements for the Degree
Doctor of Philosophy in the
School of Electrical and Computer Engineering
Georgia Institute of Technology
December 2019
Copyright c© Jingchi Yang 2019
DESIGN METHODS FOR SELF-REPAIRING DIGITAL LOGIC
Approved by:
Dr. David C Keezer, Advisor
School of Electrical and Computer
Engineering
Georgia Institute of Technology
Dr. Abhijit Chatterjee
School of Electrical and Computer
Engineering
Georgia Institute of Technology
Dr. Linda S Milor
School of Electrical and Computer
Engineering
Georgia Institute of Technology
Dr. Vijay Madisetti
School of Electrical and Computer
Engineering
Georgia Institute of Technology
Dr. Sundaresan Jayaraman
School of Materials Science and
Engineering
Georgia Institute of Technology
Date Approved: November 1, 2019
ACKNOWLEDGEMENTS
Foremost, I would like to express my sincere gratitude to my advisor Prof. David
C Keezer for the continuous support of my Ph.D. study and research, for his patience,
motivation, enthusiasm, and immense knowledge. His guidance helped me in all the time
of research and writing of this thesis. I could not have imagined having a better advisor
and mentor for my Ph.D. study. Besides my advisor, I would like to thank my fellow lab
mate in research Group: Te-Hui Chen, for his guidance in Verilog programming, testing
equipment, and Xilinx products. Last but not least, I would like to thank my family: my
parents Jian Yang and Zhenguang Wang, for supporting me spiritually throughout my life.
iii
TABLE OF CONTENTS
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 2: Background and history . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Faults and the Cause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Permanent Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Transient Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.3 Intermittent Faults . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Fault Detection Methodologies . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Hardware redundancy . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 Information redundancy . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 FPGA test methods . . . . . . . . . . . . . . . . . . . . . . . . . . 16
iv
2.3 Fault Repair Methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Fault-Tolerant Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 3: Self-repairing Methodologies . . . . . . . . . . . . . . . . . . . . . . 21
3.1 Enhanced-DMR (with BER-measurement and error-reporting) . . . . . . . 21
3.2 Enhanced-TMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Enhanced-QMR (TMR + local spare) . . . . . . . . . . . . . . . . . . . . . 27
3.4 State Synchronization techniques . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter 4: Healing Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1 FPGA working principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Partial Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Healing controller program . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Chapter 5: Software Toolchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Typical Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3 HDL converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3.1 Lexical Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.2 Syntax Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.3 Intermediate Code Modifier . . . . . . . . . . . . . . . . . . . . . 65
5.3.4 HDL Code Generator . . . . . . . . . . . . . . . . . . . . . . . . . 68
Chapter 6: Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.1 Case Study: ITC benchmark designs . . . . . . . . . . . . . . . . . . . . . 71
v
6.1.1 Implementation Cost Estimation . . . . . . . . . . . . . . . . . . . 72
6.1.2 Reliability Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2 VLSI Case Study: Handwritten Digit Recognition . . . . . . . . . . . . . . 90
6.2.1 Experiment System Setup . . . . . . . . . . . . . . . . . . . . . . 90
6.2.2 ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.3 Error Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2.4 Fault Tolerance Analysis . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.5 Implementation Report . . . . . . . . . . . . . . . . . . . . . . . . 100
Chapter 7: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2.1 Bit Error Rate (BER) Measurement . . . . . . . . . . . . . . . . . 105
7.2.2 Enhancement for Fault Tolerance and Fault Isolation . . . . . . . . 105
7.2.3 Self-Repairing Architecture . . . . . . . . . . . . . . . . . . . . . . 106
7.2.4 State Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.5 HDL converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2.6 Self-Repairing Design Framework . . . . . . . . . . . . . . . . . . 107
7.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.4.1 Built-in Healing Controller . . . . . . . . . . . . . . . . . . . . . . 108
7.4.2 Safety-Critical Application . . . . . . . . . . . . . . . . . . . . . . 108
7.4.3 Power supply noise testing . . . . . . . . . . . . . . . . . . . . . . 108
vi
7.4.4 Elevated temperature testing . . . . . . . . . . . . . . . . . . . . . 109
7.4.5 Radiation Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Appendix A: self-repairing system and benchmark design verilog code . . . . . 111
Appendix B: hdl converter code . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
vii
LIST OF TABLES
3.1 Truth Table of enhance voter logic with TMR . . . . . . . . . . . . . . . . 26
3.2 Fault Management Unit state table 1. . . . . . . . . . . . . . . . . . . . . . 29
3.3 Fault Management Unit state table 2 (continued). . . . . . . . . . . . . . . 30
4.1 Relationship between the faulty module ID, redundant module name and
the partially reconfigurable region. . . . . . . . . . . . . . . . . . . . . . . 49
5.1 Example of using regular expression to match strings patterns . . . . . . . . 54
6.1 ITC99 BENCHMARK DESIGNS . . . . . . . . . . . . . . . . . . . . . . 71
6.2 ITC99 benchmark designs implementation sizes from layout . . . . . . . . 79
6.3 ITC99 benchmark designs implementation sizes from formulas . . . . . . . 80
6.4 ITC99 benchmark designs maximum operating frequencies (in MHz) . . . . 80
6.5 System Accuracy Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.6 System Accuracy Results (continued) . . . . . . . . . . . . . . . . . . . . 100
6.7 FPGA Implementation Report . . . . . . . . . . . . . . . . . . . . . . . . 100
viii
LIST OF FIGURES
1.1 Development work flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Charge generation and collection phases in a reverse-biased junction [8] . . 12
2.2 Current pulse caused by the high-energy ion [8] . . . . . . . . . . . . . . . 12
2.3 Dual module redundancy diagram. . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Traditional TMR diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1 BER enhanced-DMR diagram. . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 BER enhanced-DMR timing diagram. . . . . . . . . . . . . . . . . . . . . 24
3.3 Enhanced-TMR with Voter and BER logic. . . . . . . . . . . . . . . . . . . 26
3.4 Enhanced-QMR with Voter and BER logic. . . . . . . . . . . . . . . . . . 28
3.5 Traditional method for TMR state synchronization. . . . . . . . . . . . . . 33
3.6 State synchronization technique for self-repairing system. The intercon-
nection between the combinational logic and the multiplexer for the rest
modules is similar to the highlighted first module and is not shown in this
diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.7 Fault Management Unit diagram. . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 FPGA system architecture [48]. . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Configurable Logic Block architecture [49]. . . . . . . . . . . . . . . . . . 38
4.3 FPGA architecture with configuration memory layer and hardware logic
layer [48]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
ix
4.4 FPGA architecture with multi-configuration layer [48]. . . . . . . . . . . . 41
4.5 How Partial Reconfiguration works [47]. . . . . . . . . . . . . . . . . . . . 41
4.6 Partially Reconfigurable regions in Xilinx Virtex XCV50 FPGA. . . . . . . 43
4.7 A bus macro showing the connectivity between the static region and a re-
configurable region [48]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.8 Xilinx 7-Series architecture [48]. . . . . . . . . . . . . . . . . . . . . . . . 45
4.9 Partial Reconfiguration design flow. . . . . . . . . . . . . . . . . . . . . . 46
4.10 Error message format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.11 Example of the interaction between the healing controller (in a PC) and the
FPGA when a soft error is detected. . . . . . . . . . . . . . . . . . . . . . 48
4.12 Example of the pre-defined partial reconfigurable regions. . . . . . . . . . . 50
4.13 Healing Controller flowchart. . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.1 HDL converter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Original HDL design file example: full adder . . . . . . . . . . . . . . . . 53
5.3 Converted HDL file with self-repairing architecture. . . . . . . . . . . . . . 53
5.4 Example Verilog code of a 8-bit counter . . . . . . . . . . . . . . . . . . . 55
5.5 Self-repairing version of the 8-bit counter with the state synchronization
technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.6 Structure of a classical compiler [60]. . . . . . . . . . . . . . . . . . . . . 57
5.7 Example of a syntax-correct but semantic-wrong code . . . . . . . . . . . . 59
5.8 Structure of a HDL converter. . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.9 Lexical Analyzer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.10 Lexical analyzer result of scanning the counter.v . . . . . . . . . . . . . . . 61
5.11 The Finite Automaton of a floating number. . . . . . . . . . . . . . . . . . 62
x
5.12 Syntax analyzer result of processing the counter.v . . . . . . . . . . . . . . 64
5.13 Tree Traversal Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.14 Tree Traversal Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.15 Example of creating a state register hash table. The hash table vardict stores
the state register name and the corresponding width . . . . . . . . . . . . . 67
5.16 The state register hash table of the counter. . . . . . . . . . . . . . . . . . . 67
5.17 Example of an intermediate code modifier function. This function inserts
the signal port for controlling the multiplexer, this signal is the “sync” sig-
nal passed from outside; the ‘neighbor in ′+k and the ‘neighbor out ′+k
port is the state synchronization data port from/to the neighbor’s module.
The “k” in the name field will be replaced with the information from the
state synchronization hash table. . . . . . . . . . . . . . . . . . . . . . . . 67
5.18 Example of the python source code creats an always statement using HDL
code generator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.19 Example of the generated Verilog code of an always statement. . . . . . . . 69
6.1 BER measurement logic block schematic . . . . . . . . . . . . . . . . . . . 73
6.2 Enhanced voting logic block schematic. . . . . . . . . . . . . . . . . . . . 74
6.3 Enhanced diagnostic logic block schematic. . . . . . . . . . . . . . . . . . 75
6.4 Enhanced DMR area overhead vs. the function module size. . . . . . . . . 77
6.5 Enhanced TMR area overhead vs. the function module size. . . . . . . . . . 78
6.6 Enhanced QMR area overhead vs. the function module size. . . . . . . . . 78
6.7 QMR enhancement logic block schematic. . . . . . . . . . . . . . . . . . . 79
6.8 Maximum operating frequencies for ITC99 benchmark designs . . . . . . . 81
6.9 Layout floorplan for enhanced QMR version of b12 . . . . . . . . . . . . . 82
6.10 Markov model for a two state system . . . . . . . . . . . . . . . . . . . . . 83
6.11 Markov model of the original TMR system. . . . . . . . . . . . . . . . . . 85
xi
6.12 Markov model of the enhanced TMR system. . . . . . . . . . . . . . . . . 86
6.13 Markov model of the enhanced QMR system. . . . . . . . . . . . . . . . . 87
6.14 The reliability changes of a 204k LUT system in 50 years of operation. . . . 88
6.15 The reliability changes of a 204k LUT system in 5000 years of operation. . 89
6.16 Experiment setup system overview. . . . . . . . . . . . . . . . . . . . . . . 90
6.17 ANN architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.18 Neuron model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.19 MNIST data example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.20 Error injection data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.21 Error generator module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.22 Example of a linear feedback shift register. . . . . . . . . . . . . . . . . . . 95
6.23 Simulation waveform of the enhancement module . . . . . . . . . . . . . . 97
6.24 ANN accuracy vs Injected error rate. . . . . . . . . . . . . . . . . . . . . . 98
6.25 FPGA Layout of the original system. . . . . . . . . . . . . . . . . . . . . . 101
6.26 FPGA Layout of the self-repairing system. Each color represents one single
neural network module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
xii
SUMMARY
The objective of this research is to establish a systematic approach for the design of self-
testable, self-correcting, self-repairing and self-healing digital systems. This self-test/self-
repair methodology is accomplished autonomously, in a distributed fashion, so that it can
scale as the size of the system grows. Modular redundancy and re-programmability are
exploited to accomplish generic self-test (applicable to nearly any application) and to en-
able self-repair in general, and self-healing in FPGAs. Error rates are measured throughout
the design to distinguish between transient errors and permanent or semi-permanent logic
faults. Original logic design is automatically transformed to this self-healing architecture
via customizable software. Logic reconfiguration occurs automatically, replacing failing
logic modules so that the system continues to operate error-free while partial dynamic re-




The objective of this research is to establish a systematic approach for the design of self-
testable, self-correcting, self-repairing and self-healing digital systems. Safety-critical
hardware systems play a important role in many applications such as military systems,
space applications, data centers, medical devices, and self-driving cars. These applica-
tions, where significant loss of life or destruction of capital equipment could result from
failing electronics, should be designed with high reliability in mind. This research presents
methods to improve digital system reliability and a technique for automatically deploying
those methods.
1.1 Motivation
Computers have already become an integral part of our everyday life and most people
assume that their operation will last forever and never fail. Clearly this is too optimistic. In
fact, even back to the 1940s when the first generation computer is built upon vacuum tubes,
errors and failures are already inevitable.
Benefiting from significant technological improvements, especially the invention of the
integrated circuits in the 1960s, that fundamentally reshape the information technology in-
dustry, today’s computer systems become more compact, less expensive and more powerful
machines.
As Gordon Moore discussed in [1], the number of transistors in a dense integrated
circuit doubles about every two years and this rate of growth would continue for decades.
In fact, nowadays, even a tiny piece of silicon chip is sufficient to operate complicated
task that the whole moon launch system can not achieve in the 1960s. This leads to a
significant increase in the use of computers. Moreover, besides the standalone desktop
1
computers, at the present time, the majority of the computers are in the form of embedded
systems. Although most of them are invisible to us, these devices cover a vast spectrum of
applications, from sophisticated space shuttle control systems to coffee machines, and from
nuclear reactor shutdown systems to automotive. The failure of many digital systems can
result in direct, and possibly very serious, harm to one or more people. In some extreme
instances the fault of the operation should be avoided at all costs, for example the failure
of the control system of a nuclear system jeopardize millions lives. On a smaller scale,
failures in an automobile brake system could potentially kill dozens of people. Even if no
human lives are under threat, some failures in digital systems could lead to catastrophic
financial loss.
Moreover, reliability in hardware systems is facing the challenge of the increasing com-
plexities of modern digital systems as well as the decreasing dimension of semiconductor
features. As a result, in the near future, even non-safety-critical designs might need to
utilize safety-critical techniques in order to meet minimum reliability requirements.
As can be seen from the above discussion, the operation of many digital systems has
a direct influence on the user safety and user property. And to achieve such a highly re-
liable and self-repairing electronic system, engineers must acquire expertise not only in
domain-specific application design but also safety-critical design. Alternatively, we present
a wrapper system that automatically upgrades non-safety-critical design into a more reli-
able version according to the specific system-level requirements, such as self-testing, fault
tolerant and self-healing for Field Programmable Gate Array (FPGA) based design.
1.2 Approach
Biological systems have evolved through billions of years and thrive largely due to their
ability to tolerate, repair, and heal defects. Most of the energy expended in living systems
is devoted to their top priority, which is to stay alive. In our work, we follow nature’s
example and put our priority on survival mechanisms in electronics, starting with self-test,
2
repair, and healing.
At a low-level, the DNA code present in all cells is made up of only four types of small
molecules arranged as complimentary base-pairs. This low-level complimentary paring
represents a first level of redundancy, somewhat analogous to the complementary operation
of CMOS logic. Defects in a base-pair are recognized by repair enzymes that sense that a
base molecule is either missing or miss-paired, and initiate repair mechanisms to replace
the defect with its proper base molecule.
Self-test is enabled within biological systems by the complementary pairing of these
bases. It provides a natural way to identify single-point defects. In this sense, the double-
helix duplication of the DNA code is analogous to dual-modular redundancy (DMR) in
electronic systems. It provides an automated way of detecting faults and serves as the
starting point for repair. At an even higher level, nature duplicates entire organs within the
body. In most cases the individual can survive (although in a degraded condition) with the
loss of one of the duplicate organs.
In this research, self-testing is achieved through an enhanced modular redundancy tech-
nique that not only compares two or more identical modules, but also distinguishes in-
termittent errors from temporary transient errors. This distinction is critical for the fault
repairing process since transient errors usually only last for one clock cycle, thus by the
time the repairing process starts the error might already disappear. Therefore, correcting
the corrupted data caused by the transient errors is recommended and the traditional error
detection method, which flags every error, is not efficient for identifying modules for repair.
By filtering out transient errors from intermittent errors, the repairing procedure required
by the fault management unit is effectively reduced by 90% to 95%, which significantly
reduced the average single module offline time, and thus increases the system availability
overall. Moreover, the hardware implementation cost of this enhanced logic is negligible.
Therefore, this minimal transistors or gates requirement can largely increase the enhanced
module scalability.
3
To achieve fault tolerance, we introduce an enhanced Triple Modular Redundancy
(TMR) technique that automatically corrects transient errors and identifies permanent error
for requesting repairing procedure more efficiently. To maintain the fault tolerance during
fault repair process and to tolerant more than one irreversible hardware defects, an en-
hanced Quadruple-Modular Redundancy (QMR) technique is presented. On the one hand,
combining all these techniques, a self-repairing architecture that satisfies high reliability
requirement is achieved. On the other hand, depending on the reliability requirement, the
individual design technique can be selected or composed to achieve desired safety perfor-
mance.
This self-repairing architecture is easy to scale and can be applied on different hierarchy
of the circuit. Furthermore, this highly reliable system architecture can be implemented
in both application-specific integrated circuit (ASIC) or Field Programmable Gate Array
(FPGA). However, if this system is implemented in FPGA, it can also benefit from a healing
process based on the FPGA partial reconfiguration technique. A healing controller that
manages this reconfiguration process is also introduced.
In addition to this self-repairing architecture, the deployment of such an architecture
should require minimal extra effort. Therefore, the transformation from a non-safety-
critical design to a reliable version is accomplished automatically with a software program.
This is extremely useful for converting extremely large design that contains of thousands
of modules. This method also eliminates human interaction, therefore increases the pro-
ductivity while avoiding the human error.
Combining all these methods and techniques, a framework for the design of self-testable,
self-correcting, self-repairing and self-healing digital systems is introduced. As presented
in Figure 1.1, on the software side, this framework contains an HDL converter for auto-
matically modifying the original Verilog source code. From a hardware point of view, this
framework utilizes FPGA and an external healing controller for deploying the self-repairing




















repairing hardware system are discussed as follows:
First, the HDL converter reads the original HDL file and modifies it to deploy the state
synchronization technique. Then it creates a higher level module with four redundant user
logic instances, one enhancement module and one interface module.
Second, the synthesis and implementation software generates a main bit-stream file and
several partial reconfiguration bit-stream files from those HDL files. The main bit-stream
is used to program the FPGA and the partial reconfiguration bit files are used as needed for
healing the damaged modules.
Finally, when an error is reported, the healing controller will reprogram the faulty area
with the corresponding partial bit stream.
To summarize, this research provides an universal solution to the design of highly re-
liable digital system. The conversion from a non-safety critical design to self-repairable
and fault tolerant design is achieved automatically. This updated design is autonomous and
can be easily scaled up. Various digital designs of different scales are implemented and
examined under random error injection. The test results demonstrate the effectiveness of
this design.
The methods and techniques presented in this thesis offer six distinct contributions as
presented below:
• Bit Error Rate (BER) Measurement
A bit error rate enhanced testing logic filters out transient errors from soft errors,
leading to a 90% to 95% reduction in the repairing request.
• Enhancement for Fault Tolerance and Fault Isolation
An enhanced voting logic not only corrects the transient error generated by the faulty
module, but also automatically identifies the failing unit for the repairing procedure.
6
• Self-Repairing Architecture
An architecture with the ability of self-testing, fault-tolerance and immediate context
switching when a permanent fault are detected. It is easy to scale and can be applied
on different hierarchy of the circuit.
• State Synchronization
A method to synchronize the sequential state of the newly repaired module with
the nearby health module. Without this technique, the entire system must be re-
initialized in order to reuse the repaired module.
• HDL converter
A Verilog compiler based software that automatically modifies the user logic and up-
grades it into a more reliable version. This method eliminates the human interaction,
therefore increases the productivity while avoiding the human error.
• Self-Repairing Design Framework
A framework for quickly prototyping a self-testable, self-correcting, self-repairing
and self-healing digital systems.
1.3 Thesis Organization
This thesis is organized as follows: In Chapter 2 “Background and History”, self-testing
and self-repair techniques are introduced. In Chapter 3 “Self-repairing Methodologies”,
details about the self-repairing system architecture is presented. In addition, a method of
synchronizing the state register between different modules is introduced. In Chapter 4
“Healing Controller”, details of using dynamic partial reconfiguration to repair failed logic
blocks in FPGA application are revealed.
The remaining research presented in this thesis uses the methods and techniques dis-
cussed in Chapter 3 to automatically upgrade non-safety-critical design into a more reli-
7
able version. In Chapter 5 “Software Toolchain”, mechanism of autonomously applying
the self-repairing architecture to any given digital design is explained. In Chapter 6, “Ex-
periment results” are presented to demonstrate the reliability improvement and the costs
of various example design ranging from SSI (small scale integration) to VLSI (very large
scale integration). In Chapter 7, a brief summary of the thesis work is presented. In addi-





No hardware system is entirely error-free [2]. This is first due to the imperfection of the
system design, and secondly the hardware component failures caused by wear, ageing or
other effects such as noisy power supply, high environment temperature or an alpha parti-
cle [3].
For the past few decades, varies methods and techniques aimed for the digital system
safety requirement have been developed under the assumption that design and component




Approaches to eliminate faults in digital system are related to the cause of those faults.
Therefore, it is relevant to discuss the origin of those faults to fully understand the logic
behind the safety-critical design techniques.
Due to this reason, the cause of faults in digital systems are discussed first in this chap-
ter. To achieve a high reliable system, we must be able to identify the fault. Therefore, after
presenting the source of errors in digital system, a brief review of studies in fault detection
is elaborated. After the fault is detected, we need to repair it to stop the negative effect.
Thus the discussion about the fault repair techniques is presented.
In consideration of the down time caused by the repairing process, systems with high
availability requirement should also be able to produce the correct result despite the exis-
tence of errors. This leads to the concept of the fault-tolerant system [4] and is exhibited in
the last section of this chapter.
9
2.1 Faults and the Cause
Different reference sources categorize faults based on different standards. For example,
faults can be divided into two separate classes based on their locations [3]. The first of
these is random faults. As the name suggests, these type of faults can occur anywhere
(and also anytime) in the circuit. They are primarily related to the individual hardware
component failures. The second type is termed systematic faults. These faults usually stay
in one area of the system since they are mainly due to the imperfect design.
Another commonly viewed categorization is based on the fault duration [5]. They are
permanent faults and transient faults. Sometimes intermittent faults are also included.
2.1.1 Permanent Faults
Permanent faults are typically caused by an irreversible physical change. In fact, the oc-
currence of permanent fault is directly associated with the semiconductor design and the
manufacturing techniques [3]. For convenience, we use permanent fault and hard fault
interchangeably in later discussion.
2.1.2 Transient Faults
Transient faults are faults caused by temporary anomaly on the hardware device. There are
several non standard conditions induce transient faults [5]:
• power supply noise
• neutron and alpha particles
Fault Source 1: Power Supply Noise
In large circuits, power is distributed in the circuit through wires, which contain parasitic
devices. The power supply noise is mainly caused by resistive and inductive parasitic
10
elements in the power supply lines. The relationship between the power supply noise and
the parasitic elements is illustrated in the following equation [6]:
∆V = IR + L
dI
dt
where R and L are the wire resistance and inductance respectively. When the logic gates
are switching, the current which flows through the power supply lines will cause the power
supply voltage to drop. This voltage drop increases the gate sensitivity to noise spike,
which consequently increases the circuit error rate [7].
Fault Source 2: Radiation
The cause of radiation-induced transient faults, e.g. when the neutron or alpha particles
strike the semiconductor chip, is related to the metal-oxide-semiconductor field-effect tran-
sistor (MOSFET) working principle [8]. High energy electrons such as alpha particles can
ionize the atoms and generate enormous electron-hole pairs [8]. As illustrated in Figure 2.1
and Figure 2.2, (a) High energy particles penetrate into the semiconductor substrate while
producing enormous electron-hole pairs along the path. (b) Those generated electrons close
to the depletion region are attracted by the depletion region electric field; similarly the gen-
erated holes are driven away from the depletion region. This behavior further increases
the size of the depletion region and force more electrons and holes to drift in the opposite
direction, which creates a large spike of current. (c) Similar to the mechanism of a PN
junction, the difference in electrons and holes concentration leads to a diffusion into the
opposite side, which counteracts the previous movement, re-balances the electrons/holes
concentration and returns the depletion region to its original size. If the current pulse or the
charges accumulated during this event reaches certain threshold, it will generate an error.
However the circuit itself is not damaged [9].
11
Figure 2.1: Charge generation and collection phases in a reverse-biased junction [8]
Figure 2.2: Current pulse caused by the high-energy ion [8]
12
Similarly, collisions induced by high energy particles can also occur in the oxide layer.
As a result, charges build up in the oxide can also cause device failure. The current leakage
during the switch ”Off” state is induced by the positive charge trapping in the gate oxide,
which increases the static power consumption and may also cause transient failure. Con-
versely, positive charge build-up in the field oxide can also reduce the current when the
switch is at ”ON” state. In fact, for modern integrated circuits with ultra-thin gate oxides,
the majority of the radiation-induced degradation is caused by charges build-up in field
oxides [10].
Neutron impacts are similar except that they do not generate the electron-hole pairs
directly. Instead, collisions cause by neutrons striking the silicon crystal produce alpha and
other high-energy particles. These particles can generate sufficient electron-hole pairs to
produce a transistor error.
Even though alpha particles and neutrons are both classified as radiation, they derive
from different sources. Alpha particles arise from chip packaging material while the dom-
inant source of neutrons comes from the cosmic rays [11].
2.1.3 Intermittent Faults
Intermittent faults are similar to transient faults, therefore it is difficult to distinguish one
from the other. Constantinescu and Cristian provide three main principles to judge the
class of fault that caused an error [5]. First, an intermittent fault tends to be more static than
transient errors, which means it occurs repeatedly at the same location. Second, intermittent
errors are more likely to occur in bursts when the fault is activated. Third, replacing the
faulty logic removes the intermittent fault, while transient errors can not be eliminated in
this way. In Chapter 4 Self-repairing Methodologies of this thesis, we provide a method to
distinguish intermittent faults from transient errors. For convenience, we use intermittent
and soft error interchangeably in later discussion.
13
2.2 Fault Detection Methodologies
Fault detection methodologies can be categorized into two approaches: on-line and off-
line [12]. On-line testing is executed during system run time while the off-line approach
is executed when the system is not operating. Since off-line test results do not guarantee
that on-line operation will be error-free, the discussion here will be restricted to on-line
testing techniques. On-line testing schemes usually depend on redundancy techniques.
For hardware error detection, there are three forms of redundancy techniques: hardware,
information and time [2][12].
2.2.1 Hardware redundancy
Hardware redundancy schemes utilize extra hardware to generate a reference module and
compare the reference results with the module under test. The easiest way to implement a
reference hardware module is to duplicate it. This method is also known as Dual Module
Redundancy (DMR). As illustrated in Figure 2.3, by executing the same task on two iden-
tical modules and comparing their outputs, as long as both modules do not make the same
mistake at the same time, a DMR system can detect any error whenever a disagreement in
the outputs occur.
Figure 2.3: Dual module redundancy diagram.
14
Besides the classic dual modular redundancy (DMR) test methodology, Burress et al
propose a look-up table (LUT) based logic cell that operates on the premise of the two-
rail checker. Firstly, according to the sum of products, the function will be realized by a
series of product and sum logic blocks. Each block has two outputs and no more than four
inputs. The block is designed in such a way that it can generate complementary outputs
when the cell works well and identical outputs in the event of faults [13]. A script in
[14] is used for simplifying and decomposing the Boolean expression. This methodology
requires less area overhead (78%) compared with the classical DMR technique. However,
any modification in the redundant module may vary the data path delay, and thus violate
the timing requirement of the original module. Therefore, this approach requires extra
development time and effort to solve the potential synchronization issue.
2.2.2 Information redundancy
Information redundancy techniques utilize extra check bits to test the original data bits
[15][16]. This is also known as the error detection coding (EDC) which includes parity
code, m-of-n code and Borden’s code [15]. Instead of checking every single bit from
the output, only the encoded check bits are compared to detect the error. Although these
methods are mostly used in data communication, studies have applied these approaches in
modular function testing. For example parity check has been used in adder and arithmetic
and logic unit (ALU) error detection by implementing a parity predictor [16].
A parity predictor is an optimized implementation of the original module plus the parity
bit encoder [17]. Since the number of parity bits is generally smaller than data bits, parity
prediction usually requires less area overhead compared with the DMR system [16].
The optimization of the implementation of parity prediction in ASIC has been studied
for years. Sobeeh et al present a parity-tree-selection method based on the relation between
the entropy of a function and the area and power-consumption overhead incurred by its
implementation [18]. The idea is to use entropy theory to find the optimal number of
15
parity bits. The author compares the entropy-driven methodology and the minimum parity
bits strategy and proves that the entropy driven parity-tree-selection method costs less area
overhead. On average this technique costs 60% overhead. However, it should be noticed
that in some cases, the parity check system cost even more area than DMR.
Mitra et al summarize three different concurrent error detection (CED) schemes [19]:
duplex system, parity prediction and unidirectional error detecting code [15]. The conclu-
sion is that the parity prediction technique has the lowest area overhead while the diverse
duplex technique is the most reliable approach. Although error detection coding has an
advantage in hardware overhead, some techniques are stricted to specific types of errors
[15]. Moreover, similar to the diverse DMR approach, synchronization with the unit under
test is a potential issue.
2.2.3 FPGA test methods
VLSI testability has been summarized to controllability and observability, yet FPGAs have
unique properties, and thus meet a new challenge in testing [20]. Due to the fact that area
consumption in FPGAs is related to the number of inputs of a function rather than the
complexity of the function itself [21], Seok et al propose a new method for implementing
the parity prediction function in FPGAs [22]. The idea is to decompose the original par-
ity function into several small-input-number functions in order to fit in with the standard
lookup table (LUT). However, all parity prediction methods have the same synchronization
issue as discussed previously.
Mehdi et al present a technique for FPGA testing by reconfiguring the interconnection
routing logic and the configurable logic block (CLB) [23]. When testing the routing logic,
the configuration of the interconnection routing logic stays unchanged. But the CLBs will
be reconfigured to implement a single-term function in order to test the interconnections
and vice versa.
16
Miron et al present an FPGA test strategy called Roving STARs [24]. In order to test
the chip without interrupting the functionality, the chip is divided into many blocks. Only a
small part of the chip will be tested each time while the other parts keep functioning. Once
the test is completed, the test target moves to the next area and repeat the previous process.
The circuit blocks in the test zone are switched between three roles: test pattern generator
(TPG), output response analyzer (ORA) and Block under test (BUT). To be more precise,
two identically block are tested by a test pattern generator (TPG) and those outputs are then
sent to an output response analyzer (ORA). The system was designed in such a way that
each block will be tested twice and each time compared to a different block. Eventually,
all the chip will be tested.
Nathan has a similar technique for bus-based FPGA [13]. The drawback of the above
methods is that due to the configuration time cost, the speed of switching testing area is
very slow. Therefore, the developer has to make a trade-off between the operating clock
frequency and the system availability.
2.3 Fault Repair Methodologies
Although the above strategies can detect errors, they cannot repair errors. Thanks to the
flexibility of FPGA, single event upset (SEU) in configuration memory can be repaired by
reprogramming the configuration bits [25]. However, permanent physical damage in FPGA
cannot be repaired, but the faulty module can be replaced with the redundant module.
For example, Herrera-Alzu et al present a background configuration memory refresh
technique which allows the FPGA to check and correct the configuration error without in-
terrupting the application operation [26]. Stoddard et al introduce a hybrid configuration
scrubbing approach with an external controller [27]. M Berg et al compared two configura-
tion scrubbing techniques and present the experimental results under a radiation test [28].
Dumitriu et al propose a novel architecture for transient and hardware faults online
recovery [29]. Transient errors or SEUs can be repaired by reconfiguring the faulty area
17
of the device. Logic circuitry containing hardware faults will be reallocated to spare area
through dynamic partial reconfiguration. The hardware rerouting process is replaced by
changing the broadcast mode in the large central bus interface in order to isolate the faulty
module and connect the relocated module to the system. Although the area overhead seems
little at first glance, the cost of the central bus is actually significant, nearly 50% of the
total area. Moreover, due to the high area cost, such a safety-critical central bus is more
vulnerable to defects compared with the subsystem module.
Kim et al. propose a novel self-repairing architecture inspired by paralogous genes [30].
This design has a built-in self-test module. Once a fault is detected in the working cell, the
input will be redirected to a redundant cell with the same functionality. The faulty cell will
be self-tested. For transit errors, the faulty cell will become a redundant cell after self-test.
In the case of permanent error, the faulty cell will be programmed to be a “death cell” and
a “stem cell” will be programmed as a new redundant cell through partial reconfiguration.
However, these self-repair methods all face two main problems. First, the speed of
the configuration memory scanning is extremely slow compared with the clock frequency.
Meanwhile, errors could propagate to other units long before being detected. Second, the
repairing process is not efficient when the transient error rate is significantly higher than
other permanent errors. Since the transient errors last briefly, by the time the repairing
process is finished, they are already disappeared.
2.4 Fault-Tolerant Techniques
Similar to the error detection techniques, fault-tolerant designs also depend on redundancy
techniques. In general, hardware redundancy schemes cost more area overhead while infor-
mation and time redundancy schemes have a larger performance penalty [2]. Since in most
cases, logic functions that implemented in FPGA do not use the entire logic resource [31],
small designs could take advantages of the unutilized area and implement the redundant
modules with those resources. In addition, the cost of the hardware resources is expected
18
to keep decreasing [32], thus the disadvantage of the hardware redundancy becomes less
important in the long-term. Therefore, the discussion here focuses on hardware redundancy
techniques over information and time redundancy for fault-tolerant design methodology.
Triple Modular Redundancy (TMR) is one of the most widely used fault-tolerant strate-
gies [33]. As presented in Figure 2.4, by tripling the original design and voting the result
with a majority gate, as long as two out of three modules work correctly, errors generated
by the faulty module will be masked by the majority vote. Xilinx provided a method for
the TMR design in their Virtex FPGAs [34]. These approaches are effective against tran-
sient errors however they ignore the consequence when one module continuously produces
errors [35].
Figure 2.4: Traditional TMR diagram.
Mathur et al proposed a TMR system with self-repair techniques integrated [36]. This
system replaces the faulty module with a spare module. However, without the distinction
between a transient error and permanent error, spare modules will soon be exhausted.
Zhang et al developed an algorithm to generate a minimal set of configurations for
19
one functional module, each configuration in that set has at least one spare CLB [37].
This minimal set covers all possible locations for the single CLB-faults and thus achieves
single CLB-faults tolerant. Moreover, alternating the spare CLB with the heavy-duty CLB
balances the system stress thus increases the overall system lifetime.
Baig et al introduce a hierarchical fault-tolerant architecture [38]. At the top level,
the system contains multiple computation tiles, each tile contains multiple computation
blocks. Inside the block, there are various computational cells and preserved spare logic
resource called stem cell. Each computational cell is designed to be fault-tolerant with pre-
computed error detection code and spare look-up tables. Those spare resource, either stem
cell or spare look-up table is designed for function relocation at a different level in case of
permanent error occurs. However, these methods cannot restore the sequential logic state
in the faulty logic, the data stored in the flip-flop will be lost when switching to a spare
logic.
DeMara et al. proposed a TMR based fault-tolerant architecture for image processing
application [39]. In order to minimize the power consumption, the third module will not
be active until the discrepancy is detected by DMR. The faulty module is repaired through
intrinsic Genetic Algorithm.
Oreifej et al. also provide an example of using GA for reconfiguration [40]. An en-
hanced version of GA called Combinatorial Group Testing (CGT) is proposed. Refurbish-
ing partially-functional configurations are demonstrated to be more efficient than designing
the configurations when using genetic algorithms. However, even with the advanced algo-
rithm, the the process of generating a valid configuration bitstream from the individual




As introduced in chapter 2, various self-testing and self-repair strategies have been uti-
lized in the past. However, they all have certain limitations. Our proposed methodology
recognizes that the optimal level of fault-tolerance and repairability will be application-
specific. Therefore we propose the following strategies that can be deployed, depending
upon system fault-tolerance requirements and the Mean Time Between Failure (MTBF)
of the components (logic blocks). Enhanced-DMR can distinguish different kinds of errors
but not correct them. Enhanced-TMR can correct transient errors but not repair faulty mod-
ules with permanent faults. To overcome these limitations, we develop the enhanced-QMR
technique which can autonomously replace the soft/hard-failing unit in order to maintain
TMR operation. These strategies not only can be applied independently to the existing error
detection and error correction technologies but also can be integrated together to achieve
very high reliability. Combined with the autonomous partial reconfiguration technique
[41], FPGA applications can extend the system lifetime by centuries [42]. The enhanced-
QMR approach is particularly applicable to extremely large-scale systems where errors are
expected to be frequent.
3.1 Enhanced-DMR (with BER-measurement and error-reporting)
As presented in the previous chapter, transient errors by their nature appear randomly and
sporadically [2]. Moreover, transient errors in configuration memory do not necessarily
lead to a soft functional error. The erroneous bit has to be one that is critical to the function
in order for a soft functional error to be observed. The number of unused bits and non-
critical errors reduces the typical soft error rate to 5% to 10% [43] (only one in 10 to 20
upsets, on average, cause a functional soft error). Therefore the traditional DMR method,
21
which flags every error, is not efficient. A better solution would be to detect functional
errors but ignore or correct non-recurring transient errors so that higher-level repair mecha-
nisms can be invoked at the system level only when there is a high likelihood of a structural
fault.
Beyond sporadic transient errors, soft or hard errors occur when a fault causes a change
in the logic behavior. In FPGAs, most soft errors result from faults in the configuration
memory, which can be corrected by reprogramming [25].
In order to distinguish the transient errors and the soft errors, a bit error rate measure-
ment logic is introduced. In Figure 3.1, each error is not only recognized, but is counted,
and compared to a threshold BER to distinguish between common transient errors (isolated,
single-bit errors) and soft/hard errors (resulting from changes in logic structure).
Figure 3.1: BER enhanced-DMR diagram.
In an FPGA, a change in the configuration memory will often cause a permanent change
in the logic behavior, resulting in a significant increase in error rate. A local 2-bit counter
is used to count the errors between global resets and stopped counting at a count of 2 (or
optionally 3). The reset signal is produced by a shared global counter and resets the local
2-bit counters periodically. This periodic global reset signal is designed to eliminate the
potential false alarm caused by the accumulation of transient errors during a long period
operation.The number of bits in the global counter is chosen based upon the expected ac-
22
ceptable BER and an unacceptable rate that is twice as high. As illustrated in Figure 3.2,
the 2-bit counter is designed to output only when the higher value is reached, signaling
the system using the Master Error Flag signal. Therefore, random transient errors can be
distinguished from soft errors which cause faults at much higher rates. As can be seen in
the later section, this BER enhancement can also be applied to higher level methods.
In conclusion, the enhanced-DMR features and benefits are presented as follows:
• Provides basic self-test capability, repair requires system-level operations
• Most soft faults can be corrected by re-programming the failed functional cell and
tester
• Distinguishes transients from soft defects
• Low overhead cost (compared to TMR, QMR)
On the contrary, the enhanced-DMR limitations are summarized as follows:
• Assumes very low probability of errors
• Does NOT correct transient errors
• Normally simple-DMR does not provide fault-isolation or repairability
• Normally simple-DMR does not distinguish between transient, soft, or hard errors
• Assumes that the system can tolerate the time needed for repair of soft-defects (by
re-programming the failed cell)
• If time-multiplexing is not used, then the entire cell must be reprogrammed and it
will be temporarily non-functional during the reprogramming step
• Suitable only for non-critical applications




























Although the enhanced-DMR approach can identify critical intermittent errors, it cannot
correct these or transient errors. Even though the bit error rate (BER) for transient errors
is low, for example, the BER for Xilinx 7 series FPGA 7k325T is 5.69 × 10−15 [30], it is
still unacceptable for safety critical applications. Moreover, certain circumstances such as
radiation will tremendously increase the BER [44] and cause SEU in configuration memory
[45][46]. Therefore, for applications with higher level reliability requirements, an error
recovery or fault tolerant mechanism is necessary. The traditional TMR method is capable
of masking one error as long as the other two outputs are error free. However, studies have
proven that once a module in the TMR system is failed, the reliability of such a system is
inferior to that of a single module system [35]. Therefore, in order to maintain the benefit of
a highly reliable TMR based fault tolerant system, a fault identification and self-repairing
mechanism is required.
In our enhanced-TMR approach, fault-isolation for repair of intermittent defects is
added to the traditional TMR. Like our enhancement to DMR, we add the ability to set
a BER threshold, above which the unit is suspected of having a soft or even a hard fault.
As shown in Figure 3.3, we use three copies of the logic block (like traditional TMR) and
enhanced voter logic to not only carry out the voting, but also to count the BER and identify
which of the three blocks has failed. A 2-bit error code is generated that identifies the most
recent failing block, as presented in Table. 3.1. For example, on the second row, when F0
and F1 produce a logic “1” while the F2 module generates a logic “0”, the enhanced voter
identifies the disagreement between the three modules and activates the error flag.
25
Figure 3.3: Enhanced-TMR with Voter and BER logic.
Meanwhile, it also generates the error code “10”, indicating the error is produced by F2.
If no error is caught, as shown in the first row and last row, then the error code is not valid.
The “xx” in the table represents “don’t care value”. This error code is used by the higher-
level system maintenance hardware to determine which block to attempt to repair (by repro-
gramming). The higher-level system maintenance hardware could be implemented using
Table 3.1: Truth Table of enhance voter logic with TMR
F0 F1 F2 out err code error flag
0 0 0 0 xx 0
0 0 1 0 10 1
0 1 0 0 01 1
0 1 1 1 00 1
1 0 0 0 00 1
1 0 1 1 01 1
1 1 0 1 10 1
1 1 1 1 xx 0
the autonomous partial reconfiguration technique [41]. During the repair attempt, we al-
low the two remaining non-failing units to continue to function (in DMR mode). While in
26
DMR mode, the frequent transients generated by the failed/re-programming block are ig-
nored by the voter logic, so the system is temporarily reduced to DMR mode while filtering
out ignoring the transients from the failed unit.
In summary, the enhanced-TMR features and benefits are presented as follows:
• Automatically corrects the vast majority of transient faults (like normal TMR)
• Uses BER to distinguish transients from soft defects (enhancement)
• Automatically identifies the failing unit and reports it for eventual repair (enhance-
ment)
On the other hand, the enhanced-TMR limitations are shown as follows:
• Temporarily drops into DMR mode while a soft defect is repaired/healed
• Repair/healing requires assistance from higher levels of the system
• Overhead is 200% (plus the cost of BER and error-reporting logic)
3.3 Enhanced-QMR (TMR + local spare)
The enhanced-TMR approach described above exhibits degraded fault-tolerance and BER
(effectively DMR) between the onset of failure and completion of the reprogramming step.
This degradation may not be tolerable by some very critical systems. Worse, if a hard error
is encountered (not repairable by reprogramming the failing logic block), then the system
has no convenient way to recover without globally reconfiguring (rerouting) a major portion
of the system. Even if this is possible, it will result in a long down-time.
In our enhanced-QMR approach, we add a fourth logic block to the enhanced-TMR
approach and modify the voter/BER/Error-code logic to account for the fourth block. This
scheme is shown in Figure 3.4. Initially, the voter logic produces a 2-bit internal ignore
code that effectively masks one of the four logic blocks outputs (the backup spare). It then
27
treats the remaining three logic blocks like a TMR configuration. As in our enhanced-TMR
scheme (above), transient errors are counted and compared to a threshold BER.
Figure 3.4: Enhanced-QMR with Voter and BER logic.
If the BER remains below a predetermined threshold, then the spare unit is ignored and
the system proceeds to correct transients in TMR fashion. If the BER exceeds the threshold,
then the last failing bit identifies the suspected failing unit. That 2-bit code is latched and
internally fed-back to signal the voter to ignore the suspected failed unit while activating
the spare (fourth) unit, thereby seamlessly replacing the failed unit and bringing it offline
for reprogramming. Table. 3.2, Table. 3.3 and Table. ?? present the enahnced-QMR voter
logic truth table.
This ability of maintaining TMR with no down time guarantees that no transient error
will pass to the output. Moreover, this repair mechanism does not require external con-
trol and consequently provides good scalability. However, external control may be used
to attempt healing of the failed block by reprogramming the configuration memory. For
example, recent research has demonstrated the FPGA self-healing process using the au-
tonomous partial reconfiguration technique [41]. It is capable of detecting and correcting
upsets in configuration memory by scanning and correcting the configuration frame. After
28
reprogramming, the system is reset back to the original configuration and the reconfigured
unit is reactivated in enhanced-TMR fashion (the fourth unit is ignored).
Table 3.2: Fault Management Unit state table 1.
err code (input) F0 F1 F2 F3 out err code (output) error flag
11 0 0 0 0 0 xx 0
11 0 0 0 1 0 xx 0
11 0 0 1 0 0 10 1
11 0 0 1 1 0 10 1
11 0 1 0 0 0 01 1
11 0 1 0 1 0 01 1
11 0 1 1 0 1 00 1
11 0 1 1 1 1 00 1
11 1 0 0 0 0 00 1
11 1 0 0 1 0 00 1
11 1 0 1 0 1 01 1
11 1 0 1 1 1 01 1
11 1 1 0 0 1 10 1
11 1 1 0 1 1 10 1
11 1 1 1 0 1 xx 0
11 1 1 1 1 1 xx 0
00 0 0 0 0 0 xx 0
00 0 0 0 1 0 11 1
00 0 0 1 0 0 10 1
00 0 0 1 1 1 01 1
00 0 1 0 0 0 01 1
00 0 1 0 1 1 10 1
00 0 1 1 0 1 11 1
00 0 1 1 1 1 xx 0
00 1 0 0 0 0 xx 0
00 1 0 0 1 0 11 1
00 1 0 1 0 0 10 1
00 1 0 1 1 1 01 1
00 1 1 0 0 0 01 1
29
Table 3.3: Fault Management Unit state table 2 (continued).
err code (input) F0 F1 F2 F3 out err code (output) error flag
00 1 1 0 1 1 10 1
00 1 1 1 0 1 11 1
00 1 1 1 1 1 xx 0
01 0 0 0 0 0 xx 0
01 0 0 0 1 0 11 1
01 0 0 1 0 0 10 1
01 0 0 1 1 1 00 1
01 0 1 0 0 0 xx 0
01 0 1 0 1 0 11 1
01 0 1 1 0 0 10 1
01 0 1 1 1 1 00 1
01 1 0 0 0 0 00 1
01 1 0 0 1 1 10 1
01 1 0 1 0 1 11 1
01 1 0 1 1 1 xx 0
01 1 1 0 0 0 00 1
01 1 1 0 1 1 10 1
01 1 1 1 0 1 11 1
01 1 1 1 1 1 xx 0
10 0 0 0 0 0 xx 0
10 0 0 0 1 0 11 1
10 0 0 1 0 0 xx 0
10 0 0 1 1 0 11 1
10 0 1 0 0 0 01 1
10 0 1 0 1 1 00 1
10 0 1 1 0 0 01 1
10 0 1 1 1 1 00 1
10 1 0 0 0 0 00 1
10 1 0 0 1 1 01 1
10 1 0 1 0 0 00 1
10 1 0 1 1 1 01 1
10 1 1 0 0 1 11 1
10 1 1 0 1 1 xx 0
10 1 1 1 0 1 11 1
10 1 1 1 1 1 xx 0
30
If repeated attempts at reconfiguring fail, then the fault is determined to be a hard fault.
Then the fourth unit is brought into service until the system can find a higher-level correc-
tion. Even if no further repair is possible, this approach will autonomously repair up to one
failing logic block per cell. Therefore, it can tolerate many (up to the number of cells in the
system) simultaneous soft/hard faults (assuming no more than one failing block per cell).
To summarize, the enhanced-QMR benefits are presented as follows:
• Automatically corrects the vast majority of transient faults (like TMR)
• Uses BER measurement to distinguish transients from soft or hard defects
• Automatically identifies the soft/hard-failing unit and reports it for eventual repair
• Autonomously replaces the soft/hard-failing unit in order to maintain TMR operation
• Continues to run in TMR mode while a soft defect is repaired/healed
• Automatically tolerates up to one hard or soft defect per cell
Conversely, the enhanced-QMR limitations is discussed as follows:
• Repair/healing of multiple hard faults requires assistance from higher levels of the
system (not fully automated)
• Overhead is 300% (plus the cost of error-reporting logic)
At the lowest level, redundant logic blocks within a cell are programmed for self-test and
for isolating faults to a single failing logic block. The redundant blocks autonomously cor-
rect transient errors in real-time, without system interruption (online self- test/repair). The
approach can tolerate a large number of independent, simultaneous such transient errors. In
addition, soft errors are autonomously detected and repaired by identifying the failing logic
block and reprogramming just it using partial reconfiguration techniques. This dynamic
partial reconfiguration also can be accomplished online, without functional interruption of
31
the system. For the very infrequent case of a detected hard error (i.e. one that resists repair
by reprogramming), we resort to the isolated reconfiguration of the failing cell, using a
preprogrammed spare logic block judiciously located adjacent to each functional cell in the
layout.
Last but not least, it is worth mentioning that the above methodologies can also be
applied to ASIC designs. However, the healing feature which requires reprogramming/re-
configuring is restricted to FPGA applications. These strategies can be deployed at the
block level, module level, and system level. In most cases, as we will see in the experimen-
tal results, the enhancements overhead is negligible compared with that of the redundant
modules, and the performance penalty is also minimal. This low level design strategy
combined with high level autonomous partial reconfiguration tremendously increases the
system reliability.
3.4 State Synchronization techniques
Traditional redundancy based methods cannot restore the sequential logic state in the faulty
logic, the data stored in the flip-flops and registers will be lost when switching to a spare
logic block. Similarly when the repaired module is switched back online, the internal
sequential states are not initialized. Therefore, for a modular redundancy technique based
self-repairing architecture, a state synchronization mechanism between the repaired module
and it’s neighbors is required.
One way to ensure the correctness of a sequential state is to use majority voters at
the input to the register, as illustrated in Figure 3.5. Data stored in the sequential state
register is not directly assigned by the combinational logic block of that module. Instead,
combinational logic blocks of all three redundant modules must first vote to generate the
output that represents the majority of the redundant logic blocks. This voted result can then
be assigned to each sequential state register. Since the register state is the majority of three
logic blocks, data stored in the register is correct as long as no more than one module is
32
Figure 3.5: Traditional method for TMR state synchronization.
failing. However, the voter only works for an odd number of modules. An even number
of modules could potential create a scenario where exactly one half of the modules are
different from the other half, thus no result can represent the majority.
Alternatively, we propose a method to synchronize the registers by selectively rerouting
every state register inputs. As presented in Figure 3.6, a multiplexer with one-bit control
signal is added to switch the inputs of the state register. This one-bit control signal, labeled
“sync” in the diagram, is generated by the fault management unit, which is discussed in
the next section. One input of that multiplexer, named “default input”, is the output of
this module’s combinational logic. This input will be selected during the normal operating
mode (sync signal is zero). The other multiplexer input is the output of the nearby redun-
dant module’s combinational logic. This input is selected when the state synchronization
is started (sync signal is one). This approach works for any number of redundant modules.
In addition, it also costs less area overhead compared with traditional state synchronization
technique since each module requires only one extra input, one extra output and one multi-
33
Figure 3.6: State synchronization technique for self-repairing system. The interconnection
between the combinational logic and the multiplexer for the rest modules is similar to the
highlighted first module and is not shown in this diagram.
plexer with one-bit control signal. All the logic states of the recently activated module will
be synchronized to the nearby module in one cycle.
Figure 3.7 presents the structure of the fault management unit. As shown in the figure,
the fault management unit is primarily consisted of an enhanced QMR voter with the error
code register, a BER counter, a decoder and some flip-flops (4-bit register). The 2-bit
latched error code is connected to a decoder and the latched outputs are the four “sync”
control signals, labeled sync0 to sync3. Each of them controls the corresponding state
synchronization multiplexer in the redundant module. By default, the outputs of the 4-bit
register are all zeros, indicating that all multiplexer select the default input. However, if a
34
soft or hard error is detected, the error code will activate the corresponding output. In this
example, F1 is the faulty module, thus the second output of the decoder is marked as 1 on
the output. As a result, the Sync1 is high while the remainder are still zero. Consequently,
the multiplexer in module F1 will select the neighbor’s input at the rising edge of the unlock
signal. The unlock signal is generated by the error reporting interface. It is used to indicate
that the healing procedure is finished. Details about the unlock signal, error reporting
interface and the healing controller are discussed in the later chapter.
Figure 3.7: Fault Management Unit diagram.
The state synchronization technique can also be used as a healing method for failures
35
caused by state-bit corruption. In that case, partial reconfiguration is unnecessary since
the reconfiguration is primarily focus on healing the permanent defects in combinational
logic (configuring memory in FPGAs). First applying a state synchronization technique
and then checking if the module is healed is more efficient than directly applying partial
reconfiguration. The healing process reuses the same self-testing hardware therefore no




The healing controller that manages the repairing procedure is introduced in this chapter.
The FPGA dynamic partial reconfiguration technology [47] is used to heal the faulty mod-
ule. In order to understand the dynamic partial reconfiguration technique, we need to first
understand how the FPGA works. In the first section, an introduction about FPGA work-
ing principles is presented. The next section discusses the details of the dynamic partial
reconfiguration technique. The last section presents the design and implementation of the
healing control system.
4.1 FPGA working principle
FPGA architecture
The typical architecture of an FPGA [48] is illustrated in Figure 4.1. An FPGA essen-
tially is an array of logic gates that can be arbitrarily interconnected together to make a
customized circuit. The term “gate” in Field Programmable Gate Array is not strictly accu-
rate. The basic cells of FPGA are not the basic NAND or NOR gates but more sophisticated
digital sub-circuits called configurable logic block (CLB). As presented in Figure 4.2, the
CLB consists of several lookup tables (LUTs), flip-flops, and routing controls such as mul-
tiplexers. The lookup tables are programmed to implement the user combinational logic
functions. The flip-flops complete the functionality of CLBs and are used for sequential
state storage. The multiplexers are applied to customize the CLB internal routing.
37
Figure 4.1: FPGA system architecture [48].
Figure 4.2: Configurable Logic Block architecture [49].
38
Each CLB only has a limited number of resources. Therefore, in order to implement
a large digital design, many CLBs are interconnected. For this reason, a matrix of inter-
connect switches is used in FPGA. As presented in Figure 4.1, the switch matrix contains
the transistors to turn on/off connections between different routing lines. Thus not only the
CLBs but also the interconnections can be “programmed”.
In order to connect the integrated circuit to external circuits, Input/Output pads or IO
banks are used. These consist of pull-up/pull-down resistors, buffers, and inverters, facil-
itating the communication between the FPGA internal logic function and the peripheral
components on the board.
Besides the basic elements discussed above, modern FPGAs also have built-in hard
blocks such as Memory controllers, digital signal processing (DSP) blocks, high-speed
communication transceivers, PCIe Endpoints, etc. All these hardware components estab-
lish a hardware logic layer that provides the necessary elements to form a circuit. The
reprogrammability is supported by the configuration layer which stores the FPGA configu-
ration information through a binary file called bitstream or bit file, as shown in Figure 4.3.
This binary file contains all the information that determines the implemented circuit,
such as the values or the truth table stored in the lookup tables, the initial values for the
flip-flops and memories, the routing information for the switch matrix, and the voltage
standards for the IO pins. Therefore, the function implemented by the hardware logic layer
is programmed by the values stored in the configuration memory. Most modern FPGA
configuration memory is based on Static Random Access Memory (SRAM) and are hence
volatile. This facilitates the implementation of FPGA design by simply downloading a bit
file. Similarly, FPGA design modification only requires changing the contents of the con-
figuration memory or rewriting a new bit file. This operation is called FPGA configuration
and can be performed through external FPGA interfaces such as Joint Test Action Group
(JTAG), SelectMap [50] or internal interface such as Internal Configuration Access Port
(ICAP) [51].
39
Figure 4.3: FPGA architecture with configuration memory layer and hardware logic
layer [48].
FPGA programming
The process of designing and implementing an integrated circuit design on FPGA is called
FPGA programming. The complete process includes building the design using HDL (Hard-
ware Description Language) code such as Verilog or VHDL, converting the HDL code to
a basic FPGA components formed circuit and generating this output file in binary format
that FPGAs can understand, and programming the output file to the physical FPGA device
using programming tools.
4.2 Partial Reconfiguration
As presented in the previous section, unlike a fixed application specific integrated circuit
(ASIC), FPGA technology provides the flexibility of reprogramming without going through
re-fabrication process. Partial Reconfiguration (PR) takes this flexibility one step further.
As presented in Figure 4.4, this architecture has multiple configuration layer. Since each
40
configuration layer is independent, modification of one or more configuration layers would
not affect the content stored in other layers. Therefore, it allows a predefined portion of an
FPGA, also known as the reconfigurable region, to be reprogrammed while the remainder
of the device continues to operate. The file to configure the FPGA is usually in the format
of a binary file. [47]. We use “configuration file” or “BIT file” interchangeably in the fol-
lowing discussion. Figure.4.5 provides a high-level overview of the mechanism behind
Figure 4.4: FPGA architecture with multi-configuration layer [48].
Figure 4.5: How Partial Reconfiguration works [47].
41
Partial Reconfiguration. In the original design, the FPGA blocks are divided into two dif-
ferent classes. The gray area of the FPGA block represents static logic and the black area
represents reconfigurable logic. The granularity of the reconfiguration region is related to
the configuration memory frame. Details about the reconfiguration region and the configu-
ration memory frame is discussed in the later section of this chapter. In this example, block
A is marked as a reconfigurable region. The function implemented in this region can be
overwritten by downloading one of several partial BIT files, e.g. A1.bit, A2.bit, A3.bit,
or A4.bit while The static logic remains functioning and is unaffected by the loading of a
partial BIT file.
PR Architecture
Xilinx has been supported PR for many years. Early age products, such as the XC6200
series [52] FPGA, contain only a single configurable memory layer. Later products, such
as Virte-II, Virtex-4, Zynq and Ultrascale, achieved partial reconfiguration based on multi
configuration memory layers [53][54][47].
For Virtex-II [53] series, Xilinx organizes the FPGA resources in a columnar fashion.
These primitives include configurable logic blocks (CLBs), Block RAMs, and multipliers.
The configuration memory is thus also organised in column, in fact each frame is 1-bit wide
and whole FPGA device height and configures a narrow vertical slice of many physical
resources [55]. Xilinx groups them into several different configuration columns, such as
Input/Output Block (IOB), CLB and BlockRAM, depending upon their role. Combine
several frames from one or more classes construct a partially reconfigurable region. Since
the height of each frame is the full height of the FPGA, each reconfiguration region is also
restricted to the full height of the FPGA, as shown in Figure 4.6. Therefore it is not efficient
in hardware utilization but relatively easy for floor-planning. This architecture simplifies
the routing process, thus the run time circuit relocation is not an issue.
42
Figure 4.6: Partially Reconfigurable regions in Xilinx Virtex XCV50 FPGA.
In order to support context switching between different partial configuration memory
files, every circuit targeted for the same partially reconfigurable region must have the same
interface to the static (non-PR) region. In Virtex-II devices, this is achieved by fixing the
routing between the static and PR regions and managing the connectivity with internal tri-
state buffers (TBUFs). To support runtime circuit relocation, the relative positions of these
TBUFs also need to match among different PRRs. The limited number and fixed position
of TBUFs available on the chip further restricts the size and positions of PRRs.
For Virtex-4 family of FPGAs [54], TBUFs were replaced by bus macros [56], as shown
in Figure 4.7. These bus macros are constructed by lookup tables. Since the number of
lookup tables are large and distributed throughout the FPGA, as opposed to the limited
number and fixed locations of TBUFs, this increases the flexibility of connectivity arrange-
ment. Moreover, The size of frames was also reduced in the Virtex-4. Unlike the Virtex-II,
where frame size is dependent on device size, it is 1 bit wide and 16 CLBs high and con-
tains 41 32-bit words (1312 bits). The reconfigurable region thus also no longer has the
restriction of being full height of the device, but rather must be a height that is a multiple of
16 CLBs. Because of this modified architecture, the floorplanning task becomes a two di-
mensional problem instead of one dimensional problem, which also increase the difficulty
for runtime relocation.
43
Figure 4.7: A bus macro showing the connectivity between the static region and a recon-
figurable region [48].
Start with the Virtex-5 series, Xilinx divided the device into several rows and columns
as shown in Figure 4.8.
Each row represents a clock region and each column, also called block, contains a sin-
gle class of FPGA resource as presented in the previous section. The intersection between
the row and the collum is called tile. Depending on the primitives of the block, the tile is
categorized into CLB tiles, DSP tiles, BRAM tiles etc. They are the basic unit for con-
figuration frame and partially reconfigurable region. One CLB tile contains 20 CLBs, one
DSP tile contains 8 DSP slices, and one BRAM tile contains 4 Block RAMs. Virtex-6 and
Xilinx 7-series FPGAs (Artix, Kintex, and Virtex-7) also have a similar tile architecture,
the main difference is the number of resources each tile contains.
These advanced architecture improves the efficiency of hardware utilization. It also
increases the flexibility of FPGA implementation and enables multiple PRRs with varying
sizes with different kinds of resources. However, all these architectural improvements also
44
Figure 4.8: Xilinx 7-Series architecture [48].
dramatically increase the difficulty of runtime circuit relocation.
PR Flow
The design and execution of partial reconfiguration can be achieved by Xilinx Vivado De-
sign Suite [47]. The final hardware design is composed of two parts, the static region and
one or more reconfigurable regions (PRRs).
The static region is the portion of the design, which does not change its functionality
during system operation. PRRs implement the reconfigurable modules, and can be recon-
figured at runtime. A single reconfigurable region can implement many modules in a time
multiplexed fashion; all reconfigurable modules implemented in the same PRR constitute a
reconfigurable partition. The design flow is illustrated in Figure 4.9, the first step is to syn-
45
thesise the static and reconfigurable modules separately using Xilinx tools or third-party
synthesis tools. The second step is to define the reconfigurable regions and allocate the
corresponding module to them (partitioning). Notice that it is the designer’s responsibility
Figure 4.9: Partial Reconfiguration design flow.
to verify the floorplanning and ensure that there is no violation against the reconfigurable
regions requirement in the FPGA fabric. For example, the height of the regions must be
that of a multiple of 16 CLBs and should be aligned to clock region boundaries as explained
46
in the partial reconfiguration architecture section.
The next step is to implement the static design. This is achieved by first using a re-
configurable partition as a placeholder and implementing the complete system. Then the
reconfigurable regions are removed, leading to a placement and routing file only for the
static region. After the static design implementation is finished, it preservers that area for
all other configurations.
The following step is to implement the reconfigurable design. Since the static region
is locked, adding the reconfigurable partition to the static design and implementing this
whole system would only generates the valid placement and routing file for the partially re-
configurable region. If multiple reconfigurable partition exist, each reconfigurable partition
needs to be added to the static design and implemented. This step would be repeated until
all the reconfigurable partitions are implemented. Designers do not need to worry about
the interface between the static region and partially reconfigurable regions, since Vivado
would automatically implement the interface logic.
Finally, the tool generates a full configuration file as well as partial bitstreams for each
PRR and each configuration. The FPGA is initially configured using one of the full bit-
streams and the PRR can be reprogrammed using a any corresponding partial bitstream at
run time without interrupting the remainder systems operation.
4.3 Healing controller program
The healing controller is a program executed on a personal computer. This program utilizes
the Vivado Design Suite to manage the partial reconfiguration bitstream downloading pro-
cedure. When the healing controller receives the repairing request and the faulty module
identity from the error reporting module of the FPGA, it generates the partial reconfigura-
tion command in order to repair the faulty area.
The error message format is presented in Figure 4.10. The total length for each error
message is two bytes. The first bit represents the error flag, followed by a two-bit faulty
47
module identity. The next five bits represent the output identity of a cell. The last eight
bits indicate the cell ID. When the error flag is active, an intermittent error is detected.
Therefore, the system requests for a partial reconfiguration.
Figure 4.10: Error message format.
For example, assume an FPGA design contains two cells, cell A and cell B. Cell A
has two outputs while cell B has three outputs as illustrated in Figure 4.11. The faulty
module ID, output ID, cell ID, the corresponding redundant module name and the partially
reconfigurable region is listed in Table. 4.1.
Figure 4.11: Example of the interaction between the healing controller (in a PC) and the
FPGA when a soft error is detected.
48
Table 4.1: Relationship between the faulty module ID, redundant module name and the
partially reconfigurable region.
Module Name Output ID Module ID Cell ID PRR
RMA 0 x 0 0 PRR A0
RMA 1 x 1 0 PRR A1
RMA 2 x 2 0 PRR A2
RMA 3 x 3 0 PRR A3
RMB 0 x 0 1 PRR B0
RMB 1 x 1 1 PRR B1
RMB 2 x 2 1 PRR B2
RMB 3 x 3 1 PRR B3
Figure 4.12 presents the pre-defined partial reconfigurable regions. If the RMA2 mod-
ule collapses and output 1 reaches the error threshold, the error message is generated as
1100000100000000 in binary (or C100 in hexadecimal). The first 1 indicates an error flag,
the following 10 and the last eight bits 00000000 identify the faulty module, which is the
“redundant module 2” of cell A. The output ID 00001 presents the error message is cre-
ated by output 1. Even though this output ID does not make a difference in this example,
when multiple fault management units of different outputs catch errors, this output ID is
required to choose which module should be repaired first. In other words, when more mul-
tiple modules of the same cell fail simultaneously, the output ID determines the priority of
reparation. According to Table. 4.1, the partial bitstream PRR A2 will be downloaded to
program the partial reconfiguable region PRR A2 in Figure 4.12.
49
Figure 4.12: Example of the pre-defined partial reconfigurable regions.
Figure 4.13 presents the flowchart of the healing controller program. First, the program
reads the faulty module identification from the serial port (a serial communication interface
through which information transfers in or out one bit at a time in contrast to a parallel port).
Second, depending on the value or the id, the healing controller downloads the correspond-
ing partial reconfiguration bit file, which are generated from the partial reconfiguration
project development process discussed above, to the FPGA. Finally, it transmits a “unlock”
signal through the serial port and waits for the next serial message.
50




In the previous chapters we described how the self-repairing system architecture is capa-
ble of detecting, identifying and correcting errors autonomously, yet the transformation
from the original source file to the self-repairing version was not discussed. This chapter
provides a method to automatically complete this transformation.
5.1 Introduction
The goal of the software stack is to provide an autonomous tool for deploying the self-
repairing architecture on any digital design. Since Verilog HDL is a widely used design
language to express the fabric of a hardware structure for both ASIC and FPGA-based cir-
cuits, the challenge of autonomously deploying the self-repairing architecture is equivalent
to automatically change the original Verilog code to the self-repairing version. For this
reason, we develop a HDL converter, as illustrated in Figure 5.1.
Figure 5.1: HDL converter.
For example, considering the Verilog implementation of a full-adder presented in Fig-
ure 5.2, by replicating the user logic block and inserting the enhanced fault management
unit described in Chapters 3, the equivalent circuit implemented in Verilog language is
presented in Figure 5.3.
52
module fulladd(s, c_o, a_i, b_i, c_i);
output s, c_o;
input a_i, b_i, c_i;
assign s = (a_iˆb_i)ˆc_i; // sum bit
assign c_o = (a_i & b_i) | (b_i & c_i) | (c_i & a_i); //carry
bit
endmodule
Figure 5.2: Original HDL design file example: full adder
module fulladd_conv (








wire [3:0] s_qmr, c_o_qmr;
wire [1:0] err_code0, err_code1;
wire err_flag0, err_flag1;
// Instantiate redundant module
fulladd rm0 (s_qmr[0], c_o_qmr[0], a_i, b_i, c_i);
fulladd rm1 (s_qmr[1], c_o_qmr[1], a_i, b_i, c_i);
fulladd rm2 (s_qmr[2], c_o_qmr[2], a_i, b_i, c_i);
fulladd rm2 (s_qmr[3], c_o_qmr[3], a_i, b_i, c_i);
// Fault Management Unit
qmr_enh inst0 (s_qmr, s, err_code0, err_flag0);
qmr_enh inst1 (c_o_qmr, c_o, err_code1, err_flag1);
endmodule
Figure 5.3: Converted HDL file with self-repairing architecture.
53
Note that this example does not include the error reporting module. If the healing
process is desired, it can also be included into the converted HDL file by instantiating the
error reporting module. Details implementation of the the enhanced modules, such as the
qmr enh module in this example, are presented in Appendix.
Recall from the self-repairing architecture, the fault management unit and the error-
reporting module must adapt to the user logic. Fortunately, most of the structure remains
unchanged while only the ports related logic requires a modification. Therefore, the first
important task is to identify the ports of the original design. Since the port declaration is
usually on the top of the module and begins with either the keyword “input”, “output” or
“inout”, this particular pattern can be easily searched using the Regular Expression [57].
A regular expression, also known as a “regex”, is a sequence of characters that specifies
a pattern. Table. 5.1 presents some example of using regular expression to match certain
strings patterns. It is extremely useful in extracting and replacing information from text
that takes a defined format, such as dates, urls, phone number, email addresses and even
code. Therefore, it is possible to perform advanced text manipulation.
Table 5.1: Example of using regular expression to match strings patterns
Regex Matches any string that
hello contains {hello}
ˆThe starts with {The}
dog$ ends with {dog}
ˆreg$ is exactly {reg}
a(b|c) has {a} followed by {b} or {c}
go+gle contains{gogle,google,gooogle,goooogle,...}
Since the previous full-adder example is a combinational circuit, no state synchroniza-
tion technique is implemented. To further evaluate the regular expression, consider the
example of a simple sequential circuit, a counter, presented in Figure 5.4.
54
module counter (
output reg [7:0] count, // Output of the counter
input enable, // enable for counter
input clk, // clock Input
input reset // reset Input
);
always @(posedge clk) begin
if (reset) count <= 8’b0;
else if (enable) count <= count + 1;
end
endmodule
Figure 5.4: Example Verilog code of a 8-bit counter
Note that when applying the state synchronization method, only the second statement of
the “out” register, e.g. out <= out+1; requires a modification. The first line out <= 8′b0;
should be unaffected since the state synchronization should not overwrite the reset behavior.
In other word, the state synchronization should be applied after the repairing process, not
during the system reset event.
Recall the state synchronization diagram in Figure 3.6 from Chapter 3, the equivalent
design in Verilog for this up counter is presented in Figure 5.5. Even though there is only
one reset statement in this example, in a real world application it can be any integer number
of statements. The lack of a separate memory in the finite automata limits the pattern depth
it can detect. As a result, the pattern for the reset statement would be too complex to be
described in the regular expression [58]. Therefore, using the regular expression to locate












always @(posedge clk) begin




if(enable) count <= count + 1;
end
end
assign neighbor_out_count = count + 1;
endmodule
Figure 5.5: Self-repairing version of the 8-bit counter with the state synchronization tech-
nique
Moreover, identifying the target register is only the first step of applying the state syn-
chronization method. The second step is to modify the original code by inserting the related
IO ports, registers and the corresponding statements without violating the grammar and the
original functionality. This could not be achieved without the knowledge of the code struc-
ture. Therefore, we need a more powerful tool that not only understands the meaning, or
semantics, of the code but also its form, or syntax. This tool is called a compiler.
5.2 Typical Compiler
A compiler is a computer program that translates a program written in one language, for
example C or Java, into anther language such as the machine language (usually in a bi-
56
nary format). First, it needs to understand the syntax and meaning of the source language.
Second, it must comprehend the rules that govern the form and content in the target lan-
guage. Finally, it requires a scheme to map content from the source language to the target
language [59].
In order to understand the structure and meaning of the source program, a compiler first
pulls it apart and analysis each basic component. Consequently, in order to generate the
target program, it puts the pieces together in a different way. As presented in Figure 5.6,
the front end of the compiler performs analysis, focusing on understanding the source-
language program; the back end does synthesis, aiming to map the program to the target
machine.
Figure 5.6: Structure of a classical compiler [60].
In order to bridge the front end and the back end, a compiler generates a special data
structure for representing the program in an intermediate form whose meaning is largely
57
independent of the source language or the target language. Furthermore, this intermediate
representation includes the knowledge of the source program obtained by the front end. To
improve the translation, an optimizer is often included to analyze and rewrite the interme-
diate form [60].
5.3 HDL converter
Since the back end of the compiler is usually responsible for mapping a human readable
code into the target machine code, it is not required by the HDL converter. Instead, we need
to automatically modify the source code without violating the source language grammar.
The intermediate representation is a formal data structure designed to be independent of
the source languages. Therefore, it is much more convenient to be modified compared
with the source program which is governed by the source language grammar. In fact, this
intermediate representation allows us to inject the required IO ports, registers (for state
synchronization) and the corresponding activation conditions without concerning about the
grammar. Moreover, the modified intermediate representation can be transformed reversely
to the source program written in Verilog. Therefore, the back end of a compiler should be
replaced with an intermediate code modifier and source code generator. Details of how to
generate a new program based on the existing intermediate representation is described in
the intermediate representation modifier section in this chapter (section 5.3.3).
For the front end, the semantic analyzer, which focus on connecting variable definitions
to their uses and checking that each expression has a correct type, can also be skipped. This
is because the semantics of the language only determines what its programs mean. In other
words, semantic analysis judges whether the syntax structure constructed in the source
program derives any meaning or not.
For example, consider the pseudo code presented in Figure 5.7. This code is lexically
and structurally correct. Therefore, it should not issue an error in the lexical and syntax
analysis phase. However, it should generate a semantic error as the type of the assignment
58
differs. The “value” on the right hand side of the equal sign is a string, but it is assigned
to an integer variable. These rules are set by the grammar of the language and evaluated in
semantic analysis.
int a = ” v a l u e ”;
Figure 5.7: Example of a syntax-correct but semantic-wrong code
Since we assume that the original HDL file is already a valid design, thus checking
the meaning of the program is unnecessary. Furthermore, it is not the HDL converter’s
responsibility to verify the original design. Even if the original HDL file is not valid, the
electronic design automation software or the Verilog compiler will perform the semantic
analysis when the Verilog code is compiled to a netlist or bitstream. Therefore, the semantic
analyzer of the front end compiler is not required for the HDL converter.
As a result, compared with a typical compiler, the structure of the HDL converter is
shown in Figure 5.8. First, the lexical analyzer and the syntax analyzer read the original
code and generate the intermediate representation, which is an Abstract Syntax Tree (AST)
in our case. Second, the AST modifier upgrades the intermediate code. Finally, the HDL
code generator create a new Verilog program based on the upgraded AST.
Figure 5.8: Structure of a HDL converter.
59
5.3.1 Lexical Analyzer
Similar to a compiler, before the HDL converter can transform one HDL file to another
design, it must understand the original source program. Assuming that a computer program
is a set of instructions or strings defined by a finite set of rules, called a grammar; a string
is a finite sequence of symbols; and the symbols themselves are consisted of a set of finite
characters. To analysis the original source codes, we start with the smallest constituent unit
of the program.
Therefore, the first phase of understanding a program is to group individual characters
into distinct words or symbols and classify each word with a part of speech or type. This
step is called lexical analysis or scanning. The lexical analyzer reads the stream of char-
acters and groups the characters into meaningful sequences called lexemes. As illustrated
in Figure 5.9, for each lexeme, the lexical analyzer produces a token as an object with two
keys: token-name and attribute-value. The token-name is an abstract symbol that is used
Figure 5.9: Lexical Analyzer.
during syntax analysis, and the attribute-value points to an entry in the symbol table for
this token. In other words, the lexical analyzer creates a connection between the individual
symbols of the source program and the predefined symbol types of the source language.
As a result, each symbol is categorized into an associated type. The tokens are then passed
on to the subsequent phase as a basic knowledge of the symbols. Figure 5.10 presents the
output of the lexical analyzer as applied to the counter example.
60
module MODULE 1 0
counter ID 1 7
( LPAREN 1 18
output OUTPUT 2 24
reg REG 2 31
[ LBRACKET 2 35
7 INTNUMBER_DEC 2 36
: COLON 2 37
0 INTNUMBER_DEC 2 38
] RBRACKET 2 39
out ID 2 41
, COMMA 2 44
input INPUT 3 76
enable ID 3 82
, COMMA 3 88
input INPUT 4 117
clk ID 4 123
, COMMA 4 126
input INPUT 5 148
reset ID 5 154
) RPAREN 6 180
; SEMICOLON 6 181
always ALWAYS 7 183
@ AT 7 190
( LPAREN 7 191
posedge POSEDGE 7 192
clk ID 7 200
) RPAREN 7 203
if IF 8 209
( LPAREN 8 212
reset ID 8 213
) RPAREN 8 218
out ID 8 220
<= LE 8 224
8’b0 INTNUMBER_BIN 8 227
; SEMICOLON 8 231
else ELSE 9 237
if IF 9 242
( LPAREN 9 245
enable ID 9 246
) RPAREN 9 252
out ID 9 254
<= LE 9 258
out ID 9 261
+ PLUS 9 265
1 INTNUMBER_DEC 9 267
; SEMICOLON 9 268
endmodule ENDMODULE 10 270
Figure 5.10: Lexical analyzer result of scanning the counter.v61
In order to identify the lexical token, multiple regular expressions are used. As previ-
ously discussed, the regular expression is a convenient way to match a string pattern. For
example, if a pattern is described as a predefined type, then whatever string matches that
pattern is a member of that type.
However, a regular expression is simply an abstract formulation. In order to implement
it as a computer program, finite automata are used. A finite automaton is a finite state
machine that accepts or rejects strings of a language which it defines. It has a finite set
of states; edges lead from one state to another, and each edge is labeled with a symbol.
One state is the start state, and certain of the states are distinguished as final states. For
example, a finite automata that can detect a floating number is illustrated in Figure 5.11.
The initial state is labeled as state 0. If the first character is any thing from number zero to
number nine (a digit), it goes to state 1. If the next charter is still a digit, it stays at state 1.
Otherwise, if it is a dot, it leads to state 2. If the following character is a digit, it reaches
state 3 and stays at this stat as long as the following character is still a digit. The state 3
represents that a floating number is matched.
Figure 5.11: The Finite Automaton of a floating number.
5.3.2 Syntax Analyzer
After the stream of characters are grouped into words, a compiler fits the words into a
grammatical model of the source programming language. This model is called context-free
grammars and this second phase of the compiler is called syntax analysis or parsing.
62
Although grouping characters into words may seem similar to fitting words into a sen-
tence, a regular expression or a finite automaton cannot recognize the context-free grammar
due to the limited states available.
For example, it is impossible for a finite automaton to recognize an arbitrarily long
balanced parentheses, because a machine with N states cannot remember a parenthesis-
nesting depth greater than N.
Another example is the recursive abbreviation. Consider the following grammar that
describes an “expression”:
digits = [0− 9]+
sum = expr“ + ”expr
expr = “(”sum“)”|digits
this expression is designed for defining forms as follows:
(1 + (2 + 3))
If we substitute sum into expr, we get the following:
expr = “(”expr“ + ”expr“)”|digits
And if we substitute expr into itself, we get the following:
expr = “(”“(”expr“ + ”expr“)”|digits“ + ”expr“)”|digits
Obviously, the occurrences of expr could be any number. However, a finite state machine
or finite automaton cannot require an arbitrary amount of memory. Therefore, the recursive
abbreviation cannot be recognized by a regular expression.
Therefore, instead of using the finite automaton, the parser uses the tokens produced
by the lexical analyzer to create a tree-like intermediate representation, called parse tree,
that describes the grammatical structure of the token stream. A typical representation is
a syntax tree in which each interior node represents an operation and the children of the
node represent the arguments of the operation. Figure 5.12 presents the output of the syntaz








Output: count, False (at 2)
Width: (at 2)
IntConst: 7 (at 2)
IntConst: 0 (at 2)
Reg: count, False (at 2)
Width: (at 2)
IntConst: 7 (at 2)
IntConst: 0 (at 2)
Ioport: (at 3)
Input: enable, False (at 3)
Ioport: (at 4)
Input: clk, False (at 4)
Ioport: (at 5)
Input: reset, False (at 5)
Always: (at 8)
SensList: (at 8)
Sens: posedge (at 8)
Identifier: clk (at 8)
Block: None (at 8)
IfStatement: (at 9)
Identifier: reset (at 9)
NonblockingSubstitution: (at 9)
Lvalue: (at 9)
Identifier: count (at 9)
Rvalue: (at 9)
IntConst: 8’b0 (at 9)
IfStatement: (at 10)
Identifier: enable (at 10)
NonblockingSubstitution: (at 10)
Lvalue: (at 10)
Identifier: count (at 10)
Rvalue: (at 10)
Plus: (at 10)
Identifier: count (at 10)
IntConst: 1 (at 10)
Figure 5.12: Syntax analyzer result of processing the counter.v
64
Source code of the lexical analyzer and the syntax analyzer are listed in the appendix.
Instead of writing a compiler front end from sketch, the open source library Python Lex
Yacc (PLY) [61] and Pyverilog [62] are used to implement the scanner and the parser.
Since the main objective of this chapter is to introduce the HDL converter which upgrades
a Verilog design to a more reliable version rather than to design a compiler front end from
sketch. The discussion here is focuses on the general mechanism behind these tools. Details
of the design and implementation of the lexical analyzer and the syntax analyzer are beyond
the scope of this thesis. The readers can refer to [59][58][60] for a full description.
5.3.3 Intermediate Code Modifier
After the intermediate representation of an abstract syntax tree is generated, the next phase
is to modify the AST data structure.
A human being modify a code in three steps. First we read the code and try to under-
stand the intent of each line of code. Then we locate the part that needs to be changed.
Finally,we make modifications based on the requirements and language grammar.
The lexical analyzer and the syntax analyzer introduced in the previous section com-
plete the first task and create a data structure for later usage. In order to locate the part that
needs to be changed, a tree traversals function is required [60]. This tree traversals will be
used for describing attribute evaluation and for specifying the execution of code fragments
in a translation scheme. As introduced in Figure 5.13, a traversal of a tree starts at the root
and visits each node of the tree in some order. Figure 5.14 illustrates the pseudocode of the
visit function.
65
Figure 5.13: Tree Traversal Diagram.
procedure visit(node N) {
for ( each child C of N, from left to right ) {
visit (C);
}
Actions at node N;
}
Figure 5.14: Tree Traversal Pseudocode
For example, visit Reg, presented in Figure 5.15, is a visit function that traverses the
abstract syntax tree and searches all the nodes that represent a register. As a result, all the
registers inside a module are located and stored in a hash table. This function is used to
target all the internal sequential state registers. The hash table in Figure 5.16 is an example
of applying the visit Reg on the counter abstract syntax tree. This hash table is then passed
to the visit Portlist function to insert the new input/output ports as illustrated in Figure 5.17.
First, the existing ports are converted into a list and stored in the new ports. Then
we append new ports to the existing port list by creating new Ioport item. The first
Ioport item inserted is the input port of the multiplexer control signal, here it is repre-
sented by the self.signal variable which can be easily renamed. The following ports are
66
the state synchronization data port from the nearby modules. One is an input port starts
with neighbor in, the other one is an output port starts with neighbor out. Both of their
names are followed by the state register’s name and the corresponding width of that reg-
ister. This extended new ports list will then be used to generate the update Verilog code
using the HDL code generator introduced later.
def visit_Reg(self, node):
self.vardict[node.name] = [node.width.msb, node.width.lsb
]
return self.generic_visit(node)
Figure 5.15: Example of creating a state register hash table. The hash table vardict stores
the state register name and the corresponding width
{’count’: [7, 0]}













Figure 5.17: Example of an intermediate code modifier function. This function inserts
the signal port for controlling the multiplexer, this signal is the “sync” signal passed from
outside; the ‘neighbor in ′+k and the ‘neighbor out ′+k port is the state synchronization
data port from/to the neighbor’s module. The “k” in the name field will be replaced with
the information from the state synchronization hash table.
67
5.3.4 HDL Code Generator
The HDL code generator generates a source code in Verilog HDL from the intermediate
representation of an modified AST. Figure 5.18 shows an example python code of gener-
ating a Verilog file from the AST. In this example, an always block is generated, and the
AST object is built from scratch only for demonstrating the functionality of the HDL code
generator. However, when the HDL code generator is integrated to the HDL converter, it
generates the Verilog code based on the upgraded syntax tree created by the intermediate
code modifier. As presented in the code, each word is reconstruct into a “sentence” based
on their syntactic category, and the “sentence” is organized into the source program based
on the Verilog grammar. Figure 5.19 illustrates the generated source code by the Python
script in Figure 5.18.
sens = vast.Sens(vast.Identifier( ’CLK ’), type= ’ posedge ’)
senslist = vast.SensList([ sens ])
assign_count_true = vast.NonblockingSubstitution(
vast.Lvalue(vast.Identifier( ’ c o u n t ’)),
vast.Rvalue(vast.IntConst( ’ 0 ’)))
if0_true = vast.Block([ assign_count_true ])
# count + 1
count_plus_1 = vast.Plus(vast.Identifier( ’ c o u n t ’), vast.IntConst(
’ 1 ’))
assign_count_false = vast.NonblockingSubstitution(
vast.Lvalue(vast.Identifier( ’ c o u n t ’)),
vast.Rvalue(count_plus_1))
if0_false = vast.Block([ assign_count_false ])
if0 = vast.IfStatement(vast.Identifier( ’RST ’), if0_true,
if0_false)
statement = vast.Block([ if0 ])
always = vast.Always(senslist, statement)
Figure 5.18: Example of the python source code creats an always statement using HDL
code generator.
68




count <= count + 1;
end
end
Figure 5.19: Example of the generated Verilog code of an always statement.
In order to build the always block presented in Figure 5.19, we first create the sensitive
list, which is the content inside the parenthesis after the @ symbol. Then we build the if
statement by first developing the expression for the case when the if condition is true. This
expression is essentially a non blocking substitution which is consited of a Lvalue and a
Rvalue. Similarly we build the expression for the false condition. This expression is a
little bit complicated since the Rvalue is no longer a simple constant. Instead, it is a PLUS
expression, which is count+1 in this example. Next, we combine the condition, the if ture
statement and the if false statement to complete the if statement. Finally, we group the
sensitive list and the if statement to finish the always statement.
In summary, the HDL converter consists of four parts: a lexical analyzer, a syntax
analyzer, an intermediate code modifier and a HDL code generator.
The lexical analyzer reads individual characters, groups them into words, and categories
each word to their parts of speech. The syntax analyzer reads distinct words and construct
an abstract syntax tree based on the grammar. The lexical analyzer and the syntax analyzer
composite the front end of the HDL converter, which identifies the internal sequential state
registers and the input/output ports of the original design file.
The intermediate code modifier reads the parsed syntax tree and upgrades the user logic
to allow state synchronization and creates top-level design for deploying self-repairing ar-
chitecture. For the user logic modification, this is achieved by first adding new input ports
and output ports to the current syntax tree. Second, every pin that connects to the user
logic state signal are reassigned. Third, A multiplexer, a “Sync” port are added to switch
69
the source signal to that state register. An if statement with the synchronization condition
is inserted to the original always block. Consequently, all the logic states of the recently
activated module will be synchronized to the nearby module in one cycle.
For the self-repairing architecture deployment, this is achieved by first creating a top-
level self-repair module with the same IO ports of the target module. Second, inside of
this top-level module, multiple copies of the original module which serve as the redundant
modules for the target are instantiated. Third, The enhancement module and the repair
interface module are also included. Next, enhancements to detect errors and correct errors
and repairing communication interface module are included. Therefore transient faults will
be ignored but soft errors and hard errors will be detected using a BER counter. Upon
detection of a soft fault, partial reconfiguration will be initiated.





6.1 Case Study: ITC benchmark designs
In this section, we implement our enhanced-DMR/TMR/QMR approaches using standard
benchmark logic designs [63] and the Xilinx 28nm Kintex-7 FPGA development platform.
The studied benchmarks are listed in Table 6.1 and range in size from MSI (25-172 gate)
to LSI (1,000 gate) complexity. We use these as representative of small-to-medium grain
logic blocks that make up a much larger VLSI system. The specific target FPGA is the
XC7K325T-2FFG900C, containing a total of 50,950 slices (203,800 LUTs, 407,600 flip
flops). In the table, the first column is the ITC99 benchmark design reference [63]. The
next two show the block size in gates and LUTs respectively. The fourth column lists the
number of flip-flops in each block. The fifth column lists the number of primary inputs and
outputs per block. The sixth column shows the number of lines of VHDL code, and the last
column shows the ratio N/M where N is the number of primary outputs and M is the block
size as measured in LUTs.
The benchmarks are described in VHDL and converted to the target FPGA layout using
the Vivado development suite from Xilinx. Prior to layout, we replicate the VHDL code for
Table 6.1: ITC99 BENCHMARK DESIGNS
ITC99# gates (LUTs) FFs I,Os VHDL N/M
B01 45(5) 5 4,2 110 0.4
B02 25(4) 4 3,1 70 0.25
B03 150(12) 30 6,4 141 0.333
B09 131(25) 28 3,1 103 0.04
B10 172(29) 17 13,6 167 0.207
B12 1000(207) 121 7,6 567 0.029
71
each block 2, 3, or 4 times to construct the enhanced DMR, enhanced TMR, or enhanced
QMR versions.
6.1.1 Implementation Cost Estimation
For enhanced DMR, the bit error rate measurement logic block is shown in Figure 6.1, we
use a portion of a LUT to implement the XOR function that recognizes when the two units
disagree. A 2-bit counter is implemented using two available flip-flops within the same
slice as the LUT. The counter is initialized (reset) by a global BER-reset signal every 2K
clock cycles, where K is the number of bits in a shared global counter. The time period
between BER resets serves as the denominator in the BER ratio. The 2-bit counter is
designed to signal an error upon receiving the SECOND error within that time period (the
first error is ignored since it represents an acceptable BER). In the case of a soft or hard
fault in either unit, the BER will reach the 2-error threshold very quickly and signal that
the cell has failed and needs repair.
For enhanced TMR, we add the TMR enhanced voting logic to the tri-modular redun-
dant configuration. Here we again use a 2-bit counter to distinguish between an acceptably-
low BER from transients and the higher rates from soft or hard errors. One LUT is used
to implement the voter logic and error signal. A second LUT generates a 2-bit error code
that identifies the failing unit (one of three, with the fourth code used to indicate error-free
operation). When the BER reaches the threshold of TWO errors within 2K clock cycles,
the system is notified and it reads the 2-bit error code in order to determine which unit to
repair.
For enhanced QMR, we add the logic shown in Figure 6.2 and Figure 6.3 to the quadruple-
modular redundant configuration. When implemented in FPGA, these two logic and com-
bined together to fit into multiple LUTs. As with enhanced DMR and enhanced TMR we
use a 2-bit counter to determine when the BER exceeds the threshold of TWO errors per


















































































voter and error-generating LUTs so that the last failing unit is ignored, while allowing the
remaining three units to function in TMR mode. A single soft or hard error is immediately
repaired by substituting the fourth spare unit for the failing unit in the remaining enhanced
TMR configuration. Again, as with enhanced TMR, if the BER ever exceeds the threshold,
then the system will get a Master Error Flag signal, and it can then read the Error Code
to identify the unit to repair. The failed unit is therefore isolated for re-programming to
attempt healing of a soft fault.
To summarize, the added test logic requires 1, 3, or 4 LUTs for Enhanced DMR, En-
hanced TMR, Enhanced QMR respectively. Therefore, the total size (in terms of LUTs) of
the fault-tolerant logic blocks and the overhead costs can be approximated as follows:
Enhanced DMR Area = 2M +N (6.1)
Enhanced DMR Overhead =
2M +N
M
− 1 = 1 + N
M
(6.2)
Enhanced TMR Area = 3M + 3N (6.3)
Enhanced TMR Overhead =
3M + 3N
M
− 1 = 2 + 3N
M
(6.4)
Enhanced QMR Area = 4M + 4N (6.5)
Enhanced QMR Overhead =
4M + 4N
M
− 1 = 3 + 4N
M
(6.6)
Where M is the number of LUTs in the original function and N is the number of primary
outputs from the logic block. In the formulas the overhead values are expressed in terms
76
of the original function size. For example, an overhead of 2.15 corresponds to 215% over-
head, or a little more than twice the original block size. The total area in that case is 3.15
times the original block size. The formulas for Enhanced DMR, Enhanced TMR, and En-
hanced QMR overhead are plotted as a function of block size in Figure 6.4, Figure 6.5 and
Figure 6.6. In each case, the overhead asymptotically approaches the limits determined by
the degree of redundancy, namely 100%, 200%, and 300% for Enhanced DMR, Enhanced
TMR, Enhanced QMR respectively as N/M approaches zero.
Figure 6.4: Enhanced DMR area overhead vs. the function module size.
77
Figure 6.5: Enhanced TMR area overhead vs. the function module size.
Figure 6.6: Enhanced QMR area overhead vs. the function module size.
78
The QMR enhancement logic block is presented in Figure 6.7. Since these enhanced
modules are required for each output of the user logic block, it is best to partition the logic
into large functional blocks (>100 LUTs) with few (<10) primary outputs, if possible.
Also notice that these formulas are approximate in that they do not completely reflect all
the design constraints for routing in an FPGA platform.
Figure 6.7: QMR enhancement logic block schematic.
To see how accurate these approximations are, we used the Xilinx Vivado design tools
to layout enhanced DMR, enhanced TMR, and enhanced QMR versions of the six ITC99
benchmarks. The actual number of LUTs used in each design is shown in Table 6.2. The
estimated sizes are shown in Table 6.3 for comparison. The difference between the actual
layout areas and estimated results is due to the Vivado synthesis optimization.
Table 6.2: ITC99 benchmark designs implementation sizes from layout
Benchmark Origin Enhanced DMR Enhanced TMR Enhanced QMR
b01 5 11 17 27
b02 4 9 13 22
b03 12 32 47 82
b09 25 49 73 101
b10 29 78 108 176
b12 207 449 659 939
79
Table 6.3: ITC99 benchmark designs implementation sizes from formulas
Benchmark Origin Enhanced DMR Enhanced TMR Enhanced QMR
b01 5 12 21 28
b02 4 9 15 20
b03 12 28 48 64
b09 25 51 78 104
b10 29 64 105 140
b12 207 420 639 852
In addition, we wanted to make sure that the fault-tolerant designs did not significantly
impact the overall system performance (i.e. maximum frequency). Therefore, for all the
Vivado layouts we specified reasonable timing constraints. As a result, we were able to get
fault-tolerant designs with nearly the same performance as the originals. This is presented
in Table 6.4 and Figure 6.8, where the maximum operating frequencies are given for these
same layouts. The typical loss of performance is 1-2%, with a maximum loss of 6% for the
most complex design, b12.
Table 6.4: ITC99 benchmark designs maximum operating frequencies (in MHz)
Benchmark Origin Enhanced DMR Enhanced TMR Enhanced QMR
b01 108.613 108.026 103.961 107.411
b02 109.565 105.02 105.363 108.483
b03 106.112 99.226 99.088 103.659
b09 109.565 104.91 104.778 109.493
b10 104.373 104.221 99.522 102.062
b12 108.483 99.334 97.809 101.368
80
Figure 6.8: Maximum operating frequencies for ITC99 benchmark designs
Figure 6.9 shows an example layout floorplan (generated by Vivado) for the enhanced
QMR version of b12. This drawing represents only about 0.5% of the entire Xilinx FPGA,
so up to 200 of these LSI cells could fit on the chip. The four copies of the original b12
function are shown as clusters of red-colored logic slices. Each cluster is about 207 LUTs,
but varies slightly across the four versions, presumably as a result of small optimizations
made by Vivado to meet other design constraints. The small group of blue-colored squares
represents the logic slices used to implement the voter and other test logic for all six primary
outputs.
81
Figure 6.9: Layout floorplan for enhanced QMR version of b12
6.1.2 Reliability Model
Combinatorial and Markov models are the two fundamental approaches to modeling hard-
ware reliability [64].
Combinatorial models use probabilistic techniques to enumerate the ways in which a
system can remain operational. The reliability of a system is generally derived in terms
of the reliabilities of each individual component [65]. Although this model is effective for
evaluating simple systems, complex systems, which involves fault recovery and module
repairing, are often difficult to be incorporated. [64].
Markov models can overcome the recovery problems. A Markov chain, or more pre-
cisely a first-order Markov chain, is a stochastic process whose dynamic behavior is such
82
that probability distributions for its future development depend only on the present state
and not on how the process arrived in that state [66].
Figure 6.10 presents an example of modeling the reliability of a computing system with
a two-state Markov chain. State 0 represents the system is functional and State 1 means the
Figure 6.10: Markov model for a two state system
system is failed. The system transitions from functional to failed at a rate of λ , as labeled
on the arc from State 0 to State 1. In addition, we denote by Pj(t) the probability of the
system being in State j at time t. Therefore the incremental change dP0 in probability of
State 0 at increment of time dt can be expressed as:




Moreover, since the device is known to be healthy at initial time t = 0, with the initial
condition P0(0) = 1, the solution for this differential equation is presented as follows:
P0(t) = e
−λt (6.9)
Since only State 0 is the functional state, the system reliability R is:
R(t) = P0(t) = e
−λt (6.10)
This result can be verified by [4].
Xilinx provides a detailed study that describes the measured reliability characteristics of
its FPGA technology [43]. For the 50,000 LUT, 28nm Kintex-7 chip, the relevant reliability
83
numbers are:
• Transient failure rate (λt) = 73728 FIT
• Soft failure rate (λs) = 7373 FIT
• Hard failure rate (λh)= 11 FIT
The FIT stands for Failures In Time. In this report, the number represents the number of





For a non-redundant design, any type of fault will lead to a system failure. Therefore,
the reliability of a non-redundant system is calculated as follows:
R(t) = P0(t) = e
−(λt+λs+λh)t (6.12)
Based on the study in [67][68], we present the Markov model of the transitional TMR
system. As illustrated in Figure 6.11, State 3 represents a state when all three modules are
functional. State 2t means one of the three modules has a transient fault. Similarly, State
2p stands for a single permanent fault (soft fault or hard fault). State F means more than
one error occur thus the system is failed. λp is the permanent failure rate and λp = λs +λh.
Since single transient error do not collapse a TMR system, thus the transient error recovery
rate u is included. The TMR system reliability is calculated as follows:
dP3
dt
= −3λtP3 − 3λpP3 + uP2t (6.13)
dP2t
dt
= 3λtP3 − λpP2t − 2(λp + λt)P2t − uP2t (6.14)
dP2p
dt
= 3λpP3 + λpP2t − 2(λp + λt)P2p (6.15)
R(t) = P3(t) + P2t(t) + P2p(t) (6.16)
84
Figure 6.11: Markov model of the original TMR system.
The enhanced TMR system allows the soft faults to be identified and repaired by the
healing controller. Therefore, the recovery of a single soft error must be included to the
Markov model. Based on the research in [69], the Markov model of the enhanced TMR
system is presented in in Figure 6.12. v is the soft error repair rate. In addition, the state
that represents one permanent fault is replaced with two distinct states. State 2s and State
2h represent a single soft fault and a single hard fault scenario. The enhanced TMR system
reliability is calculated as follows:
dP3
dt
= −3λtP3 − 3λsP3 − 3λhP3 + uP2t + vP2s (6.17)
dP2t
dt
= 3λtP3 − λhP2t − uP2t − 2λaP2t (6.18)
dP2s
dt
= 3λpP3 − vP2s − λhP2s − 2λaP2s (6.19)
dP2h
dt
= 3λpP3 + λhP2t + λhP2s − 2λaP2h (6.20)
R(t) = P3(t) + P2t(t) + P2s(t) + P2h(t) (6.21)
where λa is the sum of all types of failure rates (λa = λt + λs + λh).
85
Figure 6.12: Markov model of the enhanced TMR system.
The Markov model of the enhanced QMR system based on studies in [70][64] is pre-
sented in Figure 6.13. The i and j in State Pi,j stands for the number of the healthy working
modules and the error-free spare module. For example, State 3, 1 means all three functional
blocks and the back-up module are free of faults. State 3, 0h means there is one hard fault
in the system. If the fault is in the functional block, it is automatically replaced by the spare
module. State 2, 0ht means there is one hard error and one transient error in the system.
The enhanced QMR system reliability is calculated as follows:
dP3,1
dt
= −4λtP3,1 − 4λsP3,1 − 3λhP3,1 + uP3,0t + vP3,0s (6.22)
dP3,0s
dt
= 4λtP3,1 − 3λsP3,0s + vP2,0ss − vP3,0s (6.23)
dP3,0t
dt
= 4λtP3,1 − 3λtP3,0t + uP2,0tt − uP3,0t (6.24)
dP3,0h
dt
= 4λtP3,1 − 3λhP3,0h − 3λsP3,0h − 3λtP3,0h + vP2,0hs + uP2,0ht (6.25)
dP2,0ss
dt
= 3λsP3,0s − vP2,0ss − λhP2,0ss − 2λaP2,0ss (6.26)
dP2,0tt
dt




= 3λsP3,0h + λhP2,0ss − vP2,0hs − λhP2,0hs − 2λaP2,0hs (6.28)
dP2,0ht
dt
= 3λtP3,0h + λhP2,0tt − vP2,0ht − λhP2,0ht − 2λaP2,0ht (6.29)
dP2,0hh
dt
= 3λtP3,0h + λhP2,0hs + λhP2,0ht − 2λaP2,0hh (6.30)
R(t) = P3,1(t) + P3,0t(t) + P3,0s(t) + P3,0h(t)+
P2,0ss(t) + P2,0hs(t) + P2,0hh(t) + P2,0ht(t) + P2,0tt(t)
(6.31)
Figure 6.13: Markov model of the enhanced QMR system.
87
Based on the failure rate provided by Xilinx and these formulas presented above, we
calculated the expected reliability as a function of time (years of operation) for a 204k
LUT system (the FPGA chip), for different levels of fault-tolerance. These are plotted in
Figure 6.14 and Figure 6.15 for NR, TMR, enhanced TMR, and enhanced QMR.
Because the non-redundant (NR) design is vulnerable to transient errors, its reliability
function is dominated by the transient failure rate. It shows significant degradation within
a year. However, TMR corrects almost all transient errors, and so its reliability is domi-
nated by the soft failure rate. enhanced QMR self-corrects most soft and hard errors, and
fails mostly from multiple hard errors within a single cell. The time scale for significant
degradation of enhanced QMR is therefore measured in centuries.
Figure 6.14: The reliability changes of a 204k LUT system in 50 years of operation.
88
Figure 6.15: The reliability changes of a 204k LUT system in 5000 years of operation.
89
6.2 VLSI Case Study: Handwritten Digit Recognition
New trends in machine learning algorithm, such as using compact data types (e.g., 1-
2 bit) in deep neural network, provide a great opportunity for FPGA application over
traditional GPU based implementations [71]. Moreover, FPGAs outperform GPUs and
CPUs in latency-sensitive applications [72]. In this section we implement our proposed
self-repairing architecture using an artificial neural network (ANN) based handwritten
digit classification design and the Xilinx KC705 Evaluation Board with 28nm XC7K325T-
2FFG900C FPGA.
6.2.1 Experiment System Setup
As presented in Figure 6.16, the top level design contains two UART (Universal Asyn-
chronous Receiver/Transmitter) modules, two FIFO (first in first out) modules and the neu-
ral network module. The UART RX module receives the image data stream from PC and
Figure 6.16: Experiment setup system overview.
the UART TX module transmits the predicted result back to PC. Since the UART protocol
transforms the data between serial to parallel interface and the sampling frequency of the
communication should be at least two times faster than that of the data stream. the UART
90
module and the ANN module work in two different clock domains. Therefore two asyn-
chronous FIFO modules are included to eliminate the clock domain crossing issue. The
trained weights are stored in the block RAM during the initial FPGA configuration.
6.2.2 ANN
An artificial neural network is a computational model based on the structure and behav-
ior of biological neural networks, wherein each neuron sums the weighted inputs from the
preceding neurons. In this application, it takes a 28x28 image and generates a 4 bit binary
number representing the predicted digit. The architecture of our handwritten digit recog-
nition neural network, is presented in Figure 6.17. The first layer is the input layer where
Figure 6.17: ANN architecture.
each node simply passes the image pixel to the second layer. The second layer is a layer
of neurons where each neuron multiplies the pixel matrix and the corresponding weight
matrix.
91
The general neuron behavior is presented in Figure 6.18. Notice that since our neural
network model only contains a single hidden layer, the activation function is unnecessary
and not implemented in this design. The output of the neuron represents the probability
of the given image to be a specific digit. In our implementation, ten neurons are used
representing the ten digits from zero to nine. The last layer is the output layer. It searches
the index for the maximum probability from the preceding neurons. In other word it selects
the digit with the highest probability.
Figure 6.18: Neuron model.
The MNIST database (Modified National Institute of Standards and Technology database)
of handwritten digits [73] is used to train and test our neural network model. Figure 6.19
presents an example of the handwritten digit.
The weights are generated from the software during the training process using 60,000
training image examples. These weights are pre-stored in FPGA block RAM while the
handwritten digit images are streamed from PC to FPGA at run time. We use 10,000 image
92
examples from the MNIST testing set to evaluate the neural network accuracy under various
error rates and demonstrate the robustness of our self-repairing architecture.
Figure 6.19: MNIST data example.
6.2.3 Error Injection
Random errors are injected to test the robustness of the self-repairing system. Since the
original design is primarily consisted of a hidden layer and a output layer, two error gen-
erators are applied, as illustrated in Figure 6.20. One error generator is attached to the
Figure 6.20: Error injection data flow.
hidden layer, mimicking the faults occur inside the hidden layer. The errors and the orig-
93
inal hidden layer outputs are processed through an XOR gate. For each bit of the hidden
layer outputs, if the corresponding error bit is a logic one, it will flip the the hidden layer
output. Otherwise, the hidden layer outputs stay unchanged. Even though the errors are not
directly injected to the internal logic of the hidden layer block, from the output layer point
of view, the results are the same. These corrupt results are then transported to the output
layer. Similarly, the second error generator is attached to the output layer, representing the
internal errors of the output layer.
The error generator module diagram is introduced in Figure 6.21. The input port, la-
Figure 6.21: Error generator module.
beled “i” in this figure, is connected to the on-board switches to control the injected error
value. If the random number generated from the Linear Feedback Shift Registers (LFSR)
is less or equal to the inject error value, a logic one is generated by the comparator. Oth-
erwise, the comparator will generate a logic zero. This output of the comparator is used to
94
switch the multiplexer to push an error bit (zero or one) to the filpflop chain. And this error
bit will keep shifting for the following cycles. Therefore, the number of ones inside the
flipflop chain depends on the probability of whether the random number generated from
the LFSR is less or equal to the inject error value. For example, if the random number is
evenly distributed between 1 and 10, and the inject error value is 5, then there should be
50% of ones inside the flipflop chain.
In order to obtain a random sequence with uniform probability distribution, a LFSR is
used. The LFSR is a shift register whose input bit is a linear function of its previous state.
Figure 6.22 presents an example of an 11-bit LFSR, the bits at position 9 and 11 perform an
exclusive or operation and feed back the output to position 1. In the next cycle, the LFSR
Figure 6.22: Example of a linear feedback shift register.
shifts the bits one position to the left and updates the bit at position 1 with the XOR result.
The positions to perform the exclusive or operation is called a tap. For a n-bit LFSR, with
well-chosen taps, this logic is capable of producing a random sequence of up to 2n− 1 bits
without repeating itself. The mathematical theory behind this design can be found in [74].
Since the LFSR is guaranteed to generated a random sequence with constant density,
the relationship between the bit error rate (probability of ones occur in the flipflop chain)
95





where e is the error rate, i is the error value from the input pin and j is the number of bits of
the LFSR. For example, if the LFSR is 8 bits, and input error number is 1, the error rate is
0.392%. The error injection module output port width is designed to be the same width as
the target layer result, therefore a bitwise xor operation can be performed with the correct
layer results.
Figure 6.23 illustrates the enhancement module behavior. A four bit number is injected
to four modules, the most significant bit corresponds to the error in module 3 and least
significant bit corresponds to the error in module 1. When the first error is injected to
module 1, it is detected but not asserts the error flag since the counter is not reach the
threshold, the error code indicates that the faulty module is module 1 as where the error is
injected and the faulty module is replaced by the backup module (module 3). Further errors
from this module are temporarily masked by the enhanced voter and would not trigger
the error detection. The system output “pred” as highlighted in the figure, continuously
generates the correct output. When another error is injected at module 3, the error code
indicates the faulty module is module 3. It switches this faulty module with module 1 and
































6.2.4 Fault Tolerance Analysis
We use this design as our benchmark and apply our self-repairing architecture to exam the
robustness of this system under error injection. The predicted results are compared with the
ten thousand reference outputs from MNIST data set, summarized in Figure 6.24. Without
injecting any error, the original system accuracy is 91.52%. When the error rate is relative
low, the system works correctly. When the error rate increases, the original system starts
Figure 6.24: ANN accuracy vs Injected error rate.
to fail but the self-repairing system continuously produces the correct output. When the
error rate elevates drastically, the accuracy of the self-repaired system starts to reduce due
to to the fact that there are more than two modules affected simultaneously. The original
experiment data is listed in Table 6.5 and Table 6.6.
98
Table 6.5: System Accuracy Results








































Table 6.6: System Accuracy Results (continued)









Table 6.7 summaries the FPGA resource utilization, power consumption and timing infor-
mation. The self-repairing system requires almost four times the logic resources (required
for QMR) and consumes twice the power compared with the original design. The maxi-
mum frequency reduces by 10%. This QMR approach contributes the majority utilization
of the system. Even though the area utilization is almost 100%, the difficulty for hardware
routing does not increase significantly. As a result, the maximum frequency only drops only
10%. The blank spaces are reserved for neural network weights storage (block memory).
Table 6.7: FPGA Implementation Report
Original Self-Repair
LUTs (number and percentage) 39019, (19.15%) 155530, (76.32%)
Registers (number and percentage) 35770, (8.78%) 142129, (34.87%)
Slice (number and percentage) 12240, (24.02%) 48989, (96.15%)
Total On-chip Power (W) 0.334 0.505
Dynamic (W) 0.174 0.344
Static (W) 0.160 0.161
system clock frequency (MHz) 200 200
ANN clock frequency (MHz) 12.5 12.5
ANN worst negative slack (ns) 48.790 45.186
ANN maximum frequency (MHz) 32.041 28.724
100
The implementation layouts of the original system and the self-repairing system are
displayed in Figure 6.25 and Figure 6.26. Each redundant module is presented with a
unique color in Figure 6.26.
Figure 6.25: FPGA Layout of the original system.
101






The objective of this research was to establish a systematic approach for the design of self-
testable, self-correcting, self-repairing and self-healing digital systems. In Chapter 1, the
motivation for a designing a highly reliable digital system and the purpose of this research
was established.
Faults in the hardware system and their causes were reviewed in Chapter 2. This chapter
also examined recent technologies to detect and repair faults, and advanced methodologies
to achieve fault tolerance.
Chapter 3 presented the self-testing, self-repairing and fault-tolerant system architec-
ture. This research mainly consisted of the development of several enhanced modules to
the existing self-testing, self-repairing and fault-tolerant techniques presented in Chapter 2,
aiming to improve the fault repairing efficiency, system availability and system reliability.
The fault repairing efficiency improvement was achieved by using a self-testing enhance-
ment module using a the bit error rate counter. The bit error rate counter distinguished
intermittent errors from transient errors based on their effect on bit error rate: intermittent
faults are more likely to occur repeatedly at the same location and occur in bursts when the
fault is activated. Therefore, the transient errors are filtered and only the intermittent errors
are recorded. The enhanced voting module combined with modular redundancy techniques
allowed the faulty module to be identified and isolated from the healthy modules, which
facilitated the repairing process. Moreover, with one extra module worked as a backup
module, the system is capable of maintaining fault-tolerance even if one module is under-
going repair, which dramatically increased the system system availability and reliability.
103
In addition, a novel state synchronization technique that suits the self-repairing system
is also presented in Chapter 3. The state synchronization technique allows the out-of-state
module (due to the repairing procedure) to be synchronized with the remainder system.
However, the traditional state synchronization technique only works for redundant modules
of odd number. Therefore a novel state synchronization technique based on multiplexer was
discussed.
Chapter 4 explained the mechanism of partial reconfiguration and the related architec-
ture of FPGA. It also introduced the development flow of a partial reconfiguration project
based on Vivado Design Suite, a Xilinx software for FPGA design, synthesis and imple-
mentation. The last section of this chapter presented the error reporting message format
and the software program behind the healing controller.
In Chapter 5, A software tool, called HDL converter, that autonomously deploys the
self-repairing architect and upgrades the non-safety-critical design into a more reliable ver-
sion is discussed. First, the limitation of manipulating the user HDL code with the regular
expression is presented. Second, the similarity and difference between the classic software
compiler and the HDL converter is described. Next, the details of development of the HDL
converter is presented.
Finally, various experiment results are presented in Chapter 6. First, an area cost esti-
mation method is introduced and verified by implementing the ITC99 benchmark designs.
In addition to the small area cost of the enhanced logic, we prove that the fault-tolerant
designs did not significantly impact the overall system maximum frequency. Moreover,
multiple Markov chain based reliability models are developed to support the evaluation
theory of each enhanced self-repairing architecture. In addition, an image classification
application is implemented to demonstrate the reliability improvement. In conclusion, we
demonstrate the reliability improvement from both the theories and experiments, and we
prove that the costs of various example design ranging from SSI (small scale integration)
to VLSI (very large scale integration) are predictable and did not significantly impact the
104
overall system maximum frequency.
7.2 Contributions
The discussion given above highlights the works that has been done in this research. The
major contributions of this thesis are illustrated below:
7.2.1 Bit Error Rate (BER) Measurement
A bit error rate enhanced testing logic is introduced in this thesis. This bit error rate counter
based self-testing module detects functional errors (from changes in static logic) but ig-
nores low BER transient errors so that higher-level repair mechanisms can be invoked at
the system level only when there is a high likelihood of a structural fault. This distinction
is critical for the fault repairing process since transient errors very often do not lead to a
permanent functional failure. Therefore the traditional error detection method, which flags
every error, is not suitable for identifying modules for repair. By filtering out transient
errors from intermittent errors, the repairing procedure required by the fault management
unit is effectively reduced by 90% to 95%, which significantly reduced the average single
module offline time, and thus increases the system availability overall. Moreover, the hard-
ware implementation cost of this bit error rate measurement logic is negligible. Therefore,
this minimal transistors or gates requirement can largely increase the enhanced module
scalability.
7.2.2 Enhancement for Fault Tolerance and Fault Isolation
An enhanced voting logic is presented in this thesis. The enhanced voting logic not only




A self-repairing architecture based on modular redundancy technique is introduced in this
thesis. This architecture integrates the bit error rate measurement logic and the enhanced
voting logic and achieves self-testing, fault-tolerance and immediate context switching
when a permanent fault are detected.
This self-repairing architecture is easy to scale and can be applied on different hierarchy
of the circuit. For example, it can be applied to the entire system, a critical subsystem, one
or more modules which are more vulnerable to certain type of errors or a single data path.
7.2.4 State Synchronization
A state synchronization method is presented in this thesis. When the damaged module is
bought back to the system after repairing, the internal sequential state information can be
obtained from nearby operating modules within one clock cycle. This method is not re-
stricted by the number of redundant modules whereas the traditional state synchronization
method can only be applied a system with an odd number of modules.
7.2.5 HDL converter
A Verilog compiler based software for HDL file modification is introduced in this thesis.
This software is capable of automatically modifying the user logic and upgrading it into a
more reliable version by applying the self-repairing architecture. This method eliminates
the human interaction, therefore increases the productivity while avoiding the human error.
Although the HDL converter presented in this thesis is designed for only deploying the
predefined self-repairing architecture, this compiler based code modification software has
potential in various applications. For example, by customizing the AST modification rules,
users can easily insert related logic for increasing the circuit testability.
106
7.2.6 Self-Repairing Design Framework
This thesis presents a framework for designing and monitoring a self-testable, self-correcting,
self-repairing and self-healing digital systems. The self-repairing architecture significantly
increases the system reliability without compromising the system availability. In addition,
this architecture is automatically deployed by the HDL converter. Furthermore, by ignoring
the transient errors, the healing controller can effectively repair the soft permanent errors.
7.3 Conclusion
The objective of this research was to establish a systematic approach for the design of
self-testable, self-correcting, self-repairing and self-healing digital systems. Experimental
results presented in this thesis demonstrate the improvement in system reliability for a
variety of designs under random error injection.
The methods and techniques presented in this thesis offer six distinct contributions dis-
cussed in the previous section. Theses contributions provides a fully autonomous approach
for the design of self-repairing systems. Furthermore, the HDL converter presented in this
thesis, even though was originally designed for generating a self-repairing system, is not
limited to this research objective. Customization is allowed for future applications.
7.4 Future Work
This thesis demonstrated a systematic approach for designing self-repairing systems. There
are some future works that can be done to further improve the system integration. More
applications, especially the safety critical applications, tests and reliability tests are also
worth investigating to further examining the performance of the system.
107
7.4.1 Built-in Healing Controller
The healing controller presented in Chapter 4 relies on an external computer for executing
the reconfiguration command. However, this external healing controller can be integrated
into the same chip based on the types of FPGAs.
For system on chip FPGA, for example Xilinx Zynq FPGA and Intel Cyclone V FPGA,
there is an embedded processor which is capable of running the Linux operating system.
This allows the device to execute the partial reconfiguration task the same way as an exter-
nal desktop does.
For FPGAs without an embedded processor, the same task can be achieved by pro-
gramming the FPGA to implement a soft processor IP core. This soft processor would not
achieve the same performance compared with a dedicated hardware processor, but it is still
capable of running an embedded operating system, thus producing the same result as the
hard processor.
For FPGAs with very limited resources, the implementation of the soft processor can be
expensive or even impossible. In this case, a lightweight partial reconfiguration controller
can be applied. This controller reads the partial bitstream in flash and programs the target
area in a similar way as the FPGA booting from flash memory.
7.4.2 Safety-Critical Application
To further examine the self-repairing system performance, more safety-critical applications
can be implemented. Example applications include, but are not limited to, automotive
control systems, nuclear reactor control systems, flight craft control systems and financial
transaction processing systems.
7.4.3 Power supply noise testing
The power supply is a critical element in digital circuit. Noise on the power supply can
cause malfunction in all digital and analog systems. To further demonstrate the robust-
108
ness of our system, power supply noise testing is required. In this test, noise with differ-
ent strength could be injected to the power supply, the FPGA operation results would be
recorded and compare with the device under normal power supply.
7.4.4 Elevated temperature testing
The temperature of the device working environment affects the circuit behavior in many
ways. Elevated temperature could harm the semiconductor component, leading to accel-
erated aging and unexpected operating errors. Therefore, examining the circuit function
under elevated temperature and comparing the results from normal temperature could help
us evaluate the system reliability improvement with the self-repairing architecture.
7.4.5 Radiation Testing
Radiation induced error is the dominating source of modern semiconductor devices. More-
over, applications such as aerospace exploration forces the device to be exposed to high
radiation environment. Therefore, a radiation test, where the FPGA is operated under high





SELF-REPAIRING SYSTEM AND BENCHMARK DESIGN VERILOG CODE





// Create Date: 03/13/2019 02:46:19 PM
// Design Name:














parameter NUM = 256,
parameter DATA_WIDTH = 16,
parameter OUT_WIDTH = 32,
parameter DEPTH = $clog2(NUM),
parameter COUNT = 3,
parameter SUM_WIDTH = DATA_WIDTH+DEPTH,
parameter EXTRA = SUM_WIDTH - OUT_WIDTH,





input [DATA_WIDTH * NUM - 1 : 0] terms_flat,
input ld_fin,
output reg valid,
output reg signed [OUT_WIDTH - 1 : 0] sum
);
reg signed [SUM_WIDTH - 1 : 0] pipeline [2*NUM - 1 : 0]; // Pipeline array
reg [COUNT:0] count;
genvar i;// Pack flat terms
generate
for (i = NUM; i < 2 * NUM; i = i + 1) begin
always @ (posedge clk) begin
//pipeline[i] <= { { DEPTH{tems_flat} } ,terms_flat[(i-NUM) * DATA_WIDTH +: DATA_WIDTH]};




// Add terms logarithmically
generate
for (i = 1; i < NUM; i = i + 1) begin
always @ (posedge clk) begin




reg dly_0, dly_1, dly_2, dly_3, dly_4, dly_5, dly_6, dly_7, dly_8;































if (EXTRA>0) begin // reduce result size to fit output
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) sum <= { SUM_WIDTH{1’b0} };
else if (valid==1’b1)
begin
if (pipeline[1][SUM_WIDTH - 1] == 1’b0)
begin
if ( pipeline[1][SUM_WIDTH - 1 -: EXTRA] == {EXTRA{1’b0}} ) //no overflow
sum <= pipeline[1][OUT_WIDTH-1:0];
else // overflow, return maximum positive number




if ( pipeline[1][SUM_WIDTH - 1 -: EXTRA] == {EXTRA{1’b1}} ) //no underflow
sum <= pipeline[1][OUT_WIDTH-1:0];
else // underflow, return minimum negative number





else begin // no reduction
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) sum <= { SUM_WIDTH{1’b0} };










// Create Date: 03/09/2017 09:54:51 AM
// Design Name:




























else if(errors && ber_en)
begin




























// Create Date: 03/06/2017 04:24:46 PM
// Design Name:



























































































































































































































































































// Create Date: 04/01/2019 03:27:54 PM
// Design Name:













// Bit Errors Rate Case: BERC
// ber = 1 / (2 ** (8-BERC) )
// for example BERC==0: ber = 1/256; 7: ber = 1/2
module errInject






output reg [3:0] output_layer_error,
output reg [319:0] hidden_layer_error
);
reg [30:0] sh;
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) sh <= 31’h7fff_ffff;




always @(posedge clk or negedge rst_n)
begin
if (!rst_n) hidden_layer_error <= 320’h0;













//hl_threshold <= 31’h1 << sh;
// hl_threshold <= {8’h0,error_rate,19’h0};
hl_threshold <= {4’h0,error_rate,23’h0}; // for simulatoin waveform
if (lfsr_reg_o < hl_threshold) random <= 1’b1;
else random <= 1’b0;
end
end
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) output_layer_error <= 4’h0;




.clk(clk), // input clock









// Create Date: 03/18/2019 09:50:23 PM
// Design Name:














parameter UNIT_NUM = 49,
parameter NODE_NUM = 10,
parameter OUT_WIDTH = 32,
































































































































































// Create Date: 04/01/2019 02:39:01 PM
// Design Name:













module lfsr #(parameter SEED = 31’h55_5555)(
input clk, // input clock





always @(posedge clk or negedge rst_n)
begin
if (!rst_n) lfsr_reg_r <= SEED;
else lfsr_reg_r <= next_lfsr_reg;//next_lfsr_reg;
end
assign lfsr_reg_o = lfsr_reg_r;
assign next_lfsr_reg[30] = lfsr_reg_r[28];
assign next_lfsr_reg[29] = lfsr_reg_r[27];
assign next_lfsr_reg[28] = lfsr_reg_r[26];
assign next_lfsr_reg[27] = lfsr_reg_r[25];
assign next_lfsr_reg[26] = lfsr_reg_r[24];
assign next_lfsr_reg[25] = lfsr_reg_r[23];
assign next_lfsr_reg[24] = lfsr_reg_r[22];
assign next_lfsr_reg[23] = lfsr_reg_r[21];
assign next_lfsr_reg[22] = lfsr_reg_r[20];
assign next_lfsr_reg[21] = lfsr_reg_r[19];
assign next_lfsr_reg[20] = lfsr_reg_r[18];
assign next_lfsr_reg[19] = lfsr_reg_r[17];
assign next_lfsr_reg[18] = lfsr_reg_r[16];
assign next_lfsr_reg[17] = lfsr_reg_r[15];
assign next_lfsr_reg[16] = lfsr_reg_r[14];
assign next_lfsr_reg[15] = lfsr_reg_r[13];
assign next_lfsr_reg[14] = lfsr_reg_r[12];
assign next_lfsr_reg[13] = lfsr_reg_r[11];
assign next_lfsr_reg[12] = lfsr_reg_r[10];
assign next_lfsr_reg[11] = lfsr_reg_r[9];
assign next_lfsr_reg[10] = lfsr_reg_r[8];
assign next_lfsr_reg[9] = lfsr_reg_r[7];
assign next_lfsr_reg[8] = lfsr_reg_r[6];
assign next_lfsr_reg[7] = lfsr_reg_r[5];
assign next_lfsr_reg[6] = lfsr_reg_r[4];
assign next_lfsr_reg[5] = lfsr_reg_r[3];
assign next_lfsr_reg[4] = lfsr_reg_r[2];
assign next_lfsr_reg[3] = lfsr_reg_r[1];
assign next_lfsr_reg[2] = lfsr_reg_r[0];
assign next_lfsr_reg[1] = ˜(lfsr_reg_r[30] ˆ lfsr_reg_r[27]);
assign next_lfsr_reg[0] = ˜(lfsr_reg_r[29] ˆ lfsr_reg_r[26]);
endmodule





// Create Date: 03/11/2019 04:45:14 PM
// Design Name:
121

















input signed [31:0] a,
input signed [31:0] b,
output reg signed [31:0] out,

















// Create Date: 03/11/2019 04:45:14 PM
// Design Name:


















output reg [3:0] pred
);
wire signed [31:0] a0, a1, a2, a3, a4, a5, a6, a7, a8, a9;
wire signed [31:0] m01, m23, m45, m67, m89, mL10, mL11, mL20, mL30;
wire en_m01, en_m23, en_m45, en_m67, en_m89, en_mL10, en_mL11, en_mL20, en_mL30;






























if (valid) pred <= maxid_L30;
end
reg [3:0] index0 = 4’h0;
reg [3:0] index1 = 4’h1;
reg [3:0] index2 = 4’h2;
reg [3:0] index3 = 4’h3;
reg [3:0] index4 = 4’h4;
reg [3:0] index5 = 4’h5;
reg [3:0] index6 = 4’h6;
reg [3:0] index7 = 4’h7;
reg [3:0] index8 = 4’h8;









































































// Create Date: 03/11/2019 04:45:14 PM
// Design Name:














parameter DATA_WIDTH = 8,




input signed [DATA_WIDTH-1:0] data,
input signed [7:0] weight,





if (!rst_n) product <= 32’h0;
else product <= data * weight;
end
endmodule





// Create Date: 03/19/2019 12:43:18 AM
// Design Name:



















































































































if (inj_mode[0] == 1’b1) L1_results_with_err <= L1_results ˆ hidden_layer_error;




if (inj_mode[1] == 1’b1) pred <= output_layer_result ˆ output_layer_error;















always @(posedge clk or negedge rst_n)
begin
if (!rst_n) en_ld_dly <= 1’b0;
else en_ld_dly <= en_ld;
end
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) new <= 1’b0;
else if (en_ld == 1’b0 && en_ld_dly == 1’b1) new <= en_ld;




if (en_ld | en_ld_dly)// && ˜load_weights_finished)
begin
temp_weights_0 <= {node_weight_0, temp_weights_0[‘UNIT_NUM*8-1 : 8] };
temp_weights_1 <= {node_weight_1, temp_weights_1[‘UNIT_NUM*8-1 : 8] };
temp_weights_2 <= {node_weight_2, temp_weights_2[‘UNIT_NUM*8-1 : 8] };
temp_weights_3 <= {node_weight_3, temp_weights_3[‘UNIT_NUM*8-1 : 8] };
temp_weights_4 <= {node_weight_4, temp_weights_4[‘UNIT_NUM*8-1 : 8] };
temp_weights_5 <= {node_weight_5, temp_weights_5[‘UNIT_NUM*8-1 : 8] };
temp_weights_6 <= {node_weight_6, temp_weights_6[‘UNIT_NUM*8-1 : 8] };
temp_weights_7 <= {node_weight_7, temp_weights_7[‘UNIT_NUM*8-1 : 8] };
temp_weights_8 <= {node_weight_8, temp_weights_8[‘UNIT_NUM*8-1 : 8] };





if (en_ld | en_ld_dly)//&& ˜load_data_finished)
begin

















































always @(posedge clk or negedge rst_n)
begin
if (!rst_n) i <= 32’h0;
else if ( en_ld ) i <= 32’hffff_ffff;
else i <= i + 1’b1;
end
reg loading;
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) loading <= 1’b0;
else if ( en_ld ) loading <= 1’b1;
else if (i == 32’d783) loading <= 1’b0;
end
*/
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) i <= 32’hffff_ffff;
else if (en_ld | en_ld_dly) i <= i + 1’b1;
else i <= 32’hffff_ffff;
end
endmodule





// Create Date: 03/18/2019 10:51:22 PM
// Design Name:














parameter UNIT_NUM = 49,
parameter OUT_WIDTH = 32,










output reg signed [OUT_WIDTH-1:0] node_result
);
reg signed [OUT_WIDTH-1:0] result;
wire [OUT_WIDTH-1:0] partial_sum;
wire unit_valid;




























always @(posedge clk or negedge rst_n)
begin
if (!rst_n) count <= ’b0;
else if (unit_valid_dly4) count <= count + 1’b1;
else if (node_valid) count <= ’b0;
end
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) result <= { OUT_WIDTH{1’b0} };
else if (new) result <= { OUT_WIDTH{1’b0} };
else if (unit_valid_dly4) result <= result + partial_sum;
end
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) node_valid <= 1’b0;
else if (count == 5’h10) node_valid <= 1’b1;
else node_valid <= 1’b0;
end
always @(posedge clk or negedge rst_n)
begin
if (!rst_n) node_result <= { OUT_WIDTH{1’b0} };









// Create Date: 03/18/2019 10:15:50 PM
// Design Name:














parameter NUM = 49,
parameter DATA_WIDTH = 8,
parameter OUT_WIDTH = 32,
parameter EXTRA = $clog2(NUM),
//parameter DEPTH = $clog2(EXTRA),
//parameter SUM_WIDTH = DATA_WIDTH+8+EXTRA
//parameter N1_DATA_WIDTH =8,








output signed [OUT_WIDTH-1:0] result
);
wire [ADDER_WIDTH-1:0] product [0:NUM];







//wire [SUM_WIDTH:0] temp [EXTRA-1:0];
//reg signed [SUM_WIDTH:0] sum;
genvar gi;
generate












.data(data[ DATA_WIDTH*(gi+1) - 1 -: DATA_WIDTH ]),
.weight(weight[8*(gi+1) - 1 -: 8]),
.product(product[gi])
);























// Create Date: 03/08/2017 09:58:10 PM
// Design Name:

















































// Create Date: 04/04/2019 12:58:43 AM
// Design Name:










































assign sys_rst = GPIO_SW_N;
assign sys_rst_n = ˜ sys_rst;
wire rst_n;
assign rst_n = ˜ sys_rst;
/*
reg rst_n;







.DIFF_TERM(”FALSE”), // Differential Termination
.IBUF_LOW_PWR(”TRUE”), // Low power="TRUE", Highest performance="FALSE"
.IOSTANDARD(”DEFAULT”) // Specify the input I/O standard
) IBUFGDS_inst (
.O(uart_clk), // Clock buffer output
.I(SYSCLK_P), // Diff_p clock buffer input (connect directly to top-level port)




// Clock out ports
.clk_out1(cell_clk), // output clk_out1
// Status and control signals
.resetn(sys_rst_n), // input resetn
.locked(GPIO_LED_3_LS), // output locked
132





assign GPIO_LED_6_LS = rx_fifo_empty;
assign GPIO_LED_4_LS = tx_fifo_empty;
////////// debug end /////////////
wire [7:0] rx_byte;
wire [7:0] tx_byte;
















reg signed [7:0] node_mem_weights_0 [0:783];
reg signed [7:0] node_mem_weights_1 [0:783];
reg signed [7:0] node_mem_weights_2 [0:783];
reg signed [7:0] node_mem_weights_3 [0:783];
reg signed [7:0] node_mem_weights_4 [0:783];
reg signed [7:0] node_mem_weights_5 [0:783];
reg signed [7:0] node_mem_weights_6 [0:783];
reg signed [7:0] node_mem_weights_7 [0:783];
reg signed [7:0] node_mem_weights_8 [0:783];
reg signed [7:0] node_mem_weights_9 [0:783];
initial begin
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 0 .mem”, node_mem_weights_0, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 1 .mem”, node_mem_weights_1, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 2 .mem”, node_mem_weights_2, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 3 .mem”, node_mem_weights_3, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 4 .mem”, node_mem_weights_4, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 5 .mem”, node_mem_weights_5, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 6 .mem”, node_mem_weights_6, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 7 .mem”, node_mem_weights_7, 0, 783);
$readmemb(”C : / Use r s / jyang356 / Documents / p h d r e s e a r c h / i t c 2 0 1 9 / m n i s t n n / node 8 .mem”, node_mem_weights_8, 0, 783);








if (prog_full) delay_count <= 8’b0;




if (prog_full) prog_full_ext <= 1’b1;
else if (delay_count == 8’hff) prog_full_ext <= 1’b0;
end







reg start_count, pulse, pulse_dly;










if (prog_full_syn2 == 1’b1) pulse <= 1’b1;
else pulse <= 1’b0;
pulse_dly <= pulse;
if (pulse == 1’b1 && pulse_dly == 1’b0) start_count <= 1’b1;
else start_count <= 1’b0;
end
end
always @(posedge cell_clk or negedge rst_n)
begin
if (!rst_n) i <= 32’h0;
else if ( start_count ) i <= 32’h0;





if ( start_count ) loading <= 1’b1;
else if (i == 32’d783) loading <= 1’b0;
//else loading <= 1’b0;
end
reg [9:0] j;
always @ (posedge uart_clk or negedge sys_rst_n)
begin
if (!sys_rst_n) j <= 10’b0;
else if ( uart_tx_done ) j <= 10’b0;
else if ( reseting ) j <= j + 1’b1;
else j <= 10’b0;
end
always @ (posedge uart_clk or negedge sys_rst_n)
begin
if (!sys_rst_n) reseting <= 1’b1;
else if ( uart_tx_done ) reseting <= 1’b1;
























always @(posedge cell_clk or negedge rst_n)
begin
if (!rst_n) loading_dly <= 1’b0;
else loading_dly <= loading;
end











if (nn_valid_flag_dly == 1’b1 && nn_valid_flag_dly2 == 1’b0) fifo_tx_wr_en <= 1’b1;
else fifo_tx_wr_en <= 1’b0;
end
wire [3:0] error_rate;
wire [319:0] hidden_layer_error0, hidden_layer_error1, hidden_layer_error2, hidden_layer_error3;
wire [3:0] output_layer_error0,output_layer_error1, output_layer_error2, output_layer_error3;





if (uart_tx_done) repeat_delay_count <= 8’b0;




if (uart_tx_done) uart_tx_done_ext <= 1’b1;
else if (repeat_delay_count == 8’hff) uart_tx_done_ext <= 1’b0;
end








reg [1:0] inj_mode, rebuf;
//assign GPIO_LED_1_LS = rebuf[1];
//assign GPIO_LED_0_LS = rebuf[0];
assign GPIO_LED_1_LS = inj_mode[1];
assign GPIO_LED_0_LS = inj_mode[0];
//always @(posedge cell_clk)
//begin
















if (inj_sw == 1’b1 && inj_sw_dly == 1’b0) inj_sw_pulse <= 1’b1;
else inj_sw_pulse <= 1’b0;
end
always @(posedge cell_clk or negedge rst_n)
begin
if (!rst_n) inj_mode <= 2’b0;














































assign error_rate = {GPIO_DIP_SW3, GPIO_DIP_SW2, GPIO_DIP_SW1, GPIO_DIP_SW0};
wire [3:0] pred0, pred1,pred2,pred3;
wire [3:0] in0_qmr, in1_qmr, in2_qmr, in3_qmr;
assign in0_qmr = {pred0[0], pred1[0], pred2[0], pred3[0]};
assign in1_qmr = {pred0[1], pred1[1], pred2[1], pred3[1]};
assign in2_qmr = {pred0[2], pred1[2], pred2[2], pred3[2]};
assign in3_qmr = {pred0[3], pred1[3], pred2[3], pred3[3]};







































































































































.wr_clk(uart_clk), // input wire wr_clk
.rd_clk(cell_clk), // input wire rd_clk
.din(rx_byte), // input wire [7 : 0] din
.wr_en(valid_rx), // input wire wr_en
.rd_en(loading), // input wire rd_en
.dout(rx_data), // output wire [7 : 0] dout
.full(GPIO_LED_7_LS),// cts?rts? // output wire full
.prog_full(prog_full),









always @(posedge cell_clk or negedge rst_n)
begin
if (!rst_n) fifo_tx_wr_done <= 1’b0;
else if (fifo_tx_wr_ack==1’b0 && fifo_tx_wr_ack_dly == 1’b1) fifo_tx_wr_done <= 1’b1;
else fifo_tx_wr_done <= 1’b0;
end
reg tx_fifo_rd_en_in, tx_fifo_rd_en_dly, tx_fifo_rd_en_dly2, tx_fifo_rd_en_dly3, tx_fifo_rd_en;
138
















if (tx_fifo_rd_en_dly3 == 1’b1 && tx_fifo_rd_en_dly2 == 1’b0) tx_fifo_rd_en <= 1’b1;




.wr_clk(cell_clk), // input wire wr_clk
.rd_clk(uart_clk), // input wire rd_clk
.din(pred_byte), // input wire [7 : 0] din
.wr_en(fifo_tx_wr_en), // input wire wr_en
.rd_en(tx_fifo_rd_en), // input wire rd_en
.dout(tx_data), // output wire [7 : 0] dout
.full(GPIO_LED_5_LS), // output wire full
.wr_ack(fifo_tx_wr_ack), // output wire wr_ack
.valid(fifo_tx_rd_valid), // output wire valid


















// dubug module ============================================
/*
load_debug your_instance_name (
.clk(cell_clk), // input wire clk
.probe0(loading), // input wire [0:0] probe0
.probe1(pulse), // input wire [0:0] probe1
.probe2( pred_byte), // input wire [0:0] probe2
.probe3(temp_weight_2), // input wire [0:0] probe3
.probe4(i) // input wire [31:0] probe4
);
nn_debug NN_INS (
.clk(cell_clk), // input wire clk
.probe0(temp_weight_0), // input wire [7:0] probe0
.probe1(temp_weight_1), // input wire [7:0] probe1
.probe2(temp_weight_2), // input wire [7:0] probe2
.probe3(temp_weight_3), // input wire [7:0] probe3
.probe4(temp_weight_4), // input wire [7:0] probe4
.probe5(temp_weight_5), // input wire [7:0] probe5
139
.probe6(temp_weight_6), // input wire [7:0] probe6
.probe7(temp_weight_7), // input wire [7:0] probe7
.probe8(temp_weight_8), // input wire [7:0] probe8
.probe9(temp_weight_9), // input wire [7:0] probe9
.probe10(rst_n), // input wire [0:0] probe10
.probe11(rx_data), // input wire [7:0] probe11
.probe12(loading), // input wire [0:0] probe12
.probe13(nn_valid), // input wire [0:0] probe13









// Create Date: 03/17/2019 03:26:40 PM
// Design Name:














#(parameter CLKS_PER_BIT = 1736,
parameter WIDTH = $clog2(CLKS_PER_BIT),
parameter s_IDLE = 3’b000,
parameter s_RX_START_BIT = 3’b001,
parameter s_RX_DATA_BITS = 3’b010,
parameter s_RX_STOP_BIT = 3’b011,







//parameter CLKS_PER_BIT = 87 ;
reg r_Rx_Data_R = 1’b1;
reg r_Rx_Data = 1’b1;
reg [WIDTH:0] r_Clock_Count = 0;
reg [2:0] r_Bit_Index = 0; //8 bits total
reg [7:0] r_Rx_Byte = 0;
reg r_Rx_DV = 0;
reg [2:0] r_SM_Main = 0;
// Purpose: Double-register the incoming data.
// This allows it to be used in the UART RX Clock Domain.





















// Check middle of start bit to make sure it’s still low
s_RX_START_BIT :
begin
if (r_Clock_Count == (CLKS_PER_BIT-1)/2)
begin
if (r_Rx_Data == 1’b0)
begin








r_Clock_Count <= r_Clock_Count + 1;
r_SM_Main <= s_RX_START_BIT;
end
end // case: s_RX_START_BIT
// Wait CLKS_PER_BIT-1 clock cycles to sample serial data
s_RX_DATA_BITS :
begin
if (r_Clock_Count < CLKS_PER_BIT-1)
begin







// Check if we have received all bits
if (r_Bit_Index < 7)
begin









end // case: s_RX_DATA_BITS
// Receive Stop bit. Stop bit = 1
s_RX_STOP_BIT :
begin
// Wait CLKS_PER_BIT-1 clock cycles for Stop bit to finish
if (r_Clock_Count < CLKS_PER_BIT-1)
begin










end // case: s_RX_STOP_BIT










assign o_Rx_DV = r_Rx_DV;
assign o_Rx_Byte = r_Rx_Byte;
endmodule // uart_rx





// Create Date: 03/17/2019 03:26:39 PM
// Design Name:














#(parameter CLKS_PER_BIT = 1736,
parameter WIDTH = $clog2(CLKS_PER_BIT),
parameter s_IDLE = 3’b000,
parameter s_TX_START_BIT = 3’b001,
parameter s_TX_DATA_BITS = 3’b010,
parameter s_TX_STOP_BIT = 3’b011,









reg [2:0] r_SM_Main = 0;
reg [WIDTH:0] r_Clock_Count = 0;
reg [2:0] r_Bit_Index = 0;
reg [7:0] r_Tx_Data = 0;
reg r_Tx_Done = 0;



















end // case: s_IDLE




// Wait CLKS_PER_BIT-1 clock cycles for start bit to finish
if (r_Clock_Count < CLKS_PER_BIT-1)
begin








end // case: s_TX_START_BIT




if (r_Clock_Count < CLKS_PER_BIT-1)
begin






// Check if we have sent out all bits
if (r_Bit_Index < 7)
begin









end // case: s_TX_DATA_BITS




// Wait CLKS_PER_BIT-1 clock cycles for Stop bit to finish
if (r_Clock_Count < CLKS_PER_BIT-1)
begin











end // case: s_Tx_STOP_BIT










assign o_Tx_Active = r_Tx_Active;
assign o_Tx_Done = r_Tx_Done;
endmodule





// Create Date: 03/06/2017 04:24:05 PM
// Design Name:




















assign a = inputs[2];
assign b = inputs[1];
assign c = inputs[0];
assign d = inputs[3];
reg x,y,z;













































from __future__ import absolute_import




""" Abstact class for every element in parser """
def children(self):
pass
def show(self, buf=sys.stdout, offset=0, attrnames=False, showlineno=True):
indent = 2
lead = ’ ’ * offset
buf.write(lead + self.__class__.__name__ + ’ : ’)
if self.attr_names:
if attrnames:
nvlist = [(n, getattr(self, n)) for n in self.attr_names]
attrstr = ’ , ’.join( ’%s=%s ’ % (n, v) for (n, v) in nvlist)
else:
vlist = [getattr(self, n) for n in self.attr_names]
attrstr = ’ , ’.join( ’%s ’ % v for v in vlist)
buf.write(attrstr)
if showlineno:
buf.write( ’ ( a t %s ) ’ % self.lineno)
buf.write( ’\n ’)
for c in self.children():
c.show(buf, offset + indent, attrnames, showlineno)
def __eq__(self, other):
if type(self) != type(other):
return False
self_attrs = tuple([getattr(self, a) for a in self.attr_names])
other_attrs = tuple([getattr(other, a) for a in other.attr_names])
if self_attrs != other_attrs:
return False
other_children = other.children()
for i, c in enumerate(self.children()):











attr_names = ( ’ name ’,)





















attr_names = ( ’ name ’,)





































attr_names = ( ’ name ’, ’ t y p e ’,)



























attr_names = ( ’ name ’,)










if self.scope is None:
return self.name
return self.scope.__repr__() + ’ . ’ + self.name
class Value(Node):
attr_names = ()









attr_names = ( ’ v a l u e ’,)
















attr_names = ( ’ name ’, ’ s i g n e d ’)























attr_names = ( ’ name ’, ’ s i g n e d ’)














attr_names = ( ’ name ’, ’ s i g n e d ’)


































attr_names = ( ’ name ’, ’ s i g n e d ’)






















































































































ret = ’ ( ’ + self.__class__.__name__
for c in self.children():
ret += ’ ’ + c.__repr__()





























































































































































attr_names = ( ’ t y p e ’,)
def __init__(self, sig, type= ’ posedge ’, lineno=0):
self.lineno = lineno
self.sig = sig














































































































attr_names = ( ’ scope ’,)
































































attr_names = ( ’ module ’,)













attr_names = ( ’ name ’, ’ module ’)
158

















attr_names = ( ’ paramname ’,)










attr_names = ( ’ por tname ’,)










attr_names = ( ’ name ’,)































attr_names = ( ’ name ’,)

































attr_names = ( ’ s y s c a l l ’,)











ret.append( ’ ( ’)
ret.append( ’ $ ’)
ret.append(self.syscall)
160
for a in self.args:
ret.append( ’ ’)
ret.append(str(a))
ret.append( ’ ) ’)
return ’ ’.join(ret)
class IdentifierScopeLabel(Node):
attr_names = ( ’ name ’, ’ l oop ’)




























attr_names = ( ’ name ’, )










attr_names = ( ’ d e s t ’,)







attr_names = ( ’ s cope ’,)
161




















attr_names = ( ’ code ’,)





from __future__ import absolute_import






from jinja2 import Environment, FileSystemLoader
from pyverilog.vparser.ast import *
from pyverilog.utils.op2mark import op2mark
from pyverilog.utils.op2mark import op2order






def indent(text, prefix, predicate=None):
if predicate is None:
def predicate(x): return x and not x.isspace()
ret = []








texts = text.split( ’\n ’)
if len(texts) <= 1:
return text
try:











method = ’ v i s i t ’ + node.__class__.__name__








return node.__class__.__name__.lower() + ’ . t x t ’
def escape(s):
if s.startswith( ’\\’):
return s + ’ ’
return s
def del_paren(s):








self.indent = functools.partial(indent, prefix= ’ ’ * indentsize)
self.template_cache = {}
def get_template(self, filename):
























paramlist = self.indent(self.visit(node.paramlist)) if node.paramlist is not None else ’ ’
portlist = self.indent(self.visit(node.portlist)) if node.portlist is not None else ’ ’
template_dict = {
163
’ modulename ’: escape(node.name),
’ p a r a m l i s t ’: paramlist,
’ p o r t l i s t ’: portlist,







params = [self.visit(param).replace( ’ ; ’, ’ ’) for param in node.params]
template_dict = {
’ params ’: params,







ports = [self.visit(port) for port in node.ports]
template_dict = {
’ p o r t s ’: ports,


































’ name ’: escape(node.name),

















































’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),









’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),
’ l e n g t h ’: self.visit(node.length),








’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None else self.visit(node.width),
’ l e n g t h ’: self.visit(node.length),








’ name ’: escape(node.name),

























’ f i r s t ’: node.first.__class__.__name__.lower(),
’ second ’: ’ ’ if node.second is None else node.second.__class__.__name__.lower(),
’ name ’: escape(node.first.name),
’ w id th ’: ’ ’ if node.first.width is None else self.visit(node.first.width),









’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None or (value.startswith( ’ ” ’) and value.endswith( ’ ” ’)) else self.visit(node.width),
’ v a l u e ’: value,









’ name ’: escape(node.name),
’ w id th ’: ’ ’ if node.width is None or (value.startswith( ’ ” ’) and value.endswith( ’ ” ’)) else self.visit(node.width),
’ v a l u e ’: value,















items = [del_paren(self.visit(item)) for item in node.list]
template_dict = {
’ i t e m s ’: items,







items = [del_paren(self.visit(item)) for item in node.list]
template_dict = {
’ i t e m s ’: items,









’ v a l u e ’: del_paren(self.visit(node.value)),








’ v a r ’: self.visit(node.var),
’msb ’: del_space(del_paren(self.visit(node.msb))),








’ v a r ’: self.visit(node.var),




























if ((not isinstance(node.left, (Sll, Srl, Sra,
LessThan, GreaterThan, LessEq, GreaterEq,
Eq, NotEq, Eql, NotEql))) and
(lorder is not None and lorder <= order)):
left = del_paren(left)
if ((not isinstance(node.right, (Sll, Srl, Sra,
LessThan, GreaterThan, LessEq, GreaterEq,
Eq, NotEq, Eql, NotEql))) and
(rorder is not None and order > rorder)):
right = del_paren(right)
template_dict = {
’ l e f t ’: left,
’ r i g h t ’: right,










’ r i g h t ’: right,













































































false_value = ’ ’.join([ ’\n ’, false_value])
template_dict = {
’ cond ’: del_paren(self.visit(node.cond)),
’ t r u e v a l u e ’: true_value,








’ l e f t ’: self.visit(node.left),









’ s e n s l i s t ’: self.visit(node.sens_list),







items = [self.visit(item) for item in node.list]
template_dict = {
’ i t e m s ’: items,









’ s i g ’: ’∗ ’ if node.type == ’ a l l ’ else self.visit(node.sig),








’ l e f t ’: self.visit(node.left),
’ r i g h t ’: self.visit(node.right),
’ l d e l a y ’: ’ ’ if node.ldelay is None else self.visit(node.ldelay),









’ l e f t ’: self.visit(node.left),
’ r i g h t ’: self.visit(node.right),
’ l d e l a y ’: ’ ’ if node.ldelay is None else self.visit(node.ldelay),









’ l e f t ’: self.visit(node.left),
’ r i g h t ’: self.visit(node.right),
’ l d e l a y ’: ’ ’ if node.ldelay is None else self.visit(node.ldelay),








true_statement = ’ ’ if node.true_statement is None else self.visit(node.true_statement)
false_statement = ’ ’ if node.false_statement is None else self.visit(node.false_statement)
template_dict = {
’ cond ’: del_paren(self.visit(node.cond)),
’ t r u e s t a t e m e n t ’: true_statement,








’ p r e ’: ’ ’ if node.pre is None else del_space(self.visit(node.pre)),
’ cond ’: ’ ’ if node.cond is None else del_space(del_paren(self.visit(node.cond))),
’ p o s t ’: ’ ’ if node.post is None else del_space(self.visit(node.post).replace( ’ ; ’, ’ ’)),








’ cond ’: ’ ’ if node.cond is None else del_paren(self.visit(node.cond)),
171








’ comp ’: del_paren(self.visit(node.comp)),








’ comp ’: del_paren(self.visit(node.comp)),







condlist = [ ’ d e f a u l t ’] if node.cond is None else [
del_paren(self.visit(c)) for c in node.cond]
cond = []
for c in condlist:
cond.append(c)
cond.append( ’ , ’)
template_dict = {
’ cond ’: ’ ’.join(cond[:-1]),








’ s cope ’: ’ ’ if node.scope is None else escape(node.scope),
























’ cond ’: del_paren(self.visit(node.cond)),
























parameterlist = [self.indent(self.visit(param)) for param in node.parameterlist]
instances = [self.visit(instance) for instance in node.instances]
template_dict = {
’ module ’: escape(node.module),
’ p a r a m e t e r l i s t ’: parameterlist,
’ l e n p a r a m e t e r l i s t ’: len(parameterlist),
’ i n s t a n c e s ’: instances,







array = ’ ’ if node.array is None else self.visit(node.array)
portlist = [self.indent(self.visit(port)) for port in node.portlist]
template_dict = {
’ name ’: escape(node.name),
’ a r r a y ’: array,
’ p o r t l i s t ’: portlist,








’ paramname ’: ’ ’ if node.paramname is None else escape(node.paramname),








’ por tname ’: ’ ’ if node.portname is None else escape(node.portname),







statement = [self.indent(self.visit(s)) for s in node.statement]
template_dict = {
’ name ’: escape(node.name),
’ r e t w i d t h ’: self.visit(node.retwidth),








args = [self.visit(arg) for arg in node.args]
template_dict = {
’ name ’: self.visit(node.name),
’ a r g s ’: args,







statement = [self.indent(self.visit(s)) for s in node.statement]
template_dict = {
’ name ’: escape(node.name),




# def visit_TaskCall(self, node):
# filename = getfilename(node)
# template = self.get_template(filename)
# args = [ self.visit(arg) for arg in node.args ]
# template_dict = {
# ’name’ : self.visit(node.name),
# ’args’ : args,
# ’len_args’ : len(args),
# }













args = [self.visit(arg) for arg in node.args]
template_dict = {
’ s y s c a l l ’: escape(node.syscall),
’ a r g s ’: args,








’ name ’: escape(node.name),







scopes = [self.visit(scope) for scope in node.labellist]
template_dict = {

















’ name ’: escape(node.name),
















’ s cope ’: ’ ’ if node.scope is None else escape(node.scope),














from __future__ import absolute_import




from pyverilog.vparser.ply.lex import *
class VerilogLexer(object):
""" Verilog HDL Lexical Analayzer """
def __init__(self, error_func):
self.filename = ’ ’
self.error_func = error_func
self.directives = []
self.default_nettype = ’ w i r e ’
def build(self, **kwargs):













’MODULE’, ’ENDMODULE’, ’BEGIN ’, ’END’, ’GENERATE’, ’ENDGENERATE’, ’GENVAR’,
’FUNCTION ’, ’ENDFUNCTION ’, ’TASK ’, ’ENDTASK’,
’INPUT ’, ’INOUT ’, ’OUTPUT ’, ’ TRI ’, ’REG ’, ’LOGIC ’, ’WIRE ’, ’INTEGER ’, ’REAL ’, ’SIGNED ’,
’PARAMETER’, ’LOCALPARAM’, ’SUPPLY0 ’, ’SUPPLY1 ’,
’ASSIGN ’, ’ALWAYS’, ’ALWAYS FF ’, ’ALWAYS COMB’, ’ALWAYS LATCH’, ’SENS OR ’, ’POSEDGE ’, ’NEGEDGE’, ’ INITIAL ’,
’ IF ’, ’ELSE ’, ’FOR ’, ’WHILE ’, ’CASE ’, ’CASEX ’, ’UNIQUE ’, ’ENDCASE ’, ’DEFAULT ’,
’WAIT ’, ’FOREVER ’, ’DISABLE ’, ’FORK’, ’ JOIN ’,
)
reserved = {}
for keyword in keywords:
if keyword == ’SENS OR ’:




’PLUS ’, ’MINUS ’, ’POWER’, ’TIMES ’, ’DIVIDE ’, ’MOD’,
’NOT’, ’OR’, ’NOR’, ’AND’, ’NAND’, ’XOR’, ’XNOR’,
’LOR ’, ’LAND’, ’LNOT’,
’LSHIFTA ’, ’RSHIFTA ’, ’LSHIFT ’, ’RSHIFT ’,




tokens = keywords + operators + (
’ ID ’,
’AT ’, ’COMMA’, ’COLON’, ’SEMICOLON ’, ’DOT’,
’PLUSCOLON ’, ’MINUSCOLON’,
’FLOATNUMBER’, ’STRING LITERAL ’,
’INTNUMBER DEC ’, ’SIGNED INTNUMBER DEC ’,
’INTNUMBER HEX ’, ’SIGNED INTNUMBER HEX ’,
’INTNUMBER OCT ’, ’SIGNED INTNUMBER OCT ’,
’INTNUMBER BIN ’, ’SIGNED INTNUMBER BIN ’,




’COMMENTOUT’, ’LINECOMMENT’, ’DIRECTIVE ’,
)
# Ignore
t_ignore = ’ \ t ’
# Directive




t.lexer.lineno += t.value.count(”\n ”)





linecomment = r ””” / / .∗?\ n ”””
commentout = r ””” /\∗(.|\n )∗?\∗/ ”””
@TOKEN(linecomment)
def t_LINECOMMENT(self, t):





t.lexer.lineno += t.value.count(”\n ”)
pass
# Operator
t_LOR = r ’\|\| ’
t_LAND = r ’\&\&’
t_NOR = r ’ ˜\| ’
t_NAND = r ’˜\& ’
t_XNOR = r ’ ˜\ˆ ’
t_OR = r ’\| ’
t_AND = r ’\&’
t_XOR = r ’\ˆ ’
t_LNOT = r ’ ! ’
t_NOT = r ’ ˜ ’
t_LSHIFTA = r ’<<<’
t_RSHIFTA = r ’>>>’
t_LSHIFT = r ’<<’
t_RSHIFT = r ’>>’
t_EQL = r ’=== ’
t_NEL = r ’ !== ’
t_EQ = r ’== ’
t_NE = r ’ != ’
t_LE = r ’<=’
t_GE = r ’>=’
t_LT = r ’<’
t_GT = r ’>’
t_POWER = r ’\∗\∗’
t_PLUS = r ’\+ ’
t_MINUS = r ’−’
t_TIMES = r ’\∗’
t_DIVIDE = r ’ / ’
t_MOD = r ’%’
t_COND = r ’\? ’
t_EQUALS = r ’= ’
t_PLUSCOLON = r ’\+: ’
t_MINUSCOLON = r ’−: ’
t_AT = r ’@’
t_COMMA = r ’ , ’
t_SEMICOLON = r ’ ; ’
t_COLON = r ’ : ’
t_DOT = r ’\. ’
t_LPAREN = r ’\( ’
t_RPAREN = r ’\) ’
t_LBRACKET = r ’\[ ’
t_RBRACKET = r ’\] ’
t_LBRACE = r ’\{’
t_RBRACE = r ’\}’
t_DELAY = r ’\# ’
t_DOLLER = r ’\$ ’
bin_number = ’ [0−9]∗\’[bB][0−1xXzZ?][0−1xXzZ? ]∗ ’
signed_bin_number = ’ [0−9]∗\’[ sS ] [ bB][0−1xZzZ?][0−1xXzZ? ]∗ ’
octal_number = ’ [0−9]∗\’[oO][0−7xXzZ?][0−7xXzZ? ]∗ ’
signed_octal_number = ’ [0−9]∗\’[ sS ] [ oO][0−7xXzZ?][0−7xXzZ? ]∗ ’
hex_number = ’ [0−9]∗\’[hH][0−9a−fA−FxXzZ?][0−9a−fA−FxXzZ? ]∗ ’
signed_hex_number = ’ [0−9]∗\’[ sS ] [ hH][0−9a−fA−FxXzZ?][0−9a−fA−FxXzZ? ]∗ ’
decimal_number = ’ ([0−9]∗\ ’[dD][0−9xXzZ?][0−9xXzZ? ]∗) |([0−9][0−9 ]∗) ’
signed_decimal_number = ’ [0−9]∗\’[ sS ] [ dD][0−9xXzZ?][0−9xXzZ? ]∗ ’
exponent_part = r ””” ( [ eE][−+]?[0−9]+) ”””
fractional_constant = r ””” ([0−9]∗\.[0−9]+) |([0−9]+\ . ) ”””
float_number = ’ ( ( ( ( ’ + fractional_constant + ’ ) ’ + \
exponent_part + ’ ? ) |([0−9]+ ’ + exponent_part + ’ ) ) ) ’
simple_escape = r ””” ( [ a−zA−Z\\? ’”]) ”””
octal_escape = r ””” ([0−7]{1 ,3}) ”””
hex_escape = r ””” ( x[0−9a−fA−F ] + ) ”””
177
escape_sequence = r ””” (\\( ””” + simple_escape + ’ | ’ + octal_escape + ’ | ’ + hex_escape + ’ ) ) ’
string_char = r ””” ([ˆ”\\\n ]| ””” + escape_sequence + ’ ) ’
string_literal = ’ ” ’ + string_char + ’∗” ’





































t.lexer.lineno += t.value.count(”\n ”)
pass
def t_error(self, t):
msg = ’ I l l e g a l c h a r a c t e r %s ’ % repr(t.value[0])
self._error(msg, t)






while i > 0:
if self.lexer.lexdata[i] == ’\n ’:
break
i -= 1





def my_error_func(msg, a, b):










break # No more input
ret.append(”%s %s %d %s %d\n ” %
(tok.value, tok.type, tok.lineno, lexer.filename, tok.lexpos))
return ’ ’.join(ret)
from __future__ import absolute_import
from __future__ import print_function
import sys
import os
from pyverilog.vparser.ply.yacc import yacc
from pyverilog.vparser.plyparser import PLYParser, Coord, ParseError
from pyverilog.vparser.preprocessor import VerilogPreprocessor
from pyverilog.vparser.lexer import VerilogLexer
from pyverilog.vparser.ast import *
class VerilogParser(PLYParser):





( ’ l e f t ’, ’LOR ’),
( ’ l e f t ’, ’LAND’),
( ’ l e f t ’, ’OR’),
( ’ l e f t ’, ’AND’, ’XOR’, ’XNOR’),
( ’ l e f t ’, ’EQ ’, ’NE ’, ’EQL ’, ’NEL ’),
( ’ l e f t ’, ’LT ’, ’GT ’, ’LE ’, ’GE ’),
( ’ l e f t ’, ’LSHIFT ’, ’RSHIFT ’, ’LSHIFTA ’, ’RSHIFTA ’),
( ’ l e f t ’, ’PLUS ’, ’MINUS ’),
( ’ l e f t ’, ’TIMES ’, ’DIVIDE ’, ’MOD’),
( ’ l e f t ’, ’POWER’),
( ’ r i g h t ’, ’UMINUS ’, ’UPLUS ’, ’ULNOT’, ’UNOT’,








# Use this if you want to build the parser using LALR(1) instead of SLR
self.parser = yacc(module=self, method=”LALR”)







def parse(self, text, debug=0):
return self.parser.parse(text, lexer=self.lexer, debug=debug)
# --------------------------------------------------------------------------




’ s o u r c e t e x t : d e s c r i p t i o n ’
p[0] = Source(name= ’ ’, description=p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_description(self, p):
’ d e s c r i p t i o n : d e f i n i t i o n s ’
p[0] = Description(definitions=p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_definitions(self, p):
’ d e f i n i t i o n s : d e f i n i t i o n s d e f i n i t i o n ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_definitions_one(self, p):













’ pragma : LPAREN TIMES ID EQUALS e x p r e s s i o n TIMES RPAREN ’




’ pragma : LPAREN TIMES ID TIMES RPAREN ’





’ moduledef : MODULE modulename p a r a m l i s t p o r t l i s t i t e m s ENDMODULE’













’ p a r a m l i s t : DELAY LPAREN params RPAREN ’
p[0] = Paramlist(params=p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_paramlist_empty(self, p):
’ p a r a m l i s t : empty ’
p[0] = Paramlist(params=())
def p_params(self, p):
’ params : p a r a m s b e g i n param end ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_params_begin(self, p):
’ p a r a m s b e g i n : p a r a m s b e g i n param ’
180
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_params_begin_one(self, p):








’ param : PARAMETER p a r a m s u b s t i t u t i o n l i s t COMMA’
paramlist = [Parameter(rname, rvalue, lineno=p.lineno(2))
for rname, rvalue in p[2]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_signed(self, p):
’ param : PARAMETER SIGNED p a r a m s u b s t i t u t i o n l i s t COMMA’
paramlist = [Parameter(rname, rvalue, signed=True, lineno=p.lineno(2))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_width(self, p):
’ param : PARAMETER wid th p a r a m s u b s t i t u t i o n l i s t COMMA’
paramlist = [Parameter(rname, rvalue, p[2], lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_signed_width(self, p):
’ param : PARAMETER SIGNED wid th p a r a m s u b s t i t u t i o n l i s t COMMA’
paramlist = [Parameter(rname, rvalue, p[3], signed=True, lineno=p.lineno(3))
for rname, rvalue in p[4]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_integer(self, p):
’ param : PARAMETER INTEGER p a r a m s u b s t i t u t i o n l i s t COMMA’
paramlist = [Parameter(rname, rvalue, lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_end(self, p):
’ param end : PARAMETER p a r a m s u b s t i t u t i o n l i s t ’
paramlist = [Parameter(rname, rvalue, lineno=p.lineno(2))
for rname, rvalue in p[2]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_end_signed(self, p):
’ param end : PARAMETER SIGNED p a r a m s u b s t i t u t i o n l i s t ’
paramlist = [Parameter(rname, rvalue, signed=True, lineno=p.lineno(2))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_end_width(self, p):
’ param end : PARAMETER wid th p a r a m s u b s t i t u t i o n l i s t ’
paramlist = [Parameter(rname, rvalue, p[2], lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_end_signed_width(self, p):
’ param end : PARAMETER SIGNED wid th p a r a m s u b s t i t u t i o n l i s t ’
paramlist = [Parameter(rname, rvalue, p[3], signed=True, lineno=p.lineno(3))
for rname, rvalue in p[4]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_end_integer(self, p):
’ param end : PARAMETER INTEGER p a r a m s u b s t i t u t i o n l i s t ’
181
paramlist = [Parameter(rname, rvalue, lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_portlist(self, p):
’ p o r t l i s t : LPAREN p o r t s RPAREN SEMICOLON ’
p[0] = Portlist(ports=p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_portlist_io(self, p):
’ p o r t l i s t : LPAREN i o p o r t s RPAREN SEMICOLON ’
p[0] = Portlist(ports=p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_portlist_paren_empty(self, p):
’ p o r t l i s t : LPAREN RPAREN SEMICOLON ’
p[0] = Portlist(ports=(), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_portlist_empty(self, p):
’ p o r t l i s t : SEMICOLON ’
p[0] = Portlist(ports=(), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_ports(self, p):
’ p o r t s : p o r t s COMMA por tname ’
wid = None
port = Port(name=p[3], width=wid, type=None, lineno=p.lineno(1))
p[0] = p[1] + (port,)
p.set_lineno(0, p.lineno(1))
def p_ports_one(self, p):
’ p o r t s : por tname ’
wid = None








’ s i g t y p e s : s i g t y p e s s i g t y p e ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_sigtypes_one(self, p):













































’ i o p o r t s : i o p o r t s COMMA i o p o r t ’
if isinstance(p[3], str):
t = None
for r in reversed(p[1]):
if isinstance(r.first, Input):
t = Ioport(Input(name=p[3], width=r.first.width, lineno=p.lineno(3)),
lineno=p.lineno(3))
break
if isinstance(r.first, Output) and r.second is None:
t = Ioport(Output(name=p[3], width=r.first.width, lineno=p.lineno(3)),
lineno=p.lineno(3))
break
if isinstance(r.first, Output) and isinstance(r.second, Reg):






t = Ioport(Inout(name=p[3], width=r.first.width, lineno=p.lineno(3)),
lineno=p.lineno(3))
break
p[0] = p[1] + (t,)
else:
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_ioports_one(self, p):
’ i o p o r t s : i o p o r t h e a d ’
p[0] = (p[1],)
p.set_lineno(0, p.lineno(1))





if ’ s i g n e d ’ in sigtypes:
signed = True
if ’ i n p u t ’ in sigtypes:
first = Input(name=name, width=width, signed=signed, lineno=lineno)
if ’ o u t p u t ’ in sigtypes:
first = Output(name=name, width=width,
signed=signed, lineno=lineno)
if ’ i n o u t ’ in sigtypes:
first = Inout(name=name, width=width, signed=signed, lineno=lineno)
if ’ w i r e ’ in sigtypes:
second = Wire(name=name, width=width, signed=signed, lineno=lineno)
if ’ r e g ’ in sigtypes:
second = Reg(name=name, width=width, signed=signed, lineno=lineno)
if ’ t r i ’ in sigtypes:
183
second = Tri(name=name, width=width, signed=signed, lineno=lineno)
return Ioport(first, second, lineno=lineno)
def typecheck_ioport(self, sigtypes):
if ’ i n p u t ’ not in sigtypes and ’ o u t p u t ’ not in sigtypes and ’ i n o u t ’ not in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ i n p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ r e g ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ r e g ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ t r i ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ o u t p u t ’ in sigtypes and ’ t r i ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
def p_ioport(self, p):
’ i o p o r t : s i g t y p e s por tname ’
p[0] = self.create_ioport(p[1], p[2], lineno=p.lineno(2))
p.set_lineno(0, p.lineno(1))
def p_ioport_width(self, p):
’ i o p o r t : s i g t y p e s wid th por tname ’
p[0] = self.create_ioport(p[1], p[3], width=p[2], lineno=p.lineno(3))
p.set_lineno(0, p.lineno(1))
def p_ioport_head(self, p):
’ i o p o r t h e a d : s i g t y p e s por tname ’
p[0] = self.create_ioport(p[1], p[2], lineno=p.lineno(2))
p.set_lineno(0, p.lineno(1))
def p_ioport_head_width(self, p):
’ i o p o r t h e a d : s i g t y p e s wid th por tname ’
p[0] = self.create_ioport(p[1], p[3], width=p[2], lineno=p.lineno(3))
p.set_lineno(0, p.lineno(1))
def p_ioport_portname(self, p):




’ w id th : LBRACKET e x p r e s s i o n COLON e x p r e s s i o n RBRACKET’
p[0] = Width(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_length(self, p):
’ l e n g t h : LBRACKET e x p r e s s i o n COLON e x p r e s s i o n RBRACKET’
p[0] = Length(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_items(self, p):
’ i t e m s : i t e m s i t em ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_items_one(self, p):







































if ’ s i g n e d ’ in sigtypes:
signed = True
if ’ i n p u t ’ in sigtypes:
decls.append(Input(name=name, width=width,
signed=signed, lineno=lineno))
if ’ o u t p u t ’ in sigtypes:
decls.append(Output(name=name, width=width,
signed=signed, lineno=lineno))
if ’ i n o u t ’ in sigtypes:
decls.append(Inout(name=name, width=width,
signed=signed, lineno=lineno))














if ’ t r i ’ in sigtypes:
decls.append(Tri(name=name, width=width,
signed=signed, lineno=lineno))
if ’ s u p p l y 0 ’ in sigtypes:
decls.append(Supply(name=name, value=IntConst( ’ 0 ’, lineno=lineno),
width=width, signed=signed, lineno=lineno))
if ’ s u p p l y 1 ’ in sigtypes:
decls.append(Supply(name=name, value=IntConst( ’ 1 ’, lineno=lineno),
width=width, signed=signed, lineno=lineno))
return decls
def typecheck_decl(self, sigtypes, length=None):
if length and ’ i n p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if length and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if length and ’ i n o u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if len(sigtypes) == 1 and ’ s i g n e d ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ i n p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ r e g ’ in sigtypes:
185
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ r e g ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ t r i ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ o u t p u t ’ in sigtypes and ’ t r i ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
def p_decl(self, p):
’ d e c l : s i g t y p e s d e c l n a m e l i s t SEMICOLON ’
decllist = []
for rname, rlength in p[2]:
decllist.extend(self.create_decl(p[1], rname, length=rlength,
lineno=p.lineno(2)))
p[0] = Decl(tuple(decllist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_decl_width(self, p):
’ d e c l : s i g t y p e s wid th d e c l n a m e l i s t SEMICOLON ’
decllist = []
for rname, rlength in p[3]:
decllist.extend(self.create_decl(p[1], rname, width=p[2], length=rlength,
lineno=p.lineno(3)))
p[0] = Decl(tuple(decllist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_declnamelist(self, p):
’ d e c l n a m e l i s t : d e c l n a m e l i s t COMMA declname ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_declnamelist_one(self, p):




’ declname : ID ’
p[0] = (p[1], None)
p.set_lineno(0, p.lineno(1))
def p_declarray(self, p):
’ declname : ID l e n g t h ’
p[0] = (p[1], p[2])
p.set_lineno(0, p.lineno(1))
# Decl and Assign




if ’ s i g n e d ’ in sigtypes:
signed = True
if ’ i n p u t ’ in sigtypes:
decls.append(Input(name=name, width=width,
signed=signed, lineno=lineno))
if ’ o u t p u t ’ in sigtypes:
decls.append(Output(name=name, width=width,
signed=signed, lineno=lineno))
if ’ i n o u t ’ in sigtypes:
decls.append(Inout(name=name, width=width,
signed=signed, lineno=lineno))
if ’ w i r e ’ in sigtypes:
decls.append(Wire(name=name, width=width,
signed=signed, lineno=lineno))






if len(sigtypes) == 1 and ’ s i g n e d ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ r e g ’ not in sigtypes and ’ w i r e ’ not in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
186
if ’ i n o u t ’ in sigtypes and ’ o u t p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ i n p u t ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n p u t ’ in sigtypes and ’ r e g ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ i n o u t ’ in sigtypes and ’ r e g ’ in sigtypes:
raise ParseError(” Syn tax E r r o r ”)
if ’ s u p p l y 0 ’ in sigtypes and len(sigtypes) != 1:
raise ParseError(” Syn tax E r r o r ”)
if ’ s u p p l y 1 ’ in sigtypes and len(sigtypes) != 1:
raise ParseError(” Syn tax E r r o r ”)
def p_declassign(self, p):
’ d e c l a s s i g n : s i g t y p e s d e c l a s s i g n e l e m e n t SEMICOLON ’
decllist = self.create_declassign(
p[1], p[2][0], p[2][1], lineno=p.lineno(2))
p[0] = Decl(decllist, lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_declassign_width(self, p):
’ d e c l a s s i g n : s i g t y p e s wid th d e c l a s s i g n e l e m e n t SEMICOLON ’
decllist = self.create_declassign(
p[1], p[3][0], p[3][1], width=p[2], lineno=p.lineno(3))
p[0] = Decl(tuple(decllist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_declassign_element(self, p):
’ d e c l a s s i g n e l e m e n t : ID EQUALS r v a l u e ’
assign = Assign(Lvalue(Identifier(p[1], lineno=p.lineno(1)), lineno=p.lineno(1)),
p[3], lineno=p.lineno(1))
p[0] = (p[1], assign)
p.set_lineno(0, p.lineno(1))
def p_declassign_element_delay(self, p):
’ d e c l a s s i g n e l e m e n t : d e l a y s ID EQUALS d e l a y s r v a l u e ’
assign = Assign(Lvalue(Identifier(p[2], lineno=p.lineno(1)), lineno=p.lineno(2)),
p[5], p[1], p[4], lineno=p.lineno(2))




’ i n t e g e r d e c l : INTEGER i n t e g e r n a m e l i s t SEMICOLON ’
intlist = [Integer(r,
Width(msb=IntConst( ’ 31 ’, lineno=p.lineno(2)),
lsb=IntConst( ’ 0 ’, lineno=p.lineno(2)),
lineno=p.lineno(2)),
signed=True, lineno=p.lineno(2)) for r in p[2]]
p[0] = Decl(tuple(intlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_integerdecl_signed(self, p):
’ i n t e g e r d e c l : INTEGER SIGNED i n t e g e r n a m e l i s t SEMICOLON ’
intlist = [Integer(r,
Width(msb=IntConst( ’ 31 ’, lineno=p.lineno(3)),
lsb=IntConst( ’ 0 ’, lineno=p.lineno(3)),
lineno=p.lineno(3)),
signed=True, lineno=p.lineno(3)) for r in p[2]]
p[0] = Decl(tuple(intlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_integernamelist(self, p):
’ i n t e g e r n a m e l i s t : i n t e g e r n a m e l i s t COMMA i n t e g e r n a m e ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_integernamelist_one(self, p):










’ r e a l d e c l : REAL r e a l n a m e l i s t SEMICOLON ’
reallist = [Real(p[1],
Width(msb=IntConst( ’ 31 ’, lineno=p.lineno(2)),
lsb=IntConst( ’ 0 ’, lineno=p.lineno(2)),
lineno=p.lineno(2)),
lineno=p.lineno(2)) for r in p[2]]
p[0] = Decl(tuple(reallist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_realnamelist(self, p):
’ r e a l n a m e l i s t : r e a l n a m e l i s t COMMA r e a l n a m e ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_realnamelist_one(self, p):









’ p a r a m e t e r d e c l : PARAMETER p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Parameter(rname, rvalue, lineno=p.lineno(2))
for rname, rvalue in p[2]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_parameterdecl_signed(self, p):
’ p a r a m e t e r d e c l : PARAMETER SIGNED p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Parameter(rname, rvalue, signed=True, lineno=p.lineno(2))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_parameterdecl_width(self, p):
’ p a r a m e t e r d e c l : PARAMETER wid th p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Parameter(rname, rvalue, p[2], lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_parameterdecl_signed_width(self, p):
’ p a r a m e t e r d e c l : PARAMETER SIGNED wid th p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Parameter(rname, rvalue, p[3], signed=True, lineno=p.lineno(3))
for rname, rvalue in p[4]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_parameterdecl_integer(self, p):
’ p a r a m e t e r d e c l : PARAMETER INTEGER p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Parameter(rname, rvalue, lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_localparamdecl(self, p):
’ l o c a l p a r a m d e c l : LOCALPARAM p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Localparam(rname, rvalue, lineno=p.lineno(2))
for rname, rvalue in p[2]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_localparamdecl_signed(self, p):
’ l o c a l p a r a m d e c l : LOCALPARAM SIGNED p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Localparam(rname, rvalue, signed=True, lineno=p.lineno(2))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_localparamdecl_width(self, p):
’ l o c a l p a r a m d e c l : LOCALPARAM wid th p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Localparam(rname, rvalue, p[2], lineno=p.lineno(3))
188
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_localparamdecl_signed_width(self, p):
’ l o c a l p a r a m d e c l : LOCALPARAM SIGNED wid th p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Localparam(rname, rvalue, p[3], signed=True, lineno=p.lineno(3))
for rname, rvalue in p[4]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_localparamdecl_integer(self, p):
’ l o c a l p a r a m d e c l : LOCALPARAM INTEGER p a r a m s u b s t i t u t i o n l i s t SEMICOLON ’
paramlist = [Localparam(rname, rvalue, lineno=p.lineno(3))
for rname, rvalue in p[3]]
p[0] = Decl(tuple(paramlist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_substitution_list(self, p):
’ p a r a m s u b s t i t u t i o n l i s t : p a r a m s u b s t i t u t i o n l i s t COMMA p a r a m s u b s t i t u t i o n ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_param_substitution_list_one(self, p):




’ p a r a m s u b s t i t u t i o n : ID EQUALS r v a l u e ’
p[0] = (p[1], p[3])
p.set_lineno(0, p.lineno(1))
def p_assignment(self, p):
’ a s s i g n m e n t : ASSIGN l v a l u e EQUALS r v a l u e SEMICOLON ’
p[0] = Assign(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_assignment_delay(self, p):
’ a s s i g n m e n t : ASSIGN d e l a y s l v a l u e EQUALS d e l a y s r v a l u e SEMICOLON ’




’ l p a r t s e l e c t : p o i n t e r LBRACKET e x p r e s s i o n COLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lpartselect_lpointer_plus(self, p):
’ l p a r t s e l e c t : p o i n t e r LBRACKET e x p r e s s i o n PLUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Plus(p[3], p[5]), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lpartselect_lpointer_minus(self, p):
’ l p a r t s e l e c t : p o i n t e r LBRACKET e x p r e s s i o n MINUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Minus(p[3], p[5]), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lpartselect(self, p):
’ l p a r t s e l e c t : i d e n t i f i e r LBRACKET e x p r e s s i o n COLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lpartselect_plus(self, p):
’ l p a r t s e l e c t : i d e n t i f i e r LBRACKET e x p r e s s i o n PLUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Plus(p[3], p[5]), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lpartselect_minus(self, p):
’ l p a r t s e l e c t : i d e n t i f i e r LBRACKET e x p r e s s i o n MINUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Minus(p[3], p[5]), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lpointer(self, p):





’ l c o n c a t : LBRACE l c o n c a t l i s t RBRACE’
p[0] = LConcat(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lconcatlist(self, p):
’ l c o n c a t l i s t : l c o n c a t l i s t COMMA l c o n c a t o n e ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_lconcatlist_one(self, p):




















’ l v a l u e : l p a r t s e l e c t ’
p[0] = Lvalue(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lvalue_pointer(self, p):
’ l v a l u e : l p o i n t e r ’
p[0] = Lvalue(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lvalue_concat(self, p):
’ l v a l u e : l c o n c a t ’
p[0] = Lvalue(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_lvalue_one(self, p):
’ l v a l u e : i d e n t i f i e r ’
p[0] = Lvalue(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_rvalue(self, p):
’ r v a l u e : e x p r e s s i o n ’
p[0] = Rvalue(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
# --------------------------------------------------------------------------
# Level 1 (Highest Priority)
def p_expression_uminus(self, p):
’ e x p r e s s i o n : MINUS e x p r e s s i o n %p r e c UMINUS ’
p[0] = Uminus(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_uplus(self, p):




’ e x p r e s s i o n : LNOT e x p r e s s i o n %p r e c ULNOT’




’ e x p r e s s i o n : NOT e x p r e s s i o n %p r e c UNOT’
p[0] = Unot(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_uand(self, p):
’ e x p r e s s i o n : AND e x p r e s s i o n %p r e c UAND’
p[0] = Uand(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_unand(self, p):
’ e x p r e s s i o n : NAND e x p r e s s i o n %p r e c UNAND’
p[0] = Unand(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_unor(self, p):
’ e x p r e s s i o n : NOR e x p r e s s i o n %p r e c UNOR’
p[0] = Unor(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_uor(self, p):
’ e x p r e s s i o n : OR e x p r e s s i o n %p r e c UOR’
p[0] = Uor(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_uxor(self, p):
’ e x p r e s s i o n : XOR e x p r e s s i o n %p r e c UXOR’
p[0] = Uxor(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_uxnor(self, p):
’ e x p r e s s i o n : XNOR e x p r e s s i o n %p r e c UXNOR’





’ e x p r e s s i o n : e x p r e s s i o n POWER e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n TIMES e x p r e s s i o n ’
p[0] = Times(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_div(self, p):
’ e x p r e s s i o n : e x p r e s s i o n DIVIDE e x p r e s s i o n ’
p[0] = Divide(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_mod(self, p):
’ e x p r e s s i o n : e x p r e s s i o n MOD e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n PLUS e x p r e s s i o n ’
p[0] = Plus(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_minus(self, p):
’ e x p r e s s i o n : e x p r e s s i o n MINUS e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n LSHIFT e x p r e s s i o n ’




’ e x p r e s s i o n : e x p r e s s i o n RSHIFT e x p r e s s i o n ’
p[0] = Srl(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_sla(self, p):
’ e x p r e s s i o n : e x p r e s s i o n LSHIFTA e x p r e s s i o n ’
p[0] = Sll(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_sra(self, p):
’ e x p r e s s i o n : e x p r e s s i o n RSHIFTA e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n LT e x p r e s s i o n ’
p[0] = LessThan(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_greaterthan(self, p):
’ e x p r e s s i o n : e x p r e s s i o n GT e x p r e s s i o n ’
p[0] = GreaterThan(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_lesseq(self, p):
’ e x p r e s s i o n : e x p r e s s i o n LE e x p r e s s i o n ’
p[0] = LessEq(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_greatereq(self, p):
’ e x p r e s s i o n : e x p r e s s i o n GE e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n EQ e x p r e s s i o n ’
p[0] = Eq(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_noteq(self, p):
’ e x p r e s s i o n : e x p r e s s i o n NE e x p r e s s i o n ’
p[0] = NotEq(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_eql(self, p):
’ e x p r e s s i o n : e x p r e s s i o n EQL e x p r e s s i o n ’
p[0] = Eql(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_noteql(self, p):
’ e x p r e s s i o n : e x p r e s s i o n NEL e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n AND e x p r e s s i o n ’
p[0] = And(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_Xor(self, p):
’ e x p r e s s i o n : e x p r e s s i o n XOR e x p r e s s i o n ’
p[0] = Xor(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_expression_Xnor(self, p):
’ e x p r e s s i o n : e x p r e s s i o n XNOR e x p r e s s i o n ’






’ e x p r e s s i o n : e x p r e s s i o n OR e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n LAND e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n LOR e x p r e s s i o n ’





’ e x p r e s s i o n : e x p r e s s i o n COND e x p r e s s i o n COLON e x p r e s s i o n ’









































’ c o n c a t : LBRACE c o n c a t l i s t RBRACE’




’ c o n c a t l i s t : c o n c a t l i s t COMMA e x p r e s s i o n ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_concatlist_one(self, p):




’ r e p e a t : LBRACE e x p r e s s i o n c o n c a t RBRACE’
p[0] = Repeat(p[3], p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_partselect(self, p):
’ p a r t s e l e c t : i d e n t i f i e r LBRACKET e x p r e s s i o n COLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_partselect_plus(self, p):
’ p a r t s e l e c t : i d e n t i f i e r LBRACKET e x p r e s s i o n PLUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Plus(
p[3], p[5], lineno=p.lineno(1)), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_partselect_minus(self, p):
’ p a r t s e l e c t : i d e n t i f i e r LBRACKET e x p r e s s i o n MINUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Minus(
p[3], p[5], lineno=p.lineno(1)), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_partselect_pointer(self, p):
’ p a r t s e l e c t : p o i n t e r LBRACKET e x p r e s s i o n COLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_partselect_pointer_plus(self, p):
’ p a r t s e l e c t : p o i n t e r LBRACKET e x p r e s s i o n PLUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Plus(
p[3], p[5], lineno=p.lineno(1)), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_partselect_pointer_minus(self, p):
’ p a r t s e l e c t : p o i n t e r LBRACKET e x p r e s s i o n MINUSCOLON e x p r e s s i o n RBRACKET’
p[0] = Partselect(p[1], p[3], Minus(
p[3], p[5], lineno=p.lineno(1)), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_pointer(self, p):
’ p o i n t e r : i d e n t i f i e r LBRACKET e x p r e s s i o n RBRACKET’
p[0] = Pointer(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_pointer_pointer(self, p):
’ p o i n t e r : p o i n t e r LBRACKET e x p r e s s i o n RBRACKET’




’ c o n s t e x p r e s s i o n : i n t n u m b e r ’
p[0] = IntConst(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_const_expression_floatnum(self, p):
’ c o n s t e x p r e s s i o n : f l o a t n u m b e r ’
p[0] = FloatConst(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_const_expression_stringliteral(self, p):
’ c o n s t e x p r e s s i o n : s t r i n g l i t e r a l ’
p[0] = StringConst(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_floatnumber(self, p):



















’ s t r i n g l i t e r a l : STRING LITERAL ’





’ a lways : ALWAYS s e n s l i s t a l w a y s s t a t e m e n t ’
p[0] = Always(p[2], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_always_ff(self, p):
’ a l w a y s f f : ALWAYS FF s e n s l i s t a l w a y s s t a t e m e n t ’
p[0] = AlwaysFF(p[2], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_always_comb(self, p):
’ a lways comb : ALWAYS COMB s e n s l i s t a l w a y s s t a t e m e n t ’
p[0] = AlwaysComb(p[2], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_always_latch(self, p):
’ a l w a y s l a t c h : ALWAYS LATCH s e n s l i s t a l w a y s s t a t e m e n t ’
p[0] = AlwaysLatch(p[2], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_sens_egde_paren(self, p):
’ s e n s l i s t : AT LPAREN e d g e s i g s RPAREN ’
p[0] = SensList(p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_posedgesig(self, p):
’ e d g e s i g : POSEDGE e d g e s i g b a s e ’
p[0] = Sens(p[2], ’ posedge ’, lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_negedgesig(self, p):
’ e d g e s i g : NEGEDGE e d g e s i g b a s e ’
p[0] = Sens(p[2], ’ negedge ’, lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_edgesig_base_identifier(self, p):








’ e d g e s i g s : e d g e s i g s SENS OR e d g e s i g ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_edgesigs_comma(self, p):
’ e d g e s i g s : e d g e s i g s COMMA e d g e s i g ’








’ s e n s l i s t : empty ’
p[0] = SensList(
(Sens(None, ’ a l l ’, lineno=p.lineno(1)),), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_sens_level(self, p):
’ s e n s l i s t : AT l e v e l s i g ’
p[0] = SensList((p[2],), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_sens_level_paren(self, p):
’ s e n s l i s t : AT LPAREN l e v e l s i g s RPAREN ’
p[0] = SensList(p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_levelsig(self, p):
’ l e v e l s i g : l e v e l s i g b a s e ’
p[0] = Sens(p[1], ’ l e v e l ’, lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_levelsig_base_identifier(self, p):












’ l e v e l s i g s : l e v e l s i g s SENS OR l e v e l s i g ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_levelsigs_comma(self, p):
’ l e v e l s i g s : l e v e l s i g s COMMA l e v e l s i g ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_levelsigs_one(self, p):




’ s e n s l i s t : AT TIMES ’
p[0] = SensList(
(Sens(None, ’ a l l ’, lineno=p.lineno(1)),), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_sens_all_paren(self, p):
’ s e n s l i s t : AT LPAREN TIMES RPAREN ’
p[0] = SensList(




























’ b l o c k i n g s u b s t i t u t i o n : d e l a y s l v a l u e EQUALS d e l a y s r v a l u e SEMICOLON ’
p[0] = BlockingSubstitution(p[2], p[5], p[1], p[4], lineno=p.lineno(2))
p.set_lineno(0, p.lineno(2))
def p_blocking_substitution_base(self, p):
’ b l o c k i n g s u b s t i t u t i o n b a s e : d e l a y s l v a l u e EQUALS d e l a y s r v a l u e ’
p[0] = BlockingSubstitution(p[2], p[5], p[1], p[4], lineno=p.lineno(2))
p.set_lineno(0, p.lineno(2))
def p_nonblocking_substitution(self, p):
’ n o n b l o c k i n g s u b s t i t u t i o n : d e l a y s l v a l u e LE d e l a y s r v a l u e SEMICOLON ’
p[0] = NonblockingSubstitution(




’ d e l a y s : DELAY LPAREN e x p r e s s i o n RPAREN ’
p[0] = DelayStatement(p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_delays_identifier(self, p):
’ d e l a y s : DELAY i d e n t i f i e r ’
p[0] = DelayStatement(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_delays_intnumber(self, p):














’ b l o c k : BEGIN b l o c k s t a t e m e n t s END’
p[0] = Block(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_block_empty(self, p):
’ b l o c k : BEGIN END’
p[0] = Block((), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_block_statements(self, p):
’ b l o c k s t a t e m e n t s : b l o c k s t a t e m e n t s b l o c k s t a t e m e n t ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_block_statements_one(self, p):










’ namedblock : BEGIN COLON ID n a m e d b l o c k s t a t e m e n t s END’
p[0] = Block(p[4], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_namedblock_empty(self, p):
’ namedblock : BEGIN COLON ID END’
p[0] = Block((), p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_namedblock_statements(self, p):
’ n a m e d b l o c k s t a t e m e n t s : n a m e d b l o c k s t a t e m e n t s n a m e d b l o c k s t a t e m e n t ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_namedblock_statements_one(self, p):












for r in p[1].list:
if (not isinstance(r, Reg) and not isinstance(r, Wire)
and not isinstance(r, Integer) and not isinstance(r, Real)
and not isinstance(r, Parameter) and not isinstance(r, Localparam)):





’ p a r a l l e l b l o c k : FORK b l o c k s t a t e m e n t s JOIN ’
p[0] = ParallelBlock(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_parallelblock_empty(self, p):
’ p a r a l l e l b l o c k : FORK JOIN ’




’ i f s t a t e m e n t : IF LPAREN cond RPAREN t r u e s t a t e m e n t ELSE e l s e s t a t e m e n t ’
p[0] = IfStatement(p[3], p[5], p[7], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_if_statement_woelse(self, p):
’ i f s t a t e m e n t : IF LPAREN cond RPAREN t r u e s t a t e m e n t ’
p[0] = IfStatement(p[3], p[5], None, lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_if_statement_delay(self, p):
’ i f s t a t e m e n t : d e l a y s IF LPAREN cond RPAREN t r u e s t a t e m e n t ELSE e l s e s t a t e m e n t ’
p[0] = IfStatement(p[4], p[6], p[8], lineno=p.lineno(2))
p.set_lineno(0, p.lineno(2))
def p_if_statement_woelse_delay(self, p):
’ i f s t a t e m e n t : d e l a y s IF LPAREN cond RPAREN t r u e s t a t e m e n t ’





















’ f o r s t a t e m e n t : FOR LPAREN f o r p r e f o r c o n d f o r p o s t RPAREN f o r c o n t e n t s t a t e m e n t ’
p[0] = ForStatement(p[3], p[4], p[5], p[7], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_forpre(self, p):




















’ f o r p o s t : empty ’
p[0] = None
def p_forcontent_statement(self, p):





’ w h i l e s t a t e m e n t : WHILE LPAREN cond RPAREN w h i l e c o n t e n t s t a t e m e n t ’
p[0] = WhileStatement(p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_whilecontent_statement(self, p):





’ c a s e s t a t e m e n t : CASE LPAREN case comp RPAREN c a s e c o n t e n t s t a t e m e n t s ENDCASE ’




’ c a s e x s t a t e m e n t : CASEX LPAREN case comp RPAREN c a s e c o n t e n t s t a t e m e n t s ENDCASE ’
p[0] = CasexStatement(p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_unique_case_statement(self, p):
’ u n i q u e c a s e s t a t e m e n t : UNIQUE CASE LPAREN case comp RPAREN c a s e c o n t e n t s t a t e m e n t s ENDCASE ’
p[0] = UniqueCaseStatement(p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_case_comp(self, p):




’ c a s e c o n t e n t s t a t e m e n t s : c a s e c o n t e n t s t a t e m e n t s c a s e c o n t e n t s t a t e m e n t ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_casecontent_statements_one(self, p):




’ c a s e c o n t e n t s t a t e m e n t : c a s e c o n t e n t c o n d i t i o n COLON b a s i c s t a t e m e n t ’
p[0] = Case(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_casecontent_condition_single(self, p):
’ c a s e c o n t e n t c o n d i t i o n : c a s e c o n t e n t c o n d i t i o n COMMA e x p r e s s i o n ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_casecontent_condition_one(self, p):




’ c a s e c o n t e n t s t a t e m e n t : DEFAULT COLON b a s i c s t a t e m e n t ’




’ i n i t i a l : INITIAL i n i t i a l s t a t e m e n t ’
p[0] = Initial(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_initial_statement(self, p):





’ e v e n t s t a t e m e n t : s e n s l i s t SEMICOLON ’




’ w a i t s t a t e m e n t : WAIT LPAREN cond RPAREN w a i t c o n t e n t s t a t e m e n t ’
p[0] = WaitStatement(p[3], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_waitcontent_statement(self, p):










’ f o r e v e r s t a t e m e n t : FOREVER b a s i c s t a t e m e n t ’




’ i n s t a n c e : ID p a r a m e t e r l i s t i n s t a n c e b o d y l i s t SEMICOLON ’
instancelist = []
for instance_name, instance_ports, instance_array in p[3]:
instancelist.append(Instance(p[1], instance_name, instance_ports,
p[2], instance_array, lineno=p.lineno(1)))




’ i n s t a n c e : SENS OR p a r a m e t e r l i s t i n s t a n c e b o d y l i s t SEMICOLON ’
instancelist = []
for instance_name, instance_ports, instance_array in p[3]:
instancelist.append(Instance(p[1], instance_name, instance_ports,
p[2], instance_array, lineno=p.lineno(1)))




’ i n s t a n c e b o d y l i s t : i n s t a n c e b o d y l i s t COMMA i n s t a n c e b o d y ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_instance_bodylist_one(self, p):




’ i n s t a n c e b o d y : ID LPAREN i n s t a n c e p o r t s RPAREN ’
p[0] = (p[1], p[3], None)
p.set_lineno(0, p.lineno(1))
def p_instance_body_array(self, p):
’ i n s t a n c e b o d y : ID wid th LPAREN i n s t a n c e p o r t s RPAREN ’
p[0] = (p[1], p[4], p[2])
p.set_lineno(0, p.lineno(1))
def p_instance_noname(self, p):
’ i n s t a n c e : ID i n s t a n c e b o d y l i s t n o n a m e SEMICOLON ’
instancelist = []
for instance_name, instance_ports, instance_array in p[2]:
instancelist.append(Instance(p[1], instance_name, instance_ports,
(), instance_array, lineno=p.lineno(1)))
p[0] = InstanceList(p[1], (), tuple(instancelist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_instance_or_noname(self, p):
’ i n s t a n c e : SENS OR i n s t a n c e b o d y l i s t n o n a m e SEMICOLON ’
instancelist = []
for instance_name, instance_ports, instance_array in p[2]:
instancelist.append(Instance(p[1], instance_name, instance_ports,
(), instance_array, lineno=p.lineno(1)))
p[0] = InstanceList(p[1], (), tuple(instancelist), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_instance_bodylist_noname(self, p):
’ i n s t a n c e b o d y l i s t n o n a m e : i n s t a n c e b o d y l i s t n o n a m e COMMA i n s t a n c e b o d y n o n a m e ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_instance_bodylist_one_noname(self, p):




’ i n s t a n c e b o d y n o n a m e : LPAREN i n s t a n c e p o r t s RPAREN ’












’ p a r a m e t e r l i s t : empty ’
p[0] = ()
def p_param_args_noname(self, p):
’ pa ram args noname : pa ram args noname COMMA param arg noname ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_param_args_noname_one(self, p):




’ p a r a m a r g s : p a r a m a r g s COMMA p a r a m a r g ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_param_args_one(self, p):




’ p a r a m a r g s : empty ’
p[0] = ()
def p_param_arg_noname_exp(self, p):
’ param arg noname : e x p r e s s i o n ’
p[0] = ParamArg(None, p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_param_arg_exp(self, p):
’ p a r a m a r g : DOT ID LPAREN e x p r e s s i o n RPAREN ’









’ i n s t a n c e p o r t s l i s t : i n s t a n c e p o r t s l i s t COMMA i n s t a n c e p o r t l i s t ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_instance_ports_list_one(self, p):








’ i n s t a n c e p o r t l i s t : e x p r e s s i o n ’




’ i n s t a n c e p o r t s a r g : i n s t a n c e p o r t s a r g COMMA i n s t a n c e p o r t a r g ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_instance_ports_arg_one(self, p):




’ i n s t a n c e p o r t a r g : DOT ID LPAREN i d e n t i f i e r RPAREN ’
p[0] = PortArg(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_instance_port_arg_exp(self, p):
’ i n s t a n c e p o r t a r g : DOT ID LPAREN e x p r e s s i o n RPAREN ’
p[0] = PortArg(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_instance_port_arg_none(self, p):
’ i n s t a n c e p o r t a r g : DOT ID LPAREN RPAREN ’




’ g e n v a r d e c l : GENVAR g e n v a r l i s t SEMICOLON ’
p[0] = Decl(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_genvarlist(self, p):
’ g e n v a r l i s t : g e n v a r l i s t COMMA g en va r ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_genvarlist_one(self, p):




’ ge nv a r : ID ’
p[0] = Genvar(name=p[1],
width=Width(msb=IntConst( ’ 31 ’, lineno=p.lineno(1)),





’ g e n e r a t e : GENERATE g e n e r a t e i t e m s ENDGENERATE’
p[0] = GenerateStatement(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_generate_items_empty(self, p):




’ g e n e r a t e i t e m s : g e n e r a t e i t e m s g e n e r a t e i t e m ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_generate_items_one(self, p):












’ g e n e r a t e b l o c k : BEGIN g e n e r a t e i t e m s END’
p[0] = Block(p[2], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_generate_named_block(self, p):
’ g e n e r a t e b l o c k : BEGIN COLON ID g e n e r a t e i t e m s END’
p[0] = Block(p[4], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_generate_if(self, p):
’ g e n e r a t e i f : IF LPAREN cond RPAREN g i f t r u e i t e m ELSE g i f f a l s e i t e m ’
p[0] = IfStatement(p[3], p[5], p[7], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_generate_if_woelse(self, p):
’ g e n e r a t e i f : IF LPAREN cond RPAREN g i f t r u e i t e m ’















’ g e n e r a t e f o r : FOR LPAREN f o r p r e f o r c o n d f o r p o s t RPAREN g e n e r a t e f o r c o n t e n t ’










’ s y s t e m c a l l : DOLLER ID ’
p[0] = SystemCall(p[2], (), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_systemcall(self, p):
’ s y s t e m c a l l : DOLLER ID LPAREN s y s a r g s RPAREN ’
p[0] = SystemCall(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_systemcall_signed(self, p): # for $signed system task
’ s y s t e m c a l l : DOLLER SIGNED LPAREN s y s a r g s RPAREN ’
p[0] = SystemCall(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_sysargs(self, p):
’ s y s a r g s : s y s a r g s COMMA s y s a r g ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_sysargs_one(self, p):




’ s y s a r g s : empty ’
p[0] = ()
def p_sysarg(self, p):






’ f u n c t i o n : FUNCTION wid th ID SEMICOLON f u n c t i o n s t a t e m e n t ENDFUNCTION ’
p[0] = Function(p[3], p[2], p[5], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_function_nowidth(self, p):
’ f u n c t i o n : FUNCTION ID SEMICOLON f u n c t i o n s t a t e m e n t ENDFUNCTION ’
p[0] = Function(p[2],
Width(IntConst( ’ 0 ’, lineno=p.lineno(1)),





’ f u n c t i o n s t a t e m e n t : f u n c v a r d e c l s f u n c t i o n c a l c ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_funcvardecls(self, p):
’ f u n c v a r d e c l s : f u n c v a r d e c l s f u n c v a r d e c l ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_funcvardecls_one(self, p):








for r in p[1].list:
if (not isinstance(r, Input) and not isinstance(r, Reg)
and not isinstance(r, Integer)):
















’ f u n c t i o n c a l l : i d e n t i f i e r LPAREN f u n c a r g s RPAREN ’
p[0] = FunctionCall(p[1], p[3], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_func_args(self, p):
’ f u n c a r g s : f u n c a r g s COMMA e x p r e s s i o n ’
p[0] = p[1] + (p[3],)
p.set_lineno(0, p.lineno(1))
def p_func_args_one(self, p):









’ t a s k : TASK ID SEMICOLON t a s k s t a t e m e n t ENDTASK’
p[0] = Task(p[2], p[4], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_task_statement(self, p):
’ t a s k s t a t e m e n t : t a s k v a r d e c l s t a s k c a l c ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_taskvardecls(self, p):
’ t a s k v a r d e c l s : t a s k v a r d e c l s t a s k v a r d e c l ’
p[0] = p[1] + (p[2],)
p.set_lineno(0, p.lineno(1))
def p_taskvardecls_one(self, p):











for r in p[1].list:
if (not isinstance(r, Input) and not isinstance(r, Reg)
and not isinstance(r, Integer)):

















’ i d e n t i f i e r : ID ’
p[0] = Identifier(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_scope_identifier(self, p):
’ i d e n t i f i e r : s cope ID ’




’ scope : i d e n t i f i e r DOT’
scope = () if p[1].scope is None else p[1].scope.labellist
p[0] = IdentifierScope(
scope + (IdentifierScopeLabel(p[1].name, lineno=p.lineno(1)),), lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_scope_pointer(self, p):
’ scope : p o i n t e r DOT’
scope = () if p[1].var.scope is None else p[1].var.scope.labellist





’ d i s a b l e : DISABLE ID ’










’ s i n g l e s t a t e m e n t : s y s t e m c a l l SEMICOLON ’
p[0] = SingleStatement(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
def p_single_statement_disable(self, p):
’ s i n g l e s t a t e m e n t : d i s a b l e SEMICOLON ’
p[0] = SingleStatement(p[1], lineno=p.lineno(1))
p.set_lineno(0, p.lineno(1))
# fix me: to support task-call-statement
# def p_single_statement_taskcall(self, p):
# ’single_statement : functioncall SEMICOLON’
# p[0] = SingleStatement(p[1], lineno=p.lineno(1))
# p.set_lineno(0, p.lineno(1))
# def p_single_statement_taskcall_empty(self, p):
# ’single_statement : taskcall SEMICOLON’
# p[0] = SingleStatement(p[1], lineno=p.lineno(1))
# p.set_lineno(0, p.lineno(1))
# def p_taskcall_empty(self, p):
# ’taskcall : identifier’








print(” Syn tax e r r o r ”)
if p:
self._parse_error(
’ b e f o r e : %s ’ % p.value,
self._coord(p.lineno))
else:
self._parse_error( ’ At end of i n p u t ’, ’ ’)
class VerilogCodeParser(object):














def parse(self, preprocess_output= ’ p r e p r o c e s s . o u t p u t ’, debug=0):
text = self.preprocess()















[1] G. E. Moore et al., Cramming more components onto integrated circuits, 1965.
[2] I. Koren and C. M. Krishna, Fault-Tolerant Systems. Morgan Kaufmann Publishers
Inc., 2007, p. 400.
[3] N. R. Storey, Safety critical computer systems. Addison-Wesley Longman Publish-
ing Co., Inc., 1996.
[4] M. L. Shooman, “Reliability of computer systems and networks: Fault tolerance,”
Analysis, and Design, Wiley-Interscience, 2001.
[5] C. Constantinescu, “Trends and challenges in vlsi circuit reliability,” IEEE Micro,
vol. 23, no. 4, pp. 14–19, 2003.
[6] M. Nourani and A. Radhakrishnan, “Power-supply noise in socs: Atpg, estimation
and control,” in IEEE International Conference on Test, 2005., IEEE, 2005, 10–pp.
[7] M. Elgamel and M. Bayoumi, “Noise analysis and design in deep submicron tech-
nology,” in The Electrical Engineering Handbook, Academic Press, 2005, pp. 299–
310.
[8] R. C. Baumann, “Radiation-induced soft errors in advanced semiconductor technolo-
gies,” IEEE Transactions on Device and materials reliability, vol. 5, no. 3, pp. 305–
316, 2005.
[9] V. Narayanan and Y. Xie, “Reliability concerns in embedded system designs,” Com-
puter, vol. 39, no. 1, pp. 118–120, 2006.
[10] J. Schwank, M. Shaneyfelt, D. Fleetwood, J. Felix, P. Dodd, P. Paillet, and V. Ferlet-
Cavrois, “Radiation effects in mos oxides,” Nuclear Science, IEEE Transactions on,
vol. 55, pp. 1833 –1853, Sep. 2008.
[11] S. Mukherjee, Architecture Design for Soft Errors. San Francisco, CA, USA: Mor-
gan Kaufmann Publishers Inc., 2008, ISBN: 9780080558325, 9780123695291.
[12] H. Fujiwara, Logic Testing and Design for Testability. MIT Press, 1985.
[13] N. R. Shnidman, W. H. Mangione-Smith, and M. Potkonjak, “On-line fault detection
for bus-based field programmable gate arrays,” IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, vol. 6, no. 4, pp. 656–666, 1998.
209
[14] E. Sentovich, SIS: A System for Sequential Circuit Synthesis. Electronics Research
Laboratory, College of Engineering, University of California, 1992.
[15] A. Pal, “Cellular realization of tsc checkers for error detecting codes,” in Computer
and Communication Systems, 1990. IEEE TENCON’90., 1990 IEEE Region 10 Con-
ference on, 1990, 687–691 vol.2.
[16] M. Nicolaidis, “Carry checking/parity prediction adders and alus,” Ieee Transactions
on Very Large Scale Integration (Vlsi) Systems, vol. 11, no. 1, pp. 121–128, 2003.
[17] M. Gssel, V. Ocheretny, E. Sogomonyan, and D. Marienfeld, New Methods of Con-
current Checking. Springer Netherlands, 2008.
[18] S. Almukhaizim, P. Drineas, and Y. Makris, “Entropy-driven parity-tree selection
for low-overhead concurrent error detection in finite state machines,” IEEE Transac-
tions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 8,
pp. 1547–1554, 2006.
[19] S. Mitra and E. J. McCluskey, “Which concurrent error detection scheme to choose?”
In Proceedings International Test Conference 2000 (IEEE Cat. No.00CH37159),
2000, pp. 985–994.
[20] S. Toutounchi and A. Lai, “Fpga test and coverage,” in Proceedings. International
Test Conference, 2002, pp. 599–607.
[21] J. O. Hamblen and M. D. Furman, Rapid Prototyping of Digital Systems: A Tutorial
Approach. Kluwer Academic Publishers, 2001, p. 270.
[22] S.-B. Ko and J.-C. Lo, “Efficient realization of parity prediction functions in fpgas,”
Journal of Electronic Testing, vol. 20, no. 5, pp. 489–499, 2004.
[23] M. Tahoori, “Application-dependent testing of fpgas,” IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol. 14, no. 9, pp. 1024–1033, 2006.
[24] M. Abramovici, C. Strond, C. Hamilton, S. Wijesuriya, and V. Verma, “Using roving
stars for on-line testing and diagnosis of fpgas in fault-tolerant applications,” in In-
ternational Test Conference 1999. Proceedings (IEEE Cat. No.99CH37034), 1999,
pp. 973–982.
[25] S. Mitra, W.-J. Huang, N. R. Saxena, S.-Y. Yu, and E. J. McCluskey, “Reconfigurable
architecture for autonomous self-repair,” IEEE Design & Test of Computers, vol. 21,
no. 3, pp. 228–240, 2004.
210
[26] I. Herrera-Alzu and M. Lopez-Vallejo, “Design techniques for xilinx virtex fpga
configuration memory scrubbers,” IEEE transactions on Nuclear Science, vol. 60,
no. 1, pp. 376–385, 2013.
[27] A. Stoddard, A. Gruwell, P. Zabriskie, and M. J. Wirthlin, “A hybrid approach to
fpga configuration scrubbing,” IEEE Transactions on Nuclear Science, vol. 64, no. 1,
pp. 497–503, 2017.
[28] M. Berg, C Poivey, D Petrick, D Espinosa, A. Lesea, K. LaBel, M Friendlich, H
Kim, and A. Phan, “Effectiveness of internal versus external seu scrubbing mitiga-
tion strategies in a xilinx fpga: Design, test, and analysis,” IEEE Transactions on
Nuclear Science, vol. 55, no. 4, pp. 2259–2266, 2008.
[29] V. Dumitriu, L. Kirischian, and V. Kirischian, “Run-time recovery mechanism for
transient and permanent hardware faults based on distributed, self-organized dy-
namic partially reconfigurable systems,” IEEE Transactions on Computers, vol. 65,
no. 9, pp. 2835–2847, 2016.
[30] S. Kim, H. Chu, I. Yang, S. Hong, S. H. Jung, and K.-H. Cho, “A hierarchical self-
repairing architecture for fast fault recovery of digital systems inspired from paral-
ogous gene regulatory circuits,” IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, vol. 20, no. 12, pp. 2315–2328, 2012.
[31] R. Tessier and H. Giza, “Balancing logic utilization and area efficiency in fpgas,” in
Field-Programmable Logic and Applications: The Roadmap to Reconfigurable Com-
puting, R. W. Hartenstein and H. Grünbacher, Eds., Berlin, Heidelberg: Springer
Berlin Heidelberg, 2000, pp. 535–544.
[32] G. E. Moore, “Cramming more components onto integrated circuits (reprinted from
electronics, pg 114-117, april 19, 1965),” Proceedings of the IEEE, vol. 86, no. 1,
pp. 82–85, 1998.
[33] R. E. Lyons and W. Vanderkulk, “The use of triple-modular redundancy to im-
prove computer reliability,” Ibm Journal of Research and Development, vol. 6, no. 2,
pp. 200–209, 1962.
[34] C. Carmichael, “Triple module redundancy design techniques for virtex fpgas,” Xil-
inx Application Note XAPP197, vol. 1, 2001.
[35] J. Han, E. R. Boykin, H. Chen, J. H. Liang, and J. A. B. Fortes, “On the reliability
of computational structures using majority logic,” IEEE Transactions on Nanotech-
nology, vol. 10, no. 5, pp. 1099–1112, 2011.
211
[36] F. P. Mathur and A. Avizienis, “Reliability analysis and architecture of a hybrid-
redundant digital system: Generalized triple modular redundancy with self-repair,”
in AFIPS Spring Joint Computing Conference, 1970.
[37] H. Zhang, L. Bauer, M. A. Kochte, E. Schneider, C. Braun, M. E. Imhof, H. Wunder-
lich, and J. Henkel, “Module diversification: Fault tolerance and aging mitigation for
runtime reconfigurable architectures,” in 2013 IEEE International Test Conference
(ITC), 2013, pp. 1–10.
[38] H. Baig and J. Lee, “An island-style-routing compatible fault-tolerant fpga archi-
tecture with self-repairing capabilities,” in 2012 International Conference on Field-
Programmable Technology, 2012, pp. 301–304.
[39] R. F. DeMara, J. Lee, R. Al-Haddad, R. S. Oreifej, R. Ashraf, B. Stensrud, and M.
Quist, Dynamic Partial Reconfiguration Approach to the Design of Sustainable Edge
Detectors. 2010, pp. 49–58.
[40] R. S. Oreifej, C. A. Sharma, and R. F. DeMara, “Expediting ga-based evolution
using group testing techniques for reconfigurable hardware,” in 2006 IEEE Inter-
national Conference on Reconfigurable Computing and FPGA’s (ReConFig 2006),
2006, pp. 1–8.
[41] R. Giordano, D. Barbieri, S. Perrella, R. Catalano, and G. Milluzzo, “Configuration
self-repair in xilinx fpgas,” IEEE Transactions on Nuclear Science, 2018.
[42] D. C. Keezer and J. Yang, “Biologically inspired hierarchical structure for self-
repairing fpgas,” in 2017 International Conference on ReConFigurable Computing
and FPGAs (ReConFig), IEEE, 2017, pp. 1–8.
[43] Xilinx, “Device reliability report,” Tech. Rep., 2016.
[44] H. Quinn, “Radiation effects in reconfigurable fpgas,” Semiconductor Science and
Technology, vol. 32, no. 4, 2017.
[45] A. Ceschia, A. Violante, M. S. Reorda, A. Paccagnella, P. Bernardi, M. Rebaudengo,
D. Bortolato, M. Bellato, P. Zambolin, and A. Candelori, “Identification and classi-
fication of single-event upsets in the configuration memory of sram-based fpgas (vol
50, pg 2088, 2003),” IEEE Transactions on Nuclear Science, vol. 51, no. 2, pp. 328–
328, 2004.
[46] P. Adell and G. Allen, “Assessing and mitigating radiation effects in xilinx fpgas,”
Pasadena, CA: Jet Propulsion Laboratory, California Institute of Technology, 2008,
Tech. Rep., 2008.
[47] Xilinx, “Vivado design suite user guide - partial reconfiguration,” 2017.
212
[48] K. Vipin and S. A. Fahmy, “Fpga dynamic and partial reconfiguration: A survey of
architectures, methods, and applications,” ACM Computing Surveys (CSUR), vol. 51,
no. 4, p. 72, 2018.
[49] Z. Seifoori, B. Khaleghi, and H. Asadi, “Introduction to emerging sram-based fpga
architectures in dark silicon era,” in. Jan. 2018.
[50] N. Mark and M. Peattie, “Using a microprocessor to configure xilinx fpgas via slave
serial or selectmap mode,” Jan. 2001.
[51] V. Lai and O. Diessel, “Icap-i: A reusable interface for the internal reconfiguration
of xilinx fpgas,” Jan. 2010, pp. 357 –360.
[52] Xilinx, “Xc6200 field programmable gate arrays product description,” 1997.
[53] ——, “Xilinx ds031 virtex-ii platform fpgas: Complete data sheet,” 2003.
[54] ——, “Ug070: Virtex-4 fpga user guide,” 2008.
[55] ——, “Xapp151: Virtex series configuration architecture user guide,” 2004.
[56] P. Lysaght, B. Blodget, J. Mason, J. Young, and B. Bridgford, “Enhanced archi-
tectures, design methodologies and cad tools for dynamic reconfiguration of xilinx
fpgas,” in 2006 International Conference on Field Programmable Logic and Appli-
cations, IEEE, 2006, pp. 1–6.
[57] M. Fitzgerald, Introducing Regular Expressions: Unraveling Regular Expressions,
Step-by-Step. O’Reilly Media, 2012.
[58] A. W. Appel and J. Palsberg, Modern Compiler Implementation in Java, 2nd. New
York, NY, USA: Cambridge University Press, 2003, ISBN: 052182060X.
[59] K. Cooper and L. Torczon, Engineering a compiler. Elsevier, 2011.
[60] A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Tech-
niques, and Tools (2Nd Edition). Boston, MA, USA: Addison-Wesley Longman
Publishing Co., Inc., 2006, ISBN: 0321486811.
[61] D. Beazley, “Ply (python lex-yacc),” See http://www. dabeaz. com/ply, 2001.
[62] S. Takamaeda-Yamazaki, “Pyverilog: A python-based hardware design processing
toolkit for verilog hdl,” in Applied Reconfigurable Computing, ser. Lecture Notes in
Computer Science, vol. 9040, Springer International Publishing, 2015, pp. 451–460.
213
[63] F. Corno, M. S. Reorda, and G. Squillero, “Rt-level itc’99 benchmarks and first atpg
results,” IEEE Design Test of Computers, vol. 17, no. 3, pp. 44–53, 2000.
[64] S. R. Welke, B. W. Johnson, and J. H. Aylor, “Reliability modeling of hardware/-
software systems,” IEEE Transactions on Reliability, vol. 44, no. 3, pp. 413–418,
1995.
[65] M. Rausand, “Reliability of safety-critical systems,” John Wiley&Sons, 2014.
[66] D. P. Siewiorek and R. S. Swarz, Reliable computer systems: design and evaluation.
AK Peters/CRC Press, 1998.
[67] S. Scharoba, M. Schölzel, T. Koal, and H. T. Vierhaus, “On reliability estimation
for combined transient and permanent fault handling,” in 2014 14th Biennial Baltic
Electronic Conference (BEC), IEEE, 2014, pp. 73–76.
[68] I. Koren and S. Y. H. Su, “Reliability analysis of n-modular redundancy systems
with intermittent and permanent faults,” IEEE Transactions on Computers, no. 7,
pp. 514–520, 1979.
[69] D. McMurtrey, K. S. Morgan, B. Pratt, and M. J. Wirthlin, “Estimating tmr reliability
on fpgas using markov models,” 2008.
[70] K. S. Trivedi, Probability & Statistics with Reliability, Queuing and Computer Sci-
ence Applications. PHI Learning Pvt. Limited, 2011.
[71] E. Nurvitadhi, G. Venkatesh, J. Sim, D. Marr, R. Huang, J. Ong Gee Hock, Y. T.
Liew, K. Srivatsan, D. Moss, S. Subhaschandra, et al., “Can fpgas beat gpus in
accelerating next-generation deep neural networks?” In Proceedings of the 2017
ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, ACM,
2017, pp. 5–14.
[72] S. A. Che, J. Li, J. W. Sheaffer, K. Skadron, and J. Lach, “Accelerating compute-
intensive applications with gpus and fpgas,” 2008 Symposium on Application Spe-
cific Processors, pp. 101–+, 2008.
[73] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, et al., “Gradient-based learning applied
to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324,
1998.




Jingchi Yang was born in 1990 in China. He joined a dual degree program and received one
B.S. degree in electrical engineering from the Hohai University in China and one B.S. de-
gree from the University of Lille 1 in France in 2012. He continued on to receive the M.S.
degree in microelectronics and nanotechnology from the University of Lille 1 in 2014, and
the M.S. degree in electrical and computer engineering from Georgia Institute of Technol-
ogy in 2015.
In the Spring of 2015, Dr. Yang joined the High-Speed Digital Test Lab at Georgia Tech
under the guidance of Dr. David Keezer in order to purse the Ph.D. degree in electrical
and computer engineering. His primary areas of expertise are fault-tolerant and self-repair
digital systems design. He has over five years of experience in register-transfer level (RTL)
design and FPGA development. In the Fall of 2019, Dr. Yang received the Ph.D. degree in
electrical and computer engineering from Georgia Tech.
215
