Multi-objective evolutionary design of selective triple modular redundancy systems against SEUs  by Yao, Rui et al.
Chinese Journal of Aeronautics, (2015), 28(3): 804–813Chinese Society of Aeronautics and Astronautics
& Beihang University
Chinese Journal of Aeronautics
cja@buaa.edu.cn
www.sciencedirect.comMulti-objective evolutionary design of selective
triple modular redundancy systems against SEUs* Corresponding author. Tel.: +86 25 84892352.
E-mail address: yaorui@nuaa.edu.cn (R. Yao).
Peer review under responsibility of Editorial Committee of CJA.
Production and hosting by Elsevier
http://dx.doi.org/10.1016/j.cja.2015.03.005
1000-9361 ª 2015 The Authors. Production and hosting by Elsevier Ltd. on behalf of CSAA & BUAA.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).Yao Rui *, Chen Qinqin, Li Zengwu, Sun YanmeiCollege of Automation and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, ChinaReceived 17 June 2014; revised 5 September 2014; accepted 21 February 2015
Available online 8 April 2015KEYWORDS
Evolvable hardware;
Field programmable gate
array;
Multi-objective approach;
Selective triple modular
redundancy;
Single event upsetAbstract To improve the reliability of spaceborne electronic systems, a fault-tolerant strategy of
selective triple modular redundancy (STMR) based on multi-objective optimization and evolvable
hardware (EHW) against single-event upsets (SEUs) for circuits implemented on ﬁeld pro-
grammable gate arrays (FPGAs) based on static random access memory (SRAM) is presented in
this paper. Various topologies of circuit with the same functionality are evolved using EHW ﬁrstly.
Then the SEU-sensitive gates of each circuit are identiﬁed using signal probabilities of all the lines in
it, and each circuit is hardened against SEUs by selectively applying triple modular redundancy
(TMR) to these SEU-sensitive gates. Afterward, each circuit hardened has been evaluated by
SEU Simulation, and the multi-objective optimization technology is introduced to optimize the area
overhead and the number of functional errors of all the circuits. The proposed fault-tolerant strat-
egy is tested on four circuits from microelectronics center of North Carolina (MCNC) benchmark
suite. The experimental results show that it can generate innovative trade-off solutions to compro-
mise between hardware resource consumption and system reliability. The maximum savings in the
area overhead of the STMR circuit over the full TMR design is 58% with the same SEU immunity.
ª 2015 The Authors. Production and hosting by Elsevier Ltd. on behalf of CSAA & BUAA. This is an
open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).1. Introduction
With the fast development of science and technology, the reli-
ability of electronics in space and avionics has become crucial
due to the increased complexity of the architecture and func-
tion. Field programmable gate arrays (FPGAs) based on staticrandom access memory (SRAM) have gained a steadily
increasing interest for such applications because of their short
time to market, good reconﬁgurability and low cost.
Unfortunately, along with these advantages this technology
has a high susceptibility to the so-called single event upsets
(SEUs).1 An SEU stands for the inversion of a memory bit
caused by heavy ions, protons and/or ground level radiation.
SEU is the largest contributor to device soft failure,2 which
may even lead to failure in the mission. Hence, aerospace
industry will beneﬁt signiﬁcantly from SEU mitigation tech-
nologies for SRAM-based FPGAs.
Triple modular redundancy (TMR)3,4 is the most widely
adopted one for hardening circuits implemented on
SRAM-based FPGAs. For digital circuits mapped on
Multi-objective evolutionary design of selective triple modular redundancy systems against SEUs 805FPGAs, not only the ﬂip-ﬂops (FFs) to form the feedback path
of sequential circuits, but also the logic gates in combinational
and sequential circuits, need to be hardened. The reason for
this is that the logic gates are mapped on the FPGA using look
up tables (LUTs), which consist of SRAM cells. Even the
interconnection is also controlled using the data stored in
SRAM cells.TMR can be applied based on different granular-
ities, such as device redundancy, system redundancy, module
redundancy or logic element redundancy. The ﬁner the granu-
larity is, the higher the probability is. For example, in the sys-
tem level TMR system, the original system is replicated three
times and the output extracted from a majority voter. Each
replica of the system works independently and is named
domain. If an SEU occurs in one domain, TMR masks the
fault by majority voting and thus propagates the correct out-
put. This method provides the TMR system with resistance
against SEUs, and can harden the system without affecting
its normal operation. However, TMR system can withstand
only single upset at any instant of time. If two out of three
domains give faulty results, the system will produce wrong
answers. To enable the system to mask multiple faults, logic
element level TMR can be applied. In the logic element level
TMR system, each logic element, including the logic gate
and the FF, is hardened by TMR, so it allows every logic ele-
ment to tolerate one failure. Obviously, the anti-SEU ability of
logic element level TMR is better than system level TMR, but
the area overhead of voter insertion is also signiﬁcant.
To reduce the area overhead of TMR system, a special kind
of TMR named reduced triple modular redundancy (RTMR)
for speciﬁc very long instruction word VLIW processors is
proposed in Ref. 5. The key idea is to employ the redundancy
of operators in the data path of a VLIW processor. I.e., every
operation is executed twice by two different operators during
normal program execution. Only in case a mismatch between
both computed results occurs, the operation is executed by a
third operator and its result is used for voting. Therefore, dur-
ing most of the execution time, the area overhead of (RTMR)
is only 100%. However, RTMR is somewhat a system level
TMR, and it is only suitable for application to speciﬁc
VLIW processors. Moreover, the VLIW architecture must be
modiﬁed in order to detect a mismatch in computed results,
and necessary program transformations must be introduced
to obtain an internal representation for fault tolerant pro-
grams that can be scheduled to the proposed VLIW architec-
ture. So if it is used for logic element level TMR, many
additional hardware logic and complex scheduling mechanism
should be added.
To reduce the hardware resource consumption of logic ele-
ment level full TMR system, a fault-tolerant method of selec-
tive TMR (STMR) for circuit mapped on FPGAs is proposed
in Ref. 6. In this method, only the SEU sensitive gates, i.e.,
gates that are prone to upset in case of SEU, in the circuits
are detected using the signal probabilities of the line and are
further hardened with TMR; while those non-sensitive to
SEU are not hardened. Because only part of the gates are selec-
tively hardened by TMR, so the STMR method can signiﬁ-
cantly reduce the area overhead of the hardened circuit
compared to full TMR; moreover, since the gates not hardened
are not prone to upset by SEU, the loss of SEU immunity is
small. However, the area overhead and SEU immunity
(namely reliability, which is inversely proportional to func-
tional errors in case of SEU) of a circuit conﬂict with eachother, i.e., if the former increases then the latter increases
too; vice versa. If we want to increase the reliability of a circuit,
more gates need to be hardened by STMR, so the area over-
heard increases too; otherwise, if we want to decrease the area
overhead, the number of gates to be hardened by TMR must
decrease, so the reliability decreases too in the case of SEU.
Therefore, a compromise between the area overhead and the
number of functional errors is required. Moreover, faulty
domains in the STMR system cannot be repaired.
In this paper, the multi-objective evolutionary design of
STMR system against SEUs is presented, which combines
the novel design and self-repairing capabilities of evolvable
hardware (EHW) with the less area overhead of the STMR
technique. Moreover, this strategy can result in a tradeoff
between reliability and resource consumption by using the
multi-objective optimization technology. In general, the proce-
dure of this strategy can be divided into three steps. Firstly,
various topologies of circuit with the same functionality are
evolved using EHW. Then these circuits are hardened against
SEUs by introducing the STMR technique to greatly reduce
the area overhead with a slight loss of reliability. Lastly,
through introducing multi-objective optimization algorithm7
based on weighted summation, the number of functional errors
and the area overhead of the circuits with different topologies
are optimized simultaneously. The proposed fault-tolerant
strategy is tested on four circuits taken from microelectronics
center of North Carolina (MCNC) benchmark library. Not
only is the area overhead of the STMR circuit decreased signif-
icantly over that required for the full TMR design of the same
circuit with a small loss of reliability, but also a tradeoff
between the number of functional errors and the area overhead
is achieved.2. STMR method based on multi-objective optimization and
EHW
2.1. Evolvable hardware
EHW is a novel kind of bio-inspired smart hardware, which is
capable of self-assembly, self-repairing and self-adaptation. It
is an integration of evolutionary computation and reconﬁg-
urable hardware devices.8,9 It applies evolutionary algorithms,
particularly genetic algorithm (GA), as the global search
engine, and in-situ conﬁgurable devices as the physical med-
ium. The goal of EHW is to obtain expected circuits and
topologies through evolution without human intervention or
designers’ knowledge,10 and then adapt to the new environ-
ment by reconﬁguring its own internal structure dynamically
and autonomously according to changing environment.
Extrinsic or intrinsic evolution11 is applied to EHW in
which circuit architecture as well as property parameters are
encoded into chromosomes, then each candidate circuit can
either be simulated or implemented physically on reconﬁg-
urable devices to evaluate using evaluation function. An eval-
uation function, known as the ﬁtness function, is used to
evaluate each chromosome in terms of being a good solution
to the problem and is the optimization objective of GA.
Offspring individuals are thus derived from operators like
selection, crossover, and mutation according to their ﬁtness.
The evolution cycle is then repeated until a satisfying solution
(a circuit providing the desired behavior) is found.
Table 1 Signal probabilities computation at output of a
Boolean gate.
Gate type Output probabilities of the gate
AND
Q
iPiQ
806 R. Yao et al.Extrinsic evolution is performed using a circuit simulator.
So it is suitable for discussing new evolution mechanisms
and exploring new architecture models for EHW. For exam-
ple, Carlos et al.12 performed a comparative study of several
heuristics with respect to a traditional GA in the design of
combinational logic circuits. Sushil13 combined GA with a
case-based memory to improve the performance of a series
of similar design problems. Phillip and Ganesh14 presented
an algorithm inspired from quantum evolution and particle
swarm to evolve combinational logic circuits; a multi-objective
ﬁtness function is used to obtain feasible circuits with minimal
number of gates. Vijayakumari and Mythili15 proposed a fas-
ter 2 dimensional (2D) technique for the design of combina-
tional digital circuits using GA, in which the combinational
digital circuits is represented as 2D chromosomes, and suitable
2D crossover and mutation techniques have been proposed in
order to increase the convergence speed of GA. However,
extrinsic evolution is computation-intensive and time-consum-
ing, and there is considerable difference between the perfor-
mance obtained by simulation and that of the real-world
application.
In intrinsic evolution, all the candidate circuits are evalu-
ated in a physical reconﬁgurable device. So it can not only
accelerate the evaluation speed, but also make full use of the
real characteristic (e.g. temperature, power consumption, local
failure and so on) of the device. Intrinsic evolution is capable
of online self-adaption and self-repairing, and is the founda-
tion for EHW’s real-world application. For example, Isamu
Kajitani et al.16 proposed the variable-length chromosome
GA for the evolution of pattern recognition system to recog-
nize hand-written symbols on an EHW-board including four
Xilinx LCX XC 4025 FPGAs. Zhang et al.17 presented a
reconﬁgurable architecture inspired by Cartesian genetic pro-
gramming and dedicated for implementing high-performance
digital image ﬁlters on a custom Xilinx Virtex FPGA
xcv1000. L. Sekanina and S. Friedl18 introduced a complete
implementation of a digital evolvable hardware system in
which a virtual reconﬁgurable circuit and evolutionary algo-
rithm were implemented independently as soft IP cores. And
the COMBO6 card is employed for the evolutionary design
of small combinational circuits, such as 3 · 3-bit multipliers.
Intrinsic EHW can be classiﬁed into three different classes
according to the location of the reconﬁgurable hardware and
the evolutionary algorithm (EA), i.e., PC-based inter-board
intrinsic EHW that consists of one or more boards hosting
reconﬁgurable devices and a PC on which EA is run, inter-chip
intrinsic EHW in which the reconﬁgurable devices and the
embedded controller that runs EA are located on the same
board, and complete intrinsic EHW in which the reconﬁg-
urable hardware and EA are implemented on the same chip.
Since complete intrinsic EHW yields the best performance as
the communication delays are due to intra-chip wires, in this
paper, a complete intrinsic EHW is implemented on an
Xilinx Virtex-5 FPGA evaluation platform, and STMR is
introduced to different topologies of a given circuit evolved
by using EHW.
NAND 1 iPi
OR
P
iPi 
Q
iPi
NOR 1 PiPi 
Q
iPi
 
XOR
P
i;jPið1 PjÞ
XNOR 1 Pi;jPið1 PjÞ
 
2.2. STMR for combinational circuits
With a small loss of SEU immunity, the STMR strategy can
greatly reduce the area overhead of the hardened circuitcompared to full TMR by selectively applying TMR to SEU
sensitive gates. The basic concept is as follows: (1) a set of
the primary input probabilities has been generated and propa-
gated through the combinational circuit; (2) the output signal
probabilities of all the lines in the circuit are calculated; (3)
SEU sensitive gates are identiﬁed; (4) TMR is introduced to
these sensitive gates.
2.2.1. Output signal probabilities computation of a Boolean gate
Typically, the input environment can be summarized either in
terms of input signal probabilities or in the form of ‘‘represen-
tative’’ input sequences.19,20 Referring to Ref. 6, in this paper
we characterize the input environment in terms of input signal
probabilities and model an SEU upset as a single event upset
(SET) on a given combinational circuit to design the STMR
system. The signal probabilities of the output of an n-input
gate are calculated as shown in Table 1, which is highly depen-
dent on the nature of the circuit and the environment it will be
subject to. In other words, given the input signal probabilities
of a combinational circuit, according to Table 1, the signal
probabilities of all the nets in the circuit can be determined
through propagating the signal probabilities of the primary
inputs level by level until the primary outputs of the circuit
are reached.
2.2.2. Deﬁnition and identiﬁcation of SEU-sensitive gates
If a fault on one of the inputs of a gate can propagate to the
output of the gate, we say that the gate is sensitive to SEUs.
In order to detect the SEU sensitive gates of the circuit after
input and output signal probabilities’ calculation, a sensitive
input whose concept is introduced by the critical path tracing
algorithm21 is determined as follows:
Deﬁnition 1. The input of a gate is considered to be sensitive
only if the changes of its value results in the change of the gate
output value.
The input sensitivity of a gate with two or more inputs is
identiﬁed as follows:
(1) The input is considered to be sensitive only if its value is
dominant over other inputs.
(2) All inputs are considered to be sensitive if none of the
inputs has dominant value.
According to Deﬁnition 1, the dominant value for AND
gate as well as NAND gate is ‘‘0’’ and the dominant value
for OR as well as NOR gate is ‘‘1’’.
Multi-objective evolutionary design of selective triple modular redundancy systems against SEUs 807In order to use Deﬁnition 1, a threshold probability is iden-
tiﬁed as follows:
Deﬁnition 2. If the signal probability by a line is more than the
threshold probability, then its logic value assumed is ‘‘1’’;
otherwise its logic value assumed is ‘‘0’’.
Thus, given a threshold probability and a set of input
probabilities, we can assign the input logic values of a gate
and then detect the gate’s SEU sensitivity according to
Deﬁnitions 1–3, as shown below.
Deﬁnition 3. If a gate has one or more sensitive inputs, then
the gate is sensitive to SEUs. We take the circuit in Fig. 1 as an
example to illustrate how to ﬁnd a gate’s SEU sensitivity. Let
the threshold probability be 0.5. The 3-input OR gate with the
signal probabilities of the inputs A, B, and C are equal to 0.7,
0.2, and 0.2 as shown in Fig. 1(a). Then input ‘‘A’’ is at logic
‘‘1’’, and input ‘‘B’’ and ‘‘C’’ are at logic ‘‘0’’. Assume a fault
due to SEU on inputs ‘‘A’’ at some instant of time, then the
fault will propagate to the output of the gate. This is because
all other signals are at non-dominant values at that instant. In
other words, a fault on one of the inputs can upset its output
only if this input has dominant value whereas all the other
inputs have non-dominant values. We can explain this in the
form of probabilities as follows: an SEU on one of the inputs
of a gate may propagate through the gate only when the signal
probability of all other inputs is smaller than the threshold
probability to a large extent. Therefore the gate is considered
as SEU sensitive gate. As shown in Fig. 1(b) when the signal
probabilities of the 3-input OR gate are equal to 0.7, 0.2, and
0.8, a fault on the input signal ‘‘A’’ would not upset its output.
This is because input A is not the only input which has
dominant value at that time. Hence, this gate is insensitive to
SEUs on its inputs.2.2.3. STMR method
In real applications, a combinational circuit consists of multi-
ple Boolean gates. The design ﬂow of applying STMR to a
combinational circuit is as follows:
(1) Signal probabilities assignment. Assign a set of ran-
domly generated probabilities to all the inputs of the
given combinational circuit.
(2) Input logic value ascertainment. Propagate the primary
input probabilities through the circuit, calculate the
inputs and outputs probability of each gate in the circuit
and determine the input logic values of each gate.
(3) Gate sensitivity identiﬁcation. Propagate through the
input logic value of each gate in the circuit from the ﬁrst
level to the last level. If only one or less inputs of theFig. 1 Single-event upset.gate has the dominant value, then the gate is sensitive.
XOR, XNOR, and NOT gates propagate faults no mat-
ter what the signal probabilities of the inputs are, so
these gates are always assumed to be sensitive. The gates
in the last level of the circuit are also considered as SEU
sensitive because a heavy ion bombarding the gate has a
higher probability of upsetting the ﬁnal output.
(4) STMR insertion. Apply TMR to all the SEU sensitive
gates in the circuit.
If the output of a sensitive gate is connected to only sensi-
tive gates, then the outputs of the triplicates can be directly
connected to the inputs of the triplicates of the next level.
This is illustrated in Fig. 2(a). Both of the two gates, Gate 1
and Gate 2, are assumed to be sensitive to SEUs (marked by
dotted circles), and Gate 1 is only connected to Gate 2.
Therefore, as presented in Fig. 2(b), the triplicated structure
of the circuit does not need to have a voter.
If the output of the sensitive gate is connected to an insen-
sitive gate, then a voter should be used between the sensitive
gate and the insensitive gate. For example, as shown in
Fig. 3(a), Gate 1 is assumed to be sensitive, whereas Gate 2
is insensitive. Hence, the triplicated structure for the circuit
is shown in Fig. 3(b), in which the outputs of the triplicated
structure D_1, D_2, and D_3 are mitigated using a voter as
an output, D. Then D is fed to Gate 2.
The normal full TMR structure of the original circuit in
Figs. 2 and 3 is the same, as shown in Fig. 2(c). It can be seen
from Figs. 3 and 2 that STMR structure can obtain a signiﬁ-
cant reduction of area overhead over normal full TMR struc-
ture: where there is one gate insensitive, there is a saving of two
gates and one voter; even if all the gates are sensitive, there is
still a saving of the quantity of voters that is equal to the num-
ber of the gates.Fig. 2 No fault for connections between two triplicated
modules.
Fig. 3 Connections between triplicated module and non-tripli-
cated module which needs to introduce a voter.
808 R. Yao et al.2.3. STMR for sequential circuits
It is well-known that a sequential circuit can be viewed as a
combinational circuit with a feedback path of FFs.
Combinational circuits can be hardened by applying STMR
as mentioned before. Since FFs are crucial to the entire circuit,
they can be hardened by full TMR or be replaced with any
SEU hardened latches reported in the literature.
2.4. Multi-objective optimization
In science and engineering area, most problems actually have
two or more objectives to be optimized and these objects can-
not be normally compared but conﬂict with each other. A par-
ticular optimization solution with respect to a single objective
can lead to unacceptable results with respect to other objec-
tives. Therefore, an appropriate solution to a multi-objective
problem is to investigate a set of solutions which satisfy the
objectives at an acceptable level without being dominated by
any other solutions.22 This is so-called multi-objective opti-
mization problem (MOP) (also called multiple objectives and
vector optimization)23 in which the goal is to minimize or max-
imize several conﬂicting objective functions simultaneously.
In general, problem of minimizing multiple objectives can
be described as follows:
min y ¼ ½ f1ðxÞ; f2ðxÞ; . . . ; fqðxÞ ð1Þ
s:t: gðxÞ ¼ ½g1ðxÞ; g2ðxÞ; . . . ; gmðxÞ ð2Þ
where x ¼ ½x1; x2; . . . ; xn 2 X indicates an n dimensional deci-
sion vector composed of decision variables x1; x2; . . . ; xn in the
decision space; y represents the objective space composed of
conﬂicting objective functions y1; y2; . . . ; yq; gðxÞ represents
the constraint condition determining the domain of feasible
solution for decision space. A set of all the decision vectors x
fulﬁlling constraint condition is named as feasible solution
set xf. If x among xf is non-inferior, then x is Pareto optimal
solution or non-dominated solution. In multi-objective opti-
mization, there are usually several optimal solutions with dif-
ferent tradeoffs, called Pareto optimal solutions. The set of
Pareto optimal solutions in the objective space is known as
Pareto frontier.A traditional approach for solving an MOP is to translate it
into a single-objective problem and use the solutions of a single
objective optimization problem to approximate Pareto optimal
solutions of MOP.24 To do so, a group of weights must be
assigned to each normalized objective function so as to trans-
late the MOP to a single objective problem with a scalar objec-
tive function. The problem must be solved multiple times with
different weighted vectors to obtain multiple Pareto optimal
solutions, resulting in Pareto optimal solution set.
Weighted summation approaches based on uniform weight
are then introduced, after applying STMR to different topolo-
gies of a given circuit evolved by EHW. This summation is to
optimize the two objectives in the system, i.e., overhead area
and the number of times the circuit is functionally upset.
Here the former (in terms of the number of gates used) and
the latter are denoted by f1 and f2 respectively. Then the func-
tion to be optimized is as follows:
f ¼ w1f1 þ w2f2 ð3Þ
where w1 and w2 are the weights of f1 and f2 respectively, whose
ranges are from 0 to1 uniformly with a certain step size.25 In
this paper, the step size is set to 0.1, so there are 11 weight vec-
tors, i.e., 11 pairs of [w1, w2], namely [0, 1], [0.1, 0.9], [0.2, 0.8],
[0.3, 0.7], [0.4, 0.6], [0.5, 0.5], [0.6, 0.3], [0.7, 0.3], [0.8, 0.2], [0.9,
0.1], [1, 0]. The aim of multi-objective optimization is to min-
imize f in Eq. (3). For every weight vector, each feasible solu-
tion, i.e. each pair of (f1, f2), is examined to ﬁnd the minimum
value(s) as a Pareto solution or a Pareto solution set. Pareto
frontier is obtained according to optimal solutions of all the
diverse weights.3. Experimental setup and results
3.1. Experimental environment
The Xilinx Virtex-5 evaluation platform, ML507, is used as the
hardware platform in this paper. The experimental ﬂows are as
follows:
(1) Obtaining the required topologies using EHW. The
complete intrinsic EHW platform is built on ML507 as
follows: ﬁrstly, virtual reconﬁgurable circuits (VRCs)
modeled in very-high-speed integrated circuit hardware
description language (VHDL) using ISE design suite
are customized to an evolvable IP core attached to the
processor local bus (PLB) bus of a MicroBlaze processor
system created using embedded development kit (EDK)
to form the hardware platform; secondly, the cells in the
VRC are encoded into chromosome, the matching
degree of a circuit’s actual output response to its ideal
output response is viewed as ﬁtness function, and the
standard GA (in order to reduce the computational
complexity, there is only even variation operator, no
crossover operator. The population size is 128, mutation
rate 2, tournament selection size 5, and the maximum
number of generation is 20,000.) modeled in ‘‘C’’ lan-
guage is implemented on the MicroBlaze processor using
software development kit (SDK); ﬁnally, the construc-
tion of the EHW platform is ﬁnished, the host computer
is connected to the target board, and the initial bitstream
of the EHW platform is conﬁgured and downloaded to
Multi-objective evolutionary design of selective triple modular redundancy systems against SEUs 809the FPGA. Then different topologies of a given combi-
national circuit can be evolved by mapping its truth
table, test vectors and ﬁtness function in the standard
GA implemented on the MicroBlaze processor, and
the optimal chromosomes displayed on the hyper termi-
nal are decoded to obtain novel topologies.
(2) Constructing a new platform for STMR circuit design.
Each optimal chromosome evolved in (1) is translated
into VHDL format to design a new VRC structure cap-
able of implementing STMR. The new VRC is cus-
tomized to construct the new hardware platforms.
Then the STMR algorithm described in Section 2.2 is
implemented on the MicroBlaze processor using SDK.
(3) Designing STMR circuits. Sensitive gates are identiﬁed
using the STMR algorithm running on the MicroBlaze
processor as follows: a group of primary input probabil-
ities are generated randomly, then the signal probabili-
ties of each line in the circuit are calculated, and the
SEU sensitive gates are identiﬁed. After that, the gates
to be hardened are fed into IP core to reconﬁgure the
corresponding VRCs into their TMR version dynami-
cally and autonomously. Thus, an STMR circuit has
been designed completely. Multiple alternatives of
STMR circuits can be obtained by repeating this process
as different groups of primary input probabilities may
result in deferent STMR circuits.
(4) Testing the reliability of each STMR circuit. Each
STMR circuit is faulted by injecting a fault on one gate
at a time in simulation until all the gates in the circuit are
examined. The number of functional errors can be calcu-
lated by XOR-ing the outputs of the faulted STMR cir-
cuit and the un-faulted STMR circuit. When these
outputs are disparate, it indicates that the fault injected
in the STMR circuit upsets its output and leads to a
functional failure consequently.
(5) Optimizing multiple alternatives of STMR circuits.
The original weight vectors of the area overhead and
the number of functional errors are set as 0 and 1
respectively and the step size is set to 0.1.
Afterward, under each group of weighted vectors, dif-
ferent STMR circuits of a given combinational circuitTable 2 Experimental results obtained from STMR and full TMR
Circuit name Threshold probability STMR
N FR (%)
Alu4 0.3 0 0
0.4 0 0
0.5 0 0
Parity16e 0.3 0 0
0.4 0 0
0.5 0 0
Add03 0.3 5 7.6
0.4 13 30.2
0.5 11 22.4
Con1 0.3 1 3.8
0.4 1 3.8
0.5 1 4.2are simulated with the same set of 200 test vectors to
obtain the Pareto optimal solutions as well as the
Pareto frontier.
Because STMR is adopted only on logic gates of both com-
binational and sequential circuits, whereas the FFs of sequen-
tial circuits are hardened by full TMR, all the instances in this
section are combinational circuits.
3.2. Comparison between STMR and TMR techniques
STMR technique and full TMR technique were tested on
Alu4, Parity16e, Add03, and Con1, respectively. The experi-
mental results obtained from the two methods for three sets
of threshold probabilities (0.3, 0.4, and 0.5) are shown in
Table 2. The circuits Alu4, Parity16e, Add03 and Con1 are
taken from MCNC benchmark library. The columns corre-
sponding to ‘‘STMR’’ represent the synthesized STMR cir-
cuits; whereas the ones marked by ‘‘Full TMR’’ are the full
TMR design of the same circuit.
The column marked as ‘‘S’’ denotes the average percentage
savings in the area overhead of the STMR circuit over the full
TMR design. The columns marked as ‘‘N’’ show the average
number of times that an induced SEU affects the correct oper-
ation of the circuit. The columns marked as ‘‘A’’ represent the
average area of the circuit in terms of gates. The columns
marked as ‘‘FR’’ show the average fault rate (i.e., the ratio
of ‘‘N’’ to A) of the circuit. The STMR circuits are simulated
with the set of 200 test vectors for each set of threshold prob-
ability. The input test vectors randomly generated adhere to
the appropriate probabilities of the inputs that is employed
in generating the corresponding STMR circuits.
As seen from Table 2(the threshold probabilities are 0.3,
0.4, and 0.5), the maximum average percentage savings in
the area overhead of the STMR circuit over full TMR design
of Alu4 and Parity16e are 58% and 21% with the same SEU
immunity. The reason for this is that all the gates of the
TMR circuit utilize the triplicated structure, whereas the
STMR circuits selectively apply TMR to the gates upsetting
one or more primary outputs of the circuit. This shows that
the area of the STMR design is signiﬁcantly less than that.
S (%) Full TMR
A N FR (%) A
1485 17 0 0 1779
1128 37 0 0 1779
741 58 0 0 1779
106 21 0 0 135
107 21 0 0 135
108 20 0 0 135
65 19 0 0 81
43 46 0 0 81
49 39 0 0 81
26 21 0 0 33
26 21 0 0 33
24 27 0 0 33
Fig. 5 Average area overhead and the number of average
functional errors in STMR circuit of full adder made of only AND
gates.
810 R. Yao et al.required for full TMR design in the context of guaranteeing
100% immunity of the circuits against SEUs. The reason
why Alu4 and Parity16e have zero FRs may be that the circuits
of them all consist of many sub-circuits that are similar or sym-
metric. So after the SEU-sensitive gates, as well as all the gates
in the ﬁrst and last level of the circuits being hardened by their
triplicates, when there is only one SEU affecting one gate each
time, the outputs of the circuits would not be affected.
As for Con1, when the sets of threshold probabilities are
0.3, 0.4, and 0.5, at the cost of a slight loss of SEU immunity
(the average fault rate are about 3.8%, 3.8% and 4.2% respec-
tively), the average area savings of STMR over the TMR
design are 21%, 21%, and 27%, respectively. As for Add03,
when the sets of threshold probabilities are 0.3, 0.4, and 0.5,
the average area savings of STMR over the full TMR design
are 19%, 46%, and 39% and the numbers of times an induced
SEU affecting the correct operation of the circuit are 5, 13, and
11 respectively; the fault rates are about 7.6%, 30.2%, and
22.4% respectively. Therefore, although STMR circuit would
propagate some errors to the ﬁnal output of the circuit, the
STMR method achieves a balance between the area overhead
and the number of functional errors.
3.3. Various topologies of full adder and results of its STMR
circuits
It is well-known that various topologies of a combinational
circuit can be evolved using EHW technology. In general, each
topology is quite different from others. Thus multiple diversity
solutions can be obtained after applying STMR to each topol-
ogy. And we can choose the most suitable solution among
them for a certain application. Two typical topologies of the
full adder are shown in Fig. 4, in which Fig. 4(a) represents
a full adder consisting of only NAND gates, whereas
Fig. 4(b) shows a full adder that contains AND gates, OR
gates and XOR gates.
Fig. 5 illustrates the average area overhead and the number
of average functional errors in the STMR circuit of the full
adder made of only AND gates. Fig. 5(a) indicates the area
overhead’s changing with the threshold probability Fig. 5(b)Fig. 4 Two typical topologies of full adder.represents the variation of functional errors with threshold
probability.
As seen from Fig. 5, for the circuit consists of only AND
and/or NAND gates, with the increase in the threshold prob-
ability, the area overhead of the STMR circuits decreases and
the number of error times increases. This is because, as men-
tioned before, the non-dominant value for AND as well as
NAND gate is ‘‘1’’. Thus, with the increase of the threshold
probability, less logic values of lines are considered ‘‘1’’ and
hence the number of sensitive gates and the area of the
STMR circuits decrease. Consequently, more number of func-
tional errors in the STMR circuit affects the ﬁnal output.
Similarly, if the circuit consists of only OR and/or NOR
gates, as the threshold probability increases, more numbers
of gates are marked sensitive and then the area overhead of
the STMR circuits increases. This leads to the decrease in
the number of errors upsetting the outputs.
Whereas if a circuit contains not only AND and/or NAND
gates but also OR and/or NOR gates, then a change in the
threshold probability may not directly cause a decrease or
increase in the area overhead and functional errors of the
STMR circuit. It means that the area overhead and the num-
ber of functional errors of the STMR circuits may be indepen-
dent on the threshold probability.
XOR, XNOR, and NOT gates propagate faults no matter
what the signal probabilities of the inputs are, so these gates
Fig. 6 Results of multi-objective optimization for Alu4, Parity16e, Add03 and Con1.
Multi-objective evolutionary design of selective triple modular redundancy systems against SEUs 811are always considered SEU sensitive. Therefore, the area of the
STMR circuit is highly dependent on the number of XOR,
NOT, and XNOR gates. Likewise, because the last level out-
put gates are also considered sensitive no matter what the pri-
mary signal probabilities of the inputs are, as the last level
gates numbers increases, the area overhead of the STMR cir-
cuit increases.
3.4. Results of multi-objective optimization in fault-tolerant
system of STMR
Fig. 6 indicates the results of multi-objective optimization for
the STMR circuits of Alu4, Parity16e, Add03 and Con1.
As seen from Fig. 6, for circuits Alu4 and Parity16e, the
Pareto-optimal solutions for 11 weight vectors are all located
at the vertical axes. The reason for this is that when STMR
strategy is applied to Alu4 and Parity16e, the numbers of func-
tional errors are almost 0, so the corresponding area overhead
minimums in terms of gates are the best solutions.
As for Add03 and Con1, it can be seen from
Fig. 6(c) and (d) that multiple alternatives are offered to deci-
sion-makers; if small number of times the circuit is functionally
upset is needed, data points on the upper left of the graph can
be selected; if little area overhead is preferred, data points on
the lower right of the graph can be selected; if both of them is
concerned, the middle ones can be selected. Therefore, deci-
sion-makers can select satisﬁed solutions depending on their
own requirements. In consequence, it is ﬂexible to apply mul-
ti-objective optimization algorithms to fault-tolerant system
of STMR in order to meet different application requirements.
Note that in Fig. 6(d), some suboptimal solutions, such asthe data points (1, 29), (2, 25) and (3, 21), have been optimized
away because of the existence of the optimal solution, data
point (0, 21); therefore, alternatives offered are relatively less.
4. Conclusions
(1) To improve the reliability of electronics in space, the
multi-objective evolutionary design of STMR system
against SEUs is proposed in this paper. In our method,
the EHW technique is employed to synthesize circuits
with various topologies and the same functionality.
Then the SEU sensitive gates in the circuits are detected
and hardened against SEUs by applying STMR tech-
nique. Afterwards the multi-objective optimization tech-
niques are brought into optimize the number of
functional errors and the area overhead of the STMR
circuits simultaneity. To the best of our knowledge,
the combinations of EHW, STMR and multi-objective
optimization techniques, as well as the design ﬂow, have
not been reported yet.
(2) In our method, the lower area overhead of STMR, the
novel design capability of EHW and the tradeoff advan-
tages between reliability and resource consumption of
multi-objective optimization technique are combined.
Compared to TMR method, STMR method offers huge
area reduction, with a small loss of reliability. EHW
techniques may ﬁnd innovative topologies of a given cir-
cuit that distinguish from the traditional topologies, and
then by introducing STMR to each topology, alternative
solutions with multiple diversities can be obtained,
among them even the best solution may exist. Multi-
812 R. Yao et al.objective optimization technique provides plenty of
alternative options for user to choose in certain situa-
tions. In addition, due to the self-repairing potential of
EHW, the proposed fault-tolerant strategy can effec-
tively deal with fault modules in the STMR circuits.
(3) In the future, we will investigate how to use abundant
tri-state buffers to construct SEU-immune majority
voter circuits, as well as ﬁnd out how to repair faults
using the online self-repairing capacity of EHW.
Acknowledgements
This study was supported by National Natural Science
Foundation of China (No. 61402226) and supported by the
Fundamental Research Funds for the Central Universities of
China (No. NS2014036).
References
1. Sterpone L, Ullah A. On the optimal reconﬁguration times for
TMR circuits on SRAM based FPGAs. In: Benkrid K,
Keymeulen D, Merodio D, Newell M, Wansch R, Patel U, et al.,
editors. Adaptive hardware and systems, 25–27 June 2013; Torino,
Italy. 2013 NASA/ESA conference on adaptive hardware and
systems. Piscataway (NJ): IEEE; 2013. p. 9–14.
2. Dodd PE, Shaneyfelt MR, Schwank JR, Felix JA. Current and
future challenges in radiation effects on CMOS electronics. IEEE
Trans Nucl Sci 2010;57(4):1747–63.
3. Sterpone L, Violante M. Analysis of the robustness of the TMR
architecture in SRAM-based FPGAs. IEEE Trans Nucl Sci
2005;52(5):1545–9.
4. Kim H, Jeon HJ, Lee K, Lee H. The design and evaluation of all
voting triple modular redundancy systemProceedings annual reli-
ability and maintainability symposium, 28–31 Jan. 2002; Seattle,
USA. Piscataway (NJ): IEEE; 2002. p. 439–44.
5. Scho¨lzel M. Reduced triple modular redundancy for built-in self-
repair in VLIW-processors. In: Dabrowski A, editor. Signal
processing algorithms, architectures, arrangements and applications,
7–7 Sep. 2007; Poznan, Poland. 2007 Signal processing algorithms,
architectures, arrangements and applications. Piscataway
(NJ): IEEE; 2007. p. 21–6.
6. Samudral PK, Ramos J, Katkoori S. Selective triple modular
redundancy (STMR) based single-event upset (SEU) tolerant
synthesis for FPGAs. IEEE Trans Nucl Sci 2004;51(5):2957–69.
7. Zhu BK, Xu LH, Hu HG, Liang YM. A robust algorithm for
multi-objective optimization problem. In: Yi H, Wen DS,
Parvinder SS, editors. Computer science and information technol-
ogy, 9–11 July 2010; Chengdu, China. 2010 3rd IEEE international
conference on computer science and information
technology. Piscataway (NJ): IEEE; 2010. p. 68–72.
8. Salvador R, Otero A, Mora J, de la Torre E, Sekanina L, Riesgo
T. Fault tolerance analysis and self-healing strategy of autono-
mous, evolvable hardware systems. In: Athanas P, Becker J,
Cumplido R, editors. Reconﬁgurable computing and FPGAs, Nov.
30–Dec. 2 2011; Cancun, Mexico. 2011 International conference on
reconﬁgurable computing and FPGAs. Piscataway, NJ: IEEE;
2011. p. 164–9.
9. Haddow PC, Tyrrell AM. Challenges of evolvable hardware: past,
present and the path to a promising future. Genet Program
Evolvable Mach 2011;12(3):183–215.
10. Salvador R, Otero A, Mora J, de la Torre E, Riesgo T, Sekanina
T. Self-reconﬁgurable evolvable hardware system for adaptive
image processing. IEEE Trans Comput 2013;62(8):1481–93.11. Swarnalatha A, Shanthi AP. Complete hardware evolution based
SoPC for evolvable hardware. Appl Soft Comput 2014;18:314–22.
12. Carlos A, Enrique A, Gabriel L. Comparing different serial
and parallel heuristics to design combinational logic circuits.
In: Lohn J, Zebulum R, Steincamp J, Keymeulen D, Stoica A,
Ferguson MI, editors. Evolvable hardware, 9–11 July 2003;
Chicago, USA. 2003 NASA/Dod conference on evolvable
hardware. Piscataway (NJ): IEEE; 2003. p. 3–12.
13. Sushil JL. Learning for evolutionary design. In: Lohn J,
Zebulum R, Steincamp J, Keymeulen D, Stoica A, Ferguson
MI, editors. Evolvable hardware, 9–11 July 2003; Chicago, USA.
2003 NASA/Dod conference on evolvable hardware. Piscataway
(NJ): IEEE; 2003. p. 17–20.
14. Phillip WM, Ganesh K. Evolving combinational logic circuits
using a hybrid quantum evolution and particle swarm inspired
algorithm. In: Lohn J, Gwaltney D, Hornby G, Zebulum R,
Keymeulen D, Stoica A, editors. Evolvable hardware, June 29–July
1 2005; Washington, USA. 2005 NASA/DoD Conference on
evolvable hardware. Piscataway: IEEE; 2005. p. 97–102.
15. Vijayakumari CK, Mythili P. A faster 2D technique for the design
of combinational digital circuits using genetic algorithm. In: Biswa
ND, editor. Power, signals, controls and computation, 3–6 Jan.
2012; Thrissur, India. 2012 International conference on power,
signals, controls and computation. Piscataway (NJ): IEEE; 2012.
p. 1–5.
16. Kajitani I, Hoshino T, Iwata M, Higuchi T. Variable length
chromosome GA for evolvable hardware. In: Harashima F,
Fukuda T, Furuhashi T, editors. Evolutionary computation, 20–
22 May 1996; Nagoya, Japan. 1996 International conference on
evolutionary computation. Piscataway (NJ): IEEE; 1996. p. 443–7.
17. Zhang Y, Smith SL, Tyrrell AM. Digital circuit design using
intrinsic evolvable hardware. In: Zebulum RS, Gwaltney D,
Hornby G, Keymeulen D, Lohn J, editors. Evolvable hardware,
24–26 June 2004; Seattle, USA. 2004 NASA/DoD conference on
evolvable hardware. Piscataway (NJ): IEEE; 2004. p. 55–62.
18. Sekanina L, Friedl S. On routine implementation of virtual
evolvable devices using COMBO6. In: Zebulum RS, Gwaltney D,
Hornby G, Keymeulen D, Lohn J, editors. Evolvable hardware, 24-
26 June 2004; Seattle, USA. 2004 NASA/DoD conference on
evolvable hardware. Piscataway (NJ): IEEE; 2004. p. 63–70.
19. Wu TB, Liu HZ, Liu PX, Guo DS, Sun HM. A cost-efﬁcient input
vector monitoring concurrent on-line BIST scheme based on
multilevel decoding logic. J Electron Test Theory Appl
2013;29(4):585–600.
20. Wang J, Yue Z, Lu X, Qiu W, Shi W, Walker DMH. A vector-
based approach for power supply noise analysis in test com-
paction. In: Kapur R, editor. Test. 2005 International test
conference, 8 Nov. 2005; Austin, USA. Piscataway (NJ): IEEE;
2005. p. 510–26.
21. Abramovici M, Breuer M, Friedman A. Digital systems testing and
testable design. 1st ed. New York: Computer Science Press; 1990.
p. 157–67.
22. Carlos A. Evolutionary multi-objective optimization: a historical
view of the ﬁeld. IEEE Comput Intell Mag 2006;1(1):28–36.
23. Sindhya K, Miettinen K, Deb K. A hybrid framework for
evolutionary multi-objective optimization. IEEE Trans Evol
Comput 2013;17(4):495–511.
24. Zhou T, Sun W. Optimization of wind-PV hybrid power system
based on interactive multi-objective optimization algorithm. In:
Wu JF, Duan GR, Sun LN, editors.Measurement, information and
control, 18–20 May 2012; Harbin, China. 2012 International
conference on measurement, information and control. Piscataway
(NJ): IEEE; 2012. p. 853–6.
25. Zhao L, Ju G, Lv JH. An improved genetic algorithm in multi-
objective optimization and its application. Proc CSEE
2008;28(2):96–102 Chinese.
redundancy systems against SEUs 813Yao Rui is currently an associate professor at Nanjing University of
Aeronautics and Astronautics in 2008. Her main research interests are
bio-inspired hardware and intelligent circuits, computer control sys-
tems and intelligent information processing.
Chen Qinqin is a Master Degree Candidate at Nanjing University of
Aeronautics and Astronautics. Her research interest includes computer
control systems.
Multi-objective evolutionary design of selective triple modularLi Zengwu is a Master Degree Candidate at Nanjing University of
Aeronautics and Astronautics. Her research interest is computer
control systems.
Sun Yanmei is a Master Degree Candidate at Nanjing University of
Aeronautics and Astronautics. Her research interest is computer
control systems.
