S-SETA: Selective Software-Only Error-Detection Technique Using Assertions by Chielle, Eduardo et al.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON NUCLEAR SCIENCE 1
S-SETA: Selective Software-Only Error-Detection
Technique Using Assertions
Eduardo Chielle, Gennaro S. Rodrigues, Fernanda L. Kastensmidt, Sergio Cuenca-Asensi,
Lucas A. Tambara, Paolo Rech, and Heather Quinn
Abstract—Software-based techniques offer several advantages
to increase the reliability of processor-based systems at very low
cost, but they cause performance degradation and an increase of
the code size. To meet constraints in performance and memory,
we propose SETA, a new control-ﬂow software-only technique
that uses assertions to detect errors affecting the program ﬂow.
SETA is an independent technique, but it was conceived to work
together with previously proposed data-ﬂow techniques that aim
at reducing performance and memory overheads. Thus, SETA is
combined with such data-ﬂow techniques and submitted to a fault
injection campaign. Simulation and neutron induced SEE tests
show high fault coverage at performance and memory overheads
inferior to the state-of-the-art.
Index Terms—Aerospace applications, control-ﬂow, energy con-
straints, error detection, fault coverage, fault tolerance, memory
overhead, performance degradation, processors, reliability, soft er-
rors, software-based techniques.
I. INTRODUCTION
T HE advances in the semiconductor industry have allowedthe fabrication of high density integrated circuits (ICs).
Nowadays, we are reaching the physical limits of a couple atoms
to form the transistor’s gate [1][2]. However, the higher quantity
of transistors per die combined with reduced voltage threshold
and increased operating frequencies have made ICs more sensi-
tive to faults caused by radiation [3]. Such faults can be caused
by energized particles present in space or secondary particles
such as alpha particles, generated by the interaction of neutron
and materials at ground level [4].
Transient ionization may occur when a single radiation ion-
izing particle strikes the silicon, creating a transient voltage
pulse, or a Single Event Effect (SEE). This effect affects proces-
sors by modifying values stored in the sequential logic, known
as Single Event Upset (SEU), or by changing the function of a
Manuscript received July 09, 2015; revised September 08, 2015; accepted
September 28, 2015. This work was supported in part by CNPq and CAPES,
Brazilian agencies.
E. Chielle, G. S. Rodrigues, F. L. Kastensmidt, L. A. Tambara and P. Rech
are with the Instituto de Informática, PGMICRO,UFRGS, Porto Alegre, Brazil
(e-mail: echielle@inf.ufrgs.br; gsrodrigues@inf.ufrgs.br; fglima@inf.ufrgs.br;
latambara@inf.ufrgs.br; prech@inf.ufrgs.br).
S. Cuenca-Asensi is with the Computer Technology Department, University
of Alicante, Alicante, Spain (e-mail: sergio@dtic.ua.es).
H. Quinn is with the Space Data System Group, Los Alamos National Labo-
ratory, Los Alamos, NM, USA (e-mail: hquinn@lanl.gov).
Color versions of one or more of the ﬁgures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identiﬁer 10.1109/TNS.2015.2484842
circuit in the combinational, known as Single Event Transient
(SET). Such faults may lead the system to incorrectly execute
an application. In critical processor-based systems, an error is
unacceptable. Thus, to ensure reliability for the microprocessor
against SEEs, the use of fault tolerance techniques is mandatory.
There are two main types of fault tolerance techniques that
aim to harden processors under neutron induced soft errors,
which are hardware-based and software-based techniques. The
ﬁrst one relies on replicating or adding hardware modules. And
the second one relies on the replication of information and in-
structions in the program code [5]. Hardware-based techniques
usually change the original processor architecture by adding
logic redundancy, error correcting codes and majority voters.
They can also be based on hardware monitoring devices, called
watchdog processors [6], to monitor memory accesses. How-
ever, hardware-based techniques present signiﬁcant overheads,
like increase in area and power consumption, and high design
and manufacturing costs [7]. Software-based techniques are a
well-known approach to protect systems against SEEs by mod-
ifying the program code, without having to change the under-
lying hardware. They rely on adding instruction redundancy and
comparison to detect or correct errors. These techniques pro-
vide high ﬂexibility and low development time and cost. In ad-
dition, they allow the use of commercial off-the-shelf (COTS)
processors since nomodiﬁcation to the hardware is required. Al-
though software redundancy brings reliability to the system, it
requires extra processing time since more instructions are being
executed, and more memory, since redundancy is inserted. As a
consequence, the energy consumption is increased [8][9]. In a
previous paper, we reduced those overheads for data-ﬂow tech-
niques [10].
In this paper, we propose a new control-ﬂow technique
called SETA (Software-only Error-detection Technique using
Assertions) and a new method for selective hardening, called
tunnel effect. SETA is a technique to detect control-ﬂow errors
using assertions with lower overheads designed to be used
together with data-ﬂow techniques. Then, the selective hard-
ening method is implemented with SETA, creating S-SETA
(Selective SETA). S-SETA increases the ﬂexibility of the
original technique and allows optimizing the trade-off between
overheads and reliability. The techniques are evaluated in
terms of execution time, code size and fault coverage, and
compared to the literature. Experiments were performed with
processors miniMIPS and ARM Cortex-A9. The fault coverage
was obtained by simulation and also neutron induced SEE
tests. Results from fault injection show around 98% fault
coverage at overheads in performance and memory inferior to
0018-9499 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2 IEEE TRANSACTIONS ON NUCLEAR SCIENCE
state-of-the-art techniques. Results from the neutron radiation
test conﬁrm the high fault coverage obtained by simulation.
II. FAULT TOLERANCE IN SOFTWARE
Software-based fault tolerance techniques, also referred in the
literature as Software-Implemented Hardware Fault Tolerance
(SIHFT) [11], are techniques implemented in software to protect
processor against soft errors that may affect the program ﬂow or
the data stored in registers or memory. The techniques that are
aim to protect the data are called data-ﬂow techniques, and the
ones to protect the control-ﬂow are the control-ﬂow techniques.
There are also techniques that combine features of both data and
control-ﬂow techniques. They consist of transformation rules of
the code and can be understood as a data-ﬂow and control-ﬂow
technique applied together. Table I summarizes both types of
software-based techniques.
A. Data-Flow Techniques
Data-ﬂow techniques are designed to protect the data stored
in registers or memory. These techniques replicate the regis-
ters, assigning copies to the original ones. When the aim is
error detection, registers are duplicated, and when correction is
included, registers are triplicated. Checkers (voters, if correc-
tion) are inserted in the code to compare the registers with its
copies. An example is shown in Fig. 1. The points where the
checkers are inserted depend on the technique. Since error de-
tection presents lower overheads than correction due to duplica-
tion instead of triplication, this paper focuses on error detection.
Some well-known data-ﬂow techniques present in the literature
are EDDI [12] and Variables [13].
B. Control-Flow Techniques
Control-ﬂow techniques are designed to protect the pro-
gram’s ﬂow, i.e., to protect against incorrect branches. Such
techniques divide the code into basic blocks. A basic block
(BB) is a branch-free sequence of instructions, i.e., a portion
of code that is always executed in sequence. There only can
be a branch instruction at the end of the basic block. Further,
there are no branches to the basic block, except to the ﬁrst
instruction. For each basic block, a signature is assigned. The
signature is attributed to a global register at the beginning of
the basic block. Checkers are inserted in the code to verify if
the signature register contains the expected value. If it does not,
it means there was an incorrect branch and an error is reported.
The main control-ﬂow techniques present in the literature are
CFCSS [14], YACCA [15] and CEDA [16]. Table II shows the
execution time and fault coverage of these techniques. As one
can see, CEDA is the one with the highest trade-off between
fault coverage and performance, and that is why it is used as
the baseline technique of this paper.
III. PROPOSED TECHNIQUES
A. VAR3+ Data-Flow Technique
In [10], seventeen data-ﬂow techniques that aim at reducing
the overheads in performance, memory and energy consumption
were presented and validated by fault injection. They consist of
three types of different rules: global, duplicating and checking
TABLE I
TYPES OF SOFTWARE-BASED FAULT TOLERANCE TECHNIQUES
TABLE II
STATE-OF-THE-ART CONTROL-FLOW TECHNIQUES [16]
UF: UNDETECTED FAULTS
ET: EXECUTION TIME
rules, as one can see in Table III. There is only a global rule,
and it is applied to all techniques. It states that every register
used by the program has a spare register assigned as its replica.
The duplicating rules regard how the instructions will be dupli-
cated. They are only applied when write operations in a register
or memory are performed. Therefore, branch instructions are
not considered in this case. There are two types of duplicating
rules. Each technique can only have one duplicating rule. D1
duplicates all instructions, including stores, which allow the use
of unprotected memories, since the original value and its replica
can be stored in different positions in thememory. D2 duplicates
all instructions, except stores. The last one is adequate when the
memory is hardened because the data in memory do not need to
be duplicated. Thus, the duplication overhead and the number of
memory accesses are reduced. The checking rules indicate when
a register and its replica must be compared aiming at verifying if
an error has occurred (when they present different values). The
techniques can have more than one checking rule. Theoretically,
the more checkers are included in one technique, the more reli-
ability is achieved.
Of the seventeen proposed data-ﬂow ﬂow techniques that
were evaluated by fault injection, we selected the one with the
lowest overheads of the ones with the highest error detection
rate. This technique is named VAR3+ and the rules it uses are
presented in Table IV. As one can notice, VAR3+ uses dupli-
cating rule D2 and checking rules C3, C4, C5, and C6. Table V
shows an example of a code hardened by VAR3+. The original
code is presented as normal text, the code inserted by the du-
plicating rule is formatted as italic (lines 3, 6 and 8), and the
checkers are bold (lines 1, 4, 9 and 10).
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHIELLE et al.: S-SETA: SELECTIVE SOFTWARE-ONLY ERROR-DETECTION TECHNIQUE USING ASSERTIONS 3
TABLE III
RULES FOR DATA-FLOW TECHNIQUES [10]
This paper proposes a new control-ﬂow technique. However,
we will use VAR3+ data-ﬂow technique to complement the pro-
posed control-ﬂow technique once it has been compared to the
state-of-the-art.
B. SETA Control-Flow Technique
To complement the data-ﬂow techniques and, thus, promote
protection of both data and control-ﬂow, we propose a new tech-
nique based on HETA [17] and CEDA. All of them use runtime
signatures to detect faults in the control-ﬂow. However, HETA
makes use of hardware to help in the detection, which requires
extra power and also cannot be applied to most COTS devices.
Furthermore, both CEDA and HETA are concerned about the
fault coverage, but not about the overheads they cause. Aiming
at providing similar error detection rate as CEDA at lower over-
heads, SETA is proposed. The technique makes use of signa-
tures, calculated a priori and processed at runtime. The program
code is divided into basic blocks, and the correlation among
them based on the program ﬂow is used to calculate the sig-
natures in such a manner to warrant detection of control-ﬂow
errors.
Two Basic Block Types (BBT) are deﬁned: A and X. A basic
block is of type A if it has multiple predecessors and at least
one of its predecessors has multiple successors, and it is of
type X if it is not of type A. After that, the basic blocks are
grouped into networks. Basic blocks with common predecessors
belong to the same network. Then, SETA calculates the signa-
tures. Two different Basic Block Signatures (BBS) are used: a
Node Ingress Signature (NIS) and a Node Exit Signature (NES).
To verify those signatures during runtime, a Signature Register
(S) is used. The signatures values are divided into two parts: an
upper half and a lower half, as shown in Table VI. Each part is
calculated differently, and their sizes may vary according to the
program code requirements to avoid aliasing. The upper half is
used to identify the network that the basic block belongs to (in
the case of the NIS) and the network of the successor basic block
(in the case of the NES). The lower half is used to differentiate
the basic blocks inside of a network. An example can be seen
in Fig. 2. At runtime, S is updated according to the following
rules by using XOR or AND operation with an invariant. The
TABLE IV
RULES FOR VAR3+ DATA-FLOW TECHNIQUES
TABLE V
EXAMPLE OF VAR3+ DATA-FLOW TECHNIQUE FOR MINIMIPS PROCESSOR
TABLE VI
SIGNATURE DIVISION
Fig. 1. Example of a data-ﬂow technique.
invariant is a constant that updates the signature to its new value.
Equations (1) and (2) are the two possible ways to update S. The
updating of S follows the rules presented by Table VII. The op-
eration will use and AND when BBT is A and S is updating to
NIS. Otherwise, S will be updated by an XOR
(1)
(2)
XOR is a preferable operation to update S because it does not
mask bits. Thus, an error affecting the signature will be propa-
gate to the next basic block, and it can be detected later. How-
ever, when a basic block is of type A, i.e., when it has multiple
predecessors and at least one of its predecessors has multiple
successors, it is necessary to mask some bits to keep the signa-
ture consistent. An XOR cannot be used in this case because it
does not mask bits. On the other hand, the AND can do that,
masking only the bits that needed to be masked.
To detect incorrect branches, checkers can be inserted be-
fore exiting transitions. The more checkers, the lower is the la-
tency to detect errors. On the other hand, higher is the over-
head. The maximum number of checkers matches the number of
basic blocks since only one checker is needed per basic block.
Table VIII shows an example of SETA for miniMIPS processor.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
4 IEEE TRANSACTIONS ON NUCLEAR SCIENCE
Fig. 2. Representation of a program ﬂow. Basic blocks (circles) classiﬁed as
type A or X, and grouped into networks. The arrows indicate the possible direc-
tions that a basic block may take.
TABLE VII
SIGNATURE UPDATE
TABLE VIII
EXAMPLE OF SETA CONTROL-FLOW TECHNIQUE FOR MINIMIPS PROCESSOR
At the left, there is a portion of an unhardened code and, at the
right, the same code hardened by SETA. The checkers inserted
by SETA are in bold and the signature updates are formatted as
italic. The ﬁrst XOR (xori) is to update the signature to the basic
block’s NIS. The instructions li and bne are used to compare the
signature register $7 with the expected signature for that basic
block. Finally, the last XOR is used to update the signature to
the expected NES. Since new instructions are inserted, it is clear
that the execution time and the code size will increase.
The main differences from CEDA to SETA are:
• Removed inverted branches check. CEDA inserts branches
at both possible targets of each branch to check it was taken
correctly. SETA does not implement it because the fault
coverage it provides is negligible in relation to the over-
heads it causes. It only detects errors affecting the decision
of a branch when the registers and the comparison are cor-
rect, but the branch take the wrong direction.
• Removed extra instructions used to avoid aliasing. SETA
does not need to insert instructions to “clear” the signature,
as it is done in CEDA, because the signature values are as-
signed in a different way. The upper half is deterministic
and the lower half is randomly determined. Thus, the signa-
ture register can be always directly updated, which reduces
the overheads. SETA avoid aliasing by varying the size of
the “halves”, trying to maximize the size of the lower half.
C. Evaluation Methodology
The following parameters were used to evaluate the quality
of the proposal and to compare with the state-of-the-art:
• Execution time: it expresses the amount of time an appli-
cation takes to execute. The execution time of a hardened
application is presented normalized, i.e., it is divided by
the execution time of the unhardened application;
• Code size: it refers to the amount of memory a program
occupies in disk. The code size of a hardened application
is also normalized by the unhardened application;
• Fault coverage: the fault coverage is the amount of de-
tected and masked faults. It is expressed in percentage, and
it is given by the equation (3). The fault coverage is the sum
of detected faults with masked faults, divided by the total
of faults. It can also be expressed as one minus the total of
faults that caused undetected errors, divided by the total of
faults
(3)
• Mean Work To Failure (MWTF)[18]: the MWTF, given
by equation (4), was used as an overall quality metric. It
captures the tradeoff between reliability and performance,
since the more time an application needs to run, the higher
the probability of being hit by a particle and, consequently,
affected by a fault. AVF (Average Vulnerability Factor)
is used to measure microarchitectural structure’s suscep-
tibility to transient faults [19]. The raw error rate is the
percentage of errors not detected. In this paper, MWTF of
a hardened application is normalized by MWTF of the un-
hardened application
(4)
Large fault injection campaigns were performed over a set of
benchmarks. Most of them were done by means of simulation
tools because the amount of information necessary for all the
tests is infeasible to be obtained by radiation test only. Anyhow,
radiation tests were performed to check if their results match
with ones obtained by simulation, and thus, validate the fault
injections by simulation.
Faults were injected by forcing a bit-ﬂip at RTL level in the
miniMIPS [20] processor’s internal signals using ModelSim
[21], a simulation tool. Every signal is considered. A total
of 10,000 faults is injected per application. Only one fault is
injected per execution. The fault duration is set to one clock
cycle in order to force their effect to hit the clock barrier of the
ﬂip-ﬂops and, therefore, increase the probability of an error. A
golden execution (with no injected faults) is executed. Then,
the program is submitted to faults, and the memory results of
the program under test are compared to the golden results. The
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHIELLE et al.: S-SETA: SELECTIVE SOFTWARE-ONLY ERROR-DETECTION TECHNIQUE USING ASSERTIONS 5
Fig. 3. Comparison between CEDA and SETA for miniMIPS processor. Both
techniques present similar fault coverage. However, SETA presents signiﬁcant
lower overheads, which explains its higher MWTF.
0%
20%
40%
60%
80%
100%
1 
2 
3 
4 
5 
6 
SETA VAR3+,SETA
execution time code size MWTF fault coverage
m
ul
tip
le
s
of
 th
e 
or
ig
in
al
 v
al
ue
fault  coverage
Data and control-ﬂow techniques
Fig. 4. Comparison between SETA and VAR3+,SETA for miniMIPS pro-
cessor. Although the increase of the overheads of both techniques together
(VAR3+,SETA), there is a signiﬁcant increase of the MWTF due to the higher
fault coverage.
error is signaled when the result stored in the memory differs
from the expected one.
D. Simulation Results on MiniMIPS Processor
Nine case-study applications are hardened with SETA
and compared to CEDA. They are Bubble Sort, sequential
Depth-First Search, recursive Depth-First Search, Dijkstra’s
Algorithm, Matrix Multiplication, Run Length Encoding,
Summation, TETRA Encryption Algorithm (TEA2) and Tower
of Hanoi.
Fig. 3 shows the average results for each technique. When
CEDA and SETA have one checker per basic block, the fault
coverage of both techniques are similar, around 94%. The
advantages of SETA are due to its reduced overheads. While
CEDA presents execution time of 1.54x and code size of 1.45x,
while SETA achieves an execution time of 1.38x and code size
of 1.29x. That is the reason why SETA’s MWTF is 1.73x while
CEDA’s is 1.64x.
SETA was combined with VAR3+ in order to protect both
data and control-ﬂow of the application. Fig. 4 shows that and
compares with SETA only. Although the increase of the execu-
tion time from 1.38x to 2.11x, and of the code size from 1.29x
to 1.89x, there was an signiﬁcant increase of the MWTF, from
1.73x to 5.23x, due to the higher fault coverage of almost 99%.
E. Neutron Experiment on ARM Cortex-A9 Processor
Experiments were performed at Los Alamos National
Laboratory’s (LANL) Los Alamos Neutron Science Center
(LANSCE) Irradiation of Chips and Electronics House II,
Los Alamos, US, in December 2014 in order to validate the
fault injection campaign by simulation. As mentioned in [22],
LANSCE provides a white neutron source that emulates the en-
ergy spectrum of the atmospheric neutron ﬂux. The relationship
between neutron energy and modern devices cross section is
still an open question. Nevertheless, LANSCE beam has been
empirically demonstrated to be suitable to mimic terrestrial
radiation environment [22].
The setup, shown in Fig. 5, consists of a board, computer,
USB net switch, cables for communication, and cables for
power supply. The computer is connected to the board by two
USB cables. One is used to program the board, and the other
is used to receive the output from the board. The power supply
of the board is connected to the USB net switch, which is con-
nected by USB to the computer. It is used to control when the
power supply is available to the board. The board utilized in the
tests is the ZedBoar. It is a low-cost development board for the
Xilinx Zynq-7000 All Programmable SoC, XC7Z020-CLG484
part, that has embedded a Dual-core ARM Cortex-A9 pro-
cessor [23]. Only one core was utilized during the test and both
caches were enabled. It executed a target application that sends
the output by UART to the computer and, then, restarts its
execution. The computer was running a monitoring application
that listens to the COM port related connected to the board’s
UART and classiﬁes the output. In case of error in the ARM
processor, the processor is reset.
The neutron ﬂux was approximately n cm s
for energies above 10 MeV. The beam was focused on a spot
with a diameter of 2 inches plus 1 inch of penumbra, which
provided uniform irradiation of the device without directly af-
fecting nearby board power control circuitry. Irradiation was
performed at room temperature with normal incidence and nom-
inal voltages.
Two versions of case-study Tower of Hanoi have been tested,
one unhardened and the other hardened by VAR3+ and SETA
techniques. Table IX summarizes the data from the neutron ex-
periment. The unhardened version was executed for 100 min-
utes under the beam, receiving a total ﬂuence of n/cm
in average. The hardened version was executed for 730 minutes
under the beam, receiving a total ﬂuence of n/cm
in average. We observed 6 incorrect executions out of 1557,
which results in a SER of and a cross-section
of cm for the unhardened application. In the
hardened version, we observed 5 undetected errors that lead to
incorrect output on a total of 4872 executions, which results in a
SER of and a cross-section of cm .
The detection techniques were capable of detecting 90.9% of the
errors affecting the processor. That is the reason why we can see
a reduction of the SER by 3.76 and of the cross-section by one
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6 IEEE TRANSACTIONS ON NUCLEAR SCIENCE
Fig. 5. Setup for radiation test. The computer is connected to the board by two USB cables. One is used to program the board and the other is used to get the
output from the board by UART. The computer is also connected by USB to a switch that controls the power supply of the board.
TABLE IX
SUMMARY OF NEUTRON EXPERIMENT ON ARM CORTEX-A9 PROCESSOR
order of magnitude when hardening using VAR3+ and SETA.
However, the execution time of the hardened case-study appli-
cation used in ARM is 2.33x, and the code size is 2.13x com-
pared to the unhardened application. That results in a MWTF of
1.61x for the hardened application.
The MWTF obtained by simulation on the miniMIPS pro-
cessor for the Tower of Hanoi hardened by VAR3+ and SETA
was 2.68x. The same benchmark, but running on the ARM
Cortex-A9 processor and tested under neutrons reached a
MWTF of 1.61x. A factor that inﬂuenced in this difference is
the different processor used in both tests. Thus, the ﬁnal code
and the processor architecture are not the same. Anyhow, it is
noticeable an increase of the MWTF from the unhardened to
the hardened version.
IV. SELECTIVE HARDENING
A recent approach to reduce overheads caused by soft-
ware-based techniques consists of applying the techniques
selectively. Only selected portions of the application will be
protected, not the entire application. Few works based on selec-
tive hardening aim to guarantee application-level correctness in
multimedia applications [24][25]. For multimedia applications,
some errors can be tolerated since they will not be noticed by
the user [26]. However, in critical systems, correctness is re-
quired. Recent works on this ﬁeld have been proposed by [27].
In this paper, subsets of the registers used by the application
were protected by data-ﬂow techniques and evaluated.
With regards to control-ﬂow techniques, the selective hard-
ening can be applied to basic blocks. In this paper, we do it using
the SETA technique. The selective hardening of SETA is imple-
mented in two different ways, as summarized in Table X.
• SETA-C (SETA minus Checkers): consists of removing
checkers from the basic blocks, as stated in [16]. All the
basic blocks are protected by SETA with signatures. How-
ever, not all of them receive a checker. Larger basic blocks
have higher priority to receive a checker. If an error oc-
curs in a basic block with no checker, it can be detected in
a subsequent basic block since the error will propagate. It
presents lower overheads than the standard SETA.
• S-SETA (Selective SETA): is a new selective method. It
consists of completely ignoring some basic blocks. The ig-
nored basic blocks receive no signatures or checkers. Thus,
it is possible to provide overheads even lower than only
removing checkers. This selective hardening method of
SETA is better explained below.
A. S-SETA
S-SETA ignores some basic blocks in order to reduce costs.
This method was named as tunnel effect. It creates the effect of
a tunnel between the predecessors and the successors of ignored
basic blocks. Thus, S-SETA does not see ignored BBs and does
not protect them. As in SETA-C, larger basic blocks have higher
priority to be selected and, thus, protected. The size was selected
as criterion because very small basic blocks are quickly exe-
cuted and, therefore, are less vulnerable. If they are executed just
a few times, they would not be very sensitive, so its protection is
not very important. And if they are frequently executed, the in-
sertion of protection in such small basic blocks would cause sig-
niﬁcant performance degradation. Fig. 6 shows how the tunnel
effect is applied to a program. Fig. 6(a) presents the default pro-
gram ﬂow where all the basic blocks have been protected. If the
protection is reduced to 70%, as shown in Fig. 6(c), basic blocks
1, 4, 8 and 9 are removed. The successors of BB 1 are attributed
to its predecessor, BB 0. The successors of BB 2 now are BBs
3, 5 and 6, since BB 4 was removed. BBs 5 and 6 now point to
BBs 2 and 7 instead of BB 1. Furthermore, BB 8 was removed.
Therefore, BB 9 has no longer a successor. Following the same
idea, Fig. 6(b), Fig. 6(d) and Fig. 6(e) show how S-SETA sees
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
CHIELLE et al.: S-SETA: SELECTIVE SOFTWARE-ONLY ERROR-DETECTION TECHNIQUE USING ASSERTIONS 7
TABLE X
APPROACHES FOR SELECTIVE HARDENING IN CONTROL-FLOW TECHNIQUES
Fig. 6. Example of tunnel effect (S-SETA) (a) protecting 100% of BBs, equiv-
alent to SETA, (b) protecting 80%, (c) protecting 70%, (d) protecting 30% and
(e) protecting 20% of BBs. (a) protecting 100% (b) protecting 80% (c) pro-
tecting 70% (d) protecting 30% (e) protecting 20%.
the program ﬂow for protecting of 80%, 30% and 20% of basic
blocks, respectively.
B. Results
The same nine case-study applications used to compare SETA
with CEDA were used to evaluate the two approaches for se-
lective hardening. We have tested all the possible percentage of
BBs protected. Fig. 7 shows the results for S-SETA and SETA-C
in which the percentage of BBs protected provides the highest
MWTF. That shows the maximum improvement we can get
from selective hardening. CEDA and SETA with no selective
hardening are also included for comparison. As one can notice,
both selective approaches reduce the overheads while keeping
the a similar fault coverage.While SETAwith no selective hard-
ening presents execution time of 1.38x and code size of 1.29x,
SETA-C presents 1.28x and 1.21x of execution time and code
size, respectively. S-SETA reduces even more the overheads; it
presents both execution time and code size of 1.17x. Due to its
lower overheads, S-SETA achieves the highest MWTF, 1.94x
while SETA-C presents 1.87x, and SETAwith no selective hard-
ening reaches 1.73x. By comparing with CEDA the combina-
tion of the proposed control-ﬂow technique with the proposed
selective hardening method, S-SETA, we can see a reduction in
the execution time from 1.54x to 1.17x, code size from 1.45x to
1.17x, and an increase of the MWTF from 1.64x to 1.94x.
We can see the same behavior when VAR3+ is included
(Fig. 8). Reduction of the overheads when compared to
VAR3+,SETA (no selective hardening), similar fault coverage
and, consequently, higher MWTF. VAR3+,SETA presents
2.11x of execution time, 1.89x of code size and 5.23x of
MWTF. SETA-C reduces the execution time to 2.05x and the
Fig. 7. Comparison between CEDA, SETA, S-SETA and SETA-C for min-
iMIPS processor. Both selective approaches present similar fault coverage to
SETA with no selective hardening. And both reduce overheads. However, the
overhead reduction is higher in S-SETA, which justify its higher MWTF.
Fig. 8. Comparison between selective control-ﬂow techniques applied together
with VAR3+ data-ﬂow technique for miniMIPS processor. The highest MWTF
is presented by the proposed VAR3+,S-SETA due to its lower overheads.
code size to 1.82x and increases the MWTF to 5.69x. S-SETA
goes even further, it reduces the execution time to 2.01x and
code size to 1.84x and achieves a MWTF of 5.96x.
By only removing checkers from the basic blocks (SETA-C),
the reduction in the overheads is not that expressive if compared
to S-SETA, which clearly presents better gains in the MWTF.
Thus, S-SETA is a better option, which is shown by its higher
MWTF. An interesting approach to a future work would be
applying both solutions together, using S-SETA, and then, in-
serting checkers only in some of the basic blocks protected by
S-SETA.
V. CONCLUSIONS AND FUTURE WORK
In this paper, we introduced SETA, a new control-ﬂow tech-
nique. The aims were to keep a similar fault coverage of state-
of-the-art techniques and reduce the overheads. SETA does ex-
actly that. It keeps the fault coverage and reduces the overheads,
which impacts in an increase of the MWTF. Thus, it is possible
to say that SETA is also more reliable since it provides the same
fault coverage, and the application will be exposed for a shorter
time. The execution time and code size were reduced from 1.54x
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
8 IEEE TRANSACTIONS ON NUCLEAR SCIENCE
to 1.38x, and from 1.45x to 1.29x, respectively, when compare
to the state-of-the-art. It increases the MWTF from 1.64x to
1.73x.
To go further with the reduction of the overheads, selective
hardening was applied to SETA using an approach stated in the
literature (SETA-C) and a proposed new approach (S-SETA).
Both approaches for selective hardening reduce the overheads
and keep similar fault coverage, which increase the MWTF.
This fact is more noticeable in the proposed S-SETA. While
SETA-C reduces the execution time from 1.38x to 1.28x and
the code size from 1.29x to 1.21x, S-SETA reduces both the
execution time and code size to 1.17x. That explains S-SETA’s
MWTF of 1.94x while SETA-C presents 1.87x.
As a future work, we intend to apply the selective hardening
to VAR3+ data-ﬂow technique, by selecting the registers that
will be hardened. Thus, we can also reduce the overheads caused
by the data-ﬂow technique.
REFERENCES
[1] N. Kim, T. Austin, D. Baauw, T. Mudge, K. Flautner, J. Hu, M. Irwin,
M. Kandemir, and V. Narayanan, “Leakage current: Moore’s lawmeets
static power,” IEEE Computer, vol. 36, no. 12, pp. 68–75, Dec. 2003.
[2] S. Thompson et al., “In search of forever: Continued transistor scaling
one new material at a time,” IEEE Trans. Semicond. Manuf., vol. 18,
no. 1, pp. 26–36, Mar. 2005.
[3] R. Baumann, “Soft errors in advanced semiconductor devices-part I:
The three radiation sources,” IEEE Trans. Device Mater. Rel,, vol. 1,
no. 1, pp. 17–22, 2001, Los Alamitos, USA.
[4] “Design,” in International Technology Roadmap for Semiconduc-
tors. Washington DC: Semiconductor Industry Assoc., 2005, pp.
6–7.
[5] O. S. Unsal, I. Koren, and C. M. Krishna, “Towards energy-aware soft-
ware-based fault tolerance in real-time systems,” in Proc. Int. Symp
Low Power Electronics and Design, 2002, pp. 124–129.
[6] A. Mahmood and E. McCluskey, “Concurrent error detection using
watchdog processors–a survey,” IEEE Trans. Computers, vol. 37, no.
2, pp. 160–174, Feb. 1988.
[7] S. C. Asensi, A. M. Alvarez, F. R. Calle, F. R. Palomo, H. G. Miranda,
and M. A. Aguirre, “A novel co-design approach for soft errors miti-
gation in embedded systems,” IEEE Trans. Nucl. Sci., vol. 58, no. 3,
pp. 1059–1065, Jun. 2011.
[8] T. Yao, H. Zhou,M. Fang, and H. Hu, “Low power consumption sched-
uling based on software fault-tolerance,” in Proc. 9th Int. Conf. Natural
Computation, 2013, pp. 1788–1793.
[9] I. Assayad, A. Girault, and H. Kalla, “Tradeoff exploration between
reliability, power consumption and execution time,” presented at the
30th Int. Conf. Computer Safety, Reliability and Security, 2011.
[10] E. Chielle, F. L. Kastensmidt, and S. Cuenca-Asensi, “A set of rules
for overhead reduction in data-ﬂow software-based fault-tolerant tech-
niques,” in FPGAs and Parallel Architectures for Aerospace Applica-
tions, F. Kastensmidt and P. Rech, Eds. Berlin, Germany: Springer,
2015.
[11] O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante,
Software-Implemented Hardware Fault Tolerance. Berlin, Germany:
Springer, 2006.
[12] N. Oh, P. P. Shirvani, and E. J. McCluskey, “Error detection by dupli-
cated instructions in super-scalar processors,” IEEE Trans. Rel., vol.
51, no. 1, pp. 63–75, Mar. 2002.
[13] J. R. Azambuja, A. Lapolli, M. Altieri, and F. L. Kastensmidt, “Evalu-
ating the efﬁciency of software-only techniques to detect SEU and SET
in microprocessors,” in Proc. IEEE Latin Am Symp. Circuits and Sys-
tems, 2011, doi: 10.1109/LATW.2011.5985914.
[14] N. Oh, E. Shirvani, and E.McCluskey, “Control-ﬂow checking by soft-
ware signatures,” IEEE Trans. Rel., vol. 51, no. 2, pp. 111–122, Mar.
2002.
[15] O. Goloubeva, M. Rebaudengo, M. S. Reorda, and M. Violante, “Soft-
error detection using control ﬂow assertions,” in Proc. IEEE Int. Symp.
Defect and Fault Tolerance in VLSI Systems, 2003, pp. 581–588.
[16] R. Vemu and J. A. Abraham, “CEDA: Control-ﬂow error detec-
tion using assertions,” IEEE Trans. Computers, vol. 60, no. 9, pp.
1233–1245, Sep. 2011.
[17] J. R. Azambuja, M. Altieri, J. Becker, and F. L. Kastensmidt, “HETA:
Hybrid error-detection technique using assertions,” IEEE Nuclear and
Plasma Sciences Society, vol. 60, no. 4, pp. 2805–2812, Aug. 2013.
[18] G. A. Reis, J. Chang, N. Vachharajani, S. S. Mukherjee, R. Rangan,
and D. I. August, “Design and evaluation of hybrid fault-detection
systems,” in Proc. Int. Symp. Computer Architecture, Jun. 2005, pp.
148–159.
[19] S. S. Mukherjee, C. Weaver, J. Emer, S. K. Reinhardt, and T. Austin,
“A systematic methodology to compute the architectural vulnerability
factors for a high-performance microprocessor,” in Proc. Annu .
IEEE/ACM Int. Symp. Microarchitecture, 2003, pp. 29–40.
[20] L. M. O. S. S. Hangout and S. Jan, in The minimips project 2009, Oct.
2010 [Online]. Available: http://www.opencores.org/projects.cgi/web/
minimips/overview
[21] Mentor Graphics. ModelSim, 2010 [Online]. Available: http://www.
model.com/content/modelsim-support, 2010
[22] M. Violante, L. Sterpone, A. Manuzzato, S. Gerardin, P. Rech, M.
Bagatin, A. Paccagnella, C. Andreani, G. Gorini, A. Pietropaolo, G.
Cardarilli, S. Pontarelli, and C. Frost, “A new hardware/software
platform and a new 1/E neutron source for soft error studies: Testing
FPGAs at the ISIS facility,” IEEE Trans. Nucl. Sci., vol. 54, no. 4, pp.
1184–1189, Aug. 2007.
[23] A. ZedBoard, in featuring the Zynq-7000 All Programmable SoC 2015,
May 2015 [Online]. Available: http://www.em.avnet.com/en-us/de-
sign/drc/Pages/Zedboard.aspx
[24] J. Cong and K. Gururaj, “Assuring application-level correctness
against soft errors,” in Proc. IEEE/ACM Int. Conf. Computer-Aided
Design (ICCAD), 2011, pp. 150–157.
[25] A. Sudaram, A. Akael, D. Lockhart, D. Thaker, and D. Franklin, “Ef-
ﬁcient fault tolerance in multi-media applications through selective in-
struction replication,” in Proc. Workshop Radiation Effects and Fault
Tolerance in Nanometer Technologies, 2008, pp. 339–346.
[26] T. Y. Yeh, G. Reinman, S. J. Patel, and P. Faloutsos, “Fool me twice:
Exploring and exploiting error tolerance in physics-based animation,”
ACM Trans. Graphics, vol. 29, no. 5, pp. 1–11, 2009.
[27] F. Restrepo-Calle, A. Martinez-Alvarez, S. Cuenca-Asensi, and A. Ji-
meno-Morenilla, “Selective SWIFT-R,” J. Electron. Test, vol. 29, pp.
825–838, 2013.
