Configuration Redundancy for Enhanced Reliability in SRAM-based FPGAs by Giordano, Raffaele et al.
ar
X
iv
:1
80
6.
10
67
6v
1 
 [p
hy
sic
s.i
ns
-d
et]
  2
5 J
un
 20
18
Configuration Redundancy for Enhanced Reliability
in SRAM-based FPGAs
Raffaele Giordano, Sabrina Perrella, Dario Barbieri, Vincenzo Izzo, and Alberto Aloisio
Abstract—Digital off-detector electronics in trigger and data
acquisition systems of High-Energy Physics experiments is often
implemented by means of SRAM-based FPGAs, which make
it possible to achieve reconfigurable, real-time processing and
multi-gigabit serial data transfers. On-detector usage of such
devices is mostly limited by their configuration sensitivity to
radiation-induced upsets, which may alter the programmed
routing paths and configurable elements.
In this work, we show a new technique for enhancing the
usage of SRAM-based FPGAs also for on-detector applications.
We show a demonstrator of our solution on benchmark designs,
including a triple modular redundant design and a serial link
(without redundancy) running at 5 Gbps, implemented in a Xilinx
Kintex-7 FPGA. We performed irradiation tests at Laboratori
Nazionali del Sud (Catania, Italy) with a 62-MeV proton beam.
The results show that our scrubbing technique made it possible
to detect and correct all the radiation-induced upsets after a
total fluence higher than 1012cm−2. For both the redundant
benchmark design and the serial link, the correct functionality
was always restored after scrubbing the corrupted configuration
bits and resetting the circuit.
S
STATIC RAM-based Field Programmable Gate Arrays
(SRAM-based FPGAs) [1], [2] are widely used in trigger
and data acquisition (TDAQ) systems of High-Energy Physics
(HEP) experiments for implementing fast logic due to their
re-configurability, large real-time processing capabilities and
embedded high-speed serial IOs. However, these devices are
sensitive to radiation effects such as single event upsets
(SEUs) [3], [4] in the configuration memory, which may
alter the functionality of the implemented circuit. There is a
strong interest in finding solutions for enhancing the usage
of standard SRAM-based FPGAs also on-detector. Methods
based on triple modular redundancy (TMR) [5] and periodic
correction of the configuration, i.e. scrubbing [6], [7], [8],
are used in order to correct single and multiple bit upsets
(SBUs and MBUs) per memory location (i.e. frame). The
reason for coupling scrubbing to modular redundancy is that
fault masking techniques, such as TMR, require to avoid the
accumulation of errors in the FPGA [9]. In classical scrubbing
approaches, the golden configuration is stored in external
radiation-hardened memories and it is possible to correct any
number of MBUs per frame.
Recently, scrubbing techniques based on redundant con-
figuration have been developed in order to avoid external
”‘Accesso Aperto MIUR”’. This work is part of the ROAL project
(CINECA grant no. RBSI14JOUV) funded by the Scientific Independence of
Young Researchers (SIR) 2014 program of the Italian Ministry of Education,
University and Research (MIUR). The institutions which contributed to the
results reported in this work are listed below as affiliations of the authors.
Corresponding author: R. Giordano (email: rgiordano@na.infn.it)
R. Giordano, S. Perrella D. Barbieri, and A. Aloisio are with INFN and
Universita` degli Studi di Napoli ”Federico II”, I-80126, Napoli, Italy
V. Izzo is with INFN Sezione di Napoli, I-80126, Napoli, Italy
memories and to correct MBUs at the same time, a brief
discussion of the literature about this topic follows.
Although it cannot be classified as a scrubber, the patent
disclosed in [10] shows a very interesting FPGA architecture
and design implementation flow aimed at generating redundant
configuration at the bit level. Unfortunately, since it requires
a dedicated FPGA architecture, this approach can be pursued
only by vendors fabricating devices and it cannot be imple-
mented at the user level. Moreover, it does not allow to detect
upsets, but only to mask them, therefore it is not effective
against the accumulation of upsets and it does allow to log
them.
The patent described in [11] and the work presented in
[12] are both based on configuration redundancy at the device
level. Three identical FPGAs implement the same design and
therefore host the same configuration. The main limitations of
this approach are the need for three devices, the increase of
the power consumption by a factor three and the need for an
additional device to perform scrubbing and majority voting of
the outputs.
An interesting, and effective, approach based on configu-
ration redundancy at the frame level is shown in [13] for a
Virtex-5 FPGA. The technique requires a custom design flow,
which is based on the legacy Xilinx ISE tool, its ability to
export layouts in the Xilinx Design Language (XDL) format,
and the Rapidsmith [14] academic CAD tool. The Rapidsmith
tool is used for replicating the layout of a module three
times, generating three identical subsets of the configuration,
and therefore the Authors exploit modular redundancy to
generate configuration redundancy. In this implementation
the scrubbing logic and a voter for the three modules are
implemented in the fabric. This solution leads to a power
consumption increase related to the additional programmable
resources used. However, the impact on power consumption
is milder with respect to solutions based on redundant devices
[15], where also the device quiescent power is triple.
Unfortunately, newer FPGA families, such as the 7-Series,
the Ultrascale or the Ultrascale+, are not supported by the
Rapidsmith tool, and FPGAs of the latest generation are not
supported by ISE either. In addition to that, the new Xilinx
CAD tool Vivado, recommended for designs based on 7-Series
onwards, does not support the XDL. New initiatives, such as
[16], have been launched for enabling custom design flows
also with the Vivado tool. However, the usage of third-party
layout tools adds up to the complexity of the design flow
and it usually does not allow the designer to choose devices
of the latest generation, since the dedicated support must be
implemented for each new FPGA family.
A new method for generating redundant configuration at the
frame level has been described in [17]. In this case the used
configuration frames content is copied to unused frames after
the initial configuration of the device. There is no dependency
on the layout tools and the solution is exportable to any
other device, provided it includes an interface for reading
and writing the internal configuration. The impact on power
consumption is minimal with respect to the previously cited
solutions and in general to TMR-based solutions. A common
aspect of the methods described in [13] and [17] is that they
both require to find identical subsets of the FPGA device for
hosting the redundant configuration (or modules).
This work advances the state of the art in multiple ways.
We show an enhanced version of the configuration redun-
dancy generation and scrubbing technique for SRAM-based
FPGAs published in [17]. In fact, our new version relaxes
the requirement of allocating the redundant configuration
in identical subsets of the FPGA. Moreover, the presented
technique is capable of protecting the configuration pertaining
to basic blocks such as flip-flops, look-up-tables and routing,
but it can also address complex hard macros, such as high-
speed IO transceivers (e.g. the Xilinx GTX).
We show a demonstrator of our solution on two benchmark
designs implemented in a Xilinx Kintex-7 FPGA: an array
of 32-bit counters emulating a classical mixture of sequential
and combinational logic and a serial link running at 5 Gbps
without modular redundancy. For the first benchmark design
we implemented the TMR scheme by means of the Synopsys
Synplify Premier tool [18].
We present results from irradiation tests performed at Labo-
ratori Nazionali del Sud (Catania, Italy) with a 62-MeV proton
beam with a total fluence of 3.5 · 1011cm−2 for the TMR-
protected design and of 8.7 · 1011cm−2 for the serial link. We
split our test into several runs, each corresponding to a fluence
in the range from 2.9 · 1010cm−2 to 2.9 · 1011cm−2. We used
a software implementation of the scrubber running on a small
form factor personal computer interfaced to the device under
test via JTAG. During irradiation of the FPGA the scrubber
was active in order to remove radiation-induced upsets as
soon as they were detected. For each upset we logged a time
stamp, the frame address, the bit offset and the upset plarity
(0 → 1 or 1 → 0). We have also been measuring the power
consumption of the FPGA, separately at its power domains,
and the functionality of the selected benchmark design.
Our results show that our scrubbing technique made it
possible to detect and correct all the radiation-induced upsets.
For both the TMR-protected firmware and the serial link, the
correct functionality was always restored after scrubbing the
corrupted configuration bits and resetting the circuit. However,
the TMR-protected firmware has shown a significantly lower
number of failures with respect to the serial link. We measured
some single event functional interrupts (SEFIs) related to the
readback of block RAMs (BRAMs), where 128 bits at the
same bit offset in logically adjacent frames were flipped.
The proton to upset cross section for the entire device has
been measured to be σdev = 1.0 · 10
−7
cm
2, while the
proton to upset cross section per bit has been measured to be
σbit = 4.4 · 10
−15
cm
2 and the proton to BRAM SEFI cross
section has been measured to be σBRAM = 4.6 ·10
−11
cm
2. It
is important to note that σBRAM is more than three order of
magnitudes lower than σdev , i.e. SEFIs are very rare events.
SEFIs could not be removed by scrubbing, as they require to
power cycle the device, but they did not impact the operation
of our designs.
REFERENCES
[1] Virtex UltraScale FPGAs Data Sheet: DC and AC Switching Character-
istics. Xilinx Inc. DS893 (v1.7.1) April 4, 2016
[2] Stratix 10 Device Overview S10-OVERVIEW. Altera Corp. 2015.12.04
[3] M. Wirthlin, ”High-Reliability FPGA-Based Systems: Space, High-
Energy Physics, and Beyond,” in Proc. of the IEEE, vol. 103, no. 3,
pp. 379-389, Mar. 2015. doi: 10.1109/JPROC.2015.2404212
[4] H. Quinn, “Radiation effects in reconfigurable FPGAs.” Semi-
cond. Sci. Technol., vol. 32, no. 4 (8pp), Mar. 2017 doi:
https://doi.org/10.1088/1361-6641/aa57f6
[5] L. Sterpone and M. Violante, “Analysis of the robustness of the TMR
architecture in SRAM-based FPGAs,” IEEE Trans. on Nucl. Sci., vol.
52, no. 5, pp. 1545-1549, Oct. 2005. doi: 10.1109/TNS.2005.856543
[6] I. Herrera-Alzu and M. Lopez-Vallejo, “Design techniques for xilinx
virtex FPGA configuration memory scrubbers,” IEEE Trans. Nucl. Sci.,
vol. 60, no. 1, pp. 376–385, Feb. 2013.
[7] A. Stoddard, A. Gruwell, P. Zabriskie and M. J. Wirthlin, ”A Hybrid
Approach to FPGA Configuration Scrubbing,” IEEE Trans. on Nucl. Sci.,
vol. 64, no. 1, pp. 497-503, Jan. 2017. doi: 10.1109/TNS.2016.2636666
[8] M. Berg, C. Poivey, D. Petrick, D. Espinosa, A. Lesea, K.A. LaBel, M.
Friendlich, H. Kim, A. Phan, ”Effectiveness of Internal Versus External
SEU Scrubbing Mitigation Strategies in a Xilinx FPGA: Design, Test,
and Analysis,” IEEE Trans. Nucl. Sci., vol. 55, no. 4, pp. 2259-2266,
Aug. 2008 doi: 10.1109/TNS.2008.2001422
[9] P. S. Ostler, M. P. Caffrey, D. S. Gibelyou, P. S. Graham, K. S. Morgan,
B. H. Pratt, H. M. Quinn, and M. J. Wirthlin, “SRAM FPGA reliability
analysis for harsh radiation environments,” IEEE Trans. Nucl. Sci., vol.
56, no. 6, pp. 3519–3526, Dec. 2009.
[10] G.C. Steiner, “Method And Apparatus For Error Mitigation Of Pro-
grammable Logic Devices Configuration Memory,” U.S. patent no.
7,298,168B1, Nov. 20, 2007
[11] P. H. Alfke, “System for preventing radiation failures in programmable
logic devices,” U.S. patent no. 6,104,211A, Aug. 15, 2000
[12] I. Herrera-Alzu and M. Lo´pez-Vallejo, “Self-reference scrubber for TMR
systems based on xilinx virtex FPGAs,” Lecture Notes Comput. Sci., vol.
6951, pp. 133–142, 2011, LNCS.
[13] J. Tonfat, F. L. Kastensmidt, P. Rech, R. Reis, and H. M. Quinn,
“Analyzing the Effectiveness of a Frame-Level Redundancy Scrubbing
Technique for SRAM-based FPGAs,” IEEE Trans. Nucl. Sci., vol. 62,
no. 6, pp. 3080–3087, Dec. 2015.
[14] C. Lavin, M. Padilla, J. Lamprecht, P. Lundrigan, B. Nelson, and B.
Hutchings, “RapidSmith: Do-it-yourself CAD tools for xilinx FPGAs,”
in Proc. 2011 21st Int. Conf. F. Program. Log. Appl., Sep. 2011, pp.
349–355.
[15] R. Giordano, A. Aloisio, V. Bocci, M. Capodiferro, V. Izzo, L. Sterpone,
M. Violante, “Layout and Radiation Tolerance Issues in High-Speed
Links,” IEEE Trans. Nucl. Sci., vol. 62, no. 6, pp. 3177-3185, Dec.
2015. doi: 10.1109/TNS.2015.2498307
[16] RAPIDSMITH 2 A Library for Low-level Manipulation of Vivado De-
signs at the Cell/BEL Level, B. Nelson, T. Haroldsen, T. Townsen.
[Online]. Available: https://github.com/byuccl/RapidSmith2/blob/master/
doc/TechReport.pdf
[17] R. Giordano, S. Perrella, V. Izzo, G. Milluzzo, A. Aloisio, “Redundant-
Configuration Scrubbing of SRAM-Based FPGAs,” in IEEE Trans. on
Nucl. Sci., vol. 64, no. 9, pp. 2497-2504, Sept. 2017. Open Access
[18] Synplify Pro and Premier. Synopsys, Inc. Moun-
tain View, Calif. (USA), 2015 [Online]. Avail-
able: https://www.synopsys.com/content/dam/synopsys/
implementation&signoff/datasheets/synplify-pro-premier.pdf
