SEE Test and Data Analysis for Complex FPGA Systems by Campola, Michael & Berg, Melanie
Melanie Berg1, Michael Campola2
Melanie.D.Berg@NASA.gov; Melanie.Berg@SSAI.com
1.SSAI in support of NASA/GSFC
2. NASA/GSFC
SEE Test and Data Analysis 
for Complex FPGA Systems
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
https://ntrs.nasa.gov/search.jsp?R=20200000820 2020-03-28T18:45:12+00:00Z
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Acronyms
• Application specific integrated circuit (ASIC)
• Block random access memory (BRAM)
• Combinatorial logic (CL)
• Configurable Logic Block (CLB)
• Device under test (DUT)
• Edge-triggered flip-flops (DFFs)
• Field programmable gate array (FPGA)
• High speed serial interface (GTX)
• Input – output (I/O)
• Intellectual property (IP)
• INV (inverter)
• Linear energy transfer (LET)
• Look up table (LUT)
• Mean fluence to failure (MFTF)
• One time programmable (OTP)
• Operational frequency (fs)
• Power on reset (POR)
• Place and Route (PR)
• Representative Tactical Design (RTD)
• Reprogrammable (RP)
• Single event functional interrupt (SEFI)
• Single event effects (SEEs)
• Single event failure (SEF)
• Single event latch-up (SEL)
• Single event transient (SET)
• Single event upset (SEU)
• Single event upset cross-section (σSEU)
• Static random access memory (SRAM)
• Static timing analysis (STA)
• System on a chip (SOC)
2
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Problem Statement
For SEU analysis, common practice is to use simple 
test structures that focus on discrete components:
• Data are extrapolated into survivability calculators.
• Generic SEU data are used across all designs. 
• Assumption: the need for testing is reduced.
• However, the fidelity of generic SEU data 
extrapolation to tactical designs is questionable.
How do we provide SEU data for 
survivability calculations of tactical 
systems; while reducing the need to test 
every design? Generic testing versus Test-
As-You-Fly.
Better to use representative tactical 
designs (RTD) for SEU analysis:
• Data are a better fit for 
characterizing tactical behavior.
• However, requires SEU testing for 
every design!
Single event upset (SEU)
Field programmable gate array (FPGA)
  r t   l i  r  t t  i r l tr i  li ilit  lifi ti  r i  ti  ( ), l ,  r r  , 3
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
FPGA SEU Cross Section Model
4
CLBs
BRAM GR ControlHardIP
Configurable logic block: (CLB) 
Block random access memory: (BRAM)
Intellectual property: (IP); e.g., micro processors, digital signal processor blocks (DSP), embedded state machines, etc.
Global Routes: (GR)
Analog circuits
𝜎𝜎𝑆𝑆𝑆𝑆𝑆𝑆 = 𝑓𝑓 𝜎𝜎𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐,𝜎𝜎𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ,𝜎𝜎𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑓𝑓𝑓𝑓𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 ,𝜎𝜎𝐻𝐻𝑐𝑐𝐻𝐻𝐻𝐻𝐻𝐻𝑐𝑐𝑓𝑓𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
Dominant mechanisms of failure will drive 𝝈𝝈𝑺𝑺𝑺𝑺𝑺𝑺
SEU Cross sections for a mapped design (𝜎𝜎𝑆𝑆𝑆𝑆𝑆𝑆 ) are based on the 
FPGA’s internal elements and the mapped design’s topology.
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Embedded View of Mapped Logic
Designs only map 
into a portion of 
the configuration 
and only use a 
portion of the user 
fabric logic gates.
FPGA configuration and user 
logic are different types of 
embedded components.
Modern FPGAs have 100’s of 
millions of configuration bits and 
100’s of thousands of logic cells.
5
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
ASIC implementation
Generic Xilinx Implementation
(LUT can differ by family)
Generic Test Structures: 
Shift Register
6
With an SRAM-based FPGA, each 
design uses more logic than 
assumed.  Makes extrapolation of 
SEU data (from simple test structures 
to tactical designs) unreliable.
I1 I2 I3 I4
LUT
Q
QSET
CLR
D
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
Q
QSET
CLR
D
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
Q
QSET
CLR
D
User logic: Lookup Table (LUT)
User logic: Flip-Flop(DFF)
Configuration bits
LUTs and DFFs are contained in 
configuration logic blocks (CLBs)
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
I1 I2 I3 I4
LUT
Q
QSET
CLR
D
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
Q
QSET
CLR
D
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
I1 I2 I3 I4
LUT
Q
QSET
CLR
D
Hidden Logic: Routing matrix 
inserted during place and route 
phase.  Adds to the overall 
design susceptibility.
Closer Look: 
Shift Register with Manufacturer 
Inserted Routing Matrix (Hidden Logic)
R
O
U
T
I
N
G 
M
A
T
R
I
X
Simple test structures will not capture the impact of a tactical design’s 
hidden logic (data are not extrapolatable).  Hence the drive towards 
testing RTD structures.
7
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
• RTDs are based on tactical designs and might contain the following:
– Embedded processors
– Highspeed serial (GTX)
– Embedded SRAM (BRAM)
– Global routes
• Obey tactical design strategy:
– Synchronous design
– Routing/floorplanning specifics
• Piecemeal tests, yet use complex structures:
– Increases visibility
– Study trends
– Have at least one full RTD (close as possible to tactical)
• MFTF testing requires an increase in the number of experiments (statistics).
• MFTF testing will be driven by dominant mechanisms of failure in the 
design (given proper testing and visibility into failure).
Representative Tactical Design (RTD) Test 
Structures and MFTF Test Strategies
8
Mean fluence to failure (MFTF)
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
The following 
expertise is required: 
• Professional design 
techniques
• Complex test system 
development
• The ability to create 
visibility into test 
structures for proper 
MFTF measurement
• Knowledge of test 
facilities
RTD Test Structures and MFTF 
Strategies: Not a Simple Task
99
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Data Analysis: Easing the process of SEU test 
and analysis for tactical-design survivability 
prediction.
The following slides only apply to Xilinx SRAM-
based FPGA devices with no embedded or user 
inserted mitigation.
10
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Configuration, Mask, and Essential 
Bits
Essential bits Total bits Masked bits Unmasked bits
Design A 13326446 115522848 8853590 147819850
Design B 10334231 115522848 27727958 128945482
Design C 6515993 115522848 8857942 147815498
Configuration Total (fixed per each FPGA type)
Masked Total (calculated by the manufacturer and is not 
under user control… design and device dependent)
Essential Bit Total: number of configuration bits used by the design mapping 
(calculated by the manufacturer upon user directive… design and device 
dependent)
11
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
SEU Cross-Sections
• Cross-section Categorization:
• Across all configuration cells (device)
• Per configuration cell (device-bit)
• Across essential-bits (Design + device)
• Design specific
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐_𝐷𝐷𝐻𝐻𝐷𝐷𝑐𝑐𝑐𝑐𝐻𝐻= #𝐻𝐻𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑒𝑒#𝑃𝑃𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑓𝑓𝐻𝐻𝑒𝑒/𝑐𝑐𝑐𝑐2
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐_𝑏𝑏𝑐𝑐𝑐𝑐= #𝐻𝐻𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑒𝑒#𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃
𝑃𝑃𝑐𝑐2
∗(#un𝑐𝑐𝑐𝑐𝑒𝑒𝑚𝑚𝐻𝐻𝐻𝐻𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝐵𝐵𝑐𝑐𝑐𝑐𝑒𝑒)
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐= 𝐿𝐿𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸_𝑏𝑏𝐸𝐸𝐸𝐸𝐸𝐸 × 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐_𝑏𝑏𝑐𝑐𝑐𝑐
Which cross-sections do we use for survivability analysis?  
Must consider mission requirements.
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)𝑆𝑆𝑆𝑆𝑆𝑆= 𝟏𝟏/MFTF = 𝟏𝟏/((𝐹𝐹𝐸𝐸𝐸𝐸𝐸𝐸𝐹𝐹𝐹𝐹𝐸𝐸𝐿𝐿𝐸𝐸𝐹𝐹𝐸𝐸 − 𝐵𝐵𝐸𝐸𝐸𝐸𝐹𝐹𝐵𝐵𝐸𝐸𝐸𝐸𝐹𝐹𝐸𝐸𝐿𝐿𝐸𝐸𝐹𝐹𝐸𝐸)*AverageFlux)
Generally, configuration cross-sections are readily 
available from generic device investigations. 
12
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Mission Driven Data Analysis
• Assuming configuration SEU cross-sections are strict upper-bounds: 
Does the survivability prediction using the configuration SEU cross-
sections per device satisfy mission requirements?  
– Can I stop here? If mission requirements are satisfied, then readily 
available configuration SEU cross-sections can be used.
– Additional testing might be required to investigate device 
anomalies.
• Assuming essential-bit SEU cross-sections are strict upper-bounds: 
Will the essential bit SEU cross-sections satisfy mission requirements?
– In most cases, this will still be a strict upper-bound of a design’s 
SEU susceptibility… however … should test to verify the 
assumption.
– Requires configuration read-back tests.
– Requires RTD-MFTF testing.
• If MFTF SEU results are not mission compliant, is mitigation 
necessary?
13
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
If Upper-bounds Satisfy Mission 
Reliability/Survivability Requirements, 
Then No Mitigation is Required.
14
1.0E-08
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
0 2 4 6 8 10 12
SE
U 
Cr
os
s-
Se
ct
io
n 
(c
m
2 )
LET (MeV∙cm2/mg)
Design
Essential bit - Design
Configuration-Device
Do configuration-device cross-sections satisfy mission requirements?
Do essential-bit cross-sections satisfy mission requirements?
Do essential-bit cross-sections upper-bound MFTF σSEF?
Is mitigation required?
Single event failure Cross-section (σSEF) 
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
• Goal is to determine if generic data can be extrapolated to characterize 
complex tactical designs.
• Providing DFF, CLB, and LUT generic test data is not extrapolatable.  
– Topology effects are non-linear and does not include hidden logic.
• An alternative is to prove 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐 is an upper-bound to 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)SEF.
Xilinx SEU Test and Analysis: What Can the 
Manufacturer Provide? 
Front-end Proof of Concept
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐= 𝐿𝐿𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸_𝑏𝑏𝐸𝐸𝐸𝐸𝐸𝐸 × 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐_𝑏𝑏𝑐𝑐𝑐𝑐
Manufacturer provides generic data: configuration, 
BRAM, and embedded logic cross-sections.
Manufacturer performs a variety of tests (benchmarks) to 
compare 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐 to 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)SEF.
.
Manufacturer performs additional testing to investigate 
potential SEFIs and other device SEU susceptibilities 
(global routes and SEL).
15
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
• If 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐 proves to be a satisfactory upper-bound, the
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐_𝑏𝑏𝑐𝑐𝑐𝑐 data and the tactical design’s calculated essential-bits can 
be used by development teams for survivability analysis.
• In the past, 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐 has been assumed (by some) to be adequate for 
survivability prediction.  However, as technology shrinks the need for  RTD-MFTF 
testing and proof of concept is growing:
– Mixed-signal circuitry, global-routes, and hidden logic (embedded IP cores) will 
have more impact on 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)SEF at low LETs.  
Xilinx SEU Test and Analysis: What Does 
The End-User Do with The Data? 
Application of Concept
Compare your design to manufacturer benchmark tests. Use 
𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐 for survivability calculations if 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)Essential_𝑏𝑏𝑐𝑐𝑐𝑐 > 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)SEF
If manufacturer data show anomalies or your tactical design has 
untested complexities, additional RTD testing will be needed.
The end-user should not piecemeal small grained 
components (e.g., CLBs) for survivability analysis because 
of hidden logic and topological non-linearities.
16
Intellectual property (IP)
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
Kintex UltraScale SEU Cross-Sections
1.0E-08
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
0 2 4 6 8 10 12
SE
U 
Cr
os
s-
Se
ct
io
n 
(c
m
2 )
LET (MeV∙cm2/mg)
Counter Array SEF TAMU 2017
Essential bit - Design
Configuration-Device
1.00E-08
1.00E-07
1.00E-06
1.00E-05
1.00E-04
1.00E-03
1.00E-02
1.00E-01
1.00E+00
0 5 10 15
Cr
os
s S
ec
tio
n 
(c
m
2 /
de
sig
n)
LET (MeV∙cm2/mg)
DSP SEF
Essential bit -design
1.0E-08
1.0E-07
1.0E-06
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
1.0E+00
0 2 4 6 8 10 12
SE
U 
Cr
os
s S
ec
tio
n 
(c
m
2 )
LET (MeV∙cm2/mg)
Channel or Lane SEF
Soft or Hard Error SEF
Frame Error SEF
Configuration - Device
σssential_bit > σSEF
Implies σEssential_bit can be used to 
predict survivability (non-mitigated 
design).
More testing will be performed to 
investigate if there are SEFIs and if 
upper-bound holds across complex 
designs (e.g., embedded 
processors); and higher LET.
17
100 DSP48 multiply-accumulate DSP 
blocks @ 100 MHz.  Includes stage 
coefficients.
1 GTX channel with Aurora 
protocol@ 3.125 GHz
200 counters@ 50 MHz
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
• If the survivability analysis proves the design implementation does not satisfy 
mission requirements, user-inserted mitigation might be necessary.
– This will change the design and its essential-bit count.
– Essential-bit upper-bounds cannot be used to measure the survivability of 
applications with embedded mitigation.
• Mitigation requires additional logic
• Additional logic will increase the essential-bit count and consequently 
increase the estimated σSEF.
– RTD-MFTF testing is required to measure the efficacy of the inserted 
mitigation.  Can’t assume mitigation performs as expected.
– Requires the development team to perform SEU testing.
• Should analyze the design with-mitigation and without-mitigation (when 
possible)… used as another metric for the fidelity of the inserted mitigation.
Mitigation Analysis
18
Voter
To be presented by Melanie Berg at the Microelectronics Reliability & Qualification Working Meeting (MRQW), El Segundo, CA February 6, 2020
• Purpose of the work is to improve SEU data-sets used for survivability analysis.
• Generic SEU data obtained from testing simple structures (e.g., shift registers) 
are no longer adequate for SEU characterization of FPGA designs.
• An approach is presented that combines investigating simple and complex test 
structures:
– Investigates the efficacy of using configuration SEU data with design 
specific information for survivability analysis.
– Goal is to reduce the necessity of performing SEU testing on every design.
– MFTF testing of complex structures is required to validate the approach (per 
SRAM-based FPGA family of devices).
• Xilinx Kintex UltraScale data are presented:
– Data suggest that essential-bit SEU cross-section might be a reliable data-
set for survivability analysis.  
– Additional testing by Xilinx is required and will be performed… yet initial 
results are promising.
– Eventually, this approach can reduce the need for testing by the end-user.
• If mitigation is required, 𝜎𝜎(𝐿𝐿𝐿𝐿𝐿𝐿)SEF RTD-MFTF testing is required to be 
performed/orchestrated by the end-user.
Summary
Single event functional interrupt (SEFI)
Single event latchup(SEL)
Single event transient(SET)
19
