Mitigation Techniques of Soft-Error Rates in Network Routers Validated in Accelerated Neutron Irradiation Test by Toba  T. et al.
Mitigation Techniques of Soft-Error Rates in
Network Routers Validated in Accelerated
Neutron Irradiation Test














III. 3.  Mitigation Techniques of Soft-Error Rates in Network Routers 




Toba T.1, Shimbo K.1, Nishii K.2, Ibe E.1, and Yahagi Y.1 
 
1Production Engineering Research Laboratory, Hitachi, Ltd. 




Scaling down of semiconductor devices to sub-100 nm technology encounters a 
wide variety of technical challenges like Vth variation1) and short-channel effect and so on.  
Terrestrial neutron-induced single event upset (SEU) is one of such key issues that can 
result in major setbacks in scaling2).  MCU (Multi-Cell Upset) in memory devices are 
currently regarded as one of the most crucial issues, since they can result in MBU 
(Multi-Bit Upset, defined as MCU in occurring within the same word), a major cause of 
system downtimes in network components or microprocessors.  These problems 
eventually become system-level issues.  For this reason, the methods of irradiation 
experiments at the system level are significant.  While full chip irradiation tests have been 
performed with microprocessors, servers, and routers, this report describes the first partial 
board irradiation test involving routers. 
CYRIC (No.3 TR32 line), a quasi-monoenergetic neutron facility, is used for 
irradiation tests with a neutron peak energy of 65 MeV.  Figure 1 shows the neutron 
energy spectrum.  High energy protons are used to bombard a thin Li target, generating 
neutrons from the Li nuclei with energy nearly equal to that of the protons.  Neutron 
beams are collimated by a concrete collimator into a 10 cm×10 cm square cross section. 
In the quasi-monoenergetic neutron test, the SEU cross section σseu (En) is generally 
measured as a function of neutron energy.  The measured data are approximated by the 




























                (2) 
where, 
 47 
∞σ : saturation value of SEU cross section (cm2); 
En: neutron energy at the flux maximum (MeV); 
Eth: threshold energy (MeV);    W: width factor (MeV);   S: shape factor (-). 
 
Figure 2 shows a typical example of this type of excitation curve.  The curve starts 
from Eth and increases gradually to saturation value σ∞. 
The SER at any location on Earth can be obtained by integrating the Weibull fit 













                  (3) 
where, 
SER: soft error rate (FIT);     φ (En): neutron flux (n/cm2/h). 
 
Neutron beams with other energy peaks are not utilized due to limit of machine time. 
The parameters in Eq. (2) are estimated from our test results carried out for SRAMs. 
Figure 3 illustrates the test equipment layout.  The BUT (Board Under Test) is set 
up perpendicular to the neutron beam, centered 125 cm above the floor surface and 40 cm 
from the aperture of the neutron collimator.  The position on the table at which the BUT is 
set up is altered depending on the part to be irradiated.  An FPGA chip, a CPU chip, and a 
memory chip (SRAMs and SRAMs partially replaced by DRAMs) are chosen as partial 
irradiation components on the board, since they are regarded as the most vulnerable to 
neutron-induced soft-errors, recognized by system reboot.  Figure 4 shows two types (Set 
A and Set B) of memory architectures.  In the memory chip, all 48 MB of memory cells 
are SRAMs for Set A; in Set B, 12.5 MB of memory cells for timer information that does 
not require high-speed operation are replaced by DRAMs.  The remaining 25.5 MB of 
SRAM memory cells are unchanged, since they are used to handle session information and 
the need for high-speed operation.  Overall performance between Set A and Set B is nearly 
equivalent.  The actual sizes of the SRAMs in the test application are 14.5 MB and 2.7 
MB, respectively, for Set A and Set B.  The contribution of DRAMs to overall failures is 
believed to be negligible compared to that of the SRAMs and is disregarded3). 
The system reboot procedure is repeated about 10 times to obtain the average cross 
section for each chip.  Figure 5 illustrates the process of data acquisition.  The neutron 
flux is roughly stable, as shown at the bottom.  The neutron flux is roughly stable, as 
 48 
shown at the bottom.  The neutron beam is shut off when rebooting takes place—for 
example, due to parity errors in SRAMs—and fluence Φ I in the i-th irradiation period is 
estimated from the total proton charge directed toward the Li target in the i-th period.  If 
root cause analysis determines rebooting to be attributable to other components—for 
example, FPGA bus errors (as shown in Fig. 5) not caused by direct irradiation, the data is 
eliminated from the data analysis for the chip in question. 
Since each irradiation cycle corresponds to a single failure, σiseu for the i-th cycle is 
given by 
.                                       (4) 
Or, 
                                   (5) 
where, 
n: total number of cycles for the chip in question.  
 
Table 1 summarizes test results for the estimated SER at Tokyo sea level for total, 
SRAM, CPU, and FPGA in Set A and Set B, respectively.  The table also gives the 
incidence of system reboots measured at Tokyo sea level for about one year for Sets A and 
B in the table.  The architectural mitigation method is effective and can reduce SERs in 
the BUT by a factor of about 8 to 9.  As Figure 6 also shows, this reduction ratio is 
consistent for field and accelerator tests.  The most vulnerable of the three components 
types is SRAM (accounting for approximately 95% of all reboot events), based on partial 
board irradiation tests.  While CPU vulnerability appears to be low, care is required, since 
vulnerability depends on the CPU and the number of FPGAs actually operating.  In the 
present BUT, the number and operating ratio of FPGAs are relatively small.  The total 
time of irradiation during the accelerated test is a mere 18 hours, showing the efficacy of a 
partial irradiation test compared to a one-year field test.  The results indicate the 
architectural mitigation method can reduce SERs in both field and accelerator tests by a 
factor of about 10.  The absolute values, however, are not consistent with each other: 
SERs in the field are higher than in the accelerator test by a factor of 6 to 7.  
Three factors potentially giving rising to this discrepancy are evaluated below.  
(i) SER estimation errors from the accelerator test due to potential oversimplification in 
 49 
the Weibull fit methodology.  Given the maximum error in the estimated SER of 
approximately 15%, this mechanism would not appear to explain the discrepancy. 
(ii) The application applied to the accelerator test is a simple write-once-read-many 
operation.  The timing of the start of the irradiation may be premature relative to the 
timing whereby the critical data (leading to rebooting) in SRAMs is stabilized.  
Additionally, the number of SRAMs with critical data may be less than expected.  
(iii) The contributions of low energy neutrons (1-10 MeV) to SER are unexpectedly 
enhanced as device scaling proceeds from 90 nm to 22 nm2).  This would call for a 
revision of the Weibull fit.  Experimental data for low energy protons and theoretical work 
supporting this prediction are accumulating rapidly4). 
We introduced here a novel chip-level to board-level SER evaluation method for 
network routers using 65 MeV quasi-monoenergetic neutron beams and demonstrated an 
architectural mitigation technique against terrestrial neutron SERs.  Based on the 
replacement of SRAMs by DRAMs in a chip for which speed is not the first priority, in 
light of operating ratios, the method resulted in an approximately 10-fold reduction in 
board-level SERs.  This reduction ratio is consistent with field data for commercial 
operation over a period of approximately one year.  However, the absolute SER level 
estimated from the accelerated test is 6 to 7 times lower than the field data.  This suggests 
that a study of the effects of low energy is in order as the first step in future investigations. 
We proposed herein a generic strategy for low-cost and low power consumption 
mitigation of chip and board-level SERs, based on stepwise upper bound reductions.  The 





1)  Sugii N., Tsuchiya R., Ishigaki T., Morita Y., Yoshimoto H., Torii K., Kimura S., IEDM, San 
Francisco, Dec. 15-17, (2008) 249. 
2)  Ibe E., Taniguchi H., Yahagi Y., Shimbo K., Toba T., IEEE Trans. Electron Device, in press (2010) 
3)  Borucki. L., Schindlbeck. G., Slayman. C., IRPS 2008, Phoenix, Arizona, April 27-May 1, 
Phoenix Convention Center No.5A.4 (2008). 
4)  Heidel D.F., Marshall P.W., Pellish J.A., Rodbell K.P., LaBe K.A., Schwank J.R., Rauch S.E., 
Hakey M.C., Berg M.D., Castaneda C.M., Dodd P.E., Friendlich M.R., Phan A.D., Seidleck C.M., 






































































Figure 1.  Neutron spectrum used for the 
partial irradiation test in CYRIC. Peak flux 
is obtained at about 65 MeV. [© IEEE]. 
Figure 3.  Board setup and conceptual layout of experimental components [© IEEE]. 
Table 1.  Test results of SER normalized at Tokyo sea level for set A and B in accelerated 
and field tests  [© IEEE]. 
Figure 2.  Typical conventional Weibull 




























































Figure 4.  Board casing and CPU access memory map for set A and set B  [© IEEE]. 
Figure 5.  Image of data acquisition and handling   [© 
IEEE]. 
Figure 6.  Comparison of estimated SER in accelerator test and measured SER in 
field for Set A and B  [© IEEE]. 
 
