SEU Tolerance Design And Implementation of A Space High Reliability Microprocessor  by Bin, Bao et al.
Procedia Engineering 23 (2011) 525 – 530
1877-7058 © 2011 Published by Elsevier Ltd.
doi:10.1016/j.proeng.2011.11.2542
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
Procedia
Engineering
   Procedia Engineering  00 (2011) 000–000
www.elsevier.com/locate/procedia
SEU Tolerance Design And Implementation of A Space High 
Reliability Microprocessor 
BAO Bina, WAN Mina, WANG YunLongaa*
aBeijing Institute of Space Mechanice& Electricity of CAST, 9201Mailbox, Beijing 100076, China 
Abstract 
This paper focuses on study of a space high reliability microprocessor which is designed for space platforms, such as 
on-board computer systems . Space radiation environment is a special factor that should be considered when we 
design space high reliability processor. The architecture of Leon 3 microprocessor is investigated,which conforms to 
the IEEE-1754 (SPARC V8) architecture.Single-Event Effect of Leon3 is analyzed. Based on Leon3,a fault tolerance 
microprocessor is designed to verify Single Event Upset tolerance efforts. Expansion Hamming Code is adopted by 
IU and CRC is adopted by Cache to avoid some magnitude of SEU error to some degree. Inject fault in Modelsim 
simulator is applied to verify fault tolerance design.An FPGA platform is built to evaluate the cost and compatibility 
of this fault tolerance design. 
© 2011 Published by Elsevier Ltd. 
Selection and/or peer-review under responsibility of [name organizer] 
Space Radiation ; Single-Event Effects;High Reliability Microprocessor;Leon3 
1. Introduction
High reliability microprocessors are used in space missions, including low-orbit (LEO) satellites,
geosynchronous orbit (GEO) satellites and planetary exploration computer systems. It is essential for 
spacecraft to work as  their data processing centers to instruct equipment on the spacecraft to work 
coordinately. Radiation effect is one of the most vital issues for high reliability microprocessors applied in 
space because of high energy proton, neutron, alpha particle from cosmic galactic rays. The reliability of 
aerospace high reliability microprocessors has important effect on the reliability of the on-board computer 
systems. 
While China's aerospace industry develop rapidly, the request for microprocessors with high 
performance and reliability now is increasing while need support of computers and other devices, 
* BAO Bin. Tel.:0-86-010-68780595; fax:0-86-010-88530760. 
E-mail address:baobin4012@yahoo.com.cn. 
Open access under CC BY-NC-ND license.
Open access under CC BY-NC-ND license.
526  BAO Bin et al. / Procedia Engineering 23 (2011) 525 – 5302 Author name / Procedia Engineering 00 (2011) 0 0–000 
especially space grade chips and integrated circuits are very important. To design fault tolerance 
aerospace high reliability microprocessor becomes quite urgent.  
This work designed fault tolerance SPARC processor based on a development board . The test on 
development board has shown that OS UcLinux can run normally on the fault tolerance microprocessor. 
This paper is organized as follows. The architecture of SPARC V8 and Leon3 processor are presented 
in Section2. Section 3 presents fault tolerance design of Leon3.Following Section4 shows test results and 
conclusions . 
2. Description of Microprocessor 
2.1. SPARC Architecture 
SPARC is a CPU instruction set architecture (ISA), derived from a reduced instruction set computer 
(RISC) lineage.SPARC, formulated at Sun Microsystems in 1985, is based on the RISC I & II designs 
which were engineered at the University of California at Berkeley from 1980 through 1982. [1] 
A SPARC processor logically comprises an integer unit (IU), a floating-point unit (FPU), and an 
optional coprocessor (CP). 
The IU contains the general-purpose registers and controls the overall operation of the processor.The 
IU executes the integer arithmetic instructions and computes memory addresses for loads and stores. It 
also maintains the program counters and controls instruction execution for the FPU and the CP. 
Most microprocessors contain a small internal cache within the size range of 2K to 64K bytes.Harvard 
architecture with separated instruction and data cache was used to meet the requirement of instruction 
fetching and data dealing simultaneously. 
2.2. Leon3 Review 
LEON3 is a 32-bit processor core conforming to SPARC V8 architecture. It is designed for embedded 
applications, combining high performance with low complexity and low power consumption.[2] 
The LEON3 core has the following main features: 7-stage pipeline with Harvard architecture,hardware 
multiplier and divider, on-chip debug support and multiprocessor extensions. 
The LEON3 integer unit implements the full SPARC V8 standard, including hardware multiply and 
divide instructions. The number of register windows is configurable within the limit of  the SPARC 
standard (2 - 32), with a default setting of 8. The pipeline consists of 7 stages with a separate instruction 
and data cache interface (Harvard architecture). 
The LEON integer unit uses a single instruction issue pipeline with 7 stages: 
1. FE (Instruction Fetch), 2.DE (Decode), 3.RA (Register access): 4.EX (Execute),5.ME (Memory), 
6.XC (Exception) ,7.WR (Write). 
3. Fault Tolerance Design of  Leon3 
40%-70% of chip area was occupied by storage units (data memory, program memory and register). 
All kinds of storage units are sensitive to SEU. The error of program memory, for example address of 
JMPL, will cause the program jump to the wrong address, and then make the program completely chaos. 
Therefore, it is important to design a fault tolerance processor to mitigate SEU effects. 
Leon3 was enhanced with fault tolerance abilities to mitigate SEU effects.Two kinds of redundancy 
design methodology were adopted. One was applied to Integer Unit, which can detect two bit errors and 
527BAO Bin et al. / Procedia Engineering 23 (2011) 525 – 530 Author name / Procedia Engineering 00 (2011) 000–000 3
correct one bit error. The other one was used by cache system, which can detect two bit errors. So SEU 
fault tolerance design of Leon3 will be suit for running in space radiation environment. 
We used 32K Instruction and 32K Data cache for cache memory, and 8 register windows for register 
file. We chose this value because it is usually found in processors of RAD 750 and ARM. 
3.1. High Reliability Design of Integer Unit 
Integer Unit is the core of the processor, which has 7-stage instruction pipeline.  
Current running status of Leon3 is stored in the register file. Once the storage goes wrong, it 
would bring severe damage to the system, where fault tolerance design is crucial. In this paper 7-bit 
extended Hamming code is added to Leon3 register file for fault tolerance design. This method can 
detect 2-bit errors and correct 1-bit error. 
SEU fault tolerance design for the register file is similar to speculative execution. The basic idea 
is as follows: data read from register file are still processed, although the data may have been 
destroyed. But verification and correction will be done in the meantime, and then the corrected data 
and data read from register file will be calculated and transmitted to registers of next stage pipeline. 
The structure is shown in Figure 1. The register file with dual read ports has two sets of units to 
check and correct output data simultaneously.  An instruction would read the register file four times 
at most. However most instructions read register file two times. So the instruction that read register 
file four times could reset pipelines two times to correct the errors in register file. The register file 
has only one write port, so there is no way but only to well schedule the pipelines to write the 
corrected data to the register file.  Pipeline performance is significantly improved by increasing its 
logic complexity to write two corrected data by reset pipeline once So two corrected data should be 
stored on the registers of ME stage . 
Fig. 1. LEON3 integer unit datapath diagram      
Register of ME stage is used to store checking results including no error, correctable error and 
uncorrectable error. 
If there is no error, then all instructions continue; Store instructions in ME stage write the data 
from register file to memory .If the detected errors can be corrected, then store instructions will write 
the data to memory, vice versa. If the error is uncorrectable, register access error bit (TT = 0x20)will 
be set . 
If there is no error ,the data will be written to the register file with extended Hamming code which 
is generated by coding circuit in XC stage ;If error is correctable, system needs to choose the 
corrected data, to place the destination address with SEU error, to cancel the follow-up instructions, 
and to restart the pipeline automatically. Restarted pipeline is executed just from where the SEU 
R e g i s t e r
F i l e
R e g i s t e r s
O f  E X  
s t a g e
o p e r a n d
C h e c k
c o d e
o p e r a n d
C h e c k
c o d e
C h e c k
U n i t
C h e c k
U n i t
c o r r e c t
U n i t
c o r r e c t
U n i t
R e g i s t e r s
O f  M E  
s t a g e
A L U
C o r r e c t e d
o p e r a n d
C o r r e c t e d
o p e r a n d
A L U
r e s u l t
C h e c k
r e s u l t
528  BAO Bin et al. / Procedia Engineering 23 (2011) 525 – 5304 Author name / Procedia Engineering 00 (2011) 0 0–000 
error happens. The restarting process is the same with trap process, which sends out pipeline 
canceling signal at XC stage, and passes the address of the instruction with SEU error to FE stage;If 
register access error bit is seted, register access error trap occurs in XC stage. 
Referring to Leon3 trap process,Pipeline reboot by instruction with SEU error need to be 
completed in two cycles .It can be seen in  table 1 & table 2 
Table1 Pipeline with an SEU error 
IF I1 I2 I3 I4    I2 I3 I4   
ID  I1 I2 I3 I4    I2 I3 I4  
RA I1 I2 I3 I4 I2 I3
EX I1 I2 I3 I4 I2 
MEM     I1 I2 I3     
XC      I1 I2     
WR       I1 updata    
Table2 Pipeline with two SEU errors 
IF I1 I2 I3 I4    I2 I3 I4   
ID  I1 I2 I3 I4    I2 I3 I4  
RA I1 I2 I3 I4 I2 I3
EX I1 I2 I3 I4 I2 
MEM     I1 I2 I3     
XC      I1 I2     
WR       I1 updata updata    
Cycle 1: a. Cancel instructions in pipelining b.Write the corrected data to register file c.Record next 
valid PC value .d Pass PC value of current instruction to FE stage 
Cycle 2: a. Pass the next valid PC value to FE stage.b.If there are more data to be updated, then to 
continue to take up write port of the register.  
The correctness of this design is verified by injecting errors into ModelSim simulator which  shows 
that  the fault tolerance IU can handle SEU errors. 
3.2. High Reliability Design of Cache system
Data in Cache normally has backup copies in external memory.Leon3 processor also ensures data in 
cache be consistent with that in the external memory.When errors of Cache data are detected, IU can read 
data from external memory directly instead of from the cache regardless whether cache is hit or not, and 
at the meantime the obtained instructions and data are written into the cache system. So as to correct and 
recovery the error data in cache. Thus, we have to choose high error correction capability for the cache 
system. In This paper cyclic redundancy check is applied for fault tolerance design. 
The structure of cyclic redundancy check code for cache system is shown in Figure 2. 3-bit check 
codes are added to 32-bit data for detecting 2-bit error and the same for 25-bit tag . 
529BAO Bin et al. / Procedia Engineering 23 (2011) 525 – 530 Author name / Procedia Engineering 00 (201 ) 000–000 
Fig. 2Cache system with crc  
Cache controller with fault tolerance design can deal with no more than 2-bit errors in cache memory 
which can read the instructions or data from the external memory by compulsory cache miss. 
4. Results and Conclusions 
A development board was applied to verify the correctness of the fault tolerance design. Original 
design was synthesized by Synplify Pro. The maximum obtained frequency was 62MHz. Table 3 shows 
the device utilization reported by the EDA tool after place-and-route. 
Table3  
Logic utilization Used Availalbe Utilization 
Flip Flops 2769 67584 4% 
4 input LUTs 9684 67584 14% 
Block RAM 44 144 30% 
The fault tolerance design of the processor can operate at 60.9MHz, and its performance is 1.77% 
lower compared to the original design. Fault tolerance design has little effect on the frequency 
performance. Register file becomes more reliable with only 12.6% flip flops cost. D-Cache and I-Cache 
can be protected with only 12% Block RAM cost. 7.24％more LUT was used to code and decode check 
code. Table 4 shows the device utilization of fault tolerance design reported by the EDA tool after place-
and-route.
Table4 
Logic utilization Used Availalbe Utilization 
Flip Flops 3170 67584 4% 
4 input LUTs 10440 67584 15% 
Block RAM 50 144 34% 
The development board with high reliability processor can run uClinux  shown in  fig 3[3] 
a d d r e s s 1 a d d r e s s 2 a d d r e s s 3 a d d r e s s 4
a d d r e s s 2 a d d r e s s 3
C a c h e  d a t aC R C
::
C a c h e  d a t aC R C
::
C a c h e  d a t aC R C
c o m p a r e
V a l i d C R C
: :
V a l i d C R C
: :
V a l i d C R C
a d d r e s s 1
a d d r e s s 1
a d d r e s s 1
:
:











H i t  
C a c h e  d a t a
C h e c k  c o d e
g e n e r a t e
C h e c k  C o d e
g e n e r a t e
V a l i d  b i t  
g e n e r a t e
C a c h e  




530  BAO Bin et al. / Procedia Engineering 23 (2011) 525 – 5306 Author name / Procedia Engineering 00 (2011) 0 0–000 
Fig. 3 Boot uCLinux 
The performance and resource of original design and fault tolerance design were compared, and the 
results were analyzed. OS uClinux has run on the fault tolerance processor successfully. A very important 
point to consider here is that fault tolerance processor is totally compatible with original processor while 
only little performance and resource trade-off. 
References 
[1] The SPARC Architecture Manual Version 8, SPARC International Inc  
[2] Jiri Gaisler, Edvin Catovic, Marko Isomäki,,et al.GRLIB IP Core User’s Manual. Gaisler Research,2006 
[3] Jiri Gaisler, Edvin Catovic, Marko Isomäki，et al. GRMON User’s Manual [Z]. Gaisler Research,2007.
