Development and Qualification of an FPGA-Based Multi-Processor System-on-Chip On-Board Computer for LEO Satellites by Mohamed Mahmoud Mohamed Ibrahim
Development and Qualification of an FPGA-Based
Multi-Processor System-on-Chip On-Board
Computer for LEO Satellites
著者 Mohamed Mahmoud Mohamed Ibrahim
year 2014
学位授与年度 平成26年度
学位授与番号 17104甲工第374号
URL http://hdl.handle.net/10228/5315
  
 
Kyushu Institute of Technology 
Dept. of Applied Science for Integrated 
System Engineering 
Graduate School of Engineering 
 
 
Development and Qualification  
of an FPGA-Based Multi-Processor  
System-on-Chip On-Board Computer  
for LEO Satellites 
 
 
 
By 
Mohamed Mahmoud Mohamed Ibrahim 
 
 
A THESIS SUBMITTED TO THE  
GRADUATE SCHOOL OF ENGINEERING  
AT KYUSHU INSTITUTE OF TECHNOLOGY 
IN PARTIAL FULFILLMENT OF  
THE REQUIREMENTS FOR THE DEGREE OF  
DOCTOR OF ENGINEERING 
IN APPLIED SCIENCE FOR INTEGRATED SYSTEM 
ENGINEERRING 
 
 
 
Graduate School of Engineering, Kyushu Institute of Technology 
Kitakyushu, Japan 
2014 
  
 
Kyushu Institute of Technology 
Dept. of Applied Science for Integrated 
System Engineering 
Graduate School of Engineering 
 
 
Development and Qualification  
of an FPGA-Based Multi-Processor  
System-on-Chip On-Board Computer  
for LEO Satellites 
 
 
 
By 
Mohamed Mahmoud Mohamed Ibrahim 
 
 
A THESIS SUBMITTED TO THE  
GRADUATE SCHOOL OF ENGINEERING  
AT KYUSHU INSTITUTE OF TECHNOLOGY 
IN PARTIAL FULFILLMENT OF  
THE REQUIREMENTS FOR THE DEGREE OF  
DOCTOR OF ENGINEERING 
IN APPLIED SCIENCE FOR INTEGRATED SYSTEM 
ENGINEERRING 
 
 
 
Thesis Supervisor 
Assoc. Prof. Kenichi Asami 
Dept. of Integrated Systems Engineering 
Kyushu Institute of Technology 
  
 
  
I 
 
Acknowledgment 
All of the thanks go to the almighty ALLAH for giving me the strength and 
ability to finish this work. It is all out of his willingness and endowments upon us. 
May Allah reward us for the time and effort spent in producing this work. I ask for 
forgiveness in case you find any pitfalls in it and for prayer in case you find it 
beneficial.  
When I joined the Egyptian Space Program (ESP), more than 10 years ago, I was 
deeply interested in learning more about this field of high fidelity and ultimately 
critical engineering, namely space engineering. I tried to do all what I could in order 
to learn and become a real expert in that field. During my struggle to learn, I 
travelled to many places, knew a lot of things and dealt with many nationalities from 
the west to the east of this planet. In that pervasive period of intensive education and 
exposure to different cultures, I learnt that all of us are alike even though cultural 
differences do exist among us. We are alike in our feelings, in our willingness, in our 
ambitions, and in our love for peace and justice.  
In that journey of education, I did learn the true meaning of working for the sake of 
the bigger image instead of the narrow field of interests. Work for humanity, in 
general and without discrimination, to make the life of people better, more affordable 
and happier. I truly knew the value and meaning of the statement “science has no 
borders”. It is true because science is the resemblance of justice, truth, and equity and 
those meanings has no borders to cross, they are just intrinsic in good spirits and 
souls. 
The work in this thesis would not have been possible without the support of many 
staff, organizations and individuals. My words are not sufficiently eloquent to 
express my gratitude feelings towards them. First, I would like to express my deep 
gratitude to the ESP and the National Authority for Remote Sensing and Space 
Sciences (NARSS) at Egypt for giving me the chance to work in the field of space 
engineering. Special thanks to Prof. Ayman El Dessouki, former chairman of 
NARSS, for approving and recommending my application to this fellowship through 
which I was granted my doctoral degree. I do appreciate the efforts and work of all of 
  
II 
 
the employees at ESP and NARSS and wish them a brilliant future. Thanks to my 
master degree supervisor, the late Prof. Yousri El Nahas, and to the late Prof. Matar 
Ali Matar, may their souls rest in peace.  
I would like to thank my supervisor, Assoc. Prof. Kenichi Asami, for supervising me 
during my study at Kyuhsu Institute of Technology (KIT). Words cannot express my 
gratitude for his kindness, understanding, support and deep guidance. My 
appreciation and thanks to Prof. Mengu Cho, the scientist, the innovator, the teacher, 
the motivator and above all the mentor of our success. Deep appreciation for Dr. 
Werner Balogh at the United Nations Office for Outer Space Affairs (UNOOSA) and 
all the staff there for the extraordinary efforts in inaugurating the Doctorate in Nano-
Satellite Technology (DNST) program in collaboration with KIT. 
My appreciation and thanks to the professors at KIT for teaching us and giving us 
from their effort and time. Thanks to Prof. Keiichi Okuyama, Assoc. Prof. Kazuhiro 
Toyoda, Assoc. Prof. Motoki Miura and Assoc. Prof. Sozo Inoue. I do appreciate the 
efforts of the administrative staff at KIT for providing support and help during our 
study. I would like to thank Kanae Iyama, Kumiko Shirakawa, Michiko Kubota and 
Seiji Kawano.  
Special thanks to my brothers and colleagues at KIT for making my stay in Japan 
more knowledgeable and much friendly. Thanks to Dr. Yasser Qudaih for all the help 
and support he gave to me. My great appreciation to the DNST, PNST and SEIC 
fellows, Amgalanbat Batsuren, Ammarin Pimnoo, Bianca Szasz, Bui Nam Duong, 
Chen Shiyi Danny, Hala Almubarak, Kateryna Aheieva, Mohamed Alakli, Mohamed 
Nori Ait, Mohamed Yehia Edris, Pauline Faure, Tejumola Taiwo and Trinh Thang 
Long. Thanks to my Japanese colleagues for their fruitful cooperation during my 
study. My thanks to Shinya Kuramoto at NEC, Yuki Seri at Hitachi for building the 
geometric thermal model, Yuta Okumura at Hitachi for helping with the setup of 
radiation testing, and Dr. Masui Hirokazu for his guidance in the thermal vacuum 
and radiation tests. 
My mother and father supported me with every possible way. They gave me more 
than I deserve, all over my life. I respect them, love them, and owe them my success 
and all what I achieve in my life. My lord, bestow on them your mercy as they did 
  
III 
 
bring me up when I was young. Thanks to my brother and sisters for supporting me 
and praying for me to finish this work. My deep thanks, appreciation, love and 
respect for my wife. She did a lot to support me to finish this work. She struggled as 
much as I did during my study. Sometimes we had difficult times as much stress was 
over there in my experiments and tests, but she was extremely smart to help us pass 
all of that smoothly and peacefully. Taking care of three children while I am 
spending sleepless nights in the lab was a real challenging task. Thank you very 
much. I dedicate this work to my children Yousof, Jody and Mahmoud, my nephew 
Yassin and nieces Basmalla and Iten. I wish when they grow up that they would 
follow the way of science and engineering, May ALLAH bless them all and grant 
them happiness, success and health.  
Thanks to Japan and to Japanese people for the generosity to give me the chance to 
study here. I do respect and love Japan and Japanese people so much. I wish I can 
help my country to reach the same level of perfection.   
  
IV 
 
Abstract 
Developing small satellites for scientific and commercial purposes is emerging 
rapidly in the last decade. The future is still expected to carry more challenging 
services and designs to fulfill the growing needs for space based services. 
Nevertheless, there exists a big challenge in developing cost effective and highly 
efficient small satellites yet with accepted reliability and power consumption that is 
adequate to the mission capabilities. This challenge mandates the use of the recent 
developments in digital design techniques and technologies to strike the required 
balance between the four basic parameters: 1) Cost, 2) Performance, 3) Reliability 
and 4) Power consumption. This balance becomes even more stringent and harder to 
reach when the satellite mass reduces significantly. Mass reduction puts strict 
constraints on the power system in terms of the solar panels and the batteries. That 
fact creates the need to miniaturize the design of the subsystems as much as possible 
which can be viewed as the fifth parameter in the design balance dilemma. 
At Kyuhsu Institute of Technology-Japan we are investigating the use of SRAM-
based Field Programmable Gate Arrays (FPGA) in building: 1) High performance, 2) 
Low cost, 3) Moderate power consumption and 4) Highly reliable Muti-Processor 
System-on-Chip (MPSoC) On-Board Computers (OBC) for future space missions 
and applications. This research tries to investigate how commercial grade SRAM-
based FPGAs would perform in space and how to mitigate them against the space 
environment. Our methodology to answer that question depended on following 
formal design procedure for the OBC according to the space environment 
requirements then qualifying the design through extensive testing. We developed the 
MPSoC OBC with 4 complete embedded processor systems. The Inter Processor 
Communication (IPC) takes place through hardware First-In-First-Out (FIFO) 
mailboxes. One processor acts as the system master controller which monitors the 
operation and controls the reset and restore of the system in case of faults and the 
other three processors form Triple Modular Redundancy (TMR) fault tolerance 
architecture with each other. We used Dynamic Partial Reconfiguration (DPR) in 
scrubbing the configuration memory frames and correcting the faults that might 
exist. The system is implemented using a Virtex-5 LX50 commercial grade FPGA 
from Xilinx. The research also qualifies the design in the ground-simulated space 
  
V 
 
environment conditions. We tested the implemented MPSoC OBC in Thermal 
Vacuum Chambers (TVC) at the Center of Nano-Satellite Testing (CeNT) at Kyushu 
Institute of Technology. Also we irradiated the design with proton accelerated beam 
at 65 MeV with fluxes of 10e06 and 3e06 particle/cm2/sec at the Takasaki Advanced 
Radiation Research Institute (TARRI). 
The TVC test results showed that the FPGA design exceeded the limits of normal 
operation for the commercial grade package at about 105 C˚. Therefore, we mitigated 
the package using: 1) heat sink, 2) dynamic temperature management through 
operating frequency reduction from 100 MHz to 50 MHz and 3) reconfiguration to 
reduce the number of working processors to 2 instead of 4 by replacing the space-
redundancy TMR with time-redundancy TMR during the sunlight section of the 
orbit. The mitigation proved to be efficient and it even reduced the temperature from 
105 C˚ to about 66 C˚ when the heat sink, frequency reduction, and reconfiguration 
techniques were used together.  
The radiation and the fault injection tests showed that mitigating the FPGA 
configuration frames through scrubbing are efficient when Single Bit Upsets (SBU) 
are recorded. Multiple Bit Upsets (MBU) are not well mitigated using the scrubbing 
with Single Error Correction Double Error Detection (SECDED) technique and the 
FPGA needs to be totally reset and reloaded when MBUs are detected in its 
configuration frames. However, as MBUs occurrence in space is very seldom and 
rare compared to SBUs, we consider that SECDED scrubbing is very efficient in 
decreasing the soft error rate and increasing the reliability of having error-free 
bitstreams. The reliability was proven to be at 0.9999 when the scrubbing rate was 
continuous at a period of 7.1 msec between complete scans of the FPGA bitstream. 
In the proton radiation tests we managed to develop a new technique to estimate the 
static cross section using internal scrubbing only without using external monitoring, 
control and scrubbing device. Fault injection was used to estimate the dynamic cross 
section in a cost effective alternative for estimating it through radiation test.  
The research proved through detailed testing that the 65 nm commercial grade 
SRAM-based FPGA can be used in future space missions. The MPSoC OBC design 
achieved an adequate balance between the performance, power, mass, and reliability 
requirements. Extensive testing and applying carefully crafted mitigation techniques 
  
VI 
 
were the key points to verify and validate the MPSoC OBC design. In-orbit 
validation through a scientific demonstration mission would be the next step for the 
future research. 
 
  
Tab
Ackn
Abstr
List o
List o
Acron
Chap
1.1
1.2
1.3
1.4
1.5
1.6
Chap
2.1
2.2
2.3
2.4
2.5
Chap
3.1
3.2
3
3
3
3.3
le of C
owledgmen
act .............
f Figures ...
f Tables ....
yms .........
 : Introter 1
 Research O
 Research M
 Research O
 Research M
 Research O
 Thesis Ove
 : Backter 2
 Space Env
 Satellite O
 Radiation-
 SRAM-ba
 Previous W
 Multi-ter 3
 OBC Arch
 FPGA Rec
.2.1 Sched
.2.2 Emerg
.2.3 Runni
 MPSoC O
ontents
t ................
..................
..................
..................
..................
duction ......
utline .......
otivation .
bjectives ..
ethodolog
riginality a
rview .......
ground and
ironment ...
n Board Co
Induced Fa
sed FPGAs
ork ..........
Processor S
itecture Alt
onfiguratio
uled Mainte
ency Main
ng Repairs
BC System
V
 
..................
..................
..................
..................
..................
..................
..................
..................
..................
y ...............
nd Contrib
..................
 Literature 
..................
mputers ....
ult Mitigati
 for Space A
..................
ystem-on-C
ernatives ...
n Strategie
nance .......
tenance ......
..................
 Design .....
 
II 
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
utions .......
..................
Review .....
..................
..................
on Approac
pplication
..................
hip  On-B
..................
s ................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
hes ...........
s ................
..................
oard Comp
..................
..................
..................
..................
..................
..................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
uter Design
.................
.................
.................
.................
.................
.................
............ I 
......... IV 
......... XI 
........XV 
...... XVI 
........... 1 
........... 3 
........... 8 
........... 8 
......... 10 
......... 11 
......... 12 
......... 14 
......... 14 
......... 19 
......... 21 
......... 23 
......... 26 
 ....... 31 
......... 31 
......... 44 
......... 46 
......... 48 
......... 49 
......... 50 
  
Chap
4.1
4.2
4.3
4
4
4
4
4
Chap
5.1
5.2
5.3
5
5
5
5
5.4
5.5
5.6
Chap
6.1
6
6
6
6
6
 : Spacter 4
 Single Pro
 Time and S
 Estimation
.3.1 Target
.3.2 SEU R
.3.3 Design
.3.4 System
.3.5 Reliab
 : Raditer 5
 Radiation 
 Accelerate
 Radiation 
.3.1 Safety
.3.2 Accele
.3.3 Embed
.3.4 Upset 
 Fault Injec
 Fault Injec
 SEU Fault
 : Therter 6
 Space The
.1.1 Condu
.1.2 Conve
.1.3 Radiat
.1.4 Radiat
.1.5 Equili
e and Time
cessor Desi
pace Redu
 of Soft-Er
 Orbit Defi
ate Estima
 Specific S
 Failure R
ility Estima
ation and F
Test Object
d Proton B
Test proced
 Precaution
rator Param
ded Softw
Rate Estim
tion Test O
tion Conce
 Injector ....
mal Vacuum
rmal Enviro
ction Heat 
ction Heat 
ion Heat T
ion View F
brium Tem
V
 Redundanc
gn .............
ndancy Ch
ror Reliabil
nition ........
tion ...........
EU Rate ...
ate .............
tion ..........
ault Injectio
ives ...........
eam Radiat
ure ............
s................
eters ........
are .............
ation .........
bjectives ...
pt ...............
..................
 Testing ..
nment ......
Transfer ....
Transfer ....
ransfer .......
actors .......
perature .....
 
III 
y Trade-of
..................
eck Bench .
ity .............
..................
..................
..................
..................
..................
n Testing ..
..................
ion Test Se
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
fs ...............
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
tup .............
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
.................
......... 65 
......... 66 
......... 68 
......... 72 
......... 73 
......... 76 
......... 78 
......... 78 
......... 79 
......... 86 
......... 86 
......... 87 
......... 93 
......... 96 
......... 97 
....... 101 
....... 101 
....... 103 
....... 104 
....... 106 
....... 111 
....... 111 
....... 113 
....... 115 
....... 115 
....... 116 
....... 117 
  
6.2
6.3
6.4
6.5
6.6
6.7
Chap
7.1
7.2
7.3
7.4
7.5
Chap
8.1
8.2
Refer
Appe
Appe
Appe
C.1
C.2
C.3
C.4
Appe
Vacu
D.1
D.2
 Thermal E
 Thermal C
 Geometric
 Thermal V
 Thermal V
 Thermal V
 : Resuter 7
 Time and S
 Static Cros
 Fault Injec
 Dynamic C
 Thermal V
 : Concter 8
 Conclusion
 Future Per
ences ........
ndix A M
ndix B S
ndix C S
 Introd
 System
 Result
 Conclu
ndix D S
um and Rad
 Micro
 MIcro
ffects on 65
ontrol Tech
 Thermal M
acuum Tes
acuum Tes
acuum Tes
lts and Disc
pace Redu
s Section f
tion versus
ross Sectio
acuum Tes
lusion and 
 ................
spectives ...
..................
icroproces
EU Control
EU Screeni
uction ........
 Design an
s and Discu
sion .........
ample Co
iation Test
Blaze – 0 C
Blaze 1 – C
 nm Virtex
niques ......
odel..........
t Objective
t Setup ......
t Procedure
ussion ......
ndancy Per
rom Accele
 Radiation T
n through F
t Results ....
Future Pers
..................
..................
..................
sor Hardwa
ler Connec
ng of Spart
..................
d Test Setu
ssion ........
..................
de – MP
s ................
ode ...........
ode ...........
 
IX 
-5 FPGA ...
..................
..................
s ................
..................
 .................
..................
formance ..
rated Proto
ests ..........
ault Inject
..................
pectives ....
..................
..................
..................
re Specific
tion Schem
an-6 FPGA
..................
p ...............
..................
..................
SoC Embe
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
..................
n Beam Te
..................
ion Test .....
..................
..................
..................
..................
..................
ation of the
atic ............
 Using Cf-2
..................
..................
..................
..................
dded Sof
..................
..................
..................
.................
.................
.................
.................
.................
.................
.................
.................
st ..............
.................
.................
.................
.................
.................
.................
.................
 MPSoC ...
.................
52 ...........
.................
.................
.................
.................
tware in 
.................
.................
.................
....... 119 
....... 122 
....... 123 
....... 125 
....... 126 
....... 132 
....... 137 
....... 137 
....... 141 
....... 149 
....... 150 
....... 154 
....... 160 
....... 160 
....... 164 
....... 169 
....... 176 
....... 192 
....... 193 
....... 193 
....... 194 
....... 195 
....... 198 
Thermal 
....... 200 
....... 200 
....... 203 
  
X 
 
D.3 MicroBlaze 2 – Code ....................................................................... 206 
D.4 MicroBlaze 3 (MSCP) – Code ......................................................... 209 
D.5 Temperature IC Addresses ............................................................... 219 
Appendix E Frame Addresses Generation in the Virtex-5 FPGA ................ 221 
Appendix F Sample Code - Fault Injection of Single Bit Upsets ................. 224 
Appendix G Iterations effect on the Cost, Schedule and Reliability of Nano-
Satellites Projects .............................................................................................. 232 
G.1 Introduction ...................................................................................... 232 
G.2 Dependency Structure Matrix Background...................................... 234 
G.3 Nano-satellite Project Case Study .................................................... 236 
G.4 The Simulation Algorithm ............................................................... 242 
G.5 Results and Discussion .................................................................... 248 
G.6 Conclusion ....................................................................................... 256 
 
  
XI 
 
List of Figures 
FIGURE 2-1. HAZARDS IN THE SPACE ENVIRONMENT. SOURCE: EUROPEAN SPACE AGENCY, SPACE ENVIRONMENT AND 
EFFECTS ANALYSIS SECTION © ESA. .................................................................................................. 15 
FIGURE 2-2. CORRESPONDENCE BETWEEN RADIATION SOURCES AND EFFECTS ON COMPONENTS [18] [23] © 2013 
IEEE. ......................................................................................................................... ................ 16 
FIGURE 2-3. PERCENTAGES OF SPACECRAFT ANOMALIES DUE TO THE SPACE ENVIRONMENT [18] [23] © 2013 IEEE. 17 
FIGURE 2-4. EARTH MAGENTOSPHERE AND RADIATION BELTS.SOURCE: SPACE ENVIRONMENTS & EFFECTS PROGRAM, 
NASA MARSHALL SPACE FLIGHT CENTER © NASA. ............................................................................. 18 
FIGURE 2-5. MOTION OF THE TRAPPED CHARGED PARTICLES. SOURCE: EUROPEAN SPACE AGENCY, SPACE 
ENVIRONMENTS AND EFFECTS ANALYSIS SECTION © ESA. ...................................................................... 18 
FIGURE 2-6. TYPICAL OBC BLOCK DIAGRAM. ............................................................................................... 21 
FIGURE 2-7. XILINX FPGA TYPICAL CONTENTS BREAKDOWN © XILINX. ............................................................. 24 
FIGURE 2-8. VIRTEX-5 CONFIGURATION FRAME FORMAT © XILINX. .................................................................. 25 
FIGURE 2-9. SATELLITE PROCESSING TRENDS [14] © NASA – AMES RESEARCH CENTER. ...................................... 28 
FIGURE 2-10. SPACECRAFT COMPUTERS COMPARISON BASED ON PERFORMANCE/POWER CONSUMPTION. ............... 28 
FIGURE 3-1. GENERAL EMBEDDED OBC SYSTEM ARCHITECTURE. ..................................................................... 32 
FIGURE 3-2. ABU ZAID TAXONOMY OF FAULT TOLERANT PROCESSOR-MEMORY INTERFACES [56]. .......................... 35 
FIGURE 3-3. MPSM WITH NON-SHARED MEMORY. ....................................................................................... 38 
FIGURE 3-4. MPMM NON-SHARED MEMORY. ............................................................................................. 38 
FIGURE 3-5. MPSM NON-SHARED MEMORY RELIABILITY CURVE. ..................................................................... 39 
FIGURE 3-6. MPMM NON-SHARED MEMORY RELIABILITY CURVE. ................................................................... 41 
FIGURE 3-7. MPMM SHARED MEMORY RELIABILITY CURVE. .......................................................................... 42 
FIGURE 3-8. MPSM SHARED MEMORY RELIABILITY CURVE. ............................................................................ 43 
FIGURE 3-9. SOFT ERROR RELIABILITY FOR MP ARCHITECTURES. ...................................................................... 44 
FIGURE 3-10. SCHEDULED MAINTENANCE – RELOAD FROM EXTERNAL FLASH. ..................................................... 47 
FIGURE 3-11. SCHEDULED MAINTENANCE – RELOAD FROM PLATFORM CONFIGURATION FLASH. ............................. 47 
FIGURE 3-12. EMERGENCY MAINTENANCE. ................................................................................................. 48 
FIGURE 3-13. RUNNING REPAIRS – SELECTION OF AN ACTIVE MODULE. ............................................................. 49 
FIGURE 3-14. RUNNING REPAIRS – ACTIVE MODULES WITH VOTING. ................................................................ 50 
FIGURE 3-15. MPSOC OBC BLOCK DIAGRAM. ............................................................................................ 52 
FIGURE 3-16. PING/ACK WATCHDOG TIMER [59]. ....................................................................................... 54 
FIGURE 3-17. INTERFACE WITH TEMPERATURE MEASUREMENT IC. ................................................................... 54 
FIGURE 3-18. TMR WITH FEEDBACK AMONG THE DATA OF THE MPSOC NODES. ................................................. 56 
FIGURE 3-19. DISTRIBUTED VOTING FUNCTION AT REDUNDANT PROCESSORS [59]. .............................................. 57 
FIGURE 3-20. FUNCTIONAL BLOCKS FOR CORRECTING SBUS. ........................................................................... 58 
FIGURE 3-21. SCRUBBING ALGORITHM. ...................................................................................................... 59 
  
XII 
 
FIGURE 3-22. SCHEMATIC OF THE MPSOC DESIGN IN THE XC5VLX50-1FFG676C FPGA. ................................... 61 
FIGURE 3-23. FPGA FLOOR PLAN OF THE MPSOC OBC. ............................................................................... 63 
FIGURE 4-1. SINGLE PROCESSOR SYSTEM. .................................................................................................... 67 
FIGURE 4-2. TIME AND SPACE REDUDNACY TEST CONFIGURATION. .................................................................... 69 
FIGURE 4-3. TIME REDUNDANCY - SINGLE PROCESSOR. .................................................................................. 70 
FIGURE 4-4. SPACE REDUNDANCY - TRIPLE PROCESSORS. ................................................................................ 71 
FIGURE 4-5. AE-8 MODEL TRAPPED ELECTRONS FLUX IN THE ORBIT. ................................................................ 74 
FIGURE 4-6. AP-8 MODEL TRAPPED PROTONS FLUX IN THE ORBIT. ................................................................... 74 
FIGURE 4-7. TRAPPED ELECTRONS ENERGY SPECTRA VERSUS THE FLUX. ............................................................. 75 
FIGURE 4-8. TRAPPED PROTONS ENERGY SPECTRA VERSUS THE FLUX. ................................................................ 76 
FIGURE 4-9. IN-ORBIT SEU RATES ESTIMATED BY CREME-96 AND BASED ON REAL RADIATION TEST RESULTS OF THE 
XILINX XC5VLX50 FPGA. .............................................................................................................. 77 
FIGURE 4-10. SINGLE VERSUS TRIPLE PROCESSOR SYSTEMS SOFT-ERROR RELIABILITY BASED ON IN-ORBIT ESTIMATION. 82 
FIGURE 4-11. SINGLE VERSUS TRIPLE PROCESSOR SYSTEMS SOFT-ERROR RELIABILITY BASED ON IN-ORBIT ESTIMATION, 
TIME BETWEEN SCANS ≤ 100 SEC. .................................................................................................... 83 
FIGURE 4-12. SINGLE VERSUS TRIPLE PROCESSOR SYSTEMS SOFT-ERROR RELIABILITY BASED ON IN-ORBIT ESTIMATION, 
TIME BETWEEN SCANS ≤ 10 MSEC. .................................................................................................... 84 
FIGURE 4-13. TOTAL SYSTEM RELIABILITY OF TMR SYSTEM BASED ON IN-ORBIT SEU RATES AND ASSUMING AN MRAM 
SOFT FAILURE RATE OF 0.05 FAILURES/YEAR. ...................................................................................... 85 
FIGURE 5-1. BAKING OF BOARDS BEFORE RADIATION TESTING. ........................................................................ 87 
FIGURE 5-2. RADIATION TEST CHAMBER USED IN ACCELERATED PROTON BEAM TEST. ........................................... 88 
FIGURE 5-3. PROTON BEAM LINE CONNECTED TO TEST CHAMBER. ................................................................... 89 
FIGURE 5-4. BEAM CENTERING GRID. ......................................................................................................... 89 
FIGURE 5-5. INTERFACE FLANGE WITH RADIATION TEST CHAMBER. ................................................................... 90 
FIGURE 5-6. COPPER SHIELD IN RADIATION TEST. .......................................................................................... 91 
FIGURE 5-7. TOP-VIEW DUT IN RADIATION TEST CHAMBER WITH RS-232 INTERFACE. ......................................... 91 
FIGURE 5-8. OUTGASSING DUE TO EXTERNAL RS-422 CONVERTERS. ................................................................. 92 
FIGURE 5-9. ACCELERATED PROTON BEAM TEST SETUP. ................................................................................. 93 
FIGURE 5-10. VIRTEX 5 LAYERS © XILINX. ................................................................................................... 94 
FIGURE 5-11. X-RAY OF THE XC5VLX50-1FFG676C FPGA. ......................................................................... 95 
FIGURE 5-12. FPGA BEFORE AND AFTER DE-CAPSULATION. ............................................................................ 95 
FIGURE 5-13. PROTON ENERGY SPECTRA VERSUS FLUX LEVEL AT HORYU-2 ORBIT. .............................................. 99 
FIGURE 5-14. SAMPLE OUTPUT FROM THE SEU CONTROLLER DURING RADIATION TEST. ..................................... 102 
FIGURE 5-15. FAULT INJECTION CYCLE. ...................................................................................................... 104 
FIGURE 5-16. FAULT INJECTOR SYSTEM ARCHITECTURE. ............................................................................... 106 
FIGURE 5-17. FAULT INJECTION TEST FLOW CHART. .................................................................................... 108 
FIGURE 5-18. TMR AMONG N BITS VECTORS. ............................................................................................ 109 
FIGURE 5-19. PART OF THE MATLAB SCRIPT TO SIMULATE TMR ERRORS. ...................................................... 110 
  
XIII 
 
FIGURE 6-1. SPACE THERMAL ENVIRONMENT. ............................................................................................ 111 
FIGURE 6-2. A) HEAT FLOW IN SERIES AND B) HEAT FLOW IN PARALLEL. .......................................................... 114 
FIGURE 6-3. VIEW FACTOR CALCULATION GEOMETRY [72]. .......................................................................... 117 
FIGURE 6-4. THERMAL EQUILIBRIUM ANALYSIS THROUGH MESH NODES [68]. .................................................. 118 
FIGURE 6-5. CHANGE OF STATIC POWER WITH PROCESS, VOLTAGE AND JUNCTION TEMPERATURE IN 65 NM SRAM-
BASED FPGA XILINX VIRTEX-5 [79]. ............................................................................................... 120 
FIGURE 6-6. MPSOC OBC GEOMETRIC THERMAL MODEL. ........................................................................... 124 
FIGURE 6-7. THERMAL MATHEMATICAL MODEL SIMULATION RESULTS. ........................................................... 125 
FIGURE 6-8. SMALL THERMAL VACUUM CHAMBER USED DURING THE TESTS. ...................................................... 126 
FIGURE 6-9. THERMAL VACUUM CHAMBER SHROUD DIMENSIONS. ................................................................. 127 
FIGURE 6-10. THERMOCOUPLES POSITIONS DURING THERMAL VACUUM TESTS - TOP POSITIONS. ......................... 128 
FIGURE 6-11. THERMOCOUPLES POSITIONS DURING THERMAL VACUUM TESTS - BOTTOM POSITIONS. ................... 129 
FIGURE 6-12. THERMOCOUPLES POSITIONS DURING THERMAL VACUUM TESTS - TOP POSITIONS - WITH HEAT SINK. . 129 
FIGURE 6-13. ASSEMBLED SANDWICH STRUCTURE BEFORE THERMAL VACUUM TESTING. .................................... 130 
FIGURE 6-14. THERMAL VACUUM TEST SETUP SCHEMATIC. .......................................................................... 130 
FIGURE 6-15. ALGORITHM EXECUTED AT THE SLAVE PROCESSOR AT THERMAL VACUUM TEST. .............................. 133 
FIGURE 6-16. ALGORITHM EXECUTED AT THE MASTER PROCESSOR AT THERMAL VACUUM TEST. ........................... 134 
FIGURE 6-17. INITIAL SCREENING THERMAL VACUUM TEST SENSORS ATTACHEMNT. .......................................... 135 
FIGURE 6-18. INITIAL SCREENING THERMAL VACUUM TEST RESULTS [67] ©IEEE 2013. .................................... 136 
FIGURE 7-1. NON-FAULT-TOLERANCE EXECUTION TIME HISTOGRAM. ............................................................. 137 
FIGURE 7-2. TIME REDUNDANCY FAULT-TOLERANCE EXECUTION TIME HISTOGRAM. .......................................... 138 
FIGURE 7-3. SPACE REDUNDANCY FAULT-TOLERANCE EXECUTION TIME HISTOGRAM. ......................................... 138 
FIGURE 7-4. ACCUMULATED NO. OF FRAME UPSETS VERSUS TIME AT FLUX LEVELS OF 10E06 AND 3E06 (CM-2.S-1) – 
ENERGY LEVEL OF 65 MEV. ........................................................................................................... 142 
FIGURE 7-5. ACCUMULATED NO. OF FRAME UPSETS VERSUS TIME AT FLUX LEVEL OF 10E06 (CM-2.S-1) – ENERGY LEVEL 
OF 65 MEV – 1ST TRIAL. .............................................................................................................. 143 
FIGURE 7-6. ACCUMULATED NO. OF FRAME UPSETS VERSUS TIME AT FLUX LEVEL OF 10E06 (CM-2.S-1) – ENERGY LEVEL 
OF 65 MEV – 2ND TRIAL. ............................................................................................................. 145 
FIGURE 7-7. ACCUMULATED NO. OF FRAME UPSETS VERSUS TIME AT FLUX LEVEL OF 10E06 (CM-2.S-1) – ENERGY LEVEL 
OF 65 MEV – 3RD TRIAL. ............................................................................................................. 146 
FIGURE 7-8. STATIC CROSS SECTION CALCULATIONS AT FLUX LEVEL OF 3E06 (CM-2.S-1) – ENERGY LEVEL OF 65 MEV.
 .............................................................................................................................. ................. 147 
FIGURE 7-9. RADIATION ANALYSIS CHART. ................................................................................................. 148 
FIGURE 7-10. DE-CAPSULATED SPARTAN-6 AND VIRTEX-5 FPGAS FOR CF-252 RADIATION TEST. .......................... 150 
FIGURE 7-11. FAULT INJECTION RESULTS. .................................................................................................. 151 
FIGURE 7-12. TIME DISTRIBUTION BETWEEN INJECTED FAULTS. ...................................................................... 152 
FIGURE 7-13. TMR FAILURE RATE AT DATA VECTOR OF SIZE 2048 BYTES. ....................................................... 153 
FIGURE 7-14. FPGA AUTOMATIC THERMAL PROTECTION. ............................................................................ 155 
  
XIV 
 
FIGURE 7-15. FPGA DURABILITY TEST. ..................................................................................................... 155 
FIGURE 7-16. THERMAL VACUUM TEST, NO-HEAT SINK, 100 MHZ, 4 PROCESSORS – FPGA INTERNAL DIE 
TEMPERATURE. ........................................................................................................................... 156 
FIGURE 7-17. THERMAL VACUUM TEST, NO-HEAT SINK, 100 MHZ, 4 PROCESSORS – THERMOCOUPLES. ............... 156 
FIGURE 7-18. THERMAL VACUUM TEST, HEAT SINK, 50 MHZ, 2 PROCESSORS – FPGA INTERNAL DIE TEMPERATURE.
 .............................................................................................................................. ................. 157 
FIGURE 7-19. THERMAL VACUUM TEST, HEAT SINK, 50 MHZ, 2 PROCESSORS – THERMOCOUPLES. ...................... 157 
FIGURE 7-20. THERMAL MITIGATION EFFECTIVENESS. .................................................................................. 159 
 
  
XV 
 
List of Tables 
TABLE 2 -1 MAIN SOURCES OF THE NATURAL SPACE RADIATION ENVIRONMENT [18] ........................................... 14 
TABLE 3-1. SCOPE PROTECTED AND THE FAULT TOLERANCE PROTECTION TECHNIQUES. ......................................... 33 
TABLE 3-2. ARCHITECTURE ALTERNATIVES FOR PROCESSOR-MEMORY INTERFACES. .............................................. 37 
TABLE 3-3. FPGA USED RESOURCES IN THE MPSOC DESIGN. ........................................................................... 59 
TABLE 3-4. POWER ESTIMATION OF THE MPSOC DESIGN. ............................................................................... 60 
TABLE 3-5. FPGA BASED MPSOC OBC SPECIFICATIONS. ............................................................................... 62 
TABLE 4-1. SINGLE CORE AND MULTI-CORE POWER CONSUMPTIONS. ............................................................... 67 
TABLE 4-2. SINGLE CORE RESOURCES UTILIZATION. ....................................................................................... 68 
TABLE 4-3. MULTI-CORE RESOURCES UTILIZATION. ....................................................................................... 68 
TABLE 4-4. HORYU-2 SATELLITE ORBITAL PARAMETERS. ................................................................................. 73 
TABLE 4-5. DESIGN SIZE IN BITS AND THE UPSET RATES. .................................................................................. 78 
TABLE 4-6. SYSTEM FAILURE RATES. ........................................................................................................... 79 
TABLE 5-1. HORYU-2 SATELLITE ORBIT PARAMETERS. .................................................................................... 98 
TABLE 5-2. ACCELERATOR PARAMETERS SETTINGS. ...................................................................................... 101 
TABLE 6-1. LEO THERMAL ENVIRONMENT [68]. ......................................................................................... 112 
TABLE 7-1. STATISTICAL ANALYSIS OF EXECUTION TIMES HISTOGRAMS IN SECONDS. ........................................... 139 
TABLE 7-2. SUMMARY OF TIME AND SPACE REDUNDANCY ARCHITECTURES COMPARISON. ................................... 141 
TABLE 7-3. STATIC CROSS SECTION CALCULATIONS AT FLUX LEVEL OF 10E06 (CM-2.S-1) – ENERGY LEVEL OF 65 MEV – 
1ST TRIAL. ............................................................................................................................. .... 144 
TABLE 7-4. STATIC CROSS SECTION CALCULATIONS AT FLUX LEVEL OF 10E06 (CM-2.S-1) – ENERGY LEVEL OF 65 MEV – 
2ND TRIAL. ............................................................................................................................. ... 145 
TABLE 7-5. STATIC CROSS SECTION CALCULATIONS AT FLUX LEVEL OF 10E06 (CM-2.S-1) – ENERGY LEVEL OF 65 MEV – 
3RD TRIAL. ............................................................................................................................. ... 146 
TABLE 7-6. STATIC CROSS SECTION CALCULATIONS AT FLUX LEVEL OF 3E06 (CM-2.S-1) – ENERGY LEVEL OF 65 MEV. 147 
TABLE 7-7. THERMAL VACUUM TEST RESULTS WITH AND WITHOUT MITIGATION TECHNIQUES .............................. 158 
  
XVI 
 
Acronyms 
 
ABFT Architecture Based Fault Tolerance 
AES Advanced Encryption Standard 
AMFT Adaptive Middleware for Fault Tolerance 
ASIC Application Specific Integrated Circuits 
AU Astronomical Unit 
BB Black Box 
BGA Ball Grid Array 
BPI Byte Peripheral Interface 
BRAM Block Random Access Memory 
BSP Board Support Package 
BSTI Basic Space Technology Iniative 
CAN Controller Area Network  
CAD Computer Aided Design 
CeNT Center for Nano satellite Testing 
CMOS Complementary Metal- Oxide- Semiconductor 
CFC Clock Frequency Control  
Cf-252 Californium-252  
CLB Configuration Logic Block 
COTS Commercial Off The Shelf 
CRC Cyclic Redundancy Check 
DAQ Data Acquisition  
DCM Digital Clock Manager 
DDD Displacement Damage Dose 
DIO Digital Input Output 
DNST Doctorate in Nano-Satellite Technologies 
DPR Dynamic Partial Reconfiguration 
DTM Dynamic Temperature Management  
  
XVII 
 
DUT Device or Design Under Test 
ECC Error Correcting Code 
EDAC Error Detection And Correction 
EDC Error Detection Coding 
EDK Embedded Development Kit 
ESA European Space Agency 
FAR Frame Address Register 
FDR Frame Data Register 
FIT Failure In Time 
FPGA Field Programmable Gate Array 
GPIO General Purpose Input Output 
HPEC High Performance Embedded Computing 
IC Integrated Circuit 
ICAP Internal Configuration Access Port 
IDE Integrated Development Environment 
IIC Inter-Integrated Circuit 
IO Input/output 
IOB Input Output Block 
IP Intellectual Property soft cores 
IR Infrared Radiation 
ISR Interrupt Service Routine 
JAEA Japan Atomic Energy Agency 
JTAG Joint Test Action Group 
KURRI Kyoto University Research Reactor Institute  
KIT Kyushu Institute of Technology 
LMB Local Memory Bus 
LEO Low Earth Orbit 
LET Linear Energy Transfer 
LUT Look Up Table 
MBU Multiple Bit Upset 
  
XVIII 
 
MDM MicroBlaze Debug Module 
MOSFET Metal-Oxide Semiconductor Field-Effect Transistor 
MPI Message Passing Interface 
MPSoC Multi-Processor System-on-Chip 
MRAM Magneto-resistive Random Access Memory 
MSCP Master System Control Processor 
MTBF Mean Time Between Failures 
NASA National Aeronautics and Space Administration 
NETS Nano-satellite Environment Test Standardization 
NVP N Version Programming 
OBC On-Board Computer 
PCB Printed Circuit Board 
PLB Processor Local Bus 
PNST Post-graduate study on Nano-Satellite Technologies 
RAM Random Access Memory 
RAAN Right Ascension of Ascending Node  
RB Recovery Block 
RISC Reduced Instruction Set Architecture   
RM Reconfigurable Module 
RP Reconfigurable Partition 
RS232 Recommended Standard 232 
RS422 Recommended Standard 422 
SAA South Atlantic Anomaly 
SBU Single Bit Upset 
SDK Software Development Kit 
SECDED Single Error Correction Double Error Detection 
SEB Single Event Burnout 
SEE Single Event Effect 
SEFI Single Event Function Interrupt 
SEGR Single Event Gate Rupture 
SEL Single Event Latch-up 
  
XIX 
 
SET Single Event Transient 
SEnT Space Environment Testing 
SEU Single Event Upset 
SFT Software Fault Tolerance 
SIHFT Software Implemented Hardware Fault Tolerance 
SMC Space and Missile systems Center standard 
SoC System on Chip 
SOI Silicon On Insulator 
SPI Serial Peripheral Interface 
SPSoC Single Processor System-on-Chip 
SRAM Static Read Access Memory 
TARRI Takasaki Advanced Radiation Research Institute - (JAEA) 
TID Total Ionizing Dose 
TMM Thermal Mathematical Model 
TO Triple Oxide 
TMR Triple Modular Redundancy 
TRL Technology Readiness Level 
TT&C Telemetry, Tracking and Command 
UART Universal Asynchronous Receiver and Transmitter 
UNOOSA United Nations Office for Outer Space Affairs 
USB Universal Serial Bus 
WDT Watch Dog Timer 
XPS Xilinx Platform Studio 
 Chapt
artificial s
the open s
the space
progressed
At the ea
number of
of a natio
and opera
expensive
age, we s
knowledg
The space
provided 
that can b
even mann
offs that e
well soun
across the
resources 
high spee
application
However, 
enabled th
with even
developed
investigate
M
 : Iner 1
ore tha
space [
was use
atellite mar
kies and fo
 age and 
 and advan
rly beginn
 countries.
nal security
te their ow
 and compl
till cannot
e for startin
 race led 
from the sp
e launched
ed space s
merged fro
ded [5] [6
 globe, e
exploration
d satellite-
s were use
advances 
e developin
 better ef
 for other
d to develo
troduc
n 50 years 
1]. Sputnik
d to send a
ked the sta
rward to the
the embar
ced concep
ings, space
 The techno
 issue. On
n satellites
icated field
 find so m
g a real full
to unleashi
ace-borne 
 to space. 
tations [4].
m using sp
]. Nowada
arth obser
, homeland
based inter
d to be pro
in microe
g of small
ficiencies 
 space act
p cost effe
Chapter 1
tion 
passed sin
-1 carried
 beep to its
rt of a new
 unlimited 
kation of 
ts as well a
 was most
logy of bu
ly several c
. The spac
. Even afte
any countr
-cycle, from
ng differen
devices. Sa
Other devic
 The applic
ace-borne d
ys, the wo
vation con
 security s
net links a
vided by lar
lectronics 
er and ligh
[7] [8] [9]
ivities, suc
ctive small 
 Introductio
ce the laun
 a simple 
 ground sta
 age. Hum
opportuniti
the space
s applicatio
ly a scope
ilding sate
ountries w
e technolo
r more than
ies that ca
 launch to
t applicati
tellites are
es include 
ations, scie
evices and
rld enjoys 
stellations 
ervices, ter
nd media 
ge scale, h
technologi
ter satellite
 [10]. Eve
h as hum
satellites [1
n 
ching of th
communica
tion. The la
ans started
es form the
 technolog
ns were int
 for contri
llites and la
ere able to 
gy has alw
 50 years o
n afford t
 operation, 
ons and se
 just one f
space prob
ntific achie
 stations w
real time 
for disaste
restrial and
broadcasti
eavy and hi
es and m
s to provid
n the reu
an spacefl
1]. 
e first satel
tion system
unching of
 looking up
m. Since th
y had pro
roduced [2
bution by 
unchers w
fully build
ays used t
f starting th
he cost an
space indus
rvices that
orm of the
es, telesco
vements, a
ere tremend
telecommu
r monitor
 flight nav
ng services
gh power s
aterial eng
e the same 
se of tech
ight system
1 
lite into 
 which 
 the first 
ward to 
at event 
foundly 
] [3].  
a small 
as a sort 
, launch 
o be an 
e space 
d/or the 
try [3]. 
 can be 
 devices 
pes, and 
nd spin-
ous and 
nication 
ing and 
igation, 
. These 
atellites. 
ineering 
services 
nologies 
s, was 
Chapter 1 Introduction 
2 
 
The advancement in building satellites was associated with a paradigm shift. It was a 
fact that till the end of the nineties of the previous millennium, satellites and space-
borne devices were used to be developed by space agencies and organizations. The 
facilities needed in developing, assembling, integrating and testing the space-borne 
devices had to meet sophisticated requirements. The technical knowledge, as well, 
was not available outside of these organizations. However, these conditions prevailed 
until the launching of university-made satellites in the eighties, such as the UoSAT-1 
in 1981 [12]. Since then, a new start was remarked through a paradigm shift in the 
traditional concepts used in building satellites by universities. 
The key point that enabled the universities to indulge in the space activities was the 
use of Commercial-Of-The-Shelf (COTS) components. COTS enabled building low 
cost satellites while maintaining reliability through design techniques. Nowadays, 
there are multiples of satellites that were successfully developed by universities and 
are already rotating in space. Universities helped in developing new techniques to 
shrink the satellites sizes and make them much more efficient.  
International organizations and space agencies became interested in what universities 
were doing and offered to sponsor their programs. A remarkable example is the 
Basic Space Technology Initiative (BSTI) that was initiated by the United Nations 
Office for Outer Space Affairs (UNOOSA). The BSTI is focusing on providing space 
engineering education especially for developing countries which cannot afford the 
costs of such an expensive industry. It has a goal of defining a space engineering 
curriculum that can be used as a guide by academic institutes all over the world to 
provide standardized space engineering degrees. The BSTI also inaugurated, in 
collaboration with Kyushu Institute of Technology at Japan, a space education 
program named the Doctorate in Nano-Satellite Technologies (DNST). The program 
aims at hosting students from the developing countries and granting them the 
doctorate degree upon pursuing research in the space engineering field. The program 
was later promoted to handle master students as well and its name was changed to 
the Post-graduate study on Nano-Satellite Technologies (PNST). By the intake of 
2013, 9 participants from 8 countries are already attending the program and more 
students would participate in the coming years as it gains more international 
recognition.  
Chapter 1 Introduction 
3 
 
My research is part of the DNST program. I belong to the first batch of the students 
who were accepted in 2010. In the next few sections, I would discuss the outline of 
the research and its expected impacts. It might be necessary to point out that the 
major purpose of this research is to develop and qualify a low cost, high performance, 
high reliability, and low power on-board computer for in-orbit processing of 
scientific data and commands. This on-board computer would be used in developing 
state-of-art small spacecraft technology for future space missions by using Field 
Programmable Gate Arrays (FPGAs). It would start a new era in developing 
dynamically reconfigurable space systems to maintain the power requirements and 
adapt to the space environment conditions. 
The research was conducted at the premises of the Kyushu Institute of Technology. 
Testing was conducted at CeNT, Takasaki Advanced Radiation Research Institute 
(TARRI) – Japan Atomic Energy Agency (JAEA), and Kyoto University Research 
Reactor Institute (KURRI). The results of the research recommend the developed on-
board computer for in-orbit validation in a technology demonstration mission.  
1.1 Research Outline 
The challenge in developing small satellites is mainly concerned with keeping the 
satellite size and mass as minimum as possible. We should take into consideration 
developing satellites with an area-to-mass (A/M) ratio that would enable re-entrance 
to the earth atmosphere in less than 25 years. The area-to-mass ratio increment 
enables more drag force and hence better satellite deorbiting to reduce the problem of 
space debris. However, minimizing the subsystems mass, area, size and power 
consumption needs innovative technologies to be applied when developing the 
satellite subsystems. The density of the material used in developing the subsystem 
structure as well as the subsystem components have to be carefully selected. The 
reduction in the subsystem mass would affect the whole satellite. Another aspect is to 
reduce the power consumed by each of the satellite subsystems to reduce the surface 
area, and hence the mass, of the solar panels and the sizing of the batteries. This 
requires using innovative semiconductor manufacturing technologies and lower 
transistor feature sizes for the electronic components to reduce their static and 
dynamic power consumptions.  
Chapter 1 Introduction 
4 
 
On the other hand reliability is a big concern for all satellite developers. Design 
techniques as well as subsystems’ components should be carefully selected in order 
to develop high reliability satellites that can serve for the required mission lifetime. 
The design techniques usually depend on redundancies whether cold or hot backups. 
The components are better to be selected from the space approved category that has 
flown on-board previous space missions. The space approved category might include 
space grade and non-space grade components. Space grade components are usually 
of low performance compared to COTS components. Depending on the mission 
requirements, the reliability threshold through the mission lifetime is set. Using 
redundancies in hot backup regime requires higher power consumption. Therefore, it 
is always important to design the reliability carefully to match with the power budget 
as well as the mass budget and size of the satellite.  
We need to strike a balance between the reliability, power consumption, mass, size 
and performance of the space mission. The required balance needs innovative design 
solutions in order not to sacrifice any of the aforementioned parameters. To achieve 
this balance, design of digital electronic circuits should make use of three aspects: 1) 
Advancement in the semiconductor industry through the reduced transistor feature 
size, 2) Re-programmability of digital designs, 3) On-the-Fly DPR. 
Moore’s law suggested that the number of transistors in a chip would double every 
two years. The duplication of the transistors count means that their feature size 
would reduce in almost the same proportion. Nowadays we have FPGAs and 
Application Specific Integrated Circuits (ASIC) at the 28 nm scale. Chip 
manufacturers are working on 7 nm designs. This nano-meter scale enables high 
density designs to be developed. However, it sets a challenge to reduce the power 
consumption and the radiation immunity of the overall design. Miniaturization of 
satellite subsystems should be achieved through using the available nano-meter scale 
feature size technology. Power consumption, both dynamic and static, are reduced 
through applying manufacturing and design techniques. Immunity to space radiation 
is enhanced through fault tolerance and shielding techniques. Reducing the power 
consumption would reduce the operating temperature of the electronic chips which 
would enable them to withstand higher thermal conditions while operating in space.  
Chapter 1 Introduction 
5 
 
Digital designs of satellite subsystems are usually implemented using components 
that cannot be changed after launching the satellites. The designs are fixed and 
whether they contain a fault or not should be fully investigated before launching. 
Some designs might contain a mistake that is not apparent during ground tests and 
might cause mission failures. Correcting the design mistakes of digital designs in-
orbit used to be an impossible task. The most effort that can be done is to reprogram 
the code in case an unwanted software performance was noticed. This can be done 
only if the satellite has the feature to reprogram its code. A major advancement in 
digital designs that appeared in the last decade is to reprogram the hardware. 
Reprogramming the hardware can be either by loading alternative design modules 
from an on-board local memory or uploading new designs from the ground stations. 
Design diversity concept can thus be achieved in a much smaller area, size and mass. 
However, the re-programmability feature is applicable only through using Static 
Random Access Memory (SRAM)-based FPGAs. This type of the FPGAs has its 
own merits and demerits. The major merit is providing re-programmability. While 
the major demerit is susceptibility to design alteration while operating in space due to 
Single Event Upset (SEU) induced errors from the space environment radiation.  
DPR is a new technique that is introduced by SRAM-based FPGA manufacturers to 
overcome the problem of switching off the logic whenever a reconfiguration of the 
FPGA is needed. The technique depends on making part of the logic as static while 
defining Reconfigurable Partitions (RP) that can hold different Reconfiguration 
Modules (RM). Whenever a reconfiguration is needed for one of the RPs with any of 
its associated RMs, the static logic would reload the RP from the external flash 
memory that holds the RMs hardware designs in the form of partial bitstreams. This 
technique enables loading and unloading of digital designs when needed thus it can 
be used to save the power consumption of the satellite. As a matter of fact, not all of 
the satellite subsystems need all of their components for 100% of the operation time. 
Some components might be switched off until they are needed to perform a specific 
function. Switching off using the DPR technique means simply to load a Black Box 
(BB) design in the RP.  
DPR can also be used in developing adaptive subsystems. Adaptation can take place 
in response to the changes in the space environment, power and thermal operating 
conditions. For example loading extra redundant modules could be done to provide 
Chapter 1 Introduction 
6 
 
higher reliability in case of increased induced soft errors due to radiation. Also, 
unloading some of the design modules when they are not needed in the current 
operations might be done to save the power consumption especially if the batteries 
are on the critical storage limits. The thermal conditions in the sunlight and eclipse 
regions of the orbit vary drastically. Some modules might be unloaded to reduce the 
heat emissions while in the sunlight region. Heavy processing designs might defer 
their processing to the eclipse region by loading the processing modules there. Thus 
heat emissions might not introduce a problem to the operating temperature of the 
FPGA and other components in the satellite. This technique provides an adaptive 
protection through partial reconfiguration. Another approach is to use the DPR to 
correct the errors in the bitstream frames of the hardware in the FPGA configuration 
memory. As the satellite operates in space its SRAM-based FPGAs are subjected to 
soft errors induced through charge accumulation at the memory cells (circuit) nodes. 
The charge accumulation, if it crosses a certain limit, will flip the logic values stored 
in the internal SRAM cells of the FPGA which is known as the local configuration 
memory. The internal SRAM holds the design in the form of binary bitstream and if 
any bit is corrupted then that might lead to serious design failure issues. Therefore 
DPR can be used to continuously correct the induced flips through scrubbing in a 
read-correct-write back process. 
If we consider the three illustrated aspects mentioned above, we would recommend 
SRAM-based FPGAs as a suitable candidate to achieve the required balance between 
the satellite reliability, power consumption, mass, size and performance. However 
there still another major issue which is the cost. If we consider space grade SRAM-
based FPGAs then it would be very expensive to develop satellites at moderate 
budgets up to 5 M USD. However, if we use commercial grade then the cost would 
be reduced ultimately. In fact the price of a space grade component is about 100 
times the price of the commercial grade one. Reducing the cost is a great benefit but 
there is not enough data about the suitability of these components when used in space. 
Especially if the components are of latest generations which did not fly yet on many 
space missions thus not enough information about their performance in space is 
available. Therefore, we need to provide a qualification method for such components 
that would reveal their suitability and provide sufficient data as well as establishes 
the mitigation techniques to make them fitting for space missions.  
Chapter 1 Introduction 
7 
 
The main theme of the research is to investigate the suitability of using COTS 
SRAM FPGAs of the 65 nm scale in developing miniaturized High Performance 
Embedded Computers (HPECs). The direct application of the research is to develop 
an OBC suitable for small and nano-satellites. Qualification is an essential part of the 
research as we need to highlight the mitigation techniques needed to support COTS 
SRAM-based FPGAs while operating in space. In the research we investigated the 
concept of fault injection to simulate space radiation. Radiation testing and 
mitigation against soft error through DPR as well as thermal vacuum testing and 
mitigation through passive and active techniques were thoroughly studied. The 
research focuses on presenting practical achievements besides theoretical bases. 
Studying mitigation techniques for space-based applications needs real developments 
and qualifications to assess their suitability. Therefore, Space Environment Testing 
(SEnT) was a fundamental constituent in the researched topics. We focused on the 
development followed by real qualification through SEnT. Qualification is the most 
crucial aspect in developing working space systems and proving the validity of new 
concepts and technologies. This research is trying to start a new epoch in using 
SRAM-based FPGAs to strike the needed balance between cost, reliability, power 
consumption and performance in addition to flexibility in reprogramming the whole 
system. Some designs might require ultimate reduction in its power consumption 
hence reducing its total thermal power while not sacrificing its reliability. That 
feature was thoroughly studied in the research by investigating the time and space 
redundancy trade-offs when developing MPSoC and Single Processor System-on-
Chip (SPSoC) using 65 nm FPGAs.  
Usually satellites cannot be restored after launching. However, in some rare cases, it 
might be necessary to restore a satellite from its orbit for repair purposes. 
Nevertheless, such missions are very specific, expensive and not regular and are only 
performed for very peculiar reasons such as repairing a military satellite or a space 
telescope. Therefore, as a general rule, satellites should be fully qualified in a 
thorough approach as much as possible. SEnT is the real manifestation of suitability 
of designs to operate in space. SEnT is the main feature and differentiator of this 
research from other designs which are solely validated and approved through 
simulations. 
Chapter 1 Introduction 
8 
 
1.2 Research Motivation  
Developing space qualified low cost satellites while maintaining high performance 
and reliability is the main motivation of this research. Both the academia and 
industry are welling to achieve good results on this way. Universities and research 
institutes are developing new missions to help better understanding the space 
environment and its possible effects on space missions whether manned or unmanned. 
Innovative profit generating missions are being offered by space companies around 
the world. Developers of small satellites are struggling to make the systems smaller 
and better. In addition to that, some countries, especially developing countries, are 
welling to start making use of space for the benefit of their welfare. We wish to 
provide accessibility to these countries through low cost and clean technology. As 
miniaturizing the satellite subsystems would reduce the required number of 
components to be manufactured and hence reduce the emissions that accompany any 
manufacturing process. This means we are trying to provide access to space while 
supporting the concept of green technology on Earth. 
Through this research we are providing a platform for In-orbit processing of the huge 
amounts of scientific data which is collected on-board. This would ultimately reduce 
the communication link bandwidths and required power on-board the satellites. This 
powerful processing platform is a true demonstration of the HPEC concept using 
COTS in space. It would enable future mission to develop further applications using 
the concept of evolutionary hardware. The hardware that can cure and repair itself 
from malfunctions while in operation in its orbit. That concept would have multitude 
of applications in space as well as in other fields such as robotics and medical 
equipment. Giving the satellites high processing capabilities while being protected 
against the space environment would certainly boost the research towards developing 
silicon brains that would enable building autonomous satellites that can take full 
control on the mission with minimum or no human interaction.  
1.3 Research Objectives 
In designing devices that would operate in space, we usually have to design the 
devices from one of two perspectives: 1) Either we design the device to avoid the 
space environment induced effects, or 2) we design the device to tolerate the space 
Chapter 1 Introduction 
9 
 
environment induced effects. The difference between avoidance and tolerance is the 
type of the used components and their manufacturing grade. Space grade components 
provide good protection against radiation induced Single Event Effects (SEE), Total 
Ionizing Doses (TID), and Displacement Damage Doses (DDD). Also their operating 
temperature range can withstand the thermal vacuum conditions in space. However, 
our research mainly aims at providing protection to COTS components through 
applying design mitigation techniques instead of using space grade ones.  
The following points would summarize the main objectives of the research: 
 Design a well-protected platform that would be used as a HPEC for OBC 
applications. 
 Fully develop the system and qualify its operation and mitigation techniques in 
SEnT conditions. 
 Use the state-of-art low cost reprogrammable COTS SRAM-based FPGAs to 
provide maximum performance and flexibility.  
 Thorough investigation of the best techniques to mitigate the system against 
radiation induced transient faults as well as provisioned permanent failures.  
 Thorough investigation of the best techniques to mitigate the system against the 
thermal vacuum conditions. 
 Start a new epoch in using 65 nm COTS SRAM-based FPGAs, instead of the 
expensive anti-fused and space grade variants, in developing reliable high 
performance space devices and subsystems through mitigating all possible space 
environment effects on their operation.  
 Make use of the compactness and re-programmability features of SRAM-based 
FPGAs in developing miniaturized and flexible systems as well as facilitate the 
way to develop future satellite-on-chips. 
 Investigate the best fault tolerant approach from time and space redundancies to 
be applied on 65 nm COTS SRAM-based FPGAs.   
 Fully characterize the suitability of using 65 nm COTS SRAM-based FPGAs in 
space applications by revealing the SEnT data. 
 Investigate the different methods in performing SEnT and the pros and cons of 
each method in order to provide testing guidelines and procedures to efficiently 
reduce the test costs and complexity while maintaining the test quality i.e. 
radiation test with Californium-252 or accelerated proton beam or fault injection. 
Chapter 1 Introduction 
10 
 
1.4 Research Methodology 
The applied methodology started by thoroughly investigating and studying the 
theoretical concepts of fault tolerance and modern digital design approaches. The 
theoretical study was followed by real development and qualification through 
performing SEnT on the system. Qualification through SEnT is considered as a 
mandatory step to prove that the design can really work in space. SEnT results were 
used to practically demonstrate the design operability and suitability for space 
missions as well as the applicability of the design concepts.  
FPGAs provide the capabilities to develop high speed digital designs. Nowadays, 
computer design packages can be used in designing complete multi-processor 
embedded systems inside FPGAs. Simulating functional tests as well as developing 
software is also available through hardware/software integrated simulators and 
software development environments. The complete MPSoC design and simulation 
was performed using Computer Aided Design (CAD) tools.  
The research methodology proceeded according to the following steps: 
1) Study of the different time and space redundancy approaches in fault tolerance 
design. 
2) Study of hardware fault tolerance mitigation techniques for SRAM-based FPGAs.  
3) Developing the mitigated hardware of a SPSoC to demonstrate time redundancy.  
4) Developing the mitigated hardware of an MPSoC to demonstrate space 
redundancy.  
5) Developing the software application that would be used in running the hardware 
during the tests as well as managing the fault tolerant features in the designs.  
6) Comparison between both the SPSoC and the MPSoC to select the best fault 
tolerance approach based on power consumption, computation throughput, area 
utilization, reliability and heat emission.  
7) Performing radiation tests under proton accelerated beam at the nuclear reactor to 
estimate the static cross section. 
8) Specifying a target operation orbit for the system and using that orbit to estimate 
the charged particles energy spectrum and flux of the protons, electrons and 
Chapter 1 Introduction 
11 
 
heavy ions. The SEU rate would be estimated using the static cross section and 
the charged particles spectrum through space radiation models.  
9) Developing a fault injection bench to flexibly estimate the dynamic cross section 
of the system at any time and under any changes of the design without having to 
repeat the expensive radiation shots. 
10) Performing fault injection tests to assess the mitigation against radiation induced 
soft errors. 
11) Analysis of the system reliability in accordance to radiation induced soft errors. 
12) Studying thermal vacuum mitigation techniques through active control via 
Dynamic Temperature Management (DTM) as well as passive control through 
heat sinks. 
13) Performing thermal vacuum analysis of the system and running thermal vacuum 
tests with and without the mitigation techniques to assess their suitability. 
14) Generalizing the research results to be used as concepts in different design 
approaches. 
15) Results dissemination. 
1.5 Research Originality and Contributions 
The following points summarize the major research contributions and originalities: 
1) Developing and qualifying a reprogrammable MPSoC using 4 embedded systems 
in an SRAM-based FPGA with up to 400 MIPS and power consumption of max 
5 Watt, soft error reliability 99.99%, mass about 200 gm and ultra-low cost 
(<2000 USD). The OBC cost can be considered as about 10 % of the total 
mission cost as estimated in [13]. Therefore our cost reduction would contribute 
to about 10% of the total budget. However, if our design is used in the payload 
computer system then we would contribute to more reduction in the total budget 
as the payload forms about 21% of the mission cost. The more use of our design 
in other system bus subsystems, the more cost reduction we would achieve 
compared to the use of the space approved subsystems. 
2) Using a novel technique for estimating the static cross section during radiation 
tests through internal scrubbing of the FPGA configuration memory instead of 
using the traditional external monitoring and configuration module. 
Chapter 1 Introduction 
12 
 
3) Developing a fault injection bench for flexibility of estimating dynamic cross 
section in the lab through fault injections to the configuration memory bitstream.  
4) Applying thermal mitigation through DTM of FPGA operating temperature 
through controlling the operating frequency and reconfiguring the operating 
hardware.  
5) Clarifying the procedural details of space environment testing, radiation and 
thermal-vacuum, and publishing the results for the class of 65 nm SRAM-based 
FPGAs. 
6) Clarifying the analytical details of reliability estimation procedure for induced 
soft errors based on the static cross section and target orbit. 
7) Comparative study between suitability of time and space redundancies in 
developing embedded processor systems on 65 nm FPGAs.  
8) One of the research publications was cited at NASA’s Small Spacecraft 
Technology. State of Art Report – 2014 [14]. 
1.6 Thesis Overview 
Chapter 1 is the introduction where the research objectives and detailed problem 
definition are investigated as well as presenting the research contributions and 
originalities. 
Chapter 2 presents the background related to the space environment and possible 
mitigation techniques of the radiation effects. It overviews the FPGA technology and 
refers to detailed textbooks and references for the interested reader to grasp the full 
detailed picture from the rich available literature. It covers the previous work as 
much as possible, although not exhaustive, to present the trends in developing fault 
tolerant and new concept OBCs. 
Chapter 3 presents the details of the MPSoC design and operation concept as well as 
the fault tolerance features applied for mitigation against the induced soft errors. It 
discusses the operation of the different blocks and the concept of network-on-chip 
communication through mailboxes to achieve cross-voting through the score-card 
concept.  
Chapter 4 presents the comparison between space and time redundancy in 65 nm 
SRAM-based FPGAs to select the best technique in terms of power consumption, 
Chapter 1 Introduction 
13 
 
resources utilization, performance and reliability. The analytical approach for 
estimating the soft error reliability due to induced SEUs bit flips in the configuration 
bitstream of the FPGA is described.  
Chapter 5 presents the experimental radiation test procedure at the Takasaki nuclear 
reactor. The analytical estimation of the system reliability based on the configuration 
memory scrubbing rate and the expected SEU rates is presented.  
Chapter 6 presents the experimental thermal vacuum test procedure at CeNT. The 
thermal analysis model and the test procedure as well as the mitigation techniques 
are presented.  
Chapter 7 presents the results of evaluating the design performance and discusses the 
research contribution in developing a well mitigated MPSoC based on the results 
from the SEnT. 
Chapter 8 presents the conclusion and future work. It discusses the idea of satellite-
on-chip and its feasibility for developing future high performance and ultra-compact 
space missions. We call that concept the ultra-space in a reference to currently 
available ultra-low-size computers. 
 Chapt
at
or
w
radiation t
guaranteed
grade com
to radiatio
their effec
the actual 
satellite to
handles. F
developin
FPGA is 
building b
fault tolera
for small s
2.1 Spa
The space
plasma, m
conditions
a major th
electronic 
The sourc
Tabl
Rad
Sola
Gala
S 
 : Ber 2
ellites are 
bits. In thi
hich affect
hreats as th
 to be radi
ponents wh
n threats, t
ts can be f
in-orbit op
 provide a
ault mitig
g a design
used in d
locks and l
nt OBC de
pacecraft’s
ce Envir
 environme
eteoroids a
 in vacuum
reat for the
component
es of natura
e 2 -1 Main
iation Belts
r Flares 
ctic Cosmi
Chapter 2 
ackgro
subjected t
s chapter w
 the satelli
ey are of 
ation-harde
ich provid
hermal thre
ully mitigat
eration. We
 sufficient 
ation is dis
 that can 
eveloping 
ogic comp
signs using
 OBCs in c
onment
nt is main
nd space d
 [15] [16].
 correct op
s is perfor
l radiation 
 Sources of
 ElPr
Pr
Io
c Rays Pr
Backgroun
und an
o the hars
e provide
te electroni
a probabilis
ned unless 
e radiation 
ats are con
ed and ver
 illustrate t
backgroun
cussed to 
survive rad
the MPSoC
onents are 
 SRAM-ba
omparison 
 
ly characte
ebris, atm
The charge
eration of s
med to cha
in space are
 the Natura
ectrons 
otons 
otons 
ns 
otons and I
d and Liter
d Liter
h space en
 an overvie
cs includin
tic nature 
it would ha
hardening b
trollable an
ified in the
he major fu
d about the
understand
iation-indu
-OBC, a 
illustrated. 
sed FPGAs
to our desig
rized by it
osphere, so
d particles 
pacecraft e
racterize th
 summariz
l Space Ra
1 eV 
1 KeV
1 KeV
1 to a
ons Max 
ature Revie
ature R
vironment
w about th
g the OBC
and a syste
ve been de
y manufac
d are well 
 design and
nctions of 
 nature of
 the differ
ced faults
simple bac
Previous w
 are present
n is discus
s specific n
lar, geoma
in the space
lectronics. R
e space rad
ed in Table
diation Env
– 10 MeV
 – 500 Me
 – 500 Me
 few 10 Me
flux at abou
w 
eview 
conditions 
e radiation
. We focu
m cannot b
signed usin
turing. On 
determined
 test phase
the OBC w
 the tasks 
ent approa
. As SRAM
kground a
ork for dev
ed. The sta
sed.  
ature of ra
gnetic and 
 environm
adiation te
iation effe
 2 -1 r from
ironment [1
V 
V 
V/n 
t 300 MeV
14 
in their 
 threats 
s on the 
e 100% 
g space-
contrary 
 so that 
s before 
ithin the 
which it 
ches for 
-based 
bout its 
eloping 
te-of-art 
diation, 
thermal 
ent form 
sting of 
cts [17]. 
 [18]. 
8] 
/n
Chapter 2 Background and Literature Review 
15 
 
The different types of hazards resulting from the space environment elements are 
shown in Figure 2-1 from the European Space Agency (ESA). 
 
Figure 2-1. Hazards in the Space Environment. Source: European Space Agency, 
Space Environment and Effects Analysis Section © ESA. 
Radiation has three effects on electronic components: 1) Displacement Damage Dose 
(DDD), 2) Total Ionizing Dose (TID) and 3) Single Event Effects (SEE). DDD and 
TID are accumulated effects while SEE is caused by single particles. Radiation 
effects on electronic components are well presented in the literature [19] [20]. 
DDD is an accumulative radiation effect that causes the change of the semiconductor 
atomic lattice due to the displacement of atoms from their locations as a result of a 
strike by an external charged particle such as heavy ions. It is mostly seen in the 
degradation of the satellite solar panels [21].  
 TID is an
the oxide 
which the
switch and
type as P-
SEE is ch
some dest
Burnout (
SEU, Sing
Destructiv
and guid
correspon
electronic 
introduced
Figure 2-
According
space env
radiation e
other accum
of the MO
 device is s
 it would 
MOSFET o
aracterized
ructive for
SEB) and S
le Event T
e and non-
eline to te
dence betw
component
 in [23].  
2. Correspo
 to Ecoffet
iromnet can
ffects with
Chapter 2 
ulative rad
SFET dev
witched on
have a perm
r N-MOSF
 as an effec
ms such as
ingle Even
ransient (SE
destructive
st data in
een the radi
s is shown 
ndence bet
[1
 in [18] [23
 be illustra
 80% due to
Backgroun
iation effe
ice leading
/off from i
anent on o
ET [21]. 
t that is ca
 Single Ev
t Latchup 
T) and Sin
 radiation t
terpretatio
ation sourc
in Figure 2
ween Radia
8] [23] © 2
], the perec
ted in Figu
 SEUs. 
d and Liter
ct which c
 to change
ts gate. Ov
r permane
used by a 
ent Gate R
(SEL) and 
gle Event F
esting and 
ns are giv
es and the 
-2 referenc
tion Source
013 IEEE.
ntages of s
re 2-3. 45%
ature Revie
auses the c
s in the th
er the time
nt off failu
single char
upture (SE
non-destruc
unction In
the needed
en in [22
types of eff
ed from [1
s and Effec
pacecraft a
 of the an
w 
ollection of
reshold vo
 the device
re dependin
ged particl
GR), Singl
tive forms
terrupt (SE
 setups, pro
]. An int
ects they in
8] and was
ts on Com
nomalies du
omolies ar
16 
 ions in 
ltage at 
 fails to 
g on its 
e. It has 
e Event 
 such as 
FI) [22]. 
cedures 
eresting 
duce on 
 initially 
 
ponents 
e to the 
e due to 
 Figure 2-3
Earth ma
However, 
Atlantic A
of SEE [2
spiral mot
the equato
magnetosp
Marshall S
. Percentag
gnetic field
there are tw
nomaly (SA
4]. The tra
ion along t
r as shown
here and ra
pace Fligh
Chapter 2 
es of Spac
 provides 
o regions o
A). Satell
pped proto
he magneti
 in Figure 
diation bel
t Center. 
Backgroun
ecraft Anom
[23] © 201
a protecti
f high radi
ites passing
ns and elec
c field line
2-5 referen
ts is shown
d and Liter
alies due t
3 IEEE. 
on against
ation: 1) th
 over these
trons in th
s or bounce
ced from E
 in Figure 2
ature Revie
o the Space
 charged 
e polar cusp
 locations s
e radiation
 back and 
SA. An ill
-4 referenc
w 
 Environm
particles in
s, and 2) th
uffer highe
 belts can 
forth or dri
ustration of
ed from the
17 
 
ent [18] 
 space. 
e South 
r events 
move in 
ft along 
 Earth’s 
 NASA 
Chapter 2 Background and Literature Review 
18 
 
 
Figure 2-4. Earth Magentosphere and Radiation Belts.Source: Space Environments & 
Effects Program, NASA Marshall Space Flight Center © NASA. 
 
Figure 2-5. Motion of the Trapped Charged Particles. Source: European Space 
Agency, Space Environments and effects analysis section © ESA. 
Chapter 2 Background and Literature Review 
19 
 
2.2 Satellite On Board Computers 
Satellites are launched to carry out specific mission(s). The mission(s) is usually 
called the satellite payload. All satellites have a fixed set of subsystems that must 
exist for the satellite to operate. The power subsystem is responsible for power 
generation, storage, regulation and distribution. The communication subsystem is 
responsible for establishing a communication link in between the satellite and the 
ground station when the satellite is in the visibility region of the ground station. It 
receives the commands uploaded from the ground station and downloads the 
scientific data and telemetry information to it. The attitude orientation and control 
subsystem is responsible for maintaining the satellite orientation in reference to an 
inertial axis within specified margins. It handles the maneuvers which the satellite 
performs to orient itself to a specific position. The navigation unit is responsible for 
providing accurate navigation data about the satellite position in space. It can provide 
time steps to synchronize the satellite real time clock. The OBC is the subsystem 
which is responsible for receiving the commands from the communication 
subsystem, sorting the commands according to their execution times, switching 
on/off the subsystems and missions, sending the commands to the relevant 
subsystems and missions, receiving the telemetry from the subsystems and missions 
and managing the record of the scientific data and telemetry data in storage 
memories.  
The OBC is the satellite controller which synchronizes the operation of the whole 
satellite. Figure 2-6 shows the typical OBC architecture. It consists of a 
microprocessor based system which interfaces to a program memory and storage 
memory. The storage memory should be of a non-volatile type with low access time. 
Thus the read and write access times should be low to reduce the delay in reading 
and writing data in order to have a match between the data generation and play back 
rates. Data generation comes from the scientific mission and data playback goes to 
the communication subsystem. The commands sent from the ground station are 
usually encoded in packet streams and frames. The commands encoding facilitates 
the retransmission in case part of the command stream is lost during transmission. 
The OBC receives the decoded command stream from the communication 
subsystem. In some architectures, a specific subsystem handles the functions of 
Chapter 2 Background and Literature Review 
20 
 
telemetry storage and encoding as well as tracking of the microwave communication 
signal and decoding of the received commands, it is usually known as the Telemetry, 
Tracking and Command subsystem (TT&C).  
The OBC receives the mission data from the payload of the satellite to store it in its 
temporary storage memory for later playback to the communication subsystem 
during the communication session. It might also be possible to directly send the data 
from the payload to the communication subsystem without storing it in the OBC. The 
OBC interfaces with the satellite subsystems through the Input/Output (I/O) unit. The 
I/O unit transmits the interface commands and data arrays from the OBC to the 
satellite subsystems and receives their missions’ data and or telemetry status. It 
handles the interface of each subsystem to match it with the OBC bus. For example it 
can handle the IIC interface from one side with the subsystems and connect to the 
parallel local bus of the OBC on the other side. Sending the commands to the 
satellite subsystems might be serialized over a line such as current loop interfaces. 
Also the switch on/off pulses to start and stop the different subsystems and missions 
can be done through on-board solid state relays. For these specific functions a 
command issuing unit is needed.  
 
 When bui
units and 
transfer b
malfunctio
architectur
handles th
operation 
emergency
approache
2.3 Rad
Fault miti
[25]. Faul
fault toler
wide defin
lding satell
subsystems
etween th
ns. It shou
e. This un
e swappin
of the satel
 cases. In
s that migh
iation-I
gation can 
t avoidance
ance is thro
ition mean
Chapter 2 
Figure 2-6
ites, reliabi
 is used. Th
e redunda
ld be of hig
it is called
g between
lite when it
 the next
t be used in
nduced 
be achieve
 is mainly 
ugh radiat
s the availa
Backgroun
. Typical O
lity is the b
ere is a spe
nt compon
h reliabilit
 the config
 main and
s OBC fails
 section w
 developing
Fault M
d either th
through rad
ion harden
bility of tec
d and Liter
BC Block D
ig concern
cific unit in
ents upon
y and it is 
uration an
 reserve bl
. It acts as 
e illustra
 OBCs. 
itigation
rough faul
iation hard
ing by desi
hniques to 
ature Revie
iagram. 
. Usually a
 the satelli
 detection
preferred to
d failure m
ocks. It ca
the upper le
te the maj
 Appro
t avoidance
ening by m
gn [26]. Fa
mask the fa
w 
 redundanc
te that man
 of failu
 apply TM
anagement
n also han
vel control
or fault t
aches 
 or fault t
anufacturin
ult toleran
ults whene
21 
 
y of the 
ages the 
res and 
R on its 
 unit. It 
dle the 
 logic in 
olerance 
olerance 
g while 
ce in its 
ver they 
Chapter 2 Background and Literature Review 
22 
 
happen so that the system keeps on working even when faults exist. The techniques 
that can be used mainly focuses on adding redundancy to the system in space, time 
and data [27]. Space redundancy depends on adding additional computation blocks 
that provide redundant operation to the main working set. A consensus might be 
formed among the results or a switch to a backup function might be done if the main 
block failed to produce valid data. For time redundancy, execution is performed for 
the first round and the result is tested for validity. In case the result is not accepted, 
as it is not valid, execution repetition would take place and results are recalculated 
until a valid result is obtained or a system error signal takes place. Time redundancy 
can be executed even if the result is valid to make sure that the test module itself is 
working correctly. The data redundancy depends on adding additional bits known as 
code-words to the data to facilitate the detection of faults whenever they exist.  
Usually data faults take place in the memory contents and processor registers. They 
might be of no effect or have severe system defects. Many techniques were 
developed based on the taxonomy of fault tolerance into space, time and data. 
Implementation of fault tolerance techniques can take place in hardware, software or 
both. There exists a group of techniques known as Software Implemented Hardware 
Fault Tolerance (SIHFT) that handles the tolerance of software to projected faults 
[28] [29]. It mainly covers the control flow protection and data flow protection of the 
running software code and emulates the hardware protection techniques using 
efficient implementation of software modules. Also there are pure software-based 
fault tolerance techniques [30]. Among the Software Fault Tolerance (SFT) 
techniques are the N Version Programming (NVP), Recovery Block (RB) and Check 
Pointing, and Architecture Based Fault Tolerance (ABFT). Techniques for protecting 
the SRAM-based FPGAs are developed based on scrubbing and full/partial TMR to 
reduce power and area utilization by applying TMR to the critical sections [31] [32] 
[33]. 
When applying a fault tolerance strategy for protecting a system against the expected 
faults, a hybrid methodology is usually adopted. Where a combination between time, 
space and data redundancy is applied and implemented partially in hardware and 
software together. The fault tolerance strategy is usually thought of in the form of 
layers where each layer focuses on the protection of a certain scope of the system 
Chapter 2 Background and Literature Review 
23 
 
design. In chapter 3 we illustrate our taxonomy of the OBC architecture alternatives. 
This taxonomy can be used as a basis for selecting the most fitting fault tolerant 
approach to develop the OBC subsystem according to the mission requirements and 
the available power resources. Also, it provides the analytical formulas for 
calculating the OBC system reliability. 
2.4 SRAM-based FPGAs for Space Applications 
Typically satellites used to depend on discrete components such as microprocessors 
and microcontrollers to achieve the software processing tasks. Hardware designs 
were implemented using digital and analog components. However, with the 
advancement in the technology, FPGAs were introduced as programmable devices 
that can accommodate digital designs as well as complete embedded systems in one 
package. FPGAs are either anti-fused or SRAM-based. Anti-fused FPGAs can be 
programmed once and they provide robust performance against radiation on the 
expense of high incurred development costs. SRAM-based FPGAs are 
reprogrammable as they store the configuration data which represents the design in 
their local on-chip SRAM memories. The problem with such FPGAs, although they 
are more flexible in the design and development than anti-fused FPGAs, is the 
possibility of altering the design by radiation induced faults in the SRAM. Anti-fused 
FPGAs are usually of lower capacity in terms of the area that can hold the design 
when compared to SRAM-based FPGAs.  
The typical contents of an SRAM-based FPGA from Xilinx are broken down in 
Figure 2-7 . The FPGA contains Configuration Logic Blocks (CLB) which holds flip 
flops for implementation of sequential logic as well as Look-Up Tables (LUT) for 
implementation of combinational logic. The number of slices and flip flops vary 
from family to another and even within the same FPGA family. Xilinx FPGAs also 
contain on-board Block Random Access Memory (BRAM) for storing of binary data 
as well as program codes, dedicated multipliers for fast multiplication and Digital 
Clock Managers (DCM).  
The use of commercial grade SRAM-based FPGAs in space applications is an on-
going research topic which is state of the art [14]. Developing reconfigurable fast 
processing systems in miniaturized spaces and lower power consumptions with high 
 reliability 
Mitigation
memory 
configurat
configurat
[35]. The 
Xilinx sho
used and s
Scrub
based FP
connection
design. It 
Whenever
changed a
implement
the bitstre
of binary 
word cons
LX50 [37
is possibl
 of the SR
scrubbing 
ion access
ion tiles w
first report
wed 99.9%
crubbed ex
Figure 2-7.
bing is the 
GA’s bitst
s between 
is stored in
 an upset h
ccordingly
ed through
am and cor
frames of 
ists of 32 
]. Each fram
Chapter 2 
e using th
AM-based
either thr
 port [34
hich can b
ed flight ex
 SEU imm
ternally usi
 Xilinx FPG
principle f
ream in a
the differen
 the form 
appens, suc
 from 1 to
 an internal
recting the 
configurati
bits. Thus 
e is prote
Backgroun
e technolo
 FPGA des
ough exte
]. Other t
e reconfigu
perience fo
unity, wher
ng an anti-f
A Typical
ault toleran
n error f
t resources
of 1’s and 
h as an SE
 0 or from
 or externa
errors foun
on bits. Ea
one config
cted by an 
d and Liter
gy afforde
ign is carr
rnal contr
echniques
red to rep
r an SRAM
e two FPG
used Actel
 Contents B
ce techniqu
ree state. 
 inside the 
0’s in the 
U, to the s
 0 to 1. T
l scrubber, 
d in it. The 
ch frame c
uration fra
Error Corr
ature Revie
d by SRA
ied out thr
oller or 
depend on
lace faulty
-based V
As of LX a
 RTAX FPG
reakdown 
e used in m
The bistre
FPGA to im
internal SR
tored bits, 
he scrubbi
is responsib
bitstream i
onsists of 4
me is 1312
ecting Cod
w 
M-based 
ough confi
from the 
 using ho
 tiles upon
irtex-4 FPG
nd SX type
A [36]. 
© Xilinx. 
aintaining 
am descri
plement a
AM of the
its values m
ng function
le for read
s stored in t
1 words a
 bits in th
e (ECC). T
24 
FPGAs. 
guration 
internal 
t-spares 
 needed 
A from 
s where 
 
SRAM-
bes the 
 specific 
 FPGA. 
ight be 
, either 
ing back 
he form 
nd each 
e Xilinx 
he ECC 
 has the cap
such as S
more than
would onl
is protecte
check the 
Figure 2-8
which pr
configurat
bitstream 
Frame Da
16 bits ar
configurat
ability of S
EU, then th
 one bit e
y detect the
d by a rea
integrity of
 shows the 
esents the 
ion metho
is done thro
ta Register 
e unused, 1
ions bits. 
Figure 2
Chapter 2 
ECDED. I
e ECC che
rror occurr
 error and s
dback Cyc
 the overall
configurati
details o
ds. It sho
ugh readin
(FDR). In 
2 bits are 
-8. Virtex-5
Backgroun
f a single e
cker circui
ed, such as
ignal an er
lic Redund
 bitstream a
on frame o
f the con
uld be no
g a comple
each frame
used for E
 Configura
d and Liter
rror occurs 
try can det
 MBU, th
ror conditio
ancy Chec
nd whether
f the Virtex
figuration 
ticed that 
te frame th
 1280 bits 
CC and 4 b
tion Frame
ature Revie
to any of th
ect the erro
en the ECC
n. The who
k (CRC) c
 it contains
-5 FPGA re
procedure 
accessing 
rough shift
are used as
its are use
 Format © X
w 
e bits in th
r and corr
 checker 
le FPGA b
ode that is 
 errors or n
ferenced fr
and the 
the confi
ing its word
 configurat
d as misce
ilinx. 
25 
e frame, 
ect it. If 
circuitry 
itstream 
used to 
ot.  
om [37] 
possible 
guration 
s in the 
ion bits, 
llaneous 
 
Chapter 2 Background and Literature Review 
26 
 
2.5 Previous Work 
Developing OBCs for space missions has been concerned with the amount of 
functions which can be performed per unit volume and per unit power. For example, 
the strategy of NASA to advance the OBC technology was to develop OBCs with 
higher FLOPS/Watt/cm3 and higher functions/cm3. This strategy led the evolution 
from the Cassini EFC in 1992 to the Mars Path Finder MFC in 1994 till the X2000 in 
2003 [38].  
As space computers are expensive to develop because of the space grade components 
used in them, COTS were thought as a cost-effective solution as well as a higher 
performance one. The problems of SEUs in COTS microprocessors are investigated 
and fault tolerance techniques keep evolving due to the increasing threat as a result 
of shrinking the transistor feature sizes for higher performances and capacities [39]. 
However, the use of COTS should not jeopardize the safety of the mission. In high-
safety missions, radiation hardened space approved processors should still be the 
ultimate choice such as the case of manned missions while COTS can be used for 
low-cost scientific missions [40].  
Some space agencies started, a long time ago, ambitious programs to promote the 
microsatellite engineering and to make space more affordable and accessible through 
applying advanced system engineering concepts and the use of COTS in space, such 
as the Myriade program by CNES [41]. Some missions already proved to fly 
successfully with COTS computers applying redundancy concepts [42] [43]. 
The next step was to work with multicore technology and introduce processors that 
can achieve acceptable high performance within tight mass, power and cost 
constraints. An example is NASA’s multicore OPERA Maestro processor which is 
based on the Tilera TILE 64 architecture [44]. NASA objective is to get high 
performance within error rate equivalent to the currently working uniprocessor 
systems.  
The concept of distributed OBC is introduced through Adaptive Middleware for 
Fault Tolerance (AMFT) design. Tasks can be located among the OBC processing 
Chapter 2 Background and Literature Review 
27 
 
elements, which are located at different positions, in case one of the processing 
elements was hit by debris as an example [45]. 
The use of redundant parallel processing FPGAs interfaced to processing nodes to 
implement a fast computational platform was utilized in the design of the X-Sat 
microsatellite [46]. Some architectures use a mix between COTS microprocessors 
and radiation hardened FPGAs with reconfigurable processing resources for 
achieving high performance in communication satellites such as the Iridium 
constellation [47]. Satellite sensor networks based on distributed nodes computing 
are investigated in [48] [49]. NASA New Millennium Program, COTS processors 
with (10x-100x) the processing power of radiation hardened processors is 
investigated in [50] [51] [52] [53]. A reconfigurable OBC architecture using FPGAs 
for next generation space missions is under development by DLR [54]. 
The state-of-art in OBCs is to develop systems using FPGAs rather than 
microprocessors and ASICs for higher logic capacities and achieving the required 
performance, cost, and power constraints [14]. Figure 2-9 referenced from [14] 
shows the processor throughputs in comparison to their power consumption. Our 
design is targeting the 400 MIPS at a power consumption of less than 5 Watt 
maximum. The RAD750 achieves 260 MIPS with higher power consumption, almost 
2X our target. Also, the price of such computers, although official quotation is not 
available to us, might be up to 500X our design cost. It has been stated in [54] that 
the unit price of QML-V grade microcontrollers and FPGAs vary between US$ 
20,000 and US$ 70,000 while their COTS counterparts are less than US$ 100. The 
price of our system is targeting less than US$ 2000. The main difference between our 
design and all of the reviewed work is that it presents an integrated cycle to design a 
low cost COTS-based MPSoC OBC using SRAM-based FPGA and thoroughly 
qualify it. We collected the pieces of the puzzle together in this research.  
 Figure 
The worst
OBC pow
mitigation
Figure
5
10
15
20
25
30
35
40
45
M
IP
S/
W
at
t
2-9. Satellit
 case MIP
er consump
 technique 
 2-10. Spac
0.20
0
0
0
0
0
0
0
0
0
0
Chapter 2 
e Processin
S/Watt for
tion is est
as shown la
ecraft Com
10.40
Backgroun
g Trends [
 each comp
imated as 4
ter in Table
puters Com
Consum
12.00
86.
Spacecr
d and Liter
14] © NAS
uter is sh
 Watt usin
 7-7. 
parison ba
ption. 
00 100.00
aft Comp
ature Revie
A – Ames R
own in Fig
g heat sink
sed on Perf
120.00
uters
w 
esearch C
ure 2-10. 
 only as a
ormance/Po
150.00
400
28 
 
enter. 
Kyutech 
 thermal 
 
wer 
.00
Chapter 2 Background and Literature Review 
29 
 
The MIPS/Watt was estimated for the FPGA only without adding the power 
consumption of the external MRAMs. This gives meaningful comparison with other 
processors, which does not include external memories, such as the SPARC V8. 
It would be more realistic to compare our design, Kyutech OBC, with computers 
which have closer MIPS. The RAD750 has 260 MIPS at worst power consumption 
of 25 Watt. If we consider that our design has 400 MIPS, total maximum throughput 
of the 4 embedded processors, then the MIPS/Watt for Kyutech OBC is far much 
better than the RAD750. We might increase the estimated power consumption of 
Kyutech OBC by counting for the external MRAMs. For example we consider 4 
memories, one for each processor, each of which is 2 Mbytes and consumes about 
0.6 Watt, and then we would have 68.97 MIPS/Watt which is still much better than 
the RAD750. 
The MIPS/US$ can be estimated based on an assumption of the price of the space 
grade computers which is about 500 K USD. Kyutech OBC total price is less than 
2,000 USD. Therefore the MIPS/US$ of Kyutech OBC is about 0.2 while for the 
RAD750 is about 0.00052. However, we have to state that the RAD750 is well 
designed and tested in about 28 space missions including the Mars Reconnaissance 
Orbiter mission [55]. RAD750 uses the IBM PowerPC 750 in its radiation hardened 
version. It is mostly suitable for missions requiring high level of reliability against 
radiation induced faults and accumulated doses. Our comparison with the RAD750 
tries to present an estimation of how good our system is. However, our Kyutech 
OBC is designed for missions that can accommodate some sort of risk as we use 
COTS components. Manned missions which need ultimate reliability should use 
space grade and space approved components. 
The innovation in this research is that it provides complete mitigation and systematic 
analysis as well as ground testing of the radiation and thermal conditions to develop 
a working system that can be used in real space missions. We judge that our design 
passed Technology Readiness Level (TRL) 6, as defined by NASA, which is 
concerned with testing the functionality of the engineering model/prototype in the 
relevant environment [13]. It would be raised to 8 after successfully demonstrating 
the operation in real flight conditions. 
Chapter 2 Background and Literature Review 
30 
 
In the next chapter we introduce the MPSoC OBC design in details and illustrate the 
architecture alternatives for developing fault tolerant OBC systems using SRAM-
based FPGA as well as the different scrubbing methodologies that can be applied. 
C 
Chapt
On-Bo
n thi
MPS
tasks
alternative
adequate 
reconfigur
The impl
Developm
sections o
3.1 OB
Typically,
for an emb
added to m
of any em
system bu
modules, 
controllers
programm
I
hapter 3 M
 Muer 3
ard Co
s chapter w
oC can act 
 as well as
s to develo
architecture
ation for d
ementation 
ent Kit (E
f this chapt
C Archi
 an OBC co
edded syst
atch with 
bedded sy
ilding bloc
timers, mem
/arbiters, 
able logic. 
ulti-Proces
lti-Pro
mputer
e present t
as the avio
 central dat
p a fault t
, we then
eveloping 
of the M
Dk) and I
er were part
tecture A
nsists of a
em to oper
the interfac
stem are of
ks typicall
ories eith
Input/Outp
sor System-
cessor 
 Design
he design 
nics system
a processin
olerant sys
 introduce
fault tolera
PSoC usin
SE Design
 of the auth
lternat
 processor 
ate. Special
e requireme
 common 
y comprise
er volatile 
ut interfa
on-Chip O
System
 
of the MPS
 for small 
g for other
tem are inv
 the conce
nt MPSoC
g Xilinx t
 Suite is 
or publicat
ives 
system with
 Input/Outp
nts of the 
type as sho
 micropro
or non-vola
ce logic,
n-Board Co
-on-Ch
oC On-Bo
satellites as
 subsystem
estigated. 
pt of usin
 using SR
ools such 
thoroughly
ions at [56]
 necessary
ut interface
system but 
wn in Figu
cessors/mic
tile, interru
 watch 
mputer De
ip  
ard Compu
 it can hand
s. The arch
Having sel
g dynamic
AM-based 
as the Em
 illustrated
 [57] [58]. 
 peripherals
 circuitry m
still the maj
re 3-1. Em
rocontrolle
pt controll
dog time
sign 
31 
ter. The 
le OBC 
itecture 
ected an 
 partial 
FPGAs. 
bedded 
. Some 
 needed 
ight be 
or parts 
bedded 
rs, reset 
ers, bus 
rs and 
C 
Before thi
embedded
areas at fi
upsets is 
mask the e
protected 
and softwa
 
hapter 3 M
Figure 3
nking of th
 OBC syste
rst. Also a 
necessary t
ffect of su
in an embe
re fault tol
ulti-Proces
-1. Genera
e fault tole
m, the des
careful stu
o select co
ch faults. T
dded OBC
erance prot
sor System-
l Embedded
rance techn
ign should 
dy and exp
mponents 
able 3-1 pre
 system to
ection techn
on-Chip O
 OBC Syst
iques that 
be analyze
ectation of
and techni
sents a sum
gether with
iques that 
n-Board Co
em Archite
can be app
d to determ
 the type o
ques that c
mary of th
 the most 
can be appl
mputer De
cture. 
lied to pro
ine the err
f faults and
an mitigate
e target are
common h
ied.  
sign 
32 
 
tect any 
or prone 
 rate of 
 and/or 
as to be 
ardware 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
33 
 
We present a new taxonomy for classifying redundancy while implementing satellite 
OBCs. The areas that are most prone to SEUs are those carrying data that can be 
corrupted. The system memory carrying the program code, program related variables 
and data in addition to the processor registers are the areas of the highest risk to be 
affected by the SEUs. When an FPGA system is used, then if it was an SRAM-based 
system, the configuration data should be protected. The techniques used for 
protecting the FPGA configuration data depend on reading back the configuration 
data and detecting if the data contained an error or not. The corrected data would be 
written back in case an error was detected. A complete reconfiguration every period 
of time can be initiated to overcome the effect of accumulated errors and as a 
precautionary action.  
Processors and memories in an embedded system can be replicated or used as single 
modules. If a processor is used alone without repetition we call it Single Processor 
(SP), while if a processor is replicated we call it Multiple Processor (MP). The same 
taxonomy is valid for memory, either Single Memory (SM) or Multiple Memory 
Table 3-1. Scope Protected and the Fault Tolerance Protection Techniques. 
Scope Hardware Technique Software Technique 
Processor Registers 
Processor redundancy with 
cross voting 
Re-execution in extra 
time 
Watch dog processor 
Data and control flow 
protection 
Dual processors with 
comparator 
RB, NVP, Check 
pointing 
Memory 
Error Detection and 
Correction (EDAC) 
Software EDAC 
Multiple blocks Memory Scrubbing 
FPGA configuration 
Scheduled repair 
- Emergency repair 
Running repair 
Input/Output Minority voters Repetition of execution 
Reset and Clock Redundancy - 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
34 
 
(MM). Combining these architectural possibilities we obtain four architectures: 
SPSM, SPMM, MPSM, and MPMM as shown in Figure 3-2. We call that taxonomy, 
Abu Zaid’s Taxonomy of fault tolerant processor-memory interfaces [56]. Abu Zaid 
is the author’s family name. 
The architectures vary in the number of used blocks and their connection schemes. 
The number of used blocks proportionally affects the system complexity. If the 
number of used blocks increases then the complexity increases and vice versa. If 
some of the components are redundant then the overall system reliability would 
increase. Another dimension might be added to the proposed taxonomy which is the 
use of EDAC. It can be either implemented in hardware logic or in software routine. 
The system then can be classified as (SP,SM,E), (SP,SM,NE), (SP,MM,E), 
(SP,MM,NE), (MP,SM,E), (MP,SM,NE), (MP, MM,E), and (MP,MM,NE). Where 
the (E) denotes the existence of EDAC and (NE) denotes the absence of the EDAC.  
The system data reliability can be estimated by estimating the reliability of each of 
the processor and memory modules. In our design we use an FPGA to implement the 
processor system(s). The FPGA that is used is the Xilinx XC5VLX50, which 
contains a configuration array size of 392,124 words, each word consists of 32 bits. 
Thus the (full) configuration data size is around 11.97 Mbit. 
We would give a numerical example and plot of the OBC system reliability, based on 
our proposed fault tolerance taxonomy of the processor-memory interface. The 
calculations would make use of the estimated Failure-In-Time (FIT) for the Virtex-5 
FPGAs which is 165 FIT/Mbit. The FIT corresponds to the number of expected 
failures per 109 hours. Thus the Mean Time Between Failures (MTBF) is about 
57.814 years, in case the FPGA is fully configured. The FIT figure is estimated per 
Mbit of configuration data therefore the total FIT for an FPGA should be calculated 
based on the actual number of slices occupied by a design (the design size in 
configuration bits). If we calculate for the full configuration data size which is about 
11.97 Mbit for the XC5VLX50, then the FIT value would be in the order of 1975 
FIT. Thus the failure rate would be estimated as (ߣ௉= 0.017297 failures/year). We 
select the memory to be used in the design as a Magneto-resistive Random Access 
Memory (MRAM). The EVERSPIN MR4A16B memory is a good example as it can 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
35 
 
retain data up to 20 years. Thus its failure rate is estimated to be (ߣ௠ = 0.05 
failures/year).  
 
Figure 3-2. Abu Zaid Taxonomy of Fault Tolerant Processor-Memory Interfaces [56]. 
The system can have different architectures as indicated in Table 3-2. The reliability 
calculation formula is indicated for the different cases. The architectures make use of 
redundancies to provide fault tolerant performance. Nevertheless, the redundancies 
were limited to a practical number of modules that can be included due to size, area 
and power constraints that must be considered when designing satellite systems. 
Maximum of 3 redundant units can exist at any architecture. The use of the TMR 
scheme is well known in designing reliable systems. The reliability formulas apply a 
(2 out of 3) reliability calculation. If EDAC is used it would certainly have an effect 
on the overall reliability. It would raise the reliability values of the proposed 
architectures. 
TMR systems use voters to produce the final output value. The voters themselves 
have reliability calculations related to them. If the voter and EDAC are implemented 
inside the FPGA then their reliabilities can be estimated by first calculating their 
expected FIT then estimating the MTBF from it. The MTBF would be used to 
calculate the failures per year which is the failure rate for that design portion. To 
calculate the FIT of a design or even part of it, then the number of configuration bits 
occupied by that design would be multiplied by the FIT/Mb for that type of FPGA 
technology. The FIT for the concerned design would then be estimated. Therefore if 
we would like to include the EDAC and voter effect on reliability then we have to 
estimate their own reliabilities and then include them in the reliability calculation 
formula according to their connection in the design structure.  
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
36 
 
In assessing the reliability of each of the proposed architectures we have to 
differentiate between assessing the reliability to understand SEU susceptibility and 
assessing the reliability related to the device lifetime. The SEU susceptibility would 
indicate the vulnerability of the design towards SEUs and its ability to withstand 
faults induced due to them. On the other side, device reliability is an issue that would 
indicate the device ability to function properly during its planned lifetime. In order to 
improve the system data reliability or trustworthy, variety of techniques can be used.  
TMR is one of the techniques that is popular and well established. It is used to 
improve data reliability through 2 out of 3 voting. On the other side, building a 
system that would live longer in terms of operational hours and robustness in the 
different operating conditions can be achieved through manufacturing and design 
approaches. Manufacturing should focus on selecting durable materials of 
characteristics suitable to the expected operating conditions. While the design should 
focus on defining architectures that consider redundancy on different levels such as 
modules, subsystems and even complete systems. 
Redundancy can be active where the redundant units operate simultaneously with the 
main units or it can be inactive where the redundant units are only activated when the 
main units halt or malfunction. Switching between redundant units can be done 
through a system level controller. Systems can have their redundant units running at 
the same time where the main units are running. Other systems might have the 
redundant units switched off and they are only switched on when an error in the main 
set takes place such as switching from main OBC to reserve OBC. 
  
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
37 
 
P1 
P2 
P3 
Table 3-2. Architecture Alternatives for Processor-Memory Interfaces. 
Fault Tolerance 
Type 
Architecture 
Processor-Memory 
Reliability Formula 
SPSM 
 ܀ = ࡾ࢖ࡾ࢓ 
(3-1)
SPMM 
 
܀ = ࡾ࢖	(૜ࡾ࢓૛ − ૛ࡾ࢓૜ ) 
(3-2)
MPSM 
 
܀ = ࡾ࢓	(૜ࡾ࢖૛ − ૛ࡾ࢖૜) 
(3-3)
MPMM 
 
܀ = (૜ࡾ࢖૛ − ૛ࡾ࢖૜) × 
(૜ࡾ࢓૛ − ૛ࡾ࢓૜ ) 
(3-4) 
When memories are interfaced to more than one processor on the same bus it is said 
to be shared. If memories are interfaced to processors on separate buses they are said 
to be non-shared or distributed. Shared memory systems are known as tightly 
coupled systems while non-shared memory systems are known as loosely coupled 
systems. In the case of single processor the memories are only accessible to that 
processor so they are always considered as shared. It is meaningless to think of a 
P1 
P1 
M1 
M2 
M3 
M1 
P1 
P2 
P3 
M1 
M1 
M2 
M3 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
38 
 
non-shared memory system in case of single processor. Therefore, in the architecture 
presented in Table 3-2, the MPSM is a shared memory system, it can be 
implemented as a non-shared memory system as shown in Figure 3-3. Multi-
Processors will have 2 out of 3 voting on their results, while each of them has got its 
own separate memory interface. 
 
Figure 3-3. MPSM with non-shared Memory. 
The same concept applies in the case of the MPMM where in Table 3-2, it is shown 
as a shared memory system, while in Figure 3-4 it is shown as a non-shared memory 
system. Each processor will have its own interface with multiple memory units. 
Voting (2 out of 3) will take place in two levels, among the memory units attached to 
each processor then among the processors themselves. 
 
Figure 3-4. MPMM non-shared memory. 
It is necessary to notice the different architectures and their effects on the reliabilities 
formulas of the system. The following graphs show the system soft error reliability 
for selected architectures using a very pessimistic assumption. We used the FIT value 
P2 M2
P1 M1
P3 M3
P1
M1
M2
M3
P2
M1
M2
M3
P3
M1
M2
M3
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
39 
 
for a fully configured FPGA as if it is the FIT value for any single processor system 
from those indicated in the architectures. In fact this is a very tough approximation, 
as the real FIT for the processor system will be much smaller due to the fact that it 
occupies only a portion of the FPGA configuration bitstream.  
When analyzing reliability, it would be necessary to consider that some designs 
might not only have redundant processors inside one FPGA but also they might have 
complete redundant FPGA devices. The FPGAs might be of different types and thus 
of different FIT values. The FPGAs digital designs, which might have several 
embedded systems, can still form a (2 out of 3) redundancy by communicating data 
in between them. Voting would then be carried out inside each FPGA among the data 
generated internally and the data received from the other two external FPGAs. In the 
case of the processor modules inside the same FPGA, data exchange is easier as it 
takes place through mailboxes via a sort of Message Passing Interface (MPI).  
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10
R
el
ia
bi
lit
y
Time, Years  
Figure 3-5. MPSM non-shared Memory Reliability Curve. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
40 
 
Figure 3-5 shows the soft error reliability of the MPSM non-shared memory model. 
It is calculated from Eq. (3-5): 
R = 3R୮ୱଶ − 2R୮ୱଷ  
where: R୔ୗ = R୮R୫.  
(3-5) 
R୔ୗ is the processor system reliability which consists of both the processor and the 
memory, R୮  is the processor reliability, and R୫  is the memory reliability. The 
reliability of the MPMM architecture with non-shared memory is shown in  
Figure 3-6. It can be calculated by estimating the reliability of each processor system 
at first then estimating the reliability among the processor systems. The processor 
system in this case means the processing set which consists of 1 processor and 3 
memories. The overall system consists of 3 processor systems (processing sets). The 
reliability of each processing set is the multiplication of the processor reliability and 
the (2 out of 3) memory reliability. Calculations are shown in Eq. (3-6): 
R = ܴ௣௦ଷ + 3ܴ௣௦ଶ ൫1 − ܴ௣௦൯ 
Where: ܴ௣௦ = 	ܴ௣	(ܴ௠ଷ + 3ܴ௠ଶ (1 − ܴ௠)). 
(3-6) 
The MPMM architecture with non-shared memory shows higher reliability values 
than the MPSM architecture with non-shared memory for the same service period. 
However, the MPMM consumes high power relative to the MPSM. A compromise 
should be made between the gain in reliability and the losses in consumed power.  
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
41 
 
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 2 4 6 8 10
R
el
ia
bi
lit
y
Time, Years
 
Figure 3-6. MPMM non-shared Memory Reliability Curve. 
Two other schemes are considered in our architecture evaluation for the OBC design. 
The MPMM with shared memory and the MPSM with shared memory as shown in 
Table 3-2. The reliability curves for both schemes are shown in Figure 3-7 and 
Figure 3-8 respectively.  
It is interesting to notice that the power consumption profile of the MPMM with non-
shared memory would be the highest as it contains more components, at the same 
time it is of the highest data reliability among the Multi-Processor schemes. The 
lowest power consumption profile would be the MPSM with shared memory which 
would be of the lowest reliability as well. Both the MPMM with shared memory and 
the MPSM with non-shared memory are almost of the same power consumption 
profile and close reliability values. In our selection of the architecture we have to 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
42 
 
consider the mission reliability requirements, service conditions and satellite power 
budget. In our MPSoC design, the MPMM scheme is not suitable for the mission 
power budget. We selected the MPSM with non-shared memory as it has a moderate 
power consumption and reliability while less complexity in managing memory 
interface than the MPMM with shared memory. 
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 2 4 6 8 10
R
el
ia
bi
lit
y
Time , Years
 
Figure 3-7. MPMM shared Memory Reliability Curve. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
43 
 
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10
R
el
ia
bi
lit
y
Time, Years  
Figure 3-8. MPSM shared Memory Reliability Curve. 
By analyzing the MP curves from Figure 3-5 to Figure 3-8, we can estimate the 
degradation in soft error reliability over the mission service period for the Multi-
Processor architectures. Therefore, for example, if a satellite mission was planned to 
serve for 4 years in space then from the MP reliability curves we can estimate the 
soft error reliability values as: 1) MPMM with non-shared memory about 96%, 2) 
MPMM with shared memory about 90%, 3) MPSM with non-shared memory about 
86%, and 4) MPSM with shared memory about 81%. These results are summarized 
in Figure 3-9. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
44 
 
 
Figure 3-9. Soft Error Reliability for MP Architectures. 
In the abovementioned analysis, we focused on analyzing the MP cases as we are 
implementing an MPSoC. However, in case of implementing an SP system then it 
should be straight forward to estimate the soft error reliability from the formulas in 
Table 3-2. Next we present the use of the dynamic partial reconfiguration concept to 
scrub the configuration frames in the SRAM-based FPGA and provide fault tolerance 
against SEU induced faults. 
3.2 FPGA Reconfiguration Strategies 
Several strategies have been proposed to deal with the reconfiguration of FPGA 
bitstream in case of induced errors. The typical form of errors that might take place 
in the bit stream is the SEU, it is known as a flip in the configuration data. Thus the 
value of binary stored information will be negated, (1) changes to (0) and (0) changes 
to (1). The SEU is one form of what is globally known as SEE which includes other 
forms of faults caused due to single events of radiation effects on electronic 
components. The effect can be either in a single bit known as SBU or it can take 
place in multiple bits known as MBU.  
Reconfiguration of the bit stream is meant to overcome the effects of changing the 
bitstream values stored in the FPGA SRAM. The bitstream is stored in the form of 
96
90
86
81
70
75
80
85
90
95
100
MPMM non-
shared
MPMM shared MPSM non-
shared
MPSM shared
So
ft
 E
rr
or
 R
el
ia
bi
lit
y 
%
MP Architecture
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
45 
 
configuration frames. Each frame contains a group of configuration bits, for Virtex 5 
the frame consists of 1312 bits. The configuration bits are grouped into words of size 
32 bits. Thus each frame consists of 41 words, each of which consists of 32 bits that 
is a total of 1312 bits. 
The frames contain the bits that are related to the CLB, BRAM, Input/Output Blocks 
(IOB) and other configuration data for FPGA interconnects routing and modules. 
Each frame contains 12 bits for ECC such as Hamming code. It can be used for 
SECDED. The configuration time for the FPGA is based on the configuration clock 
rate and the configuration array size. The array size consists of words and the total 
bit stream can be viewed roughly as the total number of words times 32. Each word 
of 32 bits would take one clock cycle to be programmed into the FPGA. Thus an 
FPGA such as the XC5VLX50 would approximately take 7.42 mS to be fully 
configured at a configuration clock of 50 MHz. 
A group of registers are used to manage the access to the configuration data through 
a sequence of control commands. The configuration data can be read or written as 
one complete frame which is the smallest unit to be accessed for configuration. The 
frame data can be accessed by setting the Frame Address Register (FAR) which 
denotes the type of the block being accessed and the FPGA top or bottom half, 
details are described in [37].  
The Xilinx XC5VLX50 has two versions, the LX50T and the LX50. Both versions 
are of identical technology except for some extra logic slices and hardwired modules 
in the LX50T. In our research we performed the design on both FPGAs and we used 
them interchangeably. The LX50T contains a total of 12,547,968 configuration bits 
but about 11.37 Mbit are the actual available bits for configuration. The Virtex 5 
FPGAs can load an encrypted bit stream using Advanced Encryption Standard (AES) 
with a key length of 256 bit. The FPGA contains an AES decryption module that can 
be used only to decrypt the encrypted bitstream. The decryption module cannot be 
used in the design being loaded in the FPGA. It is dedicated only to the bitstream 
decryption. The FPGA configuration can take place through several modes that can 
be controlled through configuration pins. The modes are either in Master or Slave 
operation. In the Master operation, the configuration clock is driven from an internal 
oscillator. While in the slave operation the clock is an input. The configuration data 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
46 
 
can be loaded in parallel or serial streams. Other modes are available such as the 
JTAG/Boundary-scan mode and the SelectMap interface mode.  
The sequence of configuring the FPGA proceeds in two phases, the setup phase and 
the bitstream loading phase. In the setup phase the FPGA clears the internal 
configuration memory just after power-up, it then detects the configuration mode to 
be used such as Master or Slave and parallel or serial transfer. In the bitstream 
loading four stages are executed, synchronization, device ID check, loading of 
configuration data, and the CRC check. A 32 bit CRC is used to denote any errors in 
the configuration data loaded to the FPGA. Details of the Xilinx XC5VLX50 
characteristics and operation can be found in user guides, manuals and data sheet. 
The configuration stream can be read-back after being loaded to the FPGA internal 
SRAM. The read-back can take place through the external SlelectMap interface or 
the Internal Configuration Access Port (ICAP) primitive. In case errors exist in the 
bit stream due to SEU, the read-back technique can be used for mitigation. There are 
three main strategies when dealing with the SEU problem in the configuration data: 
1) Scheduled maintenance  
2) Emergency maintenance 
3) Running repairs 
3.2.1 Scheduled Maintenance 
The scheduled maintenance depends on triggering a complete reconfiguration for the 
FPGA every specified period of time. It does not try to detect whether bitstream 
errors exist due to SEU or not. In that technique a wash out is performed through 
rewriting the whole configuration data once more thus allowing to wipe the transient 
errors formed due to SEUs, if any. That strategy can be used when the time between 
successive reconfigurations is smaller than the mean time between bit upsets. 
The main advantage of the scheduled maintenance is that it performs a complete 
washout which removes all accumulated configuration errors whether SBUs or 
MBUs. Scheduled maintenance can take place by reading data from an external flash 
memory. That flash memory is different from the original platform configuration 
flash. However, it holds an exact bitstream copy, golden image, identical to the one 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
47 
 
that was initially loaded to the FPGA from the platform configuration flash. The 
scheduled maintenance procedure is shown in Figure 3-10. A configuration timer 
would trigger timeouts to the configuration controller which would read the 
configuration frames from an external flash memory and write it back to the FPGA 
through the internal ICAP. 
 
Figure 3-10. Scheduled Maintenance – Reload from External Flash. 
Another approach is to trigger a complete reboot for the FPGA system to reload the 
whole configuration stream from the platform configuration flash as in Figure 3-11. 
 
Figure 3-11. Scheduled Maintenance – Reload from Platform Configuration Flash. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
48 
 
3.2.2 Emergency Maintenance  
Emergency maintenance as shown in Figure 3-12 depends on being more responsive 
than the scheduled maintenance. Any SEU in the configuration frames would be 
detected by performing a scan on the whole configuration frames. The read-back of 
the frames is done by a dedicated logic which runs independent of the FPGA design. 
This technique makes use of the read-back CRC logic primitive available for Virtex 
5. The read-back CRC logic reads the configuration frames through the ICAP 
primitive and the CRC values for the read frames are calculated through the 
frame_ECC primitive. The CRC syndrome is used to locate the error in in the frame 
bitstream, if any.  
SBU can be detected and located for correction by simply flipping their values. 
Double bit errors can be only detected and a complete reconfiguration would be 
initiated. In case an MBU exists then the frame_ECC cannot identify their existence 
reliably and the whole bitstream would be invalid. The bitstream integrity can be 
judged through the bitstream CRC value while the individual configuration frames 
integrity is judged through the frame ECC value. In case of uncorrectable errors in 
the bitstrem then a full reconfiguration might be needed if the errors lead to serious 
design failures. 
 
Figure 3-12. Emergency Maintenance. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
49 
 
The scan time of the total configuration frames depend on the FPGA configuration 
array size and the scan clock rate. The clock can be an external clock or the FPGA 
can make use of its internal oscillator. The use of the internal oscillator is meant to 
avoid the external circuitry and connections that might be a source of erroneous 
operation due to design, routing, wiring mistakes or even components wear out. The 
internal ICAP can operate at a maximum frequency of 100 MHz. The SBU, once 
detected in any frame, would be corrected by flipping the faulty bit, while if an MBU 
is detected then it would not be corrected. The correction takes place by writing back 
the correct frame through the ICAP interface. The control signals of the frame_ECC 
shows the interpretation of the error condition using the syndrome valid, ECC error, 
CRC error and syndrome signals. 
3.2.3 Running Repairs  
In case errors cannot be permitted for the maximum detection time, when the 
scheduled scans would run, then the running repairs strategy can be used. The 
running repairs depend on the availability of redundant modules that would run 
either simultaneously or consecutively to provide redundant results as shown in 
Figure 3-13 and Figure 3-14.  
S
el
ec
tio
n/
A
ct
iv
at
io
n 
of
 
R
ed
un
da
nt
 M
od
ul
e
 
Figure 3-13. Running Repairs – Selection of an Active Module. 
The time between the error occurrence in configuration bits and the detection of that 
error can be sufficient for fault propagation and hence the initiation of a system wide 
failure. The application and the design details would define whether the system is 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
50 
 
susceptible to enter any of its failure modes due to delayed detection of bit upsets or 
not. If the system should be repaired immediately, in order not to have a system wide 
failure, and if the repair should be executed without resetting the system, in order not 
to lose critical data, then redundant modules are the perfect choice. The use of the 
redundant modules would allow overcoming the problem that might happen in the 
configuration bits of any of the individual modules. As it is very unlikely to have bit 
upsets taking place at the redundant modules configuration frames at the same design 
locations on the same time.  
 
Figure 3-14. Running Repairs – Active Modules with Voting. 
In real avionics systems, combination of the aforementioned strategies is typically 
implemented. Scheduled maintenance takes place regularly over larger time intervals. 
In standby modes emergency maintenance through read-back CRC can be used to 
avoid error accumulation. While during critical operation of the system, where any 
disturbance to the system or halt to reconfigure is not allowed, the running repairs 
strategy can be used. Xilinx recommends using a hybrid mitigation strategy and it 
ensures that SEU induced errors can be actively mitigated through configuration 
frames scrubbing. In the next section we present the details of the MPSoC design and 
implementation. 
3.3 MPSoC OBC System Design 
The MPSoC technology is used to develop high performance embedded systems in a 
miniaturized form. Our design for the MPSoC OBC was conducted using Xilinx 
Embedded Development Kit (EDK) software package. We designed the hardware 
using the EDK - Xilinx Platform Studio (XPS) while we designed the software using 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
51 
 
the EDK - Software Development Kit (SDK). Also, we used the Xilinx ISE design 
suite to design a top level schematic to integrate the MPSoC design from EDK with 
the SEU controller IP core. The top level schematic was developed to achieve design 
fault tolerance through mitigating the FPGA configuration frames via internal 
scrubbing. 
EDK provides variety of IP cores that can be used to develop complete embedded 
systems with different architectures to fit for the specific design requirements. The 
block diagram of the OBC implemented in the Xilinx Virtex5 LX50 FPGA is shown 
in Figure 3-15. The system consists of 4 MicroBlaze soft IP core processors. Each 
processor is interfaced to a BRAM where the initial boot loader, system initialization 
program code, and application code are stored. The BRAM size is 32 Kbytes and can 
be increased up to 64 Kbytes or more in larger size FPGAs. The LX50 is pin 
compatible with larger size FPGAs such as the LX85. This gives the advantage of 
replacing the FPGA without changing the design to get larger resources when needed. 
All the microprocessors are connected to the MicroBlaze Debug Module (MDM) IP 
core for software debugging. The debugging interface with the external software 
Integrated Development Environment (IDE) is of JTAG type. The Xilinx USB/JTAG 
cable is used to connect to the system during debugging and configuration flash 
programming.  
The configuration bitstream generated by the Xilinx EDK package, using the ISE 
Design Suite tools to compile, synthesize, map, place and route the design, is stored 
in the configuration flash memory. The configuration flash memory might have 
Serial Peripheral Interface (SPI) or Byte Peripheral Interface (BPI) with the FPGA. 
Whenever the FPGA is restarted, its bitstream will be reloaded from the 
configuration flash memory.  
Each processor has got its own Processor Local Bus (PLB). The processor uses the 
PLB to communicate with its peripherals which are connected directly as slave 
terminals on the bus. The processor is the master terminal on the PLB. The 
arbitration between the slave peripherals, requesting the bus, is granted through the 
PLB bus arbiter. The arbitration policy is configurable. For example it can be set to 
daisy chain or priority based arbitration. 
C 
The 4 Mi
size of ea
asynchron
through th
processors
Three Mic
user code
processor 
system m
processors
based on 
hapter 3 M
croBlaze pr
ch mailbo
ous read/w
e PLB bus
 together. 
roBlaze pr
. These ar
is labeled 
aster which
. Each pro
an interrupt
ulti-Proces
ocessors c
x is 512 b
rite operati
 of each pr
Figure 3-15
ocessors ar
e the proc
as the Mas
 handles t
cessor of th
 received b
sor System-
ommunicat
ytes. Each
ons. The m
ocessor. Th
. MPSoC O
e considere
essors labe
ter System
he monito
e three us
y a Watch
on-Chip O
e together 
 mailbox i
ailboxes co
ere are 6 m
BC Block 
d as applic
led MB1, 
 Control P
ring functi
er processo
dog Timer 
n-Board Co
through FIF
s bidirectio
nnect betw
ailboxes co
Diagram. 
ation proce
MB2, and
rocessor (M
on on the 
rs generate
(WDT) con
mputer De
O mailbox
nal and c
een the pr
nnecting a
ssors which
 MB3. Th
SCP), Th
operation 
 a heartbea
nected to 
sign 
52 
es. The 
an have 
ocessors 
ll of the 
  
 run the 
e fourth 
is is the 
of other 
t signal 
its PLB. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
53 
 
When the WDT times out, it sends a signal to an Interrupt Controller (INTC) to wake 
up the processor in concern. The WDT timer interval is set at the design time and can 
be calculated from Eq. (3-7). 
WDT୍ ୬୲ୣ୰୴ୟ୪ = 2୬ × Clock_Period	
Where the Clock_Period is:  
Clock୔ୣ୰୧୭ୢ = 	
1
System_Clock_Frequency	
(3-7) 
The WDT interval can be controlled by setting the value of (݊) at design time. The 
MSCP implements software WDTs to monitor the three user processors. These 
software-based WDTs at the MSCP are reset based on receiving the heart beat signals 
labeled HB(1), HB(2) and HB(3). These signals are sent through the parallel General 
Purpose Input/Output IP core (XGPIO) of each MicroBlaze processor to the XGPIO 
of the MSCP. In order to provide different paths to report the heart beat signals, as 
they are crucial to the operation of the system, a backup path is provided. The 
MicroBlaze processors send an Acknowledge (ACK) response based on a request 
issued by the MSCP as shown in Figure 3-16. The MSCP regularly invokes the slave 
processors using the (Ping) request. The slave processors should respond to the 
(Ping) request by issuing an (Ack) response. The response should be received by the 
MSCP within a specified time interval. If the time interval exceeds the limits set 
without receiving the (Ack) response, the MSCP resets the slave processor by 
sending a reset signal to the slave processor reset module. 
The MSCP itself reports its heart beat signal to an external WDT, The MAX6369. 
The external WDT sends a reset signal to the Crydom MPDCD Solid State Relay 
(SSR). The reset signal is an active low signal. It is pulled up and kept high via a 
pull-up resistance connected to the supply voltage. Whenever the MSCP fails to reset 
the external WDT (MAX6369), the reset signal will be set to low and thus 
disconnecting the power supply from the FPGA board leading to power recycling.  
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
54 
 
 
Figure 3-16. Ping/Ack Watchdog Timer [59]. 
The XC5VLX50 FPGA has an internal thermal diode that can be used to read the die 
temperature. Reading the die temperature can be done through the system-monitor IP 
core or through an external temperature measurement IC over IIC interface as shown 
in Figure 3-17 [60] [61]. The measured temperature is used to facilitate DTM to 
control the operating temperature of the FPGA as will be described in chapter 6. 
When using the MAX6655 IC, It is important to notice that the standby pin should be 
pulled up high otherwise there will be no temperature readings. 
 
 
Figure 3-17. Interface with Temperature Measurement IC. 
The MSCP can reset any of the slave processors when it does not receive the heart 
beat signal from it. The slave processors can also reset themselves through their local 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
55 
 
watchdog timers as a backup to the reset by the MSCP. Whenever any of the slave 
processors is reset a (RESET) message is sent through the mailbox after boot-up of 
the slave processor to the MSCP. The slave processor which was just reset would 
copy the current state of the other processors from the MSCP which saves a copy 
regularly of the slave processors essential data. The code written on each processor is 
recommended to be in a state machine form where the state code, input vector, output 
vector and variables’ values can be stored in one task stack and exchanged with the 
MSCP to store it as a system restoration point. To store the recovery data, the MSCP 
has an external MRAM which acts as a flash memory and SRAM at the same time. 
The interface with the external MRAM is carried out through the External Memory 
Controller IP core (EMC).  
The processors exchange data with each other to make sure that their calculations are 
correct. They make use of the TMR with feedback concept. Figure 3-18 shows the 
concept of data exchange among the processor nodes in the MPSoC OBC to execute 
TMR with feedback. Whenever a slave processor generates a data vector, it 
exchanges it with the other user slave processors, to execute voting according to Eq. 
(3-8) . 
Data୴୭୲ୣ = Dataଵ	&	Dataଶ +	Dataଶ	&	Dataଷ +	Dataଵ	&	Dataଷ	
(3-8) 
The voting is bitwise so that the logic operations of the (AND) and (OR) in Eq. (3-8) 
are carried out on the corresponding bit locations in the three (Data) vectors.  
 
C 
Figu
The copie
the MSCP
restoration
process is
master pro
on each o
implemen
Figure 3-1
The voted
guarantee 
consensus
consensus
data store
case of fa
regularly 
processor 
processor 
processors
scores. T
hapter 3 M
re 3-18. TM
s of the sou
 to vote o
 point for
 carried ou
cessor. Thu
f the system
ted on the d
9.  
 data vec
correct op
 comes fro
 comes fro
d at the M
ilures [30]
is called c
at the MS
generated 
. The proc
he process
ulti-Proces
R with Fe
rce data w
n it. The 
 slave pro
t through s
s the voter
 processo
ata genera
tor is used
erations ba
m voting 
m voting o
SCP MRAM
. Creating 
heck-pointi
CP. The sc
data that h
essors are 
or with th
sor System-
edback am
hich were 
MSCP sav
cessor(s) re
oftware co
 software fu
rs. In gene
ted by any 
 instead o
sed on res
on the red
n the resul
 is used a
a restorati
ng [27]. A
ore card c
ad consen
classified 
e highest 
on-Chip O
ong the Da
voted at ea
es the voti
covery in 
de at each 
nction is lo
ral, the vot
TMR-based
f the loca
ults consen
undant un
ts of repea
s an RB to
on point a
 score-card
ounts the 
sus with a
as being t
score reco
n-Board Co
ta of the MP
ch slave pr
ng result i
case of fa
slave proce
cated with
ing process
 processin
lly genera
sus. In spa
its. In tim
ted executi
 roll back
t the MSC
 can be k
numbers at
t least one
rustworthy 
rds the h
mputer De
SoC Node
ocessor are
n its MRA
ilures. The
ssor as we
in the code
 is preferr
g logic as s
ted data v
ce redunda
e redundan
ons [27] [2
ward the sy
P and upd
ept for eac
 which eac
 of the ot
according 
ighest trus
sign 
56 
 
s. 
 sent to 
M as a 
 voting 
ll as the 
 running 
ed to be 
hown in 
ector to 
ncy the 
cy, the 
9]. The 
stem in 
ating it 
h slave 
h slave 
her two 
to their 
tworthy. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
57 
 
Therefore, in case of absence of consensus among the three processors, the data from 
the processor with the highest trustworthy can be used by all of the processors. The 
best way in case of absence of consensus is to roll back the system via the MSCP. 
The process of check-pointing was studied thoroughly to define the basis of 
theoretical analysis for estimating the optimum check-pointing rate [27]. In our 
MPSoC design, check-pointing depends on the user applications running on the slave 
processors. Check-pointing rate changes according to the type of the processes 
running on each of the slave processors as well as the rate of generating critical new 
data from these processes. Therefore, the check-pointing is an application dependent 
process that needs to be optimized based on the specifics of each application.  
 
Figure 3-19. Distributed Voting Function at Redundant Processors [59]. 
The configuration memory of the SRAM based FPGA is scrubbed continuously 
using the SEU controller IP core provided by Xilinx [17][18]. The SEU controller 
can be used to inject faults in the configuration bitstream for the purpose of testing as 
well as scrubbing for correcting SBUs. It cannot handle the correction of MBUs in a 
single configuration frame. 
The configuration frame of the Xilinx Virtex5 LX50 consists of 1312 bits of 
configuration bitstream [37]. The soft IP core PicoBlaze microcontroller is used 
within the SEU controller IP core to handle the reading of the configuration frames 
through the Internal ICAP IP core. The ICAP can directly access the configuration 
frames SRAM in the FPGA by reading and writing to them. The read frames are sent 
to an ECC decoder for error checking and correction. The ECC calculates the error 
C 
syndrome 
can be co
blocks use
The SEU
recommen
The frame
any confi
notificatio
MBUs are
critical. T
tools. It is
Xilinx les
critical an
The whole
which gen
necessary 
were corre
algorithm 
hapter 3 M
to detect th
rrected by
d in detect
 controller
ded by Xil
 ECC has t
guration f
n and it a
 detected, 
he critical 
 important 
s than 20%
d might lea
Figur
 bitstream 
erates an e
to check th
cted when
to correct S
ulti-Proces
e location 
 toggling i
ing and cor
 IP core o
inx. 
he capabili
rame, the 
utomaticall
the FPGA
bit inform
to notice th
 and typic
d to design 
e 3-20. Fun
stored in th
rror syndro
e integrity 
ever they a
BUs.  
sor System-
of the error
ts value. F
recting the 
perates at
ty of SECD
SEU cont
y stops the
 can be le
ation can b
at not all b
ally less th
failure. 
ctional Blo
e SRAM o
me whenev
of the bitstr
re detected
on-Chip O
 bit. Once t
igure 3-20
SBUs in th
 50 MHz 
ED. In ca
roller gene
 Auto Cor
ft without 
e analyzed
its are critic
an 10% of
cks for Cor
f the FPGA
er the bitst
eam as a w
 or not. Fig
n-Board Co
he error loc
, shows th
e FPGA co
external c
se of doubl
rates an u
rection Mo
resetting if
 at design
al to the de
 the upsets
recting SB
 is protecte
ream conta
hole and to
ure 3-21 sh
mputer De
ation is det
e major fu
nfiguration
lock frequ
e errors det
ncorrectab
de (ACM)
 the bits w
 time using
sign. Acco
 are consid
Us. 
d by the CR
ins an error
 know if th
ows the sc
sign 
58 
ected, it 
nctional 
 frames. 
ency as 
ected in 
le error 
. When 
ere not 
 Xilinx 
rding to 
ered as 
 
C code 
. This is 
e errors 
rubbing 
C 
The system
3-3. The O
70%. Mor
are used a
Slice Log
Number 
Number 
Number 
Number 
Total Me
 
The powe
MPSoC d
considerab
technology
BRAMs a
hapter 3 M
 resources
ccupation 
e than half 
s well as th
Table 
ic Utilizati
of Slice Re
of Slice LU
of Occupie
of BRAM/
mory Used
r consump
esign by th
le amount 
 devices fr
nd IOs is q
ulti-Proces
Figure 
 are estima
of the slice
of the LUT
e registers i
3-3. FPGA
on 
gisters 
Ts 
d Slices 
FIFO 
 (KB) 
tion, for e
e XPower 
of the pow
om Xilinx.
uite few. T
sor System-
3-21. Scrub
ted based o
s in the FP
s which imp
nside the lo
 used resou
Used
16,36
18,76
7,07
42
1,51
xample w
analysis to
er is dissip
 The dynam
he second 
on-Chip O
bing Algor
n the Xilin
GA is up to
lement the
gic slices. 
rces in the 
 A
2 
3 
6 
2 
hen using 
ol from Xil
ated in the 
ic power u
source of h
n-Board Co
ithm. 
x EDK too
 98%. Mem
 combinatio
 
MPSoC des
vailable 
28,800 
28,800 
7,200 
60 
2,160 
LX50T, is
inx as show
form of le
sed in the 
igh power 
mputer De
l as shown 
ory usage
nal logic f
ign. 
Utiliz
56
65
98
70
70
 estimated 
n in Tabl
akage in th
logic gates,
consumptio
sign 
59 
 
in Table 
 is up to 
unctions 
ation 
% 
% 
% 
% 
% 
for the 
e 3-4. A 
e 65 nm 
 signals, 
n is the 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
60 
 
clocks generated in the FPGA then the Phase Locked Loop (PLL) circuitry. The 
leakage, clocks and PLLs consume about 85% of the total consumed power. Each 
processor runs at 100 MHz, while if the clock is reduced to smaller value the 
dynamic power consumption would reduce as well. The total power consumption of 
the OBC system would be higher than the shown hereinafter as it would include 
other components rather than the FPGA. Also it would change according to the 
operating temperature and FPGA size. 
Table 3-4. Power Estimation of the MPSoC design. 
On-Chip Multi-core Power(W) 
Clocks 0.347 
Logic 0.007 
Signals 0.014 
BRAMs 0.096
DSPs 0.000 
PLLs 0.263 
DCMs 0.068 
IOs 0.002 
Leakage 0.460 
Total 1.256 
 
The schematic output of the EDK package is shown in Figure 3-22. Four processors 
are represented in the schematic with the four large rectangles. The rest of 
components are the peripherals connected to each of the processors. The design in 
EDK is implemented through 3 steps: 1) Selecting the IP cores to be used from the IP 
catalog, 2) Connecting the IP cores to the PLB interface, 3) Connecting the IP cores 
through their ports interface to other peripherals, and 4) Defining the address map of 
the IP cores in the system. The compilation of the system to generate the bitstream 
files are done through the Xilinx ISE Design Suite tools.  
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
61 
 
 
Figure 3-22. Schematic of the MPSoC Design in the XC5VLX50-1FFG676C FPGA. 
The hardware bitsteram is exported to the SDK tool for software development. 
Software is developed for each processor using its Board Support Package (BSP) and 
software libraries. The BSP contains the device drivers of the peripherals used in the 
design. The executable files are either programmed to the BRAM or the external 
memory depending on their sizes. The file generated after the hardware design, 
which is called the (system.bit), is used by the SDK together with BRAM memory 
map file (system.bmm) and the executable code files (*.elf) to generate the download 
file (download.bit) file. Programing of the FPGA can be done through SDK, EDK, 
ISE or Impact. 
The technical specifications of the MPSoC OBC design are shown in Table 3-5. The 
interfaces can be changed according to the satellite system bus requirements. 
Typically RS232, IIC and GPIO are used. Other interfaces include Controller Area 
Network (CAN), MIL-1553 bus and Spacewire. Application memory can be 
extended however the power consumption would increase as well. We use the 
MPSM architecture therefore we recommend an external separate MRAM for each 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
62 
 
of the slave processors. They can use an internal BRAM for booting-up, such as 
using it for the boot strap purpose. The MSCP can use the internal BRAM as it does 
not execute large code however it needs an external MRAM for storing the recovery 
data. The overall power consumption should be less than 5 Watt to be adequate for 
use in small as well as nano-satellites with mass up to 50 kg and 100 Watt of 
generated power. Collected scientific data from the satellite payload can be stored via 
the OBC in the storage memory. The memory size is customizable according to the 
satellite mission. The power consumed by the storage memory should be considered 
when estimating the total power consumption of the OBC system. The OBC 
operating temperature is defined at the interface, attachment point, between the OBC 
and the satellite chassis. This temperature is used as a basis in the thermal vacuum 
testing in chapter 6. It should be respected in the thermal design of the whole 
satellite.  
Table 3-5. FPGA Based MPSoC OBC Specifications. 
Weight < 0.5 kg 
Power 
Total < 5 Watt 
FPGA 1.256 Watt, MRAM 0.6 Watt x 4, 
Others 0.3 Watt 
Size 10 cm x 10 cm x 0.5 cm 
Interfaces RS232, IIC, GPIO 
Application Memory Int. 32 Kbytes, Ext. MRAM 2 Mbytes 
Throughput 100 MIPS x 4 cores 
Storage Memory for Scientific Data Customizable 
Number of Processor Cores 4 cores 
FPGA Type SRAM Based 
Operating Temperature -24˚C to +60˚C 
Target Orbits Low Earth Orbit 
The total throughput is concerned with the total number of operating cores. The 
MicroBlaze processor uses Reduced Instruction Set Computer (RISC) architecture. 
The throughput of RISC processors is almost equivalent to its operating frequency. 
MicroBlaze processor operating at 100 MHz would have an approximate 100 MIPS 
throughput. If the processor cores are 4 then the total throughput would be 100 
MIPS. This throughput is reached if the processors work cooperatively through load 
balancing to share the tasks execution. If the processors work individually to achieve 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
63 
 
TMR based fault tolerance among three of them and a master control as the fourth 
processor then the effective throughput would be 100 MIPS. The target orbit was 
selected to be LEO due to the fact that we are designing the system using commercial 
grade FPGA. Our radiation analysis and tests in chapter 5 are based on the 
simulations for a satellite at LEO. Higher altitude orbits require different radiation 
analysis and higher level of protection against SEEs as well as accumulated doses. 
These orbits might require a different grade FPGA, typically space grade. 
The floor plan of the FPGA as generated by PlanAhead tool from Xilinx is shown in 
Figure 3-23. The design is using 98% of the FPGA resources. This percentage of 
utilization would be considered good if there is no intention to put additional logic to 
implement extra functions in the same package. In case a larger package is needed 
then the LX85 FPGA can be used to hold the extra design functions but take into 
consideration that the static and dynamic power consumptions would increase as 
well. The LX85 contains about double the resources of the LX50. 
 
Figure 3-23. FPGA Floor Plan of the MPSoC OBC. 
Chapter 3 Multi-Processor System-on-Chip On-Board Computer Design 
64 
 
The design Microprocessor Hardware Specification file (MHS) is presented at 
Appendix (A) and the top level schematic at Appendix (B).  
In this chapter we presented the design of the MPSoC OBC. We introduced the 
taxonomy of processor-memory interface which we call it Abu Zaid Taxonomy of 
fault tolerant processor-memory interfaces. It was cited at the small spacecraft 
technology state-of-art report by NASA [14]. We selected the MPSM with non-
shared memories architecture and carried out the design based on it.  
In the next chapter we introduce a detailed analysis and comparison between the time 
and space redundancy approaches which can be used in single and multi-processor 
systems.  
 
 Chapt
ev
re
th
consensus
parallel at
time redu
However, 
the other h
the case o
In order t
carry out 
to be used
to sort the
dispatchin
Bubble so
exerted in
we guaran
to get the 
It is impo
slave pro
processor,
The test w
system an
1) Perform
2) Power c
3) FPGA a
S 
C
 : Ser 4
eral techni
dundancy 
ree times 
. The space
 the same t
ndancy wo
it should h
and, reliab
f time and s
o understan
a comparati
 as the benc
 command
g to other
rt techniqu
 sorting the
tee to have
statistical d
rtant to not
cessors to 
 time redun
hich we co
d the triple 
ance in ex
onsumptio
rea utilizat
hapter 4 Sp
pace an
ques can b
technique u
to compar
 redundanc
ime and fin
uld have 
ave smaller
ility of the
pace redun
d the trade
ve analysis
hmark for 
s accordin
 subsystem
e would h
 data. Ther
 random ex
istribution.
ice that our
implemen
dancy is us
nducted w
processors 
ecuting the 
n both stati
ion. 
ace and Ti
d Time
e used to i
ses a sing
e between
y technique
ds consens
longer tim
 power con
 FPGA bits
dancy, depe
offs in usin
 based on a
the perform
g to their 
s, a sortin
ave execut
efore, by ge
ecution tim
 MPSoC d
t TMR w
ed as there 
as compari
space redun
bubble sort
c and dynam
me Redunda
 Redun
mplement 
le process
 the result
 uses three
us among 
e in execu
sumption a
tream again
nds on the 
g space an
 benchmar
ance. As th
execution t
g algorithm
ion times
nerating ra
es. The exe
esign uses 
ith feedba
is only one
ng the sing
dancy syst
 algorithm.
ic. 
ncy Trade
dancy
fault toleran
or and exe
s of execu
 hardware u
their result
tion than 
nd smaller 
st induced
scrubbing r
d time red
k. We selec
e main fun
imes for fu
 would fi
that vary b
ndom vecto
cution tim
space redun
ck. Howev
 instance of
le processo
em from fo
 
-offs 
 Trade-
t OBCs. T
cutes the c
tion to ch
nits which
s. It is obvi
space redu
area utiliza
 soft errors
ate. 
undancy w
ted the bub
ction of the
rther exec
t as a ben
ased on th
rs of unsor
es are then 
dancy betw
er, in the
 the MSCP
r time red
ur perspect
65 
offs 
he time 
ode for 
eck for 
 work in 
ous that 
ndancy. 
tion. On 
, both in 
e had to 
ble sort 
 OBC is 
ution or 
chmark. 
e effort 
ted data 
grouped 
een the 
 master 
. 
undancy 
ives:  
Chapter 4 Space and Time Redundancy Trade-offs 
66 
 
4) Soft error reliability. 
The Xilinx Virtex-5 XC5VLX50T FPGA in the Gensys kit from digilent was used in 
the space and time redundancy evaluation as the development kit had multiples of 
interface options. However, The LX50T and LX50, which is used in the actual 
MPSoC, are of the same speed and both are of commercial grade. They have the 
same number of CLB arrays, 6-input LUTs, distributed RAMs, shift registers and flip 
flops. In this chapter we study the details of time and space redundancy trade-offs 
and analytically estimate the in-orbit soft error reliability. That study would provide 
an understanding of the differences between space and time redundancy when 
implemented in 65 nm SRAM-based FPGAs. It would help in selecting the suitable 
fault tolerance approach according to the operating conditions. Some sections of this 
chapter were part of the author publications at [62]. 
4.1 Single Processor Design 
Two separate systems were implemented to test the trade-offs of using time and 
space redundancy in the Virtex- 5 LX50T FPGA: 1) single processor system, and 2) 
Multi-Processor system. The Multi-Processor system design is identical to the 
MPSoC previously described in section 3.5. In the single processor system, as shown 
in  
Figure 4-1, a MicroBlaze Processor is connected to a Local Memory Bus (LMB) 
where a local BRAM is attached. The processor is connected to other peripherals 
through the PLB. A WDT is used to reset the system in case the processor stopped 
working. The processor receives an interrupt from the INTC that the WDT finished 
counting and should be reinitialized. If the processor was working and did not hang 
up, it will respond to the WDT interrupt and will reset it. In case the processor 
stopped working for any reason, it will not respond to the WDT interrupt. The WDT 
will send a reset request to the reset module. The reset module will then reset the 
whole system including the processor.  
 
Chapter 4 Space and Time Redundancy Trade-offs 
67 
 
 
Figure 4-1. Single Processor System. 
The system contains a system control processor to handle the check pointing of the 
current status, and restoring the system operation to the last known good status in 
case of reset. The communication between the system control processor and the 
MicroBlaze processor takes place through the mailbox IP core. Two Timers exist in 
the same timer IP core, one of them is used for measuring the execution time of the 
code in this experiment. The MDM port is used to debug the software application 
running on the MicroBlaze processor and the system control processor. Xilinx 
General Purpose Input Output (XGPIO) IP core is used to input and output digital 
signals. It provides a control interface to the outside world. UART is used to send 
and receive serial streams to and from the processor.  
Table 4-1. Single Core and Multi-Core Power Consumptions. 
On-Chip Single Core Power (W) Multi-core Power (W) 
Clocks 0.170 0.347 
Logic 0.003 0.007 
Signals 0.005 0.014 
BRAMs 0.010 0.096 
DSPs 0.000 0.000 
PLLs 0.263 0.263 
DCMs 0.068 0.068 
IOs 0.001 0.002 
Leakage 0.458 0.460 
Total 0.978 1.257 
INTC
MicroBlaze
Processor
MB(1)
Timer XGPIO
BRAM
 Watch Dog 
Timer
Mailbox
UART
MDM Port
 Reset 
Module
 PLB Bus 
Arbiter 
 
 
 
 
JTAG
PLB Bus
System Control
Processor
BRAM
  ICAP
PLB Bus
Chapter 4 Space and Time Redundancy Trade-offs 
68 
 
Table 4-2. Single Core Resources Utilization. 
Slice Logic Utilization Used Available Utilization 
No. of Slice Registers 5,932 28,800 20% 
No. of Slice LUTs 6,866 28,800 23% 
No. of Occupied Slices 3,337 7,200 46% 
Number of BRAM/FIFO 18 60 30% 
Total Memory Used (KB) 648 2,160 30% 
Table 4-3. Multi-Core Resources Utilization. 
Slice Logic Utilization Used Available Utilization 
No. of Slice Registers 16,362 28,800 56% 
No. of Slice LUTs 18,763 28,800 65% 
No. of Occupied Slices 7,076 7,200 98% 
Number of BRAM/FIFO 42 60 70% 
Total Memory Used (KB) 1,512 2,160 70% 
Table 4-1 shows the power consumption of single and multi-core designs. Table 4-2 
shows the resources utilization of the single core system. Table 4-3 shows the 
resources utilization of the multi-processor system. 
4.2 Time and Space Redundancy Check Bench 
The test configuration is shown in Figure 4-2, a random number stream of specified 
length, in this case it is 100 numbers, is generated by random number generation in 
MATLAB. It is then sent to the processors through the serial port RS232 interface. 
The processors save the received vector and wait for a signal to start the bubble sort 
algorithm. When the start signal is issued by the MATLAB script, the processors 
start their local timers to calculate their execution speeds and then initiate the Bubble 
sort algorithm execution. 
 After the 
the value
mailboxes
processor 
is sent to t
trials are e
In case o
memory. T
in three d
stored in 
voting pro
times of e
the mean 
the flowch
C
Figure 4
bubble sort
s of the s
. Cross vo
is stopped 
he MATLA
xecuted.  
f single pr
he bubble 
ifferent me
the local m
cess, are s
ach process
and the var
art of the te
hapter 4 Sp
-2. Time an
 finishes ex
orted vecto
ting takes 
after the cr
B script. T
ocessor, it 
sort functio
mory loca
emory. Th
ent from th
or are stati
iance of th
st algorithm
ace and Ti
d Space Re
ecution on
rs among 
place at e
oss-voting 
he test cyc
stores thre
n runs thre
tions. The 
e values of
e processo
stically ana
e execution
 in case o
me Redunda
dudnacy Te
 each proce
each othe
ach proces
and the exe
le continues
e copies o
e times an
voting tak
 the execu
r to the M
lyzed by th
 times. Fig
f single pro
ncy Trade
st Configu
ssor, the p
r through 
sor. The l
cution tim
 until the r
f the data 
d the sorted
es place b
tion times,
ATLAB sc
e MATLAB
ure 4-3 an
cessor and 
-offs 
 
ration. 
rocessors e
the inter-p
ocal timer 
e of each p
equired num
vector in 
 vectors ar
etween the
 after finis
ript. The ex
 script to c
d Figure 4-
triple proce
69 
xchange 
rocessor 
at each 
rocessor 
bers of 
its local 
e stored 
 vectors 
hing the 
ecution 
alculate 
4, show 
ssors. 
 C
Fig
hapter 4 Sp
ure 4-3. Tim
ace and Ti
e Redund
me Redunda
ancy - Sing
ncy Trade
 
le Processo
-offs 
r. 
70 
Chapter 4 Space and Time Redundancy Trade-offs 
71 
 
 
Figure 4-4. Space Redundancy - Triple Processors. 
The results of execution and comparison between the single and triple processors 
systems in terms of time and space redundancy are presented in chapter 7. Next we 
present how to evaluate the soft error reliability for both systems. 
Initialize Platform
Receive Random Stream 
Start Timer
Bubble Sort
Send Sorted Vector to other 
Processors through Mailboxes
Receive Sorted Vectors of Others 
through Mailboxes
Majority Voting
Stop Timer
Send Timing Vector to MATLAB
Repeat 
Until no 
other 
Vectors
Chapter 4 Space and Time Redundancy Trade-offs 
72 
 
4.3 Estimation of Soft-Error Reliability 
Investigation of in-orbit SEU rates should be used when estimating the reliability of 
the design against soft errors in the configuration bitstream. SEU rates can be 
accurately investigated using accelerator radiation testing however it needs complex 
test procedure and setup. The Xilinx Virtex-5 XC5VLX50T FPGA in the Gensys kit 
from digilent was used as an example to estimate the soft error reliability. It consists 
of approximately 11.37 Mbits of effective configuration cells. The device contains 
355,190 configuration words and each word contains 32 bits (total of 11,366,080 
bits). The procedure hereinafter is general and can be used for any SRAM-based 
FPGA device. In order to approximately estimate the reliability of the design, in the 
worst case, we multiply the percentage of used resources by the device size in Mbits. 
In calculating the effective configuration bits size of the XC5VLX50T FPGA device, 
only relevant configuration cells which would remain unchanged are included. No 
block memory contents are included in the calculation as their data might change 
during the design operation. Therefore the (effective configuration image size) is 
11.37 Mbits without the block memory contents. The full configuration image size 
which contains the block memory contents is larger than 11.37 Mbits.  
To estimate the expected reliability of the Virtex-5 XC5VLX50T device, as an 
example, while operating in space, we apply the following steps: 
1) Define a target orbit upon which the estimations would take place. Use the target 
orbit data to calculate the trapped proton and electrons spectrum using the AP-8 
and AE-8 models. 
2) Use the static proton and heavy ion test results of the device, if they exist, or use 
the results of another device from the same technology, to estimate the (in-orbit 
SEU rates) using the Cosmic Ray Effects on Microelectronics (CREME-96) 
model within the Space Environment Information System (SPENVIS) online 
package.  
3) Calculate the design specific SEU rate, soft error rate in design data, based on the 
design size in bits. 
4) Calculate the system failure rate based on the design specific SEU rate. Assume 
Chapter 4 Space and Time Redundancy Trade-offs 
73 
 
that each SEU would cause a failure in the system operation. According to 
Xilinx, this is a very pessimistic assumption as less than 20%, and typically less 
than 10%, of the configuration bits would cause design failures if they are 
subjected to upsets. 
5) Reliability estimation based on the soft error failure rate (λ) and the time between 
scans of the FPGA configuration memory (T). 
4.3.1 Target Orbit Definition 
We select an orbit that is similar to the missions launched by Kyushu Institute of 
Technology to demonstrate the estimation procedure of the expected system 
reliability. Any other orbit can be selected depending on the target mission 
requirements. The orbit definition affects the expected charged particles fluxes. Table 
4-4 shows the orbital parameters of the Horyu-2 satellite which is used as an example 
to carry out the reliability calculations. 
Table 4-4. Horyu-2 Satellite Orbital Parameters. 
Orbit Type general 
Perigee Altitude 651 km 
Apogee Altitude 671.6 km
Inclination 98.17° 
RAAN 223.04° 
Argument of Perigee 31.95° 
True Anomaly 328.25° 
We use SPENVIS online package [63] to calculate the fluxes of the trapped protons 
and electrons, solar particles, and galactic cosmic ray in the chosen mission orbit. 
The output of the AE-8 and AP-8 models with solar maximum condition is shown in 
Figure 4-5 and Figure 4-6. It is clear that higher fluxes are focused at the South 
Atlantic region in what is known as the SAA. The spectra of the trapped electrons 
and protons energies versus the flux are shown in Figure 4-7 and Figure 4-8. 
Chapter 4 Space and Time Redundancy Trade-offs 
74 
 
 
Figure 4-5. AE-8 Model Trapped Electrons Flux in the Orbit. 
 
Figure 4-6. AP-8 Model Trapped Protons Flux in the Orbit. 
Chapter 4 Space and Time Redundancy Trade-offs 
75 
 
The flux of the trapped electrons energy above 6 MeV is very low. It is almost less 
than 1 cm-2sec-1. The flux of the trapped protons energy at about 300 MeV is about 
10 cm-2sec-1. Therefore, the expected effect of the trapped protons in causing bit 
upsets is higher than the effect of the trapped electrons. The threshold energy level at 
which the bit upsets are noticed depends on the specific device technology and can 
be found through radiation experimental results, static cross section estimation. 
 
Figure 4-7. Trapped Electrons Energy Spectra Versus the Flux. 
Chapter 4 Space and Time Redundancy Trade-offs 
76 
 
 
Figure 4-8. Trapped Protons Energy Spectra Versus the Flux. 
4.3.2 SEU Rate Estimation 
After defining the target orbit and estimating the trapped protons and trapped 
electrons fluxes at different energies using the AP-8 and AE-8, we use the output to 
estimate the SEU rate using CREME-96 model in SPENVIS online package. When 
running the simulations, we assumed that the satellite has an Aluminum shielding 
thickness of 0.5 g/cm2. We used the real experimental results of the static proton and 
heavy ion testing of the XC5VLX50 FPGA device by NASA as inputs to the 
CREME-96 model, in order to estimate the device specific upset rates. 
The system was built on the XC5VLXT50 FPGA device. However, both the 
XC5VLX50 and the XC5VLXT50 FPGA devices are identical in the number of 
slices except that the XC5VLXT50 FPGA contains some extra hard cores for 
advanced serial connectivity. The XC5VLX50 FPGA proton event bit cross-section 
is 8.61x10−14 ± 4.90x10−16 cm2/bit for 200 MeV and 6.37x10−14 ± 1.17x10−15 cm2/bit 
for 65 MeV [64]. 
The heavy ion event bit cross-section test results were fit using weibull curve. It is 
Chapter 4 Space and Time Redundancy Trade-offs 
77 
 
calculated from the number of single ionized particle-induced events in the data set 
(i.e., each MBU counts as one event) [64]. The four parameters of the weibull curve 
are the saturation cross-section or limit (σlim), the threshold or onset energy (Eo), the 
width of the rising portion of the curve (W) and the power that determines the shape 
of the curve (S). From [64] the values of the four parameters are (σlim) or (L=5.73E-
8) cm2/bit, (Eo) or (Lo=0.5) MeV.cm2/mg, (W=15) MeV.cm2/mg and (S=1.5). The 
heavy ion event bit cross-section for the highest tested LET of 68.3 MeV.cm2/mg is 
5.73x10−8cm2/bit [64]. 
The SEU rate along the first 14 orbits, about 23 hours of flight, is shown in Figure 
4-9. The peaks represent the rate at the SAA region. We calculate the worst case 
reliability based on the SEU rate at the SAA regions of the orbit (highest peak value 
corresponds to worst case) as well as the non-SAA regions (rest of the orbit). The 
highest SAA SEU rate is 3.73E-10 bit-1sec-1 and the non-SAA SEU rate is 2.07E-10 
bit-1sec-1. 
 
Figure 4-9. In-orbit SEU Rates Estimated by CREME-96 and based on Real 
Radiation Test Results of the Xilinx XC5VLX50 FPGA. 
Chapter 4 Space and Time Redundancy Trade-offs 
78 
 
4.3.3 Design Specific SEU Rate 
The single core and multicore embedded systems slices utilization, design sizes and 
upset rates are shown in Table 4-5. Each MicroBlaze system occupies 26% of the 
FPGA slices. The system control logic occupies 20% of the FPGA slices.  
Table 4-5. Design Size in bits and the Upset Rates. 
  
% of 
Occupied 
Slices 
SEU rate at 
SAA 
sec-1 
SEU rate at 
non-SAA 
sec-1 
Single 
core 
System control 20% 0.0008479 0.00047056 
MicroBlaze 
system 26% 0.0011023 0.00061172 
Total occupied 
slices 46% 0.0019502 0.00108228 
Multi 
core 
System control 20% 0.0008479 0.00047056 
3 MicroBlaze 
systems 3 x 26% 0.0033068 0.00183517 
Total occupied 
slices 98% 0.0041548 0.00230572 
The design specific SEU rate is calculated by multiplying the in-orbit SEU rate at the 
SAA and non-SAA regions (bit-1sec-1) by the design size (bits) which is estimated 
from the (% of occupied slices × FPGA bitstream size (11.37 Mbits)). 
4.3.4 System Failure Rate 
Assuming that each upset event would cause a system failure, which is a very 
pessimistic approach, the system failure rate per day can be estimated from the SEU 
upset rate per sec as in Eq.(4-1):  
ܵݕݏݐ݁݉_ܨ݈ܽ݅ݑݎ݁_ܴܽݐ݁_݌݁ݎ_݀ܽݕ = ܵܧܷ_ܴܽݐ݁ × 86400 
(4-1) 
Typically less than 20% of the design bits would be sensitive to the design operation. 
Table 4-6 shows the system failure rate calculated for the system control processor 
and each MicroBlaze processor system at the SAA and non-SAA regions. The failure 
rates are identical for the system control logic and the MicroBlaze processor system 
in the single and multicore designs as they occupy the same size of the bitstream in 
both designs. Calculations are carried out for two cases: 1) when considering that any 
Chapter 4 Space and Time Redundancy Trade-offs 
79 
 
bit upset would cause a failure and 2) when assuming that only 20% of the bit upsets 
would cause a failure.  
Table 4-6. System Failure Rates. 
 
Failure 
rate at 
SAA 
day-1 
(20% Sensitive 
Bits) Failure 
rate at SAA 
day-1 
Failure rate at
non-SAA 
day-1 
(20% Sensitive 
Bits) Failure 
rate at non-SAA
day-1 
System 
control 73.26 14.65 40.66 8.13 
MicroBlaze 
system 95.24 19.05 52.85 10.57 
4.3.5 Reliability Estimation 
Configuration memory scanning is performed to make sure that the memory which 
contains the design bitstream is free of errors. Configuration frames are readback 
through the ICAP. Each frame has 12 bits of ECC which is capable of SECDED. The 
frame data is checked through an ECC calculator to detect if an error exists or not. In 
case a single bit error exists the error syndrome would locate the error position. The 
error can be corrected by toggling the bit value at the position identified by the 
syndrome. If double errors or more exist in a single frame (i.e., MBU) then the error 
cannot be corrected. The FPGA should be reset if the bit is sensitive to the design. 
Xilinx also provides a CRC primitive to check the integrity of the whole bitstream 
image as well as an SEU controller IP core for bitstream scrubbing. The Scrubbing 
process consists of (scanning) reading back the frames, calculating the ECC, 
correcting single bit errors, and writing back the correct frames. The scan period is 
calculated from Eq.(4-2): 
ܵܿܽ݊_ܲ݁ݎ݅݋݀ = ܶ݋ݐ݈ܽ_ܥ݋݂݊݅݃ݑݎܽݐ݅݋݊_ܹ݋ݎ݀ݏ × 1ܥ݈݇_݂ݎ݁ݍ 
(4-2) 
The maximum scanning rate per day is calculated from Eq. (4-3): 
Chapter 4 Space and Time Redundancy Trade-offs 
80 
 
ܯܽݔ_ܵܿܽ݊_ܴܽݐ݁_݌݁ݎ_݀ܽݕ = 86400ܵܿܽ݊_ܲ݁ݎ݅݋݀_݅݊_ݏ݁ܿ݋݊݀ݏ 
(4-3) 
The XC5VLX50T FPGA contains 355,190 configuration words. When operating at 
50 MHz scanning clock then the scanning period according to Eq. (4-2) is 
(scan_period = 7.1038 msec). The maximum scan rate per day is 12,162,504.58 
scans. This rate is for continuous scanning where a new scan is started just after the 
previous scan finishes. According to Xilinx, the scanning rate is recommended to be 
at least 10 times the upset rate. As scrubbing takes place, we assume the maximum 
lifetime of any failure to be equal to the time between complete scans, call it the scan 
cycle time. The reason for this assumption is due to the fact that the errors in the 
configuration stream would be discovered during the scanning process and corrective 
action would be taken, either by resetting the FPGA or correcting the upset if 
possible. Therefore the lifetime of any error is limited by the scan cycle time. The 
basic formula to estimate the reliability is shown in Eq. (4-4): 
ܴ = ݁ିఒ் 
(4-4) 
Where (λ) is the soft error failure rate in unit of time and (T) is the maximum soft 
error lifetime which is equal to the time between scans. The reliability of the single 
core system is estimated based on Eq. (4-5): 
ܴ௦௜௡௚௟௘_௖௢௥௘ = ܴ௦௬௦௧௘௠_௖௢௡௧௥௢௟_௟௢௚௜௖ × ܴெ௜௖௥௢஻௟௔௭௘_௦௬௦௧௘௠ 
(4-5) 
The reliability of the multicore system is estimated from Eq.(4-6) and Eq.(4-7): 
ܴ௠௨௟௧௜௖௢௥௘_௦௬௦௧௘௠ = ܴ௦௬௦௧௘௠_௖௢௡௧௥௢௟_௟௢௚௜௖ × ்ܴெோ_ெ௜௖௥௢஻௟௔௭௘ 
(4-6) 
்ܴெோ_ெ௜௖௥௢௕௟௔௭௘ = 3ܴெ௜௖௥௢௕௟௔௭௘_௦௬௦௧௘௠ଶ − 2ܴெ௜௖௥௢஻௟௔௭௘_௦௬௦௧௘௠ଷ  
(4-7) 
From Table 4-6, the highest failure rates are for the system control logic and the 
MicroBlaze system at SAA which are 73.26 and 95.24 failures/day respectively. 
These failure rates correspond to the worst case reliability estimation. The lowest 
Chapter 4 Space and Time Redundancy Trade-offs 
81 
 
failure rates for the system control logic and the MicroBlaze system at the non-SAA 
region, when only 20% of the design bits are considered to be sensitive for bit upsets, 
are 8.13 and 10.57 failures/day respectively. These failure rates correspond to the 
best case reliability estimation. In Figure 4-10, Figure 4-11, and Figure 4-12 we plot 
the reliability curve of the system at the worst case estimations.  
The configuration memory scanning rates are selectable and can be designed to 
achieve the required system soft error reliability. If we use the SEU controller IP core 
from Xilinx, then we would have continuous scanning, thus the time between scans 
would be 7.1038 msec as calculated from Eq.(4-2). At this scanning rate the 
reliability of the single core and multicore systems is equal to 0.999999 as shown in 
Figure 4-12. If a custom configuration stream scanning technique was used through 
the internal ICAP or the external SelectMAP interface then we can estimate the 
reliability of the single and multicore systems based on Figure 4-11. For example 
when the time between scans equals 20 sec, then the single core system reliability is 
about 0.96 while the multicore system reliability is about 0.98. It is important to 
notice that the multicore TMR system provides better reliability than the single core 
system till a certain lifetime (time between scans) , as shown in Figure 4-10 which is 
equal to about 606 sec in this design. After that time the single core reliability 
becomes worse than the multicore TMR reliability. Therefore, it is always desired to 
select the time between scans to be lower than that value.  
Chapter 4 Space and Time Redundancy Trade-offs 
82 
 
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
400 800 1200 1600 2000 2400
Single and Multicore Systems Worst Case Reliabilities at SAA  
Single Core Multicore
Si
ng
le
 C
or
e 
Sy
st
em
R
el
ia
bi
lit
y
M
ulticore System
 (TM
R
)
R
eliability
Time Between Scans (sec)
 
Figure 4-10. Single Versus Triple Processor Systems Soft-Error Reliability based on 
in-orbit Estimation. 
Chapter 4 Space and Time Redundancy Trade-offs 
83 
 
0.8
0.85
0.9
0.95
1
0.8
0.85
0.9
0.95
1
20 40 60 80 100
Single and Multicore Systems Worst Case Reliabilities at SAA
Single Core Multicore
Si
ng
le
 C
or
e 
Sy
st
em
R
el
ia
bi
lit
y
M
ulticore System
 (TM
R
)
R
eliability
Time Between Scans (sec)
 
Figure 4-11. Single Versus Triple Processor Systems Soft-Error Reliability based on 
in-orbit Estimation, Time between Scans ≤ 100 sec. 
If an external MRAM is used for storing code or as a platform configuration flash 
then the overall system reliability is estimated from Eq.(4-8). 
ܴ௦௬௦௧௘௠_௦௢௙௧_௘௥௥௢௥ = ܴெோ஺ெ_௦௢௙௧_௘௥௥௢௥ × ܴி௉ீ஺_ௗ௘௦௜௚௡_௦௢௙௧_௘௥௥௢௥ 
(4-8) 
Where, RMRAM_soft_rate is the external MRAM memory soft error reliability while 
RFPGA_Design_soft_error is the soft error reliability of the FPGA digital design. From the 
previous analysis, it is clear that increasing the scrubbing rate of the FPGA 
configuration stream leads to higher system reliabilities. Also, we noticed that the 
TMR multicore system provides better reliabilities than the single core system, in 
case non-continuous scanning is used, as long as it is below a certain value for the 
Chapter 4 Space and Time Redundancy Trade-offs 
84 
 
time between scans. As an example, if we assume the time between scans to be 20 
sec, and MRAM soft error rate to be 0.05 failure/year then we can plot the total TMR 
system reliability as in Figure 4-13. 
0.97
0.975
0.98
0.985
0.99
0.995
1
0.97
0.975
0.98
0.985
0.99
0.995
1
0.007 0.0075 0.008 0.0085 0.009 0.0095 0.01
Single and Multicore Systems Worst Case Reliabilities at SAA
Single Core Multicore
Si
ng
le
 C
or
e 
Sy
st
em
R
el
ia
bi
lit
y
M
ulticore System
 (TM
R
)
R
eliability
Time Between Scans (sec)  
Figure 4-12. Single Versus Triple Processor Systems Soft-Error Reliability based on 
in-orbit Estimation, Time between Scans ≤ 10 msec. 
Chapter 4 Space and Time Redundancy Trade-offs 
85 
 
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
2 4 6 8 10
TMR System Total Reliability
To
ta
l R
el
ia
bi
lit
y
Time, Years  
Figure 4-13. Total System Reliability of TMR System based on in-orbit SEU Rates 
and assuming an MRAM Soft Failure Rate of 0.05 failures/year. 
In this chapter we conducted a tradeoff comparison between the time and space 
redundancy fault tolerance architectures. We compared the architectures in terms of 
the performance, power consumption, utilization and reliability. The results are 
presented in chapter 7.  
In The next chapter, we handle the radiation test at the nuclear accelerator to estimate 
the static cross section using a new method through internal scrubbing. Also, we 
present the fault injection method to estimate the design dynamic cross section.
 Chapt
n thi
estim
dyna
using prot
test at Ky
radioactiv
procedure
de-capsula
to use the 
to estimat
configurat
and fault i
a given o
publicatio
5.1 Rad
Radiation 
based test
proton and
is the main
be taken 
while radi
which ma
objectives
1) Measu
chip’s 
2) Measu
to radi
3) Measu
curren
I
C
 : Rer 5
s chapter w
ate its sta
mic cross s
on beam at
oto Unive
e isotopic 
s, test setup
tion of the 
proton beam
e the dyna
ion memor
njection tes
rbit of cho
ns at [65] [6
iation T
tests are ex
s. If the te
 heavy ion
 issue whe
to countera
ation testin
ke it worth
 of radiatio
rement of 
wafer to ra
rement of t
ation induc
rement of t
t. 
hapter 5 R
adiatio
e introduce
tic cross s
ection. The
 65 MeV. A
rsity Resea
radiation 
, and less c
Virtex-5 ch
 for radiat
mic cross 
y through D
t results are
ice. Some
6]. 
est Obj
pensive an
st was to ta
 radiation, t
n conductin
ct any po
g is still an
 to condu
n test: 
the static c
diation indu
he dynami
ed SEUs. 
he SEL inc
adiation an
n and F
 the radiat
ection and 
 Radiation
t the begin
rch Reacto
source. Us
omplicated
ip was not
ion evaluat
section of
PR as will
 used in an
 sections o
ectives 
d need a l
ke place a
hen the com
g radiation
ssibility fo
 expensive
ct wheneve
ross sectio
ced SEUs.
c cross sect
idences an
d Fault Inj
ault In
ion tests pe
the fault 
 tests were
ning, we de
r Institute
ing Cf-252
 than the pr
 successful
ion. We de
 the design
 be describe
alyzing the
f this cha
ot of prepa
t a nuclear
plications
 tests. Ther
r radiation 
 and compl
r possible
n which in
  
ion which 
d the increm
ection Test
jection
rformed on
injection te
 performed
cided to pe
(KURRI) 
 is much 
oton beam
. Therefore,
veloped a f
 using fau
d in the ch
 system sof
pter were 
ration com
 accelerato
 would eve
efore, a lot 
contamina
ex test, it h
. The follo
dicates the
indicates th
ent in the
ing 
 Testin
 the FPGA
sts to esti
 at TARRI 
rform the r
using Cf-2
easier in 
 test.  Howe
 we finally
ault injectio
lt injection
apter. The r
t error relia
part of the
pared to la
r, as in the
n be greate
of measure
tion. Neve
as some ob
wing are t
 sensitivity
e design se
 level of co
86 
g 
 chip to 
mate its 
- JAEA 
adiation 
52 as a 
the test 
ver, the 
 decided 
n bench 
 in the 
adiation 
bility in 
 author 
boratory 
 case of 
r. Safety 
s should 
rtheless, 
jectives 
he main 
 of the 
nsitivity 
nsumed 
Chapter 5 Radiation and Fault Injection Testing 
87 
 
4) Accumulated dose, TID and DDD, identification and measurement.  
5) Testing of the effectiveness of the applied mitigation techniques against SELs, 
SEUs, TID and DDD. 
6) Estimation of the electronic components Soft Error Reliability (SER) and their 
lifetimes based on SEU rate, TID and DDD. 
5.2 Accelerated Proton Beam Radiation Test Setup 
Performing radiation test using accelerated proton beam requires special test setup. 
The Device Under Test (DUT) should be baked before placing it in the radiation 
chamber in order to reduce the outgassing emissions and hence reach the required 
pressure to start the radiation test in shorter time. Figure 5-1 shows the baking of the 
boards before radiation test at CeNT – Kyushu Institute of Technology (KIT). The 
boards are then kept in nitrogen sealed containers to avoid any water vapor 
condensation on their surfaces, which would require repeating the baking step, as 
well as isolating them from any contamination from the surrounding environment.   
 
Figure 5-1. Baking of Boards before Radiation Testing. 
During the radiation test no one is allowed to exist near the test chamber due to 
safety conditions. The test chamber, shown in Figure 5-2, is vacuumed at a pressure 
Chapter 5 Radiation and Fault Injection Testing 
88 
 
of 10-5 Pa. The DUT is placed in a vertical orientation to face the proton beam. The 
pressure level of 10-5 Pa guarantees that no molecular interaction would take place 
between the proton beam and any molecules that might exist inside the chamber as 
they would be sucked out. The chamber is located away from the control room and 
protected with a concrete wall surrounding it from all sides. The concrete wall has a 
thickness of about 4 m. A stainless steel door of the same thickness is used to close 
the test location, where the test chamber is installed, before starting the radiation 
beam.   
 
Figure 5-2. Radiation Test Chamber used in Accelerated Proton Beam Test. 
Geiger counters are used to check the location before and after the test. The personal 
in charge of the test are checked at a stand-up facility to measure any radiation 
contamination in their bodies and/or clothes. Usually the level of radiation measured 
at the stand-up checking should be less than 0.001 mSv. 
The beam line, shown in Figure 5-3, runs from the accelerator core to the vacuum 
test chamber where the DUT is installed. The proton particles are accelerated and 
kept in a straight beam using electromagnetic field.  
Chapter 5 Radiation and Fault Injection Testing 
89 
 
 
Figure 5-3. Proton Beam Line Connected to Test Chamber. 
The beam is blocked from hitting the DUT using a shielding frame which can be 
removed remotely from the control room. It can be centered on the target through a 
setting mesh as shown in Figure 5-4. 
 
Figure 5-4. Beam Centering Grid. 
A flange with two 25 pins male connections is used to carry the digital interface 
signals between the DUT inside the test chamber and the devices outside of it. The 
flange is used to guarantee that tight isolation is available in order not to have any air 
leakage from outside the chamber to its internal vessel while operating in vacuum. 
The flange is shown in Figure 5-5. 
Chapter 5 Radiation and Fault Injection Testing 
90 
 
 
Figure 5-5. Interface Flange with Radiation Test Chamber. 
The DUT is installed on a metal board facing the radiation beam. A copper plate of 1 
cm thickness is installed above the DUT. The copper plate helps in shielding the 
components in the DUT that should not be subjected to radiation during the test, such 
as DC/DC converter and flash configuration memory. These components are: 1) 
either not used at all in the final design and they are just being used during the 
radiation test to provide supportive functions, 2) or they are used but they should be 
replaced with different types in the final design. An example of a supportive function, 
when there is an external microcontroller installed on the DUT to perform external 
scrubbing of the FPGA SRAM configuration memory. That microcontroller should 
not be the subject for radiation testing unless it would be used in the final design. In 
our test we focused on the FPGA to calculate its cross section using internal 
scrubbing.  
The copper plate contains a hole to allow the radiation to pass through it and hit the 
surface of the FPGA in the DUT. Figure 5-6 shows the copper shield plate installed 
above the DUT and the hole just centered above the FPGA.  
Chapter 5 Radiation and Fault Injection Testing 
91 
 
 
Figure 5-6. Copper Shield In Radiation Test. 
The board is installed inside the test chamber with the minimum number of cables 
connecting it to the computers and power sources outside the test chamber through 
the flange interface. The cables are recommended to be as few as possible to avoid 
outgassing during the preparation stage when the test chamber is being vacuumed. If 
the cables are too many then it might not be possible to reach the required pressure 
level of 10-5 Pa which is mandatory to start the radiation test.  
 
Figure 5-7. Top-view DUT in Radiation Test Chamber with RS-232 Interface. 
Even if baking is performed on the cables and they were quite many then the 
outgassing is still possible to occur. We had a similar experience at the beginning of 
Proton Beam
Chapter 5 Radiation and Fault Injection Testing 
92 
 
the radiation tests when we used RS-422 converters to extend the communication 
distance and avoid the electromagnetic noise. RS-422 is a differential serial 
communication technology which can reach up to 1000 meters with twisted pair 
cables at a speed of 10 Mbit/s. However, we could not reach the required pressure at 
the radiation test site as we were using bulk external RS-422 converters instead of 
compact ICs as shown in Figure 5-8. We had to redesign the communication board to 
be compact and of very small size and reduce the number of cables as shown in 
Figure 5-7. 
 
Figure 5-8. Outgassing due to External RS-422 Converters. 
The overall test setup can be viewed in Figure 5-9. The proton beam runs across a 4 
m concrete wall directly from the accelerator core. Coils are surrounding the beam 
tube to generate magnetic field in addition to the electric field to guide the beam and 
accelerate the particles. The DUT is installed on a handler and a copper plate is 
installed above it with a hole to pass the proton beam. Data is collected from the 
board through RS-232 interface which proved to work perfectly at low data rate of 9 
Kbps. The interface cables as well as the power cables passed through the interface 
flange. A computer system with serial interfaces, Digital Input/Output (DIO), and 
 Data Acqu
Network 
existed an
All the m
connection
In summa
1) Baking
2) Install
3) Testin
4) Vacuu
5) Startin
5.3 Rad
Before con
Cf-252 rad
in terms o
require a f
order to h
by the sp
before the
C
isition (DA
(LAN) con
d the test s
onitoring 
 to the com
Fig
ry the test s
 of the DU
ation at the 
g of interfa
ming of the
g the radiat
iation T
ducting ac
iation test 
f the neede
ull health c
ave the per
ecialists at
 test to reco
hapter 5 R
Q) cards m
nection wa
ite. During
and contro
puter syste
ure 5-9. Ac
etup can be
T for outga
test chamb
ce, control, 
 test chamb
ion beam a
est proc
celerated p
at KURRI 
d setup an
heck as we
mission to 
 KURRI t
rd the occu
adiation an
anaged th
s provided
 the test, n
l actions 
m at the te
celerated P
 arranged in
ssing. 
er. 
DUT and L
er.  
t specific fl
edure 
roton beam
as it is muc
d necessary
ll as radiati
attend and 
o de-capsu
rrence of S
d Fault Inj
e data logg
 between t
o one was 
were perfo
st site.  
roton Beam
 the follow
AN.  
ux and ene
 test at TAR
h easier tha
 procedure
on test lice
participate 
late the X
EUs. Acco
ection Test
ing and con
he room w
staying nea
rmed usin
 Test Setup
ing steps:
rgy.  
RI, we tho
n accelerat
s. Howeve
nse to be ob
in the tests
C5VLX50
rding to th
ing 
trol. A Lo
here the o
r the test c
g remote 
. 
ught to con
ed proton b
r, both test
tained offi
. We were
-1FFG676C
e nuclear ph
93 
cal Area 
perators 
hamber. 
desktop 
duct the 
eam test 
s would 
cially in 
 advised 
 FPGA 
ysicists 
 
 at KURR
capsulatio
252, whic
from trav
record an
Figure 5-1
be remove
We perfor
company 
on the FPG
flip-chip m
Solder bu
the board
provides a
thermal co
Thermal E
C
I, Cf-252 
n. This is d
h is the ret
ersing thro
y SEU eve
. The lid-h
d out to rea
med de-cap
in Japan. B
A to get a
ethod, wh
mps locatio
 is filled w
 mechanic
mpensation
xpansion (
hapter 5 R
cannot pen
ue to the f
arding forc
ugh the FP
nts. The X
eat spreade
ch the die.
Figure 5-
sulation fo
efore the d
 clear unde
ere there a
ns can be 
ith an el
al support 
 for the di
CTE). 
adiation an
etrate throu
act that the
e acting a
GA mater
C5VLX50
r together w
10. Virtex 5
r the XC5V
e-capsulatio
rstanding o
re no wires
seen in Fig
ectric insul
to fix the d
fference bet
d Fault Inj
gh the pa
 stopping p
gainst the c
ial, is high
-1FFG676C
ith the the
 Layers © 
LX50-1FF
n process,
f the die co
 bonding w
ure 5-11. T
ator adhes
ie well in 
ween the d
ection Test
ckage’s m
ower (−ௗாௗ௫
harged par
 and there
 FPGA la
rmal interf
Xilinx. 
G676C FP
 an X-ray s
nnection to
ith the ext
he space b
ion. This 
its place as
ie and subs
ing 
aterial with
)	(MeV/cm
ticles to pr
fore we m
yers are sh
ace materia
GA at a spe
can was co
 the substr
ernal pins, 
etween the
under-fill a
 well as p
trate Coeff
94 
out de-
) of Cf-
event it 
ight not 
own in 
l should 
 
cialized 
nducted 
ate. The 
is used. 
 die and 
dhesion 
roviding 
icient of 
 The FPG
capsulatio
company 
have a hig
we decide
high amou
 
Instead of
required to
Neverthel
We had s
DUT to p
C
Figure 
A was de
n the FPGA
reported th
h failure ra
d not to re
nt of FPGA
Figu
 de-capsula
 de-capsul
ess, we did
uccessful d
erform Cf-2
hapter 5 R
5-11. X-ray
-capsulated
 was not 
at the de-c
te if we cho
peat the pr
s which is 
re 5-12. FP
tion, we ch
ate the Virt
 not stop th
e-capsulati
52 radiatio
adiation an
 of the XC5
 as shown
functional 
apsulation 
se to repea
ocess of d
fairly expe
GA before 
ose the acc
ex5 FPGA 
e experime
on of the 
n testing a
d Fault Inj
VLX50-1F
 in Figur
and the pro
of Virtex F
t the proces
e-capsulatio
nsive.  
and after D
elerated pro
to be able t
nts for usin
Spartan-6 F
t KURRI a
ection Test
FG676C F
e 5-12. H
cess was n
PGAs is d
s. In order 
n as it mi
e-capsulatio
ton beam t
o conduct th
g Cf-252 
PGA whic
s shown in 
ing 
PGA. 
owever, a
ot success
ifficult an
to be cost e
ght lead to
n. 
esting as it
e test.  
in radiation
h was use
Appendix 
FP
95 
fter de-
ful. The 
d would 
ffective, 
 loosing 
 was not 
 testing. 
d as the 
(C). The 
GA Silicon
 
 
 
Chapter 5 Radiation and Fault Injection Testing 
96 
 
test showed the Cf-252 capabilities in inducing SEUs in the Spartan-6 die. We 
estimated the static and dynamic cross sections of the Spartan-6 based on the mean 
energy emitted by the Cf-252 which is estimated at 43 MeV. 
Back to the Virtex5 FPGA, we chose the energy of the proton beam to be at the 
saturation level. In [64], it was shown that 65 MeV is a saturation energy for the 
XC5VLX50-1FFG676C FPGA. Although accelerated proton beam testing is difficult 
and needs a lot of complex setup, it does have an advantage when compared to Cf-
252 testing. Proton beam has a consistent energy and flux while Cf-252 has a 
distribution of energies and fluxes. This means that when performing calculations of 
the cross section, proton beam would be easier, accurate and straight forward than 
Cf-252. In the following sections we illustrate the detailed procedure for the 
accelerated proton beam testing. 
5.3.1 Safety Precautions  
Radiation test at nuclear accelerators is an ultimately serious test. Although the test 
site is well prepared at both TARRI and KURRI, there are still some safety rules to 
follow. To be able to attend the test, a full health check should be made at an official 
governmental health check center to make sure that there are no health conditions 
that might lead to complex health problems during or after the test.  
A basic training about radiation types, nuclear reactors operation, safety procedures 
at radiation sites and contingency behavior in case of radioisotope leakage and/or 
contamination. Upon finishing the training a license is issued for one year to conduct 
radiation testing at the JAEA based facilities or university based facilities. Both 
facilities are checked and inspected by JAEA and other official authorities in Japan. 
The license should be renewed annually to keep it valid. Renewal is done by 
attending refreshment training. 
At the test site, radiation level should be checked for each participant before starting 
the test. The participants check the radiation level by standing on a device that 
measure the contamination, if any, by placing the hands in a Geiger scanner. While 
existing inside the nuclear reactor, all the participants wear a Geiger counter bracelet 
to continuously measure the radiation level in mSv. Participants wear a special coat 
Chapter 5 Radiation and Fault Injection Testing 
97 
 
and shoes while existing in the nuclear reactor for protection against radiation 
contamination.  
The DUT is located at the radiation test location inside the test chamber. The 
participants are located at a different room. The participants can control the 
experiment and monitor the results in real time via remote desktop application over 
the LAN between the test site and their room. Before the experiment is started, the 
test commander locks the radiation site and activates the alarm that radiation test is in 
progress. The door is locked during radiation and no one is allowed to exist at that 
area.  
Switching the DUT on/off should be considered carefully. As there is no possibility 
to interrupt the test by stopping the radiation beam and reopening the lock door, there 
should be a method to provide control and reconfiguration, if needed, remotely. In 
our test we designed an application using LabVIEW to control the switching of the 
DUT using relays and DIO cards. We controlled the LabVIEW application from our 
computers at the control room via remote desktop application over the LAN. In that 
way we kept controlling and monitoring the experiment without having to stop the 
beam and reentering the test site.  
At the end of the radiation, the proton beam is stopped and then the radiation test 
alarm is deactivated. After opening the lock door, TARRI specialists start to scan the 
test site using handheld Geiger counters. All the devices used during the test are 
checked using the Geiger counters to make sure that they are not contaminated. The 
DUT itself is kept at TARRI for 3 months as it became contaminated. Radiation 
testing is very serious and should be handled very carefully and with great care as it 
might lead to catastrophic crises in case of carelessness or inaccuracy. Next we 
illustrate the accelerator parameters that should be controlled during the test, the 
software that was running on the MPSoC and how the SEU controller was used to 
calculate the upset rate.  
5.3.2 Accelerator Parameters 
The radiation test starts by defining the particles flux level, energy level and 
exposure time. The flux level (# of particles/cm2/s), proton particles energy (MeV) 
and radiation exposure time (s) are the basic parameters used in calculating the 
Chapter 5 Radiation and Fault Injection Testing 
98 
 
radiation effects. Generally speaking, a flux level that generates sufficient 
statistically meaningful data in reasonable exposure time at the specified energy level 
is required. The radiation flux (# of particles/cm2/s) is related to the radiation fluence 
(# of particles/cm2) and radiation exposure time (s) as shown in Eq. (5-1) & Eq. 
(5-2) : 
݂݈ݑ݁݊ܿ݁ = 	#	݋݂	ܲܽݎݐݎ݈݅ܿ݁ݏܣݎ݁ܽ  
(5-1) 
ܨ݈ݑݔ = ݂݈ݑ݁݊ܿ݁݁ݔ݌݋ݏݑݎ݁	ݐ݅݉݁ 
(5-2) 
We recommend that the selection of the flux level be determined based on the target 
orbit where the system would operate. In our case we selected a polar orbit at a LEO 
altitude. The orbit was selected to match that of a previously launched satellite by 
KIT named HORYU-2. The orbit is also typical for earth observation missions. 
Table 5-1 shows the Horyu-2 satellite orbit parameters.  
Table 5-1. Horyu-2 Satellite Orbit Parameters. 
Orbit Type general 
Perigee Altitude 651 km 
Apogee Altitude 671.6 km
Inclination 98.17° 
RAAN 223.04° 
Argument of Perigee 31.95° 
True Anomaly 328.25° 
 
Figure 5-13 shows the expected energy spectra versus the proton flux levels at the 
Horyu-2 orbit. The graph was produced by running simulation using the AP-8 model 
under SPENVIS.  
Chapter 5 Radiation and Fault Injection Testing 
99 
 
 
Figure 5-13. Proton Energy Spectra Versus Flux Level at Horyu-2 Orbit. 
From the graph, we expect that a flux level of about 30 (protons/cm2/s) is suitable to 
test at an energy of 65 MeV. However, at that flux level the exposure time might 
need to be very long to collect sufficient upset events. The exposure time to generate 
sufficient errors at that flux and energy level depends on the static-cross section of 
the DUT. The static cross section is defined in Eq. (5-3) : 
ߪ௦௧௔௧௜௖ =
#	݋݂	ݑ݌ݏ݁ݐݏ
݂݈ݑ݁݊ܿ݁  
(5-3) 
If the test purpose was to determine the static cross section then several exposure 
times/flux levels should be tried. If the test purpose was to determine the dynamic 
cross section then the test flux level should be close to the in-orbit expected flux 
level at the same energy. The dynamic cross section is defined in Eq. (5-4) : 
ߪௗ௬௡௔௠௜௖ =
#	݋݂	ݏݕݏݐ݁݉	݂݈ܽ݅ݑݎ݁ݏ
#	݋݂	ݑ݌ݏ݁ݐݏ  
(5-4) 
Chapter 5 Radiation and Fault Injection Testing 
100 
 
In our case, we wanted to check the static cross section, so we decided to have a very 
high flux to reduce the exposure time. In fact, beam time is very expensive and it 
should be paid for each minute of irradiation. Therefore a cost effective way would 
be to use a high flux level at the selected test energy level when performing static 
cross section measurement to reduce the exposure time of irradiation. In the case of 
dynamic cross section measurement, the test flux level should be comparable to the 
in-orbit flux level at the chosen particle energy. An increment test margin of two 
orders of magnitude would be fine to work with. However, very high flux levels, 
above two orders of magnitude, would be impractical to use as they would not 
represent the target environment. On the other side, taking into consideration that 
radiation occurs at different energies and fluxes in space as shown in Figure 5-13, a 
test flux level with an increment of two orders of magnitude above the expected in-
orbit flux level would compensate for the absence of other particles at different 
energies. However, if the design would be subjected to very high unrealistic flux 
levels then the test would not represent the target environment and the capabilities of 
the used fault tolerance mitigation techniques would not be effectively evaluated.  
We performed the radiation test at two flux levels, 3e06 and 10e06 (cm-2.s-1). Three 
times of testing were conducted at flux level of 3e06 (cm-2.s-1). One test was 
conducted at flux level of 10e06 (cm-2.s-1). The reason for test repetition was to get 
different data sets to avoid the effect of single time errors thus we could calculate a 
fairly accurate average value of the static cross section. We did not specify the 
exposure time for each test run. Instead of that, we were running the tests until the 
maximum number of FPGA configuration frame errors was reached. The embedded 
SEU controller was responsible for counting the number of frame errors. A total of 
128 frame errors was the maximum record that the SEU controller could count. 
When 128 frame errors were reached we knew that we had to stop the test run and 
restart again as will be explained later. The energy level was set to 65 MeV during 
the tests and the corresponding exposure times are shown in Table 5-2: 
 
 
 
Chapter 5 Radiation and Fault Injection Testing 
101 
 
Table 5-2. Accelerator Parameters Settings. 
Flux Level (cm-2.s-1) Energy (MeV) Exposure Time (s) 
3e06 
65 
25.16 
23.4 
21.7 
10e06 82.4 
5.3.3 Embedded Software 
The MPSoC consists of four embedded processors as was explained earlier. Three 
processors implement TMR mitigation and the fourth processor acts as the master 
controller for handling system level tasks. During the radiation test the three 
processors run an Interrupt Service Routine (ISR) that enables them to print their 
status to a serial terminal via their UART ports and communicate an 
acknowledgement message to the master controller through the mailbox IP core.  
The master control processor was responsible for receiving the messages from the 
slave processors to make sure that they are alive and working as well as resetting the 
external WDT. An SEU controller IP core was used to estimate the number of errors 
in the bitstream configuration frames stored in the internal FPGA SRAM. The 
embedded software of radiation and thermal vacuum testing are similar and will be 
described in details in chapter 6. Code Excerpt is presented in Appendix (D). 
5.3.4 Upset Rate Estimation 
The Xilinx SEU controller IP core was explained earlier in the MPSoC system 
design chapter. However, we would like to focus on how to use it in calculating the 
static cross section. The SEU controller operates in one of two modes, the Detection 
Only Mode (DOM) or the Auto Correction Mode (ACM). At the beginning of the 
test, the SEU controller initially starts in the ACM mode. Due to the very high flux 
level, there is a high probability that MBUs would occur in a single frame. When the 
first MBU occurs in a frame, the SEU controller switches to the DOM mode as it 
cannot correct MBUs. Data is logged through the SEU controller UART and 
displayed on the remote monitoring computers via remote desktop connection over 
the LAN. The SEU controller can count up to 128 frame errors. The upset rate can be 
estimated by calculating the time since the beginning of the DOM mode, which 
matches the start of an MBU occurrence, till reaching the 128 frames’ errors.  
Chapter 5 Radiation and Fault Injection Testing 
102 
 
Figure 5-14 is a sample of the output from the SEU controller IP core as a result of 
scrubbing the FPGA configuration frames during the radiation test at a flux level of 
10e06 (cm-2.s-1).  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 5-14. Sample Output from the SEU Controller During Radiation Test. 
The static cross section can be calculated in terms of the event bit cross section or the 
upset bit cross section. The event bit cross section means that we count the number 
of events where we had upsets regardless of how many bits were actually upset 
during the event. For example if two bits had an upset at the same time, we would 
consider this as a single event upset. The upset bit cross section means that we count 
the number of bits that were upset either simultaneously or at different times. The 
two bits in the previous example would denote an upset bit of two. The calculation of 
the static event cross section is shown in Eq. (5-5), the static bit cross section is 
shown in Eq. (5-6), and the average static cross section is shown in Eq. (5-7): 
#of complete scans    Error Flag  #of frame errors #of corr. errors   
  
 000080   1    04    00 
 000100   1    08    00 
 000180   1    0B    00 
 000200   1    0C    00 
 000280   1    0E    00 
 000300   1    11    00 
 000380   1    13    00 
 000400   1    18    00 
 000480   1    1E    00 
 000500   1    20    00 
 000580   1    25    00 
 000600   1    28    00 
 000680   1    29    00 
 000700   1    2B    00 
 000780   1    33    00 
 000800   1    36    00 
 000880   1    3B    00 
 000900   1    40    00 
 000980   1    46    00 
 000A00   1    4B    00 
 000A80   1    52    00 
 000B00   1    58    00 
 000B80   1    61    00 
 000C00   1    67    00 
 000C80   1    6C    00 
 000D00   1    6E    00 
 000D80   1    75    00 
 000E00   1    7C    00 
 000E80   1    80    00 
Chapter 5 Radiation and Fault Injection Testing 
103 
 
ܵݐܽݐ݅ܿ	ܧݒ݁݊ݐ	ܥݎ݋ݏݏ	ܵ݁ܿݐ݅݋݊	݌݁ݎ	ܾ݅ݐ
= 	
#݋݂	݁ݒ݁݊ݐ	݁ݎݎ݋ݎݏ
(݂݈ݑݔ	 × ݁ݔ݌݋ݏݑݎ݁	ݐ݅݉݁)൘
݀݁ݒ݅ܿ݁	ݏ݅ݖ݁	݅݊	ܾ݅ݐݏ  
(5-5) 
 
ܵݐܽݐ݅ܿ	ܤ݅ݐ	ܥݎ݋ݏݏ	ܵ݁ܿݐ݅݋݊	݌݁ݎ	ܾ݅ݐ = 	
#݋݂	ܾ݅ݐ	݁ݎݎ݋ݎݏ
(݂݈ݑݔ	 × ݁ݔ݌݋ݏݑݎ݁	ݐ݅݉݁)൘
݀݁ݒ݅ܿ݁	ݏ݅ݖ݁	݅݊	ܾ݅ݐݏ  
(5-6) 
 
ܣݒ݁ݎܽ݃݁	ܵݐܽݐ݅ܿ	ܥݎ݋ݏݏ	ܵ݁ܿݐ݅݋݊	݌݁ݎ	ܾ݅ݐ
= 	 ܵݐܽݐ݅ܿ	ܧݒ݁݊ݐ	ܥݎ݋ݏݏ	ܵ݁ܿݐ݅݋݊	݌݁ݎ	ܾ݅ݐ + ܵݐܽݐ݅ܿ	ܾ݅ݐ	ܥݎ݋ݏݏ	ܵ݁ܿݐ݅݋݊	݌݁ݎ	ܾ݅ݐ2  
(5-7) 
The results of the radaition test to estimate the static event cross section are detailed 
in chapter 7. So far we investigated the use of radiation test with high flux, low 
exposure time, to estimate the static cross section. In the following sections we 
investigate the use of fault injection to estimate the dynamic cross section as an 
alternative for low flux, high exposure time and hence high cost, radiation testing.  
5.4 Fault Injection Test Objectives 
Fault Injection test is used to estimate the effects of SEUs on the designs within the 
lab. In a well-designed fault injection test, dynamic cross section could be estimated 
accurately. Fault injection test is an inexpensive alternative for radiation test in 
estimating the dynamic cross section of a design. 
The following are the main objectives of Fault Injection test:  
1) Performance evaluation of designs by emulating fault events through injecting 
faults at random positions. 
Chapter 5 Radiation and Fault Injection Testing 
104 
 
2) Estimation of design dynamic cross section as an alternative for expensive and 
complicated radiation tests.  
3) Evaluation of the effectiveness of the fault tolerance mitigation techniques used 
to protect the design against SEUs. 
4) Provide a fast, inexpensive and easy to use tool to test the designs within the labs. 
5) The test can be run at different upset rates to simulate different orbits; the upset 
rates can be used as an input for estimating the adequate flux level and energy if 
the radiation test is already planned. 
5.5 Fault Injection Concept 
Fault injection in functioning systems is a technique used to insert deliberate faults at 
selected and/or random units of the design to assess its sensitivities. This technique is 
implemented by adding additional hardware and software to the system to handle the 
insertion of faults, monitoring of performance and collection of results. Figure 5-15, 
shows the architecture of a fault injection system. The faults are injected to the DUT 
and statistical results are issued as a feedback to the injecting machine for test 
vectors adjustment. 
 
Figure 5-15. Fault injection cycle.  
The DUT is interfaced to a faults insertion unit which has access to the design units 
where faults are to be injected. The faults vector calculation and generation unit 
prepares faults vectors that match the required test objectives. The fault insertion unit 
could be a combination of hardware and software. It handles the overriding of the 
normal operation into a faulty one. For example, the fault insertion unit could be a 
Chapter 5 Radiation and Fault Injection Testing 
105 
 
code that reads back a previously calculated value by the normal DUT code and then 
overwrites it with a faulty value to simulate a specific condition. The insertion could 
be done without stopping the main operation. In some designs it might be inevitable 
to interrupt the normal operation flow by suspending it and then resuming after the 
injection takes place. The function monitoring and control unit takes care of 
monitoring the operation of the DUT. It stops the DUT operation in case of noticing 
an emergency and provides a control path to set the DUT in specific operating modes 
and operation settings. The performance of the DUT is statistically analyzed to detect 
anomalies in normal operation as faults are injected. The feedback about how the 
DUT behaves while in fault injection mode is provided to the faults vector 
calculation and generation unit. It uses that information in generating new fault 
vectors. For example, the feedback statistical information might show that there is a 
repetitive pattern in the output when certain fault sequence is followed. The faults 
vector generation and calculation unit might repeat the vectors with different 
variations to study the statistical dependence between injected faults and output 
vectors. Fault monitoring and control unit also feedback the faults vector calculation 
and generation with information about the behavior of the DUT during the fault 
injection process. For example, it might be necessary to feedback the faults vector 
calculation and generation unit with the moments where the system completely 
stopped working and needed a deep reset. This information is used in detecting the 
types of faults that lead to total failure. 
The SEUs which occur in space are probabilistic. Poisson distribution is used to 
estimate the expected number of upsets (k) which happens in the time interval (T) 
with an average number of upsets (μ) according to the probability density function 
shown in Eq. (5-8). The exponential distribution is used to estimate the expected time 
between upsets (τ) with an average number of upsets in unit time interval (λ) as 
shown in Eq. (5-9). The relationship between both distributions can be set as (μ = 
λT). 
 
ܲ(ݔ|ߤ) = (ߤ௫/ݔ!)	݁ିఒ 
(5-8) 
 
 5.6 SEU
The fault 
internal ha
It is an IP
FPGA to p
The contr
channel to
system co
computer 
Poisson d
faults will
Another c
contains th
C
 Fault I
Injector sys
rdware uni
 core that i
roduce fau
ol of the SE
 send com
ntains three
which runs
istribution o
 be injected
omputer is
e hardware
Fig
hapter 5 R
njector
tem archite
t that can r
s provided 
lts in the fo
U controll
mands to 
 external c
 MATLAB
f the SEU
 which foll
 used for 
 design.  
ure 5-16. F
adiation an
ܲ(߬|ߣ) =
cture is sh
econfigure 
by Xilinx w
rm of bit f
er is throu
it and rece
omputers t
 script to g
s in the targ
ows the exp
configurin
ault Injecto
d Fault Inj
	ߣ݁ିఒఛ	
own in Fig
the FPGA 
hich can b
lipping in t
gh serial co
ive respons
o support i
enerate ran
et orbit. It
onential d
g the FPG
r System A
ection Test
ure 5-16. T
bit stream, 
e controlle
he FPGA c
mmunicati
es from it
ts function
dom fault
 generates t
istribution a
A with the
rchitecture
ing 
he injector
the SEU co
d from out
onfiguratio
on over the
. The fault 
. The fault 
s lists based
he timing a
s described
 bit-stream
. 
106 
(5-9) 
 uses an 
ntroller. 
side the 
n frame. 
 RS232 
injector 
injector 
 on the 
t which 
 earlier. 
 which 
 
Chapter 5 Radiation and Fault Injection Testing 
107 
 
The function monitoring of the DUT, the MPSoC, is done through sending the 
processors status and results of executing a simple counter program to the UART 
interfaces which are monitored by an external computer to collect the results and 
analyze them. The system runs the simulation for number of times and it generates a 
new fault injection vector at each time. The fault injection vector contains the bit 
location that will be flipped which is a random number from (0 to 1311) and the 
frame number where flipping will take place which is a random number from (1 to 
8662), these numbers are device specific to the Virtex5 LX50. Appendix (E) shows a 
script used for the generation of the frame addresses in the format specific to the 
Virtex-5 through invoking the SEU controller. Appendix (F) shows a sample of the 
single bit upsets injection script. 
The faults are accumulated and their effects are watched as they are injected. At the 
end of the injection cycle, ACM is enabled to recover the injected faults and restore 
the operation of the cores. The flow chart in Figure 5-17, shows the test flow.  
 DOM is u
only moni
C
F
sed during
tors and rep
hapter 5 R
igure 5-17.
 the accum
orts faults 
adiation an
 Fault Injec
ulated fau
when opera
d Fault Inj
tion Test F
lt injection
ting in the 
ection Test
 
low Chart.
 process. T
DOM with
ing 
he SEU co
out perform
108 
ntroller 
ing any 
Chapter 5 Radiation and Fault Injection Testing 
109 
 
correction of the upsets. The faults are injected using specific commands over serial 
interface with the SEU controller. The result of fault injection is observed through 
monitoring the MPSoC operation over the serial interfaces.  
The function of the TMR fault mitigation approach is tested through injecting faults 
in the data carried by the redundant modules. Each module carries the same set of 
data which might represent operation state code, software variables, communication 
message or any other form of application specific information. The TMR concept 
depends on voting among the data as shown in Figure 5-18. 
 
Figure 5-18. TMR among N bits Vectors. 
The TMR approach is used in protecting the data by comparing among the data 
values of three functionally identical modules. In case differences among the data 
values exist then consensus among the values would be used to propose the most 
accurate value. If no consensus is found then an error is signaled. In simulating faults 
in the TMR data vectors of (N bits), a random fault list is generated to indicate the 
locations of the bits to be flipped. The data sets are then compared to each other and 
errors are calculated based on the comparison results as shown in the script excerpt 
in Figure 5-19. The results of fault injection are detailed in chapter 7. 
Data 
Vector 
(1) 
N bits 
Data 
Vector 
(2) 
N bits 
Data 
Vector 
(3) 
N bits 
Voter 
 
Chapter 5 Radiation and Fault Injection Testing 
110 
 
 
Figure 5-19. Part of the MATLAB Script to Simulate TMR Errors. 
In this chapter we presented the radiation test using accelerated proton beam at 
TARRI. We estimated the static cross section using a new technique through internal 
scrubbing with the SEU controller core. Traditional techniques, which use a 
monitoring and control device to scrub the FPGA externally, give accurate results on 
the expense of complicated test setups. Our technique, gives accurate results with the 
advantage of reducing the number of connections and components inside the test 
chamber which is already of limited volume. Reducing the number of connections 
and components helps in quickly reaching the 10-5 Pa vacuum pressure which is 
necessary to start the radiation test. Also, we presented the fault injection test method 
to estimate the dynamic cross section. This method is a cost effective alternative to 
expensive radiation tests which might take long times to estimate the dynamic cross 
section depending on the static cross section, flux level, energy level and exposure 
time. Both the dynamic and static tests showed that the SEU controller is robust as it 
showed no halts during the tests and it effectively scrubbed mitigated the FPGA 
against induced upsets. In the next chapter we introduce the thermal vacuum tests 
and the mitigation techniques to reduce the thermal power through reducing the static 
and dynamic power consumption of the FPGA. 
% Fault Simulation   
for i = 1:Total_Upsets        
         Byte = ceil(random_bits(i,1)/8); 
         Bit = rem(random_bits(i,1),8); 
         Buffer_1(Byte) = bitxor(Buffer_1(Byte),2^(Bit)); 
         Byte = ceil(random_bits(i,2)/8); 
         Bit = rem(random_bits(i,2),8); 
         Buffer_2(Byte) = bitxor(Buffer_2(Byte),2^(Bit)); 
         Byte = ceil(random_bits(i,3)/8); 
         Bit = rem(random_bits(i,3),8); 
         Buffer_3(Byte) = bitxor(Buffer_3(Byte),2^(Bit));              
end 
%calculation of TMR system failure rates         
for i = 1 : Buffer_Size(BS_i)           
if (Buffer_1(i)~= Buffer_2(i)) && (Buffer_1(i)~= Buffer_3(i)) && (Buffer_2(i)~= Buffer_3(i)) 
Unsim_Failures(j) = Unsim_Failures(j) + 1;     
end   
end 
 Chapt
at
en
en
FPGAs. In
operating 
testing to 
during the
and proce
approache
through R
technique 
control. So
6.1 Spa
The heat s
Earth Infr
sources in
S 
 : Ter 6
ellites’ su
vironment
vironment 
 this chap
temperatur
reveal wh
 mission lif
dures as w
s to therm
econfigurat
through he
me section
ce Ther
ources in L
a-Red (IR) 
 LEO space
Chap
hermal
bsystems 
. As was
has severe
ter we han
e and pow
ether the F
etime or no
ell as the a
ally mitiga
ion and Clo
at sink. DT
s of this ch
mal Env
EO are the
emission. F
. 
Figure 6-1
ter 6 Therm
 Vacuu
suffer from
 illustrated
 effects on
dle the eff
er consum
PGA pack
t. In the fo
pplied ther
te the FPG
ck Frequen
M techniqu
apter were 
ironmen
 direct sola
igure 6-1
. Space The
al Vacuum
m Test
 the har
 in the 
 the binary
ects of the
ption. We
age would
llowing sec
mal mitiga
A design:
cy Control
e was used
part of the a
t  
r radiation,
shows the 
rmal Envir
 Testing 
ing 
sh conditi
previous 
 design da
 thermal e
 performe
 fit and su
tions we pr
tion techniq
 1) Active
 (CFC) and
 to achieve
uthor publ
 solar refle
schematic o
onment. 
ons in th
chapter, r
ta in SRAM
nvironmen
d thermal 
stain its o
esent the te
ues. We u
 control te
 2) Passive
 the active
ications at [
ction (Albe
f the exter
111 
e space 
adiation 
-based 
t on the 
vacuum 
peration 
st setup 
sed two 
chnique 
 control 
 thermal 
67]. 
do), and 
nal heat 
 
Chapter 6 Thermal Vacuum Testing 
112 
 
Satellites can have very high temperature at one side, which directly faces the sun, 
while very low temperature at the other side, which faces the deep space. The 
amount of temperature difference between surfaces is inversely proportional to the 
satellite specific heat capacity ࢉ࢖	(J/kg.K) and mass ܕ (kg). The amount of heat 
transfer per unit time ࡽ (W) used to change the temperature by ∆܂ (K) in time  ∆ܜ (s) 
is calculated by Eq. (6-1) : 
ܳ = ݉ܿ௣
∆ܶ
∆ݐ  
(6-1) 
Where: ࡽ	is the heat transfer per unit time (W), ࢓	is the satellite mass (kg), ࢉ࢖	is the 
specific heat capacity (J/kg.K), ∆܂ is the temperature difference (K) and ∆ܜ is the 
duration (s). 
The direct solar radiation flux is about 1371 W/m2 at 1 AU from the sun, the average 
reflected solar radiation (albedo) is 0.3 of the direct solar radiation, and the planetary 
IR flux for planet Earth is 234 W/m2 as shown in [68], which is adapted from [69] 
[70] [71] . 
Table 6-1. LEO Thermal Environment [68]. 
 Aphelion Perihelion Sun Light Eclipse 
Direct Solar 1414 W/m2 1323 W/m2 1371 W/m2 0 
Albedo 
(average) 
0.30 ±0.01 0.30 ±0.01 0.30 ±0.01 0.25 
Planetary IR 
(average) 
234 ±7 W/m2 234 ±7 W/m2 234 ±7 W/m2 220 W/m2 
 
The thermal balance equation is represented in Eq. (6-2): 
ܳா௫௧௘௥௡௔௟ + ܳூ௡௧௘௥௡௔௟ = 	ܳோ௔ௗ௜௔௧௘ௗ 
(6-2) 
Where: Q୉୶୲ୣ୰୬ୟ୪  is the external heat transferred to the satellite from the thermal 
space environment (W), Q୍୬୲ୣ୰୬ୟ୪ is the internally generated heat due to the power 
consumption of the subsystems’ electronic components (W), and Qୖୟୢ୧ୟ୲ୣୢ  is the 
emitted heat from the satellite to space in the form of electromagnetic radiations (W). 
Eq. (6-3) expands the external heat transfer term: 
 
Chapter 6 Thermal Vacuum Testing 
113 
 
ܳௌ௨௡ + ܳ஺௟௕௘ௗ௢ + ܳா௔௥௧௛ + ܳூ௡௧௘௥௡௔௟ = ܳோ௔ௗ௜௔௧௘ௗ 
(6-3) 
Where: Qୗ୳୬ is the heat transferred due to the direct solar radiation (W), Q୅୪ୠୣୢ୭ is 
the heat transferred due to the reflected solar radiation from the Earth surface (W), 
and Q୉ୟ୰୲୦ is the planetary infrared of planet Earth (W).  
Generally, heat is transferred in three ways: 1) Conduction, 2) Convection, and 3) 
Radiation. Conduction heat transfer is due to the flow of heat inside the same body or 
the contact between two surfaces. Convection heat transfer is due to the flow of a 
fluid on a surface. Radiation heat transfer is due to the propagation of heat in the 
form of electromagnetic waves such as IR waves. 
6.1.1 Conduction Heat Transfer  
Heat flow through conduction is due to the interactions that take place inside the 
materials. The conduction heat transfer is described by the Fourier’s Law as in Eq. 
(6-4): 
ܳ௫ = −ܭ
݀ܶ
݀ݔ 
(6-4) 
Where: Q୶ is the local heat flux density, it represents the amount of energy that flows 
per unit area in unit time (W/m2), K  is the thermal conductivity of the material 
(W/m.K), ୢ୘ୢ୶ is the temperature difference over the length of the material (K/m).  
An analogy between current flow through a resistance in electrical circuits and heat 
flow via conduction can be represented through the thermal resistance in Eq. (6-5).  
ܴ௧௛ =
ܮ
ܭܣ 
(6-5) 
Where: R୲୦ is the thermal resistance (K/W), L is the length of the material (m), K is 
the thermal conductivity (W/m.K), and A is the cross sectional area through which 
the heat flows (m2). 
The conduction heat transfer between two arbitrary nodes (i) and (j) can be rewritten 
in terms of the thermal resistance as in Eq. (6-6): 
Chapter 6 Thermal Vacuum Testing 
114 
 
ܳ௜௝ = ௜ܶ
− ௝ܶ
ܴ௜௝  
(6-6) 
Where: Q୧୨ is the heat flow between nodes (i) and (j) (W), T୧ is the temperature of 
node (i) (K), T୨ is the temperature of node (j) (K), and R୧୨ is the thermal resistance 
between nodes (i) and (j) (K/W). 
Conduction heat transfer takes place between the different materials when they have 
contact surfaces. The contact conductance Cୡ୭୬ୢ (W/K), is used to characterize the 
amount of heat transfer per temperature unit, it is the inverse of the thermal 
resistance R୲୦ in Eq. (6-5). 
The contact conductance depends on the type of materials in contact, the surfaces’ 
finishing, pressure exerted to keep the surfaces in contact and the pressure uniformity 
[72] [73].  
 
 
 
 
 
(a)      (b) 
Figure 6-2. a) Heat Flow in Series and b) Heat Flow in Parallel. 
If different materials touch each other as shown in Figure 6-2, such that heat flows 
either in series as in (a) or in parallel as in (b), then the overall equivalent thermal 
resistance can be calculated from Eq. (6-7) and Eq. (6-8): 
ܴ௦௘௥௜௘௦ =෍ܴ௜
௡
௜ୀଵ
 
(6-7) 
ܴ௣௔௥௔௟௟௘௟ = ෍
1
ܴ௜
௡
௜ୀଵ
 
(6-8) 
Where: ܴ௜ is the thermal resistance of material (݅).  
Chapter 6 Thermal Vacuum Testing 
115 
 
6.1.2 Convection Heat Transfer 
Space is a vacuumed environment, where air density from the LEO altitudes and 
above is negligible. Therefore, heat is mainly transferred in space through radiation 
and conduction. However, in some thermal control techniques, such as heat pipes, a 
fluid is used to carry the heat from hot regions to cold regions in a thermal cycle of 
evaporation and condensation. Usually Fluids that can evaporate at low temperatures 
such as Ammonia are used in such methods [72] [73] [13]. Convection heat transfer 
is used in the design of heat pipes in conjunction with conduction heat transfer. It is 
used in calculating heat transfer in fluid fuel tanks or pressurized biological 
experiments on a spacecraft [72]. The convection heat transfer between two arbitrary 
nodes (i) and (j) can be rewritten in terms of the thermal resistance as in Eq. (6-6). 
The thermal resistance in convection heat transfer is shown in Eq. (6-9): 
ܴ௧௛ =
1
ℎܣ 
(6-9) 
Where: h is the convection heat transfer coefficient (W/m2.K), and A is the cross 
sectional area through which heat flows (m2). 
6.1.3 Radiation Heat Transfer  
The radiation heat transfer is significant when the temperature difference between the 
radiating surfaces is high. On the other side, low temperature differences between the 
radiating surfaces causes the value of the radiation heat transfer to be very low such 
that it can be totally neglected. In case the satellite electronics have close power 
consumption values, then the mutual radiation heat transfer among them would be 
very small and likely to be neglected.  
The radiation heat transfer for an ideal black body with emissivity ( ε = 1) is 
described by Stefan-Boltzmann’s law as shown in Eq. (6-10):  
E = σTସ 
(6-10) 
Where: E is the radiated heat flux (W/m2), σ is the Stefan-Boltzmann constant 
(5.67x10-8 W/m2K4), and T is the radiating surface temperature (K). The amount of 
heat transfer between two radiating surfaces can be described by Eq. (6-11) [70]: 
Chapter 6 Thermal Vacuum Testing 
116 
 
q୰ = εσFଵ,ଶA(Tଵସ − Tଶସ) 
(6-11) 
Where: q୰ is the amount of heat transfer between two radiating surfaces (W), ߝ is the 
emissivity of the radiating surface material which ranges from (0) for reflective 
surface to (1) for absorptive surface, ܨଵ,ଶ	is the view factor between surface (1) and 
surface (2) (≤1.0). 
It is important to note that the summation of the solar radiation amount that is 
absorbed, transmitted and reflected over a surface equals to 1 as shown in Eq. (6-12): 
ߙ + ߩ + ߬ = 1 
(6-12) 
Where: α is the absorbitivity, ρ is the reflectivity, and τ is the transmissivity [74]. 
When the surface is in thermal equilibrium then Kirchhoff’s law implies that 
absorptivity equals to emissivity as shown in Eq. (6-13): 
α = ϵ 
(6-13) 
6.1.4 Radiation View Factors 
Radiation heat transfer depends on the orientation of the surfaces with respect to each 
other. The view factor F୧୨	between surfaces (i)  and 	(j), also known as the shape 
factor, is used to define the fraction of radiation emitted by surface (i)  and 
intercepted by surface	(j) [75]. The fraction of radiation emitted by surface (j)	and 
intercepted by surface 	(i)  is the view factor F୨୧	between surfaces (j)  and(i) . The 
relationship between F୧୨ and F୨୧ is shown in Eq. (6-14): 
A୧F୧୨ = A୨F୨୧ 
(6-14) 
Where: A୧ and A୨	are the surface areas (m2).  
Chapter 6 Thermal Vacuum Testing 
117 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 6-3. View Factor Calculation Geometry [72]. 
The calculation of the view factors can be estimated using Figure 6-3. Two 
differential areas dA୧	and dA୧	 are shown at surfaces T୧  andT୨ . The normal to the 
surfaces are n୧ and	n୨. The angles θ୧ and θ୨	are located between the normal to the 
surfaces and the shortest vector (S) connecting between both surfaces. 
The view factor between two differential areas is calculated in [76] and is shown in 
Eq. (6-15): 
ܨௗ௜ିௗ௝ =
cos ߠ௜ cos ߠ௝
ߨܵଶ ݀ܣ௜݀ܣ௝ 
(6-15) 
The exact solution of the view factor can be found by integrating over the areas as 
shown in Eq. (6-16): 
ܨ௜௝ =
1
ܣ௜ නන
cos ߠ௜ cos ߠ௝
ߨܵଶ ݀ܣ௜݀ܣ௝ 
(6-16) 
The view factors of common geometric configurations used in space crafts are 
previously calculated [70] [76]. Computer Aided Design (CAD) tools can perform 
numerical calculation of view factors for arbitrary geometric configurations. 
6.1.5 Equilibrium Temperature  
When we perform thermal analysis of a system then we divide the system into nodes 
where we calculate the thermal conditions at each of them. The thermal equilibrium 
or thermal balance is achieved at any node when the total amount of heat flow input 
ߠ௜
ߠ௝ 
݊௜
௝݊ 
݀ܣ௜ at ௜ܶ
݀ܣ௝ at ௝ܶ 
 to the nod
(6-17): 
Where: Q୧
from node
It should b
analysis, t
To illustra
Fi
The therm
reference 
written as 
Where: m
∆T୧ tempe
heat transf
e equal to 
୬ି୧ is the h
 i (W). 
e noted th
he more acc
te the conc
gure 6-4. T
al equilibr
to node j an
in Eq. (6-1
݉ܿ௣
∆
∆
 is the mass
rature chan
er term fro
Chap
the total a
෍
eat flow inp
at the highe
urate the a
ept, we con
hermal Equ
ium equatio
d node z w
8): 
௜ܶ
ݐ =
௝ܶ − ܶ
ܴ௧௛ି௝
 (kg) at nod
ge at node
m node j to
ter 6 Therm
mount of h
ܳ௜௡ି௜ =
ut to node 
r the numb
nalysis resu
struct a me
ilibrium An
n at node
hich have 
௜
௜
+ ߝߪܨ௭௜ܣ
e	i, c୮is th
 i (K), ∆t is
 node i (W
al Vacuum
eat flow ou
෍ܳ௢௨௧ି௜
i (W) while
er of the n
lts would b
sh of nodes
alysis thro
i, assuming
higher tem
௭( ௭ܶସ − ௜ܶସ)
e specific h
 the durati
), εσF୸୧A୸(
 Testing 
tput from 
 Q୭୳୲ି୧ is t
odes at wh
e.  
 as shown i
ugh Mesh N
 that analy
peratures t
+ ܳ௘௫௧ +
eat capacity
on (s), 
୘ౠି୘
ୖ౪౞ష
T୸ସ − T୧ସ) i
it as show
he heat flow
ich we perf
n Figure 6-
odes [68].
sis is perfo
han node i
ܳ௜௡௧ 
 (J/kg.K) a
౟
ౠ౟
 is the con
s the radiat
118 
n in Eq. 
(6-17) 
 output 
orm the 
4 : 
 
 
rmed in 
, can be 
(6-18) 
t node	i, 
duction 
ion heat 
Chapter 6 Thermal Vacuum Testing 
119 
 
transfer term from node z to node i, Qୣ୶୲ other external heat input to node i (W), Q୧୬୲ 
internal heat dissipation input to node i (W). 
The importance of understanding the heat balance or thermal equilibrium of a system 
is that it helps in constructing accurate Thermal Mathematical Model (TMM) for the 
system. The thermal balance test is used in measuring the heat balance conditions; it 
can be combined with the thermal vacuum test, in one step, if the thermal control 
subsystem was not the subject of the test. TMM also defines the maximum and 
minimum limits for conducting the thermal vacuum tests. Thermal balance tests 
should be conducted in vacuum condition where the pressure is preferred to be less 
than 10-4 Pa. 
6.2 Thermal Effects on 65 nm Virtex-5 FPGA 
The effect of thermal conditions on electronic components operation is mainly 
characterized by changing the power consumption. In Xilinx FPGAs there are 6 
power modes: 1) Power on, 2) Configuration, 3) Standby, 4) Active, 5) Suspend and 
6) Hibernate [77]. The active power consumption is the power consumed during the 
design operation and it is equal to the summation of the standby power and the 
design dynamic power. The standby power is the summation of the device static and 
design static power consumptions. The active power is estimated from Eq. (6-19) and 
Eq.  (6-20) [77]: 
ܵݐܾܽ݊݀ݕ	݌݋ݓ݁ݎ = ܦ݁ݒ݅ܿ݁	ܵݐܽ݅ܿ	݌݋ݓ݁ݎ + ܦ݁ݏ݅݃݊	ܵݐܽݐ݅ܿ	݌݋ݓ݁ݎ 
(6-19) 
ܣܿݐ݅ݒ݁	ܲ݋ݓ݁ݎ = ܵݐܾܽ݊݀ݕ	݌݋ݓ݁ݎ + ܦ݁ݏ݅݃݊	ܦݕ݊ܽ݉݅ܿ	݌݋ݓ݁ݎ 
(6-20) 
The device static power depends on the transistor feature size, applied voltage, 
junction temperature and the manufacturing process [77]. The junction temperature 
itself depends on the ambient temperature, applied voltage and total current supplied. 
However, the total current supplied depends on the device static power. It is a 
circular dependency and tools used to estimate the power consumption use 
successive iterations to converge to an approximation of the static power [77]. The 
device static power is known as the transistor leakage power and it is the power that 
is consumed when the FPGA is switched on but not configured [78]. 
 The desig
represents
configured
The desig
and the co
and on-ch
which alre
DCM cloc
proportion
proportion
The stati
semicondu
excitation 
energy is 
which ena
the conduc
fact that t
another at
Figure 6-5
n static po
 the static 
 but no clo
n static pow
nfiguration
ip blocks 
ady exist a
k generator
al to the cu
al to the ju
c power 
ctor physic
received b
used to inc
bles them t
tion energy
he current 
om until the
. Change o
in 65
Chap
wer depen
power tha
ck and no
er also de
 details. Th
would be. 
nd can be 
s, I/O term
be of the ap
nction temp
increases 
s due to P
y an elect
rease the 
o reach the 
 band, high
starts to f
y rest dow
f Static Pow
 nm SRAM
ter 6 Therm
ds on the d
t is consu
 inputs are
pends on t
e larger the
On-chip b
accessed b
inators and
plied volta
erature and
ௌܲ௧௔௧௜௖
with the 
oole-Frenk
ron which 
energy of t
conduction
er current 
low by the
n after losin
er with Pr
-based FP
al Vacuum
esign whi
med by th
 applied he
he junction
 design is, 
locks are c
y the desig
 transceive
ge as show
 transistor 
∝ ௖ܸ௖ଷ  
junction t
el effect. T
exists in a
he electron
 energy ban
consumptio
 electrons 
g their ther
ocess, Volt
GA Xilinx 
 Testing 
ch is loade
e design w
nce no out
 temperatu
the larger it
omponents
n such as B
rs. The stat
n in Eq. (6-
speeds as sh
emperature
he effect e
n electric 
s at the va
d. When m
n is noticed
movement 
mal excitat
age and Jun
Virtex-5 [7
d in the F
hen the F
puts are ge
re, applied
s use of log
 inside the
RAMs, P
ic power is
21) [79]. A
own in Fig
 as expla
xplains the 
field. The 
lance ener
ore electro
. This is du
from one 
ion energy
ction Temp
9]. 
120 
PGA. It 
PGA is 
nerated. 
 voltage 
ic gates 
 FPGA 
LLs and 
 directly 
lso, it is 
ure 6-5.  
(6-21) 
ined in 
thermal 
thermal 
gy band 
ns reach 
e to the 
atom to 
.  
 
erature 
Chapter 6 Thermal Vacuum Testing 
121 
 
The design dynamic power is the power consumed by the design due to switching. It 
depends on the number of nodes, nodes’ capacitances, applied voltage, and switching 
frequency. The dynamic power can be calculated from Eq. (6-22) [79]. 
ௗܲ௬௡௔௠௜௖ = ݊ܥܸଶ݂ 
(6-22) 
Where: n is the number of toggling nodes, C is the node capacitance, V is the voltage 
swing, and f is the toggling frequency. 
It is clear from the power consumption profiles of the SRAM-based FPGA that 
reducing the thermal power would take place by reducing the static and/or dynamic 
power. Reducing the static power can be done through reducing the temperature 
using heat sink or reducing the applied voltage by 5% to be 0.95 V. However, we 
prefer to use intolerant regulators that keep the power at 1 V, in order not to have 
power fluctuations. Reducing the dynamic power takes place through reducing the 
number of toggling nodes by reducing the configured logic such as using two 
processors instead of 4. Also, we can reduce the toggling frequency to reduce the 
dynamic power.  
Xilinx introduced a new approach in manufacturing its FPGAs called triple oxide. 
That technique is used to reduce the static device power consumption increment, due 
to increase in the leakage current, which was expected due to reducing the transistor 
feature size. Xilinx have different gate oxides thickness in the transistors inside the 
FPGA: 1) Thin-oxide is used in the small. low threshold voltage, and fast transistors 
in the FPGA core, 2) Thick-oxide is used in higher voltage, swing transistors for I/O. 
and 3) Middle-oxide which is middle thickness transistors used in configuration 
memory cells and interconnect transistors which operate at higher threshold voltage 
than the thin-oxide transistors. Xilinx made use of the fact that the design uses 
hundreds of millions of transistors as configuration cells. It is meaningless to use the 
higher leakage thin-oxide fast transistors for storing the configuration data while they 
can be of lower leakage if middle-thickness transistors are used. That technique 
greatly reduces the leakage current, compared to the case if triple-oxide was not 
applied, while keeping the performance of the FPGA as required [80].  
It is important to notice that reducing the junction temperature would increase the 
reliability and according to Xilinx it increases the lifetime by 2X [81]. 
Chapter 6 Thermal Vacuum Testing 
122 
 
6.3 Thermal Control Techniques  
Thermal control is used to preserve the operating temperature limits for the different 
components inside the satellite. All electronic components have a temperature range 
within which they can operate. Thermal control tries to keep the lowest and highest 
operating temperatures, which a component might face, within the limits specified in 
the component datasheet. Thermal control techniques can be grouped into two main 
categories: 
1) Active control 
2) Passive control  
Active thermal control is used to keep the thermal conditions within the desired 
operating ranges by using techniques that consumes electrical power. Heaters and 
Louvers are examples of active thermal control. 
Passive thermal control uses techniques which do not consume electrical power to 
preserve the operating temperature limits. Multi-Layer Insulators, Heat sinks, Heat 
shields, thermal blankets, heat pipes, surface finishing and color, and radiators are 
examples of passive control techniques [73]. 
In the design of the MPSoC, we used DTM to actively control the junction 
temperature by reducing the dynamic power consumption through two methods: 
1) Design reconfiguration by changing number of processor systems. 
2) Operating frequency control by changing from 100 MHz to 50 MHz. 
Also we used a passive technique to reduce the static power consumption through a 
low thermal resistance black anodized heat sink, thermal resistance was about 7 
C˚/Watt. Combining all three techniques we get the following possibilities: 
1) Heat Sink, 4 Processors, 100 MHz. 
2) Heat Sink, 4 Processors, 50 MHz. 
3) Heat Sink, 2 Processors, 100 MHz. 
4) Heat Sink, 2 Processors, 50 MHz. 
5) No-Heat Sink, 4 Processors, 100 MHz. 
6) No-Heat Sink, 4 Processors, 50 MHz. 
Chapter 6 Thermal Vacuum Testing 
123 
 
7) No-Heat Sink, 2 Processors, 100 MHz. 
8) No-Heat Sink, 2 Processors, 50 MHz. 
6.4 Geometric Thermal Model 
The TMM is used to estimate the expected equilibrium temperatures of the system 
components before the actual thermal vacuum tests. The model is generated after 
defining the geometric thermal model of the system which is drawn in accordance to 
the real test chamber dimensions and boundary conditions. It can be used in 
extrapolating the design performance under different operating conditions as well as 
giving an initial estimation about the expected FPGA temperature during the tests.  
It should be noted that the model has a limited accuracy within ±10 ˚C. The accuracy 
depends on the geometric modelling, estimation of the materials emissivity (ɛ), 
power consumed by the components, and thermal resistances models at conductions. 
It is difficult to estimate all of the conductance thermal resistances accurately, 
therefore testing is performed and the parameters are back annotated to the model to 
achieve better accuracy. 
To establish the TMM the following steps are accomplished: 
1) Build the geometric model of the system. 
2) Set the thermal parameters of all the system components.  
3) Define the mesh nodes. 
4) Run simulation and perform results rehearsals. 
The system geometric model was built using AutoCAD-2104. The correct 
dimensions of the system and its surrounding shroud during the thermal vacuum tests 
were defined. The thermal parameters were set through Thermal Desktop. The 
system consisted of a Printed Circuit Board (PCB) from glass-epoxy FR-4 with 
components attached to it. The PCB was placed in a sandwich-like structure between 
top and bottom Aluminum sheets. Iron bolts were used to attach the top sheet (ceil) 
to the bottom sheet (floor). The emissivity values for all of the materials were set 
based on their types. Twelve nodes were selected for running the simulations. The 
geometric model and the simulation results are shown in Figure 6-6 and Figure 6-7 
Chapter 6 Thermal Vacuum Testing 
124 
 
respectively. The estimated FPGA temperature was about 97.74 ˚C while the actual 
test data showed values up to 105 ˚C. 
 
 
Figure 6-6. MPSoC OBC Geometric Thermal Model. 
Chapter 6 Thermal Vacuum Testing 
125 
 
 
 
Figure 6-7. Thermal Mathematical Model Simulation Results. 
6.5 Thermal Vacuum Test Objectives  
The major thermal vacuum test objectives can be summarized in the following points 
[82]: 
1) Verify the thermal margins of the system through which it can operate. 
2) Test the effectiveness of the thermal mitigation techniques.  
3) Judge the performance of the system in thermal vacuum environment at high 
and low cycles. 
4) Validate the TMM through back annotation with the test results and 
rerunning of the simulation. 
5) Identify anomalies in the system design, PCB manufacturing and/or 
assembling. 
6) Testing the thermal stress effects on the PCB, soldering pads and soldering 
materials. Performing X-ray scan after the test is useful to identify any cracks 
in the soldering pads due to thermal stress. The Xilinx Virtex-5 FPGA 
package uses Ball Grid Array (BGA). The soldering material is Sn/Ag/Cu 
Chapter 6 Thermal Vacuum Testing 
126 
 
solder. The Coefficient of Thermal Expansion (CTE) for the solder material 
and the glass epoxy FR-4 PCB are about 21 and 50 ppm/˚C respectively. 
These values of CTEs are good enough to keep the pads without cracks. 
6.6 Thermal Vacuum Test Setup 
Thermal vacuum testing of the MPSoC OBC took place at CeNT, Kyushu Institute 
of Technology, Japan. The center provides facilities for space environment testing of 
50 kg satellites with sizes around 50 cm × 50 cm × 50 cm. Also, it contains facilities 
for testing of smaller satellite subsystems. We used a small thermal vacuum chamber, 
shown in Figure 6-8, during the tests to save the consumption of liquid nitrogen 
which was about 0.1 kg/hour. 
 
Figure 6-8. Small thermal vacuum chamber used during the tests. 
The chamber outer diameter is about 40 cm, inner shroud diameter is 24.5 cm and the 
chamber depth is about 80 cm. Figure 6-9 shows the inner shroud dimensions in mm. 
Chapter 6 Thermal Vacuum Testing 
127 
 
 
Figure 6-9. Thermal Vacuum Chamber Shroud Dimensions. 
The pressure during the test reached 7.8	×	10-5 Pa. To reach that pressure we used a 
Turbo Molecular Pump. Liquid nitrogen was injected to maintain the shroud 
temperature in the range of -170 °C to -180 °C. The shroud acts as a Cryogenic pump, 
due to its cold surface, which helps in condensing the evaporating outgases from the 
device under test. A data acquisition system running on LabVIEW was used to 
collect the measurements from 10 temperature sensors, thermocouples, at a rate of 20 
readings per second. 
Figure 6-10, Figure 6-11, and Figure 6-12 show the temperature sensors attachments 
during the thermal vacuum test. Three temperature thermocouple sensors labeled (2), 
(3) and (4) were located on the FPGA top surface. These sensors are attached to the 
heat sink side walls when the heat sink was used to reduce the temperature as shown 
in Figure 6-12. Two sensors were located on the PCB board labeled (5) and (6). 
Sensor (7) was located on the edge of the board close to the mounting point. This 
sensor is used as the control point for achieving the test temperature profile. Sensor 
(8) was located at the bottom side of the PCB underneath the FPGA. Sensors (9) and 
(10) were used for the shroud top and bottom respectively. Sensor (1) was used to 
１９0
Φ２４5
１５5
 monitor th
system tem
the sun, al
The test te
the sensor
The tempe
C° during 
[83]. The 
cold temp
reduced t
mitigation
assembled
placed in t
 
Figur
 
e polyimid
perature a
bedo and re
mperature 
 at point (7
rature of po
the cold so
soaking tim
erature with
o one hou
 techniques
 in a sandw
he test cham
e 6-10. The
Chap
e heater tem
nd simulate
flection fro
profile was
). The clos
int (7) was
aking time 
e was init
out applyi
r while te
. Two test 
ich like s
ber. The s
rmocouple
ter 6 Therm
perature. T
 the total e
m the earth
 achieved 
ed loop co
 kept at 60 
to match w
ially two ho
ng the miti
sting the p
cycles wer
tructure, F
ystem setup
s Positions 
Positi
al Vacuum
he polyimi
ffect of the
 surface.  
by heating 
ntrol was i
C° during t
ith the SMC
urs in hot 
gation tech
reviously 
e conducted
igure 6-13
 schematic
During The
ons. 
 Testing 
de heater w
 heat input 
up the syst
mplemente
he hot soak
 acceptanc
temperatur
niques. Th
mentioned 
 at each te
shows the 
 is shown i
rmal Vacuu
as used to 
to the syste
em and mo
d using La
ing time an
e test requi
e and two 
e soaking t
8 thermal 
st. The sys
assembly t
n Figure 6-
m Tests - T
128 
raise the 
m from 
nitoring 
bVIEW. 
d at -24 
rements 
hours in 
ime was 
control 
tem was 
hat was 
14.  
 
op 
 Figure 
 
Figur
6-11. Therm
e 6-12. The
Chap
ocouples P
rmocouple
Pos
ter 6 Therm
ositions D
Positi
s Positions 
itions - wit
al Vacuum
uring Therm
ons. 
During The
h Heat Sink
 Testing 
al Vacuum
rmal Vacuu
. 
 Tests - Bo
m Tests - T
129 
 
ttom 
 
op 
 Figure 
 
 
6-13. Asse
Figur
Chap
mbled Sand
e 6-14. The
ter 6 Therm
wich Struc
rmal Vacuu
al Vacuum
ture Before
m Test Set
 Testing 
 Thermal V
up Schema
acuum Tes
tic. 
130 
 
ting. 
 
Chapter 6 Thermal Vacuum Testing 
131 
 
The schematic in Figure 6-14 shows the setup of the system inside the chamber and 
the outside connections as well. The system is placed in a sandwich structure where 
it was surrounded by upper and lower Aluminum metal sheets of A4 size. The 
thermocouples where attached to different locations as mentioned before. The data 
from the thermocouples was collected by LabVIEW through a DAQ module. The 
chamber control computer maintained the shroud temperature at around -170 °C by 
controlling the opening of an electric valve to supply liquid nitrogen inside the 
chamber.  
The control was carried out using LabVIEW through a DIO module. Also the 
chamber control computer controlled the power supply to the heater in order to 
maintain the required target profile temperature. The feedback for the shroud 
temperature was provided to the control computer through the sensors attached to the 
top and bottom of the shroud surfaces. The feedback for the target point temperature 
was provided through the sensor labeled (7) in Figure 6-10 and Figure 6-12. An 
additional heater was attached to the lower Aluminum sheet as an optional heating 
element but we did not need to use it during the tests. Monitoring computer was 
collecting the data from the software running on the processor systems during the 
thermal vacuum test to judge their correct operation. The interface between the 
FPGA, DUT, and the monitoring computer used serial communication.  
A circuit was implemented to measure the internal die temperature of the FPGA 
through the internal FPGA thermal diode. The circuit used the temperature 
measurement IC MAX6655 which was interfaced to the MSCP through I2C link. 
Two external WDTs, MAX6369, timer were used to detect the failure of the FPGA. 
The MSCP was resetting the WDTs which were connected in parallel to the Crydom 
MPDCD relay. Whenever any of the WDTs fails to reset its internal timer, based on 
a reset signal received from the MSCP, it asserts a low signal on its output which 
switches off the relay control side. Power is recycled on the relay output side and the 
FPGA is restarted. WDT asserts the low signal for one clock cycle. The WDT timer 
timeout period was set to 15 sec. 
Chapter 6 Thermal Vacuum Testing 
132 
 
6.7 Thermal Vacuum Test Procedure 
The software that was running during the tests was developed to achieve a full test 
cycle of all the processor systems in the MPSoC design. The algorithms of the 
software that run on the master processor and the slave processors is shown in Figure 
6-15 and Figure 6-16. The internal WDT in the FPGA which are connected to the 
master and slave processors kept generating an interrupt every 5.4 sec. When the 
slave processors receive the interrupt signals through their INTC, they invoke an ISR. 
The routine starts by sending an acknowledgement message of four characters 
(ACK0) to the master processor to inform it that the slave processor is alive and to 
test the mailbox connection between the master and slave processors. The slave 
processor starts an internal timer to count the interval between the successive 
interrupts received from the WDT. The internal timer value is captured by the next 
new WDT interrupt and the internal timer is restarted. The slave processor reports its 
status to the external monitoring computer together with the value of its internal 
timer through the UART interface. The last step is that the slave processor restarts 
the WDT to allow for a new interrupt to take place. If the slave processor fails to 
restart the watch dog timer it would be restarted by a reset signal to be sent from the 
WDT to the reset module of the processor.  
Chapter 6 Thermal Vacuum Testing 
133 
 
WDT Interrupt 
Received 
Start ISR
Send ACK0 to master 
processor through 
mailbox IP core
Capture Local Timer 
Value and restart the 
Timer
Send Data through 
UART to the external 
monitoring computer
Restart the local WDT
Return from ISR
 
Figure 6-15. Algorithm Executed at the Slave Processor at Thermal Vacuum Test. 
The master processor receives interrupts from its local WDT every 5.4 sec. The 
master and slave processors work on a master system clock so the interrupts timings 
are synchronized. The master processor ISR starts by reading the mailboxes FIFOs to 
receive the (ACK0) messages sent from the slave processors. If no (ACK0) is found 
from any processor then the whole system is reset and FPGA is reloaded. This takes 
place by not resetting the external WDT which leads to power recycling of the 
system through the signal sent from the external WDT to the SSR. Data is sent 
through UART to the monitoring computer. 
Chapter 6 Thermal Vacuum Testing 
134 
 
WDT Interrupt 
Received 
Start ISR
Receive ACK0 from 
slave processors 
through mailbox IP core
Restart the external 
WDT
Send Data through 
UART to the external 
monitoring computer
Restart the local WDT
Return from ISR
 
Figure 6-16. Algorithm Executed at the Master Processor at Thermal Vacuum Test. 
Before conducting the full thermal vacuum test, we conducted a check test to make 
sure that the FPGA would perform an automatic shutdown if the junction 
temperature reaches 125 ˚C. We conducted durability test to make sure that the 
FPGA would withstand the thermal cycling under vacuum condition. However, we 
noticed that the FPGA operating temperature was exceeding the recommended 
commercial grade upper limit of 85 ˚C during the operation of the 4 processor cores. 
We had an initial test cycle to better characterize the FPGA temperature profile in 
thermal vacuum testing. Sensors were attached as shown in Figure 6-17. The results 
Chapter 6 Thermal Vacuum Testing 
135 
 
shown in. Figure 6-18 were obtained. Thus, based on the results of the FPGA 
temperature profile, we decided to mitigate the design using the 8 previously 
mentioned thermal control techniques and repeated the tests again. Thermal 
mitigation test results are discussed in chapter 7. 
 
 
Figure 6-17. Initial Screening Thermal Vacuum Test Sensors Attachemnt. 
 
5 
4
1
3 
 Figure 
 
In this cha
screening 
the 4 proc
the operat
and dynam
the tests’ r
 
6-18. Initia
pter we pre
of the FPG
essor cores 
ing tempera
ic power c
esults and d
Chap
l Screening
sented the 
A temperat
were opera
ture were s
onsumptio
iscussion a
ter 6 Therm
 Thermal V
thermal vac
ure profile
ting simult
uggested b
ns. In the n
bout their s
al Vacuum
acuum Tes
uum test se
 showed an
aneously. M
ased on the
ext the cha
ignificance
 Testing 
t Results [6
tup and pr
 over limit
itigation te
 concept of
pter we pr
. 
7] ©IEEE 2
ocedure. Th
 performan
chniques to
 reducing t
esent the d
136 
 
013. 
e initial 
ce when 
 reduce 
he static 
etails of 
 Chapt
h
r
T
7.1 Tim
The time 
algorithm 
The time 
redundant
7-3 shows
data were
curve fittin
in the figu
the groupe
T
 : Rer 7
e results o
adiation te
he discuss
e and S
and space 
using vecto
span of exe
 system and
 the execu
 collected 
g on the g
res show th
d data histo
Figure 7-
Chap
esults a
f the spac
st, and the
ion about th
pace Re
redundanc
rs of rando
cution was
 the space 
tion times 
and groupe
rouped data
e red line o
grams. 
1. Non-Fau
ter 7 Resu
nd Dis
e and tim
 thermal va
e meaning
dundan
y systems 
mly genera
 collected f
redundant s
histograms
d to depic
 of each ca
f the norm
lt-Toleranc
lts and Disc
cussion
e redundan
cuum test 
 and signifi
cy Perfo
were teste
ted data w
or the non-
ystem. Figu
 of the thre
t their stati
se using n
al distribut
e Execution
ussion 
 
cy system
are presen
cance of th
rmance
d to execu
ords as des
fault tolera
re 7-1, Fig
e cases. T
stical distr
ormal distri
ion fitting a
 Time Hist
s performa
ted in this 
e results is 
  
te the bub
cribed in ch
nt system, 
ure 7-2, an
he executio
ibutions. W
bution. Th
nd the blue
ogram. 
137 
nce, the 
chapter. 
covered.  
ble sort 
apter 4. 
the time 
d Figure 
n times 
e made 
e graphs 
 bars of 
 
Chapter 7 Results and Discussion 
138 
 
 
Figure 7-2. Time Redundancy Fault-Tolerance Execution Time Histogram. 
 
Figure 7-3. Space Redundancy Fault-Tolerance Execution Time Histogram. 
Chapter 7 Results and Discussion 
139 
 
Each time a random sequence vector was generated, based on the clock seed of the 
computer system running MATLAB, to avoid repeating patterns of random 
sequences. The length of each vector was 100 words. The numbers of vectors applied 
during the test were 100 vectors. The statistical parameters of the normal distribution 
fitting of each histogram are shown in. Table 7-1. 
Table 7-1. Statistical Analysis of Execution Times Histograms in seconds. 
Parameter No-Fault 
Tolerance 
Time Redundancy Space Redundancy 
Min 0.002354 0.007171 0.002463 
Max 0.002739 0.008326 0.002847 
Mean  0.002547 0.007749 0.002655 
Median  0.002547 0.007749 0.002655 
Mode  0.002354 0.007171 0.002463 
Std 0.0001129 0.0003386 0.0001127 
Range  0.0003852 0.001156 0.0003847 
 
The mean execution time of the single processor without software TMR is about 2.55 
ms. It is almost the same as the mean execution time of the hardware TMR, space 
redundancy, which is 2.65 ms. The mean execution time of the single processor with 
software TMR, time redundancy, is 7.75 ms. Almost 3 times higher than the non-
fault tolerant and the space redundancy fault tolerance cases. This means that 
hardware redundancy is better in terms of execution time as it is as fast as the non-
fault tolerant system which contains no overheads. However, we have to carefully 
notice that adding complicated data exchange protocols between the processors will 
add an execution overhead hence increasing the execution time significantly. If too 
much time is spent in handling the communication between the processors then the 
execution time might be close to the time redundancy case. 
The power consumption varies between the single core and the multicore systems as 
was shown in chapter 4, Table 4-1. The power consumption was estimated using the 
XPower Analyzer Tool from the Xilinx-ISE toolset. The total power consumed by 
the multi-core system is about 1.26 Watt which is only about 28% higher than the 
power consumed by the single core system.  
Chapter 7 Results and Discussion 
140 
 
The total power consumption consists from: 1) dynamic power, and 2) static power. 
The dynamic power consumption is affected by the size of the design and its 
operating frequency. The static power consumption is mainly affected by the 
transistor feature size of the FPGA, junction temperature and the operating voltage. 
The static power consumption in the FPGA transistors is considered as leakage. The 
leakage current and hence the leakage power increase as the feature size of the FPGA 
transistors is minimized. This leakage power is consumed even if the FPGA is not 
configured with any bitstream. About 0.46 Watt is dissipated in the form of leakage 
power in the multicore and single core designs.  
Table 4-1 in chapter 4, represents the dynamic power consumed by the design 
resources: The on-chip clock trees, the logic gates, the design signals, the Block 
Random Access Memories (BRAM), the Digital Signal Processors (DSP) which has 
a value of zero as we did not use DSPs in the design, the Phase Locked Loops (PLL) 
and Digital Clock Managers (DCM) for generating the required operating frequency 
and the Input/Outputs (IO) logic. The difference in the dynamic power consumption 
between the two designs is about 0.278 Watt. This difference is due to the use of 
more cores, which means the use of more logic, signals, BRAMs, IOs and mostly 
more clock-trees, in implementing the multicore system. From Table 4-1 in chapter 
4, we conclude that the dynamic power consumption is the main responsible for the 
difference in the total power consumption. The FPGA resources are almost utilized 
by the multi-core system while about less than one-third is utilized by the single core 
system as shown in Table 4-2 and Table 4-3 in chapter 4. 
This comparative study investigated the effects of using hardware redundancy and 
software redundancy on the resources utilization, power consumption, execution 
speeds and reliability in single and multi-core designs of the Virtex-5 FPGA. The 
multi-core system makes better use of the resources and it executes at almost triple 
the speed of the single core system while its power consumption is only 28% higher 
than it. Thus, the multi-core system which uses space redundancy for implementing 
fault tolerance in the 65 nm Virtex 5 FPGA through repeating redundant hardware 
processor cores is more efficient and effective than the single core design. In terms of 
reliability the TMR design is better than the single core design as long as the time 
between scans is kept below a specific threshold which is 606 sec for this design as 
Chapter 7 Results and Discussion 
141 
 
was explained in chapter 4. Table 7-2 summarizes the time and space redundancy 
comparative study results. 
Table 7-2. Summary of Time and Space Redundancy Architectures Comparison. 
Parameter Time Redundancy Space Redundancy Evaluation 
Performance X 3X Good 
Power X 1.3X Moderate 
Reliability  
(at T scan = 100sec) 
X 1.1X Good 
Utilization X 2.1X Good 
 
It is recommended according to the obtained results to make use of space redundancy 
approaches when designing digital systems using the Xilinx Virtex5 (65nm) FPGA. 
This is valid due to the fact that considerable portion of the power consumed is 
dissipated in the form of leakage power. Adding extra logic did not add much to the 
total power consumed. The reliability of the space redundant system is higher or at 
least equal to the time redundant system below the critical threshold of the time 
between scans, and its execution speed is better as far as the communication protocol 
between the cores does not add much overhead. Resources are better to be utilized in 
the FPGA device rather than wasting them. The space redundant system makes 
higher utilization of the FPGA resources. That is an advantage for the space 
redundant over the time redundant systems in case no additional design modules are 
needed to be added in the design. In case larger packages are needed then it would be 
straight forward to use the larger Virtex5-LX85 which is pin compatible with 
virtex5-LX50. However, we should note that larger package means higher device and 
design static power consumptions.  
7.2 Static Cross Section from Accelerated Proton Beam Test 
The radiation test of the XC5VLX50-1FFG676C FPGA, which has a 65 nm 
technology, was conducted at TARRI to estimate the static cross section using a new 
method. We designed the method to make use of a single FPGA. Traditional 
Chapter 7 Results and Discussion 
142 
 
radiation test setup contains two main sections: 1) The DUT, and 2) the monitoring 
and control system. In our approach we used the SEU controller IP core to count the 
number of frames containing errors in every 128 full scans of the configuration 
FPGA frames. We approximated the problem by calculating the event bit cross 
section instead of the upset bit cross section. This means that the frame errors are 
counted as a single event therefore there would be no need to know the actual 
number of upset bits. Nevertheless, we should always consider that the actual 
situation in-orbit is a combination of both single and double errors. Triple and more 
bit errors are rare to occur [36]. The TARRI test results are compared to the test 
results previously obtained in the experiment of the Los Alamos National Laboratory 
(LANL) and NASA Goddard Space Flight Center (GSFC) at Indiana University 
Cyclotron Facility (IUCF) [64]. That experiment was conducted on the same 
commercial grade FPGA, Virtex 5 LX50, at an energy level of 65 MeV of 
accelerated proton beam. The event bit static cross section obtained in NASA’s 
experiment was (6.37x10−14 ± 1.17x10−15 cm2/bit). 
Figure 7-4 shows all of the radiation test results at TARRI in one graph. The results 
are collected during 4 radiation sessions. Three sessions were at a flux level of 10e06 
(cm-2.s-1) and one session at a level of 3e06 (cm-2.s-1). All the sessions continued to 
collect data until the SEU controller recorded 128 configuration frame errors.  
 
Figure 7-4. Accumulated No. of Frame Upsets versus Time at Flux Levels of 10e06 
and 3e06 (cm-2.s-1) – Energy Level of 65 MeV. 
0
20
40
60
80
100
120
140
0 50 100
N
o 
of
 F
ra
m
es
 U
ps
te
s  
Time (sec)
3e06 frame errors
10e06 frame errors - 1
10e06 frame errors - 2
10e06 frame errors - 3
Chapter 7 Results and Discussion 
143 
 
Figure 7-5 to Figure 7-8, show the individual results of conducting radiation tests 
using flux levels of 10e06 (cm-2.s-1) and 3e06 (cm-2.s-1). The graphs show the number 
of configuration frames that suffered multiple bit upsets which could not be corrected 
by the SEU controller core. The graphs are linearly curve fitted with the equation 
representing the curve shown as y(x). The R2 value is the coefficient of 
determination which is calculated to judge the goodness of the curve fitting to the 
plotted data.  
 
Figure 7-5. Accumulated No. of Frame Upsets versus Time at Flux Level of 10e06 
(cm-2.s-1) – Energy Level of 65 MeV – 1st Trial. 
Our average static cross section value over the 4 radiation sessions is (6.34e-14 
cm2/bit) while NASA’s value is (6.37e-14 ± 1.17e-15cm2/bit). If we consider the 
differences in devices accuracies between TARRI and NASA then we judge our 
results as valid and accepted. Our proposed technique is new and simple when 
compared to the traditional techniques for calculating the static cross section [19]. 
We used the DUT with an embedded SEU controller IP core, which performed read-
back of the configuration frames through the ICAP, instead of the traditional external 
y = 5.204x - 10.468
R² = 0.9778
0
20
40
60
80
100
120
140
0 5 10 15 20 25 30
N
o 
of
 F
ra
m
es
 U
ps
te
s  
Time (sec)
Chapter 7 Results and Discussion 
144 
 
scrubbing through JTAG or Select-MAP interface. The SEU controller core design is 
ultimately very small therefore no failures were detected in its operation during the 
short exposure times to radiation. The traditional methods are complex and have 
more connections as they perform external scrubbing, monitoring and control 
through a specific FPGA or microcontroller. More complex setups are not welcomed 
in radiation test experiments especially inside the radiation chamber.  
In order to calculate the static event cross section as in Eq. (5-5), the actual device 
size in bits has to be estimated. The actual device size depends on the number of bits 
being exposed to radiation. The opening in the copper shield, through which the 
proton beam was projected to the FPGA, was about 12 mm in diameter. This gives 
an exposed area of approximately 1.13 cm2. However, the FPGA die area is 1.44 cm2 
(12 mm × 12 mm). The amount of configuration words which can be read back in 
the scanning process initiated by the SEU controller are 339,200 words [84]. Each 
word contains 32 bit, therefore a total of 10854400 configuration bits can be read 
back by the SEU controller. By multiplying the total number of configuration bits by 
the area ratio (1.13/1.44), we get the total number of bits in the area exposed to 
radiation which is equal to 8525025.826 bits. By applying Eq. (5-5), we get the static 
event cross sections shown in Table 7-3 to Table 7-6. The average static event cross 
section from the four measurements is 6.34e-14 cm2/bit. 
Table 7-3. Static Cross Section Calculations at Flux Level of 10e06 (cm-2.s-1) – 
Energy Level of 65 MeV – 1st Trial. 
Flux level (cm-2.s-1) 10e06 
Exposure time (s) 25.16736 
No. of frame errors (events) 128 
Device size (bits) 8525025.826
Static event cross section 5.97e-14 
 
Chapter 7 Results and Discussion 
145 
 
 
Figure 7-6. Accumulated No. of Frame Upsets versus Time at Flux Level of 10e06 
(cm-2.s-1) – Energy Level of 65 MeV – 2nd Trial. 
 
Table 7-4. Static Cross Section Calculations at Flux Level of 10e06 (cm-2.s-1) – 
Energy Level of 65 MeV – 2nd Trial. 
Flux level (cm-2.s-1) 10e06 
Exposure time (s) 23.43168 
No. of frame errors (events) 128 
Device size (bits) 8525025.826
Static event cross section 6.41e-14 
 
 
y = 5.4266x - 0.0057
R² = 0.9985
0
20
40
60
80
100
120
140
0 5 10 15 20 25
N
o 
of
 F
ra
m
es
 U
ps
te
s  
Time (sec)
Chapter 7 Results and Discussion 
146 
 
 
Figure 7-7. Accumulated No. of Frame Upsets versus Time at Flux Level of 10e06 
(cm-2.s-1) – Energy Level of 65 MeV – 3rd Trial. 
 
Table 7-5. Static Cross Section Calculations at Flux Level of 10e06 (cm-2.s-1) – 
Energy Level of 65 MeV – 3rd Trial. 
Flux level (cm-2.s-1) 10e06 
Exposure time (s) 21.696 
No. of frame errors (events) 128 
Device size (bits) 8525025.826
Static event cross section 6.92e-14 
 
y = 5.671x + 2.9
R² = 0.9974
0
20
40
60
80
100
120
140
0 5 10 15 20 25
N
o 
of
 F
ra
m
es
 U
ps
te
s  
Time (sec)
Chapter 7 Results and Discussion 
147 
 
 
Figure 7-8. Static Cross Section Calculations at Flux Level of 3e06 (cm-2.s-1) – 
Energy Level of 65 MeV. 
 
Table 7-6. Static Cross Section Calculations at Flux Level of 3e06 (cm-2.s-1) – 
Energy Level of 65 MeV. 
Flux level (cm-2.s-1) 3e06 
Exposure time (s) 82.4448 
No. of frame errors (events) 128 
Device size (bits) 8525025.826
Static event cross section 6.07e-14 
 
The measured static cross section value gives an indication about the device 
sensitivity to the charged particles radiation. The sensitivity depends on the used 
manufacturing technology such as epitaxial CMOS, Silicon-on-Insulator (SOI) or 
Triple Oxide (TO). The static cross section is further used in estimating the soft error 
reliability as indicated in the chart in Figure 7-9. 
y = 1.4935x + 0.2492
R² = 0.9955
0
20
40
60
80
100
120
140
0 20 40 60 80 100
N
o 
of
 F
ra
m
es
 U
ps
te
s  
Time (sec)
Chapter 7 Results and Discussion 
148 
 
 
Figure 7-9. Radiation Analysis Chart. 
The static cross section ߪ௦௧௔௧௜௖	is used by the models which run under SPENVIS to 
estimate the SEU rate based on the target orbit parameters. The AP-8 and AE-8 
models are used to estimate the trapped protons and electrons energy spectra versus 
flux levels. Their output is used by the Cosmic Ray Effects on Micro-Electronics 
(CREME-96) model. The CREME-96 uses the rectangular parallelepiped method to 
estimate the accumulated charge and compare it with the critical charge at which the 
upset happens. The output of the CREME-96 is the SEU rate profile at the specified 
orbit. The design specific SEU rate is estimated based on the design bitstream size. 
The data is used to analyze the mitigated and non-mitigated cases. The mitigated 
case uses the configuration frames scrubbing method to wash out the configuration 
memory from SEU induced upsets. The soft error reliability is then calculated and 
estimated as a function in the time between scans. The non-mitigated case makes use 
Static cross section 
measurement by radiation test 
(ߪ௦௧௔௧௜௖) 
Use SPENVIS to simulate the 
orbit and estimate the SEU rate 
through running AP-8, AE-8, 
and CREME-96 models  
Estimate design specific in-orbit 
SEU rate based on the design 
bitstream size 
Estimate the soft error reliability 
versus scrubbing time  
(Mitigation Analysis) 
Target orbit 
definition  
Estimation of dynamic cross 
section through fault 
injection 
(Non-Mitigation Analysis) 
Orbit 
parameters
Worst case 
upset at 
SAA 
Chapter 7 Results and Discussion 
149 
 
of the fault injection concept to estimate the system dynamic cross section. The fault 
injection takes place at an upset rate which matches the worst upsets at the SAA 
region.  
7.3 Fault Injection versus Radiation Tests 
Fault injection is used in dynamic cross section estimation. It cannot be used in static 
cross section estimation. Radiation test can be used for both purposes. However, 
when using fault injection and/or radiation test, some points should be considered: 
1) Fault injection implementation almost would need additional hardware and 
software to be able to inject the faults. 
2) It might require stopping the operation of the design to inject the faults. However, 
if injection is on-the-fly as in the dynamic partial reconfiguration technique by 
Xilinx then there would be no need to stop the design operation. 
3) Fault injection using SEU controller IP core covers injection in the FPGA 
configuration memory but what about the bits in the control and data registers of 
the design such as the processor registers?. Our main goal when we used the SEU 
controller IP core was to evaluate the fault injection effect in the configuration 
frames as they carry the hardware design. On the other side, system level faults 
such as register upsets are well studied and can be mitigated using variety of 
techniques such as TMR, EDAC, and SIHFT. We performed TMR effectiveness 
analysis using simulations as will be shown later. The system can still be 
recovered from SEUs in case of its complete failure using WDTs to reset it.  
4) In the case of fault injection there is a non-simultaneous fault injection situation 
where faults are injected in the FPGA configuration on a one-by-one basis. Fault 
injection using SEU controller cannot inject faults in multiple frames 
simultaneously as in the real radiation case.  
5) In radiation test we usually conduct the test at a specific energy level of the 
charged particles. However, in space, charged particles exist at all energies. 
Therefore, an extrapolation of the performance during the test should be 
considered to be able to plot the system failure rate versus the different upset 
rates. In radiation test we usually control the exposure time and flux as they are 
easier than changing the energy level. In fault injection test, we can easily test at 
 differe
unit tim
6) In rad
capsul
packag
 
Figure 7-
7.4 Dyn
Figure 7-1
with two 
Each fault
the accum
types of c
reconfigur
place whe
the design
operation.
working b
without st
resynchron
be resynch
and then r
nt upset ra
e.  
iation test, 
ation. For e
e to get SE
10. de-caps
amic Cr
1, shows th
fault lists s
 list was ge
ulation of 
orrection t
ation and 
n the FPGA
 should be
 The partia
ut not the e
opping the 
ization tak
ronized to 
esumed aga
Chap
tes as we ju
if it was n
xample, C
U events a
ulated Spar
oss Sec
e results of
izes: 1) 50 
nerated ran
the faults 
hat can tak
3) Softwar
 stops wor
 reloaded t
l reconfigu
ntire three 
working pr
es place w
the same op
in. 
ter 7 Resu
st need to 
ot at a part
f-252 test n
s shown in 
tan-6 and V
tion thro
 running th
accumulate
domly for 1
at each tim
e place: 1
e resynchro
king due to
o the intern
ration take
processors.
ocessor(s) 
hen the sof
eration aft
lts and Disc
change the
icle accele
eeds to de-
Figure 7-10
irtex-5 FP
ugh Fa
e fault inje
d faults, a
0 times an
e. In the M
) Full FPG
nization. T
 fault injec
al SRAM 
s place whe
 The system
to restore t
tware of th
er one or tw
ussion 
 amount of
rator, we n
capsulate t
. 
GAs for Cf
ult Injec
ction campa
nd 2) 100 
d the result
PSoC des
A reconfig
he full re
tions. The 
in order to
n one or t
 can be pa
he correct o
e working 
o processo
 injected fa
eed to perf
he top cove
-252 radiati
tion Tes
ign on the 
accumulate
s were colle
ign there a
uration, 2
configuratio
entire bit-st
 restore the
wo process
rtially reco
peration. S
processors
rs stopped 
150 
ults per 
orm de-
r of the 
 
on test. 
t 
MPSoC 
d faults. 
cted for 
re three 
) Partial 
n takes 
ream of 
 correct 
or stops 
nfigured 
oftware 
 need to 
working 
Chapter 7 Results and Discussion 
151 
 
 
Figure 7-11. Fault Injection Results. 
The results show that about 10% of the injected faults in the 50 faults batch and 10% 
of the 100 faults batch needed full reconfiguration. Another 10% of the faults in the 
50 faults batch needed partial reconfiguration while 30% of the faults injected in the 
100 faults batch needed partial reconfiguration. This means that in the 50 faults batch, 
only 80% of the injected faults where totally recovered through the ACM of the SEU 
controller without the need for partial or full reconfiguration. In the case of the 100 
faults batch, 60% of the injected faults were fully recovered with no need of any 
reconfiguration. The software resynchronization takes place whenever a partial 
reconfiguration is initiated or a processor stops operation then resumes after the auto-
correction mode has been enabled. The obtained results of fault injection coincides 
with the concepts mentioned by Xilinx that about 10% or less of the faults would 
lead to total system failure. This result can be viewed as a validation of the fault 
injector operation. Figure 7-12, verifies the fault injector function as it shows the plot 
of 50 injected faults versus the times between injections in seconds. The times 
between injections follow an exponential distribution. The number of faults 
themselves follows Poisson distribution. 
0
1
2
3
4
5
6
7
8
9
10
Full
Reconfiguration
Partial
Reconfiguration
SW Resynch
Fa
ilu
re
 R
at
es
Recovery Type
SEU 50
SEU 100
Chapter 7 Results and Discussion 
152 
 
 
Figure 7-12. Time Distribution between Injected Faults. 
 
The TMR system was simulated at different data vector sizes. We use Eq. (7-1) 
deduced in [85] [86], to evaluate the TMR simulator behavior. 
ܴ = 1
௖ܶ
− 1
௖ܶ
ෑሾ3݁ିଶே೔௥ ೎் − 2݁ିଷே೔௥ ೎்ሿ
ெ
௜ୀଵ
 
(7-1) 
Where (R) is the TMR system failure rate, (Tc) is the cycle operation time during 
which faults are monitored also called the scrubbing time and we chose it as 1 sec, 
(N) is the number of bits in each TMR data vector and we set as 8 bits, (M) is the 
number of TMR groups for example we set it at 2048. Figure 7-13, shows the results 
of applying the fault injection over a packet size of 2048 bytes in a TMR operation. 
The packet contained a random vector of data and the vector is compared between 
three of the operating cores after faults were injected randomly in it. The vectors are 
compared value by value in a TMR operation through a voter. The upper curve 
(green) shows the failure rate as estimated by Eq. (7-1), while the lower curve (red) 
shows the simulation results. 
0 10 20 30 40 50 600
1
2
3
4
5
No. of Injected Faults
Ti
m
es
 b
et
w
ee
n 
In
je
ct
io
ns
 (s
ec
)
 F
From the 
faults in th
10% of th
case of da
than 50% 
To estimat
We estima
system fai
the system
correspond
the accura
The syste
simplicity
trial size. B
igure 7-13
results we 
e FPGA bi
e injected f
ta fault inj
of the data 
e the dynam
ܦݕ݊ܽ
te the dyn
lure at any 
 fails whe
s to 50 fai
te number 
m might h
 and as a w
y experim
Chap
. TMR Fail
conclude th
t-stream to 
aults in the
ection at an
had residua
ic cross se
݉݅ܿ	ܥݎ݋ݏݏ
amic cross 
trial means
n injecting 
lures. The r
of bits afte
ave stoppe
orst case w
enting we f
ter 7 Resu
ure Rate at 
at fault inj
simulate th
 hardware 
 upset rate
l failures. 
ction we u
	ܵ݁ܿݐ݅݋݊ =
section in 
 the failure 
50 accumu
eason for t
r which the
d after 10
e conside
ound that 1
lts and Disc
Data Vecto
ection can
e effects of
bit-stream n
 of 0.1 up
se Eq. (7-2)
	 #	݋݂	ݏݕݏ#	݋݂
a pessimis
at all of the
lated fault
hat assump
 system sto
 or 20 or 
r that the to
 out of 10 t
ussion 
r of Size 20
 be used fo
 the space e
eeded full 
sets per bit
 : 
ݐ݁݉	݂݈ܽ݅ݑݎ
ܾ݅ݐ	ݑ݌ݏ݁ݐݏ
tic way. W
 bits in tha
s we would
tion is that
pped withi
40 errors 
tal failures
rials had to
48 bytes. 
r injecting 
nvironmen
reconfigur
s per secon
݁ݏ
 
e consider 
t trial. Ther
 consider 
 we cannot 
n the 50 inj
were injec
 correspon
tal failure i
153 
random 
t. About 
ation. In 
d, more 
(7-2) 
that the 
efore, if 
that this 
identify 
ections. 
ted. For 
d to the 
n the 50 
 
Chapter 7 Results and Discussion 
154 
 
and 100 bits experiments. This corresponds to a dynamic cross section of (50/500) or 
(100/1000) which is 0.1 failures per bits upsets. So if there are 1000 upsets the 
system would fail by 100 times which is too much pessimistic consideration. To 
calculate the dynamic cross section per device we would divide by the number of bits 
in the device (0.1/11.37e6) = 8.79507e-09 bit-1. As the injection elapsed about 60 sec 
for each trial. The Dynamic cross section per bit per sec can be calculated as = 
8.79507e-09 bit-1 /60 = 1.46585e-10 sec-1 bit-1. This value of the dynamic cross 
section is in the case of a fully configured FPGA. The expected failure rate depends 
on the actual design size. Therefore, the MicroBlaze system failure rate would be 
(0.26 ×11.37e06 ×  1.46585e-10 = 4.33e-04 sec-1). The simulation of the system 
failure rate in SPENVIS showed that each MicroBlaze system failure rate would be 
0.00183517 sec-1 (at non-SAA)/3 = 6.117233e-04 sec-1. The results show that 
SPENVIS calculations overestimated the failure rate by 30%. The actual in-orbit 
SEU rate as reported by a Virtex-4 flight experience was found to be four to five 
times less than the rate estimated by the CREME96 model which is used in 
SPENVIS [36]. This means that our estimation of the failure rate based on the fault 
injection approximation, complete 50 failures or 100 failures whenever a trial fails, 
has enough calculation margins. 
7.5 Thermal Vacuum Test Results  
Thermal vacuum test was repeated after the thermal vacuum screening test showed 
that the FPGA temperature reached about 105 ̊C. In the repeated test we used the 
combination of the thermal control techniques mentioned in chapter 6. We thought of 
these techniques as a method that might reduce the FPGA core temperature through 
controlling the thermal power. At first we had to assure the auto protection feature of 
the Xilinx Virtex-5 FPGA. It is well designed to shut down when its core 
temperature exceeds 125 ˚C. It keeps off, until the silicon cools down and then it 
restarts again as shown in Figure 7-14. The FPGA cools to less than 114 ˚C. 
However, for reliability and long lifetime we should keep the FPGA around 85 ˚C as 
recommended by Xilinx for the commercial grade Virtex-5. After the auto protection 
test we performed a durability test at the 105 ̊C which was recorded during the initial 
temperature profile screening. The durability test, result shown in Figure 7-15, was a 
form of stress testing to make sure that the FPGA can withstand the over limit 
temperature stress even if it was not mitigated.  
  
Figure 7-1
the built t
explained 
Figu
6 shows th
hermal dio
in chapter 6
Chap
re 7-14. FP
Figure 7
e worst cas
de connec
. 
ter 7 Resu
GA Autom
-15. FPGA
e temperatu
ted to an e
lts and Disc
atic Therm
 Durability
re record fr
xternal tem
ussion 
al Protectio
 Test. 
om inside t
perature m
n. 
he FPGA d
easuremen
155 
 
 
ie using 
t IC as 
 Figure 7-
The extern
are the the
 
Figure
16. Therma
al tempera
rmocouples
 7-17. The
Chap
l Vacuum T
Int
ture sensor
 previously
rmal Vacuu
ter 7 Resu
est, No-He
ernal Die T
s readings 
 shown in 
m Test, No
Thermoc
lts and Disc
at Sink, 10
emperature
are shown 
Figure 6-10
-Heat Sink
ouples. 
ussion 
0 MHz, 4 P
. 
in Figure 7
 and Figure
, 100 MHz,
rocessors –
-17, theses
 6-11. 
 4 Processo
156 
 
 FPGA 
 sensors 
 
rs – 
 Figure 7-
mitigation
FPGA die
Figure 
The exter
Figure 7-1
 
Fig
18 shows 
 techgnique
. 
7-18. Therm
nal temper
9. These ar
ure 7-19. T
Chap
the best c
s both pas
al Vacuum
Int
ature senso
e the therm
hermal Vac
ter 7 Resu
ase temper
sive and ac
 Test, Hea
ernal Die T
rs readings
ocouples at
uum Test, 
Thermoc
lts and Disc
ature recor
tive. The m
t Sink, 50 M
emperature
, external 
tached in F
Heat Sink, 
ouples. 
ussion 
d when a
easuremen
Hz, 2 Pro
. 
thermocoup
igure 6-11.
50 MHz, 2
pplying all
t is from is
cessors – F
les, are sh
and Figure 
Processors 
157 
 of the 
nide the 
 
PGA 
own in 
6-12. 
 
– 
 Table 7-
Table 7-7 
mitigation
using heat
the static 
an effecti
technique.
number o
maximum
combined 
maximum
of the tech
temperatu
case.  
7. Thermal 
summarize
 technique 
 sink is an 
power. Red
ve approac
 Using re
f nodes n
 reduction i
together th
 temperatur
niques whe
re reduction
Chap
Vacuum T
s the mitiga
would exc
effective p
ucing the o
h either a
duced num
eed to be 
n the opera
us reaching
e without 
n applied a
 in referen
ter 7 Resu
est Results 
tion techni
eed the ope
assive solu
perating fr
lone or w
ber of pro
toggled th
ting temper
 66 ̊C whic
any mitigat
lone or com
ce to the m
lts and Disc
with and w
ques test re
rating limi
tion to redu
equency to
hen comb
cessor sys
us reducin
ature is ach
h correspon
ion. Figure
bined toge
aximum tem
ussion 
ithout Mitig
sults. The 
ts up to 10
ce the tem
 reduce the
ined with 
tems woul
g the dyn
ieved whe
ds to -37%
 7-20 show
ther throug
perature in
ation Tech
system with
5 C̊. It is c
perature an
 dynamic p
another m
d also red
amic pow
n all techni
 reduction 
s the effec
h the perce
 the non-m
158 
niques 
 
out any 
lear that 
d hence 
ower is 
itigation 
uce the 
er. The 
ques are 
from the 
tiveness 
ntage of 
itigated 
Chapter 7 Results and Discussion 
159 
 
 
Figure 7-20. Thermal Mitigation Effectiveness. 
In this chapter we presented the detailed results and discussion of the MPSoC OBC 
performance evaluation, time and space redundancy approaches effectiveness 
comparison, static event cross section estimation through proton beam radiation test, 
dynamic cross section estimation through fault injection and thermal vacuum 
mitigation test results. In the next chapter we present the conclusion and future work. 
 
 
-40.00
-35.00
-30.00
-25.00
-20.00
-15.00
-10.00
-5.00
0.00
Scale-down Heat Sink Freq. Red. Scale-down
& Heat Sink
Scale-down
& Freq.
Red.
Heat Sink &
Freq. Red.
All
Techniques
Re
du
ct
io
n 
Pe
re
cn
ta
ge
 Chapt
h
b
i
technology
conclusive
8.1 Con
In future 
scientific 
than send
satellite m
orbit proc
session tim
To downlo
high band
means we
required 
incurring 
afforded 
performan
satellite w
board pro
and hence
cost drasti
The resea
affordable
The delive
can be use
to provide
within the
T
 : Cer 8
e thesis pr
e used in f
n the imple
 by the tim
 outcome o
clusion 
space miss
instruments
ing it to th
eans that t
essing. As
e with them
ad large vo
width for 
 have to de
communica
higher recu
by scienti
ce processi
ithout the n
cessing wo
 the satellite
cally when 
rch tried to
 cost and r
rable was 
d in scienti
 sufficient 
 tight pow
Chapter 8 C
onclus
esented the 
uture space
mentation 
e of writin
f the thesis
 
ions a lot 
. This data
e ground. 
he results a
 scientific 
 is limited
lumes of s
higher bitra
velop satel
tion system
rring and n
fic institut
ng comput
eed to dow
uld reduce 
 size, mass
compared t
 provide th
eliability b
defined to 
fic satellite
reliability s
er budget 
onclusion 
ion and
design and
 missions. 
which is co
g the thesi
 as well as 
of data w
 would be 
This techni
re sent to t
satellites m
 to 10 min 
cientific da
tes and hi
lites with l
 requirem
on-recurrin
es. There
ers that wo
nload the 
the require
 and cost. A
o space app
e needed h
y making u
be a low c
s as well as
uitable for 
limits. As 
and Future
 Future
 qualificatio
The design
nsidered a 
s as in [14
future resea
ould be co
analyzed a
que of pro
he ground 
ostly fly 
at maximum
ta and satel
gh power 
arger sizes
ents. Buil
g engineer
fore we t
uld process
raw data t
d commun
lso using C
roved com
igh proces
se of the s
ost high pe
 commerci
small satell
low cost d
 Perspectiv
 Persp
n of an MP
 used an S
state-of-art
]. In this ch
rch perspec
llected thr
nd process
cessing the
station whi
in LEO, t
 and for 3
lite telemet
for correct 
 and masse
ding large
ing costs w
hought of
 the data lo
o the groun
ication sys
OTS comp
ponents. 
sing capab
tate of art 
rformance
al ones. The
ites mission
oes not me
es 
ectives 
SoC-OBC 
RAM-base
 in small sp
apter we c
tives. 
ough the o
ed on-boar
 data on-b
ch is know
he commu
 to 4 times 
ry, we need
transmissio
s to suffice
r satellites
hich migh
 developin
cally on-b
d station. T
tem charac
onents red
ility platfor
COTS tech
 MPSoC O
 MPSoC O
s as well a
an that the
160 
that can 
d FPGA 
acecraft 
over the 
n-board 
d rather 
oard the 
n as in-
nication 
per day. 
 to have 
n. This 
 for the 
 means 
t not be 
g high 
oard the 
his on-
teristics 
uces the 
m at an 
nology. 
BC that 
BC had 
s fitting 
 system 
Chapter 8 Conclusion and Future Perspectives 
161 
 
should be of low reliability, we had to propose mitigation techniques in the design 
and perform exhaustive testing on them to make sure that the system would work 
flawlessly in the space environment.  
The design was performed using the Xilinx EDK tool. A total utilization of 98% of 
the FPGA resources was recorded. That utilization is occupied by 4 complete 
embedded systems. If there is no need to expand the design after deploying it by 
adding extra logic then the higher the utilization would be the better as long as the 
power constraints are respected.  
Performance test was conducted to judge the performance of 65 nm FPGAs in the 
case of time and space redundancies. The execution time of the bubble sort algorithm 
on randomly generated data was almost three times larger in the case of time 
redundancy compared to space redundancy. Space redundancy execution time was 
slightly higher than the execution time of non-fault tolerant system. If more complex 
data communication protocol was used among the processors while exchanging their 
data for voting then the communication overhead would be higher. That would lead 
to significantly increasing the space redundancy execution time compared to the non-
fault tolerant case. Therefore, we should be careful in choosing simple yet reliable 
data communication protocols between the processors in order to reduce the 
communication time overhead. 
Reliability of the system was studied both in the case of time and space redundancies. 
It was found that the time between complete scans (scrubs) of the configuration 
memory is significant in determining the immunity of the configuration bitstream to 
the induced soft errors. The lower the time between the complete scans the faster the 
detection and hence correction of errors. Continuous scanning using the SEU 
controller was proven to assure a reliability of 0.999999. It should be noted that in 
case of using the TMR regime then the time between scans should not exceed a 
certain limit otherwise the reliability of the single core system would be higher than 
the TMR system. In TMR systems that limit is usually the 7 tenth (0.69) of the 
operating time. 
Power consumption was noticed to be higher in the case of the triple core space 
redundancy when compared to the single core time redundancy. The dynamic power 
consumption was increased due to the extra logic used in the space redundancy by 
Chapter 8 Conclusion and Future Perspectives 
162 
 
about 30% compared to the time redundancy case. By evaluating the four parameters 
of utilization, performance, power consumption and reliability, the space redundancy 
proved to be the best choice for 65 nm SRAM-based Virtex-5 FPGA. That result can 
be generalized to other FPGAs of the same technology feature size. It should be 
noted that the lower the technology feature size the higher the static power 
consumption and SEU induced soft errors would be. 
Radiation testing was performed to judge the static and dynamic cross sections. The 
static cross section was evaluated using a new method. Conventionally two systems 
are used to evaluate the static cross section: 1) The DUT, and 2) Monitoring and 
control system. The monitoring and control system is responsible for controlling the 
test operation as well as scrubbing the FPGA for induced errors due to irradiation by 
the proton or heavy ion beam. In our case we used only one device, the DUT. We 
replaced the functions of the external monitoring and control system with the internal 
SEU controller. It performed the scrubbing function and reported the errors found 
through the UART interface to an external monitoring computer. We calculated the 
frame errors during exposure to radiation at 4 trials. The frame errors enabled us to 
calculate the event bit cross section which was found to be 6.34e-14 cm2/bit and 
NASA’s value is 6.37e-14 cm2/bit at a proton beam energy level of 65 MeV. 
The dynamic cross section was evaluated through fault injection. The performance of 
the system was monitored while injecting randomly generated 50 bit flips in the 
configuration frames for 10 times. The injection was repeated with randomly 
generated 100 bit flips for 10 times as well. The MicroBlaze system failure rate was 
estimated to be 4.33e-04 sec-1 using fault injection while it is estimated to be 
6.117233e-04 sec-1 using SPENVIS at non-SAA. SPENVIS shows 30% higher 
estimation. The calculations of the static and dynamic cross sections are necessary to 
characterize the device and hence expect its performance under different radiation 
conditions. 
Thermal vacuum testing was conducted to estimate the commercial grade FPGA 
ability to operate flawlessly within the expected in-orbit temperature ranges. The 
SMC standard, Air Force Space Command/Space and Missile Systems Center 
Standard - Test Requirements for Launch, Upper-Stage and Space Vehicles, was 
chosen to define the test limits.  
Chapter 8 Conclusion and Future Perspectives 
163 
 
Thermal vacuum testing was conducted in between -24 ˚C and + 60 ˚C. The FPGA 
temperature reached about 105 ˚C which is outside the recommended upper limit of 
85 ˚C for the commercial grade. The FPGA was tested to check its auto-protection 
mechanism to switch off itself if the temperature reaches 125 ˚C. The mechanism 
worked well and then several mitigation techniques were proposed to put the 
operating temperature close to or less than the 85 ˚C limit. Heat sink, decreasing the 
operating frequency, and decreasing number of processors in the FPGA were tested. 
The temperature was proven to reach 66 ˚C when applying all of three techniques 
together. We recommend to use the black anodized heat sink with high emissivity 
(>0.8) at 50 MHz to keep the system temperature at about 77 ˚C. We should carefully 
consider that power consumption varies with the operating temperature. Static power 
consumption increases when temperature increases or core voltage changes. 
Dynamic power consumption varies with the change in the number of nodes being 
switched or switching frequency or core voltage. Reducing the operating temperature 
reduces the consumed power and reducing the power consumption reduces the total 
thermal power. This is a recursive relation that is used to achieve the required 
operating temperature and power limits. 
This research is crucially important to the satellite development community as it 
would enable the satellite developers to build satellites with higher processing 
capabilities at much lower costs. Using COTS in space applications is an on-going 
state-of-art research that would open the horizon for developing new test techniques, 
algorithms and architectures for fault tolerant operation in the harsh space 
environment. Even in ground-based applications, neutrons induced SEUs were 
recorded. Therefore, mitigation against such effects is mandatory for safety critical 
applications and high quality services. The research also focused on the importance 
of providing effective thermal vacuum mitigation techniques for the FPGA. The 
results help in providing the best route to utilize the correct combination of 
mitigation techniques against soft errors and thermal vacuum conditions. These 
results can be applied to terrestrial based applications as well as space based 
applications.  
The major contributions of this research can be summarized as : 1) The use of FPGA 
technology to develop a new reprogrammable design for low cost, high performance, 
moderate power and high reliability OBCs, 2) Formalization of the procedure to 
Chapter 8 Conclusion and Future Perspectives 
164 
 
analyze in-orbit radiation effects and the efficiency of scrubbing to mitigate them, 3) 
New test method for static cross section calculation at accelerated proton radiation 
test, and 4) proposing effective thermal vacuum mitigation techniques for protecting 
commercial grade FPGAs while in-orbit as well as pinpointing the effect of the 
different techniques on power consumption.  
8.2 Future Perspectives 
In this research we introduced methods that can be generalized on different types of 
SRAM-based FPGAs with different technology feature sizes. Nowadays 28 nm 
FPGAs are available. It would be good to extend the research results on other types 
of SRAM-based FPGAs which provide higher performances with low power 
consumption and high reliability.  
The essence of this research was to enable the commercial grade SRAM-based 
FPGAs to be used in serious space applications. The research should continue in the 
stream of enabling COTS SRAM-based FPGAs for variety of space applications. 
That would give higher access to universities, research institutes and developing 
countries to be actively participant in the dynamic and on-going progresses in space 
exploration. 
Enabling modern FPGAs with feature sizes less than 65 nm to be used in space 
depends on two pillars: 1) developing new mitigation techniques, and 2) developing 
new test standards and procedures. As the feature size is reduced more challenges are 
introduced. Radiation effects, mainly SEUs, would be highly manifested as more soft 
errors are likely to be induced. This is due to the fact that the decrease in the node 
capacitances corresponds to decrease in the node critical charges needed to induce an 
upset. It is likely, then, to expect MBUs in the same configuration frame. Therefore, 
the current SECDED technique would not be suitable anymore. Dynamic partial 
reconfiguration through reading the full correct frame from an external golden flash 
memory and writing it back to the FPGA would solve the MBU problem. 
As the technology feature decreases the device static power consumption increases as 
more current is likely to leak. Xilinx introduced the triple oxide concept to reduce 
that problem. However, on the system designer side, active thermal control might be 
needed to handle the increase in temperature due to the increase in static power 
Chapter 8 Conclusion and Future Perspectives 
165 
 
consumption. Traditional passive thermal control might not be sufficient for 
achieving the required temperature operation limits.  
Some tests are necessary to conduct for small feature sized FPGAs such as Total 
Ionization Dose and Displacement Damage Dose. X-ray scan before and after the 
thermal vacuum test might also be useful to study whether there are cracks in the 
Ball Grid Area (BGA) solder pads or not. Conducting thermal vacuum tests in the 
presence of infra-red camera would be useful to detect any hot-spots in the FPGA 
and/or PCB. Dynamic cross section evaluation through proton beam irradiation at 
fluxes close to the in-orbit case would be useful to assure the fault injection results. 
However, it would definitely need long exposure time which would increase the test 
costs. 
Test procedures and setup are getting increasingly complex as more connections 
might be needed during the test. Fixation of the DUT becomes difficult as well. All 
of this introduces real challenges in performing correct testing which might give 
meaningful results [22]. Therefore, it would be useful to develop specific test 
procedures, communication protocols, test platforms, measurement and analysis 
software packages and even mechanical fixations with the concept of reusability in 
mind. That would reduce test costs and lead to faster and more accurate testing. It is 
always better to use a test device which we are sure of its accuracy than to use a new 
device for the first time during testing. Standards should be modified, or even a new 
standard should be developed, to fit with the nature of the small satellite missions. 
Accepting to accommodate more risk in small satellite means that test procedures 
would became less complex and less demanding compared to the currently available 
professional standards. The Nano-satellite Environment Test Standardization (NETS) 
project at Kyushu Institute of Technology is an example for the endeavors which are 
currently undertaken to develop more fitting standards [87].  
A complete Satellite-on-Chip would be feasible to realize by now [88]. This idea is a 
fairly challenging task to implement the hardware and software of the different 
subsystems in a single FPGA package. However, implementing such a single chip 
satellite would be a revolutionary breakthrough in the satellite technology. As we 
were able to accommodate 4 complete embedded systems inside a single FPGA 
package then it would be feasible to assign the processors specific tasks to execute as 
Chapter 8 Conclusion and Future Perspectives 
166 
 
if they represent the satellite subsystems’ processors. Sophisticated failure 
management procedures as well as task distributions, dynamic resources allocation, 
interfaces and perhaps a larger size FPGA to accommodate specific digital design for 
communication modems are needed. The Satellite-on-Chip would mainly execute the 
digital signal processing as well as the algorithms of the different satellite 
subsystems. However, careful consideration for the problem of redundancy should be 
given. 
Redundancy either in the form of cold backup or hot backup is a well-known 
technique in satellite engineering which is used to increase the system reliability. The 
use of redundancy adds a problem in terms of the mass, volume, and/or power 
consumption. If a cold backup subsystem is used then it would need its own volume 
in the satellite and it would add to the total mass. If hot backup redundancy is used 
then it would add to the consumed power, mass and volume. Nano-satellites are a 
domain to get rid of redundancy, taking the risk accompanying that decision. They 
do not have enough mass, power, and volume budgets to accommodate extra units.  
By introducing miniaturized designs in modern FPGAs, redundancy can migrate 
from the outside to the inside. Thus it can move from being large bulk devices into 
small logic gates implemented inside a single FPGA. This means that redundant 
satellite-on-chips can be built on the same FPGA package to increase the overall 
system reliability. Moreover, redundancy can be selective. Modern SRAM-based 
FPGAs provide the facility to implement Dynamic Partial Reconfiguration of their 
bitstreams. This means that they can change some modules in design on-the-fly 
without stopping operation of the main static logic. Reconfigurable partitions are 
defined to carry reconfigurable modules. Whenever a redundancy is needed, its 
reconfigurable module is called and loaded in the reconfigurable partition to 
selectively provide high reliability depending on the operation conditions. If the 
redundancy is not needed, its reconfigurable partition is freed by loading a black box 
reconfigurable module. In this method power can be saved as well as area, mass, and 
volume as we call the function whenever needed in contrast to the previously 
permanent redundancy. We call that new concept dynamic redundancy.  
One direct application of dynamic redundancy is to load TMR modules at the SAA 
region of the orbit while unloading them in the rest of the orbit. That loading and 
Chapter 8 Conclusion and Future Perspectives 
167 
 
unloading could save power up to 60% in comparison to the permanent TMR system 
[58]. This means that hardware designs are no longer static. The design is dynamic 
and can be changed in-orbit either remotely by uploading new hardware or locally by 
loading hardware files from the memory. Also it means that the hardware became 
scalable in such a way that it can grow up to contain multiple modules or shrink 
down to the basic limits.  
As the processing power of computational platforms is increasing tremendously, it 
would be a matter of time before we see a fully smart system which could cure itself. 
The concepts of evolution and adaptation are very important in the human lives. 
Biological inspirations in the field of fault tolerances and artificial intelligence are 
well established. However, there is the concept of evolvable hardware. That is the 
hardware which can evolve to adapt to changes in its environment in order not to die 
out. That adaptation might be through partially or fully changing its design. The 
change however need to be initiated based on well-established semantic to define 
what is good and what is bad for the system. Obviously, it would be some sort of an 
advanced silicon brain that would cross the limits of the current autonomous systems 
which can perform smart failure recovery and in-orbit reconfigurations in the 
absence of ground control. For example, IBM is building neurosynaptic chips at 
ultra-low power consumption. Being inspired with the human brain, IBM develops 
brain-inspired computer architecture chips. The recent chip built by IBM, by the time 
of writing this thesis, contains 1 million neuron and 256 million synapses with 5.4 
billion transistors. It contains an on-chip network of 4096 neurosynaptic cores at 70 
mWatt. This means that supercomputing and cognitive systems would ultimately be 
migrated to space borne devices, it is just a matter of time. 
The difference between the current traditional and future evolutionary hardware 
designs is that the current systems are fully programmed with possible scenarios 
prior to their in-field operation. Even if they have the chance to implement some sort 
of smartness then it would be based on a predefined heuristic by the developers. 
While the evolutionary systems just have the starting point from which they start to 
develop just like human beings and then they can take decisions out of their own 
even without the discretion of their own developers. The merge between powerful 
design tools, computational platforms and advances in artificial intelligence would 
make satellites much smarter than we expect. Reaching that point might enforce the 
Chapter 8 Conclusion and Future Perspectives 
168 
 
developers to negotiate with their evolving satellite-organisms for the best sake of 
both sides. Satellites might be able to decide what the best for their operation in a 
better way than the developers themselves. 
Finally, we would like to stress on the fact that developing nano-satellites is an 
intrinsically challenging task as they do incur risks due to their design philosophy. 
Reiterating the previously accomplished processes is likely to occur during the 
development and qualification phases. More focus should be given to study the 
effects of those iterations on the schedule, cost and reliability of the nano-satellite 
projects. Appendix (G) introduces our study on that issue. 
The future is out there and this thesis tried to push forward towards it. State of the art 
technology will always emerge with new products and outcomes. One way to push 
forward is to try to present the capable platforms to developers to realize their ideas 
and make them feasible. We tried to push the capabilities of the current satellites 
forward. There is still a lot to do and to achieve however it should always be 
considered that the value of the work is better appreciated whenever it touches real 
problems and open new doors for imagination and innovation. 
 169 
 
References 
 
[1]  A. Siddiqi, "Sputnik 50 years later: New evidence on its origins," Acta Astronautica, vol. 
63, no. 1-4, p. 529 – 539, July–August 2008.  
[2]  L. Summerer, "Thinking tomorrows' space – Research trends of the ESA advanced 
concepts team 2002–2012," Acta Astronautica, vol. 95, no. 0, p. 242 – 259, February–
March 2014.  
[3]  E. E. Weeks and A. A. Faiyetole, "Science, technology and imaginable social and 
behavioral impacts as outer space develops," Acta Astronautica, vol. 95, no. 0, p. 166 –
173, February–March 2014.  
[4]  R. D. Launius, "Space stations for the United States: An idea whose time has come—
and gone?," Acta Astronautica, vol. 62, no. 10–11, p. 539 – 555, May–June 2008.  
[5]  T. Thumm, J. A. Robinson, C. Alleyne, P. Hasbrook, S. Mayo, N. Buckley, P. Johnson-
Green, G. Karabadzhak, S. Kamigaichi, S. Umemura, I. V. Sorokin, M. Zell, E. Istasse, J. 
Sabbagh and S. Pignataro, "International space station accomplishments update: 
Scientific discovery, advancing future exploration, and benefits brought home to 
earth," Acta Astronautica, vol. 103, no. 0, p. 235 – 242, October–November 2014.  
[6]  B. Detsis and E. Detsis, "The benefits brought by space—General public versus space 
agencies perspectives," Acta Astronautica, vol. 88, no. 0, p. 129 – 137, July–August 
2013.  
[7]  S. P. Neeck, T. J. Magner and G. E. Paules, "NASA's small satellite missions for Earth 
observation," Acta Astronautica, vol. 56, no. 1–2, p. 187 – 192, January 2005.  
[8]  G. W. Ousley, "NASA/International small satellite program," Acta Astronautica, vol. 53, 
no. 4–10, p. 771 – 777, August–November 2003.  
[9]  L. Alkalai, "Advanced microelectronics technologies for future small satellite systems," 
Acta Astronautica, vol. 46, no. 2–6, p. 233 – 239, January–March 2000.  
[10] J. Esper, P. V. Panetta, D. M. Ryschkewitsch, D. W. Wiscombe and S. Neeck, "NASA-
GSFC nano-satellite technology for Earth science missions," Acta Astronautica, vol. 46, 
no. 2–6, p. 287 – 296, January–March 2000.  
[11] P. A. Curto and R. S. Hornstein, "Injection of new technology into space systems," Acta 
Astronautica, vol. 57, no. 2–8, p. 490 – 497, July–October 2005.  
[12] M. Sweeting, "25 Years of space at SURREY— Pioneering modern microsatellites," Acta 
 170 
 
Astronautica, vol. 49, no. 12, pp. 681 - 691, December 2001. 
[13] D. G. Gilmore, B. E. Hardt, R. C. Prager, E. W. Grob and W. Ousley, Space Mission 
Engineering: The New SMAD, vol. 28, J. R. Wertz, . D. F. Everett and J. J. Puschell, Eds., 
Microcosm Press 2011, Space Technology Library, 2011.  
[14] Mission Design Division Staff Ames Research Center, "Small Spacecraft Technology -
State of the Art," NASA/TP–2014–216648, pp. 1 - 208, Feb. 2014. 
[15] D. Hastings and H. Garret, Spacecraft Environment Intercations, Cambridge University 
Press, 1996.  
[16] K. L. Bedingfield and R. D. Leach, Spacecraft System Failures and Anomalies Attributed 
to the Natural Space Environment, M. B. Alexander, Ed., NASA Reference Publication 
1390, pp. 1 - 51, 1996.  
[17] M. O'Bryan, K. A. Label, R. A. Reed, J. Barth, C. Seidleck, P. Marshall, C. Marshall and M. 
Carts, "Single Event Effect and Radiation Damage Results For Candidate Spacecraft," in 
Proceedings of IEEE NSREC'98, pp. 1 - 12, 1998.  
[18] R. Ecoffet, "Overview of In-Orbit Radiation Induced Sapcecarft Anomalies," IEEE 
Transactions On Nuclear Science, vol. 60, no. 3, pp. 1791-1815, June 2013.  
[19] A. Holmes-Sidele and L. Adams, Eds., Handbook of Radiation Effects, 2nd edition, 
Oxford University Press, 2007.  
[20] R. Velazco, P. Fouilat and R. Reis, Radiation Effects on Embedded Systems, Springer, 
2007.  
[21] T. R. Oldham, "Basic Mechanisms of TID and DDD Response in MOS and Bipolar 
Microelectronics," in IEEE NSREC 2011 Short Course Notes, Section II, pp. 1 - 95, 2011.  
[22] H. Quinn, "Challenges in Testing Complex Systems," IEEE Transactions On Nuclear 
Science, vol. 61, no. 2, pp. 766 - 786, April 2014.  
[23] R. Ecoffet, "On-Orbit Anomalies: Investigations and Root Cause Determination," in IEEE 
NSREC 2011 Short Course Notes, Section IV, pp. 1 - 25, 2011.  
[24] G. C. Messenger and M. S. Ash, Single Event Phenomena, Springer US, 1997.  
[25] P. P. Shirvani, "Fault-Tolerant Computing for Radiation Environments," Ph.D. Thesis, 
Center for Reliable Computing, Stanford University, pp. 1 - 86, June 2001. 
[26] F. L. Kastensmidt, R. Reis and L. Carro, Fault Tolerance Techniques for SRAM-based 
FPGAs, Springer, 2006.  
 171 
 
[27] I. Koren and M. Krishna, Fault Tolerant systems, ElSEVIER, 2007. 
[28] N. Oh, "Software implemented hardware fault tolerance," Ph.D. Thesis, Center for 
Reliable Computing, Stanford University, pp. 1 - 54, Dec. 2000. 
[29] O. Goloubeva, M. Rebaudengo, M. S. Reorda and M. Violante, Software Implemented 
Hardware Fault Tolerance, Springer, 2006.  
[30] L. L. Pullum, Software Fault Tolerance Techniques and Implementation, London: Artech 
House, 2001.  
[31] F. G. d. Lima, "Single Event Upset Mitigation Techniques for Programmable Devices," 
Ph.D. Thesis, INSTITUTO DE INFORMÁTICA, UNIVERSIDADE FEDERAL DO RIO GRANDE 
DO SUL, pp. 1 - 102, Dec. 2000. 
[32] B. Pratt, M. Caffrey, J. F. Carroll, P. Graham, K. Morgan and M. Wirthlin, "Fine-Grain 
SEU Mitigation for FPGAs Using Partial TMR," IEEE Transactions On Nuclear Science,
vol. 55, no. 4, pp. 2274 - 2280, August 2008.  
[33] I. Herrera-Alzu and M. López-Vallejo, "Design Techniques for Xilinx Virtex FPGA 
Configuration Memory Scrubbers," IEEE Transactions On Nuclear Science, vol. 60, no. 1, 
pp. 376 - 385, Feb. 2013.  
[34] I. Herrera-Alzu and M. López-Vallejo, "System Design Framework and Methodology for 
Xilinx Virtex FPGA Configuration Scrubbers," IEEE Transactions On Nuclear Science, vol. 
61, no. 1, pp. 619 - 629, Feb. 2014.  
[35] H. Baig, J.-A. Lee and Z. A. Siddiqui, "A Low-Overhead Multiple-SEU Mitigation 
Approach for SRAM-based FPGAs with Increased Reliability," IEEE Transactions On 
Nuclear Science, vol. 61, no. 3, pp. 1389 - 1399, June 2014.  
[36] H. Quinn, P. Graham, K. Morgan, Z. Baker, M. Caffrey, D. Smith, M. Wirthlin and R. Bell, 
"Flight Experience of the Xilinx Virtex-4," IEEE Transactions On Nuclear Science, vol. 60, 
no. 4, pp. 2682 - 2690, August 2013.  
[37] "Virtex-5 FPGA Configuration User Guide, XILINX UG191 (v3.10)," XILINX, pp. 1 - 166, 
2011. 
[38] L. Alkali, "An overview of flight computer technologies for future NASA space 
exploration missions," Acta Astronautica, vol. 52, no. 9, pp. 857 - 867, 2003.  
[39] M. PIGNOL, "COTS-based applications in space avionics," in Proceedings of the 
Conference on Design, Automation and Test in Europe DATE ’10, pp. 1213 – 1219, 2010. 
[40] R. Ginosar, "A survey of processors for space," in Proceedings of the Data Systems in 
Aerospace (DASIA), Eurospace, pp. 1 - 5, May 2012.  
 172 
 
[41] M. H. Thoby, "MYRIADE: CNES micro-satellite program," in Proceedings of the 15th 
Annual/USU Conference on Small Satellites, pp. 1 - 14, 2001.  
[42] S. Montenegro and W. Barwald, "Powerbird-modern spacecraft bus controller," Acta 
Astronautica, vol. 52, no. 9 - 12, pp. 957 - 963, 2003.  
[43] P. Behr, W. Bärwald, K. Brieß and S. Montenegro, "Fault tolerance and COTS: Next 
generation of high performance satellite computers," in Proceedings of the DASIA (ESA 
SP-532), pp. 1 - 12, June 2003.  
[44] C. Villalpando, D. Rennels, R. Some and M. Cabanas-Holmen, "Reliable multicore 
processors for nasa space missions," in Proceedings of the 2011 IEEE Aerospace 
Conference, pp. 1 – 12, 2011.  
[45] M. Fayyaz, T. Vladimirova and J.-M. Caujolle, "Adaptive middleware design for satellite 
fault-tolerant distributed computing," in Proceedings of the 2012 NASA/ESA 
Conference on Adaptive Hardware and Systems (AHS-2012), pp. 1 - 8, June 2012.  
[46] I. V. McLoughlin and T. Bretschneider, "Achieving lowcost high-reliability computation 
through redundant parallel processing," in Proceedings of the IEEE International 
Conference on Computing Informatics ICOCI 2006, pp. 1 - 6, June 2006.  
[47] P. Murray, T. Randolph, D. V. Buren, D. Anderson and I. Troxel, "High performance, 
high volume reconfigurable processor architecture," in Proceedings of the 2012 IEEE 
Aerospace Conference, pp. 1 - 8, 2012.  
[48] T. Vladimirova, W. Xiaofeng and C. Bridges, "Development of a Satellite Sensor 
Network for Future Space Missions," in Proceedings of the 2008 IEEE Aerospace 
Conference, pp. 1 - 10, 2008.  
[49] H. Yashiro, T. Takahashi, T. Fujiwara and K. Mori, "A High Assurance Space On-Board 
Distributed Computer System," in Proceedings of the 2003 IEEE Aerospace Conference, 
pp.5_2501 - 5_2509, 2003.  
[50] J. Samson, G. Gardner, D. Lupia, M. Patel, P. Davis, V. Aggarwal, A. George, Z. 
Kalbarcyzk and R. Some, "High Performance Dependable Multiprocessor II," in 
Proceedings of the 2007 IEEE Aerospace Conference, pp. 1 - 22, 2007.  
[51] J. Samson and E. Grobelny, "NMP ST8 Dependable Multiprocessor: TRL6 Validation –
Preliminary Results," in Proceedings of the 2009 IEEE Aerospace Conference, pp. 1 - 23, 
2009.  
[52] J. J. Samson, "Update on Dependable Multiprocessor Cubesat Technology 
Development," in Proceedings of the 2012 IEEE Aerospace Conference, pp. 1 - 12, 2012. 
[53] J. J. Samson, E. Grobelny, S. Driesse-Bunn, M. Clark and S. Van Portfliet, "Post-TRL6 
Dependable Multiprocessor Technology Developments," in Proceedings of the 2010 
 173 
 
IEEE Aerospace Conference, pp. 1 - 21, 2010. 
[54] D. Lüdtke, K. Westerdorff, K. Stohlmann, A. Börner, O. Maibaum, T. Peng, B. Weps, G. 
Fey and A. Gerndt, "OBC-NG: Towards a Reconfigurable On-board Computing 
Architecture for Spacecraft," in Proceedings of the 2014 IEEE Aerospace Conference, 
pp. 1 - 13, 2014.  
[55] "RAD750® radiation-hardened PowerPC microprocessor," BAE Systems, pp. 1 - 2, 2008.
[56] M. Ibrahim, K. Asami and M. Cho, "Fault Tolerant Architecture Alternatives for 
Developing Nano-Satellites Embedded Computers," in Proceedings of the AIAA Space 
2012 Conference & Exposition, pp. 1 - 8, Sept. 2012.  
[57] M. Ibrahim, K. Asami and M. Cho, "The Use of Reconfigurable FPGAs in Developing 
Reliable Satellite On-Board Computers," in Proceedings of the AIAA Space 2012 
Conference & Exposition, pp. 1 - 8, 2012.  
[58] M. Ibrahim, K. Asami and M. Cho, "Reconfigurable Fault Tolerant Avionics System," in 
Proceedings of the 2013 IEEE Aerospace Conference, pp. 1 - 12, 2013.  
[59] M. M. Ibrahim, "Model Based Design of a Smart Embedded System for Space 
Applications," M.Sc. Thesis, Computer & Systems Engineering Department, AL AZHAR 
University, pp. 1 - 113, 2011. 
[60] "Virtex-5 FPGA System Monitor - UG 192," XILINX, pp. 1 - 66, Feb. 2011. 
[61] "MAX6655- Dual Remote/Local Temperature Sensors and Four-Channel Voltage 
Monitors," MAXIM, datasheet, pp. 1 - 18, 2006. 
[62] M. Ibrahim, K. Asami and M. Cho, "Time and Space Redundancy Fault Tolerance Trade-
offs for FPGA Based Single and Multicore Designs," Transactions of the Japan Society 
for Aeronautical and Space Sciences, vol. Article in Press, pp. 1 - 10, 2014.  
[63] "SPENVIS: The Space Environment Information System," ESA, 1997-2014. [Online]. 
Available: https://www.spenvis.oma.be/. 
[64] H. Quinn, K. Morgan, P. Graham, J. Krone and M. Caffrey, "Static Proton and Heavy Ion 
Testing of the Xilinx Virtex-5 Device," in 2007 IEEE Radiation Effects Data Workshop, 
pp. 177 - 184, July 2007.  
[65] M. Ibrahim, K. Asami and M. Cho, "LEO Single Event Upset Emulator for Validation of 
FPGA Based Avionics Systems," Transactions of the Japan Society for Aeronautical and 
Space Sciences, Aerospace Technology Japan, vol. 12, no. ists29, pp. Tf_19-Tf_25, 2014. 
[66] M. Ibrahim, K. Asami and M. Cho, "Evaluation of SRAM Based FPGA Performance by 
Simulating SEU through Fault Injection," in Proceedings of the 2013 6th International 
 174 
 
Conference on Recent Advances in Space Technologies (RAST), pp. 1 - 6, 2013.  
[67] M. Ibrahim, K. Asami and M. Cho, "Thermal Vacuum Test Results for Virtex-5 FPGA 
Based Multi-core On-Board Computer," in Proceedings of the 2014 IEEE Aerospace 
Conference, pp. 1 - 9, 2014.  
[68] D. Q. Dinh, "Thermal Modelling of Nanosat," M.Sc., Department of Mechanical and 
Aerospace Engineering, San José State University, pp. 1 - 47, 2012. 
[69] G. Cook, "Satellite Drag Coefficients," Planetary and Space Science ELSEVIER, vol. 13, 
no. 10, pp. 929 - 946, 1965.  
[70] F. P. Incropera, Fundamentals of Heat and Mass Transfer, New York: John Wiley & 
Sons, 2006.  
[71] P. Fortescue, G. Swinerd and J. Stark, Spacecraft Systems Engineering, 4th Edition, 
Chichester, West Sussex: John Wiley & Sons, Ltd, 2011.  
[72] C. B. VanOutryve, "A Thermal Analysis and Design Tool for Small Spacecraft," San Jose 
State University, 2008. 
[73] D. F. Gluck and V. Baturkin, Spacecraft Thermal Control Handbook. Vol I: Fundamental 
Technologies., vol. I, D. G. Gilmore, Ed., El Segundo, California: The Aerospace Press, 
2002, pp. 247 - 329, Mounting and Interfaces. 
[74] Y. A. Cengel, Heat Transfer: A Practical Approach, San Francisco: McGraw Hill, 1998.  
[75] D. G. Gilmore and R. L. Collins, Spacecraft Thermal Control Handbook. Vol I: 
Fundamental Technologies, D. G. Gilmore, Ed., El Segundo, CA: The Aerospace Press, 
2002.  
[76] J. R. Howell, R. Siegel and M. P. Menguc, Thermal Radiation Heat Transfer, 5th Edition, 
CRC Press, 2010.  
[77] "Power Methodology Guide UG786 (v13.1)," XILINX, pp. 1 - 54, March 2011. 
[78] "XPower Estimator User Guide UG440 (v13.4)," XILINX, pp. 1 - 64, Jan. 2012. 
[79] P. Abusaidi, M. Klein and B. Philofsky, "Virtex-5 FPGA System Power Design 
Considerations WP285 (v1.0)," XILINX, pp. 1 - 23, Feb. 2008. 
[80] M. Klein, "Static Power and the Importance of Realistic Junction Temperature Analysis 
WP221 (v1.0)," XILINX, pp. 1 - 10, March 2005. 
[81] D. Curd, "Power Consumption in 65 nm FPGAs WP246 (v1.2)," XILINX, pp. 1 - 12, Feb. 
2007. 
 175 
 
[82] M. Cho, Space Environment Testing, Kyushu Institute of Technology, Lecture Notes, 
Thermal test part 2, pp. 1 - 64, 2014.  
[83] "SMC Standard SMC-S-016 TEST REQUIREMENTS FOR LAUNCH, UPPER-STAGE AND 
SPACE VEHICLES," pp. 1 - 112, 13 June 2008. 
[84] K. Chapman, "SEU Strategies for Virtex-5 Devices XAPP 864 (V2.0)," XILINX, pp. 1 - 16, 
April 2010. 
[85] G. Allen, L. Edmonds, G. Swift, C. Carmichael, C. W. Tseng, K. Heldt, S. Anderson and M. 
Coe, "Single Event Test Methodologies and System Error Rate Analysis for Triple 
Modular Redundant Field Programmable Gate Arrays," IEEE Transactions on Nuclear 
Science, vol. 58, no. 3, pp. 1040 - 1046, June 2011.  
[86] L. D. Edmonds, "Analysis of SEU Rates in TMR Devices," JPL Publication 09-6, pp. 1 - 18, 
Feb. 2009. 
[87] "NETS Web," Center for Nanosatellite Testing - Kyushu Institute of Technology, 
[Online]. Available: http://cent.ele.kyutech.ac.jp/nets_web/nets_web.html . 
[88] M. Peck, "Exploring Space with Chip-sized Satellites," July 2011. [Online]. Available: 
http://spectrum.ieee.org/aerospace/satellites/exploring-space-with-chipsized-
satellites. 
 
 
 176 
 
Appendix A   
Microprocessor Hardware Specification of the MPSoC  
 
PARAMETER VERSION = 2.1.0 
 
PORT reset_mb0_Ext_Reset_In_pin = net_reset_mb0_Ext_Reset_In_pin, DIR = I, SIGIS = RST 
PORT reset_mb1_Ext_Reset_In_pin = net_reset_mb1_Ext_Reset_In_pin, DIR = I, SIGIS = RST 
PORT reset_mb2_Ext_Reset_In_pin = net_reset_mb2_Ext_Reset_In_pin, DIR = I, SIGIS = RST 
PORT xps_uartlite_0_RX_pin = net_xps_uartlite_0_RX_pin, DIR = I 
PORT xps_uartlite_0_TX_pin = xps_uartlite_0_TX, DIR = O 
PORT xps_uartlite_1_RX_pin = net_xps_uartlite_1_RX_pin, DIR = I 
PORT xps_uartlite_1_TX_pin = xps_uartlite_1_TX, DIR = O 
PORT xps_uartlite_2_RX_pin = net_xps_uartlite_2_RX_pin, DIR = I 
PORT xps_uartlite_2_TX_pin = xps_uartlite_2_TX, DIR = O 
PORT xps_uartlite_3_RX_pin = net_xps_uartlite_3_RX_pin, DIR = I 
PORT xps_uartlite_3_TX_pin = xps_uartlite_3_TX, DIR = O 
PORT xps_iic_3_Sda_pin = xps_iic_3_Sda, DIR = IO 
PORT xps_iic_3_Scl_pin = xps_iic_3_Scl, DIR = IO 
PORT xps_gpio_0_GPIO_IO_pin = xps_gpio_0_GPIO_IO, DIR = IO, VEC = [0:31] 
PORT xps_gpio_1_GPIO_IO_pin = xps_gpio_1_GPIO_IO, DIR = IO, VEC = [0:31] 
PORT xps_gpio_2_GPIO_IO_pin = xps_gpio_2_GPIO_IO, DIR = IO, VEC = [0:31] 
PORT xps_gpio_3_GPIO_IO_pin = xps_gpio_3_GPIO_IO, DIR = IO, VEC = [0:31] 
PORT xps_mch_emc_3_Mem_DQ_pin = xps_mch_emc_3_Mem_DQ, DIR = IO, VEC = [0:15] 
PORT xps_mch_emc_3_Mem_A_pin = xps_mch_emc_3_Mem_A, DIR = O, VEC = [0:31] 
PORT xps_mch_emc_3_Mem_CEN_pin = xps_mch_emc_3_Mem_CEN, DIR = O 
PORT xps_mch_emc_3_Mem_OEN_pin = xps_mch_emc_3_Mem_OEN, DIR = O 
PORT xps_mch_emc_3_Mem_WEN_pin = xps_mch_emc_3_Mem_WEN, DIR = O 
PORT xps_mch_emc_3_Mem_BEN_pin = xps_mch_emc_3_Mem_BEN, DIR = O, VEC = [0:1] 
PORT MB3_reset_Ext_Reset_In_pin = net_MB3_reset_Ext_Reset_In_pin, DIR = I, SIGIS = RST 
PORT sys_clk_pin = CLK_S, DIR = I, SIGIS = CLK, CLK_FREQ = 50000000 
PORT clock_generator_0_RST_pin = net_clock_generator_0_RST_pin, DIR = I, SIGIS = RST 
PORT clock_generator_0_CLKOUT4_50MHz_pin = clock_generator_0_CLKOUT4_50MHz, DIR = 
O, SIGIS = CLK, CLK_FREQ = 50000000 
 
 
BEGIN MicroBlaze 
 PARAMETER INSTANCE = MicroBlaze_0 
 PARAMETER HW_VER = 8.20.a 
 PARAMETER C_FSL_LINKS = 3 
 PARAMETER C_DEBUG_ENABLED = 1 
 PARAMETER C_FPU_EXCEPTION = 1 
 PARAMETER C_DIV_ZERO_EXCEPTION = 1 
 PARAMETER C_IPLB_BUS_EXCEPTION = 1 
 PARAMETER C_DPLB_BUS_EXCEPTION = 1 
 PARAMETER C_ILL_OPCODE_EXCEPTION = 1 
 PARAMETER C_OPCODE_0x0_ILLEGAL = 1 
 PARAMETER C_USE_STACK_PROTECTION = 1 
 PARAMETER C_UNALIGNED_EXCEPTIONS = 1 
 PARAMETER C_FSL_EXCEPTION = 1 
 PARAMETER C_USE_BARREL = 1 
 PARAMETER C_USE_FPU = 2 
 PARAMETER C_USE_DIV = 1 
 PARAMETER C_USE_EXTENDED_FSL_INSTR = 1 
 PARAMETER C_AREA_OPTIMIZED = 1 
 PARAMETER C_NUMBER_OF_PC_BRK = 1 
 PARAMETER C_NUMBER_OF_RD_ADDR_BRK = 0 
 177 
 
 PARAMETER C_NUMBER_OF_WR_ADDR_BRK = 0 
 BUS_INTERFACE DPLB = plb_v46_mb0 
 BUS_INTERFACE IPLB = plb_v46_mb0 
 BUS_INTERFACE DEBUG = debug_module_0_MBDEBUG_0 
 BUS_INTERFACE SFSL1 = fsl_v20_mb1_mb0 
 BUS_INTERFACE SFSL2 = fsl_v20_mb2_mb0 
 BUS_INTERFACE DLMB = Data_lmb_v10_mb0 
 BUS_INTERFACE ILMB = Inst_lmb_v10_mb0 
 BUS_INTERFACE MFSL0 = fsl_v20_mb0 
 BUS_INTERFACE MFSL1 = fsl_v20_mb0_mb1 
 BUS_INTERFACE MFSL2 = fsl_v20_mb0_mb2 
 PORT INTERRUPT = xps_intc_0_Irq 
 PORT DBG_STOP = Ext_BRK 
 PORT MB_RESET = reset_mb0_MB_Reset 
END 
 
BEGIN MicroBlaze 
 PARAMETER INSTANCE = MicroBlaze_1 
 PARAMETER HW_VER = 8.20.a 
 PARAMETER C_FSL_LINKS = 3 
 PARAMETER C_DEBUG_ENABLED = 1 
 PARAMETER C_FPU_EXCEPTION = 1 
 PARAMETER C_DIV_ZERO_EXCEPTION = 1 
 PARAMETER C_IPLB_BUS_EXCEPTION = 1 
 PARAMETER C_DPLB_BUS_EXCEPTION = 1 
 PARAMETER C_ILL_OPCODE_EXCEPTION = 1 
 PARAMETER C_OPCODE_0x0_ILLEGAL = 1 
 PARAMETER C_USE_STACK_PROTECTION = 1 
 PARAMETER C_UNALIGNED_EXCEPTIONS = 1 
 PARAMETER C_FSL_EXCEPTION = 1 
 PARAMETER C_USE_BARREL = 1 
 PARAMETER C_USE_FPU = 2 
 PARAMETER C_USE_DIV = 1 
 PARAMETER C_USE_EXTENDED_FSL_INSTR = 1 
 PARAMETER C_AREA_OPTIMIZED = 1 
 BUS_INTERFACE DPLB = plb_v46_mb1 
 BUS_INTERFACE IPLB = plb_v46_mb1 
 BUS_INTERFACE DEBUG = debug_module_0_MBDEBUG_1 
 BUS_INTERFACE SFSL1 = fsl_v20_mb0_mb1 
 BUS_INTERFACE SFSL2 = fsl_v20_mb2_mb1 
 BUS_INTERFACE DLMB = Data_lmb_v10_mb1 
 BUS_INTERFACE ILMB = Inst_lmb_v10_mb1 
 BUS_INTERFACE MFSL0 = fsl_v20_mb1 
 BUS_INTERFACE MFSL1 = fsl_v20_mb1_mb0 
 BUS_INTERFACE MFSL2 = fsl_v20_mb1_mb2 
 PORT INTERRUPT = xps_intc_1_Irq 
 PORT DBG_STOP = Ext_BRK 
 PORT MB_RESET = reset_mb1_MB_Reset 
END 
 
BEGIN MicroBlaze 
 PARAMETER INSTANCE = MicroBlaze_2 
 PARAMETER HW_VER = 8.20.a 
 PARAMETER C_FSL_LINKS = 3 
 PARAMETER C_DEBUG_ENABLED = 1 
 PARAMETER C_FPU_EXCEPTION = 1 
 PARAMETER C_DIV_ZERO_EXCEPTION = 1 
 PARAMETER C_IPLB_BUS_EXCEPTION = 1 
 PARAMETER C_DPLB_BUS_EXCEPTION = 1 
 PARAMETER C_ILL_OPCODE_EXCEPTION = 1 
 178 
 
 PARAMETER C_OPCODE_0x0_ILLEGAL = 1 
 PARAMETER C_USE_STACK_PROTECTION = 1 
 PARAMETER C_UNALIGNED_EXCEPTIONS = 1 
 PARAMETER C_FSL_EXCEPTION = 1 
 PARAMETER C_USE_BARREL = 1 
 PARAMETER C_USE_FPU = 2 
 PARAMETER C_USE_DIV = 1 
 PARAMETER C_USE_EXTENDED_FSL_INSTR = 1 
 PARAMETER C_AREA_OPTIMIZED = 1 
 BUS_INTERFACE DPLB = plb_v46_mb2 
 BUS_INTERFACE IPLB = plb_v46_mb2 
 BUS_INTERFACE DEBUG = debug_module_0_MBDEBUG_2 
 BUS_INTERFACE SFSL1 = fsl_v20_mb0_mb2 
 BUS_INTERFACE SFSL2 = fsl_v20_mb1_mb2 
 BUS_INTERFACE DLMB = Data_lmb_v10_mb2 
 BUS_INTERFACE ILMB = Inst_lmb_v10_mb2 
 BUS_INTERFACE MFSL0 = fsl_v20_mb2 
 BUS_INTERFACE MFSL1 = fsl_v20_mb2_mb0 
 BUS_INTERFACE MFSL2 = fsl_v20_mb2_mb1 
 PORT INTERRUPT = xps_intc_2_Irq 
 PORT DBG_STOP = Ext_BRK 
 PORT MB_RESET = reset_mb2_MB_Reset 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb0 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_FSL_DEPTH = 128 
 PARAMETER C_ASYNC_CLKS = 0 
 PARAMETER C_IMPL_STYLE = 1 
 PORT FSL_CLK = clock_generator_0_CLKOUT2_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb2 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_FSL_DEPTH = 128 
 PARAMETER C_ASYNC_CLKS = 0 
 PARAMETER C_IMPL_STYLE = 1 
 PORT FSL_CLK = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb1 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_FSL_DEPTH = 128 
 PARAMETER C_ASYNC_CLKS = 0 
 PARAMETER C_IMPL_STYLE = 1 
 PORT FSL_CLK = clock_generator_0_CLKOUT3_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = Data_lmb_v10_mb0 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = reset_mb0_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = Data_lmb_v10_mb1 
 179 
 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = reset_mb1_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = Data_lmb_v10_mb2 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = reset_mb2_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = Inst_lmb_v10_mb0 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = reset_mb0_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = Inst_lmb_v10_mb1 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = reset_mb1_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = Inst_lmb_v10_mb2 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = reset_mb2_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN plb_v46 
 PARAMETER INSTANCE = plb_v46_mb0 
 PARAMETER HW_VER = 1.05.a 
 PORT SYS_Rst = reset_mb0_Bus_Struct_Reset 
 PORT PLB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN plb_v46 
 PARAMETER INSTANCE = plb_v46_mb1 
 PARAMETER HW_VER = 1.05.a 
 PORT SYS_Rst = reset_mb1_Bus_Struct_Reset 
 PORT PLB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN plb_v46 
 PARAMETER INSTANCE = plb_v46_mb2 
 PARAMETER HW_VER = 1.05.a 
 PORT SYS_Rst = reset_mb2_Bus_Struct_Reset 
 PORT PLB_Clk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN xps_uartlite 
 PARAMETER INSTANCE = xps_uartlite_0 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BAUDRATE = 9600 
 PARAMETER C_USE_PARITY = 0 
 PARAMETER C_BASEADDR = 0x80440000 
 180 
 
 PARAMETER C_HIGHADDR = 0x8044007f 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT RX = net_xps_uartlite_0_RX_pin 
 PORT TX = xps_uartlite_0_TX 
END 
 
BEGIN xps_uartlite 
 PARAMETER INSTANCE = xps_uartlite_1 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BAUDRATE = 9600 
 PARAMETER C_USE_PARITY = 0 
 PARAMETER C_BASEADDR = 0x80440000 
 PARAMETER C_HIGHADDR = 0x8044007f 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT RX = net_xps_uartlite_1_RX_pin 
 PORT TX = xps_uartlite_1_TX 
END 
 
BEGIN xps_uartlite 
 PARAMETER INSTANCE = xps_uartlite_2 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BAUDRATE = 9600 
 PARAMETER C_USE_PARITY = 0 
 PARAMETER C_BASEADDR = 0x80440000 
 PARAMETER C_HIGHADDR = 0x8044007f 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT RX = net_xps_uartlite_2_RX_pin 
 PORT TX = xps_uartlite_2_TX 
END 
 
BEGIN xps_iic 
 PARAMETER INSTANCE = xps_iic_0 
 PARAMETER HW_VER = 2.03.a 
 PARAMETER C_BASEADDR = 0x80500000 
 PARAMETER C_HIGHADDR = 0x805001ff 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT IIC2INTC_Irpt = xps_iic_0_IIC2INTC_Irpt 
END 
 
BEGIN xps_iic 
 PARAMETER INSTANCE = xps_iic_1 
 PARAMETER HW_VER = 2.03.a 
 PARAMETER C_BASEADDR = 0x80500000 
 PARAMETER C_HIGHADDR = 0x805001ff 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT IIC2INTC_Irpt = xps_iic_1_IIC2INTC_Irpt 
END 
 
BEGIN xps_iic 
 PARAMETER INSTANCE = xps_iic_2 
 PARAMETER HW_VER = 2.03.a 
 PARAMETER C_BASEADDR = 0x80500000 
 PARAMETER C_HIGHADDR = 0x805001ff 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT IIC2INTC_Irpt = xps_iic_2_IIC2INTC_Irpt 
END 
 
BEGIN xps_timer 
 PARAMETER INSTANCE = xps_timer_0 
 PARAMETER HW_VER = 1.02.a 
 181 
 
 PARAMETER C_BASEADDR = 0x80580100 
 PARAMETER C_HIGHADDR = 0x805801ff 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT Interrupt = xps_timer_0_Interrupt 
END 
 
BEGIN xps_timer 
 PARAMETER INSTANCE = xps_timer_1 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BASEADDR = 0x80580100 
 PARAMETER C_HIGHADDR = 0x805801ff 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT Interrupt = xps_timer_1_Interrupt 
END 
 
BEGIN xps_timer 
 PARAMETER INSTANCE = xps_timer_2 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BASEADDR = 0x80580100 
 PARAMETER C_HIGHADDR = 0x805801ff 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT Interrupt = xps_timer_2_Interrupt 
END 
 
BEGIN xps_timebase_wdt 
 PARAMETER INSTANCE = xps_timebase_wdt_0 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BASEADDR = 0x80540180 
 PARAMETER C_HIGHADDR = 0x805401ff 
 PARAMETER C_WDT_INTERVAL = 29 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT WDT_Interrupt = xps_timebase_wdt_0_WDT_Interrupt 
 PORT Timebase_Interrupt = xps_timebase_wdt_0_Timebase_Interrupt 
 PORT WDT_Reset = xps_timebase_wdt_0_WDT_Reset 
END 
 
BEGIN xps_timebase_wdt 
 PARAMETER INSTANCE = xps_timebase_wdt_1 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BASEADDR = 0x80540180 
 PARAMETER C_HIGHADDR = 0x805401ff 
 PARAMETER C_WDT_INTERVAL = 29 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT WDT_Interrupt = xps_timebase_wdt_1_WDT_Interrupt 
 PORT Timebase_Interrupt = xps_timebase_wdt_1_Timebase_Interrupt 
 PORT WDT_Reset = xps_timebase_wdt_1_WDT_Reset 
END 
 
BEGIN xps_timebase_wdt 
 PARAMETER INSTANCE = xps_timebase_wdt_2 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BASEADDR = 0x80540180 
 PARAMETER C_HIGHADDR = 0x805401ff 
 PARAMETER C_WDT_INTERVAL = 29 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT WDT_Interrupt = xps_timebase_wdt_2_WDT_Interrupt 
 PORT Timebase_Interrupt = xps_timebase_wdt_2_Timebase_Interrupt 
 PORT WDT_Reset = xps_timebase_wdt_2_WDT_Reset 
END 
 
 182 
 
BEGIN mdm 
 PARAMETER INSTANCE = debug_module_0 
 PARAMETER HW_VER = 2.00.b 
 PARAMETER C_USE_UART = 0 
 PARAMETER C_MB_DBG_PORTS = 4 
 BUS_INTERFACE MBDEBUG_0 = debug_module_0_MBDEBUG_0 
 BUS_INTERFACE MBDEBUG_1 = debug_module_0_MBDEBUG_1 
 BUS_INTERFACE MBDEBUG_2 = debug_module_0_MBDEBUG_2 
 BUS_INTERFACE MBDEBUG_3 = debug_module_0_MBDEBUG_3 
 PORT Interrupt = debug_module_0_Interrupt 
 PORT Debug_SYS_Rst = debug_module_0_Debug_SYS_Reset 
END 
 
BEGIN xps_gpio 
 PARAMETER INSTANCE = xps_gpio_0 
 PARAMETER HW_VER = 2.00.a 
 PARAMETER C_INTERRUPT_PRESENT = 1 
 PARAMETER C_ALL_INPUTS = 0 
 PARAMETER C_BASEADDR = 0x80480000 
 PARAMETER C_HIGHADDR = 0x804801ff 
 PARAMETER C_IS_DUAL = 1 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT IP2INTC_Irpt = xps_gpio_0_IP2INTC_Irpt 
 PORT GPIO_IO = xps_gpio_0_GPIO_IO 
END 
 
BEGIN xps_gpio 
 PARAMETER INSTANCE = xps_gpio_1 
 PARAMETER HW_VER = 2.00.a 
 PARAMETER C_INTERRUPT_PRESENT = 1 
 PARAMETER C_ALL_INPUTS = 0 
 PARAMETER C_BASEADDR = 0x80480000 
 PARAMETER C_HIGHADDR = 0x804801ff 
 PARAMETER C_IS_DUAL = 1 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT IP2INTC_Irpt = xps_gpio_1_IP2INTC_Irpt 
 PORT GPIO_IO = xps_gpio_1_GPIO_IO 
END 
 
BEGIN xps_gpio 
 PARAMETER INSTANCE = xps_gpio_2 
 PARAMETER HW_VER = 2.00.a 
 PARAMETER C_INTERRUPT_PRESENT = 1 
 PARAMETER C_ALL_INPUTS = 1 
 PARAMETER C_BASEADDR = 0x80480000 
 PARAMETER C_HIGHADDR = 0x804801ff 
 PARAMETER C_IS_DUAL = 1 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT IP2INTC_Irpt = xps_gpio_2_IP2INTC_Irpt 
 PORT GPIO_IO = xps_gpio_2_GPIO_IO 
END 
 
BEGIN xps_intc 
 PARAMETER INSTANCE = xps_intc_0 
 PARAMETER HW_VER = 2.01.a 
 PARAMETER C_IRQ_IS_LEVEL = 0 
 PARAMETER C_BASEADDR = 0x804c0100 
 PARAMETER C_HIGHADDR = 0x804c017f 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT Irq = xps_intc_0_Irq 
 183 
 
 PORT Intr = xps_timebase_wdt_0_WDT_Interrupt & xps_timebase_wdt_0_Timebase_Interrupt & 
xps_timer_0_Interrupt & xps_gpio_0_IP2INTC_Irpt & xps_iic_0_IIC2INTC_Irpt 
END 
 
BEGIN xps_intc 
 PARAMETER INSTANCE = xps_intc_1 
 PARAMETER HW_VER = 2.01.a 
 PARAMETER C_IRQ_IS_LEVEL = 0 
 PARAMETER C_BASEADDR = 0x804c0100 
 PARAMETER C_HIGHADDR = 0x804c017f 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT Irq = xps_intc_1_Irq 
 PORT Intr = xps_timebase_wdt_1_WDT_Interrupt & xps_timebase_wdt_1_Timebase_Interrupt & 
xps_timer_1_Interrupt & xps_gpio_1_IP2INTC_Irpt & xps_iic_1_IIC2INTC_Irpt 
END 
 
BEGIN xps_intc 
 PARAMETER INSTANCE = xps_intc_2 
 PARAMETER HW_VER = 2.01.a 
 PARAMETER C_IRQ_IS_LEVEL = 0 
 PARAMETER C_BASEADDR = 0x804c0100 
 PARAMETER C_HIGHADDR = 0x804c017f 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT Irq = xps_intc_2_Irq 
 PORT Intr = xps_timebase_wdt_2_WDT_Interrupt & xps_timebase_wdt_2_Timebase_Interrupt & 
xps_timer_2_Interrupt & xps_gpio_2_IP2INTC_Irpt & xps_iic_2_IIC2INTC_Irpt 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb0_mb1 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_FSL_DEPTH = 512 
 PARAMETER C_ASYNC_CLKS = 1 
 PORT FSL_M_CLK = clock_generator_0_CLKOUT2_100MHz 
 PORT FSL_S_CLK = clock_generator_0_CLKOUT3_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb0_mb2 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_FSL_DEPTH = 512 
 PARAMETER C_ASYNC_CLKS = 1 
 PORT FSL_M_CLK = clock_generator_0_CLKOUT2_100MHz 
 PORT FSL_S_CLK = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb1_mb0 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_FSL_DEPTH = 512 
 PARAMETER C_ASYNC_CLKS = 1 
 PORT FSL_M_CLK = clock_generator_0_CLKOUT3_100MHz 
 PORT FSL_S_CLK = clock_generator_0_CLKOUT2_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb1_mb2 
 184 
 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_FSL_DEPTH = 512 
 PARAMETER C_ASYNC_CLKS = 1 
 PORT FSL_M_CLK = clock_generator_0_CLKOUT3_100MHz 
 PORT FSL_S_CLK = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb2_mb0 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_FSL_DEPTH = 512 
 PARAMETER C_ASYNC_CLKS = 1 
 PORT FSL_M_CLK = clock_generator_0_CLKOUT0_100MHz 
 PORT FSL_S_CLK = clock_generator_0_CLKOUT2_100MHz 
END 
 
BEGIN fsl_v20 
 PARAMETER INSTANCE = fsl_v20_mb2_mb1 
 PARAMETER HW_VER = 2.11.e 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_FSL_DEPTH = 512 
 PARAMETER C_ASYNC_CLKS = 1 
 PORT FSL_M_CLK = clock_generator_0_CLKOUT0_100MHz 
 PORT FSL_S_CLK = clock_generator_0_CLKOUT3_100MHz 
END 
 
BEGIN proc_sys_reset 
 PARAMETER INSTANCE = reset_mb0 
 PARAMETER HW_VER = 3.00.a 
 PARAMETER C_EXT_RST_WIDTH = 2 
 PARAMETER C_AUX_RST_WIDTH = 2 
 PORT Bus_Struct_Reset = reset_mb0_Bus_Struct_Reset 
 PORT MB_Debug_Sys_Rst = debug_module_0_Debug_SYS_Reset 
 PORT Ext_Reset_In = net_reset_mb0_Ext_Reset_In_pin 
 PORT Aux_Reset_In = xps_timebase_wdt_0_WDT_Reset 
 PORT MB_Reset = reset_mb0_MB_Reset 
 PORT Slowest_sync_clk = clock_generator_0_CLKOUT3_100MHz 
 PORT Dcm_locked = Dcm_all_locked 
END 
 
BEGIN proc_sys_reset 
 PARAMETER INSTANCE = reset_mb1 
 PARAMETER HW_VER = 3.00.a 
 PARAMETER C_EXT_RST_WIDTH = 2 
 PARAMETER C_AUX_RST_WIDTH = 2 
 PORT Bus_Struct_Reset = reset_mb1_Bus_Struct_Reset 
 PORT MB_Debug_Sys_Rst = debug_module_0_Debug_SYS_Reset 
 PORT Ext_Reset_In = net_reset_mb1_Ext_Reset_In_pin 
 PORT Aux_Reset_In = xps_timebase_wdt_1_WDT_Reset 
 PORT MB_Reset = reset_mb1_MB_Reset 
 PORT Slowest_sync_clk = clock_generator_0_CLKOUT3_100MHz 
 PORT Dcm_locked = Dcm_all_locked 
END 
 
BEGIN proc_sys_reset 
 PARAMETER INSTANCE = reset_mb2 
 PARAMETER HW_VER = 3.00.a 
 PARAMETER C_EXT_RST_WIDTH = 2 
 185 
 
 PARAMETER C_AUX_RST_WIDTH = 2 
 PORT Bus_Struct_Reset = reset_mb2_Bus_Struct_Reset 
 PORT MB_Debug_Sys_Rst = debug_module_0_Debug_SYS_Reset 
 PORT Ext_Reset_In = net_reset_mb2_Ext_Reset_In_pin 
 PORT Aux_Reset_In = xps_timebase_wdt_2_WDT_Reset 
 PORT MB_Reset = reset_mb2_MB_Reset 
 PORT Slowest_sync_clk = clock_generator_0_CLKOUT3_100MHz 
 PORT Dcm_locked = Dcm_all_locked 
END 
 
BEGIN bram_block 
 PARAMETER INSTANCE = bram_block_mb0 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE PORTA = bram_cntlr_Inst_mb0_BRAM_PORT 
 BUS_INTERFACE PORTB = bram_cntlr_Data_mb0_BRAM_PORT 
END 
 
BEGIN bram_block 
 PARAMETER INSTANCE = bram_block_mb1 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE PORTA = bram_cntlr_Inst_mb1_BRAM_PORT 
 BUS_INTERFACE PORTB = bram_cntlr_Data_mb1_BRAM_PORT 
END 
 
BEGIN bram_block 
 PARAMETER INSTANCE = bram_block_mb2 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE PORTA = bram_cntlr_Inst_mb2_BRAM_PORT 
 BUS_INTERFACE PORTB = bram_cntlr_Data_mb2_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = bram_cntlr_Inst_mb0 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = Inst_lmb_v10_mb0 
 BUS_INTERFACE BRAM_PORT = bram_cntlr_Inst_mb0_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = bram_cntlr_Inst_mb1 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = Inst_lmb_v10_mb1 
 BUS_INTERFACE BRAM_PORT = bram_cntlr_Inst_mb1_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = bram_cntlr_Inst_mb2 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = Inst_lmb_v10_mb2 
 BUS_INTERFACE BRAM_PORT = bram_cntlr_Inst_mb2_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = bram_cntlr_Data_mb0 
 186 
 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = Data_lmb_v10_mb0 
 BUS_INTERFACE BRAM_PORT = bram_cntlr_Data_mb0_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = bram_cntlr_Data_mb1 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = Data_lmb_v10_mb1 
 BUS_INTERFACE BRAM_PORT = bram_cntlr_Data_mb1_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = bram_cntlr_Data_mb2 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = Data_lmb_v10_mb2 
 BUS_INTERFACE BRAM_PORT = bram_cntlr_Data_mb2_BRAM_PORT 
END 
 
BEGIN fsl_v20_voter 
 PARAMETER INSTANCE = fsl_voter_0 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE SFSL = fsl_v20_mb0 
END 
 
BEGIN fsl_v20_voter 
 PARAMETER INSTANCE = fsl_voter_1 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE SFSL = fsl_v20_mb1 
END 
 
BEGIN fsl_v20_voter 
 PARAMETER INSTANCE = fsl_voter_2 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE SFSL = fsl_v20_mb2 
END 
 
BEGIN plb_v46 
 PARAMETER INSTANCE = plb_v46_0_HWICAP 
 PARAMETER HW_VER = 1.05.a 
 PORT SYS_Rst = MB3_reset_Bus_Struct_Reset 
 PORT PLB_Clk = clock_generator_0_CLKOUT1_100MHz 
END 
 
BEGIN MicroBlaze 
 PARAMETER INSTANCE = MicroBlaze_3 
 PARAMETER HW_VER = 8.20.a 
 PARAMETER C_DEBUG_ENABLED = 1 
 PARAMETER C_AREA_OPTIMIZED = 1 
 BUS_INTERFACE DPLB = plb_v46_0_HWICAP 
 BUS_INTERFACE IPLB = plb_v46_0_HWICAP 
 BUS_INTERFACE DEBUG = debug_module_0_MBDEBUG_3 
 BUS_INTERFACE DLMB = MB3_lmb_v10_Data 
 BUS_INTERFACE ILMB = MB3_lmb_v10_Inst 
 187 
 
 PORT MB_RESET = MB3_reset_MB_Reset 
 PORT INTERRUPT = xps_intc_3_Irq 
END 
 
BEGIN proc_sys_reset 
 PARAMETER INSTANCE = MB3_reset 
 PARAMETER HW_VER = 3.00.a 
 PARAMETER C_EXT_RST_WIDTH = 2 
 PARAMETER C_AUX_RST_WIDTH = 2 
 PORT Bus_Struct_Reset = MB3_reset_Bus_Struct_Reset 
 PORT MB_Debug_Sys_Rst = debug_module_0_Debug_SYS_Reset 
 PORT Ext_Reset_In = net_MB3_reset_Ext_Reset_In_pin 
 PORT Aux_Reset_In = xps_timebase_wdt_3_WDT_Reset 
 PORT MB_Reset = MB3_reset_MB_Reset 
 PORT Slowest_sync_clk = clock_generator_0_CLKOUT1_100MHz 
 PORT Dcm_locked = Dcm_all_locked 
END 
 
BEGIN bram_block 
 PARAMETER INSTANCE = bram_block_0 
 PARAMETER HW_VER = 1.00.a 
 BUS_INTERFACE PORTA = MB3_Inst_BRAM_Cntlr_BRAM_PORT 
 BUS_INTERFACE PORTB = MB3_Data_BRAM_Cntlr_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = MB3_Inst_BRAM_Cntlr 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = MB3_lmb_v10_Inst 
 BUS_INTERFACE BRAM_PORT = MB3_Inst_BRAM_Cntlr_BRAM_PORT 
END 
 
BEGIN lmb_bram_if_cntlr 
 PARAMETER INSTANCE = MB3_Data_BRAM_Cntlr 
 PARAMETER HW_VER = 3.00.b 
 PARAMETER C_BASEADDR = 0x00000000 
 PARAMETER C_HIGHADDR = 0x00007fff 
 BUS_INTERFACE SLMB = MB3_lmb_v10_Data 
 BUS_INTERFACE BRAM_PORT = MB3_Data_BRAM_Cntlr_BRAM_PORT 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = MB3_lmb_v10_Data 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = MB3_reset_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT1_100MHz 
END 
 
BEGIN lmb_v10 
 PARAMETER INSTANCE = MB3_lmb_v10_Inst 
 PARAMETER HW_VER = 2.00.b 
 PORT SYS_Rst = MB3_reset_Bus_Struct_Reset 
 PORT LMB_Clk = clock_generator_0_CLKOUT1_100MHz 
END 
 
BEGIN xps_timebase_wdt 
 PARAMETER INSTANCE = xps_timebase_wdt_3 
 PARAMETER HW_VER = 1.02.a 
 188 
 
 PARAMETER C_BASEADDR = 0x80004800 
 PARAMETER C_HIGHADDR = 0x8000487f 
 PARAMETER C_WDT_INTERVAL = 29 
 BUS_INTERFACE SPLB = plb_v46_0_HWICAP 
 PORT Timebase_Interrupt = xps_timebase_wdt_3_Timebase_Interrupt 
 PORT WDT_Interrupt = xps_timebase_wdt_3_WDT_Interrupt 
 PORT WDT_Reset = xps_timebase_wdt_3_WDT_Reset 
END 
 
BEGIN xps_intc 
 PARAMETER INSTANCE = xps_intc_3 
 PARAMETER HW_VER = 2.01.a 
 PARAMETER C_BASEADDR = 0x80004400 
 PARAMETER C_HIGHADDR = 0x8000447f 
 BUS_INTERFACE SPLB = plb_v46_0_HWICAP 
 PORT Irq = xps_intc_3_Irq 
 PORT Intr = xps_timebase_wdt_3_WDT_Interrupt & xps_timebase_wdt_3_Timebase_Interrupt & 
xps_gpio_3_IP2INTC_Irpt & xps_iic_3_IIC2INTC_Irpt 
END 
 
BEGIN xps_mch_emc 
 PARAMETER INSTANCE = xps_mch_emc_0 
 PARAMETER HW_VER = 3.01.a 
 PARAMETER C_NUM_CHANNELS = 0 
 PARAMETER C_MEM0_BASEADDR = 0x80600000 
 PARAMETER C_MEM0_HIGHADDR = 0x807fffff 
 BUS_INTERFACE SPLB = plb_v46_mb0 
 PORT RdClk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN xps_mch_emc 
 PARAMETER INSTANCE = xps_mch_emc_1 
 PARAMETER HW_VER = 3.01.a 
 PARAMETER C_NUM_CHANNELS = 0 
 PARAMETER C_MEM0_BASEADDR = 0x80600000 
 PARAMETER C_MEM0_HIGHADDR = 0x807fffff 
 BUS_INTERFACE SPLB = plb_v46_mb1 
 PORT RdClk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN xps_mch_emc 
 PARAMETER INSTANCE = xps_mch_emc_2 
 PARAMETER HW_VER = 3.01.a 
 PARAMETER C_NUM_CHANNELS = 0 
 PARAMETER C_MEM0_BASEADDR = 0x80600000 
 PARAMETER C_MEM0_HIGHADDR = 0x807fffff 
 BUS_INTERFACE SPLB = plb_v46_mb2 
 PORT RdClk = clock_generator_0_CLKOUT0_100MHz 
END 
 
BEGIN xps_mch_emc 
 PARAMETER INSTANCE = xps_mch_emc_3 
 PARAMETER HW_VER = 3.01.a 
 PARAMETER C_NUM_CHANNELS = 0 
 PARAMETER C_MEM0_BASEADDR = 0x80200000 
 PARAMETER C_MEM0_HIGHADDR = 0x8023FFFF 
 PARAMETER C_TCEDV_PS_MEM_0 = 35000 
 PARAMETER C_TAVDV_PS_MEM_0 = 35000 
 PARAMETER C_THZCE_PS_MEM_0 = 15000 
 PARAMETER C_THZOE_PS_MEM_0 = 10000 
 189 
 
 PARAMETER C_TWC_PS_MEM_0 = 35000 
 PARAMETER C_TWP_PS_MEM_0 = 15000 
 PARAMETER C_TLZWE_PS_MEM_0 = 3000 
 PARAMETER C_MAX_MEM_WIDTH = 16 
 PARAMETER C_MEM0_WIDTH = 16 
 PARAMETER C_INCLUDE_DATAWIDTH_MATCHING_0 = 1 
 BUS_INTERFACE SPLB = plb_v46_0_HWICAP 
 PORT RdClk = clock_generator_0_CLKOUT1_100MHz 
 PORT Mem_DQ = xps_mch_emc_3_Mem_DQ 
 PORT Mem_CKEN = xps_mch_emc_3_Mem_CKEN 
 PORT Mem_A = xps_mch_emc_3_Mem_A 
 PORT Mem_RPN = xps_mch_emc_3_Mem_RPN 
 PORT Mem_CEN = xps_mch_emc_3_Mem_CEN 
 PORT Mem_OEN = xps_mch_emc_3_Mem_OEN 
 PORT Mem_WEN = xps_mch_emc_3_Mem_WEN 
 PORT Mem_BEN = xps_mch_emc_3_Mem_BEN 
 PORT Mem_CE = xps_mch_emc_3_Mem_CE 
 PORT Mem_ADV_LDN = xps_mch_emc_3_Mem_ADV_LDN 
END 
 
BEGIN clock_generator 
 PARAMETER INSTANCE = clock_generator_0 
 PARAMETER HW_VER = 4.03.a 
 PARAMETER C_CLKIN_FREQ = 50000000 
 PARAMETER C_CLKOUT0_FREQ = 100000000 
 PARAMETER C_CLKOUT0_PHASE = 0 
 PARAMETER C_CLKOUT0_GROUP = NONE 
 PARAMETER C_CLKOUT0_BUF = TRUE 
 PARAMETER C_CLKOUT1_FREQ = 100000000 
 PARAMETER C_CLKOUT1_PHASE = 0 
 PARAMETER C_CLKOUT1_GROUP = NONE 
 PARAMETER C_CLKOUT1_BUF = TRUE 
 PARAMETER C_CLKOUT2_FREQ = 100000000 
 PARAMETER C_CLKOUT2_PHASE = 0 
 PARAMETER C_CLKOUT2_GROUP = NONE 
 PARAMETER C_CLKOUT2_BUF = TRUE 
 PARAMETER C_CLKOUT3_FREQ = 100000000 
 PARAMETER C_CLKOUT3_PHASE = 0 
 PARAMETER C_CLKOUT3_GROUP = NONE 
 PARAMETER C_CLKOUT3_BUF = TRUE 
 PARAMETER C_CLKOUT4_FREQ = 50000000 
 PORT CLKIN = CLK_S 
 PORT CLKOUT0 = clock_generator_0_CLKOUT0_100MHz 
 PORT CLKOUT1 = clock_generator_0_CLKOUT1_100MHz 
 PORT CLKOUT2 = clock_generator_0_CLKOUT2_100MHz 
 PORT CLKOUT3 = clock_generator_0_CLKOUT3_100MHz 
 PORT LOCKED = Dcm_all_locked 
 PORT RST = net_clock_generator_0_RST_pin 
 PORT CLKOUT4 = clock_generator_0_CLKOUT4_50MHz 
END 
 
BEGIN mailbox 
 PARAMETER INSTANCE = mailbox_mb0_mb1 
 PARAMETER HW_VER = 1.00.a 
 PARAMETER C_ASYNC_CLKS = 1 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_MAILBOX_DEPTH = 512 
 PARAMETER C_SPLB0_BASEADDR = 0x80440400 
 PARAMETER C_SPLB0_HIGHADDR = 0x804407ff 
 PARAMETER C_SPLB1_BASEADDR = 0x80440400 
 190 
 
 PARAMETER C_SPLB1_HIGHADDR = 0x804407ff 
 BUS_INTERFACE SPLB0 = plb_v46_mb0 
 BUS_INTERFACE SPLB1 = plb_v46_mb1 
END 
 
BEGIN mailbox 
 PARAMETER INSTANCE = mailbox_mb0_mb2 
 PARAMETER HW_VER = 1.00.a 
 PARAMETER C_ASYNC_CLKS = 1 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_MAILBOX_DEPTH = 512 
 PARAMETER C_SPLB0_BASEADDR = 0x80440800 
 PARAMETER C_SPLB0_HIGHADDR = 0x80440bff 
 PARAMETER C_SPLB1_BASEADDR = 0x80440400 
 PARAMETER C_SPLB1_HIGHADDR = 0x804407ff 
 BUS_INTERFACE SPLB0 = plb_v46_mb0 
 BUS_INTERFACE SPLB1 = plb_v46_mb2 
END 
 
BEGIN mailbox 
 PARAMETER INSTANCE = mailbox_mb0_mb3 
 PARAMETER HW_VER = 1.00.a 
 PARAMETER C_ASYNC_CLKS = 1 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_MAILBOX_DEPTH = 512 
 PARAMETER C_SPLB0_BASEADDR = 0x80440c00 
 PARAMETER C_SPLB0_HIGHADDR = 0x80440fff 
 PARAMETER C_SPLB1_BASEADDR = 0x80007c00 
 PARAMETER C_SPLB1_HIGHADDR = 0x80007fff 
 BUS_INTERFACE SPLB0 = plb_v46_mb0 
 BUS_INTERFACE SPLB1 = plb_v46_0_HWICAP 
END 
 
BEGIN mailbox 
 PARAMETER INSTANCE = mailbox_mb1_mb2 
 PARAMETER HW_VER = 1.00.a 
 PARAMETER C_ASYNC_CLKS = 1 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_MAILBOX_DEPTH = 512 
 PARAMETER C_SPLB0_BASEADDR = 0x80440800 
 PARAMETER C_SPLB0_HIGHADDR = 0x80440bff 
 PARAMETER C_SPLB1_BASEADDR = 0x80440800 
 PARAMETER C_SPLB1_HIGHADDR = 0x80440bff 
 BUS_INTERFACE SPLB0 = plb_v46_mb1 
 BUS_INTERFACE SPLB1 = plb_v46_mb2 
END 
 
BEGIN mailbox 
 PARAMETER INSTANCE = mailbox_mb1_mb3 
 PARAMETER HW_VER = 1.00.a 
 PARAMETER C_ASYNC_CLKS = 1 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_MAILBOX_DEPTH = 512 
 PARAMETER C_SPLB0_BASEADDR = 0x80440c00 
 PARAMETER C_SPLB0_HIGHADDR = 0x80440fff 
 PARAMETER C_SPLB1_BASEADDR = 0x80006800 
 PARAMETER C_SPLB1_HIGHADDR = 0x80006bff 
 BUS_INTERFACE SPLB0 = plb_v46_mb1 
 BUS_INTERFACE SPLB1 = plb_v46_0_HWICAP 
END 
 191 
 
 
BEGIN mailbox 
 PARAMETER INSTANCE = mailbox_mb2_mb3 
 PARAMETER HW_VER = 1.00.a 
 PARAMETER C_ASYNC_CLKS = 1 
 PARAMETER C_IMPL_STYLE = 1 
 PARAMETER C_MAILBOX_DEPTH = 512 
 PARAMETER C_SPLB0_BASEADDR = 0x80440c00 
 PARAMETER C_SPLB0_HIGHADDR = 0x80440fff 
 PARAMETER C_SPLB1_BASEADDR = 0x80007000 
 PARAMETER C_SPLB1_HIGHADDR = 0x800073ff 
 BUS_INTERFACE SPLB0 = plb_v46_mb2 
 BUS_INTERFACE SPLB1 = plb_v46_0_HWICAP 
END 
 
BEGIN xps_gpio 
 PARAMETER INSTANCE = xps_gpio_3 
 PARAMETER HW_VER = 2.00.a 
 PARAMETER C_INTERRUPT_PRESENT = 1 
 PARAMETER C_BASEADDR = 0x80004a00 
 PARAMETER C_HIGHADDR = 0x80004bff 
 BUS_INTERFACE SPLB = plb_v46_0_HWICAP 
 PORT IP2INTC_Irpt = xps_gpio_3_IP2INTC_Irpt 
 PORT GPIO_IO = xps_gpio_3_GPIO_IO[0:31] 
END 
 
BEGIN xps_uartlite 
 PARAMETER INSTANCE = xps_uartlite_3 
 PARAMETER HW_VER = 1.02.a 
 PARAMETER C_BAUDRATE = 9600 
 PARAMETER C_USE_PARITY = 0 
 PARAMETER C_BASEADDR = 0x80001800 
 PARAMETER C_HIGHADDR = 0x8000187f 
 BUS_INTERFACE SPLB = plb_v46_0_HWICAP 
 PORT RX = net_xps_uartlite_3_RX_pin 
 PORT TX = xps_uartlite_3_TX 
END 
 
BEGIN xps_iic 
 PARAMETER INSTANCE = xps_iic_3 
 PARAMETER HW_VER = 2.03.a 
 PARAMETER C_BASEADDR = 0x81600000 
 PARAMETER C_HIGHADDR = 0x8160ffff 
 PARAMETER C_IIC_FREQ = 50000 
 PARAMETER C_SCL_INERTIAL_DELAY = 0 
 PARAMETER C_SDA_INERTIAL_DELAY = 0 
 BUS_INTERFACE SPLB = plb_v46_0_HWICAP 
 PORT Sda = xps_iic_3_Sda 
 PORT Scl = xps_iic_3_Scl 
 PORT IIC2INTC_Irpt = xps_iic_3_IIC2INTC_Irpt 
END 
 
 192 
 
Appendix B   
SEU Controller Connection Schematic 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Clock Divider 
MPSoC Block
 193 
 
Appendix C   
SEU Screening of Spartan-6 FPGA Using Cf-252 
 
C.1  Introduction 
Building satellites for scientific missions at Universities requires the use of low cost 
and high performance components to maintain the mission budget and performance. 
This requirement drives the researchers to make use of the available state-of-art 
COTS components in developing satellite subsystems. COTS have higher 
performance and extremely low cost when compared to the space approved 
components however they are not guaranteed for use in space applications. 
Therefore, researchers are working on the development of new design techniques to 
make COTS components suitable for use in space. At Kyushu Institute of 
Technology in Japan, we have been carrying research on the fitness of commercial 
grade SRAM based FPGAs for use in scientific satellite missions. SRAM based 
FPGAs are widely used in developing digital systems for different applications. 
However, their use in space applications is still under investigation due to three 
reasons: 1) their configuration bitstream can be corrupted by SEUs, 2) they consume 
high power and 3) their cost is high when compared to discrete components. FPGA 
manufacturers managed to solve the problems of power consumption and cost. They 
introduced low cost and low power consumption FPGAs such as the Spartan-6 with 
45nm technology from Xilinx. To solve the problem of corrupting the configuration 
memory by the space radiation, Xilinx developed a technique named DPR. In that 
technique, the configuration memory can be read and checked for errors in a frame 
by frame basis. If a single bit error is found then the frame would be corrected and 
written back to the configuration memory. Xilinx introduced an IP core, the Single 
Event Mitigation (SEM) core, to scrub the configuration memory of its devices using 
the DPR technique. The purpose of this research is to evaluate the performance of 
using the SEM with DPR in mitigating soft errors in the configuration memory of the 
Sparatn-6 FPGA using Cf-252. The radiation tests were carried out at the Kyoto 
University/Research Reactor Institute using a decapsulated Sapatn-6 FPGA. The use 
of Cf-252 in assessing the radiation effects is easier and has lower cost when 
compared to the proton beam and heavy ion tests. Nevertheless, when using Cf-252 
 194 
 
we cannot control the flux and energy of the radiated ions. Therefore, the Cf-252 is 
suitable to give an estimate of the design and fabric performance under radiation 
without having detailed accurate measurements of the dynamic and static cross 
sections as in the proton and heavy ion tests. In fact for developing low cost missions 
with acceptable reliability, Cf-252 test results are adequate. Cf-252 is a very suitable 
alternative to the expensive and complicated variant of proton and heavy ion tests. 
We do believe that if more research is carried on qualifying COTS SRAM based 
FPGAs for use in space then more advanced missions would be launched and more 
capabilities would be acquired. In the next sections we present the tested system 
design, the test setup, the test results and the conclusion. 
C.2  System Design and Test Setup 
An embedded processor system was developed using the EDK package from Xilinx.. 
Figure 1 shows the architecture and external interface of the used design during the 
Cf-252 tests. The system consisted of a MicroBlaze processor interfaced to an 
Interrupt Controller (INTC), Timer, WDT and Reset Module through the PLB. The 
MicroBlaze run the code form its Block RAM (BRAM). MDM IP core was used for 
debugging through JTAG interface. Xilinx General Peripheral Input/Output (XGPIO) 
core was used for external Input/Output from/to status leds and switches. A UART is 
used for data logging from the MicroBlaze to an external PC. The SEM core 
continuously scrubbed the configuration memory which contained 11,939,296 
configuration bits [1]. It sent the results of scrubbing indicating any single or 
multiple bit upsets found in the entire device bitstream to the external data logging 
PC. During the test, software ISR was initiated based on the interrupt received from 
the WDT via the INTC. The WDT issued the interrupt whenever it reached its time 
out value (every 5.34 sec). Upon reception of the interrupt and initiation of the ISR, 
the MicroBalze reset the WDT. If the MicroBlaze was in a hang up situation then the 
WDT would not be reset. At that moment the WDT sends a reset signal to the reset 
module to reset the whole system. An external WDT with longer time out interval 
(every 15sec) resets and reconfigures the whole FPGA whenever the MicroBlaze 
fails to restart after the internal WDT reset. The Sparatn-6 was decapsualted as 
shown in figure 2. During the test, The Cf-252 was placed at different distances from 
the FPGA top. The number of system failures and configuration frames bit upsets 
were recorded. The test was conducted in two cases: with enabling the SEM core and 
  
without en
and 1cm d
2 cm and 
vacuumed
C.3  Re
The SEU 
between th
LET of 4
changes w
the distan
increases 
distance to
 
 
 
 
 
 
 
 
 
abling the 
istances. Sy
3 cm.  Eac
 chamber.  
sults and
and MBU 
e Cf-252 a
3 (MeV cm
ith the dist
ce increase
because the
 the FPGA
Figure 2 
SEM core
stem Failu
h test was
 Discussi
count durin
nd the FPG
2/mg). Th
ance from 
s. MBU r
 stopping e
 silicon inc
Decapsulat
. Static cro
re rate tests
 run for 20
on 
g 20 min o
A top is sh
e flux is a
the FPGA.
atio to the
nergy caus
reases. 
ed Spartan 
ss section e
 were cond
 min. All t
f tests at 1
own in fig
bout 22 ion
 The total n
 total ups
ing the adj
 
6 FPGA. 
valuation w
ucted at thr
he tests we
cm, 2 cm 
ure 3. The 
s/cm2/sec 
umber of 
ets decreas
acent bit up
as done at
ee distance
re carried 
and 3 cm d
Cf-252 has
[2][3]. The
upsets decr
es as the 
sets reduce
195 
 0.5 cm 
s: 1 cm, 
out in a 
 
istances 
 a mean 
 energy 
eases as 
distance 
s as the 
  
To estima
consider th
252 is 22 
upsets acc
 
To
 
te the bit 
at the devi
 (ions/cm2/
ording to E
tal	Bit	Ups
Figu
Figure 3 S
upset rate 
ce has 11,9
sec).The M
q.1: 
ets	Count
re 4 Bit Up
EU and M
and static
39,296 con
BUs are co
= MBU Co
set Rate for
BU count V
 cross sect
figuration b
nsidered a
unt × 2 +
 the Entire 
s. Distance
ion for the
its and tha
s double. W
SEU Coun
Spartan 6 D
. 
 entire de
t the flux o
e calculat
t 
evice. 
196 
 
vice we 
f the Cf-
e the bit 
(1) 
 
  
The data 
curve is ab
 
 
The syste
reloading.
disabled. W
system fa
failure rate
is fit in fig
out 0.94. T
Figur
m failure 
 It is show
hen the S
ilures were
s depend o
ure 4using
he cross se
σ	 = ݊ݑ݂݈ݑ݁
e 5 Bit Ups
rate was e
n in two c
EM core i
 reduced b
n the select
 a power e
ction is cal
ܾ݉݁ݎ ݋݂ ݑ
݊ܿ݁ (݌ܽݎݐ݅
et Cross Se
stimated b
ases when 
s used for 
y about 32
ed orbit.  
quation, th
culated from
݌ݏ݁ݐݏ
݈ܿ݁ ܿ݉ଶ)⁄
ction for th
ased on fu
the SEM c
scrubbing t
% at 0.5 c
e R-Square
 Eq.2 and 
e Sparatn 6
ll system 
ore is enab
he configu
m and 52.
d value fo
shown in fi
 Device. 
failure and
led and wh
ration mem
9% at 1cm
197 
r the fit 
gure5 : 
(2)
 
 FPGA 
en it is 
ory, the 
. Actual 
  
C.4  Co
Decapsula
better perf
cross secti
F
Fig
nclusion
ted Spartan
ormance in
on curve is
igure 6 Sys
ure 7 Syste
  
 6 was te
 terms of r
 presented. 
tem Failure
m Failure R
sted using
educed fail
 
 Rate with 
ate withou
 Cf-252 irr
ure rates w
Mitigation.
t Mitigation
adiation. T
hen SEM 
 
. 
he FPGA 
core was u
198 
 
 
showed 
sed. The 
 199 
 
References 
[1] Xilinx, Spartan-6 FPGA Configuration User Guide, UG380 (v2.5) January 23, 
2013. 
[2] Gaisler Research, LEON3-FT-RTAX SEU test results, Issue 1, December 7, 
2005.  
[3] Stephen, J. H.; Sanderson, T. K.; Mapper, D.; Farren, J.; Harboe-Sorensen, R.; 
Adams, L., "A Comparison of Heavy Ion Sources Used in Cosmic Ray 
Simulation Studies of VLSI Circuits," Nuclear Science, IEEE Transactions on , 
vol.31, no.6, pp.1069,1072, Dec. 1984. 
 
  
 200 
 
Appendix D   
Sample Code – MPSoC Embedded Software in Thermal 
Vacuum and Radiation Tests 
D.1  MicroBlaze – 0 Code 
 
 
#include <stdio.h> 
#include "xparameters.h" 
#include "xuartlite.h" 
#include "xuartlite_l.h" 
#include "platform.h" 
#include "fsl.h" 
#include "xwdttb.h" 
#include "xintc.h" 
#include "xmbox.h" 
#include "xtmrctr.h" 
  
  
  
//comment this line to stop resetting the Microblaze through its WDT 
//#define reset_WDT 
  
void print(char *str); 
  
XWdtTb WDT; 
XIntc INTC; 
XMbox MB0_MB3; 
XMbox_Config CFG1; 
XTmrCtr Timer; 
  
  
union ACK_mes 
{ 
    char ACK_char[4]; 
    unsigned long ACK_long; 
}; 
  
union ACK_mes DATA; 
  
  
//volatile unsigned long int i = 0; 
  
#define reset_WDT 
static unsigned long Time=0; 
  
unsigned char Timer_start_flag=0; 
unsigned long Timer_Value; 
  
  
void WDT_INT_Handler(void* base_p) 
{ 
 201 
 
  
//Interrupt Handling Routine, restart watch dog timer 
    //unsigned int long j; 
  
#ifdef reset_WDT 
    *(&Time) = *(&Time) + 1; 
  
    //xil_printf("\r\n WDT ISR started....resetting watch dog 
timer...\r\n"); 
  
    XMbox_WriteBlocking(&MB0_MB3, &DATA.ACK_long, 4); 
  
  
    if (Timer_start_flag==1) 
        { 
         XTmrCtr_Stop  (&Timer, 0); 
         Timer_start_flag=0; 
         Timer_Value = XTmrCtr_GetValue (&Timer, 0); 
         xil_printf("\r\n Timer is Stopped... Value = %d \r\n",  
Timer_Value); 
        } 
    if (Timer_start_flag==0) 
    { 
        Timer_start_flag=1; 
        XTmrCtr_Reset (&Timer, 0); 
        XTmrCtr_Start  (&Timer, 0); 
        xil_printf("\r\n Timer is Started... \r\n"); 
    } 
  
  
    xil_printf("\r\n MB0\t Time Step: %d\tStatus: OK \r\n", Time); 
    xil_printf("\r\n ACk sent to MB3 \r\n"); 
  
    //for (j=0;j<100000;j++); 
  
  
    XWdtTb_RestartWdt  (&WDT) ; 
#else 
        //if WDT interrupt is set in hardware as level then you will 
see this message until system reset, make it edge to enter ISR once 
        xil_printf("\r\n WDT ISR started....watch dog timer will not 
be reset...\r\n"); 
  
        for (j=0;j<100000;j++); 
  
  
#endif 
  
  
    //  return; 
  
} 
  
  
int main() 
{ 
    DATA.ACK_char[0] = 'A'; 
    DATA.ACK_char[1] = 'C'; 
 202 
 
    DATA.ACK_char[2] = 'K'; 
    DATA.ACK_char[3] = '1'; 
  
    init_platform(); 
    XUartLite Com_Port; 
  
    //Initialize UART 
    XUartLite_Initialize  ( &Com_Port,  
XPAR_XPS_UARTLITE_0_DEVICE_ID); 
  
    xil_printf("\r\n WELCOME to MB0\r\n"); 
  
    //Initialize Mailbox 
     XMbox_CfgInitialize (&MB0_MB3, &CFG1, 
XPAR_MAILBOX_MB0_MB3_IF_0_BASEADDR); 
  
  
     //initialize counter 
      XTmrCtr_Initialize ( &Timer, XPAR_XPS_TIMER_0_DEVICE_ID); 
  
     //reset value 
     XTmrCtr_SetResetValue  ( &Timer, 0,  0); 
  
  
     //Initialize Watch Dog Timer 
  
    XWdtTb_Initialize  (&WDT, XPAR_XPS_TIMEBASE_WDT_0_DEVICE_ID  ); 
  
  
     //check if there was a previous reset... 
     if (XWdtTb_IsWdtExpired  (&WDT)) 
         xil_printf("\r\n MB0 was reset...\r\n"); 
     else 
     xil_printf("\r\n System is power recycled...\r\n"); 
  
     xil_printf("\r\n MB0\t Time Step: %d\tStatus: OK \r\n", Time); 
  
    /* Enable MicroBlaze interrupts */ 
    microblaze_enable_interrupts(); 
  
    /* Register Watch Dog Timer interrupt handler */ 
    XIntc_RegisterHandler(XPAR_XPS_INTC_0_BASEADDR , 
XPAR_XPS_INTC_0_XPS_TIMEBASE_WDT_0_WDT_INTERRUPT_INTR, 
WDT_INT_Handler, (void*)XPAR_XPS_TIMEBASE_WDT_0_BASEADDR); 
  
    /* Start the interrupt controller */ 
    XIntc_MasterEnable (XPAR_XPS_INTC_0_BASEADDR); 
  
    // very important, if you remove it then interrupt will not be 
caught....enable interrupt for the WDT 
  
    XIntc_EnableIntr  (XPAR_XPS_INTC_0_BASEADDR, 
XPAR_XPS_TIMEBASE_WDT_0_WDT_INTERRUPT_MASK); 
  
  
  
 203 
 
////////////////////////////////////////////////////////////////////
//////////////////////////////////////// 
  
    //Start Watch Dog Timer 
  
     XWdtTb_Start  (&WDT); 
  
     for(;;) 
     { 
         //xil_printf("\r\n MB0 is working"); 
         //for (i = 0; i<1000000; i++); 
     } 
  
    cleanup_platform(); 
  
    return 0; 
} 
  
  
D.2  MIcroBlaze 1 – Code  
 
 
#include <stdio.h> 
#include "xparameters.h" 
#include "xuartlite.h" 
#include "xuartlite_l.h" 
#include "platform.h" 
#include "fsl.h" 
#include "xwdttb.h" 
#include "xintc.h" 
#include "xmbox.h" 
#include "xtmrctr.h" 
  
//comment this line to stop resetting the Microblaze through its WDT 
//#define reset_WDT 
  
void print(char *str); 
  
XWdtTb WDT; 
XIntc INTC; 
XMbox MB1_MB3; 
XMbox_Config CFG1; 
XTmrCtr Timer; 
  
union ACK_mes 
{ 
    char ACK_char[4]; 
    unsigned long ACK_long; 
}; 
  
union ACK_mes DATA; 
//volatile unsigned long int i = 0; 
  
#define reset_WDT 
static unsigned long Time=0; 
  
 204 
 
unsigned char Timer_start_flag=0; 
unsigned long Timer_Value; 
  
  
  
void WDT_INT_Handler(void* base_p) 
{ 
  
//Interrupt Handling Routine, restart watch dog timer 
//  unsigned int long j; 
  
#ifdef reset_WDT 
    *(&Time) = *(&Time) + 1; 
    //xil_printf("\r\n WDT ISR started....resetting watch dog 
timer...\r\n"); 
  
    XMbox_WriteBlocking(&MB1_MB3, &DATA.ACK_long, 4); 
  
    if (Timer_start_flag==1) 
            { 
             XTmrCtr_Stop  (&Timer, 0); 
             Timer_start_flag=0; 
             Timer_Value = XTmrCtr_GetValue (&Timer, 0); 
             xil_printf("\r\n Timer is Stopped... Value = %d \r\n",  
Timer_Value); 
            } 
        if (Timer_start_flag==0) 
        { 
            Timer_start_flag=1; 
            XTmrCtr_Reset (&Timer, 0); 
            XTmrCtr_Start  (&Timer, 0); 
            xil_printf("\r\n Timer is Started... \r\n"); 
        } 
  
  
    xil_printf("\r\n MB1\t Time Step: %d\tStatus: OK \r\n", Time); 
    xil_printf("\r\n ACk sent to MB3 \r\n"); 
    //for (j=0;j<100000;j++); 
  
  
    XWdtTb_RestartWdt  (&WDT) ; 
#else 
        //if WDT interrupt is set in hardware as level then you will 
see this message until system reset, make it edge to enter ISR once 
        xil_printf("\r\n WDT ISR started....watch dog timer will not 
be reset...\r\n"); 
  
        for (j=0;j<100000;j++); 
  
  
#endif 
  
  
    //  return; 
  
} 
  
  
 205 
 
int main() 
{ 
    DATA.ACK_char[0] = 'A'; 
    DATA.ACK_char[1] = 'C'; 
    DATA.ACK_char[2] = 'K'; 
    DATA.ACK_char[3] = '1'; 
  
    init_platform(); 
    XUartLite Com_Port; 
    XUartLite_Initialize  ( &Com_Port,  
XPAR_XPS_UARTLITE_1_DEVICE_ID); 
  
    xil_printf("\r\n WELCOME to MB1\r\n"); 
  
    //Initialize Mailbox 
     XMbox_CfgInitialize (&MB1_MB3, &CFG1, 
XPAR_MAILBOX_MB1_MB3_IF_0_BASEADDR); 
  
     //initialize counter 
     XTmrCtr_Initialize ( &Timer, XPAR_XPS_TIMER_1_DEVICE_ID); 
  
    //reset value 
    XTmrCtr_SetResetValue  ( &Timer, 0,  0); 
  
     //Initialize Watch Dog Timer 
  
    XWdtTb_Initialize  (&WDT, XPAR_XPS_TIMEBASE_WDT_1_DEVICE_ID  ); 
  
  
     //check if there was a previous reset... 
     if (XWdtTb_IsWdtExpired  (&WDT)) 
         xil_printf("\r\n MB1 was reset...\r\n"); 
     else 
     xil_printf("\r\n System is power recycled...\r\n"); 
     xil_printf("\r\n MB1\t Time Step: %d\tStatus: OK \r\n", Time); 
  
    /* Enable MicroBlaze interrupts */ 
    microblaze_enable_interrupts(); 
  
    /* Register Watch Dog Timer interrupt handler */ 
    XIntc_RegisterHandler(XPAR_XPS_INTC_1_BASEADDR , 
XPAR_XPS_INTC_1_XPS_TIMEBASE_WDT_1_WDT_INTERRUPT_INTR, 
WDT_INT_Handler, (void*)XPAR_XPS_TIMEBASE_WDT_1_BASEADDR); 
  
    /* Start the interrupt controller */ 
    XIntc_MasterEnable (XPAR_INTC_0_BASEADDR); 
  
    // very important, if you remove it then interrupt will not be 
caught....enable interrupt for the WDT 
  
    XIntc_EnableIntr  
(XPAR_XPS_INTC_1_BASEADDR,XPAR_XPS_TIMEBASE_WDT_1_WDT_INTERRUPT_MASK
); 
  
  
  
////////////////////////////////////////////////////////////////////
//////////////////////////////////////// 
 206 
 
  
    //Start Watch Dog Timer 
  
     XWdtTb_Start  (&WDT); 
  
     for(;;) 
     { 
         //xil_printf("\r\n MB1 is working"); 
         //for (i = 0; i<1000000; i++); 
     } 
  
    cleanup_platform(); 
  
    return 0; 
} 
  
  
D.3  MicroBlaze 2 – Code  
 
 
#include <stdio.h> 
#include "xparameters.h" 
#include "xuartlite.h" 
#include "xuartlite_l.h" 
#include "platform.h" 
#include "fsl.h" 
#include "xwdttb.h" 
#include "xintc.h" 
#include "xmbox.h" 
#include "xtmrctr.h" 
  
  
//comment this line to stop resetting the Microblaze through its WDT 
//#define reset_WDT 
  
void print(char *str); 
  
XWdtTb WDT; 
XIntc INTC; 
XMbox MB2_MB3; 
XMbox_Config CFG1; 
XTmrCtr Timer; 
  
union ACK_mes 
{ 
    char ACK_char[4]; 
    unsigned long ACK_long; 
}; 
  
union ACK_mes DATA; 
  
//volatile unsigned long int i = 0; 
  
  
#define reset_WDT 
static unsigned long Time=0; 
 207 
 
  
unsigned char Timer_start_flag=0; 
unsigned long Timer_Value; 
  
  
  
void WDT_INT_Handler(void* base_p) 
{ 
  
//Interrupt Handling Routine, restart watch dog timer 
//  unsigned int long j; 
  
#ifdef reset_WDT 
    *(&Time) = *(&Time) + 1; 
    //xil_printf("\r\n WDT ISR started....resetting watch dog 
timer...\r\n"); 
  
    XMbox_WriteBlocking(&MB2_MB3, &DATA.ACK_long, 4); 
  
    if (Timer_start_flag==1) 
                { 
                 XTmrCtr_Stop  (&Timer, 0); 
                 Timer_start_flag=0; 
                 Timer_Value = XTmrCtr_GetValue (&Timer, 0); 
                 xil_printf("\r\n Timer is Stopped... Value = %d 
\r\n",  Timer_Value); 
                } 
            if (Timer_start_flag==0) 
            { 
                Timer_start_flag=1; 
                XTmrCtr_Reset (&Timer, 0); 
                XTmrCtr_Start  (&Timer, 0); 
                xil_printf("\r\n Timer is Started... \r\n"); 
            } 
  
  
    xil_printf("\r\n MB2\t Time Step: %d\tStatus: OK \r\n", Time); 
    xil_printf("\r\n ACk sent to MB3 \r\n"); 
    //for (j=0;j<100000;j++); 
  
    XWdtTb_RestartWdt  (&WDT) ; 
#else 
        //if WDT interrupt is set in hardware as level then you will 
see this message until system reset, make it edge to enter ISR once 
        xil_printf("\r\n WDT ISR started....watch dog timer will not 
be reset...\r\n"); 
  
        for (j=0;j<100000;j++); 
  
  
#endif 
  
  
    //  return; 
  
} 
  
  
 208 
 
int main() 
{ 
    DATA.ACK_char[0] = 'A'; 
    DATA.ACK_char[1] = 'C'; 
    DATA.ACK_char[2] = 'K'; 
    DATA.ACK_char[3] = '1'; 
  
    init_platform(); 
    XUartLite Com_Port; 
    XUartLite_Initialize  ( &Com_Port,  
XPAR_XPS_UARTLITE_2_DEVICE_ID); 
  
    xil_printf("\r\n WELCOME to MB2\r\n"); 
  
  
    //Initialize Mailbox 
    XMbox_CfgInitialize (&MB2_MB3, &CFG1, 
XPAR_MAILBOX_MB2_MB3_IF_0_BASEADDR); 
  
     //initialize counter 
         XTmrCtr_Initialize ( &Timer, XPAR_XPS_TIMER_2_DEVICE_ID); 
  
        //reset value 
        XTmrCtr_SetResetValue  ( &Timer, 0,  0); 
  
  
     //Initialize Watch Dog Timer 
  
    XWdtTb_Initialize  (&WDT, XPAR_XPS_TIMEBASE_WDT_2_DEVICE_ID  ); 
  
  
     //check if there was a previous reset... 
     if (XWdtTb_IsWdtExpired  (&WDT)) 
         xil_printf("\r\n MB2 was reset...\r\n"); 
     else 
     xil_printf("\r\n System is power recycled...\r\n"); 
     xil_printf("\r\n MB2\t Time Step: %d\tStatus: OK \r\n", Time); 
  
    /* Enable MicroBlaze interrupts */ 
    microblaze_enable_interrupts(); 
  
    /* Register Watch Dog Timer interrupt handler */ 
    XIntc_RegisterHandler(XPAR_XPS_INTC_2_BASEADDR , 
XPAR_XPS_INTC_2_XPS_TIMEBASE_WDT_2_WDT_INTERRUPT_INTR, 
WDT_INT_Handler, (void*)XPAR_XPS_TIMEBASE_WDT_2_BASEADDR); 
  
    /* Start the interrupt controller */ 
    XIntc_MasterEnable (XPAR_XPS_INTC_2_BASEADDR); 
  
    // very important, if you remove it then interrupt will not be 
caught....enable interrupt for the WDT 
  
    XIntc_EnableIntr  
(XPAR_XPS_INTC_2_BASEADDR,XPAR_XPS_TIMEBASE_WDT_2_WDT_INTERRUPT_MASK
); 
  
  
  
 209 
 
////////////////////////////////////////////////////////////////////
//////////////////////////////////////// 
  
    //Start Watch Dog Timer 
  
     XWdtTb_Start  (&WDT); 
  
     for(;;) 
     { 
  
        // xil_printf("\r\n MB2 is working"); 
        // for (i = 0; i<1000000; i++); 
  
     } 
  
    cleanup_platform(); 
  
    return 0; 
} 
  
  
D.4  MicroBlaze 3 (MSCP) – Code 
 
 
#include <stdio.h> 
#include "xparameters.h" 
#include "xuartlite.h" 
#include "xuartlite_l.h" 
#include "platform.h" 
#include "fsl.h" 
#include "xwdttb.h" 
#include "xintc.h" 
#include "xiic_l.h" 
#include "xiic.h" 
#include "xmbox.h" 
#include "xgpio.h" 
#include "MAX6656.h" 
#include "xutil.h" 
  
#define MEM_UNDER_TEST_BASE_ADDR  XPAR_EMC_0_MEM0_BASEADDR 
#define print_MAX6656_messages 
#define reset_WDT 
#define thermal_data 
#define Board_MRAM //This is for the 256K MRAM 
  
//comment this line to stop resetting the Microblaze through its WDT 
//#define reset_WDT 
  
int initialize_MAX6656(void); 
void print(char *str); 
  
int STATUS; 
XStatus status; 
unsigned long Time=0; 
  
  
 210 
 
int flag = 1; 
int flag2 = 0; 
int FPGA_temp; 
int sensor_temp; 
//volatile unsigned long int i = 0; 
  
XIic IIC; 
XWdtTb WDT; 
XIntc INTC; 
XUartLite Com_Port; 
XMbox MB3_MB0; 
XMbox_Config CFG0; 
XMbox MB3_MB1; 
XMbox_Config CFG1; 
XMbox MB3_MB2; 
XMbox_Config CFG2; 
XGpio gpio; 
  
  
  
union ACK_mes 
{ 
    char ACK_char[4]; 
    unsigned long ACK_long; 
}; 
  
union ACK_mes DATA; 
  
unsigned long r0; 
unsigned long r1; 
unsigned long r2; 
  
unsigned long WDI; 
  
u8 TEMP_SENSOR_ADDRESS = 0x18; //originally it is 0x30 but alligned 
for 7 bit addressing 
  
  
void WDT_INT_Handler(void* base_p) 
{ 
  
    //Interrupt Handling Routine, restart watch dog timer 
    //  unsigned int long j; 
  
#ifdef reset_WDT 
    //*(&Time) = *(&Time) + 1; 
    Time++; 
    //xil_printf("\r\n WDT ISR started....resetting watch dog 
timer...\r\n"); 
  
    XMbox_ReadBlocking(&MB3_MB0, &r0, 4); 
    XMbox_ReadBlocking(&MB3_MB1, &r1, 4); 
    XMbox_ReadBlocking(&MB3_MB2, &r2, 4); 
  
    WDI = WDI^0x80000000; 
    XGpio_DiscreteWrite (&gpio,1,WDI); 
  
  
 211 
 
    flag = 1; 
  
    if ((r0 == DATA.ACK_long) && (r1 == DATA.ACK_long) && (r2 == 
DATA.ACK_long)) 
    { 
        XWdtTb_RestartWdt  (&WDT) ; 
        flag2 = 1; 
    } 
  
  
    //for (j=0;j<100000;j++); 
  
    
////////////////////////////////////////////////////////////////////
////////////////////////// 
        xil_printf("\r\n Starting MemoryTest for EMC_MRAM:\r\n"); 
  
        xil_printf("\r\n Running 32-bit test...\r\n"); 
        status = 
XUtil_MemoryTest32((Xuint32*)MEM_UNDER_TEST_BASE_ADDR, 65536, 
0xAAAA5555, XUT_ALLMEMTESTS); 
        if (status == XST_SUCCESS) 
        { 
           print("\r\n PASSED!\r\n"); 
        } 
        else 
        { 
           print("\r\n FAILED!\r\n"); 
        } 
        xil_printf("\r\n Running 16-bit MRAM test...\r\n"); 
        status = 
XUtil_MemoryTest16((Xuint16*)MEM_UNDER_TEST_BASE_ADDR, 131072, 
0xAA55, XUT_ALLMEMTESTS); 
        if (status == XST_SUCCESS) 
        { 
           xil_printf("\r\n PASSED!\r\n"); 
        } 
        else 
        { 
           xil_printf("\r\n FAILED!\r\n"); 
        } 
    
////////////////////////////////////////////////////////////////////
///////////////////////// 
  
  
#else 
    //if WDT interrupt is set in hardware as level then you will see 
this message until system reset, make it edge to enter ISR once 
    xil_printf("\r\n WDT ISR started....watch dog timer will not be 
reset...\r\n"); 
  
    for (j=0;j<100000;j++); 
  
  
#endif 
  
  
    //  return; 
 212 
 
  
} 
  
  
int main() 
{ 
  
    DATA.ACK_char[0] = 'A'; 
    DATA.ACK_char[1] = 'C'; 
    DATA.ACK_char[2] = 'K'; 
    DATA.ACK_char[3] = '1'; 
  
    u8 temp_u8; 
    int temp_int; 
  
  
    u8 reg_add; 
    //unsigned long i; 
  
    init_platform(); 
    XUartLite Com_Port; 
    XUartLite_Initialize  ( &Com_Port,  
XPAR_XPS_UARTLITE_3_DEVICE_ID); 
  
    //Initialize Mailbox 
    XMbox_CfgInitialize (&MB3_MB0, &CFG0, 
XPAR_MAILBOX_MB0_MB3_IF_1_BASEADDR); 
    XMbox_CfgInitialize (&MB3_MB1, &CFG1, 
XPAR_MAILBOX_MB1_MB3_IF_1_BASEADDR); 
    XMbox_CfgInitialize (&MB3_MB2, &CFG2, 
XPAR_MAILBOX_MB2_MB3_IF_1_BASEADDR); 
  
    //Initialize XGpio 
    XGpio_Initialize(&gpio, XPAR_XPS_GPIO_3_DEVICE_ID); 
    XGpio_SetDataDirection  (&gpio,1, 0x55ff5500); 
    WDI = 0x80000000; 
    XGpio_DiscreteWrite (&gpio,1,WDI); 
  
    //initialize IIC and MAX6656 
    XIic_Initialize(&IIC,XPAR_INTC_0_DEVICE_ID); 
    XIic_Start(&IIC); 
    // if the MAX6656 is not working then this will lead to stopping 
the system!! 
    //comment it to get the system to work... 
    initialize_MAX6656(); 
  
  
    xil_printf("\r\n WELCOME to MB3\r\n"); 
  
  
  
    //Initialize Watch Dog Timer 
  
    XWdtTb_Initialize  (&WDT, XPAR_XPS_TIMEBASE_WDT_3_DEVICE_ID  ); 
  
  
    //check if there was a previous reset... 
    if (XWdtTb_IsWdtExpired  (&WDT)) 
 213 
 
        xil_printf("\r\n MB3 was reset...\r\n"); 
    else 
        xil_printf("\r\n System is power recycled...\r\n"); 
  
    //xil_printf("\r\n MB3\t Time Step: %d\tStatus: OK FPGA 
Temp: %d\tSensor Temp: %d\r\n", Time, FPGA_temp, sensor_temp); 
  
    /* Enable MicroBlaze interrupts */ 
    microblaze_enable_interrupts(); 
  
    /* Register Watch Dog Timer interrupt handler */ 
    XIntc_RegisterHandler(XPAR_XPS_INTC_3_BASEADDR , 
XPAR_XPS_INTC_3_XPS_TIMEBASE_WDT_3_WDT_INTERRUPT_INTR, 
WDT_INT_Handler, (void*)XPAR_XPS_TIMEBASE_WDT_3_BASEADDR); 
  
    /* Start the interrupt controller */ 
    XIntc_MasterEnable (XPAR_XPS_INTC_3_BASEADDR); 
  
    // very important, if you remove it then interrupt will not be 
caught....enable interrupt for the WDT 
  
    XIntc_EnableIntr  
(XPAR_XPS_INTC_3_BASEADDR,XPAR_XPS_TIMEBASE_WDT_3_WDT_INTERRUPT_MASK
); 
  
  
  
    
////////////////////////////////////////////////////////////////////
//////////////////////////////////////// 
  
  
    //Start Watch Dog Timer 
  
    XWdtTb_Start  (&WDT); 
  
  
    for(;;) 
    { 
  
        //reading external temperature 
        //reading the unsigned byte part 
        if (flag == 1) 
        { 
            reg_add = (u8) RRTE; 
  
            XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, 
&reg_add, 1, XIIC_REPEATED_START); 
  
            XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, 
&temp_u8, 1); 
  
            temp_int = (temp_u8 & 0x80)*(-1); 
  
            temp_int = temp_int + temp_u8; 
  
 214 
 
            //I disabled the fraction measurement part as it leads 
to stopping the loop 
  
  
            FPGA_temp = temp_int; 
  
#ifdef thermal_test 
            xil_printf("\r\n MB3\t Time Step: %d\tStatus: OK\tFPGA 
Temp: %d\t  WDI: %d\r\n", Time, FPGA_temp, WDI>>31); 
#endif 
  
#ifdef thermal_data 
            //to display the values read we have to switch on the 
system from the board I made... 
            xil_printf("\r\n%d\t%d\t%d", Time, FPGA_temp, WDI>>31); 
#endif 
            if (flag2 == 1) 
                                    { 
#ifdef thermal_test 
                xil_printf("\r\n ACK received from all processors 
\r\n"); 
#endif 
                flag2 = 0; 
                                    } 
  
  
            flag = 0; 
        } //if 
  
  
    } //for 
    cleanup_platform(); 
  
    return 0; 
} 
  
  
  
  
////////////////////////////////////////////////////////////////////
/// 
  
  
int initialize_MAX6656(void) 
{ 
  
    u8 tx_data[2]; //register address[0] and transmit data [1] 
    u8 rx_data; 
  
    //write Data to conversion rate register 
  
    tx_data[0] = WCRW; 
    tx_data[1] = 0x01;  //measurements every 0.5 sec 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
 215 
 
    //readback to verify 
  
    //read Data from conversion rate register 
    tx_data[0] = RCRA; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WCRW, 
tx_data[1], RCRA,rx_data); 
#endif 
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-3); 
    } 
  
  
    //write Data to internal ALERT high limit 
  
    tx_data[0] = WLHO; 
    tx_data[1] = 0x55;  //+80 C 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
    //readback to verify 
  
    //read Data from internal ALERT high limit 
    tx_data[0] = RLHN; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WLHO, 
tx_data[1], RLHN,rx_data); 
#endif 
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-4); 
    } 
  
 216 
 
  
    //write Data to internal ALERT low limit 
    tx_data[0] = WLLM; 
    tx_data[1] = 0xEC;  //-20 C 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
    //readback to verify 
  
    //read Data from internal ALERT low limit 
    tx_data[0] = RLLI; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WLLM, 
tx_data[1], RLLI,rx_data); 
#endif 
  
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-5); 
    } 
  
  
    //write Data to external ALERT high limit 
    tx_data[0] = WRHA; 
    tx_data[1] = 0x55;  //+80 C 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
    //readback to verify 
  
    //read Data from external ALERT high limit 
    tx_data[0] = RRHI; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WRHA, 
tx_data[1], RRHI,rx_data); 
#endif 
 217 
 
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-6); 
    } 
  
    //write Data to external ALERT low limit 
    tx_data[0] = WRLN; 
    tx_data[1] = 0xEC;  //-20 C 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
    //readback to verify 
  
    //read Data from external ALERT low limit 
    tx_data[0] = RRLS; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WRLN, 
tx_data[1], RRLS,rx_data); 
#endif 
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-7); 
    } 
  
  
    //write Data to internal OVERT limit 
    tx_data[0] = WLOL; 
    tx_data[1] = 0x64;  //+100 C 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
    //readback to verify 
  
    //read Data from internal OVERT limit 
    tx_data[0] = RLOL; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
 218 
 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WLOL, 
tx_data[1], RLOL,rx_data); 
#endif 
  
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-8); 
    } 
  
    //write Data to external OVERT limit 
    tx_data[0] = WROL1; 
    tx_data[1] = 0x64;  //+100 C 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, tx_data, 
2, XIIC_REPEATED_START); 
  
    //readback to verify 
  
    //read Data from external OVERT limit 
    tx_data[0] = RLOL1; 
  
    XIic_DynSend(XPAR_IIC_0_BASEADDR,  TEMP_SENSOR_ADDRESS, tx_data, 
1, XIIC_REPEATED_START); 
  
    XIic_DynRecv(XPAR_IIC_0_BASEADDR, TEMP_SENSOR_ADDRESS, &rx_data , 
1); 
  
#ifdef print_MAX6656_messages 
    xil_printf("\r\n Value written to register address [%x] is 
[%x] ... value read from register address [%x] is [%x] \r\n", WROL1, 
tx_data[1], RLOL1,rx_data); 
#endif 
  
    if (rx_data != tx_data[1]) 
    { 
#ifdef print_MAX6656_messages 
        xil_printf("\r\n Readback error: Value written to register 
address [%x] is different from value read \r\n", tx_data[0]); 
#endif 
        return(-9); 
    } 
  
    xil_printf("\r\n CONFIGURATION ENDED SUCCESSFULLY \r\n"); 
    return 1; 
} 
  
  
  
  
  
 
 219 
 
D.5  Temperature IC Addresses 
/* 
 * MAX6656.h 
 * 
 *  Created on: Oct 16, 2013 
 *      Author: eslab 
 */ 
  
#ifndef MAX6656_H_ 
#define MAX6656_H_ 
  
  
#endif /* MAX6656_H_ */ 
  
/* 
 * MAX6656.h 
 * 
 *  Created on: Sep 6, 2013 
 *      Author: eslab 
 */ 
  
#ifndef MAX6656_H_ 
#define MAX6656_H_ 
  
  
#endif /* MAX6656_H_ */ 
  
#define RLTS 0x00 //0000 0000 Read Internal Temperature 
#define RRTE 0x01 //0000 0000 Read External Temperature 1 
#define RSL 0x02 //0000 0000 Read Status Byte; Note 1 
#define RCL 0x03 //0000 0000 Read Configuration Byte 
#define RCRA 0x04 //0000 0010 Read Conversion Rate Byte 
#define RLHN 0x05 //0111 1111 Read Internal ALERT High Limit 
#define RLLI 0x06 //1100 1001 Read Internal ALERT Low Limit 
#define RRHI 0x07 //0111 1111 Read External Temperature 1 ALERT High 
Limit 
#define RRLS 0x08 //1100 1001 Read External Temperature 1 ALERT Low 
Limit 
#define WCA 0x09 //N/A Write Configuration Byte 
#define WCRW 0x0A //N/A Write Conversion Rate Control Byte 
#define WLHO 0x0B //N/A Write Internal ALERT High Limit 
#define WLLM 0x0C //N/A Write Internal ALERT Low Limit 
#define WRHA 0x0D //N/A Write External Temperature 1 ALERT High 
Limit 
#define WRLN 0x0E //N/A Write External Temperature 1 ALERT Low Limit 
#define RRET1 0x10 //0000 0000 Read External 1 Extended Temperature 
#define RRET2 0x11 //0000 0000 Read External 2 Extended Temperature 
#define RLET 0x12 //0000 0000 Read Internal Extended Temperature 
#define RRT2 0x13 //0000 0000 Read External Temperature 2 
#define RRHL2 0x14 //0111 1111 Read External Temperature 2 ALERT 
High Limit 
#define RRLL2 0x15 //1100 1001 Read External Temperature 2 ALERT Low 
Limit 
#define RLOL 0x16 //0111 1111 Read Internal OVERT Limit 
#define RLOL1 0x17 //0111 1111 Read External 1 OVERT Limit 
#define RLOL2 0x18 //0111 1111 Read External 2 OVERT Limit 
#define WLOL 0x19 //N/A Write Internal OVERT Limit 
#define WROL1 0x1A //N/A Write External 1 OVERT Limit 
#define WROL2 0x1B //N/A Write External 2 OVERT Limit 
 220 
 
#define WRH2 0x1C //N/A Write External Temperature 2 ALERT High 
Limit 
#define WRL2 0x1D //N/A Write External Temperature 2 ALERT Low Limit 
#define WV0HL 0x1E //N/A Write VCC (VIN2) ALERT High Limit for 
MAX6655 (MAX6656) 
#define WV0LL 0x1F //N/A Write VCC (VIN2) ALERT Low Limit for 
MAX6655 (MAX6656) 
#define WV1HL 0x20 //N/A Write VIN1 ALERT High Limit 
#define WV1LL 0x21 //N/A Write VIN1 ALERT Low Limit 
#define WV2HL 0x22 //N/A Write VIN2 (VCC) ALERT High Limit for 
MAX6655 (MAX6656) 
#define WV2LL 0x23 //N/A Write VIN2 (VCC) ALERT Low Limit for 
MAX6655 (MAX6656) 
#define WV3HL 0x24 //N/A Write VIN3 ALERT High Limit 
#define WV3LL 0x25 //N/A Write VIN3 ALERT Low Limit 
#define RV0HL 0x26 //1101 0011 Read VCC (VIN2) ALERT High Limit for 
MAX6655 (MAX6656) 
#define RV0LL 0x27 //1010 1101 Read VCC (VIN2) ALERT Low Limit for 
MAX6655 (MAX6656) 
#define RV1HL 0x28 //1101 0011 Read VIN1 ALERT High Limit 
#define RV1LL 0x29 //1010 1101 Read VIN1 ALERT Low Limit 
#define RV2HL 0x2A //1101 0011 Read VIN2 (VCC) ALERT High Limit for 
MAX6655 (MAX6656) 
#define RV2LL 0x2B //1010 1101 Read VIN2 (VCC) ALERT Low Limit for 
MAX6655 (MAX6656) 
#define RV3HL 0x2C //1101 0011 Read VIN3 ALERT High Limit 
#define RV3LL 0x2D //1010 1101 Read VIN3 ALERT Low Limit 
#define RV0 0x2E //0000 0000 Read VCC (VIN2) for MAX6655 (MAX6656) 
#define RV1 0x2F //0000 0000 Read VIN1 
#define RV2 0x30 //0000 0000 Read VIN2 (VCC) for MAX6655 (MAX6656) 
#define RV3 0x31 //0000 0000 Read VIN3 
#define RSL2 0x32 //0000 0000 Read Status Byte 2 
#define RCL2 0x33 //0000 0000 Read Configuration Byte 2 
#define WCA2 0x34 //N/A Write Configuration Byte 2 
#define RDID 0xFE //0000 1010 Read Device ID 
#define RDRV 0xFF //0100 1101 Read Manufacture ID 
 
 
  
 221 
 
Appendix E   
Frame Addresses Generation in the Virtex-5 FPGA 
 
%this file is used to query all the configuration frame addresses of 
the  
%Virtex5Lx50. It depends on issuing the (q)command to the embedded 
SEU  
%controller macro for each of the(21d6)frames and receiving the  
%corresponding 6 digit hex address. The linear addresses together 
with the  
%converted addresses are stored in a file for later retrieval by the 
random 
%fault injector. 
  
%file for storing the addresses after conversion 
  
fid = fopen('C:\Users\mohamed 
mahmoud\Documents\MATLAB\SEU_MODEL\frame_add.txt', 'w'); 
  
%Serial Port Settings 
s1 = serial('COM2'); 
  
%Serial Port specific properties: 
s1.BaudRate = 115200; 
s1.DataBits = 8; 
s1.DataTerminalReady = 'off'; 
s1.FlowControl = 'none'; 
s1.Parity = 'none'; 
s1.ReadAsyncMode = 'continuous'; 
s1.StopBits = 1; 
s1.Terminator = 'CR'; 
  
% Serial Port (additional) Properties 
s1.ByteOrder = 'littleEndian'; 
%s1.BytesAvailableFcn = {'dispcallback'};  
%s1.BytesAvailableFcnCount = 16; 
%s1.BytesAvailableFcnMode = 'bytes'; 
%s1.ErrorFcn =  
s1.InputBufferSize = 512; 
s1.ObjectVisibility = 'on'; 
s1.OutputBufferSize = 512; 
%s1.OutputEmptyFcn =  
%s1.RecordDetail = 'compact'; 
%s1.RecordMode = 'overwrite'; 
%s1.RecordName = 'record.txt'; 
%s1.RecordStatus = 'off'; 
  
s1.Timeout = 10; 
%s1.TimerFcn =  
%s1.TimerPeriod = 1; 
  
%response to * command when sent for second time 
res_ast_2 = [42;13;63;13;13;62;13;13;63;13;13;62]; 
  
total_no_of_frames = 8663; %21D7h 
  
 222 
 
%frame numbers vector 
Frame_No_Hex_Converter = zeros(total_no_of_frames-1,1); 
  
fopen(s1); 
%normally input and output buffers are flushed after fopen, flushing 
here 
%is for more robustness. 
  
flushinput(s1); 
flushoutput(s1); 
  
%Set the system in UART control mode  
  
fprintf(s1,'*'); 
pause(0.5); 
flushinput(s1); 
  
fprintf(s1,'*'); 
pause(0.5); 
len = s1.bytesavailable; 
input_buffer = fread(s1,len); 
  
count=0; 
for i = 1:len 
  
    if(input_buffer(i) == res_ast_2(i)) 
        count = count +1; 
    end 
end 
  
if(count == 12)  
    fprintf('response received for Asterik command\r\n'); 
else 
    fprintf('Error - incorrect response to Asterik command\r\n'); 
    exit(); 
end 
  
%flush input buffer for receiving new data, it should be clean even 
without 
%it. 
flushinput(s1); 
flushoutput(s1); 
  
%query of frame numbers 
%for j = 2602:total_no_of_frames-1, I made like that because it 
stopped 
%working in the middle of operation at a20 hex, which is 2601 
decimal so I 
%decided to resume from the next frame in order not to waste time.  
  
for j = 1:total_no_of_frames-1 
    %convert the frame index into hex then string then concatenate 
it to 
    %the q command then send it to the SEU controller 
    frame_no_hex = dec2hex(j,4); 
    frame_no_str = num2str(frame_no_hex,'%x');  
     
    %same as the above two lines  
    %frame_no_str = num2str(frame_no_hex,'%.4x') 
 223 
 
     
    q_command = strcat('q',frame_no_str); 
     
    %flush output buffer for safety, it should be clean even without 
it. 
    flushoutput(s1); 
    fprintf(s1, q_command);  
    pause(0.5); 
    %read response to issuing the q-command 
    q_command_response = fread(s1,s1.bytesavailable); 
     
    %read the address and convert it into characters, hex string 
    frame_add_hex = sscanf(char(q_command_response(21:26)),'%x'); 
     
    %store it in the vector that will be indexed later to retrive 
the 
    %converted frame addresses. 
    Frame_No_Hex_Converter(j) =  frame_add_hex; 
     
    %write data in the text file 
    %important to keep the format specifiers for the width of the 
printed 
    %word to be 4 and 6 as this is needed by the controller, 6 
digits of 
    %hex values corresponding to 24 bits address. 
    fprintf(fid,'%.4x\t%.6x\r\n',j,Frame_No_Hex_Converter(j)); 
    fprintf('%.4x\t%.6x\r\n',j,Frame_No_Hex_Converter(j)); 
    flushinput(s1); 
end 
  
  
  
%close file 
fclose(fid); 
%close, delete, clear object  
fclose(s1); 
delete(s1); 
clear s1; 
  
 
 
  
 224 
 
Appendix F   
Sample Code - Fault Injection of Single Bit Upsets 
 
%random seed 
  
RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(100*cl
ock))); 
  
%file to read the addresses of FPGA frames. 
  
fid = fopen('C:\Users\mohamed 
mahmoud\Documents\MATLAB\SEU_MODEL\frame_add.txt','r'); 
  
%constants 
  
no_of_SEU = 100; 
test_cycles = 10; 
total_no_of_frames = 8663;%this is 21D7 in hex, but we can generate 
from 1 to 21D6 as in the SEU controller data. frame 0 is unused. 
total_no_of_bits = 1312;%we can generate from 0 up to 1311 this is 
in Dec, in hex it is from 0 till 51F. 
  
%response to * command when sent for second time 
res_ast_2 = [42;13;63;13;13;62;13;13;63;13;13;62]; 
  
%response to 'd' command  
d_command_res = [100;13;68;79;77;13;13;62;13;13;63;13;13;62]; 
  
%read the farmes addresses from the addresses file 
  
frames_addresses = fscanf(fid,'%4c%*c%6c\r\n',[10 8662]);%8663-1 = 
8662 frames. frame numbers start from 1 till 21D6h.reading is  
%in column order that is why we swap columns and rows indeices then 
we 
%transpoase the resulting array 
frames_addresses = frames_addresses'; 
%when accessing the farmes_addresses array to get the farme hex 
address we 
%can have to index it with the frame linear address then obtain the 
hex 
%address from column 5 till 10, i.e. frame_addresses(8662, 5:10) = 
710100 
  
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%% 
  
  
%1- start serial communication to put the SEU controller in the uart 
mode 
%and start commanding it 
  
%Serial Port Settings 
%serial port can be different than com2 
s1 = serial('COM1'); 
  
 225 
 
%Serial Port specific properties: 
s1.BaudRate = 115200; 
s1.DataBits = 8; 
s1.DataTerminalReady = 'off'; 
s1.FlowControl = 'none'; 
s1.Parity = 'none'; 
s1.ReadAsyncMode = 'continuous'; 
s1.StopBits = 1; 
s1.Terminator = 'CR'; 
  
% Serial Port (additional) Properties 
s1.ByteOrder = 'littleEndian'; 
%s1.BytesAvailableFcn = {'dispcallback'};  
%s1.BytesAvailableFcnCount = 16; 
%s1.BytesAvailableFcnMode = 'bytes'; 
%s1.ErrorFcn =  
s1.InputBufferSize = 512; 
s1.ObjectVisibility = 'on'; 
s1.OutputBufferSize = 512; 
%s1.OutputEmptyFcn =  
%s1.RecordDetail = 'compact'; 
%s1.RecordMode = 'overwrite'; 
%s1.RecordName = 'record.txt'; 
%s1.RecordStatus = 'off'; 
  
s1.Timeout = 10; 
%s1.TimerFcn =  
%s1.TimerPeriod = 1; 
     
  
fopen(s1); 
  
flushinput(s1); 
flushoutput(s1); 
  
%reading functions are blocking the Matlab command line.  
%fread for binary read. the read operation ends when  
% timeout occurs or teh specificed number of values is read (a vlaue 
might 
%consist of more than one byte)or the buffer is filled. 
%fscanf read data and format it as character. it blocks the command 
line 
%until one of the following conditions takes place (terminator is 
received,  
%input buffer is filled, time out reached, specified number of 
values is read) 
%scantext  
%fgetl read one line of text data without terminator. read operation 
blocks 
%the Matlab command until one of teh following conditions takes 
place: the 
%terminator is read, timeout is reached, the buffer is filled. 
%fgets read one line of text data with terminator. same conditions 
as fgetl 
%to stop the blocking by the reading process. 
  
%in the buffer full call back function: query the number of bytes 
available 
%and then read them if they are more than 0. 
  
 226 
 
  
%writing text data through fprintf, by default the stream will be 
followed 
%by the terminator. also '\n' will be replaced with the terminator 
%character. The binary write can be realized by using the fwrite 
function. 
%The write operation finsihes when the all the data is written or 
the timer  
%timeout condition is reached. 
  
%in the empty call back function: query the number of bytes to 
output and  
%then write them if more they are more than 0. 
  
for test_loop = 1: test_cycles 
     
     
    fprintf('Test Loop No. = %d\r\n',test_loop); 
  
     
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%% 
%just in case of reprogamming the * should be issued again other 
wise the 
%SEU will not respond. 
  
%normally input and output buffers are flushed after fopen, flushing 
here 
%is for more robustness. 
  
flushinput(s1); 
flushoutput(s1); 
  
%Set the system in UART control mode  
  
fprintf(s1,'*'); 
pause(0.5); 
flushinput(s1); 
  
fprintf(s1,'*'); 
pause(0.5); 
  
  
len = s1.bytesavailable; 
input_buffer = fread(s1,len); 
  
count=0; 
for i = 1:len 
  
    if(input_buffer(i) == res_ast_2(i)) 
        count = count +1; 
    end 
end 
  
if(count == 12)  
    fprintf('response received for Asterik command, SEU controller 
in UART control mode\r\n'); 
else 
 227 
 
    fprintf('Error - incorrect response to Asterik command\r\n'); 
    exit(); 
end 
  
  
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%% 
     
  
  
  
%flush input buffer for receiving new data, it should be clean even 
without 
%FLUSHING but we do it for safety. 
flushinput(s1); 
flushoutput(s1); 
     
  %Always work in detection mode and not auto correction mode to 
    %accumulate errors... 
    %to work in auto correction mode then this block down to the 
marker 
    %should be commented  
     
    %first we send the 'd' command to put the system in detection 
only mode 
    %in order to test the effects of error accumulation.  
  
fprintf(s1,'d'); 
pause(1); 
  
%test the detection only mode response 
len = s1.bytesavailable; 
input_buffer = fread(s1,len); 
  
count=0; 
for i = 1:len 
  
    if(input_buffer(i) == d_command_res(i)) 
        count = count +1; 
    end 
end 
  
if(count == 14)  
    fprintf('response received for detection_mode command\r\n'); 
else 
    fprintf('Error - incorrect response for detection_mode 
command\r\n'); 
    exit(); 
end 
  
  
flushinput(s1); 
flushoutput(s1); 
  
%marker of autodetection mode 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
     
 228 
 
    %Matlab said use it once, but I use it at each test cycle to 
generate 
    %as much as possible random values.. 
  
    
RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(100*cl
ock))); 
     
     
    %Fault Injection 
  
%Generation of random locations 
  
%2- start by generating random values for the frame numbers and bit 
numbres 
%number of frames = (21D7)H, (8663)D, No of bits in each frame = 
(41words x 32 bits = 1312bit) (520)h 
  
loop_flag = 1; 
  
while loop_flag %enable this loop to make sure that all generated 
errors are SBE only without MBE. 
     
SBE_count = 0; 
MBE_count = 0; 
  
Frame_No = unidrnd(total_no_of_frames-1,1,no_of_SEU); 
  
[value F H]=unique(Frame_No); 
rep=diff(find(diff([-Inf sort(H) Inf]))); 
  
%Test if Frame_No contains repeated bits or not.. if it contains 
repeated bits 
%then this means we will have to reset the FPGA at the end of this 
%simulation run due to having MBE (more than one SBE in the same 
frame) 
  
len =  length(unique(Frame_No)); 
  
if ( len < no_of_SEU) 
    disp('The Random Frame Numbers contain a repetition, MBE is 
expected, Reset is needed at end of simulation'); 
    disp('repeating random number generation to get only SBEs'); 
         
    for i = 1:size(rep,2) 
    
        if(rep(1,i) > 1) 
            fprintf('Frame Value = %d  Numbers of Repetition = %d\n', 
value(1,i), rep(1,i)); 
            MBE_count = MBE_count + 1; 
        else 
         
            SBE_count = SBE_count + 1; 
        end 
    end 
  
 229 
 
    fprintf('Total number of error injected Frames = %d\n Number of 
frames having SBE = %d\n Number of Frames having MBE = %d\n', len, 
SBE_count, MBE_count);     
  
     
else 
    disp('The Random Frame Numbers contain no repetition, SBE is 
expected'); 
end 
  
%indicate which frames have repetitions, if any. (unique) return a 
vector 
%without repetitions, (diff) calculate diff between successive 
elements, 
%(find)gets the indices and values of non-zero elements, (all) 
determines 
%if all elements of array are non-zero ot true. 
  
if MBE_count > 0 
    loop_flag = 1; 
else 
    loop_flag = 0; 
end  
  
end %while loop  
  
% plotting of random generated farmes histogram 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
% colormap(hsv); 
%  
% x = 1:total_no_of_frames; 
% hist(Frame_No,x); 
%  
% figure; 
% scatter(value,rep,10,rep); 
%  
% [freq_count, bin_loc] = hist(Frame_No,x); 
% figure; 
% bar(bin_loc, freq_count); 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%consider plotting a bar plot that represents the repetition in 
groups of 
%frames, to show number of 1 bit error trials, 2 bit error trials... 
etc 
%also use a color map 
  
  
%3- generate the random bit to be inverted for each frame occurance 
in the frames addresses vector. 
Bit_No = unidrnd(total_no_of_bits-1,1,no_of_SEU); 
  
%4- actual Injection 
  
for i = 1 : no_of_SEU 
  
    %4.1 form strings to be sent to the port "t frame address 
bit_no"  
     
 230 
 
t_command = 
strcat('t',frames_addresses(Frame_No(i),5:10),num2str(Bit_No(i),'%.3
x')); 
fprintf('Simulation Step = %d  Command = %s\r\n', i, t_command); 
  
  
%4.2 writing the toggle command 
fprintf(s1,t_command); 
%random delay between events in the interval 0.1 till 1 sec. this is 
a 
%uniform random distribution should be poisson... 
random_t = 0.1 + (1-0.1)*rand(1); 
%pause(1); 
pause(random_t); 
flushoutput(s1); 
  
end 
  
%flush the input buffer 
%issue the status command 
%check mode, total no of SBE that have been corrected since 
initialize, 
%number of frames errors during last scan, CRC error flag, 24 bit 
count of 
%completed scans, record all this to a file. 
  
%ask if system is still working? 
  
%issue acm mode command 
  
%issue status command 
  
%analyze status data and notice the difference 
  
%ask run simulation again? 
  
  
  
fprintf('press a button before starting ACM \r\n'); 
  
    h = waitforbuttonpress; 
     
flushinput(s1); 
flushoutput(s1); 
%fprintf(s1,'s'); 
%pause(1); 
fprintf(s1,'a'); 
pause(1); 
%6- analyze port response after each insertion and the total system 
%response at the end. how many cores stopped? did system stop or 
not? 
  
%redo simulation  
  
%7- calculate SEU Rate versus .. System Failure Rate and draw the 
curve of 
%system failure rate versus SEU rate 
  
 231 
 
  
    fprintf('press a button when ready to start the next cycle, you 
might need to reset the FPGA \r\n'); 
  
    w = waitforbuttonpress; 
         
  
  
end %test loop 
%close, delete, clear object 
fclose(fid); 
fclose(s1); 
delete(s1); 
clear s1; 
  
  
  
 
  
 232 
 
Appendix G   
Iterations effect on the Cost, Schedule and Reliability of 
Nano-Satellites Projects  
G.1  Introduction 
In the process of developing complex engineering projects, which consist of many 
cross-dependent processes, iterations might take place as the work progresses. The 
project complexity is mainly associated with two dimensions: the number of 
processes and the effort level entitled within the project phases. A third dimension 
that adds to the project complexity is the level of dependency among the different 
processes. Dependency can be classified into control and data. Control dependency 
takes place when a specific process controls the initiation and termination of another 
process. Data dependency is a form of exchanging information between an 
information-generating process and an information-consuming process at certain 
point(s) in the execution flow of both processes. In modern engineering systems, 
complexity is increasing tremendously with the increment in the number of 
diversified modules used in building larger systems. The processes used in 
developing, qualifying and integrating these modules usually encompass multitude of 
control and data interdependencies among them, which means that high level of 
complexity would exist. 
We introduce a typical nano-satellite project as a case study of a complex 
engineering project with many control and data interdependencies among its 
processes. The case study is used to develop an estimation procedure for the three 
crucial parameters in any project: The schedule, the cost, and the expected product 
reliability. Currently, nano-satellite development activities spread all over the world. 
Unlike traditional large/medium satellites, nano-satellites try to achieve low-cost and 
fast-delivery with less emphasis on the reliability. However, this does not mean that 
the reliability can be ignored for nano-satellites. As long as nano-satellites are 
utilized for commercial purpose, a certain amount of reliability must be secured. It is 
a very important and difficult problem as how to balance the low-cost and fast-
delivery requirements of nano-satellite projects while securing accepted reliability. 
Discussion on such a topic is often based on individuals experiences gained through 
 233 
 
their involvement in satellite development projects. This is due to the fact that the 
number of satellites, especially nano-satellites, produced so far, is few when 
compared to other industrial products, such as automobiles, airplanes, consumer 
electronics. Therefore, quantitative discussion based on the field data is extremely 
difficult.  
The purpose of this study is to stimulate the discussion on nano-satellite cost, 
schedule and reliability estimations by providing the basis of quantitative discussion. 
We run Monte-Carlo simulations to estimate the probabilistic values of the cost and 
schedule of the processes which are found in a real nano-satellite project. The 
reliability of the satellite is assumed to be affected by the number of iterations 
performed. We estimate the reliability as a function in processes rework. In this way, 
we can evaluate cost, schedule and reliability of many hypothetic projects.  
Another purpose is to develop a complexity management tool for satellite projects, 
not only for nano-satellites but also for larger classes as well. The motivation behind 
this work is driven by the fact that the satellite developers want to predict, 
beforehand, a reasonable estimate of the project budget, duration and product quality. 
The more accurate the prediction is, the more feasible the project control would be. 
Our proposed procedure helps in facilitating that prediction as well as providing a 
better chance for informative decision making. It enables the project manager to 
decide when to cut certain loops of iterations and estimate the risk on the total cost, 
schedule and reliability. 
In this study, the project processes are represented using the Dependency Structure 
Matrix (DSM), a technique used for representing interdependent processes in 
complex projects. It facilitates visualizing the interdependency among processes and 
the modeling of project cost and schedule. This work builds upon the case study and 
technique that was adopted in [1] as it presented an Unmanned Aerial Vehicle 
(UAV) manufacturing project and calculated the cost and schedule estimates based 
on Monte-Carlo simulations. In this work we extend the problem studied in [1] by 
covering nano-satellite projects as a case study. We propose a new procedure to 
study the effect of the project processes on the reliability through defining the 
concept of Process Importance Factor (PIF). Also we handle the calculations of the 
 234 
 
expected reliability based on an empirical function in the number of processes 
iterations. 
The study is organized as follows: Section 2 reviews the background of the DSM 
concept. Section 3 handles the nano-satellite case study through defining the concept 
and typical architecture of a nano-satellite and illustrating the phases of a nano-
satellite development project followed by the DSM representation of the project 
processes. Section 4 shows the simulation algorithm details. Section 5 presents the 
results and the discussion followed by the study conclusion in section 6. 
G.2  Dependency Structure Matrix Background 
The DSM is used to represent the work flow among the project processes. 
Researchers expanded its use in modeling complex systems and handling complexity 
management. It was applied in many industries and services such as automotive, 
electronics, construction, and marketing [2][3]. The DSM has three application 
domains: processes interactions, product architectures, and organization structures. It 
is possible to represent a combination of these domains in a multi-domain DSM 
where the relations among the teams, products and processes can be illustrated in the 
same matrix.  
The basic idea of the DSM depends on representing the project processes in the form 
of an N2 matrix. The matrix consists of rows and columns representing the project 
processes according to their sequence of occurrence. As shown in figure 1, a sample 
project consisting of 6 tasks is represented in its DSM form. The activities listed in 
the rows represent the processes of the project which receive inputs from other 
processes. The activities listed in the columns represent the processes of the project 
which give outputs to other processes. Reading along a row tells the sources of inputs 
to that specific row process. While reading along a column tells the processes to 
which the output of the current column process will be directed to. The feedback 
from any current process, such as process 5 in figure 1, to an earlier process, such as 
process 4, is represented with a cross mark at the location above the intersection of 
the current process row and column in the matrix. This form of the DSM is called 
Inputs in Rows\Feedback Above the Diagonal (IR\FAD). Another form of the DSM 
is the Inputs In Columns with Feedback Below the Diagonal (IC\FBD). The DSM 
can be analyzed using several algorithms such as clustering, tearing, banding and 
  
partitionin
algorithm.
of executi
The data 
input, is r
valued for
absence o
represent d
 
Figur
In sequent
identify th
waits for a
predecesso
number 2
finishes ex
processes 
forward i
number 5
process nu
process nu
This paral
resources 
such as h
simulation
g. In the 
 The proce
on which d
exchanged,
epresented 
m. The bi
f informati
ifferent log
e 1, DSM r
ial process
e possibili
n input fro
r. For exam
. This mea
ecution to 
cannot run 
nput from 
, we would
mber 4 tak
mber 3 wo
lel executio
among the 
uman reso
s we consi
case of pro
sses listed 
enotes the c
 at the int
by the cros
nary DSM
on flow. In
ic notions.
epresenting
rows while
es simulati
ties of hav
m its prede
ple, in fig
ns that pro
provide the
in parallel. 
process nu
 discover 
es no input
uld be exe
n is limited
parallel run
urces, fina
der the reso
ject proce
in the DSM
ontrol flow
erface from
s-marks. T
 contains 0
 a multiva
  
 
 a sample p
 feedback is
on using th
ing process
cessor pro
ure 1, proc
cess numbe
 needed inf
If we consi
mber 2 an
that it is i
s from proc
cuted, proc
 by the ab
ning proce
ncial resou
urces as be
T1
T1
T2 X
T3
T4
T5 X
T6
sses repres
 are ordere
 of the proc
 one proc
he DSM ca
s and 1s w
lued DSM 
roject cons
 above the 
 
e DSM, the
es that can
cess then it
ess number
r 3 should
ormation to
der process
d a feed-b
ndependent
ess numbe
ess number
ility of the 
sses. Resou
rces, and 
ing unlimit
T2 T3
X
X
X
X
entation, w
d accordin
esses exec
ess output 
n be of bin
hich deno
several val
isting of 6 
diagonal. 
 banding a
 run in pa
 cannot run
 3 takes an
 wait until
 process nu
 number 4 
ackward i
 from proc
r 3 then at 
 4 can start
system to d
rces can be
equipment.
ed therefor
T4 T5
X
X
e use the 
g to their s
ution in the
to another 
ary form o
te the pres
ues can be
tasks, input
lgorithm is
rallel. If a
 in parallel
 input from
 process nu
mber 3. Th
which take
nput from 
ess numbe
the time ste
 execution 
ispatch the
 of differe
 For math
e parallel p
235 
T6
banding 
equence 
 project. 
process 
r multi-
ence or 
 used to 
s are in 
 used to 
 process 
 with its 
 process 
mber 2 
ese two 
s a feed-
process 
r 3. As 
p where 
as well. 
 needed 
nt forms 
ematical 
rocesses 
  
can start e
by the con
of the mat
are locate
step after 
and proce
running. I
process, s
postpone 
means tha
place then
finishes. 
 
G.3  Na
In the sim
case study
50cm) vo
satellites. 
expectatio
typical m
subsystem
resolution
xecution to
strained nu
rix shown i
d in the sam
finishing pr
ss 3 shoul
f an earlier
uch as pro
its executio
t it might o
 the earli
Figure 2,
no-satell
ulations th
 of a typic
lume. Usua
Typically th
n of failur
issions for
s and\or tec
s, store and
gether and 
mber of av
n Figure 1
e band as
ocess 3. H
d wait for
 process, s
cess 3, the
n until the
r might not 
er process 
 applying th
ite Proje
at we run t
al nano-sa
lly nano-sa
ey have no
e that is m
 nano-sate
hnologies,
 forward sy
at once. In
ailable reso
into bands 
 they can s
owever, pro
 process 2 
uch as pro
n it does n
 feedback i
lead to rew
will be re
e banding 
ct Case S
o present t
tellite. It is
tellites are
 redundant
uch higher
llites inclu
 earth obser
stems, and 
 reality this
urces. Figu
of parallel p
tart runnin
cess 2 shou
to finish 
cess 1, rec
ot mean t
s ready. Fe
ork of the 
-executed 
algorithm t
tudy 
he results i
 a 50 kg s
 characteriz
 subsystem
 than prof
de techno
vation with
scientific m
 assumptio
re 2, show
rocesses. P
g in parall
ld wait for
before any
eives a fee
hat the ear
edback is 
earlier proc
again after
o the DSM 
n this study
atellite in a
ed as low 
s as they ar
essional la
logy demo
 moderate 
issions.  
n should be
s the catego
rocesses 4,
el at the sa
 process 1 
 of them c
dback from
lier proces
probabilisti
ess. If rewo
 the later 
 
matrix. 
 we used 
 (50cm ×	
cost and h
e produced
rger satelli
nstration f
spectral an
236 
 limited 
rization 
 5 and 6 
me time 
to finish 
an start 
 a later 
s has to 
c which 
rk takes 
process 
a simple 
50cm × 
igh risk 
 with an 
tes. The 
or new 
d spatial 
 237 
 
Figure 3, shows the architecture of a generic non-redundant nano-satellite 
configuration. The On-Board Computer (OBC) controls the switch on/off of 
subsystems and change of operation modes. It also sends the received command 
stream to the relevant subsystem after decoding it. The telemetry information is 
collected by the Telemetry Module (TM) and transferred to the OBC for encoding 
and further transfer to the ground station via the Communication Subsystem (CSS). 
The CSS is the interface between the satellite bus and the ground station and it 
receives the encoded command stream that is forwarded to the OBC for decoding 
then executing or distributing it to the relevant subsystems. The Power Subsystem 
(PSS) is responsible for controlling the charging of the batteries through the solar 
panels and distributing the power over the power bus to the satellite subsystems. 
Power can be distributed in regulated or non-regulated form. The Attitude 
Determination and Control Subsystem (ADCS) is responsible for detecting the 
current satellite orientation and controlling it to reach a target orientation and 
stabilize within certain angular velocities ranges as required by the mission. It needs 
to be updated with the timing information from the navigation subsystem such as the 
Global Positioning System receiver (GPS) for the attitude prediction algorithms to 
work properly. The GPS also provides time scale corrections to the OBC. 
 
 
Figure3, Simple Nano-Satellite Architecture. 
 
a) Project Life Cycle 
Building a satellite follows a lifecycle that is defined by the project plan. The 
lifecycle is divided into group of activities which are carried out along the project 
phases. The activities define the inputs, outputs and processes which take place 
Data Bus
Control Bus
Time Marker
 238 
 
within the phases. Figure 4, shows the main phases used in developing a nano-
satellite project. The technical requirements and specification phase starts with a 
general statement of the required mission. In that phase, detailed technical 
requirements and specifications on the system and subsystem levels are specified. 
The Structure and Thermal Model (STM) phase handles the activities related to 
producing the nano-satellite model which is used in vibration and thermal vacuum 
tests. The Qualification Test Model (QTM) phase handles the activities related to 
producing the functional nano-satellite subsystems and performing the integrated 
environmental tests. The Flight Model (FM) phase handles the production of the 
flight approved model of the nano-satellite and performing the end-to-end (E2E) 
communication tests. In the following section, the DSM of the project phases and 
activities is described in details. 
 
 
Figure 4, Main phases in a nano-satellite project. 
 
b) Project DSM Model 
The nano-satellite project full DSM model is shown in Figure. 5. The complete set of 
processes used in developing the nano-satellite is illustrated. A total of 177 processes 
are presented in this matrix. The color bands are used to group the processes into 
phases of related activities. Each phase ends by a review process which forms a 
project milestone. The black bounded boxes on the matrix diagonal represent the 
processes within each phase.  
If we consider any of the black bounded boxes and look at its left then this would 
denote the inputs this box of processes is taking from predecessor processes in a 
feed-forward fashion. If we look at the right of the box then this denotes the inputs 
from successor processes in a feed- backward fashion. If we look above the box then 
Technical 
Requiremenst 
and 
Specifications
STM QTM FM
 239 
 
this denotes the feed-backward outputs which are given by the box processes to the 
upstream, predecessor, processes. If we look below the box then these are the feed-
forward outputs given by the box processes to the subsequent, coming, processes. So, 
for each black bounded box of processes in the DSM matrix in figure 5, group of 
collected processes, we have four cases: Left which denotes input from/feed-forward, 
right which denotes input from/feed-backward, up which denotes output to/feed-
backward and down which denotes output to/feed-forward.  
The DSM in figure 6, allows us to understand the complexity of the satellite 
development processes. We can spot the locations which enjoy high level of feed-
back. The yellow marks represent the information flow. Feedback is mainly located 
at three locations in the matrix: the beginning, near the middle and at the end of the 
project.  
At the beginning of the project where the specifications of the satellite system, 
subsystems and components are being settled, many iterations take place as the 
parameters needed to be specified are interdependent. Usually we start with a set of 
parameters for the mission and then we deduce the rest of the parameters iteratively 
by fine tuning until reaching the final set of parameters and specifications of the 
satellite. 
As the work in the project progresses, subsystems start to be produced. Subsystems 
are developed separately based on the specifications which were set at the beginning 
and then integrated together for mutual functional testing. At the integration phase, 
electrical, mechanical and thermal interfacing problems usually start to show up. 
Iterations are performed to correct the interfaces, and in some cases the functionality, 
of each subsystem to perform as it should be. This step is affected by the level of 
accuracy of the interface control documents used to relay the information between 
the subsystem developers. It is also affected by the design that is adopted to develop 
the subsystems and the components quality. Therefore most of the feedback in this 
band, iterations, is affected by the previous two bands which indicate the design, unit 
testing and components procurement of the satellite subsystems.  
The flight model testing is the final step in developing a satellite. This step requires 
great focus on reviewing the details of all specifications to verify that they were 
  
accurately
componen
The proce
figure 5. 
represente
two proce
from 10 to
DSM mat
simulation
 satisfied. 
ts quality a
Figure
sses in fig
The group
d by the nu
sses groups
 28 in figu
rix in figur
s on the fu
Figure 6,
Iterations m
nd acceptan
 5, Full act
ure 6, repr
s of proce
mbers in t
 in figure 6
re 5, are pr
e 6 to pro
ll and simpl
 Main proje
ight take 
ce tests wh
ivities - DS
esent the 1
sses in eac
he parenth
, which co
esented in d
ve the vali
e (scaled d
ct phases -
place due 
ich covers 
M of the na
0 color ba
h of the b
eses in figu
rrespond to
etails in fi
dity of the
own) DSM
DSM of th
to the desi
the previou
no-satellite
nds, projec
ands show
re 6. For i
 the proces
gure 7. We
 model res
. 
e nano-sate
gn, manufa
s two steps
 project. 
t phases, s
n in figur
llustration, 
ses from 1 
 will use th
ults as we
llite project
240 
cturing, 
. 
 
hown in 
e 5 are 
the first 
to 9 and 
e simple 
 run the 
 
. 
 241 
 
1 
Determination 
of system 
specification 
details 
                             
2 
Determination 
of sub-system 
specification 
details 
Structure ○  ○ ○  ○ ○ ○ ○                    
3 Thermal ○ ○  ○ ○ ○ ○  ○                    
4 Communication ○ ○   ○ ○ ○ ○ ○                    
5  C&DH ○   ○  ○ ○ ○ ○                    
6  AOCS ○ ○  ○ ○  ○ ○ ○                    
7 Power ○ ○ ○ ○ ○ ○  ○ ○                    
8 Harness ○  ○ ○ ○ ○ ○  ○                    
9 Mission ○ ○  ○ ○ ○ ○ ○                     
10 
Determination 
of component 
specification 
details 
Structural body  ○         ○ ○ ○ ○ ○ ○  ○ ○ ○ ○ ○ ○ ○ ○ ○ ○  
11 Sep. system  ○        ○                   
12 Radiator /heater   ○       ○            ○ ○ ○     
13 TM/Com TX/RX   ○ ○      ○     ○ ○ ○       ○     
14 Downlink TX   ○ ○      ○     ○ ○ ○       ○     
15 Antenna    ○  ○    ○   ○ ○               
16 OBC hardware   ○  ○ ○    ○  ○ ○ ○   ○ ○ ○ ○ ○   ○   ○  
17 C&DH SW     ○       ○ ○ ○  ○  ○ ○ ○ ○   ○   ○  
18 AOCS SW   ○   ○    ○      ○   ○ ○ ○   ○     
19 Att. actuator   ○   ○    ○      ○  ○  ○    ○     
20 Att. Sensor   ○   ○    ○      ○  ○ ○     ○     
21 GPS   ○   ○    ○      ○  ○      ○     
22 Solar panel   ○    ○   ○  ○           ○ ○     
23 Battery   ○    ○   ○  ○          ○  ○     
24 PCU+PDU   ○    ○   ○  ○ ○ ○  ○ ○ ○ ○ ○ ○ ○ ○    ○  
25 Cables/connectors        ○  ○  ○ ○ ○ ○ ○  ○ ○ ○ ○ ○ ○ ○   ○  
26 Lens   ○      ○ ○                 ○  
27 Camera   ○      ○ ○      ○ ○       ○  ○   
28 Review  ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○ ○  
 
Figure 7, First 28 processes concerned with specification details IR/FAD.
  
G.4  Th
The Nano
execution 
located at
number 2 
parallel if
section 2.
intersectio
For each p
worst, mo
TRIPDF 
Review T
obtain a s
TRIPDF d
the TRIPD
and sched
distributio
The mathe
the execut
Whenever
e Simula
-satellite p
of the proc
 row numb
and before
 they have
 The excha
ns of a row
rocess ther
st likely a
function [1
echnique (
pecific val
istribution
F distribu
ule pairs 
n. 
matical mo
ion of the 
 a process 
tion Algo
roject pro
esses match
er 3 in the
 the proce
 no interd
nge of inf
 and a colu
e is a varia
nd best va
]. The sam
PERT) wh
ue for the 
. The mode
tion for eac
for the d
Figu
del is dev
project pro
is executed
rithm 
cesses are 
es their ord
 DSM wou
ss in row 
ependency
ormation b
mn in the D
ble cost an
lues for ea
e concept 
en develop
cost and s
l starts by 
h process. 
ifferent p
re 8, TriPD
eloped usin
cesses in th
 for some 
represente
er in the D
ld be exec
number 4. 
in exchan
etween the
SM.  
d schedule 
ch proces
is used in
ing project
chedule of
generating
The rando
rocess. Fig
F distributi
g the C lan
e order of 
time ΔT, it
d using D
SM. For ex
uted after 
The proces
ging outpu
 processes 
range. The
s. This not
 the Progr
 schedules
 any proce
 random va
m variables
ure 8, sh
on. 
guage.  Th
their appea
s duration 
SM. The o
ample, the
the process
ses can ex
ts as desc
is presente
 range spec
ation is ca
am Evaluat
. The likeli
ss depends
riables tha
 represent 
ows the T
e model s
rance in th
is reduced 
242 
rder of 
 process 
 in row 
ecute in 
ribed in 
d at the 
ifies the 
lled the 
ion and 
hood to 
 on the 
t follow 
the cost 
RIPDF 
 
imulates 
e DSM. 
with the 
 243 
 
same amount. This indicates that the process already run and would finish when its 
duration equals to (0). The cost is increased in a linear relation with the elapsed 
duration. As the process spends more time in execution its duration is decreased and 
its cost is increased.  
The DSM matrix is analyzed by the banding algorithm to detect the set of processes 
that can run in parallel during the same simulation step. The step itself is set to a 
value that is lower than the lowest duration change in any process. For example, if 
the lowest duration for any activity is 1 day then the simulation step should be set to 
less than 1 day. The smaller the simulation step the higher the accuracy of the model 
outputs. Simulation can run by progressing the time in terms of small steps of ΔT and 
calculating the change in the processes durations and costs based on that change or 
through discrete events of change [1]. 
The DSM representation shows the possible iterations in the form of feedback marks 
above the diagonal in the (IR/FAD) form. This feedback from the downstream 
activities to the upstream might lead to probable rework of downstream activities in 
between the process that received the feedback and the process that issued the 
feedback [1].  
This rework is not certain as it depends on finding a necessity to go back and modify 
something in a previously iterated process. In some cases of projects, feedback 
represents an input that must be taken into consideration but it is not available at the 
moment of running the activity. The feedback value is then estimated as a 
preliminary value which would be modified when output from the upstream activity 
becomes available. The rework in that situation is certain and not probable as it 
represents an input that was roughly estimated beforehand. 
The rework probability is estimated in the model based on the fact that it is less 
probable to rework earlier upstream processes due to the change that takes place in 
later downstream processes. Thus the rework probability decreases as we go towards 
the end of the project when we traverse through rows which represent the affected 
processes. To compensate for the probable modification of results that might happen 
later due to the availability of information in a more accurate form, the rework 
probability increases slightly as we traverse through column processes. This 
 244 
 
represents an increase in the probability of rework for the affecting processes. These 
two conditions seem to be contradictory, increase as we traverse through columns 
and decrease as we traverse through rows, however this is the physical conditions 
that do really exist. Figure.9, shows the rework probability over the rows and 
columns of the DSM matrix. 
 
Figure 9, Rework probability over the rows and columns of the DSM. 
The satellite subsystems depend in their development on some of the processes that 
are listed in the DSM. The processes are relevant to each subsystem depending on 
the project phase and the subsystem itself. The effect of each process on the 
successfulness of developing the subsystem and its ability to meet the design 
requirements is represented by a weight function. It can be seen as a function which 
defines the importance of each process to each subsystem. Three weight functions 
are being studied: Equal weight, increasing weight and decreasing weight. Figure 10, 
shows the different weight functions plots. In equal weight all the processes have the 
same importance all over the project phases. In increasing weight processes have 
more importance as we move towards the end of the project. In decreasing weight 
processes have less importance as we move towards the end of the project. The 
importance of a process is a heuristic for the effect that the process might have on the 
project in case its outputs did change. 
The importance of the processes and their effect on the project outcomes is affected 
by the management approach. Some process managers tend to focus on the project at 
its beginning to start correctly and build upon the success of each stage. Others like 
  
to focus o
of others a
the projec
The weigh
number to
to 1. We u
quadratic 
Sometime
decreases 
function m
tries to 
leading\ex
doing the 
of work i
useful in r
tendency t
(after the p
F
The simul
batch cons
total cost 
when to s
between th
running fu
n the test st
s long as c
t processes 
t function 
 a weightin
se simple 
weight fu
s processes
importanc
ight be the
mimic th
ecuting pro
tasks gradu
s mostly ne
epresenting
o focus on
eak).  
igure 10, W
ation proce
ists of 100
and total sc
top the sim
eir mean a
rther batch
age and thi
ost of chan
as equi-imp
is a simpl
g factor. In
linear funct
nction mi
 have high
e as in th
 saw tooth
e behavio
jects. Usua
ally then th
ar the mid
 that situat
 the first ha
eight func
eds by rep
0 runs, then
hedule bet
ulation. I
nd variance
es. In each
nk that test
ge will not
ortant.  
e first degr
 figure 10,
ions to rep
ght also b
er importa
e shape of
 or even n-
r of pro
lly people 
ey saturate
dle of the 
ion. It can b
lf (before t
tions plots,
decreasing
eating the 
 comparin
ween the co
n case the 
 then we c
 run we ca
ing is suffic
 blow up. T
ee polynom
 the area u
resent the w
e used to
nce at the 
 a trapezo
degree poly
cess man
tend to star
 then start
project. Th
e skewed i
he peak) or
 
 equal weig
 weight. 
runs in the
g the mean 
nsecutive 
last two b
onsider the
lculate the
ient to ove
he majorit
ial which 
nder the we
eight effe
 have sm
beginning 
id functio
nomial.  T
agers and
t building e
 to lose it g
e quadratic
n either dir
 the second
ht, increasi
 form of b
value and
batches, a d
atches hav
 simulation
 total reliab
rcome the m
y tend to d
relates the
ight curve 
cts. Neverth
oother tra
then satur
n. Another
he weight 
 executers
nthusiasm 
radually. T
 function m
ection to re
 half of the
ng weight, 
atches wh
the varianc
ecision is 
e an error 
 stable and 
ility of the
245 
istakes 
eal with 
 process 
is equal 
eless, a 
nsitions. 
ate then 
 weight 
function 
 when 
towards 
he peak 
ight be 
veal the 
 project 
 
and 
ere each 
e for the 
made as 
of 0.01 
we stop 
 project 
 246 
 
taking into account the weight of the processes. Figure 11, shows the simulation 
cycle of the model. It starts with loading the simulation parameters and the DSM 
matrices. A distribution generator generates the random pools of the PDF and CDF 
functions for the cost and schedule TRIPDF of each process. A random selection is 
carried out from the pools of random variables to select the cost and schedule of each 
process randomly. The simulator core is responsible for traversing the DSM matrix 
starting from the first process and going in order, process after process. Whenever a 
probable rework is found then a random number is generated to evaluate whether to 
do that rework or not. The rework is based on the rework probability matrix that is 
defined for each process in the DSM as shown in figure 9. If rework takes place then 
all the interim processes between the reworked process and the original process are 
evaluated for probable rework using the same concept of random number generation 
[1]. The model then starts to estimate the reliability based on a complete run of the 
model. Each run traverses the DSM matrix from the beginning till the end in 
increments of time step that is used to progress a duration counter for the processes 
as if they were executing. The cost is linearly dependent on the schedule progress 
and increases as portions of the schedule are counted. The system then produces the 
simulations graphs using a MATLAB script. 
 
Figure 11, the simulation cycle block diagram. 
LOADER
(DSM, Rework probabilities, Rework 
Impact, Learning Curve, TriPDF limits)
GENERATOR
TriPDF generator  
For each process  
(PDF and CDF)
Random Variable
Generation of Random Cost and 
Schedule pairs for each activity 
based on their TriPDFs
SIMULATOR
Simulation of activity running 
through propagating time step, 
summation of cost and schedule, 
and rework evaluation
RELIABILITY 
Estimator
GRAPHER
 247 
 
Equations (1 to 7) show the formula used in the schedule, cost and reliability 
calculations. The schedule duration in Eq. (1) is a summation of all the schedule 
durations of the project processes. The duration for each process is calculated based 
on the initial random value that was generated from the TRIPDF distribution and 
modified by accumulating the extra durations due to reworks for that process. The 
cost in Eq. (2) is calculated in the same way as the schedule. The reliability in Eq. (3) 
is a multiplication of the subsystems reliabilities. Where the reliability of each 
subsystem is calculated as in Eq. (4) using the probability of failure of the subsystem 
itself. The probability of failure of any subsystem is affected with the rework of the 
processes used to develop it. It changes after rework due to the volume of reworked 
processes. This is represented in Eq. (5) and Eq. (6) where we used the logarithm to 
be effective when there is high volume of rework as it would lead to reasonable 
values of the probability of failure. If we consider using the exponential function 
instead of the logarithm when we have high volume of reworks then we would have 
very small values, almost zeros, for the probability of failure. The use of the 
logarithm has two cases, when the total number of rework is less than 1 and when it 
is greater than or equal to 1. This is to account for the negative values produced by 
the logarithm function for fractions, the less than one case, and also when there is no 
rework at all, total rework equal to 0.  
The total rework number of each process is calculated as in Eq (7) by summing the 
product of the rework count for each process and the process weight effect. This 
gives a weighted average for each process. The weighted average value for each of 
the processes belonging to a specific subsystem is then summed together to form the 
total weighted average for that subsystem processes. We call this total value the 
rework index. We emphasize here on the concept of reliability improvement in this 
model. The reliability is improved by multiplying the probability of failure of each 
subsystem by its rework index logarithm. 
ܥ = ෍ܿ௜
௡
௜ୀଵ
 (1.) 
ܵ =෍ݏ௜
௡
௜ୀଵ
 (2.) 
 248 
 
ܴ௦௬௦௧௘௠ =ෑܴ௜
௠
௜ୀଵ
 (3.) 
ܴ௜ = 1 − ܲ(݂݈ܽ݅ݑݎ݁ ݂ܽݐ݁ݎ ݎ݁ݓ݋ݎ݇)௜
Where (m) is the total number of subsystems 
(4.) 
 
ܲ(݂݈ܽ݅ݑݎ݁	݂ܽݐ݁ݎ	ݎ݁ݓ݋ݎ݇)௜ = ܲ(݂݈ܽ݅ݑݎ݁ ܾ݂݁݋ݎ݁ ݎ݁ݓ݋ݎ݇)௜/ log(ݔ) 
Provided that [x > 1] 
(5.) 
ܲ(݂݈ܽ݅ݑݎ݁	݂ܽݐ݁ݎ ݎ݁ݓ݋ݎ݇)௠ = ܲ(݂݈ܽ݅ݑݎ݁ ܾ݂݁݋ݎ݁ ݎ݁ݓ݋ݎ݇)௠ 
Provided that [x ≤ 1] 
(6.) 
Where	(x)	is	the	total	number	of	rework	for	subsystem	processes	given	by:	
ݔ = ∑ process	rework count୦ × Weight Effect of process(h)௠௛ୀଵ x =∑ processreworkcount୦ × WeightEffectofprocess(h)୫୦ୀଵ   
(7.) 
G.5  Results and Discussion 
The simulation was run on two models. The full scale DSM of the nano-satellite 
project and the scaled down or we call it the grouped DSM. The purpose is to prove 
that the model results are valid and to assure the conclusions we reached from 
running the full scale model.  Table 1 and Table 2 show the simulation parameters 
used for both models. 
Table 1 Simulation Parameters of the full DSM model. 
Parameter  Value 
Number of processes  177 
Number of subsystems 8 
Number of runs per (Batch)  1000 
Subsystem probability of failure 0.1 
Mean error threshold  0.01 
Variance error threshold 0.01 
 
Table 2 Simulation parameters of the grouped DSM model. 
Parameter  Value 
Number of processes  10 
Number of subsystems  8 
Number of runs per (Batch)  10000 
Subsystem probability of failure  0.1 
Mean error threshold  0.01 
Variance error threshold  0.01 
 
 249 
 
The main difference is in the number of the processes being simulated and the 
number of runs per batch.  
The simulation in the grouped DSM makes use of the TRIPDF distribution shown in 
Table 3. The best case is 10% below the most likely case. The worst case is 10% 
above the most likely case. The estimations of the cost and schedule for each process 
were deduced from the full scale DSM TRIPDF estimations. 
Table 3 TRIPDF distribution of the grouped DSM processes. 
Schedule (Days) Cost (K JPY) 
BC ML WC BC ML WC 
99 100 101 13860 14000 14140 
173.25 175 176.75 10890 11000 11110 
181.17 183 184.83 26284.5 26550 26815.5 
38.61 39 39.39 8068.5 8150 8231.5 
60.39 61 61.61 46530 47000 47470 
172.26 174 175.74 39501 39900 40299 
105.93 107 108.07 25542 25800 26058 
133.65 135 136.35 108652.5 109750 110847.5 
110.88 112 113.12 20889 21100 21311 
104.94 106 107.06 27423 27700 27977 
 
The grouped DSM simulation, with equal weight function applied, generated the 
schedule, cost and reliability distributions shown in Figure 12 and Figure 13. 
Figure 12, Schedule and Cost distribution of the grouped DSM (10 processes), with 
equal weights of the processes. 
 250 
 
 
Figure 13, Reliability distribution of the grouped DSM (10 processes), with equal 
weights of the processes. 
 
The statistical data of the distributions is shown in Table 4. Reliability reached up to 
80.63% with iterations.  
Table 4 Statistical data of the schedule, cost and reliability distributions of the 
grouped DSM simulation. 
 Min Max Mean Std Range 
Cost (K JPY) 3.329e+05 3.469e+05 3.399e+05 4472 1.4e+04 
Schedule 
(Days) 
1064 1115 1089 16.02 51 
Reliability 0.5863 0.8063 0.6963 0.0678 0.22 
 
The 3D plot of the cost schedule and reliability is shown in Figure 14 and the joint 
PDF between the cost and schedule is shown in Figure 15. It is clear that the 
reliability and cost increase with the schedule . Also from Figure 15, there is a peak 
at which the cost and schedule would have the highest joint probability value. The 
corresponding total schedule and total cost are the most probable to occur in the 
project. This is useful in planning as to expect the probability of finishing within 
certain target budget and time. 
 251 
 
 
Figure 14, 3D plot of the cost, schedule and reliability for the grouped DSM. 
 
 
Figure 15, 3D plot of the joint cost and schedule PDF for the grouped DSM. 
 
The grouped DSM model, 10 processes, shows a trend of increasing cost versus 
schedule in a linear function as illustrated in Figure 16. This is logical as the cost 
keeps rising as long as the project schedule keeps increasing, due to the incurred cost 
in operations and resources. However, Figure 17 shows a trend of saturation in 
between the cost and reliability. So as cost keeps rising, reliability tends to saturate. 
Which means that, at some point in the project, regardless of the amount of processes 
being iterated, which obviously leads to increment in the schedule and cost, the 
 252 
 
reliability will not improve significantly. More incurred cost due to more iteration 
does not mean endless improvement in the reliability. This fact is also clear from 
Figure 18, where the reliability tends to saturate even when more time is spent in 
iterating previously accomplished processes. The reason for having an upper 
reliability limit depends on the intrinsic goodness of the subsystems which is initially 
defined by the failure rates.  
 
Figure 16, Cost and schedule relation in the grouped DSM. 
 
Figure 17, Cost and reliability relation in the grouped DSM. 
 
 253 
 
 
Figure 18, Schedule and reliability relation in the grouped DSM. 
 
The simulation was run on the full scale DSM, 177 processes, with equal, increasing 
and decreasing weights of the processes. Figure 19 and Figure 20 show the 
distribution of the cost, schedule and reliability after running the simulations. 
 
Figure 19, Schedule and Cost distribution of the full DSM (177 processes), with 
equal weights of the processes. 
 254 
 
 
Figure 20, Reliability distribution of the full DSM (177 processes), with equal 
weights of the processes. 
The statistical data of the distributions is depicted in Table 5. The reliability can 
reach a maximum of 86.67%. However, the schedule and cost would increase 
drastically. With the minimum reliability of 80.17% and the maximum of 86.67%, a 
1% of reliability improvement corresponds to 38461.5 K JPY increase in cost and 
24.62 days increase in schedule. 
Table 5 Statistical data of the schedule, cost and reliability distributions of the full 
DSM simulation. 
 Min Max Mean Std Range 
Cost (K JPY) 3.99e+05 6.49e+05 5.24e+05 7.649e+04 2.5e+05 
Schedule 
(Days) 
655 815 735 50.5 160 
Reliability 0.8017 0.8667 0.8342 0.01984 0.065 
 
Figure 21, shows the 3D plot of the joint PDF function between the cost and the 
schedule. The multimodals reprent the effect of processes iterations. Several peaks 
exist due to variety of iteration probabilities and paths. 
 255 
 
 
Figure 21, 3D plot of the joint cost and schedule PDF for the full DSM. 
 
Figure 22, is a top view of Figure 21. It can be used in project planning. The highest 
joint probability of the cost and schedule corresponds to a range from 5.26 K JPY to 
5.4 K JPY for the cost and from 746 days to 761 days in the schedule. A target 
should be defined, the black lines, and the project manager should calculate the 
probability of finishing the project at the required target based on the joint PDF 
values. 
 
Figure 22, Top projection of the joint cost and schedule PDF for the grouped 
DSM. 
 256 
 
 
 
 
Figure 23, Comparison between the cost schedule and reliability in equal, 
increasing and decreasing weight processes for the full DSM. 
 
The simulation was run on the three cases, equal, increasing and decreasing 
processes weights. Figure 23 shows the results in terms of the mean values. The 
lowest cost and schedule, while achieving high reliability, was at the decreasing 
process weight model. In that model the cost to improve reliability with a certain 
percentage is lower than the cost needed in the case of equal and increasing models. 
The same fact applies to the case of schedule. This means that the project reliability 
can be improved while not pushing the incurred cost and schedule too high. 
G.6  Conclusion  
This study presented the effects of rework on the Nanosatellite project key 
parameters. The effect of rework on schedule, cost and reliability, is random in its 
735
724.6
713.6
700
710
720
730
740
Equal Increasing Decreasing
Schedule (Days)
Schedule
524,000
515,400 514,700
510,000
515,000
520,000
525,000
Equal Increasing Decreasing
Cost (K JPY)
Cost
0.8342
0.7819
0.8224
0.75
0.8
0.85
Equal Increasing Decreasing
Reliability
Reliability
 257 
 
nature. Therefore a model based on Monte-Carlo simulation was developed to study 
the effects. The simulations showed that rework do affect the project reliability. The 
more rework we have the higher the reliability would be. Rework has an adverse 
effect as well. It leads to increasing the schedule of the project as more time is 
needed to re-do the previously done activities. The re-doing affects the project cost. 
As the cost is tied to the schedule in a direct forward proportionality, the increment 
in the schedule leads to increment in cost as well. The statistical results showed that 
the increment in schedule and cost due to rework corresponds to increment in the 
project reliability. The seriousness by which we handle the project processes was 
modeled using a weight function. The simulations showed that, although the 
probable rework was distributed in three main phases in the DSM, the beginning, the 
middle and the end, still higher reliabilities can be achieved with the decreasing 
weight function. The decreasing weight means we focus seriously on the early 
stages. This result is proven by the numbers in the statistical tables of the 
simulations.  If we focus on the early specification stages then all of the project 
would progress with higher reliability. The project cost saturates with the reliability 
as we reach the project end. It means that the processes at the end of the project, 
which are of higher cost than the processes at the beginning, due to its nature such as 
building of the expensive flight model, cannot afford to raise the project reliability. 
We cannot correct all of the mistakes at the last moment. So it is of limited effect on 
the project. The bending in the cost-reliability curves is due to the overlapping in 
executing the tasks and rework effects on the schedule and the cost of the project. 
The simulation model presented can be very useful in project planning. It can be used 
to estimate the expected schedule, cost, and reliability of a Nano-satellite project 
with a set of iterative processes. The model can be used to assess the project risk 
management by calculating the profit-lost values. The significance of the study is 
that it merges the DSM approach with the project planning concepts and applies this 
to space based projects. 
 
References 
1. Tyson Rodgers Browning, Modeling and Analyzing Cost, Schedule, and Performance 
in Complex System Product Development, Doctor of Philosophy in Technology, 
Management, and Policy, MIT, 1998. 
 258 
 
2. Ali A.Yassine, “An Introduction to Modeling and Analyzing Complex Product 
Development Processes Using the Design Structure Matrix (DSM) Method“, 
Quaderni di Management (Italian Management Review), No.9, 2004. 
3. Steven D. Eppinger and Tyson R. Browning, “Design Structure Matrix Methods and 
Applications”, MIT Press, June 2012, USA. 
4. Chun-Hisen Chen, Shih Fu Ling, and Wei Chen, “Project Scheduling for Collaborative 
Product Development Using DSM“, International Journal of Project Management, 
Vol.21, 2003. 
 
