Built-in Self Repair by Reconfiguration of FPGAs by S. Habermann et al.
Built-in Self Repair by Reconfiguration of FPGAs 
S. Habermann, R. Kothe, H. T. Vierhaus 
Brandenburg University of Technology Cottbus 
Computer Engineering Group 
{rek|htv}@informatik.tu-cottbus.de
Abstract 
Systems on a chip (SoCs) in safety-critical applications 
need features such as built-in self-test, on-line self-test 
and error compensation of transient faults. With ever-
shrinking feature size, also built-in self-repair (BISR) 
may become a must. While BIST and BISR are well 
understood and frequently implemented for embedded 
memory blocks, BISR for random logic is by far an 
unsolved problem. Logic circuits based on field-
programmable gate arrays (FPGAs) are a technology 
base that allows for functional reconfiguration in the 
field of application. In this paper we investigate on the 
possibilities and limitations of logic BISR for FPGAs. 
1. Introduction 
Several authors predict that sub-100-nm IC technolo-
gies led to increased numbers of defect devices be-
cause of inevitable device parameter fluctuations [1]. 
Furthermore, there is an increasing sensitivity against 
transient fault effects induced by particle radiation and 
electromagnetic coupling. Also a rising share of stress-
induced device failures in the field is more likely. 
Therefore future SoCs need to be designed for efficient 
self-repair. While built-in self-repair are known to 
work on regular structures like embedded memory 
blocks [3], BISR for random logic blocks is a more 
challenging problem.  
A FPGA based self-repairing system was developed 
and implemented at Stanford University [2] for appli-
cations in long-term space missions. The method of 
reconfiguration suggested uses a repair function which 
is based on shifting columns in the FPGA structure. 
Unfortunately, such an approach is relatively ineffi-
cient since it discards multiple functional CLBs in the 
repair procedure. In the following paper, we describe a 
more efficient FPGA based embedded repair approach 
which avoids the loss of functional CLBs. 
2. FPGA Architecture 
An FPGA can be seen as a  regular array of configur-
able logic blocks (CLBs) and routing resources. For 
detailed investigations we focused on the Virtex II-Pro-
series from Xilinx Inc. In that case a CLB consists of 
the following elements: 
• 4 slices incorporating function generators, 
logic gates, function multiplexers and storage 
devices 
• 3-state-elements, 
• 1 switching matrix. 
Every CLB contains 4 flexible slices with a broad 
application range. Apparently the complexity of such a 
core cell is beyond 1000 transistors. 
Routing on the FPGA is organized hierarchically. 
Every CLB can directly communicate only with a 
small subset of CLBs on the FPGA. For transmitting 
signals to non-directly reachable CLBs a CLB hopping 
scheme have to be used.  
3. Granularity of Reconfiguration 
There exists two different kinds of FPGA reconfigura-
tion concepts: full and partial. The main advantage of  
the full reconfiguration approach is the simplicity. You 
need normally no additional devices, only memory for 
storing the necessary configuration data. But this is 
also the big disadvantage of this type of reconfigura-
tion. A set of configuration data for the Virtex-II Pro 
FPGA has a size of 550 kbyte for a complexity of 
about 1 million equivalent gates. For just 25 relevant 
fault cases, a data base of 13.4 Mbytes has to be stored. 
Surely with data compression techniques the amount of 
storage area could be further reduced [2].  
Another approach is the partial reconfiguration where 
only a (small) part of the FPGA device is reconfigured. 
As mentioned at Stanford University a column based 
Proceedings of the 12th IEEE International On-Line Testing Symposium (IOLTS'06) 
0-7695-2620-9/06 $20.00 © 2006 IEEE reconfiguration scheme was developed [2]. This 
scheme has the advantage of getting along with only 
relatively simple re-routing functions. On the other 
hand, the usage of resources is not very efficient, since 
multiple functional CLBs in the affected column are 
lost. So we investigated the possibilities of Xilinx 
synthesis software. Typically, FPGAs are synthesized 
into a clustered distribution of CLBs for minimization 
of signal delays, leaving unused CLBs that may serve 
as backup resources in peripheral positions. Such a 
structuring, on the other hand, is not very favorable in 
case of CLB substitution for repairing, since re-routing 
the FPGA tends to become a very complex task 
involving local and global re-routing [4]. Therefore the 
basic approach was to start from a friendly allocation 
of CLBs and routing resources and perform repair 
functions by shifting of CLBs and a minimized amount 
of re-routing.  
4. Repair Functions 
Our investigations were based on a Virtex–II Pro 
FPGA (Type XC2VP7) which contains a 32-bit 
PowerPC core (PPC405). Apparently this is a realistic 
architecture for a practical system design. This embed-
ded processor is powerful enough to support the recon-
figuration process and also some lean tasks of embed-
ded design automation. Necessary precondition for a 
local repair function of limited complexity is a favor-
able allocation of spare CLBs. The repair procedure 
requires the local shifting of a CLB function plus re-
routing of local and, to a limited degree, global connec-
tions.  
The repair function consists of the following steps: 
1. Examine if CLB resources are available and can 
be used. For example, some CLBs are used for 
the process of reconfiguration. For such CLBs, 
additional efforts in fault tolerant design may 
become necessary in order to prevent a proce-
dural deadlock in the repair process.  
2. Extract signal source and drain locations for all 
signal pins of the CLB that needs replacement.  
3. Route the new signal connections. 
4. Shift the CLB function by programming the 
target CLB. 
The embedded tool that performs the local re-routing is 
a problem by itself. It has to deal with the opportunities  
and complexity of the hierarchical routing scheme of 
Xilinx FPGAs. Experiments showed that a restricted 
routing scheme which preferred one routing direction 
(horizontal or vertical) led to acceptable performance 
for the re-routing task in most cases while using only 
limited memory space. 
5. Results 
Repair functions were implemented on a Xilinx-Vertex 
II FPGA that contains an embedded PPC 405 processor 
(350 MHz), the whole reconfiguration system and the 
device under test (DUT) which consisted of some 
combinatorial functions. All CLBs of the DUT were 
successfully repaired by shifting the function to a more 
or less randomly chosen but free CLB. Program code 
of repair procedure needs only about 230 kBytes. Em-
bedded repair functions required typically up to a few 
seconds with use of PPC 405. But needed re-routing 
time depends mainly on the distance between sources 
and drains of involved CLBs and rate of FPGA utiliza-
tion regarding CLBs and routing resources.  
6. Summary and Conclusions 
Fault repairing on FPGAs through a local CLB 
substitution is feasible, even under embedded 
conditions in the field of application. Both an adequate 
allocation of redundancy and a reasonably powerful 
embedded processor are necessary. The available level 
of repair granularity (CLB) is in the order of  thousand 
transistors. The problem of identifying a faulty CLB by 
diagnostic tests in the field still need a lot of attention. 
We assume that our proposed method is useful and 
realistic for safety-critical applications which are not 
time-critical. 
7. Acknowledgement 
This work was supported by the German Research 
Foundation (Deutsche Forschungsgemeinschaft, DFG) 
within the HITSOC project under grant no. VI 185/5-1.
8. References 
[1]  M. A. Breuer, S. Gupta, T. M. Mak, “Defect and Error 
Tolerance in the Presence of Massive Numbers of Defects“, 
IEEE Design and Test of Comp., May/Jun 2004, pp. 216-227 
[2]  S. Mitra, W. - J. Huang, N. R. Saxena, S.-Y. Yu, E.J. 
McCluskey, “Reconfigurable Architecture for Autonomous 
Self Repair”, IEEE Design and Test of Computers, May/Jun 
2004, pp. 228-240 
[3]  S. Shoukourian, V. Vardanian, Y. Zorian, “SoC Yield 
Optimization via an Embedded-Memory Test and Repair 
Infrastructure”, IEEE Design and Test of Computers, 
May/Jun 2004, pp. 200-207 
[4]  A. La Rosa, L. Lavagno, C. Passerone, “Software 
Development for High-Performance, Reconfigurable, Em-
bedded Multimedia Systems”, IEEE Design and Test of 
Comp., Jan/Feb 2005, pp. 28-38 
[5]  R. Lysecky, F. Vahid, S X.-D. Tan, “Dynamic FPGA 
Routing for Just-in-Time FPGA Compilation “, ACM-IEEE 
Design Automation Conference, June 2004
Proceedings of the 12th IEEE International On-Line Testing Symposium (IOLTS'06) 
0-7695-2620-9/06 $20.00 © 2006 IEEE 