Partial and Dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation by Berthelot, Florent et al.
Partial and Dynamic reconfiguration of FPGAs: a top
down design methodology for an automatic
implementation
Florent Berthelot, Fabienne Nouvel, Dominique Houzet
To cite this version:
Florent Berthelot, Fabienne Nouvel, Dominique Houzet. Partial and Dynamic reconfiguration
of FPGAs: a top down design methodology for an automatic implementation. 20th Interna-
tional Parallel and Distributed Processing Symposium (IPDPS 2006), Apr 2006, rhodes, Greece.
pp.1-4, 2006. <hal-00083979>
HAL Id: hal-00083979
https://hal.archives-ouvertes.fr/hal-00083979
Submitted on 5 Jul 2006
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entific research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destine´e au de´poˆt et a` la diffusion de documents
scientifiques de niveau recherche, publie´s ou non,
e´manant des e´tablissements d’enseignement et de
recherche franc¸ais ou e´trangers, des laboratoires
publics ou prive´s.
Partial and Dynamic reconfiguration of FPGAs: a top down design
methodology for an automatic implementation
Florent Berthelot1, Fabienne Nouvel, and Dominique Houzet
1CNRS UMR 6164 IETR-INSA
Rennes 20 av des Buttes de Coesmes
35043 Rennes, France
{florent.berthelot, fabienne.nouvel,dominique.houzet}@insa-rennes.fr
Abstract
Dynamic reconfiguration of FPGAs enables systems
to adapt to changing demands. This paper concentrates
on how to take into account specificities of partially
reconfigurable components during the high level Ade-
quation Algorithm Architecture process. We present a
method which generates automatically the design for
both partially and fixed parts of FPGAs. The run-
time reconfiguration manager which monitors dynamic
reconfigurations, uses prefetching technic to minimize
reconfiguration latency of runtime reconfiguration. We
demonstrate the benefits of this approach through the
design of a dynamic reconfigurable MC-CDMA trans-
mitter implemented on a Xilinx Virtex2.
1. Introduction
The recent and next generations wireless systems
are being designed to provide a wide variety of multi-
media services and seamlessly switch between diﬀerent
wireless standards. Multiple physical layers algorithms
must be implemented, such as for example coding, de-
coder, equalizer, ...The need for Software Deﬁned Ra-
dio (SDR) [4] is motivated by the wide range of con-
ﬁguration parameters and ﬂexibility and leads to the
concept of reconﬁgurability. Harvard-based architec-
tures (DSPs) are reconﬁgurable at system-level and use
a temporal scheme implementation of operations but
suﬀers from its power eﬃciency for repetitive and high
throughput computations. Reconﬁgurable devices, in-
cluding FPGAs, can ﬁll the gap between hardwired
and software technology. Recently runtime reconﬁgu-
ration (RTR) of FPGA parts has led to the concept of
virtual hardware. By changing dynamically the func-
tionality performed by the FPGA over the time, we can
address SDR constraints and obtain a scalable system.
However reconﬁguration latency is a major drawback
of runtime reconﬁguration on commercial devices and
must be considered during HW/SW partitioning pro-
cess. The objective is to obtain a near optimal schedul-
ing of tasks in time over an heterogeneous architec-
ture [5].
In this paper, we brieﬂy introduce the diﬀerent re-
conﬁguration levels and describe the impact of runtime
reconﬁgurable component utilization on algorithm-
architecture adequation in the second section. Next
the way of modeling a partially runtime reconﬁgurable
part of a FPGA with SynDEx is exposed. We discuss
on how to generate an automatic management of run-
time reconﬁguration over the time with SynDEx [6].
A case study based on a runtime reconﬁgurable MC-
CDMA transmitter is presented in last section followed
by the conclusion.
2. Reconfiguration Levels
In the case of mobile communications, three main
constraints have to be combined : high performance,
low power consumption and ﬂexibility. Grain of com-
putation, reconﬁguration schemes are open research
topics. Three levels of reconﬁguration can be consid-
ered :
In the System Level Reconﬁguration case, the ap-
plication is very often supported by an heterogeneous
architecture (DSPs, FPGAs). The functionalities are
distributed onto these components according to their
performances. Either, it is diﬃcult to address a speciﬁc
function of the application.
With Functional Level Reconﬁguration, most of the
reconﬁgurable architectures are based on FPGAs which
support partial dynamic runtime conﬁguration and al-
low ﬂexible hardware. A runtime reconﬁguration man-
ager will control, monitor and execute the dynamic re-
conﬁguration.
In the Logic and RTL level Reconﬁguration case,
the majority of the FPGAs are ﬁne grain since they
can be conﬁgured at the bit level. Neither, this level
is architecture’s manufacturer dependant and do not
allow the designer to adopt an open methodology.
Considering the SDR constraints, the functional re-
conﬁguration level seems to be the best level for our
partial and dynamically reconﬁguration of systems.
3. Design flow for dynamic reconfigura-
tion : AAA approach
Some partitioning methodologies based on various
approaches are reported in the literature [2]. They
are characterized by the granularity level of the par-
titioning, metrics, target hardware, support of runtime
reconﬁguration, ﬂow automation and on-line / oﬀ-line
scheduling policies. Choice of candidates for a dynamic
hardware implementation can be guided by some met-
rics: execution time, memory constraints, power eﬃ-
ciency, reconﬁguration time, conﬁguration prefetching
capabilities and area constraints.
Among theses methods we have chosen the AAA ap-
proach with its tool SynDEx. AAA methodology aims
at ﬁnding the best matching between an algorithm and
an architecture while satisfying time constraints. The
reader is referered to the previous paper [1] which de-
scribes all the steps of the methodology and extensions
to FPGAs modelization.
Application algorithm is represented by a data ﬂow
graph to exhibit the potential parallelism between op-
erations. An operation is executed as soon as its input
are available, and is inﬁnitely repeated.
Architecture is also modeled by a graph where the
vertices are operators (e.g processors, DSP, FPGA) or
media and edges are connections between them. Op-
erators have no internal parallelism computation avail-
able but the architecture exhibits the potential paral-
lelism.
Adequation consists in performing the mapping and
scheduling of the operations and data transfers onto the
operators and the communication media. It is carried
out by a heuristic which takes into account durations
of computations and inter-component communications.
The result is a synchronized executive represented by
a macro-code for each vertices of the architecture.
4. Run time reconfiguration considera-
tions for adequacy
To reduce the diﬃculty in managing such dynamic
reconﬁgurable application and to provide reliable im-
plementation, following issues must be adressed : au-
tomatic or manual partitioning of an application, spec-
iﬁcation of the dynamic constraints, automatic gener-
ation of the C or VHDL core controler.
Considering these features in SynDEx, runtime re-
conﬁgurable parts of an component must be considered
as vertices in the architecture graph. As shown in the
example of Figure 1, runtime reconﬁgurable parts of a
FPGA (D1 and D2) and ﬁxed parts (F1) can be repre-
sented as hardware operators of the architecture. (D1
and D2) will integrate both dynamic modules and the
control to manage the reconﬁguration. An internal me-
dia (IL) allows data exchanges between those parts.
       
     
     	 
 
     
     
       
     
  	 
        
    
 	  	  
Figure 1. Model of runtime reconfigurable
parts of a FPGA with SynDEx
By identifying dynamic parts of the application at
a high level, we make dynamic speciﬁcation ﬂexible
and independent of the application coding style. A
constraints ﬁle will contain the deﬁnition of each dy-
namic module and the associated constraints (loading,
unloading, sharing area, dynamic relations, exclusion).
5. Automatic design generation : Gen-
eral FPGA synthesis scheme
Once mapping and scheduling of the algorithm are
performed, macro-code is automatically generated
and each one must be translated. The translation
generates the VHDL code, both for the static and
dynamic parts of a FPGA. The ﬁnal FPGA design is
based on several dedicated processes to control:
- communication sequencings,
- computation sequencings,
- operator behaviour,
- activation of reading and writing phases of
buﬀers,
In order to perform reconﬁguration of the dynamic
part we have chosen to divide this process into two sub-
parts: a conﬁguration manager and a protocol conﬁg-
uration builder. A conﬁguration manager is in charge
of the conﬁguration bitstream which must be loaded
on the reconﬁgurable part by sending conﬁguration re-
quests. Conﬁguration requests are sent to the protocol
conﬁguration builder which is in charge to construct
a valid reconﬁguration stream in agreement with the
used protocol mode (e.g selectmap). Figure 2 shows
diﬀerent solutions of architectures to reconﬁgure par-
tially a FPGA.
  
           
     
                

     

 
 
 
 
 
 

              
 !   

 
 
 
 
 
 

  "   
 !   
       
      
   
 # $ 






 
 
!
 
 
 


 # $ 
% #  
   
      
     
       
             
   &     

 
 
 
 
 
 

              
 !   

 
 
 
 
 
 

  "   
 !   
  
                
  '     ' 
  
      !        



Figure 2. Different ways to reconfigure dy-
namic parts of a FPGA
LabelsM andP show where functionalities ’Config-
uration manager’ and ’Protocol configuration builder’
respectively are implemented. Locations of these func-
tionalities have a direct impact on the reconﬁguration
latency. Case a) shows standalone self reconﬁgurations
where the ﬁxed part of the FPGA reconﬁgures the dy-
namic area. Case b) shows the used of a processor
to perform the reconﬁguration. In this case the FPGA
sends reconﬁguration requests to the processor through
hardware interruptions.
The outputs of SynDex tool are both a VHDL enti-
ties and process corresponding to dynamic and static
modules for ﬁnal implementation. We synthesize the
VHDL code of the static part and of each dynamic part
separately in order to obtain separate netlists. The Xil-
inx Modular back-end ﬂow is used to place and route
each module and to generate the associated bitstream,
resulting in a typical ﬂoorplan. Concerning the place
and route constraints, reconﬁgurable modules have the
following properties : the height of the module is al-
ways the full height of the device and its width ranges
a minimal of four slices. The communications between
static and dynamic parts use a special bus macro. This
bus is a ﬁxed routing bridge between two sides and
is pre-routed. The current implementation of the bus
macro uses eight 3-state buﬀers, their position exactly
straddles the dividing line between designs. All these
constrains are ﬁxed in a constraints ﬁle, used during
the placement and routing. By using SynDEx tool and
Xilinx Modular Design ﬂow, we deﬁne a top-down and
validated methodology, depicted by Figure 3, address-
ing the complete design ﬂow.
 
Synthesis : place and route, 
bitstreams  
Modelisation:  Static and Reconfigurable parts 
partitionning, 
reconfiguration manager
 
VHDL generation Constraints File 
SynDEx
 
Dynamic 
verification  
Final Synthesis
  
Xilinx Modular 
Design
 
Figure 3. Complete Design Flow : SynDEx
tool and Modular Design
6. Implementation example
Our top-down design ﬂow has been tested on a
complete application which is a transmitter system
for future wireless networks for 4G air interface [3].
This transmitter is based on MC-CDMA modulation
scheme. SynDex algorithm graph, depicted by Fig-
ure ??, shows the basic numeric computation blocks of
this transmitter. Block modulation performs either a
QPSK or QAM-16 modulation. This adaptive modu-
lation is selected by the conditional entry Select which
deﬁnes the modulation of each OFDM symbol accord-
ing to the signal to noise ratio. All other blocks are
implemented in the static part of the FPGA.
We implement this transmitter over a prototyp-
ing board from Sundance technology. This board is
composed of one DSP C6201 and one FPGA Xilinx
Xc2v2000.
Figure 4 shows the resulting design of the reconﬁg-
urable transmitter. The FPGA is divided in two parts.
The ﬁrst one is static and implements non reconﬁg-
urable logic, the second one takes 8% of the FPGA
and is dedicated to the dynamic operator. As mod-
ulation functionality is runtime reconﬁgurable, design
generated by SynDEx for this block is represented by
operator Op Dyn.
  #
(  
 % ) * + 	
       	 

    
   
       	 

     
   
     
    
       	 
 
    
     
    
 	   
    ,   
    ,  





	














	

















          !    
           	 
    
   
 
   
!  " 	 #          
  " "   # $ %
  " "   # & ' (
    -     
$ 
 
 "
	
 
#
$ %
#
&
'
(

)
 
	
*
 
+
$ 
 
 
 	
! 

+
$ 

(
$  %
$  


*

 	
 


 ,
-
. 
	


 
 )
	
 
 


 

 /
  &
    	 "
   ' (  #  #
  ) !     
    
       	 
  0  	 *   
    	 " (    	   
*    "  
  
 
 
 
 
 
 . 
   / 
  !    
        
              
    
    
            
             
   
 
 
 . 
 0   ' 
  !    
        
              
    
    
     
            
    
 . 
         
 %             
       
 #   
Figure 4. Reconfigurable MC-CDMA transmit-
ter architecture
The DSP can select modulation performed by the
dynamic part by sending this value to module Inter-
face IN OUT. This value is send to block modulation
through communication link LIO. Interface IN OUT
module receives DSP data from SHB bus. Receiv-
ing process can be locked-up during partial reconﬁg-
urations thanks to signal In Reconf. If Select register
value is changed, block modulation sends a reconﬁgura-
tion request to the protocol builder component which
is next in charge to address external memory and drive
ICAP. The reconﬁguration time needed to reconﬁgure
Op Dyn takes about 4ms.
As shown in Table 1, FPGA resources utilization
needed to implement QPSK and QAM-16 modulations
are more important with a dynamic reconﬁguration
scheme. This overhead is due to the generic VHDL
structure generation, based on the macro code descrip-
tion, which is more adapted for medium-complex data
path control. However this gap is decreasing with
the number of diﬀerent reconﬁgurations needed and
the ability of runtime reconﬁguration to provide vir-
tual hardware. The ﬂexibility given by this methodol-
ogy and the automatic VHDL generation can overcome
hardware resource overhead.
7 Conclusion
We have described a methodology ﬂow to man-
age automatically partially reconﬁgurable parts of a
FPGA. It fully exploits advantages given by partially
reconﬁgurable components. The AAA methodology
and associated tool SynDEx have been used to perform
  *   	                                                                                              - .  	 
  
 
 )   
    	             +   & , +  *  -             	              .             .   ( 	 .       !      	 " (   / .              	   
                                                                                  
 $    . 	   #  0                       1                                       - % %                                  % 1                                      1 %
    2  $    3  #  0                   4                                        4 % %                                   4                                         2
    "          0                  4 4                                        1 5                                   4                                         2
 .   6  !  *     & / 	  #   0      %                                                                                %                                       5   
          0                        2                                            2                                      2                                      
 7 	   3 	  "  .        0             2                                            2                                      2                                    5  #
Table 1. Fix-Dynamic modulation implemen-
tation comparison
mapping and code generation for ﬁxed and dynamic
parts of FPGA. Either, SynDEx’s heuristic needs addi-
tional developments to optimize time reconﬁguration.
Furthermore, complex design and architecture can sup-
port more than one dynamic part. This design ﬂow has
the main advantage to target as well as software com-
ponents as hardware components to implement com-
plex applications from a high level functional descrip-
tion. This methodology can easily be used to introduce
dynamic reconﬁguration over already developed ﬁxed
design as well as for IP block integration.
References
[1] F. Berthelot, F. Nouvel, and D. Houzet. Design
methodology for dynamically reconﬁgurable systems.
JFAAA , Dijon France, pages 47–52, January 2005.
[2] J. Harkin, T. McGinnity, and L. Maguire. Partitioning
methodology for dynamically reconﬁgurable embedded
systems. IEE Proc-Comput. Digit. Tech, 147, November
2000.
[3] S. Lenours, F. Nouvel, and J.-F. Helard. Design and
implementation of mc-cdma systems for future wireless
networks. EURASIP JASP, pages 1604–1615, August
2004.
[4] J. Mitola. Software radio architecture evolution: Foun-
dations, technology tradeoﬀs, and architecture impli-
cations. IEICE Trans. Commn., E83-B(6):1165–1172,
June 2000.
[5] J. Noguera and R. Badia. A hw/sw partitioning al-
gorithm for dynamically reconﬁgurable architectures.
Proc. Design Automation and Test in Europe, pp.
72934, 200t, 2001.
[6] C. Sorel and Y. Lavarenne. From algorithm and ar-
chitecture speciﬁcations to automatic generation of dis-
tributed real-time executives: a seamless ﬂow of graphs
transformations. Formal Methods and Models for Code-
sign Conference, France, pages 123– 132, June 2003.
