Minimizing FPGA Reconfiguration Data at Logic Level by Raghuraman, Krishna et al.
Southern Illinois University Carbondale
OpenSIUC
Conference Proceedings Department of Electrical and ComputerEngineering
3-2006
Minimizing FPGA Reconfiguration Data at Logic
Level
Krishna Raghuraman
Southern Illinois University Carbondale
Haibo Wang
Southern Illinois University Carbondale, haibo@engr.siu.edu
Spyros Tragoudas
Southern Illinois University Carbondale
Follow this and additional works at: http://opensiuc.lib.siu.edu/ece_confs
Published in Raghuraman, K., Wang, H., & Tragoudas, S. (2006). Minimizing FPGA reconfiguration
data at logic level. Proceedings of the 7th International Symposium on Quality Electronic Design
(ISQED’06), 224. doi: 10.1109/ISQED.2006.87 ©2006 IEEE. Personal use of this material is
permitted. However, permission to reprint/republish this material for advertising or promotional
purposes or for creating new collective works for resale or redistribution to servers or lists, or to
reuse any copyrighted component of this work in other works must be obtained from the IEEE. This
material is presented to ensure timely dissemination of scholarly and technical work. Copyright and
all rights therein are retained by authors or by other copyright holders. All persons copying this
information are expected to adhere to the terms and constraints invoked by each author's copyright.
In most cases, these works may not be reposted without the explicit permission of the copyright
holder.
This Article is brought to you for free and open access by the Department of Electrical and Computer Engineering at OpenSIUC. It has been accepted
for inclusion in Conference Proceedings by an authorized administrator of OpenSIUC. For more information, please contact opensiuc@lib.siu.edu.
Recommended Citation
Raghuraman, Krishna; Wang, Haibo; and Tragoudas, Spyros, "Minimizing FPGA Reconfiguration Data at Logic Level" (2006).
Conference Proceedings. Paper 43.
http://opensiuc.lib.siu.edu/ece_confs/43
Minimizing FPGA Reconﬁguration Data at Logic Level
Krishna Raghuraman, Haibo Wang, and Spyros Tragoudas
Southern Illinois University, Carbondale, IL 62901
Abstract
A framework that relates the size of FPGA reconﬁgu-
ration data to the number of minterms of a specially con-
structed function is presented. Three techniques, variable
mapping optimization, circuit don’t-care modiﬁcation, and
look-up table input permutation, are developed to minimize
minterms of the special function. The method to integrate
the proposed techniques into FPGA design automation ﬂow
is discussed and experimental results are presented.
1. Introduction
Reconﬁgurable systems provide a number of advantages
and are continuously gaining their popularity in various ap-
plications. Currently, most reconﬁgurable systems are im-
plemented on FPGA platforms. For such systems, an impor-
tant design concern is to minimize FPGA reconﬁguration
bitstreams, and this problem has been widely investigated
from high level design. Studies in [1, 2, 3, 4, 5] present
different algorithms to perform temporal partitions with the
objective of reusing function units in different temporal par-
titions. Meanwhile, the reuse of FPGA routing patterns is
investigated in [6]. Relocation and defragmentation tech-
niques are presented in [7, 8]. The work in [9] minimizes
reconﬁguration cost by both using coarse-grain logic blocks
and optimizing scheduling and allocation schemes. Ad-
ditionally, other techniques proposed in literature include
conﬁguration caching [10], conﬁguration compression [11],
and column-based conﬁguration method [12].
Differing from previous approaches, this work addresses
the problem of minimizing reconﬁguration data at the logic
level. Techniques developed in this work take advantage of
two facts. First, FPGA conﬁguration data are partitioned
into frames, which are the smallest data units that can be in-
dividually accessed by conﬁguration commands [13]. Sec-
ond, a frame contains conﬁguration data for identical hard-
ware located in an FPGA column. To conveniently track
the size of reconﬁguration data, we introduce a framework
that links reconﬁguration frames to minterms of a specially
constructed function, which is referred to as the difference
function of a look-up table (LUT) column. Based on this
framework, three techniques, variable mapping optimiza-
tion, circuit don’t-care modiﬁcation, and LUT input order
permutation, are proposed to minimize minterms of LUT-
column difference functions.
The rest of the paper is organized as follows. Section 2
explains FPGA conﬁguration frames and describes how to
link reconﬁguration frames to minterms of LUT-column
difference functions. Motivational examples are also given
in this section to elucidate the proposed techniques. Sec-
tion 3 develops procedures to efﬁciently implement the pro-
posed techniques. Section 4 illustrates how to integrate the
proposed techniques into FPGA design automation ﬂow and
reports experimental results. The paper is concluded in Sec-
tion 5.
2. Preliminaries
In many LUT-based FPGAs, conﬁguration data are par-
titioned into frames [13, 14]. A frame contains conﬁgura-
tion data for hardware located in an FPGA column. The
structure of frames is explained using an FPGA LUT col-
umn shown in Figure 1. Assume that there are N LUTs in
the column and each LUT has 16 memory locations. The
16 memory locations of any LUT in the column belong to
16 different frames. In addition, each frame contains N
bits, corresponding to the same memory locations in the N
LUTs of the column. Since a frame is the smallest block
of conﬁguration data that can be accessed by conﬁguration
commands, the entire frame has to be written into the FPGA
even if we just want to change a single bit of an LUT during
partial reconﬁguration. This arrangement lessens the bur-
den of addressing LUT locations, consequently simplifying
hardware design and reducing the size of conﬁguration bit-
streams.
As frames are the primitive units of FPGA reconﬁgura-
tion data, reducing the size of FPGA reconﬁguration bit-
streams is equivalent to minimizing the number of recon-
ﬁguration frames. The latter minimization problem can be
addressed in two perspectives. First, it is desirable to have
each LUT require less number of frames during reconﬁg-
uration. This leads to minimizing the difference between
data stored in each LUT before and after reconﬁguration.
This problem can be tackled by both optimizing variable
Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06) 
0-7695-2523-7/06 $20.00 © 2006 IEEE 
Authorized licensed use limited to: Southern Illinois University Carbondale. Downloaded on May 29, 2009 at 11:09 from IEEE Xplore.  Restrictions apply.
Frame 1 Frame 2 Frame 16 
Configuration bit for memory location 16 in LUT N
Configuration bit for 
memory location 1 
in LUT1
Configuration bit for 
memory location 1 
in LUT2
Configuration bit for 
 memory location 1 
 in LUT N
Frames of configuration data
LUT Column
LUT1
1
16
LUT
LUT
2
N
1
1
16
16
Figure 1. Virtex conﬁguration frames.
mapping and modifying LUT don’t-care locations. Before
and after reconﬁguration, an LUT may implement two dif-
ferent functions that depend on two sets of logic variables.
Variable mapping refers to the rule that dictates which two
variables (one is an input of the ﬁrst function and the other
is an input of the second function) should be mapped to the
same LUT address input. Meanwhile, LUT don’t-care loca-
tions are memory locations whose addresses correspond to
circuit don’t-cares. Data stored in don’t-care locations can
be altered without changing circuit functionality. The sec-
ond perspective on minimizing reconﬁguration frames is to
maximize the efﬁciency of each frame, which is measured
by how many bits of the frame containing data that truly up-
date LUT locations. For a given number of LUT locations
that need to be updated, higher frame efﬁciencies will result
in less number of frames. The efﬁciencies of frames can be
improved by permuting LUT input orders, which relocates
LUT locations that need be updated into common frames.
We ﬁrst introduce notations used in the paper. We refer
to logic functions implemented on an LUT before and af-
ter reconﬁguration as its initial and ﬁnal functions, respec-
tively. For a given LUT, denoted as LUTi, we use fi and
hi to represent its initial and ﬁnal functions. When it is
not necessary to distinctively identify LUTs, subscripts of
fi and hi are omitted for the sake of conciseness. Further-
more, for any given logic function l, we use lon, ldc, loff
to represent its on, don’t-care, and off sets, respectively.
Three examples will be given to illustrate how variable
mapping (Example 1), don’t-care locations (Example 2),
and LUT input orders (Example 3) can be utilized to reduce
reconﬁguration frames. Without losing generality, three-
input LUTs are used.
Example 1: For an LUT, assume f = a·b+c and h = x+y·z.
If the variable mapping is selected as {a ↔ x, b ↔ y, c ↔ z}
(symbol↔ indicates which two variables are mapped to the same
LUT address), two frames (indicated by asterisks) are needed for
this LUT as shown in Figure 2. However, if the variable mapping
is changed to {a ↔ y, b ↔ z, c ↔ x}, no frames are needed.
Example 2: For an LUT, assume f = a · b and h = a · c. As
shown in Figure 3, four frames are needed for this LUT. However,
if both functions have don’t-care sets fdc = a · c + a · b and
000 001 010 011 100 101 110 111
A3
A2
A1
a (x)
* *
Initial
Final
b (y)
c (z)
0 1 0 1 0 1 1
LUT1
0 0
LUT content before reconfiguration
LUT content after reconfiguration
Address of LUT locations 
1
0 1 1 1 1 1
Figure 2. LUT data without variable mapping
optimization.
hdc = a · b + a · c respectively, then the initial and ﬁnal functions
can be modiﬁed as fnew = hnew = a · b + a · c. No frames are
needed after the modiﬁcation. In this example, both the initial and
ﬁnal functions depend on the same set of logic variables. After the
variable mapping is ﬁxed, f and h can have either the same or
different support sets.
000 001 010 011 100 101 110 111
A3
A2
A1
a
* * *
Initial
Final
b
c
0 0 0 0 0 1 1
LUT1
0 1
LUT content before reconfiguration
LUT content after reconfiguration
Address of LUT locations 
0
1 0 0 0 0 0
*
Figure 3. LUT data without don’t-care modiﬁ-
cation.
Example 3: Assume LUT1 and LUT2 are in the same column
and f1 = a·(b+c), h1 = a+b, f2 = a·b+c, h2 = (a+b)·c. If the
input orders for both LUTs are {a → A3, b → A2, c → A1},
ﬁve frames are needed as shown in Figure 4(a). However, if the
input order for LUT2 is changed to {c → A3, a → A2, b →
A1}, only three frames are required as shown in Figure 4(b). Note
that LUT input order permutation is performed with ﬁxed variable
mappings. During the permutation , LUT input orders for both
initial and ﬁnal functions are changed in the same way.
000 001 010 011 100 101 110 111
A3
A2
A1
a
*
*
* *
*
Initial
Final
b
c
0 0 0 0 0 1 1 1
a
b
c
LUT1
LUT2
A3
A2
A1
0 1 0 1 0 1 1 1
0 0 1 1 1 1 1 1
0 0 0 1 0 1 0 1
LUT content before reconfiguration
LUT content before reconfiguration
LUT content after reconfiguration
LUT content before reconfiguration
Address of LUT locations 
Initial
Final
(a) Before LUT input permutation.
A3
A2
A1
a
* * *
Initial
Final
b
c
0 0 0 0 0 1 1 1
c
a
b
LUT1
LUT2
A3
A2
A1
0 0 0 1 1 1 1 1
0 0 1 1 1 1 1 1
0 0 0 0 0 1 1 1
LUT content before reconfiguration
LUT content before reconfiguration
LUT content after reconfiguration
LUT content before reconfiguration
Initial
Final
(b) After LUT input permutation.
Figure 4. LUT data with input permutation.
Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06) 
0-7695-2523-7/06 $20.00 © 2006 IEEE 
Authorized licensed use limited to: Southern Illinois University Carbondale. Downloaded on May 29, 2009 at 11:09 from IEEE Xplore.  Restrictions apply.
In the quest for solutions of the proposed minimization
problem, we are more interested in how logic values (0 or
1) are stored in LUTs, rather than what actual functions im-
plemented on LUTs are. Due to this reason, we introduce
the concept of the LUT mapping function. LUT mapping
functions are LUT functions expressed in terms of LUT ad-
dress variables. For an LUT whose implemented logic func-
tion is given, we can obtain its mapping function through
substituting logic variables by their associated address vari-
ables. For example, the initial function of LUT1 in Figure 4
is a · (b + c). Substituting logic variables by their associ-
ated LUT address variables, we have its mapping function
as A3 · (A2 + A1). The mapping function of an LUT rep-
resents all the LUT locations that store logic 1. Since each
LUT is associated with two logic functions (f and h), there
are two mapping functions for each LUT as well. Due to
the close relation between LUT logic functions and their
corresponding mapping functions, we also use f and h to
represent the initial and ﬁnal mapping functions of an LUT,
respectively.
Based on LUT mapping functions, we deﬁne the LUT
difference function as:
D = f ⊕ h (1)
In addition, the difference function of an LUT column is
deﬁned as:
D =
N⋃
i=1
Di (2)
where, N is the total number of LUTs in the given column
and Di is the LUT difference function of LUTi. In the com-
putation of D, address variables with the same name but lo-
cated in different LUTs (e.g. A1 of LUTi and LUTj) are
treated as the same variable, since they function as coordi-
nates to indicate LUT locations containing logic 1. There-
fore, function D depends on only p variables: Ap, Ap−1,
· · ·, A1, where p is the number of inputs of the LUTs in the
column. It is easy to see that the number of minterms in D
is equal to the number of frames requested for reconﬁgur-
ing the entire LUT column. Due to this reason, the phrase
of minimizing LUT difference functions is used in the rest
of the paper as a convenient synonym of minimizing the
number of minterms in LUT difference functions.
3. Proposed Techniques
As discussed early, FPGA reconﬁguration data can be
minimized by optimizing variable mapping, modifying
LUT don’t-care locations, and permuting LUT input orders.
The problem of ﬁnding optimal variable mappings is easy
since it can be solved separately for each LUT. Techniques
to perform the other two optimization procedures are dis-
cussed in the following.
3.1 Modifying LUT don’t-care locations
In general, expressions for f and h of an LUT contain
their entire on sets (fon and hon) and portions of their don’t-
care sets (fdc and hdc). We use fdc and fdc† to distinguish
don’t-cares of f that are included and excluded in the ex-
pression of f . Similar notations apply to function h. Then,
we have f = fon + fdc and h = hon + hdc. The LUT
difference function can be written as:
D = f · h + f · h
= fon · hoff + fon · hdc† + fdc · h
+ hon · foff + hon · fdc† + hdc · f (3)
Obviously, fon · hoff + hon · foff constitutes the lower
bound of the difference between f and h. The other terms
on the right-hand-side of Equation 3 can be eliminated by
assigning proper values to LUT don’t-care locations. This
is formally stated by the following corollary.
Corollary 1 The number of minterms of an LUT difference
function is minimized if the initial and ﬁnal functions of the
LUT are modiﬁed as follows:
fnew = f + fdc · h− fdc · h− fdc · hdc (4)
hnew = h + hdc · f − hdc · f − fdc · hdc (5)
In the above equations, symbols +, ·, and− represent set
union, intersection, and subtraction operations. For an LUT,
adding a minterm to its function implies changing the value
stored in the LUT location that corresponds to the minterm
to logic 1. Meanwhile, subtracting a minterm is the same as
putting logic 0 to the corresponding LUT location. It is easy
to show fnew⊕hnew = fon ·hoff +hon ·foff and, hence,
prove the corollary. By performing function modiﬁcation
according to the above corollary, minterms added to f are:
μ+ = fdc · h− fdc · hdc − f (6)
Similarly, minterms that are subtracted from f can be ex-
pressed as:
μ− = fdc · h + fdc · hdc − f (7)
The total LUT locations that are altered can be expressed by
their corresponding minterms as:
μ = μ+ + μ− (8)
Note that a similar set of equations apply to function h.
For an LUT, its don’t-cares consist of controllability
don’t-cares (CDCs) and observability don’t-cares (ODCs).
CDCs are signal patterns that never appear at the LUT in-
puts. Meanwhile, ODCs are deﬁned as LUT input patterns
Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06) 
0-7695-2523-7/06 $20.00 © 2006 IEEE 
Authorized licensed use limited to: Southern Illinois University Carbondale. Downloaded on May 29, 2009 at 11:09 from IEEE Xplore.  Restrictions apply.
representing scenarios that the LUT output cannot be ob-
served by circuit primary outputs. Because CDC sets of dif-
ferent LUTs are independent of each other, modifying LUT
locations addressed by CDC patterns can be performed in-
dividually for each LUT. This simple process always leads
to the globally optimized solution when only CDCs are un-
der consideration. On the contrary, modifying ODC loca-
tions is a complicated process. When ODC locations of an
LUT are modiﬁed, ODCs of other LUTs may change. Al-
though it is theoretically possible to re-compute ODCs for
the rest of LUTs after each LUT is modiﬁed, this approach
is practically unattractive due to its computation complex-
ity. To avoid repeated re-computation of LUT ODCs, this
section presents an efﬁcient method to compute LUT ODCs
that can be simultaneously modiﬁed, which are referred to
as compatible ODCs (CODCs). To address a similar prob-
lem in logic synthesis, several techniques [15, 16, 17, 18]
have been proposed. The method presented here is simi-
lar to approaches discussed in [16, 17] in the perspective of
computing CODC upper bounds. However, it differs from
the previous approaches in the following two aspects. First,
ODCs covered by their upper bounds are further restricted
according to Equation 8. Second, a heuristic method is uti-
lized to determine the order of LUTs to be processed.
The simultaneous optimization for multiple vertices
(gates or LUTs), denoted as y1, y2, · · · , yn, can be modeled
by n perturbation variables δ1, δ2, · · · , δn [15]. In this ap-
plication, δi represents ODCs that are added or subtracted
from the function of LUTi. Let DCext represent external
don’t-cares, ODCyi denote ODCs at vertex yi, and sym-
bol | represent generalized cofactor operations. A sufﬁ-
cient condition for the equivalence between the perturbed
and original circuits is [16]:
δi1 ⊆ DCext + ODCyi |δ′1,···,δ′i−1 i = 1, 2, · · ·n. (9)
In the above expression, don’t-cares with respect to differ-
ent primary outputs are represented in the vector format and
1 = (1, 1, · · · , 1). The above condition gives a series of
upper bounds (with respect to different primary outputs)
for δi, which depend on ODCyi and previous perturba-
tions. Let m denote the number of circuit primary outputs,
DCextj and ODC
yi
j denote the external and observability
don’t-care sets at vertex yi with respect to primary output j,
respectively. The global upper bound, which is in the scalar
format, can be obtained as:
ζi(δ1, · · · , δi−1) =
m⋂
j=1
(DCextj + ODC
yi
j |δ′1,···,δ′i−1)(10)
for i = 1, 2, · · ·n
As FPGA reconﬁguration data for an FPGA column de-
pend on all the LUT functions of the column, it is imperative
to simultaneously optimize all LUT functions of a column.
In addition, LUT difference functions with large numbers
of minterms are likely to affect the overall reconﬁguration
frames. Therefore, such LUTs should be given high pri-
orities during the optimization. Due to this observation,
the proposed procedure ﬁrst ranks all the LUTs according
to the number of minterms in their difference functions.
LUTs whose difference functions contain more minterms
are given higher ranks. Following the descending order of
LUT ranks, ODCs are pruned in accordance with two con-
straints. The ﬁrst constraint is Equation 8, which eliminates
ODCs that don’t minimize LUT difference functions. The
second constraint is the upper bound given in Equation 10,
which is used to guarantee the correctness of the resulted
circuit.
The proposed procedure is further elaborated as follows.
For the convenience of description, we re-label LUTs after
ranking such that LUTs with higher ranks are given smaller
index numbers. For example, N LUTs arranged in the de-
scending order of their ranks will be listed with their new
labels as LUT1, LUT2, · · ·, LUTN . Thus, LUT1 is the ﬁrst
LUT to be processed. When the initial function of LUT1
is under consideration, LUT locations whose values are de-
sired to be altered are:
δf1 = μ
f
1 (11)
In the above and following equations, we use superscripts
to indicate the function on which δ and μ are deﬁned. Also,
we use subscripts to indicate the LUT that δ and μ are asso-
ciated with. Since LUT1 is the ﬁrst LUT to be processed,
δf1 is not subject to the second constraint. However, when
LUTk (k = 1) is processed, we have to apply both con-
straints. This leads to:
δfk = μ
f
k · ζfk (δ1, · · · δk−1) (12)
The pseudo-code of the proposed CODC computation pro-
cedure is given in Figure 5. Note that CODCs for both LUT
initial and ﬁnal functions are computed simultaneously in
the procedure.
ODC OPT( LUTs ) {
1 Compute ODCs for all LUTs regarding
their initial and ﬁnal functions
2 Rank all LUTs and re-label them according
to the descending order of their ranking
3 δf1 = μ
f
1 ; δ
h
1 = μ
h
1
4 for k=2 to N
5 δf
k
= μf
k
· ζf
k
(δf1 , · · · δfk−1)
6 δhk = μ
h
k · ζhk (δh1 , · · · δhk−1) }
Figure 5. CODC computation procedure.
3.2 Permuting LUT input orders
By deﬁning LUT-column difference function D, we re-
late the number of reconﬁguration frames to the number of
Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06) 
0-7695-2523-7/06 $20.00 © 2006 IEEE 
Authorized licensed use limited to: Southern Illinois University Carbondale. Downloaded on May 29, 2009 at 11:09 from IEEE Xplore.  Restrictions apply.
minterms in D. Thus, the optimal LUT input orders should
minimize minterms in the corresponding column difference
function. Although it is possible to solve this problem
through exhaustive enumeration, the large search space of
this problem makes a such approach impractical. This paper
presents a search procedure based on a greedy algorithm.
With assumptions that each LUT has p inputs and N LUTs
are in the give column, the major steps of the procedure
are described in Figure 6. It ﬁrst constructs LUT differ-
ence functions (line 3) and, concurrently, ﬁnds the LUT that
requires the least number of reconﬁguration frames (lines
4 ∼ 5). The input order of that LUT will not be permuted,
and is used as a reference when permuting other LUT input
orders. Also, function MintermCount used in line 3 counts
the number of minterms of its operand. After the reference
LUT is selected, the algorithm sequentially picks an unpro-
cessed LUT and permutes its inputs. The permutation pro-
cedure is sketched from lines 9 to 18. It exhaustively tries
all the possible permutations and picks the one that results
in the smallest increase on the number of minterms of the
newly constructed union function (Dtmp). The time com-
plexity of the proposed procedure is (p!) · (N − 1), which
is signiﬁcantly smaller than the time complexity of the ex-
haustive enumeration method.
1 min tmp = 2p
2 for i = 1 to N
3 D[i] = fi ⊕ hi; min = MintermCount(D[i])
4 if min < min tmp
5 min tmp = min; min index = i; D = D[i]
6 for i = 1 to N
7 if i = min index
8 D = permute(D, D[i])
9 permute( D, D[i] ) {
10 min tmp = 2p
11 for each permutation order of LUTi
12 derive new function D′[i] according
to the new input order
13 Dtmp =D
⋃
D′[i]
14 min = MintermCount( Dtmp )
15 if min < min tmp)
16 min tmp = min; Dmin =Dtmp
17 Order[LUTi] = current permut. order
18 return Dmin }
Figure 6. LUT input permutation procedure.
4. Experimental Results
This section describes how the proposed techniques can
be integrated into FPGA design automation ﬂow, and re-
ports experimental results. The current FPGA design au-
tomation ﬂow is sketched by the solid arrows in Figure 7(a).
For reconﬁguration applications, FPGA implementations of
both initial and ﬁnal circuits are generated following the
same ﬂow. The reconﬁguration bitstreams, which change
FPGA hardware from the initial circuit to its ﬁnal circuit,
are produced by comparing the initial and ﬁnal FPGA im-
plementations. The proposed optimization procedures can
be added into the design ﬂow between placement and rout-
ing steps as shown in Figure 7(b). After the placement
phases of both the initial and ﬁnal circuits, the initial and
ﬁnal functions of all the LUTs become available. Hence,
the proposed techniques can be applied to optimize variable
mappings, modify LUT don’t-care locations, and ﬁnd op-
timal LUT input orders. After this, FPGA routing can be
performed accordingly.
   Circuit 
description
  Logic synthesis &
technology mapping 
Placement & routing
Generating bitstreams
Proposed optimization
FPGA
  Logic synthesis &
technology mapping 
Placement
Generating bitstreams
Proposed optimization
Routing
(a) (b)
Figure 7. Integrating the proposed tech-
niques into FPGA design ﬂow.
It is often difﬁcult to have direct access to results pro-
duced by the FPGA placement procedure. In this case,
our method can be integrated as indicated by the dash ar-
rows in Figure 7(a). After the placement and routing (P&R)
phases of both the initial and ﬁnal circuits, we let the FPGA
tool write P&R results into structural VHDL ﬁles. The ba-
sic components in these VHDL ﬁles are LUTs. In addi-
tion, we let the FPGA tool generate location constraints for
each LUT in VHDL ﬁles according to P&R results. The
VHDL ﬁles along with the constraint ﬁles provide informa-
tion about LUTs in the same column and their initial and
ﬁnal functions. After applying the proposed optimization
procedures, LUT init values (that represent LUT locations
storing logic 1) are updated and new constraints regarding
LUT input orders are added into constraint ﬁles. The up-
dated VHDL and constraint ﬁles are fed to the P&R module
in the FPGA tool to re-route FPGA circuits.
We experimented with the latter integration scenario.
Due to the lack of suitable partial reconﬁguration bench-
mark circuits, we use ISCAS85 benchmark circuits as initial
FPGA circuits. We derive ﬁnal FPGA circuits by perform-
ing random function modiﬁcation on the initial circuits. In
this process, we ﬁrst deﬁne a set of functions, denoted as
g1, g2, · · · gi, which depend on variables A4, A3, A2, A1
(since four-input LUTs are used in our experiments). Then,
we derive ﬁnal LUT functions by performing either COM-
POSE or INTERSECT operation with using the original
LUT function and one function selected from g1, g2, · · · gi
Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06) 
0-7695-2523-7/06 $20.00 © 2006 IEEE 
Authorized licensed use limited to: Southern Illinois University Carbondale. Downloaded on May 29, 2009 at 11:09 from IEEE Xplore.  Restrictions apply.
as operands. The COMPOSE and INTERSECT are function
manipulation operations deﬁned in CUDD package that is
used in the implementation of our optimization procedures.
The selection on operation (COMPOSE or INTERSECT)
and operand function (g1, g2, · · · gi) is totally randomized.
The experiments are conducted on Xilinx Virtex 1000
platform. The obtained results are summarized in Table 1.
The second column of the table lists the number of LUTs as-
signed to each column. Several column conﬁgurations are
investigated in the experiment. The third column records
the required frame numbers without performing any of the
proposed optimization. The fourth column summarizes the
number of frames contained in reconﬁguration data when
only LUT input order permutation technique is applied.
The percentage of frame reduction is given in the ﬁfth col-
umn. With both don’t-care modiﬁcation and LUT input or-
der permutation techniques being utilized, the resultant re-
conﬁguration frame numbers and their corresponding sav-
ing (in percentage) are summarized in the sixth and seventh
columns, respectively. The results show that the proposed
techniques can reduce reconﬁguration frames by more than
20% on average.
Table 1. Comparing Reconﬁguration frames.
Circuit #lut W/o. Inp. Perm. DC Opt. &
Opt. Only Inp. Perm.
#Frm. R(%) #Frm. R(%)
3 274 244 11% 236 14%
C432 4 238 212 11% 208 13%
8 166 136 18% 136 18%
3 142 137 4% 117 18%
C1355 6 124 111 10% 99 20%
9 106 95 10% 83 22%
3 255 239 6% 143 44%
C1908 6 198 175 12% 119 40%
9 172 141 18% 91 47%
3 430 389 10% 322 25%
C2670 6 334 286 14% 251 25%
9 276 232 16% 204 26%
6 771 659 15% 580 25%
C3540 9 632 506 20% 452 28%
12 567 409 28% 377 34%
9 769 617 20% 529 31%
C5315 12 626 574 8% 505 19%
15 542 440 19% 402 26%
12 1168 964 17% 849 27%
C6288 15 986 826 16% 786 20%
18 852 712 16% 672 21%
12 967 780 19% 686 29%
C7552 15 814 660 19% 611 25%
18 693 570 18% 539 22%
5. Concluding Remarks
This paper presents a comprehensive methodology to
minimize FPGA reconﬁguration data at logic level. The
methodology is based on a framework that links the size of
reconﬁguration data to the number of minterms contained in
LUT-column difference functions. It comprises three tech-
niques, which are variable mapping optimization, don’t-
care location modiﬁcation, and LUT input order permuta-
tion. To efﬁciently implement the proposed techniques, two
heuristic algorithms are developed for computing compati-
ble don’t-care locations and ﬁnding optimal LUT input or-
ders from a large search space. The developed techniques
can be perfectly combined with other methods that mini-
mize FPGA reconﬁguration data at high levels for further
reducing FPGA reconﬁguration cost.
References
[1] J. M. Cardoso, “On Combining Temporal Partitioning and Shar-
ing of Function Units in Compilation for Reconﬁgurable Architec-
tures,” IEEE Trans. on Computers, vol. 52, no. 10, pp. 1362–1375,
2003.
[2] M. Meribout and M. Motomura, “Efﬁcient Metrics and High-Level
Synthesis for Dynamically Reconﬁgurable Logic,” IEEE Trans. on
VLSI, vol. 12, no. 6, 2004.
[3] M. Kaul and R. Vemuri, “Temporal Partitioning Combined with De-
sign Space Exploration for Latency Minimization of Run-Time Re-
conﬁgured Designs,” in Proc. DATE, pp. 202–209, 1999.
[4] M. Kaul and R. Vemuri, “An Automated Temporal Partitioning and
Loop Fission Approach for FPGA Based Reconﬁgurable Synthesis
of DSP Applications,” in Proc. DAC, pp. 616–622, 1999.
[5] K. M. GajjalaPurna and D. Bhatia, “Partitioning in time: a paradigm
for reconﬁgurable computing,” in Proc. ICCD, pp. 340–345, 1998.
[6] D. Rakhmatov and S. B.K. Vrudhula, “Minimizing routing conﬁgu-
ration cost in dynamically reconﬁgurable FPGAs,” in Proc. Parallel
and Distributed Processing Symp., pp. 1481–1488, 2001.
[7] K.Compton, J.Cooley and S.Knol, “Conﬁguration relocation and
defragmentation for reconﬁgurable computing,” in Proc. IEEE
Symp. FPGA Custom Computing Machines, pp. 79–80, 2000.
[8] K.Compton, Z.Li,S.Knol and S.Hauck, “Conﬁguration relocation
and defragmentation for reconﬁgurable computing,” IEEE Trans.
on VLSI, vol. 10, pp. 209–220, 2002.
[9] Z. Huang and S. Malik, “Managing dynamic reconﬁguration over-
head in SoC design using reconﬁgurable datapaths and optimized
interconnect networks,” in Proc. DATE, pp. 13–16, 2001.
[10] Z. Li, K. Compton, and S. Hauck, “Conﬁguration Caching for FP-
GAs,” in Proc. IEEE Symp. FPGA Custom Computing Machines,
pp. 22–36, 2000.
[11] S. Hauck, Z. Li, and E. Schwabe, “Conﬁguration Compression for
the Xilinx XC6200 FPGA,” in Proc. FPGA Custom Computing Ma-
chines, 1998.
[12] S. Mitra, W. Huang, N. Saxena, S. Yu, and E. J. McCluskey, “Re-
conﬁgurable Architecture for Autonomous Self-Repair,” IEEE De-
sign and Test of Computer, vol. 21, no. 2, pp. 228–240, 2004.
[13] XILINX Inc., Virtex Series Conﬁguration Architecture User Guide,
2003.
[14] XILINX Inc., Two Flows for Partial Reconﬁguration:Module Based
or Small Bit Manipulations, 2002.
[15] G. De Micheli, Synthesis and Optimization of Digital Circuits.
McGraw-Hill, Inc., 1994.
[16] M. Damiani and G. De Micheli, “Don’t Care set Speciﬁcations in
Combinational and Synchronous Logic Circuits,” IEEE Trans. on
CAD, vol. 12, no. 3, pp. 365–388, 1993.
[17] H. Savoj and R. Brayton, “The use of Observability and External
Don’t cares for the Simpliﬁcation of Multi-Level Netwworks,” in
Proc. DAC, pp. 297–301, 1990.
[18] S. Yamashita, H. Sawada, and A. Nagoya, “SPFD: A New Method
to Express Functional Flexibility,” IEEE Trans. on CAD, vol. 19,
no. 8, pp. 840–849, 2000.
Proceedings of the 7th International Symposium on Quality Electronic Design (ISQED’06) 
0-7695-2523-7/06 $20.00 © 2006 IEEE 
Authorized licensed use limited to: Southern Illinois University Carbondale. Downloaded on May 29, 2009 at 11:09 from IEEE Xplore.  Restrictions apply.
