The Partition into Hypercontexts Problem for Hyperreconfigurable Architectures by Lange, Sebastian & Middendorf, Martin
The Partition into Hypercontexts Problem for
Hyperreconfigurable Architectures
Sebastian Lange and Martin Middendorf
Parallel Computing and Complex Systems Group
Department of Computer Science, University of Leipzig
Augustusplatz 10/11, D-04109 Leipzig, Germany
{langes,middendorf}@informatik.uni-leipzig.de
Abstract. Hyperreconfigurable architectures adapt their reconfiguration abilities
during run time in order to achieve fast dynamic reconfiguration. Modelsfor such
architectures have been proposed that change their ability for reconfiguration
during hyperreconfiguration steps and in ordinary reconfiguration steps r con-
figure the actual contexts for a computation within the limits that have been set
by the last hyperreconfiguration step. In this paper we study algorithmic aspects
of how to optimally decide what hyperreconfiguration steps should be donur-
ing a computation in order to minimize the total time necessary for hyperrecon-
figuration and ordinary reconfiguration. It is shown that the general problem is
NP-hard but fast polynomial time algorithms are given to solve this problem on
different types of hyperreconfigurable architectures. These include newly intro-
duced architectures that use a cache to store hypercontexts. We definean example
hyperreconfigurable architecture and illustrate the introduced conceptsfor three
application problems.
1 Introduction
The increasingly higher integration and flexibility of dynamically reconfigurable hard-
ware lead to a large amount of information which has to be transferred onto the hard-
ware for reconfiguration to define the new state of the system.This large amount of data
transfer makes run time reconfigurations time critical operations, especially, for com-
putations which exploit the full capacity of dynamically reconfigurable architectures by
frequent reconfigurations. Different approaches have beenproposed in the literature to
cope with this problem, e.g., compression methods for the stream of reconfiguration bits
([4,6]), multi-context architectures [1,12]), self-reconfigurability ([8,15,17]) and hyper-
reconfiguration ([9]) which means that the reconfiguration ptential of an architecture
itself is reconfigurable.
In this paper we study algorithmic aspects of single task hyperreconfigurable ar-
chitectures as they have been proposed in [9] (algorithmic aspects of multi-task hyper-
reconfigurable architectures are studied in [10]). Such arcite tures use two types of
reconfiguration steps: i) reconfiguration steps where the reconfiguration potential of the
architecture is defined ii) standard reconfiguration steps which are used to reconfigure
the actual contexts which are used by the algorithm. The firsttype of reconfiguration
steps are called hyperreconfiguration steps. Moreover, we extend hyperreconfigurable
architectures by introducing a cache for storing hypercontexts.
A central problem that emerges on hyperreconfigurable architectures is to deter-
mine when hyperreconfiguration steps should be taken and howt e reconfiguration
potential should be defined in these steps in order to minimize the total time necessary
for (hyper)reconfiguration of a computation. We call this problem Partition into Hyper-
contexts (PHC) problem and show that it is NP-hard. We also decribe polynomial time
algorithms for several variants of PHC on the so called Switch model of hyperrecon-
figurable architectures ([9]). Unfortunately, it is also shown that the introduction of a
cache for hypercontexts makes the PHC problem NP-hard even for the Switch model.
To illustrate the ideas in this paper we present an example for the PHC problem on the
Switch model. An optimal solution for the PHC problem is provided when the example
architecture has no cache and a heuristic solution when a cache for hypercontexts is
used.
The paper is organized as follows. In the next Section 2 we describe hyperrecon-
figurable architectures and introduce the Partition into Hypercontexts (PHC) problem.
In Section 4 we discuss polynomial time solvable cases of thePHC problem. A vari-
ant of the PHC problem with changeover costs is studied in Section 5. In section 6 we
introduce hyperreconfigurable architectures with a cache for hypercontexts and study
PHC for these architectures. Experimental results for a test architecture are presented
in Section 7. The paper ends with a conclusion in Section 8.
2 The Partition into Hypercontexts Problem
Hyperreconfigurable architectures allow to alter the reconfiguration potential during run
time and use two types of reconfiguration steps ([9]). The ordinary reconfiguration steps
are used to actually define a new configuration of the system. The actual state of the
system that can be changed by reconfiguration is called the cont xt of a computation.
Hyperreconfiguration steps are used for defining the actual reconfiguration potential
of the architecture that is activated for reconfiguration inthe ordinary reconfiguration
steps. Thus, a hyperreconfiguration step defines the set of contexts that can potentially
be reconfigured in (ordinary) reconfiguration steps. Such a set of possible contexts is
called a hypercontext. A reconfiguration into a new context might be dependent on ex-
ternal and internal parameters of the computation and can becharacterized by the set
of all possible contexts that it defines depending on the data. Hence, a reconfiguration
can in general only be executed during run time when the machine is in a hypercontext
that contains this set of possible contexts. A set of possible contexts is called a con-
text requirement and a hypercontexts that contains itsat sfiesthe corresponding context
requirement. It is assumed that a reconfiguration step requir s reconfiguration informa-
tion for all activated resources (even when the informationis that an activated resource
is not used in the corresponding context). Formal models forhyperreconfigurable ar-
chitectures where the cost (e.g., the time or the amount of bits necessary to be loaded
onto the architecture) of a reconfiguration step depends on the actual hypercontext have
been given in [9] and are described in the following.
Let C be the set of possible context requirements for a reconfigurable machine and
C = c1 . . .cm, ci ∈ C be the sequence of context requirements that characterizesan al-
gorithm/computation. Ahypercontextis a state of the machine which is characterized
by the subset ofC context requirements that are satisfied when the machine is in this
state. At any time exactly one hypercontext is realized on the machine. LetH be the
set of possible hypercontexts. For a hypercontexth∈ H let h(C ) ⊂ C be the subset of
context requirements that are satisfied byh. The seth(C ) is called thecontext setof h.
For a sequencec1 . . .ck of context requirements and a hypercontexth let c1 . . .ck ⊂ h(C )
denote the fact that for each context requirementci , i ∈ [1 : k] ci ∈ h(C ) holds. In or-
der to change the machine’s current hypercontext ahyperreconfiguration stepis nec-
essary. For each hypercontexth ∈ H two cost measures are defined: i)init (h) is the
cost of performing a hyperreconfiguration that brings the machine into hypercontexth
ii) cost(h) denotes the cost of an ordinary reconfiguration step when themachine is in
hypercontexth. Then a computation is characterized by a partition ofC into substrings
S1, . . . ,Sr (i.e.C = S1 . . .Sr ) and hypercontextsh1, . . . ,hr , r ≥ 1 such thatSi ⊂ hi(C ) and
∑ri=1(init (hi)+cost(hi) · |Si |) are the costs where|Si | is the length ofSi , i.e., the number
of context requirements inSi . When the algorithm/computation is executed the ma-
chine performs the following reconfiguration operations:h1S1 . . .hrSr whereSi stands
for a sequence of|Si | reconfigurations which use only those parts of the machine which
are available within the hypercontexthi . It is assumed that a hyperreconfiguration is
always performed before the first reconfiguration step.
An important problem that emerges for a hyperreconfigurablemachine and a given
algorithm (i.e. a sequence of context requirements) is to define when hyperreconfigura-
tions are done and how corresponding hypercontexts are define such that the context
requirements of the algorithm are satisfied and the total costs f r the hyperreconfigura-
tion steps and the ordinary reconfiguration steps are minimized. Formally we define,
Partition into Hypercontexts(PHC) problem : Given a hyperreconfigurable machine
(as described above) and a sequenceC = c1 . . .cm of context requirements. Find a parti-
tion ofC into substringsS1, . . . ,Sr (i.e.C = S1 . . .Sr ) and hypercontextsh1, . . . ,hr , r ≥ 1
swithSi ⊂ hi(C ) and minimal total (hyper)reconfiguration.
Two variants of the model for hyperreconfigurable architectures have been intro-
duced in [9]. TheDAG modelis for coarse grained reconfigurable machines where dif-
ferent reconfigurable submachines (hypercontexts) can be define that can be ordered
with respect to their computational power (this model is notc nsidered in this paper
due to space limitations). The second variant calledSwitch modelis for fine grained
machines where a set of small (similar) reconfigurable units(also called switches) ex-
ists. The reconfigurable machine that is available during a hypercontext is defined by
the subset of available units. For reconfiguration the stateof each available switch has
to be defined. Thus the cost for reconfiguration is the number of available units plus
some overhead cost. Formally, letX = {x1, . . . ,xn} be a set of switches and define
C = H = 2X, i.e., the set of possible context requirementsC and the set of possi-
ble hypercontextsH equal the set of all subsets ofX. For contextx ∈ X the relation
x∈ h(C ) holds, whenx⊂ h. Let cost(h) = |h|, where|h| is the size ofh, i.e., the num-
ber of switches available inh. Let init (h) = n for h∈ H , which reflects the fact that for
each switch it has to be defined during hyperreconfiguration whether it is available in
the new hypercontext. A computation is characterized by a partition of C into substrings
S1, . . . ,Sr , r ≥ 1 (i.e.C = S1 . . .Sr ) and hypercontextsh1, . . . ,hr such thatSi ⊂ hi(C ) and
the total (hyper)reconfiguration costs arer ·n+∑ri=1 |hi | · |Si |.
PHC-Switch problem: Given a hyperreconfigurable machine inthe Switch model
with the set of switchesX = {x1, . . . ,xn} and a sequence of context requirementsC =
c1 . . .cm. Find a partition ofC into substringsS1, . . . ,Sr , r ≥ 1 (i.e.C = S1 . . .Sr ) and
hypercontextsh1, . . . ,hr such thatSi ⊂ hi(C ) and the total (hyper)reconfiguration costs
are minimal. Note that for the PHC-Switch problem there exist 2n hypercontexts but
this number is not part of the size of the problem instance which isn+m.
3 NP-Hardness
In this section we show that the general PHC problem is NP-hard which means it is
unlikely that the problem can be solved in polynomial time.
Theorem 1. The PHC problem is NP-complete.
We only give the proof idea. For a proof one can encode an instance of an NP-hard
problem, say 3-SAT, in a sequence of contextsC. Then a cost function and a set of
hypercontexts can be defined such that there exists a cheap partition into hypercontexts
of C if and only if the partition consists of a single hypercontext and the contexts inC
encode an instance of 3-SAT that is solvable such that there exists no partition ofC into
substrings which can be covered by hypercontexts in a cheap way.
4 Polynomial Time Algorithm for PHC-Switch
In this section we describe a dynamic programming solution for the PHC-Switch prob-
lem. The algorithm computes a tableM = (Mk, j)k∈[1:m], j∈[k:m] whereMk, j are the min-
imal costs for the prefix of lengthj of the sequence of context requirementsc1 . . .cm
when usingk hypercontexts. The optimal solution for PHC-Switch can then b derived
from this matrix. This algorithm is designed such that each row of the matrix can be
determined in timeO(n·m) so that the total run time isO(n·m2).
In the following lethi j be a cheapest hypercontext that satisfies the contexts require-
mentsci , . . . ,c j . First, we need some facts and definitions. It is not hard to show for each
k∈ [1 : m]: i) the value ofMk,p is monotone decreasing inp, ii) for j ∈ [k : m] the value
of cost(hi, j) is monotone decreasing ini. Let j ∈ [k : m]. It follows from the stated facts
that there exists a partitionT1, . . . ,Th of the sequence of context requirementsck . . .c j
such thatck . . .c j = T1 . . .Th and for each string of contextsTs, s∈ [1 : h] holds: For all
contextsct ∈ Ts the hypercontextsht, j and therefor the costscost(ht, j) are the same. Re-
call, thatht, j for the PHC-Switch problem is defined as the hypercontext that consists
of all switches that are element of at least one of the contextrequirementsct , . . . ,c j ,
i.e.,ht, j =
⋃ j
i=t ci . We call the partitionT1, . . . ,Th theequal cost partitionof [k : j]. The
corresponding intervals of indices of the contexts theequal cost intervals.
Let [s : t] be an equal cost interval. For indexx ∈ [s : t] the valuesδ ∈ [1 : n] are
determined for whichcMk,x−1+δ ·(t−(x−1)) = min{Mk,y−1+δ ·(t−(y− 1)) | y∈ [s:
t]} holds. Clearly, for each indexx∈ [s: t] the correspondingδ values form a subinterval
of [1 : n]. This interval is called theminimum cost interval of index x(within the equal
cost interval[s : t]) and is denoted byIx. It is not hard to show thatIs, . . . , It is a partition
of [1 : n] where all elements inIi are smaller than all elements inIi+1 for i ∈ [s : t −1].
In the following we describe the computation of a single matrix element in the main
step of the algorithm. We assume that all elements in row 1 ofMk, j and all elements
Mk,k = k·w+∑ki=1 |ci |, k∈ [1 :m] have been computed during initialization. It is enough
to consider the computation of an elementMk, j+1 for k > 1 and j ∈ [1 : m−1] assuming
that elements in rowk−1 and elementMk, j have already been computed.
In order to search efficiently for possible good places to introduce thekth hyper-
reconfiguration we introduce a pointer structure over partsof the sequence of context
requirementsc1 . . .cm. First we describe the pointer structure over the sequenceck . . .c j
for the computation ofMk, j and then show how it can be extended to a pointer structure
over the sequenceck . . .c j+1 for the computation ofMk, j+1.
The first context requirements in each of the sequences of context requirements
Th, . . . ,T1 are linked by so calledequal cost pointers, i.e. there is a pointer to the first
context requirement inTh, from there to the first context requirement inTh−1 and so
forth. Moreover, within each equal cost interval the indices x with a minimal cost
interval that is empty or contains only values that are smaller than the actual costs
cost(hx, j+1) are linked in order of increasing value by so calledminimum cost pointers.
In addition, there is a pointer from the first context requirement of the interval to the
last useful index in the interval. This pointer is called theend pointerof the equal cost
interval. All indices with an equal cost interval that are linked by minimal cost pointers
are calleduseful. All other indices are calleduselessand will be marked as useless by
the algorithm. The following two facts which are not hard to sh w are used for run time
analysis and to show the correctness of the algorithm (omitted due to space limitations).
Fact 1: It is easy to obtain from the equal cost partitionT1, . . . ,Th of [k : j] and its
corresponding pointers the equal cost partitionU1 . . .Ug of ck . . .c j+1 of [k : j +1] and
the corresponding pointers in timeO(n).
To see that this is true observe that each string inU1, . . . ,Ug can be obtained by
merging (or copying) neighbored strings fromT1, . . . ,Th andUg contains in addition the
context requirementc j+1.
Fact 2: Consider an elementTs of the equal cost partitionT1, . . . ,Th of [k : j]. Let cx
(cy) be the context inTs (respectively from the element of the equal cost partition of [k :
j +1] that containsTs) for whichMk,x−1+cost(hx, j) (respectivelyMk,y−1+cost(hy, j+1))
is minimal. Then it follows thatx≤ y.
To computeMk, j+1 the algorithm performs the following steps:
i) Extend the equal cost partition of[k : j] by appending the (preliminary) equal cost
intervalc j+1 and let[1 : n] be the (preliminary) minimal cost interval forj +1.
ii) Compute the equal cost partition of[k : j +1] from the extended equal cost par-
tition of [k : j] by merging neighbored intervals when they have the same costwith
respect toj +1.
iii) For each index within a merged interval the new equal cost interval is deter-
mined together with its minimal cost pointers and its end pointer. During this process
all indices that have become useless are marked.
Clearly step (i) can be done in timeO(1). The determination of the intervals that
have the same costs in step (ii) is done in timeO(n) by following pointers that con-
nect the intervals. To determine the time for step (iii) consider an equal cost interval
[s0 : sh], k≤ s1 ≤ sh ≤ j +1 that was merged fromh≤ n old intervals[s0 : s1], [s1 +1 :
s2], . . . [sh−1 +1 : sh]. We now show that the computation of new pointers and the mark-
ing of useless indices takes timeO(h+q) whereq is the number of marked indices.
a) For each of theh intervals consider the minimum cost interval of the index to
which the first minimum cost pointer points. If the minimum cost interval does not
contain a value that is at least as large asco t(hs, j+1) then the index is marked as useless
and the first pointer is merged with the next pointer. This process proceeds until every
first minimum cost pointer points to a useful index.
b) Now it remains to update the minimum cost intervals by selecting for each cost
value only the best index from theh merged intervals. This can be done in a left to
right manner starting with the smaller cost values. Therebyalways comparing the cor-
responding minimum cost intervals of indices between two neighbored of theh merged
intervals, say[si−1 + 1 : si ] and [si + 1 : si+1], i ∈ [1 : h− 1]. For ease of description
we assume here that all values in one minimal cost interval are better than all values
in the other interval. If this is not the case both minimum cost intervals are split so
that each contains only the values for which it is better. Observe that the split value
can be computed in constant time. When the minimum cost interval in the left interval
[si−1 + 1 : si ] is better the corresponding index in the right interval is marked useless
and the next minimum cost intervals are compared. When the minimum cost interval in
the right interval[si +1 : si+1] is better the index in the left interval is marked useless.
Then the minimum cost interval in the old right interval (nowthe new left interval) is
compared with the corresponding minimum cost interval of its right neighbor interval
[si+1 +1 : si+2]. During the search for the corresponding minimum cost interval all in-
dices that are passed are marked useless. The process stops when the best minimum
cost interval with valuen is found. During the search a pointer is set from the rightmos
useful index of an interval to the first useful index in its right neighbor. Thereby it might
be necessary to jump over intervals that have no useful indexleft. The end pointer of
the first interval is set to point to the last useful index of the merged intervals.
Since the total number of intervals in the equal cost partition for [k : j +1] is at most
n minus the number of merged intervals the time to computeMk, j+1 is at mostO(n+q)
whereq is the total number of indices that are marked useless. Sinceat mostm− k
indices exist in rowk of matrix M it follows that the computation sum of all steps (iii)
for computing the elements in this row isO(n·m+m).
Theorem 2. The PHC-Switch problem can be solved in time O(n·m2).
5 PHC with Changeover Costs
In this section we study a variant of the PHC problem where thecost for a hyperrecon-
figuration depends not only on the new hypercontext but also on its preceding hyper-
context. Parts of the hyperreconfiguration costs can then beconsidered as changeover
costs and therefore we call this problem the PHC problem withchangeover costs. This
problem is used to model architectures where during hyperreconfiguration it is not nec-
essary to specify the new hypercontext from scratch but where it is possible to define
the new hypercontext through its difference to the old hypercontext. In the following we
consider the problem only for the Switch-Model. For this problem the changeover costs
between two hypercontexts are defined as the number of switches for which the state
has to be changed for the new hypercontext (i.e., the state ischanged from available to
not available or vice versa). Formally, the problem can be stated as follows.
PHC-Switch problem with changeover costs: Given an instance of the PHC-Switch
problem, whereinit (h) = w for h∈H , w> 0, the cost functionchangeoveronH ×H is
defined bychangeover(h1,h2) := |h14h2| where4 denotes the symmetric difference,
and an initial hypercontexth0 ∈ H . Find a partition ofC into substringsS1, . . . ,Sr ,
r ≥ 1 (i.e.C = S1 . . .Sr ) and hypercontextsh1, . . . ,hr such thatSi ⊂ hi(C ) andr ·w+
∑ri=1(|hi 4hi+1|+ |hi | · |Si |) is minimized.
The next result shows that PHC-Switch with changeover costsis polynomially solv-
able (the algorithm is too involved for the available space and omitted).
Theorem 3. The PHC-Switch problem with changeover costs can be solved in time
O(m4 ·n).
6 Caches for Hypercontext and PHC
Multi-context devices allow to store the reconfiguration data that are necessary to spec-
ify a set of contexts. Such context caching on the device can lead to a significant speedup
compared to single context devices where the reconfiguration bits have to be loaded
onto the device from a host computer for every reconfiguration. In this section we in-
troduce multi-hypercontext hyperreconfigurable architectures, which have a cache for
hypercontexts so that they can switch between hypercontexts very rapidly. The con-
cept of reconfigurable devices with context switching has been introduced a decade ago
(e.g. the dynamically configurable gate array (DPGA) [1] or WASMII [12]). In [14] the
reconfigurable computing module board (RCM) has been investigated which contains
two context-switching FPGAs, called CSRC, where the context switching device can
store four contexts.
A typical cache problem for many reconfigurable architectures is that the sequence
of contexts for a computation is known in advance and the problem is then to find the
best replacement strategies for the contexts that are stored in the cache. On a run time
reconfigurable machine the problem is that the actual contexts might not be known in
advance because they can depend on the actual results of a computation. But what might
be known in advance are general requirements on the contexts, e.g. whether few or many
routing resources are needed. The actual context, e.g. the exact routing, is then defined at
a reconfiguration step. Therefore, it seems a promising concept for hyperreconfigurable
architectures to introduce a cache for storing hypercontexts.
What makes the problem of using a cache for hypercontexts particul rly interesting
on a hyperreconfigurable machine is that different sequences of hypercontexts are pos-
sible which can satisfy the sequence of context requirements of a computation. Hence,
the algorithm that computes the best sequence of hypercontexts should take the use of
the cache into account. In general, it can be advantageous touse fewer but more com-
prehensive hypercontexts in order to increase the chances that a hypercontext which
is to be used already exists in the cache and can therefore be load d very fast. Thus,
there is a trade-off between the increasing reconfigurationcosts when fewer but more
comprehensive hypercontexts are used and the shrinking costs for loading these hyper-
reconfigurations.
Here we consider a hyperreconfigurable machine with a cache for hypercontexts
that can store a fixed maximal number of hypercontexts. It is assumed that a hyper-
context has to be loaded from the host only when the hypercontext is not stored in the
cache. Hence, the cost for loading a hypercontexth depends on whether it is in the
cache or not. The value ofinit (h) is smaller when the hypercontext is in the cache. For
a machine with cache we define the PHC-Switch problem as follows.
PHC-Switch problem (for hyperreconfigurable machines witha cache for hypercon-
texts): Given a cache capacity 2n, a set of switchesX = {x1, . . . ,xn}, a set of context re-
quirementsC and a set of hypercontextsH defined asC = H = 2X,i.e.,C andH equal
the set of all subsets ofX. For a given sequence of context requirementsC = c1 . . .cm
find a partition ofC into substringsS1, . . . ,Sr , r ≥ 1 (i.e.C= S1 . . .Sr ) and hypercontexts
h1, . . . ,hr such thatSi ⊂ hi(C ) andr1 ·n+ r2 ·c+∑ri=1 |hi | · |Si | is minimized wherer2 is
the number of hypercontexts that can be loaded from the cache, r2 := r − r1, andc the
cost to load a hypercontext from the cache.
We can show the following theorem by a reduction from 3-SAT (the proof is some-
what technical and therefore omitted).
Theorem 4. The PHC-Switch problem is NP-hard on a hyperreconfigurable machine
with a cache for hypercontexts.
7 Experiments and Results
We define a Simple HYperReconfigurable Architecture (SHyRA)as an example of a
minimalistic model of a rapidly reconfiguring machine in orde to illustrate our con-
cepts. As depicted in Figure 1 it features 18 reconfigurable Look-Up Tables each with
three inputs and one output. For storing signals a file of 73 regist rs is used. The regis-
ters are reconfigurably connected to the LUTs by a 73:54 multiplexer and 18:73 demul-
tiplexer. The inability of the architecture to directly chain the LUTs for computation
poses a bottle neck for the test applications we run on SHyRA and forces them to make
extensive use of reconfigurations. The test applications therefore naturally lend them-
selves to profit from the use of hyperreconfigurations. This,however, does not limit the
general validity of the experimental results, because althoug SHyRA implicitly im-
poses reconfiguration every reconfigurable application follows the same basic design,
i.e. having a calculation phase (LUTs), transferring the information to some registers
(DeMUX) and then have it reinjected into the next calculation phase (MUX). In order to
evaluate the caching model, each reconfigurable component was equipped with a cache
of up to 14 cache lines. Two sample applications (a 4 bit adderand a primitive ALU)































Fig. 1.Simple HYperReconfigurable Architecture: Principal System Design
After mapping the design onto the reconfigurable resources (LUT contents, MUX
switching information) a heuristic was employed to determine appropriate hypercon-
texts using the same costs as in the Switch model. For the caseof not using caches
the optimal hypercontexts were determined with the algorithm described in Section 4.
For the case with caches for hypercontexts we used a greedy strategy which takes the
optimal solution for the PHC-Switch problem without cachesas starting point and sub-
sequently improves this solution by randomly applying one of three operations:
Fig. 2.Relative Costs of the Test Case Designs With Cache Size From 1t 14 Lines
1. Two randomly chosen hypercontexts are merged. 2. Two hypercontexts are cho-
sen randomly. For each contextc j a penalty cost (cost(c j) = ∑k∈[0,n],c jk=1(|{ci |i 6=
j,cik = 0}|)) is determined and the most expensive context is exchanged (this is re-
peated as long as the total costs become smaller). 3. One randomly chosen hypercontext
is split into two hypercontexts and the same exchange procedure as in (2) is applied.
Figure 2 shows the resulting total hyperreconfiguration costs for the test designs
without cache and with caches of sizes from one two 14 cache lines. For the test appli-
cations it can be observed that small caches for hypercontexts can significantly decrease
the total hyperreconfiguration costs.
8 Conclusion
We have investigated a central algorithmic problem for hyperreconfigurable architec-
tures, namely the Partition into Hypercontexts (PHC) problem. It was shown that the
problem in NP-hard in general but can be solved in polynomialtime for the Switch
model under different cost measures. We have also introduced hyperreconfigurable ar-
chitectures that use a cache to store hypercontexts and haves own that PHC becomes
NP-hard even for the Switch model for this architectures. Applications of the PHC
problem on an example architecture have been given. For the cas when caches for
hypercontexts are used a heuristic for solving the PHC problem was introduced.
References
1. M. Bolotski, A. DeHon, and Jr. T.F. Knight: Unifying FPGAs and SIMDArrays. Proc. FPGA
’94 – 2nd International ACM/SIGDA Workshop on FPGAs, 1-10, (1994).
2. K. Bondalapati, V.K. Prasanna: Reconfigurable Computing: Architetur s, Models and Algo-
rithms. In Proc. Reconfigurable Architectures Workshop, IPPS, (1997).
3. K. Compton, S. Hauck: Configurable Computing: A Survey of Systemand Software. ACM
Computing Surveys, 34(2): 171–210, (2002).
4. A. Dandalis and V. K. Prasanna: Configuration Compression for FPGA-based Embedded Sys-
tems. In Proc. ACM Int. Symposium on Field-Programmable Gate Arrays, 173–182, (2001).
5. C. Haubelt, J. Teich, K. Richter, and R. Ernst: System Design for Flexibility. In Proc. 2002
Design, Automation and Test in Europe, 854–861, (2002).
6. S. Hauck, Z. Li, and J.D.P. Rolim: Configuration Compression for the Xilinx XC6200 FPGA.
IEEE Trans. on CAD of Integrated Circuits and Systems, 8:1107–1113,( 999).
7. P. Kannan, S. Balachandran, D. Bhatia: On Metrics for Comparing Routability Estimation
Methods for FPGAs. In Proc. 39th Design Automation Conference, 70–75, (2002).
8. M. Koester and J. Teich: (Self-)reconfigurable Finite State Machines: Theory and Implemen-
tation. In Proc. 2002 Design, Automation and Test in Europe, 559–566,(2002).
9. S. Lange and M. Middendorf: Hyperreconfigurable Architecturesfor Fast Runtime Reconfig-
uration. To appear in Proceedings of 2004 IEEE Symposium on Field-Programmable Custom
Computing Machines (FCCM04), Napa Valley, USA, 2004.
10. S. Lange and M. Middendorf: Models and Reconfiguration Problems for Multi Task Hyper-
reconfigurable Architectures. To appear in Proc. RAW 2004, Santa Fe, 2004.
11. K.K. Lee and D.F. Wong: Incremental Reconfiguration of Multi-FPGA Systems. In Proc.
Tenth ACM International Symposium on Field Programmable Gate Arrays,206–213 , (2002).
12. X. P. Ling, and H. Amano: WASMII: a Data Driven Computer on a Virtual Hardware. Proc.
of the IEEE Workshop on FPGAs for Custom Computing Machines, 33-42, (1993).
13. T.-M. Lee, and J. Henkel, W. Wolf: Dynamic Runtime Re-SchedulingAllowing Multiple
Implementations of a Task for Platform-Based Designs. In Proc. 2002Design, Automation
and Test in Europe, 296–301, (2002).
14. K. Puttegowda, D.I. Lehn, J.H. Park, P. Athanas, and M. Jones: Context Switching in a Run-
Time Reconfigurable System. The Journal of Supercomputing, 26(3): 239-257,(2003).
15. R.P.S. Sidhu, S. Wadhwa, A. Mei, V.K. Prasanna: A Self-Reconfigurable Gate Array Archi-
tecture. Proc. FPL (2000) 106-120.
16. M. Teich, S. Fekete, and J. Schepers: Compile-Time Optimization of Dynamic Hardware
Reconfigurations. Proc. Int. Conf. on Parallel and Distributed Processing Techniques and Ap-
plications (PDPTA’99), Las Vegas, U.S.A., 1999.
17. S. Wadhwa, A. Dandalis: Efficient Self-Reconfigurable Implementations Using On-chip
Memory. Proc. FPL, (2000) 443-448.
