Novel switch block architecture using non-volatile functional pass-gate for multi-context FPGAs by 亀山 充隆
Novel Switch Block Architecture Using Non-Volatile Functional Pass-gate
for Multi-Context FPGAs
Masanori Hariyama, Weisheng Chong, Sho Ogata and Michitaka Kameyama
Graduate School of Information Sciences
Tohoku University
Aoba 6-6-05, Aramaki, Aoba, Sendai, 980-8579, Japan
hariyama@ecei.tohoku.ac.jp
Abstract
Dynamically-programmable gate arrays (DPGAs)
promise lower-cost implementations than conventional
FPGAs since they efﬁciently reuse limited hardware
resources in time. One of typical DPGA architectures is
a multi-context one. Multi-context FPGAs (MC-FPGAs)
have multiple memory bits per conﬁguration bit forming
conﬁguration planes for fast switching between contexts.
The additional memory planes cause signiﬁcant overhead
in area and power consumption. To overcome the over-
head, a ﬁne-grained reconﬁgurable architecture called
reconﬁgurable context memory (RCM) is presented based
on the fact that there are redundancy and regularity in con-
ﬁguration bits between different contexts. A ﬂoating-MOS
functional pass-gate, where storage and switch functions
are merged, is used to construct the RCM area-efﬁciently.
1 Introduction
Dynamically-programmable gate arrays (DPGAs) pro-
vide more cost-effective implementations than conventional
FPGAs where hardware resources are dedicated to a single
context[1],[2]. A DPGA can be sequentially conﬁgured as
different processors in real time, and efﬁciently reuse the
limited hardware resouces in time. One of typical DPGA
architectures is a multi-context one. Multi-context FPGAs
(MC-FPGAs) have multiple memory bits per conﬁguration
bit forming conﬁguration planes for fast switching between
contexts. However, the additional memory planes cause sig-
niﬁcant overhead in area and power consumption [3]. Espe-
cially, switch blocks require a much larger memory capacity
than look-up tables.
Figure 1 shows the overall structure of an MC-FPGA.
Each cell consists of a programmable logic block and a pro-
grammable switch block. Figure 2 shows the structure of
a conventional multi-context switch. The switch has mul-
Cell Cell
Cell Cell
Multi-context switch
Switch block
Configuration data
Logic block
G2G1 G3
G5G4 G6
G8G7 G9
Figure 1. Overall structure of an MC-FPGA
tiple memory bits for multi-contexts and its contexts are
selected from the memory bits according to a context ID.
In the conventional approach, each switch requires n bits
to store n contexts. Most previous works for DPGAs re-
duce the overhead using only device-level solutions. That
is, compact memory devices such as DRAM were used to
store conﬁguration data [1].
To reduce the overhead of conﬁguration memory in MC-
FPGAs, this paper proposes an architectural-level solution
based on the fact that there are redundancy and regularity
in conﬁguration bits between contexts. To illustrate the re-
dundancy and regularity, Table 1 shows an example of con-
ﬁguration data of the switch block shown in Fig. 1. Each
row denotes conﬁguration data of each switch. The con-
ﬁguration data G3 and G9 have redundancy in themselves.
That is, there is no change in their conﬁguration bits. It
is said that less than 3% of conﬁguration data are changed
when contexts are switched [4]. There is another type of re-
dundancy between conﬁguration data of different switches.
For example, G2 and G4 have the same conﬁguration data.
Moreover, there is regularity in conﬁguration data such as
G2 and G4. The conﬁguration data G2 and G4 can be repre-
sented by repeating bits in an order of (0,1). To exploit the
Proceedings of the IEEE Computer Society Annual Symposium on VLSI 
New Frontiers in VLSI Design 
0-7695-2365-X/05 $20.00 © 2005 IEEE
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 5, 2009 at 01:51 from IEEE Xplore.  Restrictions apply.
M MM M
S1
S0
G9
Context ID bit
C2=1 C1=1C3=1 C0=1
Configuration bit
Memory bit
Figure 2. Conventional multi-context switch
(four contexts)
Table 1. Redundancy and regularity in conﬁg-
uration data
Context 0 (C0)Context 1 (C1)Context 2 (C2)Context 3 (C3)
1
0
0
0
0
1
1
0
1
0
1
1
0
1
0
G9
G4
G3
G2
G1
1
0
0
0
1
redundancy and regularity, a reconﬁgurable context mem-
ory is proposed based on a ﬂoating-gate MOS functional
pass-gate (FGFP). The FGFP is the device that can merge
a logic operation and storage in a single ﬂoating-gate MOS
transistor[5]. An arbitrary switch function can be decom-
posed into switch functions called “window literals”, each
of which is efﬁciently implemented by using the FGFP. The
number of the window literals corresponds to the number
of conﬁguration bits for the switch function. In the FGFP-
based RCM, the number of window literals can be ﬂexi-
bly changed by reconﬁguring the FGFP network connec-
tion. By using the FGFP-based RCM, the number of tran-
sistors of switch block can be reduced to 10% in compari-
son with conventional SRAM-based switch block. The use
of FGFPs will be efﬁcient in static power in comparison
with the SRAM-based implementation because no supply
voltage is required to keep the storage.
2 Switch Block Architecture Using the Re-
conﬁgurable Context Memory
Redundancy and regularity in conﬁguration data can be
used to reduce the area of the context memory. In this paper,
an architecture with four contexts is considered as an exam-
ple although our approach is also applicable to architectures
with other number of contexts. Contexts are switched by a
2-bit context ID (bit S1 and bit S0) as shown in Table 2.
Figure 3 shows conﬁguration-bit patterns that are inde-
pendent from the context ID because the switch is pro-
Table 2. Relations between contexts and con-
text ID bits
Context
0
Context
1
Context
2
Context
3
S0
S1
0101
0011
1111
Hardware
generation
of G
Configuration bit (G)
0000
Context
0 (C0)
Context
1 (C1)
Context
2 (C2)
Context
3 (C3)
G
M
M
0
1
Memory bit
Figure 3. Conﬁguration-bit patterns that are
independent of a context ID
grammed to be always turned on or off. A single memory
bit is sufﬁcient to control the switch, while four memory
bits are required for the conventional switch shown in Fig.
2. Figure 4 shows conﬁguration-bit patterns that depend on
a single context-ID bit. Note that each bit pattern is same as
the bit patterns of S1 (or S1) or S0 (or S0) shown in Table
2. A switch using a single context-ID bit is smaller than the
conventional switch which uses two context-ID bits. The
other conﬁguration-bit patterns depend on S1 and S0. Each
bit pattern can be generated using a 2-to-1 multiplexer. The
multiplexer is slightly larger than the hardware shown in
Figs. 3 and 4. However, the bit patterns are not frequently
used in a multi-context architecture since less than 3% of
conﬁguration data change when contexts are switched [4].
Figure 5 shows an MC-FPGA architecture that uses re-
conﬁgurable context memory (RCM) as switch blocks. A
logic block is connected to a RCM block. RCMs are con-
nected by 2 types of interconnections: single-length lines
and double-length lines to achieve both of ﬂexibility and
high speed data transfer. Single-length lines connect neigh-
bor RCMs. Double-length lines connect RCMs every 2
RCMs. To simplify the ﬁgure, only single-length lines are
illustrated. The single-length lines allow ﬂexible connec-
tions between RCMs, but it may decrease speed of data
transfer between distant LBs. The double-length lines al-
low high-speed data transfer between distant LBs through
less RCMs.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI 
New Frontiers in VLSI Design 
0-7695-2365-X/05 $20.00 © 2005 IEEE
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 5, 2009 at 01:51 from IEEE Xplore.  Restrictions apply.
1010
Hardware
generation
of G
Configuration bit (G)
0011
0101
1100
Context
0 (C0)
Context
1 (C1)
Context
2 (C2)
Context
3 (C3)
S1G=
S0G=
S0G=
S1G=
Figure 4. Conﬁguration-bit patterns that de-
pend on a single context-ID bit
Reconfigurable context memory
Logic block LB
RCM
LB
RCM
LB
RCM
LB
RCM
LB
RCM
LB
RCM
Figure 5. MC-FPGA architecture using the
reconﬁgurable context memory as switch
blocks
3 Implementation of an RCM using an
FGPG
Let us consider the function F shown in Fig. 6(a).
The function can be given by OR-ing the functions
FWL1 (Fig.6(b)) and FWL2(Fig.6(c)) called window liter-
als. Given S1 and S2 (S1 ≤ S2), a window literal is de-
ﬁned as follows.
FWL(S, S1, S2) =
{
1 S1 ≤ S < S2
0 otherwise
The function F is given by OR-ing 2 window literals as
follows:
F (S) = FWL(S, 0, 1) + FWL(S, 2, 3) (1)
For the case of N contexts, the function of an MC switch
can be given by OR-ing N/2 window literals at most.
Let us consider the window literal FWL2 shown in Fig.
7(a). The window literal can be AND-ing the functions FUL
and FDL called “up-literal” and “down-literal” respectively.
Figure 6. Function of the multi-context switch
(4 contexts).
An up-literal is a monotone increasing function as shown
in Fig. 7(b). Given the threshold value T , an up-literal
FUL(S, T ) is given by
FUL(S, T ) =
{
1 T ≤ S
0 otherwise
A down-literal is a monotone decreasing function as shown
in Fig.7(c). Given the threshold value T , a down-literal
FDL(S, T ) is given by
FDL(S, T ) =
{
1 S ≤ T
0 otherwise
Hence, the window literal FWL2 is expressed as
FWL(S, 2, 3) = FUL(S, 2) · FLDL(S, 2)
Finally, Eq.(1) can be rewritten as
F (S) = FUL(S, 0) · FLDL(S, 0)
+FUL(S, 2) · FLDL(S, 2)
Proceedings of the IEEE Computer Society Annual Symposium on VLSI 
New Frontiers in VLSI Design 
0-7695-2365-X/05 $20.00 © 2005 IEEE
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 5, 2009 at 01:51 from IEEE Xplore.  Restrictions apply.
From this equation, the circuit of the MC switch for N con-
texts is provided as shown in Fig. 8. The function of the
4-context MC switch is generated by wired-OR-ing the out-
puts of 2 window literals. In general, the function of the N -
context MC switch is generated by wired-OR-ing the out-
puts of less than N/2 window literals. The output of each
window literals is generated by wired-AND-ing the outputs
of an up-literal and a down-literal. If we use the FGMOS as
a 4-valued device, each of an up-literal and a down-literal
is implemented by a single FGPG[5] where an FGMOS is
used not only as a storage device but also pass transistor.
The threshold value of an up-literal or a down-literal is pro-
grammed by injecting a controlled amount of electrons into
the ﬂoating gate. Figure 9 shows the symbol of the FGFP,
where threshold voltage is denoted by Vth bounded by a
broken line. Figure 10 shows the implementation of the
switch function F shown in Fig.6, where VS and V¯S denotes
the control-gate voltages corresponding to S and S¯. In N-
valued logic, S¯ is deﬁned as N −S − 1. Hence, S¯ = 3−S
for N = 4. Note that the down literal FDLforS(Fig.7) can
be implemented by the up literal for S¯ as shown in Fig.7.
Based on this observation, we propose the RCM as shown
in Fig. 11. To simplify the ﬁgure, only 4 tracks are illus-
trated. The RCM consists of three types of FGFPs. The
FGFPs with the control-gate voltage VS and V¯S are used to
implement up-literals and down-literals, respectively. The
FGFPs denoted by small squares are used to connect(or dis-
connect) horizontal and vertical tracks. This type of FGFPs
can implement the conﬁguration-bit patterns with high re-
dundancy (Fig. 3) area-efﬁciently. For example, implemen-
tation of MC-switches shown in Fig.12 is denoted by thick
lines in Fig.11. For the case of 4 contexts and 4 tracks, the
number of transistor is reduced to about 10% and 60% in
comparison with the SRAM-based implementation and the
FGFP-based implementation without redundancy.
4 Conclusion
This paper presents novel switch block architecture for
multi-context FPGAs. The key technologies are architec-
ture to exploit redundancy between context data and FGFPs
to implement conﬁguration circuits area-efﬁciently. The use
of FGFPs will be efﬁcient in static power in comparison
with the SRAM-based implementation because no supply
voltage is required to keep the storage. The implementation
of the test chip is undergoing using 0.35µ EPROM technol-
ogy.
References
[1] A. Dehon, “Dynamically Programmable Gate Arrays:
A Step Toward Increased Computational Density”,
in Proc. the Fourth Canadian Workshop on Field-
Programmable Devices, pp. 47-54(1996)
[2] S.M.Scalera and J.R.Vazquez, “The design and imple-
mentation of a context switching FPGA”, in Proc. IEEE
Symposium on FPGAs for Custom Computing Ma-
chines, pp.78-85(1998)
[3] S. Trimberger, et al. “A Time-Multiplexed FPGA”, in
Proc. of FCCM, pp. 22-28(1997)
[4] I. Kennedy, “Exploiting Redundancy To Speedup Re-
conﬁguration of An FPGA”, in Proc. FPL, pp. 262-
171(2003)
[5] T. Hanyu, M. Kameyama, “Multiple-Valued Logic-in-
Memory VLSI Architecture Based on Floating-Gate-
MOS Pass-Transistor Logic”, IEICE Trans. Electron.,
Vol.E82-C, No.9(1999)
[6] Katarzyana Leijten-Nowak, “An FPGA Architecture
with Enhanced Datapath Functionality”, in Proc.
FPGA03, pp.195-204(2003)
Proceedings of the IEEE Computer Society Annual Symposium on VLSI 
New Frontiers in VLSI Design 
0-7695-2365-X/05 $20.00 © 2005 IEEE
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 5, 2009 at 01:51 from IEEE Xplore.  Restrictions apply.
Figure 7. Decomposition of a window literal
into an up-literal and a down-literal(4 con-
texts).
Figure 8. Circuit of an MC switch (4 contexts).
Figure 9. Floating-Gate MOS transistor.
1.5
0.5
Vs
Vs
-0.5
2.5
Vs
Vs
FWL1 FWL2
Figure 10. Implementation of the MC switch
using FGFPs.
Figure 11. RCM using FGFPs.
Figure 12. Example of MC switches.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI 
New Frontiers in VLSI Design 
0-7695-2365-X/05 $20.00 © 2005 IEEE
Authorized licensed use limited to: TOHOKU UNIVERSITY. Downloaded on February 5, 2009 at 01:51 from IEEE Xplore.  Restrictions apply.
