A low-power reconfigurable logic array based on double-gate transistors by Beckett, P
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 2, FEBRUARY 2008 115
A Low-Power Reconfigurable Logic Array Based
on Double-Gate Transistors
Paul Beckett, Member, IEEE
Abstract—A fine-grained reconfigurable architecture based on
double gate technology is proposed and analyzed. The logic func-
tion operating on the first gate of a double-gate (DG) transistor is
reconfigured by altering the charge on its second gate. Each cell
in the array can act as logic or interconnect, or both, contrasting
with current field-programmable gate array structures in which
logic and interconnect are built and configured separately. Simu-
lation results are presented for a fully depleted SOI DG-MOSFET
implementation and contrasted with two other proposals from the
literature based on directed self-assembly.
Index Terms—CMOS integrated circuits, double-gate (DG)
transistors, logic circuits, nanotechnology, reconfigurable archi-
tectures.
I. INTRODUCTION
RECONFIGURABLE architectures are of great interest tosystem designers because they offer a way of achieving
power and performance efficiency by matching specific algo-
rithmic constructs with an appropriate architecture [1]. The tra-
ditional approach to developing reconfigurable systems, in field-
programmable gate arrays (FPGAs), for example, has been to
build separate regions of programmable logic gates and inter-
connection and to manage these two resources more-or-less sep-
arately during the synthesis process. Therefore, much of the
work on reconfigurable platforms has been directed towards
answering questions such as “how much of each and in what
form?” (e.g., [2]).
The reduced fan-out, power handling capacity, gain, and relia-
bility of deep-submicrometer (DSM) and nanoscale devices will
have a number of consequences for reconfigurable systems. As
device dimensions shrink, it will become increasingly difficult
to manufacture the complex heterogeneous layouts that have un-
derpinned field-programmable technology to date. Physical is-
sues such as the increasing difficulty in achieving alignment be-
tween process layers [3] as well as the prospect of poor perfor-
mance of FETs at reduced gate lengths [4] have already forced
designers to look towards alternative manufacturing techniques
on which to base programmable architectures. Ideas such as
chemically-assembled molecular electronics [5], nanotube and
nanowire devices [6], quantum dot techniques [7], and magnetic
spin-tunneling devices [8] have all been proposed as the basis of
future reconfigurable systems.
This research is motivated by two related questions: what
types of simple (regular) CMOS structures can be exploited
to create future reconfigurable architectures at nanoscale di-
mensions and how might heterogeneous functionality emerge
Manuscript received May 10, 2006; revised March 19, 2007.
The author is with the School of Electrical and Computer Engineering, RMIT
University, Melbourne 3000, Australia (e-mail: pbeckett@rmit.edu.au).
Digital Object Identifier 10.1109/TVLSI.2007.912024
from an essentially homogeneous array of simple devices? Re-
maining in the CMOS domain offers a number of advantages,
including the availability of three terminal switching devices
with intrinsic gain, a stable and well characterized manufac-
turing base plus compatibility with existing design tools. The
disadvantage is that the design is constrained by lithographic
patterning and alignment issues. While it is forecast that feature
sizes (for logic) will reduce below 20 nm by 2016–2018 [9],
it is not clear at the moment how this might be achieved. Our
premise here is that simplified, regular structures with a min-
imal number of interconnection layers will have a better chance
of achieving sub-20 nm feature sizes than the complex, hetero-
geneous layouts that characterize most current micro-architec-
tures. It is likely that a “wish-list” of features for future recon-
figurable architectures would include at least some of the fol-
lowing:
• a simplified processing technology;
• a highly regular layout style;
• small logic and interconnect footprints, supporting high
component densities;
• configuration flexibility supporting efficient routing as
well as allowing a continuous tradeoff between routing
and logic;
• an organization that minimizes reconfiguration overheads.
The double-gate (DG) transistor is a promising device ap-
plicable to DSM due in particular to its inherent resistance to
short-channel effects and potentially ideal subthreshold perfor-
mance. Typically, the two gates would be operated together as
this offers the best switching performance. However, accessing
the two gates separately creates opportunities for innovative cir-
cuit design [10]. This paper proposes a reconfigurable architec-
ture based on double gate transistor circuits where the operating
point of the circuit can be set via one gate while the other gate
is used to form the logic array. In this way, the overheads im-
posed by reconfigurability can be reduced or hidden to an extent
where it becomes possible support complex datapath architec-
tures with homogeneous fine-grained organization.
The remainder of this paper proceeds as follows. In Section II,
the operation of DG device is introduced and its application to
a reconfigurable logic cell is described. Section III then shows
how these cells can be assembled to form a homogenous re-
configurable processing mesh capable of flexible configuration
into logic and/or interconnect. We also present some prelimi-
nary performance estimates. Finally, this paper is summarized
in Section IV.
II. RECONFIGURABLE DG CELL
The many problems associated with scaling MOS transistors
are likely to result in the DG transistor becoming a preferred
circuit element. They are predicted by the ITRS [9] to appear as
early as 2011. Theoretically, these devices do not need channel
doping and, therefore, can be scaled to dimensions below 10
1063-8210/$25.00 © 2007 IEEE
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
116 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 2, FEBRUARY 2008
Fig. 1. Generalized DG transistor characteristics, showing I -V over a
range of V .
nm without running into problems of uncontrollable parameter
variations due to the random distribution of dopant atoms [11].
This section outlines the basic operation of the DG transistor and
shows how it supports the proposed reconfigurable logic array.
We first look at the generic behavior of these devices and then
at a specific implementation using silicide source/drain that may
result in simplified processing and superior performance as an
“end-of-roadmap” device.
A. DG Transistor Operation
It appears likely that in typical applications DG devices will
be set up to use the front and back gates together as this leads
to the best performance as a switching device [12]. However, if
the two gates can be accessed independently, one can be used
to set the operating point of the transistor thus affecting the
threshold behavior of the other. This is the basis for the opera-
tion of the proposed cell. Fig. 1 shows the basic idea for a generic
double-gate n-channel device simulated in Spice-3 using a fully
depleted SOI model developed at the University of Florida [13].
For this example, the transistor models were approximately cali-
brated to the idealized SOI double-gate device described in [14].
The impact of shifting the gate-source bias on one gate ( in
this case) is to move the effective threshold voltage seen
at the other. Similar behavior occurs in the p-channel devices.
The shift in threshold is accompanied by a change in the sub-
threshold slope, which will be discussed later in this section. It
should be noted that this effect is entirely symmetrical, and of
equal magnitude if both gate oxide thicknesses are the same. In
the following, we will tend to refer to one gate as the control
gate and the other as the logic gate.
As most of the important operating parameters of a logic gate
(e.g., switching threshold, gain, noise margin, etc.) depend on
may be used to vary these during operation. The shift
in the logic switching threshold is illustrated in Fig. 2 for a
simple inverter circuit under five operating conditions (
and 1 V, no load). At the two
extremes, the output remains high ( 0.8 V) or low (
0.1 V) for all input values whereas for 0 V, the output
switches symmetrically. The resulting transfer curves are sim-
ilar to the characteristics of the planar “ground plane” (GP)
CMOS inverter presented by Ieong et al. in [15]. One objec-
tive of that investigation was how to use the bias on the second
gate to improve noise margins and gain for the inverter circuit
so those experiments focused only on part of the bias range
. In the case of the reconfigurable cell proposed in
Fig. 2. Switching characteristics of a configurable inverter over a range of con-
trol voltages V .
this paper, the offset is used at least partly in the opposite direc-
tion: to reduce the gain to as close to zero as possible.
Several DG topologies have been described in both MOS
and heterojunction technologies (e.g., [16]–[18]). Of these, the
back-gate planar configuration [18] may result in the densest
cell layout for this particular application. Although achieving
close alignment between the gates still represents a major pro-
cessing hurdle, process flows for planar DG-MOSFET with self-
aligned top and bottom gates have already been demonstrated
[12], [18], and there is evidence to suggest that adequate perfor-
mance might be achieved without precise gate alignment [19].
Although these devices have focussed on conventionally doped
source/drain devices, an alternative using silicide source/drain
regions and a mid-band metal gate structure may offer some ad-
vantages and this is briefly explored in Section II-B.
B. Silicide Source/Drain Devices
As previously mentioned, performance fluctuations due
to random dopant distribution in the channel will become
a major problem in CMOS at reduced gate lengths. An al-
ternative approach, employing an undoped channel region
and using Schottky barriers at the source and drain has been
demonstrated by a number of researchers. Metal silicides form
natural Schottky barriers to silicon substrates, acting to confine
carriers and reducing or eliminating the need for impurities in
the channel to prevent current flow in the off condition [20].
Schottky barrier devices were first described more than 30 years
ago [21] and have been investigated for many years. Although
they generally exhibit significantly lower drive currents than
conventional devices, this gap may close as devices shrink. For
example, there is evidence that quantum confinement effects
in thin, narrow silicon wires (i.e., nanowires) may result in
quasi-ballistic operation leading to greater mobility values
than in bulk silicon [14], although increased scattering due to
edge roughness and other effects may ultimately prevent fully
ballistic behavior [23].
As part of this paper, we have simulated a number of DG
thin-body Schottky devices of the form shown in Fig. 3 using
a commercial simulator with classical transport models [24].
The results were then compared with the characteristics of pre-
viously reported devices (in [25] and [26], for example). All of
the physically implemented devices reported to date have been
either single-gate or planar. Thus, the objective of this simula-
tion work was to determine the likely extent of the threshold
shift in intrinsic channel, DG, silicide S/D SOI devices.
The source/drain regions of the n-type transistors were as-
sumed to use ErSi (barrier height, 0.28 eV above Si)
although in [27] it is shown that Ytterbuim Silicide (YbSi )
on silicon exhibits an electron barrier of 0.27 eV and may be
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
BECKETT: LOW-POWER RECONFIGURABLE LOGIC ARRAY BASED ON DG TRANSISTORS 117
Fig. 3. Basic topology of the thin-body DG transistor with silicide source/drain.
Fig. 4. Simulated I versus V characteristics of the transistors of
Fig. 3-n-type (dots), p-type (crosses) with p and n control gate biasses V
and V =  2; 0;+2 as labeled.
easier to fabricate in thin films than ErSi . PtSi was assumed
for the p-type devices ( 0.23 eV). In both cases the gate
material was assumed to be Au/Cr with a work function of about
4.7 eV. The width and length were both fixed at 20 nm and the
gate oxide thickness at 1.5 nm.
Although we used classical diffusion models without
quantum correction, these have been shown to be sufficiently
accurate to the limit of this work around 5 nm [28].
However, the models appeared to underestimate the drain
current density at these dimensions. For example, the value
of 0.1 A derived here contrasts with actual devices
measured in [25] which exhibited current drives in the order of
2 A for the 20-nm thick channel in their study. We, therefore,
consider our drive current results to be a worse-case prediction.
In addition, it has been shown that quantum confinement effects
can produce a large threshold voltage shift in DG devices below
30 nm gate lengths [29], an effect that is not accounted for here.
Neither of these effects was important to this part of the study
and in any case, the quantum shift is most pronounced
below 10 nm.
Fig. 4 shows the simulated versus performance for
these devices with the control gate voltages set between
2 V, indicating that values of can be
achieved using these devices and that a significant threshold
shift can be expected as the control gate is modulated (e.g.,
0.375 V). This is the basis of the operation
of the proposed devices that can be exploited to create recon-
figurable lookup table (LUT) structures from simple arrays, as
will be discussed in Section II-C.
Fig. 5 illustrates the relative sensitivity of the threshold
voltage seen at the logic gate to the control gate bias with
various values of channel thickness between 5 and 30 nm.
These plots are for the n-type transistor; those for the p-type
Fig. 5. Threshold voltage change (V ) versus control gate voltage (V )
at various T for the n-type device of Fig. 3. The shape and magnitude of the
P-type device characteristics are similar.
have a similar form and magnitude. Here the threshold values
have been normalized such that at . As the
channel thickness is reduced the threshold sensitivity increases
to a point where at 5 nm, setting 1 V can
produce 0.45-V shift in threshold voltage with a shift of similar
magnitude observed for the p-type device at .
In the ground-plane mode (i.e., with one gate at a fixed po-
tential), the behavior of the subthreshold slope (S) is similar to
that of planar devices and is given by [30]
(1)
where oxide capacitance of gate and
is the effective body capacitance between the inversion layer
and gate 2 (see Fig. 3) so that if the back surface
is in accumulation, and in de-
pletion. Substituting , ( for
SiO dielectric), the subthreshold slope becomes
mV/decade (2)
The term becomes zero when that surface is in ac-
cumulation. This implies that scaling the silicon body thickness
to 5 nm will require the oxide thickness to shrink to less than
1 nm to maintain 100 mV/decade (which, in turn, will re-
strict the supply range for reasons related to gate tunneling and
oxide breakdown). While is theoretically possible for DG-SOI
transistors to approach the ideal subthreshold slope for MOS
( 60 mv/decade) when used in DG mode (both gates driven to-
gether), none of the nMOS devices reported to date has reached
this figure, although the YbSi S/D bulk device described in
[27] achieves 75 mV/decade. Previous values (e.g., [31])
have ranged between 100 and 150 mV/decade, i.e.,
between 0.6 and 1.5.
While the threshold sensitivity increases
with reducing the body thickness, the subthreshold slope de-
grades. It is suggested in [28] that this limits the tuning
range (to in that study). However, in our applica-
tion the subthreshold shift is still useful well outside this range,
regardless of the final value of S. It can also be seen from (4)
that moving to hi- gate dielectrics can significantly improve
S in this case (as well as serving to reduce gate leakage). For
example, using HfSiO in (2) becomes less
than 1.0 and the worse-case slope will reduce to approximately
78 mV/decade, although it is likely that this will be at the ex-
pense of reduced channel mobility and increased short channel
effects [32].
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
118 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 2, FEBRUARY 2008
Fig. 6. Simplified view of the Schottky source/drain transistor pair.
Fig. 7. Configurable cells based on DG transistors. (a) Configurable 2-NOR
gate. (b) Configurable pass gate/inverting buffer structure.
C. Assembling the Array
Having now described the operation of its components, we
can now show how the array can be assembled. The objective
here is to achieve a simple, regular organization that can then be
configured using the control-gates of the complementary tran-
sistors. One possible layout for such an array is based on pairs of
p and n-type transistors formed in undoped silicon by depositing
(and annealing) materials such as erbium and platinum silicide
in alternate rows (see Fig. 6). Conventional flash memory tech-
niques would be applicable to set the top gate charge in a similar
way to the SOI Flash cell described in [33]. Other more spec-
ulative configuration mechanisms have been suggested in [34]
and [35].
In Fig. 7, these complementary transistors have been or-
ganized into simple circuits exploiting this basic mechanism.
Fig. 7(a) represents a configurable two-input NOR gate in which
the threshold on each transistor is controlled via its gate bias
voltage (shown as black squares on the diagram) so that the
circuit can develop an enhanced set of logic functions shown
in the accompanying table. The four-transistor cell in Fig. 7(b)
can be configured to behave as either an inverting buffer or as a
non-inverting pass gate. It can also be put into a high impedance
mode to isolate its input and output signal. As will be shown
below, together these two cell types are sufficient to create a
“sea-of-gates” style configurable layout.
To analyze the characteristics of the 2-NOR gate, we can start
with the equation for saturation drain current in DSM tech-
nology developed in [36]
(3)
where is the physical gate length, the effective gate oxide
thickness, including any depletion layer effects, and is
a function of the source/drain resistance for a particular tech-
nology. The exponent describes the degree of velocity satura-
tion. It is given a value of 1.25 for DSM in [36], and will tend
towards 1 in short-channel, ballistic devices [37]. Assuming that
the term is constant for a given technology
and circuit design style, we can equate the saturation drain cur-
rents in the conventional way [38] to derive an approximate for-
mula for the switching threshold of the NOR circuit
(4)
where is the threshold voltage for the devices,
is the number of transistors in the stack [e.g., 2 in Fig. 7(a)] and
. Thus, will be a function of the transistor
gain ratio-given by and the relative mo-
bilities (replaced in (1) by the term). It can be noted
here that the effective mobility of the Schottky barrier pMOS
device is greater than that of the nMOS due to the lower barrier
height of PtSi (0.23 V versus 0.28 V for ErSi , see Fig. 3).
From (4), with the switching threshold
becomes when and thus
(5)
It can also be seen from (4) that when
(6)
and V when
(7)
which, together, describe the operation of the configurable NOR
in Fig. 7(a). Analogously to a conventional ROM, changes
at the control gate shifts and for a given input suf-
ficiently that the switching threshold seen that input moves out-
side the voltage range of the logic signal. This is illustrated in
Table I which lists and for one input of Fig. 7(a)
as well as the overall switching threshold seen at that input de-
rived from (4). “Normal operation” means that the switching
threshold is about and the gate is sensitive to that input.
Setting large but symmetrical threshold values puts the array in
a low standby power mode. While is still approximately
, the propagation delay is severely affected so that the
cell would not normally be operated in this mode. Shifting its
switching threshold to greater than effectively makes the
gate insensitive to that input. Similarly, when its threshold is less
than ground, the n-device will be always conducting. As this is
a NOR configuration, any transistor pair in this condition results
in an output low.
Equations (4)–(7) also illustrate two further points about this
organization. First, as the supply reduces with scaling, the range
of threshold shifts that will be required to configure the array
will also scale down. Second, although the optimum value of
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
BECKETT: LOW-POWER RECONFIGURABLE LOGIC ARRAY BASED ON DG TRANSISTORS 119
TABLE I
OPERATING CONDITIONS AT INPUT A OF THE CONFIGURABLE 2-NOR
GATE OF Fig. 7(a), BASED ON THE DEVICE CHARACTERISTICS
IN Fig. 4 (n = 2; K = 3; V = 1 V)
(i.e., such that ) is related to , in common
with all static gates we can adjust over a fairly wide range
with a minimal effect on (but with an effect on perfor-
mance [38]). For example, with (at 0.8 V,
0.3 V), setting will result in
0.37 V (i.e., 4% shift from its optimum value). We
can also achieve a shift in switching between and 0 V with
0.45 V, which would require the front gate to be
modulated by approximately 1.3 V. This is consistent with the
results for the simulated device shown in Fig. 5 at 5 nm.
Up to this point, we have ignored the impact of threshold
voltage variability, which will increase the range of gate biases
necessary to achieve the desired switching thresholds. For
example, substituting into (4) a worse-case value with
25% of the nominal for both the p and
n devices (twice the ITRS figure of 12%), and assuming
the same as above ( 0.45 V), results in the maximum
bias increasing from 1.3 V to approximately 2 V. Thus,
we expect that gate biases in the range of 2 V (that are also
compatible with oxide reliability [39]) will be sufficient to
configure this array.
We will demonstrate the operation of this reconfigurable cell
using the organization shown in Fig. 8 which is set up as a
6-input, 6-output NOR-based LUT with each output line termi-
nated in a configurable inverter/3-state driver as described in
Fig. 7(b). Alternative organizations (e.g., NAND-LUT) would
be equally possible with a slight rearrangement of the internal
connections. These cells are then organized with adjacent con-
nections in the vertical and horizontal directions plus two local
feedback connections (see Fig. 9), making an 8 8 reconfigu-
ration block at each cell position. Arranged in this way, each
pair of adjacent cells contains sufficient resources for either a
small combinational logic circuit such as a 3-LUT, more com-
plex synchronous state machine elements such as latches and
flip-flops or simple asynchronous circuits, as will be described
in Section III.
III. RECONFIGURABLE COMPUTING CIRCUITS
The reconfigurable array described in Section II could be said
to be “polymorphic” [34], [35] in that it may be arbitrarily con-
figured as logic and/or interconnect or as combinations of both.
Although there are many ways to implement such an array, the
thin-body, fully-depleted, DG MOSFET devices described in
Section II at least have the advantage that they represent a plau-
sible evolutionary path from conventional CMOS technology.
In this section, we demonstrate that the simple array structure
proposed in Section II can support a range of complex datapath
architectures.
A. Programmable Interconnect
A primary characteristic of this proposed array is that there
is little intrinsic difference between logic and routing and each
Fig. 8. Single reconfigurable cell based on a 6 6 NOR organization with pass/
invert interface gates. The black squares are the top gate programming nodes.
Fig. 9. Data flow in adjacent cells. The open circles are the pass/invert inter-
face cells between 6-LUTs. The corner space between groups of four cells is
occupied by additional sets of pass/invert cells for local interconnect.
cell can be used for both simultaneously. Fig. 10 illustrates the
range of options available to merge logic fan-in, fan-out, and
routing as follows.
1) Logic Fan-in and Fan-out: The basic layout of each cell
comprises of six programmable 6-NOR gates [see Fig. 7(a)].
Input lines that are not part of a particular logic mapping on a
6-NOR are set to “don’t care” on that gate by shifting the effective
switching threshold seen at that input. In the example of Fig. 10,
the nine inputs are partitioned across cells 1, 3, and 5.
Cell 1 creates the logic function (normal threshold
for these three inputs, high for the remaining three). The
term is then routed across cell 2 and 3 to merge with the terms
and generated by cells 3 and 5, respec-
tively. The resulting function appears on line “Y” at the output
of cell 4 and is then transferred to cell 9. In a similar manner,
this term is distributed horizontally to cells 8 and 11 where it is
combined with inputs routed through 2, 6, 7, and 11. The partial
logic terms developed in cells 8 and 10 are then transferred via
12 and 14 to be recombined in cell 13. (Note that this example
is not intended to represent any particular logic function).
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
120 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 2, FEBRUARY 2008
Fig. 10. Programmable cell styles: each cell can be set up as configurable NOR
logic and/or routing block. Linking two adjacent cells via a (bidirectional) pass
gate allows signals to be routed across a cell to the next logic gate.
2) Routing: With a few restrictions, signal routing can occur
simultaneously in the two orthogonal directions within each
cell. First, any input line automatically carries that signal across
the cell so that it is available to be connected to an adjacent cell
via a pass transistor. The partial directionality imposed by the
interface gates restricts this option to noninverting (pass) logic.
At the same time, any output line not involved in logic, for which
all transistors are biased “off” (i.e., high ) can also be used
to route signals across the cell. As an example, cell 4, which is
used to form the intermediate logic terms previously outlined, is
also shown transferring the signal “X” vertically to cell 9, from
where it continues through to cell 13. Finally, each cell is effec-
tively equivalent to a 6 6 universal routing block (e.g., cells 7
and 11 in Fig. 10). These corner routing cells can be inverting or
noninverting and can be easily cascaded. The result is a flexible
organization that supports fine-grained tradeoffs between logic
and routing. Ironically, this level of flexibility is also likely to
make the job of automatic place-and-route difficult and as a re-
sult it might not be possible to maximize the use of every indi-
vidual cell.
B. Combinational and Sequential Logic
As an example of combinational and sequential logic syn-
thesis, a simple D-type flip-flop and a full-adder circuit were
simulated using Spice3 level 10 SOI models available at the
Nanotechnology Simulation Hub [40]. The thin-film double
gate transistor models of the D-type were tuned approximately
to the characteristics of the devices fabricated in [25] (with a
particular focus on the subthreshold slope) while the full-adder
used parameters derived from the simulated devices outlined
in Section II. Figs. 11 and 12 show the results for the D-type
flip-flop and full-adder, respectively. Table IV summarizes
the performance of these circuits and, as a comparison, also
includes simulation results from [41] and [42] that are further
discussed in Section III-D.
C. Layout Area Comparisons
In this section, we estimate the area of circuits on our array
and compare it in general terms with the fixed channel island-
style routing of a conventional FPGA. The primary objective
here was to explore whether complex logic can be mapped effi-
ciently onto a flat, undifferentiated array of the sort proposed in
Fig. 11. Simulated waveforms for D-type FF circuit (V = 1.0 V).
this paper and whether this fine-grained array style can support
larger datapath components.
We start with an analysis of the netlists of a number of cir-
cuits drawn from the LGSynth93 archive [43]. Each of these
circuits had already been partitioned onto a basic logic element
(BLE) comprising a 4-LUT and a flip flop [see Fig. 13(a)]. The
component values listed in Table II represent the number of
cells mapped a particular basic block configuration, i.e., LUT
only (bypassing the D flip-flop), latch only (bypassing the LUT)
and both LUT and flip-flop simultaneously. Table II also shows
the estimated area required by these circuits. We did not actu-
ally place and route these circuits, but based the estimates on a
number of the following conservative assumptions.
1) In the PMA case, a 4-LUT and D-type can each be config-
ured using a pair of adjacent cells and interconnected by
abutment as shown in Fig. 13(b), incurring no additional
local routing overhead. As they are approximately the same
complexity, one cell-pair and a BLE are assumed to occupy
equal area in the same technology.
2) We assumed the impact of the configuration mechanisms to
be approximately the same in each case so that it could be
ignored. Each PMA block requires 128 configuration bits.
Conventional FPGA devices use sparse encodings, typi-
cally using at least a factor of 2–4 more configuration bits
than necessary, so are likely to exhibit similar a number of
configuration bits as the PMA for an equivalent function.
3) The width of the routing channels was estimated to be four
times the original logic block width based on data from
[44] with the LUT input size (K) within the range 3–6. The
overall cost of using a BLE was therefore fixed at 25 units.
This is an underestimate for FPGA devices.
4) The routing overhead for the PMA layout was estimated
using the stochastic wire length model of [45] with the
Rent parameter set to 0.8. This is an overestimate, more
typical of random logic blocks. The true parameter will be
smaller for all these circuits. The wire length estimates (in
units of cell pitches) were then scaled up by to allow
for the rectangular routing constraints and doubled again to
cover routing congestion and placement inefficiencies. The
overall cost of using a polymorphic cell-pair is therefore
taken to be units, where is the average
interconnect length. Routing is confined to the adjacent
regions between merged cells as shown in Fig. 10.
Unlike an FPGA, this array is not constrained by a fixed
channel width set by the number of pins on a fixed logic block.
The ability to configure routing only where necessary and to
collapse low fan-out cells into blocks connected by abutment,
results in area ratios less than one for all of the smaller bench-
marks. The ratio of the areas reaches (and slightly exceeds) unity
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
BECKETT: LOW-POWER RECONFIGURABLE LOGIC ARRAY BASED ON DG TRANSISTORS 121
Fig. 12. Simulated sum and carry-out waveforms for full-adder (V =
0.8 V).
Fig. 13. (a) BLE structure. (b) Group of configured logic cells forming merged
3-LUT and flip-flop elements. A key difference from (a) is that both the LUT
and register outputs are available and it is not necessary to instantiate the mul-
tiplexor.
TABLE II
AREA COMPARISON BETWEEN BLE AND THE PMA FOR
LGSYNTH93 CIRCUITS [43]
in the larger circuits with longer average wire lengths and in the
case where a greater number of flip-flops increases the overall
number of cells compared to the FPGA. Even though these re-
sults are only approximate, we observe that the array is likely to
be no worse that the fixed channel FPGA and will typically be
better due to its greater flexibility in trading logic for intercon-
nect area.
The general trend of these results is reinforced in Table III,
which shows a similar analysis of a number of arithmetic cir-
TABLE III
RELATIVE AREA FOR ARITHMETIC CIRCUITS
Fig. 14. Example floorplan of a duplicated datapath from [46].
cuits written in VHDL, originally targeting an FPGA. In this
case, wire lengths are estimated using a Rent exponent of 0.4
to reflect the regularity of these arithmetic circuits and their
low fan-out. These arithmetic circuits can therefore be readily
grouped to form larger blocks, mainly by abutment and with
a small ratio of internal routing. Table III shows that under the
same assumptions as before this regularity and locality will sup-
port compact layouts, up to three times more compact than on
conventional organizations with fixed routing.
As an example, a duplicated 8-bit data path was set up with
the general floorplan shown in Fig. 14. Here, the two data path
blocks are formed from replicated 1-bit add/compare/register
slices (cf. [46]) mirrored across a common central intercon-
nection bus. Adjacent data path blocks are connected by abut-
ment and interconnect channels are shared where appropriate,
thereby incurring little additional interconnection area or delay
penalty apart from a small increase in the capacitive load on the
“corner” routing cells in the central channel. This takes advan-
tage of a characteristic of the particular organization chosen for
this array, i.e., that the dataflow direction can be reversed simply
by shifting the slice mapping by one cell position in any direc-
tion (see Fig. 9). Of course, the tight rectangular packing exhib-
ited here is a particular characteristic of this example and would
not be universal. However, this simple example does serve to il-
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
122 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 2, FEBRUARY 2008
TABLE IV
SIMULATED PERFORMANCE FIGURES FOR THE ARRAY
(WITH FIGURES FROM [41] AND [42] FOR COMPARISON)
lustrate how the “polymorphic” behavior of these cells allows a
fine-grained organization to efficiently support regular datapath
structures.
D. Static Power-Delay Performance
In this section, we briefly examine the static power and per-
formance results for the proposed array and compare it against
two proposals for nanoscale systems that rely on a partial self-
assembly approach. The results are shown in Table IV. The
crossbar full-adder circuit of [41] is based on rectifying junc-
tions formed from rotaxane molecules such as those described
in [47], whereas the programmable logic array (PLA) circuits in
[42] are based on (hypothetical) stochastic self-assembly tech-
niques that would form orthogonal arrays of silicon nanowires
randomly connected to a micro-scale address decoder.
Although the nanowire and the molecular crossbar circuits
are similar in both functional complexity and in overall size,
we have to exercise care when making direct comparisons be-
tween them as all of the figures listed depend on specific de-
sign details that are difficult to quantify in general terms. The
thin-body MOSFETs described in [25] exhibit low values of
threshold (particularly the n-type devices) due to the partic-
ular gate workfunction chosen. As a result, the static power
of the D-type is more than 60 times that of the
molecular design and is certainly too high to support the levels
of integration envisaged. For example, if we assume that the
target is a low-power portable system with a static power target
of 0.01 W/cm , the density would be power-limited to fewer
than 300 LUT blocks/cm .
In a fixed threshold system, ultimate scaling is likely to re-
quire separate gate work functions set above and below the mid-
band in order to “tune” and balance the transistor thresholds
[17]. The silicide devices presented in Section II exhibit much
lower values of (in the order of 10 A), thus their static
power performance is better than even the molecular case, but
at the expense of performance . It is worth noting that
even at its quoted figure of 0.55 W, the molecular crossbar el-
ement analyzed in [41] would reach a static power density of
0.01 W/cm at fewer than 2 10 LUT blocks/cm , a figure
that could be easily exceeded by current low-power CMOS tech-
niques. Once the drivers/sense amplifiers are factored into the
molecular crossbar case, its static power is almost double that
of the high speed CMOS circuits with approximately the same
propagation delay. It is only at the maximum figure for pas-
sive cooling of 100 W/cm that we start to approach the sort of
densities promised by the device roadmaps [e.g.,
blocks/cm ]. It should be remembered that this not include the
effect of dynamic power, which would apply additional con-
straints. No static power figures are presented in [42] (and the
proposed manufacturing methodology is purely speculative in
any case). The 0.1 W figure in Table IV is from [6] and is
based on the estimated worse-case pull-down resistance for the
“nMOS-style” organization and with a supply voltage of 1 V.
Nanowire FETs are constrained by the same , and
subthreshold slope considerations as conventional devices. As
a result, the more complete proposal presented in [42] uses dy-
namic logic as a way to maintain performance while minimizing
overall system power, although at the expense of introducing ad-
ditional multiphase clock distribution overheads, and with little
likely impact on static power.
As a final observation, it is worth noting that by the final ITRS
technology nodes (16–22 nm), all of these proposals appear to
result in similar device densities and are limited by power/en-
ergy rather than layout area. The area comparison presented in
Section III-C implies that there is likely to be minimal differ-
ences in mapping efficiency when each is applied to real cir-
cuits so it will be power density that sets an ultimate scaling
limit-supporting perhaps a density factor of 2 at best. It could
be argued that these modest returns would not be worth the sig-
nificant increase in complexity that would result from mixing
conventional and self-assembled manufacturing techniques.
IV. CONCLUSION
A fine-grained reconfigurable platform based on complemen-
tary, DG, fully depleted SOI transistors has been proposed and
analyzed. DG-SOI technology is of interest as it offers a way of
avoiding short channel effects and substrate leakage as channel
lengths are scaled down. Given that the second gate can be jus-
tified on these grounds, the ability to configure the transistors’
operating region can be added with little additional overhead.
In the fabric developed here, the DG devices are arranged
such that the top gates may be programmed to affect the
switching threshold of the device and therefore its logic
function. Organizing these into simple arrays supports the de-
velopment of more complex computing functions. While there
are still many technical challenges to be overcome, such de-
vices offer a number of tangible benefits, not the least of which
is a plausible migration path from conventional (micro-scale)
CMOS.
Extending CMOS into the nanoscale domain will require a
careful balance between power and performance. The SOI array
offers low-power, complementary operation, without the addi-
tional overheads of either level restoration or gain blocks (as
required, for example, by molecular logic circuits). By com-
paring the performance of our CMOS array with others based on
self-assembly methods, we can see that moving to non-CMOS
technologies will not guarantee, per se, higher densities or im-
proved performance. Many of the gains to be made towards the
end of the roadmap will be achieved by exploiting innovative
architectures to ameliorate the impact of poor device perfor-
mance and reliability, increasing variability and, in particular,
power/energy.
REFERENCES
[1] S. Hauck, “The roles of FPGAs in reprogrammable systems,” Proc.
IEEE, vol. 86, no. 4, pp. 615–638, Apr. 1998.
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
BECKETT: LOW-POWER RECONFIGURABLE LOGIC ARRAY BASED ON DG TRANSISTORS 123
[2] V. Betz and J. Rose, “How much logic should go in an FPGA logic
block?,” IEEE Des. Test Comput., vol. 15, no. 1, pp. 10–15, Jan./Mar.
1998.
[3] R. Compano, “Technology roadmap for nanoelectronics,” European
Commission IST Programme-Future and Emerging Technologies,
2000 [Online]. Available: http://cordis.europa.eu/ist/fet/nidqf.htm
[4] R. Ronen, A. Mendelson, K. Lai, S.-L. Lu, F. Pollack, and J. P. Shen,
“Coming challenges in microarchitecture and architecture,” Proc.
IEEE, vol. 98, no. 3, pp. 325–340, Mar. 2001.
[5] S. C. Goldstein and M. Budiu, “Nanofabrics: Spatial computing using
molecular electronics,” in Proc. 28th Int. Symp. Comput. Arch., 2001,
pp. 178–189.
[6] A. DeHon, “Array-based architecture for FET-based, nanoscale elec-
tronics,” IEEE Trans. Nanotechnol., vol. 2, no. 1, pp. 23–32, Mar. 2003.
[7] C. S. Lent, P. D. Tougaw, W. Porod, and G. H. Bernstein, “Quantum
cellular automata,” Nanotechnol., vol. 4, no. 1, pp. 49–57, 1993.
[8] R. Richter, H. Boeve, L. Bär, J. Bangert, G. Rupp, G. Reiss, and
J. Wecker, “Field programmable spin-logic realized with tun-
nelling-magnetoresistance devices,” Solid-State Electron., vol. 46, no.
1–3, pp. 639–643, 2002.
[9] Semiconductor Industry Association, “International Technology
Roadmap for Semiconductors,” (2005). [Online]. Available:
http://public.itrs.net/Common/2005ITRS/Home2005.htm
[10] V. V. Rakitin and E. I. Filippov, “Logical elements based on dual MOS
transistors,” (1996). [Online]. Available: http://www.niifp.ru/english/
nano/lebdmos.html
[11] A. R. Brown, J. R. Watling, and A. Asenov, “A 3-D atomistic study of
archetypal double gate MOSFET structures,” J. Computational Elec-
tron., vol. 1, no. 1–2, pp. 165–169, 2002.
[12] K. W. Guarini, P. M. Solomon, Y. Zhang, K. K. Chan, E. C. Jones, G.
M. Cohen, A. Krasnoperova, M. Ronay, O. Dokumaci, J. J. Bucchig-
nano, J. C. , Jr, C. Lavoie, V. Ku, D. C. Boyd, K. S. Petrarca, I. V.
Babich, J. Treichler, and P. M. Kozlowski, “Triple-self-aligned, planar
double-gate MOSFETs: Devices and circuits,” in Proc. Int. New Elec-
tron Devices Meet., 2001, pp. 19.2.1–19.2.4.
[13] J. G. Fossum, M.-H. C. L. Ge, V. P. Trivedi, M. M. Chowd-
hury, L. Mathew, G. O. Workman, and B.-Y. Nguyen, “A
process/physics-based compact model for nonclassical CMOS
device and circuit design,” Solid-State Electron., vol. 48, no. 6, pp.
919–926, Jun. 2004.
[14] Z. Ren, R. Venugopal, S. Datta, M. Lundstrom, D. Jovanovic, and J.
Fossum, “The ballistic nanotransistor: A simulation study,” in Proc.
Int. Electron Devices Meet., 2000, pp. 715–18.
[15] M. Ieong, E. C. Jones, T. Kanarsky, Z. Ren, O. Dokumaci, R. A. Roy,
L. Shi, T. Furukawa, Y. Taur, R. J. Miller, and H.-S. P. Wong, “Experi-
mental evaluation of carrier transport and device design for planar sym-
metric/asymmetric double-gate/ground-plane CMOSFETs,” in Proc.
Int. Electron Devices Meet., 2001, pp. 6.1–6.4.
[16] N. J. Collier and J. R. A. Cleaver, “Novel dual-gate HEMT utilising
multiple split gates,” Microelectron. Eng., vol. 41–42, pp. 457–460,
1998.
[17] L. Chang, S. Tang, T.-J. King, J. Bokor, and C. Hu, “Gate length scaling
and threshold voltage control of double-gate MOSFETs,” in Proc. Int.
Electron Devices Meet., 2000, pp. 719–722.
[18] T. Schulz, W. Rosner, E. Landgraf, L. Risch, and U. Langmann,
“Planar and vertical double gate concepts,” Solid-State Electron., vol.
46, no. 7, pp. 985–989, 2002.
[19] F. Allibert, A. Zaslavsky, J. Pretet, and S. Cristoloveanu, “Double-gate
MOSFETs: Is gate alignment mandatory?,” in Proc. Euro. Solid-State
Device Res. Conf., 2001, pp. 267–270.
[20] J. R. Tucker, “Schottky barrier MOSFETs for silicon nanoelectronics,”
in Proc. Adv. Workshop Frontiers Electron. (WOFE), 1997, pp. 97–100.
[21] T. Lepselter and S. M. Sze, “SB-IGFET: An insulated-gate field-effect
transistor using Schottky barrier contacts for source and drain,” Proc.
IEEE, vol. 56, no. 8, pp. 1400–1401, Aug. 1968.
[22] Z. Ren, R. Venugopal, S. Datta, and M. Lundstrom, “The ballistic
nanotransistor: A simulation study,” in Proc. Int. Electron Devices
Meeting, 2002, pp. 715–718.
[23] W. Chen, L. F. Register, and S. K. Banerjee, “Simulation of quantum
and scattering effects along the channel of ultra-scaled si-based MOS-
FETs,” in Proc. 60th Device Res. Conf. Dig., 2002, pp. 109–110.
[24] Silvaco, Santa Clara, CA, “Atlas device simulation software,” 2007
[Online]. Available: http://www.silvaco.com
[25] J. Kedzierski, P. Xuan, E. H. Anderson, J. Bokor, T.-J. King, and C.
Hu, “Complementary silicide source/drain thin-body MOSFETs for the
20 nm gate length regime,” in Proc. Int. Electron Dev. Meet. (IEDM),
2000, pp. 57–60.
[26] Q.-T. Zhao, P. Kluth, S. Winnerl, and S. Mantl, “Fabrication of
Schottky barrier MOSFETs on SOI by a self-assembly CoSi2-pat-
terning method,” Solid-State Electron., vol. 2003, pp. 1183–1186.
[27] S. Zhu, J. Chen, M.-F. Li, S. J. Lee, J. Singh, C. X. Zhu, A. Du, C. H.
Tung, A. Chin, and D. L. Kwong, “N-type Schottky barrier source/drain
MOSFET using ytterbium silicide,” IEEE Electron Device Lett., vol.
25, no. 8, pp. 565–567, Aug. 2004.
[28] H.-S. P. Wong, D. J. Frank, and P. M. Solomon, “Device design con-
siderations for double-gate, ground-plane, and single-gated ultra-thin
SOI MOSFET’s at the 25 nm channel length generation,” in Proc. Int.
Electron Devices Meet. (IEDM), 1998, pp. 407–410.
[29] J. R. Watling, A. R. Brown, and A. Asenov, “Can the density gradient
approach describe the source-drain tunnelling in decanano double-gate
MOSFETs?,” J. Comput. Electron., vol. 1, pp. 289–293, 2002.
[30] D. J. Wouters, J.-P. Colinge, and H. E. Maes, “Subthreshold slope in
thin-film SOI MOSFETs,” IEEE Trans. Electron Devices, vol. 37, no.
9, pp. 2022–2033, Sep. 1990.
[31] J. Kedzierski, E. Nowak, T. Kanarsky, Y. Zhang, D. Boyd, R. Car-
ruthers, C. Cabral, R. Amos, C. Lavoie, R. Roy, J. Newbury, E. Sullivan,
J. Benedict, P. Saunders, K. Wong, D. Canaperi, M. Krishnan, K.-L.
Lee, B. A. Rainey, D. Fried, P. Cottrell, H.-S. P. Wong, M. Ieong, and
W. Haensch, “Metal-gate finfet and fully-depleted SOI devices using
total gate silicidation,” in Proc. Int. Electron Devices Meet. (IEDM),
2002, pp. 247–250.
[32] Q. Chen, L. Wang, and J. D. Meindl, “Impact of high-k dielectrics on
undoped double-gate MOSFET scaling,” in Proc. IEEE Int. SOI Conf.,
2002, pp. 115–116.
[33] X. Lin, M. Chan, and H. Wang, “Opposite side floating gate SOI flash
memory cell,” in Proc. IEEE Electron Devices Meet., 2000, pp. 12–15.
[34] P. Beckett, “A fine-grained reconfigurable logic array based on double
gate transistors,” in Proc. IEEE Int. Conf. Field-Program. Technol.
(FPT), 2002, pp. 260–267.
[35] P. Beckett, “A polymorphic hardware platform,” in Proc. 10th Recon-
figurable Arch. Workshop (RAW), 2003, p. 175.
[36] K. Chen, C. Hu, P. Fang, M. R. Lin, and D. L. Wollesen, “Predicting
CMOS speed with gate oxide and voltage scaling and interconnect
loading effects,” IEEE Trans. Electron Devices, vol. 44, no. 11, pp.
1951–1957, Nov. 1997.
[37] T. Sakurai and A. Newton, “Alpha-power law MOSFET model and
its applications to CMOS inverter delay and other formulas,” IEEE J.
Solid-State Circuits, vol. 25, no. 2, pp. 584–594, Feb. 1990.
[38] S.-M. Kang and Y. Leblebici, CMOS Digital Integrated Circuits: Anal-
ysis and Design. New York: McGraw-Hill, 1996.
[39] D. J. Frank, R. H. Dennard, E. Nowak, P. M. Solomon, Y. Taur, and
H.-S. P. Wong, “Device scaling limits of si MOSFETs and their appli-
cation dependencies,” Proc. IEEE, vol. 89, no. 3, pp. 259–288, Mar.
2001.
[40] “The nanotechnology simulation hub,” [Online]. Available:
http://www.nanohub.org/
[41] M. M. Ziegler and M. R. Stan, “The CMOS/nano interface from a cir-
cuits perspective,” in Proc. Int. Symp. Circuits Syst. (ISCAS), 2003, pp.
904–907.
[42] A. DeHon and M. J. Wilson, “Nanowire-based sublithographic pro-
grammable logic arrays,” in Proc. ACM/SIGDA 12th Int. Symp. Field-
Program. Gate Arrays (FPGA), 2004, pp. 123–132.
[43] “The FPGA place-and-route challenge,” [Online]. Available:
http://www.eecg.toronto.edu/~vaughn/challenge/challenge.html
[44] J. Rose, R. J. Francis, D. Lewis, and P. Chow, “Architectures of field-
programmable gate arrays: The effect of logic functionality on area
efficiency,” IEEE J. Solid-State Circuits, vol. 25, no. 5, pp. 1217–1225,
May 1990.
[45] J. A. Davis, V. K. De, and J. D. Meindl, “A stochastic wire-length dis-
tribution for gigascale integration (gsi). ii. applications to clock fre-
quency, power dissipation, and chip size estimation,” IEEE Trans. Elec-
tron Devices, vol. 45, no. 3, pp. 590–597, Mar. 1998.
[46] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, “Low-power
CMOS digital design,” IEEE J. Solid-State Circuits, vol. 27, no. 4, pp.
473–484, Apr. 1992.
[47] C. P. Collier, E. W. Wong, M. Belohradský, F. M. Raymo, J. F. Stod-
dart, P. J. Kuekes, R. S. Williams, and J. R. Heath, “Electronically con-
figurable molecular-based logic gates,” Science, vol. 285, pp. 391–394,
1999.
Paul Beckett (M’06) was born in Melbourne,
Australia, in 1953. He received the B.Eng. (Comm.),
M.Eng., and Ph.D. degrees from the Royal Mel-
bourne Institute of Technology (now RMIT
University), Melbourne, in 1975, 1984, and 2007,
respectively.
He is currently a Senior Lecturer with the School of
Electrical and Computer Engineering, RMIT Univer-
sity, where he teaches undergraduate and postgrad-
uate courses in embedded computer architecture, dig-
ital logic, and VLSI design. His research interests in-
clude the design and simulation of nanoscale devices and the mixed-signal mod-
eling of reconfigurable circuits and architectures.
Authorized licensed use limited to: RMIT University. Downloaded on August 5, 2009 at 21:15 from IEEE Xplore.  Restrictions apply. 
