Novel Configurable Logic Block Architecture Exploiting Controllable-Polarity Transistors, invited paper by Gaillardon, Pierre-Emmanuel et al.
Novel Configurable Logic Block Architecture Exploiting 
Controllable-Polarity Transistors 
Pierre-Emmanuel Gaillardon, Xifan Tang, Giovanni De Micheli 
EPFL, Lausanne, Switzerland 
pierre-emmanuel.gaillardon@epfl.ch 
 
Abstract—Controllable-polarity transistors exhibit a device-level con-
figurability. Indeed, they can be dynamically configured between n-
type and p-type. Such property can be exploited in Field Programmable 
Gate Arrays (FPGAs) to replace traditional Look-Up Tables (LUTs) by 
more powerful configurable units. We report here on a new FPGA logic 
block architecture, called MCluster, that takes a direct advantage of 
configurable transistors. The performance of the approach is evaluated 
and compared to its traditional Complementary Metal-Oxide-
Semiconductor (CMOS) counterpart at 22-nm technology node. We 
note an average saving of 64% in area×delay×power product. 
Keywords—Controllable-polarity transistors; ultra-fine grain 
logic; Reconfigurable logic 
I.  INTRODUCTION 
In modern semiconductor technologies, Schottky contacts are often 
used to reduce the access resistances at the source and drain interfaces of 
Metal-Oxide-Semiconductor Field Effect Transistors (MOSFETs). How-
ever, Schottky barriers can also lead to ambipolarity. The ambipolar 
phenomenon consists of observing both hole and electron conductions in 
the same devices and limits the performances of semiconductors. Never-
theless, it is possible to exploit this phenomenon in order to enrich the 
logic capabilities of the elementary transistors by creating double-
independent-gate structures. More precisely, the additional gate elec-
trode can be used to control the device polarity dynamically. While 
being recently demonstrated using Silicon Nanowire FETs (SiNWFETs) 
[1], such transistor property defines a new class of emerging devices that 
inherently implement a two-input comparator rather than a simple 
switch. Controllable-polarity devices can be realized in diverse technol-
ogies, such as silicon nanowires [1,2], carbon nanotubes [3], graphene 
[4] and nanorelays [5]. 
The intrinsic device configurability enables many innovations from a 
design perspective, ranging from arithmetic logic [6] down to compact 
reconfigurable logic [7]. Compact reconfigurable logic cells are appeal-
ing to re-think the standard reconfigurable architectures, such as the 
traditional Field Programmable Gate Arrays (FPGAs) architectural 
scheme. 
In this paper, we present a novel logic block architecture that ex-
ploits the polarity control at the device level. The logic blocks use ultra-
fine grain logic gates grouped in 2×2 matrices in order to realize more 
efficient combinational logic. The new logic blocks then replace the 
LUTs in the standard FPGA structure. System-level performance evalua-
tions conclude that such an approach leads to an average saving of 64% 
in area×delay×power figure-of-merit as compared to its standard coun-
terpart architecture at 22-nm node. 
The remainder of the paper is organized as follows. In Section II, we 
devise on the opportunities brought by controllable-polarity devices to 
realize compact reconfigurable cells and novel FPGA structures. In 
Section III, we briefly report on the performances improvements given 
by the approach. In Section IV, we conclude the paper.  
II. LEVERAGING DEVICE POLARITY CONTROL IN 
RECONFIGURABLE SYSTEMS 
In this section, we discuss the opportunities given by controllable-
polarity devices to realize efficient reconfigurable logic blocks. 
A. Ultra-Fine Grain Reconfigurable Logic 
We first review the behavior of controllable-polarity transistors, then 
we report on their use in compact logic cell design. 
1) Transistors with Controllable Polarity 
Controllable-polarity transistors typically control the height of the 
Schottky barriers formed at the source and drain contacts. This control is 
enabled by double-independent-gate structures. In such devices, one gate 
electrode, called the Control Gate (CG), acts conventionally by turning 
on and off the device. The other electrode, called the Polarity Gate (PG), 
acts on the side regions of the device, in proximity of the Source/Drain 
(S/D) Schottky junctions, and switches the device polarity dynamically 
between n- and p-type, as illustrated in Fig. 1. The input and output 
voltage levels are compatible, resulting in directly-cascadable logic gates 
[1]. For a complete review on the design opportunities brought by these 
transistors, we refer the reader to [6]. 
PG
CG
n
pPG=0
PG=1
 
Fig. 1. Transistor with polarity control behavior operating principle. 
2) Ultra-Fine Grain Reconfigurable Logic Gates 
The property of in-field reconfigurability has been used in [7] to 
build a compact reconfigurable cell. The cell, reported in Fig. 2-a, can 
realize any of the 8 Boolean logic functions Y of the two inputs A and B, 
reported in Fig. 2-b. 
VBA
A
VBB
B
VDD
PC1
Gnd
EV1
VBC
VDD
EV2
Gnd
PC2
Y
VBB VBCVBA
0
1
0
1
0
1
0
1
0
0
1
1
0
0
1
1
0
0
0
0
1
1
1
1
Y
A.B
A+B
A.B
A.B
A+B
A.B
A+B
A+B
(a) (b)
 
Fig. 2. Ultra-Fine grain reconfigurable cell schematic [7] (a) and associated 
configurations (b). 
978-1-4799-5810-8/14/$31.00 ©2014 IEEE 
The cell is built with only 7 transistors arranged in two dynamic log-
ic stages: logic function and follower/inverter. Signals PC1, EV1, PC2 
and EV2 are respectively the global precharge and evaluation signals of 
the two stages. The reconfiguration of the cell depends on the signals 
applied to the polarity gates VBA, VBB and VBC. Each of these signals is 
biased with either VDD or Gnd. This results in configuring the related 
transistors to either n- or p-type, thereby customizing the gate internal 
circuit. For a detailed description of the circuit operation, we refer the 
reader to [7]. 
B. Matrix Cluster Logic Blocks 
In this work, we replace the traditional LUTs by ultra-fine grain log-
ic gates. However, a one-to-one replacement would result in a large 
overhead in terms of programmable connections and would worsen the 
already significant imbalance between routing and logic resources in 
FPGAs. Indeed, in conventional FPGA systems, less than 15% of the 
area is dedicated to the logic computation while the other resources are 
used for the structure reconfigurability [8]. Therefore, to increase the 
logic coverage of the structure, we group the logic gates in layered 2×2 
matrices. This arrangement is called Matrix Cluster (MCluster). For the 
intra-matrix interconnect, we use a butterfly pattern between the two 
layers of logic cells. Mclusters perform combinational logic functions 
and place the original LUTs. Fig. 3 shows the organization of a logic 
block. Each Basic Logic Elements (BLEs) consists of a collection of N 
MCluster-based BLEs. 
f01
f00
f11
f10 FF
FF
f01
f00
f11
f10 FF
FF
2N 
Outputs 
I 
Inputs 
I 2N 
Clock 
BLE 
 
N MClusters 
BLE 
 
 
Fig. 3: MCluster-based logic block organization. 
III. EXPERIMENTAL RESULTS 
The impact of replacing standard LUTs by MCluster-based struc-
tures is evaluated through system-level benchmarking of a set of logic 
circuits taken from MCNC and ISCAS’89. Our reference FPGA archi-
tecture corresponds to a fully homogeneous FPGA architecture. The 
CMOS reference architecture with 4-input non-fractionable LUTs ar-
ranged in logic blocks of N=10 BLEs and I=22 external inputs. This 
architecture is optimal for homogeneous FPGAs [10]. Our novel struc-
ture follows the organization depicted in Fig. 3 with N=10 BLEs of 2×2 
MClusters. The evaluations are performed using the VTR benchmarking 
flow [9]. To handle MCluster-based architectures, we use a specific 
packer, called MPack [12]. The physical parameters of the different 
architectures are extracted for a 22-nm technological node, while the 
electrical performances, i.e. delay and power consumption numbers, of 
the elementary MCluster and LUT are electrically characterized using 
HSPICE. 
The architectural evaluation considers, as metrics, the area, the criti-
cal path delay, the dynamic power consumption and the leakage power. 
These metrics are computed during the place and route iterations of the 
flow. The area corresponds to the sum of the logic area, i.e., the area of 
used CLBs, and the routing area, i.e., the area of the used routing re-
sources. The critical path delay corresponds to the most constrained 
delay through the implemented structures. Finally, the power consump-
tions include both the contribution of logic blocks and the contribution 
of routing structures. All the metrics are normalized with respect to the 
most constrained CMOS design. 
Fig. 4 depicts the area×delay×power estimation for MCluster-based 
FPGA and compares it to its CMOS counterpart. The benchmarks show 
an improvement of 64% on average. This can be accounted (i) to the 
performance of a logic-gate-based computation (as compared to the LUT 
approach) and (ii) to the low area impact of ultra-fine grain logic cells, 
compared to the rather larger area required by a CMOS LUT. At the 
same 22-nm technology node, a 2 by 2 cluster is 2× smaller than a 4-
input LUT (2.22µm2 vs. 5.45 µm2 respectively) and is able to reach 
higher functionalities. A 4-input LUT computes a single output signal 
depending on 4 inputs. While MClusters can only realize a subset of the 
functions reachable by a LUT, they are capable to produce up to 2 out-
puts that are functions of the same 4 inputs. Thus, MClusters can poten-
tially output 2× more results for roughly 2× less area. Correlated to the 
efficiency of the packing tool for matrix clustering, this demonstrates a 
clear advantage of our proposal as compared to the CMOS approach.  
alu
4
C3
54
0
C5
31
5
de
s
ex
5p i10 i8
mi
se
x3 pd
c
sp
la
0.0
0.2
0.4
0.6
0.8
1.0
A
re
a.
D
el
ay
.P
ow
er
  p
ro
du
ct
  -­  
a.
u.
  CMOS
  MClusters_2_2
 
Fig. 4: Area×Delay×Power product comparisons between CMOS-based and 
MClusters-based FPGAs. 
IV. CONCLUSION 
In this paper, we report on a novel FPGA logic block architecture 
that leverages the additional degree of freedom given by controllable-
polarity transistors. In particular, we use controllable-polarity transistors 
to create compact reconfigurable logic cells. These cells, once arranged 
in 2×2 clusters, replace the traditional LUTs to perform basic combina-
tional operations. Thanks to the increased logic capabilities of these 
cells, it is possible to improve the performance of FPGA structures with 
on average a 64% reduction in area×delay×power product as compared 
to its standard counterpart at 22-nm technology node. 
ACKNOWLEDGMENTS 
This work was supported by the grant ERC-2009-AdG-246810. 
REFERENCES 
[1] M. De Marchi et al., “Polarity control in Double-Gate, Gate-All-Around 
Vertically Stacked Silicon Nanowire FETs,” IEDM Tech. Dig., 2012. 
[2] A. Heinzig, S. Slesazeck, F. Kreupl, T. Mikolajick and W. M. Weber, 
“Reconfigurable Silicon Nanowire Transistors,” Nano Letters, vol. 12, pp. 
119-124, 2011. 
[3] Y.-M. Lin, J. Appenzeller, J. Knoch and P. Avouris “High-Performance 
Carbon Nanotube Field-Effect Transistor With Tunable Polarities,” IEEE 
Trans. Nanotechnology, vol. 4, pp. 481-489, 2005. 
[4] N. Harada, K. Yagi, S. Sato and N. Yokoyama, “A polarity-controllable 
graphene inverter,” Applied Physics Letters, vol. 96, pp. 12102, 2010. 
[5] D. Lee et al., “Combinational Logic Design Using Six-Terminal NEM 
Relays, IEEE Trans. on CAD, 32(5):653-666, 2013. 
[6] P.-E. Gaillardon, S. Bobba, M. De Marchi, D. Sacchetto and G. De 
Micheli, “NanoWire Systems: Technology and Design,” Philosophical 
Transactions of the Royal Society of London A, vol. 372, no. 2012, March 
2014.. 
[7] I. O'Connor et al., "CNTFET Modeling and Reconfigurable Logic-Circuit 
design", IEEE Transactions on Circuits and Systems, vol. 54, no. 11, pp. 
2365-2379, November 2007. 
[8] M. Lin, A. El Gamal, Y.-C. Lu and S. Wong, “Performance Benefits of 
Monolithically Stacked 3-D FPGA”, IEEE Transaction on Computer-
Aided Design, vol. 26, no. 2, 2007. 
[9] J. Rose et al., “The VTR Project: Architecture and CAD for FPGAs from 
Verilog to Routing,” International symposium on Field Programmable Gate 
Arrays, pp. 77-86, 2012. 
[10] E. Ahmed, “The effect of logic block granularity on dee-submicron FGPA 
performance and density,” Master thesis, U. of Toronto, 2001. 
[11] E. Ahmed, “The effect of logic block granularity on dee-submicron FGPA 
performance and density,” Master thesis, U. of Toronto, 2001. 
[12] The Matrix Packer (MPack) tool available online at: 
https://sites.google.com/site/pegaillardon/downloads 
 
