VCTA: A Via-Configurable Transistor Array regular fabric by Pons Solé, Marc et al.
VCTA: A Via-Configurable Transistor Array
Regular Fabric
Marc Pons∗, Francesc Moll∗, Antonio Rubio∗, Jaume Abella†, Xavier Vera‡ and Antonio Gonza´lez‡
∗Universitat Polite`cnica de Catalunya, Electronic Engineering, {mpons,moll,rubio}@eel.upc.edu
†Barcelona Supercomputing Center (BSC-CNS), jaume.abella@bsc.es
‡Intel Barcelona Research Center, Intel Labs - UPC, {xavier.vera,antonio.gonzalez}@intel.com
Abstract—Layout regularity is introduced progressively by
integrated circuit manufacturers to reduce the increasing system-
atic process variations in the deep sub-micron era. In this paper
we focus on a scenario where layout regularity must be pushed
to the limit to deal with severe systematic process variations
in future technology nodes. With this objective, we propose
and evaluate a new regular layout style called Via-Configurable
Transistor Array (VCTA) that maximizes regularity at device
and interconnect levels. In order to assess VCTA maximum
layout regularity tradeoffs, we implement 32-bit adders in the
90 nm technology node for VCTA and compare them with
implementations that make use of standard cells. For this purpose
we study the impact of photolithography proximity and coma
effects on channel length variations, and the impact of shallow
trench isolation mechanical stress on threshold voltage variations.
We demonstrate that both variations, that are important sources
of energy and delay circuit variability, are minimized through
VCTA regularity.
I. INTRODUCTION
As we enter the deep sub-micron era, integrated circuit
manufacturers are facing the increasing systematic process
variations that arise from the optical lithography manufac-
turing process. 193 nm light sources are still used to print
critical dimensions of 65 nm, 45 nm and 32 nm, resulting in
geometrical layout variability that leads to variations on the
electrical characteristics of devices and interconnections.
Important sources of these variations are lithography imper-
fections such as proximity and coma effect that cause MOS
channel length variations. Their impact on delay and leakage
current have been demonstrated in [1] where a methodology
is described to include layout-dependent variations in static
timing analysis and to reduce manufacturing risk.
Resolution Enhancement Techniques (RETs) such as phase
shift mask, optical proximity correction and off-axis illu-
mination have been used to greatly improve layout print-
ability and to correct lithography imperfections. However,
these techniques are computationally expensive and very time-
consuming for large integrated circuits with arbitrary layout
patterns. Thus, layouts with a reduced number of patterns are
desirable.
Another source of variation is mechanical stress due to
shallow trench isolation. The shape of the oxide diffusion
area as well as the location of the MOS device inside this
area impact threshold voltage and produce circuit performance
variation [2].
New design for manufacturability regularity-based tech-
niques with fewer layout patterns are emerging as a possible
solution for manufacturers [3]–[5]. Examples of the use of
dummy features to increase layout regularity can be found
in [6] showing that regularity is being progressively introduced
by Intel or AMD. Regular structuctures for transistors also
using dummys reduce the stress-induced performance vari-
ations [7]. Other works at Tela Innovations using gridded
design rules have been shown to reduce gate critical dimension
variability by 4x to 16x by improving polysilicon regularity
[8]. However, the resulting layouts for these cases are not
completely regular. As performance is usually worsen by
regularity, layouts are tuned including only the regularity
needed to reduce the process variations for the nowadays
technology to achieve acceptable yields. In the future, more
comprehensive regularity-based techniques will be required to
deal with the increasing process variations.
In this paper, we propose and study a new regular design
style called VCTA, that stands for Via-Configurable Transistor
Array, whose purpose is to push to the limit layout regularity
for devices and interconnects to minimize the amount of
systematic process variations. Tradeoffs involved in the design
of VCTA-based circuits are carefully evaluated.
The structure of the paper is as follows. In section II we
briefly describe existent regularity-based techniques and their
trade-offs. In section III we detail our VCTA proposal and
describe the VCTA basic cell. In section IV we explain VCTA
complex circuit layout generation. In section V we present
electric simulations for two 32-bit adders to evaluate the
overheads introduced by regularity in area and performance.
In section VI we study the benefits of regularity on process
variations. Finally, in section VII we provide the conclusions.
II. REGULARITY-BASED TECHNIQUES OVERVIEW
Regular designs are based on the repetition of a small set of
basic blocks and layout patterns. The main benefit of layout
regularity is the reduction of the amount of systematic process
variations by allowing RETs to more effectively mitigate
lithography printability issues. This translates into a reduction
of design cost achieved through (i) a reduction of the yield loss
associated to circuit energy and delay unpredictability due to
the reduction of process variations, and (ii) a reduction of the
time-to-market by accelerating RETs and also due to the lower
Fig. 1. Regularity vs Efficiency for different layout techniques. STD =
Standard Cell, LB = Logic Bricks, VC = Via Configurable Blocks, FPGA =
Field-Programmable Gate Array, and our proposal VCTA = Via-Configurable
Transistor Array.
number of basic cells or layout patterns and neighborhoods to
be optimized.
Among these comprehensive regularity-based techniques
there are already some proposals like Logic Bricks (LB),
structured ASIC Via-Configurable Logic Blocks (VC) and also
Gate Arrays (GA).
The LB design technique is an evolution of the standard
cell approach [9]. The basic idea is to find the reduced set of
standard cells that are needed for the function to be imple-
mented and optimizing them by reducing the large amount of
neighborhood configurations in a standard cell library.
The emerging structured ASICs are constructed using an
array of identical basic tiles that contain the logic [10]. The
different types of structured ASICs can be classified depending
on their regularity granularity defined by the elements that
compose the tile. For instance, the tile can be composed by
gates, multiplexers, lookup tables, buffers, etc. The condition
that has to be ensured is that all functions can be synthesized
with the elements included in the tile. VC proposal is to
consider a tile composed by via-configurable logic blocks [11].
It consists of two types of blocks: a via-configurable functional
cell, containing the combinational logic needed for functions,
and two via-configurable inverter arrays, containing inverters
to perform the buffer connections with the surrounding tiles.
Among GA, the most evolved design technique is Field-
Programmable Gate Arrays (FPGAs). The basic structure is
conformed by logic blocks, including lookup tables and flip-
flops plus the programming overhead, that are interconnected
using routing blocks, consisting of connection boxes and
switch boxes [12].
Each of the previous techniques shows different degrees of
layout regularity but none of them explores the possibility of
maximizing layout regularity both at transistor and intercon-
nect levels.
III. VCTA PROPOSAL AND BASIC CELL
The objective of our Via-Configurable Transistor Array
proposal (VCTA) is maximizing regularity for devices and
interconnects. VCTA uses a single basic cell which is repeated
along the circuit. Based on the observation that regular designs
such as SRAMs get to the market long before irregular
conventional logic designs such as microprocessors [13], [14],
Fig. 2. Transistor Array structure (PO = polysilicon, OD = oxide diffusion).
(a) (b)
Fig. 3. (a) Metal grid structure for inter-cell routing (b) Placement and local
power supply network of VCTA basic cells (6 cells in the picture).
we expect VCTA regularity to reduce the time-to-market,
providing a reduction of the amount of systematic process
variations with the associated yield loss reduction. However,
in general, more regularity implies less efficiency in terms of
energy, delay and area.
To summarize the pros and cons of the different existent reg-
ular design techniques and to illustrate the expected behavior
of our VCTA proposal, we depict in Figure 1 the tradeoff for
these designs between design efficiency and regularity.
VCTA is a very fine-grain regular structure, that maximizes
layout regularity by setting up regular interconnects and en-
forces all transistors to have the same dimensions.
A. Maximizing regularity at transistor Level: the Transistor
Array
The Transistor Array is composed of 2 vertically aligned
blocks of a number T of PMOS and T NMOS transistors
(Figure 2 where T = 6). Note that the T transistors in each case
share the same oxide diffusion in order to increase transistor
density and thus reduce the area of the basic cell. With this
(a) (b) (c)
Fig. 4. NAND-NOR: (a) Schematic (b) VCTA schematic (c) VCTA layout.
constraint VCTA transistors are connected in series by default.
However, we can implement parallel connections by properly
setting up vias as it will be explained in the next subsections.
In order to force maximum transistor layout regularity, all
transistors have the same width and the minimum channel
length. To further reduce process variations, we add 2 dummy
transistors (the ones on the upper and lower extremes). In this
way we avoid possible variations in drains/sources between
two polysilicon gates and drains/sources at the edges with only
one gate on one side.
B. Maximizing regularity at interconnect level: the Via-
Configurable choice
In order to ensure interconnect regularity VCTA uses a
regular interconnect grid of parallel metal lines. The lines
alternate from horizontal to vertical direction from one layer
to the next. Intra-cell routing is performed by means of a via-
configurable structure where all contacts and vias are placed
depending on the function to be synthesized. Inter-cell routing
between VCTA basic cells is achieved by the extension of the
metal lines across the borders of the VCTA cells. Contacts,
vias and inter-cell metal interconnections are the only source
of layout irregularity of our VCTA design.
C. Implementation of VCTA
In order to study VCTA designs we will use a particular
implementation with T = 6 in the rest of the paper. The choice
of having 6 PMOS and 6 NMOS transistors in the basic cell
is related to the possibility of implementing 2 logic branches
of transistors with a maximum length of 3 serial transistors
to avoid body effect and excessive serial resistance issues. All
transistors have minimum channel length and 440 nm width
ensuring enough transistor strength to drive the associated
parasitics of the dense interconnect metal grid structure.
The via-configurable interconnection scheme uses three
metal levels, M1 to M3, forming a regular routing grid where
M1 and M3 wires are vertical and M2 wires are horizontal.
The choice is based on using the lowest metal levels.
Fig. 5. Layouts of CLA32 VCTA (65.5µm x 40µm, top) and STD (80µm
x 17.4µm, bottom)
Fig. 6. Layouts of KS32 VCTA (110µm x 64µm, top) and STD (91µm x
29.5µm, bottom)
Regarding the intra-cell connections, we use PO-M1 con-
tacts to configure the transistor gate inputs, M1-M2 vias to
configure the basic cell inputs and outputs and finally M2-M3
vias to configure the parallel transistor connections. Figure 4
shows an example of intra-cell connections using contacts and
vias (CO-VIA) to implement a simple NAND-NOR gate inside
of a basic cell. Note that we still have spare transistors (whose
drain and sources are connected to power supply in order to
avoid undesirable energy consumption) that we can use to
implement other functions if necessary.
We use M1 and M3 layers for vertical inter-cell connections
and M2 for horizontal inter-cell connections (Figure 3a).
Note that we can consider many other VCTA implementa-
tions with different number of transistors, metal layers, etc.
However, such a study is out of the scope of this paper.
IV. COMPLEX CIRCUITS WITH VCTA
In order to illustrate that our VCTA regular design technique
allows the implementation of complex circuits we have imple-
mented binary adders, a common block in integrated circuit
designs. In particular, we have developed complete layouts
in the 90 nm technology node for a 32-bit Carry-Lookahead
adder (CLA32) and for a 32-bit Kogge-Stone adder (KS32)
using the VCTA structure and also the Standard Cell approach
(STD) to evaluate the area, energy and delay overheads in
those commonly used circuits.
(a) (b) (c)
Fig. 7. STD CLA32 Layout Layer Masks (a) PO (b) OD (c) M1.
(a) (b) (c)
Fig. 8. VCTA CLA32 Layout Layer Masks (a) PO (b) OD (c) M1.
The resulting complete layouts captures are presented for
the CLA32 in Figure 5 and for the KS32 in Figure 6.
The steps that we have followed for VCTA layout generation
are: (1) find out the logic functions needed to implement
the structure of the circuit, (2) map the transistors of these
functions into the VCTA basic cell as we have shown in the
previous section for the NAND-NOR gate, (3) manually place
and route them to obtain the complete layout. The automation
of the whole VCTA design flow is part of our future work. For
the STD layout generation, we have used the public standard
cell layouts provided in [15] that offers a complete set of
portable CMOS libraries.
The binary adder circuits studied require 6 different types
of logic functions: an inverter, an XOR, a 2-input NAND, a
4-input NAND, an AND-OR and an OR-AND [16]. We have
mapped these functions into the VCTA basic cells. In some
cases we were able to implement 2 functions into a single
VCTA basic cell. This can be done when the functions are
next to each other in the circuit (e.g., the output of one of the
functions is the input of the other one, or they share the same
inputs).
As a consequence, the VCTA layouts can be composed by
fewer cells than the STD layouts. For instance, the complete
CLA32 finally required 228 standard cells and only 160 VCTA
basic cells. We have manually placed and routed those VCTA
cells trying to minimize the interconnect distances as well as
for STD cells.
For the placement of our basic cells we have also considered
the local power supply network symmetries. We depict the
general placement of our basic cells in Figure 3b for 6 basic
cells. In the VCTA basic cell we reserve M1 and M3 wires in
TABLE I
RESULTS WITHOUT PROCESS VARIATIONS
(WCD = WORST-CASE DELAY, AVGE = AVERAGE ENERGY)
WCD(ns) AVGE(pJ) Area(µm2)
CLA32 STD 1.11 0.21 1394
CLA32 VCTA 2.15 0.47 2620
Ratio 1.94x 2.24x 1.88x
KS32 STD 0.84 0.33 2684
KS32 VCTA 1.69 0.79 7046
Ratio 2.00x 2.39x 2.63x
each metal layer for VDD and GND. These wires are shared
across neighbor cells. In this way, we can reduce the area when
implementing a full circuit with multiple basic cells. We use
the same design criterion sharing polarization contacts.
To illustrate the layouts that we have generated, Fig-
ures 7 and 8 present CLA32 layout masks for STD and
VCTA designs for polysilicon (PO), oxide diffusion (OD) and
metal 1 (M1) layers. By visual inspection we can see how our
VCTA design is much more regular than the STD approach.
V. IMPACT ON AREA AND PERFORMANCE
We have performed complete electrical simulations of the
extracted layouts of CLA32 and KS32 in the 90 nm technology
node using the HSPICE simulator. We have evaluated both
the adders designed with our VCTA regular design as well as
those based on standard cells in terms of delay and energy for
10400 inputs that we have sampled from all 26 programs in
the SPEC2000 benchmark suite [17]. We have measured the
delay from input variation to the associated output transition
considering the cross at 90% of the voltage rise or fall swings.
We have also measured energy for each input combination
integrating the current demand at the power supply source dur-
ing the addition. Finally, we have measured the area directly
from the layout. We show measurement results for worst-case
delay (WCD) and average energy dissipation (AVGE) for all
the inputs in Table I.
First, this particular choice for VCTA regular design implies
an increase around 2x in area (1.88x for CLA32 and 2.63x
for KS32) when compared to the STD approach. The area
increase is basically due to the regularity requirements and
redundancy, because all possible configurations of devices and
interconnects are in place in the VCTA basic cell. The basic
cell includes dummy transistors, spare transistors and also
spare interconnects which increase the total area.
In terms of WCD and AVGE, both CLA32 and KS32
present more than a 2x energy ratio but around a 2x delay
ratio. In fact overheads introduced by VCTA when compared
to STD are very much dependent on the function to implement.
STD uses different standard cells depending on the circuit op-
timization but VCTA always uses the same basic cell. Energy
and delay overheads are due to the parasitics introduced by
our VCTA metal grid.
Another VCTA parameter to optimize is transistor sizing
(all the transistors have the same dimensions). With our
present choice of 440 nm for width, by connecting in parallel
Fig. 9. Proximity and coma effect model measurements
TABLE II
CHANNEL LENGTH VARIATIONS
Proximity Effect L 3σ/µ Coma Effect L 3σ/µ
CLA32 STD 5.31% 6.19%
KS32 STD 5.16% 6.48%
transistors, we can only emulate wider transistors of 880 nm,
1320 nm, etc., with a width multiple of the basic transistor,
and this is not always optimal.
Note also that logic functions implemented such as NAND,
XOR, etc. are particularly suitable for STD, but may be
suboptimal for VCTA.
VI. IMPACT ON PROCESS VARIATIONS
Evaluating the impact of layout regularity on systematic
process variations is key to demonstrate the usefulness of
maximizing layout regularity using VCTA. As variability
in printed features depends on their neighborhood, the link
between the different shapes in the layout and the amount of
process variations has to be studied.
Channel length variations and the impact of mechanical
stress on threshold voltage have been evaluated as they are
major sources of circuit performance variations.
A. Channel length variations: proximity and coma effect
Models for systematic variations of channel length (L)
variability can be found in [18] taking into account proximity
and coma Effects. Basically, proximity and coma effects
models associate to each channel a percentage of L variation
depending on the layout neighborhood on both sides of the
feature to be printed. The models are based on the inspection
of the layout to the left and to the right of the feature in
order to define the kind of neighborhood that the channel has.
They measure the distances to the first polysilicon line in each
direction. Figure 9 depicts an example of distances n1 to the
left and n2 to the right. The difference between both is that for
proximity effect left side and rigth side distances are equivalent
in their impact on variations but for coma effect they are not.
The models include tables with the nominal amount of process
variations for each case and the final percentage variation for
L can be obtained by setting the maximum percentage range
of variations. The final result is the expected L for each of
the transistors on the layout. The entire circuit L distribution
can then be characterized by its mean µ and its standard
deviation σ.
Using those proximity and coma effect models we have
measured the L systematic process variations of the adder
layouts for VCTA and STD for the different sources of
systematic variability considering 10% maximum L variations.
As all the transistors in the VCTA basic cell have the
same layout neighborhood, with two polysilicon lines at the
same distance, they are all affected by the same systematic
L variations, thus showing no σ in the L distribution. This
is achieved by the use of the dummy polysilicon lines at the
edges of the PMOS and NMOS transistor arrays.
On the other hand, STD adders that use different cells
with different placements present higher number of layout
neighborhoods. The L statistics in terms of 3σ/µ are presented
in Table II. For proximity effect, CLA32 and KS32 transistors
see 7 and 8 neighborhoods respectively. For Coma Effect,
which differentiates the sides for the distances measured, there
are 9 and 10. That is why coma effect variability is higher than
proximity effect variability.
The final result is that all VCTA transistors are affected
by the same L systematic variation and therefore have all the
same L whereas the L variability between transistors is around
5-6% for STD. Therefore, we can conclude that L variations
for proximity effect and coma effect are minimized through
VCTA regular layout designs.
Note that these results show the regularity of VCTA at two
levels. First, the L variations are the same for both CLA32
and KS32 for the VCTA design whereas they depend on
the particular circuit for the STD design. This is because
VCTA uses the same basic cell for both adders and STD uses
different cells. This is VCTA regularity at cell level. Second,
VCTA maximizes regularity inside the basic cell and shows
only one neighborhood for all transistors whereas STD shows
different neighborhoods inside each of the cells. This is VCTA
regularity at transistor level.
B. Threshold voltage variations: mechanical stress
Models for silicon mechanical stress due to Shallow Trench
Isolation (STI) are included in the BSIM4 transistor models
[19]. Transistor performance is affected depending on the
shape of the oxide diffusion area and on the position of the
device inside this area. In particular, threshold voltage (Vth)
varies depending on the distances from the channel to the edge
of the difusion (where the STI begins). Figure 10 shows an
example for the measurement of these d1 and d2 distances.
The relative impact also depends on the dimensions of the
transistor. Transistors with wider channel will be less affected.
By extracting these data from the layout and using the
models supplied for the 90 nm technology node, we have
calculated the Vth variations for PMOS and NMOS transistors
in the CLA32 and the KS32 adders. The results for the VCTA
and STD designs are shown in Table III.
For VCTA transistors, there are only three different cases.
From Figure 2 it can be seen that transistors 1 and 6 will
have the same STI stress because the VCTA basic cell is
Fig. 10. STI Stress model measurements
TABLE III
THRESHOLD VOLTAGE VARIATIONS
PMOS Vth 3σ/µ NMOS Vth 3σ/µ
CLA32 STD 4.85% 6.25%
CLA32 VCTA 0.82% 1.24%
Ratio 0.17x 0.20x
KS32 STD 4.07% 5.83%
KS32 VCTA 0.82% 1.24%
Ratio 0.20x 0.21x
symmetric. The same occurs for transistors 2 and 5 and finally
for transistors 3 and 4. Furthermore, the VCTA transistors
have all the same channel width and therefore will be affected
similarly.
On the other hand, for STD, there is a higher number of
cases related to the different transistor neighborhoods and to
the different transistor sizings. That is why for VCTA the
Vth variability is around 1% and for STD it reaches 4% for
PMOS and 6% for NMOS. The ratios for the reduction of Vth
variability due to VCTA regularity are close to 0.20x.
Again, the results show VCTA regularity at two different
levels. First, at cell level we can see how VCTA shows the
same Vth variations independently of the circuit considered.
Second, at transistor level, the number of cases for STI stress
is also reduced because of transistor array regularity.
VII. CONCLUSION
This paper proposes and evaluates the VCTA design tech-
nique to explore the impact of maximizing layout regularity
in future technologies that will have to deal with increasing
systematic process variations.
CLA32 and KS32 adder layouts using a particular imple-
mentation of the VCTA basic cell have been developed to
illustrate the VCTA methodology.
VCTA proposal maximizes regularity at cell level using
a single basic cell, at transistor level with the Transistor
Array structure, and finally at interconnect level with the Via-
Configurable choice.
Lithography models for proximity and coma effects and STI
mechanical stress show how VCTA regularity minimizes the
channel length and threshold voltage MOS systematic process
variations that are affecting STD designs.
ACKNOWLEDGMENT
This research work has been supported by Intel Corporation,
Feder Funds, the Spanish Ministry of Education and Sci-
ence under grant TIN2007-61763, TEC2008-01856 and FPU
AP2007-04125 and the Generalitat de Catalunya under grant
2009SGR1250.
REFERENCES
[1] M. Choi and L. Milor, “Impact on circuit performance of deterministic
within-die variation in nanoscale semiconductor manufacturing,” IEEE
TCAD, vol. 25, no. 7, pp. 1350 –1367, 2006.
[2] V. Moroz et al., “Stress-aware design methodology,” in ISQED, 2006,
pp. 807–812.
[3] B. Wong et al, Nano-CMOS Design for Manufacturability: Robust
Circuit and Physical Design for Sub-65 nm Technology Nodes. John
Wiley & Sons, 2009.
[4] M. Orshansky et al, Design for Manufacturability and Statistical Design:
A Constructive Approach. Springer, 2008.
[5] C. Chiang and J. Kawa, Design for Manufacturability and Yield for
Nano-Scale CMOS. Springer, 2007.
[6] J. Dick, “Design-for-manufacturing features in nanometer processes - a
reverse engineering perspective,” in ASMC, 2009, pp. 56–61.
[7] P. G. Drennan et al., “Implications of proximity effects for analog
design,” CICC, pp. 169–176, 2006.
[8] M. Smayling et al, “Low k1 logic design using gridded design rules,”
vol. 6925, no. 1. SPIE, 2008.
[9] V. Kheterpal et al, “Design methodology for IC manufacturability based
on regular logic-bricks,” in DAC, 2005, pp. 353–358.
[10] B. Zahiri, “Structured ASICs: opportunities and challenges,” in ICCD,
2003, pp. 404–409.
[11] Y. Ran and M. Marek-Sadowska, “Designing via-configurable logic
blocks for regular fabric,” IEEE Transactions on VLSI Systems, vol. 14,
no. 1, pp. 1–14, 2006.
[12] M. Lin and A. El Gamal, “A routing fabric for monolithically stacked
3d-fpga,” in FPGA, 2007, pp. 3–12.
[13] http://www.intel.com/pressroom/.
[14] http://www.tcmagazine.com/comments.php?shownews=24545.
[15] G. Petley, VLSI and ASIC Technology Standard Cell Library Design,
http://www.vlsitechnology.org.
[16] H. Neil et al, CMOS VLSI Design, A Circuits and Systems Perspective.
Pearson, 2005.
[17] http://www.spec.org/cpu2000.
[18] M. Choi and L. Milor, “Diagnosis of optical lithography faults with
product test sets,” IEEE TCAD, vol. 27, no. 9, pp. 1657 –1669, 2008.
[19] http://www-device.eecs.berkeley.edu/∼bsim3/bsim4.html.
