Cost Modeling and Projection for Stacked Nanowire Fabric by Macha, Naveen Kumar & Rahman, Mostafizur
 Naveen Kumar Macha, Mostafizur Rahman 
Department of Computer Science & Electrical Engineering, University of Missouri Kansas City, MO, USA 
E-mail: nmhw9@mail.umkc.edu, rahmanmo@umkc.edu 
 
Abstract— To continue scaling beyond 2-D CMOS with 3-D 
integration, any new 3-D IC technology has to be comparable or 
better than 2-D CMOS in terms of scalability, enhanced 
functionality, density, power, performance, cost, and reliability. 
Transistor-level 3-D integration carries the most potential in this 
regard. Recently, we proposed a stacked horizontal nanowire 
based transistor-level 3-D integration approach, called SN3D 
[1][2] that solves scaling challenges and achieves tremendous 
benefits with respect to 2-D CMOS while keeping manageable 
thermal profile. In this paper, we present the cost analysis of SN3D 
and show comparison with 2-D CMOS, conventional TSV based 
3-D (T3-D) and Monolithic 3-D (M3-D) integrations. In our cost 
model, we capture the implications of manufacturing, circuit 
density, interconnects, bonding and heat in determining die cost 
and evaluate how cost scales as transistor count increases. Since, 
SN3D is a new 3-D IC fabric, based on our proposed 
manufacturing pathway [2] we assumed complexity of fabrication 
steps as proportionality constants in our cost estimation model. 
Our analysis revealed SN3D have 86% and 74% reduction in area; 
55% and 43% reduction in interconnect distribution and total 
interconnect length required; reduced metal layer requirement; 
and 70% and 68% reduction in total cost in comparison to 2-D 
CMOS and Monolithic 3-D (M3-D) integrations respectively.  
 
Index Terms—3-D IC, 3-D CMOS, Fine-Grained 3-D, SN3D, 3-
D Manufacturing, 3-D cost, SN3D Cost  
I. INTRODUCTION 
Transistor-level 3-D integration is considered the most 
promising direction for replacing 2-D CMOS due to its density 
and performance benefits.  Our proposal for transistor-level 3-
D integration called stacked horizontal nanowire based 3-D IC 
fabric (SN3D), is based on stacked horizontal nanowires and 
uses architected features for circuit functionality, 
interconnection, and thermal management. Previously we 
reported huge gains with SN3D design [1-3]. In this paper, we 
focus on cost aspects and show a detailed comparison with 2-D 
CMOS, TSV based 3-D CMOS (T3-D) and monolithic 3-D 
(M3-D) integrations.  
Fig. 1 shows core components of the fabric and overview of 
circuit mapped SN3D fabric. Stacked suspended horizontal 
nanowires (Fig. 1A) are the building blocks, which are 
prefabricated, and predopped. Architected fabric components: 
Gate-All-Around junctionless nanowire FETs (Fig. 1B), 
Common Contact (CC- Fig. 1C.i), Common Gate (CG- Fig. 
1C.ii), Horizontal Bridges (HB- Fig. 1C.iii), Horizontal 
Insulation (HI-Fig. 1D) and Fabric Vias (FV- Fig. 1E) are 
formed onto these nanowires through material deposition 
techniques[4][5]. Active devices here are junctionless nanowire 
transistors that do not require any doping variation for 
Drain/Source/Channel regions. Previously we demonstrated 
Cost Modeling and Projection for Stacked 
Nanowire Fabric  
A) Stacked Nanowires
C)   Fabric Interconnect Features 
(i) Common Contact
Contact
(Ni) Gate
(TiN)
Gate
(Ti)
(ii) Common Gate (iii) Horizontal Bridge
F)    SN3D Fabric-View 
Nanowire (Si)
B) Nanowire Transistor 
NW Channel
Gate Oxide(HfO2) Contact
Gate
(Ti/TiN)
Spacer 
(Si3N4)
Contact
(Al)
Bridge
E) Vias
Nano-Vias 
(W)
Horizontal 
Isolation 
(SU8/SiO2)
D) Horizontal Insulation (HI)
Substrate
Metal 
Stack
Via
Insulation
Nanowire
Gate
Bridge
Common 
Contact
Fig. 1: (A) Stacked Nanowires; (B) Junction-less Gate All Around Nanowire Transistor; (C) Fabric-Interconnects, (i) Common Contact (CC), (ii) Common Gate 
(CG), (iii) Horizontal Bridge (HB); (D) Horizontal Insulation (HI); (E) Fabric-Vias; (F) SN3D Fabric-Overview  
 junctionless device behavior experimentally [4][5]. The CG 
feature allows multiple junctionless devices to be gated with 
single input in vertical and/or horizontal directions. This 
enables optimal placement of common gated devices and 
minimizes inter-device connectivity requirements. The CCs 
provide common contact for stacked NWFETs, serve as an 
interconnection between adjacent transistors’ source and drain 
contacts. Functionally, CG and CC carry common signals; 
however, physically they are composed of different material 
stacks according to channel-gate work-function and Ohmic 
Contact requirements respectively. The HI (Fig. 1D) provides 
isolation between adjacent Gate, Source and Drain contacts in 
the vertical direction. The HBs [4][5] serve dual purpose: 
connectivity and heat extraction. These Bridges along with CG 
and CC allow routability in 3-D and is very different from 2-D 
CMOS and other through silicon via based 3-D CMOS 
approaches where routing mostly take place in the 2-D plane 
and through vias that connect two layers/dies. Finally, fabric 
vias (Fig.1E) are used for input/output signals, and to carry 
excess routing to top metal layers, those maximize the 
routability in the SN3D fabric.  
Fig.2 presents the conceptual view of the four integration 
approaches (2-D, T3-D, M3-D and SN3D) and illustrates their 
difference in terms of interconnection hierarchy. In 2-D CMOS, 
active devices are on the substrate at the bottom and all the 
routings are carried out on top metal layers (Fig.2 A). Similar 
are active device-layers and metal routing in conventional 3-D 
integration approaches (T3-D and M3-D), whereas, 3-D 
package and 3-D routing (Fig. 2 B & C) are through stacked 
prefabricated dies, and perforated thick inter-die vias (TSVs for 
T3-D and MIVs for M3-D) respectively. Here, the die stacking 
is limited (typically two layers) and relatively very few vertical 
interconnects replace the global interconnects. Whereas, SN3D 
is an integrated device-interconnect-fabric (fig. 2D and 1F), 
where, active devices are implemented in a stacked array of 
NWs, and 3-D interconnects are routed through architected 
fabric interconnect features. These fabric interconnect features 
replace the metal interconnects from local to global 
interconnect lengths (fig. 2D). Thus, SN3D offers a dense and 
fine-grained package of both devices and interconnects, 
yielding to huge footprint and interconnects length reduction. 
These imply circuits implemented through SN3D fabric offer 
increased functionality and improved performance at reduced 
power consumption[1-3]. However, any such benefits of an IC 
fabrication approach ultimately have to translate to cost 
benefits or at least to endurable cost trade-off, in order to adopt 
the approach to mainstream or research. This paper presents the 
benefits of SN3D implementation in cost form. The following 
section presents the cost model used to analyze the cost of 
SN3D implemented IC.  
  
II. COST MODEL  
The cost model used for 2-D CMOS, T3-D, M3-D and SN3D 
is illustrated in the Fig. 3. As SN3D IC follows a new 
fabrication approach that does not have prior cost data, the early 
cost estimation model showed (Fig. 3.) is built upon the prior 
estimates of the major cost dictating aspects, such as, 
interconnect, die area, the complexity of process steps, and 
metal layers-requirement. Initially, the die area and the 
interconnect requirement is estimated based on SN3D design 
rules, guidelines and circuit design principles developed (Fig. 
3.). Also, the 3D interconnects replacing the metal 
interconnects are expressed quantitatively. Later, interconnects 
and area estimated is input (Fig. 3.) to the metal layer estimation 
algorithm to calculate the number of metal layers required for 
the circuit system considered. Finally, the estimated die area 
and metal layer count serve as the independent variables in the 
cost function developed, where, the fabrication complexity i.e., 
the total number of different process steps required, are 
parameterized as proportionality parameters. Also, to note, the 
cost model presented is applicable to four of the fabrication 
approaches considered, because the underlying fabrication 
processes are same for all, therefore the model enables the fair 
relative comparison of SN3D cost estimates with the rest. 
Towards which, the next sections present the scheme for 
interconnect, die area and metal layers estimation.  
C)    Monolithic 3-D
A)    2D CMOS
D)    SN3DB)    Multilithic 2.5-D
LI – Local Interconnect, SGI – Semi global Interconnect, Gl- Global Interconnect
MIV- Monolithic Inter Via, TSV-Through Silicon Via, NW- Silicon Nanowire,
CG-Common Gate, CC-Common Contact, HI-Insulation, HB-Horizontal Bridge
TSV
GI
LI
  
  
Die 2
SGI
GI
LI
SGI
Die 1
SGI
 
 
Substrate
GI
 
 
LI
  
MIVs
GI
SGI
LI
LI
N-Tier
P-Tier
  
    
Substrate
SGI
GI
HI
HB
CG
CC
SNW
Fig. 2. Interconnect Hierarchy (A) 2D CMOS, (B) Through Silicon Via 
(TSVs) based 3-D IC, (C) Monolithic 3-D (M3D), (D) SN3D.  
  
SN3D Design Rules, 
Guidelines 
SN3D Circuit Design 
Principles
Area Estimation
(2D,3D,M3D,SN3D)
Interconnect Estimation
(2D,3D,M3D,SN3D)
Metal Layer Estimation
(2D,3D,M3D,SN3D)
Cost 
Estimation
(2D,3D,M3D,SN3D)
3-D Interconnects
(TSVs, MIVs, FIs)
Cooling 
Cost 
(2D,3D,
M3D,
SN3D)
Bonding  
Cost 
(3D,M3D)
Fig. 3 Cost Estimation Model
 A. Interconnect Estimation for SN3D                                                
Interconnect projection for a placement of logic gates is an 
important variable for early estimation of wiring space 
requirement, metal layer requirement, delays, power 
dissipation, and cost. For that, this section shows the 
hierarchical partitioning and placement strategy considered for 
SN3D fabric and estimates the interconnect density for circuits 
implement in SN3D. The interconnect estimates obtained 
would be used in later sections for estimation of metal layers, 
and cost (Fig.3). For a hierarchically partitioned circuit system 
(as shown in Fig.4) implemented in SN3D, the number of metal 
interconnect terminals () required by a module containing    blocks is given by Rent’s [6] empirical expression as (it is 
applicable at all levels of hierarchy)  = 	        (1)  
Where  is the average number of terminals (input and output) 
per each block (Fig.4), and  is the rent’s constant which 
specifies the complexity of the design considered. From this, 
the total number of interconnects () required for a circuit 
system is proportional to the number of terminals and is given 
by the expression [7]   = (1 − 	)     (2) 
Where  is the fraction of the on-chip terminals that are sink 
terminals, and it is related to average fanout . ., as  = . . . . +1⁄ . In SN3D approach, a significant number of 
these interconnects are replaced by fabric interconnects, which 
reduces the interconnect routing on top metal layers. These 
interconnects reduction is expressed quantitatively next; a 
modified rent’s expression [9] and an interconnect density 
function [8] is formulated for SN3D. Consider a circuit design 
with  gates as represented by Fig.4, they can be equally 
distributed in to  nanowire transistor layers (we consider  
=10 in this paper) in the SN3D fabric. This partitioning scheme 
is represented in Fig.5A, here each layer incorporates   gates. 
The number of terminals emanating from each layer () due to 
such partition can be given by Rent’s expression as 
 =    	       (3) 
Where  is the layer’s number and  is the total number of 
layers.  Summation of   over all the layers (Fig.5A) gives the 
total number of terminals  !" available in SN3D over all the 
layers. 
                !" = #$ 

%&
' =   ( )
	                    (4) 
Next, the Fig.5B represents partitioning and placement 
scheme in SN3D fabric, where, the total terminals ( !") 
emanting are utilized in two distinct ways: +,-.  terminals for 
3-D interconnect features specific to SN3D, such as CG, CC 
and HB (Fig.1 and 2D); and  terminals for metal layer 
interconnects. Hence,  !" is   comprised of (Fig.5B)   !" =  + +,-.                  (5) 
Now, from 01(1), 01(4) and 01(5)  
+,-. =   ( )
	 − 	 
    +,-. = (1 − 	6&)  	  
From which, the average number of terminals consumed by 
fabric’s 3-D interconnects features per each nanowire layer is  
           +,-., = 789: = (1 − 	6&)  	  
Similarly, the number of terminals contributing top metal layer 
interconnects per each nanowire layer (,) is the difference of 
total number of terminals per each layer   (eq (3)) and layer’s 
fabric interconnect terminals =+,-.,> , =   − +,-., 
1
2
3
4 i
NG
1
2
3
4
.
.
.
.
.
.
.
.
.
.
.
.
.
T
. . . . . . .
. . . . . . . . 
. . . . . . . . 
Module
Module 
Terminals
Sub-module 
pins
1
i
2
3.
.
.
.
.
.
k
Sub-module 
 
Fig.4. Rent’s Correlation for Hierarchically Partitioned Random Logic   
Implemented in SN3D  
 
. . . . . . . . . . . . .
A)
Tfeat
TmB)
1
1
2
2
2
3
3
3
3
l
l
l
.
.
.
.
.
.
.
.
.
x
y
Block A
u
Block CBlock B
x+y=l
Layer1
. . . . . . . . . . . . 
Layer i
Layer n
C)
Fig.5. (A) Partitioning scheme for SN3D where   gates are equally distributed into n layers; (B) Partition and Placement scheme in SN3D fabric leading to two 
distinctly utilized interconnect terminals( and +,-.) for the logic blocks; (C) Regular square arrayed placement of logic blocks in SN3D fabric for deriving 
interconnect distribution  
 , =  ( )
	 − (1 − 	6&) ( )
	 
, = 	6& ( )
	
 
Comparing  +,-., and , with its Rent’s equivalent for 
each layer (eq (3)), the effective average number of terminals 
per each block consumed for fabric interconnects (? !",+,-.) 
and effective average metal interconnects (? !",) are 
          ? !",+,-. = (1 − 	6&)   and 
              ? !", = (	6&)      (6) 
Now, the total number of fabric interconnects can be expressed 
quantitatively by replacing  with ? !",+,-. in 01(2)  ,+,-. = A(1 − 	6&)(1 − 	)   (7) 
Similarly, the metal interconnects of different lengths can be 
calculated by replacing  with  ? !", in an interconnect 
estimation model. Fig.5C shows a regular square arrayed 
placement of a hierarchically partitioned system in SN3D, 
where each cube represents a logic block, and the spacing 
between the adjacent blocks is the unit distance called gate pitch 
(u). All interconnects between logic blocks are assumed at their 
manhattan grid distances [7][8] in the square array. As an 
example, elements numbered 2 in block B (Fig.5C) are at 
distance of 2 units (2u) from block A, similarly, elements in 
block C are at distance of l (i.e., x+y) from elements in block 
A. For such a placement, ref. [8], gives a stochastic wire length 
distribution function (C); (C) is a continuous interconnect 
density function which gives number of interconnects available 
at a given length C. The closed form analytical expression (C) 
is a function of   , C,  and  variables,  (C) = ( , C, , ) 
These variables capture the characteristics of the circuit system 
considered and the silicon floor architecture.   considers size 
of the logic design i.e., number of logic blocks in the circuit 
system;  is the rent’s coefficient which defines the complexity 
of the circuit considered; C is the manhattan distance in 
multiples of u (u defines architecture of the silicon floor i.e., it 
is different for 2-D, T3-D, M3-D and SN3D); and  is the 
average number of terminals per each block in the circuit 
system. As discussed in eq (5), the reduction in metal 
interconnects due to fabric interconnects is expressed through 
reduction in ? !",. Thus, employing ? !", for k [9], the 
modified expression for (C) [8] in SN3D is, 
Region I:  1 ≤  l  ≤  E   
            !"(C) = F(GHIJ)K L MN! − 2ECK + 2C CK	6O                  
Region II:  E   ≤  l  ≤  2E 
    !"(C) = F(GHIJ)P L=2E − C>!CK	6O     (8) 
Where  is the fraction of the on-chip terminals that are sink 
terminals, and is related to average fanout . ., as 
          =  +.R.+.RS&    
The normalization factor L is given as [8]  
                L =  K&6HIJT6H JUVHIVVHIJH(VHIJ)(HIJ)(VHIN)6 JWHSVEXVHIJ 6 XHIJY                      
Now, integration of (C) over a range of interconnect lengths 
gives the total number of interconnects available over that 
range, which is named as cumulative interconnect density 
function  !"(C) [8],  
  !"(C) =  Z  !"([) \[M&      (9) 
Fig. 6 shows the interconnect distribution (C)  and cumulative 
Interconnects distribution (C) functions evaluated for SN3D, 
M3-D and 2-D designs. It can be noticed that SN3D shows huge 
reduction in interconnect distributions (Fig.6) from local to 
global interconnects range due to the availability of fine-
grained 3-D fabric routing features (Fig.1 & 2D) from device to 
system level. This impact is pictorially depicted in Fig.7: the 
quantity of 3-D interconnects is given by ,+,-.  eq (7) which is 
represented in the figure by interconnect lines among the 
blocks; and the reduced metal layer interconnects is given by  !"=2E> eq (9) which is represented as metal lines on the 
top. The reduction in total interconnects  !"=2E> over K"=2E> is 54% (this can be further enhanced by exploring 
the efficient routing options in SN3D fabric). This reduction 
reduces the average interconnect length and total wiring 
requirement of the chip, consequently alleviates the metal 
layers requirement, wire delays, power loss and cost.  
Fig. 9. Estimated Area Comparisons 
In
te
rc
o
n
n
e
ct
 D
e
n
si
ty
 F
u
n
c
ti
o
n
, 
i(
l)
0 10 100 1000
1.0E-01
A) i(l)2D i(l)3D i(l)SN3D
Interconnect length, l [Gate pitches]
B)
1.9E+07
1.7E+07
1.5E+07
1.1E+07
9.0E+06
7.0E+06
5.0E+06
3.0E+06
0 500 1000 1500 2000
1.0E+06
1.3E+07
I(l)2D I(l)3D I(l)SN3D
C
um
u
la
ti
ve
 I
n
te
rc
o
n
n
e
ct
 D
e
n
si
ty
 F
u
n
ct
io
n
, 
I(
l)
Interconnect length, l [Gate pitches]
1.0E+00
1.0E+01
1.0E+02
1.0E+03
1.0E+04
1.0E+05
1.0E+06
1.0E+07
Fig.6. Fig.6. Interconnect Distributions calculated for 2-D, 3-D, and SN3D integration approaches: A) Interconnect Density Function (C); B) Cumulative 
Interconnect Density Function (C). 
 B. Die Area Estimation 
Die area for gate-limited designs is proportional to the total 
number of gates available in the design [10], accordingly, it can 
be formulated for 2-D, T3-D, M3-D as 
           ^K" = ^,K" ^!" = ^,!" +   _^ _  ^`!" = ^,`!" +  `a_^`a_  
Where   is the total number of gates in the design and ^ is 
the average area of each gate. The average gate area in 2-D 
CMOS (^,K")  is taken from [10] as 3125 λ2, correspondingly, 
assuming an efficient 2 layer 3-D integration, the average gate 
area for T3-D and M3-D is considered half the 2-D CMOS gate 
area (i. e. , ^,!" = ^,`!" =  c,VdK ). Additionally, to note from 
above expressions, T3-D and M3-D suffer from area overhead 
due to TSVs and MIVs respectively.  Here,  _ and `a_  are 
the number of TSVs and MIVs required respectively; ^ _ and ^`a_ are the block out area for each TSV and MIV 
respectively.  _ is estimated from [10], and `a_ for a 
transistor-level-monolithic 3-D design could be expressed from eq (6) and eq (2) as (M3-D is limited to 2 tier design) ,`a_e = A(1 − 2	6&)(1 − 	) 
Similarly, die area for SN3D is given by  ^", !" = ^, !" 
Based on our previous SN3D layout designs [1-3], the average 
gate area ^, !" calculated for SN3D is 432λ2. Fig.8 
represents single gate layout [2] in SN3D consuming the 
footprint of only one NW transistor (other transistors are 
implemented in subsequent nanowires on the bottom layers) 
plus the footprint for fabric vias, and HBs. It is to be noticed 
that block out area for fabric vias is incorporated in the gate area 
of SN3D.  
Thus, solving the given die area expressions, Fig.9 shows the 
die area estimates for 2-D CMOS, 3-D, M3-D and SN3D 
approaches for 5,10 and 20 million logic gate designs. From the 
figure, SN3D shows a huge reduction in area compared to other 
approaches; the reduction is 86% and 74% with respect to 2-D 
CMOS and M-3D respectively. Subsequent sections will use 
these area estimates for metal layer estimation and cost 
projection.  
 
C. Metal layer Estimation  
The metal layer estimation algorithm [11] [10] is based on 
the iterative bottom-up placement of interconnect distribution 
onto the available routing area in successive metal layers. The 
total interconnect routing length available on a metal layer is 
given by [11]  
f-g, =  h^", !" − ^g-e,i  
Where ^", !" is the SN3D die area,  indicates the metal layer 
count, h is layer’s routing efficiency, ^g-e, is the layer’s total 
via block out area and i  is the layer’s metal wire pitch. ^g-e, 
is estimated as [11] ^g-e, =  2^g,(.  − (C))  
Where ^g, is block out area of single via on layer ,  is the 
total gate count, .  is the average fanout per each gate and (C) is the cumulative interconnect density which gives the 
total number of interconnects routed until current layer. Here, (. ) is the total number of fanout interconnects available 
in the design, thus, (.  − (C)) gives total number of 
fanout interconnects routed above current metal layer  [11], 
twice this number gives the approximate number of vias passing 
through current metal layer [11].  Fig.9 illustrates the metal 
layer estimation algorithm. Starting from bottom metal layer, 
the interconnects available through interconnect distribution 
function are routed iteratively until the layer’s available routing 
length f-g, is deplete. The subsequent interconnects are routed 
similarly on the next metal layers as depicted in the Fig.9. The 
upper bound condition for iterative routing of interconnects in 
a metal layer is  
                                 j f(C) −  jf(C6&) ≤  f-g,       
            
24 λ
18 λ
 
Fig.8. Layout of SN3D Standard Cell  
1.0E-06
1.0E-04
1.0E-02
1.0E+00
1.0E+02
1.0E+04
1.0E+06
1 10 100 1000
Interconnect Length, l (gate pitches)
In
te
rc
o
n
n
e
c
t 
D
e
n
si
ty
 F
u
n
c
ti
o
n
, 
i(
l)
Metal 1
Metal 2
.
.
Metal i
.
Metal n
Lav,1
Lav,2. . .. . . . . .. . .Lav,i. . .. . . . . .. . .
Lav,i
 
 Fig.9. Metal Layer Estimation Algorithm  
Metal 
Interconnets
Fabric 
Interconnects
Fig. 7. Potentials of SN3D fabric: Dense placement strategy for 
devices/logic blocks; Increase in interconnect resources due to Fabric’s 3-D 
interconnects (,+,-.); Reduction in metal interconnect requirement  
 Where f(C) is the cumulative interconnect wire length until 
the .m layer, and j = O+.RS! is the factor that accounts for 
fraction of interconnects shared with in a common net [11]. 
Upon completion of all the interconnects available in 
interconnection distribution, the routing algorithm terminates, 
and the value of  gives the estimation of metal layers required 
for the circuit system considered. Table.1 presents the results of 
metal layer estimation for 2D, T3-D, M3-D and SN3D 
approaches for 5, 10 and 20 million logic gate designs. SN3D 
shows 2.5x reduction in metal layer requirement, which would 
reduce the cost of metal layers, that is discussed in the next 
section.  
D. Cost Approximation 
Cost of a chip is comprised of its constituent costs: die cost, 
metal layers cost, cooling cost and bonding cost (package costs 
is not considered). 
    C = op, +  o,.-M + oqRRMr + osRpr            (11) 
Projection of this constituent costs for a fabrication approach 
needs both prior cost data from foundry and the design phase 
information available. As SN3D is a new fabrication approach, 
which has no previous cost data from the foundry, we have 
developed a cost model which best captures the design phase 
information available and presents a forecast proportional to 
actual cost, those, it enables faithful comparison of SN3D cost 
with 2-D, T3-D and M3-D approaches. The design phase 
information like die area (^K"), metal layers () and 
fabrication process steps requirement, serve as the 
determinants. Accordingly, the general expression for die cost 
and metal cost can be expressed as 
   op, +  o,.-M =  (t	p^",) + (t	^",)      (12)  
Where ^", and  (estimated is previous sections) are the 
variables dependent on physical design of the circuit system, t	p  and t	 are the proportionality constants parameterized 
corresponding to the process steps involved in fabrication. t	p 
is a parameter proportional to process steps involved in the 
fabrication of active devices on die and t	 is the parameter 
proportional to process steps involved in fabrication of metal 
layers on top of the die. Next, we formulate these parameters 
for different fabrication approaches. 
The sequential processes involved in any IC fabrication can 
be classified into five major process steps, photolithography, 
diffusion, deposition, etching, and implantation. Next, we 
quantize and parameterize these process steps in terms of an 
arbitrary cost. Let q be an arbitrary cost consumed by unit area 
of silicon chip when subjected to each of the five processes. 
This is conveyed in the Fig.10A, where, a unit area of substrate 
is subjected to one photolithography, one diffusion, one 
implantation, one deposition and one etching process steps. 
Thus, q can be expressed into constituent cost constants of 
photolithography uv, diffusion "w, deposition "u, etching x and implantation a`. q =  uv + "w + x + "u + a`  
Now, Fig.10C [12] depicts the relative cost of these major 
process steps involved in the semiconductor industry. From 
these relative cost statistics (Fig.10C) [12], the above 
constituent cost constants can be appropriately deduced to 
fraction of q i.e.,   uv = 0.32q; "w = 0.22q; x =0.18q; "u = 0.16q  ; and a` = 0.12q 
Next, determining the number of different process steps from 
the process sequence [2] [13-17] of a particular fabrication 
approach would give cost per unit area (t	) for that approach.  t	 = uv . uv +  "w. "w + x . x + "u. "u + a`. a` t	 = (0. 32uv +  0.22"w + 0.18x + 0.16"u+ 0.12a`)q                                   (13) 
Where uv, "w, x, "u and a`  are the number of 
photolithography, diffusion, etching, deposition, and 
implantations steps required respectively. For example, 
Fig.10B conveys the number of different process steps required 
for SN3D die (Table.2) [2] [13], therefore, t	 for SN3D die (t	p) is calculated from  01(13) as 26.54q. Similarly, Table.2 
presents the number of different process steps required per unit 
area of 2D CMOS [14], T3-D [15], M3-D [16] and SN3D [2] 
[13] dies, and for unit area of a metal layers [14]. It is to be 
noticed that T3-D and M3-D require more than twice the 2-D 
process steps because two stacked dies would incur equal but 
separate process steps and the process steps excess to twice the 
2-D CMOS are to drill TSVs and MIVs through the dies.  
Substituting this process steps count in 01(13),  and from 01 (12) and (11), the final cost expressions formulated for 2-
D, T3-D, M3-D and SN3D are oK" = 6.26q^K" + 2q^K" +  oqRRMr  
TABLE. 1  
METAL LAYER ESTIMATION     
                                                         
Ng 2-D  
CMOS 
TSV 
3D  
M 3D SN3D 
5 M 5 5 3 3 
10 M 6 5 4 3 
20 M 7 6 5 4 
  
Unit Process Constant
cPD=kc
nPL=1 
nIM=1 nDF=1 nET=1 nDP=1 
1
1
A)
SN3D Process Constant 
nPL=2
nIM=0 nDP=40 nET=51 nDF=2
1
1
cPD=26.54kc
B)
Photolithography
Diffusion 
Etching
Deposition
Implantation
12%
32%
16%
18%
22%
C)
 
Fig. 10. Parameterizing Process Steps A) Unit process parameter for cost, 
B) SN3D process parameter for cost, C) Relative Cost of the process steps 
TABLE 2  
PROCESS STEPS  
Process  2D 
[14] 
3D/M3D 
[15-17] 
SN3D 
[2][13] 
Metal 
[14] 
Photolithography 9 19 2 2 
Diffusion  4 8 2 - 
Implantation 7 14 - - 
Deposition 4 10 40 4 
Etching  5 13 51 4 
 o!" = 7.26q^!" +  2q^!" +  osRpr + oqRRMr  o`!" = 7.26q^`!" +  2q^`!" +  osRpr + oqRRMr  o !" = 26.54q^ !" +  2q^ !" + oqRRMr  
Where 6.26, 7.26, 7.26, and 26.54 are the die process 
parameters (t	p) calculated for 2-D, T3-D, M3-D and SN3D 
respectively; whereas, metal layers’ process parameter t	 is 
same (i.e., 2) for all the approaches because metal layer process 
steps are identical in all the approaches. ^K", ^!", ^`!" , and ^ !" are the die areas estimated in 
section B for 2-D, T3-D, M3-D and SN3D respectively, and 
metal layer estimates () for all the approaches are from 
Table.1. Hence, by evaluating the above expressions, SN3D 
cost is estimated and compared to conventional fabrication 
approaches.  
Fig.11 shows the cost estimates evaluated for four of the 
integration approaches along with its components costs. SN3D 
cost estimate shows huge saving due to following reasons. First, 
though the number of process steps increases in SN3D due to 
voluminous package of devices (this is reflected in process 
constant 26.54), huge reduction in the die area reduces the die 
cost. Moreover, as noted in SN3D process count in Table.2 and 
process cost-statistics from Fig.10C, SN3D shifts the processes 
to relatively cheaper deposition and etching steps compared to 
extreme lithography steps needed in sub nanometric regime in 
conventional approaches. Second, reduction in metal 
interconnects and metal layer requirement alleviates the metal 
layer cost for SN3D. Third, SN3D does not require any bonding 
cost (Fig.11), whereas 3-D and M3-D incur significant bonding 
cost. The bonding costs presented are estimated based on 
relative cost data from [10]. Fourth, the cooling cost is linearly 
proportional to the temperature of the chip [10]; 3-D and M3D 
require relatively high cooling cost because of the extreme 
working temperatures due to minimum heat escape path in the 
stacked dies. Whereas SN3D owing to its fine-grained thermal 
management scheme heats up less [18] compared to other three, 
therefore, it incurs a low cooling cost.   
III.  CONCLUSION  
We have presented in detail an approach for relative cost 
estimation of different fabrication methodologies which is 
based on the quantitative estimation of major cost dictating 
aspects, and compared the cost of SN3D IC with 2-D CMOS, 
T3-D and M3-D (Fig.12) costs. This cost evaluation 
methodology is generic and it is applicable for relative cost 
estimation of any fabrication methodologies, whereas the 
parameterized constants change according to the process 
complexity (number of different process steps) and can be 
derived from eq (13). Such kind of cost estimation would 
enable a fair comparison of cost estimates at design phase, and, 
as it is in units of an arbitrary cost (q) which serves as a base 
enabling consistent comparison between different fabrication 
approaches, the cost estimated would be proportional to the 
actual cost. Our results show SN3D consumes 70%, 67% and 
68% lower cost than 2-D, T3-D and M3-D respectively, proving 
it to be promising 3-D integration direction for research. 
IV. REFERENCES 
[1] N. K. Macha, et al., "Fine-grained 3-D CMOS concept using 
stacked horizontal nanowire," 2016 NANOARCH IEEE/ACM 
International Symposium on Nanoscale Architectures 
(NANOARCH), Beijing, 2016, pp. 151-152. 
[2] N. K. Macha, M. A. Iqbal, and M. Rahman, “New 3-D CMOS 
Fabric with Stacked Horizontal Nanowires,” 2017, Submitted 
for publication, 2017. 
[3] Naveen Macha, Sandeep Geedipally, Mostafizur Rahman, 
"Ultra high density 3D SRAM cell design in Stacked 
Horizontal Nanowire (SN3D) fabric", 2017 IEEE/ACM 
International Symposium on Nanoscale Architectures 
(NANOARCH), vol. 00, no. , pp. 155-161, 2017.  
[4] M. Rahman, P. Narayanan, S. Khasanvis, J. Nicholson, and C. 
A. Moritz, “Experimental prototyping of beyond-CMOS 
nanowire computing fabrics,” Proc. 2013 IEEE/ACM Int. 
Symp. Nanoscale Archit. NANOARCH 2013, pp. 134–139, 
2013. 
[5] M. Rahman, J. Shi, M. Li, S. Khasanvis, and C. A. Moritz, 
“Manufacturing pathway and experimental demonstration for 
nanoscale fine-grained 3-D integrated circuit fabric,” IEEE-
NANO 2015 - 15th Int. Conf. Nanotechnol., pp. 1214–1217, 
2016. 
[6] B. S. Landman and R. L. Russo, “On a Pin Versus Block 
Relationship For Partitions of Logic Graphs,” IEEE Trans. 
Comput., vol. C-20, no. 12, pp. 1469–1479, 1971. 
[7] W. E. Donath, “Placement and Average Interconnection 
Lengths of Computer Logic,” IEEE Trans. Circuits Syst., vol. 
26, no. 4, pp. 272–277, 1979.    
[8]   J. A. Davis, et al., “A stochastic wire-length distribution for 
gigascale integration (GSI). Part I: Derivation and validation,” 
IEEE Trans. Electron Devices, vol. 45, no. 3, pp. 580–589, 
Mar. 1998. 
[9] Shukri J Souri, “3D ICs Interconnect Performance Modeling 
and Analysis,” Ph.D. dissertation, pp. 29–41, 2002. 
[10] X. Dong, J. Zhao, and Y. Xie, “Fabrication cost analysis and 
cost-aware design space exploration for 3-D ICs,” IEEE 
Trans. Comput. Des. Integr. Circuits Syst., vol. 29, no. 12, pp. 
1959–1972, 2010. 
[11] P. Chong and R. K. Brayton, “Estimating and optimizing 
routing utilization in DSM design,” in Proc. Workshop Syst.-
Level Interconnect Prediction, 1999, pp. 97–102. 
[12] Y. Lai, “Cost Per Wafer,” Imid 2009, pp. 1069–1072, 2009. 
[13] R. M. Y. Ng, T. Wang, F. Liu, X. Zuo, J. He, and M. Chan, 
“Vertically stacked silicon nanowire transistors fabricated by 
inductive plasma etching and stress-limited oxidation,” IEEE 
Electron Device Lett., vol. 30, no. 5, pp. 520–522, 2009. 
[14] James D. Plummer, et al., “Modern CMOS Technology,” in 
Silicon VLSI Technology: Fundamentals, Practice and 
Modeling, ed. New Jersy: Prentice-Hall, 2000, ch. 2, pp. 49–
92. 
2D
0
10
20
30
40
50
60
70
80
Ng-Number of gates in the design
P
ri
ce
 in
 u
n
it
s 
o
f 
K
c 
90
3D M3D SN3D
Die cost
Cooling cost
Bonding cost
Metal cost
Fig. 11 Estimated Cost Comparisons 
 [15] Pangracious, Vinod & Marrakchi, Zied & Mehrez, Habib. 
(2015). Three-Dimensional Integration: A More Than Moore 
Technology. 350. 13-41. 10.1007/978-3-319-19174-4_2. 
[16] C. Liu and S. K. Lim, “A design tradeoff study with 
monolithic 3D integration,” Proc. - Int. Symp. Qual. Electron. 
Des. ISQED, no. 404, pp. 529–536, 2012. 
[17] Y. Lee, P. Morrow, and S. K. Lim, “Ultra High Density Logic 
Designs Using Transistor-Level Monolithic 3D Integration,” 
ICCAD '12 Proceedings of the International Conference on 
Computer-Aided Design, pp. 539–546. 
[18] M. A. Iqbal, M. Rahman “New Thermal Management 
Approach for Transistor-level 3-D Integration” , IEEE SOI-
3D-Subthreshold Microelectronics Technology Unified 
Conference (S3S), 2017. 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
