Skybridge: A New Nanoscale 3-D Computing Framework for Future Integrated Circuits by Rahman, Mostafizur
University of Massachusetts Amherst 
ScholarWorks@UMass Amherst 
Doctoral Dissertations Dissertations and Theses 
November 2015 
Skybridge: A New Nanoscale 3-D Computing Framework for 
Future Integrated Circuits 
Mostafizur Rahman 
Department of Electrical and Computer Engineering 
Follow this and additional works at: https://scholarworks.umass.edu/dissertations_2 
 Part of the Computer and Systems Architecture Commons, Digital Circuits Commons, Electrical and 
Electronics Commons, Electronic Devices and Semiconductor Manufacturing Commons, Hardware 
Systems Commons, Nanotechnology Fabrication Commons, and the VLSI and Circuits, Embedded and 
Hardware Systems Commons 
Recommended Citation 
Rahman, Mostafizur, "Skybridge: A New Nanoscale 3-D Computing Framework for Future Integrated 
Circuits" (2015). Doctoral Dissertations. 524. 
https://scholarworks.umass.edu/dissertations_2/524 
This Open Access Dissertation is brought to you for free and open access by the Dissertations and Theses at 
ScholarWorks@UMass Amherst. It has been accepted for inclusion in Doctoral Dissertations by an authorized 
administrator of ScholarWorks@UMass Amherst. For more information, please contact 
scholarworks@library.umass.edu. 
  
 
 
 
 
 
SKYBRIDGE: A NEW NANOSCALE 3-D COMPUTING FRAMEWORK FOR 
FUTURE INTEGRATED CIRCUITS  
 
 
 
 
 
 
 
 
 
 
A Thesis Presented 
 
 
by 
 
MOSTAFIZUR RAHMAN 
 
 
 
 
 
 
 
Submitted to the Graduate School of the 
University of Massachusetts Amherst in partial fulfillment 
of the requirements for the degree of 
 
DOCTOR OF PHILOSOPHY 
 
September 2015 
 
Department of Electrical and Computer Engineering 
 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
© Copyright by Mostafizur Rahman 2015 
All Rights Reserved 
  
  
SKYBRIDGE: A NEW NANOSCALE 3-D COMPUTING FRAMEWORK FOR 
FUTURE INTEGRATED CIRCUITS 
 
 
 
 
 
A Thesis Presented 
 
by 
 
MOSTAFIZUR RAHMAN 
 
 
 
 
 
Approved as to style and content by: 
 
 
_______________________________________ 
Csaba Andras Moritz, Chair 
 
 
_______________________________________ 
Israel Koren, Member 
 
 
_______________________________________ 
C. Mani Krishna, Member 
 
 
_______________________________________ 
Charles Weems, Member 
 
 
 
 
 
 
________________________________ 
Christopher V. Hollot, Department Head 
Electrical and Computer Engineering 
 
 
iv 
 
ACKNOWLEDGEMENTS 
 
 First and foremost, I praise and thank God Almighty for this accomplishment. My 
efforts were very insignificant compared to His blessings.    
 I would like to take this opportunity to also thank my advisor, PhD committee 
members, teachers, colleagues, friends and family. My advisor, Prof. Csaba Andras 
Moritz, has been instrumental for my growth as a researcher. He has been an excellent 
mentor, a constant source of guidance and inspiration. I thank him wholeheartedly, and 
will keep his ideals close to my heart as I progress to a new career as educator. I am 
grateful to my dissertation committee members Prof. Koren, Prof. Krishna, and Prof. 
Weems for their valuable feedback and suggestions throughout the course of my PhD. I 
am thankful to my teachers in UMass Amherst who helped me develop as researcher 
through coursework. I am indebted to my colleagues, who were not just great 
collaborators but also close friends. I am grateful to Dr. Pritish Narayanan for his 
guidance and mentoring during my initial years. I am thankful to Santosh Khasanvis for 
his support and kind cooperation throughout my PhD. I would like to also thank Pavan 
Panchapakeshan, Priyamvada Vijayakumar, Prasad Shabadi, Md. Muwyid Khan, Sankara 
Narayanan Rajapandian, Jianfeng Zhang, Jiajun Shi and Mingyu Li. I am thankful to Dr. 
John Nicholson for his suggestions and advice on experimental work. Finally, I would 
like to express my sincere gratitude to all my family and friends for their continued love 
and support through all these years.   
v 
 
ABSTRACT 
 
SKYBRIDGE: A NEW NANOSCALE 3-D COMPUTING FRAMEWORK FOR 
FUTURE INTEGRATED CIRCUITS  
 
September 2015 
 
B.Sc., NORTH SOUTH UNIVERSITY, DHAKA, BANGLADESH 
 
Ph.D., UNIVERSITY OF MASSACHUSETTS, AMHERST 
 
Directed by: Professor Csaba Andras Moritz 
 
 
Continuous scaling of CMOS has been the major catalyst in miniaturization of 
integrated circuits (ICs) and crucial for global socio-economic progress. However, 
continuing the traditional way of scaling to sub-20nm technologies is proving to be very 
difficult as MOSFETs are reaching their fundamental performance limits ‎[1] and 
interconnection bottleneck is dominating IC operational power and performance ‎[2]. 
Migrating to 3-D, as a way to advance scaling, has been elusive due to inherent 
customization and manufacturing requirements in CMOS architecture that are 
incompatible with 3-D organization. Partial attempts with die-die ‎[3] and layer-layer ‎[4] 
stacking have their own limitations ‎[5]. We propose a new 3-D IC fabric technology, 
Skybridge ‎[6], which offers paradigm shift in technology scaling as well as design. We 
co-architect‎Skybridge’s‎core‎aspects,‎from‎device‎to‎circuit‎style,‎connectivity,‎thermal‎
management, and manufacturing pathway in a 3-D fabric-centric manner, building on a 
uniform 3-D template. Our extensive bottom-up simulations, accounting for detailed 
material system structures, manufacturing process, device, and circuit parasitics, carried 
through for several designs including a designed microprocessor, reveal a 30-60x density, 
vi 
 
3.5x performance/watt benefits, and 10x reduction in interconnect lengths vs. scaled 16-
nm CMOS ‎[6]. Fabric-level heat extraction features are found to be effective in managing 
IC thermal profiles in 3-D. This 3-D integrated fabric proposal overcomes the current 
impasse of CMOS in a manner that can be immediately adopted, and offers unique 
solution to continue technology scaling in the 21
st
 century. 
  
vii 
 
TABLE OF CONTENTS 
Page 
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . …..iv 
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..v 
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...  ..x 
LIST OF FIGURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . … xi 
CHAPTER 
 
1. INTRODUCTION ........................................................................................................1 
2. SKYBRIDGE FABRIC OVERVIEW ..........................................................................7 
2.1 Core Fabric Components ..................................................................................8 
2.1.1 Vertical Silicon Nanowires ......................................................................8 
2.1.2 Vertical Gate-All-Around Junctionless Nanowire Transistors ................8 
2.1.3 Bridges .....................................................................................................9 
2.1.4 Coaxial Routing Structures ....................................................................11 
2.1.5 Heat Extraction Junctions ......................................................................12 
2.1.6 Heat Dissipating Power Pillars ..............................................................13 
2.2 Logic Implementation Example in Skybridge Fabric .....................................13 
2.3 Chapter Summary ...........................................................................................15 
3. 3-D DEVICE, CIRCUIT STYLE AND MEMORY ..................................................16 
3.1 Vertical Gate-All-Around Junctionless Transistor .........................................17 
3.2 Skybridge’s‎Circuit‎Style ................................................................................19 
3.2.1 High Fan-In Support ..............................................................................24 
3.2.2 Noise Mitigation ....................................................................................26 
3.2.3 Mitigation of Performance Impact Due to Long Interconnects .............29 
3.3 Skybridge’s‎Volatile‎Memory .........................................................................30 
3.4 Section Summary ............................................................................................32 
4. ARITHMETIC CIRCUIT DESIGN EXAMPLES AND SCALABILITY STUDY ..33 
4.1 Circuit Design Examples and Scalability Aspects ..........................................33 
viii 
 
4.1.1 Basic Arithmetic Circuits ......................................................................33 
4.1.2 High Bit-Width Arithmetic Circuits ......................................................36 
4.2 Section Summary ............................................................................................39 
5. SKYBRIDGE MICROPROCESSOR DESIGN .........................................................40 
5.1 WISP-4 Architecture .......................................................................................40 
5.2 Section Summary ............................................................................................45 
6. FABRIC EVALUATION METHODOLOGIES, 3-D CIRCUIT DESIGN RULES 
AND GUIDELINES ..................................................................................46 
6.1 Fabric Evaluation Methodologies ...................................................................47 
6.1.1 Methodology for 3-D Circuit Evaluation ..............................................47 
6.1.2 Methodology for 3-D Interconnect Modeling, Wire Length Estimation 
and Repeater Count Distribution ...........................................................49 
6.1.3 Methodology for 3-D Thermal Analysis ...............................................50 
6.2 3-D Circuit Design Rules and Layout Guidelines ..........................................51 
6.2.1 Design Rules ..........................................................................................52 
6.2.2 Additional Guidelines ............................................................................53 
6.3 Section Summary ............................................................................................56 
7. BENCHMARKING RESULTS .................................................................................57 
7.1 Benchmarking of Arithmetic Circuits .............................................................57 
7.2 Benchmarking of Volatile Memory ................................................................58 
7.3 Benchmarking of Processor Design in Skybridge ..........................................59 
7.4 Connectivity Implications of Skybridge Designs ...........................................60 
7.5 Section Summary ............................................................................................61 
8. FINE-GRAINED 3-D THERMAL MANAGEMENT ...............................................62 
8.1 Thermal Modeling and Analysis .....................................................................64 
8.1.1 V-GAA Junctionless Transistor ............................................................64 
8.1.2 Thermal Model & Analysis of Skybridge Circuits ................................68 
8.2 Skybridge’s‎Heat‎Extraction‎Features.............................................................71 
8.2.1 Heat Dissipation Power Pillars (HDPPs) ..............................................71 
8.2.2 Heat Extraction Junctions (HEJs) ..........................................................74 
ix 
 
8.3 Section Summery ............................................................................................77 
9. ENVISIONED WAFER-SCALE MANUFACTURING PATHWAY ......................78 
9.1 Envisioned Wafer-scale Manufacturing Pathway ...........................................81 
9.1.1 Starting Wafer........................................................................................81 
9.1.2 Nanowire Patterning ..............................................................................81 
9.1.3 Contact Formation .................................................................................83 
9.1.4 VDD/GND/Output Signal Carrying Bridges .........................................84 
9.1.5 Planarization, Interlayer Dielectric Deposition .....................................85 
9.1.6 Gate Stack Deposition ...........................................................................86 
9.1.7 Input Signal Carrying Bridges ...............................................................89 
9.1.8 Alignment ..............................................................................................90 
9.2 Section Summary ............................................................................................90 
10. EXPERIMENTAL PROTOTYPING .........................................................................92 
10.1 Experimental Validation of Horizontal Junctionless Nanowire 
Transistor .......................................................................................92 
10.1.1 Process and Device Simulations ............................................................92 
10.1.2 Experimental Process Flow ...................................................................96 
10.1.3 Device Characterization Results ............................................................98 
10.2 Experimental Demonstration of‎Skybridge’s‎Key‎Manufacturing 
Steps .............................................................................................100 
10.2.1 Formation of Vertical Nanowires ........................................................100 
10.2.2 Photoresist Planarization, Alignment and Deposition .........................101 
10.2.3 Interlayer Dielectric Deposition and Planarization .............................102 
10.2.4 Multi-layer Material Deposition ..........................................................104 
10.3 Section Summary ........................................................................................104 
BIBLIOGRAPHY ............................................................................................................105 
 
  
x 
 
LIST OF TABLES 
Table ‎6.1. Design rules ......................................................................................................54 
Table ‎7.1. Scalability potential of Skybridge designs........................................................58 
Table ‎7.2.  Memory comparison: Skybridge 8T-NWRAM vs. CMOS 6T-RAM .............59 
Table ‎7.3. Skybridge vs. CMOS comparison for microprocessor .....................................60 
Table ‎8.1. Properties of materials used in transistor modeling ..........................................67 
Table ‎9.1. Manufacturing requirements and challenges: CMOS vs. Stacked 
CMOS vs. Skybridge .........................................................................................80 
 
  
xi 
 
LIST OF FIGURES 
Fig. ‎2.1. Core fabric components. ......................................................................................10 
Fig. ‎2.2. Skybridge Full Adder.. ........................................................................................14 
Fig. ‎3.1. 3-D TCAD simulation results.. ............................................................................18 
Fig. ‎3.2.  Cascaded NAND-NAND and Compound dynamic circuit styles for 
XOR gate.. .........................................................................................................20 
Fig. ‎3.3. Dual rail vs Single rail logic for Skybridge circuits.. ..........................................23 
Fig. ‎3.4. Comparative analysis of high fan-in implications.. .............................................24 
Fig. ‎3.5. Analysis of coupling noise. .................................................................................27 
Fig. ‎3.6. Volatile memory design in Skybridge.. ...............................................................31 
Fig. ‎4.1. 4-bit carry look-ahead adder (CLA).. ..................................................................35 
Fig. ‎4.2. 4-bit Array Multiplier. .........................................................................................37 
Fig. ‎4.3. High bit-width arithmetic examples: 8-bit and 16-bit CLAs.. .............................38 
Fig. ‎5.1. Skybridge 4-Bit Wire Streaming Processor (WISP-4).. ......................................41 
Fig. ‎5.2. Block diagram of each pipeline stages. ...............................................................42 
Fig. ‎5.3. 2-bit ROM, 2:1 decoder and a latch.. ..................................................................44 
Fig. ‎6.1. Skybridge Circuit Evaluation methodology.. ......................................................48 
Fig. ‎6.2.3-D Interconnect modeling methodology.. ...........................................................50 
Fig. ‎6.3. Thermal evaluation methodology. .......................................................................51 
Fig. ‎6.4. Design rule illustration.. ......................................................................................53 
Fig. ‎6.5. View of Skybridge fabric. ...................................................................................55 
Fig. ‎7.1. Comparison of interconnect distribution and estimated repeater count in 
Skybridge and CMOS, for an integrated circuit consisting of 10 million 
gates.. .................................................................................................................61 
Fig.‎8.1.Thermal modeling and simulations of V-GAA junctionless transistor.. ..............66 
Fig. ‎8.2.  Heat dissipation paths in circuits. .......................................................................68 
xii 
 
Fig. ‎8.3.  Thermal modeling of circuits.. ...........................................................................69 
Fig. ‎8.4. Thermal simulation results of Skybridge circuits without heat extraction 
features.. ............................................................................................................70 
Fig. ‎8.5.  Incorporation of Heat Dissipating Power Pillar (HDPP)....................................72 
Fig. ‎8.6.  Impact of HDPPs for Heat Extraction. ...............................................................73 
Fig. ‎8.7.  Heat Extraction Junctions (HEJs). ......................................................................74 
Fig. ‎8.8.  Impact of HEJs, Bridges and HDPPs for heat extraction. ..................................75 
Fig. ‎9.1. Starting wafer and nanowire patterning...............................................................82 
Fig. ‎9.2. Contact formation. ...............................................................................................83 
Fig. ‎9.3.  Formation of VDD/GND/Output signal carrying Bridges. ................................85 
Fig. ‎9.4. Planarization and interlayer dielectric deposition ...............................................86 
Fig. ‎9.5.  Gate stack deposition. .........................................................................................87 
Fig. ‎9.6.  Formation of input signal carrying bridges. .......................................................89 
Fig. ‎9.7. Alignment. ...........................................................................................................90 
Fig. ‎10.1. Ion Implantation simulations.. ...........................................................................93 
Fig. ‎10.2. Process and Device simulation results...............................................................95 
Fig. ‎10.3. Experimental process flow.. ..............................................................................97 
Fig. ‎10.4. Experimental results.. ........................................................................................99 
Fig. ‎10.5. Vertical Nanowire Patterning. .........................................................................101 
Fig. ‎10.6. Photoresist Planarization.. ...............................................................................102 
Fig. ‎10.7. Demonstration of Material Depositions.. ........................................................103 
 
 
1 
 
1. CHAPTER 1 
 
INTRODUCTION 
 
 
 Tremendous progress in miniaturization of integrated circuits (ICs) has been crucial 
for the socio-economic developments in the last century. So far, this miniaturization was 
mainly enabled by the ability to continuously scale the CMOS technology. However, as 
we are reaching sub-20nm technology nodes, maintaining traditional way of scaling is 
becoming very challenging. This is mainly because CMOS scaling follows a device 
centric mindset, where shrinking device dimensions is the primary scaling factor, and all 
circuits and interconnections are designed as afterthoughts to accommodate scaled 
devices. Scaling MOSFET channel lengths below 20nm results in minimum to no 
performance benefits regardless of channel optimizations ‎[1]; moreover, device 
performance starts to degrade due to secondary scattering effects ‎[1]. Furthermore, 
customized sizing, doping and placement requirements of scaled devices for CMOS 
circuits result in reduced noise margin ‎[7], connectivity bottleneck ‎[2] and huge 
escalation of manufacturing complexities ‎[8]. 
 To continue the historical Moore's law scaling trend for higher density, reduced 
power and improved performance, 3-D integration of CMOS has been sought for long 
time, since it could provide a possible pathway without extensively relying on ultra-
scaled transistors. Until now, however, the migration of CMOS to 3-D has been 
unattainable.  CMOS architecture uses C-MOSFETs in an inverted logic, where both 
pull-up and pull-down transistors share the same input. The complementary MOSFETs 
2 
 
have opposite doping profiles and each MOSFET contains multiple doping regions. In 
order to achieve correct circuit operation, these MOSFETs have to be carefully sized and 
precisely doped in a 3-D stack. In terms of connectivity, 3-D implementation of CMOS 
circuits would imply that each input signal have to be vertically routed twice for C-
MOSFETs. Mapping such connectivity in 3-D even for a 4 fan-in logic, where pull-down 
transistors are stacked, and pull-up transistors are isolated or vice versa, would yield 
connectivity bottlenecks; for a large circuit these complexities would explode. In terms of 
manufacturing, CMOS in 3-D would imply extreme lithography to create various vertical 
shapes for 3-D for C-MOSFETs, and each MOSFET has to be doped precisely in isolated 
3-D regions, which is impractical. In addition to these, there is no heat extraction 
capability inherent to CMOS to prevent hotspot development. To the best of our 
knowledge, since the inception of vertical devices in 2000 ‎[9], there has been no 
demonstration of 3-D CMOS despite a significant industrial push, which is indicative of 
these above-mentioned challenges.  
 Partial attempts for 3-D organizations with CMOS die-die ‎[3] and layer-layer ‎[4] 
stacking have failed so far to become mainstream technologies. The Die-die stacking 
offers linear density benefits with number of dies stacked, but suffers from several critical 
challenges such as connectivity limitations between dies with large area vias or peripheral 
wirings, lack of heat dissipation and increased assembly cost ‎[3]‎[5]. Recently sequential 
CMOS integration with multiple silicon layers was proposed ‎[4]. Although this approach 
alleviates some of the challenges of die-die stacking with fine-grained Vias, new 
complexities emerged such as increased thermal budget to crystallize top silicon layers, 
layer to layer device variations, and reliability concerns due to thermo-mechanical 
3 
 
stress.  Both these approaches are additive and inherit the scaling challenges that are 
intrinsic to 2-D CMOS. 
 In contrast to CMOS and CMOS stacking approaches, we propose a truly fine-
grained 3-D nanofabric alternative, called Skybridge ‎[6], which offers paradigm shift in 
technology scaling. Starting from a template of uniformly doped vertical nanowire arrays 
functionalized with nanostructures, this fabric is envisioned to address device, circuit, 
connectivity, thermal management, and manufacturability aspects, in an integrated 3-D 
compatible manner. The integrated approach is essential in achieving this 
compatibility.  Our extensive theoretical and experimental work demonstrates its 
feasibility and potential. If realized, Skybridge can lay the foundation for orders of 
magnitude area and power/performance benefits vs. projected, scaled CMOS, and pave 
the way for advancing charge-based integrated circuits beyond 2-D CMOS for many 
years to come. 
 In this dissertation proposal, we show core aspects of the fabric design including (i) 
fabric nanostructures, (ii) 3D vertical integration of devices with limited customization, 
(iii) associated 3-D circuit style for arbitrary logic and volatile memory, (iv) 3-D 
connectivity schemes, and (v) fabric-level heat management support. Our bottom-up 
simulations, accounting for detailed material system structures, device, circuit and 
assembly, carried through for several designs including a 4-bit microprocessor, show 
more than 30x density and 3.5x performance/watt benefits vs. projected scaled 16-nm 
CMOS. Higher bit-widths show increasing benefits: our 16-bit CLA design achieves 
60.5x density, and 16.5x performance/watt benefits. Our analytical projections for 10M-
transistor designs indicate 10x reductions in interconnect lengths. Detailed thermal 
4 
 
modeling‎ reveals‎ Skybridge’s‎ fabric-level heat extraction features to address 3-D heat 
management requirements. The envisioned manufacturing pathway for large-scale 
assembly follows established foundry processes, and does not add any new 
manufacturing constraints. The doping and lithographic precision requirements for fabric 
assembly are significantly less, and are required only at the beginning; all device, contact 
and interconnect formations are primarily with depositions, which is lower cost and can 
be controlled to few Angstroms precision. We have experimentally validated the core 
device concept ‎[10] and performed several of the steps required in the manufacturing 
pathway. Key contributions of this proposal include: 
(i) 3-D Nanoscale Fabric Design: Starting from a template of uniformly doped 
vertical nanowire arrays, nanostructures to jointly address device, circuit, 
connectivity, thermal management and manufacturing challenges, while 
maintaining 3D compatibility, are architected. 
(ii) 3-D Circuit Designs: Various 3-D circuit styles, placement and routing schemes 
specific for Skybridge fabric are devised. Fabric level optimizations for high fan-
in circuits, and noise mitigation are shown. Logic, arithmetic and volatile memory 
circuit examples using Skybridge circuit styles are demonstrated. 
(iii) Bottom-up Fabric Evaluation Methodology and Detailed Benchmarking: An 
extensive bottom-up evaluation methodology that include detailed material 
considerations, 3D TCAD process and device simulations with experimental data, 
and circuit-level simulations using the device models,  3-D parasitics is 
developed. Detailed design rules and guidelines for 3-D circuits are derived that 
conform to manufacturing requirements. HSPICE circuit level simulations are 
5 
 
carried out using this methodology, and benchmarking is done against projected 
scaled CMOS designs for high bit width arithmetic circuits and a microprocessor 
design. Analytical modeling using parameters from Skybridge processor design 
are used to estimate interconnect length, and to predict repeater requirements; 
comparison is done with CMOS. 
(iv) Intrinsic Heat Management: Degrading circuit reliability due to lack of heat 
dissipation paths is a key concern for nanoscale circuits ‎[19] and critical in 3-D. 
Skybridge introduces fabric-intrinsic heat extraction mechanisms to ensure heat 
management in 3-D – an integral part of the design mindset and a new dimension 
in physical design. Detailed analysis of thermal profiles in Skybridge circuits is 
shown through fine-grained modeling and simulations.  
(v) Manufacturing Pathway: A manufacturing pathway for large-scale assembly is 
proposed that uses established foundry processes. 
(vi) Experimental Prototyping: Small-scale experimental prototyping is carried out 
to demonstrate key manufacturing steps and to validate the device concept. A 
detailed process and device simulation framework is developed to determine 
process parameters for the experiments.   
  
 The rest of this dissertation proposal is organized as follows: Chapter 2 presents an 
overview of the Skybridge fabric and details its core components. Chapter 3 discusses 3-
D device, circuit style and memory elements. Chapter 4 and 5 details high bit-width 
arithmetic circuit examples and a microprocessor design in Skybridge. Chapter 6 
introduces fabric evaluation methodologies, and Chapter 7 presents benchmarking 
6 
 
results. Details about thermal management and modeling results are presented in Chapter 
8. Envisioned manufacturing pathway for large-scale assembly is discussed and 
experimental prototyping results are shown in Chapter 9 and 10 respectively.   
 
  
7 
 
2. CHAPTER 2 
SKYBRIDGE FABRIC OVERVIEW 
  
 Skybridge fabric design follows a fabric-centric mindset, assembling structures on a 
3-D uniform template of single crystal vertical nanowires, keeping 3-D requirements, 
compatibility, and overall efficiency as its central goal. All active components and fabric 
features are formed on these nanowires through material depositions. In this fabric, 3-D 
device, circuit, connectivity, and thermal management issues are solved by carefully 
architecting towards 3-D organization. From architectural perspective, this is in stark 
contrast to the CMOS component-centric mindset, where transistors are the primary 
design components and the main technology scaling factor, wherein circuits, 
interconnection network, power and system level heat-management schemes are 
engineered to accommodate these transistors.  
 Beyond the Skybridge template based on the uniform single-doped vertical silicon 
nanowires, the key components functionalized include vertical Gate-All-Around (V-
GAA) Junctionless transistors, Bridges, Coaxial routing structures, Heat Extraction 
Junctions (HEJs) and large area Heat Dissipating Power Pillars (HDPPs). V-GAA 
Junctionless transistors are stacked on the vertical nanowires and are interconnected for 
realizing 3-D circuits. Local interconnection is primarily through unique routing features: 
Bridges and Coaxial routing structures. The heat management features HEJs and HDPPs 
are used in conjunction with Bridges to extract and dissipate heat from heated regions in 
8 
 
the logic implementing nanowires. In this chapter, we discuss the core fabric components 
and show how they are used in unison to achieve desired functionality.  
2.1 Core Fabric Components 
2.1.1 Vertical Silicon Nanowires  
 Regular Arrays of single crystal vertical silicon nanowires are fundamental building 
blocks of Skybridge fabric. All logic and memory functionalities are achieved in these 
nanowires. These nanowires are classified such that some of them are used as (i) logic 
nanowires to accommodate logic gates with each gate consisting of a stack of vertical 
transistors, and (ii) signal nanowires to carry Input/Output/Global signals themselves and 
facilitate routing of other signals for logic gates. All the nanowires are heavily doped; this 
is necessary for the V-GAA Junctionless transistors employed and for metal silicidation. 
The nanowires that are used for Input/Output/Global signal routing are silicided to reduce 
their electrical resistance.  
 Fig. ‎2.1A shows arrays of regular vertical silicon nanowires that are patterned from 
highly doped silicon substrate with discrete SiO2 islands (Details about wafer preparation 
and nanowire patterning can be found in Chapter 9). The SiO2 islands are used to isolate 
signal-carrying nanowires from contacting the bulk silicon substrate.   
2.1.2 Vertical Gate-All-Around Junctionless Nanowire Transistors 
 Active devices in this fabric are n-type vertical Gate-All-Around (V-GAA) 
Junctionless nanowire transistors. Junctionless transistors are well-suited‎for‎Skybridge’s‎
3-D implementation, since they eliminate the requirement of precision doping in 3-D. 
Junctionless transistors have uniform doping across Drain, Channel and Source regions;  
9 
 
their behavior is modulated by the workfunction difference between the gate and the 
heavily doped channel. In addition, there is no requirement for raised Source/Drain 
structure for Contact formation: contacting the low workfunction metal with heavily n-
doped Source and Drain regions can form a good Ohmic contact. In Chapter 3.1, we 
present more details of V-GAA device characteristics through 3-D TCAD process and 
device simulations. Previously, we have also experimentally validated the Junctionless 
device concept ‎[10]. 
 In Skybridge, structural simplicity of Junctionless transistors is exploited to easily 
form devices in vertical direction. As shown in Fig. ‎2.1B, V-GAA Junctionless transistors 
are formed by just depositing materials; in the beginning Drain contact metal (Ti) layer is 
deposited, and is followed by spacer (Si3N4), Gate oxide (HfO2), Gate electrode (TiN), 
spacer (Si3N4) and Source metal (Ti) layer deposition. Since depositing materials forms 
the devices, there is no requirement for lithographic or doping precision. A wafer/IC level 
a priori doping is sufficient for devices and contacts (See Chapter 9 for the envisioned 
manufacturing pathway). 
2.1.3 Bridges 
 Bridges are unique to the Skybridge fabric; they enable high degree of connectivity in 
3-D with minimum area overhead, and also play a key role in heat extraction. Based on 
their roles, Bridges can be classified into two categories: signal carrying Bridges and 
heat extraction Bridges.  
  
10 
 
  
 
Fig. ‎2.1. Core fabric components. A) Arrays of regular single crystal vertical Si 
nanowires, B) vertical Gate-All-Around Junctionless nanowire transistor, C) nanowire 
linking Bridges, D) Coaxial routing structures, E) sparse large area Heat Dissipating 
Power Pillars, F) Heat Extraction Junctions 
 
11 
 
 The primary role of signal-carrying Bridges is to form links between two adjacent 
nanowires, and carry Input/Output/Global signals (Fig. ‎2.1C). Depending on the circuit 
implementation, Bridges can be placed at different nanowire heights, and can propagate 
relatively long distances in the layout by hopping nanowires; Coaxial routing structures 
are used in conjunction with Bridges to facilitate this nanowire hopping. These routing 
features provide flexibility, and allow dense 3-D interconnection minimizing interconnect 
congestion.   
  In addition to their usage as signal carrying links, the Bridges also facilitate heat 
extraction. Heat extraction Bridges provide thermally conductive paths for heat transfer 
from the heat source. They are used in conjunction with Heat Extraction Junctions 
(HEJs) and large area Heat Dissipating Power Pillars (HDPPs) to maximize heat 
extraction and dissipation. Subject to the thermal profile of the nanowires, HEJs and 
Bridges can be connected to any heated region in the logic-nanowire. Fig. ‎2.1F shows an 
example of a Bridge connected to a HEJ in the logic gate output region (see Chapter 8 for 
thermal modeling and heat extraction results for 3-D circuits).  
2.1.4 Coaxial Routing Structures 
 Coaxial routing refers to a routing scheme, where a signal routes coaxially to another 
inner signal without affecting each other. This routing is unique for Skybridge, and is 
enabled by the vertical integration approach. Fig. ‎2.1D‎shows‎an‎example:‎signal‎‘A’‎is‎
carried by‎the‎vertical‎nanowire,‎whereas‎the‎signal‎‘B’‎is‎routed‎by‎Bridges;‎the‎Coaxial‎
routing‎structure‎allows‎signal‎‘B’‎to‎hop‎the‎nanowire‎and‎continue‎its‎propagation.‎This‎
coaxial routing is achieved by specially configuring material structures, insulating oxide 
12 
 
and contact metal. By controlling the thickness of the insulating oxide, and by choosing 
low workfunction metal as Contact Metal, proper signal isolation can be achieved. A 
thick layer of SiO2 as insulating oxide and Titanium (Ti) as Contact metal is well suited 
for this purpose. Workfunction difference between Ti and n-doped Si is such that there is 
no carrier depletion; moreover a thick layer of SiO2 ensures no electron tunneling 
between the Contact metal and silicon nanowire.  
 Using multiple coaxial layers can provide noise isolation and route multiple signals. 
Coupling noise in dense interconnect networks and in dynamic circuits is a well-known 
phenomenon. By configuring the Coaxial routing structure to incorporate a GND signal 
for noise shielding, coupling noise can be mitigated. Fig. ‎2.1D also illustrates this 
concept; the GND signal in between signal A and B acts as noise shield, and prevent 
coupling between these two signals. More details on noise mitigation can be found in 
Chapter 3.2.2.  
2.1.5 Heat Extraction Junctions 
 Heat Extraction Junction (HEJ) is an architected feature (Fig. ‎2.1F) used to extract 
heat from a heated region in logic-nanowire without affecting the underlying logic 
operation. An HEJ is a thermally conductive but electrically isolated junction. When 
combined with Bridges, the HEJs provide flexibility to be connected to any heated region 
in the logic-nanowire to prevent hotspot development.   
 These junction properties of an HEJ are achieved by carefully architecting material 
requirements. A sufficiently thick layer (6nm) of Al2O3 is used for this purpose – Al2O3, a 
13 
 
good insulator with excellent thermal conduction property (thermal conductivity 39.18 
Wm
-1
k
-1 ‎[20] ).  
2.1.6 Heat Dissipating Power Pillars 
 Large area Heat Dissipating Power Pillars (HDPPs) serve both the purpose of 
reliable power supply and heat dissipation. Depending on electrical and thermal 
requirements, these pillars are placed intermittently throughout the layout and are 
connected by Bridges. They occupy large area, and are specially designed to have low 
electrical resistance, and maximum heat conduction. As shown in Fig. ‎2.1E, HDPPs 
occupy a 2 x 2 nanowire pitch and would typically be placed on the periphery of circuit 
layouts. The 4 nanowires used in HDPPs are all metal silicided, and the region is filled 
with Tungsten (W) to maximize thermal conductance and minimize electrical resistance.  
 HDPPs that carry GND signals are connected to Bulk silicon at the bottom, whereas 
HDPPs carrying VDD signals are isolated from the bulk with SiO2 islands (Fig. ‎2.1E). 
For heat extraction purposes, Bridges connect to HDPPs (GND) on one end and to HEJs 
on the other; this configuration ensures that the heat extraction Bridges are at reference 
temperature for maximum heat extraction. Details on HDPPs, and thermal analysis can be 
found in Chapter 8.  
2.2 Logic Implementation Example in Skybridge Fabric 
Fig. ‎2.2 shows a logic implementation example in Skybridge fabric; a full adder logic 
is implemented using core fabric components. As shown in Fig. ‎2.2, logic nanowires are 
used to stack V-GAA Junctionless transistors, and signal nanowires are used to facilitate 
input/output signal propagations. All interconnections for the full-adder logic is through 
14 
 
  
 
Fig. ‎2.2. Skybridge Full Adder. Full-Adder logic implementation in Skybridge fabric 
utilizing core fabric components. 4 logic-nanowires are used for this implementation, 
peripheral signal carrying nanowires are shared with other logics.  
 
15 
 
fabric’s‎ routing‎ features‎Bridges‎ and‎Coaxial‎ routing‎ structures.‎The‎ full-adder logic is 
implemented using compound dynamic circuit style that is specific for Skybridge fabric 
(More details about circuit style can be found in Chapter 3). The density benefits of 
Skybridge’s‎vertical‎ integration‎are‎obvious‎ from‎Fig. ‎2.2; only four transistor carrying 
nanowires are necessary to implement the full-adder logic that utilizes 32 transistors. 
2.3 Chapter Summary 
 In this chapter an overview of the Skybridge fabric was presented; its core 
components were detailed and an example logic implementation utilizing these core 
components was shown. The 3-D integration of the Skybridge fabric is enabled by 
following a template approach with vertical nanowires and by architecting fabric 
components to address device, circuit, connectivity, heat, and manufacturing 
requirements in unison.  
  
16 
 
3. CHAPTER 3 
3-D DEVICE, CIRCUIT STYLE AND MEMORY 
 
 The manufacturing compatibility and the ability to efficiently implement logic and 
memory functionalities in 3-D without incurring detrimental connectivity overhead are 
key requirements for realizing circuits in 3-D. The CMOS circuit style is not suitable for 
this purpose, since it requires customizations in complementary device doping, sizing and 
placements for functionality; such an implementation in 3-D would result in significant 
connectivity bottleneck, and escalate manufacturing complexities. 
 In Skybridge, 3-D circuit and connectivity requirements are met by synergistically 
exploring device, circuit and architectural aspects without compromising on 
manufacturability. A dynamic circuit style that is amenable to implementations in 3-D is 
chosen for realizing arbitrary logic and volatile memory circuits. This dynamic circuit 
style uses only single type uniformly sized Junctionless transistors. It is easily mapped 
onto arrays of regular vertical nanowires without requiring any customizations in terms of 
doping, sizing or incompatible routing; formation of active components is primarily by 
layer-by-layer material depositions. As discussed before, to meet 3-D inter-circuit 
connectivity requirements, Skybridge has intrinsic routing features: signal nanowires, 
Bridges and Coaxial structures.  
 The dynamic circuit style, along with the 3-D integration scheme allows various 
choices to design for either high performance or low power, or a balance of both, at a 
17 
 
very high density. The tuning knobs for Skybridge circuit implementations are cascading 
choices and compound gates, dual rail vs. single rail implementations, and fan-in. In the 
following, we present more on these choices, and discuss trade-offs with example 
circuits. We also show how coupling noise due to ultra-dense 3-D integration, is 
mitigated through optimizing circuit clocking scheme and architecting fabric features. 
The discussion begins with analysis of active device components, and follows by details 
on logic circuit styles and volatile memory design. 
3.1 Vertical Gate-All-Around Junctionless Transistor 
 N-type vertical Gate-all-around (V-GAA) Junctionless nanowire transistor were 
chosen as active devices in the Skybridge fabric. V-GAA Junctionless transistors do not 
require abrupt doping variations within the device; as a result complexities related to 
precision doping in 3-D and high temperature annealing are eliminated. Stacking of 
transistors for circuit implementation requires only material deposition steps on pre-
patterned vertical nanowires.  
 In V-GAA Junctionless transistors, channel conduction is modulated by the 
workfunction difference between the heavily doped channel and the gate. Due to this 
workfunction difference, the n-type devices used in Skybridge are normally OFF, and the 
channel carriers are depleted (note, p-type Skybridge fabrics would follow similar 
mindset as our n-type version). With the application of gate voltage, carriers start to 
accumulate and the channel conducts. Source/Drain contact formation is done by metal-
Si Ohmic contacts; there is no need for raised Source/Drain structures ‎[21]. We have 
carried out extensive process and device simulations to characterize the V-GAA 
Junctionless devices based on specific material and sizing in Skybridge. We have also 
18 
 
experimentally demonstrated the 
Junctionless device concept; a p-
type horizontal Tri-gated 
Junctionless nanowire device was 
fabricated and characterized 
recently ‎[10] in our group. 
 The 3-D Synopsys Sentaurus 
Process simulator ‎[11] was used to 
create the device structure 
emulating actual process flow. In 
the process simulation, the 
substrate was initially doped to 
have 1e19 dopants/cm
3
 doping 
concentration; the doping step was followed by vertical nanowire patterning using 
anisotropic etching, followed by sequential anisotropic material deposition steps to 
complete the V-GAA Junctionless transistor formation. The resulting device structure 
had 16nm long Si channel, 2nm of HfO2 as gate oxide, 10nm thick TiN as gate electrode, 
10nm thick and 5nm long Si3N4 as spacer material, and 10nm thick, 10nm long Ti as 
contact material (Chapter 2, Fig. 2.1B). 3-D Sentaurus Device simulations ‎[12] were 
performed on this device to characterize its behavior, while taking nanoscale effects into 
account. Silicon bandstructure was calculated using the Oldslotboom model ‎[12], charge 
transport was modeled using hydrodynamic charge transport ‎[12]; quantum confinement 
effects were taken into account by using density gradient quantum correction model ‎[12]. 
 
Fig. ‎3.1. 3-D TCAD simulation results. Id-Vgs 
characteristics in log (left) and linear (right) 
scale for V-GAA Junctionless transistor. 16nm 
channel length, width and thickness; doping: As 
dopant, 1e19 dopants/cm3; 2nm HfO2 gate 
dielectric; 10nm thick TiN gate electrode. 
Simulation shows 27µA Ion, 0.1nA Ioff, SS 
78mV/dec. 
19 
 
Electron mobility was modeled taking into account effects due to high doping, surface 
scattering, and high-k scattering. The simulated device characteristics are shown in 
Fig. ‎3.1. This device had an On current of 27µA, Off current 0.1nA; subthreshold slope 
was found to be 78mV/dec, and threshold voltage (Vth) was 0.35V. These simulated 
device characteristics were used to generate a behavioral device model for HSPICE 
circuit simulations. 
3.2 Skybridge’s Circuit Style 
 As outlined before, Skybridge circuits follow a dynamic circuit style that is 
compatible with 3-D integration requirements. The circuit style allows various design 
choices including cascaded NAND-NAND or single stage AND-of-NAND compound 
implementations for logic gates with dual rail or single rail inputs; these can be also 
combined in a hybrid logic style with high fan-in support. These design choices are 
generic and can realize any arbitrary logic; moreover, they provide flexibility to optimize 
Skybridge circuit designs for power or performance, or a balance of both at a very high 
density. In the following discussions we analyze each circuit style supported, and discuss 
their trade-offs. Other circuit implementations may be possible. 
 Fig. ‎3.2 illustrates the cascaded NAND-NAND and compound dynamic logic gate 
implementations. An example of cascaded dynamic logic is shown through XOR gate 
design in Fig. ‎3.2A, corresponding HSPICE simulated behavior and physical layout are 
shown in Fig. ‎3.2B and Fig. ‎3.2E. In cascaded dynamic logic style, complex logic is 
implemented in two stages using NAND-NAND logic. The output of one NAND stage is  
20 
 
  
 
Fig. ‎3.2.  Cascaded NAND-NAND and Compound dynamic circuit styles for XOR 
gate. A) Cascaded circuit style with two logic stages, each stage is controlled by 
separate PRE and EVA clock signals; B) HSPICE simulated waveforms for the XOR 
in (A); C)compound dynamic circuit style; logic computation in one stage; two NAND 
gate outputs are combined in AND of NAND logic; D) HSPICE validations; E) 
physical layout of cascaded XOR in (B), occupying 3 logic nanowires, and 6 signal 
nanowires; F) physical layout of XOR gate in (C), only one logic nanowire is occupied 
for circuit implementation; 4 peripheral nanowires are used signal routing, which are 
shared with other circuits.   
21 
 
propagated to another NAND stage to complete logic behavior; both stages are micro 
pipelined for seamless signal propagation. The dynamic NAND gates in Fig. ‎3.2A 
operate with only n-type uniform V-GAA Junctionless transistors; dynamic circuit 
behavior is controlled by precharge (PRE1, PRE2), evaluate (EVA1, EVA2) and hold 
(HOLD1, HOLD2) clock phases. During precharge, the output node is pulled to VDD, 
and during evaluate period it is either pulled to GND or remains at VDD depending on 
the input pattern. During the hold phase, the output of current stage is propagated to next 
stage.‎‎In‎order‎to‎have‎full‎voltage‎swing‎in‎the‎output‎node,‎the‎pull‎up‎transistor’s‎gate‎
voltage is regulated to have higher voltage than VDD. Cascaded dynamic logic has the 
potential to achieve high performance, since the load capacitance at output is small for 
each NAND stage. More details on other types of cascaded dynamic circuits and their 
analysis can be found in our previous work ‎[22]‎[23]‎[26].   
 Compound dynamic logic is another variation of dynamic logic style that is unique 
for the Skybridge fabric. The compound circuit style is designed such that maximum 
density benefits can be achieved in 3-D implementations. This also alleviates fine-grained 
clocking requirements. In a single stage, complex logic gates such as XOR, AND-of-
NAND gates, etc. can be realized. An example of compound dynamic logic is shown in 
Fig. ‎3.2C, Fig. ‎3.2D and Fig. ‎3.2F. As shown in Fig. ‎3.2C, circuit operation is controlled 
by precharge (PRE), evaluate (EVA) control signals, and there is no need for cascading of 
stages; outputs of NAND gates are shorted to achieve AND-of-NANDs logic behavior. 
Fig. ‎3.2D shows HSPICE simulated waveforms that validate the compound logic 
behavior. Like cascaded NAND-NAND designs, this compound logic style is also 
generic for any logic function. 
22 
 
 As evident from the physical layouts in Fig. ‎3.2E and Fig. ‎3.2F, Skybridge’s‎ 3-D 
implementation achieves tremendous density benefits. Cascaded NAND-NAND logic 
based XOR implementations require three logic nanowires (Fig. ‎3.2E), whereas a 
compound XOR implementation uses only one logic nanowire (Fig. ‎3.2F); the signal 
nanowires are shared with other logic gates. The compound dynamic style achieves 
maximum density by eliminating signal and clock routing overheads of cascaded logic, 
but lacks slightly in performance compared to cascaded logic since the load capacitance 
is higher due to output sharing. Our Skybridge designs for arithmetic circuits and 
microprocessor (Chapters 4, 5) follow typically a hybrid logic style, where both the 
benefits of cascaded NAND-NAND and AND-of-NAND compound logic are combined 
for maximum density and performance.  
 These above circuit styles support both dual-rail and single-rail implementations, and 
thus allow flexible design choices for logic. In dual-rail logic, all true and complimentary 
signals are used as inputs, and the circuit is configured to generate both true and 
complimentary outputs at the same stage (Fig. ‎3.3A, Fig. ‎3.3B). On the contrary, single-
rail logic uses only a combination of inputs required to generate true/complimentary 
output, a separate inverter stage is used to generate the opposite signal. Fig. ‎3.3C 
illustrates single-rail implementation, and Fig. ‎3.3D shows HSPICE simulation results. 
The clocking schemes are different for single-rail and dual-rail circuit styles. Single-rail 
logic uses two overlapping clock sequence PRE1, EVA1, HOLD1 and PRE2, EVA2, 
HOLD2 (Fig. ‎3.3D). In dual-rail logic, only one sequence of clock phases is used: PRE, 
EVA, HOLD (Fig. ‎3.3B), since all operations are performed in one stage. Single-rail logic 
23 
 
is suitable to be used in Cascaded NAND-NAND circuit style, whereas dual-rail logic is 
more suitable for Compound AND-of-NAND circuit style.  
 Both dual-rail and single-rail designs have associated trade-offs; in order to optimize 
circuit performance dual-rail logic is used, whereas single-rail logic results in lower 
power‎ and‎ higher‎ density.‎ ‎ In‎ addition‎ to‎ aforementioned‎ choices,‎ Skybridge’s‎ unique‎
 
Fig. ‎3.3. Dual rail vs Single rail logic for Skybridge circuits. A) Example of dual 
rail logic using 2 input NAND gate; both true and complementary signals are 
generated at the same stage; B) Simulated waveform of the NAND gate in (A); C) 
Single rail implementation of the same 2 input NAND gate using two clock stages; 
complementary output is generated in the second stage NAND gate; D) HSPICE 
validations of the single rail circuit in (C).  
24 
 
dynamic circuit styles and fabric integration provides opportunities for more compact 
circuit implementations with high fan-in to maximize density. In the following we 
elaborate on fan-in choices for Skybridge circuits. 
3.2.1 High Fan-In Support 
 High fan-in logic is a well-known driver for compact circuit designs. Since they have 
fewer transistors and interconnects. Therefore, they are advantageous for both improving 
density and power consumption. However, high fan-in circuits are not widely used due 
their detrimental impact on performance compared to low fan-in cascaded designs. The 
performance degradation is particularly severe in CMOS, where the circuit style requires 
complementary devices, and the devices have to be differently sized, which adds to load 
capacitance, and thus lowers the performance. Generally, CMOS circuits are limited to 
 
Fig. ‎3.4. Comparative analysis of high fan-in implications. A) Skybridge NAND gate 
with‎‘m’‎number‎of‎fan-ins;‎B)‎CMOS‎NAND‎gate‎with‎‘m’‎number‎of‎fan-ins; C) fan-
in sensitivity: CMOS delay increases sharply with increasing fan-in,‎Skybridge’s‎delay‎
increases almost linearly with high fan-in; the difference is primarily due to the higher 
load capacitance of CMOS circuit; CMOS uses complementary devices, higher fan-in 
results in higher parasitic capacitances.  
25 
 
only 4 or 2 fan-in‎based‎designs.‎ In‎contrast,‎Skybridge’s‎circuit‎ style‎with‎only‎single‎
type uniform transistors and 3-D layout implementation, allows high fan-in logic without 
corresponding typical performance degradation. 
 To evaluate the feasibility of high fan-in logic in Skybridge, we have carried out fan-
in sensitivity analysis using a NAND gate as an example circuit. For Skybridge HSPICE 
simulations, TCAD generated V-GAA Junctionless device characteristics (Fig. ‎3.1) were 
used. Equivalent CMOS designs were simulated for comparison using 16nm tri-gated 
high-performance PTM device models [25]. The outputs of both Skybridge and CMOS 
NAND gates were connected to load capacitances that are equivalent to fan-out to 4 
inverters in respective designs. The worst-case delay was captured during the falling edge 
of the output node.  
 As shown in Fig. ‎3.4A and Fig. ‎3.4B,‎ Skybridge’s‎ NAND‎ gate uses all n-type 
transistors, whereas the CMOS NAND gate uses both n- and p-type transistors. The total 
capacitance‎ at‎ the‎ output‎ node‎ of‎ Skybridge’s‎ NAND‎ gate‎ is‎ from‎ two‎ adjacent‎
transistors and from 4 inverter fan-out load capacitance. Inverter implementation in 
Skybridge is equivalent to one fan-in NAND gate with three transistors; one transistor is 
gated with input signal, and other two are gated with control clock signals. As a result, 
the load capacitance at the output node in Fig. ‎3.4A is from 4 n-type transistor gate 
capacitances and interconnects. On the other hand, the total capacitance at the output 
node of CMOS NAND gate in Fig. ‎3.4B is from adjacent transistors, which increases 
with fan-in, and from 4 inverter fan-out load capacitance. In a CMOS inverter, same 
input is driven to both n- and p-type devices; in addition, p-type devices are sized to be 
26 
 
twice that of n-type. Hence the load capacitance in CMOS is from 4 n-type and 4 double 
sized p-type transistors, and interconnects.  
 The impact of higher capacitance at output node is evident from results in Fig. ‎3.4C. 
These results are normalized to one fan-in delay for respective designs. As shown in 
Fig. ‎3.4C, CMOS delay increases rapidly with higher fan-in, as more transistor parasitic 
capacitances are added to the total‎ capacitance.‎ On‎ the‎ contrary,‎ Skybridge’s‎ delay‎
increases almost linearly and the impact is less prominent, since the load capacitance 
remains same; the linear increase in delay is mainly due to increased resistance of 
additional transistors in the discharge path. By optimizing V-GAA Junctionless device 
characteristics, this delay can be improved further.  
  In Chapter 4, we show high fan-in circuit implementations for large-scale designs. 
The benchmarking results indicate significant benefits can be obtained for Skybridge 
designs compared to CMOS. 
3.2.2 Noise Mitigation 
 While the dynamic circuit style provides opportunities for efficient circuit 
implementations in 3-D, it is not immune from coupling noise. In dynamic circuits, the 
output is not driven during the hold phase; hence it is susceptible to coupling noise due to 
‘1’‎ to‎ ‘0’‎and‎ ‘0’‎ to‎ ‘1’‎ transitions‎ in‎cascaded‎ logics ‎[26]. In a dense 3-D integration, 
coupling noise from interconnects can also affect the circuit functionality.  
 In order to mitigate coupling noise affects, Skybridge has intrinsic architected 
features that provide noise shielding. The coaxial routing capability (Chapter 2.4), which  
27 
 
  
 
Fig. ‎3.5. Analysis of coupling noise. A)worst case noise scenario; a victim signal is 
carried through outer metal shell in the middle nanowire, signals in inner nanowire, and 
in‎ adjacent‎ metal‎ layers‎ are‎ transitioning‎ from‎ ‘1’‎ to‎ ‘0’‎ while‎ the‎ victim‎ signal‎ is‎
floating‎at‎‘1’;‎B)‎layout‎with‎GND‎shielding‎layer‎to‎protect‎against‎coupling‎noise;‎C)‎
the circuit depicting worst case scenario; D) the circuit schematic when GND shielding 
layers are incorporated; E) when one aggressor is active (Agg1 switching); F) when 
two aggressors are active (Agg 1 and 2 switching); G) when three aggressors are active 
(Agg 1,2 and 3 switching) 
28 
 
is normally used for signal routing, is specially configured to incorporate a noise-
shielding layer. A GND signal is routed in between inner nanowire and outer metal2 
shell. The key concept of noise shielding using GND signal is to increase the overall 
capacitance at the floating nodes, thereby reducing the impact of coupling capacitance.  
This approach ensures coupling noise mitigation during logic cascading, and signal 
propagation in dense interconnect network. In addition to the noise shielding layer, the 
Skybridge circuit style uses a clocking control scheme that is known to provide noise 
resilience ‎[26]. 
 To‎ evaluate‎ the‎ effectiveness‎ of‎ Skybridge’s‎ noise‎ shielding‎ approach,‎ we‎ have‎
performed detailed simulations accounting for worst-case scenarios. The scenarios 
considered, are depicted in Fig. ‎3.5A. Worst case scenario 1 considers the case when a 
signal carried through outer metal layer is floating, and is affected by a driven signal that 
is routed through the inner nanowire; the nanowire signal in this case is aggressor 1. 
Worst case scenario 2 and 3 considers coupling from adjacent metal2 layers that carry 
driven signals; they are denoted as aggressor 2 and aggressor 3 (Fig. ‎3.5A). In all 
scenarios the victim signal is input to another NAND gate with single input; the 
switching‎activity‎of‎this‎NAND‎gate‎degrades‎floating‎node’s‎stability‎even‎further.‎‎The‎
corresponding circuit that emulates these worst-case scenarios is shown in Fig. ‎3.5C. The 
modified circuit schematic after incorporation of GND shielding layer is shown in 
Fig. ‎3.5D, and its physical representation is shown in Fig. ‎3.5B. Simulation results are 
shown in Fig. ‎3.5E-G. Skybridge simulations use 3-D TCAD simulated V-GAA 
Junctionless device characteristics for HSPICE simulations, and takes into account 
interconnect parasitics from the actual 3-D layout. Capacitance calculations for Coaxial 
29 
 
routing structures use the methodology in ‎[27] and assume average routing lengths from a 
Skybridge microprocessor design (Chapter 5). 
 In‎all‎scenarios,‎the‎victim‎signal‎(carried‎through‎metal2)‎is‎kept‎floating‎at‎‘1’,‎and‎
the aggressor signals (carried through inner nanowire, and adjacent metal2 lines) are 
transitioning‎from‎‘1’‎to‎‘0’.‎For‎clarity,‎only‎the‎results‎during‎transitions are shown in 
Fig. ‎3.5 E-G. As shown in Fig. ‎3.5E, for scenario 1, due to interconnect coupling from 
aggressor 1, the floating voltage drops from 0.8V to 0.58V; during the evaluation phase 
of cascaded stage, it drops further to 0.39V. The situation worsens for scenario 2 and 3, 
and in the worst-case the voltage drops to 0.39V. The performance degradation due to 
low input voltage is obvious, and in the worst case it reduces by 416% (Fig. ‎3.5G). The 
GND shielding approach increases the noise margin significantly with none to small 
degradation in performance. For scenario 1, the GND shielding recovers the noise margin 
completely and there is no performance degradation; for scenario 2 and 3 the noise 
impact is minimal, in the worst case the voltage drops by 0.08V, and the performance 
degradation from nominal is 12%. 
3.2.3 Mitigation of Performance Impact Due to Long Interconnects 
 Long interconnect RC delays are critical factors that impact overall performance of 
nanoscale integrated circuits. Typically in CMOS, this issue is addressed by custom 
sizing of transistors to increase signal drive strength. In Skybridge, the 3-D circuit style 
and the fabric integration scheme provides several options to minimize this performance 
impact without any device customization. One such option is insertion of Dynamic 
buffers; dynamic buffers allow partitioning of a long interconnect into small segments, 
30 
 
and allow seamless signal propagation in a pipelined design, without impacting the 
overall throughput.  Dynamic buffers are one fan-in NAND gates that are gated by 
complementary inputs. All Skybridge circuit designs are such that both true and 
complementary values are present in the output. These dynamic buffers were used 
extensively in our arithmetic circuits and microprocessor designs (Chapters 4, 5). Other 
choices for performance improvement are through fan-in optimization and logic 
replication. Both these choices can be used to boost drive current, and as a result to 
reduce long interconnect delay. By reducing the fan-in of the driver circuit, the total 
resistance at the output node can be reduced, which in turn can increase the drive current 
at the output. Similarly, by replicating the driver logic in neighboring nanowires and by 
shorting the outputs, the drive current in a long interconnect can be increased to reduce 
delay. In addition to these choices, CMOS-like repeaters can be employed to reduce the 
delay for very long interconnects that are used for semi-global and global signals. These 
repeaters can be placed in dedicated locations of the die similar to other mixed-signal 
analog power and clock generation circuits. Such repeater requirements for Skybridge 
large scale designs are up-to 100x less than in CMOS (See Chapter 7).  
3.3  Skybridge’s Volatile Memory 
 In addition to logic, ability to incorporate high performance volatile memory is a key 
requirement in integrated circuits.  In Skybridge, the volatile memory implementation 
conforms to the 3-D integration requirements, and follows the aforementioned dynamic 
circuit styles. In this memory, two cross-coupled dynamic NAND gates are used to store 
true and complimentary values, and a separate read logic is employed to perform read 
31 
 
similar to our previous design for 2-D fabrics ‎[15]. The 8T-NWRAM schematic and 
HSPICE validations are shown in Fig. ‎3.6A-B.  
 As shown in Fig. ‎3.6A-B, the memory operation is synchronized with the input 
clocking scheme‎ and‎ the‎ control‎ signals.‎ In‎ order‎ to‎write‎ ‘1’‎ or‎ ‘0’,‎ the‎ clock‎ signals‎
(xpre, xeva, ypre, yeva)‎are‎selectively‎turned‎ON.‎For‎example,‎to‎write‎‘1’‎in‎node‎out, 
xpre and xeva signals are turned ON, and this is followed by ypre, yeva signals. Once the 
node out is‎pulled‎to‎‘1’,‎the‎complementary‎node‎gets‎pulled‎to‎‘0’‎during‎the‎ypre, yeva 
clock phases. A gated read logic is employed for memory read, and the operation is 
synchronized with the read signal. During the read operation, bl is initially precharged, 
and is subsequently discharged or remains at precharged voltage depending on the nout 
state, when the read signal is ON.  
 
Fig. ‎3.6. Volatile memory design in Skybridge. A) 8T-NWRAM circuit schematic; 
volatile memory implementation with two cross-coupled dynamic NAND gates, a 
separate read logic for read operation; B) HSPICE results showing write and read 
operations; C) 8T-NWRAM’s‎physical‎layout. 
32 
 
 A key feature of this NWRAM is that it is not dependent on precise sizing of 
complementary transistors for memory operations as it is in the CMOS SRAM; as a 
result, device sizing-related noise concerns prevalent at nanoscale are mitigated. 
Furthermore, the read logic is separated from the write logic mitigating bit-flipping 
concerns during read operations. In addition, during periods of inactivity, all control 
signals are switched OFF, which reduces leakage power. At certain intervals, the clock 
signals are switched ON again to restore the stored values but there is no need for read-
back and write for this periodic restoration.  
 The layout of this volatile memory is shown in Fig. ‎3.6C; noticeably, all 8 transistors 
required for memory operation are stacked in only one nanowire, whereas two adjacent 
nanowires are used for signal propagation, which can be shared by other memory cells. 
The ultra-dense implementation with reduced interconnections has huge implications on 
reducing active power and improving performance. Moreover, the Coaxial routing 
structures used for intra-cell routing provide additional storage capacitance, which is 
beneficial for prolonging bit storage without restoration, and thus help in reducing 
leakage power consumption. Benchmarking results are shown in Chapter 7. 
3.4 Section Summary 
 In this section Skybridge's device, circuit style and volatile memory elements were 
detailed. The Vertical Gate-All-Around Junctionless transistor geometry, and TCAD 
simulated device characteristics was shown. We presented the 3-D compatible circuit 
style, and showed different approaches to design for high performance and low power at 
ultra-high‎density.‎We‎also‎introduced‎Skybridge’s‎volatile memory approach equivalent 
with the CMOS SRAM.   
33 
 
4. CHAPTER 4 
ARITHMETIC CIRCUIT DESIGN EXAMPLES AND SCALABILITY STUDY  
 
 In this chapter we detail on arithmetic circuit implementations using carry look-ahead 
adders and array multiplier circuits. These arithmetic circuits combine compound and 
cascaded dynamic logic styles in dual rail logic for optimum performance at low power 
and ultra-high density. The density benefits are maximized by using high fan-in logic.  
Connectivity requirements are met by utilizing‎the‎fabric’s‎routing‎features.‎The‎effect‎of‎
coupling noise due to dynamic circuit style and dense interconnections is mitigated 
through the noise shielding approach introduced in Chapter 3.2.2.  
 In order to study the scalability aspects of Skybridge designs, we have implemented 
arithmetic circuits at 4, 8 and 16-bit-widths, and benchmarked against CMOS designs at 
16nm. In the following we present various circuit design examples and show our 
scalability study. 
4.1 Circuit Design Examples and Scalability Aspects  
4.1.1 Basic Arithmetic Circuits 
 Adders and multipliers are core arithmetic computing blocks in ALUs, and are often 
extended to implement other arithmetic operations such as complement, subtraction and 
division. Some of the circuits presented here are also used for the Skybridge 
microprocessor design (Chapter 5). 
34 
 
4.1.1.1  Carry Look-Ahead Adder 
 CLA is well-known parallel adder for fast computation. A block diagram of a 4-bit 
CLA is shown in Fig. ‎4.1A; it consists of propagate-and-generate, carry, buffer and 
summation blocks. The propagate-and-generate block is used to produce intermediate 
signals Pi and Gi (where i = 0 to 3), which are used for calculating Sum and Carry 
respectively; the logic expressions used are Pi = (Ai⨁Bi)  , Gi = Ai Bi.  The carry block 
is used to compute intermediate carry signals and final carry output. The logic expression 
for carry generation is Ci = Gi−1 + Pi−1 Ci−1,‎where‎‘i’‎is‎from‎1‎to‎4.‎The‎buffer‎block‎
is used to buffer a signal and maintain signal integrity. The sum block generates the final 
sum output using the intermediate Pi and Ci signals; the logic expression is Si =
Ai⨁Bi⨁Ci = Pi⨁Ci.  
 The Skybridge specific implementations of these logic blocks use both compound and 
cascaded dual-rail dynamic logic styles (see Section 2 for details). The circuit schematics 
are shown in Fig. ‎4.1B-D. As shown in Fig. ‎4.1B, and 4.1D, the XOR logic for 
computing Pi and Si, and their complementary signals, is done using compound dynamic 
gates. The Ci and ~Ci computations also use dynamic compound gates in AND-of-
NANDs logic, as shown in Fig. ‎4.1C. The generated intermediate signals are propagated 
to the next stage of compound gates through cascading.  HSPICE simulation results 
validating the CLA circuit behavior are shown in Fig. ‎4.1E.  
 The physical implementation of a CLA is shown in Fig. ‎4.1G. The circuit mapping 
into Skybridge follows the guidelines summarized in Chapter 6. 


35 
 
 
 
Fig. ‎4.1. 4-bit carry look-ahead adder (CLA). A) Overall block diagram of 4-bit 
CLA; it contains propagate and generate (PG), carry, buffer and sum blocks; B) 
circuit schematic of PG block; both true and complementary values are generated in 
the compound dual rail logic; C) schematic for carry block using the same circuit 
style, inputs from PG block is used; D) schematic of sum block, inputs from both PG 
and carry blocks are used; E) HSPICE simulated waveforms validating the expected 
adder behavior; F) physical layout of a CLA in the Skybridge fabric. 
 
36 
 
4.1.1.2 Array Multiplier 
 Array based multipliers are widely used for fast parallel multiplications. The core 
concept is illustrated in Fig. ‎4.2A: multiplication is achieved by a series of additions. The 
hardware implementation of the algorithm uses adder units for these iterative additions. 
The block diagram for the multiplier is shown in Fig. ‎4.2B. As illustrated, the 
multiplication is performed with the help of AND logic, half adder and full adders. AND 
operation is performed simply by using a compound gate with two inverted inputs (to 
perform AND-of-NANDs).  The half adder and full adder implementations follow ripple 
carry logic, and are implemented using XOR and NAND gates. Implementation of these 
logic units use similar compound circuit implementations as in CLA. The result of each 
addition is cascaded to other adder units to generate the total multiplication output. 
HSPICE simulated waveforms for this multiplier circuit are shown in Fig. ‎4.2C; the two 
operands illustrated for the 4-bit multiplication are 0011 and 0111, yielding 00010101. 
The physical layout of this multiplier can be seen in Fig. ‎4.2D.   
4.1.2 High Bit-Width Arithmetic Circuits 
 In order to evaluate the potential of Skybridge designs at higher bit-widths, we have 
extended the 4-bit CLA designs to 8- and 16-bit CLAs. An additional objective was to 
evaluate the impact of high fan-in on key design metrics such as density, power and 
performance.  
 8-bit and 16-bit CLA block diagrams are shown in Fig. ‎4.3. Both designs use 4-bit PG 
and Sum blocks as core building blocks. The implementations of these 4-bit blocks 
remain the same irrespective of the bit-width‎choices.‎However,‎the‎carry‎block’s‎ 
37 
 
  
 
Fig. ‎4.2. 4-bit Array Multiplier A) 4-bit array multiplication algorithm; B) block 
diagram of the array multiplier; in order to do iterative additions half adder and full adders 
are used, the multiplication is completed in 9 stages; in this figure, the flow is from the top 
towards bottom; C) HSPICE validations of the multiplier; multiplication between 0011 
and 0111 results in 00010101; the final result is generated at the 9
th
 clock phase; D) 
physical layout of a 4-bit multiplier in Skybridge. 
 
38 
 
complexity increases with bit-width, since Ci is calculated using logic expression: 
𝐶𝑖 = 𝐺𝑖−1 + 𝑃𝑖−1 𝐶𝑖−1. For higher orders of Cout, the complexity increases exponentially. 
As a result, two carry blocks cannot be used in the same clock stage without cascading in 
8-bit CLA design; such partitioning of the carry block will result in throughput 
degradation.  
 However, for a 16-bit CLA design (Fig. ‎4.3B), two 8-bit carry blocks were used. A 
single 16-bit carry block in a single clock stage would result in 17 fan-in circuits, which 
would cause severe degradation of overall performance (details on fan-in sensitivity can 
be found in Chapter 3.2.1). The maximum fan-ins assumed are 4, 9 and 9 for 4-bit, 8-bit 
and 16-bit CLAs respectively.  
  

 
Fig. ‎4.3. High bit-width arithmetic examples: 8-bit and 16-bit CLAs. A) 8-bit CLA 
block diagram; it consists of 4-bit propagate and carry (PG), 4-bit buffer, 8-bit carry and 
2 4-bit sum units. PG blocks generate intermediate signals for parallel addition, buffer is 
used for signal synchronization, and for signal propagation; sum and carry blocks 
generate sum and carry respectively; B) 16-bit CLA block diagram; it consists of 4 4-bit 
PG, 4 4-bit buffer, 2 8-bit carry and 4 4-bit sum blocks. 
39 
 
4.2 Section Summary 
 This section presented various circuit design examples in Skybridge fabric. We 
presented detailed designs of arithmetic circuits such as Adders, Multipliers at different 
bit-width. Scalability aspects were investigated through high bit-width CLA designs. 
Benchmarking results against projected scaled CMOS designs for these arithmetic 
circuits are provided in Chapter 7.  
 
  
40 
 
5. CHAPTER 5 
SKYBRIDGE MICROPROCESSOR DESIGN   
 
 In this chapter, a Skybridge processor design is shown. A 4-bit WIre Streaming 
Processor (WISP-4) was built at the transistor level, and functionally verified at the 
circuit level. The WISP-4 processor design uses a load-store architecture, which is 
common in modern RISC processor designs. It is composed of blocks such as program 
counter (PC), read-only memory (ROM), register file, buffers, decoders, multiplexers and 
arithmetic logic unit (ALU), and is capable of performing memory access and arithmetic 
operations. WISP-4 was designed with five stages of pipeline, and each stage is micro-
pipelined with internal clock signals driving Skybridge dynamic circuits. Design of all 
logic‎and‎memory‎circuits‎for‎processor‎follow‎the‎Skybridge’s‎circuit‎styles‎(see‎Chapter‎
3). Circuit placements and layouts are in accordance to the Skybridge fabric design rules 
and guidelines (see Chapter 6).  
 Using the bottom-up evaluation and benchmarking methodology discussed in Chapter 
6.1, extensive simulations were carried out to validate the WISP-4 design, and to evaluate 
its potential against equivalent CMOS implementation. Benchmarking results are shown 
in Chapter 7.  
5.1 WISP-4 Architecture  
 The architecture of WISP-4 is shown in Fig. ‎5.1. It has five pipeline stages: 
Instruction Fetch, Decode, Register Access, Execute and Write Back. During Instruction 
Fetch, an instruction is fetched from ROM and is fed to instruction decoder. In 
41 
 
Instruction Decode, the 
fetched instruction is 
decoded to generate 
control signals, and to 
buffer the register 
addresses and data. In the 
next stage, buffered data is 
stored in register file and 
prepared for sequential 
execution in the Execute 
stage. After ALU 
operations in the Execute 
stage, results are stored in 
the register file during 
Write Back. The synchronization of pipeline stages is maintained through micro 
pipelining of logic blocks at each stage; this is possible, since all logic block 
implementation is through the Skybridge logic style, which uses clock signals as control 
inputs.  
 The instruction fetch unit consists of a program counter (PC) and a ROM (Fig. ‎5.2A). 
The PC is a 4-bit binary up counter that is used to continuously increment the instruction 
address every clock cycle. This implementation uses a 4-bit CLA; one of its inputs is 
constant‎’1’,‎and‎another‎is‎the‎result‎of‎previous‎calculation.‎The‎result‎of‎PC‎is‎fed‎to‎a‎
4:16 decoder to select one of the 16 rows from the instruction ROM. The ROM stores a 
 
Fig. ‎5.1. Skybridge 4-Bit Wire Streaming Processor 
(WISP-4). Block diagrams showing the WISP-4 
organization; it has 5 pipelined stages: Instruction Fetch 
(IF), Instruction Decode (ID), Register Access, Execute 
and Write Back. 5 instructions are supported: move 
(MOV), move immediate (MOVI), addition (ADD), 
multiplication (MULT) and stall (NOP). 
42 
 
set of instructions to be executed and has a total capacity of 16x9bits in this prototype. 
The output of ROM is a 9-bit instruction and contains 3-bit operation instruction 
(opcode), two 2-bit source/destination register addresses or 4-bit data (see Fig.  5.1).  
  As shown in Fig.  5.2B, the instruction decode unit consists of a 3:8 decoder and 
 
Fig. ‎5.2. Block diagram of each pipeline stages. A) Instruction Fetch stage contains 
4-bit CLA for program counter, 4:16 decoder to decode ROM address and 16*9 ROM 
to store instructions; B) Instruction Decode stage contains a 3:8 decoder to decode 
opcode and two 2-bit buffers for buffering address and data; C) Register Access stage 
has four 4-bit registers to store operands, two 4:1 multiplexers and one 2:1 multiplexer 
for operand selection; D) Execute stage contains arithmetic units: 4-bit CLA and 
multiplier for addition and multiplications, a buffer for data buffering, and two 2:1 
multiplexers for result selection.  
43 
 
buffers to decode operation type in an instruction (opcode), and to buffer the address and 
data. Five operations are supported in the current design: MOV, MOVI, ADD, MULT, 
NOP. MOV (move) and MOVI (move-immediate) opcodes are used to move or store 
data in registers. ADD and MULT opcodes are used for addition and multiplications 
respectively. NOP stands for no operation, and is used for stalling the pipeline.  
 The Register file (Fig.  5.2C) consists of registers, 2:1 and 4:1 multiplexers, and 
buffers. Registers are used to store operands, and multiplexers are used to generate 
control signals for ALU. Buffers are necessary for synchronization of data between 
stages.  
 The ALU in WISP-4 consists of a CLA, array multiplier, buffer, and 2:1 multiplexers. 
The block diagram of ALU is shown in Fig.  5.2D. 4-bit CLA and multiplier units are 
used for addition and multiplication on 4-bit operands. The buffer unit is used for data 
buffering and to write back in the next stage. 2:1 multiplexers select the output of ALU, 
which is stored in the register file during Write Back stage.  
 Circuit-level implementation of these processor units follows the Skybridge circuit 
style. Both Compound and cascaded dynamic logic styles are combined for efficient 
implementations. 4-bit CLA and multiplier circuits and HSPICE validations were shown 
in Chapter 4; in this section we show the core supporting circuits.  
 Fig. ‎5.3 shows 2-bit ROM, 2:4 decoder, and a latch. The ROM is pre-configured to 
generate‎either‎‘1’‎or‎‘0’‎output‎at‎selected‎locations.‎For‎example,‎to‎emulate‎permanent‎
storage‎ of‎ ‘1’‎ and‎ ‘0’‎ in‎ word1,‎ bit1‎ and‎ word2,‎ bit2 locations, 3 dynamic one input 
NAND gates are used. As shown in Fig.  5.3A, the bit1 location is associated with  
44 
 
  
 
Fig. ‎5.3. 2-bit ROM, 2:1 decoder and a latch. A) 2-bit ROM implementation using 
Skybridge’s‎circuit‎style.‎The‎circuit‎is‎preconfigured‎to‎produce‎‘0’‎or‎‘1’‎output‎at‎
selected locations;‎ the‎ schematic‎ (top)‎ is‎ configured‎ to‎produce‎ ‘1’‎ at‎ bit1‎ location‎
when‎W1‎is‎selected,‎‘0’‎at‎bit2‎when‎W2‎is‎selected.‎HSPICE‎results‎are‎shown‎in‎
the bottom figure;  B) 2:4 decoder schematic and HSPICE results are shown; 
cascaded logic style is used for this; output of first stage is propagated to the second 
stage for inversion operation; C) A latch implementation; latch operation is 
controlled by Sel0 and Data inputs; HSPICE simulation results are shown in the 
bottom subfigure. 
45 
 
a NAND gate that has only word2 (W2) as input; whereas, the bit2 location is associated 
with shorted outputs of two NAND gates, whose inputs are word1 (W1) and word2 (W2) 
respectively. All NAND gates shown in Fig.  5.3A  are controlled by the same PRE, EVA 
control signals. During W1 select, W2 is‎ ‘0’,‎ therefore bit1 read-out‎ value‎ is‎ ‘1’,‎ and‎
during W2 select both bit1 and bit2 read-out‎values‎are‎‘0’‎as‎expected.‎Fig.  5.3B shows 
the HSPICE simulated waveform validating ROM behavior.  
 The 2:1 decoder implementation uses a cascaded dynamic logic style; output of first 
stage is propagated to second stage for inverted final output. Fig.  5.3C-D shows the 
circuit schematic and related HSPICE simulation results. The dynamic latch 
implementation is shown in Fig.  5.3E-F. It uses a 2:1 multiplexer and a NAND gate for 
required functionality; depending on the input (Data) and select signal (Sel0), either new 
data is latched or old data (out) is retained through the feedback logic. Fig.  5.3F shows 
the HSPICE simulations for this latch, validating circuit operation.   
5.2  Section Summary 
 A 4-bit Skybridge microprocessor (WISP-4) was presented, details of microprocessor 
architecture and its core elements were shown. The WISP-4 design lays the foundation 
for processor implementations in Skybridge fabric. This design can be easily extended to 
higher bit-width, arithmetic circuits similar to the ones shown in Chapter 4 can be used. 
In‎addition,‎Skybridge’s‎volatile‎RAM‎can‎be‎used‎to‎realize‎high‎performance‎on-chip 
caches. 
  
46 
 
6. CHAPTER 6 
FABRIC EVALUATION METHODOLOGIES, 3-D CIRCUIT DESIGN RULES 
AND GUIDELINES 
   
 Comprehensive methodologies, from the material layer to system, were developed to 
evaluate the potential of Skybridge vs. CMOS. All circuit simulations followed a bottom-
up simulation methodology that included detailed effects of material choices, confined 
dimensions, nanoscale device physics, 3-D circuit style, 3-D interconnect parasitics, and 
3-D coupling noise. For benchmarking purposes, equivalent CMOS designs were 
implemented using state-of-the-art CAD tools, and were scaled to 16nm using standard 
scaling rules. 
 All Circuit design and layout in Skybridge adhere to 3-D specific design rules and 
guidelines. The design rules ensure conformity to necessary material structure 
requirements and manufacturing assumptions, as presented earlier. The guidelines allow 
efficient mapping of circuits in this 3-D fabric without routing congestion, helps in 
mitigating coupling noise, ensures thermal management and manufacturability.  
 3-D connectivity implications for large-scale designs in Skybridge were analyzed 
using a detailed methodology. 3-D interconnect modeling was done for a 10 million logic 
gate based design with Skybridge specific parameters; equivalent estimation was done for 
CMOS designs at 16nm technology node for comparison. Thermal analysis of Skybridge 
circuits was carried out using fine-grained model accounting for thermal properties of 
materials, nanoscale dimensions and 3-D layout. 
.   
47 
 
6.1 Fabric Evaluation Methodologies 
6.1.1 Methodology for 3-D Circuit Evaluation   
 As mentioned earlier, Skybridge circuit evaluation followed a bottom-up simulation 
methodology. Detailed simulations were done at device, core circuit and system levels. 
V-GAA Junctionless device behavior was characterized using 3-D TCAD Process and 
device simulations.  Process simulation was done to create the device structure emulating 
the actual process flow; process parameters (e.g., implantation dosage, anneal 
temperature, etc.) used in this simulation were taken from our experimental work on 
Junctionless transistor ‎[10]. Process simulated structure was then used in Device 
simulations to characterize device behavior. Detailed considerations were taken to 
account for confined device geometry, nanoscale channel length, surface and secondary 
scattering effects (see Chapter 3.1 Process and Device simulation results).  
 For circuit simulations, the TCAD simulated device characteristics were used to 
generate an HSPICE compatible behavioral device model (Fig. ‎6.1). Regression analysis 
was performed on the device characteristics, and multivariate polynomial fits were 
extracted using DataFit software ‎[26]. Mathematical expressions were derived to express 
the Drain current as a function of two independent variables, Gate-Source (VGS) and 
Drain-Source (VDS) voltages. These  expressions  were  then  incorporated  into  sub-
circuit definitions  for  voltage-controlled  resistors  in HSPICE ‎[24]. Capacitance data 
from TCAD simulations was directly  integrated  into  HSPICE  using  voltage-controlled  
capacitance  (VCCAP)  elements  and  a piece-wise linear approximation.  The  
regression  fits  for  current together  with  the  piece-wise  linear  model  for capacitances  
and  sub-circuits define the behavioral HSPICE model for the V-GAA Junctionless 
48 
 
 
Fig. ‎6.1. Skybridge Circuit Evaluation 
methodology. The bottom-up approach uses 
TCAD Process and Device simulated device 
characteristics in HSPICE simulations. 
Interconnect parasitics and noise effects from 3-D 
layout are also captured in these simulations.  
transistor.  This modeling 
methodology is similar to our 
prior work on horizontal 
nanowire device modeling ‎[26].  
 In addition to accurate 
device characteristics, Skybridge 
circuit simulations also 
accounted for 3-D layout 
specific interconnect parasitics 
and coupling noise effects 
(Fig. ‎6.1) considering actual 
dimensions and material 
choices. Circuit mapping into 
Skybridge fabric and 
interconnection‎ were‎ according‎ to‎ manufacturing‎ assumptions‎ and‎ followed‎ fabric’s‎
design rules and guidelines. Coupling noise considered was due to cascading of logic 
stages, and signal propagation through dense 3-D interconnect network. V-GAA 
Junctionless transistors used for fabric evaluation had 16nm channel length. All 
manufacturing assumptions and design rules followed ITRS guidelines for 16nm 
technology node ‎[48]. Capacitance calculations for Coaxial routing structures were 
according to the methodology in ‎[27], and resistance calculations were according to the 
PTM interconnect model ‎[35]. The PTM model ‎[35] was also used for metal routing RC 
and coupling capacitance calculations.  
49 
 
 For benchmarking CMOS implementations, of arithmetic circuits and a 
microprocessor, state-of-the-art CAD simulation tools (Synopsys Design Compiler, 
Cadence Encounter, and Synopsys HSPICE) were used. Behavioral design, physical 
layout, placement, interconnect extraction, and HSPICE simulations were performed at 
45nm technology node. Extracted results were then scaled to 16nm technology using 
standard scaling rules ‎[13]‎[14].   
6.1.2 Methodology for 3-D Interconnect Modeling, Wire Length Estimation and 
Repeater Count Distribution 
 Predictive models ‎[16]‎[17] for estimation of interconnect distribution in 2-D and 3-D 
fabrics‎were‎employed.‎Parameters‎for‎these‎models‎such‎as‎Rent’s‎parameters,‎average‎
fan-out and gate-pitch were extracted from the microprocessor and arithmetic circuits 
designed for Skybridge and CMOS. In addition, typical CMOS parameters from 
literature ‎[16] were also considered for another level of comparison. This resulted in the 
full interconnect distribution for Skybridge and 2-D CMOS. In order to identify the 
boundaries between interconnect hierarchical levels, delay criterion was used ‎[6]. The 
number of repeaters for each hierarchical level was then estimated based on the optimal 
interconnect segment length for repeater insertion and the number of interconnects for a 
given length (from the interconnect length distribution). The optimal segment length for a 
given hierarchical level was determined based on interconnects resistance and 
capacitance parameters. Fig. ‎6.2 provides an overview; details on the predictive models 
used can be found in ‎[6].  
 
50 
 
 
Fig. ‎6.2.3-D Interconnect modeling methodology. Methodology for predicting the 
interconnect length distribution in Skybridge and 2-D CMOS; Skybridge parameters 
were taken from WISP-4 microprocessor design, CMOS parameters were taken from 
[16][17]. 
6.1.3 Methodology for 3-D Thermal Analysis 
 To analyze the thermal profile of 3-D circuits, and to quantify the effectiveness of 
Skybridge’s‎heat‎extraction‎features,‎we‎have‎done‎circuit-level thermal evaluation using 
detailed modeling and simulation for the worst-case static heat scenario. The thermal 
modeling was done at transistor level granularity, and was extended for Skybridge 
circuits. In this model, each heat conducting region (e.g., Channel, Drain/Source, 
Contacts etc.) is represented with equivalent thermal resistance, and the thermal 
resistance value is determined from the actual thermal conductivity of material used, and 
material dimensions (see Chapter 8 for material properties). The effect of nanoscale 
confined dimensions on thermal conductivity is captured in thermal resistance 
51 
 
 
Fig. ‎6.3. Thermal evaluation methodology. 
Methodology for worst-case static heat scenario. 
Thermal resistance modeling is done for each 
circuit component, such as device, interconnect, 
power rail, etc. and combined to assemble thermal 
resistance network for the 3-D circuit; electrical 
equivalent circuit model is then used for HSPICE 
evaluations  
calculations. For Skybridge 
circuits the same model was used 
to calculate thermal resistance of 
all active circuit components, 
accurately reflecting material 
dimensions and 3-D layout. 
HSPICE thermal simulations 
were done by analogous 
representation of thermal 
resistance and heat source in 
electrical domain. Worst case 
static heat scenario was 
considered for these simulations. 
Analysis was done on 8 fan-in 
based Skybridge circuits. Several 
conditions were simulated 
including‎ heat‎ conduction‎ with‎ and‎ without‎ Skybridge’s‎ heat‎ extraction‎ features‎ at‎
different gate temperatures. Fig. ‎6.3 illustrates the methodology used for thermal 
modeling. More details about thermal modeling and analysis can be found in Chapter 8.  
6.2 3-D Circuit Design Rules and Layout Guidelines 
 The design rules are a set of numerical rules for circuit layout derived from TCAD 
simulations and envisioned manufacturing pathway. These design rules set the standard 
for minimum length, width, thickness, and spacing of nanowires, transistors, and metal 
52 
 
layers. The guidelines for 3-D circuit mapping and layout are based on Skybridge's 
circuit style, global and intermediate signal routing, heat extraction, and 
manufacturability. Ease of implementations of dynamic circuits in 3-D is emphasized in 
these guidelines; careful considerations are taken to enable high fan-in logic 
implementations and to prevent long intra-logic interconnections that are detrimental to 
performance. Basic guidelines are discussed for routing signals using intrinsic features in 
Skybridge (signal carrying nanowires, Bridges and Coaxial routing structures) and 
considerations are taken to mitigate coupling noise through incorporating GND shielding 
layers on signal routing paths. Circuit design guidelines also take into account 3-D heat 
extraction requirements. Heat extraction features are used synergistically with other 
active components to prevent hotspot development in 3-D. Ensuring fabric 
manufacturability is precursor to all these guidelines. 
6.2.1 Design Rules 
 Design rules used for behavioral and thermal simulations of Skybridge circuits were 
derived from material requirements and the manufacturing pathway presented in Chapter 
9. Materials required and their dimensions are specific to design choices, and are 
validated by simulations; for example: choice of 2nm thick HfO2 as gate-dielectric for 
vertical J-GAA device was validated by detailed 3-D TCAD Sentaurus based modeling  
and simulations (see Chapter 3.1). Similarly material dimensions were selected for 
spacer, contact formation, inter-layer dielectric, and interconnect and heat junctions. 
Fig. ‎6.4 shows cross-section of routing-nanowire and logic-nanowire, and illustrates 
dimensions and spacing of different material regions. These dimensions are based on 
their core requirements and manufacturability. For example, as shown in Fig. ‎6.4, the 
53 
 
11.5nm thickness of TiN 
layer (Gate electrode for 
vertical J-GAA devices) 
is determined both by 
minimum gate electrode 
thickness requirement for 
device functionality and 
lithographic alignment 
precision (± 3.3nm at 16  
nm node ‎[48]) required 
for UV exposure (Chapter 
9.1.6).  
 Table ‎6.1 lists design 
rules that are specific to 
each fabric component. 
Since Skybridge is a 3-D fabric, design rules are required in all X, Y and Z directions as 
presented in Table ‎6.1. Some choices are customizable to individual circuit designs, such 
as Coaxial routing layer length, heat junction spacing etc.; these are not listed in 
Table ‎6.1.   
6.2.2 Additional Guidelines 
 An abstract view of the Skybridge fabric with key aspects is shown in Fig. ‎6.5. As 
illustrated, local interconnections for input, output and power rails are through Bridges 
 
Fig. ‎6.4. Design rule illustration. A pair of nanowires 
are shown: one logic nanowire and another signal 
nanowire. Transistors are stacked in logic nanowires, 
whereas signal nanowires are primarily used for signal 
routing. The figure depicts different materials and 
dimensions. Logic nanowires outer dimensions are 
determined by transistor gate electrode thickness, gate 
contact requirements; signal nanowires outer dimensions 
are specified by ILD and different metal layer 
thicknesses. 
 
54 
 
and Coaxial structures. 
Intermittent Heat dissipating 
power pillars are also shown 
on the periphery of logic 
blocks. 
 Circuit mapping into the 
Skybridge fabric involves 
placement of device, contacts 
and power rails, and local, 
semi-global and global 
interconnections. This 3-D circuit mapping is made compatible with heat extraction and 
manufacturing requirements.  
 For circuit mapping, arrays of regular vertical nanowires are partitioned into logic and 
signal routing nanowires. Logic nanowires are dedicated for containing transistors stacks, 
and signal nanowires are primarily used for signal routing. Placements of logic and signal 
nanowires are periodic, and are interleaved with each other. All nanowires are assumed to 
have a fixed height of 886nm. The logic nanowires are partitioned to have at most two 
logic stages, each having maximum of 9 fan-in, and occupying half of maximum 
nanowire height. Interconnection in-between logic stages is through Bridges and Coaxial 
routing structures, and utilizes signal nanowires. Bridges form links between nanowires, 
and Coaxial routing structures that are placed on signal nanowires allow signal hoping 
and provide noise shielding. Three signals can be routed with one signal nanowire and 
surrounding metal shells in current designs; one of the three signals is dedicated for GND 
Table ‎6.1. Design rules 
 
Width 
(nm) 
X 
Length 
(nm) 
Z 
Thickness 
(nm) 
Y 
Spacing 
(nm) 
Bridge 
(X,Y,Z) 
16n-
58n 
16n 16n-58n 16n-37n 
Transistor Channel 
(X,Y,X) 
16n 16n 16n 58n 
Transistor Spacing 
(Z) 
- - - 16n 
Gate Electrode (Z) 29n 16n 11.5n - 
Contact (X,Y,Z) 26n 16n 16n 39 
Heat Junction 
(X,Y,Z) 
22n 16n 6n - 
Coaxial (Si-M1) 
(X,Y) 
37n - 37n 
4n (Si-
M1) 
Coaxial (M1-M2) 
(X,Y) 
58n - 58n 
4n (M1-
M2) 
 
55 
 
signal to provide noise shielding. In addition to these routing requirements, logic stages 
that are used in same logic block are placed in close proximity to reduce long intra-block 
connections, and thus to reduce delay. 
 Global signals in Skybridge are primarily clock and power signals. Power signal 
contacts (VDD, GND) are made at the top, middle, and bottom of the logic nanowires. 
GND contacts are made at the top and bottom, and VDD contacts are made in the middle; 
this configuration allows heat flow from the top of the nanowire towards the bottom bulk 
 
Fig. ‎6.5. View of Skybridge fabric. The figure shows abstract layout of Skybridge 
fabric incorporating all fabric components. Logic and signal nanowires are separated, 
and are interleaved with each other. Logic nanowires contain transistor stacks, and 
have power rail contacts at top, middle and bottom. Signal nanowires carry signals 
themselves and also facilitate routing through Coaxial routing structures and Bridges. 
Coaxial routing structures have dedicated GND signal layer for noise shielding. Heat 
Extraction features ensure thermal management. As illustrated, Heat Extraction 
Junctions are placed on selective places on logic nanowires; extracted heat is dissipated 
through Heat Extraction Bridges and Heat Dissipating Power Pillars. 
 
56 
 
(details on thermal management on Chapter 8). Clock signals are routed in parallel to 
power signals.  
 Heat Extraction Junctions are placed at the output of every logic stages or one per 
logic nanowire, depending on the requirements. One input out of a fan-in of 9 is reserved 
in every logic stage for the Heat Junction. Extracted heat is dissipated through Bridges 
and Heat Dissipating Power Pillars. The large area Heat Pillars are placed on the 
periphery of logic blocks, and are separated by an average distance of 10 nanowire 
pitches from each other. Circuit mapping in the fabric takes into consideration the 
placement of these pillars.  
6.3 Section Summary 
 In this section, an overview of the methodologies used for interconnect estimation, 
thermal analysis, 3-D circuit functionality verification and benchmarking were presented. 
Numerical design rules 3-D circuits derived from TCAD simulations and manufacturing 
assumptions were elaborated. Guidelines for circuit mapping into physical fabric were 
shown that take into account manufacturability, connectivity, noise mitigation and 
thermal management.  
  
57 
 
7. CHAPTER 7 
BENCHMARKING RESULTS 
   
 We have extensively evaluated core aspects of Skybridge fabric, and benchmarked 
against projected scaled CMOS. Benefits of 3-D circuit implementation were evaluated 
through a 4-bit array multiplier, 4-, 8- and 16-bit CLAs, 1-bit volatile memory cell, and a 
4-bit microprocessor design. The benchmarking was done by accounting for detailed 
effects of material structures, nanoscale device physics, circuit style, 3-D circuit layout, 
interconnect parasitics and noise coupling, and followed the methodology, design rules 
and guidelines described in Chapter 6. CMOS equivalent implementations were 
completed using state-of-the-art CAD tools and scaling to 16nm was done using standard 
design rules ‎[13]‎[14] as discussed in Chapter 6.1.1.  In addition, we have also evaluated 
connectivity‎implications‎for‎Skybridge’s‎ultra-dense implementations and compared that 
with equivalent CMOS following the methodology described in Chapter 6.1.2. 
Effectiveness of Skybridge’s‎heat‎extraction‎features‎are‎shown‎in‎Chapter‎8.‎‎ 
 The benchmarking results show tremendous benefits can be obtained for Skybridge 
designs; for example, the 16-bit CLA design achieves 60x density, 10x power and 54% 
performance benefits over equivalent‎ CMOS‎ designs,‎ and‎ Skybridge’s‎ estimated‎ total‎
interconnection length is 10x less compared to CMOS.  
7.1 Benchmarking of Arithmetic Circuits  
 The benchmarking results for arithmetic circuits are shown in Table ‎7.1; these circuit 
designs were detailed in Chapter 4. As evident from the results, Skybridge designs 
58 
 
achieve significant benefits across 
all metrics. Table ‎7.1 shows that 
the 4-bit array multiplier 
Skybridge design has 39.3x 
density and 4x power advantage at 
comparable performance vs. the 
CMOS multiplier. The 4-bit 
Skybridge CLA is 24.6x denser 
and has 12X reduced power; whereas 8 and 16-bit CLA designs that use 8 fan-in are 48x 
and 60.5x denser, respectively, and consume 12x and 10x less power, respectively, in 
comparison to equivalent 16-nm CMOS designs. The active power results show almost 
linear dependence to throughput. The 16-bit Skybridge design is 54% higher performance 
vs. the CMOS version. Due to the Skybridge fabric and circuit style, the load capacitance 
that each gate output sees is reduced, and as a result high fan-in designs are possible and 
beneficial in Skybridge circuits. Our 16-bit results show better overall results with higher 
bit-widths vs. CMOS. These results indicate high bit-width scalability potentials of 
Skybridge designs. 
7.2 Benchmarking of Volatile Memory  
 Cell-level evaluation of Skybridge volatile RAM vs. scaled 16nm high performance 
6T-SRAM is shown in Table 7.2. The Skybridge RAM has 4.6x density, 4.24x active 
power and 50x leakage power benefits, and operates at similar frequency as the high 
performance SRAM (Table 7.2). These benefits of Skybridge RAM are achieved due to 
3-D integration and innovative circuit style. The density benefits are obvious from the 
Table ‎7.1. Scalability potential of Skybridge 
designs 
CLA  
Throughput  
(s-1) 
Power  
(μW) 
Area  
(μm2) 
CMOS SB CMOS SB CMOS SB 
4-Bit 
Multiplier 
5.0e9 5.1e9 42.3 172 50 1.27 
4-Bit CLA 9.9e9 10.4e9 235 19.4 18.7 0.76 
8-Bit CLA 4.5e9 5.7e9 287 23.5 64.7 1.34 
16-Bit 
CLA  
2.4e9 3.7e9 297 27.8 130.2 2.15 
 
 
59 
 
Skybridge RAM 3-D layout (Fig. 
3.6C), since only one logic-
nanowire is used for memory 
implementation, which is 
equivalent to one transistor area. 
The dense implementation also implies intra-cell routing is less, which is advantageous to 
reduce active power. The active power in this RAM is further reduced compared to 
SRAM, due to its fundamental operating style. The write operation in Skybridge RAM is 
synchronized with clock, and only true or complementary value is written at a certain 
time as opposed to SRAM where both values transition at the same time leading to higher 
switching activity, and as a result more active power compared to Skybridge RAM. The 
leakage power in Skybridge RAM is significantly less, since the RAM design uses 
dynamic circuit style with multiple transistors stacked in series forming high resistance 
path‎ from‎ storage‎ node‎ to‎ GND.‎Moreover,‎ the‎ Skybridge‎ RAM’s‎ restoration‎ scheme‎
ensures that during periods of inactivity all control signals can be switched off, which 
reduces leakage power further (Details on Skybridge RAM operation can be found in 
Section 3.3). Despite reduced intra-cell routings of Skybridge RAM, the performance 
results. 
7.3 Benchmarking of Processor Design in Skybridge 
 Benchmarking results for WISP-4 microprocessor is shown in Table ‎7.3. The WISP-4 
architecture and its core design components were presented in Chapter 5. As shown in 
Table ‎7.3, the Skybridge WISP-4 design significantly outperforms the equivalent CMOS 
version. At-least 30x density, 2.94x power and 18.6% performance benefits are obtained. 
Table ‎7.2.  Memory comparison: Skybridge 
8T-NWRAM vs. CMOS 6T-SRAM 
 
Delay 
(ps) 
Active 
Power 
(μW) 
Leakage 
Power 
(nW) 
Area 
(μm2) 
CMOS 6T-
SRAM 
20 1.4 8.2 0.065 
Skybridge 
8T-NWRAM 
20.2 0.33 0.164 0.014 
 
 
60 
 
Higher benefits are expected for 
higher bit-width implementations. 
The scalability of Skybridge 
circuits was shown through 
arithmetic circuits in Section 6. 
7.4 Connectivity Implications of Skybridge Designs 
 Skybridge’s‎ unique‎ routing‎ features‎ such‎ as‎Bridges‎ and‎Coaxial‎ routing‎ structures‎
allow Input/Output/Global signals to be routed from any arbitrary position in the 3-D 
layout to another, and thus ensure high degree of connectivity with limited footprint. 
Additional routing is achieved through traditional metal layers. We have quantified 
connectivity implications of Skybridge designs using predictive models based on Rent’s‎
rule ‎[16]‎[17].‎ Rent’s‎ parameters‎ for‎ Skybridge‎ were‎ extracted‎ from‎ actual‎ designed‎
circuits and CMOS parameters were taken from literature ‎[16]. For a 10M logic-gate 
design, our results indicate that interconnect lengths for Skybridge are significantly 
shorter than CMOS, at each hierarchical level (Fig. ‎7.1A); e.g., the longest Global 
interconnect is ~10X shorter with Semi-global and Local interconnects being dominant. 
This reduces the number of repeaters required in Skybridge considerably (Fig. ‎7.1B), in 
the best case the repeater count was found to be 100x less compared to CMOS designs; 
this has huge implications for overall area, power consumption, and performance of large 
Skybridge-based circuit architectures.  
Table ‎7.3. Skybridge vs. CMOS comparison 
for microprocessor 
WISP-4 
Processor 
Throughput 
(Operations/sec) 
Power 
(μW) 
Area 
(μm2) 
CMOS 4.3x10
9
 886 289 
Skybridge 5.1x10
9
 301 9.52 
 
 
61 
 
7.5 Section Summary 
 In this section, benchmarking results of Skybridge fabric were presented. We 
presented results for arithmetic circuits at different bit-widths and showed how they 
scale. We have also shown benchmarking results for a microprocessor. The benefits of 
Skybridge designs were tremendous across all metrics: area, power and performance, and 
at higher bit-width more benefits are projected. Implications of 3-D connectivity were 
also evaluated; interconnect requirements for Skybridge were found to be order of 
magnitude less.   
 
Fig. ‎7.1. Comparison of interconnect distribution and estimated repeater count 
in Skybridge and CMOS, for an integrated circuit consisting of 10 million 
gates. A)  Interconnect distribution estimating the number of interconnects of a 
given length (in gate-pitches). Skybridge reduces the length of interconnects 
significantly, by almost 10x for the longest interconnect. B) Estimated count of 
repeaters based on the interconnect distribution in (A). Parameters for Skybridge: k 
= 5.39, p =‎ 0.577‎ (Rent’s‎ parameters),‎ average‎ fan-out = 2.018. For CMOS, 
Parameter Set 1: k=4, p=0.6, average fan-out = 3; and Parameter Set 2: k=3.416, 
p=0.473, average fan-out = 1.7.  
62 
 
8. CHAPTER 8 
FINE-GRAINED 3-D THERMAL MANAGEMENT 
   
 Thermal management is a crucial issue at nanoscale. As transistors are reaching ultra-
scaled dimensions, heat dissipation paths are reducing, thus giving rise to self-heating in 
transistors. The situation worsens for 3-D designs, where multiple transistors are stacked 
vertically, and thermal resistance from heat source to sink increases. In Skybridge 
nanoscale thermal issues are addressed through architected heat extracting features being 
built-in as core fabric components. This integrated mindset is a significant departure from 
traditional CMOS approaches, where heat extraction from active circuit is addressed only 
as after-thought (i.e., during operation, and at system level).  
 The intrinsic heat extraction features of Skybridge fabric are: (i) selective placement 
of power rails (i.e., VDD and GND) to control heat flow direction, (ii) Heat Extraction 
Junctions (HEJs) to extract heat from a heated region in a circuit, (iii) sparsely placed 
large area Heat Dissipating Power Pillars (HDPPs) for heat dissipation to sink.  
(i) In Skybridge, logic and memory functionality is achieved in vertical nanowires, 
where transistors are stacked and metal contacts are established at selective places in 
nanowires for output and power rails (i.e., VDD and GND). The placement of power 
rail contacts has huge thermal implications, since it determines the current and heat 
flow direction in a vertically implemented fabric. For example, in a vertically 
implemented dynamic NAND gate if the VDD is placed on the top and GND is 
placed at the bottom, electrons will flow from GND towards VDD and generate heat 
63 
 
along its path. In turn the generated heat will flow from top (i.e., hot region) to 
bottom (i.e., cool region) towards reference temperature.   In this fabric, the power 
rails are positioned vertically such that heat flow towards substrate is maximized. 
Since, each logic nanowire pillar accommodates two dynamic NAND gates, and one 
power rail can be shared between two gates, the VDD contact is positioned in the 
middle and GND contacts are made at the top and at the bottom. This configuration 
allows heat transfer from VDD to bottom GND and towards heat sink in the bulk 
and allows the bottom of the nanowires to be at the same temperature as the 
substrate.  
(ii) HEJs are specialized junctions that are used to extract heat from a logic nanowire 
without perturbing its operation. HEJs are connected with Bridges to transfer heat to 
the bulk through HDPPs. The Bridges that carry heat are different from other generic 
signal carrying Bridges, since these always carry only one type of electrical signal 
(GND) and serve the purpose of heat extraction only. HEJs in conjunction with 
Bridges allow flexibility to selectively extract heat from a 3-D circuit layout without 
any loss of functionality or performance. 
(iii) HDPPs are intrinsic to Skybridge fabric, and are used for both power supply (i.e., 
VDD and GND signals) and heat dissipation. These pillars are large in area (2nw 
pitch x 2nw pitch) and have specialized configuration with metal silicidation and 
fillings particularly to facilitate heat transfer. The top GND and middle VDD 
contacts in each logic nanowire connect to these large area pillars through Bridges. 
The power pillars are different in-terms of dimension, layout and material 
64 
 
configuration from signal pillars, which carry input/output/clock signals from 
different logic/clock stages. 
In the following, we present details on thermal characteristics of Skybridge fabric, and 
show effectiveness of its architectural features. Fine-grained thermal modeling approach 
is presented for 3-D circuits, and is followed by detailed evaluation. 
8.1 Thermal Modeling and Analysis 
 In order to characterize the thermal profile during operating conditions heat modeling 
was done for circuits at transistor-level granularity as outlined in Chapter 6.2. This fine-
grained modeling is especially important due to nanoscale dimensions of active devices; 
at this scale, confined dimensions and scattering affects drastically reduce thermal 
conductivity of silicon channel, which leads to rapid self-heating. From a circuit 
perspective, such fine-grained modeling allows detail understanding about heat 
generation in circuits, and implications of materials and architectural choices for heat 
dissipation.  
8.1.1 V-GAA Junctionless Transistor  
 In this section we show thermal modeling of a single n-type GAA Junctionless 
transistor. Material and geometry considerations of this device are reassessed from 
thermal perspective. Fig. ‎8.1A shows cross-section of n-type GAA Junctionless 
transistor, where heat generation is mainly due to electron-phonon interaction in the 
Drain region. During ON state, free electrons accelerate from the Source region towards 
the Drain. Here they scatter due to interactions with other electrons, phonons, and 
impurity atoms causing the lattice temperature to increase ‎[19]. Depending on the 
65 
 
material considerations and geometry of the transistor, this temperature gradient can 
either dissipate quickly without any impact or slowly dissipate and cause transistor ON 
current degradation. 
 In order to estimate temperature gradient within transistor region, an electrical 
analogy of thermal model can be used ‎[18]. An approximation of generated heat, Q 
(Watts) can be:  
𝑄 = 𝐼𝑑𝑠 ∗ 𝑉𝑑𝑠 (8.1) 
 In eq. (8.1), Ids is Drain-source current, and Vds is Drain-source voltage. The 
relationship between heat (Q) and temperature-gradient‎(ΔT)‎is: 
𝛥𝑇 =
𝐿
𝐾 ∗ 𝐴
 ∗ 𝑄 
(8.2) 
 In eq. (8.2), L is the length of heat conduction path, k is thermal conductivity and A is 
cross-section area of heat conduction path. Q and T are analogous to current (I) and 
voltage (V) respectively in electrical domain, and thermal resistance is analogous to 
electrical resistance. This allows us to model the thermal circuit as an equivalent 
electrical circuit for analysis under various operating conditions. 
 Material considerations and nanoscale effects are captured in thermal conductivity 
parameter k, whereas geometry considerations are accounted in (L/A) portion of eq. (8.2). 
Surface scattering, trap states and confinement effects reduce channel conductivity 
significantly at nanoscale. Pop. et. al., reported ‎[19] thermal conductivity of 10nm thin 
silicon layer to be as small as 13 Wm
-1
K
-1
, which is one order of magnitude less than bulk 
silicon (147 Wm
-1
K
-1
). Table ‎8.1 lists different materials used in GAA Junctionless 
66 
 
transistor and circuit thermal modeling. Material specifications (i.e., 2-D dimensions, 
thermal conductivity), in the heat flow path are also mentioned in Table ‎8.1, which is 
visually depicted in Fig. ‎8.1B. 
 Thermal model of GAA Junctionless transistor was developed using an equivalent 
thermal resistance network considering the heat conduction path and device geometry, 
based on the methodology discussed in ‎[18] for multigated transistors. The resistance 
 
Fig. ‎8.1.  Thermal modeling and simulations of V-GAA junctionless transistor. 
A) V-GAA Junctionless transistor cross-section is shown with material dimensions; 
B) heat dissipation paths are shown; heat source being the Drain region; C) heat 
resistance model for a single transistor; Drain side of the channel acts as heat 
source, heat is dissipated through the contacts in Drain, Source and Gate; D) 
thermal simulation results for a single transistor; temperature profile at various 
transistor regions with the increase in Drain voltage. 
67 
 
network built from the 
thermal conduction paths 
in Fig. ‎8.1B and with 
corresponding material 
parameters (Table ‎8.1) is 
shown in Fig. ‎8.1C. As 
illustrated, there are three 
paths to reference 
temperature through 
contacts at Drain, Gate 
and Source regions. 
Following‎ the‎ transistor’s‎
underlying self-heating principle the heat source is placed on the Drain side of the 
channel. From the heat source, heat travels either through the silicide, spacer and contact 
at the Drain, or through the channel towards the gate contact, or through the channel 
towards the Source contact. Heat flow is depended on the least resistance path to 
reference temperature. This resistance network model and device characteristics from 
TCAD simulations (VDD = 0.8V and ON current = 3.2x10
-5
 A; Section 2.1) were used 
for HSPICE simulations. Fig. ‎8.1D shows the simulation result for a single isolated 
transistor. For this simulation, routing resistance from contact to bulk was considered to 
be negligible. The reference temperature was assumed to 350K. As shown in Fig. ‎8.1D, 
the temperature is highest at the drain side and gradually lowers towards the Source; the 
Table ‎8.1. Properties of materials used in transistor 
modeling 
Region Material 
Dimension 
(L x W x T) 
nm 
Thermal 
Conductivity 
Wm
-1
K
-1
 
Drain 
Electrode 
Ti 10 x 16 x 12 21 ‎[38] 
Drain-Si Sillicide 10 x 16 x 16 45.9 ‎[39] 
Spacer Si3N4 5 x 16 x 18.5 1.5 ‎[41] 
Channel Doped Si 16 x 16 x 16 13 ‎[19] 
Gate Oxide HfO2 16 x 18 x 2 0.52 ‎[42] 
Gate 
Electrode 
TiN 10 x 16 x 6 1.9 ‎[45] 
Heat 
Junction 
Al2O3 4x16x18.5 30 ‎[20] 
Interlayer 
C doped 
SiO2 
 0.6 ‎[40] 
Bridge W 43.5x58x 16 167 ‎[43] 
 
68 
 
trend is same for varying Drain voltages. However the slope of change in temperature is 
different in various regions due to effective thermal resistance in each dissipation path.  
8.1.2 Thermal Model & Analysis of Skybridge Circuits  
 In order to understand 
thermal constraints present 
in realistic scenarios and to 
validate thermal extraction 
capabilities in Skybridge, 
we have performed 
detailed thermal circuit 
modeling using thermal 
resistance networks. 
HSPICE simulations were 
carried out to characterize 
static thermal behavior of 
the circuit during worst 
case operating condition. 
 Fig. ‎8.2 shows example 
sub-circuits with two 
independent 8-input 
dynamic NAND gates 
implemented in single 
 
Fig. ‎8.2.  Heat dissipation paths in circuits.  2 dynamic 
NAND gate (8 fan-in and Pre and Eva transistors) are 
implemented in vertical nanowire; NAND gates share 
VDD contact in the middle; heat dissipation is through 
the nanowire, power rail contacts (VDD and GND), 
through gate electrodes and through interlayer dielectric. 
A signal nanowire is shown. Bridges carry signal from 
the signal nanowire to inputs; heat flows opposite to the 
direction of incoming signal through the gates depending 
on the temperature of gate input Bridges and signal 
nanowires.  
 
69 
 
nanowire. GND contacts are on the top and bottom of the nanowire and VDD is in the 
middle. The placement of these power rail contacts dictates the dissipation paths. 
Additional heat dissipation paths are through the transistor Gate regions, through 
interlayer dielectric, and through doped silicon nanowire (see Fig. ‎8.2). Gate input 
Bridges along with Gate contacts contribute significantly in heat extraction, if the contact 
itself (i.e., source of Gate input) is in reference temperature. If the Gate input is at 
different temperature, heat dissipation through Gate may vary. 
 The 3-D thermal resistance network for the nanowire in Fig. ‎8.2 is shown in Fig. ‎8.3. 
As depicted, metal contacts, silicided nanowire, transistors, Skybridges, signal and power 
pillars are all represented by thermal resistances. The modeling of thermal resistance 
 
Fig. ‎8.3.  Thermal modeling of circuits. 2 sub-circuit representation in single nanowire 
is shown; the thermal resistance network is built based on vertical GAA Junctionless 
transistor model (Fig. 8.1C) and nanowire transistor stack schematic (Fig. 8.2). Each 
Ohmic contact to nanowire is represented by nanowire sillicidation resistance, Ohmic 
contact resistance and routing resistance. Average routing distance from each metal 
electrode (i.e., Gate electrode, Ohmic contact, power rail contact) to heat sink was 
assumed from 8bit Skybridge carry look ahead adder circuit.  
 
70 
 
follows similar methodology described in Section 8.1.1. Design rules for 3-D circuit 
layout and transistor are same as in Chapter 6.4, 6.5 and Chapter 3.1.  
 HSPICE simulations were carried out for worst case thermal profile. For the sub-
circuits in Fig. ‎8.2, the worst case scenario is during the EVA phase of operation when all 
the‎ transistors‎ are‎ ‘On’‎ and‎ each‎ of‎ them‎act‎ as‎ a‎ static‎ heat‎ source.‎Heat‎ source‎ (i.e.,‎
power in electrical analogy) at the Drain side of each transistor in the NAND gate was 
determined by dividing maximum heat (Ion x VDD) with number of ON transistors. This 
is overly pessimistic, since in a dynamic circuit multiple transistors are stacked, and the 
state of each transistor's Drain/Source diffusion capacitances determines the current flow. 
As a result the current in Drain regions are much lower than this worst static case. 
 
Fig. ‎8.4.  Thermal simulation results of Skybridge circuits without heat extraction 
features.  temperature profile of each transistor in the logic-nanowire in Fig. ‎8.2 is 
shown. Thermal profile of shows the importance of heat dissipation paths, for the 
scenario when no heat extraction through Gate is considered, temperature is as much as 
4307K, in the EVA transistor. When heat extraction through Gate contact is considered, 
temperature reduces drastically to 667K and 480K for 50% and 100% Gate extractions 
respectively.  
71 
 
  As mentioned earlier, the Gate contact plays an important part in heat dissipation. In 
our HSPICE simulations, we model different scenarios for Gate input temperature: (i) at 
maximum, (ii) half of the maximum, and (iii) reference. Maximum temperature in Gate 
contact represents the scenario when there is no heat conduction  through the gate (i.e., 
thermal resistance in the Gate is inifinite); half of the maximum scenario refers to the 
condition that the heat conduction through the Gate is half of the best case scenario, when 
the Gate is at reference temperature and contributes fully as major heat dissipation path. 
Simulation results are shown in Fig. ‎8.4. The best case results are obtained for scenario 
(iii), when there are multiple heat dissipation paths. For the top-most transistor, the 
temperature in the Drain region is as high as 4307K in scenario (i); however with more 
heat dissipations through the Gate, the temperature reduces drastically to 667K (scenario 
(ii)) and to 480K (scenario (iii)). Fig. ‎8.4 also shows the trend that temperature decreases 
towards the bottom of the transistor stack.      
8.2 Skybridge’s Heat Extraction Features  
8.2.1 Heat Dissipation Power Pillars (HDPPs)  
 Skybridge’s‎ heat‎ extraction‎ features‎ maximize‎ heat‎ dissipation‎ by‎ providing‎
thermally conductive paths. HDPPs, when connected to power rails provide such paths. 
The HDPPs are intermittent power pillars that serve both the purpose of local power 
supply and heat dissipation. These pillars are specially designed to maximize heat 
conduction; they occupy 2x2 nanowire pitch, (132nm x 132nm) area in our current fabric 
design; within this area there are 4 silicided pillars (16nm x 16nm) each. The rest of the 
volume has Tungsten (W) filling to maximize heat conductance (Fig. ‎8.5).  
72 
 
 For the example sub-
circuits in Fig. ‎8.5, we 
have connected the power 
rails contacts at the top, 
middle and bottom to 
HDPPs and characterized 
thermal effects. The 
configuration is visually 
depicted in Fig. ‎8.5. The 
average routing distance 
was assumed to be 10 
nanowire pitches, which is 
half the width of an 8-bit 
carry look ahead adder 
(CLA) layout in 
Skybridge; the 8-bit CLA 
is representative of large 
scale circuit design. Simulation results are shown in Fig. ‎8.6. Clearly, for scenario (i), 
large area power pillars have huge impact in heat dissipation, since they provide extra 
heat conduction paths to reference temperature other than the silicon nanowire; the 
temperature reduces to 2433K in scenario (i), which is a 43% reduction from 4307K. For 
scenario (ii) and (iii) the change in temperature is less obvious, since the Gate contacts 
constitute major heat dissipation paths. Noticeably, the trend in change in temperature 
 
Fig. ‎8.5.  Incorporation of Heat Dissipating Power 
Pillar (HDPP): An intrinsic feature in Skybridge fabric 
to mainly facilitate heat extraction. HDPPs are connected 
to logic-nanowire through Bridges at the top (GND) and 
the middle (VDD) of the nanowire. HDPPs are 
configured (132 nm x 132nm area, 4 sillicided nanowire 
pillars, metal filling (W)) to maximize heat dissipation.  
 
73 
 
across various transistors is different in this case. Peak temperature from the top of the 
transistor stack gradually decays at the middle when contacts are made to VDD pillars, 
and then there is slight increase again and ultimately it decays to the reference 
temperature. In the middle of the nanowire, contacts to VDD pillar provide less heat 
resistance path, and as a result the temperature drops sharply; further down the nanowire, 
as we go away from the power rail contacts, temperature increases slightly.  These results 
indicate that HDPPs play a prominent role in heat extraction from circuits. Based on this 
understanding, we have added new architectural features to maximize heat extraction 
from logic-nanowire pillars and to dissipate it through HDPPs. 
 
 
Fig. ‎8.6.  Impact of HDPPs for Heat Extraction: HDPPs provide a low resistance path 
to reference temperature, as a result temperature profile drops sharply. For simulations, 
when no gate extraction is considered, the temperature decrease is 43% from 4307K to 
2433K for topmost Eva transistor; another sharp drop in temperature can be observed in 
the middle of nanowire for Eva transistor in the bottom stack, where the temperature 
drops from 2909K to 828K, nearly 71%. Impact of HDPPs are not so prominent for the 
cases, when heat dissipation through gate contacts exist. 
 
74 
 
8.2.2 Heat Extraction Junctions (HEJs)  
 Heat Extraction Junctions (HEJs) are specialized junctions that are used solely for 
heat extraction in a logic nanowire without perturbing its electrical operation. HEJs 
facilitate heat transfer to 
Bridges and HDPPs. The 
heat extracting Bridge 
connects to an HEJ on 
one side and to HDPP 
(GND) pillar on the other; 
this ensures that the heat 
extraction Bridges are at 
reference temperature 
initially to facilitate heat 
transfer from the hot 
region towards cool 
region. Fig. ‎8.7 illustrates 
this concept. Al2O3 meets 
the material requirements 
for such HEJ since it has 
excellent thermal 
conductance (39.18 Wm
-1
K
-1 ‎[20]), and is a good electrical insulator. The thickness for 
Al2O3 was chosen to be 6nm, which is sufficient to prevent any electrostatic control from 
Bridge contacts to silicided silicon. The HEJs can be placed at any point on the logic-
 
Fig. ‎8.7.  Heat Extraction Junctions (HEJs): HEJs for 
heat extraction and dissipation through Bridges and a 
HDPP is shown. HEJs are placed at selective places in 
the logic-nanowire; they extract heat without perturbing 
the electrical signal. Al2O3 is used as Junction material 
for excellent thermal conduction and electrical insulation.  
 
75 
 
nanowire and can be connected with Bridges for heat extraction; this allows certain 
degree of freedom and enables custom design choices for hotspot mitigation. 
 Fig. ‎8.8 shows simulation results that indicate the effectiveness of the HEJs when 
combined with Bridges and HDPPs. Two conditions are illustrated: (a) one HEJ 
connected to the Drain region in the topmost transistor in the logic nanowire, and (b) two 
HEJs are connected to two most heated regions in the logic-nanowire (two topmost 
 
Fig. ‎8.8.  Impact of HEJs, Bridges and HDPPs for heat extraction. Two cases are 
simulated: with 1 HEJ and with 2 HEJs per logic nanowire connected to Bridges and 
HDDPs for heat management.  In the case of 2 HEJs per nanowire, they are connected 
to two output regions of dynamic NAND gates. For the case with no heat dissipation 
through gate, the temperature decreases from 4307K to 400K when 1 HEJ is used in 
topmost Eva transistor, and from 2909K to 426K in the bottom Eva transistor for 2 
HEJs. Improvements are also observed for the cases when the gate electrode is at half 
of the maximum temperature (1 HEJ: from 667K to 376K) in the topmost Eva 
transistor and (2 HEJ: from 479K to 398K) in the middle Eva transistor; in case of the 
gate electrode at reference temperature, temperature drops from 479K to 367K for 1 
HEJ at  the topmost Eva transistor, and from 422K to 389K for 2HEJs at the middle 
Eva transistor. 
 
76 
 
transistors in each NAND gate). In these simulations, power rail contacts were assumed 
to be connected to HDPPs in the same way as was discussed in the previous sub-section. 
The routing distances for Bridges were assumed to be 10 nanowire pitches. 
 As illustrated in Fig. ‎8.8, radical improvement in temperature profile is achieved 
when all the fabric heat extraction features are active. Up to 90% reduction in 
temperature is achieved when only one HEJ is used in the logic nanowire. For the 
scenario when there is no heat extraction through gate contacts, HEJ, Bridges and HDPPs 
jointly reduce the temperature from 4307K to 400K in the topmost transistor, and the 
average temperature drops from 2977K to 793K, a 73% reduction. The average 
temperature reduces further, 78% when two HEJs are used in conjunction with Bridges 
and HDPPs. Substantial improvements are also observed when gate contacts contribute to 
heat dissipation. For the scenarios when gate contacts are at half of the maximum 
temperature and at reference temperature, the average temperature reduces by 12% and 
4.5%, and 15.4% and 6.5% for heat extractions with one HEJ and two HEJs, respectively. 
These‎ results‎ validate‎ the‎ effectiveness‎ of‎ Skybridge’s‎ heat‎ extraction‎ features.‎ The‎
simulation results indicate that even with 1 HEJ per logic nanowire, the average 
temperature for the worst-case heat generation can be reduced to acceptable temperatures 
below the breakdown voltage of Junctionless transistors. These transistors were shown to 
operate even at temperatures as high as 500K ‎[47]. In addition, depending on design 
requirements, modifications can be done with placement of HDPPs and number of HEJs 
in circuits to reduce the average temperature even further. 
77 
 
8.3 Section Summery 
 In this section thermal management details in Skybridge fabric was presented. 
Through transistor level modeling we analyzed thermal profiles in Skybridge circuits, and 
showed‎ the‎ effectiveness‎ of‎ Skybridge’s‎ intrinsic‎ heat‎ extraction‎ features. In the best 
case, Skybridge features were effective to reduce the average temperature in 3-D circuits 
by 78%.  
  
78 
 
9. CHAPTER 9 
ENVISIONED WAFER-SCALE MANUFACTURING PATHWAY 
   
 For more than past two decades, CMOS technology scaling has been determined 
mainly by the ability to shrink transistor channel lengths using UV lithography. However, 
as transistors are scaled to sub-20 nm dimensions lithographic aberrations are becoming a 
big concern, along with fundamental performance limitations of ultra-scaled transistors. 
Moreover, the CMOS fabric requires precise sizing and doping of complementary 
transistors, and needs them to be placed and interconnected in a complex layout to meet 
density, power and performance requirements – all of which add to the already stringent 
requirements of lithography at nanoscale.  
 Contrary to CMOS, Skybridge offers a paradigm shift in technology scaling: here 
scaling is primarily achieved by 3-D integration and is no longer limited by shrinking 
transistor dimensions only.  In this fabric, transistors are integrated vertically; 3-D circuit 
implementation, connectivity and thermal management requirements are carefully 
architected in the fabric to reduce manufacturing complexities. Lithographic precision in 
Skybridge is required only for the uniform nanowire array pattern definition; transistor 
channel length is determined by gate material deposition, which is lower cost, and known 
to be controlled to few Angstrom's precision.  
 In addition, the manufacturing pathway for Skybridge is envisioned such that only a 
single layer of crystalline silicon for vertical transistor channels is used, and same 
alignment markers for all the mask registration steps are employed; these alleviate the 
79 
 
challenges associated with the high temperature crystallization of amorphous silicon  ‎[4], 
and inter-layer misalignments‎[3]‎[5], which are critical for stacked CMOS 
approaches ‎[3] ‎[4]‎[5].  
 The manufacturing steps for Skybridge's bottom-up assembly include: wafer 
preparation, active silicon layer doping, arrays of regular vertical nanowire patterning, 
Ohmic contact and formation of Bridges for power rail, planarization using self-
planarizing materials, spacer formation, interlayer dielectric deposition, Gate oxide and 
Gate metal deposition using 3-D Photoresist structures, and formation of input-signal 
carrying Bridges. Although these steps were demonstrated individually in the literature 
and in our group ‎[10], the overall integration is not yet shown and the process itself can 
be likely refined further from what we show; similarly to CMOS that has been perfected 
during several decades, Skybridge requirements could fuel new manufacturing research 
and establish a roadmap with vertical integration.  Material choices may be refined and 
other compatible (with manufacturing) device types that are potentially based on spin 
could be employed.  
 In Table ‎9.1, we show key manufacturing requirements and challenges for Skybridge 
and compare it with both CMOS and stacked CMOS approaches.  
 
80 
 
 
Table ‎9.1. Manufacturing requirements and challenges: CMOS vs. Stacked 
CMOS vs. Skybridge 
 CMOS Stacked CMOS  ‎[3]‎[4]‎[5] Skybridge 
 Requirements Challenges Requirements Challenges Requirements Challenges 
Lithography 
Determining 
factor for 
scaling; 
defines 
channel length, 
contact, 
interconnect, 
and via 
Light source 
aberrations; 
variation 
prone; 
design rule 
explosion; 
costly 
Same as 
CMOS 
Same as 
CMOS 
Precision only 
for nanowires; 
interconnect 
definition 
relaxed 
Prone to 
variations 
during 
nanowire 
pattern 
definition 
Doping 
High precision 
for 
complementar
y dopings 
Uniform 
doping 
difficulties 
across die 
Same as 
CMOS 
Same as 
CMOS 
Doping 
required only 
once; Single 
type uniform 
across the die 
Maintaining 
uniformity at 
various depths 
Patterning 
Complex 
shapes: zigzag 
patterns and 
different 
dimensions 
Increasing 
variation 
Same as 
CMOS 
Same as 
CMOS 
High aspect 
ratio 
nanowires 
Patterning 
dense, high 
aspect ratio 
nanowires 
Deposition 
Interconnect, 
Via material 
filling 
Processing 
temperature 
in gate-first 
process 
Same as 
CMOS 
Same as 
CMOS 
Transistor, 
contact, and 
interconnect 
definition 
Such multi-
layer 
deposition is 
not shown  yet 
experimentally 
3-D 
Photoresist 
Structures 
--- --- --- --- 
Used for 
selective 
deposition 
Precision 
required for 
small feature 
sizes 
Planari-
zation 
CMP after 
each 
deposition 
layers 
Corrosions 
in metal; 
rigidity 
Same as 
CMOS 
Same as 
CMOS 
Etch-back or 
novel 
material ‎[54]  
Relatively new 
process 
Alignment 
and 
Registration 
Layer by 
Alignment, 
and 
registration 
offset at 
different layers 
Litho- 
precision 
dependent 
Same as 
CMOS 
Same as 
CMOS 
Same 
alignment and 
registration 
across all 
layers 
Lithography 
dependent; 
new Marker 
design 
Thermal 
Annealing 
--- --- 
For 
crystallizing 
each deposited 
Silicon 
layer ‎[4] 
High 
temperature 
affects 
material 
structures 
--- --- 
Through 
Silicon Vias 
--- --- 
Coarse grain 
[3] die-die 
TSVs; fine 
grain layer-
layer TSVs‎[4] 
Misalignment
; uniform 
material 
filling; 
Relatively 
new process 
--- --- 
Thinning 
and Bonding 
--- --- 
Processed 
Wafer/Die 
thinning for 
bonding  
Die-bond  
issues ‎[3]; 
stress in  
Dies, crack 
formation,  
misalignment 
--- --- 
  
81 
 
9.1 Envisioned Wafer-scale Manufacturing Pathway 
 This section details the envisioned manufacturing pathway for Skybridge fabric, and 
presents how established processes can be engineered towards meeting its requirements.  
9.1.1 Starting Wafer  
 The starting wafer is a customized highly doped silicon wafer. As shown in Fig. ‎9.1A, 
at the bottom of the wafer is bulk silicon, which can be connected to the package heat 
sink through backside metallization and bonding substrate; on top of bulk silicon are 
islands of SiO2, which serve the purpose of electrically isolating the silicon nanowire 
pillars from the bulk; a layer of crystalline silicon is deposited on top and doped 
(concentration ~ 10
19
 dopants/cm
3
; see Chapter 3.1 for doping requirements), which 
completes the wafer preparation process. Noticeably, doping is required only once prior 
to any processing steps.  
9.1.2 Nanowire Patterning  
 Patterning of arrays of high aspect ratio vertical nanowires is the next step in the 
manufacturing flow. All the nanowires have similar aspect ratio, and they maintain 
uniform distances between each other. The nanowire patterning is done such that 
alternative nanowires are patterned on top of horizontal SiO2 islands, and a group of 
nanowires are patterned on top of vertical SiO2 lands at sparse intervals (Fig. ‎9.1B). This 
is done to isolate input/output signal carrying pillars (through horizontal SiO2 islands) 
and large area VDD signal carrying pillars (through sparse vertical SiO2 islands) from 
shorting the bulk silicon and creating undesired latch-up conditions.  
82 
 
 High aspect ratio uniform vertical nanowires with smooth surfaces can be achieved 
through different processes such as patterning with oxidation and etch back 
technique ‎[50], Inductively Coupled Plasma (ICP) etching ‎[49], etc. Yang et al. in ‎[50] 
have demonstrated 20nm wide, 1µm tall (1:50) nanowires using oxidation and etch back 
techniques, while in ‎[49], Mirza. et al., demonstrated nanowires of various widths 
ranging from 30nm to 5nm with very high aspect ratios, the highest aspect ratio being 
1:50. In addition, these nanowires were shown to withstand processing conditions for G  
ate–All-Around (GAA) vertical transistor formation ‎[50]. 
 For the circuits described in this paper, the nanowire aspect ratio was 1:54 (16nm 
width, 868nm height) – accommodating two 8 fan-in logic gates in each nanowire. 
Although for benchmarking purposes this configuration was assumed, this is not a 
requirement and other aspect ratios can be supported. For example, either reducing the 
number of gates per vertical nanowire or by reducing the fan-in per gate can reduce 
aspect ratio requirements. An aspect ratio of 1:28 allows a single high fan-in gate being 
 
Fig. ‎9.1. Starting wafer and nanowire patterning. A) Bulk silicon wafer with SiO2 
islands and doped silicon layer on top; B) high aspect ratio nanowire patterning with 
lithography; signal-nanowire pillars are isolated from bulk silicon by SiO2 islands, 
whereas logic-nanowires connect directly with the bottom bulk. 
 
83 
 
vertically integrated. This would keep performance and power benefits to remain similar 
to what was presented (since the underlying design is identical and increased local 
interconnections have a minimal impact, Chapter 7). Density benefits are expected to 
scale close to linearly with nanowire aspect ratios: a 1:28 ratioed fabric would have a 2X 
lower density vs. our 1:54 benchmarked design. Nevertheless it would still have 
considerable die area benefits vs. CMOS.   
9.1.3 Contact Formation 
 Nanowire patterning is followed by a contact formation step for connecting the 
nanowire with power rail at the bottom. Ohmic contacts at different heights are also 
 
Fig. ‎9.2. Contact formation: Ohmic contact for power rail (i.e., GND) is made using A) 
photoresist spinning and UV exposure; B) Photoresist development in developer 
solution; C) Ti deposition for Ohmic contact, followed by sacrificial polymer deposition; 
D) metal lift-Off. 
 
84 
 
formed for input/output and power rail (VDD, GND) connections. In order to make an 
Ohmic contact, first a region surrounding the nanowire is exposed using UV lithography 
(Fig. ‎9.2A-B); the region of exposure is determined by the minimum material dimension 
requirements for the Ohmic contact. Ti, a widely used material for Ohmic contacts to 
heavily-doped  n-silicon, is chosen for this purpose. The required Ti thickness and length 
are derived from 3-D TCAD simulations (see Chapter 3.1). The UV Exposure step is 
followed by anisotropic Ti deposition (i.e., no step coverage on the side of nanowire, see 
Fig. ‎9.2C). Next, a layer of sacrificial polymer ‎[57] is deposited or spun on top of the Ti 
layer followed by a Lift-Off process (Fig. ‎9.2D). During Lift-Off, the Photoresist is 
removed along with the material deposited on top.  
9.1.4  VDD/GND/Output Signal Carrying Bridges 
 In Skybridge, signals are carried from one nanowire to another through Bridges. 
Bridges may be of different lengths and may be placed at different heights as per the 
circuit requirements. The manufacturing flow for these Bridges differs depending on their 
placement (e.g., input signal carrying Bridges connect to transistor gates while 
output/power signal Bridges connect to logic gate output/power rail contacts).  
 Fig. ‎9.3 shows the manufacturing steps required to form Bridges that connect to 
Ohmic contacts. After Photoresist spinning, the lithographic pattern for interconnection is 
created by UV exposure (Fig. ‎9.3A) and resists development (Fig. ‎9.3B). Noticeably, the 
exposure is such that it overlaps previously created Ohmic contacts (Fig. ‎9.3D) by a small 
portion; this is done to ensure proper metal-metal contact. After exposure and photo resist 
development, Tungsten (W) is deposited anisotropically (Fig. ‎9.3C) using CVD ‎[52]. 
85 
 
Tungsten has excellent electrical and thermal properties, and is widely used in industry 
today as Metal1 and Via filling material. This step is followed by a Lift-Off process 
(Fig. ‎9.3D) and polymer removal step (Fig. ‎9.3E), removing excess material.  
9.1.5 Planarization, Interlayer Dielectric Deposition 
 Planarization after depositions is an important step since non-planar surfaces cause 
lithographic focus imbalance, and alignment errors, which can easily result in causing 
distortion in printed features. Planarization with chemical mechanical polishing is 
avoided in this Skybridge manufacturing flow to prevent structural damage to standing 
single crystal vertical nanowires. Alternative planarization techniques such as etch back 
planarization ‎[55], self-planarization materials ‎[54] can be used to potentially achieve the 
same purpose. In this manufacturing flow we describe the usage of self-planarization 
 
Fig. ‎9.3.  Formation of VDD/GND/Output signal carrying Bridges: A) Photoresist 
spinning and UV exposure to define regions for Bridges; B) Photoresist development; 
C) anisotropic deposition of Tungsten (W) using CVD; D) W Lift-Off; E) sacrificial 
polymer removal to get rid of excess metal. 
 
86 
 
materials. These are special materials that planarize themselves regardless of the 
underlying topology. For example, Fig. ‎9.4A shows the resultant planarized surface after 
a self-planarization material is applied; the top surface is plane and smooth even though 
there is variation in height in the underlying features. This step is followed by spacer 
(Fig. ‎9.4B) and interlayer dielectric (C-SiO2, dielectric constant 2.2 ‎[56]) deposition 
(Fig. ‎9.4C). After these steps, the surface is expected to be planarized as shown in 
Fig. ‎9.4D.  
9.1.6 Gate Stack Deposition 
 Gate stack deposition involves steps for Gate oxide and Gate electrode deposition. 
Both deposition steps use the same lithographically defined pattern. Two types of 
 
Fig. ‎9.4. Planarization and interlayer dielectric deposition: A) Self-planarization 
material deposition to planarize surface; B) Spacer deposition using UV exposure (like 
Fig. S2); C) ILD (i.e., C-SiO2) deposition (like Fig. S3); D) Self-planarization material 
deposition. 
 
87 
 
Photoresists are used in this step: standard resist (e.g., PMMA) that dissolves easily in 
developer solution, and a lower resolution resist (e.g., Lift-Off Resist (LOR) ‎[58]) that 
dissolves slowly in the same developer solution. The idea is to create 3-D shapes using 
these Photoresists to selectively deposit Gate stack materials. In the beginning, 16 nm 
thick (requirement per 16-nm J-GAA transistor channel length) standard Photoresist is 
spun and is followed by UV exposure (Fig. ‎9.5A-B) to create the desired pattern for 
 
Fig. ‎9.5.  Gate stack deposition: A) Photoresist spinning and UV exposure; B) resist 
development; C) low resolution Lift-Off Resist deposition; D) second UV exposure; E) 
controlled resist development to remove first Photoresist; F) HfO2 deposition using 
ALD; G) TiN deposition using CVD; H) metal Lift-Off. 
 
88 
 
selective deposition. Next, a thicker layer of low resolution Photoresist is spun on top 
(Fig. ‎9.5C) and UV exposure is done (Fig. ‎9.5D). During this Photoresist development 
step (Fig. ‎9.5E) one standard resist develops faster than the other, and by controlling 
resist development time 3-D Photoresist shapes can be formed. After creating 3-D 
structures with Photoresist, the Gate stack is deposited. HfO2 is deposited (Fig. ‎9.5F) 
using Atomic Layer Deposition (ALD); in this step, HfO2 deposits only on uncovered Si 
surface. TiN is deposited next, (Fig. ‎9.5G) anisotropically using CVD ‎[51]. The gate 
stack material choices are specific to J-GAA devices, and are derived from 3-D TCAD 
simulations (see Chapter 3.1). The last step in this process is Lift-Off (Fig. ‎9.5H) to 
89 
 
remove the excess material on top of the Photoresist.  
9.1.7  Input Signal Carrying Bridges 
 Manufacturing steps for input signal carrying Bridges begin with Photoresist spinning 
and lithographic exposure (Fig. ‎9.6A-B). Next, TiN from the exposed region is etched 
away using dry etch (Fig. ‎9.6C) and Photoresist as etch-mask. Afterwards, Tungsten (W) 
is deposited anisotropically on the exposed region (Fig. ‎9.6D). This step is followed by a 
W  Lift-Off proc  ess (Fig. ‎9.6E). 
 Other Bridge structures such as Heat Extraction Bridges, routing Bridges follow 
 
Fig. ‎9.6.  Formation of input signal carrying bridges: A) Photoresist spinning and 
UV exposure for Bridges; B) Photoresist development; C) TiN dry etch using 
photoresist as etch-mask; D) anisotropic deposition of W using CVD; E) metal Lift-
Off. 
 
90 
 
similar methodology for fabrication.  
9.1.8 Alignment  
 Maintaining alignment precision in multiple layers of processing is a critical 
requirement, and is different from the CMOS alignment methodology. In CMOS, new 
alignment markers are created after each layer of processing; these new markers are 
larger in dimensions compared to previous ones to accommodate Mask Registration 
offset. In contrast, the same alignment markers can be used in all layers of processing for 
Skybridge; they are created at the very first step, during nanowire patterning. Different 
Mask Registration with respect to same alignment markers allow features to be built with 
same alignment precision across multiple layers. The approach is illustrated in Fig. ‎9.7, 
where alignment markers on the periphery of a die are shown to the have same height as 
the nanowires. This 
alignment methodology is 
unique to Skybridge, and is 
enabled due to 
aforementioned 
manufacturing flow, which 
does not require mechanical 
planarization processes.  
9.2 Section Summary 
 In this section the envisioned manufacturing pathway for the Skybridge fabric was 
detailed. We presented material requirements for the devices, contacts, interconnects and 
 
Fig. ‎9.7. Alignment. Skybridge alignment step using 
same alignment markers for Mask Registration 
across all layer of processing. 
 
91 
 
interlayer dielectric, and discussed their usage in established process technologies. We 
showed a step-by-step manufacturing pathway including wafer preparation, nanowire 
patterning, contact formation, planarization, spacer formation, interlayer dielectric 
deposition, and gate stack deposition. Contrast with CMOS manufacturing was also 
elaborated. 
  
92 
 
10. CHAPTER 10 
EXPERIMENTAL PROTOTYPING  
  
 In order to validate the core device concept and to demonstrate key manufacturing 
steps, we have carried out experimental prototyping in clean room. This work involved 
co-exploration of process/device simulations, and experimental metrology to optimize 
process steps. Initial process parameters were derived by emulating the actual process 
flow in simulations; SRIM, Synopsis Sentaurus Process and Device simulators were used 
for this purpose. Direct pattering with Electron-beam lithography (EBL) was used for 
experimental prototyping.  
  Significant progress was made in the experimental prototyping direction. We have 
successfully fabricated nanostructures below 30nm dimensions, demonstrated key 
process steps for Skybridge assembly such as substrate doping and nanowire patterning, 
photoresist planarization, anisotropic deposition, interlayer dielectric planarization, multi-
layer alignment and depositions, and have validated the Junctionless device concept.  
Fabricated horizontal tri-gated p-type Junctionless device was shown to have good Id-Vg 
characteristics, the ON current was found to be 1.5µA/µm, the ION/IOFF was ~ 10
3
, and the 
Vth was -0.3V.  
10.1  Experimental Validation of Horizontal Junctionless Nanowire Transistor 
10.1.1 Process and Device Simulations 
 A combination of three simulation tools (SRIM, Synopsys Sentaurus Process and 
Synopsys Sentaurus Device) was used to simulate process and device characteristics. 
93 
 
SRIM (Stopping Range of Ions in Matter) ‎[59] was used to extract ion implantation 
parameters, Sentaurus process ‎[11] was used to create device structures emulating the 
actual process flow and Sentaurus device ‎[12] was used to simulate carrier transport in 
these device structures. These simulations provided realistic insight on implications of 
materials, and process and device parameter choices for fabric prototyping. 
 Since, Junctionless device behavior is modulated by the workfunction difference 
between the channel and the gate, the nanoscale dimension of the channel is fundamental 
for its operations. In V-GAA Junctionless transistor maximum gate to channel 
electrostatics control is achieved through surround gate structure, and 16nm diameter 
vertical nanowire channel. To achieve similar device operation in 2-D, we have used an 
SOI wafer, and the top device silicon layer was thinned to 15nm. The buried Oxide layer 
in SOI wafer ensured that there are no leakage paths, and maximum gate control is 
achieved over the horizontal nanowire channel.      
 
Fig. ‎10.1. Ion Implantation simulations. A) SRIM simulation plot showing ion 
distribution in SOI wafer for 28KeVimplant, B) Sentaurus process simulation plot 
showing ion distribution in SOI wafer before and after thermal annealing at 1000° C. 
94 
 
 The same SOI wafer configuration was used in Process and Device simulations. The 
SOI wafer had a 100nm thick top device layer (Si), 378nm middle buried oxide (SiO2) 
layer and 500um bottom handle layer (Si). Fig. ‎10.1A shows Ion (B+) distribution plot 
obtained from SRIM on this SOI wafer. The acceleration voltage (28 KeV) used in SRIM 
simulations, obtained from stropping range table for Boron dopants and silicon substrate, 
was chosen such that the bottom 20nm of the top Si layer had maximum doping 
concentration. In order to identify the annealing temperature for substrate 
recrystallization and to create device structure for simulations with realistic process 
assumptions, ion implantation parameters (acceleration voltage 28KeV, implant dosage 
1e14 atom/cm
2
) obtained from SRIM was used in Sentaurus Process simulation to 
emulate the implantation step. Several process conditions were simulated to identify 
parameters for implant annealing. Substrate annealing at 1000° C, for 60 minutes in N2 
ambient was found to be adequate for substrate recrystallization, diffusion and activation 
of dopants. Fig. ‎10.1B shows uniform dopant distribution in the top silicon layer after 
annealing. Ion implantation process was modeled using Monte Carlo (TRIM) simulation 
model. Diffusion and activation processes were modeled using Charged Cluster 
model ‎[11].   
 The doped substrate was then used to create horizontal tri-gated junctionless 
nanowire FET device structures in Sentaurus Process. The device creation process 
involved following steps, which are very similar to experimental process flow- i) 
substrate thinning from 100nm to 15nm, ii) nanowire patterning, iii) masking to define 
gate region, iv) HfO2 gate oxide deposition, v) gate material (Ti) deposition vi) Al source, 
95 
 
drain contact formation. Fig. ‎10.2A shows device structure obtained from Sentaurus 
Process emulating these process steps.  
 The device structure was then used to simulate electrical properties of junctionless 
nanowire transistor using Sentaurus Device simulator. Carrier transport was modeled 
using Hydrodynamic charge transport model with densitiy gradient quantum 
corrections ‎[12] to take into account quantum affects at nanoscale. Secondary scattering 
effects were also taken into account. Simulations were done for various device 
configurations; Gate Oxide, channel width and channel length were varied; doping 
concentration, channel thickness were kept the same at 1e19 dopants/cm
3
 and 15nm 
 
Fig. ‎10.2. Process and Device simulation results. A) Horizontaltri-gated junctionless 
nanowire FET device structure from Sentaurus process, B) Id-Vgs curve showing 
variations due to gate oxide choice, C) Id-Vgs curve showing impact of nanowire 
channel width, D) Id-Vgs showing the effect of gate length on drain current. 
96 
 
respectively. Fig. ‎10.2B shows Id-Vgs characteristics for different gate oxides; 1nm HfO2 
shows superior characteristics with Ion/Ioff ~ 10
7 
compared to 3nm SiO2, 1nm SiO2 and 
3nm HfO2, which is primarily due to stronger electric field resulting from thinner HfO2 
high-k dielectric. Fig. ‎10.2C and Fig. ‎10.2D shows simulated Id-Vgs characteristics for 
different channel width and channel lengths. Clearly, nanowire FETs with narrower 
channels and longer gate lengths show better characteristics (ION ~ 30uA, IOFF ~ 5pA) due 
to higher electrostatics of the metal gate over channel.  These simulation results provide a 
premise for expected junctionless nanowire FET behavior, and as well initial process 
parameters for device fabrication. 
10.1.2  Experimental Process Flow 
 An end-to-end process flow for device fabrication was developed and individual steps 
were optimized. This experimental pathway was based on direct patterning of silicon 
nanowires from Silicon-on-Insulator (SOI) substrates using Electron-Beam Lithography 
(EBL). The prototyping approach used is shown schematically in Fig. ‎10.3. The starting 
material is an SOI wafer (Fig. ‎10.3A) where the top device layer is doped with p+ 
dopants. The ion implantation and annealing steps for uniform doping of Si device layer 
was carried out using simulated process parameters (Acceleration voltage:28KeV, 
Surface dosage: 1e14 dopants/cm
2
, Implant tilt: 7 degree, Annealing Temperature: 1000° 
C, Annealing Duration: 60min, Annealing Ambient: N2). The implantation was such that 
initially the bottom 20nm of the top Si layer had maximum doping concentration in the 
order of 1e19 dopants/cm
3 
(Fig. ‎10.3B). The substrate was thinned down to 15nm with 
anisotropic RIE using SF6+CHF3 etch recipe (Fig. ‎10.3C). Using EBL and PMMA resist, 
contact pads and alignment markers were patterned, and were followed by Ti (5nm) and 
97 
 
Au (25nm) deposition using E-beam Evaporator (Fig. ‎10.3D). Using these alignment 
markers, sub-30nm nanowire features were patterned in between contact pad extensions, 
and was followed by Ni evaporation and liftoff steps to define Ni features on top of the 
substrate (Fig. ‎10.3E). The Ni features acted as an etch mask for defining nanowires on 
the SOI. Anisotropic RIE using SF6 + CHF3 mixture was then used to etch the 
surrounding Si, followed by Piranha (3:1 H2SO4:H2O2) treatment to remove Ni etch 
 
Fig. ‎10.3. Experimental process flow. A) SOI wafer as starting wafer; 100nm Si 
device layer (top), 378nm buried Oxide layer (middle), and 500um Si handle layer 
(bottom). B) Ion implantation and annealing. C) Substrate thinning to 15nm using RIE. 
D) Contact pad and alignment marker formation. E) Patterning of Nickel feature. F) 
Nanowire pattern transfer. G) ALD HfO2 deposition. H) Gate formation and gate 
material depositions. 
98 
 
mask. This resulted in Silicon nanowires directly patterned on the SOI substrate 
(Fig. ‎10.3F). Nanowires at widths as small as 30nm, 20nm and 15nm were demonstrated 
using this approach. Atomic layer deposition technique was used for Halfnium oxide 
(HfO2) deposition (Fig. ‎10.3G), followed by alignment, patterning, evaporation and 
liftoff to define metal gate (Fig. ‎10.3H). Material selection and thickness parameters for 
gate oxide and gate metal were as derived from process and device simulations.  
10.1.3  Device Characterization Results 
 Extensive metrology was done after each process step to verify expected results. Four 
point probe measurements were carried out to determine doping concentration in Silicon 
substrate after ion implantation and were found to be ~8 x 10
18 
dopants/cm
3
,
 
which was 
almost equal to expected concentration (10
19 
dopants/cm
3
). Atomic Force Microscopy 
(AFM) measurements were done to determine surface roughness and Silicon thickness 
after substrate thinning and pattern transfer steps. Substrate thinning and nanowire 
patterning results are shown in Fig. ‎10.4A and Fig. ‎10.4B. As shown in Fig. ‎10.4A, 
thinned Si substrate had less than 1nm of surface roughness variation after anisotropic 
etching of top SOI layer from 100nm to 15nm. Fig. ‎10.4B shows AFM image of a 15nm 
thick patterned Silicon nanowire on top of SiO2 substrate.  
 I-V measurements were carried out on individual junctionless nanowire FETs to 
characterize electrical properties. In order to determine, ON current and contact resistivity 
in junctionless FETs, two point probe I-V measurements were done on nanowire 
channels, which were patterned in between source and drain contacts. Excellent Ohmic 
behavior was achieved from Source/Drain contacts (contact metal stack: 5nm Ti + 30nm 
99 
 
Au) since the underlying substrate was heavily doped. Fig. ‎10.4C shows I-V 
characteristics of heavily doped nanowires with Source/Drain contacts, the gate voltage 
was varied from -10V to +10V and linear increase in current was observed. Ellipsiometry 
measurements were done to determine HfO2 thickness after atomic layer deposition at 
150° C. We were able to deposit and measure HfO2 films down to 1nm, and the thickness 
was found to be uniform across the die.    
 Three point probe measurements were done on junctionless nanowire FETs. 
Dimensions for fabricated devices were 30nm wide and 15nm thick nanowire channel, 
 
Fig. ‎10.4. Experimental results. A) AFM results: less than 1nm surface roughness after 
RIE thinning, B) 15nm thick Si nanowire on top of SiO2 substrate. C) I-V measurements 
of nanowire channel showing linear increase in current for wide range of voltages. D) Id-
Vgs characteristics of fabricated p-type junctionless xnwFET, the device is normally OFF 
at 0Vgs, turns ON fully at -1Vgs.  
 
100 
 
2nm thick HfO2 gate dielectric, 200nm long gate and 50nm thick gate metal stack. A 
stack of 30nm Titanium layer and 20nm thick Gold layer served as gate metal stack. 
Fig. ‎10.4D shows Id-Vgs characteristics of p-type junctionless nanowire FETs when a 
metal gate stack was put on top of silicon nanowire channel. The Ids-Vgs characteristics in 
Fig. ‎10.4D accurately depicts junctionless device characteristics, where the workfunction 
difference between Titanium/Au gate and P+ doped Silicon nanowire channel depletes 
the channel and the device is normally OFF at 0V Vgs. With the application of negative 
gate voltages (Vgs < Vth), the carriers accumulated and the channel conduction was 
maximum. These devices had an Ion/Ioff ~ 1000 and threshold voltage ~ -0.3 V. 
Characterization was done using the Keithley 4200 parametric analyzer and Wentworth 
probe station. 
10.2 Experimental Demonstration of Skybridge’s Key Manufacturing Steps 
We‎ have‎ experimentally‎ demonstrated‎ key‎ steps‎ necessary‎ for‎ Skybridge’s‎
assembly. These demonstrations along with Junctionless device validation further prove 
feasibility of realizing Skybridge fabric. 
10.2.1 Formation of Vertical Nanowires 
We have demonstrated high aspect ratio vertical nanowires. Both isolated nanowires 
and nanowire arrays of different height and width were fabricated. Similar to the process 
steps described in Section 10.1.2, a metal etch mask was used and deep RIE etching was 
done to form these nanowires. An optimized etch recipe was used that had intermediate 
surface passivation stages. Combination of three gases (SF6, CHF3, and Ar) was used to 
for etching and surface smoothening, while O2 was used in interleaved stages for surface 
101 
 
passivation. Fig. ‎10.5 shows vertical nanowire fabrication results. A range of nanowires 
with different height and width were fabricated. Fig. ‎10.5A shows 360nm tall nanowires 
of different width; smallest width being 26nm on top. Fig. ‎10.5B nanowire array with 
each nanowire having 11nm height and 197nm mostly uniform width. The nanowire 
width can be further reduced to achieve higher aspect ratios by oxidation and removal 
techniques similar to the ones presented in ‎[50].    
10.2.2 Photoresist Planarization, Alignment and Deposition 
 Photoresist planarization is a key step in Skybridge assembly. Spinning a thin layer of 
photoresist on a substrate with existing high aspect ratio features, usually results in non-
uniformities due to surface tension of liquid. The non-uniformities in photoresist layer 
(Fig. ‎10.6A) are detrimental to exposure/writing steps. To overcome this challenge and to 
planarize photoresist layer, we have developed a technique using photoresist over-fill and 
etch-back. During the over-fill process, several layers of photoresist were coated to 
completely cover the nanowire features. Subsequently, photoresist was etch-back using 
 
Fig. ‎10.5. Vertical Nanowire Patterning. A) 360nm tall vertical nanowires with 
varying widths (26nm-250nm). Inset shows 360nm tall nanowire with 26nm top width 
and 55nm bottom width. B) Nanowire Array: 1100nm height, 197nm mostly uniform 
width, 2µm spacing. 
 
102 
 
an optimized recipe with O2 plasma to obtain a thin planarized photoresist layer at the 
bottom of nanowires (Fig. ‎10.6B).  
 After photoresist planarization, E-beam exposure was done selectively on nanowire 
surrounding regions to deposit materials for source/drain contact formation. E-beam 
alignment and exposure was done following the same alignment methodology described 
in Section 10.1.2. After E-beam exposure and photoresist development, contact material 
(Ti) was deposited using E-beam evaporator. Fig. ‎10.7A shows an example of selective 
anisotropic material deposition following aforementioned steps.  
10.2.3 Interlayer Dielectric Deposition and Planarization 
 Interlayer dielectric provides isolation between electrical components, and is very 
essential in nanofabrication processes. Both self-planarization materials with low-k, and 
low-k oxides can be used for this purpose. For our experiments, we used SU-8 as self-
planarizing interlayer dielectric material. Similar to the photoresist planarization process 
discussed earlier, SU-8 was overfilled and etched-back to obtain planarized interlayer. 
 
Fig. ‎10.6. Photoresist Planarization. A) Non-uniformity after photoresist spinning, 
B)After over-fill and etch-back; planarized photoresist layer at the bottom of the 
nanowires. 
 
103 
 
SU-8 has self-planarizing capabilities; once the vertical nanowires are covered with SU-
8, the top layer planarizes itself. SU-8 is also suitable for our experiments for its 
structural rigidity; once hardened, SU-8 is very difficult to remove with wet etchants, and 
remains unperturbed throughout subsequent processing steps. SU-8 can be hardened both 
by over-baking and plasma exposure. Fig. ‎10.7B demonstrates application of SU-8 as 
interlayer dielectric. 
 
Fig. ‎10.7. Demonstration of Material Depositions. A) Anisotropic material deposition 
only at the bottom of nanowires for contact formation; these depositions are selective 
and done after E-beam alignment and exposure steps. B) After interlayer dielectric 
deposition; SU-8 is as used self planarizing interlayer dielectric material. It was 
overfilled and etched-back to achieve desired thickness. C) Demonstration of multi-layer 
selective material deposition; two contact regions are formed with SU-8 in-between. 
 
104 
 
10.2.4 Multi-layer Material Deposition  
 Following aforementioned steps, and using same set of alignment makers E-beam 
exposure and deposition can be done to develop multi-layer material stack as shown in 
Fig. ‎10.7C. Similar process steps with controlled etching can be also used for gate-oxide 
deposition.  
10.3  Section Summary 
 In this section the experimental prototyping progress was shown. A Process/Device 
simulation framework was developed to determine process parameters and to understand 
implications of material choices on device characteristics. Successful validation of the 
Junctionless device concept, and key manufacturing steps were shown experimentally 
that are essential for Skybridge assembly.  
  
105 
 
BIBLIOGRAPHY 
 
 
[1] Fischetti, M. V., et al. Scaling MOSFETs to 10 nm: Coulomb Effects, Source 
Starvation, and Virtual Source. International Workshop on Computational 
Electronics. 1. 2009 
[2] Puri, R. & Kung, D.S. The dawn of 22nm era: Design and CAD challenges. 
Proceedings of 23rd International Conference on VLSI Design. 429-433. (2010) 
[3] Black, B., et al. Die Stacking (3D) Microarchitecture. 39th Annual IEEE/ACM 
International Symposium on Microarchitecture. 469-479 (2006) 
[4] Batude, P., et al. Advances in 3D CMOS sequential integration. IEEE 
International Electron Devices Meeting. 1.7-9 (2009) 
[5] Farrens, S. Wafer-Bonding Technologies and Strategies for 3D ICs. Wafer Level 
3-D ICs Process Technology. 49–85. (Springer, New York, 2008) 
[6] Rahman, M., Khasanvis, S., Shi, J. J., Li, M. Y & Andras, C. A. Skybridge: 3-D 
Integrated Circuit Technology Alternative to CMOS. Nature. Under Review. 
(2014) 
[7] Abu-Rahman, M. H. & Anis, M.  Variability in Nanometer Technologies and 
Impact on SRAM. Nanometer Variation-Tolerant SRAM. (Springer, New York, 
2013) 
[8] Greenway, R. T., et. al. Interference assisted lithography for patterning of 1D 
gridded design.  Proceedings of SPIE. 7271. (2009) 
[9] Plummer, J. D. Silicon MOSFETs (conventional and non-traditional) at the 
scaling  limit. Device Research Conference. 3-6 (2000) 
[10] Rahman, M., Narayanan, P, Khasanvis, S., Nicholson, J. & Moritz, C. A. 
Experimental Prototyping of Beyond-CMOS Nanowire Computing Fabrics. 
Proceedings of IEEE/ACM International Symposium on Nanoscale Architectures. 
In press. (2013) 
[11] Synopsys. Synopsys Sentaurus Process. Software. Version C-2009.06. 
<http://www.synopsys.com/tools/tcad/processsimulation/pages/sentaurusprocess.a
spx> (2009) 
[12] Synopsys. Synopsys Sentaurus Device. Software. Version C-2009.06. 
<http://www.synopsys.com/tools/tcad/processsimulation/pages/sentaurusprocess.a
spx> (2009) 
106 
 
[13] Kim, D. H., Kim, S. & Lim, S. K. Impact of Nano-scale Through-Silicon Vias on 
the Quality of Today and Future 3D IC Designs. ACM/IEEE International 
Workshop on System Level Interconnect Prediction. 1-8 (2011) 
[14]  Yang, K., Kim, D. H. & Lim, S-K. Design quality tradeoff studies for 3D ICs 
built with nano-scale TSVs and devices. 13th International Symposium on Quality 
Electronic Design.740-746 (2012) 
[15] Suresh, V., et al. Design of 8T-Nanowire RAM Array.  Proceedings of 
IEEE/ACM International Symposium on Nanoscale Architectures. In press (2013) 
[16] Rahman, A. & Reif, R. System-level performance evaluation of three-dimensional 
integrated circuits. IEEE Transactions on Very Large Scale Integration Systems. 
8. 671-678 (2000) 
[17] Davis, J. A., De, V. K. & Meindl, J. A stochastic wire-length distribution for 
gigascale integration (GSI)—Part I: Derivation and validation. IEEE Trans. 
Electron Devices. 45. 580–589 (1998) 
[18] Swahn, B. & Hassoun, S. Electro-Thermal Analysis of Multi-Fin Devices. IEEE 
Transactions on Very Large Scale Integration Systems. 16. 816-829 (2008) 
[19] Pop, E. Energy dissipation and transport in nanoscale devices. Nano Research. 3. 
147-169 (2010) 
[20] Dinash. K, Mutharasu, D. & Lee, Y. T. Paper study on thermal conductivity of 
Al2O3 thin film of different thicknesses on copper substrate under different 
contact pressures. IEEE Symposium on Industrial Electronics and Applications. 
620. 25-28 (2011) 
[21] Rios, R., et al. Comparison of Junctionless and Conventional Trigate Transistors 
With Lg Down to 26 nm. IEEE Electron Device Letters. 32. 1170-1172 (2011) 
[22] Moritz, C. A., Narayanan, P. & Chui, C. O. Nanoscale Application Specific 
Integrated Circuits. Nanoelectronic Circuit Design (Springer, New York, 2011) 
[23] Narayanan, P., Leuchtenburg, M., Wang, T., & Moritz, C. A. CMOS Control 
Enabled Single-Type FET NASIC.  IEEE Computer Society International 
Symposium on VLSI. 191-196 (2008) 
[24] Synopsys. HSPICE user guide: simulation and analysis. Version C-2009.09 
(2009)  
[25] Oakdale Engineering. DataFit Software. Version 9.0. 
<http://www.oakdaleengr.com/download.htm> (2013) 
107 
 
[26] Narayanan, P., Kina, J., Panchapakeshan, P., Chui, C. O. & Moritz, C. A. 
Integrated Device-Fabric Explorations and Noise Mitigation in Nanoscale 
Fabrics. IEEE Transactions on Nanotechnology. 11. 687 -700 (2012) 
[27] Milovanovic, A. & Koprivica, B. Analysis of square coaxial lines by using 
Equivalent Electrodes Method. Nonlinear Dynamics and Synchronization (INDS) 
& 16th Int'l Symposium on Theoretical Electrical Engineering. 1-6 (2011) 
[28] Arizona State University. PTM-MG device models for 16nm node. 
<http://ptm.asu.edu/> (2011)  
[29] Donath, W. Placement and average interconnection lengths of computer logic. 
IEEE Transactions on Circuits and Systems. 26. 272-277 (1979) 
[30] Christie, P. & Stroobandt, D. The interpretation and application of Rent's rule. 
IEEE Transactions on Very Large Scale Integration Systems. 8. 639-648 (2000) 
[31] Bakoglu, H. B. Circuits, Interconnects and Packaging for VLSI. (Addison-
Wesley, Boston, 1990). 
[32] Otten, R. H. J. M. & Brayton, R. K. Planning for performance. Proceedings of 
35th Annual Design Automation Conference. 122–127. (1998) 
[33] Davis, J. A., De, V. K.  & Meindl, J. D. A stochastic wire-length distribution for 
gigascale integration (GSI)—Part II: Applications to clock frequency, power 
dissipation, and chip size estimation. IEEE Transactions on Electron Devices. 45. 
(1998) 
[34] Sinha, S., Yeric, G., Chandra, V., Cline, B. & Cao, Y. Exploring sub-20nm 
FinFET design with Predictive Technology Models. Proceedings of 49th 
ACM/EDAC/IEEE Design Automation Conference. 283-288 (2012) 
[35] Arizona State University. PTM R-C Interconnect models. <http://ptm.asu.edu/> 
(2012)  
[36] ITRS. ITRS 2012 Interconnect Tables. <http://itrs.net/> (2012)  
[37] Sai-Halasz, G. A. Performance trends in high-end processors. Proceedings of  
IEEE. 83. 20–36 (1995) 
[38] Wang, H. & Porter, W. D. Thermal Conductivity 27: Thermal Expansion 15. 500. 
(DEStech Publication Inc, Knoxville, 2003) 
[39] Neshpor, V.S. The thermal conductivity of the silicides of transition metals. 
Journal of Engineering Physics. 15. 750-752 (1968) 
[40] Tritt, T. M. Thermal Conductivity: Theory, Properties, and Applications. 172. 
(Kluwer Academic, New York, 2004) 
108 
 
[41] Griffin, A. J., Brotzen, F. R. & Loos, P. J. The effective transverse thermal 
conductivity of amorphous Si3N4 thin films. Journal of Applied Physics. 76. 
4007-4011 (1994)  
[42] Panzer, M. et al. Thermal Properties of Ultrathin Hafnium Oxide Gate Dielectric 
Films. IEEE Electron Device Letters. 30. 1269-1271 (2009) 
[43] Thermal Conductivity: Tungsten.  
<http://www.efunda.com/materials/elements/TC_Table.cfm?Element_ID=W> 
(2010) 
[44] Thermal Conductivity: Titanium. 
<http://www.efunda.com/materials/elements/TC_Table.cfm?Element_ID=Ti> 
(2010) 
[45] Pierson, H. O. Handbook of Refractory Carbides and Nitrides: Properties, 
Characteristics, Processing, and Applications. 223-247. (Noyes Publications, Park 
Ridge, 1996) 
[46] Lu, X. Thermal conductivity modeling of copper and tungsten damascene 
structures. Journal of Applied Physics. 105. 1-12 (2009) 
[47] Das, S. et al. Performance of 22 nm Tri-Gate Junctionless Nanowire Transistors at 
Elevated Temperatures. ECS Solid State Letters. 2. (2013) 
[48] ITRS. ITRS 2012 Lithography Tables. <http://itrs.net/> (2012)  
[49] Mirza, M. M., et al. Nanofabrication of high aspect ratio (50:1) sub-10 nm‎silicon‎
nanowires using inductively coupled plasma etching. Journal of Vacuum Science 
& Technology. 30. (2012) 
[50] Yang, B., et al. Vertical Silicon-Nanowire Formation and Gate-All-Around 
MOSFET. IEEE Electron Device Letters. 29. 791-794 (2008) 
[51] Na, J., Yanqing, Y.,  Xian, L., & Zhenhai, X. Development of CVD Ti-containing 
films. Progress in Materials Science. 58. 1490-1533 (2013) 
[52] Rosler, R. S., Mendonca, J. & Rice, M. J. Tungsten chemical vapor deposition 
characteristics using SiH4 in a single wafer system. Journal of Vacuum Science & 
Technology B: Microelectronics and Nanometer Structures. 6. 1721-1727 (1988) 
[53] Conley, J. F., Ono, Y., Zhuang, W., Stecker, L. & Stecker, G. Electrical properties 
and reliability of HfO2 deposited via ALD using Hf(NO3)4 precursor. IEEE 
International Integrated Reliability Workshop. 108. 21-24 (2002) 
[54] Bai, D., Fowler, M., Planje, C. & Shao, X. Planarization of Deep structures Using 
Self-Leveling Materials. International Microelectronics Assembly and Packaging 
Society. (2012)  
109 
 
[55] Ting, C.H., Pai, P.L. & Sobczack, Z. An improved etchback planarization process 
using a super planarizing spin-on sacrificial layer. IEEE International VLSI 
Multilevel Interconnection Conference. 491 (1989) 
[56] Gupta, T. K. Dielectric Materials. Copper Interconnect Technology. 67-100 
(Springer, New York, 2009) 
[57] Linder, V.,  Gates, B. D., Ryan, D., Parviz, B. A.& Whitesides, G. M. Water-
Soluble Sacrificial Layers for Surface Micromachining. SMALL.  1. 730-736 
(2005) 
[58] Yun, K-S. & Yoon, E. Microfabrication of 3-dimensional photoresist structures 
using selective patterning and development on two types of specific resists and its 
application to microfluidic components. IEEE International Conference on Micro 
Electro Mechanical Systems. 757-760 (2004) 
[59] Ziegler, J. Stopping Range of Ions in Matter. Software. (2012) 
<http://www.srim.org/>. 
 
 
