Architecting SkyBridge-CMOS by Li, Mingyu
University of Massachusetts Amherst 
ScholarWorks@UMass Amherst 
Masters Theses Dissertations and Theses 
March 2015 
Architecting SkyBridge-CMOS 
Mingyu Li 
University of Massachusetts Amherst 
Follow this and additional works at: https://scholarworks.umass.edu/masters_theses_2 
 Part of the Electrical and Electronics Commons, Electronic Devices and Semiconductor Manufacturing 
Commons, and the VLSI and Circuits, Embedded and Hardware Systems Commons 
Recommended Citation 
Li, Mingyu, "Architecting SkyBridge-CMOS" (2015). Masters Theses. 157. 
https://scholarworks.umass.edu/masters_theses_2/157 
This Open Access Thesis is brought to you for free and open access by the Dissertations and Theses at 
ScholarWorks@UMass Amherst. It has been accepted for inclusion in Masters Theses by an authorized 
administrator of ScholarWorks@UMass Amherst. For more information, please contact 
scholarworks@library.umass.edu. 
  
 
 
 
 
 
ARCHITECTING SKYBRIDGE-CMOS 
 
 
 
 
 
 
A Thesis Presented 
by 
MINGYU LI 
 
 
 
 
Submitted to the Graduate School of the  
University of Massachusetts Amherst in partial fulfillment 
of the requirements for the degree of 
MASTER OF SCIENCE IN ELECTRICAL AND COMPUTER ENGINEERING 
February 2015 
Electrical and Computer Engineering 
  
 
 
 
 
 
 
 
 
 
© Copyright by Mingyu Li 2015 
All Rights Reserved 
  
  
 
 
 
ARCHITECTING SKYBRIDGE-CMOS 
 
 
A Thesis Presented 
by 
MINGYU LI 
 
 
 
 
Approved as to style and content by: 
 
_____________________________________________ 
Csaba Andras Moritz, Chair 
 
_____________________________________________ 
Israel Koren, Member 
 
_____________________________________________ 
C. Mani Krishna, Member 
 
______________________________________________ 
Christopher V. Hollot, Department Chair 
Electrical and Computer Engineering
 iv 
 
 
 
ACKNOWLEDGEMENTS 
 
With this opportunity I express my gratitude to my advisor Professor Csaba Andras Moritz. 
You have been constantly encouraging and guiding me throughout my Master study. I would like 
to thank you for mentoring me to grow as a learner, researcher and a Ph.D. candidate. Your 
advice on my research will forever be my guidelines in my future work and career. I would also 
like to express my appreciation to my dissertation committee members Professor Krishna and 
Professor Koren for their priceless advices on my thesis work. Your feedback has inspired me to 
achieve more during my research. 
I have also benefited from the invaluable guide and mentorship of my colleagues, in no 
particular order, Santosh Khasanvis, Mostafizur Rahman, Jiajun Shi, Xiayuan Shi and Xiangyun 
Zeng, who have been not only helpful in our research work, but also good friends in my life. 
I would also like to thank my mother and father for their continuous encouragement for more 
than twenty years. Their love have accompanied me through my life and study, supporting me 
towards pursuing more achievement. 
  
 v 
 
 
 
ABSTRACT 
 
ARCHITECTING SKYBRIDGE-CMOS 
 
FEBRUARY 2015 
 
MINGYU LI 
B.ENG., SHANDONG UNIVERSITY, JINAN, CHINA 
M.S.E.C.E., UNIVERSITY OF MASSACHUSETTS AMHERST 
 
Directed by: Professor Csaba Andras Moritz 
 
As the scaling of CMOS approaches fundamental limits, revolutionary technology beyond the 
end of CMOS roadmap is essential to continue the progress and miniaturization of integrated 
circuits. Recent research efforts in 3-D circuit integration explore pathways of continuing the 
scaling by co-designing for device, circuit, connectivity, heat and manufacturing challenges in a 
3-D fabric-centric manner. SkyBridge fabric is one such approach that addresses fine-grained 
integration in 3-D, achieves orders of magnitude benefits over projected scaled 2-D CMOS, and 
provides a pathway for continuing scaling beyond 2-D CMOS. 
However, SkyBridge fabric utilizes only single type transistors in order to reduce manufacture 
complexity, which limits its circuit implementation to dynamic logic. This design choice 
introduces multiple challenges for SkyBridge such as high switching power consumption, 
 vi 
 
susceptibility to noise, and increased complexity for clocking. In this thesis we propose a new 3-
D fabric, similar in mindset to SkyBridge, but with static logic circuit implementation in order to 
mitigate the afore-mentioned challenges. We present an integrated framework to realize static 
circuits with vertical nanowires, and co-design it across all layers spanning fundamental fabric 
structures to large circuits. The new fabric, named as SkyBridge-CMOS, introduces new 
technology, structures and circuit designs to meet the additional requirements for implementing 
static circuits. One of the critical challenges addressed here is integrating both n-type and p-type 
nanowires. Molecular bonding process allows precise control between different doping regions, 
and novel fabric components are proposed to achieve 3-D routing between various doping 
regions. 
Core fabric components are designed, optimized and modeled with their physical level 
information taken into account. Based on these basic structures we design and evaluate various 
logic gates, arithmetic circuits and SRAM in terms of power, area footprint and delay. A 
comprehensive evaluation methodology spanning material/device level to circuit level is 
followed. Benchmarking against 16nm 2-D CMOS shows significant improvement of up to 50X 
in area footprint and 9.3X in total power efficiency for low power applications, and 3X in 
throughput for high performance applications. Also, better noise resilience and better power 
efficiency can be guaranteed when compared with original SkyBridge fabrics.  
 vii 
 
TABLE OF CONTENTS 
Page 
ACKNOWLEDGEMENT ........................................................................................................... iv 
ABSTRACT ................................................................................................................................... v 
LIST OF TABLES ....................................................................................................................... ix 
LIST OF FIGURES ...................................................................................................................... x 
CHAPTER 
1. INTRODUCTION..................................................................................................................... 1 
2. OVERVIEW OF SKYBRIDGE FABRIC .............................................................................. 4 
2.1 Motivation and Overview of SkyBridge ........................................................................ 4 
2.2 Challenges in SkyBridge Fabric .................................................................................... 6 
3. OVERVIEW OF SKYBRIDGE-CMOS ................................................................................. 9 
3.1 Motivation ...................................................................................................................... 9 
3.2 Overview of SkyBridge-CMOS ..................................................................................... 9 
3.3 Core Components ........................................................................................................ 10 
3.3.1 Vertical Nanowires ............................................................................................... 10 
3.3.2 Transistors ............................................................................................................. 12 
3.3.3 Contacts ................................................................................................................. 15 
3.3.4 Bridges and Coaxial Routing ................................................................................ 16 
3.3.5 SkyBridge-Interlayer-Connections ....................................................................... 17 
4. SKYBRIDGE-CMOS CIRCUIT IMPLEMENTATIONS ................................................. 19 
4.1 Overview of SkyBridge-CMOS Circuit Style ............................................................. 19 
4.2 Elementary Logic Gates ............................................................................................... 21 
4.2.1 Inverters ................................................................................................................ 21 
4.2.2 NAND Gate .......................................................................................................... 23 
4.2.3 AOI21 Compound Gate ........................................................................................ 24 
4.3 Full Adder .................................................................................................................... 25 
4.4 Flip-flop ....................................................................................................................... 26 
 viii 
 
4.5 6T-SRAM .................................................................................................................... 27 
4.5.1 Read Operation ..................................................................................................... 28 
4.5.2 Write Operation .................................................................................................... 29 
4.5.3 Noise Margin and Writability ............................................................................... 30 
5. EVALUATION OF SKYBRIDGE-CMOS FABRIC .......................................................... 33 
5.1 Fabric Evaluation Methodology .................................................................................. 33 
5.1.1 Device and Material Level Methodology ............................................................. 34 
5.1.2 Circuit and Layout Design .................................................................................... 37 
5.1.3 RC Extraction ........................................................................................................ 38 
5.1.4 HSPICE Simulation .............................................................................................. 38 
5.1.5 CMOS Baseline Evaluation .................................................................................. 40 
5.2 Evaluation Results ....................................................................................................... 41 
5.2.1 Noise Resilience Evaluation ................................................................................. 41 
5.2.2 Initial Benchmarking ............................................................................................ 43 
5.2.3 Performance Optimization .................................................................................... 46 
5.2.4 Large Benchmarking: WISP-4 and 16-bit Multiplier ........................................... 47 
6. CONCLUSION ....................................................................................................................... 53 
BIBLIOGRAPHY ....................................................................................................................... 55 
 
   
  
 ix 
 
LIST OF TABLES 
 
Table               Page 
5.1. Initial Benchmarking Results ............................................................................................. 45 
5.2. WISP-4 Benchmarking Results ......................................................................................... 50 
5.3. 16-bit Multiplier Benchmarking Results ........................................................................... 52 
 
  
 x 
 
LIST OF FIGURES 
 
Figure               Page 
1.1. Lithography challenge with scaling ..................................................................................... 2 
1.2. Performance trend ................................................................................................................ 3 
2.1. SkyBridge Inverter Implementation .................................................................................... 5 
2.2. SkyBridge Fabric Representation ........................................................................................ 5 
2.3. SkyBridge Control Scheme.................................................................................................. 7 
2.4. Switch in static and dynamic circuits when output=0 ......................................................... 7 
3.1. Substrate with Layered Doping Regions ........................................................................... 11 
3.2. Nanowires with p-, n- doped regions and SB-ILD ............................................................ 12 
3.3. Gate Material Choice ......................................................................................................... 14 
3.4. N-type Transistor in SkyBridge-CMOS ............................................................................ 14 
3.5. P-type Transistor in SkyBridge-CMOS ............................................................................. 15 
3.6. N- and P-type Contact ........................................................................................................ 16 
3.7. Interconnections: Coaxial Routing and Bridge .................................................................. 17 
3.8. SkyBridge-Interlayer-Connection ...................................................................................... 18 
4.1. SkyBridge-CMOS Circuit Style ........................................................................................ 20 
4.2. Physical Layout of SkyBridge-CMOS Inverter ................................................................. 22 
4.3. Cascaded Inverters ............................................................................................................. 23 
4.4. 3-1 NAND Gate ................................................................................................................. 24 
4.5. AOI21 Compound Gate ..................................................................................................... 25 
4.6. 1-bit Full Adder.................................................................................................................. 26 
 xi 
 
4.7. 1-bit Negative Edge Flip-flop ............................................................................................ 27 
4.8. SRAM Cell Schematic and Layout .................................................................................... 28 
4.9. SB-CMOS SRAM read operation ...................................................................................... 29 
4.10. SB-CMOS SRAM write operation .................................................................................. 30 
4.11. 6T SB-CMOS SRAM Hold Margin Measurement Circuit and Results .......................... 31 
4.12. SB-CMOS SRAM Read Margin Measurement Circuit and Results ............................... 32 
4.13. 6T SB-CMOS SRAM Writability Measurement Circuit and Results ............................. 32 
5.1. SkyBridge-CMOS Fabric Evaluation Methodology .......................................................... 33 
5.2. P-type Transistor Evaluation Method ................................................................................ 35 
5.3. Transistor IDS-VDS Characteristics from TCAD Simulation ........................................... 35 
5.4. SkyBridge-Interlayer-Connection I-V Characteristics ...................................................... 36 
5.5. SkyBridge-Interlayer-Connection Modeling ..................................................................... 37 
5.6. Coupling noise scenario (with GND shielding for SkyBridge) ......................................... 39 
5.7. Victim Signals for SkyBridge Noise Evaluations for Scenarios in Section ...................... 41 
5.8. Noise Resilience Evaluation for SkyBridge with GND Shielding and SkyBridge-CMOS 42 
5.9. 4-bit Array-based Multiplier .............................................................................................. 43 
5.10. Layout of One Cell in Multiplier ..................................................................................... 44 
5.11. Pipelined Multiplier Evaluation Results .......................................................................... 47 
5.12. WISP-4 Architecture and Instruction Set ........................................................................ 48 
5.13. Block Diagram of WISP-4 Stages ................................................................................... 49 
5.14. 16-bit Multiplier Design .................................................................................................. 51 
 
 1 
 
 
CHAPTER 1 
1. INTRODUCTION 
CMOS-based integrated circuits have been constantly developing thanks to the 
continuous scaling of MOSFETs. As lithography and process techniques improve, feature 
size shrinks, driving transistors to become smaller and cheaper, switch faster, and 
consume less power. The advancements in devices have been leading the progress and 
miniaturization of integrated circuits. 
However, when channel length of MOSFETs approaches nanoscale domain, many 
challenges appear in various aspects of fabrication and device, which prevents the further 
scaling of CMOS transistors. Moreover, solely focusing on the improvement of device 
has been less effective in nanoscale regime since interconnection costs are dominating 
[1]. All these challenges will be introduced in details in the following paragraphs. 
First of all, CMOS fabrication and manufacturing have been more difficult than ever 
before due to the extremely high requirements in building MOSFETs in nanoscale. In 
terms of lithography precision, it has been extremely challenging to define the small 
feature size in tens of nanometers as we can see in Figure 1.1 [2]. In order to reduce 
variations and ensure reliability, very precise lithography technique needs to be 
developed. As for doping control, conventional MOSFETs in nanoscale with ultra-sharp 
abrupt doping junctions necessitating orders of magnitude in doping concentration 
variation across several nanometers [3], which is difficult for doping control and may 
thus lead to great variations. 
 2 
 
 
 
Figure 1.1. Lithography challenge with scaling 
Second, non-ideal characteristics become more severe as CMOS device scales. On one 
hand, subthreshold leakage current issue turns to be more difficult to be controlled in 
short channel transistors. The decreased on-off current ratio leads to more significant 
static power dissipation. This scaling trend in static power is in contrary to the one of 
active power consumption, making static power consumption more critical particularly in 
on-chip memories [4]. On the other hand, the threshold voltage scaling has also slowed 
down dramatically to prevent subthreshold leakage power from going up too fast in 
nanoscale regime, which makes the dynamic power efficiency still high. Thus it is no 
more possible to obtain both performance and power benefit at the same time [5]. The 
more usual way of dealing the trade-off is to limit the performance from scaling up too 
fast as is shown in the Figure 1.2 [2]. 
 3 
 
 
 
Figure 1.2. Performance trend 
Moreover, as the integration scale and gate density increase, interconnection cost is 
gradually dominating power consumption and performance in recent technology nodes. 
Contrary to the device-scaling trend, interconnection per unit length leads to larger 
parasitic resistance and capacitance and consumes more power. What is more, die size 
has been gradually increasing, making global wiring overhead larger than before. 
Confronted with afore-mentioned challenges, it has been more difficult and less 
beneficial to remain the progress of CMOS-based integrated circuit by continuing scaling 
the current MOSFET design. Consequently revolutionary technology beyond the end of 
CMOS roadmap is essential for the development of charge-based electronics. 
  
 4 
 
 
CHAPTER 2 
2. OVERVIEW OF SKYBRIDGE FABRIC 
In this chapter, a recently proposed innovative 3-D integration technology named as 
SkyBridge [6] will be briefly introduced as a solution for the further CMOS technology 
scaling. Afterwards, we will introduce challenges in SkyBridge, which motivates us to 
further explorer possible extensions based on this new fabric. 
2.1 Motivation and Overview of SkyBridge 
In the last chapter, it has been introduced that the further CMOS technology scaling 
has encountered several challenges in fabrication, device and connectivity. These 
problems are not easily solvable if we continue the conventional scaling by simple 
shrinking channel length and engineering the design parameters and structure. It is thus 
necessary to achieve a revolutionary technology by co-designing in all the aspects 
including material, device, circuit, interconnection, heat management and fabrication. 
Following this mindset, a new fabric named as SkyBridge has recently been proposed. 
In this technology, the 3-D fabric-centric manner is followed to address fine-grained 3-D 
integration and mitigate the afore-mentioned CMOS scaling problems [6]. At the same 
time, with the true 3-D integration, SkyBridge achieves orders of magnitude benefit in 
density and power over projected scaled 2-D CMOS technology. 
 5 
 
 
 
Figure 2.1. SkyBridge Inverter Implementation 
 
Figure 2.2. SkyBridge Fabric Representation 
The 3-D fabric-centric manner in SkyBridge is followed by bringing in innovative 
features other than the planar MOSFETs and interconnection structures in CMOS. First, 
 6 
 
 
as shown in the Figure 2.1, SkyBridge circuits are built and stacked on vertical 
nanowires, which makes the definition of channel length dependent on material 
deposition instead of lithography limitation. Second, SkyBridge fabric performs logics 
with Gate-All-Around junctionless transistors to better suppress the leakage current and 
reduce doping complexity. Third, noise mitigation mechanism in SkyBridge allows the 
faster static circuit implementation. At last, interconnections are redesigned in 
SkyBridge. The better connection structures, together with the higher gate density, 
provide better connectivity beyond the bottleneck in traditional 2-D CMOS technology. 
All these features, together with the routing and heat management strategies, are shown 
in the Figure 2.2 and realize the true 3-D integration in a “fabric-centric” manner. 
2.2 Challenges in SkyBridge Fabric 
In order to reduce the fabrication complexity, one uniform n-type doping is applied to 
the wafer, limiting the transistors to only n-type [6]. This design choice leads the 
implementation to only dynamic circuit style and incurs challenges. 
First of all, dynamic SkyBridge circuits are more susceptible to noise. The control 
scheme, as shown in Figure 2.3, enables fast gate transitions but leaves the output signals 
floating during the “HOLD” phase as well. The floating output is only held by 
capacitance attached to the gate output and thus vulnerable to coupling capacitances. 
 7 
 
 
 
Figure 2.3. SkyBridge Control Scheme 
Second, according to the control scheme, SkyBridge is not efficient in some scenarios. 
For example, in the scenario shown in the Figure 2.4 when consecutive “zero” results are 
expected, the gate output keeps being precharged and discharged during the “Precharge” 
and “Evaluate” phases. These switches are redundant and consume unnecessary power. 
 
Figure 2.4. Switch in static and dynamic circuits when output=0 
 8 
 
 
At last, clocking routing for dynamic circuits brings large overhead. SkyBridge circuits 
require three or four kinds of clock signals depending on whether single-rail or dual-rail 
implementation is used. Moreover, these various kinds of clock signals need to be routed 
to every gate in dynamic SkyBridge circuits. While in static circuits, they are only routed 
to the sequential elements such as flip-flops and register files. The more complex 
clocking leads to overhead in performance, power and area footprint. 
Confronted with all these challenges caused by static circuit implementation in 
SkyBridge fabric, we are motivated to explore the possibility and benefit of SkyBridge 
extension to static circuit. This thinking has led us to the new SkyBridge-CMOS fabric 
that is going to be introduced in the following chapters. 
  
 9 
 
 
CHAPTER 3 
3. OVERVIEW OF SKYBRIDGE-CMOS 
This chapter introduces the motivation of applying 3-D integration technology with 
static CMOS-style circuit implementations. An overview concept of the new SkyBridge-
CMOS fabric is then provided. At last, detailed introductions about the core components 
in SkyBridge-CMOS fabric are presented. 
3.1 Motivation 
In the previous chapter, several challenges in SkyBridge fabric are introduced: outputs 
are floating during their “Hold” phases and thus susceptible to noise; redundant switches 
happen between consecutive “zero” outputs due to the dynamic control scheme; complex 
clock routing leads to overhead in area, power and performance. All these limitations are 
related with the dynamic circuit style in SkyBridge fabric and can possibly be mitigated 
with static circuit implementations. Consequently we are motivated to explorer 
implementing static circuits in the similar 3-D integration. 
3.2 Overview of SkyBridge-CMOS 
Confronted with the challenges from dynamic implementations, a new fabric, similar 
in mindset to SkyBridge, but with static logic circuit implementation is proposed. The 
new integrated framework is co-designed across all layers spanning fundamental fabric 
structures to large circuits to realize efficient static circuit implementations with vertical 
nanowires. The new fabric, named as SkyBridge-CMOS, introduces new technology, 
structures and circuit designs to meet the additional requirements for implementing static 
circuits. One of the critical challenges addressed here is integrating both n-type and p-
 10 
 
 
type nanowires. Molecular bonding process [7] allows precise control between different 
doping regions, and novel fabric components are proposed to achieve complementary 
function device and 3-D routing between various doping regions. 
3.3 Core Components 
In this section, core components in SkyBridge-CMOS fabric will be introduced. On 
one hand, some elements are inherited from the original SkyBridge fabric to realize the 
similar mindset of 3-D integration based on vertical nanowires; on the other hand, some 
of these SkyBridge elements are re-engineered and new components are designed to meet 
the doping and routing requirement of implementing static circuit style. All these core 
components will be introduced in the following content of the chapter. 
3.3.1 Vertical Nanowires 
Vertical single crystalline silicon nanowire array acts as the fundamental building 
blocks of SkyBridge-CMOS fabric [6]. All the elements including active devices, 
contacts and interconnections are based on nanowires. The nanowires are classified into 
two kinds: one is named as logic nanowires where transistors are built to implement 
logics; the other is named as routing nanowires where silicon layer is used for signal 
propagation. These nanowires need to be heavily doped to meet the doping concentration 
requirement of junctionless transistor channels [3]. What is more, silicidation on the 
surface of silicon nanowires is necessary to improve the conductivity. 
In SkyBridge-CMOS fabric, since two types of transistors are needed, precisely 
controlled regions with various doping on nanowire arrays are essential. However, good 
lateral doping control is hard to achieve when a high doping aspect ratio is also desired. 
 11 
 
 
Thus the solution of doing n- and p-type doping in different regions for the two transistor 
types may be challenging. We then turn to another way which achieves one to-be-
patterned substrate with various doping layers by bonding several substrates with 
different doping profiles together. A technology named as molecular bonding has been 
proposed and proved to be successful [7]. With molecular bonding, a wafer with several 
active layers and dielectric layers in between is obtained. When preparing for a 
SkyBridge-CMOS substrate, an n-doped wafer and a p-doped wafer is bonded together to 
obtain a substrate with various doping regions as well as a dielectric layer in between, as 
shown in the Figure 3.1. The dielectric layer, named as SkyBridge-Interlayer-Dielectric, 
provides isolation between p- and n-doped regions when the connection is not desired. 
We would not like to make this layer too thick because of the nanowire aspect ratio 
limitation. Regarding this problem, a SkyBridge-Interlayer-Dielectric layer as thin as 
23nm has been proposed [8]. What is more, by doing the process iteratively, more layers 
with different doping can be stacked. 
 
Figure 3.1. Substrate with Layered Doping Regions 
 12 
 
 
After the silicon substrate with doping layers is prepared, highly anisotropic silicon 
etching is desired in order to achieve nanowires with high aspect ratio and uniform width. 
In this way, finally we achieve nanowires with different doping regions and SkyBridge-
Interlayer-Dielectric in between as shown in the Figure 3.2. This top-down method 
ensures the quality of silicon, surface profiles and good geometry parameter control of 
nanowires [9]. 
 
Figure 3.2. Nanowires with p-, n- doped regions and SB-ILD 
3.3.2 Transistors 
The active devices used in SkyBridge-CMOS fabric are vertical Gate-All-Around 
junctionless nanowire transistors. Junctionless transistors avoid the abrupt junctions in 
conventional MOSFETs and thus reduce the doping complexities. Moreover, due to the 
vertical structure, channel lengths of SkyBridge-CMOS transistors are defined by the 
 13 
 
 
thickness of material deposition instead of lithography accuracy, which allows the 
continuous channel length scaling beyond the lithography limitation. 
The electrical behavior is modulated by the Workfunction difference between gate 
material and channel. So proper gate electrode Workfunction is important. The transistor 
channels should be in depletion by the Workfunction difference if no gate voltage is 
applied and in conduction with right gate voltage applied. Thus the gate material is 
chosen based on the Workfunction range, which is shown in the Figure 3.3, and on the 
electric characteristics. The previous design, which is shown in Figure 3.4, for n-type 
transistor uses Titanium Nitride as gate material and Boron-doped silicon with a high 
concentration of 1e19 cm-3, resulting in an on current of 1.63e-5A and an on-off ratio of 
1.72e5. For p-type transistors, we have chosen Tungsten Nitride as gate electrode 
material and has a design shown in Figure 3.5. In order to have a similar on current with 
n-type transistors and decent on-off ratio, further refinement on the gate electrode 
Workfunction and channel doping concentration has been applied with Sentaurus TCAD 
simulation [10], leading us to the results of 4.54 for Workfunction, achieved by WN0.6 
[11], and 0.8e5 cm-3 for Arsenic doping concentration. With all these optimizations, we 
make the on-current 1.6e-5A and on-off ratio 2.1e4 for p-type transistor, which shows the 
similar driving ability with n-type transistors and good on-off ratio. 
 14 
 
 
 
Figure 3.3. Gate Material Choice 
 
Figure 3.4. N-type Transistor in SkyBridge-CMOS 
 15 
 
 
 
Figure 3.5. P-type Transistor in SkyBridge-CMOS 
3.3.3 Contacts 
Contact materials also need to be chosen based on the Workfunction difference for 
good Ohmic contact. Consequently different kinds of contact material are required for 
diverse doping regions. In SkyBridge, silicon is always n-doped so only one type of 
contact needs to be designed. Titanium provides good Ohmic contact with n-doped 
silicon and good conductivity and thus it is chosen as the contact electrode material. As 
for the p-type contact in SkyBridge-CMOS fabric, Workfunction of contact material 
needs to be similar or slightly higher than the p-doped silicon. Among the commonly 
used materials, nickel is chosen based on the Workfunction requirement. Contact designs 
based on these material choices are shown in the Figure 3.6. The fact that we are using 
two different kinds of contact material is not adding fabrication complexity since the n-
 16 
 
 
type and p-type contacts are always on different layers. A silicided layer between contact 
material and silicon is also necessary to decrease the contact resistance. 
 
Figure 3.6. N- and P-type Contact 
3.3.4 Bridges and Coaxial Routing 
Two kinds of routing structures have been used in SkyBridge-CMOS fabric: Bridges in 
SkyBridge-CMOS fabric provides connection between adjacent nanowires; coaxial 
routing structure offers connectivity in vertical direction to allow more interconnection 
flexibility. 
In order to improve the connectivity, bridges can be built at different nanowire heights 
so that multiple bridges are available on one nanowire. Additionally at most two metal 
layers and one silicon layer for coaxial routing are allowed to increase the vertical 
 17 
 
 
connectivity. Figure 3.7 shows an example of integrating multiple bridges and coaxial 
routing layers to achieve high connectivity. 
Both two kinds of metal routing structures are built with tungsten, which is the 
material widely used as interconnection in conventional 2-D CMOS technology. The 
material choice is not only based on its good electric characteristics as interconnection, 
but on the compatibility with other materials as well. The connection between tungsten 
and n-type contact / gate material has been proved feasible in original SkyBridge fabric. 
To allow tungsten connected with nickel, a titanium nitride barrier layer is essential to 
prevent the inter-diffusion [12]. 
 
Figure 3.7. Interconnections: Coaxial Routing and Bridge 
3.3.5 SkyBridge-Interlayer-Connections 
One important challenge for realizing static circuits on vertical nanowires in 3-D 
integration is to effectively perform the connection between n- and p-doped silicon 
regions. The connection is necessary in two conditions: one is for connecting pull-up and 
 18 
 
 
pull-down networks to implement static logic gates; the other is to allow vertical signal 
routing to bypass the SkyBridge-Interlayer-Dielectric. 
The structure of SkyBridge-Interlayer-Connection are shown in the Figure 3.8. It 
includes connections through p-doped silicon, nickel, titanium nitride, tungsten, titanium 
and n-doped silicon. These materials are chosen so that good Ohmic contact and material 
bonding are achieved for every connection existing in the SkyBridge-Interlayer-
Connection. Due to its complicated structure, electrical characteristic should be verified 
to ensure good connectivity between two terminals. We used Sentaurus TCAD tools to 
simulate the process for building the SkyBridge-Interlayer-Connection structures and the 
device characteristics in physical level. The electrical characteristics of SkyBridge-
Interlayer-Connection are going to be plotted during the evaluation. A linear I-V 
characteristics will show that SkyBridge-Interlayer-Connections are providing good 
Ohmic contact between two terminals in various doping regions. 
 
Figure 3.8. SkyBridge-Interlayer-Connection  
 19 
 
 
CHAPTER 4 
4. SKYBRIDGE-CMOS CIRCUIT IMPLEMENTATIONS 
This chapter provides an overview of SkyBridge-CMOS circuit style. Also, some 
circuit design examples including logic gates, arithmetic circuits and Static RAM are 
shown in transistor-level schematics and 3-D physical-level layouts. 
4.1 Overview of SkyBridge-CMOS Circuit Style 
Limitations from the dynamic circuit style in SkyBridge fabric motivate us to explore 
the possibility of implementing static circuits in 3-D integration. Consequently afore-
introduced new core components are designed for SkyBridge-CMOS: p-type transistors 
can build pull-up network without generating degraded “1”; SkyBridge-Interlayer-
Dielectric provides reliable isolation between p- and n-doped nanowire regions; 
SkyBridge-Interlayer-Connections connects p- and n-doped silicon when connection is 
desired. All these elements make it possible to achieve static circuits in 3-D integration 
similar with SkyBridge fabric. 
In SkyBridge-CMOS fabric, the conventional circuit style to implement static circuits 
in planar CMOS technologies is followed. In this style, each gate consists of a pull-down 
network to ground with n-type transistors and a pull-up network to VDD with p-type 
transistors VDD as shown in the Figure 4.1 [13]. Important improvements in terms of 
power consumption and noise resilience can be achieved with static circuit style: either 
pull-up or pull-down network is ON due to the nature of static circuits, making the gate 
output always driven by strong “1” or “0” and thus ensuring good noise robustness; static 
logic is more energy-efficient due to the absence of redundant switches from precharge-
 20 
 
 
evaluation mechanism used in dynamic circuits; static CMOS-style circuits greatly 
reduce clocking overhead since clock signals no longer need to be routed to every gate as 
in dynamic circuits. 
 
Figure 4.1. SkyBridge-CMOS Circuit Style 
Another feature in SkyBridge-CMOS fabric is compound logic. Similarly with 
conventional CMOS circuits, a combination of series and parallel transistors can 
implement more functions besides “NAND” and “NOR”. As we will see in Section 4.2.3, 
a compound gate is able to perform the “AOI21” logic in a single stage with a compact 
and efficient gate design. 
As we can see, SkyBridge-CMOS fabric shares similar transistor-level circuit design 
methods in terms of static circuit style and compound logic. The similarity provides 
better compatibility with the state-of-art CAD tools and thus makes it easier for large 
scale circuit design than the original SkyBridge. 
 21 
 
 
4.2 Elementary Logic Gates 
Following the previously introduced circuit styles, we are now able to design 
elementary logic gates in SkyBridge-CMOS fabric, which are going to be described in 
the following paragraphs. 
4.2.1 Inverters 
First of all, the SkyBridge-CMOS circuit implementation of inverter, which applies the 
simplest logic, is presented. Also, by showing the physical-level layout of several 
cascaded inverters, routing strategy between gates is shown. 
The layout of a single inverter is included in Figure 4.2. The Ohmic contacts to n- and 
p-doped silicon connect output node to power supply through pull-up network and to 
GND through pull-down network. Input signal is routed to gate electrodes surrounding 
the nanowire channels and control the on-off state of corresponding transistors. In 
between the two doping regions, SkyBridge-Interlayer-Connection structure connects the 
pull-up and pull-down network together to generate the output logic. From the figure we 
can see that all the contacts and gates are stacked vertically and thus a very small area 
footprint is occupied. 
 
 22 
 
 
  
Figure 4.2. Physical Layout of SkyBridge-CMOS Inverter 
The layout of four cascaded inverters is then shown in the Figure 4.3. Through this 
design we are focusing on how the routing between these cascaded inverters are made. 
Firstly primary input signal “In” is connected to both pull-up and pull-down network by 
the coaxial routing structure on a nanowire dedicated for signal routing. However, this 
routing strategy leads to large overhead since the routing nanowire is no longer available 
for implementing logic. One way to avoid using routing nanowire is to customize the 
nanowire heights of outputs and inputs so that they can be connected directly by bridges. 
The signals “Int0”, “Int1” and “Int2” as shown in Figure 4.3 are following this routing 
strategy. 
 23 
 
 
 
Figure 4.3. Cascaded Inverters 
4.2.2 NAND Gate 
In this section, a 3-input NAND gate design in SkyBridge-CMOS fabric is shown to 
introduce gates consisting of multiple-nanowires. As we can see in the Figure 4.4, the 3-
input NAND gate has a pull-up network with three parallel p-type transistors and a pull-
down network with three serial n-type transistors. Three serial n-type transistors are 
implemented vertically on one nanowire and three parallel p-type transistors have to be 
built on three nanowires connected by SkyBridge-Interlayer-Connections and bridges. 
 24 
 
 
 
Figure 4.4. 3-1 NAND Gate 
4.2.3 AOI21 Compound Gate 
As we have discussed in the circuit style section, SkyBridge-CMOS circuits are 
capable of using compound gates to perform complex logics in addition to NAND and 
NOR gates. An AOI21 gate is shown as an example because the AOI and OAI logic 
styles are efficient to be implemented with compound logic. As is shown in the Figure 
4.5, with the same transistor-level design in CMOS technology, a 3-input AOI21 logic is 
performed with only two nanowires. 
 25 
 
 
 
Figure 4.5. AOI21 Compound Gate 
4.3 Full Adder 
The SkyBridge-CMOS one-bit full adder design in the Figure 4.6 acts as an example of 
a simple functional circuit consisting of several gates. The transistor-level design shown 
in the schematic follows a conventional full adder design in planar CMOS technologies. 
Physical-level layout is designed in full-custom way to optimize the performance and 
density in both 2-D and 3-D technologies. The layout shows great benefit in density over 
conventional CMOS implementation: SkyBridge-CMOS full adder occupies eleven 
nanowires, leading to the area footprint of only 0.06 um2, which is 28X denser than 16nm 
2-D CMOS implementation. This small benchmark gives us an initial idea of the huge 
density benefit we achieve in the SkyBridge-CMOS fabric. 
 26 
 
 
 
Figure 4.6. 1-bit Full Adder 
4.4 Flip-flop 
Flip-flop is a basic element for storing state information and thus essential for 
sequential circuits. Most of the conventional flip-flop designs in 2-D CMOS technology 
are applied in pass transistor logic style. However, pass transistor logic style is not 
desired in SkyBridge-CMOS fabric because of the larger voltage drop across drain to 
source of the junctionless transistors. Consequently we employ the design as shown in the 
Figure 4.6, which is realized by two cascaded 2-to-1-multiplexer-style latches with 
 27 
 
 
complementary clock signals. The physical layout as shown in the Figure 4.7, is 
customized so that only an area footprint of 0.04 um2 is necessary. 
 
Figure 4.7. 1-bit Negative Edge Flip-flop 
4.5 6T-SRAM 
SRAM is an important circuit application since it is widely used in fast on-chip 
applications such as caches and register files. It is thus necessary for a new technology to 
have an efficient implementation for SRAM design. However, it is challenging to realize 
the conventional 2-D CMOS SRAM design in SkyBridge-CMOS technology due to the 
inefficiency of transistor sizing. Consequently a new design suitable for SkyBridge-
CMOS fabric is desired. 
With the effort of ensure the SRAM cell stability and writability without transistor 
sizing, a SkyBridge-CMOS SRAM cell is designed and shown in the Figure 4.8. 
Similarly with traditional 2-D SRAM in CMOS technology, the SkyBridge-CMOS 
SRAM cell stores value with cross-coupled inverters and controls read and write 
accessibility with pass transistors. However, the customization of transistor strength is no 
 28 
 
 
longer the same: to ensure writability, write-access transistor needs to be stronger than 
any of the transistors in cross-coupled inverters; to ensure read stability, read-access 
transistor has to be weaker than any of the transistors in the cross-coupled inverters. The 
cause of the difference is that the conventional customization for transistor strength is 
feasible only by customizing the geometry parameters and thus not possible in 
SkyBridge-CMOS fabric. In SkyBridge-CMOS, the way of customizing the transistor 
strength is applying various gate voltage levels, which is going to be described in details 
in later sections. 
  
Figure 4.8. SRAM Cell Schematic and Layout 
4.5.1 Read Operation 
The read for the 6T SkyBridge-CMOS SRAM cell operates in two steps. First, with the 
read-access transistor in off state, Read-Bit-Line is initialized to “0” by bit line 
conditioning circuits. Then the p-type read-access transistor is turned-on by pulling down 
 29 
 
 
Read-Word-Line. Only when the value stored inside the SRAM cell is 1, the floating zero 
Read-Bit-Line will be pull-up. Otherwise it remains to be 0. In this way, a read operation 
is finished. 
In order to ensure the read stability, the read-access transistor should be weaker than 
any of the transistors in the cross-coupled inverters. The way how conventional CMOS 
6T-SRAM satisfies the requirement is with transistor sizing, which is not feasible in 
SkyBridge-CMOS fabric. As is known, gate voltage level can influence the driving 
ability of transistors. Therefore, we customize the Read-Word-Line signal, the gate 
voltage of read-access transistors, to have weaker “0” of 0.1V so that the read-access 
transistor is weakly ON and thus read stability is achieved. 
 
Figure 4.9. SB-CMOS SRAM read operation 
4.5.2 Write Operation 
The write for the 6T SkyBridge-CMOS SRAM cell operates in two steps. First, with 
the write-access transistor in off state, Write-Bit-Line is driven to strong “0” or “1” by 
 30 
 
 
write driver. Then the n-type write-access transistor is turned-on by pulling up Write-
Word-Line. In this way, the node inside the SRAM cell is connected with Write-Bit-Line 
through the ON write-access transistor and Write-Bit-Line signal is written into the cell. 
In order to ensure the writability, the write-access transistor should be stronger than 
any of the transistors in the cross-coupled inverters. Still the way in conventional CMOS 
6T-SRAM design is by transistor sizing, while we use the alternative method of 
customize the Write-Word-Line signal, the gate voltage of write-access transistors, to 
have strong “1” of 1.2V. With the strongly ON write-access transistor driving, writability 
is ensured. 
 
Figure 4.10. SB-CMOS SRAM write operation 
4.5.3 Noise Margin and Writability 
In order to verify the new SRAM design, the measurement of noise margin and 
writability is necessary. We follow the common way for measuring SRAM noise margin 
[14] with the methods and results illustrated in the following paragraphs. 
 31 
 
 
For the hold margin measurement, the circuit shown in Figure 4.11 is built and we plot 
V2 against V1 and V1 against V2, which is known as the “Butterfly Curve”, based on the 
HSPICE [15] simulation result. From the plotting we see three stable states and hold 
noise margin is determined by the side length of the largest possible square that can fit 
between the curves. Similarly, the read margin is measured with circuit in Figure 4.12. 
With the read-access transistor trying to pull down the node “Q”, the butterfly curve 
shifts to the plotting shown in Figure 4.12. Based on the measurements, we see good hold 
noise margin of 0.3V and read noise margin of 0.15V. As for writability measurement, 
with the ON write-access transistor trying to write “1” or “0” into the cross-coupled 
inverters, we plot V2 against V1 and V1 against V2, fit the largest possible squares 
between the two curves when only one stable status is possible for both scenarios of 
writing “0” and “1” [16]. 
 
Figure 4.11. 6T SB-CMOS SRAM Hold Margin Measurement Circuit and Results 
 32 
 
 
 
Figure 4.12. SB-CMOS SRAM Read Margin Measurement Circuit and Results 
  
Figure 4.13. 6T SB-CMOS SRAM Writability Measurement Circuit and Results 
 33 
 
 
CHAPTER 5 
5. EVALUATION OF SKYBRIDGE-CMOS FABRIC 
Following all the circuit design knowledge presented in the last chapter, now we are 
capable of doing some evaluation for SkyBridge-CMOS fabric. In this chapter, 
evaluation methodology will be firstly described. Then the evaluation results following 
the presented methods are going to be shown and analyzed. 
5.1 Fabric Evaluation Methodology 
In order to achieve credible results in terms of area footprint, power consumption, 
performance and noise resilience, a comprehensive evaluation methodology has been 
established. This evaluation includes information regarding material / device, schematic 
and physical level layout as shown in the flowchart in the Figure 5.1. More detailed 
methods in these steps will be introduced in the following paragraphs. 
 
Figure 5.1. SkyBridge-CMOS Fabric Evaluation Methodology 
 34 
 
 
5.1.1 Device and Material Level Methodology 
The designs for basic fabric elements have been shown in Chapter 3. In order to see the 
effects of these component designs in the circuit, they have to be simulated and modeled 
so as to be considered during the circuit simulation. Mainly two kinds of core 
components including p-type transistors and SkyBridge-Interlayer-Connection need to be 
evaluated. 
First we will see the detailed evaluation methodology for p-type transistor modeling. 
The method is similar with the previous one for horizontal nanowire device modeling 
[17] and is presented in the flow chart as shown in Figure 5.2. First of all, we develop the 
process for building the p-type transistors based on the material-level information and 
geometry parameters we have in our component design. Second, we do the process 
simulation in Sentaurus Process to obtain the transistor structure, which is going to 
provide the necessary information in the following device simulation. Third, Sentaurus 
Device takes process simulation results and does the physical-level simulation to generate 
the device I-V and C-V characteristics. After that, the mathematic tool named as DataFit 
[18] performs regression analysis and polynomial fits are applied on device 
characteristics to get the mathematic expression. At last, we build the behavioral HSPICE 
modeling based on expressions describing device characteristics. In this way, from our 
physical level design we obtain a device model that is going to be used in the HSPICE 
circuit simulations. This device model has been verified with HSPICE simulation and the 
result of DC analysis is shown in the Figure 5.3. 
 35 
 
 
 
Figure 5.2. P-type Transistor Evaluation Method 
 
Figure 5.3. Transistor IDS-VDS Characteristics from TCAD Simulation 
 36 
 
 
After introducing the transistor modeling, the method of SkyBridge-Interlayer-
Connection evaluation will be described. Similar method of acquiring device 
characteristics with Sentaurus Process and Device is followed. From the plotting of the 
characteristics as shown in Figure 5.4, we see a nearly perfect Ohmic I-V characteristic, 
which is desired for the connection between p- and n-doped regions. What is more, the 
coupling capacitance between two terminals is also negligible. The detailed modeling 
with resistors and capacitances for SkyBridge-Interlayer-Connection is shown in the 
Figure 5.5 and will be used in the HSPICE circuit simulation. 
 
Figure 5.4. SkyBridge-Interlayer-Connection I-V Characteristics 
 37 
 
 
 
Figure 5.5. SkyBridge-Interlayer-Connection Modeling 
5.1.2 Circuit and Layout Design 
With evaluation results for all the SkyBridge-CMOS components ready, we are able to 
do HSPICE simulation in circuit-level by using the models of these elements. Benchmark 
circuits are firstly designed in schematic-level following the circuit style introduced in 
Chapter 3. Then these schematics are designed in physical-level layout, with which we 
can estimate the area footprint. After that, we build HSPICE netlist to describe the layout 
with all the transistor and interconnection models. At last, test vectors are applied to the 
netlist for functional verification. 
 38 
 
 
5.1.3 RC Extraction 
The previous HSPICE simulation results are only good enough for combinational 
function validation. For better results in terms of signal integrity, performance, power 
consumption and noise robustness, the parasitic resistances and capacitances need to be 
considered. Due to the absence of CAD tools for SkyBridge-CMOS, RC extraction has to 
be done manually by looking into the layout to measure the parasitic resistances and 
capacitances. These parasitic elements are modeled following the Predictive Technology 
Model (PTM) [19]. Then we attach extraction results to the original schematic-level 
HSPICE netlist. 
5.1.4 HSPICE Simulation 
With the physical-level HSPICE netlists, various simulations can be applied for 
evaluations in noise resilience, performance and power consumption. The detailed 
methods and assumptions are introduced in the following paragraphs. 
In order to compare the noise resilience for circuits built in SkyBridge and SkyBridge-
CMOS, we need to build scenarios with noise attacking signals in these three fabrics, do 
HSPICE simulations and see how signals are influenced. As is shown in Figure 5.6, two 
kinds of coupling noises exist in circuits built with vertical nanowires: coupling noise 
between different layers of one nanowire and noise between adjacent nanowires. We 
assume that for one victim of coupling noises, there are at most one aggressor from the 
same nanowire and two aggressors from the adjacent nanowires due to the common way 
of routing in real circuits. Consequently we build physical-level HSPICE netlists for three 
scenarios: one aggressor from the same nanowire, one aggressor from the same nanowire 
 39 
 
 
and the other from adjacent nanowire, one aggressor from the same nanowire and two 
from adjacent nanowires. In order to make the scenario authentic, victim and all the 
aggressors are generated by real gates and all these signals are driving average loads, 
which are fan-out of four minimum-sized inverters. At last noise resilience can be 
evaluated by checking the noise margin of victims in different fabrics. 
 
Figure 5.6. Coupling noise scenario (with GND shielding for SkyBridge) 
As for the performance evaluation, two kinds of metrics need to be considered 
including throughput (in operations / second) and delay (in nanoseconds). Throughput 
shows how many instructions can be operated in unit time at most; delay shows how soon 
an operation can be finished. Both of them are important metrics. For the evaluation of 
delay, we do analysis for the circuits-under-test to find the critical paths and accordingly 
setup input test vectors. By applying all these combinations leading to switches flowing 
through critical paths, we find the maximum propagation delay from input crossing 50% 
to output crossing 50%. Again, all the inputs to the circuits-under-test are generated by 
gates and outputs are loaded with four inverters. After the critical delay is defined, 
 40 
 
 
throughput is automatically known for static circuits without pipelining from the 
multiplicative inverse of delay. For static circuit with pipelining, throughput is instead 
defined by the maximum delay of critical stages. This is also true for dynamic SkyBridge 
circuits since they are pipelined by the implicit latching between stages. 
For power evaluation, still two kinds of metrics are necessary: power consumption (in 
Watts) and power efficiency (in operations / J). In order to measure the largest possible 
power consumption, each circuit-under-test are operated in its largest frequency. We 
ensure the input test vectors are random generated and large enough to make the result 
credible. After power consumption is known, power efficiency can be obtained by 
computing the performance per watt. Similarly, all the circuits are simulated with fan-out 
of four inverters. 
5.1.5 CMOS Baseline Evaluation 
In order to serve as the baseline for the SkyBridge-CMOS fabric evaluation results, 
identical benchmarking needs to be done for CMOS technology. First of all, state-of-art 
CAD tools are used to build benchmark circuits in 45nm CMOS technology: benchmark 
circuits are expressed in Verilog HDL language, which is later taken by Synopsys Design 
Compiler for the synthesize to generate the gate-level netlist; Cadence Encounter does the 
automatic placement & routing for the gate-level design; Cadence Virtuoso translates 
gate-level into transistor-level netlist, verifies the level versus schematic layout and 
generates the physical level netlist. Afterwards, the scaling factors are used to achieve the 
results in 16nm CMOS technology. [20] [21] 
 41 
 
 
5.2 Evaluation Results 
In this section, following the methodology and assumptions introduced before, a 
comprehensive evaluation for SkyBridge-CMOS fabric is done. All the results are going 
to be shown and analyzed to make us understand the advantages and disadvantages of 
enabling static circuits in the 3-D integration concept similar with SkyBridge. 
5.2.1 Noise Resilience Evaluation 
In the original SkyBridge fabric, the inner metal layers in coaxial routing structures are 
always connected to ground instead of carrying signals, which is customized for 
providing ground shielding to remove coupling noise between outer metal and inner 
silicon layers. The reason why ground shielding is essential can be seen from the noise 
evaluations for SkyBridge circuits with or without inner metal layer connected to ground. 
The evaluation is done for the condition when the floating victim signals “1” is pulled 
down by the falling aggressors during “HOLD” phases of the victim signals. From noise 
resilience evaluation results shown in Figure 5.7, it is obvious that ground shielding for 
inner metal layer is essential to keep the circuits function correctly [6]. 
 
Figure 5.7. Victim Signals for SkyBridge Noise Evaluations for Scenarios in Section 
 However, the interconnection engineering method in SkyBridge fabric attaches large 
ground capacitances to signals and thus slows the performance and increases power 
 42 
 
 
consumption. It is thus beneficial trying to avoid the ground shielding mechanism and at 
the same time keep the circuits function well in SkyBridge-CMOS. Due to the fact that 
static circuits are more robust against noise, it is possible that the SkyBridge-CMOS 
circuits are able to function correctly without ground shielding. Consequently we follow 
the noise resilience evaluation methods for the experimental scenario with no ground 
shielding for SkyBridge-CMOS and have the results as shown in the Figure 5.8. 
 
Figure 5.8. Noise Resilience Evaluation for SkyBridge with GND Shielding and 
SkyBridge-CMOS 
From the evaluation results for both SkyBridge and SkyBridge-CMOS fabrics, it is 
obvious that SkyBridge-CMOS circuits are far better at noise resilience. First of all, 
output signals of SkyBridge-CMOS circuits have better noise margin than those of 
SkyBridge circuits. Second, a SkyBridge-CMOS gate can automatically reset its noise in 
output voltage because of the static circuit style, while noises in SkyBridge gates always 
remain. At last, no performance and power overhead from additional noise mitigation 
mechanism exist in SkyBridge-CMOS circuits. 
 43 
 
 
5.2.2 Initial Benchmarking 
For the evaluation in terms of performance and power consumption, an initial 
benchmarking with 4-bit array-based multiplier is implemented. The design of 4-bit 
multiplier, as is shown in the Figure 5.9, performs the multiple partial product additions 
with 4-bit carry-save adders and at last one 4-bit carry- propagate adder. The multiplier is 
designed and built in physical level in SkyBridge-CMOS, SkyBridge as well as 
conventional CMOS to allow us to compare between different fabrics. The physical 
design of one carry-save adder cell in SkyBridge-CMOS is shown in Figure 5.10. 
 
Figure 5.9. 4-bit Array-based Multiplier 
 44 
 
 
 
Figure 5.10. Layout of One Cell in Multiplier 
Following the methodology introduced before, the following initial benchmarking 
results with 4-bit array multiplier are acquired and shown in the Table 5.1. When 
compared with SkyBridge results, for the performance metrics we see better latency but 
lower throughput in SkyBridge-CMOS. The reason for the lower throughput is the less 
pipelined static circuit implementation in SkyBridge-CMOS. In terms of power 
efficiency, SkyBridge-CMOS is much better than SkyBridge for the following reasons: 
the less complex clocking due to the static SkyBridge-CMOS circuit style; no redundant 
switches between consecutive zeros; no noise mitigation overhead from ground shielding. 
The better power efficiency and lower operation frequency in SkyBridge-CMOS also 
 45 
 
 
automatically lead to a large reduction in power consumption. The density of SkyBridge-
CMOS is also good that it is much better than the dual-rail SkyBridge 4-bit multiplier and 
only slightly worse than the single-rail SkyBridge result. 
Comparing with conventional CMOS, we see significant improvement in density for 
the 3-D stacked transistors and routing structures. SkyBridge-CMOS also dominates in 
power efficiency and power consumption because of the lower power junctionless 
devices and the smaller interconnection overhead in the denser designs. However, 
SkyBridge-CMOS loses a lot in both performance metrics. The main reason for the 
performance loss is the low performance device in SkyBridge-CMOS. As is known, the 
intrinsic delay CV/I is important for the performance of a technology [22]. In SkyBridge-
CMOS, due to the structure and junctionless operation, transistors are have larger 
intrinsic delays. Moreover, the device driving ability is also weaker, making it slower 
when driving interconnections. 
 
Table 5.1. Initial Benchmarking Results 
 
Latency 
(ps) 
Throughput 
(ops. / sec.) 
Power 
(μW) 
Performance / Watt 
(Ops. / J) 
Area 
(μm2) 
SkyBridge (dual-rail) 524 5.09E+9 41.3 1.23E+14 1.27 
SkyBridge (single-rail) 923 4.07E+9 27.9 1.46E+14 1.06 
SkyBridge-CMOS 501 2E+9 10.1 1.98E+14 1.09 
16nm CMOS 201 4.97E+9 172 2.89E+13 50.1 
 
 46 
 
 
5.2.3 Performance Optimization 
As is shown in the last section, SkyBridge-CMOS circuits are not advantageous in 
throughput due to the lower performance of device and the static circuit style. One 
solution for the challenge is having a new device design for high performance 
applications at the cost of higher power consumption. However, circuit-level 
optimizations by pipelining for benchmarking results are also feasible to solve the 
throughput problem even without device engineering. 
Using the afore-introduced flip-flop design, we can achieve pipelined design by 
inserting flip-flops between stages. For the 4-bit array multiplier, experiments with at 
most three stages are implemented with the results shown in the Figure 5.11. After 
pipelining, the throughput in SkyBridge-CMOS benchmark has been similar with the 
other results. At the same, power and density of SkyBridge-CMOS is still far better than 
CMOS. Compared with SkyBridge, power efficiency is in between two implementations 
of SkyBridge and density is similar with the dual-rail implementation. Consequently we 
conclude that by doing circuit-level performance optimization, we can get comparable 
throughput with other technologies and still win over the opponents in power, area or 
noise resilience. 
 47 
 
 
 
Figure 5.11. Pipelined Multiplier Evaluation Results 
5.2.4 Large Benchmarking: WISP-4 and 16-bit Multiplier 
With the experience in circuit design and optimization we obtain from 4-bit multiplier, 
further evaluations in performance, power and area can be implemented with larger scale 
benchmarking. First, a 4-bit WIre Streaming Processor (WISP-4) benchmarking for 
comprehensive practice in logic and arithmetic circuit, memories as well as inter-circuit 
connections will be included. Second, a 16-bit array-based multiplier benchmarking will 
be presented to provide evaluation for larger scale circuits. 
 48 
 
 
a) WISP-4 Benchmarking 
During the WISP-4 Benchmarking, the simple 4-bit WIre Streaming Processor is built 
at transistor level in SkyBridge-CMOS fabric, functionally verified and evaluated against 
the baselines in SkyBridge and conventional CMOS technologies. As shown in the 
Figure 5.12, the WISP-4 microprocessor uses load-store architecture and consists of five 
function stages including Instruction Fetch, Instruction Decode, Register File, Arithmetic 
Logic Unit and Write Back. We build the entire processor following the circuit design 
guidelines presented in Chapter 3. 
 
Figure 5.12. WISP-4 Architecture and Instruction Set 
During the first stage, program counter generates the instruction address, which is then 
decoded as the instruction ROM word line. In WISP-4, one instruction consists of nine 
bits as shown in the Figure 5.13 and five kinds of operations are supported including 
move, move immediate, addition, multiplication and stall. Then in the second stage, 
 49 
 
 
instruction is decoded into word lines of register files and control signals. After that, the 
Register File stage loads the correct operands, which are fed into the ALU stage for the 
calculations including addition and multiplication. At last, the results are written back 
into the register files. The block diagrams are shown in the Figure 5.13 for each stage. 
 
Figure 5.13. Block Diagram of WISP-4 Stages 
From the previous section we have seen that SkyBridge-CMOS circuits will lose in 
throughput against the baselines in conventional CMOS and SkyBridge when no deeper 
pipelining is applied. We thus apply further circuit level optimizations for the 
microprocessor, break the five functional blocks into more pipeline stages and improve 
the throughput. In order to achieve comparable throughput with CMOS and SkyBridge 
baselines, a thirteen-stage design is necessary for SkyBridge-CMOS WISP-4. By 
following the afore-mentioned methodology, evaluations in terms of performance, power 
and area footprint are made and shown in the Table 5.2.  
 
 
 50 
 
 
Table 5.2. WISP-4 Benchmarking Results 
 
Throughput 
(ops. / sec.) 
Power 
(μW) 
Performance / Watt 
(Ops. / J) 
Area 
(μm2) 
SkyBridge-CMOS 4.55E+9 186 2.45E+13 10.6 
SkyBridge 5.09E+9 301 1.69E+13 9.52 
16nm CMOS 4.31E+9 886 4.86E+12 289 
 
From the results, first of all we should notice that SkyBridge-CMOS is doing better in 
all the metrics when compared with conventional CMOS technology. These benefits 
prove that as the circuit scale goes up and more interconnections and memories are taken 
into consideration, SkyBridge-CMOS technology keeps the benefit over the CMOS 
technology. When compared with SkyBridge fabric, the power efficiency is much better 
while the throughput and area footprint are slightly worse. 
b) 16-bit Array Multiplier 
16-bit array-based multiplier is a larger circuit for a decent benchmarking. The size is 
larger so that we can see more influences from interconnections. The design of 16-bit 
array-based multiplier, as shown in the Figure 5.14, consists of sixteen 16-bit carry-save 
adders to sum up all the partial products generated by the AND gates in each carry-save 
adder cell. The sum and carry bits from the last iteration of addition are added by a 16-bit 
2-level Carry-Lookahead-Adder at the last of multiplication. As for the physical-level 
designs, we can simply continue using the original CSA cell design presented during the 
4-bit multiplier benchmarking. In SkyBridge-CMOS, two versions of 16-bit multipliers 
are implemented: one is not pipelined; the other is 10-stage pipelined. 
 51 
 
 
By building the entire designs in different technologies and doing HSPICE simulation, 
the results are shown in the Table 5.3. For the unpipeliend version, we have observed 
significant advantages in power and area metrics, which again implies that there are 
intrinsic benefits lying in SkyBridge-CMOS fabric in terms of power consumption and 
area. As for the pipelined version, SkyBridge-CMOS loses some of the area benefit when 
compared with other technologies. The reason for the narrower area gap is that the flip-
flop count for pipelining increases severely as the bit number of operands goes up. On the 
other hand, as fast as 3X faster throughput has been achieved when compared with 
conventional CMOS technology. With the baseline of SkyBridge fabric, advantages in 
throughput and power efficiency are witnessed while the density is worse in SkyBridge-
CMOS fabric. 
  
 
Figure 5.14. 16-bit Multiplier Design 
 52 
 
 
 
 
Table 5.3. 16-bit Multiplier Benchmarking Results 
 
 
 
Delay 
(ns) 
Throughput 
(ops. / sec.) 
Power 
(μW) 
Performance / Watt 
(Ops. / J) 
Area 
(μm2) 
SkyBridge-CMOS 
(Unpipelined) 
1.72 5.81E+8 115 5.05E+12 14.5 
SkyBridge-CMOS 
(10-stage Pipelined) 
2.19 4.57E+9 1290 3.55E+12 36.3 
16nm CMOS 0.713 1.4E+9 2580 5.42E+11 721 
SkyBridge (Dual-
rail) 
1.79 3.73E+9 1020 3.13E+12 22.3 
 
  
 53 
 
 
CHAPTER 6 
6. CONCLUSION 
In this dissertation, a new fabric, named as SkyBridge-CMOS, has been proposed and 
evaluated. Confronted with all the challenges from dynamic circuit implementations in 
SkyBridge fabric, this new fabric enables the extension of static circuit implementation in 
the 3-D integration similar with the original SkyBridge fabric. 
During this research, innovations in the device / material level have been proposed: 
Molecular technology helps to achieve nanowires with well-controlled different doping 
regions; Corresponding p-type core components including p-type transistors and contacts 
are designed and engineered; SkyBridge-Interlayer-Connection solves the connection 
problem between different doping regions; All these new designs contributes the 
realization of various kinds of static circuits including logic gates, arithmetic circuits, 
flip-flops, memories as well as microprocessors. At last, a comprehensive evaluation 
methodology taking information in all levels into consideration is developed, which is 
followed to achieve evaluation results in noise resilience, power, area and performance. 
Several benchmarking has been built for the evaluation including 4-bit and 16-bit array-
based multiplier and WISP-4 microprocessor. 
We have applied comprehensive analysis on all the evaluation results to understand the 
benefit and shortcomings of SkyBridge-CMOS fabric. When compared with CMOS 
technology, we see benefits in all aspects including performance, power and area for 
large-scale circuits. As for the comparison with original SkyBridge fabric, on one hand 
better noise resilience is observed in SkyBridge-CMOS fabric due to its static circuit 
 54 
 
 
implementation, on the other hand we see intrinsic benefits in terms of power 
consumption and area, which allows us to perform more performance optimization at the 
cost of sacrificing some of the benefits in resource consumptions. 
  
 55 
 
 
 
BIBLIOGRAPHY 
 
[1]  Puri, R. and Kung, D. The dawn of 22nm era: Design and CAD challenges. 
Proceedings of 23rd International Conference on VLSI Design, pp. 429-433, 2010.  
[2]  Warnock, J. Circuit Design Challenges at the 14nm Technology Node. in Design 
Automation Conference (DAC), New York, 2011.  
[3]  Lee, C. W. Junctionless multigate field-effect transistor. Applied Physics Letters, 
vol. 94, no. 5, pp. 053511 - 053511-2, 2009.  
[4]  Kim, N. et al. Leakage current: Moore's law meets static power. Computer, pp. 68 - 
75, 2003.  
[5]  Muller, M. Embedded Processing at the Heart of Life and Style. in Solid-State 
Circuits Conference. ISSCC 2008. Digest of Technical Papers. IEEE International, 
San Francisco, 2008.  
[6]  Rahman, M., Khasanvis, S., Shi, J., Li, M., and Moritz C. A. Skybridge: 3-D 
Integrated Circuit Technology Alternative to CMOS. 
http://arxiv.org/abs/1404.0607, 2014. 
[7]  Batude, P. et al. Advances in 3D CMOS sequential integration. Electron Devices 
Meeting (IEDM), pp. 1-4, 2009.  
[8]  Batude, P. et al. Demonstration of low temperature 3-D sequential FDSOI 
integration down to 50 nm gate length. VLSI Technology (VLSIT), Symposium on, 
 56 
 
 
pp. 158 - 159, 2011.  
[9]  Yang, B. et al. Vertical Silicon-Nanowire Formation and Gate-All-Around 
MOSFET. IEEE Electron Device Letters, vol. 29, no. 7, pp. 791-794, 2008.  
[10]  Sentaurus TCAD, http://www.synopsys.com/tools/tcad/Pages/default.aspx, 
Synopsys, Inc., 2014.  
[11]  Jiang, P., Lai, Y., and Chen, J. S. Dependence of crystal structure and work 
function of WNx films on the nitrogen content. Applied Physics Letters, vol. 89, no. 
12, pp. 122107-122107-3, 2006.  
[12]  Nowak, W., Keukelaar, R., Wang, W., and Nyaiesh A. Diffusion of nickel through 
titanium nitride films. Journal of VacuumScience & Technology A: Vacuum, 
Surfaces, and Films, vol. 3, no. 6, p. 2242 –2245, 1985.  
[13]  Weste, N., and Harris D. CMOS VLSI Design: A Circuits and Systems Perspective, 
Addison Wesley, 2011.  
[14]  Lohstroh, J., Seevinck, E., and de Groot, J. Worst-case static noise margin criteria 
for logic circuits and their mathematical equivalence. Solid-State Circuits, IEEE 
Journal of, vol. 18, no. 6, pp. 803-807, 1983.  
[15]  HSPICE, http://www.synopsys.com/Tools/Verification/AMSVerification 
/CircuitSimulation/HSPICE/Pages/default.aspx," Synopsys, Inc., 2014. 
[16]  Bhavnagarwala, A. et al. Fluctuation limits & scaling opportunities for CMOS 
SRAM cells. in Electron Devices Meeting, Washington, DC, 2005.  
[17]  Narayanan, P., Kina, J., Panchapakeshan, P., Chui, C. O., and Moritz C. A. 
 57 
 
 
Integrated Device-Fabric Explorations and Noise Mitigation in Nanoscale Fabrics. 
IEEE Transactions on Nanotechnology, vol. 11, pp. 687-700, 2012.  
[18]  DataFit, http://www.oakdaleengr.com/datafit.htm, Oakdale Engineering, 2013 
[19]  PTM R-C Interconnect Models. http://ptm.asu.edu, Arizona State University, 2012  
[20]  Kim, D. H., Kim, S., and Lim, S. K. Impact of Nano-scale Through-Silicon Vias on 
the Quality of Today and Future 3D IC Designs. ACM/IEEE International 
Workshop on System Level Interconnect Prediction, pp. 1-8, 2011.  
[21]  Yang, K., Kim, D. H., and Lim, S.-K. Design quality tradeoff studies for 3D ICs 
built with nano-scale TSVs and devices. 13th International Symposium on Quality 
Electronic Design, pp. 740-746, 2012.  
[22]  Chau, R. "Benchmarking nanotechnology for high-performance and low-power 
logic transistor applications," Nanotechnology, IEEE Transactions on, vol. 4, no. 2, 
pp. 153-158, 2005.  
 
 
