University of Massachusetts Amherst

ScholarWorks@UMass Amherst
Open Access Dissertations
2-2013

Nasics: A `Fabric-Centric' Approach Towards Integrated
Nanosystems
Pritish Narayanan
University of Massachusetts Amherst

Follow this and additional works at: https://scholarworks.umass.edu/open_access_dissertations
Part of the Electrical and Computer Engineering Commons

Recommended Citation
Narayanan, Pritish, "Nasics: A `Fabric-Centric' Approach Towards Integrated Nanosystems" (2013). Open
Access Dissertations. 698.
https://doi.org/10.7275/gs30-qz30 https://scholarworks.umass.edu/open_access_dissertations/698

This Open Access Dissertation is brought to you for free and open access by ScholarWorks@UMass Amherst. It
has been accepted for inclusion in Open Access Dissertations by an authorized administrator of
ScholarWorks@UMass Amherst. For more information, please contact scholarworks@library.umass.edu.

NASICS: A ‘FABRIC-CENTRIC’ APPROACH TOWARDS
INTEGRATED NANOSYSTEMS

A Thesis Presented
by
PRITISH NARAYANAN

Submitted to the Graduate School of the
University of Massachusetts Amherst in partial fulfillment
of the requirements for the degree of
DOCTOR OF PHILOSOPHY
February 2013
Electrical and Computer Engineering

c Copyright by Pritish Narayanan 2013
All Rights Reserved

NASICS: A ‘FABRIC-CENTRIC’ APPROACH TOWARDS
INTEGRATED NANOSYSTEMS

A Thesis Presented
by
PRITISH NARAYANAN

Approved as to style and content by:

Csaba Andras Moritz, Chair

Israel Koren, Member

C. Mani Krishna, Member

Chi On Chui, Member

Christopher V. Hollot, Department Chair
Electrical and Computer Engineering

ACKNOWLEDGEMENTS

I am forever indebted to my advisor Prof. Csaba Andras Moritz for his constant
encouragement, guidance and mentorship. This dissertation would simply not have
been possible without his leadership, vision and direction which were instrumental
in developing my skills as a researcher. I am grateful to my dissertation committee
members Prof. Chui, Prof. Koren and Prof. Krishna for their valuable feedback
and suggestions throughout the course of my PhD. I have benefited immensely from
working with several creative, intelligent and dedicated colleagues. Foremost is Dr.
Teng Wang, who was a guide and mentor during my initial years. I would also
like to thank, in no particular order, Pavan Panchapakeshan, Priyamvada Vijayakumar, Prasad Shabadi, Prachi Joshi, Mostafizur Rahman, Santosh Khasanvis and Md.
Muwyid Khan, who have been not just great colleagues, but also great friends who
enriched my years of graduate study. I am thankful to Dr. John Nicholson, whose
diligent efforts keep the CHM cleanroom functional and who is always ready with
suggestions and advice on experimental work, Jorge Kina who was a key collaborator on nanowire device and manufacturing aspects and Stefan Dickert and Huajie
Ke who taught me Electron-Beam Lithography. Finally, I would like to express my
sincere gratitude to my parents Dr. Rama Rajaram and Dr. A. Rajaram, my brother
Vageeswar, my dear wife Rachita and my entire family for their continued love and
support through all these years.

iv

ABSTRACT

NASICS: A ‘FABRIC-CENTRIC’ APPROACH TOWARDS
INTEGRATED NANOSYSTEMS
FEBRUARY 2013
PRITISH NARAYANAN
B.E. (Hons) Electrical and Electronics Engineering,
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI, INDIA
M.Sc. (Hons) Chemistry,
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI, INDIA
Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST
Directed by: Professor Csaba Andras Moritz

This dissertation addresses the fundamental problem of how to build computing systems for the nanoscale. With CMOS reaching fundamental limits, emerging
nanomaterials such as semiconductor nanowires, carbon nanotubes, graphene etc.
have been proposed as promising alternatives. However, nanoelectronics research has
largely focused on a ‘device-first’ mindset without adequately addressing system-level
capabilities, challenges for integration and scalable assembly.
In this dissertation, we propose to develop an integrated nano-fabric, (broadly
defined as nanostructures/devices in conjunction with paradigms for assembly, interconnection and circuit styles), as opposed to approaches that focus on MOSFET replacement devices as the ultimate goal. In the ‘fabric-centric’ mindset, design choices

v

at individual levels are made compatible with the fabric as a whole and minimize
challenges for nanomanufacturing while achieving system-level benefits vs. scaled
CMOS.
We present semiconductor nanowire based nano-fabrics incorporating these fabriccentric principles called NASICs and N3 ASICs and discuss how we have taken them
from initial design to experimental prototype. Manufacturing challenges are mitigated
through careful design choices at multiple levels of abstraction. Regular fabrics with
limited customization mitigate overlay alignment requirements. Cross-nanowire FET
devices and interconnect are assembled together as part of the uniform regular fabric
without the need for arbitrary fine-grain interconnection at the nanoscale, routing
or device sizing. Unconventional circuit styles are devised that are compatible with
regular fabric layouts and eliminate the requirement for using complementary devices.
Core fabric concepts are introduced and validated. Detailed analyses on devicecircuit co-design and optimization, cascading, noise and parameter variation are presented. Benchmarking of nanowire processor designs vs. equivalent scaled 16nm
CMOS shows up to 22X area, 30X power benefits at comparable performance, and
with overlay precision that is achievable with present-day technology. Building on the
extensive manufacturing-friendly fabric framework, we present recent experimental efforts and key milestones that have been attained towards realizing a proof-of-concept
prototype at dimensions of 30nm and below.

vi

TABLE OF CONTENTS

Page
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

CHAPTER
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. NANOSCALE APPLICATION SPECIFIC INTEGRATED
CIRCUITS (NASICS): AN OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1
2.2
2.3
2.4

NASIC Building Blocks: Semiconductor Nanowires and Crossed
Nanowire Field Effect Transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
NASIC Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
WISP-0 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3. CMOS CONTROL ENABLED SINGLE-TYPE FET NASIC . . . . . . 11
3.1
3.2

Modifications to the Control Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
NASIC Logic Implementation with One Type of Devices . . . . . . . . . . . . . . 13
3.2.1
3.2.2

3.3
3.4

WISP-0 Program Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
WISP-0 Arithmetic Logic Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Cascading and Noise Considerations for Single-Type FET Designs . . . . . . 16
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4. INTEGRATED DEVICE-FABRIC EXPLORATION AND
NOISE MITIGATION IN NANOSCALE FABRICS . . . . . . . . . . . . 19

vii

4.1
4.2

Methodology for Integrated Device-Fabric Exploration . . . . . . . . . . . . . . . . 20
Physical Layer and Device Explorations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.1
4.2.2
4.2.3
4.2.4

4.3

Circuit level simulation and noise evaluation in-fabric . . . . . . . . . . . . . . . . . 27
4.3.1
4.3.2

4.4
4.5

Sequencing schemes for the NASIC fabric . . . . . . . . . . . . . . . . . . . . . 28
Circuit Simulation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Noise Resilient Sequencing Scheme for the NASIC Fabric . . . . . . . . . . . . . 37
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.1
4.5.2
4.5.3

4.6

Devices Explored . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Device Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Performance Optimization and Evaluation . . . . . . . . . . . . . . . . . . . . 40
Impact of Power Supply Droop on NASIC Fabric
Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Manufacturing Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5. PARAMETER VARIATION IN NANOSCALE COMPUTING
FABRICS: BOTTOM-UP INTEGRATED
EXPLORATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1
5.2
5.3
5.4
5.5
5.6

Methodology for Addressing Variability in Nanoscale Systems . . . . . . . . . 51
Device Parameter Variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Circuit-level Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Architectural Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Variability Impact on xnwFET Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Variability Impact on Circuit Level Delay and System
Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.6.1
5.6.2

5.7

Circuit Level Delay Characterization . . . . . . . . . . . . . . . . . . . . . . . . . 59
System Level Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6. MANUFACTURING PATHWAYS AND ASSOCIATED
CHALLENGES FOR NASICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.1
6.2

Fabric Choices Targeting Manufacturability . . . . . . . . . . . . . . . . . . . . . . . . . 66
Manufacturing Pathway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

viii

6.2.1
6.2.2
6.3

Nanowire Growth and Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Functionalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7. N3 ASICS: DESIGNING NANOFABRICS WITH REDUCED
MANUFACTURING REQUIREMENTS . . . . . . . . . . . . . . . . . . . . . . 74
7.1
7.2
7.3
7.4

Physical Fabric Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Assembly Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Overlay Requirement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
N3 ASICs Device, Circuits and Architectural Exploration . . . . . . . . . . . . . . 83
7.4.1
7.4.2

7.5
7.6
7.7

Device Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Circuit and System Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Reducing Doping Requirements with Metal-Gated Junctionless
xnwFETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Assembly Sequence for N3 ASICs with MJNFETs . . . . . . . . . . . . . . . . . . . . 88
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

8. EXPERIMENTAL PROTOTYPE DEVELOPMENT . . . . . . . . . . . . . . 91
8.1
8.2
8.3

Fabrication - Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Ion Implantation of SOI Wafers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Experimental Process Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.3.1
8.3.2
8.3.3
8.3.4

8.4
8.5

Electron Beam Lithography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Reactive Ion Etch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Oxide Deposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
I-V Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

9. CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

APPENDIX: LIST OF PUBLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

ix

LIST OF TABLES

Table

Page

4.1

Parameters used for xnwFET device simulations . . . . . . . . . . . . . . . . . . . . . 24

4.2

xnwFET Device simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.1

xnwFET Device parameters and extent of variation . . . . . . . . . . . . . . . . . . 55

5.2

Impact of physical parameter variation on device on-current . . . . . . . . . . . 59

7.1

Device simulation parameters for 2C-xnwFET . . . . . . . . . . . . . . . . . . . . . . . 84

7.2

2C-xnwFET Device simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.3

Comparison of key system-level metrics for WISP-0 . . . . . . . . . . . . . . . . . . 86

x

LIST OF FIGURES
Figure

Page

2.1

NASIC 1-bit Full adder using AND-OR 2-level logic . . . . . . . . . . . . . . . . . . . 7

2.2

Dynamic control scheme for NASIC 2-level AND-OR logic . . . . . . . . . . . . . 8

2.3

WISP-0 Nanoprocessor layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1

NASIC circuit with two dynamic NAND stages - Output of one stage
is cascaded to the next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2

Control scheme for NAND-NAND NASIC logic with implicit
latching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3

NASIC 1-bit full adder with NAND-NAND logic (a) 3-D physical
fabric view with nanowire grid, xnwFETs, oxide and peripheral
control. (b) circuit-equivalent implementation . . . . . . . . . . . . . . . . . . . . 15

3.4

WISP-0 Program Counter implemented using NAND-NAND logic . . . . . . 15

3.5

WISP-0 Arithmetic Logic Unit implemented using NAND-NAND
logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1

Methodology for Integrated Device-Fabric Exploration and Noise
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.2

Three xnwFET devices simulated (a) Si gate xnwFET (b) NiSi gate
xnwFET (c) Omega-gated xnwFET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.3

Device simulation outputs: (a) ID − VGS curves (b) CG − VGS curves
25

4.4

Circuit simulations of single NASIC dynamic stage . . . . . . . . . . . . . . . . . . . 29

4.5

Three-phase timing scheme for the NASIC fabric . . . . . . . . . . . . . . . . . . . . 31

4.6

Test circuit used for cascading evaluations - output integrity of stages
2 and 3 are affected by switching events in their neighborhood . . . . . . 32
xi

4.7

Cascading evaluations for NiSi 0.2 Device - Due to poor driving
voltage at the input transistor and slow device, output node do31
does not properly discharge leading to loss of signal integrity. . . . . . . . 33

4.8

Cascading evaluations for Omega 0.2 Device - Despite poor driving
voltage, signal integrity is preserved owing to faster device. . . . . . . . . . 34

4.9

Cascading evaluations for NiSi 0.2 and Omega 0.2 devices using
3-phase sequencing scheme - Logic ‘1’ glitching effects are reduced
in this scheme, and NiSi 0.2 device shows expected behavior.
However, logic ‘0’ glitching is critical for faster devices. Upward
glitch on do21 during eva3 causes loss of signal integrity at do31
node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.10 Noise resilient 4-phase sequencing scheme for the NASIC fabric Additional hold phase (H2) inserted to separate evaluation from
noise event. Green arrow shows do21 glitches only after eva3 has
completed. Signals repeat every four stages. . . . . . . . . . . . . . . . . . . . . . . 38
4.11 Cascading evaluations for NiSi (solid) and Omega (Dotted) devices
using the noise resilient 4-phase control scheme - Results show
signal integrity and sufficient noise margins for logic ‘1’ glitches
for both devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.12 Capacitance engineering of input gates: adding gate capacitance at
outputs of Stage 1 increases gate-drive voltages of Stage 2
xnwFETs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.13 Test circuit for performance evaluation as a function of fan-in - The
time taken to discharge do21 through a xnwFET stack consisting
of N inputs is measured . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.14 Graph showing frequency and drive voltage improvements against
capacitive loading for fan-in 4 NASIC dynamic circuits - 5X
improvement in operating frequency compared to no cap-loading
is demonstrated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.15 Maximum operating frequency with and without capacitance loading
vs. fan-in: a consistent 4.5X to 6X improvement in performance is
seen for all fan-ins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

xii

4.16 Impact of VDD drooping in conjunction with internal noise on
cascaded NASIC fabrics - Slower NiSi devices (left) do not
discharge effectively and signal integrity is lost for a 20% droop in
VDD. Circuits using faster Omega 0.2 devices (right) are resilient
to VDD drooping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.17 Front view of the xnwFET during the formation of the source and
drain underlap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.1

Methodology for evaluation of parameter variation integrating device,
circuit and architectural levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2

Crossed Nanowire Field Effect Transistor (xnwFET) structure . . . . . . . . . 53

5.3

N-input dynamic NAND circuits characterized for delay
distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.4

Delay distributions for physical parameters with maximum impact on
on-current for (a) 15 input and (b) 30 input NASIC dynamic
NAND gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.5

Delay distribution for 15 input gate with all parameters
simultaneously varied: Nominal value is 174ps . . . . . . . . . . . . . . . . . . . . 62

5.6

Distribution of WISP-0 operating frequencies showing impact of
parameter variations: (a) With no built-in fault tolerance
incorporated, 67% of chips operate at frequency below nominal
due to variations in device parameters (b) PDF for 2-way and
3-way redundancy schemes, showing a majority of samples
operating at better-than-nominal frequencies (normalized
frequency > 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.1

Scalable manufacturing pathway for NASICs . . . . . . . . . . . . . . . . . . . . . . . . 68

6.2

Manufacturing pathway for NASICs with 3-step nanowire transfer . . . . . . 73

7.1

Nano-CMOS integrated N3 ASICs fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.2

CMOS Design rules applied to N3 ASICs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

xiii

7.3

Assembly Sequence for N3 ASICs fabric: A) Patterned Nanowires B)
Creation of Lithographic contacts and dynamic control rails C)
Metal gate deposition followed by self-aligned ion-implantation to
define high-conductivity interconnect D) Metal 1 vias and
interconnects, and E) Creation of Lithographic contacts and
dynamic control rails. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.4

Mask overlay limited Yield vs. Overlay for 3D integrated fabric . . . . . . . . 82

7.5

3D structure of N3 ASICs device (2C-xnwFET) . . . . . . . . . . . . . . . . . . . . . . 84

7.6

Metal-gated Junctionless Nanowire FET (MJNFET) A) Structure,
B) Simulated IDS − VGS (log) plot C) Simulated IDS − VDS curve
for different VGS showing linear and saturation regimes of
operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.7

Assembly Sequence for N3 ASICs fabric with MJNFETs: A) SOI
wafer with wafer-wide top Silicon doping, B) Direct patterning of
nanowires, C) MJNFET creation by gate oxide + gate metal
deposition, D) Power rail and via placement, E) Metal1 for gate
inputs and control signals, F) M2 for routing. . . . . . . . . . . . . . . . . . . . . 89

8.1

Simulations for Ion Implantation A) SRIM simulation plot showing
ion distribution in SOI wafer for 28keV implant B) Sentaurus
process simulation plot showing ion distribution in SOI wafer
before and after thermal annealing at 1000◦ C. . . . . . . . . . . . . . . . . . . . . 93

8.2

End-to-end prototyping process flow for N3 ASICs fabric . . . . . . . . . . . . . . 95

8.3

AFM Images post-RIE A) Successful thinning of top Silicon to
∼15nm with less than 1nm RMS deviation in surface roughness
B) Successful pattern transfer to Si followed by Nickel removal,
showing anisotropic profile and smooth top surface. . . . . . . . . . . . . . . . 98

8.4

Experimental MJNFET Device Characterization: A) Fabricated
Device Structure and B) IDS − VGS characteristics for normally off
MJNFETs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

8.5

Fabricated N3 ASICs Tile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

xiv

CHAPTER 1
INTRODUCTION

As MOSFET critical dimensions have progressed through the deep-submicron to
nanoscale regimes, CMOS technology is facing new challenges across both device
characteristics as well as manufacturability and may in the near future reach fundamental barriers. With aggressive scaling, it becomes increasingly difficult to achieve
the manufacturing precision required to keep variability and defect rates within acceptable limits. For example, ITRS 2011 [26] projects 3σ = ±1.3nm Critical Dimension (CD) control and 3σ = ±3nm overlay control for 16nm CMOS, precisions for
which manufacturing solutions are not known. Optical system complexity and design rules for manufacturing are becoming increasingly intractable. These extremely
stringent manufacturing requirements are a consequence of the general fabric architecture of CMOS, which requires a high degree of customization, including arbitrary
placement and complex interconnectivity schemes. Furthermore, at the device-level
challenges include the need for complementary devices with precise sizing and ultrasharp doping profiles (1nm-2nm lateral doping abruptness [10]) for source-channel
and drain-channel junctions. Arbitrary layouts with complex interconnection also
imply increased power consumption, because of the need to switch large interconnect
load capacitances.
Emerging nanomaterials such as semiconductor nanowires [11, 12], carbon nanotubes [23, 24], graphene [22, 42], molecular devices [7], spintronic/spin-wave devices [64] etc. have been suggested as alternatives to conventional CMOS technology.
However, research in the field of emerging electronics has largely concentrated on a

1

‘device-first’ approach i.e. discovering and optimizing MOSFET replacement devices
with the assumption that CMOS circuit styles and paradigms for interconnection to
build logic gates/integrated systems will be preserved intact. However, at the present
time, there are no immediately obvious replacement devices/state-variables available [15].Furthermore, the device-first approach does not address the aforementioned
challenges. This implies that given several levels of logic and routing, system-level
capabilities may not necessarily scale in proportion to individual device performance.
By contrast, at the nanoscale, it would be desirable to minimize lithographic customization requirements on devices and layouts (e.g. eliminating arbitrary sizing and
placement of devices and arbitrary routing between them) by moving towards simple device structures and regular layouts such as parallel arrays and grids that are
more easily realizable with both unconventional and photolithography-based manufacturing approaches. Furthermore, ultra-dense nanosystems could be realized if
both devices and local interconnects could be formed at the same time as part of
these regular layouts, as opposed to requiring fine-grain arbitrary interconnections of
individual devices at nanoscale dimensions. This integrated approach across multiple
design levels including manufacturing, devices, circuits and architecture is called the
‘fabric-centric’ mindset. This mindset is anchored in a belief that at nanoscale developing a complete fabric framework, rather than focusing on devices alone, is how
significant progress could be made. The twin objectives of this fabric-centric approach
are reducing manufacturing requirements while concurrently improving system-level
capabilities.
The goal of this dissertation is to develop integrated nanoscale fabrics based on
the principles of the fabric-centric approach, validate core concepts at all fabric design
levels and explore manufacturing solutions towards demonstrating a proof-of-concept
prototype. Nanoscale Application Specific Integrated Circuits (NASICs) is a semiconductor nanowire-based fabric that implements logic and memory functionalities on

2

regular 2-D semiconductor nanowire grids with crossed-nanowire field effect transistors (xnwFETs) at certain crosspoints. Previous work on NASICs has focused more
on architectural and defect tolerance aspects. A simple general purpose processor
called WIre Streaming Processor version-0 (WISP-0) [35, 56] was built on the NASIC fabric. Built-in defect tolerance schemes [35, 57] were also developed to provide
resilience against high levels of manufacturing defects. In this dissertation, we discuss key physical layer aspects, device-circuit co-design and experimental directions
towards realizing a nanowire fabric. The main technical contributions are:
• We show how manufacturing and device-level requirements can be significantly
alleviated through design optimizations including novel circuit style and control
schemes.
• We develop an integrated device-circuit methodology for evaluating nanodevice
behavior in-fabric and validating core physical fabric concepts. This methodology is generic and ties physical layer assumptions and accurate 3-D physics
based simulations of device structures with extensive circuit-level simulation and
validation. Using the device-circuit methodology, we evaluate noise, cascading
and functionality aspects that are apparent only when considering interacting
devices and associated control schemes.
• We extend the generic integrated device-fabric methodology to address parameter variability and present evaluations at device, circuit and system levels.
• We discuss scalable manufacturing pathways for NASICs and identify remaining
manufacturability challenges for integration.
• We present N3 ASICs, a new 3D integrated nanoscale computing fabric which
combines unconventional manufacturing with CMOS design rules and can be
assembled with no special manufacturing requirements. We also present bench-

3

marking of NASICs and N3 ASICs WISP-0 processor designs vs. equivalent
16nm scaled CMOS.
• Building on this extensive theoretical framework and manufacturing-friendly
fabric design, we present recent progress towards fabrication and experimental
demonstration of N3 ASICs in Cleanroom settings.
The rest of this dissertation proposal is organized as follows: Chapter 2 presents
a brief overview of the NASIC fabric. Chapter 3 discusses single-type FET NASICs.
Chapter 4 describes integrated device-fabric methodologies for validating cascading,
noise mitigation and functionality. The methodology for handling device parameter variation and detailed evaluations at all fabric levels are presented in Chapter 5.
Chapter 6 discusses scalable manufacturing pathways. Chapter 7, presents a new 3-D
integrated fabric with no special manufacturing contraints. Chapter 8 describes recent experimental efforts and demonstrations targeting a proof-of-concept prototype.
Chapter 9 concludes this dissertation.

4

CHAPTER 2
NANOSCALE APPLICATION SPECIFIC INTEGRATED
CIRCUITS (NASICS): AN OVERVIEW

This chapter provides a brief overview of NASICs. NASICs are targeted as a
CMOS replacement technology for general purpose computing as well as specialized applications such as image processing. NASICs rely on regular 2-D grids of
semiconductor nanowires, motivated by the need for simpler manufacturability at
the nanoscale without arbitrary layouts or extensive nanoscale customization. Logic
and interconnect are achieved as part of the grid itself, without the need for arbitrary connections at nanoscale dimensions post device-formation. Devices and circuit
styles amenable to implementation on these grids are used. Computational streaming/cascading is supported from external reliable circuitry. Built-in fault tolerance is
used to provide resilience against permanent manufacturing defects, transient faults
and parameter variations. More details are presented below.

2.1

NASIC Building Blocks: Semiconductor Nanowires and
Crossed Nanowire Field Effect Transistors

Semiconductor nanowires are nanostructures made of semiconductor material with
diameters typically between 2nm – 100nm. Nanowires can be grown to up to a
few microns in length and have been shown with a variety of materials including
Silicon [11, 31], Germanium [17, 61], Zinc Oxide [29], Indium Phosphide [13] and
Indium Antimonide [28]. By using non-conventional or self-assembly techniques [16,
21, 54], it may be possible to assemble these materials into regular arrays and grids.

5

The NASIC fabric is built on these types of 2-D semiconductor nanowire grids
with crossed nanowire field-effect transistors (xnwFETs) at certain crosspoints. The
channel of a xnwFET is aligned along one NW while the perpendicular NW above it
acts as gate. A typical xnwFET behavior has been reported in Silicon NWs in [20].

2.2

NASIC Circuits

Fig. 2.1 shows an example of a NASIC circuit that implements a 1-bit full adder.
This consists of a semiconductor nanowire grid with peripheral microwires (MWs)
that carry VDD , VSS and dynamic control signals. Both n- and p-type xnwFETs
are shown at certain crosspoints in the diagram. Channels of xnwFETs are oriented
horizontally on the left plane, and vertically on the right. Inputs are received from
vertical nanowires in the left plane. These act as gates to n-type horizontal nanowire
FETs implementing an AND stage of a two-level logic. The outputs of the horizontal
AND plane act as gates to p-type xnwFETs whose channels are aligned in the vertical
direction (right OR plane). Multiple such NASIC tiles are cascaded together to form
more complex circuitry such as processors [56]. In keeping with the fabric-centric
mindset, all crossed nanowire devices used in one logic stage of the circuit are identical
with no arbitrary doping or sizing requirement. Customization of the grid is limited
to defining the positions of transistors and interconnect, which determines the logic
function implemented without arbitrary placement or routing.
NASICs use dynamic signals driven from external reliable CMOS circuitry. Control signals coordinate the flow of data through NASIC tiles: horizontal and vertical
signals are different, supporting cascading. Fig. 2.2 shows a typical NASIC control
scheme but other schemes are also possible. Horizontal nanowire outputs are initially
discharged to logic ‘0’ by asserting hdis. hdis is then switched off and heva is asserted
to evaluate inputs. If all inputs are ‘1’ an output of ‘1’ is achieved, realizing AND
logic. In the next phase, both hdis and heva are switched off, and the horizontal

6

Figure 2.1. NASIC 1-bit Full adder using AND-OR 2-level logic

7

Figure 2.2. Dynamic control scheme for NASIC 2-level AND-OR logic

nanowires are in hold phase, during which time vertical nanowires are precharged
(vpre is asserted) to ‘1’. veva is then asserted and outputs from the tile are evaluated. The staggered evaluation of dynamic stages with inserted hold phases is critical
to NASIC operation. The hold phase enables implicit latching of the nanowire output
after evaluation without the need for expensive flip-flops, and is essential for cascading
multiple nanowire stages [37].

2.3

WISP-0 Architecture

NASIC tiles may be cascaded together to form large scale systems/architectures.
WIre Streaming Processor version 0 (WISP-0) is a stream processor that implements a
5-stage microprocessor pipeline architecture including fetch, decode, register file, execute and write back stages. WISP-0 consists of five nanotiles: Program Counter (PC),
ROM, Decoder (DEC), Register File (RF) and Arithmetic Logic Unit (ALU). Fig. 2.3
shows its layout. In WISP designs, in order to preserve the density advantages of the
fabric, data is streamed through the fabric with minimal control/feedback paths. All
hazards are exposed to the compiler. It uses dynamic circuits and pipelining on the
wires to eliminate the need for explicit flip-flops and therefore improve the density
considerably. WISP-0 is used as a design prototype for evaluating key metrics such
as area and performance as well as the impact of various fault-tolerance techniques

8

on chip yield and process variation mitigation. A NASIC WISP-0 processor has been
shown to be up to 33X denser than an equivalent 16nm CMOS implementation.

2.4

Chapter Summary

This chapter presents a brief overview of the NASIC fabric. The rest of the dissertation addresses key physical fabric issues in building integrated nanosystems in
general, and NASICs in particular, including integrated device-circuit explorations,
fabric-friendly design optimizations for functionality and reducing manufacturing requirements, parameter variation, fabrication etc.

9

Figure 2.3. WISP-0 Nanoprocessor layout

10

CHAPTER 3
CMOS CONTROL ENABLED SINGLE-TYPE FET NASIC

In nano-systems based on semiconductor nanowires, it may be difficult to build
both p- and n-FETs using the same material. While complementary FETs have been
demonstrated in zinc oxide [41], silicon [11], and germanium [17] nanowires, in all cases
large differences in transport properties were found between the two types of FETs,
sometimes much greater than those seen in today’s traditional CMOS transistors. As
the transistor characteristics are certain not to be symmetric between n-FETs and
p-FETs, this would make timing closure more complicated thereby making it harder
to manufacture systems reliably. Consequently, when designing at the nanoscale, it
would be advantageous if only one type of device were required.
However, in general, conventional logic systems designed using mostly one type of
FETs, such as pseudo-NMOS, suffer from major power and performance drawbacks
as compared to CMOS [45]. This is one reason why such designs have not found
widespread applicability.
In NASIC designs, instead of using a scheme such as pseudo-NMOS, the dynamic
control from external CMOS can be modified such that the associated nanoscale
circuits could function with only one type of FET. The static power consumption can
be eliminated by ensuring that the control scheme never causes direct paths between
ground and the power supply voltage.
This chapter introduces new types of dynamic circuit styles utilizing only one type
of xnwFETs in the logic portions of the design. In keeping with the fabric-centric
mindset, this approach has the following key advantages:

11

• It eases manufacturing requirements by eliminating one doping type.
• It reduces device design and optimization requirements, since characteristics of
dissimilar devices need not be balanced.
• By eliminating slower p-type devices, it enables system-level performance improvement.
Furthermore, as will be shown in this chapter, these benefits come with no loss in
the overall density of the fabric.

3.1

Modifications to the Control Scheme

It has been found that altering the CMOS control scheme obviates the need for two
types of devices to implement arbitrary logic functions on the nanogrid. The scheme
may thus be used with manufacturing processes where complementary devices are
difficult or impossible to achieve. A design using only n-type FETs will implement
a NAND-NAND cascaded logic whereas a design using p-type FETs will implement
a NOR-NOR logic. Fundamentally, these are equivalent with the original AND-OR
(Fig. 2.1).
Fig. 3.1 shows two NASIC stages implementing NAND-NAND functionality with
only n-type xnwFETs. Outputs from the horizontal NAND stage (do1a and do1b)
become gate inputs for the vertical NAND stage as part of the crossbar grid structure
without additional routing requirement. The associated timing scheme is shown in
Fig. 3.2. On comparison with the AND-OR timing scheme (Fig. 2.2), it is seen that
the dynamic scheme of precharge, evaluate and hold is still in place. However the
behaviour of the control signals has been modified. There is no predischarge phase;
all planes are precharged. Outputs are initially at logic ‘1’, and if all inputs are ‘1’,
they are evaluated to ‘0’, achieving NAND functionality. Also, all control signals are

12

Figure 3.1. NASIC circuit with two dynamic NAND stages - Output of one stage
is cascaded to the next.

active high, since they gate only n-type FETs. Similar to the previous case, hold
phases are inserted for implicit latching and correct cascading.

3.2

NASIC Logic Implementation with One Type of Devices

Fig. 3.3 shows a 1-bit full adder built using only n-type devices. Fig. 3.3(a) shows
a 3-D physical fabric diagram with the crossed nanowire grid, xnwFET channels (blue
regions) and peripheral microwires for power rails and dynamic control. Fig. 3.3(b)

Figure 3.2. Control scheme for NAND-NAND NASIC logic with implicit latching

13

shows the circuit equivalent implementing NAND-NAND logic. In comparison with
the previous implementation (Fig. 2.1) it may be noted that the relative positions
of the transistors in the NAND-NAND example is identical to the AND-OR implementation. The only change from AND to NAND is in the swapping of the control
signals, VDD and VSS . The output node is precharged rather than predischarged and
evaluated to ground as opposed to logic ‘1’, which results in the inversion of the
function. On the second plane, the change is more significant: from OR to NAND.
Both the type of the transistor and polarity of the control scheme have been changed.
Also, the inputs to the vertical NW are now inverted from their values in the ANDOR scheme. The inversion of the inputs in conjunction with the change from OR
to NAND results in a transformation of the logic function. DeMorgan’s Laws tell
us that this transformation should produce the same result as the AND-OR scheme.
This allows us to maintain the transistors in their original positions, even though the
logic functions used have changed. It can thus easily be seen that there will be no
impact on the area of the overall design.
All WISP-0 tiles were implemented using the new control scheme and n-type
xnwFETs. Two examples are shown below.

3.2.1

WISP-0 Program Counter

The WISP-0 program counter is implemented as a 4-bit accumulator. Its output
is a 4-bit address that acts as input to the ROM. The address is incremented each
cycle and fed back using a nano-latch. Fig. 3.4 shows implementation of the Program
Counter using NAND-NAND. Diagonal FETs on upper NAND planes delay output
by one cycle and allow signals to turn the corner.

3.2.2

WISP-0 Arithmetic Logic Unit

Fig. 3.5 shows the layout of the WISP-0 ALU that implements both addition
and multiplication functions. The arithmetic unit integrates an adder and multiplier

14

Figure 3.3. NASIC 1-bit full adder with NAND-NAND logic (a) 3-D physical fabric view with nanowire grid, xnwFETs, oxide and peripheral control. (b) circuitequivalent implementation

Figure 3.4. WISP-0 Program Counter implemented using NAND-NAND logic

15

Figure 3.5. WISP-0 Arithmetic Logic Unit implemented using NAND-NAND logic

together to save area and ease routing constraints. It takes the inputs (at the bottom)
from the register file and produces the write-back result. At the same time, the writeback address is decoded by the 2-4 decoder on the top and transmitted to the register
file along with the result. The result is written to the corresponding register in the
next cycle.

3.3

Cascading and Noise Considerations for Single-Type FET
Designs

With single-type xnwFET schemes, n-type precharge devices are used to pull up
output nodes. If the gate voltage were VDD , this would lead to output potentials
below VDD , typically around (VDD - VT H ) [45]. One important consideration is,
will cascading of multiple dynamic stages lead to accumulation of VT H drops, causing

16

incorrect functionality? The NASIC logic style is designed such that this catastrophic
noise build-up scenario never occurs.
Firstly, the gate voltage for the precharge device is controlled from external CMOS
circuitry and not from logic. This implies that the driving voltage can be higher than
VDD , leading to a full voltage swing at the output node. Furthermore, NASICs use a
NAND-NAND logic style which, in addition to being able to implement any arbitrary
logic function, is also inverting in nature. Output nodes at any stage are always
cascaded to a xnwFET in the next stage that is part of a pull down network. In other
words, the logic style is such that logic ‘1’ inputs when evaluated will cause logic ‘0’
output at the next stage. Output signals at any stage do not gate xnwFETs in pullup networks; the pull-up is accomplished entirely by precharge signals. Therefore, a
combination of circuit and inverting logic style prevents noise accumulation in NASIC
designs. Extensive device-fabric noise simulations have shown that there is no noise
accumulation in cascaded dynamic circuits 40 stages deep. Detailed evaluations of
noise and cascading issues are presented in the next chapter.

3.4

Chapter Summary

A fabric-friendly approach towards elimination of dissimilar devices in the NASIC fabric was presented. The new circuit-style is enabled by changes to the external
CMOS control scheme and is achieved with no loss of density and has benefits at manufacturing, device, and architectural levels. The use of single-type FETs for NASIC
designs implies that manufacturing requirements are considerably eased since complementary doped nanowire devices are not needed. Device-design and optimization
effort may be reduced since balancing device characteristics across dissimilar devices
through sizing or other approaches are not required. System-level performance benefits are also achieved due to the elimination of slower devices. The next chapter will

17

discuss detailed device and circuit level evaluations of the NAND-NAND circuit style
including noise implications and further improvements to the control scheme.

18

CHAPTER 4
INTEGRATED DEVICE-FABRIC EXPLORATION AND
NOISE MITIGATION IN NANOSCALE FABRICS

Integration of nanofabrics requires extensive validation of device interactions.
Charge sharing and capacitance effects that cause glitches are apparent only when
considering multiple devices together and could cause loss of functionality and/or
performance. Therefore, while device choices and optimizations must target key electrical parameters such as threshold voltage and intrinsic delay, in keeping with the
fabric-centric mindset, they should also i) be fully validated at the circuit/fabric level
for noise implications and functionality and ii) not impose insurmountable challenges
for the fabric manufacturing sequence.
In this chapter we present an integrated device-fabric exploration with simulations
at the circuit level built on accurate 3-D physics based simulations of nanodevice
electrostatics and operations. Using an integrated approach, co-design of devices and
circuits can be accomplished with accurate physics-based device models. We extract
device I-V characteristics, parasitic capacitances and key electrical parameters such
as threshold voltage and on/off current ratios for different xnwFETs. We then create
behavioral models of the data for a circuit simulator and use these to evaluate devices
in-fabric for noise resilience, signal integrity and validation of worst-case test circuits.
We also discuss implications of device and fabric choices for manufacturing. While
the work is focused on xnwFETs and NASICs, the approach and methodology are
fairly generic and may be applicable to other nanoscale fabrics.

19

4.1

Methodology for Integrated Device-Fabric Exploration

The methodology for bottom-up integrated device-fabric explorations is detailed in
this section. It encompasses physical layer assumptions, device level explorations and
implications at higher design levels and is summarized in a flow diagram (Fig. 4.1).
A variety of physical layer assumptions such as choice of gate material and the
structure of devices can be made targeting device metrics such as the threshold voltage, on-currents and intrinsic delay. For example, the gate material used in NASIC
crossed nanowire field effect transistors (xnwFETs) could be composed of crystalline
silicon, nickel silicide or metals. Similarly, the structure of the device may be a top
nanowire gate or an Omega gated structure for tighter electrostatics. In accordance
with the fabric-centric mindset, these assumptions need to be evaluated in terms of
implications for manufacturing as well as for other design levels.
The electrical properties of individual xnwFETs may be characterized using accurate 3-D physics based simulation of the nanostructures using Synopsys Sentaurus
Device [3]. Calibration of the tool against experimental data at similar dimensions
is required to account for nanoscale effects such as increased surface roughness and
interface trap states. These device-level simulations provide 3 sets of data: i) Current
data for different values of drain-source (VDS ) and gate-source (VGS ) voltages, ii) Device capacitances at different values of VGS , and iii) key device parameters/metrics
that determine noise margins and performance of the devices such as the on-currents
(ION ), threshold voltage (VT H ) and the intrinsic delays of the devices. These device parameters may be adjusted by changing underlying physical layer assumptions
as well as the substrate bias (e.g. a higher threshold voltage may be obtained by
modifying the metal work function or using a more negative back gate bias).
The current data is fitted as a function of VGS and VDS using regression analysis
and curve fitting. This step expresses the current as a mathematical function of
VGS and VDS . The expression for the current, in conjunction with a piecewise linear

20

Figure 4.1. Methodology for Integrated Device-Fabric Exploration and Noise Evaluation

21

approximation for the device capacitances forms a behavioral model of the xnwFET,
which may be incorporated into a standard circuit simulator such as HSPICE to carry
out circuit level evaluations.
The circuit-level simulations take as inputs the behavioral models for individual
devices, circuit netlists with worst-case noise scenarios as well as fabric-specific control and sequencing schemes. As will be shown, different sequencing schemes have
different implications while considering noise margins and signal integrity; they control the flow of data and influence capacitive interactions and glitching in between
successive cascaded stages. Different cascading and noise scenarios are evaluated and
output waveforms are checked for signal integrity. Circuit-level delay and fabric performance implications are also quantified from these simulations. The methodology
thus explores implications of physical layer and device assumptions on the fabric as a
whole. While it has been explored extensively for the NASIC fabric, this integrated
methodology is fairly generic and is applicable to other nano-fabrics as well.

Physical Layer and Device Explorations1

4.2
4.2.1

Devices Explored

We have considered three different xnwFET structures. Fig. 4.2 shows an image of
each nanowire transistor structure used for this study. The first structure considered
is the silicon gate xnwFET. This transistor consists of a bottom nanowire that acts
as the channel and a top nanowire, orthogonal to the bottom nanowire, which acts
as the gate electrode. These two nanowires are separated by a thin dielectric, which
acts as the gate insulator.
The second structure considered is the fully silicided (FUSI) gate xnwFET. This
structure is similar to the previous one, except that the gate nanowire has been fully
1

Work done in UCLA by Prof. Chui’s group, included for completeness

22

Figure 4.2. Three xnwFET devices simulated (a) Si gate xnwFET (b) NiSi gate
xnwFET (c) Omega-gated xnwFET.

silicided. This eliminates some undesired effects such as gate depletion, and reduces
the resistance of the gate nanowire needed for fast evaluation of the previous logic
stage. Also NiSi gives a smaller gate-substrate workfunction difference and therefore, there is no need of applying large substrate biases or using large source/drain
underlaps to achieve the desired threshold voltage.
The third structure considered is the Omega-gated xnwFET structure with a metal
gate. This structure was chosen because it has a better gate to channel coupling than
the two previous structures. Therefore it should have a better ON current (ION ) as
well as a higher on-to-off current ratio (ION /IOF F ).
4.2.2

Methodology

Due to the complex structure of xnwFETs, a 3D simulation is mandated. To study
the behavior of xnwFETs, Synopsys Sentaurus Device simulator was used. Before any
relevant simulation can be done, the simulation models have to be calibrated. To do
this, experimental data from well characterized nanowire channel FETs with similar
dimensions was employed [50, 47]. The calibrated models and parameters include the

23

Table 4.1. Parameters used for xnwFET device simulations
Device

Si 0.2

Si 0.3

Gate Material
Gate Workfunction (eV)
Gate NW diameter (nm)
Channel NW diameter (nm)
Channel doping (cm−3 )
Gate oxide material
Gate Oxide thickness (nm)
Bottom oxide material
Bottom oxide thickness (nm)
Source/Drain underlap (nm)
Back Gate Bias (V)

Si
n+ Si
10
10
1018
HfO2
3
SiO2
10
3
-4

Si
n+ Si
10
10
1018
HfO2
3
SiO2
10
3
-5

NiSi
0.2
NiSi
4.5
10
10
1018
HfO2
3
SiO2
10
0
-3

NiSi
0.3
NiSi
4.5
10
10
1018
HfO2
3
SiO2
10
0
-4

Omega
0.2
Metal
4.5
10
1018
HfO2
3
SiO2
10
0
-3

Omega
0.3
Metal
4.6
10
1018
HfO2
3
SiO2
10
0
-3

drift-diffusion transport models, to include effects such as carrier scattering due to
surface roughness, and dielectric/channel interface trapped charges.

4.2.3

Simulation Results

For this study, six different devices have been simulated. For each of the structures
mentioned before, we simulated a device with a threshold voltage of around 0.2 V and
another device with a threshold voltage of around 0.3 V. The 0.2 V and 0.3 V values
for VT H were chosen for the noise resilience study purposes. A lower value for VT H is
expected to improve logic ‘1’ noise resilience, but lower the logic ‘0’ noise resilience,
whereas a higher value for VT H will do the opposite. To achieve the desired VT H
values, a source/drain underlap, as well as a back gate bias can be applied. Table 4.1
summarizes the basic device parameters used to achieve the desired VT H values.
Drain current vs. gate voltage (IDS -VGS ), drain current vs. drain voltage (IDS VDS ) and capacitance vs. gate voltage characteristics were simulated and important
electrical parameters such as on current (ION ) and on-to-off current ratio (ION /IOF F )
were extracted. Fig. 4.3(a) shows IDS -VGS curves for the 6 devices simulated and

24

Figure 4.3. Device simulation outputs: (a) ID − VGS curves (b) CG − VGS curves

Table 4.2. xnwFET Device simulation results

VT H (V)
ION (A)
ION /IOF F
Intrinsic delay (ps)

Si Gate xnwFET
0.21
0.32
1.31
0.69
6798
29831
2.38
4.43

NiSi Gate xnwFET
0.22
0.31
5.37
3.95
1773
12046
1.13
1.49

25

Omega-Gated
xnwFET
0.21
0.31
18.5
12.9
10782
77875
0.59
0.81

Fig. 4.3(b) shows capacitance vs. VGS curves for the 6 devices simulated at VDS =
0.8 V (VDD ). Similarly, data was obtained for other values of VDS and VGS to cover
the operating regions of the devices. Table 4.2 summarizes key parameters such as
ION , ION /IOF F and intrinsic delay for the different devices.
ION is defined as the current level when the gate to source voltage (VGS ) and the
drain to source voltage (VDS ) are both equal to 0.8 V (VDD ). The off-state-current
is defined as the current level when VGS is equal to 0 V and VDS is equal to 0.8 V.
Various techniques is available for VT H extraction. We have chosen the square-root
IDS extrapolation method. To calculate the intrinsic delay, we computed the CV /I
ratio, where C is the total capacitance seen from the gate and IDS is the current value
at VGS = VDS = VDD .
4.2.4

Device Comparisons

The characteristics of the three nanowire transistor structures are compared as
follows. For a given threshold voltage, the silicon gate xnwFET has the smallest ION ,
followed by the NiSi gate xnwFET and the Omega-gated xnwFET has the highest
ION as expected. First the NiSi structure has a higher ION than the Si gate structure
because the ΦM S value is lower in the NiSi case. Therefore a smaller source/drain
underlap is needed to achieve the same VT H , which in turn reduces the effective
channel length, raising the drain current level. For the Omega-gated xnwFET, the
higher current level is due to the increased ability of the gate to modulate the channel
conductivity. In the Si gate or NiSi gate xnwFET structure, the inversion layer needed
to turn on the device is formed mostly on the top part of the channel nanowire, near
the gate nanowire, whereas in the Omega-gated xnwFET, the inversion layer can be
formed almost all around the channel nanowire and therefore, this can be thought as
increasing the effective channel width at the same gate voltage.

26

Another figure of merit for these three devices is the on-to-off current ratio. For
a given threshold voltage, the Si gate xnwFET and the NiSi gate xnwFET devices
have similar ION /IOF F but the Omega-gated xnwFET has a higher ION /IOF F value
as expected. This is because the Omega-gated xnwFET has better gate to channel
electrostatic control than any of the other two structures. In other words, the Omegagated xnwFET is more effective at turning the device on and off than any of the
other xnwFET structures. The Omega-gated xnwFET, therefore, should have better
subthreshold slope than any of the other two devices leading to a higher ION /IOF F .
Also we can compare the capacitances for these three devices. For a given VT H
specification, it can be seen that the capacitance values are usually higher for the
Omega-gated xnwFET, followed by the NiSi gate device, and the Si gate xnwFET
has the lowest values. For example, the NiSi gate device has a higher gate-to-source
and gate-to-drain capacitance value than the Si gate device because the former has
a smaller junction underlap, which will thus increase the gate coupling to the source
and drain. In addition, the NiSi gate device does not have the gate semiconductor
depletion issue near the oxide interface further increasing its capacitance values. For
the Omega-gated xnwFET, since the gate is wrapped around the channel, it can be
easily seen that the gate is located closer to the source and the drain regions than
in the other two xnwFET devices. It will in turn increase the gate-to-source and
gate-to-drain coupling and thus the respective capacitances.

4.3

Circuit level simulation and noise evaluation in-fabric

Behavioral models for the devices examined in the previous section were created
using the methodology described in Fig. 4.1. This section describes a variety of circuit
level simulations carried out to identify and fully evaluate the impact of internal noise
and validate cascaded nanowire fabrics utilizing xnwFETs.

27

DC Sweep analysis was done to verify that behavioral models accurately abstract
device data. For all devices, it was found that behavioral models accurately track
SentaurusTM current data within 5% error for the voltage ranges considered.
A single NASIC NAND stage was simulated using HSPICE to verify expected
functionality. Representative results are shown in Fig. 4.4 for the Omega 0.2 device.
Other devices exhibit similar behavior. From the signal waveforms we make the
following key observations: 1) the output precharges to logic ‘1’ when the pre signal
is asserted. Typically a value greater than VDD is used for pre to achieve rail-torail voltage swing at the output node . 2) The output goes to ‘0’ only when all
inputs are ‘1’, achieving the required NAND logic. 3) Current dissipation occurs only
when the capacitances are charged or discharged, and there is no static current in
NASIC designs as one of pre or eva is always off. 4) During the hold phase, the
output does not change. However during this time, the output node has high output
impedance which makes it susceptible to switching events in its neighborhood while
considering cascaded NASIC designs. In the next set of circuit simulation experiments
these internal noise sources and switching events will be investigated in detail for the
different xnwFETs and two baseline control schemes.

4.3.1

Sequencing schemes for the NASIC fabric

Fig. 3.2 showed one possible sequencing scheme for cascaded NASIC designs. In
this baseline scheme, one stage is precharged and evaluated before the next stage with
signals repeating every two stages, i.e. stages 1, 3 and 5 may use the same control
signals (say pre1 and eva1) whereas stages 2, 4 would use pre2 and eva2. While any
one stage is being precharged or evaluated, its neighbors are in the hold phase, with
outputs implicitly latched on the nanowire for correct cascading and pipelining of
datapaths.

28

Figure 4.4. Circuit simulations of single NASIC dynamic stage

29

In general, since control signals are not driven from logic but from reliable external circuitry, they may be optimized to achieve specific targets. One example of
this is driving precharge signals to voltages greater than VDD , thereby achieving a
full VDD voltage swing at the output node of a nanowire for maximum logic ‘1’ noise
margin. In keeping with the fabric-centric mindset, modifying the control schemes
does not impose any new challenges at the physical layer or in terms of manufacturing requirements, since there is no additional customization requirement at the
nanoscale. Furthermore, noise implications and signal integrity considerations may
be very different depending on the sequencing scheme used, since the scheme decides
how logic nodes are switching relative to one another.
Another sequencing scheme used for the NASIC fabric is shown in Fig. 4.5. This
is a 3-phase sequencing scheme where signals are repeating every 3 stages. In a large
scale design, this would imply that stages 1,4,7 etc would use identical control signals.
In this scheme, evaluate of one stage is overlapped with the precharge phase of the
next. This scheme carries performance benefits in a pipelined design as compared
to the scheme described in Fig. 3.2, since output evaluation events occur at a higher
frequency.

4.3.2

Circuit Simulation and Analysis

The six devices described in Section 4.2 were evaluated for a worst-case circuit to
evaluate noise implications and functionality. Both baseline timing schemes described
in the previous sub-section were considered in this analysis.
The three-stage cascaded test circuit used in these noise evaluations is shown in
Fig. 4.6. Stage 1 generates imperfect outputs that drive input xnwFETs of stage
2. Output integrity is checked at output nodes do21 and do31. Due to high output impedance during the hold phase, the output nodes at various stages may be
susceptible to noise effects across device parasitic capacitances.

30

Figure 4.5. Three-phase timing scheme for the NASIC fabric. Note that signals
repeat every three stages, with pre4 and eva4 identical to pre1 and eva1 respectively.

For example, key sources of noise for the do21 node include the Miller capacitances
between this node and do11 and do31 nodes. If do11 evaluates to ‘0’ it might cause
a downward glitch (degradation of logic ‘1’) at do21 due to the CGD capacitance
between do11 and do21. Similarly, if eva3 is asserted, a downward glitch may occur
at do21 due to the CSG parasitic capacitance. Precharging of do31 could cause an
upward glitch at the do21 node. Other similar parasitic effects exist between outputs
and intermediate nodes in the design, leading to glitching and internal noise events.
Fig. 4.7 and Fig. 4.8 show the output waveforms for the NiSi 0.2 and Omega 0.2
devices for the basic sequencing scheme describe in Fig. 3.2. Logic ‘1’ glitching is a
very serious problem in this timing scheme. Due to parasitic coupling between the
pre2 signal and do21 through the CGS capacitor (see Fig. 4.6), there is a drop in the
do21 output when pre2 is deasserted. Furthermore, while do21 is holding logic ‘1’,
it may be severely affected by two sources of noise: the CGD capacitance between
do11 and do21 as well as the CSG capacitance of the input transistor of stage 3. If
eva1 is asserted and do11 simultaneously discharges, a severe downward glitch may
be experienced at the do21 node due to these capacitances. This implies that when
31

Figure 4.6. Test circuit used for cascading evaluations - output integrity of stages 2
and 3 are affected by switching events in their neighborhood. The circuit represents
a worst-case scenario for noise since stage 3 has a single input, corresponding to the
least effective resistance and capacitance between its output node and VSS.

32

Figure 4.7. Cascading evaluations for NiSi 0.2 Device - Due to poor driving voltage
at the input transistor and slow device, output node do31 does not properly discharge
leading to loss of signal integrity.

stage 3 is evaluated, the driving voltage at the do21 node could be significantly below
VDD .
Two scenarios may then be considered: the voltage of do21 may be below or above
VT H . In the former case the signal integrity test fails at do21, since it is effectively
at a logic ‘0’ voltage level. In the latter case, the circuit functionality depends on the
characteristics of the device. A fast device may be able to effectively switch even with
a low driving voltage, leading to a correct logic ‘0’ evaluation of node do31, whereas
a slower device may not be able to effectively discharge do31, leading to an erroneous
logic ‘1’ value on the node. As seen in Fig. 4.7, circuits with the slower NiSi gated
devices fail in this scenario despite the input voltage being within the logic ‘1’ noise
margin (i.e. > VT H ). However, the circuit with Omega 0.2 devices, which is the
fastest of the 6 devices considered in terms of intrinsic delay, is able to effectively

33

Figure 4.8. Cascading evaluations for Omega 0.2 Device - Despite poor driving
voltage, signal integrity is preserved owing to faster device.

34

discharge the output node even with a significantly degraded input voltage. In other
words, faster devices are more resilient to logic ‘1’ glitching effects. Of the 6 devices
considered for these simulations, only the fastest Omega 0.2 device achieves expected
behavior, the 5 slower devices do not work.
Fig. 4.9 shows output waveforms for the NiSi 0.2 (left) and Omega 0.2 (right)
devices for the 3-phase control scheme described in Fig. 4.5. In this control scheme,
logic ‘1’ glitching effects are not as severe as in the previous scheme. This is because
both neighboring stages are not simultaneously discharging during the stage 2 hold
phase. While there can be some downward glitching due to CSG between do21 and
do32, in this scheme the parasitic capacitance CGD to do11 does not hurt logic ‘1’
integrity, since do11 is actually precharging during the stage 2 hold phase. Therefore
the NiSi 0.2 device (Fig. 4.9 - left) is able to effectively discharge the do31 output
node, leading to correct functionality. As expected, the Omega 0.2 device works
correctly in the presence of logic ‘1’ glitches.
However, in this sequencing scheme, logic ‘0’ glitching is an important consideration. Due to precharging of node do11, the output node do21 might have an upward
glitch from logic ‘0’ during its hold phase. For the Omega 0.2 device this upward
glitch might cause a logic ‘0’ value to reach above the threshold voltage of the device.
Given that this device has the lowest intrinsic delay of all devices considered, the
glitch may be sufficient to cause the stage 3 input xnwFET to operate in the linear
region, leading to loss of signal integrity (Fig. 4.9 – right). In other words, faster
devices are less resilient to logic ‘0’ glitching effects. Of the 6 devices considered, the
slowest NiSi 0.3 and Si 0.3 devices fail due to logic ‘1’ glitching effects, whereas the
Omega 0.2 fails due to the logic ‘0’ glitching. NiSi 0.2, Si 0.2 and Omega 0.3, which
are middle-of-the-road devices in terms of intrinsic delay, pass all signal integrity tests
and are correctly evaluated.

35

Figure 4.9. Cascading evaluations for NiSi 0.2 and Omega 0.2 devices using 3-phase
sequencing scheme - Logic ‘1’ glitching effects are reduced in this scheme, and NiSi
0.2 device shows expected behavior. However, logic ‘0’ glitching is critical for faster
devices. Upward glitch on do21 during eva3 causes loss of signal integrity at do31
node.

36

As seen from these results, both sequencing schemes and device properties have
strong implications on noise. Glitching occurs due to switching events in the neighborhood, which are influenced by the external control sequence. Therefore, while
device parameters such as VT H and intrinsic delay need to be adjusted for noise resilience, additional noise optimizations could be done at the fabric level by altering
the sequencing schemes and eliminating or isolating glitching events. For example,
the 3-phase scheme is resilient to logic ‘1’ glitching for 4 out of 6 devices owing to the
higher driving voltage at the input nodes, whereas the other baseline scheme works
only for 1 of 6 devices. We could then potentially design a new noise resilient timing
scheme that preserves the logic ‘1’ advantages of the 3-phase timing scheme while
providing tolerance against logic ‘0’ glitching such that the fastest devices may be
leveraged in NASIC designs.

4.4

Noise Resilient Sequencing Scheme for the NASIC Fabric

In this section, we present and evaluate a new noise-resilient dynamic control
scheme that provides resilience against both logic ‘1’ and logic ‘0’ glitches across a
variety of devices. The scheme is described and all devices are evaluated against it
for the test circuit (Fig. 4.6).
Fig. 4.10 shows the new noise resilient sequencing scheme. Similar to the 3phase scheme, eva phase of any stage overlaps with pre of the next stage. Also,
since both neighboring stages do not simultaneously discharge, logic ‘1’ glitching is
less severe than in the first scheme. However, the key difference for the noise resilient
scheme is the introduction of a second hold stage (labeled H2 in Fig. 4.10) to separate
evaluation events from noise events. For example, in the 3-phase scheme (Fig. 4.5),
do11 precharging can cause an upward glitch at do21, which affects logic ‘0’ integrity.
However, with the new scheme do21 has already been ’used’ as input for the next
stage, i.e. eva3 has completed before the noise event (i.e. pre1) occurs (shown by

37

Figure 4.10. Noise resilient 4-phase sequencing scheme for the NASIC fabric Additional hold phase (H2) inserted to separate evaluation from noise event. Green
arrow shows do21 glitches only after eva3 has completed. Signals repeat every four
stages.

the green arrow in Fig. 4.10). In this new control scheme, signals repeat every four
stages.
Fig. 4.11 shows the output waveforms for the Omega 0.2 device with the new
noise resilient scheme. As expected, the logic ‘0’ at do21 is already consumed before
the glitching event occurs and does not affect do31. During eva3, stage 1 is in the
new H2 phase, which essentially isolates the noise event from the propagation event
preserving signal integrity. Thus, using the new noise resilient timing schemes, devices
with lower intrinsic delays may be made functional in the NASIC fabric.

4.5

Discussion

This section discusses implications of the 4-phase noise resilient timing scheme on
fabric performance, the effect of external noise sources (e.g. power supply droops)
and manufacturing implications.

38

Figure 4.11. Cascading evaluations for NiSi (solid) and Omega (Dotted) devices
using the noise resilient 4-phase control scheme - Results show signal integrity and
sufficient noise margins for logic ‘1’ glitches for both devices. Logic ‘0’ glitches have
been isolated from evaluation events and are therefore not propagated. The new
sequencing scheme achieves noise resilience and correct functionality for 4 out of 6
devices.

39

4.5.1

Performance Optimization and Evaluation

In general, it may be expected that the noise resilient 4-phase sequencing scheme
would run at slower frequencies than the 3-phase and basic schemes since additional
hold phases are inserted for noise resilience. However, since the 4-phase scheme
provides better logic ‘1’ values and isolates logic ‘0’ glitches, faster devices could be
leveraged with this scheme leading to significant performance improvements at the
system level.
However, even with faster devices, NASIC dynamic circuits need to be optimized for performance. Specifically, due to noise cascading effects and high output
impedance, charge at driving nodes and the associated gate-drive voltages are typically expected to be lower than VDD . Since ION is strongly dependent on VGS , this
implies that even devices with low intrinsic delays (e.g. Omega 0.2) may be operating
at sub-optimal points, leading to large evaluation delays and poor circuit performance.
Therefore, circuits need to be optimized ‘in-fabric’ to improve VGS and performance.
CMOS dynamic circuits typically use keeper devices or domino logic [45] for
achieving low output impedance. A keeper device is part of a feedback network,
which is turned ON when the output node is ‘1’, and OFF when it is ‘0’. Keeper
configurations are typically achieved with an inverter and a PMOSFET. However,
this may be hard to achieve on a regular NW based fabric without a large density
impact, since it requires nanoscale customization and feedback, in addition to p-type
FETs and static inverters for every NASIC dynamic gate. Similarly, domino logic
would need insertion of static CMOS stages between tiles. These approaches cannot
be directly integrated into the NASIC fabric.
One promising technique for increasing charge at the driving nodes is capacitance
engineering. The key idea is to increase the overall capacitance (and consequently
the charge stored) at input nodes, thereby reducing the magnitude of noise glitching, thereby leading to higher gate voltages. While increased load capacitance at a

40

Figure 4.12. Capacitance engineering of input gates: adding gate capacitance at
outputs of Stage 1 increases gate-drive voltages of Stage 2 xnwFETs.

node will have a linear impact on performance; the expectation is that a net benefit will be achieved due to the better-than-linear relationship between ION and VGS .
Importantly, this technique does not impose new manufacturing challenges. A capacitance trench may be created at an input stage, increasing the net capacitance of
all input nodes in that stage (Fig. 4.12). This would be done at the granularity of a
NASIC stage (typically 10s – 100s of nm) using conventional photolithography steps
and would be easier to achieve than in a conventional DRAM process, which requires
isolated capacitors for every memory bit.
The test circuit used for performance evaluation with capacitance engineering is
shown in Fig. 4.13. Stage 1 generates imperfect outputs and is subject to noise effects
previously discussed. The time taken to fully discharge the output node of stage 2 is
measured as a function of fan-in. Stage 3 loads stage 2. Capacitors shown in green
are inserted at output nodes and improve drive voltages. It must be noted that these
capacitances improve logic ‘1’ noise margins, since more charge is stored on the nodes
and magnitude of downward glitching is reduced.
Experiments were done to characterize the evaluation delay of NASIC dynamic circuits as a function of fan-in. Maximum operating frequency is defined as 1/N ∗ delay,
where N is the number of distinct evaluate phases in the control scheme (explicitly, N

41

Figure 4.13. Test circuit for performance evaluation as a function of fan-in - The
time taken to discharge do21 through a xnwFET stack consisting of N inputs is
measured. Stage 3 provides constant capacitive loading.

42

Figure 4.14. Graph showing frequency and drive voltage improvements against
capacitive loading for fan-in 4 NASIC dynamic circuits - 5X improvement in operating
frequency compared to no cap-loading is demonstrated.

is 4 for 4-phase). The reasoning is that the minimum duration of any single evaluate
phase has to be at least equal to the delay for completely discharging the output node
through the pull-down network.
Fig. 4.14 shows drive voltage and maximum operating frequency vs. capacitance
for fan-in 4 NASIC dynamic gates. Without any capacitive loading, a maximum
frequency of 1.68 GHz is obtained. However, increasing the capacitance leads to a
5X improvement performance. A key observation is that for smaller drive voltages,
significant improvements in performance are seen. However, at higher drive voltages,
the ION vs. VGS relationship becomes more linear, and the effect of better driving
voltages due to capacitance at the input node is negated by the linear impact of the
output load capacitance.
For capacitance loading between 9 aF and 30 aF, only a 5% standard deviation is
observed, implying that performance is not very sensitive to variations in the capacitance values. Also, new techniques to mitigate the impact of variability in nanoscale
fabrics [39] may be leveraged to improve the performance further. Similar trends are
seen at other fan-ins.

43

Figure 4.15. Maximum operating frequency with and without capacitance loading
vs. fan-in: a consistent 4.5X to 6X improvement in performance is seen for all fan-ins.

Fig. 4.15 shows the maximum operating frequency vs. maximum fan-in for the
Omega 0.2 device with and without capacitance engineering. A consistent 4.5-6X
performance improvement is seen for all fan-ins with capacitance engineering (e.g.
for fan-in 10, maximum operating frequency increases from 798 MHz to 3.34 GHz).
These results attest to the importance of achieving high drive voltages at input nodes.

4.5.2

Impact of Power Supply Droop on NASIC Fabric Functionality

The previous sections dealt exclusively with internal noise sources such as arising
from parasitic capacitances. Fundamentally, fabric design and optimizations have to
be validated for functionality by mitigating internal noise. However, external effects
such as power supply variation, clock skew, thermal vibrations and soft errors can also
be detrimental to nanoscale fabric functionality. The latter two effects may partially
be dealt with through built-in fault tolerance techniques incorporated in the NASIC
fabric [35, 57]. With regard to clock skew, NASIC designs employ local interconnections between neighboring dynamic stages. The control signals that ’clock’ NASIC

44

stages are expected to be propagated on common rails from a Phase-Locked-Loop
with local phase shifters generating the four-phase clock. Given the local interactions
and the prescribed clocking structure, appreciable skew is not expected on control
signals. However, systematic effects such as fluctuations in VDD could still disrupt
functionality, especially when considered in conjunction with internal noise sources.
In this section, we examine how VDD changes may affect fabric functionality. The
test circuit in Fig. 4.6 was used and the four devices examined were: Si 0.2, NiSi 0.2,
Omega 0.3 an Omega 0.2. These devices were found to work correctly under nominal
VDD with the 4-phase noise resilient control scheme. VDD was varied systematically
for all the stages in the test design, because while across chip variation in VDD could
be large, little local variation is expected for smaller circuits using the same supply
rails. Up to 20% variation on either side of nominal (0.8 V) was considered.
Supply voltage spiking can be detrimental to logic ‘0’ outputs. However, these
upward glitches can be isolated using the 4-phase noise resilient scheme and our
simulations showed circuits with all four devices working correctly for up to a 20%
spike in VDD . Droops in supply voltage on the other hand affect logic ‘1’s. The
following results highlight the impact of power supply drooping.
The results are shown in Fig. 4.16 for the NiSi 0.2 (left) and Omega 0.2 (right)
devices. The trends for Si 0.2 and Omega 0.3 are very similar to NiSi 0.2. Omega 0.2
is extremely resilient to VDD noise (Fig. 4.16 - right) due to its smaller intrinsic delay.
Even when VDD drops to 0.65 V (∼20% droop), the logic ‘1’ values are evaluated
correctly and a strong ‘0’ is obtained at the do21 node. For NiSi 0.2, we see for
VDD = 0.65 V, the stage 2 input devices are not fully turned on and do21 is not
fully discharged. An ambiguous signal ≈ VT H is obtained and loss of signal integrity
occurs at do31. While the voltage at do21 for VDD = 0.65 V is only slightly higher
than for VDD = 0.7 V, the stage 3 xnwFET is much more strongly turned on, leading
to incorrect discharge at the do31 node.

45

Figure 4.16. Impact of VDD drooping in conjunction with internal noise on cascaded
NASIC fabrics - Slower NiSi devices (left) do not discharge effectively and signal
integrity is lost for a 20% droop in VDD. Circuits using faster Omega 0.2 devices
(right) are resilient to VDD drooping.

46

These results highlight that devices with smaller intrinsic delays are resilient to
logic ‘1’ glitching caused by both internal and external noise sources. In conjunction
with fabric level noise resilient sequencing schemes and capacitance engineering, faster
devices may be leveraged for noise tolerant, high performance computational fabrics
and systems.

4.5.3

Manufacturing Considerations

Reliable and scalable assembly of nanostructures and manufacturing pathways
towards integrated systems continue to pose significant challenges. Therefore two
objectives must be concurrently achieved: i) Device design and optimizations at device/circuit levels must target circuit functionality and fabric noise mitigation, and ii)
In keeping with the fabric-centric mindset physical layer assumptions targeting device
structures must not pose insurmountable challenges to the manufacturing sequence.
Silicidation of VLS grown nanowires with nickel for improved conductivity has
been shown in [60]. A similar silicidation process may be used to achieve NiSi gate
material as well as interconnect regions between xnwFETs. Since a final nickel silicidation step can be carried out after all ion implantation steps, thermal stability issues
for NiSi material do not arise.
Omega-gated structures could be achieved by nanolithography or other pattern
and etch techniques. For example, Superlattice Nanowire Pattern Transfer [34, 55] has
shown metal nanowires at sub-15nm pitches. Snider et al. [48] have shown nanoimprint lithography based copper nanowires.
Two device engineering techniques discussed include the back-gate bias and the
underlap. The substrate bias is applied to all devices in the fabric and therefore does
not impose new manufacturing constraints. The underlap is envision to be created
using a self-aligned process without any masking and is described below.

47

Self-aligned Underlap Formation: Source and drain junction underlap regions selfaligned to the gate nanowire are formed using spacer technology (Fig. 4.17). This
process is similar to what is used to form highly doped drain and source (HDD) in
CMOS devices and does not need any extra lithographic masking or overlay. During
the anisotropic etch step (Fig. 4.17c), deposited material on nanowire sidewalls is not
completely etched owing to higher thickness (Fig. 4.17b).

Figure 4.17. Front view of the xnwFET during the formation of the source and
drain underlap. (a) Initial structure right after channel nanowire, gate dielectric and
gate nanowire have been placed into position. (b) A thin layer the spacer material
(oxide or nitride) is conformally deposited. (c) The spacer material is anisotropically
etched. (d) Ion implantation is performed to dope the source, drain and gate regions.

We believe that these physical layer choices carefully addressing manufacturing
considerations, in conjunction with manufacturing-friendly device and fabric optimizations for noise and functionality may pave the way for future nanowire-based
integrated nano-fabrics.

4.6

Chapter Summary

A methodology for integrated device-circuit explorations of nanodevice based systems was presented. This methodology provides a fast and accurate way to create
behavioral models for circuit simulations from device data using regression analysis.
Furthermore, this approach is very generic, and can be applied to any nanodevice
based computing system.

48

Cascaded crossbar dynamic circuits were validated using this integrated approach
that combines circuit simulations, regression analysis, and accurate 3-D physics based
device models. Three different xnwFETs were investigated; a xnwFET with 10 nm
gate, 10 nm channel, underlap of 7 nm and a substrate bias of -1 V was found to meet
circuit requirements including sufficiently high on/off ratios and a VT H of +0.23 V.
Circuit simulations show that this device combined with NASIC circuit and logic
styles can achieve correct cascading with adequate noise margins. Future work will
address implications for optimized devices such as based on cylindrical nanowires,
fully-silicided gates, omega-gated structures etc. as well as new noise mitigation and
performance enhancement techniques.

49

CHAPTER 5
PARAMETER VARIATION IN NANOSCALE
COMPUTING FABRICS: BOTTOM-UP INTEGRATED
EXPLORATION

Reliable and deterministic manufacturing of integrated nanosystems continues
to be challenging. Self-assembly based approaches as well as photolithography at
features sizes of few tens of nanometers and below are expected to introduce significant
levels of permanent defects as well as large variations in physical parameters. While
permanent defects have been extensively analyzed at circuit and system levels through
approaches such as built-in defect tolerance [57, 35] and reconfiguration [49, 48], there
is little understanding of the impact of parameter variability for emerging nanoscale
fabrics.
Parameter variations arise due to imprecision in the manufacturing process as well
as fundamental atomic scale randomness. At nanometer dimensions where structures
typically consist of tens of atoms/molecules, even a small absolute variation in the
number of atoms causes a large shift in the electrical characteristics (e.g., random
dopant fluctuation and VT H [59] ). This could potentially lead to performance deterioration and/or yield loss.
In this chapter, we explore a methodology for evaluating the impact of variability on a nanoscale fabric. This methodology is integrative across device, circuit and
architectural layers. It builds on the core concepts of the device-circuit exploration
methodology described in the previous chapter including physics-based simulation of
device structures and regression-based behavioral models, but incorporates sources of
variation for xnwFETs as well as architectural-level evaluations using a custom simu50

lator. We identify key sources of variability at the physical layer, such as channel and
gate dimensions of transistors and incorporate them into unified behavioral models
for circuit simulation. Extensive HSPICE based characterization of circuits may be
done and a library of gate delays incorporated into a high-level architectural simulator to evaluate system-level variability impact. While there has been some previous
work in characterizing properties of nanomaterials (e.g., distributions of nanowire
diameters for a particular manufacturing setup [31, 12]), devices (e.g. on-current
variation [33]) or architecture, this is the first time that an integrated bottom-up approach evaluating implications of variability across multiple fabric levels is presented.
The variability framework, while discussed in the context of NASICs, is fully generic
and can be adapted to other nanofabrics as well.

5.1

Methodology for Addressing Variability in Nanoscale Systems

In this section we present the methodology for achieving integrated device-circuitarchitectural explorations considering parameter variability. This methodology, while
discussed in the context of the NASIC fabric, is fully generic and can be applied
to other emerging nanoscale computational fabrics for which analytical models of
device behavior considering variations are not available. This integrated approach
ties physical layer variability to circuit and system level metrics such as delay and
performance.
The overall methodology for integrated exploration is presented in the flowchart
on Fig. 5.1. The methodology for parameter variation builds on the integrated
device-fabric exploration methodology presented in the previous chapter but includes
sources of physical parameter variation (e.g. channel diameter, oxide thickness) as
independent variables in addition to gate-source and gate-drain voltages. Devices
are characterized extensively using Synopsys Sentaurus to extract current-voltage

51

Figure 5.1. Methodology for evaluation of parameter variation integrating device,
circuit and architectural levels

52

Figure 5.2. Crossed Nanowire Field Effect Transistor (xnwFET) structure

and capacitance-voltage information. Different device configurations are investigated
based on values of physical parameters and their behavior quantified. If the device
does not meet circuit requirements for correct functionality, device design may be
iteratively carried out. Otherwise, the current and capacitance data are fitted using a
standard curve-fit tool to obtain mathematical expressions for the data. Using these,
a unified behavioral model is created for a circuit simulator such as HSPICE. The
unified behavioral model accurately describes the behavior of a single device across
a range of input voltages and physical parameter values. Circuit level simulations
incorporating Monte Carlo sampling of individual parameters may then be carried
out to obtain distributions of circuit delays with parameter variation. This information is then used to create a library of delays and incorporated into a custom
nano-architectural simulator to quantify the critical path delays and performance
of large-scale designs. To our best knowledge, this framework is a first of its kind.
Subsequent sections describe each phase in more detail.

53

5.2

Device Parameter Variability

Key sources of variability for a single xnwFET device were identified based on device structure (Fig. 5.2) and manufacturing sequence. These include channel diameter
and doping, gate oxide thickness, gate diameter as well as source-drain doping. Table 5.1 summarizes all parameters and their extent of variability. Variations in these
parameters are dependent on the specific fabrication process used. For example, if a
Vapor-Liquid-Solid (VLS) growth method [31] is assumed for nanowire growth, the
gate and channel diameter parameters would be very strongly correlated to variations
in the catalyst nanoparticles used as seeds. The standard deviation in wire diameter
has been shown to be less than 10% in [31, 12]. Similar deviation is seen for Silicon
nanowires with SNAP [18]. Atomic Layer Deposition for gate oxide formation has
been shown to have spatial variability as low as σ=1% [32].
xnwFETs need to be engineered to meet NASIC circuit requirements (e.g., threshold voltage, on-off current ratios [40]). Device level techniques such as gate underlap
and substrate bias were applied in conjunction to achieve these targets. However,
these techniques can be sources of additional variability. For example, variation in
the length of the underlap can significantly affect I −V characteristics. Since this process step is identical to conventional spacer technology, the ITRS spacer requirements
table [25] defines the extent of variability allowed for underlap. For a 16nm CMOS
technology node this value is 3σ=±0.6nm which is 50% of the extent of variability
assumed in our work.
Large-scale integrated manufacturing of nanoscale computing systems is still in
its infancy, and for NASIC system fabrication, different approaches are currently
being investigated. Therefore, for our initial variability modeling, we conservatively
model 10% standard deviation (3σ=±30%) for all parameters1 . Random variation

1

For doping levels, each device simulation assumes a discrete number of dopants. 10% standard
deviation represents the average deviation over multiple device simulations

54

Table 5.1. xnwFET Device parameters and extent of variation
Parameter
Channel diameter (Cdiam)
Gate diameter (Gdiam)
Underlap (Ulap)
Gate oxide thickness (Gox)
Bottom oxide (Box)
Channel doping (Cdop)
Source-drain doping (Sddop)

Nominal Value
10nm
10nm
4nm
3nm
10nm
1018 dopants/cm3
1020 dopants/cm3

Standard Deviation
10%
10%
10%
10%
10%
10%
10%

in all parameters is assumed Furthermore, physical parameters are expected to be
uncorrelated since they would be influenced by separate process steps. For example,
the gate oxide may be created using Atomic Layer Deposition (ALD) [46, 32]. There
is no dependence of this parameter on any other process step. Similarly, variation in
the underlap is purely dependent on the spacers used, and not on any other step.
As more experimental data on device characterization becomes available and detailed process models developed, the modes and extent of variation can be suitably
altered.
Accurate 3D-physics-based simulations using Synopsys Sentaurus were carried
out to characterize the electrical behavior of the xnwFET device structures. Depending on extent of variability in individual parameters, multiple device configurations
were explored. Simulations were calibrated against published experimental data for
nanowire FETs at similar dimensions to account for effects such as carrier scattering
due to surface roughness and dielectric/channel interface trapped charges. Since parameters are assumed to be uncorrelaterd, in these simulations, each parameter was
varied one at a time for ±3σ and the I-V and C-V data were obtained for all device
configurations. This data was then used to construct unified behavioral models for
circuit simulations.

55

5.3

Circuit-level Simulations

In order to represent the behavior of the device accurately in a circuit simulator
such as HSPICE [2], curve-fitting of the raw data obtained from device simulations
needs to be done. In this step, the current (and various parasitic capacitances)
are fitted as a function of independent variables, i.e., input voltages (drain-source
(VDS ) and gate-source voltages (VGS )) as well as the physical parameters described
in Table 5.1. This step was accomplished using the statistical computing tool R [1].
Mathematical expressions describing the current (and capacitances) as functions of
the independent variables are then obtained for various regions (see Fig. 5.1 for flow).
An equivalent circuit for the xnwFET was then built into HSPICE incorporating
the current source and the parasitic capacitances using sub-circuit definitions. The
current and capacitance are calculated on-the-fly during simulations using the fitted
mathematical expressions. The subcircuit definition in conjunction with the expressions for individual elements forms the unified behavioral model for the xnwFET
device.
NASIC dynamic circuits were extensively characterized for delay using these models. A typical NASIC dynamic circuit is shown in Fig. 5.3. It has N inputs, as well
as control xnwFET devices for precharge and evaluate. The output node is first
precharged to logic ‘1’, and then the pre signal is switched off and eva is enabled.
If all inputs are logic ‘1’, the output node will discharge to logic ‘0’ accomplishing
NAND gate functionality. The NAND gate is the universal building block for large
scale designs, and its delay behavior needs to be extensively characterized and a
library of delay distributions constructed for use in an architectural level simulator.
Delay characterization was done using NASIC dynamic NAND gates with number
of inputs varying from 1 to 30. The Monte Carlo simulation framework available
with HSPICE was used to vary parameter values and the delay to precharge and
evaluate the output node was obtained. Parameters are assumed to follow a Gaussian

56

Figure 5.3. N-input dynamic NAND circuits characterized for delay distribution

distribution, with the mean and standard deviation values specified in Table 5.1.
They are varied independently for each device, except for the channel diameter which
is assumed to be the same across all devices, since all devices are along the same
nanowire. Since it may be very hard to do detailed circuit-level simulations on a
larger design such as the WISP-0 processor, the delay information is abstracted and
used in a higher level architectural simulator.

5.4

Architectural Simulations

The architectural simulations take as input the gate delay characterizations as
shown in Fig. 5.1. We use a custom-written simulator called FTSIM. FTSIM takes as
input a NASIC circuit definition, gate timing characterizations, and parameters for
defects and simulates the operation of the circuit on a cycle-by-cycle basis, tracking
values within the circuit logically.
FTSIM handles both parameter variations and permanent defects.For permanent
defects, the user specifies the type of defects (e.g. stuck-on, stuck-off devices, broken
nanowires) and individual defect rates. A Monte Carlo system is used for defect
injection and multiple trials carried out. Clustered defects may also be handled.
Additional information on defect tolerance can be found in [35, 57].
For parameter variations, timing characterization information for NAND gates
from HSPICE are used. Gate delay for any one stage is sampled from the distribution
of delays obtained from circuit simulation for each trial and the maximum frequency
at which correct outputs are obtained may be found.

57

In this work, we ran 1,000 trials which produces sufficient working circuits to
give a sound idea of the performance distributions. The output of this stage is the
performance distributions for the test architectures considered.

5.5

Variability Impact on xnwFET Devices2

At the device level, variation in physical parameters affects the on-current (ION ) of
the device3 and capacitances of xnwFETs. This implies variation in the on-resistance
leading to variations in delay and performance at higher levels.
In this study, physical parameters from Table 5.1 are varied one at a time, and the
sensitivity of ION to parameter variation is measured. Parameters are varied across a
±3σ range, assuming 10% standard deviation (i.e., parameters are varied from 70%
to 130% of their nominal value).
Not all parameters have equal impact on ION . The percentage change in oncurrent between the lowest and highest sampled value for each physical parameter is
shown in Table 5.2. Channel diameter has the largest impact, with ION varying by
3.5X over a 7nm to 13nm range.
For four parameters, positive correlation exists between the parameter value and
ION . For example, as bottom oxide thickness increases, ION increases. The substrate
bias is used to deplete carriers in the channel for reducing leakage and improving
threshold voltage. However, the substrate bias also reduces ION due to a shift in
the threshold voltage. As the bottom oxide is made thicker, the electrostatic control
exerted by the back gate bias is reduced, producing a smaller positive VT H shift than
expected, leading to larger ION . As channel diameter increases, the channel resistance
2

Work done in UCLA by Prof. Chui’s group, included for completeness

3

Off-currents are also affected, but this is primarily a leakage issue. While variation in the offcurrents is captured in device simulations and in the circuit level model, it is not expected to affect
the delay and performance of NASIC designs that is the focus of this chapter.

58

Table 5.2. Impact of physical parameter variation (3σ = ±30%) on device on-current
Parameter
Channel diameter
Underlap
Bottom oxide thickness
Gate oxide thickness
Source/drain doping
Gate diameter
Channel doping

% Change in ION
352.0
181.2
147.2
58.2
23.8
16.2
11.7

Correlation
Positive
Negative
Positive
Negative
Positive
Negative
Positive

decreases due to an increase in the cross-sectional area, leading to an increase in ION .
Increasing the source and drain doping reduces the series resistance. Lastly, as channel
doping increases, the short channel effects (SCE) are somewhat alleviated leading to
larger ION . The other parameters all correlate negatively with on current. Increasing
the underlap increases the effective channel length, resulting in a decrease in ION.
Similarly, increasing the gate oxide thickness decreases the gate capacitance and how
well the gate can turn on the channel. Increasing gate diameter increases the length
of the channel underneath, decreasing ION.

5.6

Variability Impact on Circuit Level Delay and System
Performance

5.6.1

Circuit Level Delay Characterization

NASIC N-input dynamic NAND gates (Fig. 5.3) were simulated in HSPICE using
unified behavioral models derived from device data. Delay characterization was done
for fan-in varying between 1 and 30, which is the maximum fan-in for the NASIC
WISP-0 processor, using the HSPICE Monte Carlo framework and Gaussian sampling
of individual parameters. A single channel diameter value was sampled per Monte
Carlo simulation for all devices, since all xnwFETs are on the same nanowire. Lengthwise variation has been shown to be negligible for the nanowire lengths considered [43]

59

Figure 5.4. Delay distributions for physical parameters with maximum impact on
on-current for (a) 15 input and (b) 30 input NASIC dynamic NAND gates. Black
line represents nominal.

for a process such as VLS growth. All other parameters were varied independently
for each device.
The delay sensitivity of NASIC N-input dynamic gates to individual parameters
was studied. We show the impact on delay for the four parameters that have maximum
impact on ION at the device level. Representative results for fan-in of 15 and 30 are
shown. Other fan-in gates were investigated and found to show similar trends.
Fig. 5.4(a) and (b) show the delay distributions for 15 input and 30 input NASIC
dynamic NAND gates. The delay distribution due to channel diameter, underlap,
bottom oxide and gate oxide thickness is studied. The following key observations are
made Channel diameter has the maximum impact on delay distribution - 81% (71%)
change in delay with respect to nominal for 15 (30) input gate. This is due to the
high sensitivity of ION at the device level, and also due to the correlation of channel
diameter across all devices for a single NASIC dynamic NAND circuit. These effects
also imply a large percentage standard deviation - 18% (15%) for 15 (30) input gates
- leading to a wide spread of delay values.

60

Underlap is negatively correlated with ION . This implies that delays will be less
than nominal for shorter underlaps. Furthermore, from device level sensitivity analysis ION variation is asymmetrical with underlap. 30% negative (positive) deviation
causes +74% (-43%) change in the ION . This would imply that in a circuit simulation, where underlap values for individual devices are independently sampled, the
delay distribution should be left-shifted (majority of devices operating better than
nominal). However, the opposite trend is noticed. This is because increasing trend in
the ION with decreasing underlap is dominated by an increasing trend in the various
capacitances as distances between terminals shrink.
The evaluation delays for gate oxide and bottom oxide are tightly distributed
along the nominal, with mean values within 2% of nominal and standard deviation
of 3% for the 30 input gate. Since these parameters are sampled independently, and
there exist no appreciable asymmetries as compared to the underlap, variation in
delays of individual devices tend to cancel out especially in higher fan-in designs.
Fig. 5.5 shows delay distributions for the 15 input NASIC dynamic NAND gate
with all parameters varied simultaneously with 3σ=±30%. The mean is 20% higher
than the nominal due to the underlap asymmetry effect that skews the distribution
to the right. The same trend is observed in other fan-in gates as well. A 118% spread
with respect to the nominal is observed for 15 input gates. The relative spread was
found to be decreasing with increasing fan-in, as expected.
The gate delay distributions with all parameters varying for different fan-ins were
modeled as gamma distributions and used in an architectural simulator to evaluate
the process variation impact on a larger design.

5.6.2

System Level Performance

Architectural simulations of the NASIC WISP-0 processor [56, 58] were carried
out using the architectural simulation framework described in Fig. 5.1 and Section 5.4.

61

Figure 5.5. Delay distribution for 15 input gate with all parameters simultaneously varied: Nominal value is 174ps. Distribution is right-shifted due to asymmetric
underlap effect

Gate delay distributions obtained from Monte Carlo simulations of NASIC dynamic
NAND gates were sampled for each gate in the design and the maximum operating
frequency at which the processor functioned without missed deadlines was estimated.
The probability density function of operating frequencies obtained is plotted in
Fig. 5.6(a). Also shown in the diagram is the nominal frequency for WISP-0 without
any process variation. From the diagram, parameter variation causes performance
deterioration in 67% of the samples investigated.
WISP-0 is not fully balanced with respect to timing and delay. The frequency
is therefore determined entirely by a small number of high fan-in data-paths. If the
delays sampled from these paths are lower than nominal then the performance of
the entire design is not affected or may even improve. However, in designs balanced
for timing, such as commercial processors where a lot of emphasis is typically put on
timing path optimizations, there will be a large number of paths with similar nominal
delay. The slowest path among these would determine the operating frequency. This
implies that for balanced designs with process variation, a much larger fraction of
chips will be slower than nominal, since data speed-up along some high fan-in paths
will be entirely offset by others.

62

Figure 5.6. Distribution of WISP-0 operating frequencies showing impact of parameter variations: (a) With no built-in fault tolerance incorporated, 67% of chips
operate at frequency below nominal due to variations in device parameters (b) PDF
for 2-way and 3-way redundancy schemes, showing a majority of samples operating
at better-than-nominal frequencies (normalized frequency > 1).

Results in Fig. 5.6(a) are for designs with no built-in fault tolerance. However,
nanoscale fabrics based on self-assembly manufacturing processes tend to have very
high defect rates (in NASICs we assume 10 orders of magnitude higher than CMOS
or 100s of millions to billions of defective devices per cm2 ) that neccessitates the use
of built-in fault tolerance for achieving acceptable effective yield. These techniques
may also provide resilience against parameter variation related timing faults, since the
fault-tolerance is agnostic to the source of the fault (permanent defects or parameter
variation) and may be leveraged for parameter variation resilience.
Fig. 5.6(b) plots a distribution of maximum operating frequencies obtained for
2-way and 3-way redundant WISP-0 designs for 6% device level defect rate. The xaxis is normalized to the respective nominal frequencies (no parameter variation). In
these cases, timing faults due to slower data-paths are masked by redundant fast datapaths which implies that a majority of samples (75% for 2-way redundancy) operate
at frequencies better than nominal, proving that built-in fault tolerance can provide
resilience against parameter variations in conjunction with manufacturing defects. A
variety of new techniques based on FastTrack and biased voting schemes that carefully

63

manage yield and performance tradeoffs and are optimized for parameter variation
as opposed to permanent defects have been developed for nanoscale fabrics [39].

5.7

Chapter Summary

A novel methodology for bottom-up integrated device-circuit-architectural explorations for analyzing the impact of parameter variability in nano-device based
computing systems was developed. The methodology builds on accurate 3D physics
based simulations of device structure to capture variations in on-current as a function of physical parameters. Circuit and architectural simulations can then be done
to evaluate the impact of this variability on gate delay and system level performance
respectively.
The methodology was evaluated on the NASIC computational fabric with xnwFETs, NASIC dynamic NAND gates and a processor design. Key sources of variation at the device level such as channel diameter were identified and sensitivity of
ION was evaluated. ION may vary by up to 3.5X with variations in the channel diameter and by up to 1.5X with gate underlap. Circuit level simulations identified the
evaluate time in NASIC designs as the dominant component of the gate delay with
parameter variation incorporated. Gate delay simulations varying a single parameter
show up to ±40% variation from nominal gate delay.
For a processor with no fault tolerance, 67% of chips were found to operate at
frequencies below nominal due to parameter variation. However given high defect
rate for nanomanufacturing, nanoscale computing fabrics would incorporate built-in
fault tolerance that could also provide resilience against timing faults.

64

CHAPTER 6
MANUFACTURING PATHWAYS AND ASSOCIATED
CHALLENGES FOR NASICS

Reliable manufacturing of large-scale nanodevice-based systems continues to be
challenging. Self-assembly based approaches, while essential for the synthesis and
scalable assembly of nano-materials and structures at very small dimensions, lack the
specificity and long-range control shown by conventional photolithography. Other
non-conventional approaches such as electron-beam lithography (EBL) provide the
necessary precision and control and are pivotal in characterization studies; but these
are not scalable to large scale systems. Examples of small nanoscale prototypes
include a carbon nanotube FET based ring oscillator [8] and an XOR gate using SNAP
assembled semiconductor nanowires and electron-beam lithography [54]. In all these
cases, the focus has been on creation of devices followed by arbitrary interconnections
to build logic gates, an approach that is not scalable to large-scale systems.
In general, a manufacturing pathway for integrated nanosystems needs to achieve
three important criteria:
• Scalability: Large scale simultaneous assembly of nanostructures/devices on a
substrate must be possible.
• Interconnect: Nanodevices must be interconnected in a prescribed fashion for
signal propagation and achieving requisite circuit functionality. While it may
be possible to integrate individual devices together after assembly, an approach
that simultaneously creates nanodevices and interconnections poses fewer challenges and is expected to achieve better density.
65

• Interfacing: The nanosystem must be effectively interfaced with the external
world.
In this chapter we explore a manufacturing pathway for NASICs that realizes
the fabric as a whole including devices, interconnect and interfacing. This pathway
employs self-assembly/ unconventional patterning-based approaches for scalable assembly of semiconductor nanowires, and conventional lithography based techniques
for parallel and specific functionalization of nanodevices and interconnects. While
individual steps have been demonstrated in laboratory settings, challenges exist in
terms of meeting specific fabric requirements and integration of disparate process
steps.

6.1

Fabric Choices Targeting Manufacturability

Before delving into the details of the manufacturing pathway, it is instructive to
look at certain aspects of the NASIC fabric that significantly mitigate requirements
on manufacturing. Design choices have been made at the device, circuit, and architectural levels targeting feasible manufacturability while carefully managing constraints.
This is in direct contrast to other technologies such as CMOS which optimize designs
for performance and area, but place stringent requirements on the manufacturing
process.
• NASIC designs use regular semiconductor nanowire crossbars without any requirement for arbitrary sizing, placement or doping. Regular nanostructures
with limited customization are more easily realizable with unconventional nanofabrication approaches.
• NASIC circuits require only one type of xnwFET in logic portions of the design.

66

• Local interconnection between individual devices as well as between adjacent
crossbars is achieved entirely on nanowires; interconnection of devices does not
introduce new manufacturing requirements.
• NASICs use dynamic circuit styles with implicit latching on nanowires. Implicit
latching reduces the need for complex latch/flip-flop components that require
local feedback.
• Tuning xnwFET devices to meet circuit requirements is done in a fabric-friendly
fashion; techniques such as gate underlap and substrate biasing do not impose
new manufacturing constraints.
• NASICs use built-in fault tolerance techniques to protect against manufacturing
defects and timing faults caused by process variation. Built-in fault tolerance
techniques do not need reconfigurable devices, extraction of defect maps, or
complex micro-nano interfacing as required by reconfiguration based fabrics.
All fault tolerance is added at nanoscale and made part of the design.
These fabric choices reduce manufacturing requirements down to two key issues:
assembling nanowire grids on to a substrate and defining the positions of xnwFET
transistors and interconnect. The latter step, also called functionalization, is a price
paid for a manufacturing-time customization. The manufacturing pathway and associated challenges are discussed in the next sub-section. Note that by adjusting the
nanowire pitch any manufacturing issue can be managed but the goal is to achieve
the smallest possible pitch.

6.2

Manufacturing Pathway

Key steps in the NASIC manufacturing pathway are shown in Fig. 6.1. Fig. 6.1(A)
shows a NASIC 1-bit full adder circuit. Horizontal nanowires are grown and aligned

67

Figure 6.1. Scalable manufacturing pathway for NASICs

on a substrate (B). In general, nanowire alignment can be in-situ, ex-situ, or directpatterned. In-situ refers to techniques where nanowires are aligned in parallel arrays
during the synthesis phase itself. On the other hand, ex-situ refers to techniques where
nanowire synthesis and alignment are carried out separately. Lithographic contacts
for VDD and VSS as well as some control signals are created (B). A photolithography
step is used to protect regions where transistors will be formed while creating high
conductivity regions using ion implantation- elsewhere (C, D). Ion implantation creates n + /p/n+ regions along the nanowires which under suitable electrical fields act
as inversion mode source/channel/drain regions.
Gate dielectric layer is then deposited (or oxide is grown) (E) followed by alignment
of vertical nanowires. The above steps are now repeated for the vertical nanowire layer
(F-H). During ion implantation on vertical nanowires (H), channels along horizontal
nanowires are self-aligned against the vertical gates.
Key individual steps and challenges are discussed in detail in the following subsections.

68

6.2.1

Nanowire Growth and Alignment

The ideal technique to form aligned nanowire arrays should guarantee an intrinsic
and concurrent control over three key parameters:
• the number of nanowires,
• the inter-nanowire pitch, and
• the nanowire diameter within the array.
State-of-the-art semiconductor nanowire array formation with alignment techniques can be broadly classified into three categories:
• In-situ nanowire growth and alignment: In in-situ nanowire growth and alignment nanowires are directly synthesized in an aligned fashion on a substrate. For
example, [28] has shown MOCVD growth of InSb nanowires from gold precursors in-plane using a InSb (111) substrate. Other representative techniques for
in-situ growth include gas-flow guiding [30] and electrical field guiding [52, 14].
This family of techniques is dependent on catalyst engineering and patterning as
well as compatibility of nanowires with the substrate. One approach to pattern
gold catalysts at sub-lithographic features is using oriented block-copolymer
films [51] as templates. The key advantage is that a separate transfer step for
nanowires is not required.
• Ex-situ nanowire alignment: In ex-situ nanowire alignment, nanowires are grown
separately and then transferred to substrate. Representative techniques include
fluidic alignment [62] and organic self-assembly etch [19, 27]. The key advantages of ex-situ techniques are wide variety of material choice and nanowire
synthesis processes that are available. It is also possible to achieve a tighter distribution of nanowire diameters since the growth process can be separately controlled. However, an effective transfer step is required to attach each nanowire
to pre-defined locations as well as control of orientation of transferred nanowires.
69

• Nanolithography-based pattern and etch techniques: In these approaches, a
semiconductor material layer pre-formed on the target substrate surface is first
patterned by nanolithography and then anisotropically etched to create a periodic nanowire array. While the etching process is rather standard, there are
two very promising nanoscale patterning techniques including the nanoimprint
lithography (NIL) [9] and superlattice nanowire pattern transfer (SNAP) [54,
18]. These approaches in principle meet the aforementioned criteria in terms
of numbers, diameter and pitch of nanowires (e.g. since the reusable transfer
pattern can be precisely controled) but possess some subtle practicality concerns. Since the surfaces of these nanowires are usually damaged during the
etching process, caution should be exercised to prevent significant degradation
in the resultant device performance. Also the choice of semiconductor nanowire
material is more limited compared to either the in-situ or ex-situ approach.
The construction of the 2D nanowire fabric for NASIC circuit applications consists
of two aligned nanowire array formation steps. The first (and bottom) semiconductor
nanowire array can be formed by either the nanolithography-based patterning-andetching technique or the ex-situ aligned assembly method. The former selection is
primarily driven by the material choice silicon. Since silicon-on-insulator (SOI) substrates are readily available, the patterning-and-etching technique could be considered
due to its capability of achieving aligned parallel nanowires with long-range order as
long as the nanowire surface damage could be minimized. Alternatively, the ex-situ
method remains an attractive solution with the advantages and challenges discussed
above.
The second (and top) array is preferentially formed by the transferring of a prealigned nanowire array assembled using either the ex-situ or in-situ approach. The
choice of a particular technique would depend on its ability to accomplish the key
specifications outlined above. Since the same material (silicon) with roughly the same
70

nanowire diameter and pitch is required in both arrays, it is therefore beneficial to
employ the same method and repeat it.

6.2.2

Functionalization

In an n-type xnwFET, the gate, drain and source terminals are doped n+, whereas
the channel used p-type doping for inversion mode operation. Similar to conventional
FET devices, the potential applied at the gate controls the flow of electrons between
the source and drain terminals. Customization of nanowire arrays is required to
define the positions of transistors on the grid for achieving arbitrary logic functions,
and create high conductivity interconnects elsewhere.
Nanowires assembled on the substrate are initially doped uniformly along their
lengths. The doping type corresponds to the channel doping of the inversion mode
FET devices (for example, if n-type FETs are needed, the nanowires transferred to
the substrate will originally contain p-type dopants).
We propose to use ion implantation, a well controlled technique used in the semiconductor industry, to create: a) high conductivity regions on nanowires where transistors do not exist, b) gate material of NWs, and c) gate self-aligned FET channels.
The minimum feature size is calculated to be (2 × pitch – width) squares (Fig. 6.1(C)
and 6.1(G)). Simulations of overlay requirements for NASICs was carried out sampling overlay imprecision for successive lithographic masks as Gaussian random variables. Results of overlay simulations show that 100% overlay-limited yield can be
obtained for a mask misalignment of 3σ = ±5.7nm [53], which is a considerable
improvement over 16nm CMOS (3σ = ±3nm)
Ion implantation of horizontal nanowires is shown in Fig. 6.1(C), (D). These steps
create high conductivity regions along the assembled horizontal nanowires. n+/p/n+
regions are formed on the left side of Fig. 6.1(D); these act as source/channel/drain
regions. The n+ regions on the right of Fig. 6.1(D) are gates for vertical nanowires

71

that will be assembled in subsequent steps. An additional optional silicidation step
could be done to further improve the conductivity of the n+ regions defined in this
step.
Fig. 6.1(G), (H) show ion implantation steps applied to vertical nanowires. The
six vertical nanowires on the left are doped n+ and act as gates for underlying horizontal nanowire channels. The four vertical nanowires to the right contain n + /p/n+
source/channel/drain regions and are gated by underlying horizontal nanowires. Furthermore, this ion implantation step self-aligns the horizontal channels on the left
side of the figure against the vertical nanowire gates.
It must be noted that lithography is used to protect regions where FETs will
be formed, and not for complex patterning. In conjunction with self-alignment, this
implies that precise shapes with sharp edges are not needed. NASIC built-in defect
tolerance techniques [35, 57] further ameliorate requirements on lithography. Fewer
masks and NASICs built-in fault tolerance imply that it may be possible to build
NASICs at a much lower manufacturing cost and finer resolution than scaled CMOS.
The manufacturing pathway in Fig. 6.1 needs nanowires to be transferred on to
the substrate twice, once each for horizontal and vertical nanowires. This implies
that vertical xnwFET channels are not on the substrate, but placed above layers of
horizontal nanowires and oxide; this poses some challenges in terms of self-alignment
of these channels against horizontal nanowire gates.
This concern may be overcome by using an approach that uses 3 separate nanowire
transfers as shown in Fig. 6.2 A) vertical NWs in the output plane are first transferred, and ion implantation with lithographic masks is carried out; B) horizontal
NWs are transferred after gate dielectric deposition, these NWs are self-aligned with
the previously transferred verticals; and C) input plane verticals are transferred and
ion implantation self-aligns these with the underlying horizontals. This approach,
however, poses challenges in terms of alignment of input vertical nanowires in one

72

Figure 6.2. Manufacturing pathway for NASICs with 3-step nanowire transfer

NASIC tile against output nanowires of the previous tile, as well as the requirement
of physical interconnections between nanowires assembled in separate steps.

6.3

Chapter Summary

One manufacturing pathway for the NASIC fabrics was discussed. The pathway
realizes the fabric as a whole with devices and interconnect formed as part of a regular
grid, as opposed to approaches focused on arbitrary interconnection of individual
nano-devices. Key challenges including nanowire alignment and functionalization
requirements were discussed.

73

CHAPTER 7
N3 ASICS: DESIGNING NANOFABRICS WITH
REDUCED MANUFACTURING REQUIREMENTS

In this chapter, we present N3 ASICs, a new nanoscale computing fabric that
eliminates the remaining manufacturing challenges for NASICs and can be built with
manufacturing solutions that are known today. In keeping with the fabric-centric
mindset, this reduction in manufacturing complexity is enabled by design choices at
multiple levels. While this new fabric trades-off some of the density advantages of
NASICs, it still achieves considerable improvements in area/power/performance over
scaled CMOS with reduced manufacturing requirements.
As discussed in the previous chapter, unconventional direct-patterning based manufacturing techniques such as Nano Imprint Lithography (NIL) [38] and Superlattice
Nanowire Pattern transfer (SNAP) [55] [58], are able to produce ultra-high density nanostructures. For e.g., it has been shown that 7nm width with 13nm pitch
nanowires can be patterned with SNAP [18] on an SOI substrate. However these
and other unconventional techniques have poor overlay with respect to previously
formed patterns. Overlay imprecision reported for NIL was 3σ = ±105nm [44].
On the other hand photolithography has an excellent overlay and alignment precision. According to International Technology Roadmap for Semiconductors (ITRS)
[26] state-of-the-art photolithography has an overlay imprecision of 3σ = ±6.4nm.
However, overlay alignment is expected to become much more challenging with further CMOS scaling (e.g. 16nm CMOS would require 3σ = ±3nm, manufacturing
solutions unknown).

74

Our goal in this chapter is to develop an approach by which we can combine unconventional and conventional manufacturing approaches while retaining the benefits
of both. Unconventional nanomanufacturing is used in conjunction with conventional
CMOS lithography and design rules to build a new class of 3-D integrated nanofabrics
( N3 ASICs: Nanoscale 3D Application Specific Integrated Circuits) with careful consideration to manufacturing and overlay requirements. We present the overall fabric
design and show a layer-by-layer assembly sequence for N3 ASICs depicting how the
complete fabric (including devices, interconnect and interfacing) may be realized on
a single Silicon-on-Insulator (SOI) wafer. We show how fine-grained integration between nanoscale and CMOS features can be achieved using standard area distributed
pins/vias and design rules. We also evaluate key system-level metrics such as density,
performance and power for N3 ASICs and compare it against both NASICs and an
equivalent 16nm CMOS design.

7.1

Physical Fabric Vision

We propose a new physical fabric that consists of nanowire arrays at the bottom
(built using unconventional manufacturing) with a conventional CMOS metal stack
for interconnect (built using photolithography) on top. All active devices and logic
implementation is achieved on the ultra-dense nanowire arrays which can be directpatterned on an ultra-thin Silicon-On-Insulator (SOI) wafer. The patterning can be
achieved using techniques like NIL or SNAP that provide excellent control over the
number of nanowires, width and spacing. There is no second nanowire transfer step.
In this approach, patterning of high-density nanostructures is carried out prior
to all lithography steps without any overlay requirement. Furthermore if the defined
nanostructure pattern is regular (e.g. parallel arrays), the first lithographic mask
has overlay tolerance, i.e. it may be offset over the array without yield loss. Subsequent steps make use of conventional photolithography. The a priori assembly/direct-

75

patterning of sub-lithographic features on the densest NW layer before any conventional lithographic step (e.g., for contacts/vias) means 3D overlay alignment requirements exist only between subsequent lithographic masks.
Fig. 7.1 shows the envisioned N3 ASICs fabric built on a standard Silicon-onInsulator (SOI) wafer. It consists of uniform parallel semiconductor nanowire arrays
on which logic/memory is implemented. Active devices in N3 ASICs are single type,
doped dual channel crossed nanowire transistors (2C-xnwFETs). Area-distributed interfaces or vias are used to connect outputs of nanowire stages to a standard CMOS
metal stack. Metal interconnections between vias achieve arbitrary routing. The
nanowire logic plane is surrounded by CMOS circuitry. The peripheral CMOS circuitry can be used for control logic, dynamic clocking, mixed signal etc. N3 ASICs use
the same circuit styles as NASICs and previous explorations on device requirements,
cascading/noise issues and parameter variation are equally relevant.
To enable full and fine-grained integration with CMOS metal stack without new
manufacturing or functionalization requirements, lithographic design rules need to be
followed. Standard lithography design rules are used for lithographic functionalization steps including defining positions of transistors, power and control rails, vias,
interconnect etc. Lithographically defined vias or area-distributed interfaces connect
the nanowire arrays through a CMOS metal stack. Metal interconnects are used for
routing the signals in 3D. Adherence to design rules imply functionalization requirements are mitigated.
Fig.

7.2 shows representative λ design rules applied to the N3 ASICs fabric.

All design rule requirements like Metal-Metal spacing, Metal-via spacing and Viaoverhang are followed. C. Bencher et. al.

[5] project that the metal 1(M1) pitch

for the 16nm technology node is 40nm. This is equal to 5λ where λ=8nm for 16nm
technology node.

76

Figure 7.1. Nano-CMOS integrated N3 ASICs fabric

77

Figure 7.2. CMOS Design rules applied to N3 ASICs

78

Since metal vias are used to contact nanowires, the nanowire spacing should adhere
to CMOS design rules. Given that nanowires can have much smaller dimensions than
vias, more sub-lithographically patterned nanowires may be bundled within the same
via dimension without any density impact. Having more than one nanowire per
via allows for better contact, performance and inherent defect resilience (e.g against
stuck-open channels).
Fig. 7.2 shows how bundled pair of nanowires are contacted using a via. Metal 1
interconnects is used to connect the inputs of the transistors. Metal 2 interconnects
are used to connect the output on the nanowires to the subsequent stages.

7.2

Assembly Sequence

We present a simplified assembly sequence followed in building the N3 ASICs fabric.At the bottom of the fabric is a uniform semiconductor nanowire array. This can
be direct patterned on ultra-thin Silicon-On-Insulator. Nanowires can be bundled in
pairs in order to achieve better contact with the vias. Fig. 7.3A shows the uniform
dense nanowire array created a priori to any lithographic step.
Fig. 7.3B shows the contact creation for VDD and GND, precharge and evaluate.
This diagram depicts the scenario of two stages cascaded next to each other. This
can be treated as two logic planes as shown in the figure. We can use interconnects
to route signals across the logic planes. Logic plane 1 is on the left and logic plane 2
is on the right
Fig. 7.3C shows the metal gate deposition step. Metal gates (shown in green) are
deposited at certain positions to define 2C-xnwFETs using conventional lithography
and masks. Initially the nanowires are doped p-type. A self-aligning ion implantation is then used to create n + /p/n+ source/channel/drain structures. This creates
enhancement mode 2C-xnwFETs similar to conventional MOSFETs in CMOS. All
device channels are oriented along the same direction and lie on the substrate itself.

79

Figure 7.3. Assembly Sequence for N3 ASICs fabric: A) Patterned Nanowires B)
Creation of Lithographic contacts and dynamic control rails C) Metal gate deposition
followed by self-aligned ion-implantation to define high-conductivity interconnect D)
Metal 1 vias and interconnects, and E) Creation of Lithographic contacts and dynamic
control rails.

80

Fig. 7.3D shows the Metal 1 vias and interconnects. Metal lines and vias are laid
down for interconnection. Inputs are received through an M1 array (light blue lines)
and vias are dropped on to the nanowires to tap the outputs (blue dots).
As shown in Fig. 7.3E, outputs from the left logic plane are cascaded to the inputs
of the right plane using M2 (orange lines). The output of the second logic plane can
be routed to other tiles using higher metal layers in the metal stack. This allows us to
achieve arbitrary routing between two different tiles. All local routing within a single
stage is achieved on the nanowires themselves. This helps in reducing the routing
overhead of the design.

7.3

Overlay Requirement

As discussed previously, the initial nanowire patterning step with unconventional
manufacturing does not have any overlay requirement. In this section, the impact
of mask overlay misalignment for subsequent lithographic masks is addressed. The
WISP-0 [56] nanoscale processor design was mapped onto the N3 ASIC fabric. Overlay
misalignment between successive masks were modelled as Gaussian random variables,
and Monte Carlo simulations were carried out in a custom simulator to determine the
number of functioning chips. The simulations were carried out for several 3σ overlay
misalignment values projected by ITRS 2011.
The contact creation and metal gate deposition steps involve alignment to the
smallest features, and hence they are most critical to mask overlay and contribute
significantly to the yield loss. Yield loss due to mask overlay during metal stack
creation is minimal (identical to conventional CMOS). Hence metal stacks higher
than M2 layer have not been considered in these simulations.
The results in Fig. 7.4 show that 100% mask overlay limited yield may be obtained
for 3σ = ±8nm overlay (manufacturing solutions known as per ITRS 2011) when
constructing a uniform nanowire bundle with λ=8nm (16nm technology node) in the

81

Figure 7.4. Mask overlay limited Yield vs. Overlay for 3D integrated fabric

3D integrated fabric. Within a bundle the width of nanowires is 5nm each, with 6nm
spacing to accommodate 16nm vias. Fig. 7.4 shows that even with a pessimistic mask
overlay projection of 3σ=±16nm a mask overlay limited yield of 83% can be observed.
These numbers are a significant improvement over overlay precision requirement for
NASICs, where the equivalent number is 3σ = ±5.7nm for 100% overlay-limited yield.
It is evident from the results that the use of regular structure (like the nanowire
arrays in N3 ASICs) does not impose stringent constraints on overlay precision requirement. Further, fewer masks are required to manufacture this fabric compared
to a CMOS design which is beneficial from both yield and cost perspective. By contrast, irregular structures would have more stringent mask overlay requirements. For
example, the proposed approach also has considerably greater tolerance to overlay
imprecision than 16nm CMOS that requires a 3nm precision at 16nm node as per
ITRS 2011.

82

N3 ASICs Device, Circuits and Architectural Exploration

7.4

N3 ASICs evaluations were carried out at device, circuit and architecture level. The
integrated device-fabric exploration methodology proposed for NASIC was adopted.
Physical fabric choices impact the structure and properties of N3 ASICs devices.
For e.g. if SNAP is used to pattern the bottom most ultra-dense nanowire layer,
nanowires with square cross section will be obtained. Further, use of CMOS design
rules facilitates bundling of nanowires because of the larger via dimension compared
to nanowires. Hence, dual-channel devices can be used in N3 ASICs. For this device
structure the electrical properties are obtained from Synopsys Sentaurus Device [3].
Using this data, behavioral model compatible with HSPICE [2] is created. This
behavioral model is used to carry out circuit and system level evaluations.

7.4.1

Device Simulations1

Dual-Channel Crossed Nanowire FETs (2C-xnwFETs, Fig. 7.5A) employ metal
Omega gate structures for tighter electrostatic control. Gate material work function
is 4.6 eV. 16nm channel devices were simulated given that it is the minimum feature
size for lithographically defined gates. The notation N3 ASICs-16 represents N3 ASICs
constructed with 16nm CMOS design rules, which implies λ the scale length, is equal
to 8nm. The channels are doped p-type of the order of 1018 cm−3 and the source/drain
regions were doped n-type of the order of 1020 cm−3 . A substrate bias of -3V was
assumed to deplete the channel and adjust device parameters such as threshold voltage
and on/off current ratios for correct cascading. A high-κ HfO2 material is used for
gate oxide. The gate oxide thickness was 3nm. Table 7.1 summarizes the parameters
used for Device simulations.
Drain current vs. drain voltage (IDS -VDS ), drain current vs. gate voltage (IDS VGS ), and different parasitic capacitances vs. gate voltage (C vs VGS ) were simulated.
1

Device Simulations were done by other students in the group, but are included for completeness

83

Figure 7.5. 3D structure of N3 ASICs device (2C-xnwFET)

Table 7.1. Device simulation parameters for 2C-xnwFET
Parameter
Gate Material
Gate Workfunction(eV)
Channel Doping (cm−3 )
Gate Oxide Material
Gate oxide thickness (nm)
Bottom oxide material
Bottom oxide thickness (nm)
Back Gate bias (V)
Source/Drain doping (cm−3 )

84

Value
Metal
4.6
1018
HfO2
3
SiO2
10
-3
1020

Table 7.2. 2C-xnwFET Device simulation results
Parameter
VT H
ION
ION /IOF F

N3 ASICs-16 2C-xnwFET
0.27
39.6µA
26218

On-current (ION ) and on/off (ION /IOF F ) current ratio were extracted. Fig. 7.5B
shows the IDS -VDS curve for different VGS values. These simulations verify inversion
mode behavior for 2C-xnwFETs with a positive threshold voltage.
Table 7.2 shows key device simulation results for N3 ASICs-16 2C-xnwFET. With a
high on current, VT H > 0.2, and ION /IOF F > 104 the devices meet circuit requirements
for correct functionality and noise.

7.4.2

Circuit and System Evaluation

Detailed system level evaluations were carried out using WISP-0 nanoprocessor as
the test case. 16nm CMOS equivalent of WISP-0 was developed in order to compare
the area, power and performance. NASICs are 22× and N3 ASICs are 3× denser than
16nm CMOS equivalent design. It was seen that both fabrics are able to achieve
comparable performance at 30× and 5× lower power consumption. The density
advantage is due to the dense nanowire array at the bottom (implying the use of
devices with smaller dimensions when compared to conventional CMOS FETs), use
of single type FET to realize logic, implicit latching on the nanowires (which ensures
that there is no need for area expensive latches and flip-flops) and finally reduced
transistor count compared to CMOS.
N3 ASICs trades-off some density benefits, since CMOS design rules are used for
pitch and spacing, but achieves ease-of-manufacturability. As the nanowire layer confirms to CMOS design rules, the spacing between the nanowires is greater compared
to a 2-D grid based NASIC fabric. The use of design rules, while alleviating manufac-

85

Table 7.3. Comparison of key system-level metrics for WISP-0

CMOS Baseline(16nm)
NASICs
3
N ASICs-16
Relative Improvement

Area(µm2 )
66.24
2.90
22
22x,3x

Performance(GHz)
6.25
4.66
6.32
0.75x,1x

Power(µW)
77.90
2.60
14.36
30x,5.42x

turing requirements, reduces the density advantage of N3 ASICs to 3X. The evaluation
results are summarized in the table.
Power and performance comparisons are shown in Table 7.3. We notice that
the performance of N3 ASICs-16 is comparable to that of 16nm CMOS equivalent
WISP-0. These simulations do not consider key optimizations for xnwFETs and
2C-xnwFETs making comparisons pessimistic. For example, while the PTM models
employ strained silicon, no straining was assumed for nanowire FETs. It is expected
that a better mobility and hence better performance could be obtained when straining
techniques are employed in NASICs and N3 ASICs.

7.5

Reducing Doping Requirements with Metal-Gated Junctionless xnwFETs

Both conventional inversion-mode CMOS devices and 2C-xnwFETs for N3 ASICs
require ultra-sharp source-channel and drain-channel junction with dopant concentrations changing several orders of magnitude within a span of 1nm-2nm. Achieving
this requires extremely precise control of spacer techniques and high temperature annealing processes. Design choices can further be optimized to eliminate this requirement in N3 ASICs. In this section, we propose and describe Metal-gated Junctionless
Nanowire FETs (MJNFETs) that are fully compatible with the N3 ASICs fabric and
provide significantly reduced manufacturing complexity.

86

The device structure is shown in Fig. 7.6A. It consists of a uniformly doped channel nanowire without drain- or source- junctions, a high-κ dielectric material, and an
orthogonal metal gate. This junctionless channel scheme considerably simplifies manufacturing by eliminating complex fabrication steps such as ultra-low energy impurity
implantation, and high thermal budget defect annihilation/dopant incorporation for
achieving extremely sharp lateral doping abruptness both of which are increasingly
prohibitive especially for non-planar semiconductor nanostructures. The principle of
operation is not based on inversion but on accumulation/depletion. Channel depletion is induced by work-function difference between the metal gate and the doped
channel. n+ Silicon channels can be depleted by metals with higher work-function
than the Si-channel (e.g. Nickel), whereas p+ channels are depleted by materials with
lower workfunction (e.g. Titanium). Given the nanoscale dimensions of the channel
cross-section, the channel region can be completely depleted of carriers at zero gate
voltage, leading to normally OFF devices (necessary for cascading in NASICs and
N3 ASICs). Applying a voltage bias on the metal gate eliminates the work-function
difference, turning ON the device.
MJNFET device behavior was validated through detailed 3-D Synopsys Sentaurus process and device simulations (Fig. 7.6B, C). For an n+ device with 16nm (gate
length) × 10nm (channel width) × 10nm (channel thickness) dimensions, HfO2 gate
dielectric with 2nm thickness, and channel doping of 2 × 101 9 dopants/cm3 ) a threshold voltage of ∼0.3V is achieved. Above the threshold voltage a conducting path is established and the device is considered ON. Accumulation increases up to the flat-band
condition, when the channel concentration reaches the initial doping concentration.
ON-current for this device was found to be 14µA.

87

Figure 7.6. Metal-gated Junctionless Nanowire FET (MJNFET) A) Structure, B)
Simulated IDS −VGS (log) plot C) Simulated IDS −VDS curve for different VGS showing
linear and saturation regimes of operation

7.6

Assembly Sequence for N3 ASICs with MJNFETs

Fig. 7.7 shows the layer-by-layer assembly sequence for N3 ASICs with MJNFETs.
Similar to the enhancement-mode device, the unconventional patterning step is carried out a priori to all lithographic steps. However, a key distinction is the doping
requirement mitigation. Given that the circuit-style uses single-type FETs, and that
individual devices do not have complex or dissimilar doping profiles the only doping
step required is a single initial wafer-wide doping before any patterning. Functionalization of MJNFET crosspoints is achieved by depositing metal gates with the
appropriate workfunction to achieve channel depletion in the required channel segments without additional alignment/processing. Self-aligned ion-implantation or lateral doping abruptness across the nanowire length are not needed.

7.7

Chapter Summary

A 3-D integrated nanofabric N3 ASICs was presented. A physical fabric vision
was developed to enable the self-assembly/unconventional manufacturing approach
and conventional photolithography, to be employed in conjunction while retaining
the benefits of both the approaches. To facilitate the use of photolithography CMOS
design rules were followed at all levels. No special manufacturing constraints were
introduced. A detailed layer-by-layer assembly sequence of the fabric was presented.

88

Figure 7.7. Assembly Sequence for N3 ASICs fabric with MJNFETs: A) SOI wafer
with wafer-wide top Silicon doping, B) Direct patterning of nanowires, C) MJNFET
creation by gate oxide + gate metal deposition, D) Power rail and via placement, E)
Metal1 for gate inputs and control signals, F) M2 for routing.

89

Fabric evaluations were carried out at device, circuit and system levels. A nanoprocessor implemented using the proposed N3 ASIC fabric was shown to be 3X denser
than equivalent CMOS design and 5X power efficient for a comparable performance.
Systematic yield implications due to mask overlay misalignment were analyzed. Results show that a yield of 100% was obtained with an overlay misalignement of 3σ =
±8nm (manufacturing solutions known and optimized). A yield of 83% was obtained
even for a pessimistic overlay misalignment of 3σ = ±16nm.
Junctionless xnwFETs with Metal-gates were discussed to further reduce manufacturing requirements by eliminating complex doping profiles and high thermal
budgets. Sentaurus simulations show these devices to have the requisite I-V characteristics to be made functional in NASIC and N3 ASICs circuits. An assembly
sequence for N3 ASICs was developed with these MJNFETs, where the only doping
requirements is an initial wafer-wide doping step of the top silicon.

90

CHAPTER 8
EXPERIMENTAL PROTOTYPE DEVELOPMENT

A comprehensive theoretical framework for nanowire fabrics spanning device characteristics, circuit behavior, architecture, fault-tolerance and assembly sequences was
explored. Through careful design choices at multiple levels, manufacturability requirements were mitigated. Building on these fabric-centric explorations, a new research effort was undertaken with the goal being to experimentally validate core fabric
concepts and demonstrate MJNFET devices and N3 ASICs prototype at sub-35nm dimensions in Cleanroom settings.

8.1

Fabrication - Preliminaries

The starting material for prototyping is a Silicon Implanted Oxide (SIMOX)
Silicon-on-Insulator (SOI) wafer. The SOI has a 100nm top Silicon and 378nm buried
oxide layer. The initial doping is p-type 1015 dopants/cm3 . A wafer-wide ion implantation step is used to increase the doping to achieve conducting channels with
sufficient on-currents. For the purpose of prototyping, all patterning steps are done
with Electron-Beam Lithography (EBL), which can achieve the requisite nanowire
channel and gate dimensions. EBL steps can be replaced by unconventional patterning or photolithography steps to achieve scalable manufacturing of the fabric with
assembly sequences shown in previous chapters. Standard processing steps such as
Evaporation, Reactive Ion Etch (RIE), Wet Chemical Etches, Atomic Layer Deposition (ALD), Sputtering etc. are used. Where appropriate, process simulations are
used to determine critical process parameters for experiments.
91

The key milestones for this effort are:
• Demonstrate successful ion implantation of top SOI substrate
• Develop end-to-end process flow for the N3 ASIC fabric and optimize individual
process steps
• Demonstrate individual conducting nanowires after EBL patterning and RIE
pattern transfer
• Show MJNFET devices that are normally OFF (fully depleted at zero gate bias
- required for cascading) with appropriate choice of gate material, gate oxide
and device dimensions
• Demonstrate small-scale N3 ASICs tile

8.2

Ion Implantation of SOI Wafers

Ion implantation is required to achieve sufficiently high doping concentration that
can ensure high on-currents as well as high conductivity drain and source regions in
junctionless xnwFET devices. This is a two step process: the first step is the dopant
implant, which is followed by thermal annealing to diffuse and activate the dopants
in the lattice.
A combination of two simulation tools (SRIM and Sentaurus Process) is used to
simulate process characteristics and extract process parameters. SRIM (Stopping
Range of Ions in Matter) [63] simulations are used to extract ion implantation parameters such as acceleration voltage and implant dosage. Sentaurus Process [4] is
used to determine annealing temperature and annealing time.
SRIM Simulations are carried out for an SOI wafer with 100nm thick top device
layer (Si), 378nm middle buried oxide(SiO2 ) layer and and 500um bottom handle
layer (Si). The acceleartion voltage (28 keV) used in SRIM simulations is obtained

92

from stropping range table for Boron dopants and silicon substrate. Ion implantation
process is modeled using Monte Carlo (TRIM) simulation model. Fig. 8.1A shows Ion
(B+ ) distribution plot obtained. Ion implantation parameters (acceleration voltage 28
keV, implant dosage 1014 atoms/cm2 ) obtained from SRIM are used in Sentaurus Process [4] simulations to implant the SOI substrate. Diffusion and activation processes
are modeled using Charged Cluster model. Simulations show that Ion-implanted substrates, if annealed at 1000◦ C, for 60 minutes in N2 ambient will diffuse and activate
dopants. Fig. 8.1B shows process simulation with uniform dopant distribution in the
top silicon layer after annealing.

Figure 8.1. Simulations for Ion Implantation A) SRIM simulation plot showing ion
distribution in SOI wafer for 28keV implant B) Sentaurus process simulation plot
showing ion distribution in SOI wafer before and after thermal annealing at 1000◦ C.

Process simulations were also used to construct the targeted junctionless xnwFET
structure. Combined with device-level simulations of charge transport, this approach
helps identify several key parameters including gate oxide thickness, impact of different gate oxide materials, metal gate workfunction to achieve normally OFF devices,
impact of channel/gate geometry on device characteristics etc.

93

8.3

Experimental Process Flow

An end-to-end process flow for small-scale fabric prototype was developed and
individual steps optimized. This pathway is based on direct patterning of silicon
nanowires from Silicon-on-Insulator (SOI) substrates with thin top silicon layers using
Electron-Beam Lithography (EBL). As previously mentioned, a key feature of the
fabric is that given an initial SOI wafer with the correct doping concentration, no
additional doping steps are necessary for realizing individual devices and functional
blocks. A scalable pathway for integrated systems can be envisioned along the same
lines as this prototyping approach, but using parallel processes for assembly and
functionalization.
The prototyping approach is shown schematically in Fig. 8.2. The starting material is an SOI wafer where the top device layer is uniformly doped with p+ dopants.
The ion implantation and annealing steps for unifrom doping of Si device layer were
carried out using simulated process parameters (Acceleration voltage:28keV, Area
dosage: 1014 dopants/cm2 , Implant tilt: 7 degrees, Annealing Temperature: 1000◦ C,
Annealing Duration: 60min, Annealing Ambient: N2 ). The substrate was thinned
down to 15nm with anisotropic RIE using SF6 + CHF3 etch recipie (Fig. 8.2B). Using
EBL and PMMA resist, sub-30nm features are patterned and a Nickel evaporation
and liftoff step is used to define Ni features on top of the substrate (Fig. 8.2C). The Ni
features act as an etch mask for defining nanowires on the SOI. Anisotropic RIE using
SF6 + CHF3 mixture is used to etch the surrounding Si, followed by Piranha (3:1
H2 SO4 :H2 O2 ) treatment to remove Ni etch mask. This leaves Silicon nanowires directly patterned on the SOI substrate (Fig. 8.2D and E). Nanowires at widths as small
as 30nm, 20nm and 15nm have been successfully demonstrated using this approach.
Smaller dimensions imply better depletion, leading to normally off devices with higher
on/off current ratios. Atomic layer deposition tehcnique is used for Halfnium oxide
(HfO2 ) deposition (Fig. 8.2F), followed by alignment, patterning, evaporation and

94

liftoff to define metal gate nanowires (Fig. 8.2G). Additional details are presented
below.

Figure 8.2. End-to-end prototyping process flow for N3 ASICs fabric

8.3.1

Electron Beam Lithography

EBL is used for all patterning steps including defining contacts and alignment
markers, patterning nanowires and orthogonal gates. For all steps a positive resist
process is used with Poly-Methyl-MethAcrylate (PMMA), with a commercially available formulation in Anisole designated A2. The resist has excellent adhesion to Silicon
and fairly low thicknesses (less than 60nm) are achievable for small feature sizes. Exposure to an electron-beam causes breakdown of polymer chains in PMMA, which
can be dissolved in a ketone developer solution (Methyl Iso-Butyl Ketone, MIBK).
Alignment routines available as part of the patterning system are used for locating
previously defined features (e.g. in the creation of metal gates over previously defined
channels).

8.3.2

Reactive Ion Etch

RIE steps are used in two steps of the process flow: i) to thin down the top silicon
layer from 100nm to ∼20nm and ii) to transfer EBL-defined nanowire patterns to the

95

substrate to achieve Silicon nanowires. The recipe used to etch Silicon is adapted
from [6]. A combination of SF6 and CHF3 gases is used. SF6 achieves the actual
etching of Silicon; however the process is isotropic. To improve the anisotropy, CHF3
is used. Radicals from this gas ensure passivation of any exposed Silicon sidewalls,
ensuring that the process is entirely top-down from any exposed Silicon surfaces.
This ensures smooth thinning of the Silicon substrate in Fig. 8.2B as well as successful
pattern transfer in Fig. 8.2D. Nickel is used as a metal etch mask since it is completely
unreactive to this gas mixture, and can be easily removed using a piranha wet-etch
process that does not affect the substrate, channel or contacts/markers.

8.3.3

Oxide Deposition

Silicon dioxide, Aluminum oxide and Hafnium dioxide were considered as possible gate oxide materials. Silicon oxide was deposited using a standard PECVD
process with Silane gas and Oxygen, Aluminum oxide was sputtered, and Hafnium
oxide was deposited using ALD. The former two approaches were found to be unsuitable for MJNFETs: Dielectric constants were lower than HfO2 to begin with, and
oxide thicknesses could not be controlled to atomic precision. Characterization of
FET structures showed poor gate control, with dielectric breakdown occuring well
before full channel depletion. ALD HfO2 process at 150◦ C was optimized to achieve
thicknesses between 1nm to 2nm. Characterization of oxide thickness was done using
ellipsometry.

8.3.4

I-V Measurements

I-V Measurements are done at various stages of the process flow. 4-pt probe
measurements of the substrate are used for determining if it has been succesfully
doped and dopants activated. 2-pt probe measurements are done after nanowire
patterning to determine if patterned nanowires conduct. 3-pt FET characterization
is done after creating MJNFET structures to determine ID − VGS and ID − VDS
96

characteristics. A Keithley 4200 Semiconductor Parametric Analyzer was used for
these experiments.

8.4

Experimental Results

The aformentioned process steps and process simulations were used in fabricating
xnwFET structures and logic stage of the nanowire fabric. Extensive metrology was
done after each process step to verify expected results. Four point probe measurements were carried out to determine doping concentration in Silicon substrate after
ion implantation. This was found to be ∼ 8 × 1018 dopants/cm3 which is almost equal
to targeted concentration from simulations (1019 dopants/cm3 ).
Atomic Force Microscopy (AFM) measurements were done to determine surface
roughness and Silicon thickness after RIE substrate thinning and pattern transfer
steps. As shown in Fig.4A (left), a thinned Silicon substrate has less than 1nm rootsum-squared variation in surface roughness after anisotropic etching of top SOI layer
from 100nm to 15nm. Fig.4A (right) shows AFM image of 15nm thick patterned
Silicon nanowire on top of SiO2 buried oxide.
I-V measurements were carried out on individual junctionless xnwFETs to characterize electrical properties. In order to determine on current and contact resistivity
in junctionless xnwFETs, two point probe I-V measurements were done on nanowire
channels, which were patterned in between source and drain contacts. Excellent
Ohmic behavior was achieved through these nanowires (contact metal stack: 5nm Ti
+ 30nm Au) since the substrate from which they are patterned was heavily doped.
Ellipsiometry measurements were done to determine HfO2 thickness after atomic
layer deposition at 150◦ C. We were able to deposit and measure HfO2 films down to
1nm, and the thickness was found to be uniform across the die.
Three point probe measurements were done on junctionless xnwFETs. Dimensions for fabricated devices were 30nm wide and 15nm thick nanowire channel, 1.2nm

97

Figure 8.3. AFM Images post-RIE A) Successful thinning of top Silicon to ∼15nm
with less than 1nm RMS deviation in surface roughness B) Successful pattern transfer
to Si followed by Nickel removal, showing anisotropic profile and smooth top surface.

Figure 8.4. Experimental MJNFET Device Characterization: A) Fabricated Device
Structure and B) IDS − VGS characteristics for normally off MJNFETs.

98

thick HfO2 gate dielectric, 200nm long gate and 50nm thick gate metal stack. A
stack of 35nm Titanium layer and 15nm thick Gold layer served as gate metal stack;
Titanium provides the necessary work-function difference for depleting p+ doped Silicon channel, and Gold is used for reducing the series resistance of the gate. Fig. 8.4
shows IDS − VGS characteristics of p-type junctionless xnwFETs when a metal gate
stack was put on top of silicon nanowire channel. The IDS − VGS characteristics in
Fig. 8.4 accurately depicts junctionless device characteristics, where the workfunction difference between Titanium/Au gate and p+ doped Silicon nanowire channel
depletes the channel and the device is normally OFF at 0V Vgs. As the negative
gate voltages (VGS < 0) are applied, the carriers are accumulated and the channel
conducts. These devices have an ION /IOF F > 103 and threshold voltage ∼ −0.3V .
These characteristics imply that MJNFET devices can be made functional in NASIC
and N3 ASICs circuits, with sufficient noise margins and cascading capability.
We have also demonstrated a single logic stage of the nanowire fabric. As shown in
Fig. 8.5, nanowire grid with functional cross-points was fabricated using the process
flow desribed before. The bottom (horizontal) Si nanowires in the grid were 30nm
wide, 15nm thick and 100nm apart from each other; the top metal nanowires (vertical)
were 30nm wide, 50nm thick and 200nm spaced; Vias were placed at output of each
horizontal nanowires. Whille demonstration of a fully functional N3 ASICs fabric will
require further effort, this work shows feasibility of the approach and validates the
process flow.

8.5

Chapter Summary

A prototyping process flow for demonstration of N3 ASICs was presented. This
process flow uses EBL steps for patterning in conjunction with standard semiconductor processing steps including ion implant, RIE, evaporation etc. No special manufacturing requirements exist. The experimental approach was supported by process

99

Figure 8.5. Fabricated N3 ASICs Tile

simulations to determine key parameters for fabrication (e.g. ion implant dosage,
annealing time/temperature etc). Key milestones such as successful ion implantation, optimization of individual process steps, successful nanowire pattern transfer,
and demonstration of requisite MJNFET behavior with normally OFF p-type devices
(VT H ∼ −0.3V ) and three orders of magnitude ON/OFF current ratios were achieved.
An N3 ASICs tile was also demonstrated. Further optimization of process and devices
will enable a fully functional prototype.

100

CHAPTER 9
CONCLUSIONS

A fabric-centric approach towards building integrated nanosystems was presented.
Through careful design choices across device, circuit and architecture levels manufacturing requirements are reduced - regular arrays with limited customization imply
mitigated overlay precision requirements, novel circuit styles eliminate the need for
arbitrary fine-grain sizing and complementary doping, simple device strucutures are
used and device optimizations are done in a fabric-friendly manner. The fabric is
validated through an integrated bottom-up methodology with careful consideration
to physical layer assumptions and their implications for noise and parameter variation
at circuit and system levels. It is shown to have 22× density benefit and 30× power
benefit vs. CMOS for improved overlay imprecision tolerance (3σ = ±5.7nm).
A new 3D integrated fabric, N3 ASICs, was proposed that combines unconventional manufacturing with lithography and design rules for reduced manufacturing
requirements vs.scaled CMOS. This fabric achieves 3X area, 5X power at comparable
performance vs. 16nm CMOS for a processor design. Furthermore, these benefits may
be achieved with overlay imprecision of ±8nm, for which manufacturing solutions are
known today (vs. ±3nm for 16nm CMOS, manufacturing solutions unknown).
Experimental efforts towards building an N3 ASICs prototype were discussed. An
end-to-end process flow was developed and individual steps optimized. Successful
doping of SOI substrates, pattern transfer to create nanowires at dimensions between 15nm to 30nm, and metal-gated junctionless nanowire FET structures were
demonstrated. I-V characterization of MJNFET devices show normally OFF behav-

101

ior (through gate channel workfunction difference) and three orders of magnitude
on/off current ratios, implying that these devices meet circuit requirements for cascading and noise, as per circuit evaluations. N3 ASICs tiles with MJNFETs at the
crosspoints were also demonstrated. Thus through a combination of fabric design,
theoretical exploration and cleanroom fabrication, new nano-fabrics were developed
and shown to achieve the concurrent objectives of improved system-level benefits and
improved manufacturability.

102

APPENDIX
LIST OF PUBLICATIONS

C. A. Moritz, P. Narayanan and C. O. Chui, “Nanoscale Application Specific
Integrated Circuits,” Book chapter for Nanoelectronic Circuit Design (Eds. Niraj Jha
and Deming Chen), Springer 2011.
P. Narayanan, J. Kina, P. Panchapakeshan, C. O. Chui and C. A. Moritz, “Integrated Device-Fabric Explorations and Noise Mitigation in Nanoscale Fabrics,” IEEE
Transactions on Nanotechnology, vol. 11, no. 4, pp. 687-700, 2012.
P. Narayanan, M. Leuchtenburg, J. Kina, P. Joshi, P. Panchapakeshan, C. O.
Chui and C. A. Moritz, “Variability in Nanoscale Fabrics: Bottom-up Integrated
Analysis and Mitigation,” Accepted for publication by ACM Journal on Emerging
Technologies in Computing Systems, 2013.
M. M. U. Khan, P. Narayanan, P. Joshi, P. Panchapakeshan, and C. A. Moritz,
FastTrack: Toward Nanoscale Fault Masking With High Performance, IEEE Transactions on Nanotechnology, vol. 11, no. 4, pp. 720 730, 2012.
C. O. Chui, K.-S. Shin, J. Kina, K.-H. Shih, P. Narayanan, and C. A. Moritz,
Heterogeneous Integration of Epitaxial Nanostructures: Strategies and Application
Drivers, Invited paper in Proc. SPIE 8467, Nanoepitaxy: Materials and Devices IV,
vol. 8467, pp. R1 - R15, 2012.
J. Zhang, P. Narayanan, S. Khasanvis, J. Kina, C. O. Chui, and C. A. Moritz,
On-Chip Variation Sensor for Systematic Variation Estimation in Nanoscale Fabrics,
Proceedings of 12th IEEE Conference on Nanotechnology (IEEE-NANO), 2012.

103

Y. Guo, P. Narayanan, M. A. Bennasser, S. Chheda and C. A. Moritz, “EnergyEfficient Hardware Data Prefetching,” IEEE Trans. on VLSI, vol. 19, no. 2, pp.
250-263, 2011.
S. Khasanvis, K. M. M. Habib, M. Rahman, P. Narayanan, R. K. Lake and C. A.
Moritz, Ternary Volatile Random Access Memory based on Heterogeneous GrapheneCMOS Fabric, to appear in Proceedings of IEEE/ACM International Symposium on
Nanoscale Architectures (NanoArch), 2012.
P. Narayanan, et al., ”Nanoscale Application Specific Integrated Circuits”, textslin
Proc. of IEEE/ACM Intl. Symposium on Nanoscale Architectures (NanoArch’11),
pp. 99-106, 2011.
P. Vijayakumar, P. Narayanan. et al., “Impact of Nanomanufacturing Flow on
Systematic Yield Loss in Nanoscale Fabrics,” in Proc. of IEEE/ACM Intl. Symposium on Nanoscale Architectures (NanoArch’11), pp. 181-188, 2011.
P. Panchapakeshan, P. Narayanan, and C. A. Moritz, “N3 ASICs: Designing
Nanofabrics with Fine-grained CMOS Integration,” in Proc. of IEEE/ACM Intl.
Symposium on Nanoscale Architectures (NanoArch’11), pp. 196-202, 2011.
P. Narayanan, P. Panchapakeshan, J. Kina, C. O. Chui and C. A. Moritz, ”Integrated Nanosystems with Junctionless Crossed Nanowire Transistors”, IEEE International Conference on Nanotechnology (IEEE NANO 2011), pp.845-848, 2011.
P. Panchapakeshan, P. Vijayakumar, P. Narayanan, C. O. Chui and C. A.
Moritz, 3-D Integration Requirements for Hybrid Nanoscale-CMOS Fabrics, IEEE
International Conference on Nanotechnology (IEEE NANO 2011), pp.849-853, 2011.
M. Rahman, P. Narayanan, and C. A. Moritz, “N3 ASIC-based Volatile Nanowire
RAM,” in Proc. IEEE Conference on Nanotechnology (NANO’09), pp.1097-1101,
2011.
P. Narayanan, M. Leuchtenburg, J. Kina, P. Joshi, P. Panchapakeshan, C. O.
Chui and C. A. Moritz, “Variability in Nanoscale Fabrics: Bottom-up Integrated

104

Analysis,” in Proc. of IEEE Intl. Symposium on Defect and Fault Tolerance in VLSI
Systems (DFTS’10), Best Student Paper Award, pp. 24-31, 2010.
P. Narayanan, T. Wang, and C. A. Moritz, “Programmable Cellular Architectures at the Nanoscale,” Elsevier Nano Communications Networks (NANOCOMNET), vol. 1, no. 2, pp. 77-85, 2010.
P. Vijayakumar, P. Narayanan, I. Koren, C. M. Krishna and C. A. Moritz,
“Incorporating Heterogeneous Redundancy in a Nanoprocessor for Improved Yield
and Performance,” in Proc. of IEEE Intl. Symposium on Defect and Fault Tolerance
in VLSI Systems (DFTS’10), pp. 273-279, 2010.
P. Shabadi, A. Khitun, P. Narayanan, M. Bao, I. Koren, K. Wang and C. A.
Moritz, “Towards Logic Functions as the Device,” in Proc. IEEE/ACM Symposium
on Nanoscale Architectures(NanoArch’10), pp. 11-16, 2010.
P. Narayanan, K. W. Park, C. O. Chui and C. A. Moritz, “Validating Cascading
of Crossbar Circuits through an Integrated Device-Circuit Exploration,” in Proc. of
IEEE/ACM Intl. Symposium on Nanoscale Architectures (NanoArch’09), pp. 37-42,
2009.
P. Narayanan, K. W. Park, C. O. Chui, and C. A. Moritz, “Manufacturing
Pathway and Associated Challenges for Nanoscale Computational Systems,” in Proc.
IEEE Conference on Nanotechnology (NANO’09), pp. 119-122, 2009.
T. Wang, P. Narayanan, and C. A. Moritz, “Heterogeneous 2-level Logic and
its Density and Fault Tolerance Implications in Nanoscale Fabrics,” IEEE Trans. on
Nanotechnology, vol. 8, no. 1, pp. 22-30, January 2009.
M. Leuchtenburg, P. Narayanan, T. Wang and C. A. Moritz, “Impact of process variation on NASIC nanoprocessors with 2-way redundancy,” in Proc. of IEEE
Conference on Nanotechnology (NANO’09), pp. 737-739, 2009.

105

P. Narayanan, T. Wang, M. Leuchtenburg, and C. A. Moritz, “CMOS Control
Enabled Single-Type FET NASIC,” in Proc. of IEEE Symposium on VLSI, Best
Paper Award, pp. 191-196, 2008.
P. Narayanan, T. Wang, M. Leuchtenburg and C. A. Moritz, “Image Processing
Architecture for Semiconductor Nanowire Based Fabrics,” Invited paper in Proc. of
IEEE Conference on Nanotechnology (NANO’08), pp. 677-680, 2008.
P. Narayanan, T. Wang, M. Leuchtenburg and C. A. Moritz, “Comparison of
analog and digital nanosystems: Issues for the nano-architect,” in Proc. of IEEE Intl.
Nanoelectronics Conference (INEC’08), pp. 1003-1008, 2008.
T. Wang, P. Narayanan, M. Leuchtenburg and C. A. Moritz, “NASICs: A
nanoscale fabric for nanoscale microprocessors,” in Proc. of IEEE Intl. Nanoelectronics Conference (INEC’08), pp. 989-994, 2008.
C. A. Moritz, T. Wang, P. Narayanan, M. Leuchtenburg, Y. Guo, C. Dezan,
and M. Bennaser, “Fault-Tolerant Nanoscale Processors on Semiconductor Nanowire
Grids,” IEEE Trans. on Circuits and Systems I, special issue on Nanoelectronic
Circuits and Nanoarchitectures, vol. 54, no. 11, pp. 2422-2437, 2007.
T. Wang, P. Narayanan, and C. A. Moritz, “Combining 2-level Logic Families
in Grid-based Nanoscale Fabrics,” in Proc. of IEEE/ACM Symposium on Nanoscale
Architectures (NanoArch’07), pp. 101-108, 2007.
C. Dezan, L. Lagadec, M. Leuchtenburg, T. Wang, P. Narayanan, C. A. Moritz,
“Building CAD Prototyping Tool for Emerging Nanoscale Fabrics,” in Proc. of European Nano Systems Conference, pp. 25-30, 2007.

106

BIBLIOGRAPHY

[1] The R project for statistical computing. http://r-project.org.
[2] HSPICE, software.
http://www.synopsys.com/Tools/Verification/
AMSVerification/CircuitSimulation/HSPICE/Pages/default.aspx, 2009.
Synopsys, Inc.
[3] Sentaurus device, software.
http://www.synopsys.com/tools/tcad/
devicesimulation/pages/sentaurusdevice.aspx, 2009. Synopsys, Inc.
[4] Sentaurus process, software.
http://www.synopsys.com/tools/tcad/
processsimulation/pages/sentaurusprocess.aspx, 2009. Synopsys, Inc.
[5] Bencher, Christopher, Dai, Huixiong, and Chen, Yongmei. Gridded design rule
scaling: taking the CPU toward the 16nm node. In Proceedings of SPIE (San
Jose, CA, USA, 2009), pp. 72740G–72740G–10.
[6] Chang, Y.-F., Chou, Q.-R., Lin, J.-Y., and Lee, C.-H. Fabrication of highaspect-ratio silicon nanopillar arrays with the conventional reactive ion etching
technique. App. Phys. A: Materials Science & Processing 86 (2007), 193–196.
[7] Chen, Yong, Ohlberg, Douglas A. A., Li, Xuema, Stewart, Duncan R., Williams,
R. Stanley, Jeppesen, Jan O., Nielsen, Kent A., Stoddart, J. Fraser, Olynick,
Deirdre L., and Anderson, Erik. Nanoscale molecular-switch devices fabricated
by imprint lithography. Applied Physics Letters 82, 10 (2003), 1610.
[8] Chen, Zhihong, Appenzeller, Joerg, Lin, Yu-Ming, Sippel-Oakley, Jennifer, Rinzler, Andrew G., Tang, Jinyao, Wind, Shalom J., Solomon, Paul M., and Avouris,
Phaedon. An integrated logic circuit assembled on a single carbon nanotube. Science 311, 5768 (Mar. 2006), 1735.
[9] Chou, Stephen Y, Krauss, Peter R, and Renstrom, Preston J. Nanoimprint
lithography. Journal of Vacuum Science & Technology B: Microelectronics and
Nanometer Structures 14, 6 (Nov. 1996), 4129–4133.
[10] Colinge, Jean-Pierre, Lee, Chi-Woo, Afzalian, Aryan, Akhavan, Nima Dehdashti,
Yan, Ran, Ferain, Isabelle, Razavi, Pedram, O’Neill, Brendan, Blake, Alan,
White, Mary, Kelleher, Anne-Marie, McCarthy, Brendan, and Murphy, Richard.
Nanowire transistors without junctions. Nat Nano 5, 3 (Mar. 2010), 225–229.

107

[11] Cui, Yi, Duan, Xiangfeng, Hu, Jiangtao, and Lieber, Charles M. Doping and
electrical transport in silicon nanowires. The Journal of Physical Chemistry B
104, 22 (June 2000), 5213–5216.
[12] Cui, Yi, Lauhon, Lincoln J., Gudiksen, Mark S., Wang, Jianfang, and Lieber,
Charles M. Diameter-controlled synthesis of single-crystal silicon nanowires. Applied Physics Letters 78, 15 (2001), 2214.
[13] Duan, Xiangfeng, Huang, Yu, Cui, Yi, Wang, Jianfang, and Lieber, Charles M.
Indium phosphide nanowires as building blocks for nanoscale electronic and optoelectronic devices. Nature 409, 6816 (Jan. 2001), 66–69.
[14] Englander, Ongi, Christensen, Dane, Kim, Jongbaeg, Lin, Liwei, and Morris,
Stephen J. S. Electric-Field assisted growth and Self-Assembly of intrinsic silicon
nanowires. Nano Letters 5, 4 (Apr. 2005), 705–708.
[15] Galatsis, K., Khitun, A., Ostroumov, R., Wang, K.L., Dichtel, W.R., Plummer,
E., Stoddart, J.F., Zink, J.I., Lee, Jae Young, Xie, Ya-Hong, and Kim, Ki Wook.
Alternate state variables for emerging nanoelectronic devices. Nanotechnology,
IEEE Transactions on 8, 1 (jan. 2009), 66 –75.
[16] Gates, Byron D., Xu, Qiaobing, Love, J. Christopher, Wolfe, Daniel B., and
Whitesides, George M. UNCONVENTIONAL NANOFABRICATION. Annual
Review of Materials Research 34, 1 (2004), 339–372.
[17] Greytak, Andrew B., Lauhon, Lincoln J., Gudiksen, Mark S., and Lieber,
Charles M. Growth and transport properties of complementary germanium
nanowire field-effect transistors. Applied Physics Letters 84, 21 (2004), 4176.
[18] Heath, James R. Superlattice nanowire pattern transfer (SNAP). Accounts of
Chemical Research 41, 12 (Dec. 2008), 1609–1617.
[19] Heo, Kwang, Cho, Eunhee, Yang, Jee-Eun, Kim, Myoung-Ha, Lee, Minbaek,
Lee, Byung Yang, Kwon, Soon Gu, Lee, Moon-Sook, Jo, Moon-Ho, Choi, HeonJin, Hyeon, Taeghwan, and Hong, Seunghun. Large-Scale assembly of silicon
nanowire Network-Based devices using conventional microfabrication facilities.
Nano Letters 8, 12 (Dec. 2008), 4523–4527.
[20] Huang, Yu, Duan, Xiangfeng, Cui, Yi, Lauhon, Lincoln J., Kim, Kyoung-Ha,
and Lieber, Charles M. Logic gates and computation from assembled nanowire
building blocks. Science 294, 5545 (Nov. 2001), 1313 –1317.
[21] Huang, Yu, Duan, Xiangfeng, Wei, Qingqiao, and Lieber, Charles M. Directed
assembly of One-Dimensional nanostructures into functional networks. Science
291, 5504 (Jan. 2001), 630 –633.
[22] Huard, B., Sulpizio, J. A., Stander, N., Todd, K., Yang, B., and GoldhaberGordon, D. Transport measurements across a tunable potential barrier in
graphene. Physical Review Letters 98, 23 (June 2007), 236803.
108

[23] Iijima, Sumio. Helical microtubules of graphitic carbon. Nature 354 (Nov. 1991),
56–58.
[24] Iijima, Sumio, and Ichihashi, Toshinari. Single-shell carbon nanotubes of 1-nm
diameter. Nature 363, 6430 (June 1993), 603–605.
[25] ITRS. International technology roadmap for semiconductors - table lith5b.
[26] ITRS. International technology roadmap for semiconductors (ITRS) - Table
LITH2. 2011.
[27] Jordan, Brian J, Ofir, Yuval, Patra, Debabrata, Caldwell, Stuart T, Kennedy,
Andrew, Joubanian, Steven, Rabani, Gouher, Cooke, Graeme, and Rotello, Vincent M. Controlled self-assembly of organic nanowires and platelets using dipolar
and hydrogen-bonding interactions. Small (Weinheim an Der Bergstrasse, Germany) 4, 11 (Nov. 2008), 2074–2078. PMID: 18855971.
[28] Khan, M. Ibrahim, Wang, Xu, Bozhilov, Krassimir N., and Ozkan, Cengiz S.
Templated fabrication of InSb nanowires for nanoelectronics. Journal of Nanomaterials 2008 (2008), 1–5.
[29] Li, Y., Meng, G. W., Zhang, L. D., and Phillipp, F. Ordered semiconductor
ZnO nanowire arrays and their photoluminescence properties. Applied Physics
Letters 76, 15 (2000), 2011.
[30] Liu, Yi-Tao, Xie, Xu-Ming, Gao, Yan-Fang, Feng, Qing-Ping, Guo, Lin-Rui,
Wang, Xiao-Hao, and Ye, Xiong-Ying. Gas flow directed assembly of carbon
nanotubes into horizontal arrays. Materials Letters 61, 2 (Jan. 2007), 334–338.
[31] Lu, Wei, and Lieber, Charles M. Semiconductor nanowires. Journal of Physics
D: Applied Physics 39, 21 (2006), R387–R406.
[32] McNeill, D. W., Bhattacharya, S., Wadsworth, H., Ruddell, F. H., Mitchell, S.
J. N., Armstrong, B. M., and Gamble, H. S. Atomic layer deposition of hafnium
oxide dielectrics on silicon and germanium substrates. Journal of Materials Science: Materials in Electronics 19, 2 (2007), 119–123.
[33] Mehrotra, Saumitra R, and Roenker, K.P. Process variation study for silicon
nanowire transistors. Microelectronics and Electron Devices, 2007. WMED 2007.
IEEE Workshop on (2007), 40–41.
[34] Melosh, Nicholas A., Boukai, Akram, Diana, Frederic, Gerardot, Brian,
Badolato, Antonio, Petroff, Pierre M., and Heath, James R. Ultrahigh-Density
nanowire lattices and circuits. Science 300, 5616 (2003), 112 –115.
[35] Moritz, C.A., Wang, Teng, Narayanan, P., Leuchtenburg, M., Guo, Yao, Dezan,
C., and Bennaser, M. Fault-tolerant nanoscale processors on semiconductor
nanowire grids. Circuits and Systems I: Regular Papers, IEEE Transactions on
54, 11 (nov. 2007), 2422 –2437.
109

[36] Moritz, Csaba Andras, Narayanan, Pritish, and Chui, Chi On. Nanoscale
application-specific integrated circuits. In Nanoelectronic Circuit Design, Niraj K. Jha and Deming Chen, Eds. Springer New York, 2011, pp. 215–275.
[37] Moritz, Csaba Andras, and Wang, Teng. Latching on the wire and pipelining in
nanoscale designs. In IN NSC-3 (2004), pp. 1–7.
[38] Mrtensson, Thomas, Carlberg, Patrick, Borgstrm, Magnus, Montelius, Lars,
Seifert, Werner, and Samuelson, Lars. Nanowire arrays defined by nanoimprint
lithography. Nano Letters 4, 4 (Apr. 2004), 699–702.
[39] Narayanan, P., Leuchtenburg, M., Kina, J., Joshi, P., Panchapakeshan, P., Chui,
C.O., and Moritz, C.A. Parameter variability in nanoscale fabrics: Bottom-Up
integrated analysis and mitigation. To appear in ACM Journal of Emerging
Technologies in Computing Systems (2011).
[40] Narayanan, Pritish, Moritz, Csaba Andras, Park, Kyoung Won, and Chui,
Chi On. Validating cascading of crossbar circuits with an integrated devicecircuit exploration. In Nanoscale Architectures, IEEE International Symposium
on (2009), IEEE Computer Society, pp. 37–42.
[41] Ng, Hou T., Han, J., Yamada, Toshishige, Nguyen, P., Chen, Yi P., and Meyyappan, M. Single crystal nanowire vertical Surround-Gate Field-Effect transistor.
Nano Letters 4, 7 (July 2004), 1247–1252.
[42] Novoselov, K. S., Geim, A. K., Morozov, S. V., Jiang, D., Katsnelson, M. I.,
Grigorieva, I. V., Dubonos, S. V., and Firsov, A. A. Two-dimensional gas of
massless dirac fermions in graphene. Nature 438, 7065 (Nov. 2005), 197–200.
[43] Park, Won Il, Zheng, Gengfeng, Jiang, Xiaocheng, Tian, Bozhi, and Lieber,
Charles M. Controlled synthesis of Millimeter-Long silicon nanowires with uniform electronic properties. Nano letters 8, 9 (Sept. 2008), 3004–3009.
[44] Picciotto, Carl, Gao, Jun, Yu, Zhaoning, and Wu, Wei. Alignment for imprint lithography using nDSE and shallow molds. Nanotechnology 20, 25 (2009),
255304.
[45] Rabaey, Jan M., Chandrakasan, Anantha, and Nikolic, Borivoje. Digital Integrated Circuits. Prentice-Hall, New Jersey, 2011.
[46] Ritala, M., and Leskela, M. Atomic layer deposition. High-K Gate Dielecrics
(2004), 17–64.
[47] Singh, N., Agarwal, A., Bera, L.K., Liow, T.Y., Yang, R., Rustagi, S.C., Tung,
C.H., Kumar, R., Lo, G.Q., Balasubramanian, N., and Kwong, D.-L. Highperformance fully depleted silicon nanowire (diameter 5 nm) gate-all-around
CMOS devices. Electron Device Letters, IEEE 27, 5 (2006), 383–386.

110

[48] Snider, Gregory S, and Williams, R Stanley. Nano/CMOS architectures using a
field-programmable nanowire interconnect. Nanotechnology 18, 3 (2007), 035204.
[49] Strukov, Dmitri B., and Likharev, Konstantin K. Reconfigurable hybrid
CMOS/Nanodevice circuits for image processing. IEEE Transactions on Nanotechnology 6 (Nov. 2007), 696–710.
[50] Suk, Sung Dae, Lee, Sung-Young, Kim, Sung-Min, Yoon, Eun-Jung, Kim, MinSang, Li, Ming, Oh, Chang Woo, Yeo, Kyoung Hwan, Kim, Sung Hwan, Shin,
Dong-Suk, Lee, Kwan-Heum, Park, Heung Sik, Han, Jeorig Nam, Park, C.J.,
Park, Jong-Bong, Kim, Dong-Won, Park, Donggun, and Ryu, Byung-Il. High
performance 5nm radius twin silicon nanowire MOSFET (TSNWFET) : fabrication on bulk si wafer, characteristics, and reliability. In Electron Devices Meeting,
2005. IEDM Technical Digest. IEEE International (2005), pp. 717–720.
[51] Thurn-Albrecht, T., Steiner, R., DeRouchey, J., Stafford, C. M, Huang, E., Bal,
M., Tuominen, M., Hawker, C. J, and Russell, T. P. Nanoscopic templates from
oriented block copolymer films. Advanced Materials 12, 11 (June 2000), 787–791.
[52] Ural, Ant, Li, Yiming, and Dai, Hongjie. Electric-field-aligned growth of singlewalled carbon nanotubes on surfaces. Applied Physics Letters 81, 18 (2002),
3464.
[53] Vijayakumar, Priyamvada. Impact of manufacturing flow on yield losses in
nanoscale fabrics. Masters Theses, University of Massachusetts - Amherst (Feb.
2012).
[54] Wang, Dunwei, Sheriff, Bonnie, McAlpine, Michael, and Heath, James. Development of ultra-high density silicon nanowire arrays for electronics applications.
Nano Research 1, 1 (July 2008), 9–21.
[55] Wang, Dunwei, Sheriff, Bonnie A., McAlpine, Michael, and Heath, James R.
Development of ultra-high density silicon nanowire arrays for electronics applications. Nano Research 1, 1 (2008), 9–21.
[56] Wang, Teng, Ben-naser, Mahmoud, Guo, Yao, and Moritz, Csaba Andras. Wirestreaming processors on 2-D nanowire fabrics. NANOTECH 2005, NANO SCIENCE AND TECHNOLOGY INSTITUTE (2005).
[57] Wang, Teng, Narayanan, P., and Moritz, C. Andras. Heterogeneous Two-Level
logic and its density and fault tolerance implications in nanoscale fabrics. Nanotechnology, IEEE Transactions on 8, 1 (2009), 22–30.
[58] Wang, Teng, Narayanan, Pritish, and Moritz, Csaba Andras. Combining 2-level
logic families in grid-based nanoscale fabrics. In Proceedings of the 2007 IEEE
International Symposium on Nanoscale Architectures (2007), IEEE Computer
Society, pp. 101–108.

111

[59] Wong, Hon-Sum Philip, Taur, Yuan, and Frank, David J. Discrete random
dopant distribution effects in nanometer-scale MOSFETs. Microelectronics and
Reliability 38, 9 (Sept. 1998), 1447–1456.
[60] Wu, Yue, Xiang, Jie, Yang, Chen, Lu, Wei, and Lieber, Charles M. Single-crystal
metallic nanowires and metal/semiconductor nanowire heterostructures. Nature
430, 6995 (July 2004), 61–65.
[61] Xiang, Jie, Lu, Wei, Hu, Yongjie, Wu, Yue, Yan, Hao, and Lieber, Charles M.
Ge/Si nanowire heterostructures as high-performance field-effect transistors. Nature 441, 7092 (May 2006), 489–493.
[62] Xiong, Xugang, Jaberansari, Laila, Hahm, Myung Gwan, Busnaina, Ahmed,
and Jung, Yung Joon. Building highly organized Single Walled CarbonNanotube
networks using Template-Guided fluidic assembly. Small 3, 12 (Dec. 2007), 2006–
2010.
[63] Ziegler, James. Stopping range of ions in matter, software. http://www.srim.
org/, 2012.
[64] Zutic, Igor, Fabian, Jaroslav, and Sarma, S. Das. Spintronics: Fundamentals
and applications. Reviews of Modern Physics 76, 2 (2004), 323.

112

