On The Design Of Low-Complexity High-Speed Arithmetic Circuits In Quantum-Dot Cellular Automata Nanotechnology by Almatrood, Amjad
Wayne State University
Wayne State University Dissertations
1-1-2017
On The Design Of Low-Complexity High-Speed
Arithmetic Circuits In Quantum-Dot Cellular
Automata Nanotechnology
Amjad Almatrood
Wayne State University,
Follow this and additional works at: https://digitalcommons.wayne.edu/oa_dissertations
Part of the Electrical and Computer Engineering Commons
This Open Access Dissertation is brought to you for free and open access by DigitalCommons@WayneState. It has been accepted for inclusion in
Wayne State University Dissertations by an authorized administrator of DigitalCommons@WayneState.
Recommended Citation
Almatrood, Amjad, "On The Design Of Low-Complexity High-Speed Arithmetic Circuits In Quantum-Dot Cellular Automata
Nanotechnology" (2017). Wayne State University Dissertations. 1775.
https://digitalcommons.wayne.edu/oa_dissertations/1775
ON THE DESIGN OF LOW-COMPLEXITY HIGH-SPEED ARITHMETIC
CIRCUITS IN QUANTUM-DOT CELLULAR AUTOMATA
NANOTECHNOLOGY
by
AMJAD ALMATROOD
DISSERTATION
Submitted to the Graduate School
of Wayne State University,
Detroit, Michigan
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
2017
MAJOR: ELECTRICAL ENGINEERING
Approved By:
Advisor Date
c© COPYRIGHT BY
AMJAD ALMATROOD
2017
All Rights Reserved
DEDICATION
To my family
ii
ACKNOWLEDGMENTS
I would like to express my appreciation and gratitude to my advisor, Prof. Harpreet
Singh for his guidance, encouragement and support during my research work. I thank him
for motivating me and helping me to improve my research skills every step of the way. The
time and careful attention he gave to my research were indispensable.
I would also like to thank my dissertation committee: Prof. Mumtaz Usmen, Prof. Feng
Lin and Prof. Lubna Alazzawi for their support and encouragement. Their careful review
and insightful comments greatly improved the quality of my research.
In addition, I would like to thank my colleagues and friends. I am lucky to have spent
time in close proximity with such great people. Especially, I would like to thank Aby George,
Otman Ali and Ishak O K, with whom I discussed different aspects related to my work which
really helped me during the research.
I would like to give special thanks to my parents, sister and brothers for their love, pa-
tience, encouragement, and support during the time I spent away from them as a Ph.D.
student at Wayne State University. Lastly, I would like to thank my brother Ahmad Alma-
trood for assisting and supporting me during my research work.
iii
TABLE OF CONTENTS
DEDICATION ii
ACKNOWLEDGMENTS iii
LIST OF FIGURES ix
LIST OF TABLES xiii
CHAPTER 1: INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
CHAPTER 2: LITERATURE REVIEW 6
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Majority/Minority-Based Post-CMOS Nanotechnologies . . . . . . . . . . . . 7
2.2.1 Quantum-dot Cellular Automata Technology . . . . . . . . . . . . . . 7
2.2.2 Single Electron Tunneling Technology . . . . . . . . . . . . . . . . . . 12
2.2.3 Tunneling Phase Logic Technology . . . . . . . . . . . . . . . . . . . 13
2.2.4 Spintronic Majority Gate . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.5 All Spin Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.6 Spin Torque Oscillator Logic . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.7 Spin Wive Device Technology . . . . . . . . . . . . . . . . . . . . . . 17
2.2.8 Nanomagnetic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.9 DNA Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Majority/Minority Logic Circuit Synthesis Methods . . . . . . . . . . . . . . 20
iv
2.3.1 Majority Logic Synthesis (MALS) . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Kong’s Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Majority Expression Lookup Table (MLUT)-Based Synthesis . . . . . 24
2.4 Comparison and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.3 Converting Boolean Functions into Majority Expressions . . . . . . . 28
2.4.4 Optimization Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.5 Comparison of Experimental Results . . . . . . . . . . . . . . . . . . 30
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
CHAPTER 3: DESIGNS OF VARIOUS FUNDAMENTAL ARITHMETIC
CELLS IN QCA 37
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Fundamental Arithmetic Cells . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.1 Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2 Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.3 Multiplier (High-Speed Arithmetic Array) . . . . . . . . . . . . . . . 38
3.2.4 Restoring Divider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.5 Non-Restoring Divider . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.6 Divider (High-Speed Arithmetic Array) . . . . . . . . . . . . . . . . . 39
3.2.7 Squarer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.8 Square-Rooting Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.9 Square/Square-Rooting Cell (High-Speed Arithmetic Array) . . . . . 41
3.3 Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
v
3.3.1 Majority-Based Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.2 Majority/XOR-Based Circuits . . . . . . . . . . . . . . . . . . . . . . 43
3.3.3 QCA Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4 Results and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
CHAPTER 4: DESIGN OF ARRAY MULTIPLIER IN QCA 52
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 Basic Multiplier Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.2 Array Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Simulation Results and Comparison . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
CHAPTER 5: QCA DESIGN OF NON-RESTORING BINARY ARRAY
DIVIDER 62
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Non-Restoring Binary Array Divider . . . . . . . . . . . . . . . . . . . . . . 62
5.2.1 Complement Adder/Subtractor Cell . . . . . . . . . . . . . . . . . . . 62
5.2.2 Non-Restoring Binary Array Divider . . . . . . . . . . . . . . . . . . 63
5.3 Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Results and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
CHAPTER 6: SQUARING AND SQUARE-ROOTING CIRCUITS IN
QCA 77
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2 Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
vi
6.2.1 Squaring and Square-Rooting Cells . . . . . . . . . . . . . . . . . . . 77
6.2.2 Squaring and Square-Rooting Arrays . . . . . . . . . . . . . . . . . . 79
6.3 Simulation Results and Comparisons . . . . . . . . . . . . . . . . . . . . . . 81
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
CHAPTER 7: A METHODOLOGY FOR MAJORITY/ MINORITY LOGIC
NETWORK SYNTHESIS 86
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.1 Overview of the Synthesis Method . . . . . . . . . . . . . . . . . . . . 86
7.2.2 Constructing Standard Majority Logic Structures . . . . . . . . . . . 87
7.2.3 Applying Admissible Combinations . . . . . . . . . . . . . . . . . . . 92
7.2.4 Finding Equivalent Majority Functions . . . . . . . . . . . . . . . . . 95
7.2.5 Redundancy Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.2.6 Choosing the Optimal Majority Logic Network . . . . . . . . . . . . . 96
7.3 Results and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
CHAPTER 8: DESIGN OF GENERALIZED PIPELINE CELLULAR AR-
RAY IN QCA 105
8.1 Intoduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.2 Generalized Pipeline Cellular Array . . . . . . . . . . . . . . . . . . . . . . . 105
8.3 Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.3.1 Arithmetic and Control Cells . . . . . . . . . . . . . . . . . . . . . . 110
8.3.2 Generalized Pipeline Cellular Array . . . . . . . . . . . . . . . . . . . 112
8.3.3 Generalized Pipeline Cellular Array for Specified Input/ Output Pins 114
8.4 Simulation Results and Comparison . . . . . . . . . . . . . . . . . . . . . . . 117
vii
8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
CHAPTER 9: CONCLUSION 122
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
9.2 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
9.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
APPENDIX: PROOF OF MINIMAL MAJORITY NETWORK 128
PUBLICATIONS 130
BIBLIOGRAPHY 132
ABSTRACT 147
AUTOBIOGRAPHICAL STATEMENT 149
viii
LIST OF FIGURES
Figure 2.1 The possible electron configurations of a QCA cell . . . . . . . . . . . . . 7
Figure 2.2 QCA wire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Figure 2.3 Two different structure of QCA inverters . . . . . . . . . . . . . . . . . . 9
Figure 2.4 QCA majority gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Figure 2.5 QCA clocks and its 4-phase with difference of 90◦ . . . . . . . . . . . . . 11
Figure 2.6 Four-phase of a QCA clock . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Figure 2.7 Multi-layer crossover layout [17] . . . . . . . . . . . . . . . . . . . . . . . 12
Figure 2.8 (a) SET minority gate. (b) SET majority gate . . . . . . . . . . . . . . . 13
Figure 2.9 TPL minority gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Figure 2.10 (a) SMG device [19]. (b) Top view of SMG [19] . . . . . . . . . . . . . . 15
Figure 2.11 ASL devices: (a) ASL majority gate [23]. (b) ASL inverter [23] . . . . . . 16
Figure 2.12 STO logic majority gate [12] . . . . . . . . . . . . . . . . . . . . . . . . . 16
Figure 2.13 SWD devices: (a) SWD majority gate [31]. (b) SWD inverter [31] . . . . 17
Figure 2.14 (a) The possible stable magnetization. (b) NML majority gate . . . . . . 18
Figure 2.15 Different designs of DNA majority gates: (a) Four-way junction-driven
DNA majority gate [33]. (b) DNA majority gate given in [34]. (c) Spatially
localised DNA majority gate [35]. (Each color represents a particular domain in
the strand) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 2.16 Preprocessing script used in [55] and [60] . . . . . . . . . . . . . . . . . . 24
Figure 2.17 Four decomposition methods scripts used in [55] and [60] . . . . . . . . . 25
Figure 3.1 (a) QCA majority gate. (b) QCA inverter. . . . . . . . . . . . . . . . . . 42
Figure 3.2 Majority circuits of (a) Adder (level priority). (b) Adder (gate prior-
ity). (c) Multiplier. (d) Multiplier (high-speed). (e) Restoring divider.(f) Non-
restoring divider. (g) Divider (high-speed). (h) Squarer. . . . . . . . . . . . . . 44
Figure 3.3 Majority circuits of (a) Square rooting. (b) Square/square-root (high-speed) 45
ix
Figure 3.4 QCA structure of a three-input XOR gate. . . . . . . . . . . . . . . . . . 46
Figure 3.5 Majority-XOR circuits of (a) Adder. (b) Multiplier. (c) Multiplier (high-
speed). (d) Restoring divider. (e) Non-restoring divider. (f) Divider (high-speed).
(g) Squarer. (h) Square rooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Figure 3.6 Majority-XOR circuits of (a) Square/square-root (high-speed). . . . . . 48
Figure 3.7 QCA circuit designs of (a) Adder. (b) Multiplier. (c) Multiplier (high-
speed). (d) Restoring divider. (e) Non-restoring divider. (f) Divider (high-speed).
(g) Squarer. (h) Square rooting. (i) Square/square-root (high-speed). . . . . . . 49
Figure 4.1 (a) Basic multiplier cell (b) Logic diagram of the multiplier cell [70]. . . . 53
Figure 4.2 Majority-based circuit of multiplier cell . . . . . . . . . . . . . . . . . . . 54
Figure 4.3 Majority/XOR-based circuit of multiplier cell . . . . . . . . . . . . . . . 54
Figure 4.4 QCA layout of the multiplier cell . . . . . . . . . . . . . . . . . . . . . . 54
Figure 4.5 QCA layers for multiplier cell: (a) Main layer (b) Layer 1 (c) Layer 2 . . 55
Figure 4.6 A 4-bit multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Figure 4.7 QCA layout of 3-bit multiplier . . . . . . . . . . . . . . . . . . . . . . . . 57
Figure 4.8 QCA layout of 4-bit multiplier . . . . . . . . . . . . . . . . . . . . . . . . 58
Figure 4.9 Simulation result for the multiplier cell . . . . . . . . . . . . . . . . . . . 59
Figure 4.10 Simulation results for multiplication of (a) 10 and 101 (b) 110 and 101 . 60
Figure 5.1 Complement adder/subtractor cell . . . . . . . . . . . . . . . . . . . . . 63
Figure 5.2 5× 5 non-restoring binary array divider . . . . . . . . . . . . . . . . . . 65
Figure 5.3 CAS cell: (a) Logic diagram (b) QCA layout . . . . . . . . . . . . . . . . 66
Figure 5.4 QCA layers for CAS cell: (a) Main layer (b) Layer 1 (c) Layer 2 . . . . . 67
Figure 5.5 QCA layout of 3× 3 NRD . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Figure 5.6 The patterns of CAS cells for different sizes of NRD. . . . . . . . . . . . 70
Figure 5.7 QCA layout of 4× 4 NRD. . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Figure 5.8 QCA layout of 5× 5 NRD. . . . . . . . . . . . . . . . . . . . . . . . . . . 72
x
Figure 5.9 Latency of RD, NRD, and the proposed NRD. . . . . . . . . . . . . . . . 73
Figure 5.10 Simulation results for the CAS cell . . . . . . . . . . . . . . . . . . . . . 74
Figure 5.11 Simulation results for division of (a) 111 by 10 (b) 1111 and 10 . . . . . 75
Figure 6.1 QCA layout of squaring cell . . . . . . . . . . . . . . . . . . . . . . . . . 78
Figure 6.2 QCA layout of square-rooting cell . . . . . . . . . . . . . . . . . . . . . . 79
Figure 6.3 QCA layout of 4-bit squaring circuit . . . . . . . . . . . . . . . . . . . . 80
Figure 6.4 QCA layout of 4-bit square-rooting circuit . . . . . . . . . . . . . . . . . 81
Figure 6.5 Simulation results of squaring cell . . . . . . . . . . . . . . . . . . . . . . 82
Figure 6.6 Simulation results of square-rooting cell . . . . . . . . . . . . . . . . . . . 83
Figure 6.7 Simulation results for squaring of (a) 101 (b) 1100 . . . . . . . . . . . . . 84
Figure 6.8 Simulation results for square rooting of (a) 1001 (b) 1101 . . . . . . . . . 84
Figure 7.1 Flowchart for the proposed synthesis method. . . . . . . . . . . . . . . . 88
Figure 7.2 The possible levels of standard majority logic structures based on the
number of majority gates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Figure 7.3 The possible gates of standard majority logic structures based on the
number of levels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 7.4 Three structures for 5 majority gates and 3 levels. . . . . . . . . . . . . . 92
Figure 7.5 Three structures for 5 majority gates and 4 levels. . . . . . . . . . . . . . 93
Figure 7.6 The structure for 5 majority gates and 5 levels. . . . . . . . . . . . . . . 93
Figure 7.7 Obtained majority expression for the output {d}. . . . . . . . . . . . . . 97
Figure 7.8 Obtained majority expression for the output {g}. . . . . . . . . . . . . . 98
Figure 7.9 Obtained majority expressions for the output {e}. . . . . . . . . . . . . . 99
Figure 7.10 Obtained majority expressions for the output {f}. . . . . . . . . . . . . . 100
Figure 7.11 Final majority network for b1. . . . . . . . . . . . . . . . . . . . . . . . . 101
Figure 8.1 Basic cells: (a) Controlled adder-subtractor cell (b) Control cell . . . . . 106
xi
Figure 8.2 Generalized pipeline cellular array . . . . . . . . . . . . . . . . . . . . . . 108
Figure 8.3 Majority circuits of the arithmetic cell . . . . . . . . . . . . . . . . . . . 110
Figure 8.4 QCA design for the arithmetic cell . . . . . . . . . . . . . . . . . . . . . 111
Figure 8.5 Majority circuits of the control cell . . . . . . . . . . . . . . . . . . . . . 112
Figure 8.6 QCA design for the control cell . . . . . . . . . . . . . . . . . . . . . . . 112
Figure 8.7 QCA design of a generalized pipeline array for n = 2 . . . . . . . . . . . 113
Figure 8.8 QCA design of a generalized pipeline array for n = 5 . . . . . . . . . . . 115
Figure 8.9 Generalized pipeline cellular array for specified input/output pins . . . . 116
Figure 8.10 QCA design of generalized pipeline array for specified input/output pins
for n = 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Figure 8.11 Simulation results for (a) Squaring of 111 (b) Square rooting of 1000000
(c) Multiplication of 1110 and 1001 (d) Division of 101101 by 101 . . . . . . . . 119
xii
LIST OF TABLES
Table 2.1 Majority function truth table . . . . . . . . . . . . . . . . . . . . . . . . . 9
Table 2.2 Comparison between the best comprehensive synthesis methods . . . . . . 26
Table 2.3 Comparison of 8 standard three-variable Boolean functions using different
majority synthesis methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Table 2.4 Comparison of 40 benchmarks using the best comprehensive synthesis
methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Table 2.5 Optimization capability analysis of the best comprehensive synthesis methods 34
Table 3.1 Specifications of majority circuits of the fundamental arithmetic units . . 43
Table 3.2 Specifications of majority/XOR-based circuits of the fundamental arith-
metic units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Table 3.3 Comparison of the proposed QCA designs of the fundamental arithmetic
units and the best existing designs . . . . . . . . . . . . . . . . . . . . . . . . . 50
Table 4.1 Comparison of different QCA designs of array multipliers . . . . . . . . . 60
Table 5.1 Comparison of QCA designs for a CAS cell . . . . . . . . . . . . . . . . . 73
Table 5.2 Comparison of different QCA designs of restoring and non-restoring array
dividers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Table 6.1 Comparison of the proposed squaring and square-rooting designs . . . . . 85
Table 7.1 Comparison of 15 MCNC benchmarks using two existing methods and
proposed method “Gate Priority” . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Table 7.2 Comparison of 15 MCNC benchmarks using two existing methods and
proposed method “Level Priority” . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Table 8.1 Conditions of generalized pipeline array inputs for arithmetic operations . 107
Table 8.2 Conditions of generalized pipeline array (specified pins) inputs for arith-
metic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Table 8.3 Comparison of the proposed GPCA and different QCA designs . . . . . . 120
xiii
1CHAPTER 1: INTRODUCTION
1.1: Introduction
For the last four decades, the implementation of high-density, high-speed and low-power
very large scale integrated systems has largely based on complementary metal-oxide semicon-
ductor (CMOS) technology. Further scaling down of feature sizes has faced many difficulties
due to the fundamental physical limits of CMOS [1]. Many other nanotechnologies such as
quantum-dot cellular automata (QCA) [2–7], single electron tunneling (SET) [8,9], tunneling
phase logic (TPL) [10], etc., have been proposed and considered as possible replacements for
CMOS. These technologies are expected to achieve high-density, high-speed switching and
low-power consumption. In addition, there has been an increasing interest in the develop-
ment of nanotechnologies in various areas of research because of their future applications in
the field of medical, energy, industrial, etc. CMOS technology uses NAND, NOR and NOT
gates as the basic units to implement circuits. However, in the emerging nanotechnologies,
the fundamental logic units are majority/minority gates and inverters.
The development of low-complexity and high-speed circuits in different disciplines has
always been a topic of interest. In order to achieve low-cost and high-efficiency applications, it
is important to develop algorithms for the basic arithmetic circuits such as adder, subtractor,
multiplier, divider, squarer, and square-rooting circuits. The purpose of this research is to
develop low-complexity and high-speed QCA circuits of various single- and multi-operation
arithmetic circuits.
1.2: Motivation
In traditional Boolean logic design, AND, OR, and NOT gates have been the basic units
for realization of Boolean functions. These functions are usually produced using conventional
2reduction methods which result in simplified expressions in one of two standard representa-
tion forms: sum of products (SOP) and product of sums (POS). However, in post-CMOS
technologies, logic circuits are based on majority or minority gates. Therefore, in order to
implement a Boolean function on these nanotechnologies, the function has to be converted
into its equivalent majority or minority logic networks. Using traditional reduction methods
to produce circuits based on majority/minority gates is not efficient due to the complexity
of their circuits. Thus, there is an obvious necessity to develop algorithm that can efficiently
produce QCA-based circuits.
Basic arithmetic circuits are the backbone of all the meaningful applications. In promising
nanotechnologies, majority and/or minority logic are used as the basic units to implement
such circuits. Optimal QCA designs of different arithmetic circuits such as adder, subtractor,
multiplier, divider, squarer, and square-rooting arrays have been a major area of research for
many researchers. By having optimized basic arithmetic units, QCA computing systems can
be more optimal, especially while considering large-scale circuits. The majority/minority
synthesis techniques, which are in general proposed for different nanotechnologies, may not
result in the optimum QCA designs. For instance, the full adder circuit consists of a three-
input XOR and a three-input majority logic gate. In order to realize a three-input XOR
gate we need at least three majority gates and more than one level. In QCA, it could be
possible to construct three-input XOR logic as a basic unit. Hence, using both XOR and
majority logic, basic arithmetic operation can be realized in QCA with more efficient designs
compared to majority-only circuits.
A simplified Boolean function expressed in terms of logic AND, OR and NOT, can be
directly mapped one-to-one on to majority AND/OR gates to obtain its equivalent majority
expression. However, this procedure does not result in an optimal majority network. This
3means that the number of majority gates and the number of levels used in the majority
circuit are not the minimum numbers. For example, consider the Boolean function f =
x1x
′
2 +x1x3 +x
′
2x3. By using AND/OR mapping method, it requires five majority gates and
three levels as n1 = x1x
′
2, n2 = x1x3, n3 = x
′
2x3, n4 = n1+n2, and f = n3+n4. However, this
function can be realized with only one majority gate in one level, i.e., f = M(x1, x
′
2, x3). In
a majority circuit, the number of gates and levels are the most essential factors in improving
performance since they determine the latency and the size of the circuit. Therefore, there
is a strong necessity to develop an efficient method for synthesizing majority logic networks
with better results in terms of these factors in order to design efficient QCA circuits.
1.3: Research Objectives
In this dissertation, algorithms for QCA design of different arithmetic circuits are proposed.
These designs include multiplier, divider, squarer, square-rooting circuit, and multi-operation
arrays. The QCA designs of the basic cells of these arrays are developed based on the majority
gate which is the fundamental logic device in QCA and a QCA structure of the three-input
XOR function. This process can provide further reductions in the number of gates, levels,
inverters, and gate inputs which leads to QCA-based arithmetic circuits with better results in
view of area, latency, and thus cost, compared to the existing designs. The proposed arrays
is developed in a pipeline manner to perform the arithmetic operations for any number of
bits.
In the second portion of this research, a comprehensive methodology for majority/
minority-based circuits synthesis is proposed. This method is capable of processing n-feasible
networks and synthesizing their equivalent majority logic circuits with optimization priority
given to either gates or levels. We develop a process for constructing standard majority
4logic structures and their corresponding majority expressions starting from the minimum
number of gates and levels. The concept of parallel processing is used to apply combinations
of inputs on produced standard majority structures. To the best of the authors’ knowledge,
the concept of constructing standard majority logic circuits and parallel processing have not
been used in majority/minority logic synthesis methods. A simplification method is also
used to simplify the primary resulting majority network by removing redundancies and opti-
mizing the use of inverters. This leads to a lower number of majority gates and levels, better
latency, smaller area, which improves the overall performance of the circuits.
Since the operation of a minority gate is just the complement of a majority function, a
minority logic network for any Boolean function can be directly realized and obtained from
the equivalent majority logic network by using De Morgan’s theorem. This operation results
in a minority logic network with the same structure as its equivalent majority network. In
other words, both majority logic network and its equivalent minority network have the same
number of gates and levels. By having an efficient majority logic synthesis method, a Boolean
function can be converted into its equivalent majority and minority logic networks.
1.4: Dissertation Outline
This dissertation is organized as follows:
In Chapter 1, the introduction, motivations and objectives of this research are given.
Chapter 2 gives background information about emerging nano scale technologies. In
addition, more details of QCA technology are given. This includes QCA cells, devices,
clocks, and its crossover. This chapter also discusses the existing majority/minority logic
synthesis methods and analyzes their advantages and disadvantages.
Chapter 3 proposes QCA designs of various fundamental arithmetic cells using a QCA
5structure of the three-input XOR function. These designs include the basic cell of adder,
multiplier, divider, squarer, square-rooting circuit, and multi-operation array.
Chapter 4 presents a low-complexity and high-speed QCA design of multiplier array. The
proposed design is also compared with the best existing designs.
In Chapter 5, a QCA design of n-bit non-restoring binary array divider is discussed. A
comparison with various QCA dividers is also given.
Chapter 6 presents QCA designs of squaring and square-rooting circuits. The chapter
also compares the proposed circuits with their counterparts.
Chapter 7 explains the procedure of the proposed majority/minority logic synthesis
method. A comparison of 15 MCNC Benchmarks using the proposed synthesis method
and the existing majority logic synthesis methods is given.
Chapter 8 introduces a QCA design of generalized pipeline cellular array which can
perform all the basic arithmetic operation such as addition, subtraction, multiplication,
division, squaring, and square-rooting. A comparison of arithmetic operations and design
specifications is also given.
In Chapter 9, a brief summary, concluding remarks and future directions are given.
6CHAPTER 2: LITERATURE REVIEW
Portions of this chapter were reprinted or adopted from: Amjad Almatrood and Harpreet
Singh, “A Comparative Study of Majority/Minority Logic Circuit Synthesis Methods for
Post-CMOS nanotechnologies,” Engineering, 9(10): 890, 2017. [11]
2.1: Introduction
Complementary metal-oxide semiconductor (CMOS) technology has played a vital role in
constructing integrated systems for the past four decades. This technology has provided the
requirements of implementing high-density, high-speed and low-power very large scale inte-
grated systems. The fundamental physical limits of this technology have been reached [1].
Many researches have introduced different nanotechnologies such as quantum-dot cellular au-
tomate (QCA) [2–7], single electron tunneling (SET) [8,9], tunneling phase logic (TPL) [10],
spintronic devices [12], and many other nanotechnologies. These nanotechnologies are being
considered as possible replacements for CMOS technology and expected to provide further
scaling down of feature sizes and other features of integrated systems. For instance, the
MPU/ASIC high performance 4t NAND gate size in 7-nm CMOS technology is approx-
imately 0.099µm2 [1]. On the other hand, the basic logic unit in QCA has an area of
0.0034µm2. In addition, the applications of nanotechnology are far reaching and can pro-
vide great achievements in different fields. For example, one of the greatest achievement of
nanotechnology till date, is the invention of automated nano molecular devices which can be
controlled to perform a given task [13].
In CMOS technology, logic NAND, NOR and NOT gates are the basic units used to
implement circuits. The post-CMOS nanotechnologies use logic majority and/or minority
gates. In this chapter, we give the background information about QCA technology and
7review some majority/minority- based post-CMOS nanotechnologies and the implementation
of their logic devices. We also discuss the best existing majority/minority logic synthesis
methods and compared these methods based on different optimization factors.
2.2: Majority/Minority-Based Post-CMOS Nanotechnologies
2.2.1: Quantum-dot Cellular Automata Technology
Quantum-dot cellular automata (ACA) technology is one of nanotechnologies that provide
a new technique of computation information transformation. This technology uses a QCA
majority gate as the basic device along with QCA wire and QCA inverter to implement logic
circuits. The following sections explain QCA cell and devises and its clocks in detail.
QCA Cell: A QCA cell contains four quantum dots that are located at the corners of
a square. By charging a cell with two free electrons, which tunnel between dots, there are
only two states of electrons pairs that are energetically stable due to Coulombic interactions.
The two configurations of electrical charges in a cell encode binary information. Each of
these configurations has a different cell polarization. These polarizations are P = +1 and
P = −1 which represent logic 1 and 0, respectively. Fig. 3.2(a) shows a QCA cell and its
two possible electron configurations.
Electron Quantum dot
P = +1
(Logic 1)
P = -1
(Logic 0)
Figure 2.1: The possible electron configurations of a QCA cell
8In QCA, a logic circuit is implemented using three primitive devices that are QCA wire,
QCA inverter, and QCA majority logic gate. The construction of these devices is based on
QCA cell which is the fundamental unit in QCA.
QCA Wire: A QCA wire can be constructed by placing a group of cells next to each
other as shown in Fig. 3.2(b). The binary signal propagates from the leftmost cell which is
the input along the wire to the rightmost cell which is the output [14].
Input Output
Logic 1 Logic 1
Information propagation
Figure 2.2: QCA wire
QCA Inverter: The function of a QCA inverter is to produce logic 1 as an output if the
input is logic 0 and logic output 0 if the input is 1. By placing cells in a diagonal position, the
polarizations of these cells will be reversed. Based on this characteristic, the QCA inverter
can be constructed in two different ways. The first QCA inverter is shown in Fig. 3.2(c)(a).
This inverter requires more cells. However, it is more efficient which is cannot be affected by
surrounding cells. While, the second requires less number of cells as shown in Fig. 3.2(c)(b).
However, this inverter cannot be applied in all circuits because its procedure can be affected
by other cells in the circuit.
QCA Majority Gate: The function of a QCA majority gate is a three-input majority
logic function. This function is to produce an output logic 1 if two or more of the inputs are
1. Otherwise, it produces an output logic 0 as given in table Table 2.1. This function can
9Input Output
Logic 1 Logic 0
(a)
Input
Output
Logic 1
Logic 0
(b)
Figure 2.3: Two different structure of QCA inverters
be expressed by
M(x1, x2, x3) = x1x2 + x1x3 + x2x3 (2.1)
Table 2.1: Majority function truth table
x1 x2 x3 M(x1, x2, x3)
0 0 0 0
0 0 1 0
0 1 0 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
The layout of QCA majority gate is shown in Fig. 3.2(d). As seen in the figure, a QCA
majority gate is constructed of one cell surrounded by four cells, one in each side. Three of
these cells are the gate inputs which are the upper, leftmost and lower cells. Based on the
polarizations of the three input cells, the middle cell polarization is determined because it
represents the lowest energy state. Then, the signal propagates to the rightmost cell which
is the output cell.
10
X2 = 1
X1 = 0
X3 = 1
Output = 1 M
X1
X2
X3
M(X1, X2, X3)
Figure 2.4: QCA majority gate
By forcing one of the three inputs in a three-input majority gate to logic 0 or 1, the gate
will perform as a two-input logic AND or a two-input logic OR function as given in (2.2)
and (2.3), respectively.
M(x1, x2, 0) = x1x2 (2.2)
M(x1, x2, 1) = x1 + x2 (2.3)
QCA Clocks: In QCA there are four different stages. These stages are used to determine
the direction of signal flow. These stages are clock 1, clock 2, clock 3, and clock 4. Each of
these clocks has four phases with difference of 90◦ [15] as shown in Fig. 3.2(e). These phases
namely switch, hold, release, and relax as depicted in Fig. 3.2(f).
The first purpose of using these clocks is to power the automaton. The second purpose
is to control direction of date flow. The date flow direction is determined by the state of
cell and its state neighbors. In the first phase which is switch, the cell is polarized based on
the neighbor’s state. During the hold phase, the cells retain its polarization. Finally, in the
release and relax phases, the cells are unpolarized [16].
Crossover in QCA: Crossover in QCA is a technique that is used in order to avoid changes
in cells polarization during wire intersections in a circuit design. This technique is applied
11
Switch Hold Release Relax
C
lo
c
k
 S
ig
n
a
l
Time
Clock 
zone 1
Clock 
zone 0
Clock 
zone 2
Clock 
zone 3
Figure 2.5: QCA clocks and its 4-phase with difference of 90◦
Switch
Voltage
Hold Release Relax
Time
Figure 2.6: Four-phase of a QCA clock
based on multi-layer as shown in Fig. 3.2(g).
As shown in the figure, crossover technique is applied in logic 1 wire while logic 0 wire is
12
Figure 2.7: Multi-layer crossover layout [17]
connected using a normal QCA wire. By connecting logic 1 wire vertically, this will transfer
logic 1 to a new layer and allow it to cross over logic 0 wire without causing any effects on
both wires. Then logic 1 can be transferred back to the base layer [17,18].
2.2.2: Single Electron Tunneling Technology
In single electron tunneling (SET) technology, both majority and minority gates are used
to implement logic circuits. A SET minority gate implements a three-input logic function
given in (2.4). Since the minority function is just the complementary of majority function,
it produces an output 0 if one or more of its inputs are 1. Otherwise, it produces an output
1.
m(x1, x2, x3) = x
′
1x
′
2 + x
′
1x
′
3 + x
′
2x
′
3 (2.4)
Fig. 3.2(h)(a) shows a basic SET minority gate. It consists of three input capacitors,
single-electron boxes (SEBs), and an output capacitor. The inputs of minority gate (V1,
V2, and V3) move through the input capacitors to form a voltage summing network. These
capacitors produce the mean voltage of their inputs at node A. Based on the value of the
mean voltage, an electron will tunnel through SEBs and make the voltage at node A negative.
13
Otherwise, the voltage will remain positive. The negative and positive values represent logic
0 and 1, respectively.
V1
V2
V3
Vdd
A
Inputs
Output
(a)
V1
V2
V3
Vdd
A
Inputs
B
O1
O2
O3
Outputs
(b)
Figure 2.8: (a) SET minority gate. (b) SET majority gate
By setting one of the three inputs of the minority gate as a logic 0 or 1, the gate imple-
ments a two-input logic NAND or two-input logic NOR gate, respectively [8]. The obtained
functions are given by
m(x1, x2, 0) = x
′
1 + x
′
2 = (x1x2)
′ (2.5)
m(x1, x2, 1) = x
′
1x
′
2 = (x1 + x2)
′ (2.6)
A SET majority gate is constructed of three input capacitors, a balanced pair of SEBs,
three output capacitors as shown in Fig. 3.2(h)(b). When the bias voltage (Vdd) increases,
the electron tunneling occurs and results in either (0, 1) or (1, 0) stable voltage state. The
(0, 1) state occurs and produces a positive value at node B if the majority of the inputs are
1. Otherwise, the (1, 0) state will occur and produce a negative value at node B [9].
2.2.3: Tunneling Phase Logic Technology
A tunneling phase logic (TPL) minority gate is the basic unit used in TPL technology
to implement logic circuits. As shown in Fig. 3.3(a), the inputs of a TPL minority gate are
14
three waveforms (W1, W2, and W3). The phases of a waveform are used to represent logic
0 and 1. Based on the input waveforms, the phase of the output waveform is determined.
If two phases of the three input waveforms are different, they will neutralize each other and
the reverse of the third waveform will be the output. However, if all input waveforms have
the same phases, the output will be the reverse of these phases.
Clock1
Pump
Clock2
Pump
J4
J3
J2
J1
W1
W2
W3
Output
Figure 2.9: TPL minority gate
2.2.4: Spintronic Majority Gate
A spintronic majority gate (SMG) is a device that performs a three-input majority
function. This device is implemented with a cross of ferromagnetic wires with a size of
140 × 140 nm [19, 20]. Over the four ends of the cores, the three inputs (A, B, C) and
output (Out) terminals terminals are formed as nanopillars (20×20 nm each) with a separate
ferromagnetic layer as shown in Fig. 3.3(b)(a). Based on the sign of the current received from
each nanopillar, the current exerts spin torque in order to switch the magnetization of the
common layer to a certain direction. The final direction of the magnetization is determined
by the majority directions of the inputs and sensed via the tunneling magnetoresistance
(TMR) effect using a sense amplifier. Fig. 3.3(b)(b) shows the width and length of arms,
the size of pillars, and distance between them (a = 20 nm).
15
(a) (b)
Figure 2.10: (a) SMG device [19]. (b) Top view of SMG [19]
2.2.5: All Spin Logic
An all spin logic (ASL) device is also spin based device [21,22]. It constructed of copper
wires and nanomagnets. To implement an ASL device that performs a three-input majority
function, four nanomagnets, which represent the three inputs and one output, are placed
over the ends of the copper wires as shown in Fig. 2.11(a). The input and output sides of
each of these nanomagnets are separated by an insulator. Due to the current driven to the
ground terminal from the voltage supplied to the top of each nanomagnets, spin polarized
electrons accumulate in the two sides of each nanomagnet with different concentrations.
This difference causes a diffusion spin current, which exerts torque on a nanomagnet and is
able to switch its polarization. Based on the majority of input polarizations, the output is
determined and delivered via the output nanomagnet as a logic value. The inverters also can
be implemented based on the same properties of polarization changes as shown in 2.11(b).
2.2.6: Spin Torque Oscillator Logic
A spin torque oscillator (STO) logic is a device that can perform a three-input majority
function [24]. This device consists of four nanopillars (three inputs and one output) with
16
(a) (b)
Figure 2.11: ASL devices: (a) ASL majority gate [23]. (b) ASL inverter [23]
their own layers. Similar to SMG, the oscillators have a common ferromagnetic layer as
shown in Fig. 2.12. The input currents pass through nanopillars and exert spin torques
that drive oscillators. Because of the driven oscillators, spin waves propagate in the common
layer that make the oscillators’ signal coupled. Based on the majority of the inputs, the
frequency of the output oscillator is determined. This can be sensed via the effect of giant
magnetoresistance (GMR) or TMR. The frequency of the output serves in the circuit as the
logic signal.
Figure 2.12: STO logic majority gate [12]
17
2.2.7: Spin Wive Device Technology
In spin wive device (SWD) technology, computation and information transformation occur
via spin waves [12, 25, 26]. SWD technology uses the majority gate as the logic primitive.
A SWD majority gate is constructed of the symmetric merging of three waveguides [27,
28]. Its operation is based on the interference of the input spin waves. The output is
determined based on the interference of the three phases of the input spin waves via magneto-
electron (ME) cells [29]. Another logic device in SWD technology is a SWD inverter which
is implemented by a waveguide to deliver the inverse of spin wave signal to the output ME
cell [28, 30]. Fig. 2.13(a) and (b) show the areas and designs for a SWD majority gate and
inverter, respectively. It can be seen that the length of waveguide in the inverter is 1.5× of
the length of spin wavelength (λSW), while the length of each waveguide in the majority gate
is 1.0× of the spin wavelength.
(a) (b)
Figure 2.13: SWD devices: (a) SWD majority gate [31]. (b) SWD inverter [31]
2.2.8: Nanomagnetic Logic
The process of computation and information transformation in nanomagnetic logic (NML)
[32] is based on magnetization of patterned array of elongated nanomagnets. In NML, there
are two stable magnetization states of magnets that are used to represent binary information.
18
These states are commonly referred to as “up” or “down” which represent logic 1 or 0,
respectively, as shown in Fig. 2.14(a). The fundamental logic element in NML technology is
a three-input majority gate. This gate is constructed of a cross of five dots, which are one
central dot surrounded by four dots that represent inputs (A, B, C) and output as shown
in Fig. 2.14(a). Based on the majority of magnetization of the three inputs, the output is
calculated via magnetic interactions.
(a) (b)
Figure 2.14: (a) The possible stable magnetization. (b) NML majority gate
2.2.9: DNA Technology
DNA technology is being considered as a possible alternative to silicon-based technologies
especially for implantable medical devices. The small size, light weight, and compatibility
with bio-signals of DNA technology show its ability of implementing logic circuits. Several
researchers have introduced different designs of DNA majority gates [33–35] based on differ-
ent techniques such as the four-way junction-driven DNA majority gate, spatially localised
DNA majority gate, etc. The basic operation associated with such majority gates is the
DNA strand displacement mechanism. Different designs of DNA majority gates are shown
in Fig. 2.15.
19
(a)
(b)
y^
b^*
s*
y^
c^*
s*
y^
a^*
s*
z^
x^*
s*
blank^
x^*
s*
x^
y^*
s*
Fuel F(Y,X)x^
y^*
s*
x^
y^*
s*
Output
Translator 
H(X,Z)
H(B,Y)
H(A,Y)
H(C,Y)
(c)
Figure 2.15: Different designs of DNA majority gates: (a) Four-way junction-driven DNA
majority gate [33]. (b) DNA majority gate given in [34]. (c) Spatially localised DNA majority
gate [35]. (Each color represents a particular domain in the strand)
In addition to the nanotechnologies discussed in this paper, other nanotechnologies such
as graphene [36,37] reconfigurable gate [38], resistive RAM [39–41], carbon nanotube [42–45],
etc., use logic majority and/or minority gates as circuit primitives. Hence, in order to im-
plement an efficient logic circuit in any of these nanotechnologies including all the nanotech-
nologies discussed earlier, the circuit has to be converted into its equivalent majority- or
minority-based logic circuit. In this research, QCA technology is considered for the imple-
mentation of various arithmetic circuits obtained from the proposed synthesis method.
Since minority logic is the complement of majority function, De Morgan’s theorem can
be used to drive a minority logic network from its equivalent majority network. This process
20
results in a minority network with the same number of majority gates and levels as in its
equivalent majority network. This means that an efficient majority logic network synthesis
method can be used to obtain both majority and minority networks. The simplified Boolean
functions expressed in standard forms SOP and POS can be directly converted into majority
or minority logic networks by implementing the majority AND/OR mapping method. This
method is to map each logic gate in the simplified Boolean functions to majority AND/OR
gates. However, in most cases, this method does not results in optimal majority/minority
expressions. In other words, the number of gates, levels, etc., used in majority/minority
expressions obtained from the AND/OR mapping method are not the optimal results. For
example, consider the majority function f = x1x2 + x1x3 + x2x3. By using the AND/OR
mapping method, it requires five majority gates, three levels as n1 = x1x2, n2 = x1x3,
n3 = x2x3, n4 = n1 +n2, and f = n3 +n4, whereas it can be realized with only one majority
gate in one level, i.e., f = M(x1, x2, x3). Therefore, an efficient majority/minority logic
network synthesis is needed in order to generate optimal majority/minority logic networks.
In the next section, we review the best existing majority/minority logic synthesis methods
in detail.
2.3: Majority/Minority Logic Circuit Synthesis Methods
The history of research in majority logic synthesis dates back to the 1960s. Karnaugh-
map (K-map) [46], reduced-unitized-table [47], and Shannon’s decomposition principle [48]
are some of these methods that were developed to synthesize majority logic network. How-
ever, these methods are suitable only for small networks because they are used to synthesize
majority networks manually. Other majority synthesis methods were introduced based on
geometric interpretation of the three-variable Boolean functions to convert sum of products
21
expressions into optimal majority logic networks [49, 50]. However, these methods can syn-
thesize only up to three-variable Boolean functions. For synthesizing majority logic networks
with more than three variables, several approaches have been proposed based on different
concepts [51–56]. Methods in [51, 52] are developed based on genetic algorithm [57, 58] and
the concept of Boolean disjointness, respectively. Other approaches described in [53–56] use a
standard logic synthesis tool which is sequential interactive synthesis (SIS) [59] to decompose
Boolean functions into three-feasible or four-feasible networks. The decomposed networks
are then converted into their equivalent majority expressions based on different techniques.
Recently, Wang et al. [60] have proposed a new comprehensive majority/minority synthesis
method that can be used to synthesize majority networks with optimization priority given
to either the number of gates or levels of the generated network. This method also uses SIS
tool. However, the decomposition methods used in this approach is developed to decom-
pose input Boolean functions into both three-feasible and four-feasible networks. Based on
a developed table that contains optimal equivalent majority expressions for all four-variable
Boolean functions, the decomposed networks are then converted into their corresponding ma-
jority expressions. Other majority logic synthesis methods are proposed in [61, 62]. These
methods use a new logic represintation called majority-inverter graph (MIG) which is a di-
rected acyclic graph that consisting of three-input majority nodes and regular/complemented
edges. MIG can minimize the number of complemented edges and affects the circuits perfor-
mances [63]. A new paradigm for majority synthesis that is also based on majority-inverter
graph is introduced in [64]. However, this approach uses a new Boolean algebra in order to
optimize MIGs. As the purpose of this chapter is to provide a review of the best synthesis
methods for post-CMOS nanotechnologies, we only concentrate on the majority/minority
logic networks synthesis methods that are capable of synthesizing multi-input multi-output
22
Boolean functions as follows.
2.3.1: Majority Logic Synthesis (MALS)
MALS is the first proposed comprehensive majority/minority logic network synthesis
method that is capable of synthesizing multi-level multi-output majority/minority logic net-
works [54]. The input to MALS is a minimized algebraically factored multi-output combina-
tional network, and the output is an equivalent majority logic network. The method starts
by preprocessing and decomposing the input network such that each node in the decomposed
network has at most three input variables. This process is done by using preprocessing and
decomposition methods in SIS. The next step is to check each decomposed node to see
whether it is a majority function. If so, the node will be converted and the process will move
to check the next node. Otherwise, the node function will be checked if there is any common
literal. If this is the case, the literal will be factored out and an AND/OR mapping is then
performed on the factored function. If the node function has no common literal and it can
be realized with less than four majority AND/OR gates, an AND/OR mapping will then
be performed. Otherwise, the node will be converted into its equivalent majority expression
with at most four majority gates in two levels using K-map. This procedure is accomplished
by first getting the K-map of the logic function of the node. Next, the first majority function
f1 is determined by finding the admissible pattern from the K-map of the node. Based on
the K-map of the node and the first admissible pattern, the second admissible pattern is then
found which gives the second majority function f2. Lastly, the third admissible pattern is
found based on the K-map of the node and the first and the second admissible patterns. The
third admissible pattern gives the third majority function f3. These three majority functions
are determined such that the original node can be replaced with the majority function of
23
these three functions as M(f1, f2, f3).
2.3.2: Kong’s Synthesis
Another comprehensive majority/minority logic network synthesis method was introduced
by Kong et al. [55]. The input to this methodology is an arbitrary multi-output Boolean
function, and its output is an equivalent majority logic network. The method begins by
preprocessing the input network and checking its correctness using SIS. If the input function
is correct, multiple preprocessing scripts given in Fig. 3.5(a) are applied to simplify and
factor it algebraically, where all “(x)” are replaced with “3”. Otherwise, error information
will be shown and the process will be ended. After preprocessing, the factored functions
are decomposed using SIS such that each node has at most three input variables. For
decomposition, four different methods given in Fig. 3.5(b) are performed in order to obtain
the minimum number of three-feasible nodes. In these decomposition methods, all “(x)”
are replaced with “3” to produce three-feasible decomposed networks. After decomposition,
all nodes in the decomposed network are then checked to see if there is any node that can
be collapsed into its fanout while retaining feasibility. This process can reduce further the
number of nodes. In the next step, each node in the decomposed network is then checked to
see if it is a majority function. If so, the function is then converted into its corresponding
majority expression based on forty primitive functions which are all the possible three-
variable Boolean functions. Otherwise, all admissible expression groups are found from
the forty majority expressions such that each group consists of three majority expressions
(f1, f2, f3) where the node is the majority function of these expressions, i.e., M(f1, f2, f3).
Then, all the majority functions that consist of expression groups with a minimum number of
majority gates are selected. Next, the selected functions are checked to select the functions
24
with a minimum number of gate inputs. Then, the selected functions are checked again to
select the functions with a minimum number of inverters. The same steps are repeated for
the complement of the node function. The next step is to select the majority function with
a minimum number of majority gates, gate inputs, and inverters from the selected majority
functions that consist of expression groups and their complements. The last step is to check
obtained majority expressions and see if there are repeated nodes. If so, these nodes will
be removed and the majority network will be updated. This process keeps running until no
repeated nodes exist.
1.   collapse
2.   sweep
3.   eliminate 5
4.   simplify – m nocomp – d
5.   resub – a – d
6.   gkx – abt (x)0
7.   resub – a – d
8.   sweep
9.   gcx – bt (x)0
10. resub – a – d
11. sweep
12. gkx – abt 10
13. resub – a – d
14. sweep
15. gcx – abt 10
16. resub – a – d
17. sweep
18. gkx – ab
19. resub – a – d
20. sweep
21. gcx – b
22. resub – a – d
23. sweep
24. eliminate 0
Figure 2.16: Preprocessing script used in [55] and [60]
2.3.3: Majority Expression Lookup Table (MLUT)-Based Synthe-
sis
One of the majority/minority logic network synthesis methods is the MLUT-based
method [60]. The input to this method is an arbitrary Boolean functions network, and
the output is an equivalent majority logic network. This method also starts by preprocess-
ing and decomposing the input network using SIS as used in Kong’s method. However, the
preprocessing and decomposition methods used here are able to preprocess and decompose
25
Method 1:  xl_split – n (x)
Method 2:  xl_imp – n (x)
Method 3:  xl_part_coll – n (x) – m – g 2
                    xl_coll_ck – n (x)
                    xl_partition – n (x) – m
                    full_simplify
                    xl_imp – n (x)
                    xl_partition – n (x) – t
                    xl_cover – n (x) – e (x)0 – u 200
                    xl_coll_ck – n (x) – k
Method 4:  xl_part_coll – n (x) – m – g 2
                    xl_coll_ck – n (x)
                    xl_partition – n (x) – m
                    sweep; eliminate – 1
                    simplify – m nocomp
                    eliminate – 1
                    sweep; eliminate 5
                    simplify – m nocomp
                    resub – a
                    fx
                    resub – a; sweep
                    eliminate – 1; sweep
                    full_simplify – m nocomp
                    xl_imp – n (x)
                    xl_partition – n (x) – t
                    xl_cover – n (x) – e (x)0 – u 200
                    xl_coll_ck – n (x) – k
Figure 2.17: Four decomposition methods scripts used in [55] and [60]
the input Boolean functions network up to four-feasible networks. In preprocessing, the
input Boolean functions are simplified by algebraically factoring the common terms out and
removing the repeated terms by applying the preprocessing script given in Fig. 3.5(a), where
all “(x)” are replaced with “4”. For decomposition, the same four methods used in Kong’s
method are implemented in order to find the minimum number of decomposed networks.
However, these four decomposition methods will decompose the network into two-feasible,
three-feasible and four-feasible networks by replacing all “(x)” in Fig. 3.5(b) with “2”, “3”,
and “4” in order to find the best solution. In this method, a majority expression lookup
26
table is developed. This table is developed by generating equivalent majority expressions
for all possible four-variable Boolean functions. This results in a table that contains ninety
four-variable Boolean functions and their equivalent majority expressions. Based on this
table, each decomposed node, which consists of four or fewer variables, is then converted
into its corresponding majority expression. A redundancy removal method is also used.
This process can provide further simplification by implementing several steps. It starts by
checking all nodes in the obtained majority network and removing the repeated nodes. All
nodes with duplicated inputs are then simplified. The next step is to sweep all nodes without
majority gates. The last step in the redundancy removal method is to minimize the number
of inverters. This step is implemented if the majority network has two cascaded inverters
which can cancel each other out. Another case would be if a majority gate in the network
has two or three internal inverters which can be factored out to have only one external
inverter. The redundancy removal method may require more than one iteration until no
further simplification is possible.
2.4: Comparison and Discussion
The three majority/minority synthesis methods discussed in this chapter differ from each
other in their preprocessing methods, decomposition methods, conversion techniques, and
optimization targets: gates, levels, inverters, and gate inputs. Table 2.2 gives a summary of
these differences. These differences are discussed in detail as follows.
Table 2.2: Comparison between the best comprehensive synthesis methods
Preprocessing Decomposition Conversion
technique
Optimization targets
Method r-feasible Methods r-feasible Reduction Gates Levels Inverters Inputs
MALS r = 3 1 r = 3 No K-map Yes No No No
Kong’s r = 3 1, 2, 3, 4 r = 3 Yes 40 Primitives Yes No Yes Yes
MLUT r = 4 1, 2, 3, 4 r = 2, 3, 4 Yes 90 Primitives Yes Yes Yes Yes
27
2.4.1: Preprocessing
The first step in all three methods is preprocessing. This process is used to simplify the
input Boolean functions by removing the redundant terms and algebraically factoring the
common terms out. For example, consider the Boolean function F = x1x
′
2 + x1x3 + x
′
2x3 +
x1x
′
2x
′
3. This function is first simplified to F = x1x
′
2+x1x3+x
′
2x3. Then, the common terms
are factored out and the function is simplified to F = (x′2 + x3)x1 + x
′
2x3. In all algorithms,
this process is done by using the simplification and factorization methods in SIS. However,
the preprocessing method used in MLUT is improved by performing the operations of kernel
and cube extraction for four-feasible networks instead of three-feasible networks as used in
MALS and Kong’s method. Although, the preprocessing method provides simplified Boolean
functions in terms of logic AND, OR and NOT, these functions are not expressed properly
to be converted into their equivalent majority expressions for some cases. To demonstrate
this point, consider the same function that we used for simplification. After removing the
redundant term, the function is expressed by F = x1x
′
2+x1x3+x
′
2x3. It can be seen that this
function is expressed as a majority function which can be realized with only one majority
gate in one level, i.e., F = M(x1, x
′
2, x3). However, if the common terms are algebraically
factored out, i.e., F = (x′2 +x3)x1 +x
′
2x3, the function will have a different expression which
can result in an equivalent majority expression with more than one majority gate and one
level. This specific example may not fall in this category due to its simplicity. However, this
case can occur especially while processing large circuits.
2.4.2: Decomposition
For decomposition, MALS uses method 1 in Fig. 3.5(b) to decompose the input network
into smaller nodes such that each node in the network has at most three input variables
28
which can be easily converted into its equivalent majority expression. In Kong’s method,
the decomposition process is also used to decompose the input networks into three-feasible
networks. However, this method uses four different decomposition methods as given in Fig.
3.5(b). Any function with three input variables can be realized with at most four majority
gates in two levels [48, 65]. Thus, the total number of majority gates in the synthesized
majority network is between the number of nodes and the number of nodes multiplied by 4.
Therefore, in order to reduce the number of majority gates in a synthesized majority network,
the number of nodes must be reduced. None of the four decomposition methods used in
Kong’s method give the minimum number of nodes for all cases. Therefore, all four methods
are applied to find out the best results. In MLUT, the same four decomposition methods
are used. However, these methods are improved to decompose the input network into two-
feasible, three-feasible, and four-feasible networks. Based on the obtained networks from the
four decomposition methods, the best solution is then chosen. The obtained decomposed
networks from the four methods are not guaranteed to be optimal. However, they provide a
fundamental library of heuristic techniques for decomposition. In both Kong’s method and
MLUT, all nodes in the obtained decomposed networks are then checked to see if there are
nodes that can be collapsed into their fanouts while retaining feasibility. This can provide
further reduction in the number of gates, number of levels, number of inverters, and gate
inputs. However, this process is not considered in MALS.
2.4.3: Converting Boolean Functions into Majority Expressions
For converting the decomposed networks into their equivalent majority expressions, each
method uses a different technique. The MALS method uses K-map to obtain one-level
majority functions f1, f2, and f3 for each node, such that the function can be represented
29
as M(f1, f2, f3). This method can generate only one admissible majority expression for a
given Boolean function. This is considered as a drawback for this method. Therefore, this
technique does not guarantee that it results in optimal majority expressions. In Kong’s
method, the process of converting the function of a node is based on forty optimal majority
expressions. If the Boolean function belongs to these forty expressions, it is converted into
its corresponding majority expression. Otherwise, all admissible three-expression groups
from the forty expressions are found such that the function of the node can be represented
as a majority function of the three expressions. This conversion technique is also used in
MLUT. However, this method is based on ninety primitive functions instead of forty as used
in Kong’s method. These primitives are the equivalent majority expressions for all possible
four-variable Boolean functions. Each node in the decomposed network is replaced with a
majority function if it has a corresponding expression. Otherwise, a combination of three
majority expressions is chosen from the ninety expressions such that the function of the node
can be represented as the majority function of the chosen three expressions.
2.4.4: Optimization Targets
Since the gate count and level count determine the latency and the size of a major-
ity/minority circuit, they are the most important factors that play an essential role in en-
hancing performance. Therefore, by reducing the number of gates and the number of levels,
the performance can be improved. In the three comprehensive synthesis methods (MALS,
Kong’s, and MLUT), the optimization is targeted to reduce either the number of gates or
levels. In MALS and Kong’s method, the gate count reduction is taken as the first priority
for optimization. However, in MLUT, either the number of gates or levels can be taken
as the first priority. In addition to the number of gates and level count, there are other
30
factors that can play an essential role in providing further scaling down of feature sizes of a
generated majority circuit. One of these factors is inverter count. In some nanotechnologies,
the implementation of an inverter requires a larger area than a majority gate. For example,
in QCA technology, the implementation of an inverter requires seven QCA cells, whereas
five QCA cells are required to implement a majority gate. Another factor that can lead to
additional optimization is the number of gate inputs. By reducing the number of gate inputs
of a circuit, the routing complexity can be reduced. For example, consider the majority
functions F1 = M(x3, 1,M(x1, x2, x3)) and F2 = M(x3, 1,M(x1, x2, 0)). Both F1 and F2 are
equivalent majority expressions to F = x1x2 + x3. However, F1 has four gate inputs and
F2 has three gate inputs. Since the logic 0 and 1 in QCA can be generated from external
sources at their positions, the best expression to implement this circuit in QCA is F2. The
optimization of inverters and gate inputs are only considered in Kong’s method and MLUT.
2.4.5: Comparison of Experimental Results
In this section, we demonstrate an overall comparison between the results obtained
from the existing synthesis methods. In Table 2.3, the obtained equivalent majority ex-
pressions for eight standard three-variable Boolean functions [49] using the comprehensive
synthesis methods discussed in this paper and other five three-variable synthesis meth-
ods [49–53] are given. In the same table, the numbers of majority gates, levels, inverters,
and gate inputs used in each majority expression are given as well. From the table, it
can be seen that MALS results in an optimal solution for some functions, whereas Kong’s
method and MLUT give the optimal expressions in terms of gates, levels and inverters
for all Boolean functions. However, none of these methods result in the minimum num-
ber of gate inputs for all functions. For example, the equivalent majority expression for
31
the Boolean function F = x1x2 + x
′
1x
′
2x3 obtained from Kong’s method and MLUT is
F = M(M(M(x1, 0, x2),M(x1, 1, x2)
′,M(x1, 1, x3)). This expression requires four gates, two
levels, one inverter and nine gate inputs. From the AND/OR mapping method and methods
in [49], [52], [54], the obtained majority expression is F = M(M(M(x′1, 0, x
′
2), 0, x3), 1,M(x1,
0, x2)). This expression requires four gates, two levels, one inverter and eight gate inputs.
For Boolean functions with more than three variables, we compare the results of 40
Microelectronics Center North Carolina benchmark circuits [66] using the comprehensive
synthesis methods in Table 2.4. The results obtained from each of the three methods are
compared with the majority AND/OR mapping method. In this table, only the number of
majority gates and levels are considered for comparison. As shown in the table, when the
MLUT method is targeted to reduce the number of gates, there is an average reduction of
36.1% in gate counts as well as 6.9% in level counts, whereas Kong’s method and MALS
have an average reduction of 31.8% and 22.0% in the number of gates, respectively. When
the MLUT method is targeted to optimize the level counts, there is an average reduction
of 11.1% in the number of levels as well as 34.3% in the number of gates, whereas Kong’s
method and MALS have an average reduction of 5.7% and -7.5% in level counts, respectively.
It can be noticed that all methods give better average reduction results for gates and levels
except the MALS method which results in a worse average reduction for level counts as
compared to the AND/OR mapping method. Even though the MLUT method results in
the highest average reduction for gate and level counts compared to other methods, it does
not result in the optimal majority networks for some circuits. For example, the obtained
majority network for the benchmark circuit cm152a using MLUT when targeted to optimize
either majority gates or levels, requires six levels, whereas it can be realized with five levels
as obtained from Kong’s method.
32
Table 2.3: Comparison of 8 standard three-variable Boolean functions using different major-
ity synthesis methods
Standard function Method Majority expression Gates Levels Inverters
Gate
inputs
F = x1x2 + x
′
1x
′
2x3
AND/OR mapping M(M(M(x′1, 0, x
′
2), 0, x3), 1,M(x1, 0, x2)) 4 3 2 8
[49] [52] [54] M(M(M(x′1, 0, x
′
2), 0, x3), 1,M(x1, 0, x2)) 4 3 2 8
[50] M(M(x′1, 1, x2),M(x1, 0, x2),M(x
′
2, 0, x3)) 4 2 2 9
[51] M(M(x1, 0, x2),M(x1, 1, x2)
′,M(x1, 1, x3)) 4 2 1 9
[53] M(M(x′1, 1, x2),M(x1, x
′
2, x3),M(x1, 0, x2)) 4 2 2 10
[55] [60] M(M(x1, 0, x2),M(x1, 1, x2)
′,M(x1, 1, x3)) 4 2 1 9
F = x1x2x3 + x
′
1x2x
′
3
+x1x
′
2x
′
3
AND/OR mapping
M(M(M(M(x1, 0, x2), 0, x3), 0,M(M(x
′
1,
0, x2), 0, x
′
3)), 0,M(M(x1, 0, x
′
2), 0, x
′
3))
8 4 4 16
[49] [50] M(M(x1, 0, x3),M(x1, x2, x
′
3),M(x
′
1, x
′
2, x
′
3)) 4 2 4 11
[51] M(M(M(x1, x2, x3), 1, x1)
′, x3,M(x1, x2, x′3)) 4 3 2 11
[52] [54] M(M(x1, x2, x
′
3),M(x
′
1, 0, x
′
3),M(x1, x
′
2, x3)) 4 2 4 11
[53] M(M(x1, x2, x
′
3),M(x1, x
′
2, x3),M(x
′
1, 0, x2)) 4 2 3 11
[55] [60] M(M(x1, 0, x3),M(x1, x2, x3)
′,M(x1, x2, x′3)) 4 2 2 11
F = x1 + x2x3
AND/OR mapping M(M(x2, 0, x3), 1, x1) 2 2 0 4
[49] [50] [52] [53] M(M(x2, 0, x3), 1, x1) 2 2 0 4
[51] M(M(x1, 1, x3), x1, x2) 2 2 0 5
[54] [55] [60] M(M(x2, 0, x3), 1, x1) 2 2 0 4
F = x1x2 + x
′
2x3
AND/OR mapping M(M(x1, 0, x2), 1,M(x
′
2, 0, x3)) 3 2 1 6
[49] [50] [52] [53] M(M(x1, 0, x2), 1,M(x
′
2, 0, x3)) 3 2 1 6
[51] M(M(x1, 1, x2)
′, x1,M(x2, 1, x3)) 3 2 1 7
[54] [55] [60] M(M(x1, 0, x2), 1,M(x
′
2, 0, x3)) 3 2 1 6
F = x1x2x3 + x
′
1x
′
2x
′
3
AND/OR mapping M(M(M(x1, 0, x2), 0, x3), 1,M(M(x
′
1, 0, x
′
2), 0, x
′
3)) 5 3 3 10
[49] [52] [54] M(M(M(x1, 0, x2), 0, x3), 1,M(M(x
′
1, 0, x
′
2), 0, x
′
3)) 5 3 3 10
[50] [53] M(M(x′1, 1, x2),M(x
′
2, 0, x
′
3),M(x1, 0, x3)) 4 2 3 9
[51] M(M(x1, 1, x2)
′,M(x2, 0, x3),M(x1, 1, x′3)) 4 2 2 9
[55] [60] M(M(x1, 0, x2),M(x2, 1, x3)
′,M(x′1, 1, x3)) 4 2 2 9
F = x1x2 + x2x3
+x′1x
′
2x
′
3
AND/OR mapping M(M(M(x1, 0, x2), 1,M(x2, 0, x3)), 1,M(M(x
′
1, 0, x
′
2), 0, x
′
3)) 6 3 3 12
[49] [50] [53] M(M(M(x1, 1, x3), 0, x2), 1,M(x
′
1, 0,M(x
′
2, 0, x
′
3))) 5 3 3 10
[51] M(M(x1, x2, x3),M(x
′
1, 1, x
′
2),M(x
′
2, 0, x
′
3)) 4 2 3 10
[52] [54] M(M(x1, x2, x3),M(x
′
1, 0, x
′
2),M(x2, 1, x
′
3)) 4 2 3 10
[55] [60] M(M(x′1, 1, x2),M(x2, 1, x3)
′,M(x1, x2, x3)) 4 2 2 10
F = x1x2x3 + x
′
1x
′
2x3
+x1x
′
2x
′
3 + x
′
1x2x
′
3
AND/OR mapping
M(M(M(M(x1, 0, x2), 0, x3), 1,M(M(x
′
1, 0, x
′
2), 0, x3)),
1,M(M(M(x1, 0, x
′
2), 0, x
′
3), 1,M(M(x
′
1, 0, x2), 0, x
′
3)))
11 5 6 22
[49] [50] [53] M(M(x′1, x2, x3), x
′
3,M(x1, x
′
2, x3)) 3 2 3 9
[51] M(M(x1, x2, x3)
′, x3,M(x1, x2, x′3)) 3 2 2 9
[52] [54] M(M(x′1, x2, x3),M(x1, x2, x
′
3),M(x1, x
′
2, x3)) 4 2 3 12
[55] [60] M(M(x1, x2, x3)
′, x1,M(x′1, x2, x3)) 3 2 2 9
F = x1x2x3 + x1x
′
2x
′
3
AND/OR mapping M(M(M(x1, 0, x2), 0, x3), 1,M(M(x1, 0, x
′
2), 0, x
′
3)) 5 3 2 10
[49] [50] [53] M(M(x1, x2, x
′
3), 0,M(x1, x
′
2, x3)) 3 2 2 8
[51] M(M(x2, 0, x3), x1,M(x2, 1, x3)
′) 3 2 1 7
[52] [54] M(M(M(x′2, 0, x
′
3), 1,M(x2, 0, x3)), 0, x1) 4 3 2 8
[55] [60] M(M(x2, 0, x3), x1,M(x2, 1, x3)
′) 3 2 1 7
As a result, it can be observed from Table 2.3 and 2.4 that none of the comprehensive
synthesis methods can generate the optimal majority/minority logic networks in terms of
33
Table 2.4: Comparison of 40 benchmarks using the best comprehensive synthesis methods
AND/OR MLUT Reduction%
mapping MALS Kong’s Gate priority Level priority MALS Kong’s MLUT (Gate) MLUT (Level)
Benchmark Gates Levels Gates Levels Gates Levels Gates Levels Levels Gates Gates Levels Gates Levels Gates Levels Level Gate
b1 9 3 9 3 7 2 6 2 2 6 0.0% 0.0% 22.2% 33.3% 33.3% 33.3% 33.3% 33.3%
cm42a 21 2 21 2 18 2 18 2 2 18 0.0% 0.0% 14.3% 0.0% 14.3% 0.0% 0.0% 14.3%
decod 28 3 28 3 28 3 28 3 3 28 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
cm82a 50 7 16 8 7 3 6 3 3 6 68.0% -14.3% 86.0% 57.1% 88.0% 57.1% 57.1% 88.0%
majority 12 5 6 5 6 4 5 4 3 6 50.0% 0.0% 50.0% 20.0% 58.3% 20.0% 40.0% 50.0%
z4ml 71 10 27 8 9 4 9 4 4 9 62.0% 20.0% 87.3% 60.0% 87.3% 60.0% 60.0% 87.3%
9symml 276 15 216 12 47 10 45 12 10 47 21.7% 20.0% 83.0% 33.3% 84.0% 20.0% 33.3% 73.2%
ldd 91 9 73 13 67 7 67 7 7 67 19.8% -44.4% 26.4% 22.2% 26.4% 22.2% 22.2% 26.4%
alu2 495 15 354 18 340 18 329 18 16 347 28.5% -20.0% 31.3% -20.0% 33.5% -20.0% -6.7% 29.9%
x2 49 6 42 8 37 7 34 7 6 36 14.3% -33.3% 24.5% -16.7% 30.6% -16.6% 0.0% 26.5%
cm152a 31 5 21 5 21 6 17 6 6 17 32.3% 0.0% 32.3% -20.0% 45.2% -20.0% -20.0% 45.2%
cm85a 80 10 34 10 26 6 19 6 6 19 57.5% 0.0% 67.5% 40.0% 76.3% 40.0% 40.0% 76.3%
cm151a 56 8 42 8 23 7 20 7 7 20 25.0% 0.0% 58.9% 12.5% 64.3% 12.5% 12.5% 64.3%
cm162a 57 7 46 9 41 7 36 9 7 41 19.3% -28.6% 28.1% 0.0% 36.8% -28.6% 0.0% 28.1%
cu 61 8 46 7 40 7 39 7 6 40 24.6% 12.5% 34.4% 12.5% 36.1% 12.5% 25.0% 34.4%
cm163a 52 7 42 9 38 7 32 7 7 32 19.2% -28.6% 26.9% 0.0% 38.5% 0.0% 0.0% 38.5%
cmb 44 4 44 5 28 4 26 4 4 26 0.0% -25.0% 36.4% 0.0% 40.1% 0.0% 0.0% 40.1%
pm1 49 6 45 7 35 6 32 6 6 32 8.2% -16.7% 28.6% 0.0% 34.7% 0.0% 0.0% 34.7%
tcon 24 2 24 2 24 2 24 2 2 24 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
vda 856 13 738 14 700 15 670 14 14 670 13.8% -7.7% 18.2% -15.4% 21.7% -7.7% -7.7% 21.7%
pcle 78 9 67 8 62 8 62 8 8 62 14.1% 11.1% 20.5% 11.1% 20.5% 11.1% 11.1% 20.5%
sct 86 7 72 10 65 6 65 6 6 65 16.3% -42.9% 24.4% 14.3% 24.4% 14.3% 14.3% 24.4%
cc 49 5 44 5 43 5 43 5 5 43 10.2% 0.0% 12.2% 0.0% 12.2% 0.0% 0.0% 12.2%
cm150a 54 8 46 8 46 9 37 6 6 37 14.8% 0.0% 14.8% -12.5% 31.5% 25.0% 25.0% 31.5%
mux 55 7 46 7 46 9 37 6 6 37 16.4% 0.0% 16.6% -28.6% 32.7% 14.3% 14.3% 32.7%
ttt2 187 11 154 11 145 11 144 10 10 144 17.6% 0.0% 17.6% 0.0% 22.5% 9.0% 9.0% 23.0%
i1 54 7 41 8 36 6 35 6 6 35 24.0% -14.3% 33.3% 14.3% 35.2% 14.3% 14.3% 35.2%
lal 123 7 95 9 82 8 64 9 8 82 22.8% -28.6% 33.3% -14.3% 48.0% -28.6% -14.3% 33.3%
pcler8 107 11 90 8 80 9 80 9 8 90 15.9% 27.3% 25.2% 18.2% 25.2% 18.2% 27.3% 15.9%
frg1 196 17 111 23 105 18 102 17 17 102 43.4% -35.3% 46.4% -5.9% 48.0% 0.0% 0.0% 48.0%
c8 124 8 115 8 112 7 108 8 7 112 10.2% 0.0% 12.5% 12.5% 15.6% 0.0% 12.5% 12.5%
term1 352 12 174 16 106 11 89 10 10 89 50.1% -33.3% 69.9% 8.3% 74.7% 16.7% 16.7% 74.7%
unreg 84 5 84 4 84 5 84 5 5 84 0.0% 20.0% 0.0 % 0.0% 0.0% 0.0% 0.0% 0.0%
k2 1602 18 1313 19 1301 19 1193 19 19 1193 18.0% -5.6% 18.8% -5.6% 25.5% -5.6% -5.6% 25.5%
cht 121 4 120 4 120 4 120 4 4 120 0.8% 0.0% 0.8% 0.0% 0.8% 0.0% 0.0% 0.8%
x1 573 12 320 13 264 11 253 11 11 253 44.2% -8.3% 53.9% 8.3% 55.8% 8.3% 8.3% 55.8%
example2 285 9 259 9 247 10 241 9 9 241 9.1% 0.0% 13.3% -11.1% 15.4% 0.0% 0.0% 15.4%
apex6 984 16 701 14 662 17 662 17 15 677 28.8% 12.5% 32.7% -6.3% 32.7% -6.3% 6.3% 31.2%
frg2 759 14 672 15 582 14 568 15 13 600 11.5% -7.1% 23.3% 0.0% 25.2% -7.1% 7.1% 20.9%
i2 395 14 209 18 209 13 209 13 13 209 47.0% -28.6% 47.0% 7.1% 47.0% 7.1% 7.1% 47.0%
Average reduction% 22.0% -7.5% 31.8% 5.7% 36.1% 6.9% 11.1% 34.3%
all optimization factors for all cases. However, some of these methods can result in best
solutions in terms of some optimization factors for three-variable or multi-variable Boolean
functions. Table 2.5 shows the capability of each synthesis method to optimize gates, levels,
inverters and gate inputs for all cases of three-variable and multi-variable Boolean functions.
From the table, it can be seen that Kong’s method and MLUT can generate the optimal ma-
jority networks in terms of gates, levels and inverters for all cases of three-feasible networks.
34
However, none of these methods can generate the optimal majority networks in terms of gate
inputs for all cases. For Boolean functions with more than three variables, only the MLUT
method can synthesize the optimized majority networks in terms of gates and inverters for
all cases. Although these results are the best compared to other methods, they are not
guaranteed to be optimal. For levels and gate inputs, none of the synthesis methods can
give the optimal solutions in terms of these factors for all cases of multi-variable Boolean
functions.
Table 2.5: Optimization capability analysis of the best comprehensive synthesis methods
Three-variable Multi-variable
Method Gates Levels Inverters Gate inputs Gates Levels Inverters Gate inputs
MALS [54] No No No No No No No No
Kong’s [55] Yes Yes Yes No No No No No
MLUT [60] Yes Yes Yes No Yes No Yes No
Even though these methods result in the best majority networks in terms of some op-
timization factors for all cases, these networks are not guaranteed to be optimal especially
while synthesizing multi-output Boolean functions. The process of selecting the optimal
majority network for multi-output Boolean functions is not considered in any of the three
synthesis methods, which is a very serious drawback. For a multi-output Boolean function,
by synthesizing the equivalent majority expression for each output separately, which is per-
formed in the three methods, the obtained majority expression can be the optimal in terms
of all optimization factors for this output. However, the final majority network realized from
these expressions is only optimal in terms of levels, which is the maximum number of levels
used in these expressions. For the number of majority gates, inverters, and gate inputs, the
final network is not always the optimal solution in terms of these factors. In other words, the
number of gates, inverters, and gate inputs used in a majority network obtained from one of
35
these methods for a multi-output Boolean network can be further reduced. To clarify this
point, consider a Boolean network N with two outputs, i.e., F = x1x2x3 + x
′
1x2x
′
3 + x1x
′
2x
′
3
and G = x1x2x3 + x
′
1x
′
2x3. For the output F , one of its equivalent optimal majority ex-
pressions is F = M(M(x1, 0, x3),M(x1, x2, x3)
′,M(x1, x2, x′3)). For the output G, two of its
equivalent optimal majority expressions are G1 = M(M(x1, x2, x3)
′, x3,M(x1, x2, x′3)) and
G2 = M(M(x1, x2, x3)
′, x1,M(x′1, x2, x3)). It can be noticed that both majority expressions
for the output G have the same number of gates, levels, inverters, and gate inputs as 3, 2, 2,
and 9, respectively. Now, the final majority network for N can be realized by selecting either
majority expressions (F,G1) or (F,G2). However, these networks are different in terms of
some optimization factors. For the network (F,G1), it has 5 gates, 2 levels, 2 inverters,
and 14 gate inputs, whereas the second network (F,G2) has 6 gates, 2 levels, 3 inverters,
and 16 gate inputs. From the two solutions, it can be seen that the number of levels is the
only factor that does not change. However, the second network (F,G1) has the minimum
number of gates, inverters, and gate inputs. Therefore, the best solution for network N is
(F,G1). Consequently, it can be seen that this is an important process that can provide
further reduction and give better results in terms of different optimization factors.
As discussed earlier, since the different characteristics of nanotechnologies and their logic
devices implementation can affect the optimization priorities given to different factors such
as gates, levels, inverters, etc., a majority/minority logic network generated from the ex-
isting synthesis methods is not guaranteed to be the best solution for all nanotechnologies.
Therefore, there is a strong need for developing an efficient majority/minority logic synthesis
method that can results in the optimal networks in terms of all optimization factors for any
majority/minority-based nanotechnology.
36
2.5: Conclusion
Due to the physical limitations of CMOS technology, many emerging nanoscale technologies
such as quantum-dot cellular automate (QCA), single electron tunneling (SET), tunneling
phase logic (TPL), spintronic devices, etc., have been proposed and considered as possible
replacements for CMOS. As known, CMOS technology uses logic NAND, NOR and NOT
gates to implement circuits. However, in post-CMOS nanotechnologies, majority and/or
minority gates are the fundamental logic units used to implement Boolean functions. In
this chapter, we gave the background information about QCA technology and its circuit
primitives. In addition, an overview of the majority/minority-based nanotechnologies and
the implementation of their logic devices is discussed. We also gave a comprehensive review
of majority/minority logic network synthesis methods that are capable of synthesizing multi-
input multi-output Boolean functions. Each of these methods is discussed in detail. We also
compared and discussed the results of eight standard three-variable Boolean functions and
40 MCNC benchmarks obtained from these methods based on different optimization factors
such as the number of gates, the number of levels, etc. From this comparison, we observed
that the existing techniques can give sub-optimal solutions. However, none of these methods
results in the optimal majority/minority logic networks in terms of all optimization factors
for all cases.
37
CHAPTER 3: DESIGNS OF VARIOUS FUNDAMENTAL ARITH-
METIC CELLS IN QCA
3.1: Introduction
Considerable interest has been shown during the last decade in the design of various QCA
arithmetic circuits. Several researchers have proposed different QCA arithmetic circuits such
as adder [67, 68], multiplier [69–72], divider [73–78], squarer [79], square-rooting [80], and
multi-operation arrays [81–83]. These QCA circuits are developed based on the equivalent
majority-based logic circuits of their basic cells. In this chapter, we proposed QCA designs
of nine basic arithmetic cells based on their equivalent majority/XOR-based circuits. These
circuits are developed using the fundamental logic devices in QCA and a structure of the
three-input XOR function. This process can provide further reductions in different charac-
teristics of QCA designs which can improve the performance of the circuits. The proposed
circuits have shown that they have better results in cell count, area, and latency, compared
to the circuits designed using only majority gates. In addition, the circuits designed using
both majority and XOR gates can provide further reduction in costs. The circuits discussed
in this chapter include the basic cell of adder, multiplier, restoring and non-restoring divider,
squarer, square-rooting, and multi-operation arrays. A detailed comparison of these designs
with respect to different characteristics is also presented.
3.2: Fundamental Arithmetic Cells
In this section, we discuss the logic functions of the nine basic arithmetic cells.
3.2.1: Adder
The basic cell in the adder array is a full adder circuit. This cell consists of three inputs
(ai, bi, ci) and two outputs: si for the sum and co for the carry [67, 84]. The functions of
38
these two outputs can be expressed by
si = ai ⊕ bi ⊕ ci
co = aibi + aici + bici
 (3.1)
3.2.2: Multiplier
The basic multiplier cell consists of four inputs (si,j−1, ai, bj, ci−1,j) and two outputs
(si,j, ci,j) [70]. The structure of this cell is basically a full adder. The inputs of this full adder
are si,j−1, ci−1,j, and the output of AND operation of ai and bj. The outputs of the adder
are si,j and ci,j for sum and carry, respectively. The functions of this cell are expressed as
si,j = si,j−1 ⊕ (aibj)⊕ ci−1,j
ci,j = si,j−1(aibj) + si,j−1ci−1,j + (aibj)ci−1,j
 (3.2)
3.2.3: Multiplier (High-Speed Arithmetic Array)
Another multiplier array is proposed in [85]. This array is based on a multiplier cell that
also consists of four inputs (a, b, p, ci) and two outputs (s, c2). The functions of this cell are
similar to that given in (3.2). The output s is the sum of a, c1, and the output of AND
function of b and p, and the output c2 is the carry of these three inputs. These functions
can be expressed as
s = a⊕ c1 ⊕ (bp)
c2 = ac1 + (a+ c1)bp
 (3.3)
3.2.4: Restoring Divider
The basic elements in the restoring divider cell are a full subtractor and a two-to-one
multiplexer. This cell consists of four inputs (z, d, cin, P ) and two outputs (cellout, bout).
Based on the control input (P ), the output cellout will be determined as the difference of the
39
subtractor if P = 1, or as the input z if P = 0 [74]. This can be expressed by
cellout =

z ⊕ d⊕ cin if P = 1
z otherwise
(3.4)
For the output bout, the signal is delivered directly as the borrow of the full subtractor [74].
This is expressed as
bout = z
′d+ z′cin + dcin (3.5)
3.2.5: Non-Restoring Divider
Another technique of division operation is non-restring division [86]. In the non-restoring
array divider, the basic unit is a complement adder/subtractor (CAS) cell. This cell consists
of four inputs (Ai, Bi, P, and Ci) and two outputs (Si and Co). The Boolean expressions of
the CAS cell can be given as
Si = Ai ⊕ (Bi ⊕ P )⊕ Ci
Co = Ai(Bi ⊕ P ) + AiCi + (Bi ⊕ P )Ci
 (3.6)
From the expressions in (3.6), it can be noticed that the CAS cell is basically an XOR
function and a one-bit full adder, where the inputs of the full adder are Ai, Ci, and the
output of Bi ⊕ P , and the outputs are Si for the sum and Co for the carry [76, 86]. It can
be noticed that if the control signal P = 0, the output Si will be the sum of Ai, Bi, and Ci,
whereas the cell will perform the subtraction of these inputs if the control input P = 1.
3.2.6: Divider (High-Speed Arithmetic Array)
Another divider array is introduced in [85]. In this array, the basic unit consists of five
inputs (ai, bi, ci, qj and ei) and five outputs (si, ci+1, ei+1, Gi+1 and Pi+1). In this cell, when
the signal qj is 1, the input bi will be added to the summation of ai and ci. The input ci is
40
always included while computing the carry out (ei+1) and carry look ahead values Gi+1 and
Pi+1. The operations of this cell can be expressed by
si = (ai ⊕ ci)⊕ (qjbi)
ci+1 = aici + (ai + ci)qjbi
ei+1 = aici + (ai + ci)bi
Gi+1 = (ai ⊕ bi ⊕ ci)ei
Pi+1 = ai ⊕ bi ⊕ ci + ei

(3.7)
3.2.7: Squarer
In squarer array, the basic cell is a full adder [79]. This cell has three inputs (Ai, Bi, Ci)
and two outputs (Si,Co), where the output Si is the sum and Co is the carry. These functions
are expressed by
Si = Ai ⊕Bi ⊕ Ci
Co = AiBi + AiCi +BiCi
 (3.8)
3.2.8: Square-Rooting Cell
In the square-rooting array, the basic unit is a complement adder/subtractor (CAS) cell.
This cell is the same as that used in the non-restoring array divider. The inputs of this cell
are X,D, P, and Cin and the outputs are R and Cout. The inputs of the full adder are X,Cin,
and the output of D⊕ P , and the outputs are R for the sum and Cout for the carry [76,86].
The cell will perform the addition or subtraction operation of the three inputs X,D, and
Cin, if the control input P = 0 or 1, respectively. These operations can be expressed as
R = X ⊕ (D ⊕ P )⊕ Cin
Cout = X(D ⊕ P ) +XCin + (D ⊕ P )Cin
 (3.9)
41
3.2.9: Square/Square-Rooting Cell (High-Speed Arithmetic Ar-
ray)
In [85], an array that can perform squaring and square-rooting operation is proposed.
This array can perform both operations using the same basic unit. This cell is almost similar
to the divider cell (high-speed arithmetic array). However, it has two more outputs, i.e., gi−1
and hi−1. This cell can perform three functions: addition, complement addition, and transfer
(or restore) based on the values of inputs b and d. The control signal x determines which
operation to be performed, i.e., 0 for squaring and 1 for square-rooting. The logic expressions
of this cell can be given as
si = ai ⊕ ci ⊕ [rj(bi ⊕ x)]
ci+1 = aici + (ai + ci)rj(bi ⊕ x)
ei+1 = aici + (ai + ci)(bi ⊕ x)
Gi+1 = (ai ⊕ bi ⊕ ci ⊕ x)ei
Pi+1 = ai ⊕ bi ⊕ ci ⊕ x+ ei
gi−1 = di(bi + rj)
hi−1 = bi + dirj

(3.10)
3.3: Design and Implementation
3.3.1: Majority-Based Circuits
As discussed earlier, the fundamental logic device in QCA is a majority gate as shown
in Fig. 3.1(a). Therefore, in order to implement a logic function in QCA, it has to be
converted into its equivalent majority-based circuit. The easier way to convert a logic circuit
into its equivalent majority circuit is to convert its Boolean function expressed in one of
two standard forms: Sum of Products (SOP) or Product of Sums (POS) by mapping each
42
two-input AND and two-input OR gate to a majority gate by fixing the third input to 0 or
1, respectively. However, this is not an efficient technique because it is not guaranteed that
the obtained majority circuit will have the minimum number of gates and levels. In QCA,
the number of gates and levels are the most important factors that play an essential role in
enhancing the performance of the circuit since they determine the area and latency.
X2 = 1
X1 = 0
X3 = 1
Output = 1
(a)
Input Output
Logic 1 Logic 0
(b)
Figure 3.1: (a) QCA majority gate. (b) QCA inverter.
When synthesizing the majority circuit for a Boolean function targeting one of the two
factors (gates or levels), the second factor may not be fully optimized. In other words, if
a majority circuit has the minimum number of gates, the number of levels in this circuit
may not be the minimum. On the other side, if the first priority is given to the number of
levels, the number of gate used in the circuit is also may not be the minimum. For example,
for the adder cell, when targeting to minimize the number of levels, the majority circuit
requires two levels, three majority gates, and two inverters as shown in Fig. 3.2(a). On
the other side, when the priority is to minimize the number of gates, the circuit requires
three levels, three majority gates, and one inverter as shown in Fig. 3.2(b). In QCA, the
inverter requires seven cells as shown in Fig. 3.1(b), whereas the majority gate requires five
cells. In addition, the QCA inverter has larger area compared to the majority structure.
43
Therefore, in the synthesis process the inverter optimization is also playing a vital role in
reducing the size and latency of the circuit. The majority circuits of all the fundamental
arithmetic units discussed in Section 3.2 have been synthesized by giving the first priority to
the number of gates and levels. However, we only give the minimal majority circuit, which
requires less number of levels, gates an inverters for each arithmetic cell. Fig. 3.2 and 3.3
show the majority-based circuits for the nine fundamental arithmetic units. Table 3.1 gives
the number of gates, levels, and inverters used in the equivalent majority-based circuits of
the fundamental arithmetic units.
Table 3.1: Specifications of majority circuits of the fundamental arithmetic units
Array cell Gates Levels Inverters
Adder cell (level priority) 3 2 2
Adder cell (gate priority) 3 3 1
Multiplier cell 4 3 2
Multiplier cell (High-speed) 4 3 2
Restoring divider cell 6 4 3
Non-restoring divider cell 6 4 3
Divider cell (High-speed) 9 3 3
Squarer cell 3 2 2
Square-rooting cell 6 4 3
Square/square-rooting cell (High-speed) 19 5 5
3.3.2: Majority/XOR-Based Circuits
The design of QCA circuits has largely been based on the majority gate since it is the basic
logic device in QCA. In the literature, many researchers have proposed different designs of
QCA computing circuits. Most of these designs are developed based on the QCA majority
gate. Recently, a QCA structure of the three-input XOR function has been proposed [87].
This structure is constructed using ten cells in a single layer as shown in Fig. 3.4 . This
44
co
si
M
ai
bi
ci Mai
Level 1 Level 2
M
ai
bi
ci
(a)
co
siM
ai
bi
ci
M
bi
ci
ai
M
Level 1 Level 2 Level 3
(b)
ci,j
si,j
M
si,j-1
ci-1,j
Msi,j-1
Level 2 Level 3
M
M
ai
bj
0
Level 1
si,j-1
ci-1,j
(c)
c2
s
M
a
c1
Ma
Level 2 Level 3
Ma
c1
M
b
p
0
Level 1
(d)
M
z
d
cin
M
z
Level 1 Level 2
M
z
d
cin
bout
cellout
M
P
0
M1
Level 3 Level 4
M
z
P
0
(e)
co
si
M
Ai
Ci
MAi
Level 3 Level 4
MAi
Ci
M
Bi
P
1
M
Bi
P
0
M1
Level 1 Level 2
(f)
ci+1
si
M
ai
ci
Mai
Level 2 Level 3
Mai
ci
M
bi
qj
0
Level 1
M
M
ei+1
Gi+1
Pi+1
M
ai
bi
ci
M
ai
bi
ci
M
ai
ei
0
ei
1
(g)
Co
Si
M
Ai
Bi
Cin MAi
Level 1 Level 2
M
Ai
Bi
Cin
(h)
Figure 3.2: Majority circuits of (a) Adder (level priority). (b) Adder (gate priority). (c)
Multiplier. (d) Multiplier (high-speed). (e) Restoring divider. (f) Non-restoring divider. (g)
Divider (high-speed). (h) Squarer.
device can lead to better QCA circuits in view of gates and levels compared to the majority-
based circuits. To clarify this point, consider the full adder cell discussed in the previous
section. The equivalent majority-based circuit of this cell requires two levels, three majority
gates, and two inverters, when targeting to minimize the number levels; and three levels,
45
cout
R
M
X
Cin
MX
Level 3 Level 4
MX
Cin
M
D
P
1
M
D
P
0
M1
Level 1 Level 2
(a)
ci+1
si
M
ai
ci
Mai
Level 4 Level 5
Mai
ci
M
ri
0
Level 3
M
M
ei+1
Gi+1
Pi+1
Mai
ci
ei
0
ei
1
M1
M
bi
x
M
bi
x
0
1
M
ai
bi
ci
M
ai
bi
ci
M
bi
rj
1
Mbi
M
x
0
Mx
1
M1
gi-1
hi-1
M
di
0
Mbi
di
Level 2Level 1
(b)
Figure 3.3: Majority circuits of (a) Square rooting. (b) Square/square-root (high-speed).
three majority gates, and one inverter when the number of gates is taken as the first priority.
However, by designing this cell using both majority and the three-input XOR gate, the circuit
requires only one level and two gates as shown in Fig. 3.5(a). In addition, it can be seen that
the circuit does not require any inverter. Therefore, we have designed all other cells using
both majority and XOR gates as shown in Fig. 3.5 and 3.6. Table 3.2 gives the number
of majority gates, XOR gates, levels, and inverters for each equivalent majority/XOR-based
circuit of arithmetic cells. From the table, it can be noted that when using both majority
and XOR gates, all the basic arithmetic cells can be realized with less number of gates,
levels, and inverters compared to the circuits designed using only majority gates.
46
X2 = 1
X1 = 0
X3 = 0
Output = 1
X1
X2
X3
Figure 3.4: QCA structure of a three-input XOR gate.
Table 3.2: Specifications of majority/XOR-based circuits of the fundamental arithmetic units
Gates
Levels Inverters
Array cell Majority XOR
Adder cell 1 1 1 0
Multiplier cell 2 1 2 0
Multiplier cell (High-speed) 2 1 2 0
Restoring divider cell 4 1 3 2
Non-restoring divider cell 1 2 2 0
Divider cell (High-speed) 5 2 2 0
Squarer cell 1 1 1 0
Square-rooting cell 1 2 2 0
Square/square-rooting cell (High-speed) 8 4 3 0
3.3.3: QCA Designs
For the QCA designs of the fundamental arithmetic units, the equivalent majority-based
circuits of these units given in Fig. 3.2 and 3.3 can be directly implemented in QCA. However,
their equivalent majority/XOR-based circuits have better results in terms of gates, levels,
and inverters, compared to the circuits developed using only majority gates. These circuits
will give better QCA designs in view of complexity and speed. In addition, the cost of these
47
co
si
ai
bi
ci
Level 1
M
ai
bi
ci
(a)
ci,j
si,j
si,j-1
ci-1,j
Level 2
M
M
ai
bj
0
Level 1
si,j-1
ci-1,j
(b)
c2
s
a
c1
Level 2
Ma
c1
M
b
p
0
Level 1
(c)
z
d
cin
Level 1
M
z
d
cin
bout
cellout
M
P
0
M1
Level 2 Level 3
M
z
P
0
(d)
co
si
Ai
Ci
Level 2
MAi
Ci
Bi
P
0
Level 1
(e)
ci+1
si
ai
ci
Level 2
Mai
ci
M
bi
qj
0
Level 1
M
M
ei+1
Gi+1
Pi+1
M
ai
bi
ci
ai
bi
ci
ei
0
ei
1
(f)
Co
Si
Ai
Bi
Cin
Level 1
M
Ai
Bi
Cin
(g)
Level 2
M
D
P
0
Level 1
X
Cin
X
Cin
cout
R
(h)
Figure 3.5: Majority-XOR circuits of (a) Adder. (b) Multiplier. (c) Multiplier (high-speed).
(d) Restoring divider. (e) Non-restoring divider. (f) Divider (high-speed). (g) Squarer. (h)
Square rooting.
circuits will be less than majority-based circuits since the implementation cost of a QCA
circuit is determined by its area and latency [68]. The cost can be determine by
Cost = area× latency2 (3.11)
48
ci+1
si
ai
ci
Level 3
Mai
ci
M
ri
0
Level 2
M
M
ei+1
Gi+1
Pi+1
Mai
ci
ei
0
ei
1
bi
x
0
ai
bi
ci
M
bi
rj
1
gi-1
hi-1
M
di
0
Mbi
di
Level 1
x
0
(a)
Figure 3.6: Majority-XOR circuits of (a) Square/square-root (high-speed).
Therefore, the designs of QCA arithmetic units are developed based on their equivalent
majority/XOR circuits given in Fig. 3.5 and 3.6.
In QCA, when designing a logic circuit using a single layer, the number of QCA cells in
most cases is less than the multi-layer design. However, due to the routing complexity of
single-layer QCA circuits, multi layers can provide better results in view of delay, and area.
In addition, when designing large circuit, multi-layer designs can have less number of QCA
cells, compared to the single-layer circuits. Therefore, the QCA designs in this research
are developed using multi layers. Fig. 3.7 shows the QCA designs of the nine fundamental
arithmetic cells.
3.4: Results and Comparison
The QCA circuits of the fundamental arithmetic units are designed using QCADesigner
tool version 2.0.3 [15]. The layers properties used in the designs are as follows: the cells
area is 18 nm×18 nm, and the diameter of the dots is 5 nm. The designs are verified using
49
(a) (b) (c) (d)
(e) (f) (g) (h)
(i)
Figure 3.7: QCA circuit designs of (a) Adder. (b) Multiplier. (c) Multiplier (high-speed).
(d) Restoring divider. (e) Non-restoring divider. (f) Divider (high-speed). (g) Squarer. (h)
Square rooting. (i) Square/square-root (high-speed).
bistable approximation engine. The parameters used are as follows: the number of samples
is 12800, convergence tolerance is 0.001, radius of effect is 65 nm, relative permittivity is
50
12.9, clock high and clock low are 9.8×10−22 and 3.8×10−23 J, respectively, clock amplitude
factor is 2, layer separation is 11.5 nm, and maximum iterations per sample is 100.
In the literature. several papers have introduced different QCA designs of each arithmetic
cell discussed in this chapter. However, in this section, we have only compared the proposed
QCA arithmetic cells with the best existing designs as given in Table 3.3. This comparison
includes the number of QCA cells, area, latency, and the type of layer used in each unit. In
the same table, we have also compared the cost of each arithmetic unit.
Table 3.3: Comparison of the proposed QCA designs of the fundamental arithmetic units
and the best existing designs
QCA specifications
Cost
Array cell Cell count Area (µm2) Latency (clock cycle) Layer type
Adder cell [68] 63 0.05 0.75 Single-layer 0.028
Proposed adder cell 61 0.06 0.5 Multi-layer 0.015
Multiplier cell [70] 240 0.25 3.0 Single-layer 2.25
Proposed multiplier cell 72 0.07 0.75 Multi-layer 0.039
Proposed multiplier cell (High-speed) 72 0.07 0.75 Multi-layer 0.039
Restoring divider cell [75] 183 0.25 1.25 Multi-layer 0.391
Proposed restoring divider cell 118 0.13 1.0 Multi-layer 0.13
Non-restoring divider cell [77] 60 0.08 0.75 Single-layer 0.045
Proposed non-restoring divider cell 74 0.07 0.75 Multi-layer 0.039
Proposed divider cell (High-speed) 317 0.34 0.75 Multi-layer 0.191
Squarer cell [79] 62 0.06 0.75 Multi-layer 0.034
Proposed squarer cell 61 0.06 0.5 Multi-layer 0.015
Square-rooting cell [80] 311 0.45 2.25 Single-layer 2.28
Proposed square-rooting cell 74 0.07 0.75 Multi-layer 0.039
Proposed square/square-rooting cell (High-speed) 561 0.52 1.0 Multi-layer 0.52
From the table, it can be observed that the proposed designs, which are developed based
on the equivalent majority/XOR circuits of these units, give better results in terms of cell
count, area, and latency for all units except the area of the adder and the cell count of the
non-restoring divider unit. The proposed QCA adder cell has an area of 0.06µm2, whereas
the best existing design given in [68] has an area of 0.05µm2. However, our adder has a
latency of 0.5 clock cycle, while the best existing adder has a latency of 0.75 clock cycle.
51
Therefore, the proposed design achieves a cost reduction of 46.4%. For the non-restoring
divider unit, the proposed design requires 14 QCA cells more than the design given in [77]
since our circuit has been designed using multi layers. However, the proposed non-restoring
divider cell has a smaller area and the same latency as the best design. This leads to a cost
reduction of almost 13.3 %. For the remaining units, the proposed designs give better results
in terms of all categories or the same as the best existing designs. In addition, the cost of
all the proposed designs are less than the best existing designs.
3.5: Conclusion
In this chapter, QCA designs of various fundamental arithmetic cells are given. These
designs are developed using both majority gates and the QCA structure of the three-input
XOR function. From the comparison, we found that the proposed multi-layer QCA designs
result in lower complexity, higher speed, and lower cost compared to the designs developed
based on the equivalent majority-based circuits of these units. The proposed QCA cells can
be designed and extended in a pipeline manner to form arithmetic arrays that can perform
the operations for any number of bits.
52
CHAPTER 4: DESIGN OF ARRAY MULTIPLIER IN QCA
4.1: Introduction
There has always been an interest in developing different arithmetic units that can
be used in various computer arithmetic circuits. Recently, several papers have introduced
different QCA arithmetic circuits such as multiplication [70], division [73], squaring [79],
square rooting [80], and multi-operation array [83]. Multipliers have always been of interest
to various researchers because of their applications in variety of fields. Several papers have
proposed different designs of QCA multipliers based on different techniques. In [70], a QCA
design of pipelined parallel array multiplier is proposed. This design is developed using only
majority gates and a coplanar layer. In [71], QCA designs of Wallace and Dadda multipliers
are introduced. Other types of QCA multipliers are proposed in [72]. These multipliers are
also designed using only majority gates and a coplanar layer. The proposed array multiplier
is designed using both majority and XOR functions. It is developed using multi layers which
can provide better latency, area, and cell count, compared to the existing designs.
4.2: Design and Implementation
4.2.1: Basic Multiplier Cell
The basic cell in the multiplier array consists of four inputs (si,j−1, ai, bj, ci−1,j) and two
outputs (si,j, ci,j) as shown in Fig. 4.1(a). The Boolean functions of the two outputs in the
basic multiplier cell can be expressed by
si,j = si,j−1 ⊕ (aibj)⊕ ci−1,j
ci,j = si,j−1(aibj) + si,j−1ci−1,j + (aibj)ci−1,j
 (4.1)
It can be noticed that the cell is basically a full adder and an AND gate. The inputs are
si,j−1, ci−1,j, and the output of AND operation of ai and bj, and the outputs of the adder
53
MC
aisi,j-1
bjbj
ci-1,jci,j
si,j ai
(a)
aisi,j-1
bj
ci-1,j
bj
ci,j
si,j ai
(b)
Figure 4.1: (a) Basic multiplier cell (b) Logic diagram of the multiplier cell [70].
are si,j for the sum and ci,j for the carry. Fig. 4.1(b) shows the logic diagram of the basic
multiplier cell.
Due to the principles of QCA technology, the design of logic circuits in QCA have been
previously based on only majority gates. However, the single-layer QCA structure of the
three-input XOR function provides better QCA characteristics and improves the performance
of the designed circuit compared to its equivalent majority-based circuit. For the multiplier
cell, when using only majority gates, the circuit requires four gates, three levels, and two
inverters as shown in Fig. 4.2. However, by using both majority and XOR gates, the cell
requires only three gates and two levels as shown in Fig. 4.3. This circuit can be directly
implemented in QCA using the same number of gates and levels as shown in Fig. 4.4. This
design uses 72 QCA cells and has an area of 0.8µm2 and a delay of 0.75 clock cycle.
The design of multiplier cell is developed using multi-layer structure. Because of routing
complexity of single-layer designs, using multi layers can give better results in view of latency
54
ci,j
si,j
M
si,j-1
ci-1,j
Msi,j-1
Level 2 Level 3
M
M
ai
bj
0
Level 1
si,j-1
ci-1,j
Figure 4.2: Majority-based circuit of multiplier cell
ci,j
si,j
si,j-1
ci-1,j
Level 2
M
M
ai
bj
0
Level 1
si,j-1
ci-1,j
Figure 4.3: Majority/XOR-based circuit of multiplier cell
Figure 4.4: QCA layout of the multiplier cell
55
and area. In addition, a multi-layer design can have less number of QCA cells compared
to a single-layer design especially when considering large circuits. The design of multiplier
circuit is developed using three layers as main layer, layer 1, and layer 2. Fig. 4.5(a), (b),
and (c) show these layers, respectively.
(a) (b) (c)
Figure 4.5: QCA layers for multiplier cell: (a) Main layer (b) Layer 1 (c) Layer 2
4.2.2: Array Multiplier
A multiplier is formed by a two dimensional array of pipelined multiplier cells. For an n-bit
multiplier, the array requires n2 cells. Fig. 4.6 shows the pipelined array for a 4-bit multiplier.
This array can be used to perform the multiplication of 4-bit number (a3a2a1a0) by 4-bit
number (b3b2b1b0). The result of multiplication is produced in (m7m6m5m4m3m2m1m0).
From the figure, it can be noticed that the inputs si,j−1’s of the cells in the first row (top
row) and the inputs ci−1,j’s in the first column (most-right column) are given as 0’s.
56
m4m3m2
MC MC MC MC
MC MC MC MC
MC MC MC MC
MC MC MC MC
m7m6m5 m1m0
b3
b2
b1
b0
0
0
0
0
0 a3 0 a2 0 a1 0 a0
Figure 4.6: A 4-bit multiplier
For illustration, we have considered the design of 3- and 4-bit QCA multipliers. The
designs of these arrays are developed using the QCA multiplier cells designed with two
different clock zones in order to have the minimum delay. To illustrate this point, consider
the multiplier cell given in Fig. 4.4. It can be noticed that the inputs of this cell are given
in the first quarter of the clock cycle (clock 0) and the outputs are received in the third
quarter (clock 2). Therefore, in order to connect this cell to another cell with the same
clock zones, there is an additional 0.25 clock cycle is required. However, by designing the
next cell to receive the inputs in the third quarter, the output of the previous cell can be
directly connected to the next cell. Therefore, the array multipliers are designed by using
57
the multiplier cells with two different clock zones such that none of these cells is connected
to the next cell with the same clock cycles. The QCA designs of 3- and 4-bit multipliers are
shown in Fig. 4.7 and 4.8, respectively.
Figure 4.7: QCA layout of 3-bit multiplier
The delay of the proposed QCA multiplier for an n-bit array can be determine by
Delay = 0.75(2n− 1) (4.2)
Thus, the delay of the the proposed multipliers is 3.75 clock cycles for the 3-bit array
and 5.25 clock cycles for the 4-bit array.
58
Figure 4.8: QCA layout of 4-bit multiplier
4.3: Simulation Results and Comparison
The designs and simulations were done using QCADesigner version 2.0.3 [15]. The
layers properties used in the designs are as follows: the cells area is 18 nm×18 nm, and the
diameter of the dots is 5 nm. The parameters used for a simulation engine in the bistable
approximation are as follows: the number of samples is 12800, convergence tolerance is
59
0.001, radius of effect is 65 nm, relative permittivity is 12.9, clock high and clock low are
9.8 × 1022 and 3.8 × 1023, respectively, clock amplitude factor is 2, layer separation is 11.5,
and maximum iterations per sample is 100.
Fig. 4.9 shows the simulation results for the multiplier cell. It can be seen that the
outputs are received after 0.75 clock cycle.
Figure 4.9: Simulation result for the multiplier cell
The simulation results after several iterations for multiplication process of two examples
using the proposed 3-bit multiplier are shown in Fig. 4.10. Fig. 4.10(a) shows the result for
60
the multiplication of 10 and 101. From the figure, it can be seen that the output is produced
as 1010. Fig. 4.10(b) shows the multiplication of 110 and 101, where the output is generated
as 11110.
(a) (b)
Figure 4.10: Simulation results for multiplication of (a) 10 and 101 (b) 110 and 101
In Table 4.1, a comparison between the proposed QCA design of 4-bit array multiplier
and the existing designs are given.
Table 4.1: Comparison of different QCA designs of array multipliers
Multiplier array Cell count Area (µm2) Latency
4-bit multiplier [70] - 7.04 11
4-bit Wallace multiplier [71] 3295 7.39 10
4-bit Dadda multiplier [71] 3384 7.51 12
4-bit multiplier I [72] 2956 5.18 14
4-bit multiplier II [72] 3738 6.02 14
Proposed 3-bit multiplier 1343 1.31 3.75
Proposed 4-bit multiplier 2504 2.40 5.25
Reduction % (4-bit multiplier) 15.3% 53.7% 47.5%
From the table, it can be observed that the proposed design gives better results in view of
61
all QCA characteristics compared to the existing designs. The proposed multiplier achieves
15.3% and 53.7% reduction in cell count and area, respectively, compared to type I multiplier
given in [72]. In addition, our multiplier achieves a reduction of 47.5% in latency compared
to Wallace multiplier in [71]. In the same table, the obtained results for the proposed 3-bit
multiplier is also given.
4.4: Conclusion
This chapter presented a QCA design of n-bit binary array multiplier. The basic cell
in the proposed multiplier is designed using both majority and XOR functions. Based on
this cell, a 3- and 4-bit QCA multipliers are designed. The designs of the basic cell and the
array multiplier are developed using multi-layer structures. Compared with the previously
proposed designs, our multiplier can provide better results of different QCA characteristics
such as delay, cell count, and area.
62
CHAPTER 5: QCA DESIGN OF NON-RESTORING BINARY
ARRAY DIVIDER
5.1: Introduction
Up to date, several papers have proposed different QCA designs of different arithmetic units
such as multiplier [69], divider [73], squarer [79], square-rooting circuits [80], etc. However,
due to the major impact of the divider circuits on the overall performance of any arithmetic
processor, they are the most complicated units. Different QCA designs of divider circuits such
as restoring binary array divider (RD) [74, 75, 83] and non-restoring divider (NRD) [75–78]
have been proposed. The designs of NRD show that they are better compared to RD’s
because of the drawbacks of RD’s such as restoring process and realising control logic, which
cause more delay, unnecessary power dissipation, and larger sizes.
In the literature, several QCA-based circuits for the non-restoring binary array divider
have been introduced based on different techniques [75–78]. These dividers were designed
using a single layer as in [76] or multi layers as in [75, 78] using only majority gates and
inverters. Recently, a QCA design for NRD was proposed based on the QCA structure of
the XOR function [77]. In this design, the Boolean functions are realized using both majority
and XOR gates. However, this divider is designed using a single layer. In this chapter, a
QCA design of n-bit NRD using multi-layer structures is proposed. This leads to a NRD
with better results in view of cell count, latency, and area, compared to the existing designs.
5.2: Non-Restoring Binary Array Divider
5.2.1: Complement Adder/Subtractor Cell
The basic unit in the non-restoring binary array divider is the complement adder/subtractor
(CAS) cell. In this cell, there are four inputs; Ai, Bi, P, and Ci, and two outputs; Si and C0
63
as shown in Fig. 5.1. The functions of this cell can be expressed as follows.
Si = Ai ⊕ (Bi ⊕ P )⊕ Ci
Co = Ai(Bi ⊕ P ) + ACi + (Bi ⊕ P )Ci
 (5.1)
CAS
Si
Bi
P P
Bi Ai
C0 Ci
Figure 5.1: Complement adder/subtractor cell
From these expressions, it can be noted that the CAS unit is basically an XOR function
and a one-bit full adder. The first two inputs of the full adder are Ai, Ci and the third input
is the output of Bi ⊕ P . As for the outputs of the CAS cell, they are Si for the sum and C0
for the carry.
5.2.2: Non-Restoring Binary Array Divider
The division process of the non-restoring binary array is done by calculating the partial
reminders by subtracting or adding the dividend and the right-shifted versions of the divisor.
Based on the sign of the calculated partial reminder, the quotient bit is determined. The
sign of the partial reminder is also used to decide whether the shifted divisor has to be added
or subtracted in the next cycle. The process of NRD is defined in [76,86] as follows.
qi+1 =

1, if Ri > 0
0, if Ri < 0
(5.2)
64
Ri+1 =

2Ri − Y, if Ri > 0
2Ri + Y, if Ri < 0
(5.3)
r =

2−n.Rn, if Ri > 0
2−n.(Rn + Y ), if Ri < 0
(5.4)
where n is the number of bits, i is the iteration index : {i→ 1, 2, .., n− 1}, qi is quotient set,
Ri is the partial reminder after ith iteration, y is the divisor, and r is the final reminder.
By a two-dimensional array of pipelined CAS cells, the non-restoring divider can be
designed to perform the division operation for any number of bits. For an n-bit NRD, the
array requires n2 CAS cells. Fig. 5.2 shows a 5× 5 non-restoring binary divider. This array
can divide an 8-bit number (x1x2x3x4x5x6x7x8) by a 4-bit number (y1y2y3y4). For the result,
a 5-bit quotient (q0q1q2q3q4) is produced at the left side of the array and a 5-bit reminder
(r4r5r6r7r8) is produced at the bottom of the array.
5.3: Design and Implementation
As mentioned earlier, the basic unit in QCA technology is a majority gate. Therefore,
in order to implement a logic circuit in QCA, the Boolean function has to be converted
into its equivalent majority logic circuit. Different majority logic circuit synthesis methods
have been introduced. However, for some cases, when converting a Boolean function into
its equivalent majority-based circuit, the circuit requires more gates and levels compared to
its original Boolean function. In any QCA circuit, the number of gates and levels are the
most important factors that affect the performance of the circuit since they determine the
complexity and latency. Therefore, in order to have an efficient QCA design of non-restoring
binary divider, the numbers of gates and levels have to be reduced. The proposed QCA
65
CAS CAS CAS
CAS CAS CAS
CAS CAS CAS
CAS CAS CAS
CAS
q1
q2
q0
P
q3
y0 x0 y1 x1 y2 x2 y3 x3 x4 x5 x6
CAS
CAS
CAS
CAS
CAS CAS CAS
q4
r4 r5 r6 r7
CAS
CAS
CAS
CAS
CAS
r8
x7 x8y4
Figure 5.2: 5× 5 non-restoring binary array divider
design of the non-resorting divider is developed using a QCA structure of the three-input
XOR function. Using this structure with majority gates to design a QCA circuit can provide
further reduction in cell count, latency and area.
As given in the previous section, a CAS cell is essentially an XOR function and a one-bit
full adder. Since the carry of the full adder is a majority function, the CAS cell can be
realized using two XOR gates and one majority gate as shown in Fig. 5.3(a). This circuit
can be directly implemented in QCA using the same number of gates and levels. It can be
noted that the XOR function of inputs Bi and P can be realized using the three-input XOR
gate by fixing the third input to logic 0. Fig. 5.3(b) shows the QCA design of the CAS cell.
In the CAS circuit, the longer path is from inputs Bi and P and output C0. This path
66
Bi
M
Ai
Ci
SiC0
0
P
(a) (b)
Figure 5.3: CAS cell: (a) Logic diagram (b) QCA layout
is basically the majority function of Ai, Ci, and the output of XOR function of Bi and P ,
i.e., M(Ai, Ci, (Bi⊕ P )). From Fig. 5.3(b), it can be noted that this path requires 0.5 clock
cycle for the XOR gate and 0.25 for the majority gate. Therefore, the overall delay of this
circuit is 0.75 clock cycle.
In this chapter, the proposed designs for CAS cell and divider are developed using multi-
layer QCA structures. For most cases, circuits designed in QCA using multi layers require
more QCA cells compared to that designed with a single layer. However, multi-layer circuits
can provide better results in view of area and latency. In addition, due to the routing
complexity of the single layer, multi layers can also give better designs that require less cells
especially while designing large circuits. The QCA design of the CAS cell given in Fig.
5.3(b) is developed using three different layers, i.e., main layer, layer 1, and layer 2. These
layers are shown in Fig. 5.4(a), (b), and (c), respectively.
67
(a) (b) (c)
Figure 5.4: QCA layers for CAS cell: (a) Main layer (b) Layer 1 (c) Layer 2
Based on the QCA design of the CAS cell, the non-restoring binary array divider can be
implemented to perform the division operation for any number of bits. Fig. 5.5 shows the
proposed QCA design of 3 × 3 NRD. From the design, it can be noted that two CAS cells
named “Red” and “White” are used as shown in the red rectangles. These cells are designed
with different clock zones in order to have less number of clock cycles of the divider. To
illustrate this point, consider the QCA design of CAS cell given in Fig. 5.3(b). The inputs
of this cell are given in the first clock zone (clock 0) and the outputs are received in the
third clock zone (clock 2). Therefore, in order to connect this cell to another cell with the
same clock zones, there is an additional 0.25 clock cycle required. However, by designing the
next cell to receive the inputs in the third clock zone, the output of the cell can be directly
connected to the next cell. Based on that, the divider is designed by placing and connecting
these cells to each other such that none of these cells is connected to another cell with the
68
same clock zones. In addition, the first cell in each row (the most-right cell) should have
the same clocks of the last cell of the previous row (the most-left cell), whereas the first cell
in the first row is always “Red”. This provides 0.5 clock cycle for connecting qi to the next
row, while one full cycle is required if these two cells have the same clocks. Fig. 5.6 shows
the patterns of CAS cells for 3-, 4-, 5-, and 8-bit non-restoring dividers. To illustrate the
implementation of the proposed divider, we have considered the implementation of 4×4 and
5× 5 NRD’s as shown in Fig. 5.7 and 5.8, respectively.
The latency of an n-bit restoring and non-restoring binary array divider can be deter-
mined by 4n2 + 1 [74] and 3n2− 0.75, respectively. Recently, a QCA design of non-restoring
divider was proposed with a latency of 3n2/4 [77]. In the proposed non-restoring binary
array divider, the QCA design of n-bit array achieves an overall latency of
Delay =
2n(n+ 1)− 1
4
. (5.5)
In this chapter, the proposed divider is implemented for n = 3, n = 4, and n = 5. Thus,
the overall delay of these dividers are 5.75, 9.75, and 14.74 clock cycles, respectively. Fig.
5.9 shows the latency of restoring divider, non-restoring divider, and the proposed divider.
5.4: Results and Comparison
In this chapter, the designs and simulations were done using QCADesigner version
2.0.3 [15]. The layers properties used in the designs are as follows: the cells area is 18 nm×18
nm, and the diameter of the dots is 5 nm. The parameters used for a simulation engine in the
bistable approximation are as follows: the number of samples is 12800, convergence tolerance
is 0.001, radius of effect is 65 nm, relative permittivity is 12.9, clock high and clock low are
9.8 × 10−22 and 3.8 × 10−23 J, respectively, clock amplitude factor is 2, layer separation is
11.5 nm, and maximum iterations per sample is 100.
69
Figure 5.5: QCA layout of 3× 3 NRD
Fig. 5.10 shows the simulation results for the CAS cell. From the figure, it can be
observed that the signals are received at the output cells after 0.75 clock cycle. Compared
to the existing CAS cell, the delay of the proposed design is the same as the design in [77].
The proposed QCA design of CAS cell has a total of 74 cells and an area of 0.07µm2. A
comparison of QCA specifications between the proposed design and the best existing designs
70
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
R
ed
R
ed
R
ed
R
ed
R
ed
W
h
it
e
W
h
it
e
R
ed
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
R
ed
W
h
it
e
3 
× 
3 
N
R
D
4 
× 
4 
N
R
D
5
 ×
 5
 N
R
D
8 
× 
8 
N
R
D
F
ig
u
re
5.
6:
T
h
e
p
at
te
rn
s
of
C
A
S
ce
ll
s
fo
r
d
iff
er
en
t
si
ze
s
of
N
R
D
.
71
Figure 5.7: QCA layout of 4× 4 NRD.
for the CAS unit is given in Table 5.1. From the table, it can be noted that since the
proposed CAS unit is designed using multi-layer structure, the QCA circuit has a total of
74 cells, whereas the best existing design has 60 cells. However, the proposed design has a
smaller area, i.e., 0.07µm2.
72
Figure 5.8: QCA layout of 5× 5 NRD.
The simulation results after several iterations for division process of the proposed 3-bit
divider for two examples are shown in Fig. 5.11. Fig. 5.11(a) shows the result for division
of 111 by 10. From the figure, it can be seen that the outputs are generated as 11 for the
73
Figure 5.9: Latency of RD, NRD, and the proposed NRD.
Table 5.1: Comparison of QCA designs for a CAS cell
CAS cell Cell count Latency Area (µm2) Layer type
[75] 235 1.75 0.35 multi-layer
[76] 147 2.25 0.27 coplanar
[77] 60 0.75 0.08 coplanar
Proposed 74 0.75 0.07 multi-layer
Reduction % (w.r.t [77]) -23.3% 00.0% 12.5% –
quotient and 1 for the reminder. Fig. 5.11(b) shows the division of 1111 by 10, where the
outputs are produced as 111 and 1 for the quotient and reminder, respectively.
Even though the proposed CAS unit requires more QCA cells compared to the single-
layer design, the overall design of non-restoring binary array divider requires fewer QCA
cells, clock zones, and smaller area. The proposed designs of 3× 3 and 4× 4 dividers have a
total of 1436 and 2791 QCA cells and area of 1.53µm2 and 2.81µm2, respectively. In Table
74
Figure 5.10: Simulation results for the CAS cell
5.2, a comparison between the proposed non-restoring binary dividers and the best existing
designs are given. From the table, it can be observed that the proposed design has better
results in view of cell count, latency, and area. Compared to the designs in [77] and [78], the
proposed 3 × 3 divider achieves 14.8%, 14.8%, and 20.3% reductions in cell count, latency,
and area, respectively. In addition, the proposed 4 × 4 divider achieves 5.5%, 18.8%, and
33.1% reductions in cell count, latency, and area, respectively. In the same table, the QCA
specifications for 5× 5 divider is also given.
75
(a) (b)
Figure 5.11: Simulation results for division of (a) 111 by 10 (b) 1111 and 10
Table 5.2: Comparison of different QCA designs of restoring and non-restoring array dividers.
Divider array Cell count Latency Area (µm2) Layer type
3× 3 RD [74] 6451 37 15.05 coplanar
3× 3 NRD [76] 3742 26.25 6.22 coplanar
3× 3 NRD [77] 1686 6.75 3.4 coplanar
3× 3 NRD [78] 1852 31 1.92 multi-layer
Proposed 3× 3 1436 5.75 1.53 multi-layer
Reduction % (w.r.t [77]) 14.8% 14.8% 55.0% –
Reduction % (w.r.t [78]) 22.5% 81.5% 20.3% –
4× 4 RD [75] 5351 16.5 15.51 multi-layer
4× 4 NRD [75] 5124 15.25 9.99 multi-layer
4× 4 NRD [76] 6865 47.25 10.95 coplanar
4× 4 NRD [77] 3180 12 6.5 coplanar
4× 4 NRD [78] 2954 44 4.2 multi-layer
Proposed 4× 4 2791 9.75 2.81 multi-layer
Reduction % (w.r.t [77]) 12.2% 18.8% 56.8% –
Reduction % (w.r.t [78]) 5.5% 77.8% 33.1% –
Proposed 5× 5 4313 14.75 4.58 multi-layer
76
5.5: Conclusion
In this chapter, a QCA design of n-bit non-restoring binary divider is presented. This
divider is developed based on a QCA structure of the three-input XOR function in order
to design the equivalent circuits with the minimum numbers of gates, levels, and inverters.
Unlike existing designs, which were developed using single-layer structures, the proposed
design is developed using multi layers. In addition, the proposed divider is developed using
the QCA structure of the basic cell designed with different clock zones. These cells are placed
and connected based on the number of bits in order to have the minimum number of clock
cycles. This results in a better QCA design for the non-restoring divider compared to the
previously proposed designs in view of cell count, clock frequency, and area.
77
CHAPTER 6: SQUARING AND SQUARE-ROOTING CIRCUITS
IN QCA
6.1: Introduction
Several papers have introduced QCA designs of different arithmetic circuits such as
adder [67, 68], subtractor [68], multiplier [69], divider [76–78], squarer [79], square-rooting
circuit [80], and multi-operation arrays [81–83] based on different techniques. The existing
QCA designs of squaring and square-rooting were developed using only majority gates. In
this chapter, efficient QCA designs of n-bit squaring and square-rotting circuits are proposed.
These circuits are developed using majority gates and a single-layer QCA structure of the
three-input XOR function. This structure can provide better QCA designs in view of different
parameters such as cell count, area, and latency, compared to the majority-based circuits.
This leads to lower-complexity and higher-speed designs of squaring and square-rooting
circuits, compared to the best existing designs.
6.2: Design and Implementation
6.2.1: Squaring and Square-Rooting Cells
The basic unit in square array is a full adder [79]. This cell consist of three inputs
(Ai, Bi, Ci) and two outputs; Si for the sum and Co for the carry. The functions of this cell
can be expressed by
Si = Ai ⊕Bi ⊕ Ci
Co = AiBi + AiCi +BiCi
 (6.1)
From the functions in (6.1), it can be noticed that the cell can be realized with only
two gates; a three-input XOR gate for the sum, and a three-input majority gate for the
carry. In addition, this circuit requires only one level. However, when designing the cell
using only majority gates, the design requires three majority gates, two inverters, and two
78
levels. Therefore, the basic cell of the squaring circuit is designed in QCA based on its
equivalent majority/XOR-based circuit. The QCA layout of the proposed squaring cell is
shown in Fig. 6.1. From the figure, it can be seen that the design is developed using multi
layers. Even though the single-layer circuits require less number of QCA cells compared
to the multi layers, multi-layer circuits can provide smaller area. In addition, due to the
routing complexity of the single layer, multi-layer circuits can also have less number of QCA
cells especially when designing large circuits. Therefore, the QCA designs of squaring and
square-rooting circuits are developed using multi layers.
Figure 6.1: QCA layout of squaring cell
For square-rooting array, the fundamental unit is a complement adder/subtractor (CAS)
cell [76]. The cell has four inputs (X,D, P, Cin) and two outputs (R,Cout). This cell is
basically a full adder and a two-input XOR function. The inputs of the full adder are
X,Cin, and the output of the XOR function of D and P , and the outputs are R and Cout for
the sum and carry, respectively. In this cell, the control signal P is used to determine the
79
operations that will be performed on the inputs X,D, and Cin. If the control signal P equal
to 0, the cell will perform the addition operation, otherwise, it will perform subtraction if P
equal to 1. The operations of this cell can be expressed as
R = X ⊕ (D ⊕ P )⊕ Cin
Cout = X(D ⊕ P ) +XCin + (D ⊕ P )Cin
 (6.2)
To implement the CAS cell in QCA, the full adder of this cell can be implemented as
squaring cell. For the two-input XOR function, we can use the QCA structure of the three-
input XOR gate by fixing the third input to logic 0. Hence, the CAS cell can be directly
implemented using one majority gate and two XOR gates in two levels as shown in Fig. 6.2.
Figure 6.2: QCA layout of square-rooting cell
6.2.2: Squaring and Square-Rooting Arrays
The squaring and square-rooting circuits can be designed to perform the arithmetic
operations for n-bit numbers by extended their basic cells in a pipeline manner. For the
80
squaring circuit, it can be seen that the inputs of the basic cell given in Fig. 6.1 are received
in the first clock zone (clock 0), and the outputs are generated in the second zone (clock 1).
By connecting this cell to another cell with the same clock zones, an additional 0.5 clock
cycle is required. Therefore, the squaring cell is designed with four different clock zones in
order to connect the output of the current cell to the next cell directly without any additional
clock cycle. For illustration, we have considered the implementation of 4-bit squaring circuit.
This array requires four CAS cells. Fig. 6.3 shows the QCA layout of the proposed 4-bit
squaring circuit.
Figure 6.3: QCA layout of 4-bit squaring circuit
For square-rooting circuit, the design can be implemented using multi rows and multi
columns of CAS cells. In this chapter, we have also considered the implementation of 4-bit
square-rooting circuit. This array requires six CAS cells divided in two rows, where the first
row consists of two CAS cells and the second row consists of four cells. Since there is a
difference of 0.25 clock cycle between the inputs and the outputs of the CAS cell as shown
81
in Fig. 6.2, the CAS cell has been designed with four different clock zones in order to have
the minimum number of clock cycles. The QCA layout of the proposed 4-bit square-rooting
array is shown in Fig. 6.4.
Figure 6.4: QCA layout of 4-bit square-rooting circuit
6.3: Simulation Results and Comparisons
The designs and simulations were done using QCADesigner version 2.0.3 [15]. The
layers properties used in the designs are as follows: the cells area is 18 nm×18 nm, and the
diameter of the dots is 5 nm. The parameters used for a simulation engine in the bistable
approximation are as follows: the number of samples is 12800, convergence tolerance is
0.001, radius of effect is 65 nm, relative permittivity is 12.9, clock high and clock low are
82
9.8 × 1022 and 3.8 × 1023, respectively, clock amplitude factor is 2, layer separation is 11.5,
and maximum iterations per sample is 100.
Fig. 6.5 and 6.6 show the simulation results of the proposed squaring and square-rooting
cells, respectively. From Fig. 6.5, it can be seen that the outputs of the squaring cell are
received after 0.5 clock cycle. The design of this circuit requires 61 QCA cells and has an
area of 0.06µm2. For square-rooting cell, the outputs are received after 0.75 clock cycle as
shown in Fig. 6.6. This circuit requires 74 cells and has an area of 0.07µm2.
Figure 6.5: Simulation results of squaring cell
83
Figure 6.6: Simulation results of square-rooting cell
The simulation results for the proposed squaring and square-rooting circuits are shown
in Fig. 6.7 and 6.8, respectively. Fig. 6.7 shows the simulation results for squaring operation
of 101 and 1100. It can be seen that the outputs are produced as 11001 and 10010000,
respectively. For square rooting, Fig. 6.8 shows the simulation results after several iterations
for 1001 and 1101. The obtained results of these numbers are 11 and 11 for the quotients,
and 0 and 100 for the reminders, respectively.
Table 6.1 gives a comparison of the proposed designs of squaring and square-rooting cells
and the best existing designs. In the same table, we have also compared the proposed 4-bit
84
(a) (b)
Figure 6.7: Simulation results for squaring of (a) 101 (b) 1100
(a) (b)
Figure 6.8: Simulation results for square rooting of (a) 1001 (b) 1101
squaring and square-rooting arrays. From the table, it can be noticed that the proposed 4-bit
squaring array has less number of QCA cells than the design given in [79]. In addition, the
proposed squarer achieves a latency reduction of 14.3%. For square-rooting array, our design
also achieves a latency reduction of 41.2%. It can be observed that the proposed designs of
85
squaring and square-rooting cells and arrays give better results in all aspects, compared to
the best existing designs.
Table 6.1: Comparison of the proposed squaring and square-rooting designs
QCA unit Cell count Area (µm2) Latency Layer type
Squaring cell [79] 62 0.06 0.75 Multi-layer
Proposed squarer cell 61 0.06 0.5 Multi-layer
Square-rooting cell [80] 311 0.45 2.25 Single-layer
Proposed square-rooting cell 74 0.07 0.75 Multi-layer
4-bit square array [79] 552 0.53 1.75 Multi-layer
Proposed 4-bit square array 546 0.53 1.5 Multi-layer
4-bit square-root array [80] − − 4.25 Single-layer
Proposed 4-bit square-root array 733 0.84 2.5 Multi-layer
Reduction % (w.r.t [79]) 1.1% 00.0% 14.3% –
Reduction % (w.r.t [80]) − − 41.2% –
6.4: Conclusion
In this chapter, QCA designs of squaring and square-rooting arrays are proposed. These
designs are developed and implemented using majority gate and the QCA structure of the
three-input XOR function. The basic cells of these arrays are designed with different clock
zones. This led to QCA squaring and square-rooting circuits with better results in view of
cell count, area, and latency, compared to their best counterparts. In addition, the proposed
squaring and square-rooting arrays can be extended in a pipeline manner to perform the
arithmetic operations for any number of bits.
86
CHAPTER 7: A METHODOLOGY FOR MAJORITY/ MINOR-
ITY LOGIC NETWORK SYNTHESIS
7.1: Introduction
In this chapter, a comprehensive methodology for majority/minority logic networks
synthesis is proposed. Unlike existing methods which are oriented to process three- and four-
feasible decomposed networks, this method is capable of processing n-feasible networks and
synthesizing any Boolean function to produce its equivalent majority logic network. This
method can be used to obtain optimized majority logic network using either the number
of majority gates or levels as the first priority. The obtained majority networks are then
further simplified which can provide further reduction in the number of gates and inverters.
In the proposed method, more than one equivalent majority expressions can be obtained
for an input Boolean function. However, it results in the most optimized network as a final
solution. Compared to the results obtained from the existing methods in [55, 60, 64], the
proposed method results in fewer majority gates and levels, which gives better latency and
size, and enhances the performance of the circuit.
7.2: Methodology
7.2.1: Overview of the Synthesis Method
The input to the proposed method is an arbitrary Boolean functions network, and
the output is an equivalent majority logic network. The method begins by generating and
storing the output terms for each Boolean function of input network. The second step is
to construct standard majority logic structures starting from either the minimum number
of majority gates or levels based on the priority selection. After producing each majority
structure, the admissible input combinations are applied in parallel and verified if one or
87
more of these cases are solutions for any of the input Boolean functions. This process results
in one or more solutions of majority expressions for each input Boolean function. These
expressions are then simplified by removing all redundancies and reducing the number of
inverters. The last step is to select the most optimized majority network among all solutions
based on the number of majority gates and inverters of selected majority expressions. An
overview of the main steps of the proposed synthesis method is shown in Fig. 7.1. The terms
used in the flowchart are defined as follows:
n number of input Boolean functions;
Ti output terms for ith input Boolean function;
G number of majority gates;
L number of levels;
Gmin minimum value of G;
Gmax maximum value of G;
Lmin minimum value of L;
Lmax maximum value of L;
SG,L set of standard majority logic structures for given values of G and L;
Mi set of equivalent majority expressions for ith input Boolean function;
N obtained majority logic network;
7.2.2: Constructing Standard Majority Logic Structures
In this synthesis method, the main step is to construct all standard majority logic
structures (donated by SG,L) depending on the number of majority gates G and the number
of levels L. These structures are developed differently based on how majority gates are
divided between levels without determining the inputs of majority gates in all levels of theses
structures. Every developed standard structure must satisfy the following requirements.
88
 Gate Priority
Initialize the value of Y1 to 1
Construct a standard 
majority logic structure
Apply current combination
An equivalent to 
one of input Boolean 
functions?
Finished all SY1, Y2 
structures?
No
Store majority expression 
in its corresponding  Mi 
Yes
Y2 < Y2max?
No
Y1++
Yes
Y2++
Remove redundancies
Optimize inverters
End
Do in Parallel
Finished all 
combinations?
Yes
Remove Boolean functions 
that have solutions
Found majority 
expressions for all 
Boolean functions?
No
Determine the value of Y2min and 
set it as an initial value for Y2
Yes
Yes
Choose the optimal majority 
logic network N
Generate and store Ti : 
{i  1, 2, .., n} 
Start
An arbitrary Boolean functions network
Move to next combination
No
All redundancies 
removed?
Yes
Y1 = G; Y2 = L;
Y2min = Lmin; Y2max = Lmax
Y1 = L; Y2 = G;
Y2min = Gmin; Y2max = Gmax
Level Priority
No
No
Figure 7.1: Flowchart for the proposed synthesis method.
89
1. The number of majority gates in any level of a constructed majority structure must be
less than or equal to the number of gates in the next level multiplied by 3. This can
be represented by gj 6 gj+1 × 3, where gj is the number of gates in jth level.
2. The number of majority gates in the last level of a constructed majority structure is
always 1.
Based on these requirements, two ways are developed to construct standard majority
logic structures starting from the minimum number of majority gates and levels called “Gate
Priority” and “Level Priority”, respectively.
Gate Priority: This process starts with a number of majority gates equal to 1 as an
initial value. This value increases by 1 after realizing all standard majority structures for all
admissible levels. The majority structures for a number of majority gates are constructed
with all their admissible number of levels, starting from the minimum number, and ending
with the maximum number of levels. The minimum number of levels can be determined by
Lmin =
⌈
log(2G+ 1)
log(3)
⌉
(7.1)
For the maximum number of levels, it can be determined directly from the majority
gate count. By having only one majority gate in each level, which gives the maximum
possible number of levels, the number of levels is equal to the number of majority gates.
Thus, the maximum number of levels Lmax is equal to the number of majority gates G. The
graph given in Fig. 7.2 shows the minimum and maximum numbers of levels of standard
majority structures for each number of majority gates. For example, for 50 majority gates,
the minimum and maximum numbers of levels are 5 and 50, respectively. This can be seen
in Fig. 7.2.
90
10
0
10
1
10
2
10
3
0
50
100
150
200
250
300
350
400
450
500
 
 
G: 50
L: 50
Number of Levels (L)
N
u
m
b
e
r 
o
f 
M
aj
o
ri
ty
 G
a
te
s 
(G
)
G: 50
L: 5
L
min
L
max
Figure 7.2: The possible levels of standard majority logic structures based on the number of
majority gates.
Level Priority: This process is to construct standard majority logic structures with re-
spect to the number of levels as the first priority. The process starts with a level count
initialized to 1. The number of levels increases by 1 after realizing all admissible structures
starting from the minimum number up to the maximum number of majority gates. For each
number of levels, the minimum number of gates can be determined by having only one ma-
jority gate in each level. Thus, the minimum number of gates Gmin is equal to the number
of levels L. For the maximum number of gates, based on the requirements for constructing
standard majority structures, the maximum number of gates can be determined by
Gmax =

∑L−1
n=0 3
n if L > 1
0 otherwise
(7.2)
Thus, the minimum and maximum numbers of majority gates for any number of levels
can be determined. The graph given in Fig. 7.3 shows the minimum and maximum number
91
10
0
10
1
10
2
10
3
10
4
10
5
1
2
3
4
5
6
7
8
9
10
 
 
L: 5
G: 121
Number of Majority Gates (G)
N
u
m
b
e
r 
o
f 
Le
ve
ls
 (
L)
L: 5
G: 5
G
min
G
max
Figure 7.3: The possible gates of standard majority logic structures based on the number of
levels.
of gates of standard majority structures based on the number of levels. For example, for a
number of levels equal to 5, the minimum and maximum number of gates are 5 and 121,
respectively.
Mostly, for each number of majority gates and levels, there are different admissible struc-
tures of majority logic. For example, for 5 majority gates, the admissible levels of standard
structures are 3, 4, and 5 levels. For 3 levels, the admissible structures in S5,3 that can be
obtained are three. The first structure contains 3 gates in level 1, 1 gate in level 2, and 1
gate in level 3. The second structure contains 2 gates in level 1, 2 gates in level 2, and 1 gate
in level 3. The third structure contains 1 gate in level 1, 3 gates in level 2, and 1 gate in
level 3. These structures are shown in Fig. 7.4. For 4 levels there are also three admissible
structures in S5,4 as shown in Fig. 7.5. The first structure consists of 2 gates in level 1, 1
gate in level 2, 1 gate in level 3, and 1 gate in level 4. For the second structure, it consists of
92
1 gate in level 1, 2 gates in level 2, 1 gate in level 3, and 1 gate in level 4. Lastly, the third
structure consists of 1 gate in level 1, 1 gate in level 2, 2 gates in level 3, and 1 gate in level
4. However, for 5 levels there is only one majority logic structure in S5,5 which consists of
only 1 gate in each level as shown in fig. 7.6.
M1
M2
M3
M4
M5
M1
M2 M4 M5
M1
M2
M3
M4
M5
St
ru
ct
u
re
 1
St
ru
ct
u
re
 2
St
ru
ct
u
re
 3
Level 1 Level 2 Level 3
M3
Figure 7.4: Three structures for 5 majority gates and 3 levels.
7.2.3: Applying Admissible Combinations
In the previous step, the standard majority logic structures are developed. However, the
inputs of majority gates in these structures are not determined. Therefore, this process is
to determine the inputs for each majority gate in all levels of the majority logic structures
developed in the previous step. The process starts by determining the input elements called
V. For a k number of inputs, the V consists of variable inputs (x1, x2, .., xk), complement
93
M1 M2
M3
M4
M5
M1
M2
M4 M5
M2
M3
M1 M4 M5
St
ru
ct
u
re
 1
St
ru
ct
u
re
 2
St
ru
ct
u
re
 3
Level 1 Level 2 Level 3
M3
Level 4
Figure 7.5: Three structures for 5 majority gates and 4 levels.
M1 M5M3 M4
St
ru
ct
u
re
 1
Level 1 Level 2 Level 3
M2
Level 4 Level 5
Figure 7.6: The structure for 5 majority gates and 5 levels.
of variable inputs (x′1, x
′
2, .., x
′
k) and logic 0 and 1. For example, for two inputs, the V
can be represented by V = {x1, x2, x′1, x′2, 0, 1}. Based on V, all possible combinations
are then applied on each developed standard majority structure by satisfying the following
requirements.
1. The inputs of each majority gate in the first level must be a combination of three
94
inputs from V.
2. The output of each majority gate in any level except the last level must be an input
of at least one majority gate in the next level.
3. The rest of inputs of each majority gate in the second level up to the last level must be
a single input, a combination of two inputs from V, or a combination of three inputs
from V depending on how many inputs of a gate are not determined.
4. Each majority gate in a structure must satisfy the following requirements.
(a) It must not have a duplicated input.
(b) It must not have both an input and its complement. This includes the case of
having both logic 0 and 1.
By following these requirements, all the admissible input cases are applied on the devel-
oped majority structure in parallel as follows. After receiving a specific majority structure,
the first thread applies all combinations for the first majority gate at the first level. For each
combination assigned, a new thread is spawned that works similarly by applying all com-
binations for the remaining majority gates in the current structure. The process continues
working until the number of the threads reaches its limit. Once the limit is reached, each
thread continues working by its own to apply all combinations for the remaining majority
gates. For example, in a case of having a structure with a total of 4 majority gates and each
of these gates has a total of 12 combinations that have to be applied, the first thread assigns
a combination to the first gate and then gives it to a new thread to work on. Thread 1 will
create 12 more threads for 12 combinations. Similarly, the new threads will do the same
process for the three remaining gates until all combinations are applied.
95
7.2.4: Finding Equivalent Majority Functions
After applying each input case on the developed majority structure, the next step is
to verify if it is a solution for any of the input Boolean functions. This process is done by
comparing the majority function output terms of the applied input case with the output
terms of each Boolean function Ti : {i → 1, 2, .., n} to see if it matches any of these Ti’s.
The verified and approved majority functions for ith Boolean function are then stored in
Mi. After applying all admissible input cases on a majority structure, if one or more of the
input Boolean functions do not have solutions, the process moves to the next structure and
repeat the steps. The Boolean function that has an equivalent majority expression can have
more solutions of majority functions in the next structures. However, this Boolean function
must be removed from verification after covering all structures in SG,L. In other words,
the Boolean function can have another solution from the next structure if the number of
both majority gates and levels in the current and the next structures are same. Otherwise,
the process stops looking for more solutions. This process keeps running until covering all
structures in SG,L, where each input Boolean function has at least one equivalent majority
expression.
7.2.5: Redundancy Removal
Since the output of a majority gate in all levels except the last level of a generated
majority network must be an input for at least one majority gate in the next level, the
initial obtained majority expressions are not optimal. This is because of the redundancies
that must be removed. The process of redundancy removal given in [60] is used. This
process is done in several steps to optimize and simplify the complexity of all obtained
majority expressions.
96
The process starts by removing repeated nodes, which is done by comparing the original
form of each node and its complementary form with the rest of nodes in the majority ex-
pression. The second step is to simplify all nodes in the majority network with duplicated
inputs. The next step is to eliminate nodes without any majority gate. However, this step is
only applied on internal nodes. In other words, it is not allowed to eliminate primary output
nodes of the majority network. The last step is to reduce the number of inverters. This step
is used in case of having two cascaded inverters, which can cancel each other out. Another
case is to have a majority gate with two or more internal inverters, which can be factored to
have only one external inverter.
The process of redundancy removal may need more than one iteration to obtain the
optimal majority logic network especially while processing a large circuit. Therefore, the
process must be repeated until no further simplifications and reductions are possible. Further
details of the redundancy removal process can be found in [60].
7.2.6: Choosing the Optimal Majority Logic Network
In most cases, every input Boolean function has more than one equivalent simplified
majority expression. Therefore, the last process in the proposed synthesis method is to
choose the most optimized majority expression for each input Boolean function from the
corresponding sets of equivalent majority expressionsMi : {i→ 1, 2, .., n}. The final majority
expressions are chosen such that they have the minimum total number of majority gates and
the minimum total number of inverters. By completing this process, the optimal majority
logic expressions network N is generated. As a result, the equivalent optimal majority logic
network for any Boolean functions network can be obtained from the proposed method. The
proof of the optimality of the obtained majority networks is given in the Appendix.
97
The proposed methodology has been implemented in Java. The package contains approx-
imately 2500 lines of Java code [88]. A simple example is presented to illustrate the steps of
the proposed method. Consider the benchmark circuit b1 from Microelectronics Center of
North Carolina (MCNC) benchmark suite. This circuit includes three primary inputs a, b
and c and four primary outputs {d}, {e}, {f} and {g}. The functionality of this circuit can
be expressed by
{d} = c
{e} = a′b+ ab′
{f} = abc′ + a′b′c
{g} = c′

(7.3)
Since the outputs {d} and {g} are only consist of one-input variable, there is no need to
use majority gates for their realization. The obtained results for these outputs are shown in
Fig. 7.7 and 7.8, respectively.
Figure 7.7: Obtained majority expression for the output {d}.
For outputs {e} and {f}, we first generate their output terms which are T1 = (0, 0, 1, 1, 1,
1, 0, 0) and T2 = (0, 1, 0, 0, 0, 0, 1, 0), respectively. In the second step, the standard majority
logic structures are constructed starting from the minimum number of majority gates and
levels by considering either gates or levels as the first priority. In this example, both gate and
98
Figure 7.8: Obtained majority expression for the output {g}.
level priorities result in the same solution. The set of standard majority structures for the
Boolean function {e}, is constructed with 3 gates and 2 levels which gives only one element
(donated by S3,2). For the Boolean function {f}, the set of standard majority structures
also contains only one element which consists of 4 gates and 2 levels (donated by S4,2). After
constructing each majority structure, all possible input combinations based on V are then
applied. The three inputs, the complement of the three inputs, and logic 0 and 1 determine
the elements of V, i.e., V = {a, b, c, a′, b′, c′, 0, 1}. After applying the input combinations, sets
of majority expressions M1 and M2 are obtained for the outputs {e} and {f}, respectively.
For M1, six equivalent majority expressions are obtained, i.e., M1 = {e0, e1, e2, e3, e4, e5} as
shown in Fig. 7.9. These functions can be expressed by
e0 = M(a′,M(a, b′, 0),M(a, b, 1))
e1 = M(b′,M(a′, b, 0),M(a, b, 1))
e2 = M(0,M(a, b, 0)′,M(a, b, 1))
e3 = M(1,M(a, b′, 0),M(a′, b, 0))
e4 = M(a,M(a, b, 0)′,M(a′, b, 0))
e5 = M(b,M(a, b′, 0),M(a, b, 0)′)

(7.4)
It can be seen that all the expressions consist of three majority gates and two levels. For
99
Figure 7.9: Obtained majority expressions for the output {e}.
M2, also six majority expressions are obtained, i.e., M2 = {f0, f1, f2, f3, f4, f5} as shown
in Fig. 7.10. These expressions are given by
f0 = M(M(b, c, 0)′,M(a, b, 0),M(a′, c, 0))
f1 = M(M(a, c, 0)′,M(a, b, 0),M(b′, c, 0))
f2 = M(M(b, c′, 0),M(a, b′, 1),M(a′, c, 0))
f3 = M(M(b, c′, 0),M(a, c, 1),M(a, b, 1)′)
f4 = M(M(b, c, 1),M(a, c′, 0),M(a, b, 1)′)
f5 = M(M(b′, c, 0),M(a′, b, 1),M(a, c′, 0))

(7.5)
100
Figure 7.10: Obtained majority expressions for the output {f}.
It can be seen that each of these expressions consists of four majority gates and two
levels. From M1 and M2, if we choose any two expressions the number of levels will be the
same. However, the number of majority gates and the number of inverters are not the same
in all choices. There are five choices that have the minimum number of majority gates and
inverters which are (e2, f0), (e2, f1), (e2, f3), (e4, f0), (e4, f1) and (e5, f1). Each of these
choices has six gates and three inverters. However, since the output {g} is the complement
of c and the only choice that has the complement of c is (e2, f3), these expressions will be
101
chosen as the final solution as shown in Fig. 7.11. This network can be expressed as given
in (7.6), where n0 is an internal node.
{d} = c
{e} = M(0,M(a, b, 0)′, n0)
{f} = M(M(b, c′, 0),M(a, c, 1), n0′)
{g} = c′
n0 = M(a, b, 1)

(7.6)
Figure 7.11: Final majority network for b1.
Thus, the obtained majority network for the benchmark circuit b1 requires six majority
gates, two levels, and three inverters.
7.3: Results and Comparison
In this section, a comparison between the obtained results of 15 MCNC benchmarks [66]
using the best existing majority synthesis methods in [55, 60] and the proposed method are
102
demonstrated. The obtained results using both “Gate Priority” and “Level Priority” are
shown in Table 7.1 and 7.2, respectively.
In Table 7.1 and 7.2, the first column lists the names of the 15 benchmarks. The columns
under the title “AND/OR mapping” show the obtained results using majority one-to-one
AND/OR mapping method. The columns titled “Method [60]” show the benchmarks re-
sults obtained from [60] when targeted to optimize either the number of majority gates or
levels. The columns under the title “Method [55]” show the results obtained from [55].
The columns under titled “Proposed method” show the results obtained from the proposed
synthesis method using either “Gate Priority” or “Level Priority”. The percentages of gate
and level counts reductions obtained by the best existing methods and the proposed method
compared to the majority AND/OR mapping method are given in the columns under the
title “Reduction %”.
Table 7.1: Comparison of 15 MCNC benchmarks using two existing methods and proposed
method “Gate Priority”
AND/OR
mapping
Method [60]
Method [55]
Proposed method Reduction%
(gate priority) (gate priority) Method [60] Method [55] Proposed Method
Benchmark Gate Level Gate Level Gate Level Gate Level Gate Level Gate Level Gate Level
b1 9 3 6 2 7 3 6 2 33.3% 33.3% 22.2% 33.3% 33.3% 33.3%
cm42a 21 2 18 2 18 2 18 2 14.3% 0.0% 14.3% 0.0% 14.3% 0.0%
decod 28 3 28 3 28 3 28 3 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
cm82a 50 7 6 3 7 3 6 3 88.0% 57.1% 86.0% 57.1% 88.0% 57.1%
majority 12 5 5 4 6 4 5 4 58.3% 20.0% 50.0% 20.0% 58.3% 20.0%
x2 49 6 34 7 37 7 34 6 30.6% -16.7% 24.5% -16.7% 30.6% 0.0%
cm152a 31 5 17 6 21 6 15 5 45.2% -20.0% 32.3% -20.0% 51.6% 0.0%
cm85a 80 10 19 6 26 6 14 6 76.3% 40.0% 67.5% 40.0% 82.5% 40.0%
cm151a 56 8 20 7 23 7 15 5 64.3% 12.5% 58.9% 12.5% 73.2% 37.5%
cm162a 57 7 36 9 41 7 32 7 36.7% -28.6% 28.1% 0.0% 43.9% 0.0%
cu 61 8 39 7 40 7 37 6 36.1% 12.5% 34.4% 12.5% 41.0% 25.0%
cm163a 52 7 32 7 38 7 29 7 38.5% 0.0% 19.2% -28.6% 46.2% 0.0%
cmb 44 4 26 4 28 4 26 4 40.9% 0.0% 36.4% 0.0% 40.9% 0.0%
pm1 49 6 32 6 35 6 30 6 34.7% 0.0% 28.6% 0.0% 38.8% 0.0%
mux 55 7 37 6 46 9 35 6 32.7% 14.3% 16.4% -28.6% 36.4% 14.3%
Average reduction% 42.0% 8.3% 34.6% 5.4% 45.3% 15.1%
In Table 7.1, it can be seen that when the priority is to reduce the number of majority
gates, the proposed method produces better results in terms of gate counts for 8 benchmarks.
103
It also produces better results in terms of level counts for 5 benchmarks. For the remaining
benchmarks, the obtained results are the same as that obtained from the best existing
method. There is an average reduction of 45.3% in the number of gates and 15.1% in
the number of levels, compared to majority AND/OR mapping method, whereas the best
existing method has an average reduction of 42.0% in the number of gates and 8.3% in the
number of levels.
Table 7.2: Comparison of 15 MCNC benchmarks using two existing methods and proposed
method “Level Priority”
AND/OR
mapping
Method [60]
Method [55]
Proposed method Reduction%
(level priority) (level priority) Method [60] Method [55] Proposed Method
Benchmark Level Gate Level Gate Level Gate Level Gate Level Gate Level Gate Level Gate
b1 3 9 2 6 3 7 2 6 33.3% 33.3% 33.3% 22.2% 33.3% 33.3%
cm42a 2 21 2 18 2 18 2 18 0.0% 14.3% 0.0% 14.3% 0.0% 14.3%
decod 3 28 3 28 3 28 3 28 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
cm82a 7 50 3 6 3 7 3 6 57.1% 88.0% 57.1% 86.0% 57.1% 88.0%
majority 5 12 3 6 4 6 3 6 40.0% 50.0% 20.0% 50.0% 40.0% 50.0%
x2 6 49 6 36 7 37 5 36 0.0% 26.5% -16.7% 24.5% 16.7% 26.5%
cm152a 5 31 6 17 6 21 4 16 -20.0% 45.2% -20.0% 32.3% 20.0% 48.4%
cm85a 10 80 6 19 6 26 6 14 40.0% 76.3% 40.0% 67.5% 40.0% 82.5%
cm151a 8 56 7 20 7 23 4 16 12.5% 64.3% 12.5% 58.9% 50.0% 71.4%
cm162a 7 57 7 41 7 41 6 36 0.0% 28.1% 0.0% 28.1% 14.3% 36.8%
cu 8 61 6 40 7 40 5 36 25.0% 34.4% 12.5% 34.4% 37.5% 39.3%
cm163a 7 52 7 32 7 38 6 28 0.0% 38.5% -28.6% 19.2% 14.3% 44.2%
cmb 4 44 4 26 4 28 4 26 0.0% 40.9% 0.0% 36.4% 0.0% 40.9%
pm1 6 49 6 32 6 35 6 30 0.0% 34.7% 0.0% 28.6% 0.0% 38.8%
mux 7 55 6 37 9 46 5 37 14.3% 32.7% -28.6% 16.4% 28.6% 32.7%
Average reduction% 13.5% 40.5% 5.4% 34.6% 23.5% 43.1%
From Table 7.2, when targeted to optimize the level counts, the obtained results show
that the proposed method produces better results in terms of level counts for 7 benchmarks
and for 7 benchmarks in terms of gate counts as well. The results show that there is an
average reduction of 23.5% in the number of levels and an average reduction of 43.1% in
the number of gates, as compared to majority AND/OR mapping method, while the best
existing method has an average reduction of 13.5% and 40.5% in the number of levels and
gates, respectively.
From the results given in Table 7.1 and 7.2, it is obvious that the capability of the
104
proposed method of processing n-feasible networks and generating the equivalent majority
networks based on the process of constructing standard majority logic structures can result
in better majority networks in terms of gates and/or levels compared to the best existing
majority/minority logic synthesis methods.
7.4: Conclusion
In this chapter, we proposed a comprehensive method to generate an optimal majority
logic network for any arbitrary multi-output Boolean function network. The main step in
this method is the process of constructing standard majority logic structures starting from
either the minimum number of majority gates or levels based on the priority selection. Each
input case is then applied on the developed majority structure and verified to see if it is
a solution. A redundancy removal method is also used to simplify the obtained majority
expressions. Since the proposed method results in more than one equivalent majority network
for an input Boolean functions network, a method to choose the best solution is developed.
The results of 15 MCNC benchmarks showed that the proposed method gives better results
in terms of majority gates and/or levels for all benchmark circuits compared to the best
existing techniques. For some cases, our method gives the same results as the best existing
method. Since the proposed method generates majority networks by applying all admissible
combinations, the time complexity becomes an issue, especially while synthesizing large
circuits. By using De Morgan’s theorem, any majority logic network can be easily converted
into its equivalent minority logic network. Therefore, the proposed method can be used for
any majority- or minority-based technologies such as QCA, SET and TPL.
105
CHAPTER 8: DESIGN OF GENERALIZED PIPELINE CELLU-
LAR ARRAY IN QCA
Portions of this chapter were reprinted or adopted from: Amjad Almatrood and Harpreet
Singh, “Design of generalized pipeline cellular array in quantum-dot cellular automata,”
IEEE Computer Architecture Letters, 2017, DOI 10.1109/LCA.2017.2719021. [83]
8.1: Intoduction
Computer arithmetic will always be a topic of interest to the computer architecture
community. With the dawn of VLSI, cellular array are becoming more and more important.
Up to now, the implementation of various applications of arithmetic units in QCA have been
investigated and conducted. Several papers have introduced different QCA architectures for
different arithmetic circuits such as squarer [79], square rooting circuit [80], divider [76],
and multiplier [89]. However, to the best of the authors’ knowledge, there has been no
prior work on the implementation of a universal arithmetic unit that can perform all the
basic operations. In this chapter, a QCA design for a generalized pipeline cellular array
(GPCA) is presented. Unlike existing QCA arithmetic units which were developed to perform
limited arithmetic operations, the proposed QCA array can perform all the basic arithmetic
operations such as squaring, square rooting, division, multiplication, etc., using only one
arithmetic cell. In addition, the proposed QCA design can be extended in a pipeline manner
to perform the operations for any number of bits.
8.2: Generalized Pipeline Cellular Array
A generalized pipeline cellular array is an arithmetic processor that can perform all the
basic arithmetic operations such as multiplication, division, squaring, and square rooting
[90, 91]. In this array, controlled adder-subtractor arithmetic cell and control logic cell are
106
used as the basic cells. In the arithmetic cell, there are six inputs, i.e., X,A,B,C,C1 and
Fi, and four outputs, i.e., S,C0, D and E as shown in Fig. 8.1(a). The Boolean expressions
for the arithmetic cell as given in [90,91] can be defined by
S = [A⊕ (B ⊕X)⊕ C1]Fi + AF ′i
C0 = (B ⊕X)(A+ C1) + AC1
D = BC + CFi = C(B + Fi)
E = B + CFi = (B + C)(B + Fi)

(8.1)
X
AFi
C0
S D
E
X
Fi
C1
B
C A
(a)
X
CP
C0
X
Fi
C0
(b)
Figure 8.1: Basic cells: (a) Controlled adder-subtractor cell (b) Control cell
For the control cell, there are three inputs, i.e., X,C0 and Pi, and one output, i.e., Fi as
shown Fig. 8.1(b). The Boolean function for the control cell can be defined by
Fi = C0X + PiX
′ (8.2)
The layout of the generalized pipeline array is shown in Fig. 8.2. Based on the logic values
given to the inputs, the array can perform different operations. For squaring process, the
array can give the square of a five-bit number. The input number is given in (P1, P2, .., P5),
and the output is produced in (S1, S2, .., S10). The inputs X and (A1, A2, .., A10) are made
107
zeros, whereas inputs (B1, B2, .., B7) and (C1, C2, .., C7) are given as (0, 0, 1, 1, 1, 1, 1)
and (0, 1, 0, 0, 0, 0, 0), respectively. For square rooting, the array can process a ten-bit
number. In this operation, the input number is given to (A1, A2, .., A10), and the outputs
are generated in (F1, F2, .., F5) and (S4, S5, .., S10) for the result and remainder, respectively.
The input X is given as 0, whereas inputs (P1, P2, .., P5) are given as 1’s. For inputs B’s
and C’s, they are given as in the squaring operation. The array can also perform the
multiplication process of two four-bit numbers given in (P2, P3, .., P5) and (B1, B2, .., B4), the
multiplier and multiplicand, respectively. The output of multiplication process is produced
in (S0, S1, .., S10). The inputs X and (A1, A2, .., A10) are made zero’s, whereas C’s inputs
are made equal to their corresponding B’s values. For division process, the array can be
used to divide a seven-bit number (A1, A2, .., A7) by a four-bit number (B1, B2, .., B4). The
outputs are generated in (F1, F2, .., F5) for the result and (S4, S3, .., S7) for the remainder.
The inputs X and (P1, P2, .., P5) are given 0 as 1’s, respectively. The C’s inputs are made
equal to the corresponding B’s values. Table 8.1 gives a summary of the logic values that
should be given to the inputs for each operation. It also shows how the results are produced.
Table 8.1: Conditions of generalized pipeline array inputs for arithmetic operations
Arithmetic
Operation
Inputs Outputs
X P A B C S F
Squaring 0 operand A1 = ... = A10 = 0 B1 = B2 = 0, B3 = ... = B7 = 1 C1 = 0, C2 = 1, C3 = ... = C7 = 0 result −
Square rooting 1 P1 = ... = P5 = 0 operand B1 = B2 = 0, B3 = ... = B7 = 1 C1 = 0, C2 = 1, C3 = ... = C7 = 0 remainder result
Multiplication 0 multiplier A1 = ... = A10 = 0 multiplicand C1 = B1, .., C7 = B7 result −
Division 1 P1 = ... = P5 = 0 dividend divisor C1 = B1, .., C7 = B7 remainder result
The generalized pipeline array can be designed to perform any arithmetic operation for
n-bit numbers by increasing the number of arithmetic and control cells. For a pipeline array
designed for squaring of an n-bit number, square rooting of a 2n-bit number, multiplication
of (n+ 3)/2-bit and (n+ 3)/2-bit numbers if n is odd, and of (n+ 2)/2-bit and (n+ 4)/2-bit
108
A
A
C
A
A
C
A A
A
A
A
A
C
A
A
A
A
A
A
A
C
A
A
A
A
A
A
A
A
A
C
A
A
A
A
A
A
A
A
A
0
0
0
0
0
0
0
0
0
P
1
 P
2
 P
3
 P
4
 P
5
 X
B
1
 C
1
B
2
 C
2
A
1
B
3
 C
3
A
2
A
3
B
4
 C
4
A
4
B
5
 C
5
A
5
A
6
A
7
B
6
 C
6
A
8
B
7
 C
7
A
9
A
10
F 1 F 2 F 3 F 4 F 5
S 1
S 2
S 3
S 4
S 5
S 6
S 7
S 8
S 9
S 1
0
S 0
C
lo
ck
F
ig
u
re
8.
2:
G
en
er
al
iz
ed
p
ip
el
in
e
ce
ll
u
la
r
ar
ra
y
109
numbers if n is even, and division of an (n+ 2)-bit number by an (n+ 3)/2-bit number if n
is odd or an (n + 2)/2-bit number if n is even, the array requires n(n + 2) arithmetic cells
and n control cells [92, 93].
8.3: Design and Implementation
Since the fundamental logic devices in QCA are majority gates and inverters, in order
to implement a logic circuit in QCA, it has to be realized by only majority gates and in-
verters. A direct method to generate the majority logic network for any Boolean function is
majority AND/OR mapping method. This process is done by first simplifying the Boolean
functions using traditional methods such as Karnaugh-map (K-map), reduced-unitized-table,
Shannon’s decomposition principle, etc., to obtain functions expressed in one of two stan-
dard forms; sum of products (SOP) or product of sums (POS). These expressions are then
converted into their equivalent majority networks by mapping each AND and OR logic to
a majority AND/OR gate. Even though this process is a straightforward way to produce
majority network, usually it does not result in optimized solutions in terms of different op-
timization factors such as the number of gates, levels, inverters, etc. For this reason, several
methodologies for majority logic networks synthesis have been investigated and introduced.
Some of these methods are developed for three-variable Boolean functions and some others
for multi-variable functions. However, none of these methods can generate the optimal so-
lutions in terms of all optimization factors for all cases. In the previous chapter, we have
introduced a new methodology for majority logic networks synthesis. This method can be
used to synthesize the optimal majority logic network for any Boolean function in terms of
different optimization factors which leads to efficient implementation of QCA circuits. In
this chapter, this synthesis method is used to produce the majority network of GPCA.
110
8.3.1: Arithmetic and Control Cells
As discussed earlier, the controlled adder-subtractor arithmetic cell has six inputs, i.e.,
X,A,B,C,C1. and Fi, and four outputs, i.e., S,C0, D and E. The obtained majority network
for the arithmetic cell is shown in Fig. 8.3. This network can be expressed as given in (8.3),
where n0, n1, n2, and n3 are internal nodes.
S = M(M(n0, Fi, 0),M(A,F
′
i , 0), 1)
C0 = n1
D = M(C, n3, 0)
E = M(B,C, n3)
n0 = M(n
′
1,M(n2, A
′, C1), A)
n1 = M(n2, A, C1)
n2 = M(M(X
′, B, 0),M(X,B′, 0), 1)
n3 = M(B,Fi, 1)

(8.3)
Co
S
M
Fi
0
M1
Level 5 Level 6
M
A
Fi
M
Level 4
A
M
A
C1
M
M
B
Fi
1
D
E
M
C
0
MB
C
Level 3Level 2
0
M1
M
X
B
M
X
B A
C1
Level 1
0
0
Figure 8.3: Majority circuits of the arithmetic cell
111
It can be seen that the majority network for the arithmetic cell requires twelve majority
gates, six levels, and five inverters. The QCA design for the arithmetic cell is shown in Fig.
8.4. This design has a total of 712 cells, a delay of 4 clock cycles, and an area of 0.69µm2.
Figure 8.4: QCA design for the arithmetic cell
In the control cell, there are three inputs, i.e., X,C0 and Pi, and one output Fi. The
obtained majority expression is shown in Fig. 8.5. This can be expressed as
Fi = M(M(X,C0, 0),M(X
′, Pi, 0), 1) (8.4)
From the expression, it can be seen that the network requires three majority gates, two
levels, and one inverter. The delay of this circuit is 1.5 clock cycles as shown in Fig. 8.6.
The design has 79 cells and an area of 0.12µm2
112
Fi
Level 1 Level 2
M1
M
X
C0
M
X
Pi
0
0
Figure 8.5: Majority circuits of the control cell
Figure 8.6: QCA design for the control cell
8.3.2: Generalized Pipeline Cellular Array
Based on the QCA designs of the arithmetic and control cells, a generalized pipeline
cellular array can be designed. Fig. 8.7 shows the QCA design of a generalized pipeline
array that consist of eight arithmetic cells and two control cells. This array can perform
squaring operation of 2-bit number, square rooting of 4-bit number, multiplication of 2-bit
number and 3-bit number, and division of 4-bit number by 2-bit number. In this array, the
total number of QCA cells used is 7521 and the area is 10.63µm2.
In the implementation of a generalized pipeline array, the delay in any level of the array
113
Figure 8.7: QCA design of a generalized pipeline array for n = 2
is equal to the delay in the previous level plus 3 clock cycles; 2 clock cycles for the additional
two arithmetic cells and 1 clock cycle for the control signal (X) and the output of the control
cell (Fi). The delay of the first level is 8 clock cycles. As a result, the overall delay for a
pipeline array designed for squaring of a n-bit number, square rooting of a 2n-bit number,
multiplication of (n + 3)/2-bit and (n + 3)/2-bit numbers if n is odd, and of (n + 2)/2-bit
and (n + 4)/2-bit numbers if n is even, and division of (n + 2)-bit number by (n + 3)/2-bit
number if n is odd or (n+ 2)/2-bit number if n is even, can be determined by
Delay =
n∑
i=1
8 + 3(i− 1) (8.5)
114
In the implemented pipeline array shown in Fig. 8.7, n is equal to 2. Thus, the overall
delay is Delayn=2 = 19.
As mentioned earlier, the array can be extended to preform the arithmetic operations
for any number of bits. The QCA design for a generalized pipeline array that can perform
the squaring of 5-bit number, square rooting of 10-bit number, multiplication of two 4-bit
numbers, and division of 7-bit number by 4-bit number is shown in Fig. 8.8. This array
consists of thirty five arithmetic cells and five control cells. The design has a total number of
41,361 QCA cells and an area of 57.76µm2. In this array, n is equal to 5. Thus, the overall
delay is Delayn=5 = 70. The red rectangles in Fig. 8.8 show the delay increment in each
level.
8.3.3: Generalized Pipeline Cellular Array for Specified Input/
Output Pins
In this section, we have proposed the same generalized pipeline cellular array with an
additional circuit that leads to a reduction in the number of inputs. Fig. 8.9 shows the
generalized pipeline cellular array in the lower-dashed rectangle and the additional circuit in
the upper-dashed rectangle. The additional circuit is used to determine inputs (C1, C2, .., C7)
based on the value of inputs (B1, B2, .., B7) and the new input W . This can be expressed by
Ci =

W ′ +Bi if i = 2
WBi otherwise
(8.6)
From Fig. 8.2, it can be seen that the total number of inputs of the original array is thirty,
whereas the number of inputs required in the improved array is reduced to twenty four. In
this array, the inputs for each operation are given as in the original array. However, the new
inputs (W and X) are given as 00 for squaring, 01 for square rooting, 10 for multiplication,
115
Figure 8.8: QCA design of a generalized pipeline array for n = 5
and 11 for division, respectively. Table 8.2 gives a summary of the inputs conditions for each
operation.
Table 8.2: Conditions of generalized pipeline array (specified pins) inputs for arithmetic
operations
Arithmetic
Operation
Inputs Outputs
WX P A B S F
Squaring 00 operand A1 = ... = A10 = 0 B1 = B2 = 0, B3 = ... = B7 = 1 result −
Square rooting 01 P1 = ... = P5 = 0 operand B1 = B2 = 0, B3 = ... = B7 = 1 remainder result
Multiplication 10 multiplier A1 = ... = A10 = 0 multiplicand result −
Division 11 P1 = ... = P5 = 0 dividend divisor remainder result
116
A
A
C
A
A
C
A A
A
A
A
A
C
A
A
A
A
A
A
A
C
A
A
A
A
A
A
A
A
A
C
A
A
A
A
A
A
A
A
A
0
0
0
0
0
0
0
0
0
F 1 F 2 F 3 F 4 F 5
S 1
S 2
S 3
S 4
S 5
S 6
S 7
S 8
S 9
S 1
0
S 0
P
1
 P
2
 P
3
 P
4
 P
5
 W
 X
B
1
 
B
2
A
1
B
3
A
2
A
3
B
4
A
4
B
5
A
5
A
6
A
7
B
6
A
8
B
7
A
9
A
10
Cl
o
ck
C
1
C
2
C
3
C
4
C
5
C
6
C
7
F
ig
u
re
8.
9:
G
en
er
al
iz
ed
p
ip
el
in
e
ce
ll
u
la
r
ar
ra
y
fo
r
sp
ec
ifi
ed
in
p
u
t/
ou
tp
u
t
p
in
s
117
Based on the QCA arithmetic and control cells designed previously, the generalized
pipeline array for specified input/output pins can be designed. In Fig. 8.3.3, the QCA
design of the generalized pipeline array for n = 5, which can perform squaring operation
of 5-bit number, square rooting of 10-bit number, multiplication of two 4-bit numbers, and
division of 7-bit number by 4-bit number is shown. This array is similar to the design shown
in Fig. 8.8. However, it has an additional circuit that leads to a reduction in the number
of inputs as shown in the red rectangular in Fig. 8.10. This design has a total number of
42,174 QCA cells and an area of 59.47µm2.
In the generalized pipeline array for specified input/output pins, the delay is the same
as given in the original design. However, a full clock cycle is required for the additional
circuit which is used to determine the values of inputs (C1, C2, .., Ci) based on the value of
inputs (B1, B2, .., Bi) and W . As a result, the overall delay for a pipeline array designed for
squaring of a n-bit number, square rooting of a 2n-bit number, multiplication of (n+3)/2-bit
and (n+ 3)/2-bit numbers if n is odd, and of (n+ 2)/2-bit and (n+ 4)/2-bit numbers if n is
even, and division of (n+ 2)-bit number by (n+ 3)/2-bit number if n is odd or (n+ 2)/2-bit
number if n is even, can be determined by
Delay = 1 +
n∑
i=1
8 + 3(i− 1) (8.7)
Thus, the overall delay for the generalized pipeline cellular array with n = 5 shown in
Fig. 8.3.3 is Delayn=5 = 71.
8.4: Simulation Results and Comparison
In this section, the simulation results for the proposed design of generalized pipeline
array are presented. The designs and simulations were done using QCADesigner version
2.0.3 [15]. The layers properties used in the design are as follows: the cells area is 18 nm×18
118
Figure 8.10: QCA design of generalized pipeline array for specified input/output pins for
n = 5
nm, and the diameter of the dots is 5 nm. The parameters used for a simulation engine
in the bistable approximation are as follows: the number of samples is 12800, convergence
tolerance is 0.001, radius of effect is 65 nm, relative permittivity is 12.9, clock high and clock
low are 9.8× 1022 and 3.8× 1023, respectively, clock amplitude factor is 2, layer separation
is 11.5, and maximum iterations per sample is 100.
In Fig. 8.11, the simulation results after several iterations for squaring, square-rooting,
119
multiplication, and division operations using the proposed generalized pipeline array for
n = 5 are given.
(a) (b)
(c) (d)
Figure 8.11: Simulation results for (a) Squaring of 111 (b) Square rooting of 1000000 (c)
Multiplication of 1110 and 1001 (d) Division of 101101 by 101
For squaring operation, the input is given in P ′s as 111 and the result is obtained in S ′s
as 110001 as shown in Fig. 8.11(a). For square rooting, the input is given in A′s as 1000000
120
and the outputs are obtained in F ′s as 1000 for the quotient and S ′s as 0 for the remainder
as shown in Fig. 8.11(b). For multiplication, Fig. 8.11(c) shows the multiplication result of
the inputs P ′s and B′s given as 1110 and 1001, respectively, and the result is obtained in
S ′s as 1111110. For division, the inputs are given in A′s as 101101 for the dividend and B′s
as 101 for the divisor, the outputs are produced in F ′s as 1001 and S ′s as 0 for the quotient
and remainder, respectively, as shown in Fig. 8.11(d).
Several QCA designs of different arithmetic circuits have been introduced in the literature
[76, 79, 80, 89]. However, these basic circuits can perform generally a single operation. In
Table 8.3, a functional and specification comparison of these circuits and the proposed QCA
design of generalized pipeline array is given.
Table 8.3: Comparison of the proposed GPCA and different QCA designs
Arithmetic operations Specifications
Array ADD SUB SQ SQR DIV MUL Latency Area (µm2) Cell count
4-bit squarer [79] no no yes no no no 7 0.53 552
8-bit square root [80] no no no yes no no 6.25 14.85 5272
3× 3 divider [89] no no no no yes no 37 15 6,451
4× 4 divider [76] no no no no yes no 47.25 10.95 6,865
4× 4 multiplier [89] no no no no no yes 14 5.15 2,956
4-bit processor (ALU) [81] yes yes no no no no 11 14.27 −
4-bit ALFG [82] yes yes no no no yes 9 11.37 35,596
Proposed GPCA (n = 2) yes yes yes yes yes yes 20 11.43 7,905
Proposed GPCA (n = 3) yes yes yes yes yes yes 34 23.35 15,098
Proposed GPCA (n = 4) yes yes yes yes yes yes 51 39.44 26,698
Proposed GPCA (n = 5) yes yes yes yes yes yes 71 59.47 42,174
From the table, it can be observed that the latency and area of the proposed array are
comparable to the other basic circuits. As shown in the table, the 3 × 3 divider [89] has a
latency of 37 clock cycles, whereas the proposed array for n = 3 can perform the division
of 5 × 3 with a latency of 34 clock cycles. We have also compared arithmetic logic unit
(ALU) [81] and arithmetic and logical function generator (ALFG) [82]. It can be noted that
121
the proposed array is the only unit that can perform all the basic arithmetic operations. For
comparison with other arithmetic units, the proposed array for n = 2, n = 3, n = 4 and
n = 5 are included in the table.
8.5: Conclusion
In this chapter, a QCA design of a generalized pipeline cellular array is presented. The
equivalent majority-based networks of the basic cells in the array have been realized using
the majority logic synthesis method given in the previous chapter in order to find the optimal
networks with the minimum numbers of gates, levels, and inverters. Unlike existing QCA
arithmetic units which can perform limited operations, the proposed array can perform all
the basic operations such as squaring, square rooting, multiplication, division, etc., using
only one arithmetic cell.
122
CHAPTER 9: CONCLUSION
9.1: Introduction
The physical limitations of complementary metal-oxide semiconductor (CMOS) technology
have led many researchers to consider other alternative technologies. Quantum-dot cellular
automate (QCA), single electron tunneling (SET), tunneling phase logic (TPL), spintronic
devices, etc., are some of the nanotechnologies that are being considered as possible re-
placements for CMOS. In these nanotechnologies, the basic logic units used to implement
circuits are majority and/or minority gates. In this dissertation, we proposed algorithms for
designing various arithmetic circuits for QCA implementation. These arrays can perform
multiplication, division, squaring, square-rooting, and multi-operation. The arrays are de-
veloped based on their fundamental logic devices of QCA and a structure of the three-input
XOR function. We have also introduced a comprehensive majority/minority logic synthesis
technique in view of majority- and/or minority-based nanotechnologies.
9.2: Summary and Conclusion
In this research, QCA designs of various single- and multi-operation arithmetic circuits
are proposed. In Chapter 2, the background information about QCA technology including
its logic devices, clocks, and crossover is given. We also reviewed majority/minority-based
nanotechnologies and the implementation of their basic logic devices. A comprehensive
review of majority/minority logic synthesis methods that can process multi-input multi-
output Boolean functions is presented. These synthesis methods are compared based on
different optimization factors.
QCA designs of various fundamental arithmetic cells are given in Chapter 3. An algorithm
for the design of these cells is described first. The algorithm consists of taking the Boolean
123
functions of these cells and converting them into their equivalent majority/XOR-based ex-
pressions. The designs of these fundamental cells are used to develop QCA multiplier,
divider, squarer and square-rooting circuits as given in Chapter 4, 5, and 6, respectively.
These designs are developed to perform the arithmetic operations for any number of bits
by forming them in a pipeline manner. The basic cells of these arrays are designed based
on the fundamental logic devices in QCA and a single-layer structure of the three-input
XOR function. The designs are developed using multi-layer architecture. The proposed
arrays outperformed their counterparts in all aspects such as cell count, area, and latency.
The proposed multiplier and divider achieved reductions of 15.3% and 5.5% in cell count,
53.7% and 33.1% in area , and 47.5% and 18.8% in latency, respectively. The proposed
squaring circuits achieved a reduction of 1.1% in cell count and 41.2% in latency, whereas
the square-rooting circuits achieved a latency reduction of 41.2%, compared to the existing
designs.
In Chapter 7, a comprehensive methodology for majority/minority logic networks syn-
thesis is proposed. This method is capable of processing any arbitrary multi-output Boolean
function to find its equivalent optimal majority logic network. This method can be used to
obtain majority logic networks targeting to optimize either the number of gates or levels as
the first priority. After generating the primary solutions, a process for removing redundan-
cies is then applied. In the proposed method, more than one equivalent majority expressions
network can be obtained. However, the most optimized network will be produced as a final
solution. The proposed method has been implemented in Java and the package contains
approximately 2500 lines of code. The obtained results for 15 MCNC benchmark circuits
showed that when the number of majority gates is the first priority, there is an average
reduction of 45.3% in the number of gates and 15.1% in the number of levels. They also
124
showed that when the number of levels is the first priority, an average reduction of 23.5% in
the number of levels and an average reduction of 43.1% in the number of gates is possible,
compared to the majority AND/OR mapping method. These results are better compared to
those obtained from the existing methods. The proposed synthesis method can also be used
to realize minority logic network. A majority network can be easily converted into its equiv-
alent minority network by using De Morgan’s theorem which results in a minority network
with the same number of gates and levels as its equivalent majority network. Therefore,
the proposed method can be used for any majority- or minority-based technologies such as
QCA, SET and TPL.
A QCA design for a generalized pipeline cellular array that can perform all the basic
arithmetic operations such as multiplication, division, squaring, and square rooting is pre-
sented in Chapter 8. This array can also perform A+(B×P ). By choosing P appropriately,
we can determine addition and subtraction operations. The equivalent majority logic net-
works of the arithmetic cell and control cell used in the pipeline array are generated using
the synthesis method proposed in Chapter 7. The designs of this array and its basic cells
are developed using multi-layer structures. The proposed arrays can perform all the basic
arithmetic operations for any number of bits which could be quite valuable in considering
future design of large-scale QCA circuits.
9.3: Future Work
In this dissertation, we developed algorithms which are suitable for the implementation
of QCA circuits. With these algorithms, we designed a number of arithmetic arrays which
could be implemented on QCA technology. The study has opened a number of problems for
future work.
125
In Chapter 2, an extensive review of majority/minority-based nanotechnologies and syn-
thesis methods is given. We found that other nanotechnologies are being worked out and
considered for future nano circuits. In particular, it is important to review emerging tech-
nologies at post-CMOS era, such as neuromorphic computing with magneto-metallic and
compare them with the other existing technologies.
In Chapter 3, we developed an algorithm for QCA design of various single- and multi-
operation arithmetic circuits using a single-layer structure of the three-input XOR function.
We compared this algorithm with the previous research works and found that it results in
lower-complexity, higher-speed, and lower-cost circuits. However, this is an ample scope to
exploit different approaches and study the possible implementation of other logic functions
as basic devices such as XNOR. In addition, it is worthwhile making effort to improve the
algorithm so that it can optimize the number of QCA cells.
In Chapter 4, 5, and 6, we designed multiplier, divider, squarer and square-rooting cir-
cuits in QCA using some of the existing Boolean functions. It is important to study some
additional basic cells of these arrays designed by other researchers. Adder and subtractor are
the basic operations for multiplication, division, squaring, and square rooting. In literature,
researchers are still designing better and better adders. A rigorous study of various adders
is needed along with the design of higher-speed and low-complexity adders and subtractors.
The majority/minority logic synthesis method proposed in Chapter 7 can realize any
Boolean function and result in optimum number of gates and levels. This method gave
better results for 15 MCNC benchmarks in view of gates and levels compared to the existing
methods. This method is specific to majority/minority-based nanotechnologies. However,
with the advancements in research, logic functions beyond majority/minority could possibly
be implemented as basic devices. Hence, a synthesize tool which can be used for any selected
126
emerging nanotechnology approach such as SET, TPL, etc, could be an efficient solution for
future nano circuit design.
In chapter 8, we have designed a generalized pipeline cellular array in QCA. This array
can perform all the basic operations such as addition, subtraction, multiplication, division,
squaring, and square rooting. Hence, the array can be extended to any number of bits for
any operation. The basic cells of this array are developed based on their equivalent majority
logic networks. It is worthwhile considering other logic functions such as XOR for the QCA
design of the basic cells used in the array.
In this research, we considered area and latency for the QCA design of different arithmetic
arrays. Recently, there has been an interest in developing low-power QCA circuits [94–98].
With the recent works, it is worthwhile considering low-power, low-complexity, and very
high-speed QCA designs of arithmetic arrays.
The implementation of QCA circuits will have a large number of applications especially
in medical sciences. The hardware implementation of QCA circuits is a major challenge for
nanotechnologists as it needs a major government funding. To date, QCA is considered as
an unrealized technology and the implementation of its circuits is at the stage of infancy.
However, several experimental basic devices such as QCA cell, majority gate, inverter, and
wire have been fabricated [6, 99–102]. These implemented devices attempt to realize the
required bistable interacting behavior of QCA. The implementations of these devices are
classified to
• Metal-island [103–106].
• Semiconductor [107,108].
• Magnetic [32,109,110].
• Molecular [111–116].
127
In this research, we mainly focused on only up to the development of algorithms for better
designs. Therefore, expanding the investigation of QCA implementations and analyzing the
different characteristics of these classes would make a worthwhile contribution to future QCA
implementation.
In conclusion, algorithms for QCA designs of various arithmetic circuits have been devel-
oped. The study is by no means complete. Extensive research is needed in order to achieve
the objectives of future QCA nano circuit applications.
128
APPENDIX: PROOF OF MINIMAL MAJORITY NETWORK
The symbols that are used in our proof are defined as follows:
f input Boolean function;
s standard majority structure;
m majority expression for f obtained from s;
MG set of all majority expressions for f obtained from SG,L by “Gate Priority”;
ML set of all majority expressions for f obtained from SG,L by “Level Priority”;
M set of all majority expressions for f ;
The definitions of G,L, and SG,L are the same as given in the previous sections.
Theorem 1 For any f , every m ∈ MG is a majority expression for f with the minimum
possible number of gates among all elements in M, and among these majority expressions
with that number of gates, m has the minimum possible number of levels.
Proof: Since all elements in MG are obtained from at least one element in SG,L, every
m ∈ MG has the same number of gates and the same number of levels as every s ∈ SG,L.
Therefore, m is a minimal majority expression for f if s is a minimal majority structure.
According to our algorithm, when the optimization is targeted to reduce the number of gates,
the standard majority structures are first sorted by the number of gates, and among these
structures with the same number of gates, they are sorted by the number of levels. In this
ordering, a minimal structure that can result in at least one majority expression in MG is
then chosen. This process gives the minimal set of structures SG,L since the corresponding
MG to SG∗,L∗ , where G∗ < G and L∗ < L, is an empty set. Thus, every m ∈MG is a minimal
majority expression for f with the minimum number of gates and the minimum number of
levels for that number of gates. 
129
Theorem 2 For any f , every m ∈ ML is a majority expression for f with the minimum
possible number of levels among all elements in M, and among these majority expressions
with that number of levels, m has the minimum possible number of gates.
Proof: Since all elements in ML are obtained from at least one element in SG,L, every
m ∈ ML has the same number of gates and the same number of levels as every s ∈ SG,L.
Therefore, m is a minimal majority expression for f if s is a minimal majority structure. In
the proposed algorithm, when the main optimization target is levels, the standard majority
structures are sorted by the number of levels, and among these structures with the same
number of levels, they are sorted by the number of gates. In this ordering, a minimal
structure that can result in at least one majority expression in ML is then chosen. Since the
corresponding ML to SG∗,L∗ , where L∗ < L and G∗ < G, is an empty set, every s ∈ SG,L is
a minimal structure. Thus, every m ∈ ML is a minimal majority expression for f with the
minimum number of levels and the minimum number of gates for that number of levels. 
130
PUBLICATIONS
1. Amjad Almatrood and Harpreet Singh, “Design of Generalized Pipeline Cellular Array
in Quantum-dot Cellular Automata”, IEEE Computer Architecture Letters, 2017, DOI
10.1109/LCA.2017.2719021.
2. Amjad Almatrood and Harpreet Singh, “A Comparative Study of Majority/Minority
Logic Circuit Synthesis Methods for Post-CMOS nanotechnologies”, Engineering, 9(10):
890, 2017.
3. Amjad Almatrood and Harpreet Singh, “QCA Circuit Design of n-Bit Non-Restoring
Binary Array Divider”, submitted for publication.
4. Amjad Almatrood and Harpreet Singh, “A Methodology for Majority/Minority Logic
Networks Synthesis”, submitted for publication.
5. Amjad Almatrood and Harpreet Singh, “Designs of Various Arithmetic Fundamental
Units in QCA”, submitted for publication.
6. Amjad Almatrood and Harpreet Singh, “Low-Complexity High-Speed QCA Array Mul-
tiplier”, submitted for publication.
7. Amjad Almatrood and Harpreet Singh, “Squaring and Square-Rooting Circuits in
Quantum-dot Cellular Automata”, submitted for publication.
8. Amjad Almatrood, H Singh, “A new Approach to the Development of Nano Digital
Circuits”, Poster presented at IEEE Humanitarian Technology Conference, July 17,
2015. (Best poster award)
9. Amjad Almatrood, H Singh, “A New Approach to the Development of Nano Digital
Circuits and Its Applications in Molecular Medicine”, 9th IEEE International Con-
ference on Nano/Molecular Medicine and Engineering, November 15-18, 2015: 978-1-
4673-9671-4.
131
10. Aby K George, Amjad Almatrood, Harpreet Singh, “Design of Arithmetic and Con-
trol Cells for a DNA Binary Processor”, Proceedings on International conference on
Computational Science and Computational Intelligence (CSCI 15), pp 7-12, Dec 2015.
11. Amjad Almatrood, Aby K George, Harpreet Singh, “On the Development of Multi-
Input Multi-Output Nano Digital Circuits for Molecular Medicine”, Proceedings on
IEEE International conference on Computational Science and Computational Intelli-
gence (CSCI 15), pp 873-878, Dec 2015.
12. Twal, Rula, Harpreet Singh, Amjad Almatrood, and Aby K. George. “On the de-
velopment of Boolean algebra approach to terminal-electromagnetic-compatibility of
networks,” In Electromagnetic Compatibility (EMC), 2016 IEEE International Sym-
posium on, pp. 67-72, 2016.
13. Twal, Rula, Amjad F. Almatrood, and Harpreet Singh. “On the Development of Mod-
els and Metrics for Safety of Soldiers, In Proceedings of the International Conference
on Modeling, Simulation and Visualization Methods (MSV), p. 108. The Steering
Committee of The World Congress in Computer Science, Computer Engineering and
Applied Computing (WorldComp), 2016.
14. Anubhav Sharma, Shashank Kamthan, Aby K George, Amjad Almatrood, Harpreet
Singh, Harinder Pal Singh, “On the Development of Chip to Control Laser Time for
Cell-selective Arrhythmia Ablation of Heart”, Proceedings on International Conference
on Innovative Trends in Electronics Engineering (ICITEE-2016), Jan 2016.
15. Sairam Gowrisankar, Amjad Almatrood, Aby K George, Harpreet Singh, Harinder Pal
Singh, “On the development of Arithmetic Processors”, Proceedings on International
Conference on Innovative Trends in Electronics Engineering (ICITEE-2016), Jan 2016.
132
BIBLIOGRAPHY
[1] 2013 International Technology Roadmap for Semiconductors (ITRS), [Online]. Avail-
able: www.semiconductors.org.
[2] Craig S Lent, P Douglas Tougaw, Wolfgang Porod, and Gary H Bernstein. Quantum
cellular automata. Nanotechnology, 4(1):49, 1993.
[3] P Douglas Tougaw and Craig S Lent. Logical devices implemented using quantum
cellular automata. Journal of Applied physics, 75(3):1818–1825, 1994.
[4] C. S. Lent and P. D. Tougaw. A device architecture for computing with quantum dots.
Proceedings of the IEEE, 85(4):541–557, Apr 1997.
[5] Wolfgang Porod. Quantum-dot devices and quantum-dot cellular automata. Interna-
tional Journal of Bifurcation and Chaos, 7(10):2199–2218, 1997.
[6] GL Snider, AO Orlov, I Amlani, X Zuo, GH Bernstein, CS Lent, JL Merz, and
W Porod. Quantum-dot cellular automata: Review and recent experiments. Jour-
nal of Applied Physics, 85(8):4283–4285, 1999.
[7] K. Walus, G. A. Jullien, and V. S. Dimitrov. Computer arithmetic structures for
quantum cellular automata. In Signals, Systems and Computers, 2004. Conference
Record of the Thirty-Seventh Asilomar Conference on, volume 2, pages 1435–1439
Vol.2, Nov 2003.
[8] Takahide Oya, Tetsuya Asai, Takashi Fukui, and Yoshihito Amemiya. A majority-logic
nanodevice using a balanced pair of single-electron boxes. Journal of nanoscience and
nanotechnology, 2(3-4):333–342, 2002.
[9] T. Oya, T. Asai, T. Fukui, and Y. Amemiya. A majority-logic device using an irre-
versible single-electron box. IEEE Trans. on Nanotechnology, 2(1):15–22, Mar 2003.
133
[10] H. A. H. Fahmy and R. A. Kiehl. Complete logic family using tunneling-phase-logic
devices. In Microelectronics, 1999. ICM ’99. The Eleventh International Conference
on, pages 153–156, Nov 2000.
[11] Amjad Almatrood and Harpreet Singh. A comparative study of majority/minority
logic circuit synthesis methods for post-cmos nanotechnologies. Engineering, 9(10):890,
2017.
[12] Dmitri E Nikonov and Ian A Young. Overview of beyond-cmos devices and a uniform
methodology for their benchmarking. Proceedings of the IEEE, 101(12):2498–2533,
2013.
[13] Ben L. Feringa. The art of building small: From molecular switches to motors (nobel
lecture). Angewandte Chemie International Edition, 56(37):11060–11078, 2017.
[14] Alexei O Orlov, Islamshah Amlani, Geza Toth, Craig S Lent, Gary H Bernstein, and
Gregory L Snider. Experimental demonstration of a binary wire for quantum-dot
cellular automata. Applied physics letters, 74(19):2875–2877, 1999.
[15] Konrad Walus, Timothy J Dysart, Graham A Jullien, and R Arief Budiman. Qcade-
signer: A rapid design and simulation tool for quantum-dot cellular automata. IEEE
transactions on nanotechnology, 3(1):26–31, 2004.
[16] Craig S Lent and P Douglas Tougaw. A device architecture for computing with quan-
tum dots. Proceedings of the IEEE, 85(4):541–557, 1997.
[17] Gabriel Schulhof, Konrad Walus, and Graham A Jullien. Simulation of random cell
displacements in qca. ACM Journal on Emerging Technologies in Computing Systems
(JETC), 3(1):2, 2007.
[18] Aaron Gin, P Douglas Tougaw, and Sara Williams. An alternative geometry for
quantum-dot cellular automata. Journal of Applied Physics, 85(12):8281–8286, 1999.
134
[19] Dmitri E Nikonov, George I Bourianoff, and Tahir Ghani. Proposal of a spin torque
majority gate logic. IEEE Electron Device Letters, 32(8):1128–1130, 2011.
[20] George Bourianoff and Dmitri Nikonov. (keynote) progress, opportunities and chal-
lenges for beyond cmos information processing technologies. ECS Transactions,
35(2):43–53, 2011.
[21] Behtash Behin-Aein, Deepanjan Datta, Sayeef Salahuddin, and Supriyo Datta. Pro-
posal for an all-spin logic device with built-in memory. Nature nanotechnology,
5(4):266–270, 2010.
[22] Behtash Behin-Aein, Angik Sarkar, Srikant Srinivasan, and Supriyo Datta. Switching
energy-delay of all spin logic devices. Applied Physics Letters, 98(12):123510, 2011.
[23] Meghna G Mankalale and Sachin S Sapatnekar. Optimized standard cells for all-
spin logic. ACM Journal on Emerging Technologies in Computing Systems (JETC),
13(2):21, 2016.
[24] Markovic D Krivorotov, I. Stt oscillators/memory. presented at the MIND Annual
Review and NRI Benchmarking Workshop, Notre Dame, IN, USA, pages 16–18, 2017.
[25] Alexander Khitun and Kang L Wang. Nano scale computational architectures with
spin wave bus. Superlattices and Microstructures, 38(3):184–200, 2005.
[26] Kerry Bernstein, Ralph K Cavin, Wolfgang Porod, Alan Seabaugh, and Jeff Welser.
Device and architecture outlook for beyond cmos switches. Proceedings of the IEEE,
98(12):2169–2184, 2010.
[27] Stefan Klingler, Philipp Pirro, Thomas Bra¨cher, Britta Leven, Burkard Hillebrands,
and Andrii V Chumak. Design of a spin-wave majority gate employing mode selection.
Applied Physics Letters, 105(15):152410, 2014.
[28] Odysseas Zografos, Praveen Raghavan, Luca Amaru, Bart Sore´e, Rudy Lauwereins,
135
Iuliana Radu, Diederik Verkest, and Aaron Thean. System-level assessment and area
evaluation of spin wave logic circuits. In Nanoscale Architectures (NANOARCH), 2014
IEEE/ACM International Symposium on, pages 25–30. IEEE, 2014.
[29] Alexander Khitun and Kang L Wang. Non-volatile magnonic logic circuits engineering.
Journal of Applied Physics, 110(3):034306, 2011.
[30] M a´P Kostylev, AA Serga, T Schneider, B Leven, and B Hillebrands. Spin-wave logical
gates. Applied Physics Letters, 87(15):153501, 2005.
[31] Luca Amaru´, Pierre-Emmanuel Gaillardon, Subhasish Mitra, and Giovanni De Micheli.
New logic synthesis as nanotechnology enabler. Proceedings of the IEEE, 103(11):2168–
2195, 2015.
[32] RP Cowburn and ME Welland. Room temperature magnetic quantum cellular au-
tomata. Science, 287(5457):1466–1468, 2000.
[33] Jinbo Zhu, Libing Zhang, Shaojun Dong, and Erkang Wang. Four-way junction-driven
dna strand displacement and its application in building majority logic circuit. ACS
nano, 7(11):10211–10217, 2013.
[34] Wei Li, Yang Yang, Hao Yan, and Yan Liu. Three-input majority logic gate and
multiple input logic circuit based on dna strand displacement. Nano letters, 13(6):2980–
2988, 2013.
[35] Aby K George and Harpreet Singh. Three-input majority gate using spatially localised
dna hairpins. Micro & Nano Letters, 12(3):143–146, 2017.
[36] Frank Schwierz. Graphene transistors. Nature nanotechnology, 5(7):487–496, 2010.
[37] Heejun Yang, Jinseong Heo, Seongjun Park, Hyun Jae Song, David H Seo, Kyung-Eun
Byun, Philip Kim, InKyeong Yoo, Hyun-Jong Chung, and Kinam Kim. Graphene bar-
136
ristor, a triode device with a gate-controlled schottky barrier. Science, 336(6085):1140–
1143, 2012.
[38] Sandeep Miryala, Valerio Tenace, Andrea Calimera, Enrico Macii, Massimo Poncino,
Luca Amaru´, Giovanni De Micheli, and Pierre-Emmanuel Gaillardon. Exploiting the
expressive power of graphene reconfigurable gates via post-synthesis optimization. In
Proceedings of the 25th edition on Great Lakes Symposium on VLSI, pages 39–44.
ACM, 2015.
[39] Eike Linn, Roland Rosezin, Carsten Ku¨geler, and Rainer Waser. Complementary
resistive switches for passive nanocrossbar memories. Nature materials, 9(5):403–406,
2010.
[40] Richard Fackenthal, Makoto Kitagawa, Wataru Otsuka, Kirk Prall, Duane Mills, Kei-
ichi Tsutsui, Jahanshir Javanifard, Kerry Tedrow, Tomohito Tsushima, Yoshiyuki
Shibahara, et al. 19.7 a 16gb reram with 200mb/s write and 1gb/s read in 27nm
technology. In Solid-State Circuits Conference Digest of Technical Papers (ISSCC),
2014 IEEE International, pages 338–339. IEEE, 2014.
[41] Shyh-Shyuan Sheu, Meng-Fan Chang, Ku-Feng Lin, Che-Wei Wu, Yu-Sheng Chen,
Pi-Feng Chiu, Chia-Chen Kuo, Yih-Shan Yang, Pei-Chia Chiang, Wen-Pin Lin, et al.
A 4mb embedded slc resistive-ram macro with 7.2 ns read-write random-access time
and 160ns mlc-access capability. In Solid-State Circuits Conference Digest of Technical
Papers (ISSCC), 2011 IEEE International, pages 200–202. IEEE, 2011.
[42] Yu-Ming Lin, Joerg Appenzeller, Joachim Knoch, and Phaedon Avouris. High-
performance carbon nanotube field-effect transistor with tunable polarities. IEEE
Transactions on Nanotechnology, 4(5):481–489, 2005.
137
[43] Joerg Appenzeller. Carbon nanotubes for high-performance electronics—progress and
prospect. Proceedings of the IEEE, 96(2):201–211, 2008.
[44] Keivan Navi, Amir Momeni, Fazel Sharifi, and Peiman Keshavarzian. Two novel ultra
high speed carbon nanotube full-adder cells. IEICE Electronics Express, 6(19):1395–
1401, 2009.
[45] NAVI Keivan, Fazel Sharifi, Amir Momeni, and Peiman Keshavarzian. Ultra high
speed cnfet full-adder cell based on majority gates. IEICE transactions on electronics,
93(6):932–934, 2010.
[46] H. S. Miller and R. O. Winder. Majority-logic synthesis by geometric methods. IRE
Transactions on Electronic Computers, EC-11(1):89–90, Feb 1962.
[47] S. B. Akers. Synthesis of combinational logic using three-input majority gates. In
Switching Circuit Theory and Logical Design, 1962. SWCT 1962. Proceedings of the
Third Annual Symposium on, pages 149–158, Oct 1962.
[48] Saburo Muroga. Threshold logic and its applications. Wiley, New York, NY, USA,
1971.
[49] Rumi Zhang, K. Walus, Wei Wang, and G. A. Jullien. A method of majority logic
reduction for quantum cellular automata. IEEE Transactions on Nanotechnology,
3(4):443–450, Dec 2004.
[50] K. Walus, G. Schulhof, G. A. Jullien, R. Zhang, and W. Wang. Circuit design based
on majority gates for applications with quantum-dot cellular automata. In Signals,
Systems and Computers, 2004. Conference Record of the Thirty-Eighth Asilomar Con-
ference on, volume 2, pages 1354–1357 Vol.2, Nov 2004.
[51] M. R. Bonyadi, S. M. R. Azghadi, N. M. Rad, K. Navi, and E. Afjei. Logic optimization
138
for majority gate-based nanoelectronic circuits based on genetic algorithm. In Electrical
Engineering, 2007. ICEE ’07. International Conference on, pages 1–5, April 2007.
[52] Suresh Rai. Majority gate based design for combinational quantum cellular automata
(QCA) circuits. In 2008 40th Southeastern Symposium on System Theory (SSST),
pages 222–224. IEEE, 2008.
[53] Zhi Huo, Qishan Zhang, S. Haruehanroengra, and Wei Wang. Logic optimization for
majority gate-based nanoelectronic circuits. In 2006 IEEE International Symposium
on Circuits and Systems, pages 4 pp.–1310, May 2006.
[54] R. Zhang, P. Gupta, and N. K. Jha. Majority and minority network synthesis with
application to QCA-, SET-, and TPL-based nanotechnologies. IEEE Transactions
on Computer-Aided Design of Integrated Circuits and Systems, 26(7):1233–1245, July
2007.
[55] K. Kong, Y. Shang, and R. Lu. An optimized majority logic synthesis methodology for
quantum-dot cellular automata. IEEE Transactions on Nanotechnology, 9(2):170–183,
March 2010.
[56] P Wang, M Niamat, and S Vemuru. Minimal majority gate mapping of four-variable
functions for quantum-dot cellular automata. In Nanoelectronic Device Applications
Handbook, chapter 20, pages 263–280. CRC Press, 1st ed. Boca Raton, FL, USA, 2013.
[57] Melanie Mitchell. An introduction to genetic algorithms. MIT press, London, UK,
1998.
[58] John H Holland. Adaptation in natural and artificial systems: an introductory analysis
with applications to biology, control, and artificial intelligence. U Michigan Press, Ann
Arbor, MI, USA, 1975.
[59] Ellen M Sentovich, Kanwar Jit Singh, Luciano Lavagno, Cho Moon, Rajeev Murgai,
139
Alexander Saldanha, Hamid Savoj, Paul R Stephan, Robert K Brayton, and Alberto
Sangiovanni-Vincentelli. SIS: A system for sequential circuit synthesis. 1992.
[60] P. Wang, M. Y. Niamat, S. R. Vemuru, M. Alam, and T. Killian. Synthesis of ma-
jority/minority logic networks. IEEE Transactions on Nanotechnology, 14(3):473–483,
May 2015.
[61] Luca Amaru´, Pierre-Emmanuel Gaillardon, and Giovanni De Micheli. Majority-
inverter graph: A novel data-structure and algorithms for efficient logic optimization.
In Proceedings of the 51st Annual Design Automation Conference, pages 1–6. ACM,
2014.
[62] Luca Amaru´, Pierre-Emmanuel Gaillardon, and Giovanni De Micheli. Boolean logic
optimization in majority-inverter graphs. In Design Automation Conference (DAC),
2015 52nd ACM/EDAC/IEEE, pages 1–6. IEEE, 2015.
[63] Eleonora Testa, Mathias Soeken, Odysseas Zografos, Luca Amaru, Praveen Raghavan,
Rudy Lauwereins, Pierre-Emmanuel Gaillardon, and Giovanni De Micheli. Inversion
optimization in majority-inverter graphs. In Nanoscale Architectures (NANOARCH),
2016 IEEE/ACM International Symposium on, pages 15–20. IEEE, 2016.
[64] Luca Amaru, Pierre-Emmanuel Gaillardon, and Giovanni De Micheli. Majority-
inverter graph: A new paradigm for logic optimization. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 35(5):806–819, 2016.
[65] L Dadda. Information processing: Proceedings of the ifip congress. In Information
Processing: Proceedings of the IFIP Congress, The Netherlands: North Holland, 1963.
[66] Robert Lisanke. Logic synthesis and optimization benchmarks user guide version 2.0.
Microelectronics Center North Carolina, Tech. Rep, 1988.
140
[67] Vikramkumar Pudi and K Sridharan. Low complexity design of ripple carry and brent–
kung adders in qca. IEEE Transactions on Nanotechnology, 11(1):105–119, 2012.
[68] Carson Labrado and Himanshu Thapliyal. Design of adder and subtractor circuits in
majority logic-based field-coupled qca nanocomputing. Electronics Letters, 52(6):464–
466, 2016.
[69] Heumpil Cho and Earl E Swartzlander Jr. Adder and multiplier design in quantum-dot
cellular automata. IEEE Transactions on Computers, 58(6):721–727, 2009.
[70] Ismo Hanninen and Jarmo Takala. Pipelined array multiplier based on quantum-dot
cellular automata. In Circuit Theory and Design, 2007. ECCTD 2007. 18th European
Conference on, pages 938–941. IEEE, 2007.
[71] Seong-Wan Kim and Earl E Swartzlander. Parallel multipliers for quantum-dot cellular
automata. In Nanotechnology Materials and Devices Conference, 2009. NMDC’09.
IEEE, pages 68–72. IEEE, 2009.
[72] Seong-Wan Kim and Earl E Swartzlander. Multipliers with coplanar crossings for
quantum-dot cellular automata. In Nanotechnology (IEEE-NANO), 2010 10th IEEE
Conference on, pages 953–957. IEEE, 2010.
[73] Inwook Kong, Seong-Wan Kim, and Earl E Swartzlander. Design of goldschmidt
dividers with quantum-dot cellular automata. IEEE Transactions on Computers,
63(10):2620–2625, 2014.
[74] Seong-Wan Kim and Earl E Swartzlander. Restoring divider design for quantum-dot
cellular automata. In Nanotechnology (IEEE-NANO), 2011 11th IEEE Conference on,
pages 1295–1300. IEEE, 2011.
[75] Samira Sayedsalehi, Mostafa Rahimi Azghadi, Shaahin Angizi, and Keivan Navi.
141
Restoring and non-restoring array divider designs in quantum-dot cellular automata.
Information sciences, 311:86–101, 2015.
[76] Huanqing Cui, Li Cai, Xiaokuo Yang, Chaowen Feng, and Tao Qin. Design of non-
restoring binary array divider in quantum-dot cellular automata. Micro & Nano Let-
ters, 9(7):464–467, 2014.
[77] TN Sasamal, AK Singh, and U Ghanekar. Design of non-restoring binary array divider
in majority logic-based qca. Electronics Letters, 52(24):2001–2003, 2016.
[78] Mohammad Mohammadi, Saeid Gorgin, and Majid Mohammadi. Design of non-
restoring divider in quantum-dot cellular automata technology. IET Circuits, Devices
& Systems, 11(2):135–141, 2017.
[79] O Giannou, HT Vergos, and D Bakalis. Squarers in qca nanotechnology. In Nanotech-
nology (IEEE-NANO), 2012 12th IEEE Conference on, pages 1–6. IEEE, 2012.
[80] Mohammad Reza Jahangir, Shadi Sheikhfaal, Shaahin Angizi, Keivan Navi, and
Firdous Ahmad. Designing nanoelectronic-compatible 8-bit square root circuit by
quantum-dot cellular automata. In Nanoelectronic and Information Systems (iNIS),
2015 IEEE International Symposium on, pages 23–28. IEEE, 2015.
[81] Konrad Walus, Mike Mazur, Gabriel Schulhof, and Graham A Jullien. Simple 4-bit
processor based on quantum-dot cellular automata (qca). In Application-Specific Sys-
tems, Architecture Processors, 2005. ASAP 2005. 16th IEEE International Conference
on, pages 288–293. IEEE, 2005.
[82] Vishnu C Teja, Satish Polisetti, and Santhosh Kasavajjala. Qca based multiplexing of
16 arithmetic & logical subsystems-a paradigm for nano computing. In Nano/Micro
Engineered and Molecular Systems, 2008. NEMS 2008. 3rd IEEE International Con-
ference on, pages 758–763. IEEE, 2008.
142
[83] Amjad Almatrood and Harpreet Singh. Design of generalized pipeline cellular array
in quantum-dot cellular automata. IEEE Computer Architecture Letters, 2017, DOI
10.1109/LCA.2017.2719021.
[84] Luigi Dadda and Vincenzo Piuri. Pipelined adders. IEEE Transactions on Computers,
45(3):348–356, 1996.
[85] Dharma P. Agrawal. High-speed arithmetic arrays. IEEE Transactions on Computers,
(3):215–224, 1979.
[86] Maurus Cappa and V Carl Hamacher. An augmented iterative array for high-speed
binary division. IEEE Transactions on Computers, 100(2):172–175, 1973.
[87] Firdous Ahmad, Ghulam Mohiuddin Bhat, Hossein Khademolhosseini, Saeid Azimi,
Shaahin Angizi, and Keivan Navi. Towards single layer quantum-dot cellular automata
adders based on explicit interaction of cells. Journal of Computational Science, 16:8–
15, 2016.
[88] https://waynestateprod-my.sharepoint.com/personal/fo5221 wayne edu/ layouts/
15/guestaccess.aspx?folderid=048e535a8e95d4ce0984b162690a1f088&authkey=Ab
nnz GJs9FvNSV-IaJvf20.
[89] Weiqiang Liu, Earl E Swartzlander Jr, and Ma´ire O’Neill. Design of semiconductor
QCA systems. Artech House, 2013.
[90] AK Kamal, Harpreet Singh, and DP Agrawal. A generalized pipeline array. IEEE
Transactions on Computers, 100(5):533–536, 1974.
[91] Harpreet Singh, Dharma P Agrawal, Shashank Kamthan, and Lubna Alazzawi. On
simulation and design implementation of generalized pipeline cellular array. In Infor-
mation Science, Electronics and Electrical Engineering (ISEEE), 2014 International
Conference on, volume 3, pages 1761–1765. IEEE, 2014.
143
[92] Peter M Kogge. The architecture of pipelined computers. CRC Press, 1981.
[93] Kai Hwang. Computer arithmetic principles, architecture, and design. 1979.
[94] Marco Ottavi, Salvatore Pontarelli, Erik P DeBenedictis, Adelio Salsano, Sarah Frost-
Murphy, Peter M Kogge, and Fabrizio Lombardi. Partially reversible pipelined qca
circuits: combining low power with high throughput. IEEE Transactions on Nan-
otechnology, 10(6):1383–1393, 2011.
[95] X Yang, L Cai, and X Zhao. Low power dual-edge triggered flip-flop structure in
quantum dot cellular automata. Electronics letters, 46(12):825–826, 2010.
[96] Shadi Sheikhfaal, Shaahin Angizi, Soheil Sarmadi, Mohammad Hossein Moaiyeri, and
Samira Sayedsalehi. Designing efficient qca logical circuits with power dissipation
analysis. Microelectronics Journal, 46(6):462–471, 2015.
[97] Jayita Das, Syed M Alam, and Sanjukta Bhanja. Low power magnetic quantum cellular
automata realization using magnetic multi-layer structures. IEEE Journal on Emerging
and Selected Topics in Circuits and Systems, 1(3):267–276, 2011.
[98] Saket Srivastava, Sudeep Sarkar, and Sanjukta Bhanja. Estimation of upper bound of
power dissipation in qca circuits. IEEE transactions on nanotechnology, 8(1):116–127,
2009.
[99] Islamshah Amlani, Alexei O Orlov, Gregory L Snider, Craig S Lent, and Gary H
Bernstein. Demonstration of a functional quantum-dot cellular automata cell. Jour-
nal of Vacuum Science & Technology B: Microelectronics and Nanometer Structures
Processing, Measurement, and Phenomena, 16(6):3795–3799, 1998.
[100] Gary H Bernstein, Alexandra Imre, V Metlushko, A Orlov, L Zhou, L Ji, Gyo¨rgy
Csaba, and Wolfgang Porod. Magnetic qca systems. Microelectronics Journal,
36(7):619–624, 2005.
144
[101] Hua Qi, Sharad Sharma, Zhaohui Li, Gregory L Snider, Alexei O Orlov, Craig S
Lent, and Thomas P Fehlner. Molecular quantum cellular automata cells. electric
field driven switching of a silicon surface bound array of vertically oriented two-dot
molecular quantum cellular automata. Journal of the American Chemical Society,
125(49):15250–15259, 2003.
[102] Konrad Walus and Graham A Jullien. Design tools for an emerging soc technology:
Quantum-dot cellular automata. Proceedings of the IEEE, 94(6):1225–1244, 2006.
[103] Ravi K Kummamuru, Alexei O Orlov, Rajagopal Ramasubramaniam, Craig S Lent,
Gary H Bernstein, and Gregory L Snider. Operation of a quantum-dot cellular au-
tomata (qca) shift register and analysis of errors. IEEE Transactions on electron
devices, 50(9):1906–1913, 2003.
[104] Gary H Bernstein, Islamshah Amlani, Alexei O Orlov, Craig S Lent, and Gregory L
Snider. Observation of switching in a quantum-dot cellular automata cell. Nanotech-
nology, 10(2):166, 1999.
[105] Islamshah Amlani, Alexei O Orlov, Ravi K Kummamuru, Gary H Bernstein, Craig S
Lent, and Gregory L Snider. Experimental demonstration of a leadless quantum-dot
cellular automata cell. Applied Physics Letters, 77(5):738–740, 2000.
[106] Alexei O Orlov, Ravi K Kummamuru, Rajagopal Ramasubramaniam, Geza Toth,
Craig S Lent, Gary H Bernstein, and Gregory L Snider. Experimental demonstra-
tion of a latch in clocked quantum-dot cellular automata. Applied Physics Letters,
78(11):1625–1627, 2001.
[107] C Ungarelli, Sebastiano Francaviglia, Massimo Macucci, and G Iannaccone. Ther-
mal behavior of quantum cellular automaton wires. Journal of Applied Physics,
87(10):7320–7325, 2000.
145
[108] Massimo Macucci, Giuseppe Iannaccone, Sebastiano Francaviglia, and Bruno Pelle-
grini. Semiclassical simulation of quantum cellular automaton circuits. International
Journal of Circuit Theory and Applications, 29(1):37–47, 2001.
[109] Gyo¨rgy Csaba, Alexandra Imre, Gary H Bernstein, Wolfgang Porod, and Vitali
Metlushko. Nanocomputing by field-coupled nanomagnets. IEEE Transactions on
Nanotechnology, 99(4):209–213, 2002.
[110] Alexandra Imre, G Csaba, L Ji, A Orlov, GH Bernstein, and W Porod. Majority logic
gate for magnetic quantum-dot cellular automata. Science, 311(5758):205–208, 2006.
[111] Kevin Hennessy and Craig S Lent. Clocking of molecular quantum-dot cellular au-
tomata. Journal of Vacuum Science & Technology B: Microelectronics and Nanometer
Structures Processing, Measurement, and Phenomena, 19(5):1752–1755, 2001.
[112] Craig S Lent, Beth Isaksen, and Marya Lieberman. Molecular quantum-dot cellular
automata. Journal of the American Chemical Society, 125(4):1056–1063, 2003.
[113] Craig S Lent and Beth Isaksen. Clocked molecular quantum-dot cellular automata.
IEEE Transactions on Electron Devices, 50(9):1890–1896, 2003.
[114] Zhaohui Li and Thomas P Fehlner. Molecular qca cells. 2. characterization of an
unsymmetrical dinuclear mixed-valence complex bound to a au surface by an organic
linker. Inorganic Chemistry, 42(18):5715–5721, 2003.
[115] Jieying Jiao, Gary J Long, Fernande Grandjean, Alicia M Beatty, and Thomas P
Fehlner. Building blocks for the molecular expression of quantum cellular automata.
isolation and characterization of a covalently bonded square array of two ferrocenium
and two ferrocene complexes. Journal of the American Chemical Society, 125(25):7522–
7523, 2003.
[116] Yuliang Wang and Marya Lieberman. Thermodynamic behavior of molecular-scale
146
quantum-dot cellular automata (qca) wires and logic devices. IEEE Transactions on
Nanotechnology, 3(3):368–376, 2004.
147
ABSTRACT
ON THE DESIGN OF LOW-COMPLEXITY HIGH-SPEED ARITHMETIC
CIRCUITS IN QUANTUM-DOT CELLULAR AUTOMATA
NANOTECHNOLOGY
by
AMJAD ALMATROOD
December 2017
Advisor: Dr. Harpreet Singh
Major: Electrical Engineering
Degree: Doctor of Philosophy
Nanoscale arithmetic circuits are important components and they play a vital role in
future processing applications. By lowering the complexity and increasing the speed of the
arithmetic circuits, the performance of the overall system can be improved. For the last four
decades, the implementation of circuits has largely been based on complementary metal-oxide
semiconductor (CMOS) technology. However, this technology has reached its physical limi-
tations. Different emerging nanotechnologies such as quantum-dot cellular automata (QCA),
single electron tunneling (SET), and tunneling phase logic (TPL), have been considered as
major candidates for possible replacement of CMOS. In this research, our approach is to
exploit QCA technology because of its capability to implement high-density, high-speed and
low-power arithmetic circuits. QCA technology is more amenable to digital circuits design
due to its binary processing paradigm. In particular, we have developed algorithms for the
QCA designs of various single- and multi-operation arithmetic arrays. These designs include
multiplier, divider, squarer, and square-rooting arrays. Majority and/or minority logic are
the basic units used to implement circuits in promising nanotechnologies. However, an XOR
148
function can be constructed in QCA as a single device. The basic cells of the proposed arrays
are developed based on the fundamental logic devices in QCA and a single-layer structure
of the three-input XOR function. The proposed QCA arithmetic circuits outperform their
counterparts in view of different aspects such as cell count, area, and latency.
In this research, a comprehensive methodology for majority/minority logic synthesis is
also developed. This method is capable of processing any arbitrary multi-output Boolean
function to find its equivalent optimal majority logic network. This method is developed
to achieve different optimization goals and results in better networks in view of gates and
levels, compared to the existing methods. Based on the synthesis method, a QCA design
of generalized pipeline cellular array that can perform all the basic arithmetic operations is
proposed. All the proposed single- and multi-operation arithmetic arrays can be designed
in a pipeline manner to perform the operations for any number of bits which could be quite
valuable while considering the future design of large-scale QCA circuits. It is hoped that the
results obtained in this study may find significant applications in a large number of hitherto
unexplored areas.
149
AUTOBIOGRAPHICAL STATEMENT
AMJAD ALMATROOD
EDUCATION
2017 Doctor of Philosophy in Electrical Engineering
Wayne State University
Detroit, MI, United States
2013 Master of Science in Electrical Engineering
Gannon University
Erie, PA, United States
2011 Bachelor of Science in Electrical Engineering-Electrical Power
Al Jouf University
Al Jouf, Saudi Arabia
RESEARCH AREAS
Nanotechnlolgy, nano circuit design, low-complexity, high-speed and low-power circuit de-
sign, quantum dot cellular automaton technology.
