Vlsi Implementation of Olfactory Cortex Model by Patil, Sanjay B.




Bachelor of Science 
College of Engineering 
Poona, India 
1987 
Submitted to the Faculty of the 
Graduate College of the 
Oklahoma State University 
in partial fulfillment of 
the requirements for 
the Degree of 
MASTER OF SCIENCE 
May, 1993 
OKLAHOMA STATE UNIVERSITY 
VLSI IMPLEMENTATION,: OF OLFACTORY 
CORTEX MODEL 
Thesis Approved: 
Dean of the Graduate College 
11 
PREFACE 
This thesis attempts to implement the building blocks required for the realization of 
the biologically motivated olfactory neural model in silicon as the special purpose 
hardware. The olfactory model is originally developed by R. Granger, G. Lynch, and 
Ambros-Ingerson. CMOS analog integrated circuits were used for this purpose. All of 
the building blocks were fabricated using the MOSIS service and tested at our site. The 
results of this study can be used to realize a system level integration of the olfactory 
model. 
I wish to express my gratitude to my major advisor, Dr. Chriswell Hutchens, for his 
guidance, inspiration, invaluable counsel, and financial support. I appreciate the endless 
time and effort he put in this work. I am also thankful to Dr. Louis Johnson, Dr. Teague, 
and Dr. Richard Cummins for serving on my committee. 
I wish to thank the office of Naval Ocean System Center, San Diego, for the 
computing resources and fmancial support they provided for the project. I am also 
thankful to Dr. Ramesh Sharda for his guidance and support. 
My special thanks goes to Dr. Patrick Shoemaker, for his invaluable assistance in 
this work. I would also like to acknowledge MOSIS fabrication services, for fabricating 
our circuits. I extent my thanks to my friends, David Born and Subbaraju Gadhiraju for 
proof reading. 
Finally, my deepest appreciation is extended to my parents, brother, and sister for 
lll 
their love, support, moral encouragement, and understanding. This work is dedicated 




TABLE OF CONTENTS" 
Page 
OLFACTION AND ELECTRONIC NEURAL NETWORKS 
Olfaction ................................... . 
About Neural Networks ........... .. ............. . 
' 
"Abstract" Verses "Tightly Coupled" Neural Network Paradigm .. 





Models: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 
Proposal for Hardware Implementation of GLA Olfactory Model .. 11 
II. OLFACTORY MODEL AND ITS HARDWARE IMPLEMENTATION 14 
The Bulbar-Cortical Model . . . . . . . . . . . . . . . . . . . . . . . . . 14 
Olfactory Bulb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 
Piriform Cortex . . . . . . . . ·. . . . . . . . . . . . . . . . . . . . 18 
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 
Multi-Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 
Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 27 
III. SYSTEM BUILDING BLOCKS . . . . . . . . . . . . . . . . . . . . . . . . . 34 
Glomerulus Normalization . . . . . . . . . . . . . . . . . . . . . . . . . 34 
AGC and Offset Combined Normalizing Function . . . . . . . 39 
Transconductance Multiplier . . . . . . . . . . . . . . . . . 40 
Simulations. . ..... .. ................ 52 
Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . 54 
Offset Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . 54 
Simulations. . . . . . . . . . . . . . . . . . . . . . . . 62 
Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . 64 
Linear Limiter with AGC Normalization Function . . . . . . . 65 
Square Law Bulb Normalization Function . . . . . . . . . . . . 66 
Approximate Sigmoidal Function . . . . . . . . . . . . . . 67 
Simulations. . . . . . . . . . . . . . . . . . . . . . . . 71 
Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . 71 
Mitral Patch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 
Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 
Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 
v 
Chapter Page 
Bi-directional Voltage/Current Buffers ................. . 
Simulations ....... . .... .... . .. .... .. .... . . 
Testing ................................ . 
Weight Matrix ............................... . 
Floating Gate Avalanche Injection MOS Memory . . .... . 
Metal Nitrite Oxide Silicon Memory ............. . 
Dual Injector Floating Gate MOS Memory ....... . .. . 
Floating Gate Analog Memory in Standard CMOS Process . 
Memory Structure . . . . . . . . . . . . . . . . . . . . . . 
Field Enhanced Fowler-Nordheim Tunneling ..... . 
Programming . . . . . . . . . . . . . . . . . . . . . . . . . 
Winner Take All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
Simulations ................ .... .... .. . . . . 
Testing ... ...... . .. : . .... . .. . ......... . 
Tie Resolver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
Testing ....... . .................. . .... . 
Dynamic Current Copier Integrator .......... . ....... . 
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
The Operating Principle of the Current Copier Integrator .. 
C. ' t D . 1rcu1 es1gn . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
Upper Integration Limit .................. . 
Maximum Switching Frequency .. ... . ....... . 
Minimum Switching Frequency ............. . 
Mechanisms of Errors . . . . . . . . . . . . . . . . . . . . . . . 
Charge Injection . . . . . . . . . . . . . . . . . . . . . . . 
Switch Feedthrough ............... ... .. . 
Cascade Configurations . . . . . . . . . . . . . . . . . . . 
Simulations ....... . .......... . .. . .. . .... . 
Testing ........... . . . ................. . 































IV. CONCLUSIONS AND FUTURE PROSPECTS . . . . . . . . . . . . . . . 156 
REFERENCES 160 
VI 
LIST OF TABLES 
~~ p~ 
I. Injector Structu.res ...... .. .......... . .......... . ... 112 
II. Truth Table ............ · . . : ......... . ............ 135 
vii 
LIST OF FIGURES 
Figure Page 
1. Block Diagram of the Olfactory System .. 16 
2. Flowchart of the Multi-Sampling Process ...... . .............. .. 24 
3. AGC and Offset Combined Linear Normalization Function ...... . ... . . 36 
4. Linear Limiter with AGC Normalization Function . . . . ...... .. . . ... 37 
5. Square Law Normalization Function ............... . ......... . 38 
6. Transconductance Multiplier .............. . ............... 41 
7. Demonstration of Class AB Principle with Two Transistors ........... 43 
8. CMOS Equivalent of Single MOS Transistor .. . . ... . ............ 45 
9. Linear MOS Transconductor Principle .......... . ........ . .... 47 
10. Linear MOS Transconductor Using the MOS Equivalent Pair .......... 49 
11. Double Pair Implementation of a Floating Voltage Source ............ 50 
12. DC Transfer Characteristics of a Transconductance Multiplier .......... 53 
13. Multiplier Output Voltage Obtained from Simulations ..... . ......... 55 
14. Multiplier Output Voltage Obtained from Test Results .... . . . . . ...... 56 
15. Offset Summer Circuit . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . ... 57 
16. Maximum Function Circuit . .. ... .. .......... . ............ 60 
17. Offset Summer DC Characteristics for Four Branches . . . . . . . . . . . . . . . 63 
18. Approximate Sigmoidal Function ............................ 68 
Vlll 
Figure Page 
19. AC Response of Squashing Function . .... ... . ................ 72 
20. DC Transfer Characteristics of Sigmoidal Function (Simulations) ........ 73 
21. DC Transfer Characteristics of Sigmoidal Function (fest Results) ........ 74 
22. Mitral Patch .... . ...... . ............ .... .... . ... .. .. 77 
23. DC Response of Mitral Cells . . . . . . . . . . . . . . . . . . . . . . . . . . .... 79 
24. Transient Response of Mitral Cells .......................... 80 
25. Bi-directional Voltage/Current Buffers . . .' .. .... . ... . .......... 82 
26. Block Diagram of the CC-II± ............................. 84 
27. Bi-directional Voltage/Current Conveyors .... . ................. 86 
28. Transient Response of the Current Conveyor .................... 89 
29. Current Conveyor AC Response .......... . .......... . ...... 90 
30. DC Transfer Characteristics of the CC Obtained from Simulations ....... 91 
31. DC Transfer Characteristics of CC Obtained from Test Results . . . . . . . .. 92 
32. Weight Matrix . ........ .. ..... . . . ............. . ...... 95 
33. Cross-section of the FAMOS Structure ................... . ... 100 
34. Cross-section of the DIFMOS Structure .................. . ... 106 
35. Electrical Equivalent Schematic of the Layout . . . . . . . . . . . . . . . . . . . 110 
36. Test Setup for Testing Memory Cell . ... .............. . ... .. 117 
37. Threshold Voltages of Un-programmed Devices ... . ............. 119 
38. Threshold Voltages of Programmed Devices .. . ........... .. .... 120 
39. Threshold Voltage Retention After 3 Hours ... . . ... ..... . .. . ... 122 
40. Threshold Voltage Retention After 130 Hours ... . ............... 123 
41. Winner Take All Circuit . . . . . .. .. ....................... 125 
ix 
Figure Page 
42. Demonstration of the Resolving WT A Inputs . . . . . . . . . . . . . . . . . . . 128 
43. Ties in the WTA Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 
44. Effect of Mean on the Settling Time ......................... 131 
45. Effect of Difference Current on the Settling Time . . . ... . . .. ..... . 132 
46. Tie Resolver ....................... . ........ . ...... 134 
4 7. Basic Current Copier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 
48. Current Copier Integrator .......... .' ........ ... .... .. ... 141 
49. The P-Cell ........... · ....................... . ..... 146 
50. Transient Operation of the CCI .... . .......... .. .. . .... ... . 153 
X 
NOMENCLATURE 
Width to length ratio of subscripted MOSFET x 
a Coupling ratio 
Transconductance parameter of subscripted MOSFET x 
Synaptic increment of wijkl per trai~ing episode 
Piriform refractory frequency facilitation threshold 
e 1 Threshold to periglomerular to eliminate inhibition noise floor 
eM. 
J Threshold of the jth mitral cell 
Piriform cell threshold 
Channel length modulation of subscripted MOSFET x 
System clock active in forward phase 
System clock active in backward phase 
Sub-phase of <Pr to initialize WTA 
Sub-phase of <P 1 to latch winning piriform cells into tie resolver 
Sub-phase of <j> 2 to store feedback inhibition ( 
<P22 Sub-phase of <j> 2 to update feedback inhibition ( 
Voltage gain of subscripted operational amplifier x 
Bootstrap capacitor 
Gate to body capacitance 
















Gate to source capacitance 
Gate capacitance of subscripted MOSFET 
Injector capacitor 
Per unit oxide capacitance 
Subscripted diode 
Number of glomeruli in bulb patch 
Automatic Gain Controlled signal 
Maximum value of element in G'i ' 
Glomerulus input (un-normalized) 
Maximum value of element in G*i 
Small signal drain transconductance 
Normalized glomerulus output; mitral patch input 
Small signal channel transconductance of subscripted MOSFET 
Nonlinear mapping function that maps glomeruli activity 
# of piriform cells per piriform patch 
Aggregate un-thresholded inhibition to glomerulus i 
Counting index for g 
Weighted inhibition on LOT line ij in backward direction 
Drain current of subscripted MOSFET x 
Full scale current 
Integrated and thresholded inhibition signal into glomerulus i 
Counting index for m 
Counting index for p 
Constants of non-linear function 
xu 
KG Constant to set percentage activation of the glomerulus 
Counting index for h 
m Number of mitral cells per glomerulus 
Mii LOT from jth mitral cell of the ith glomerulus 
Mx Subscripted MOSFET 
Oi Olfactory sensor output; olfactory system input 
p Number of piriform patches in cortex patch 
p•kl Output of the weight matrix, cortex patch input to lth piriform cell of kth 
piriform patch 
Pkl Output of cortex patch from lth piriform cell in kth piriform patch 
PWkl Winning cortex output; olfactory output 
V 00 Positive supply voltage 
V 05 Drain to source voltage of subscripted MOSFET x 
V E Erasing voltage 
V Gsx Gate to source voltage of subscripted MOSFET x 
V K Normalization scaling constant 
V P Programming voltage 
V55 Negative supply voltage 
V tun Tunneling voltage 
V Tx Threshold voltage of subscripted MOSFET x 
W Weight matrix 
wijkl Synaptic weight from LOT Mii to the piriform cell Pkl 
wmax Maximum value of the synaptic weight wiikl 
WT Transpose of the weight matrix 
Xlll 
CHAPTER I 
OLFACTION AND ELECTRONIC 
NEURAL NETWORKS 
Olfaction 
The current research surge in neural networks (NN) falls into essentially three broad 
categories. The flrst category is that of the mathematical description and analysis of the 
learning properties of neural networks, often working from biological and physiological 
exemplars [1,2]. The second, and perhaps the largest, research effort uses computer 
simulations to verify the validity of the neural network models in addition to 
demonstrating their appliqtions [3,4]. Since the publication of John Hopfield's paper [5] 
on the prospect of compact and dense hardware implementation of neural networks in 
analog integrated circuit form, a third group of research topics, into which this thesis falls , 
has emerged. The researchers in this category attempt to impl~ment neural networks in 
LSI/VLSI hardware [6,7,8,9,10,11]. 
The theory of biological neuron and the actual neural processing within the brain are 
complex and involved [12]. The physical and chemical processes in the nerve cells that 
are responsible for learning and memory are beginning to yield to experimental study by 
physiologists and anatomists. By and large, biological neural nets exhibit massive 
parallelism and parallel processing. The modulation of synaptic junctions has long been 
1 
2 
regarded as the likely mechanism for learning and memory [13]. The long term 
potentiation (LTP) that is observed in the hippocampus, limbic system, and in some 
cortical structures of the brain, is believed to be similar to the mechanism used for 
learning [14]. The changes in the synaptic strength · due to LTP are rather course 
compared to precise and graded weight changes that are offered by artificial neural 
networks. , How a nervous system might respond to the ' computationally limited neural 
learning and neural processing that is used by artificial neural networks due to two 
dimensional connectivity [15], is a question. Extensive research is being carried out 
using computer simulations on such abstract neural network models to understand the 
effects of incorporated artificiality and also in an attempt to elucidate the organizational 
principle at the system level [1 ,2]. 
R. Granger, G. Lynch, and Ambros-Ingerson of U. C. Irvine have reported a 
potentially useful model, referred to as the GLA model henceforth, for the operation of 
the interacting neural networks of the olfactory bulb and piriform cortex that has been 
observed in rats [16, 17, 18]. Computer simulations of this model have demonstrated 
interesting computational properties, such as, (1) the ability to perform the hierarchical 
clustering of the input cues (odors) presented by the pattern of activity on the input lines 
from the olfactory receptors, (2) the extensibility to unsupervised learning, and 3) the 
ability to detect .weaker stimuli when masked by a stronger one. 
A central feature of this model is the periodic sampling of stimuli at the so-called 
theta rhythm, to which the network response is locked . Hierarchical clustering and 
unmasking operations proceed sequentially with successive sampling (sniffs) of the 
inputs. For example, if you are serious gardener, on the first sniff you might get a 
3 
response indicating the odor of flower, on the second a rose, and finally on the third, an 
"Oklahoma Orange Spirit." 
The goal of this research work is to develop a simplified electronic realization of 
GLA olfactory model suitable for analog implementation in bulk CMOS circuitry. This 
realization will retain the essential clustering properties of the olfactory bulb (OB) and 
paleocortex. The dominance of the theta rhythm in GLA model suggests the suitability 
of the synchronous or clocked approach, but the actual computation between the two 
clock cycles is analog, asynchronous, and carried out in parallel. In the GLA model, 
categorization in the paleocortex is done through an iterative procedure of sniffs, usually 
less then 5, with each sniff leading to a specific clustered solution down in the hierarchy. 
About Neural Networks 
Before the problem is presented, some review of the basic concepts of neural 
networks may be useful. Even though, at the later stages of this chapter, we have tried 
to subtly distinguish the GLA olfactory model from the traditional "abstract" neural 
networks (since the olfactory model more closely mimics the nervous system), GLA 
model still uses many of the concepts exploited by abstract artificial neural networks. 
The interested reader is referred to the work of Patri K. Simpson [19] for history and 
more details of artificial neural networks. This section gives the reader a brief review 
of neural networks. 
Artificial neural networks (ANN's) go by many different names, such as 
connectionist models, parallel distributed-processing models, or neuro-computers. The 
structure of artificial neural networks is based on the present understanding of the 
4 
biological nervous system. ANN's provide an alternative form of computation that 
attempts to mimic the neurophysiological functions. ANN's are composed of many 
nonlinear computational elements. These computational elements operate in parallel and 
arranged in patterns reminiscent of biological neural networks. Elements are connected 
via densely connected weights. Weights are typically adapted during use (learning) [3]. 
The information is held in these weights.· The new information is captured by changing 
the strength of the connection. Contrary to Von Neumann's computer, which processes 
instructions sequentially, neural network models explore many hypothesis simultaneously 
using their massive parallel structures. 
In its simplest form, a neuron sums weighted inputs and passes the result through 
a non-linearity. The neuron is characterized by an internal threshold or offset and by the 
type of non-linearity. The various types of non-linearities are hard and soft limiters, 
sigmoidal logistic non-linearities, and hyperbolic tangents [3]. The hyperbolic tangent is 
similar in shape to the logistic function. It is often used by biologists as a mathematical 
model of nerve-cell activation (OUT= tanh(x)). The most commonly used non-linearity 
is the sigmoidal logistic which is continuously clifferentiable. 
Based on existing results, most neural networks adapt the connection weights over 
time to improve network performance. Adaption or learning is a major interest area of 
neural net research. An example of such adaption is speech recognition, where training 
data is limited. The new speakers, words, dialects, phrases, and contests are continuously 
encountered. Traditional statistical techniques are not adaptive. They typically process 
all training data simultaneously before being used with new data. Neural net classifiers 
are non-parametric and they make weaker assumptions concerning the shapes of 
5 
underlying distributions than rraditional statistical classifiers do. Such neural network 
adaptive systems are often described by energy functions and/or probability distributions. 
The discussed neuron and the neural processing is a simplified version of biological 
neuron and neural processing. The biological neuron consists of a cell body called soma 
and an axon or nerve fiber that connects the cells to each other [20]. The junctions 
between neurons occur either on the cell body or on spin-like extensions of the cell body 
called dendrites. These junctions are referred to as synapses. Nerves and dendrites can 
be viewed as insulated conductors used for rransmitting electrochemical signals to 
neurons. In the human nervous system, about 1011 neurons participate in perhaps 1012 
interconnections over a transmission path that may range for a meter or more [20]. 
The neuron processing times are larger compared to today's advanced computer 
cycle times. The cycle time is the time taken to process a single piece of information 
from input to output. The cycle time of most advanced computers corresponding to one 
clock cycle for the CPU is on the order of 1 nanosecond. The average cycle time for a 
neuron in the brain is 2 milliseconds. The difference in speed is 2 x 106, yet due to 
brain's parallel nature, the brain is more time efficient than conventional computers. 
Neural network models offer their greatest potential in areas such as speech 
processing, image recognition, and pattern classification. In such applications, many 
hypothesis are pursued in parallel, high fault tolerant computation rates are required, and 
the existing computer systems are far from equaling human performance. When 
compared to traditional computing methods, the benefits of neural networks extend 
beyond the high computation rates provided by massive parallelism. Degree of robustness 
or fault tolerance provided by neural networks is greater than fault tolerance provided by 
6 
Von-Neumann sequential computers. Because of the many processing elements, damage 
to a few neurons and synapses does not significantly impair overall performance'. Like 
humans, trained neural networks recognize partial input information. 
"Abstract" Verses "Tightly Coupled" 
NeuraJ Network Paradigms 
This section focuses on distinguishing biologically coupled neural network paradigms 
(i.e. olfactory) from so-called traditional "abstract" neural networks. We believe that the 
subcategory of biologically mimicked or tightly coupled neural networks is necessary to 
highlight behavioral features and functions. Tightly coupled neural networks are 
significantly different in treatment when compared to many widely used abstract neural 
networks. 
Artificial neural networks may be classified according to learning algorithms, 
topologies, and node characteristics. Another factor that might be of paramount 
importance is the degree of biological plausibility of the network in question. Adaption 
or learning is the major process in neural networks and thus forms an important criteria 
for classification. Most neural networks adapt connection weights over time to improve 
performance. A number of widely used neural network paradigms feasible for parallel 
implementation are based on adaptations of conventional statistical and numerical 
techniques [1,21,22]. These' neural ·paradigms 3!e non-parametric. ''They make weaker 
assumptions:conceming the shapes of underlaying distributions than traditional statisticru 
classifiers do. Such adaptive systems are described by their energy function ~ arid. · 
probability distribution. Examples of such networks are traditional, layered, and heavily 
7 
interconnected feed-forward architectures such as the multi-layer perceptron with bayk 
propagation learning [1], vector quantization [21], and probabilistic neural networks [22] 
i.e. Boltzmann machine. The reciprocally and symmetrically interconnected architectures 
described by Hopfield [5] and Boltzmann machine [2~] are examples of physical systems. 
All of these networks [1,5,21,22,23] c~ be . categorized i ~.s abstract neural networks. 
Abstract neural networks attempt to emulate the functionality of the brain and intelligence 
within it. These networks seem to. be 11.1ore heavily influenced by the underlying 
' 
statistical distributions rather than being truly inspired by a straight forward one to one 
biological processing. The biological neural mechanisms at the neuron and synaptic level 
are considerably more involved and complex than those modeled by most widely used 
"abstract" neural networks. It is difficult to conclude which of the biological mechanisms 
to retain in the interest of computational efficiency. This is partially due to poor 
understanding of neural theories which are intem due to the extreme experimental 
difficulties encountered in biological network neuroscience. The answer dependents upon 
whether one wishes to develop artificial computational models or to understand 
neurobiology. Our goal is certainly the former. However, we also believe that modeling 
a considerable part of the biological machinery is h~lpful in cr~ating thinking machines. 
The theory that we wish to emphasis is that present artificial network models are too 
abstract to retain the computational efficiencies that are present in the biological world. 
Therefore in summery,. we view the broad spectrum of neural networks models as 
spanning from "abstract" (perception) to the loosely coupled (Kohonen) to the more 
closely emulated (Grossberg) to the tightly coupled GLA olfactory model. 
Thus, by way of contrast, the tightly coupled neural network models bear a the 
8 
straight forward structural relationship to a specific neural function within a nervous 
system. Tightly coupled neural networks are the subject of much current interest 
[24,25,26,37]. In this class, unfortunately, understanding of the collective function of 
neural networks in vertebrates is largely limited to sensory structures i.e. early processing. 
The sensory functions have been studied in the greatest depth and with most success. It 
appears that while most artificial neural networks are typically comprised of a densely 
connected layered network of simple neurons, tightly coupled networks employ sparsely 
connected networks of much more elaborate neurons in which substantial information 
processing occurs within a single neuron. The GLA olfactory paradigm is most certainly 
inspired by tightly coupled neural network philosophy. However, substantial biological 
complexity in these cases also is a result of constrained molecular properties (i.e. channel 
membrane transport). Therefore, ultimately a certain amount of abstraction (detennined 
by application) must be justified in order to build silicon hardware models. 
Hardware Implementation of "Tightly Coupled" 
Computational Models: A Review 
Now that the particular subcategory (tightly coupled neural networks) under which 
the olfaction problem is to be studied is defined, the following text will review what type 
of the tightly coupled neural networks have been jmplemented in hardware, before the 
hardware implementation of our olfaction model is proposed. In spite of a substantially 
different (straight forward tightly coupled) computational approach from most abstract 
neural networks, the essential technologies and hardware techniques remain the same in 
both cases. The interested reader is referred to the extensive literature review done by 
9 
John Wagnon [27] for details on various abstract neural systems that have been 
implemented to date in the hardware. 
The literature search connected with the hardware implementation of tightly coupled 
neural networks yielded only two papers detailing problems with the software simulations 
of olfaction [16,26] . To date, no parallel hardware implementation of olfaction has been 
reported. J. Bailey and D. Hammerstrom [28] have proposed the serial implementation 
of the GLA olfactory model. 'However, some researchers, most notably Carver Mead, 
have attempted to build silicon models of a biologically plausible early processing 
structures for sensory inputs [29,30,31]. 
Usually, the required real time auditory signal processing burden is too high to be 
handled by artificial speech recognition systems due to computational limitations. 
Computationally efficient special purpose hardware in analog integrated VLSI circuitry 
can be used to handle the large signal processing burden, thus forming an efficient 
solution to the problem of computational limitation. Carver Mead and co-workers have 
reported a working analog VLSI chip that implements a stereausis model of biological 
early auditory processing in the brain [32]. The chip essentially is an artificial cochlea 
that analyzes a sound wave and detects a fundamental note missing from the harmony. 
The binaural information exploited by the stereausis algorithm improves speech 
intelligibility in noisy environments compared to the monaural audio signal processing 
exhibited by most artificial speech recognition systems due to their computational 
limitations. The chip is based on the stereausis model of biological auditory processing 
that encodes bi-neural cross-correlation and spectral auto-correlation information by 
deriving a two dimensional representation of binaural sound waves from two sound inputs 
10 
(ears). Their algorithm has also demonstrated the ability to naturally segment moneural 
signals into distinct spectral regions. Although, to some e~tent the responses ·are found 
sensitive to the noise in the data, output patterns have demonstrated feature extraction 
capability for speech signals, specifically the spectral information of various sound waves 
and their location. The chip is comprised of 10,000 transistors using two micron analog 
CMOS technology and fabricated from MOSIS. 
An interesting real time hardware implementation of vertebrate retinal inhibitory 
behavior has recently been presented by Mead and Mahowald [33]. The processing relies 
on the lateral inhibition to adapt the system to a wide ranges of viewing conditions, and 
to produce an output that is independent of the absolute illumination level. Such 
processing is a direct result of initial inhibitory analog stage in retinal processing. The 
secondary effect of the lateral inhibition mechanism is enhancement of spatial edges in 
the image. Their silicon model implements the first stage of retinal processing on a 
single chip where the logarithm of the incident light is computed by a photoreceptor. The 
output of a photoreceptor is further spatially smoothed by a resistive network (grid). The 
amplitude difference between photoreceptor output and its smoothed counterpart is 
amplified to form a second order spatial filter. They have performed the experiments on 
a 48x48 array of silicon pixels on one quarter of a square centimeter chip in CMOS 
technology. Compared to the entire biological visual system, even though the system is 
realized at very low level, it creates the true biological representation upon which higher 
level processing stages can be built. The mathematical analysis of the network is 
presented by J. G. Taylor [34] which allows the extension of the results to a general class 
of resistive grids and inhibitory feedbacks. 
11 
Along similar lines, a neural network approach to the color consistency problem has 
been reported [35] . Color consistency is the ability to judge the reflectance of on object 
under different illumination conditions, since illumination elucidates the same object under 
different lighting conditions. The system is based on the Land's retinex theory. Land's 
retinex theory is inspired by mammalian neurobiology and human psychophysics [35]. 
This algorithm models our ability to see colors intensities roughly constant as light varies. 
Their computer simulations have confirmed validity of their implementation of the Land's 
model. They have implemented Land's algorithm in subthreshold analog CMOS VLSI 
using a two micron process from MOSIS. The chip is comprised of about 60,000 
transistors and is reported to operative at video rates. 
H. C. Card and W. R. Moore report that the learning and memory behavior at neuron 
and synaptic levels can be best understood in simple invertebrate animals such as worms 
and insects [36]. To demonstrate neuron and synaptic level memory behavior, they chose 
the well-studied marine specimen, mollusc Aplysia. These small animals exhibit the same 
learning patterns as vertebrates while keeping neural processing relatively simple. They 
have also proposed analog CMOS circuitry that explores both, the associative and non-
associative learning mechanisms of habituation and sensitization in Aplysia. 
Proposal for Hardware Implementation 
of GLA Olfactory Model 
We propose the simplified hardware implementation of GLA olfactory model in a 
two micron, p-well, double poly, double metal bulk CMOS process from MOSIS. The 
implementation will retain the essential clustering properties of the GLA olfactory model. 
12 
The GLA model inherently possess many favorable features to aid simple hardware 
implementation. These features are: (1) mixed mode processing instead of a pure analog, 
(2) current and voltage mode processing, (3) rhythmic clocking for synchronization, 4) 
discrete, course, and unidirectional learning leading to simplified learning algorithm, and 
5) single quadrant multiplication instead of a four quadrant multiplication to obtain 
scaling closer to that exhibited by a synapse. 
The electronic implementation of the GLA olfactory model involves the integration 
of various mathematical functions in silicon as integrated sub-components. The system 
level GLA olfactory architecture can best be realized by developing such mathematical 
functions separately in the form of building blocks to allow for simplified testing, and 
then incorporating all such blocks onto a single substrate. This thesis specifically 
addresses the design, simulation, layout, fabrication, and testing of these basic building 
blocks. The system level realization is beyond the scope of this thesis. 
As opposed to traditional voltage mode analog signal processing, in which inherently 
current signals are transferred to the voltage domain before any analog signal processing 
takes place, the current mode analog signal processing approach is taken here. The use 
of current rather than voltage as an active parameter can result in higher gain, accuracy, 
and wider bandwidth due to the reduced voltage excursion at dynamic nodes [38]. 
Simulations are performed using SPICE on a personal computer. Layouts are 
accomplished by using MAGIC on Sun work-stations. All circuits are fabricated using 
fabrication service from MOSIS. Finally, the testing is performed. 
The following, II chapter, will describe in detail our interpretation of GLA olfactory 
model and proposed electronic implementation. Chapter ill will focus on the design, 
13 
simulation, and testing of all building blocks. Chapter IV will offer conclusions based 
on the results and suggestions for the future work connected with investigation of the this 
proposed hardware implementation of olfactory model. 
CHAPTER li 
OLFACTORY MODEL AND ITS HARDWARE 
IMPLEMENTATION 
The Bulbar-Cortical Model . 
Modeling of olfaction is a difficult task since olfaction theories are still in the 
developmental stages. On the one hand, a computer simulation of a too detailed 
anatomical olfactory model may result in large volumes of hard to analyze data, while on 
the other hand, too much abstraction and simplification of the anatomical olfactory model 
may altogether loose its relevance to biology with a potential loss of computational power 
for the anatomical model. Thus, the efforts towards the development of the moderately 
abstracted olfactory model is necessary. Such a model helps to understand the model as 
well as preserves the essential features of the model. A moderately abstracted mid level 
[17] GLA olfactory model has been proposed by Granger, Lynch, and Ambros-Ingerson. 
The interested reader is referred to the work of Granger et al. for details [16,17, 18]. In 
this section, we will focus on our interpretation of the essential features of the GLA 
model, leading to our simplified olfactory architecture suitable for a proposed hardware 
implementation. Throughout the course of the discussion, we will justify various 
assumptions and simplifications which are essential to keeping the implementation simple 
yet practical. These assumptions have resulted in a slightly modified architecture. 
14 
15 
Our architecture of the bulbar-cortical (BC) is shown in the Figure 1. The model 
basically consists of the olfactory bulb (OB) and the piriform cortex (PC). The olfactory 
nomenclature is given in the preliminaries. 
Olfactory Bulb 
The olfactory bulb receives the input via the olfactory nerve (ON). Olfactory nerves 
originate from the olfactory receptor sheet and project onto the periglomerular in a 
topographic fashion. The receptor cells, which are most responsive to the particular 
chemical stimuli, project their axons to a· delimited area of the olfactory bulb referred to 
as the glomerulus. The receptor cells fire with higher frequency for higher concentrations 
of odorant The concentration of the odorant is modeled by the magnitude of a real 
positive number. This number reflects aggregate firing frequency and represents ON 
input to the corresponding glomerulus. 
The olfactory bulb is organized into a number of glomeruli g. Each glomeruli 
consists of m mitral/tufted cells. Each glomerulus receives excitatory input from an ON 
collectively forming system input vector Oi. It also receives inhibitory }nput vector ~ 
from the PC feedback through weights which are set during the developmental period to 
be discussed later. The excitatory inputs are summed with the inhibitory feedback signal 
forming the net un-norrnalized input activity o·i to the glomerulus. This un-nonnalized 
glomerulus activity is given by: 
(1) 







. - 1 J 
~ = ... g t • 
J=1 ... m~ 
k - 1 · ·· P Accumulating 


















normalization mediated by the interaction between the excitatory and inhibitory cells of 
OB . This serves to nonnalize the output of the bulb by keeping the total number of 
mitral cells that are activated constant across the stimuli for different intensities and 
compositions. In this normalization scheme, the sum of the normalized glomeruli activity 
is maintained at a constant level. The normalization is obtained in such a way that the 
sum of the non-linearly mapped and scaled normalized activity remains nearly constant. 
Mathematically, the normalized glomerulus activity is given by: 
(2) 
In the above equation, the scaling constant, V K, is the smallest positive value that satisfies 
g 
L gs(VK G" )=KG (3) 
i=l 
where Ko is the glomerulus activity constant and &(.) is a non-linear mapping function 
that maps glomeruli activity into number of the activated mitral cells. Mathematically, 
(4) 
where KI> K2, K3, and ~ are circuit constants. The variable, x, is the input to non-linear 
mapping function. 
The intensity of the normalized glomerulus activity Gi is linearly reflected in the 
number of mitral cells within the particular glomeruli that it activates. The mitral cells 
have increasing thresholds, 8Mi < 8Mi+' (O~j~), where 8Mi is the activation threshold 
of the jth mitral cell in a glomerulus. Thus an increasing amount of normalized 
18 
glomerulus activation results in a greater number of mitral cells being ftred. The mitral 
cells are modeled as two state devices (active or inactive) or McCullough-Pitts neurons 
which are either high (logic 1) or low (logic 0) with glomerulus activity either above or 
below its threshold respectively. Mathematically, 





Thus, the overall processing within the olfactory bulb in the absence of inhibitory 
feedback is as follows. The nonlinear nonnalization and the constraint on the total 
normalized glomeruli activity results in the accentuation of insignificant components in 
the odorant while attenuation of stronger components, which is intuitively pleasing. The 
normalized glomerulus activity is spatially thermometer-coded via the mitral patch where 
the increasing amount of normalized glomerulus activity is reflected in greater number 
of mitral cells being triggered. The constant level of glomerulus output activity serves 
to keep the total number of mitral cells that are activated reasonably constant across 
stimuli with different concentrations and compositions. This means that even though the 
same odorant at different concentrations results in different activations of the same 
glomeruli, the normalization can result in an identical thermometer code. This makes 
GLA model insensitive to odor concentration, i.e., Oi amplitudes. 
Piriform Cortex 
The main features of the piriform cortex are the sparse, and forward projections of 
the mitral cells onto piriform cells [17] via the lateral olfactory tracts (LOT), and 
backward inhibition feedback to the OB. 
19 
The outputs of the mitral cells in the OB, Mij• are projected on to the piriform cells 
in the piriform cortex via the LOT lines, fanning a connection matrix between the OB 
and the PC in layer Ia of the PC. The excitory synapses W(ij)(kl) in the piriform cortex are 
sparse, meaning synapses are made at random with a sparseness on the order of 10%. 
In our model, we assume a unifonn distribution of synapses. However, in the GLA 
model, the sparsity decreases (tapering) as one travels from the rostral to the caudal 
region of the piriform cortex [17], i.e., from closer to OB to further away. Further, our 
model does not consider synapses that are present from PC to PC (thickening synapses) 
in layer Ib of PC, which are present in the GLA model. This assumption was necessary 
to simplify the winner take all (WT A) structure leading to a saving in the silicon area. 
The excitory piriform cells Pkl are arranged into p disjoint piriform patches with h 
piriform cells per patch. The indice k indicates the patch while indice 1 indicates the cell 
number within the piriform patch. The total input activation to the piriform cell is 
g m 
p·kl= L :LMijW@(kJ) (6) 
i=l j=l 
At each operating cycle, due to the strong local inhibition, the piriform patches 
exhibit a winner take all competition within a patch, which results in only the strongest 
or few near strongly activated piriform cells to fire, while the rest of the piriform cells 
remain quiescent. The winner take all competition is exhibited due to the presence of 
inhibitory interneuron within the layer II (the stellate cells). Stellate cells are activated 
by the most strongly activated piriform cell, producing strong local inhibition to all other 
piriform cells except the strongly activated piriform cell within the patch. Thus, the 
strongly activated piriform cell tries to become more activated while the activation of the 
20 
other piriform cells is suppressed. The piriform patch compartments makes this event 
local and thus competitive by stTOnger local inhibition. 
The winning piriform cell is declared activated only if the corresponding input 
activation to the piriform cell is equal to or greater than a fixed piriform cell threshold 
8p. The output of the pirifonn cell is given by: 
PW -1 kl-
=0 
if P* kl~e P' and P*klLP"/rJ for all l~j~h 
otherwise 
(7) 
The GLA model states that, in addition to a fixed threshold, piriform cells also have 
frequency facilitation (ff) and refractory states. In the ff state, the ffkl is increased or 
decreased by one every time a cell activates or remains quiescent. The refractory state 
starts when ffkl exceeds the threshold 8rr· The refractory state of previously active 
piriform cell (8rF1) assures distinct piriform bulbar output code in each minor cycle. The 
non-refractory restriction to wining ensures that the piriform cells which won in the 
previous cycle will never win in the present cycle. However, our model does not 
implement ff and refractory states. 
Finally, the output pattern forn1ed by the winning piriform cells is regarded as the 
spatially encoded output of the bulbar-cortical system. Intuitively, it is clear that these · 
winning piriform cells happen to have a relatively large number of their synapses from 
the active mitral cells. 
The glomeruli in the OB are enervated by the inhibitory feedback generated by the 
winning piriform cells of the PC. This inhibition is weighted through synapses. Synapses 
are modulated over different input cues during the developmental period according to a 
correctional or Hebb rule. Feedback inhibits those glomeruli which are most responsible 
21 
for firing the corresponding winning piriform cells, thus attempting to deactivate those 
winning piriform cells which had generated the output of PC competitions in the previous 
cycle. The weighted inhibition on LOT line ij in the backward direction is given by: 
(8) 
Inhibition on consecutive m LOT lines m the backward direction (from where the 
respective forward LOT lines were originated) is summed by grouping them together 
forming aggregate un-thresholded inhibition r"i to the glomeruli. Inhibition is given by: 
m 
r i=L,r u 
j=l 
for i=l.. .. . g (9) 
The feedback inhibitory signal into glomerulus is obtained by thresholding on 8 1 as 
It= L, r t where, 
y (10) 
I* =I* if r ~~e 1 t i 
=0 otherwise 
where, y is the indice for each minor clustering cycle. 
Learning 
Only one type of learning mechanism has been modeled. After long term 
potentiation (L TP), the active synapses W(ij){kil project from active mitral cells in OB onto 
the winning piriform cells in PC. The weight matrix W consists of such a sparsely placed 
synapses. The learning involved in these synapses is referred to as adult plasticity. The 
weights of these synapses are non-decremental, incremented in discrete steps ow (- 10% 
of their maximum weight), and saturated beyond maximum value wmax (- two to three 
22 
times their naive weights). Mathematically, learning can be described as 
w(V)(kl)=min[(W(if.l(kl)+OWWDlAX] if w(f/)(kl)*O, and Mtj>O, and Pld>O 
= w(!/)(kl) otherwise 
(11) 
From the above equations it is clear that the synaptic alterations take place only in 
physically existing synapses and only if pre and post synaptic sites are active. 
In our model, since we do not implement synapses that , are present from PC to PC 
(thickening) of layer Ib of GLA model, we do not implement the learning associated with 
these synapses. The learning involved in these synapses is similar to adult plasticity 
described above [13]. 
The anatomical model calls for a distinct forward path form OB to PC and a 
feedback path from PC to OB. Forward excitory synapses (adult) are trained according 
to the rules given by equation 11 and backward inhibitory synapses are trained by a 
correlative Hebb rule [] during the developmental phase prior to its use for actual 
hierarchical clustering. To facilitate area efficient electronic implementation, at this stage 
we propose common adult and developmental plasticities. Common adult and 
developmental plasticity allows use of a single time multiplexed weight matrix W in feed 
forward and backward cycles. 
In the feedback path, the active synapses projected from winning piriform cells in 
the PC onto the glomerulus in the OB are strengthened over different input cues during 
the simulated developmental period. For each input sample on the olfactory nerve (ON), 
feedback synapses projected on the glomeruli and co-activated by both, the ON input and 
the pirifonn feedback are strengthened while the remaining synapses are unchanged. 
However, since in any particular feedback path, feedback correlations arise as a direct 
23 
consequence of the given connectivity and strength of the forward excitory synapses in 
the corresponding column of the weight matrix W, the same effect can be obtained by 
using the transpose WT of weight matrix W to compute bulbar inhibition. Architecturally, 
this implies that a single weight matrix with time multiplexing can be used to compute 
the weighted excitory bulbar input to the PC in the forward phase, followed by weighted 
inhibitory feedback from winning piriform cells to the OB in the backward phase 
resulting in improved area efficiency. Equation 8 gives the glomerulus inhibition using 
WT. 
Multi-Sampling 
The computational properties of the coordinated operation of the entire bulbar-
cortical structure can best be described by a so-called multi-sampling process. The 
flowchart of multi-sampling process is shown in Figure 2. It is observed that activity in 
the various brain regions of small mammals is synchronized to their sniffing rate at the 
so-called theta rhythm (4-5Hz,- 200 ms) [17]. The GLA model states that the role of 
the theta rhythm is for synchronization. This eliminates the potential for oscillations due 
to feedback. Such a synchronization permits the entire OB to operate in rhythmic 
synchronization with the brain, where upon reaching the thresholds, the mitral/tufted cells 
fire in synchrony at the theta clock. The input to the piriform cortex aries due to the 
synchronous bursting of the mitral cells, yielding to the cyclic activity of the reciprocal 
process of feed-forward excitation of the PC by the OB followed by feedback inhibition 
of the OB by the PC at the theta rhythm. 
As the animal sniffs a single odor, the following sequence of events takes place in 
G*i (y) = max(Oi+Ii(y), 0) 
Process According to 
Equations 
2, 3, 4 ,5 ,6, 7 
N 
Generate Feedback I i 
According to Equations 8, 9 
Learn According to Equation 
11 ·----------------~ 
Figure 2. Flowchart of the Multisampling Process 
24 
25 
the naive network. After the flrst sniff ( cycle 1), depending on . the input composition,. 
the OB output triggers the most active piriform cell in-each patch of the PC based on the 
discussed operating rules and random connectivity. The winning pirifonn cells in the PC 
produce a feedback signal to the OB. Once the feedback signal from the PC crosses the 
feedback threshold 8 1, the glomerulus with, ·the most significant input components are 
strongly inhibited for remaining cycles via the phenomena called "long lasting inhibition" 
that is observed in the OB. In the subsequent -sniffs (cycles 2;3, .. y), the normalized 
activity of the uninhibited glomeruli increases (according to the normalization property 
which attempts to keep total glomerulus activity at a constant level) in order to 
compensate for the inhibition of the strongest components in the previous cycle. This 
allows weaker components in the input vector to be expressed. As a result, the 
thermometer code or the spatial pattern of the mitral cells activity differs significantly 
from the spatial patterns in the previous cycles. Mitral cells associated with the glomeruli 
that are now inhibited do not fire while a larger number of mitral cells fire from 
glomerulus whose normalized activity has been increased. A different patterns of 
activation from the bulb at each step assures a distinct bulbar-cortical output codes. 
The process of obtaining distinct cortical responses by successively inhibiting 
components of the original stimuli is referred to as multi-sampling. This multi-sampling 
process is repeated until the bulb is sufficiently inhibited to be largely quiescent, meaning 
every component in the input stimuli, no matter how weak it is, is given a chance to be 
expressed in the hierarchical clustering process. 
The naive network can be trained on a training set containing noiseless versions of 
selected dissimilar odorant (vectors) according to the learning rules discussed above. The 
26 
flowchart of the learning process is shown in Figure 2. The effect of learning in the 
network, is to cluster essentially random PC Tesponses into -nearly equivalent estimates 
of the input vectors. These vectors are sufficiently close in ·Space· to the ones used in the 
training set. Thus, learning develops the ability in the network to cluster the sufficiently 
close input vectors. The similar vectors in the training set fonn one category while 
dissimilar vectors fonn distinct categories giving rise to a new class every time an input 
vector is found dissimilar to all other vectors. Any :input vector similar to some vectors 
in the training set is accommodated in that category whereas a novel vector dissimilar to 
all vectors in training set gives rise to a new category. 
A perfonnance comparison study [17] of untrained and trained networks for 
dissimilar noisy odors concludes that the trained network enhances the overall overlap of 
patterns obtained for noisy instances of the same odorant. Also, it reduces the overlap 
of all pairs of patterns obtained for noisy instances of a different odorant. This indicates 
an inclination towards accommodating all noisy instances of the original odor under the 
same category. It also indicates that after training, the set of cortical responses are largely 
distinct. 
After the first cycle, the overlap between the sequence of the cortical responses in 
the subsequent cycles becomes progressively lower for different cues, increasingly 
distinguishing a given input cue and thus producing a unique encoding for an individual 
odorant. During the first cycle, the network responses are nearly identical to the input 
cues which are sufficiently close in space, group:ing them together in a sub-cluster. At 
the same time, it maintains extremely low overlap between two sub-clusters such that 
during the second cycle, responses are nearly identical for the members of the sub-clusters 
27 
while different responses for the vectors that were not the members of the sub-clusters. 
The responses in the third cycle are nearly unique producing unique encoding for 
individuals. Thus, during the multi-sampling process, a hierarchical clustering takes place 
where initial output codes indicate broad class or cluster membership, and subsequent 
codes indicate sub-clusters within clusters, and finally individuals within those sub-
clusters. Cluster and sub-cluster breadth in the input vector space are influenced by the 
weight increment size, the ratio of saturated to naive weight values, and dimension of the 
input vectors in the training set. 
Hardware Implementation 
This section focusses on our hardware implementation of GLA olfactory model. Our 
simplified olfactory architecture is suitable for the hybrid implementation in the MOSIS 
two micron, p-well, double poly, double metal bulk CMOS process. The GLA model 
inherently possess many favorable features to aid such simple hardware implementation. 
Out of the numerous possible architectures, one potentially feasible olfactory 
architecture is shown in Figure 1. The hierarchical clustering at the theta rhythm in the 
GLA model necessitates the synchronous or clocked approach, rather than truly analog 
continuous parallel processing. 
The input cues, analog current input vectors Oi, are assumed to be generated by 
suitable sensory structure (receptor in anatomical model) which are sampled periodically 
at an artificial theta rhythm. For each cycle in theta rhythm, there are two major non-
overlapping phases: activation of the OB and feed-forward excitation of the PC indicated 
by forward phase <j> I> followed by feedback inhibition of the OB by the PC indicated by 
28 
<P 2• Each phase, <P 1 and <P 2, is further subdivided into two non-overlapping sub-phases, 
<P 11 , <P 12 and <P 21 , <P 22, respectively. The timing diagram of the olfactory system is shown 
in Figure 1. Prior to using the network for hierarchical clustering, the network is trained 
over a set of the input cues by updating the forward (excitory) nonvolatile weights in 
parallel according to the adult plasticity rule discussed in learning section. Even though 
system controls are derived from the clocks, the actual computation between two clocks 
is truly analog, concurrent, and carried out in parallel. Clocks are merely for multi-
sampling, and synchronization purposes. 
The following sections discuss the overall operation of our architecture, the different 
building blocks used by the architecture, and the top level architectural issues together 
with their relevance to the anatomical model. Th~ essential blocks and their functions in 
the proposed architecture are: (1) The glomeruli normalizer within the OB to normalize 
the glomerulus activity at a constant level. (2) The mitral patch within glomeruli to 
generate the LOT lines or to thermometer encode the net normalized input. (3) The 
sparse weight matrix to scale and sparsely expand the LOT line activity onto the PC via 
the modifiable synapses. ( 4) The WT A piriform patches within the PC to exhibit the 
winner take all competition. (5) The tie resolver to digitally resolve the potential ties 
which occur among winning piriform cells within a piriform patch, and (6) the current 
copier integrator (CCI) to provide the thresholded, collateral, and cumulative feedback to 
the OB. 
The analog input current vectors 0; (l~i~g) generated by the receptors are sampled 
periodically at an artificial theta clock. In the OB, the net input a·; to the glomerulus is 
formed by summing the real positive input vector 0; point by point with the negative 
29 
inhibitory feedback current vector I; (equation 1). · 
The a·; is then subjected to the global nonlinear normalization by a glomeruli 
normalizer. Several alternatives for normalization have been developed. Essentially all 
normalization schemes are implemented with a closed feedback loop circuit similar to that 
used in automatic gain control (AGC). KG is the constant to which the sum of the 
normalized activity (equation 3) is maintained (20%). 
Each normalized' glomerulus signal G; is thermometer coded by the m mitral cells 
per mitral patch. Mitral cells have increasing equidistance thresholds, i.e., 8Mi < 8MG+LJ 
(O:s;j:s;m), where 8Mi is the activation threshold of jth mitral cell. Mitral thresholds are 
generated globally by a capacitive ladder. The mitral cells are modeled as two state 
devices (active or inactive) or McCullough-pitts neurons by the two stage comparators 
which are either high (logic 1) or low (logic 0) with glomerulus activity Gi either above 
or below threshold 8Mi, respectively. Electronically, this is equivalent to front end of a 
flash AID convertor. 
The binary voltage output of the mitral cells Mij in the OB is spatially projected onto 
the hxp piriform cells in the piriform cortex via mxg LOT lines, forming the synapses 
between the OB and the PC. The synaptic weights W(ij){kl) are realized by a floating gate, 
non-volatile, analog programmable memory. The memory is used in conjunction with a 
MOS transistor operating either in the triode or saturation region. The conductance of a 
MOS transistor is modulated by the charge on the floating gate. The weights are non-
decremental, incremented in discrete steps (- 10% of their maximum weight), and 
saturated beyond the maximum value of wmax (- two to three times their naive weights). 
The excitory synapses WCii)(kiJ are sparse rather than topographic, that is, they are randomly 
30 
distributed within the PC with a sparseness on the order of 10%. The sparse weight 
matrix W<mxg)(hxp) consists of sparsely placed synapses. Synapses are randomly arranged 
in the 4x5 sub-matrices. Restricting the PC random interconnections to a small local area 
is biologically unsupported. However, the choice of a 4x5 sub-matrix area was selected 
for fabrication convenience without any biological formulation. Each sub-matrix receives 
4 consecutive LOT lines (rows) and five consecutive piriform lines (columns) resulting 
in the 20 cross .junctions. The 10 percent sparse random connectivity within the sub-
matrix is achieved by establishing two randomly chosen connections and placing a 
weighing transistor at these cross junction. Within the sub-matrix, any LOT line may be 
interconnected with any piriform input line, with the exception that double 
interconnections between a pair of lines is excluded. During layout, the location of the 
metal contact to the weighing transistor will be derived by executing a macro that 
generates a randomized connection between LOT and piriform line. Local grouping of 
interconnects minimizes interconnection and routing area, resulting in 10 to 20 percent 
area saving [15]. 
Simplification of the weight matrix (specifically the local interconnect) architecture 
results in the loss of certain statistical independence of the connectivity exhibited in the 
anatomical model. The architecture also results in the uniform distribution of weights as 
opposed to the increasingly tapered distribution from caudal to rostral region in the 
anatomical model. Further, due to the restrictions imposed on the connectivity of the sub-
matrix, there exist a zero probability for forming some particular pattern of connectivity 
within a sub-matrix, where as in the absence such restrictions, corresponding probabilities 
would have some finite values. The architecture would seem to be less prone to these 
31 
effects in networks with a sufficiently wide input vector, since according to the central 
limit theorem, with increasing LOT lines the constrained distribution in the sub-matrix 
tend to be very similar to the unconstrained interconnection patterns of the anatomical 
model. 
As discussed earlier, we use a common adult and developmental plasticities which 
allows the use of the single weight matrix W in the forward and the backward cycles. 
This requires that weight matrix W be must time multiplexed to compute the weighted 
excitory bulbar output currents into the PC in the forward phase, and the weighted 
inhibitory feedback currents from winning piriform cells into the OB in the backward 
phase. The use of a common weight matrix results in a significant area saving since the 
weight matrix dominates the total silicon area as the weight matrix area grows in a square 
while the input/output dimensions grow linearly. 
Current conveyor (CC) based bi-directional voltage/current buffers (BiVI) permit bi-
directional use of W. They provide the dual functions of voltage drivers and current 
sources/sinks to isolate the W matrix in the forward and backward mode. During the feed 
forward cycles, the BiVI buffers on the mitral side and the piriform side act as the voltage 
controlled voltage sources and the current controlled current sources respectively. Their 
roles are reversed in backward cycle. The detailed, time-multiplexed BiVI buffer 
operation is described in chapter III. 
The currents produced by the inner-products between the LOT activity and the sparse 
weights, are summed on the columns of W according to Kirchoff's current law. The 
weight matrix columns are organized into p patches with h neighboring columns/patch. 
The resulting inner-product analog currents p• kl are amplified/scaled by the BiVI buffers 
32 
and fed into the PC. In the PC, the excitory piriform cells Pkl are arranged into p disjoint 
winner-take-all piriform patches with h piriform cells per patch. The indice k and 1 
indicate the piriform patch and the cell number within the piriform patch, respectively. 
Each column feeds only one piriform cell. During the sub phase <j> 11 , the piriform patches 
exhibit a winner take all competition within a patch which results in only the piriform cell 
associated with highest input current to become logic high while the rest of the cells 
remain at a logic low. The winning piriform cell is declared activated only if the 
corresponding input current to the piriform cell is equal or greater than the piriform cell 
threshold 8p. 
The output Pk1 of WT A ideally should have only p winners. But due to the finite 
resolution of the WT A circuit, it is not possible to avoid ties among the highest and the 
few near highest input currents. A tie resolver circuit has been added to do post WT A 
processing during phase <j> 12 thereby digitally resolving the ties. The vectors Pkl, PW kl are 
unresolved input, and resolved output respectively. During the multi-sampling process, 
resolved WTA outputs produce a distinct output code. This output code is used for 
clustering as well as forming the basis for feedback inhibition. 
To implement feedback inhibition to the OB by the PC during the backward phase 
<j> 2, binary outputs of the resolved winning piriform cell PW~c~ are latched and reciprocally 
applied via the caudal BiVI buffers to the multiplexed WT matrix. This generates the 
inhibitory currents on the respective LOT lines configured for sinking the currents. The 
resulting inhibitory currents are amplified/scaled by the rostral BiVI buffers. By 
Kirchoff's current law, inhibition on consecutive m caudal "BiVI or backward LOT lines 
is summed by switching them together. Thus, forming an aggregate un-thresholded 
33 
inhibition ri (equation 9) to the glomeruli, from which the respective forward LOT lines 
are originated. The multiplexed operation of the weight matrix together with BiVI buffers 
is discussed in chapter III. 
The CCI provides the function of accumulative collateral feedback inhibition from 
the active pirifonn patches. If the corresponding LOT line was active in the feed-forward 
phase, fi is sampled and stored in each feedback cycle by the CCI circuit. The CCI runs 
in two phase ¢21 and ¢ 22, for storing and for updating l*i respectively. During the multi-
sampling process, of the feedback phase, inhibition I; is applied to the glomerulus at the 
end of each minor cycle. Inhibition persists to be used during the next cycle in the 
forward phase. During the successive cycles, all of the inhibition currents that are 
generated in the backward phase are sampled and added to previously stored inhibition. 
In this way, as the multi-sampling proceeds, cumulative inhibition up to present minor 
cycle is applied to the glomerulus to inhibit the stronger input patterns. Thus, making the 
remaining (weaker) patterns more significant and allowing them to take an active part in 
the overall clustering process. el, the inhibitory threshold imposed on ti (equation 10) 
is necessary to eliminate the effects of floor noise on inhibition. 
CHAPTER III 
SYSTEM BUILDING BLOCKS 
This chapter focuses on the design, simulation, and testing of the basic building 
blocks. Simulations were performed using SPICE. Some auxiliary circuits such as 
current sources/sinks, digital control pulses, and triggering circuits had to be bread-
boarded before actual blocks could be tested for their functionality. Testing is mainly 
performed to check DC response. Future test of transient response for speed verification 
will require that the testing circuitry be fabricated on-chip along the functional block 
being tested. This avoids the external capacitance contribution to the block due to the set 
up itself. Fabrication of such on-chip testing circuitry is beyond the scope of this thesis. 
However, we have reported transient responses as measured by an oscilloscope at the pad 
pins. This obviously adds a parasitic setup capacitance to the circuit nodes. Therefore, 
extrapolation of transient results is required to estimate the internal bandwidth. Once 
again, considering the vast and more important topics ahead, we leave this topic for the 
future. The system level integration of the olfaction system is also beyond the scope of 
this study. 
Glomerulus Normalization 
In real world artificial intelligence problems, such as pattern recognition, natural 
language processing, and olfactory clustering, signals in the input vector on the multiple 
34 
35 
channels convey useful information both, in the position and amplitude ratios of the 
vector elements. The signal processing burden in such cases is high. The absolute level 
of these signals may have minimal bearing on the final outcome of the classification or 
clustering of related observations. In such applications, slinple analog circuits can be 
used to perlorm the task of signal normalization. That is, to generate an output array in 
which each element is proportional to the corresponding element in the input array when 
normalized by a suitably-derived metric of the overall magnitude of the input, generally, 
the largest element or sum of all the element in the output array. 
Several signal normalization techniques based on the translinear principle have been 
reported and realizeq in monolithic forn1 [39]. These circuits exhibit undesirable pattern 
sensitivity because they don't scale with respect to a suitably-derived metric of the overall 
magnitude of the element in input array, i.e., scaling becomes a function of the number 
of input elements. Also being bipolar, they can not be used in the bulk CMOS process. 
The concept behind normalization circuits is the prospect of perlorming massively parallel 
and truly concurrent signal processing. 
In this section, various schemes to achieve olfactory bulb normalization are 
presented. The normalization scheme shown in Figure 3 consists of two feedback loops 
across all of the bulb inputs. An AGC function controls gain and an offset function 
ensures that the activity of normalized outputs is large in amplitude to strongly activate 
the mitral patch. The other normalizing schemes illustrated in Figure 4 and Figure 5 do 
not incorporate an offset function but use a linear and nonlinear (square law) functions 
respectively, to process the bulb inputs in the feedback loop. The following sub-sections 





GC. -- -- 1-- -
Q. 
---- 1----




G*· - -- - 1----
Multiplier 





Figure 4. Linear Limiter with AGC Normalization Function 
38 
G*· -- - - 1----
11.) 11.) VK ...... ;.:::::1 ......... 
~ ~ ·.o ·.o 
Rin 
....... ....... 




Figure 5. Square Law Normalization Function 
39 
AGC and Offset Combined Normalizing Function 
The block diagram of the AGC and offset combined linear normalization function 
is shown in the Figure 3. The normalization block consists of two feedback loops across 
the inhibited bulb input G·;· The basic building blocks involved are the multiplier, offset 
summer, and operational amplifier. The off-chip operational amplifiers will be used to 
simplify conceptual block testing. G'; is the signal obtained after AGC while G; is the 
normalized offset output signal. The multiplier in the closed loop ensures a constant 
level, Ka percent, of mitral activity. The offset summer in the closed loop detects the 
maximum value of element G'imax in G'; and adds the difference between the G'imax and the 
set point IFS, to all the elements in G';· This difference between the set point IFs and 
G'imax is referred to as an offset. The offset activity can best be illustrated mathematically, 
Offset= I Fs-G' i.ma.x 
and output is given as 
G~=G1 1±1Qffse~ + if G'i.ma.x <I FS 
- if G 'lnuJJ:> I FS 




where gm is the linear transconductance of the multiplier and scaler V K is the smallest 
value that satisfies 
40 
(15) 
where ~ is glomerulus activity constant. AGC closed loop activity sets V K at 
(16) 
The multiplication in equation 14, and offsetting of the automatic gain controlled 
vector in equation 12 and 13, is accomplished by the multiplier and offset summer circuits 
of Figure 6 and 15, respectively. The following sections discuss each functional building 
block in detail. 
Transconductance Multiplier 
The schematic diagram of the cross-coupled double quad CMOS transconductance 
multiplier used in the normalization circuit is shown in Figure 6. The circuit is composed 
of the linear CMOS transconductor and its biasing circuitry. The transconductance is a 
crucial component of the design since it may limit the multiplier linearity, frequency 
response, and noise performance. High linearity for large input signals, low noise, no 
dominant internal poles, large transconductance, and low quiescent power dissipation are 
the desired properties of any transconductance circuit. Several techniques for improving 
the linearity of the MOS transconductance elements have been proposed [ 40]. Most of 
the differential transconductance schemes can be broadly classified into four categories: 
adaptive biasing, class A-B, source degeneration, and current differencing. Some combine 
two or more of these techniques to achieve linearization. Detailed information can be 
found elsewhere [ 40]. 
41 
---------Should be Added In 
: V :Future Designs 
I D I 
: M20: Voo 
I I !,_ _______ I 
,------1 
I v D I 







Figure 6. Transconductance Multiplier 
42 
The proposed circuit comes under the class A-B transconductors . Transconductors 
m which the maximum output is greater than the quiescent bias current (rplOO%), 
generally operate in the class A-B mode. Class A-B transconductors typically exploit the 
square law characteristics of an MOS transistor in the saturation region to achieve 
linearization [41]. In this section, we will discuss the fundamental principle and design 
of the transconductance multiplier circuit. 
The fundamental principle of the class A-B transconductors can be understood by 
examining the two transistor configuration shown in Figure 7. Assuming both transistors 
are perfectly matched and operating in the saturation region, the differential output current 
is given by: 
(17) 
Above equation states that a linear transconductance can be achieved by ensuring that the 
sum of the gate-to-source voltage is constant. With the sum constant, if Y;d is equal to 
V Gs1-V Gs2 then equation 17 reduces to : 
(18) 
where V eM = (V Gs1+ V Gs;J/2 is the common mode input level. The transconductance 
gm=~(V eM- V T) is linear and may be varied electronically by adjusting the common mode 
input level. From the above analysis, the fundamental principle of class A-B operation 
can be defined as: "Under the conditions of a constant sum of gate to source voltages, two 
matched MOS transistors operating in the saturation region display a linear relationship 
between the difference of the gate-to-source voltages and the difference of the drain 
Figure 7. Demonstration of Class AB Principle 
With Two Transistors 
43 
currents" [ 40]. This applies in the operating range: 
In this region current may vary as 
- 2J DC~[ diff 
lnc=PCVcM- VT? 




The second fundamental principle involved in the design of cross-coupled double 
quad transconductance is replacement of the single transistor M 1 or M2 with the CMOS 
double pair as shown in Figure 8. This overcomes the matching problems associated with 
n-channel class A-B operation. Using the saturation region equation of the MOS device 
results into, 
(21) 










~ = ~ N~ p 
sq C/P N+{ff;)2 
(24) 
This concludes that a pair of opposite polarity MOS transistors acts as a single transistor 
with an equivalent threshold voltage and transconductance given by equations 23 and 24, 
respectively. 
In equation 18, the transconductance is perfectly linear. Although it has excellent 
linearity and efficiency, some of the class A-B implementations suffer from limitations 
such as the requirement of fully balanced signals for non-linearity cancel1ation and poor 
common mode rejection. The class-AB double quad circuit, which overcomes most of 
these problems, is shown in Figure 9. From the fundamental principal of class AB 
transconductance and equation 18, the sum of the gate-to-source voltage of M1 and M2 




In the above equation, Vid need not be a balanced input. Current source biasing can be 
ratio. As a result, the common mode input level no longer affects the transconductance 
or the linear range. Substituting equations 25 and 26 into equation 17 results in 
(27) 
The transconductance gm=2~V n is perfectly linear and can be tuned by changing V 8 . It 
should be noted that the differential current flows through the floating voltage sources. 
+ 




The magnitude of these voltage sources should remain constant regardless of the current 
flowing through them. A better solution can be realized by replacing a single transistor 
(M 1 & M 2) with its CMOS equivalent double pair as described previously. This does not 
change the circuit behavior since the CMOS double pair acts like a single transistor, 
except ~ in equation 27 is replaced by ~eq and V T by V Teq as given by equations 23 and 
24 (see Figure 10). In this configuration, drain current no longer flows through the 
floating voltage sources. Thus the required floating voltage sources can now be achieved 
by the diode connected CMOS pairs biased with the current sink as shown in Figure 11. 
From equation 22, the bias voltage V 8 is given by: 
v.~ ~ 2I. 
p eq 
(28) 
Combining this biasing network with Figure 10 results in the final transconductor as 
shown in Figure 6. The differential output current is obtained by incorporating 
differential mirroring pairs consisting of M9, M 14 and M 10, M 15 . Finally, from equation 
27 and 28 
(29) 
for the differential range 
Vw<~ (30) 
The transconductance of equation 29 is perfectly linear and can be tuned with the bias 
current I13• Since both, the quiescent current and the maximum linear output currents are 
4I8 , the maximum efficiency is 100%. The efficiency can be increased above 100% by 
+ 
Figure 10. Linear MOS Transconductance Using 
the MOS Equivalent pair 
49 
Figure 11. Double Pair Implementation of a 
Floating Voltage Source 
50 
51 
decreasing the W /L ratio of the inner quad transistors with respect to the outer quads. 
The differential output current is obtained by incorporating the differential mirroring stage 
in series with the inner quads. From Figure 6 and equation 29 
(31) 
It should be clear from the equations 29 and 31 that by making the bias current I8 a 
function of one variable, V K> the transconductance circuit can be used as a linear 
multiplier. However, it is essential that both bias current sources remain in saturation 
over the entire operating range to be a linear function of the gate voltage V K· In order 
to maintain symmetry of operation in both the quads, as well as to achieve good 
efficiency, the following geometrical relations can be determined by inspection. For the 
symmetrical operation of the inner quads: (W!Lh=(WIL)3, and (WIL\=(W/L)7• To have 
identical biasing resistance for both quads: (W!L)16 = (W/L) 18 , (W!L) 17=(WIL)19; and for 
proper differential current mirroring: (W!L)J(W/L)14 = (W!L)Hj(W/L) 15. Moreover, to 
the outer quad. The inner quad is made geometrically half of the outer quad to achieve 
greater than 100% efficiency, i.e., P2=P/2. With these simplifications, equation 31 
becomes 
Imr..JCP P 11) (VK- vr11) Vw 
=gm(VK-VTl)Vw 
(32) 
where gm= CP~11) 112 is the linear transconductance. As pointed out earlier, the input does 
52 
not have to be fully balanced. Keeping one end (point b) of the differential input Y;d at 
ground potentiaJ., the differential signal can be made single ended. But, then V os at 5 V 
is not sufficient to keep M8 and M 12 in saturation for higher values of VK. Thus M12 falls 
into triode region and the multiplier from suffers linearity degradation . Therefore, the 
common mode range of the differential signal should be increased by V T or 2 V T, 
allowing increased V os headroom to keep M8 and M12 in saturation over the entire 
operating range of VK. This is achieved by two identical complementary linear resistors 
Ru,, each made of transistors M16, M17 and M18, M19 connected in a back to back fashion 
as shown in Figure 6. FinaJly, 
G' t=ldiff (33) 
=gm (G* J?.ill) 11 VK 
where, !J.VK=VK-VT!l. The resistance R;n is given by: 
(34) 
Clearly, the output current is the function of input current G"; and scaler V K· 
Simulations. The SPICE simulations of the DC transfer characteristics of the 
multiplier are shown in the Figure 12. A family of curves is obtained by ramping 
inhibited receptor currents G"i from 0 to 250 !-LA for the different closed loop voltage, V K> 
varied over the 1 V to 2.4 V range in steps of 0.2 V. The output currents are sampled 
via R0 . 
The transconductance obtained from the DC transfer characteristics is approximately 
linear and satisfies equation 32. The non-linearity at lower values of inputs is due to the 
Date{fime run: 07/24/91 22:52:10 Temperature: 27.0 
3.0ot--------------+--------------T--------------T---------------r--------------t 
' I 




























-200.0aA i + 
I I 
I I 
-2so.oaA0~-----------souA. _________ IouliA.--------r5ou_A _________ ioo-uX ____ i5cf~t 
G*i~ 
Figure 12. DC Transfer Characteristics of a Transconductance Multiplier 
54 
non-linear Rin (equation 34). The non-zero output current at a·i=O is a result of the 
differential offset that is present due to the finite amount of current that is required to 
flow to provide V b· 
Testing. Due to the difficulty in obtaining the ramped DC sinking current a·i, saw 
tooth voltage v. is used instead of G·i· This requires that the bias voltage Vb (see Figure 
6) must be known to confirm the zero crossing of the output currents. The SPICE 
simulations performed on the extracted file are shown in Figure 13 . V b was found to be 
2.25 V. Keeping this in mind, a family of curves is obtained by ramping v. from 1 V 
to 3.5 V for different values of V K varied over a I V to 2.5 V range. Test results are 
shown in the Figure 14. During the testing, two quadrant operation of the multiplier is 
exhibited due to the fact that Vb is biased to a positive voltage (2.25 V) instead to a zero. 
Thus, the sign of the differential input voltage (Vict = V.-Vb) and output current changes 
(equation 32) when v. is varied from below Vb to above Vb. The output current is 
sampled in terms of the voltage drop V0 across the 10 Kohms precision resistor R0 . 
The test results are compared with the results that are obtained from the SPICE 
simulations performed under the identical conditions. The multiplier behaves linearly 
within the operating range. The percentage difference between the output currents 
obtained from simulation and testing is found to be below 25 percent. This linearities 
is present due to the nonlinear resistance given by equation 34. 
Offset Circuit 
















I I I I I 
k 25v I I I I I = · I I I I I - ______ , _________ r--------~---------r--------; 
I I I I I 
I I I I 
I I I I 
----~--------~---------~--------~ I l I l 
l l I 
I l I 
Il5 
I I l --------1---------r--------1--
1 I I 
I l I I 
I l I 
-------- ~---------~--------~---------~- -----1 l I I 
I I I I 
I I I I ________ J _________ L ________ J _________ L _______ _ 
Va 







Offset Circuit Offset Summer 
G'i Gi 
Repeated Per El~ment In Input Vector 






is divided into two parts: the global off-setter and the offset summer that is repeated 
perelement of the input vector. The simulations consider four bit wide input ,and output 
analog current vectors, G'; and G;, respectively. 
The off-setter is comprised of an operational amplifier (op-arnp) connected in the 
negative feedback loop, and the offsetting circuit (M13_16, M14A> M 15A> and M m). The op-
amp may locally be integrated on-chip or it may be connected off-chip at the expense of 
the bandwidth. For testing, op-amp is connected off-chip. Simulations are performed by 
considering an ideal op-amp, while for testing purposes it is replaced by a discrete off-
chip amplifier. The voltage drop across M17_18 and M82 corresponding to the full scale 
input current Ips forms the inverting input to the op-amp. 
The offset summer is comprised of the compensator (M9_12), a current mirror (M1_8) 
and the maximum function circuit (M19_20, MBl). The bias voltage VB2 required by the 
current mirror is generated on-chip by the voltage reference circuit CMva2p and MVBm). 
o·i, where i=1,2, .. g, forms the input vector to the offset summer. Let o·imax be the 
maximum value of the input current among the elements of the o·i· This circuit achieves 
the offsetting of the input vector G'; by adding the difference between the full scale 
current and the maximum input current (IFS-G'imax) to all the elements of the input vector 
including G'imax· All of the elements in the output vector Gi are offset by an equal 
amount. This is due to fact that the compensating circuit, along with the op-amp 
generates in parallel the same global feedback + V FS and -V FS to all the individual 
elements. The appeal of this circuit lies primarily in the prospect of performing massive 
parallel signal normalization. Note that the bandwidth is limited by slew rate of the 
operational amplifier and the Cas load of M10. 11 • 
59 
Two copies, GCi and GCCi, of the output vector Gi are generated by the current 
minor. Cascades M5_7 minimize the copying error that is present due to channel length 
modulation. Gi forms the input to the mitral patch for further processing of the signal. 
GCi is used in a closed loop to maintain the input activity to KG percent while GCCi feeds 
the maximum function circuit that is used to detect the maximum value of the element 
G'imax in the input vector a·i· The multiple input-single output maximum function circuit 
is comprised of several single input-single output sub circuits connected in parallel, and 
repeated for each element of a·j- Each sub-circuit (M19_20) is comprised of two MOS 
devices connected in diode fashion. The circuit diagram (g=4) of the maximum function 
circuit is shown in the Figure 16. MB1 is a long channel transistor necessary to provide 
the leakage cunent to bias M20A, M20B, M2oc, and M200. Nodes A, B, C, and D possess 
different potentials depending on the corresponding mirrored input currents that are 
flowing through M19A , M19B, M19e, and M190 respectively. The node "-" acquires the 
max (VA' VB' V c' V 0 } corresponding to the maximum current. This voltage reverse biases 
all other diodes except the diode in the branch with maximum current, thereby detecting 
the maximum potential corresponding to the a·imax- The value of the a·imax can be 
estimated by comparing max {VA' VB' V c' V 0 } with the drop across an identical structure 
(M17_18) due to known full scale current. The voltage drop across the identical sub-circuit 
M17_18 forms the inverting input of the op-amp. IFS being a single element, M18 and M82 
are unnecessary. They maintain symmetry for minimizing the input offset. 
The output of the op-amp V 0 together with the compensating circuit, provides global 
feedback to a compensator via two buses: +VFS and -VFS. Applying a KVL around the 








GCC3 ........ CJ 
Vss 
Figure 16. Maximum Function Circuit 
61 
(35) 
Assuming M 14A and M15A are operating in saturation, 
a leakage current IL to keep M14A and M15A conducting in the subthreshold region. 
The sign and value of the differential input determines sign and value of V 0 . As V 0 
increases in the positive direction, V Gs14 and V Gs15 increase. M 14 pulls down bus + V Fs 
while PM OS M15 looses pull-up action therefore bus -V Fs also decreases. The converse 
is true if V 0 decreases in negative direction. At any point in time, the difference between 
+VFS and -VFS remains constant, but the mean changes. In other words, +VFS and -VFS 
vary in the same direction by an equal amount. 
The operating principle of the offset summer circuit can best be illustrated by 
considering the circuit operating in a closed loop configuration. Assuming that a·imax>IFS, 
the following sequence of operations takes place. The current mirror generates two 
copies, GC; and GCC;, of the output vector G;. Each one is used for a specific purpose 
as described earlier. Since a·imax > IFS, VD becomes greater than Vr, thus Yo increases in 
the positive direction resulting in a decrease in both + V FS and -V FS· ~ starts conducting 
while M12 shuts off. If a ·imax has to be normalized to IFS then the extra current, G'imax-IFS, 
must come from M9• Thus, only IFS flows through M4 which after mirroring is available 
as an offset version of the input G';- Since + V FS and -V FS are common to all the elements 
in the input vector, the same offset, a ·imax-IFS is added to every element in the input 
62 
vector. Since G'; is a constant, only G';-(G'imax-IFs) flows through the corresponding M4 
which when mirrored is available as G;. The identical but reverse action takes place if 
G'imax < Ir-s· M 9 shuts off and the shortfall (IF5-G'imax) flows through M 12, thus the element 
corresponding to G'imax gets offset to become G'imax+lFs-G'imax=IFs and all other elements 
are offset to G';+IFs-G'ima.· 
Simulations. The SPICE simulations of the DC transfer characteristics of the offset 
summer circuit are shown in the Figure 17. For simplicity, a four bit wide input vector 
0'1.4 is considered. The DC transfer curves are obtained by holding G'24 at the constant 
levels, i.e., 44 )..l.A, 34 )..l.A, and 24 )..l.A respectively, whereas 0'1 is ramped from 100 )..l.A 
to 0. The full scale current, ID(MF5), is held at 64 )..l.A. The offset output current is 
sampled via M4 in each sub-circuit which, when mirrored, is available as normalized 
output vector 0 14. Normalization is analyzed at the four discrete points A, B, C, and D. 
These points are· shown on the plot. 
At point A, G'1=G'imax· According to previous discussion, when G'imax > IFS, offset 
IS IF5-G'imax> i.e, -36 )..l.A in this case is added to all the elements in the input vector. 
Mathematically, the normalized currents G 14 become 64 )..l.A, 8 )..l.A, -2 )..l.A, and -12 ~A 
respectively. The normalized currents obtained at point A in the plot are off by 4 )..l.A 
since the maximum element gets normalized to 68 ~-tA rather than to 64 )..l.A. This error 
is shown on the plot. The error is attributed primarily to the copying fidelity of the 
current mirrors. 
At point B, G' 1=64 ).1.A. Thus the offset reduces to zero. Mathematically, the 
normalized currents G 14 should possess their original values of 64 J..l.A, 44 ).1.A, 34 )..l.A, 





















Figure 17. Offset Summer DC Characteristics for Four Branches 
64 
24 )l.A respectively. Currents obtained are once again off by 4 J.LA because of the 
previously stated reasons. 
At point C, 0'1=0·2. Any further reduction in 0'1 makes G'2=G·imax· Thus, from this 
point forward, a constant offset (IFS-G'J, 20 ~A in this case, is added to a· 1_4 . 
Mathematically, at point D, the normalized currents G14 should possess 20 ).l.A, 64 J.LA, 
54 J.LA, and 44 ).l.A respectively. The currents obtained from the plot at point D verify 
these values. 
Testing. The testing of the entire offset circuit connected in a closed feedback loop 
didn't lead to conclusive results. To locate the fault, each sub-circuit was tested 
separately. 
The common node formed by gates of M14 is a high impedance node. The leakage 
resistance R in Figure 15 is essential to provide the bias current to M8. Without R, even 
though correctly-biased by V B2, M8 fails to configure M14 in the current mirror mode, thus 
no feedback in the closed loop is made available to the maximum function circuit. The 
value of R is of the order of one meg ohm. 
CMOS technology inherently does not offer area efficient way to realize linear high 
on-chip resistances. Realizing R internally by using a poly resistor is not an area efficient 
solution. Alternatively, a common gate can be made available externally so that the 
external resistance can be used. The later needs i pins for i bit wide input vector. Our 
failure to implement R by either means restricted us from testing the entire circuit in a 
closed loop. However, the effect of V0 on the + VFS is observed. The V0 is ramped 
linearly from 0 to 3 V. From Figure 15, 
65 
+ v,s= VDD-~ CVo-vT,.,)+ VTIJ (37) 
The testing results closely follow the above equation. 
Linear Limiter with AGC Normalization Function 
The block diagram of the linear limiter with AGC normalizing function is shown in 
Figure 4. · It is essentially identical to the AGC and offset combined normalization 
function, except that it uses different normalization parameters and does not have an 
offset function. It consists of an AGC feedback loop across all the inhibited bulb inputs 
to perform the task of signal normalization, that is, to generate an output array in which 
each element is proportional to the corresponding element in the input array divided by 
the largest element of the input array. 
First, using the previously described maximum function circuit, the maximum 
element in the inhibited input vector G"i, is detected. From Figure 4 and equation 33 
(38) 
where gm and ~n have their usual meanings. Also from Figure 4 
(39) 
where 11 V K = V K-V Tll· Substituting equation 38 into equation 39 results in 
(40) 
If Av1 is sufficiently large such that 1 +G" imax Av1 Rin R gm "" a· imax Av1 Rin R gm then, 
66 
I A v ,.. FS 




Finally, combining equation 41 and equation 42 results in 
I 
G =G* FS 
t . t G* . 
lmcJX 
(43) 
The maximum element in the input vector is always normalized to some 
predetermined full scale value, while all other elements are ratioed corresponding to their 
absolute values with respect to a maximum value. Note that this scheme represents the 
normalized vector in terms of ratios of relative value of the individual elements with 
respect to maximum element in the input vector. 
Square Law Bulb Normalization Function 
The block diagram of the square law bulb normalization function is shown in Figure 
5. Conceptually, this scheme is similar to schemes described previously except for some 
important features. GLA model [17] calls for the scaling of on-normalized glomerulus 
activity G"i by a suitable scaler V K ' such that the sum of the non-linearly processed and 
scaled un-normalized glomerulus activity is constant (equation 2 and equation 3). The 
effect of such a nonlinear normalization on an overall clustering process is discussed in 
multi-sampling section. The nonlinear sigmoid-like transfer function is mathematically 
characterized by equation 4. 
67 
The normalization scheme is comprised of: the multiplier, the approximate sigmoid 
function g.(.), and on or off chip operational amplifier. From Figure 5 and equation 33, 
the AGC scaled vector G.i is 
(44) 
and the normalized activity is 
(45) 
where gm and Rio carry their usual meaning, g.(.) is the approximate sigmoid transfer 
function given by equation 4, and scaler V K is the smallest value that satisfies 
g 
Lgs(R G't)=KG (46) 
i=l 
In closed loop, VK settles at 
(47) 
The following sections discuss the electronic realization of the approximate sigmoidal 
transfer function. 
Approximate Sigmoidal Function 
The squashing cell is shown in Figure 18 [ 42] . It takes advantage of the inherent 
nonlinear drain to source I-V characteristics of aMOS device to generate the continuously 
differentiable and gain programmable transfer function. The cell is versatile, is extremely 
simple to design, and provides independent voltage or current programmable control of 
the gain. From an analog electronic system perspective, the sigmoidal non-linearity can 
68 
Yss T 
Figure 18. Approximate Sigmoidal Function 
69 
be thought of as an amplifier with a nonlinear transconductance. This results in nonlinear 
DC transfer characteristics. The gain is the slope of the output-input curve at a specific 
input excitation leveL It varies from a low value at large positive or negative excitations 
(flat portions of the curve in Figure 20), to a maximum value at zero excitation. This 
non-linear transconductance nmmalizes the input activity. In this novel cell, the inherent 
nonlinear drain to source I-V characteristics of the MOS device are utilized to generate 
high gain near zero crossover using the triode region and low gain using the saturation 
region at high excitations. 
In Figure 18, G' ;R, eG, + V c, and G; are the input voltage, the threshold controlled 
voltage, the sigmoidal gain control voltage, and the normalized output currents 
respectively. G; forms the input to the mitral patch used for further processing of the 
signal, whereas GC; in a closed loop is used to set the output activity to Ko percent. The 
geometries of M5,6,7,8 are designed such that all of the MOS devices operate in the 
saturation region. Applying KVL around the loop shown, results in the following voltage 
loop equation: 
(48) 
in Ps=P7 and P6=P8. Since they are all n type devices, it is assumed that the threshold 
voltages of all of the devices are matched. However, because of different body potentials 
there will be slight mismatch in the threshold voltages. Assuming matched V T , s, equation 
48 simplifies to 
70 
(49) 
Noting that G'; R is impressed across M9 , and using an accurate strong inversion model 
[ 43] of an n-channel MOS transistor operating in the triode and saturation regions, the 
drain or output current of the NMOS device operating in the triode and saturation regions 
is modeled as 
I _ {W)jc · ~ 1 (G'tK)2 ] 1 • 1 D9-K - vc-VT9'G i R- (l+.l.G, R) , vc-v19~G j R ± eF 
L 2 (50) 
= Kn(W) (V -v· '~2(1+~G1 17\ V -V --G1 R ± B 2 L c T9' ,._ I .1.\.) c 19~ i F 
9 




It is important to note that in the transition between the triode and saturation regions, 
commonly referred to as the moderate inversion, the MOS model neither fits into the 
triode nor the saturation model. In many treatments, no moderate-inversion is defined. 
Sometimes this region is considered as the lower part of strong inversion. Such models 
can lead to large errors. 
Note that the described squashing function operates in a single (1st) quadrant. The 
symmetrical two quadrant (1st and 3rd) operation can be achieved by incorporating the 
complementary equivalent into the circuit in Figure 18. At any instance, only one 
quadrant is operative depending on the polarity of the input voltage V in· Symmetry in two 
quadrants is maintained by the proper selection of device geometries in their respective 
parts. The simulations and fabrication are based on the two quadrant squashing function . 
71 
Note that in Figure 18, the R can be replaced by a MOS transistor operating in linear 
region. 
In summery, the squashing circuit transfers the input voltage across the drain to the 
source of the transistor M9. Over the supply voltage range, this MOSFET has a 
continuously differentiable I-V characteristics. In the triode dV c-VT91>1G'i Rl) and 
saturation dV c-V TJ<G'i R) regions, the transconductance gain, gds, is ~(V 05- V T-V 05) 
and Al0 , respectively. The resulting nonlinear drain current is mirrored by M11_12. The 
current is linear at small values of input voltage and saturates, as the input voltage 
increases and the transistor M9 enters into the saturation region. 
The input voltage G'iR is "squashed" into a nonlinear current that is made available 
as an output after current mirroring. The important features are independently 
programmable control of the sigmoidal gain and offset 8 0 . The saturation knee point can 
be placed anywhere simply by proper combination of gate voltages and geometry of M9. 
Finally, it is important to point out that this circuit achieves voltage to current conversion 
(transconductance). 
Simulations. Figure 19 shows the AC response of the celL The cell achieves a 
bandwidth on the order of 10 Mhz into a one Megaohm load. The SPICE simulations of 
the DC transfer characteristics of the cell are shown in Figure 20. The family of curves 
is obtained by ramping the input voltage V in from -3 V to 3 V for different sigmoidal gain 
voltages, V c· The output current Gi is sampled via R0 . 
Testing. The DC transfer characteristics obtained from the experimental data are 
shown in the Figure 21. When compared with the simulation results for the same V in and 
72 






10 Kohm Load 













































































+ : Vc=3V 
I I 
-50" . ~y;~~;---:.21iv-------~T:ov--------oJv --------cov--------2-:nv------ioV 
Figure 20. 
lfl 






Gi 0.00 +-~V~e=~l:i._V-1--f----t--+--~~+==i==+==+==t====i 
1Vc=dv -20.00 + ---rr:::::>m" ___ _ 
t- C= 
-60.00 Vc=3.0V 
Figure 21. DC Transfer Characteristics of sigmoidal Function (Test Results) 
75 
V c• current in the 1st quadrant due to n the devices'is lower than in the Jrd quadrant due 
to p devices. This is due to the threshold and beta mismatches between the n and p 
devices. With the V c's set at low values, the threshold mismatch was 'foundto ,be 0.266 
Y. + V c and -V c were adjusted for threshold mismatch before recording the test data. 
The current mismatch at higher values of V c's is mainly due to the beta mismatch and 
the channel length modulations for large voltages of n and p devices. The testing data 
closely follows equation 50. 
The small signal transient step response rise and' fall times with a 10 kQ output 
resistor and 20 pF oscilloscope probe capacitance plus test,fixture capacitance are found 
to be 2.5 ~s and 2.25 ~s respectively. 
Mitral Patch 
The bulb simulations consider mxg projections (mitral cells). Projections are divided 
into g separate groups (mitral patches). Each mitral patch is excited by the normalized 
input from one group of peripheral receptors. The normalized output ensures input 
activity of mitral patch significantly large in amplitude to strongly activate the mitral 
patch and to keep the total number of mitral cells that are activated reasonably constant 
across the input vectors with different intensities and compositions. Within the mitral 
patch, the intensity of a normalized input is thermometer coded by the number of active 
cells. In other words, a thermometer code is an output representation, in which input 
activity is linearly coded by the increased number of units being triggered for the 
increased input activity. Thus, depending on the input activity, each mitral patch is 
spatially expanded from 1 to j t11ermometer coded LOT lines which project onto the 
76 
pyriform neurons. The mitral patch implements AID conversion with a logical 
thermometer code. 
The circuit diagram of the mitral patch is shown in Figure 22. It is comprised of a 
global capacitor reference ladder that sets the full scale current into m equidistance global 
thresholds, 8Mi. These thresholds are then compared to the input by m comparators in 
each mitral patch. The output currents, G;, of the normalizing glomerulus function are 
used as an input to mitral patch. The scheme is equivalent to front end of the m level 
:t 
flash AID converter, generating the thermometer coded digital LOT lines. 
After mirroring through n mirror stage MR3.4 and p mirror stage MR1.2, the full scale 
current IFs is dropped acrs>ss the active load (MR5_6) creating a full scale voltage reference. 
This voltage reference is impressed across the ladder of m identical MOS capacitors 
producing m voltage levels. The poly-1 to poly-2 unit capacitance has a tolerance of± 
6 Ff/j...tm2 with a typical value of 50 Ff/j...lm2• Looking into the comparator, if Cas is the 
gate to source capacitor of M2 then, the jth mitral cell threshold voltage is given by: 
l V. 1 + V. l C GS V= 1 ... ; - +V --1 C 111 2C 2+~ 
2C 
(52) 
l-}+1 + YJ-1 .,. for C GS<.2C 
2 
Nonidentical step capacitances result in non-equidistance threshold levels, if C>>Cas• then 
(53) 



















where j is the mitral cell index and m is the total number of mitral cells per mitral patch. 
The output current of each glomerulus is equilibrated across an identical active load 
MR7.8• The resulting voltage drop is compared with threshold voltages using a series of 
comparators. The two stage comparator is shown in the Figure 22 [ 44]. The low gain 
of the differential stage is augmented by the gain of the current sink inverting stage. The 
problem associated with such an comparator is a poorly predicted trip-point voltage. 
Simulations 
The SPICE simulations of the DC and transient characteristics for the mitral patch 
circuit are shown in the Figures 23 and 24, respectively. With the full scale current set 
at 64 flA, the DC characteristics are obtained by ramping the normalized bulb input 
currents G;, from 0 to 64 J..LA. V 1. 16 are the threshold voltages applied to the inverting 
input of the comparator, Y301_316 are the digital output voltages of comparators and V in is 
the non-inverting input of the comparator. As V in crosses threshold voltage, the output 
of the corresponding comparator is driven high. In this manner, the intensity of a 
normalized input is thermometer coded by the number of cells that normalized input 
activates. 
The transient step response reveals that the LOT lines that are thresholded near the 
full scale current are slower than those that are thresholded near ground. This is due to 
the changing input differential voltages as a function of ladder position. The differential 
voltage is a maximum at lowest threshold 8Mp and minimum at highest threshold 8M01 • 
79 















































Ou A lOu A 
' . 
~ 



















a V(l) • V(2) • V(3) • V(14) •V(15) •V(16) • V in +V(301) aV(302) •V(303) 
• V(314) .V(315) •V(316) Q. ---..• 
I 
Figure 23. DC Response of Mitral Cells 
80 































0·0ons 5 10 ns --T50~-s----2-0D~~----25b~;----3ob~s----35o~s-·;fo6is 
aV(301) •V(302) •V(303} •V(314) •V(315) •V(316) xV(16) +Yin 
Time .,.. 
Figure 24. Transient Response of Mitral Cells 
81 
Testing 
The Limitation on the package pins restricted external access to only a few LOT 
lines. To confinn the proper functionality of the capacitive ladder, LOT1, LOT15 , and 
LOT16 are connected to the pad-frame. With IFs set to a known positive value, Gi is 
varied from zero to IFs ).!A and the state of the LOT lines is observed. The global 
capacitive ladder is suppose to set the full scale current into m (16 in this case) 
equidistance global thresholds. With IFS equal to 64 IJ.A, theoretical toggling levels for 
LOT1, LOT15, and LOT16 are 4 ).lA, 60 ).lA, and 64 ).!A, respectively. The corresponding ,, 
toggling levels recorded from testing data are 4.74 !lA, 40 ).!A and 43 ).!A. 
The large signal step transient response agrees with the theoretical conclusion, i.e., 
LOT lines that are thresholded near Ips are slower compared to those that are thresholded 
near ground. The rise time of LOT1 is 4 ).ls, LOT15 is 10 f.l.S, and LOT16 is 12 ).lS. 
Bi-directional Voltage/Current Buffers 
The bi-directional voltage/current (BiVI) buffers that are based on the current 
conveyor concept are shown in Figure 25. These buffers provide the dual functions of 
voltage drivers and current sources/sinks to isolate the W matrix in the forward and 
backward mode. During the feed forward cycle, the BiVI buffers on the mitral side are 
configured as voltage controlled voltage sources, and the buffers on the pyrifonn side are 
configured as current controlled current sources. During the inhibition in the backward 
cycle, their roles are reversed. Bi-directional operation is achieved by switching S1 and 



















From 1LOT L------- -_ __;·~ 









Weight Matrix 1 









Figure 25. Bi-directional Voltage/Current Buffers 
83 
ground potential. When the Y input of one buffer is at ground, the other may be either 
at ground, or at Vrcr causing a current flow proportional to the charge on the floating gate 
and v ref to flow. 
Current conveyor circuits began to emerge as an important class of circuits during 
the early 70's. They have proven to be functionally flexible and versatile, gaining 
acceptance as both a theoretical and a practical building block that offers an alternative 
way of abstracting complex functions. Current conveyors offer several advantages over 
conventional operational amplifiers. They provide higher gain over a greater signal 
bandwidth [ 46]. 
The block diagram of a CC is shown in Figure 26. Class-I (CCI±) and class-II 
(CCII±) conveyors have defined properties [45]. A CCII± can be expressed in the 
following hybrid equations: 
1r o o· o vr 
Vx = 1 0 0 lx 
Iz 0 ±1 0 Vz 
(55) 
The above equation states that no current flows into terminal Y, thus terminal Y exhibits 
an infinite input impedance. If the voltage is applied to input terminal Y, an equal 
voltage appears on the input terminal X, thus X exhibits a zero input impedance. Finally, 
an input current Ix on terminal X is conveyed to high impedance output terminal Z. The 
positive sign denotes that at any instant both, Ix and lz flow into or away from the 
conveyor signifying CCII+ while the minus sign denotes the opposite directions of the 
currents signifying CCII-. 








Figure 26. Block Diagram of the CC-ll+-
84 
.85 
NMOS (MFN in Figure 27) transistor can be achieved by incorporating transistor in the 
negative feedback loop of the operational amplifier. In which case, the cuiTent is 
resoicted to flowing away from the X terminal. Similarly, with the PMOS (MFr) 
transistor incorporated in the feedback loop, current is restricted to flowing into the X 
terminal. Bi-directional current flow can be achieved by using a complementary pair of 
MOS transistors (MFN and MFP) in the op-amp feedback loop. When minored by 
complementary minors, this current can be made available on the output node Z. Thus 
the input current Ix is conveyed to output Clfrrent ~ (assuming ID(M7_8) =0). The scaling 
of the input current can be obtained by designing proper mirror ratio or by providing an 
alternate parallel path for the current via branch M7_8 • Thus, allowing only a portion of 
input current to flow through the mirrors. This is a CCII+ realization since both, Ix and 
~ simultaneously flow into or away from the conveyor. 
The CMOS folded cascade op amp shown in the Figure 27 has been integrated on-
chip to be used as the CCII+ op-amp. In the. design of the op-amp, the locations of 
; 
dominant poles are decided by high impedance nodes that are responsible for deteriorating 
the phase margin. In a simple two stage op amp, Miller compensation attempts to drive 
the pole at an output beyond the GB, while making the internal pole dominant. This 
scheme does not completely eliminate the output pole problem, since for large load 
capacitances, the output pole has tendency to shift back toward the origin resulting in 
unstable operation [ 46]. Since the input resistance of a folded cascade stage is very low 
(1/gm), the folded cascade eliminates the high impedance nodes and thus only one 
dominant pole exist at the output In contrast to the two stage Miller compensated op 





Figure 27. Bi-directional Voltage/Current Conveyor 
87 
compensation, resulting in a further increase in the phase margin [ 46]. 
In Figure 25, switch S1 is activated by the forward digital LOT signal and switch S2 
is activated by the backward digital signal from the tie resolver. During the forward 
cycle, switch S1 is switched to Vrer while switch S2 is' switched to ground. The voltage 
controlled voltage source configured CC ensures V x1 equals to Vy1• S2 is switched to 
ground forcing X2 to the ground reference. If the LOT line is digitally high, then the 
potential difference between xl and x2 CVrcJ, causes a current Ixl to flowing in the 
forward direction proportional to the charge on the floating gate and the value of vrcf· 
The current controlled current source configured CC ensures Iz2 equals to Ix2- lz2 is then 
processed further by the winner take all circuit. 
The BiVI buffers must be able to supply the total weight current in one column of 
the weight matrix. To accomplish this, the source/sink transistors, MJPA• MFN, MINA> and 
MFP must be sized appropriately. The current sources are sized to source currents in the 
voltage controlled voltage source mode, while the current sinks are sized to sink weighted 
currents in the current controlled current source mode. 
During the backward phase, S2 is activated by lines from tie resolver switching Y 1 
either to the reference voltage or to the ground potential, depending on the state of the 
corresponding resolver line. A winning state results in Y 1 being switched to V rer· The 
voltage controlled voltage source configured CC ensures V x2 equals to V n· S, is 
switched to ground forcing X1 to be a virtual ground. If a WTA line is a logic high, then 
the · potential difference between X2 and X, causes current Ix2 to flowing during the 
backward phase proportionally to the charge on the floating gate and the value of V re r· 
The current controlled current source configured CC ensures 12 1 equals to In The 
88 
resulting 121 is processed further by the current copier integrator circuit. 
Simulations 
Figure 28 shows the transient response and Figure 29 shows the ac response of the 
CCI circuit. The CCII+ circuit is capable of source/sink 1 Ma of current while slewing 
a single weight current (40 !lA) in less then 400 ns into a 1 K ohm load. A small signal 
bandwidth is greater than 10 Mhz frequency. 
The SPICE simulations of the DC transfer characteristics of the CCII+ conveyor are 
shown in Figure 30. The characteristics are obtained by ramping the input current Ix 
from the negative to the positive value. Over the range -2 Ma to 2 Ma, the output current 
lz is a linear function of the input. The CCII+ looses its linearity as the internal 
transistors MFN and Mr:P begin to fall out of saturation. 
Testing 
The DC transfer characteristics are obtained by ramping voltage Vy from -2.5 V to 
2.5 V. According to equation 55, Yx=Vy. The test data indicates that Yx exactly tracks 
V Y· The resistance connected between terminal X and ground, thus draws current Ix 
proportional to Vy. Ix is conveyed to the output as lz via CC. The Ix-lz transfer curve 
is shown in Figure 31. The test results are comparable to the simulations except that CC 
is linear over Ix range -2 Ma to 1.75 Ma compared to ±2 Ma for the simulations. 
The small signal transient step response rise and fall times are found to both be 2.5 
!lS. The transient response times are limited by the parasitic capacitances at nodes X, Y, 
and Z. 
89 























































Figure 28. Transient Response of the Current Conveyor 
90 
Date{fime run: 07/31/91 07:05:47 Temperature: 27.0 
-0 t=======:t:::======~====~==~--:;;--:,;-.::--+-::!::.:-----------+------------r 
















: I : 
I I 















-250t I t 
I I 
I I 
I I I I I 
: · 20Mh : 
-3q~o~h-------i()h--------1~0iFCh _______ 1o1f~------i~~------ilk3_h ____ io-octh 
a DB(IM(Iz)!IM(Ix)) • IP(Iz) 
Frequency ---+-
Figure 29. Current Conveyor AC Response 
91 






































































· - l.2m1.i~;;-----:ro;A.------~it~-A--------o.o;;i\------i~t~A.-------r"ir~A.----3~o7n~ 
Figure 30. DC Transfer Characteristics of the CC Obtained From Simulations 
0. 8 ---r---r-- -~--- -,-- --,---,---,--- -r---T--- T--- r- -- r-- -~ 
I I I I I I I I I I 
0 6 I I I I I I I I . ~_,---r--T-~--~--;---r--T--~=-T-~---r--4 
0.4+---r--T---r--T---r--T---r--~--r-~---r--;-~ 
I I I I I I I I I I I I 
I I I I I I I I I I I I I 
-- .J--- ..1--- .1--- l.-- - l.---L---L---L-- -'--- -1--- .J--- .J- - - .J 
Figure 31. 
Ix 
DC Transfer Characteristics of the CC Obtained 




One of the most onerous requirements facing the designers of the neural networks 
integrated circuit (NNIC) is the appropriate selection of technology and circuit 
configuration to produce a memory with suitable characteristics. In general, from the 
electronic neural networks perspective, a memory element can be characterized by: (1) 
nature of memory, analog or digital (2) location, on-chip or off-chip (3) volatility, volatile 
or nonvolatile (4) programming/erasing method, electrical or non-electrical, and (5) the 
precision in bits. More often than not, the technology selection is restricted by factors 
such as the cost and availability of a particular process by the commercial vendors. Most 
of the research reported to date, requires a special processes such as an ultrathin window, 
nitrite oxide, and textured polysilicon. 
Knowledge in the analog artificial neural networks is stored in the form of variable 
weights. Neura1 networks adapt themselves by modifying the strength of connecting 
weights according to the specific learning algorithms. This requires that the weight be 
easily altered in order to take a wide range of positive values. These weights must allow 
long term storage and must be locally stored to allow easy and rapid access. Storage of 
analog weights necessitates analog memories that are (1) truly non-volatility, for long 
term retention of the stored knowledge, (2) on-chip and rapidly programmable, to expedite 
the network learning by minimizing read and write times, and (3) application specific yet 
simple, for ease of fabrication. Strictly speaking, due to factors such as the learning rate 
in an ENN, discrete programming of true analog memories results in finite resolution, 
usually specified in bits. The electronic implementations of most widely used networks 
94 
includjng back propagation typically requ ire resolution on the order of 5 bits or greater 
[ 47]. 
The favorable learning features of the GLA model are that the weights require only 
low precision on the order of three to five bits. The learning in the network comprises 
of course, unidirectional, and parallel real time weight updates which take place according 
to a simple Hebb-type co-active based update rule. The inherently slow multi-sampling 
process at theta rhythm (200 ms) can tolerate long programming times although fast 
updates are prefened. Due to the coarse learning, retentivity of 3-5 bit over 10 years at 
room temperature is -~llowed. Thus, in summary, to implement network learning with a 
sparse synaptic weights requrres coarsely analog, non-volatile, electrically 
programmable/erasable memory with progranuning time on the order of 200 ms. Each 
memory element should be configured with a variable conductance synapse, whose 
conductance can be modulated by the nonvolatile weight. The sparse weight matrix W 
consist of sparsely placed electrically erasable/programmable transistors and randomly 
ananged in a 4x5 sub-matrix as shown in Figure 32. 
In the past, attempts to build neural weights have resulted in simplified non-
adaptable or discontinuously adaptable synaptic weights [48]. Some provide a continuous 
true analog nature, but do not store the weights locally on chip. This limits the 
computational capability of the NNIC or neural systems because the read and write 
become input/output limited resulting in very large developmental time. The numerous 
possibilities to build a memory element can be broadly classified as: digital semiconductor 











: "" "" 
--" ,. 
LOT Lin § 
~l:\ 





't' If ~ r r , ~ ~ r r r r r ' r , ~ ' ~ ~ ,. 
To Piriform Patch . (i"[:";; 
# ---t~ 
Figure 32. Weight Matrix 
96 
Semiconductor memories (e.g. SRAMs) are volatile in nature, that is, data content 
IS lost when power is removed. This problem can be solved by using . the · fixed 
programmable memories or the mask programmable read only memories (ROM's), where 
data content is placed in the memory during the manufacturing process. This makes them 
non adaptable. Also, these memories require a large manufacturing volume of a particular 
program to recover the high fabrication cost. Programmable read only memories 
(PROM's) allow programming prior to use. These memories ·can be built using either 
bipolar technology (fusible link) or the MOS technology, e.g., the floating gate avalanche 
injection MOS (FAMOS) [49]. Bipolar devices are non adaptable because they cannot 
be erased once programmed. However, FAMOS can be erased by exposing it to 
ultraviolet rays. Unfortunately, none of the above memories truly satisfies the need of 
electrically programmable/erasable analog memory. 
In digital semiconductor memories, the MOS capacitor holds data which is 
dynamically refreshed to preserves the data content. Weights can be stored in digital 
form and then converted into analog form by D/A converters (for example, M-DAC). 
This technique relies on the fact that the conductance or transconductance of a MOS 
transistor can be modulated by changing the transistor gate voltage. The transistor is 
operated in the triode region where non-linearity of the synapse is fairly low. 
Multiplexing and routing complexities make the parallel updating of weights in such 
architectures slow and complex. Proper trade off between quantization error and silicon 
area (RAM memory) is necessary. Along the similar lines, another technique is suggested 
by Y. Tsividis and S. Satyanarayana [50] where analog voltages are stored at the gate 
capacitance of the synaptic MOS transistor itself. They suggest canceling the inherent 
97 
non-linearity of a transistor by using complementary input voltages through the matched 
weighing transistor, or by passing the same voltages through the complementary weighing 
transistors: the n-channel and the p-channel. Learning takes place by addressing the 
proper capacitors and charging them according to a specified learning algorithm. Once 
the weights are settled (RC time constant), the capacitors are periodically accessed for 
reading, charging, and refreshing. This scheme suffers from a relatively shmt retentivity 
resulting in decreased accuracy. As a result, the network becomes "absent minded", 
forgetting information shortly after learning. 
Floating-gate analog semiconductor memories have been proposed and studied by 
a number of researchers [51] as a suitable analog medium for the long-term storage of the 
weights. They serve the dual purpose of providing local on-chip weight storage on the 
floating gate of synaptic transistor. The transistor intern can be used as the variable 
synapse. The strength of the synaptic weight depends upon the stored charge on the 
floating gate. This type of memory element exhibits long term retention because no 
discharge path is available since the gate is surrounded by the dielectric material Si02. 
The charge transport mechanism used by floating gate memories can broadly be 
classified as the avalanche injection of electrons [52], and Fowler-Nordheim tunneling of 
electrons [53]. Some use a combination of these two. There are four basic categories of 
avalanche injection (52]. In the avalanche injection of electrons, high energy electrons 
are generated within a substrate, to surmount the Si02 barrier and to be injected onto the 
conductive floating silicon gate. While in the Fowler-Nordheim tunneling of electrons, 
a high voltage is placed across a thin oxide, typically a window across the floating gate. 
This impart sufficient energy to the electrons within the substrate to tunnel through the 
98 
Si02 barrier. However, the process by which the stored charge may be altered is highly 
nonlinear, sensitive to geometric and processing parameters, and can require high 
programming voltages (greater than 5 Y). In general, it is a function of the applied 
electric field intensity, programming duration, and back emf. It is difficult to conceive 
precise modification of analog weights without feedback control. The most obvious 
solution is the use of course weights. A few researchers have proposed modification of 
established algorithms by using very coarse quantization weight updates [54). 
One well known solution for adaptable weight is the metal nitride oxide-
semiconductor (MNOS) technology [48]. A MNOS device has a variable threshold which 
can be electrically changed by a tunneling charge into an interfacial layer in the gate 
dielectric. By reversing the polarizing field, the charge can be tunneled out of the 
interfacial layer, thus making the device electrically writable/erasable. The MNOS 
fabrication process is complicated because the control of the silicon nitrite-tunneling gate 
oxide is difficult. With some modifications to the FAMOS structure, it is possible to 
have an electrically" programmable/erasable non-volatile memory. The most recent 
development is the dual injector floating gate MOS (DIFMOS). In the DIFMOS, like the 
FAMOS, data are stored on the floating gate which is charged by the avalanche injection 
of electrons. But unlike the FAMOS, erasure is achieved by the avalanche injection of 
holes. However, hole injection is an order of magnitude slower than electron injection 
[49]. 
The following sections briefly review the widely used floating gate semiconductor 
technologies. 
99 
Floating Gate Avalanche Injection MOS Memory 
The concept of an insulated gate field effect transistor with a floating gate as a 
nonvolatile memory element was first advanced by Khang and Sze [55]. The operation 
of the proposed structure is based on the charge transport from the silicon substrate across 
a thin insulator layer(:::: 50 A) to a floating metal electrode which is covered by a second 
insulator and the upper metal gate. The charge is stored in the floating metal gate in 
response to the applied voltage between the upper metal and the substrate. The fonnation 
of the metal gate over a very thin dielectric layer is the major obstacle in the practical 
realization of the proposed structure. The similar concept is involved in the MNOS 
structure in which the floating metal gate is replaced by a layer of traps of the silicon 
nitrite. MNOS technology will be discussed in detail in the next section. 
Nicollian et al. [56] reported that the high electron current densities can be achieved 
in the MOS capacitors by avalanche injection from the P type substrate at considerably 
lower current density than the hole injection from the N type substrate. The FAMOS 
structure uses this principle to avoid the basic drawback of Khang and Sze's structure. 
FAMOS combines the floating gate concept with an avalanche injection of electrons to 
yield a nonvolatile memory element [57 ,58]. 
The cross section of a FAMOS structure is shown in Figure 33. It is essentially a 
p-channel device in which no electrical contact is made with the silicon gate. The 
floating gate is fanned by depositing a polysilicon layer over 1000 A or thinner gate 
oxide. Gate is isolated from the top by a 1 11m thick oxide. Initially all the tenninals are 








Figure 33. Cross-section of the FAMOS Structure 
101 
negative drain to the source voltage is applied. As the voltage increases, a positive drop 
appears across the overlap region between the floating gate and the P+ drain region. This 
drop tries to invert the heavily doped drain region. As a result, depletion takes place at 
the drain end near the Si02 interface. Eventually, the electric field induced in the surface 
depletion region reaches a point at which avalanche multiplication occurs. The generated 
high energy electrons acquire sufficient energy to surmount the Si02 .barrier and to be 
swept towards the conductive floating silicon gate. This charge is · responsible for the 
inversion layer underneath the Si-Si02 boundary. The amount of the charge transferred 
to the floating gate is a function of the amplitude and the duration of the applied p-n 
junction potential. The amount of the transferred charge can be determined by measuring 
drain to source conduction. The accumulation of charge changes the threshold of the 
MOS structure (57]. The change in the threshold voltage is given by: 
(56) 
where Q0 is the final stored charge, Q0 <0> is the initial charge (if any), and C0 is the oxide 
capacitance. In general, the threshold voltage is given by: 
(57) 
where VFB is the flat-band voltage, $ss is the polysilicon work function, Q,5 is the fixed 
charge at the Si-Si02 interface, $F is the Fermi potential, and Qn is the charge within the 
substrate. 
102 
The I0 -V05 characteristics of a charged and uncharged FAMOS device reveal that 
the device conducts even when there is no charge on the floating gate. This is due to the 
capacitive feedthrough voltage from drain to gate. The feedback voltage is given by: 
(58) 
where C08 is the series combination of C0 and CB. The 10 - V ns characteristics of the 
FAMOS device and the ordinary MOS device with its gate voltage equivalent to the 
amount of charge transferred on the floating gate of the FAMOS, are not the same. This 
is mainly due to the capacitive feedback to the floating gate. The amount of the feedback 
voltage depends on the value of the drain voltage. The variation in the feedback factor 
(8V G~ oV os) stems from the variation of the inter-electrode capacitance as a function of 
the drain voltage. When V0 >V05-VT (triode region), the inversion layer extends from 
source to drain. Thus, CG is splitted between drain and source equally. This increases the 
numerator of equation 58, thus increasing the feedback factor. At higher values of drain 
voltage (saturation), due to the pinch off of the channel, CG is diverted to source. This 
decreases the numerator of equation 58, thus reducing the feedback factor. 
Charge accumulation in a FAMOS is identical to that of aMOS whose gate is kept 
floating. A MOS transistor with a gate oxide thickness of 1000 A takes approximately 
80 V across drain to source before any appreciable gate current can be observed. In the 
same structure, avalanche-junction breakdown can occur at 30 V. Had this gate been 
floating, the avalanche injection would have resulted in the transfer of an equivalent 
amount of charge to the gate. This charge, divided by the oxide capacitance, gives the 
change in the threshold voltage. The amount of charge transferred is a function of the 
103 
applied junction voltage, programming duration, and the charge stored on the floating 
gate. 
A stored charge of 4xl06 electron/cm2 results in an electric field intensity of 
approximately 2xl06 V/cm across the thermal oxide [57] : If the polysilicon-Si02 barrier 
is assumed to be 3.2 e V, then the discharge current due to oxide leakage will be of the 
order of 10-4° amp/cm2 at 300° C. Retentivity plots at different initial charge and 
temperatures reveal drastic initial decay and thereafter a logarithmic decay [57]. Initially, 
negatively charged electrons counterbalance the positive charge accumulated at the Si-
Si02 interface (due to the high dielectric field in the oxide created by the floating gate 
at elevated temperatures). The logarithmic retentivity is due to leakage through oxide. 
Since the gate is surrounded by a dielectric, it is not accessible. Thus FAMOS is 
not electrically erasable. Due to a lack of evidence of substantial hole conduction through 
the oxide, the possibility of neutralizing electrons by the injection of holes from the 
substrate is doubtful. But Tarui et al. [59] have reported that hole injection is possible. 
With a slight modification of the basic FAMOS structure, electrical erasure is theoretically 
possible. In the modified FAMOS, like Khang and Sze's structure, the top gate is added 
to facilitate electrical programming/erasure. The device is held at a high positive voltage 
and programmed similar to the FAMOS structure. Erasure takes place with the top gate 
at ground or negative potential to favor hole injection into the floating gate. Classically, 
the device is restored to its neutral condition by exposing it to ultraviolet or X-ray 
radiation. Rays with suitable wavelength excite electrons to overcome the oxide barrier 
of approximately 4.3 eY. Erasure by X-ray radiation involves the generation of a hole 
electron pair in the oxide. 
104 
An interesting problem occurs when the device is in the read mode. Generally, the 
memory cell is read by sensing the drain current This ·is done by applying low negative 
voltage, around -15 V, to the drain of the FAMOS. This raises the possibility of whether 
an uncharged memory cell can be slowly charged by repeatedly selecting it in the read 
mode, which is of-course undesirable. Empirical experiments demonstrate that such 
parasitic charging does not present a potential programming problem in memory cell 
operation [57]. 
Metal Nitrite Oxide Silicon Memory 
The process limitation in the formation of a metal layer over a very thin dielectric 
in Khang and Sze's structure, has led to the invention of the MNOS structure. It is 
typically used as a digital memory element in EEPROM. The structure is the same as the 
modified F AMOS, except that in lieu of a metal gate, a nitride layer is laid on the thin 
oxide. The top gate is made of polysilicon. For ann-channel MNOS device, a high gate 
voltage causes electrons to be injected from the substrate to the insulating silicon nitride 
layer. The injection uses the modified Fowler-Nordheim tunneling and other mechanisms 
[60,61]. The oxide thickness must be less than 50 A. Trapped electrons in the dielectric 
nitrite layer result in a positive shift of the threshold. During electrical erasure, high 
negative gate voltages repel or drive electrons from the nitrite trap layer to the substrate. 
The threshold window (minimum and maximum amplitude in the threshold swings) is 
limited by the number of write/erase cycles. The degradation in swing is caused by a 
creation of surface states and surface charges due to the high field across the oxide layer 
applied during the first few programming/erasing pulses. The increased number of states 
105 
results in a loss of stored charge from the oxide-nitrite surface, short term retention of 
weights and a reduction in threshold swing. The retention time in MNOS devices ranges 
from one to ten years, depending on the permittivity of nitride silicon [48]. 
Dual Injector Floating Gate MOS Memory 
From a neural networks integrated circuits perspective, one of the problems in the 
discussed memories, is the learning time. Any basic weighing memory cell operates in 
two modes: read and write. In the discussed memories, reading and programming can not 
be done simultaneously since terminals are common for read and write operations. In 
order to achieve both operations simultaneously, separate read and write terminals are 
necessary. The DIFMOS is a four terminal device [ 49]. Two of the terminals, the drain 
and source, function as a built-in electrometer for measuring the charge stored on the 
floating gate. The other two electrodes belong to the electron and hole injector diodes. 
The DIFMOS structure is shown in Figure 34. When reverse biased into avalanche 
breakdown, these injectors inject electrons and holes into the floating gate. Both injectors 
are excited by negative current sources. As programming proceeds, the charge on floating 
gate retards further accumulation of electrons due to back emf but encourages the 
injection of positively charged holes. The level of the drain current indicates the state of 
the device. 
The DIFMOS basically consists of a sensing transistor, a floating gate, an electron 
injector, and a hole injector. The bootstrap capacitor functions as a part of the hole 
injector by providing a favorable electric field for hole injection. Because greater current 




Figure 34. DIFMOS Structure 
107 
uses p+p junctions for its electron ·injection and p-n+ junctions for hole injectors. 
Ideally, the hole injector should discharge the floating gate to the cutoff voltage. But 
it is not capable of discharging the floating gate below the threshold V T of the hole 
injector. This problem can be overcome by using a bootstrap capacitor. The capacitor 
is formed between a floating gate and the p diffusion. During normal operation, VB is 
held at the substrate potential. During the erase operation, sufficiently negative voltage 
is applied to VB which capacitively couples a voltage to the floating gate equivalent to 
the minimum discharge threshold voltage. Erasing action only occurs when both, the 
bootstrap capacitor and the hole injector are operated simultaneously. The capacitively 
coupled voltage is given by: 
(59) 
where CB is the bootstrap capacitor, C is the total floating gate capacitance, CJC is a 
referred to as coupling ratio. The minimum bootstrap voltage required is given by: 
(60) 
where V T is the threshold voltage. 
Performance measurements were reported by M. Gosney [48]. Programming pulses 
of 500 1-lA and 50 llS duration were used for the write and erase operations. The 
bootstrap voltage pulse was -40 V for 100 ms. The bootstrap voltage is applied just 
before the hole injector avalanche is turned on. After avalanche is over, the bootstrap is 
removed. Timing of the bootstrap and avalanche is not critical, but both must be present 
for the erase operation. The device suffers from WTite/erase time limitations which are 
108 
several orders of magnitude slower than read time. Therefore, the DIFMOS will 
generally be limited to read-mostly applications. The device suffers from the trapping of 
holes and electrons in the oxide as all others memories do. As traps are filled, the 
charging and discharging times become longer. For a given voltage configuration, the 
decay in gate voltage is approximately linear with the logarithm of the number of 
write/erase cycles. Endurance (life) is a function of the cumulative trap charge. Trapped 
charges reduces the gate voltage window. At room temperature, retention is measured 
at 0.06 percent/decade, while at elevated temperatures of 80° C, it is approximately 1 
percent per decade [48] . 
From a fabrication perspective, the DIFMOS and the CMOS are nearly equal in 
process complexity. The FAMOS is much simpler but has no electrical erasure ability. 
The process comparison among the PMOS, FAMOS, CMOS, DIFMOS, and MNOS, is 
reported by Gosney [ 48]. 
Floating Gate Analog Memory in 
Standard CMOS Process 
The memories discussed above require a special fabrication process such as ultrathin 
window, nitrite trap oxide, or a conventional textured polysilicon. These processes are 
not yet matured. Usually, these special processes are expensive and simply not available 
in many design environments, especially universities. In order to fulfill the need of an 
analog neural network designers for programmable memories, existing standard CMOS 
process without modifications must be able to provide a solution to realize floating gate 
memories. Recently several such implementations have been reported [62,63]. 
109 
Based on the limitations discussed above, we propose that the sparse weight matrix 
W to be implemented in a standard CMOS process. This memory takes advantage of the 
mask geometries to cause the field-enhanced Fowler-Nordheim tunneling of the electrons 
from a substrate through a standard gate oxide of thickness 40 nm at relatively low 
programming voltages. Unlike the existing methods for the tunneling of electrons through 
a thick oxide by field enhancement, this method does not require a special process for 
textured-surface polysilicon, nor does it require an ultrathin gate oxide. Instead, the mask 
geometric factors induced by the physical shape of the gate are used to enhance the 
electric field strength at the Si02 interface. The following section discusses this weighing 
memory in detail. 
Memory Structure 
The test structure designed to understand the charge transport mechanism in the 
floating gate memory is fabricated in the two micron, p-well, double poly, double metal 
CMOS process with a gate-oxide thickness of 40 nm. The electrical equivalent schematic 
of the layout is shown in Figure 35. 
There are four basic test cells. Each test cell consists of the following: (1) a current 
injector Cinj' for injecting and removing electrons to and from the floating gate, (2) a 
PMOS sense transistor M, for sensing charge on the floating gate, and finally (3) a 
bootstrap capacitor CB, to allow external control and programming of the floating gate 
voltages without actually having an electrical connection between the programming gate 
and the floating gate. All four cells are identical except for their injector structures and 






- - - _CB.L ~-~ne- ~ o 
..------1-----=--n...... I D2 






Figure 35. Electrical Equivalent Schematic of the Layout 
110 
111 
chosen to assist in detern1ining the effect of the injector structure on the tunneled.charge. 
The CB is sized to approxin1ately maintain a constant CJCini ratio among all the test cells. 
The injector structure details are summarized in table I. 
All four cells with the different injector structure are intended to be programmed or 
erased simultaneously in order to compare the geometrically dependent behavior of the 
charge injection at various points during progranuning. This arrangement removes effects 
that are present due to the variation in amplitude and duration of the programming pulses 
as well as the variation due to different drain to source voltage of the sense transistors. 
These effects are present if the devices are tested separately. The sources of all of the 
identical sense transistors are connected together and the same drain voltage is 
simultaneously impressed across them. This ensures the equal drain to source voltage 
across each memory cell and thus removes the effect of channel length modulation on the 
drain current. 
In the injector structure, a self alignment process results in a lateral diffusion of the 
n+ region under the floating gate by a lateral diffusion factor WD. A floating polysilicon 
gate, ends up with its peripheral edge and corners over the n+ diffusion. Theoretically, 
the electric field due to the floating gate voltage is concentrated locally at the corners and · 
may be along the peripheral edge. The exact field distribution density of the electric field 
is complex and believed to be a function of the geometry of the injector. Experimental 
results indicate that a field enhancement factor of 2 to 4 can be obtained [62]. In order 
to experimental! y predict I-V curves, different combinations of comers and periphery as 
given in table 1 have been selected. We theorize that injector area does not play an 








Comers Injector Injector Area 
Internal External Perimeter urn um2 
4 2 30 54 
6 4 24 14 
10 8 40 38 












C8 =0.5 ffd/um2 
c..,j = 0.84 ffd/um2 
113 
programming voltages is not sufficiently high to cause a significant amount of tunneling 
from the p well to the floating gate. However, the tunneling may be present along the 
edges\ of the injector. 
The bootstrap capacitor is formed between poly-1 and poly-2. Poly 1 serves as a 
floating gate as well as the lower plate of the capacitor, while poly-2 acts as the upper 
plate of the capacitor. Thus, poly-1 is floated, i.e., electrically isolated from all the 
nodes. The poly-1 to poly-2 oxide thickness is 50 nm. Since the floating gate is 
surrounded by insulating Si02 from all the sides, charge leakage will be insignificant. 
During the programming and erasing, the voltage difference between the floating 
gate and the n+ diffusion is responsible for the Fowler Nordheim tunneling of the 
electrons. The bootstrap capacitor is necessary to control and isolate the floating gate 
voltage. Figure 35 also shows the different parasitic capacitances associated with a 
memory cell. The percentage of the programming voltage that appears on the floating 
gate depends on the capacitive coupling ratio a . This ratio is given by: 
CB 
a=------------------ (61) 
C B + c GS+ c GB +C GD +C inj 
where C8 is the bootstrap capacitor between the floating gate and the control gate across 
the poly-1 to ploy-2 oxide, Cas is the floating gate to source capacitance, C08 is the 
capacitance between the floating gate and the bulk, C00 is the capacitance between the 
floating gate and the drain, and Cinj is the injector capacitance across the gate oxide. 
The voltage responsible for the tunneling is thus given by: 
114 
(62) 
Clearly for the giVen tunneling voltages, tighter coupling minimizes the required 
programming voltages Vp. For this reason, the bootstrap capacitor should be at least one 
order of magnitude larger than the sum of C05, C08, C00 and Cinj· Taking into account 
the circuit area, proper trade offs between the size of the bootstrap capacitor and 
programming voltage have to be made. Using typical assumptions, the approximate 
bootstrap coupling ratio for all the four cells in this case is 10/11. 
The bootstrap capacitor C8 formed between poly-1 and poly-2 does not impose a 
significant limitation on the highest value of the programming voltage. The diffusion-poly 
capacitor, on the other hands, would have limited the maximum signal peak to 
approximately ±14 V (for the orbit process) to save the device from avalanche breakdown 
either between the diffusion and the well or between the well and the substrate. However, 
for the chosen process, the per unit capacitance formed between the diffusion and the poly 
is more area efficient than that formed between the two polys. 
The sense t::r;ansistor and the charge on its floating gate represent the synapse in the 
weight matrix W and the value of the weight respectively. As the network learns, the 
strength of the synapse increases. This is the electrical equivalent of dumping more 
charge on the floating gate, i.e., programming. Programming modulates the electrical 
conductivity of the synapse (P-MOS) device. Thus during programming, the electrical 
conductivity of the synapse is expected to mcrease. The P-sense transistor was 
specifically chosen to achieve this operation. During programming, the floating gate 
acquires electrons. Trapped electrons develop a negative potential on the floating gate 
115 
of the P-MOS sense transistor. The floating gate voltage tends to become more negative 
as programm.ing proceeds. Therefore, the drain current through the device increases, i.e., 
conductivity increases. This would not have been possible with a N-MOS because 
conductivity of NMOS decreases with the decrease in gate voltage. To avoid the problem 
associated with the N-MOS as a sense transistor, the synapses would initially have to be 
driven to the cutoff region by programming. Then by removing electrons in the erasing 
mode and superimposing fixed bias voltage on the controlling · gate, weights would be 
loaded. Another reason for using the P-MOS sense transistor is to avoid an erroneous 
change in the gate voltage due to the generation of hot electrons near the floating gate. 
N-MOS transistors operating at higher values of V05 are more prone to such effects [62]. 
Field Enhanced Fowler-Nordheim Tunneling 
A simplified explanation of the Fowler-Nordheim tunneling is as follows [62]. There 
exist an energy barrier of approximately 3.2 Ev that prevents the escape of electrons from 
the substrate to the Si02• At room temperature, the kinetic energy of the electrons allows 
them to tunnel through an oxide barrier whose thickness is approximately 5 nm. If the 
favorable electric field (generated due to external potential within this 5 nm range near 
the oxide silicon interface) is less than 3.2 Ev, then the electrons are pulled back into 
silicon. However, if the external field strength in this region is greater than 3.2 Ev, a 
percentage of the total electrons continue to travel in the direction of the external field 
and thus a small current flows from the Si surface. Increasing the electric field increases 
the electron flow and thus the electron current. Keeping these numbers in mind, it takes 
approximately 25 V to tunnel electron across a thickness of 40 nm. This voltage should 
116 
be well below the gate-oxide breakdown voltage, which is about 28 V for MOSIS 
process. 
According to this theory, the electric field within 5 nm of the Si02 interface plays 
an important role in the tunneling process. The electron emitting surface can be 
structured in order to increase the local electric field at the Si02 interface, thus allowing 
electron currents to be induced at much lower external voltages. Commercial EEPROMs 
use the same concept by deliberately introducing spikes or other non uniformities, such 
as surface textures, on the Si-Si02 interface. Enhancement of the electric field in such 
cases is reported to be by factor 4 to 5. In the present case, instead of special processing 
such as textured polysilicon, the lithographic features have been used to enhance the local 
field intensity. The field enhancement factor obtained by lithographic features (2 to 4) 
[62] is less than the field enhancement factor obtained by the textured polysilicon injector 
(4 to 5) [62]. 
The theorized area of a gate th~J.t is influenced by sufficient freld strength is very 
small (probably only comers). Thus programming and erasing are extremely slow. 
However, this is not critical for the implementation of the plasticity in the electronic 
olfactory system. 
Programming 
The test setup is shown in Figure 36. The setup is configured to measure the 
threshold voltage of the sense transistors M1-4 before and after every programming 
attempt. Programming results in the tunneling of the electrons onto the floating gate, 
which according to equation 56 produces the negative shift in the threshold voltage of the 
117 










-------------------------------~-----~§ ___ _ 
Figure 36. Test Setup for Testing Weighting Cell 
118 
sense transistors. The shift in the threshold voltage is used to confirm the presence of 
tunneling phenomena. 
To measure the threshold voltage, switches a and b are switched over to the test 
mode while c and d are closed. The plot of square root of the drain currents versus V os 
of the un-programmed devices is shown in Figure 37. The threshold voltage for the cells 
is found to be approximately -0.8 V. 
To program the memory cells, switches a and b are switched over to the program 
mode while c and d were left open. With VE set at 0 V, -5 V, and -10 V, programming 
pulses (Vp) of amplitude ranging from 5 V to 16 V with the step of 1 V were applied. 
The same programming voltages was applied across all of the cells. Thus, any difference 
in electron current flowing onto the floating gate could'-be attributed to the differences in 
the injector structures. The duration of pulses was varied from 2 ms to 40 ms in the steps 
of 5 ms. The rise and fall times of Vp were controlled, since it determines the peak 
capacitive current that flows through the injector. A sufficient rise time [62] of the pulse 
was used to prevent sharp capacitive current pulses that can result in gate oxide 
breakdown. After every programming attempt, the threshold voltage of the sense 
transistors was measured by switching the devices in test mode to observe any shift in the 
threshold voltage due to the progranuning. Over numerous such attempts, no significant 
shift in the threshold voltage was observed. However, a significant shift in the threshold 
voltage was observed in the last set with VE at- 10 V and with VP pulse amplitude of 16 
V. But, in this case, V00 was left floating instead of connected to the power supply. The 
resulting shift in the threshold voltages is shown in Figure 38. 


















,..J ~ ~ P' 
.~ t2 v 
_;) ~ / 
J/. v / 
....::;! ~. / / v 
~~ 
v v rM( / 
~ 9 / rl'i1!3 .)v ........ 
~ ~ 
............ /. / v 
~ 0 - N N ~ ~ ~ ~ ~ 
c:i I I I I I I I I 
Vas 
Figure 38. Threshold Voltages of Programmed Devices 
121 
between injector structure and programming level. For bootstrap ratio of 10/11, the 
tunneling voltage of 27 V is comparatively higher than reported by L. R. Carley (18 V 
to 19 V) [62]. Note that for the same programming voltages, no tunneling was observed 
when V DD was used. This raises a question as to whether powering of V 00 (see Figure 
35) adds an extra capacitance to the floating gate thereby reducing the effective bootstrap 
coupling ratio. The decrease in the bootstrap ratio leads to higher programming voltages. 
The validity of the above statement has not yet been verified. 
Retentivity plots taken at room temp (26° C) after 3 and 130 hours are shown in 
Figure 39, and 40 respectively. The comparison study of Figures 38 and 39 demonstrate 
excellent short term retentivity. However, the comparison study of Figure 37 and 40 















I 7 7 T 





I I I I I I I I I 
I ' I I I I I I I 
12 -----~----~-----~-----~-----~----~-----~-----~--1 I I I I I I 
I I I I I I I 
10+-----,'------r1 -----T1 ----_,1------r1 ----~1----~~~~T-----~ 
I 
(Io) 112 8 +-----L----'-1 --.,-"'----+---'-....,...o:;;..,.,~---~-..---~------~ 
0 -1 - 1.5 -2 -2.5 
Vas 
-3 -3.5 -4 
Figure 40. Threshold Voltage Retentation After 130 Hours 
-4 .5 
124 
Winner Take All 
Winner take all (WTA) competition of the pirifom1 cortex is accommodated in the 
p identical piriform patches. One (k=l ) such WTA piriform patch within a PC is shown 
in Figure 41. The patch consists of h identical piriform cells connected in parallel. 
Within a patch, the cells share a common comparison node C. Node C serves as a strong 
local inhibitory feedback, similar to the circuit designed by Lazarro et al. [64]. However, 
the circuit is designed for improved sensitivity. Each piriform cell receives input current, 
P* 11 (l:S:l::.:;h), from the piriform BiVI buffer. A single piriform cell is shown by the dotted 
box. Gate of M 1 is the node where comparison takes place. This node is common with 
other cells. M7 is a cascade device provided to minimize the current mirroring error in 
M 1 which is present due to the channel length modulation. M3 provides leakage current 
that is present on the common gate. M5 is provides source for the shortfall in the 
mirrored current. 
The circuit is reset at the beginning of each sniff by pre-charging the common 
mirroring node C to V ss by the switch M9 which is actually distributed in each of the 
piriform cell. During the winner take all co~petition, M9 is shut off and the circuit is 
allowed to seek a stable equilibrium. Depending on the time constant at node C, the 
common gate voltage starts rising due to the incoming currents and finally settles to the 
voltage corresponding to the highest value of input current p•limax· Since this voltage is 
common to all h piriform cells, the highest input current gets mirrored in the rest of the 
h-1 cells by the M 1 transistor in the other cells. At this stage, all comparing transistors 




Figure 41. Winner Take All Circuit 
126 
input current, sinking currents exceed the input currents P"11 • The shonfall, the 
difference between the maximum current and the corresponding input current in cells 
(P" 1"""x-P. 11), is supplied by the diode connected transistor M5 connected at the cell input. 
The differential current results in drop across M5. The drop biases M3 to conduct and 
M5 to shut off in the branch associated with p·,"i"'" making the corresponding M 1 a main 
controlling device while all other (h-1) M1 's mirroring devices. At this transition, the 
voltage at the input drops from a threshold above ground to a threshold below ground 
at all of the input nodes, except the branch with the highest current since the shortfall in 
that branch is zero. The resulting change in the diode voltage (2 VT approximately) is 
amplified by the invertor M17, 18 and level shifted by invertor M19,20. Thus, the maximum 
current results in the logic high at the output of the piriform cell signifying the winner, 
while all other piriform cells remain low signifying the losers. 
In Figure 41, if Iw; is the winner's input current, IL is looser's input current, 
I0 w is the drain current via M1 of winner, IoL is the drain current via M, of looser, Is is 




The common node C attends a gate voltage of 
(65) 
Theoretically, current mirroring should result in I0 L is equal to low· However, due 
127 
to the beta and threshold mismatch, and channel length modulation associated with the 
M,'s, IDL is equal to IDw ± .6.1, where .6.1 is given by: 
!:.!=!:. ~1(VGSJ-VTJ)±Pl!:. VTJ±A!:. VDSlpl 
::::/:. pl(VcsJ-VTI)±plil VTJ 
Subtracting equation 64 from equation 63 results in 
(66) 
(67) 
The 15 is responsible for exhibition of WTA competition. To be able to resolve the 
winner and the looser, Is should be grater than the resolution capacity of the WTA 
circuit. 
The circuit has limited resolution, due to the mirroring error associated with M 1• 
Simulations 
The SPICE. simulations are shown in Figures 42 and 43. Figure 42 demonstrates 
the ability to resolve the winner between inputs, which differ in amplitude by 1 f.l.A at 
low levels of input currents while Figure 43 demonstrates its inability to resolve the same 
differential at high levels of input currents. The settling time is a complex function of 
both the magnitude of all the currents and the differential between the winning and 
loosing currents. In general, the settling time 7 5 is the inverse function of the 
differential. The worst case time is derived from a pmr of closely matched low 
amplitude input currents. With all identical losers, it is found to be typically 1 f.LS. 
In Figure 43, the looser becomes high, even if the inputs have a differential of 1 
f.l.A, whereby the circuit fails to resolve a winner. However, since the settling time is 
128 
Date!fime run: 07/24/91 18:29:56 Temperature: 27.0 
6.00 +--------------+--------------~-------------;--------------~-------------~ 
5.00 L Single Winning Output ; 





I ' I 

























1.dus 2.0us 3.0us 0. us 4.0us 5.0us 
Time 
Figure 42. The Demonstration of Resolving WTA Inputs 
129 
Date!Time run: 07/30/91 22:36:00 Temperature: 27.0 
6.00\T+--------------+--------------,_ _____________ ,_ _____________ ~--------------r 
' Output : 
5.00\T ' 11 13 p14 
Tie of High Currents 
0.00\T -IL--------------------------+ 
I I 
I I -1.00\T +--------------+--------------,_ _____________ ,_ _____________ ~--------------+ 
255.0uA 
p 11 I I I I 
I I 
254.0uA + Inputs + I I 
I 














0 1 I 25 .ou1l.ou~----------i~~----------2~~~----------3~01~;----------4.<r~;-------5~o~~ 
Time 
Figure 43. Ties in the WTA Outputs 
130 
the inverse function of the differential, the settling time of the true winner is always less 
than the other winners. This fact may be used in the future applications to an advantage 
in separating the true winner from many winners. 
Simulations have been performed on as many as 250 WTA cells operating in parallel 
within a single piriform patch. It is observed that number of active WTA cells has an 
limited influence on the timing performance of the circuit. 
Testing 
Due to the pin limitation, only four WT A cells were fabricated. The cell inputs p•11 , 
P* 12, p•13, and P\4 are supplied through the high resolution current sinks. While testing, 
p·,2, P'"13, and P\4 are grouped together and supplied by a common current sink. The 
function generator is used to reset the circuit, thus when <!> 12 is pulled to a logic low, the 
circuit is allowed to seek the stable equilibrium. 
Experiments are carried out keeping in mind the effect of the mean value of the 
input currents, and the differential current between the winner and the looser, on the 
settling time of the circuit. Figure 44 shows the settling time as a function of the input 
current level with the difference current (5 J.LA) as a constant parameter. For the same 
differential, any increase in the mean level of the input current beyond the shown current 
range results in failure to resolve the inputs. It can be seen that with an increase in the 
current level, the settling time of the winner tw increases while the settling time of looser 
1:L decreases. For the present circuit geometries, the current level for the minimum 
settling time is found to be approximately 40 !J.A. 
Figure 45 shows the effect of the current difference when current level is set as a 
40 -----r-----r----,-----,-----,-----T-----r-----r-----1 
I I I I I I I I I 
35 -----4-----+-----t-----~-----~-----~----~-----~----
l I I I I I I I 
30 -----L-----~----~--- - -~-----4-----~---- - L-----L 





IOr--~~::~~~~~~~=r==~~ I I I 
5 --- - -~-----~----~-- ---~- ---,-----1-----r-----r-----~ 
I l I I I I I I I 
0 ----~.---~~--------~----+-----T---~~--~----~ 
8 15 22 28 35 42 48 55 62 68 
Current Mean 
Figure 44. Effect of Mean on the Settling Time 
131 
132 
35 ---- - --1--------r--------r--------r-------,--------l 
I I I I I I 
30 
I I I I I I 
--------I--------~-- -- ----~-------- t------- -I-------- -I 
I I I I I I 
I I I I I I 







I I I I I I 
I I I I I I 
- - - - --c; -1- - - - - - - - ~- - - - - - - - ~ - - - - - - - - t - - - - - - - - I- - - - - - - -
L : : : ~ ~ 'Cw I 
I I I I I I 
-------1--------t--------~--------r-------,------ -1 
I I I I I I 
10 
I I I I I I 
------~--------~--------L__ _____ _L___ --~ 
I I I I 
I 
5 I I --- - - -- -~ -- -- - - -~ - - ------t------ -- t-- -- - ---r-
1 I I I I 
I I I I I 
0 
5 15 25 35 45 55 65 
Current Differential 
Figure 45. Effect of Difference Current on the Settling Time 
133 
constant parameter. As expected, the graph reveals the discrepancy between winner and 
loosing currents. 
Testing results taken at a low level of the input currents (near 5 11-A) show that the 
circuit is capable of resolving difference currents as small as 2 !J.A. As the current level 
goes higher, the resolution decreases. Testing results taken at current mean equal to 70 
11-A shows that the circuit is capable of resolving difference current of about 5 )lA. 
Tie Resolver 
The short corning, i.e., the finite resolutions of the WT A circuit was discussed in the 
pervious section. To ensure only one winner in a single piriform patch, a resolver circuit 
is required to post-process the WT A circuit output 
The tie resolver element is shown in the Figure 46. This element digitally resolves 
the ties among the winners. In the circuit, inputs and outputs are defined as follows: L 
is the learn, TI1 is the control input, T01 is the control output, Pkl is the unresolved input, 
and PW kl is the resolved output. The 1 bit resolver is formed by connecting l resolver 
cells in a chained fashion, where TI, is propagated across the entire input vector Pkl from 
left to right. The control output of the preceding resolver element forms the control input 
to the next element. That is, T00)==~1+l)' except with TI1. 
The truth table II for the resolving logic function states that, with learn high, the 
high TI, is propagated from left to right until it encounters the first winner, making PWkl 
of the corresponding element high and negating its control output TO,. For PW kl to be 









I ......... . . . 
I 













L Til pkl PWkl T01 
1 1 0 0 1 
1 1 1 1 0 
1 0 X 0 0 
0 X 0 0 X 
0 X 1 1 X 
136 
only the one with the lowest settling time corresponding to the highest input, will be 
transferred to the output. Since Tl1 is propagated from left to right, the left most winner 
is selected and declared as the final winner. 




The hardware implementation of the above Boolian equations is shown in Figure 46. The 
standard CMOS gates have been used from a standard library to fabricate the resolver 
circuit. The layout is done by using the VLSI tool LAGER. Logical simulations are 
carried out by the built in circuit simulator IRSIM. 
Testing 
The testing results agreed with the Boolian equation 68. However, with the 5 V 
supply voltage, the high logic level on the T01 is found to be only 1.6 V. We attribute 
this fault to the possible defect in the mask since the standard LAGER cells were used 
to build the circuit. The rise and fall times are found to be 0.56 ~s and 0.5 11-s 
respectively. 
Dynamic Current Copier Integrator 
The current copier integrator (CCI) provides collateral feedback inhibition from the 
active piriform patches to glomeruli. Winning piriform neurons are applied to the WT 
matrix generating feedback currents. Feedback currents are sampled, stored, and 
integrated in the CCI. During the backward phase and at the end of each minor cycle, 
137 
inhibition is applied to the glomerulus. This inhibition persists to be used during the 
forward phase of the next cycle. During successive cycles, all of the inhibition currents 
that are generated in the backward phase are sampled and summed with previously stored 
inhibition. In this way, according to GLA olfactory model, as the multi-sampling 
proceeds, the cumulative inhibition up to that cycle is applied to the glomerulus to inhibit 
the stronger components in the input vector. This allows weaker components to become 
comparatively significant thus taking an active part in the overall clustering process. The 
CCI is a dynamic, yet discrete analog memory element to compute and store the 
accumulation of the san1pled feedback analog currents. The circuit is based on dynamic 
current copier principle. The following text describes the electronic implementation of 
the CCI. 
Background 
The standard current mirror is the most widely used block in analog integrated 
circuits. The current mirror concept was originally applied in bipolar technology. It is 
now extensively used in the CMOS process to duplicate, multiply or divide the currents. 
Current error due to the threshold mismatch and 1/f flicker noise is the most significant 
limitation of the standard MOS current mirrors, when used in a high precision analog 
circuits. Inspite of the various circuit design techniques reported, these errors typically 
could not be reduced below 1% [65]. The dynamic current copier, also referred to by 
many other names such as a current copier, current self calibrating circuit, and dynamic 
current mirror, etc., is a recent innovation. They completely overcome the limitations of 
the standard current mirrors, and moves achievable precision to tighter limits [65]. The 
138 
circuit is essentially a sample and hold cell that suppljes current by storing a voltage at 
the gate of a MOS transistor through which current flows. Current copiers can replace 
the standard current mmors to achieve multiple copies of a reference current with an 
accuracy of several PPM as compare to the typical one percent accuracy in standard 
current mirrors. This advantage led to the invention of the dynamic current copying 
techniques. 
Since the gate of the MOS device has practically infinite input impedance, it can be 
used to store the information on the gate capacitor for a short time period, i.e., for a few 
ms. Figure 47 shows the basic N-copier cell. To copy the current Ia into the cell, 
switches S 1 and S2 are closed (sample phase). The capacitor is charged to the gate 
voltage required by the transistor to achieve the drain current Io. If M 1 is in saturation, 
the gate voltage is given by: 
Vas=~+ VT> (69) 
The capacitor C1 will be charged to a voltage V Gs· The switches may then be opened (S 1 
must be opened before S2 to avoid the ilischarge of C1 via M 1). Ideally, the cell is 
capable of sinking Io when connected to the a load via S3 (hold phase). Several cells can 
sequentially be loaded from the same source. Note that a P-copier cell can be obtained 
by replacing the N-MOS transistor with its equivalent P-MOS transistor, and by reversing 
the supply polarities and the direction of currents. In such a case, the cell sources Io 
when connected to the load. The cells need not be accurately matched with respect to the 
transistor dimensions or the capacitor values since the current copying operation in each 
case results in the appropriate transistor gate voltage being stored on the gate capacitor 
139 
Figure 47. Basic Current Copier 
140 
of the selected transistor. Since the same transistor is used for sampling and holding, beta 
and threshold mismatches are completely eliminated. However, inevitable circuit flaws 
result in an error current causing the Io retrieved from a cell in hold phase different from 
Io of sampled phase. This error current is denoted by lU. The mechanisms of the 
original-to-copy error include: (1) switch charge feedthrough, limiting the initial accuracy 
,• 
of the current sample, (2) channel length modulation, producing a change in the retrieved 
current as the voltage V05 changes (as with standard current sources), (3) junction leakage 
; 
associated with S1, causing a steady discharge of the storage capacitor, (4) channel charge 
injection associated with switch S1, causing a change in V 05 when S1 is opened, and other 
flaws in the circuit. 
The Operating Principle of the Current Copier Integrator 
In Figure 47, integer multiplication of I0 by variable n can be achieved by making 
n copies of Io. These copies can be added together through a common load. This would 
require n identical current copier cells, whereby after adding them together would give 
a load current of n><Io. However, serial discrete integration of Io (L) can be obtained by 
using a pair of complementary (N & P) current copying cells connected in a circular 
fashion where during any instance, one of them acts as a temporary memory. Figure 48 
shows such a current copier integrator. The N-cell acts as a temporary memory while the 
P-cell acts as the sampler and surruner. The circuit operates in two phases requiring two 
non-overlapping switching clocks of the same frequency, ¢ 21 and ¢22. During phase 1, 
S1 is closed and S2 is opened while during phase 2, S2 is closed and S1 is opened. During 
phase 1, S1 is closed on phase ~21 and the steady state input current f; is sampled into 
J, 
I 







Figure 48. Current Copier Integrator 
141 
142 
the P-cell. Capacitor ~1 is charged to the gate voltage V CHI corresponding to the. drain 
current r"i that is flowing through M1p. During phase . 2, V CHI is transferred and 
memorized on CH2 by closing switch S2 on 4> 22• This completes one cycle. Note that S, 
should be opened before closer of S2, and vice a versa to avoid .the improper operation 
of the cifcuit At this stage, transistor MIN is capable of sinking exactly ri. During the 
next cycle when S1 is closed again, CHI is charged to the gate voltage corresponding to 
the drain current 2ri ( I*i from input plus ri from MIN ) that is flowing through MIP· For 
a steady state input current, over n cycles, a total of nxl"; current flows through M1p, 
which when mirrored by M0 is available as an output current li. However, if the input 
is a time varying analog signal ri(t), then the output over n cycles is given by: 
n 
It(n)= L r tCn) ; n=0,1.2, ... J (70) 
where 
(71) 
The parameter T is the time period of the switching frequency 4> 2. I"Jt) is assumed 
constant during sampling. From the above equation, the output is clearly a discrete 
integration of the time varying input current. The initialization of the integrator is 
essential in order to restart the inhibition for different sets of inputs. The minimum 
dimension switches, Msp and MsN are used to reset the gate voltage or hold capacitors. 
The circuit is initialized by resetting V cHr and V CHl to zero on RESET if> 2• To reduce the 
error due to the channel length modulation, cascade devices M2P and Mm are added. 
Dynamic biasing of these cascades gives improved cascading. This is achieved by using 
143 
additional dynamic biasing circuitry consisting of M0 p, MIP, MIN and MnN· Cascade Moe 
serves the same purpose of reducing error due to channel length modulation. Switches 
sl and s2 are made of transmission gates to cancel the effects of the feed through and 
channel charge injection. The precision MOS capacitors CH1 and CH2 are realized between 
poly-1 aitd poly-2. 
Circuit Design 
This section addresses the CCI design. It addresses the limitations imposed on 
maximum integration level, the maxiinum switching frequency, and the minimum 
switching frequency. 
Upper Integration Limit 
Equation 70 to be accurate within 5%, the transistors MFN M1N, M1p, and MFP must 
stay in saturation over the entire dynamic range. As the integration progresses, for the 
unidirectional input current, the current (integrand) in the circuit rises. For the selected 
geometries, let ~ax be the maximum attainable current that can be delivered without any 
of the transistors slipping out of the saturation region. Thus ~ax determines the upper 
limit on integration. For any current above ~ax• the circuit looses its accuracy as either 
one or all of the transistors fall into the triode region. This fonns the design criteria for 
~ax· Assuming ~MtP = ~M2P = ~M2N = ~MIN = ~ and assuming all corresponding transistors 
in the P-copier leg operating in saturation, the maximum current that can be pushed 
through P-copier leg is given by: 
144 
(72) 
The sampled current in the sampler has to be exactly transferred to the hold cell. This 
requires ~MFP equals to ~MtP• and ~MFN equals to ~MIN· Finally, to mirror the integrand 
in the circuit to the output with the unity ratio requires ~0 equals to ~MIP· 
The bias voltages V5 and V6 of the cascode transistors should be maintained as low 
as possible to maximize the full 'scale current range. The dynamic cascoding is essential 
for optimized cascading effects at all integrated levels of the input current. To achieve 
this, the bias voltages V5 and V6 must be a function of the present level of the current in 
the circuit. M0 p is a 1:1 biasing current mirror that copies the present current level and 
feeds it into the biasing circuitry. Considering the worst case that occurs when the 





MIN is an active resistor used to dynamically bias M2N. The bias voltage V5 at the current 
level ~ax requires the geometry of MIN to be: 
2 lmu. p =---
MIN (Vs - VTi 
(75) 
Similarly, if ~~HN equals to ~MDN• then the geometry of MIP is given by: 
145 
(76) 
Maximum Switching Frequency 
In order to calculate bandwidth of the CCI circuit, it is essential to know that how 
fast circuit can be run without adding excessive error in the integration . The transient 
response of the CCI is limited by the settling time of RC network formed by the 
switching elements, sample and hold capacitors. 
As the integration progresses, the accumulative sum of the sampled current 
(integrand) in the circuit continuously changes its value. During both phases, it is 
essential to update the value of the last stored voltages V CHI and V cH2 to a new voltage 
corresponding to the latest sum. This stores the integrand up to that point in the copier 
cell and allows the variation to be followed by the output current I;. If switch S1 during 
phase 1 remains closed for duration t1, then correct updating is only possible if the time 
duration t1 is longer than the settling time of the sample and the hold formed by M1p, S1 
and CHI· Assuming small perturbations, the settling behavior of this circuit can be 
examined by means of the P-cell of Figure 49. Opening the loop between capacitor CH1 





Figure 49. The P-Cell 
147 
(78) 
The &nand gx are the transconductance of M1 and switch respectively. The two poles of 
the closed loop circuit are the roots ~f equation 1- G(s) = 0 
I s =--± 
1,2 2 
1"2 
I 1 (79) 
For 4~ > 't1, the response is a damped oscillations with an envelope time constant of 2t2. 
For t 1 >> 4't2, it settles exponentially with the time constant t 1• The global settling time 
constant may be reasonably approximated by: 
(80) 
'ts must be 5 to 7 times (depending on the desired. accuracy) smaller than t1 to ensure that 
equilibrium is reached. Applying similar treatment during phase 2 for the N-cell, results 
in ~- Therefore, the maximum switching frequency is: 
1 
<I> CC-
2,_ t +t 
1 2 
(81) 
These conditions place an upper limit on the operational frequency, and upper limit on 
values of CH1 and CH2. They also place a lower limit on gx and gm. 
Minimum Switching Frequency 
During normal operation, S 1 is opened followed by the closer of S2 and vice a versa. 
The time intervals, t12 and ~1 between these two instances determine the minimum 
148 
possible switching frequency. During these intervals, the circuit is idle since neither of 
the capacitors are connected to their respective dqtins through switches since both of the 
switches are off. Thus, the gate voltages Ycm and V cH2 float to their pre-charged 
voltages. The voltages stored on the MOS c~pacitors at the gates of MFN and M 1p are 
affected by the leakage currents that is flowing from the gate. The peak to peak variation 
caused by the leakage current is given by: 
t 
.dV =I ~ 
pp leak c 
Rl 
Variation in the gate voltage produces a variation in the drain current 
(82) 
(83) 
The leakage current is present due to the reverse biased diode current associated with 
transmission gates. Longer t12 and {21 result in larger drain current errors. These relations 
impose the upper limit on the t12 and {21 for a given tolerance in drain current error. 
These times may be referred to as circuit idle time. The idle time is given by: 
.111 CHI 
t =- - -
12 gm llhlk 
(84) 
Note that {21 equals to t12• They set the minimum allowable operational frequency at: 
(85) 
Switching frequencies below ¢2min introduce unacceptable errors in the output current. 
Increasing CH1 and CIU results in a lower operational frequency but increases the settling 
time. Hence, a proper trade off has to be made. 
149 
Mechanisms of Errors 
This section addresses the errors that are present in the output current of CCI circuit. 
These errors are result of charge injection, switch feedthrough, channel length modulation, 
and leakage current. The error due to the -leakage current was discussed previously. 
Charge Injection 
A significant limitation to the precisiOn Qf the current copiers is due to the 
realization of the various switches by means of transistors. To close the switch, the 
switching transistor is made conductive by mobile carriers that are attracted into the 
channel by the gate voltage. For charge equilibrium, the total charge of the mobile 
carriers in the channel must be equal to the total charge stored on the gate. The charge 
stored on the gate in strong inversion is given by: 
(86) 
When the switch is opened, these carriers are released from the channel in order to block 
the transistor. The channel charge flows into the source and drain. Thus in theN-copier 
when switch S1 opens, a fraction bq of q is dumped on the capacitor CH2. The factor b 
determines the amount of charge that is dumped on the source of the MOS transistor. In 
some literature, it is specified to be 0.5 [66]. This causes gate voltage error given by 
!J. V= 6q 
Cm 




in the stored voltage V CH2• This voltage error in turn creates a relative error in the output 
150 
current of the copier as 
!:J.[ DMFN gm MFN !:J. V 
IDMFN IDMFN 
(W L COX')sl (Vas- VT) SJ 
= 
(88) 
11. V can be decreased by making gate oxide .capacitance of the switch a small percentage 
of the CH2 where one limit is given by the area of the CH2• It can also be decreased by 
reducing the total charge q in the channeLwhich ir!tern reduces the fraction 11.q that flows 
onto CH2. This can be achieved by minimizing the gate area WxL and/or by controlling 
the gate voltages of the switch. The percentage error also tends to be low at higher 
values of V cH2. A similar treatment applies to the P-copier cell for determination of the 
error due to the charge injection. 
Switch Feedthrough 
Switch feedthrough contribution is due to the clock voltage that is coupled to the 
gate via Cas. The clock voltages is partially transferred to the gate via the capacitive 
network. The transferred voltage js given by, 
(89) 
where V 8 is the gate voltage of switch transistor and Cas is gate to drain capacitance of 
the transmission gate. The change in the gate voltage multiplied by the transconductance 
reflects an error in the drain current. 
151 
Cascade Configurations 
Consider the structure illustrated in Figure 48 without the cascade devices M2p and 
M2N. For any cycle, during phase 1 and 2, currents are sampled on CH1 and CH2, 
respectively. While during the remainder of the cycle time, the copier hold these sampled 
currents on their gates. Considering the P-cell, let V75 and V m be the voltages attended 
by node 7 during the sample phase and the hold phase, respectively. The V7 must return 
to the value v7S equals to v CHI during sample pha~e I. During the hold phase, sl is open 
and V7 jumps to the voltage V m' imposed by the relative impedances of M 1N, M 1p, and 
the input current sink. Since V 75 is not equal to V m' the difference in drain voltages 
during the two phases produces additional contributions to the inaccuracy of the 
integration. 
The first contribution is due to the channel length modulation producing change in 
the drain current as the drain to source voltage changes. Mathematically, this can be 
represented in terms of the effective output conductance fSo, where go is the combined 
transconductance of cascaded MlP and M2P. Thus, the relative error in the output 
current of the copier can be written as 
fl./ D g o(V1H- V1S) (90) 
ID ID 
The second contribution is due to the drain voltage transferred to the gate via C00. 




where VDsample and vobold are the drain voltages attended by M1P in sample and hold 
phases respectively. The change in the gate voltage multiplied by the transconductance 
reflects an error in the drain current. 
Simulations 
The transient simulation of the CCI is shown in Figure 50. With a 30 J..lA steady 
state input current applied, integration over 5 cycles is observed. <j> 21 and <j> 22 are the 
switching clocks. The output current is sampled via Ro-
Initially, the circuit is reset to the initial conditions. The first output sample is found 
to be approximately 35 J..lA. A successive increase in step size of the output current is 
attributed to the cumulative integration of an error term that is present due to previously 
described factors specifically channel length modulation effects and channel charge 
injection. Over 5 cycles, the error is found to be 33%. For the designed geometries, the 
circuit saturates above 300 J..lA. 
This simulation demonstrates maximum switching frequencies in excess of 10 MHz. 
Testing 
The test set up consisted of two variable duty cycle non-overlapping clocks <j> 21 and 
<J> 22, derived from the pulse generator and applied to CCI. The auxiliary bread boarded 
circuit which was driven by <J> 21 was used to generate complementary reset pulses after 
every 8 clock cycles. Thus, throughout the testing, integration is performed over 8 clock 
153 










































--r+-- - -+---r---+- -+- -r - ,....---- --
I 
I 
D PHA. CLK 
I 






N t N -e- I 
• I 
~ I I 
P-. I I 0 I 
u I I 
I 
2.0us 4.0us 6.( i.JS 8.0us lO.bus 12. Ous 14.bus 
Time 
Figure 50. Transient Operation of the CCI 
154 
cycles by periodically resetting the V CHI and V cH2 with reset pulses. The auxiliary circuit 
was bread boarded to produce a precision current sink ri, where current was controlled 
in steps. The output current I; was sampled across a precision 10 K.Q sampling resistor. 
The performance was observed and recorded under two conditions: with, and without 
the external gate capacitances added to the internal MOS capacitances CHI and CH2. With 
an external capacitance of 200 pF each is added to both CH1 and CH2, the clock speed is 
set at the low value (1 KHz). In this case for fi equal to 20 ~A, during the first clock 
cycle, Ii is found to be 20 f..LA which is in exact agreement with the integrator theory. 
But, during the second clock cycle, ~ raised to 70 ).LA instead of the theoretical 40 ~A 
value, leading to a 75% error in the integrand. During the subsequent cycles, the output 
current is observed to be increasingly deviating from its expected theoretical values. 
Clearly, this is due to the cumulative integration of an error term, which is being added 
during every cycle along with the information signal. Thus, as the integration progresses, 
a larger error accumulates leading to a substantial error term during the later part of the 
prolonged integration cycles (200%). Hence, in the subsequent designs the following 
factors should be considered: (1) an error compensation scheme to the basic circuitry of 
Figure 48, in order to cancel the error in the integrand before it is processed further, and 
(2) the additional cascades to MFN and 1v1FP. 
The circuit conditions without the external capacitance added to the circuit are 
identical to the earlier case except that ti is set at 5 ).LA. In this case, a step in the output 
current due to channel charge injection and/or feedthrough is observed. It occurs when 
S 1 and S2 are opened. From equation 87, the channel charge injection error is a function 
of the ratio of switch oxide capacitance and hold capacitance. It has been suggested 
155 
previously that such an error can be reduced by making switch oxide capacitance a small 
percentage of Cr-r1 and CH2. Therefore, during the previous experiment, the off-chip 200 
pF capacitances were added to the internal MOS capacitances CH1 and CH2• With the 
external capacitances removed, 11 V due to the channel charge injection or switch 
feedthrough is comparatively high, resulting in a false step in the output current in every 
cycle. The resultant error in the output current due to the channel charge injection or 
switch feedthrough was found to be as high as 40%. 
Summary 
From the above discussion, it is clear that dynamic current copying techniques are 
potentially superior to the normal current mirroring techniques, due to the complete 
elimination of threshold mismatch errors and potential removal of flicker noise. However, 
proper selection of switching frequencies, incorporation of proper compensating schemes, 
and proper circuit design techniques cannot be overlooked when attempting to minimize 
errors. Appropriate device geometries, switches, and gate capacitances are important 
factors in the design of the CCL Increasing gate capacitance CH1 and CH2 improves 
accuracy but lowers the operating frequencies. Therefore proper trade off between space 
and accuracy is required. In summary, charge injection, switch feedthrough and other 
errors due to a variation in drain voltage, are the main sources of errors which occur 
during the integration. Complementary clocks which are suggested for driving 
complementary transmission switches to cancel the effect of switch feedthrough appear 
to be of little value. Therefore, future designs will make use of dununy switches in 
conjunction with single channel transistor switches. 
CHAPTER IV 
CONCLUSIONS AND FUTURE PROSPECTS 
Analog circuits are often criticized for their functionality when compared to their 
digital counterparts. Usually, analog integrated circuits designed by even the most 
experienced designers require multiple attempts to achieve desired results. Our experience 
in this regard is the other way around. Seven of nine blocks showed satisfactory DC 
behavior, due to the utmost care in design, simulation, and layout. However, the 
electronic olfaction topic is still open to many improvements, both in the olfactory model 
and in the refinement of electronic building blocks. The following suggestions provide 
the future scope that will help in realizing the system level integration of the GLA 
olfactory model. 
The GLA olfactory model described in chapter II is most definitely biologically 
inspired, but the basic idea in the minds of the original investigators initially may not 
have been its hardware implementation. In other words, the model may need additional 
simplifications that favor a simple electronic implementation while retaining the model's 
essential clustering properties. The ongoing simulation efforts of the simplified model 
by our group at Oklahoma State University and some researchers elsewhere [ 47] will 
hopefully lead to further simplified but computationally efficient model in the near future. 
In spite of the number of favorable features that make the GLA model suitable for direct 




The system level integration of the olfactory ·model will require 
additional knowledge of specific model parameters values (g, m, p, h). The primary task 
of selecting the best set of implementation strategies for an olfactory architecture is a 
rather difficult issue since olfaction is poorly understood. Extensive computer simulations 
will be required to analyze the effect of various model parameters such as number of 
glomerulus and mitral patches etc. on the clustering properties. This will assist in 
selecting the most optimal parameters thus providing efficient use of the silicon area. 
These parameters will have a direct impa:ct on the transistor level design. 
Two dimensional connectivity may form a bottleneck. In this regard, techniques like 
multiplexing, and inherent sparse and spatially local interconnect or shared wires will help 
to reduce routing complexity. 
The problems of communication, weight representation, and learning will also be of 
particular importance. To achieve effective communication on the memory front, local 
storage of the weight in close proximity of the multiplier hardware is the preferred 
solution. The task of weight updates is complex since it involves issues related to high 
voltage non-linear programming, learning algorithm, weight storage, on/off chip learning 
etc. In other words local optimization will dominate design and will remain a key focus 
in any olfactory system design. 
From an electronic perspective, the future prospects for electronic olfaction are 
unlimited. Adhering to the sequence as it is presented in chapter Ill, the multiplier testing 
results closely match with simulation and theoretical results. However, the present 
multiplier circuit is area consuming. The possibility of an alternative area efficient single 
quadrant multiplier needs to be investigated. The mitral patch circuit needs thorough 
158 
analysis. The idea of incorporating the sigmoidal function within the mitral patch by 
arranging thresholds in a nonlinear fashion certainly deserves some attention. The offset 
circuit associated with the multiplier needs special attention. The area efficient way must 
be found to realize an internal high leakage resistance. MOSFET operating in the 
subthreshold region should be investigated for this purpose. 
In electronic neural networks, the problem of realizing a trainable analog medium 
is current subject of high interest. Floating gate memories provide the best answer to 
electrically programmable/erasable non-volatile semiconductor memories. Out of the 
numerous possibilities, the concept of standard CMOS floating gate memory, based on 
the field enhancement due to mask geometries, is relatively new and poorly understood. 
These memories may not ever be suitable due to their heavy dependence on the 
manufacturing process. Precise control of the weight needs extensive experimentation to 
mathematical model and understand and the programming and erasing behaviors. This 
will assist in uncovering the basic physical principals hidden behind the field 
enhancement due to mask geometries and the retention of charge. 
System level integration will require a suitable programming scheme. An algorithm 
has to be devised to convert the inherently complex and non-linear programming into a 
relatively simplified and hopefully linearized learning algorithm. 
The on-chip generation of high. voltage poses a real challenge. However, the 
tunneling physics and high voltage pulse generation are two separate issues and initially 
should by handled separately for conceptual testing and understanding, and then should 
be combined together. Other issues relating to the weight matrix are cell layout, 
placement, and signal routing. Cell layout will have a direct impact on both the silicon 
159 
area as well as on the cell performance. Significant expertise is needed to arrive at the 
optimal design. A suitable signal routing scheme is required since the weight matrix is 
expected to be dominated by routing wires. In this regard, high voltage concems such 
as field threshold, reverse breakdown etc. need special attention. 
The testing of the WT A circuit reveals a limited operating range (0-70 ~A). Device 
geometries have to be pushed to achieve a higher dynamic range. Further, a creative on-
chip testing structure must be developed to measure the bandwidth. The CCI circuit has 
to be modified [65] to incorporate the error compensation scheme, improved dynamic 
cascading, and the dummy switches. This will bring down the errors in the output current. 
Finally, another milestone of this research, the system level integration of the 
olfactory model on a single substrate will require a serious effort Each factor 
(simplification of the model, suitable programming scheme, on-chip high voltage 
generation, weight cell characterization etc.) by itself can be significant enough to be 
another thesis. By no means does the author imply that the above list of problems is 
complete. But as we dwell into the area, hopefully we will come up with many more 
opportunities for improvements. 
REFERENCES 
1. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning Internal 
Representations by Error Propagation," in Parallel Distributed Processing, 
Explorations in the Microstructure of Cognition, D. E. Rumelhart & J. L. 
McClelland, Eds. MIT Press, Cambridge, MA, Vol. 1, pp. 318-362, 1986. 
2. S. Grossberg, "Nonlinear Neural Networks: Principles, Mechanisms, and 
Architectures" Neural Networks, VoL I, pp. 17-61, 1988. 
3. R. P. Lippmann, "An Introduction to Computing with Neural Nets," IEEE ASSP 
Magazine, pp. 4-22, April 1987. 
4. R. Sharda, and R. Patil, "Neural Networks as Forecasting Experts: An Emperical 
Test", International Joint Conference on the Neural Networks, Washington, 
D.C., Vol. II, pp. 491-494, 1990. 
5. J. J. Hopfield, "Neural Networks and Physical Systems with Emergent Collective 
Computation Abilities," Proceedings of the American Academy of Sciences, 
Vol. 79, pp. 2554-2558, 1982. 
6. M. Holler, S. Tam, H. Castro, and R. Benson, " An Electrically Trainable 
Artificial Neural Network (ETANN) with 10240 Floating Gate Synapses," 
in Proceedings, International Joint Conferences on the Neural Networks, 
Washington, DC, VoL IT, pp. 177-182, 1989. 
7. B. Furman, and A. Abidi, "CMOS Analog IC Implementing the Back Propagation 
Algorithm," Neural Networks, VoL 1, Sup. 1, pp. 381, 1988. 
8. L. D. Jackel, H. P. Graf, and R. E. Howard," Electronic Neural Network Chips," 
Applied Optics, Vol. 26, pp. 5077-5080, 1987. 
9. P. Mueller, J. Van der Spiegel, D. Blackman, T. Chiu, T. Clare, J. Dao, C. 
Donham, T. Hsieh, and M. Loinaz, " A General Purpose Analog 
Neurocomputer," Proceedings of the International Joint Conference on 
Neural Networks, Vol. IT, pp. 191-196, 1989. 
10. A. F. Murray, and A. Smith, " Asynchronous VLSI Neural Networks Using Pulse-




11. D. B. Schwartz, R. E. Howard, and W. E. Hubbard," A Programmable Analog 
Neural Network Chip," IEEE Journal of Solid-State Circuits, Vol. 24, pp. 
313-319, 1989. 
12. G. Lynch, and R. Granger, "Simulation and Analysis of a Simple Cortical 
Network", Psychology of Learning and Motivation, VoL 22, pp. 1-87, 1988. 
13. D. 0. Hebb, The Organization of Behavior, Wiley, New York, 1949. 
14. G. Lynch, Synapses, Circuits, and the Beginnings of Memory, MIT Press, 
Cambridge, MA, 1986. 
15. D. Hammerstrom, and E. Means, "System Design for a Second Generation 
Neurocomputer," In Proceedings, International Conference on Neural 
Networks, Washington, Vol. II, pp. 80-83, 1990. 
16. J. Ambros-Ingerson, R. Granger, and G. Lynch, "Simulation of Paleocortex 
Performs Hierarchical Clustering," Science, Vol. 247, pp. 1344-1348, 1990. 
17. J. Am bros-Ingerson, Computational Properties and Behavioral Expression of 
Cortex-Peripheral Interactions Suggested by a Model of the Olfactory Bulb 
and Cortex, Ph.D. Dissertation, University of California, Irvine, 1990. 
18. G. Lynch, and R. Granger, "Simulation and Analysis of a Simple Cortical 
Network," Psychology of Learning and Motivation, Vol. 23, pp. 205-241, 
1989. 
19. P. K. Sipmson, "A Survey of Artificial Neural Systems," 
20. P. D. Wasserman, Neural Computing Theory and Practice 
21. T. Kohonan, Self-Organization and Associative Memory, 2nd Edition, Springer-
Verlag, Berlin, 1988. 
22. D. Specht, "Probabilistic Neural Networks," Neural Networks, Vol. 3, pp. 109-118, 
1990. 
23. G. E. Hinton, and T. J. Sejnowski, " Learning and Relearning in Boltzmann 
Machines," in Parallel Distributed Processing, Explorations in the 
Microstructure of Cognition, D. E. Rumelhart & J. L. McClelland, Eds. MIT 
Press, Cambridge, MA, Vol. 1, pp. 282-317, 1986. 
24. J. Daugman, "Networks for Image Analysis: Motion and Texture," in proceedings, 
International Joint Conference on Neural Networks, Washington DC, Vol. 
I, pp. 189-194, 1989. 
162 
25. N. Suga, "Cortical Computational Maps for Auditory Imaging," Neural Networks, 
Vol. 3, pp. 3-22, 1990. 
26. Y. Tao and W. J. Freeman, "Model of Biological Pattern Recognition with 
Spatially Chaotic Dynamics," Neural Networks, Vol. 3, pp. 153-170, 1990. 
27. J.P. Wagnon, A Proposal for An Analog CMOS Median Filter System Based On 
Neural Network Architectural Principles, Masters Thesis, Oklahoma State 
University, Stillwater, 1988. 
28. J. Bailey, D. Hammerstrom, J. Mates, and M. Rudnick, "Silicon Association 
Cortex. InS. F. Zometzer," J. L. Davis, and C. Lau, editors, An Introduction 
to Neural and Electronic Networks, Academic Press, August 1989. 
29. C. A. Mead, Analog VLSI and Neural Systems, Addison-Wesley, MA, 1989. 
30. C. A. Mead, "Neuromorphic Electronic Systems," Proceedings of the IEEE, Vol. 
78, pp. 1629-1636, 1990. 
31. S. P. DeWeerth, and C. A. Mead, " An Analog VLSI Model of Adaptation in the 
V estibulo-Ocular Reflex," in Advances in Neural Information Processing 
Systems 2, D. Touretzky, Ed. Morgan-Kaufmann, San Mateo, CA, pp. 742-
749, 1990. 
32. C. A. Mead, X. Arreguit, and J. Lazzaro, "Analog VLSI Model of Binaural 
Hearing," IEEE Transactions on Neural Networks, Vol. 2, pp. 230-236, 
1991. 
33. C. A. Mead, and Mahowald, "A Silicon Model of Early Visual Processing," Neural 
Networks, Vol. 1, pp. 91-97, 1988. 
34. J. G Taylor, " A Silicon Model of Vertebrate Retinal Processing," Neural 
Networks, Vol. 3, pp. 171-178, 1990. 
35. A. Moore, J. Allman, and R. M. Goodman, "A Real-Time Neural System for Color 
Constancy,"IEEE Transactions on Neural Networks, Vol. 2, pp. 237-247, 
1991. 
36. H. C. Card, and W. R. Moore, "Silicon Models of Associative Learning In 
Aplysia," Neural Networks, Vol. 3, pp. 333-346, 1990. 
37. R. Braham, and J. 0. Hamblen, "The Design of a Neural Network with a 
Biologically Motivated Architecture, "IEEE Transactions on Neural 
Networks, Vol. 1, No. 3, pp. 251-262, 1990. 
163 
38. C. Toumazou, J. Lidgey, and D. Haigh "Introduction," Ch. 1 in Analogue IC 
Design: The Current-Mode Approach, C. Toumazou, F. J. Lidgey, and D. 
G. Haigh, Eds., Peregrinus, London, 1990. 
39. B. Gilbert, "Current-Mode Circuits From A Translinear Viewpoint: A Tutorial," 
Ch. 2 in Analogue IC Design: The Current-Mode Approach, C. Toumazou, 
F. J. Lidgey, and D. G. Haigh, Eds., Peregrinus, London, 1990. 
40. S. T. Dupuie, and M. Ismail,"High Frequency CMOS Transconductors," Ch. 5, in 
Analogue IC Design: The Current-Mode Approach, C. Toumazou, F. J. 
Lidgey, and D. G. Haigh, Eds., Peregrinus, London, 1990. 
41. K. Bult, and H. Wallinga, "A Class of Analog CMOS Circuits Based on the 
Squre-Law Characteristics of an MOS Transistor in Saturation," IEEE J. 
Solid-State Circuits, Vol. SC-22, pp. 357-365, June 1987. 
42. S. B. Patil, and C. G. Hutchens," A Novel Squashing Function for Electronic 
Implementation of Neural Networks," 5th Oklahoma Symposium of 
Artificial Intelligence, 1991. 
43. Y. Tsividis, Operation and Modeling of MOS Transistor, 
44. P. E. Allen, and D. R. Holberg, "Two Stage Comparators," Ch. 7,. in CMOS 
Analog Circuit Design , HRW Inc., 1987. 
45. A. S. Sedra, and G. W. Roberts, " Current Conveyor Theory and Practice," Ch. 3 
in Analogue IC Design: The Current-Mode Approach, C. Toumazou, F. J. 
Lidgey, and D. G. Haigh, Eds., Peregrinus, London, 1990. 
46. VLSI Design Techniques For Analog and Digital Circuits 
47. P. A. Shoemaker, C. G. Hutchens and, S. B. Patil, " A Hierarchical Clustering 
Network Based on a Model of Olfactory Processing," Submitted, 1992. 
48. T. H. Borgstrom, M. Ismail, and S. B. Bibyk, "Programmable Current-Mode 
Neural Network for Implementation in Analogue MOS VLSI," lEE 
Proceedings, Vol. 137, pt. G, No.2, pp. 175-183, April1990. 
49. W. M. Gosney, " DIFMOS-A Floating-Gate Electrically Erasable Nonvolatile 
Semiconductor Memory Technology," IEEE Transactions on Electron 
Devices, Vol. Ed. 24, No. 5, pp. 594-599, May 1977. 
50. Y. Tsividis, and S. Satyanarayana, " Analog Circuits for Variable-Synapse 
Electronic Neural Networks," Electronics Letters, Vol. 23, No. 24, pp. 1313-
1314, November 1987. 
164 
51. R. L. Shimabukuro, and P. A. Shoemaker, "Circuitry for Artificial Neural 
Networks with Nonvolatile Analog Memories," Proceedings, IEEE 
International Symposium on Circuits and Systems, pp. 1217-1220, 1989. 
52. C. B ulucea, " Avalanche Injection into the Oxide In Silicon Gate Controlled 
Devices-I. Theory," Solid State Electronics, Vol. 18, pp. 363-374, 1975. 
53. S. M. Sze, Physics of Semiconductor Devices, Wiley, Newyork, 1981. 
54. P. A. Shoemaker, M. J. Carlin, and R. L. Shimabukuro, "Back-propagation 
Learning with Trinary Quantization of Weight Updates," Neural Networks, 
Vol. 4, pp. 231-241, 1991. 
55. D. Khang and S. M. Sze, Be11 System Tech., J. 46, 1288, 1967. 
56. E. H. Nicollian, A. Goetzberger, and C. N. Berglund, Applied Physics Letters 15, 
pp. 174, 1969. 
57. D. Frohman-Bentchkowsky, " Memory Behavior In a Floating-Gate 
Avalanche-Injection MOS (FAMOS) ·Structure," Applied Physics Letters, 
Vol. 18, Number 8, pp. 332-334, April 1971. 
58. D. Frohman-Bentchkowsky, "FAMOS - A New Semiconductor Charge Storage 
Device," Solid state Electronics, VoL 17, pp. 517-529, 1974. 
59. Y. Tarui,· Y. Hayashi, and K. Nagai, "Electrically Reprogrammable Non-volatile 
Semiconductor Memories," IEEE J., SC-7 , pp. 369-375, 1972. 
60. T. G. Carlstedt, and Svensson C. M., " MNOS Memory Transistor In Simple 
Memory Arrays," IEEE J., SC-7, pp. 382-388, 1972. 
61. R. A. Williams, and M. M. E. Begueala,"The Effect of Electrical Conduction of 
Si3N4 On the Discharge of MNOS Memory Transistor," IEEE Transaction, 
ED-25, 8, pp. 1019-1022, 1978. 
62. L. R. Carley, " Trimming Analog Circuits Using Floating-Gate Analog MOS 
Memory" Circuits, Vol. 24, No. 6, pp. 1569-1575, December 1989. 
63. B. W. Lee, B. J. Sheu, and H. Yang, " Analog Floating-Gate Synapses for 
General-Purpose VLSI Neural Network Computation," IEEE Transaction on 
Circuits and Systems, Vol. 38, No. 6, June 1991. 
64. J. Lazzaro, S. Ryckebusch, M.A. Mahowald, and C. A. Mead, "Winner-Take-All 
Networks of O(N) Complexity", California Institute of Technology 
Technical Report Caltech-CS-TR-21-88, 1989. 
165 
65. E. A. Vittoz, and G. Wegmann, "Dynamic Current Mirrors," Ch. 7 in Analogue 
IC Design: The Current-Mode Approach, C. Toumazou, F. J. Lidgey, and 
D. G. Haigh, Eds., Peregrinus, London, 1990. 
66. S. J. Daubert, and D. Vallancourt, "Operation and Analysis of Current Copier 
Circuits",IEE Proceedings, Vol. 137, Pt. G., No.2, pp. 109-115, April 1990. 
67. C. Hutchens, A. Hill, and S. B. Patil "Simulation of an Olfactory Neural Paradigm 
Suitable for Electronic Clustering," 5th Oklahoma Symposium on Artificial 
Intelligence, Nov. 1991. 
68. J. L. Wyatt, D. L. Standley, and W. Yang, "The MIT Vision Chip Project: Analog 
VLSI Systems for FA~T Image Acquisition and Early Vision Processing," 
Proceedings of the IEEE, International Conference on Robotics and 
Automation, pp. 1330-1335, April 1991. 
69. M. A. Sivilotti, M. A. Mahowald, and C. A. Mead, "Real-Time Visual 
Computation Using Analog CMOS Processing Arrays," 1987. 
70. K. Goser, U. Hilleringmann, U. Rueckert, and K. Schumacher," VLSI 
Technologies for Artificial Neural Networks, II IEEE Micro, pp. 28-44, 
December 1989. 
71. J. P. Sage, K. Thompson, and R. S. Withers, " An Artificial Neural Network 
Integrated Circuit Based on MNOS/CCD Principles, II American Institute of 
Physics, pp.381-385, 1986. 
72. A. J. Agranat, C. F. Neugebauer, and A. Yariv,"A CCD Based Neural Network 
Integrated Circuit With 64K Analog Programmable Synapses, IJCNN , pp. 
II-552-555, 
73. Y. P. Tsividis, and D. Anastassiou, " Switched- Capacitor Neural Networks," 
Electronics Letters, Vol. 23, No. 18, pp. 958-959, August 1987. 
74. M. Stanford Tornlison Jr., D. J. Walker, and M. A. Sivilotti," A Digital Neural 
Network Architecture for VLSI," IJCNN, pp. IT 545-550. 
75. A. F. Murray, and Anthony V. W. Smith," Asynchronous VLSI Neural Networks 
Using Pulse-Stream Arithmetic" IEEE, pp. 688-697, 1988. 
76. W . Wike, D. Van den, and T. Miller ill, "The VLSI Implementation of STONN, 
II IJCNN, pp. ll-593- 598. 
77. S. Satyanarayana, and Y. Tsividis," Analog Neural Networks with Distributed 
Neurons,11Electronics Letters, Vol. 25, No. 5, pp. 302-303, March 1989. 
166 
78. J. Alspector and R. B. Allen, "A Neuromorphic VLSI Learning System," Bell 
Communication Research, pp. 314-345. 
79. H. P. Graf, and P. de Vegvar, "A CMOS Implementation of a Neural Network 
Model, "AT & T Bell Laboratories, Holmdel. 
80. D. Hammerstrom, "A VLSI Architecture for High-Performance, Low-Cost, Onchip 
Learning,"Adaptive Solutions Inc.~ Beaverton, Oregon, pp. II-537-544, 
February 28, 1990. 
81. Y. Hirai, K. Kamada, M. Yamada, and M. Ooyama, "A Digital Neuro-Chip With 
Unlimited Connectiability for Large Scale Neural Networks," Institute of 
Information Sciences and Electronics, University of Tsukuba, Japan, pp. TI-
163-169. 
82. P. W. Hollis, "Artificial Neural Network Using MOS Analog Multipliers," IEEE 
Journal of Solid-State Circuits, Vol. 25, No. 3, pp. 849-855, June 1990. 
83. N. I. Khachab, and M. !smile, " MOS Multiplier/Divider Cell For Analogue VlSI," 
Electronics Letters, Vol. 25, No. 2, pp. 1550-1553, November 1989. 
84. B. Hochet," Multivalued MOS Memory for Variable-Synapse Neural Networks," 
Electronics Letters, VoL 25, No. 10, pp. 669-670, May 1989. 
85. J. Alspector, R. B. 'Allen,V. Hu, and S. Satyanarayana, "Stochastic Learning 
Networks and Their Implementations," In D. Z. Anderson (Ed.), Proceedings 
of IEEE Conference on Neural Information Processing Systems-Natural and 
Synthetic, pp. 9-21, 1988. 
VITA 
Sanjay B. Patil 
Candidate for the Degree of 
Master of Science 
Thesis: VLSI IMPLEMENTATION OF OLFACTORY CORTEX MODEL 
Major Field: Electrical Engineering 
Biographical: 
Personal Data: Born in Maharashtra, India, August 28, 1966, son of Dr. Bhagawan 
Patil and Mrs. Vatsala Patil. 
Educational: Graduated from Shri Shivaji Secondary School, Navapur, India, 1981; 
received a Diploma in Electrical Engineering from Government Polytechnic, 
Yeotmal in July 1984; received Bachelor of Electrical Engineering from 
College of Engineering Poona, in July 1987; completed requirements for the 
Master of Science degree at Oklahoma State University in May, 1993. 
Professional Experience: Research Assistant (1991-Present), Dept. of Electrical 
Engg., OSU; Teaching Assistant (FALL-1991), Dept of Electrical Engg., 
OSU; Research Assistant (1990-1991), College of Business Administration, 
OSU. . 
Design Executive, Switchgears (1988-89), Siemens Ltd., Bombay, India. 
Production Engineer, Switch Boards (1987-88), Siemens Ltd., Bombay, India. 
