Implementing Homeostatic Plasticity in Analog VLSI by Rovere, Giovanni
A.A. 2012/2013
Università degli Studi di Padova
- School of Engineering -
Electrical Engineering Master's Degree
Master Thesis
implementing homeostatic plasticity in analog VLSI
And how to obtain ultra long time constants on compact silicon substrates.
Candidate:
Giovanni Rovere
1033455-IL
giovanni.rovere@gmail.com
ETH-UZH Supervisor:
Prof. Giacomo Indiveri
UniPD Supervisor:
Prof. Andrea Gerosa
1

Abstract
Neuromorphic engineering systems are electronic devices that emulates the
spike based computational paradigm observed in biological neural networks.
The neuromorphic computing power originates from the number of artiﬁcial
neurons and from the interconnections among them. Hence high density
CMOS analog VLSI are an optimal implementation medium for neuromor-
phic hardware systems.
However, high chip integration and CMOS processes scaling yield mis-
match and non-ideality phenomena that limit the performances of the de-
vice. A neuromorphic approach to address this problem is to implement the
Synaptic Homeostatic Plasticity (SHP) in silicon. SHP is a property ob-
served in real neurons that modiﬁes the synaptic gain in order to stabilize
the neuronal ﬁring rate activity in face of instance variations and stimuli
changes.
In engineering terms, the SHP is equivalent to an Automatic Gain Con-
trol (AGC) loop comprising a Low Pass Filter (LPF). Such LPF must have
a cut-oﬀ frequency several order of magnitude lower than the neurons dy-
namic in order to not interfere with the learning mechanisms. However, due
to integration reasons, long time constants must be obtained exploiting low
currents rather than increase the capacitor area. State of the art homeostatic
plasticity implementations exploit ﬂoating gate devices or oﬀ-chip worksta-
tion control systems that require additional circuitry or prevent from the use
in low power portable applications.
Given such LPF challenging speciﬁcations, I developed a compact CMOS
ﬁlter architecture based on leakages currents in a pMOS device. I carried
out and reported simulation measurements that shows AGC time constants
on the order of minutes with 1pF capacitor.
3 G. Rovere
A Mamma e Papà, per il loro aﬀetto e sostegno immenso.
Alla Famiglia e agli amici più cari.
A F.A., G.M ed E.C. a cui devo veramente molto.
G. Rovere 4
CONTENTS
1 Introduction 9
1.1 Neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Neuromorphic Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 The Homeostatic Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.5 The Thesis Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2 Concept and Circuits of Neuromorphic Engineering 27
2.1 The Subthreshold MOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 A Low Leakage Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 The Translinear Principle and log-domain Filters . . . . . . . . . . . . . . . . 40
2.4 The Diﬀerential Pair Integrator Circuit . . . . . . . . . . . . . . . . . . . . . . 44
2.5 The Winner-Take-All Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3 The Filter Design 55
3.1 The Automatic Gain Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2 The Core Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3 Filter Design 1 - The DPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 Filter Design 2 - The Classical log-domain . . . . . . . . . . . . . . . . . . . . 64
3.5 Filter Design 3 - The Unbalanced Structure . . . . . . . . . . . . . . . . . . . 68
4 The Final Design and Simulations 73
4.1 The Loop Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 A low oﬀset ampliﬁer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5 Conclusions 89
5.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5
5.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
A Masks Layout 93
G. Rovere 6
Preface
This master thesis work was carried in the Neuromorphic Cognitive Group (NCS) under
the supervision of Prof. Giacomo Indiveri of UZH|ETH at the Institute of Neuroinformatics
(INI), Zürich. The research and writing work has been conducted from late February 2013
to September 2013 and is part of the Erasmus framework. The aim of this thesis report is to
document the work carried out, the results I found and the whole design process in which I
was involved in.
Analogue VLSI design is an iterative process that requires countless changes in the topol-
ogy, sizes and architecture of the design layout in order to convey form the original idea to
the application. Hence, in this thesis in addition to the ﬁnal working circuit I will report
also other circuits I simulated, but that showed insurmountable issues and didn't meet the
speciﬁcations. However, I reported here only two attempts, the ones I believed were more
meaningful due to their initial appealing. Countless other solution were subject of my research
both from taken literature and from scratch.
I would like to emphasize that with this work I had the invaluable opportunity to deal with
the classical analogue design work-ﬂow (idea, design, simulation, layout). This work required
considerable eﬀort and concentration, especially in order to ﬁnd a robust and reliable circuit
that could be integrated in a bigger chip (MN256r1). Finally, at the end, the circuit was
successfully taped out and the chip is expected to be given back on Fall 2013. Unfortunately
for this reason, I can't include the real circuits measurements on this thesis report, but its
test and characterization will be carried afterwards.
In addition to this main work I designed (on early March 2013) a rail-to-rail OPAMP
that will be used as replacement for the old "WidePAD". It is a buﬀer for monitoring chip
signals. It will be used in the unity gain arrangement, so true rail to rail input and output
were key speciﬁcations. Hence, a two stage OPAMP is designed and integrated in the same
chip (MN256r1).
This thesis report is structured as follow: Chapter one gives the context and background
of neural networks and deﬁne the thesis aim. Chapter two brieﬂy analyses meaningful neu-
romorphic circuits and concepts exploited later in the design. Chapter three focuses on the
circuit design comparing diﬀerent solutions. In chapter four is ﬁnally presented the on chip
implementation of the circuit with Cadence simulations. Chapter ﬁve closes this report with
conclusions and further considerations.
Except when otherwise explicitly mentioned, circuits, insights and arrangements found on
Chapter three and Chapter four are, as far as I know, original results that I obtained. Due to
the newness in the solutions, new concepts and circuits has been condensed in a paper that
will be submitted to IEEE International Symposium on Circuits and Systems (ISCAS) 2014.
In addition to prof. Giacomo Indiveri for his invaluable support, I would also thank Qiao
7 G. Rovere
Ning (INI, ETH) and Chiara Bartolozzi (IIT, Genova) that really helped me a lot, and of
course the whole Institute of Neuroinformatics.
G. Rovere 8
CHAPTER
1
INTRODUCTION
This chapter provides an overview of basic concepts that will be used later in the discussion.
These sections aren't complete and thorough and intended only for contextualize this thesis
work. This introduction ranges from the biological neuron and ends with a brief discussion
about neuromorphic hardware and circuits.
1.1 Neurons
The neurons are electrical excitability cells that are specialized in intracellular communi-
cation. They are responsible for processing information over long distances through electrical
and chemical mechanisms.
A typical neuron possess a cell body called the soma, several dendrites and the axon.
The neuron internal volume is separated from the surrounding environment by the cellular
membrane.
The soma is a compact section which contains the nucleus and several other organelles,
which besides, provides the energy to the neuron.
The dendrites are thin bulges that arises from the cell body forming a complex structure
and branches. These bulges are generally thin and scattered far away from the soma, hence
its structure is often called dendritic tree due to its shape that resembles the roots of a tree.
The dendritic tree dimensions and shapes can be very variable even within the same organism
according to the neuron purpose and its location in the body.
The axon is another bulge that still originate from the soma but is much longer than the
dendrites. Its diameter is roughly constant and it eventually ends with multiple terminations.
A stylised depiction of a biological neuron is given in Figure 1.1.
9
1.1. Neurons
Dendrites
Axon terminals
Axon
Soma
Figure 1.1: The biological neuron. Adapted from Nicolas Rougier, 2007 cba
The neuron is identiﬁed as the primary functional unit of the brain and of the nervous
system in general. From the functional point of view the dendrites are the neuron inputs
weather the axon terminals are the output gateways of the neuron. The input signals, gathered
by the dendrites, are summed, conveyed into the soma and then integrated over time by it.
Next, the resulting value is compared with a threshold value. If the signal exceeds that
threshold then what is called an action potential is established and the neuron ﬁres.
An action potential is a narrow spike generated in the soma that runs through the whole
axon. From the mathematical point of view a biological neuron can be model as an integrate
and ﬁre system.
To note that signals coming from the dendrites can have diﬀerent weights. This is due
because the path from the dendrite to the soma is passive. The strength of the signal at the
soma point (before the soma integration and sum) can be computed by using the passive cable
theory, whose parameter depends on the physical dimension of the path from the dendrite
through the soma. A block representation with n inputs and one (y) output is given in Figure
1.2.
But how these signals are created in biological neurons? These neuronal signals are rep-
resented the voltages set up by the ions gradients across the neuronal membranes. These
membranes, made by a lipid bilayer with embedded proteins, are selectively permeable to
ions and separates the interior of the neurons form the outside environment. These proteins
acts as active ion pumps and as ion channels. The membrane has these peculiar abilities that
are responsible for most of the neuronal properties.
G. Rovere 10
1.1. Neurons
Σ
X1
X2
Xn-1
Xn
W2
Wn-1
Wn
W1
s
Sigmoid function
yu
y = f ( u )
n-S
yna
pse
s
Soma Dendrites
Figure 1.2: The mathematical model of the biological neuron, here depicted with n synapses and one axonal
termination. u is the synpatic drive while y is the output ﬁring rate.
Intra Extra
Potassium (K+) 140 5
Sodium (Na+) 10 145
Chloride (Cl-) 20 110
Calcium (Ca2+) 0.0001 2
Table 1.1: Intracellular and extracellular concentrations of ions in mammalian neurons [mM]. Taken from
[19].
The ion pumps can actively moves ion against concentration gradients weather ion chan-
nels allow certain kind of ions to diﬀuse according to their concentration gradient. Thus, two
opposite ﬂows of ions can be identiﬁed and alter the internal ion concentration in the neuron.
The external concentration is assumed to be constant due to its big volume compared to the
neuron body volume. Even though this ion ﬂow mechanism will continue forever, it will reach
an equilibrium point in which the two ions ﬂow ratios get equal. Since this equilibrium point
is reached with diﬀerent concentrations of ions (inside and outside the neuron), a voltage
across the membrane is set.
There are mainly four ion types inside and outside the neurons, namely Sodium (Na+),
potassium(K+), chloride(Cl-) and calcium (Ca2+). At the equilibrium point the intracellular
concentration of potassium is grater than the other ions concentrations (see Table 1.1) and
the membrane voltage potential can be calculated by the Goldman equation (1.1).
11 G. Rovere
1.1. Neurons
Vmem =
RT
F
ln
(
PNa+ [Na
+]out + PK+ [K
+]out + PCl− [Cl
−]in
PNa+ [Na+]in + PK+ [K+]in + PCl− [Cl−]out
)
(1.1)
In which P is the permeability of the membrane to a certain ion and values in brackets
are the concentrations of the ions. R and F are the ideal gas constants and the Faraday's
constant, while T is the temperature in Kelvin.
Hence, if computed the membrane potential with the concentration values shown in Table
1.1, the voltage result to be around Vmem = −70mV (the positive is referred to the external
of the cell).
What happen across the cell membrane is a common mechanism in physic
that has several analogies. Just for bridge it to the electronics world I give
here an example on p-n junctions. P-n junction is the interface between a
n-type silicon and a p-type silicon. Each of these two material have diﬀerent
densities of holes and electrons due to the doping processes. While in con-
tact, a carriers gradient is established and a carriers ﬂow from the material
with higher density to the one with lower density. This motion creates a ﬂux
of carriers and meanwhile leaves ionized atoms that have an electrical charge.
These atoms are static, in the sense that they can't move from their position
in the silicon lattice and their charge is constant. Hence, these static ions
set up an electrical gradient that produces an opposite carrier ﬂux and coun-
terbalance the diﬀusion gradient ﬂux. At equilibrium these ﬂows are equal
and the net mobile charge is zero. However, in this condition, a voltage is
measurable across the silicon interface. It can be calculate by the Boltzmann
distribution, that is a special case of the Goldmann equation.
When the integrated signals come from the dendrites and summed each other, they aﬀect
the intracellular ions concentrations and thus alter the membrane potential in the soma.
Hence, if a threshold voltage is reached, a spike called action potential is generated.
But how an action potential can be generated by the neuron? This can happen because the
ion channels permeability is modulated by the ions concentrations in the neuron body. In fact,
ion channels are sensitive to the membrane potential and hence their ion conductivity changes
accordingly, some in an increasing way and some in a decreasing way. So, this modulation in
conductivity results in a change of ion concentrations gradients and hence in a diﬀerential of
potential across the neuronal membrane.
Usually there are several type of ion channel that exhibits diﬀerent dynamics each other,
but two of them are dominant. For instance, while the threshold is reached, Na+ channels
quickly open, letting Na+ ions ﬂow increase and thus altering the membrane potential up to
a positive value about +40mV . Consequently, the K+ channels opens increasing their con-
ductivity, counterbalancing the previous Na+ eﬀect and bring back the membrane potential
to its resting state.
G. Rovere 12
1.1. Neurons
These and other dynamics are summed over the time and, if measured by an oscilloscope,
they result in a spike voltage, see Figure 1.3. In this spike we can recognize diﬀerent phases,
such as depolarization of the membrane, repolarization of the membrane and a refractory
period that reset the neuron to the resting state. These phases are the direct result of the ion
channels dynamics.
To note that these spikes amplitude and shape are invariant to the stimuli from which are
generated (as far as the threshold voltage is reached). This is what is commonly referred as
"all or nothing".
After being generated, the spikes runs though the whole axon and reach the end without
loosing their intensity. In fact, throughout the axon, the signal is restored after ﬁxed dis-
tances, in order to prevent signal degradation. This resemble a serial digital communication
through a single line.
Actually this is a very simple model that doesn't takes into account several other dynam-
ics. For example neurons exhibits refractory periods, it means that the interval between spikes
in response of a train of stimuli gets higher and higher. Or even the homeostatic properties,
subject of this thesis work, that will be explained later. However, the mechanics of all these
dynamics arise from the membrane ion permeability modulation.
Action
potential
Me
mb
ra
ne
 po
ten
tia
l [
mV
]
De
po
lar
iza
tio
n Repolarization
Threshold
Stimulus
Resting state
Refractory
period
+40
0
-55
-70
0 1 2 3 4 5
Time [ms]
Figure 1.3: The qualitative action potentnial of a generic neuron.
Of course neurons doesn't like to stay alone and prefer to gather all together and forming
a complex network. Usually neurons are shown to be connected each other in order to form a
network. In this network, the neural output signals generated by the soma, runs through the
whole axon up to the end of it. Then, an electrochemical connection called synapse, connect
13 G. Rovere
1.2. Neural Networks
the terminal part of the axon to the dendrite of another neuron, establishing a connection
between two neurons. This happens countless time in a complex nervous system resulting in
a tangled net in which information propagates from a neuron to another one and so on.
The synapses are hence a biological structures that allow a neuron to communicate to
another neuron. Synapses can be both excitatory or inhibitor. It means that they can "de-
cide" weather the received information signal from the previous neuron axon will contribute in
a positive (or negative) way making the neuron more (or less) likely to ﬁre an action potential.
1.2 Neural Networks
An artiﬁcial neural network is a mathematical model inspired by biological neural networks
that consists of interconnected groups of artiﬁcial neurons. It is based on simple process ele-
ments (neurons) and it exhibits a complex behaviour determined by the connections between
the elements and their relative connection weights. Neurons in neural networks can be la-
belled in cluster called layer. Those layer diﬀer each other from the functions that perform
in the networks. For example the feedforward neural network shown in Figure 1.4 consist of
three layers of neurons. The input layer is responsible to gather the stimuli form the external
environment, the hidden layer is responsible for the process of information while the output
layer provides the processed signals to the external word.
Input 
Layer
Hidden Layer
Output 
LayerNeruron
Neruron
Neruron
Neruron
Neruron
Neruron
Neruron
Neruron
Neruron
Figure 1.4: A schematic of a feedforward neural network with three layers. These layers consists of 3, 4 and
2 neurons each. Every neuron in a layer is connected to all the neurons in the previous layer.
One of the main characteristics that biological neural networks have is their intrinsic
plasticity. A neural network is said to be plastic if the strength of the connections among
neurones can be somehow modiﬁed. In the artiﬁcial neural networks model these strengths
are stored in parameters called synaptic weights. These parameters can be updated accord-
ing to the system input and to other events. Generally the synaptic weight can range from
-1 to +1. If it's negative then the synapse is called inhibitory, if the weight is positive the
G. Rovere 14
1.3. Neuromorphic Engineering
synapse is excitatory (recall end of Section 1.1). These diﬀerent weights process the data in
the calculations and are responsible to the diﬀerent behaviour of a neural network. One of the
most interesting and promising application of neural network is to develop machine learning
algorithms.
Learning means, given a speciﬁc task and a class of functions, to use a set of observation
in order to ﬁnd function in the given class that solves the problem in an optimal way. An
example of learning in neural networks is the capability of these systems to recognise letters
and numbers in real time, even with diﬃcult backgrounds. This task is performed without
having explicitly programmed the neural network, but simply providing to it a suﬃcient input
statistic. Others appealing applications of neural networks ranges from the speech recognition
and classiﬁcation to robotic sensing and data processing.
These great capabilities of these systems doesn't lie in the neuron level but, on contrary,
at system level. This power of calculus is strictly related to the number of the connections of
the system and to the network plasticity properties.
The plastic system ability to update its weights is governed by this simple rule: connections
between two neuron are reinforced if they ﬁre at the same time, otherwise weakened.
This property is called synaptic plasticity and is ﬁrst developed by Donald Hebb in 1949,
thus the name Hebbian Theory. We can have a Long Term Potentiation (LTP) if their strength
increases over time or a Long Term Depression (LTD) if is weakened over time. These weights
are modiﬁed by the network itself according to its input statistic (unsupervised learning), and
the end of this learning period the weights of the system are set and their value shouldn't
vary remarkably on time. Hence, relative weights among synapses are critical and
contains information.
Synaptic plasticity is probably the most important feature of neuronal networks. This
peculiar characteristics of the neural networks is the basis of memory and their ability to
learn without being explicitly programmed.
In biological neural system these connection weights depending both on the physical dis-
tance of the synapses to the soma (see Section 1.1) and on chemical proprieties of the synapses.
But these weights aren't ﬁxed, in fact the synaptic strengths have the ability to change their
weight according to the received stimuli.
After this short introduction should be clear that, unlike von Neumann model compu-
tations, artiﬁcial neural networks process signals in a highly parallel way. An independent
program memory doesn't really exist and process ﬂow and memory are shared in the neural
network synaptic weights.
1.3 Neuromorphic Engineering
The neuromorphic engineering is the branch of the electrical engineering that studies how
to mimic neuro-biological architectures present in the nervous system and reliably build it on
15 G. Rovere
1.3. Neuromorphic Engineering
silicon. The original concept was developed by Carver Mead at Caltech in the late 80's and
condensed in [1].
In particular, the neuromorphic engineers build circuits that emulates the same behaviour
observed in biological neurons, retinas, cochleas and so on. Then these basic circuits are
used as building block for making networks of these instances in order to process informa-
tion. However, one of the main goals of the neuromorphic engineering is that not only these
building blocks should behave as observed in nature, but even at system level this philosophy
should be applied. Hence, neuromorphic circuits should be connected as close as nature ar-
ranges neuronal cells in the nervous system. This close similarity in the operation allows the
engineers to build eﬃcient systems as nature does.
But, on the other hand, neuromorphic hardware are a useful tool for neuroscientists too.
In fact they can perform experiments that otherwise would be impossible to carry out with
real neurons. For instance they can exactly deﬁne how many and how connect their popula-
tion of neurons, perform experiments and evaluate their hypothesis. It's a kind of symbiotic
interaction between engineers and neuroscientists in which both take advantage from this
collaboration and this is one of the reason this emerging ﬁeld is become so relevant.
BIOLOGY ENGINEERING
BIO-INSPIRED
ANALYSIS, INSTRUMENTATION, DESIGN, REPAIR
Figure 1.5: Biology and Engineering symbiotic interaction. Rahul Sarpeshkar.
Here I report what I believe is a meaningful example about how nature inspired us ﬁxing
technological issues. In addition to that, this particular example is in strict relation with the
focus of my thesis work.
In nowadays silicon processes the MOS dimensions are approaching to the
minimum physical value. This means that a regular MOS has geometrical
dimensions, namely W and L that are shrinking down. Currently the MOS
minimum size is just some order of magnitude wider than the silicon atom.
This is an advantage for the number of integrated device you can fabricate
into a single die but it's a serious problem when silicon non homogeneities
G. Rovere 16
1.3. Neuromorphic Engineering
arises. In fact, imperfections such as impurities, ﬂaws in the silicon crystal
lattice and asymmetries in process layers would generally degrade circuit
performances. Indeed same instances of designed elements (MOS, capacitor,
resistors) would behave diﬀerently from each other due to material variance.
This is the well known mismatch problem. These diﬀerences must be taken
seriously into account by designers because usually sum each other and hence
lower the performances of the whole system.
Since these problems arises with minimum size transistor, one possible
solution is to make the area of sensitive devices bigger. For instance, for a
MOS you can times both W and L by the same amount, let's say α. This
is an eﬀective way, even though it has some drawbacks, but is not how Na-
ture would ﬁx the problem. In fact, also in the real world is hard to have
a perfect material up to the atoms scales, and this happens in our brains
too. Our neurons are far to be exactly an precise copy of each other. But
Nature doesn't ﬁx the problem by simply making bigger neurons, just be-
cause it would such a waste of volume. Neuroscientists speculate that Nature
has designed our brain with countless connection between neurons also for
overcome this mismatch issue.
The computational power of our brain doesn't lie at the neuron level, but on contrary must
be tried to ﬁnd at network level. These countless connections are the key of the wonderful
brain behaviour and capabilities.
Recalling our example, neuromorphic designer should ﬁx mismatches of silicon devices not
increasing their size, but on contrary keeping the area compact and massive interconnect the
neuronal instances.
Another Nature mechanism that helps to ﬁx the mismatches problem is the network plas-
ticity. Plasticity in the brain is not only related to the learning mechanisms, but even modiﬁes
the strength of neurons in order to keep the neuronal network in working conditions in face of
system variations, such as mismatch. Implementing this plasticity in hardware would hence
be another great way to apply bioinspired mechanisms in silicon design. This type of
plasticity is called homeostatic plasticity and will be further discussed in section 1.4.
However, the main reason that urged early scientists into this insight of the neuromorphic
engineering is that, even with the most powerful supercomputer available, we can't perform
tasks such as pattern recognition, classiﬁcation, motor control and so on as eﬃcient and as
eﬀective as animals do. The reason of this huge diﬀerence is that computer and brains are
based and relied on diﬀerent principles and paradigms.
In the next section I will ﬁrst highlight these diﬀerences between brain and non neuromor-
phic computer (supercomputer), then I will give a quick overview about diﬀerent approaches
and philosophies that drives neuromorphic engineers. The further discussion about neuro-
morphic hardware will be focused especially on the neuromorphic cognitive systems rather
than neuromorphic sensors or interfaces. I.e. architectures that can reason about the actions
to take in response to the combinations of external stimuli, internal states, and behavioural
objectives. These hardware are directly inspired by the brain.
17 G. Rovere
1.3. Neuromorphic Engineering
1.3.1 Supercomputer vs neuromorphic hardware
As already shown in the previous section the brain processes the information mainly in
an analogue fashion. The key entity that carries information are the spikes and those signals
are processed throughout a 3D (volume) neural network.
As contrary, supercomputers information are always associated to digital values and their
spatial domain is a ﬂat piece of silicon, reducing the networking capabilities. These two sys-
tems, the neural networks and the supercomputers, relies on completely diﬀerent architecture.
The supercomputers are based on the von Neumann paradigm, i.e. there is a digital CPU
that process input data according to the stored program located somewhere next the CPU.
This architecture is hard-wired and modiﬁcations in the topology aren't usually expected.
The instructions fetch and execute ﬂow is heavily sequential and no more than one operation
can be performed at once (with one core). However, even though these machine could be the-
oretically made in the analogue domain, their natural implementation is the digital clocked
domain. Digital is because the information are associated by a sequence of zeros and ones and
clocked is because the timing of the operation is precisely set by a high frequency reference
(around GHz in modern devices). The strength of this paradigm is that it's really robust to
noise and can easily handle large amount of data. But is very power hungry and for some
particular tasks its very ineﬃcient. Usually, supecomputer are excellent in providing exact
solution to well deﬁned problems.
Neuromorphic hardware are completely diﬀerent. They are based on analogue devices
and signals are sort of digitalized only for transmit information inside and outside the chip.
Digital signals are not involved in the direct computational process. The architecture is
based on neural networks that are massively parallel, it means that all the artiﬁcial neurons
works together simultaneously. The computing timing is not set any more by a clock, but is
performed in an event-driven manner.
Another big diﬀerence is that in supercomputer you have to write a very precise and ﬂaw-
less code in order to get proper behaviour. On neuromoprhic cognitive hardware you don't
have a proper program but you use the learning paradigm to allow it perform useful tasks.
Another great result about how the Nature build brains is that they can still work even
if a neuron is corrupted, supercomputer could became useless even if a single transistor gets
out of order. This nice characteristic can be observed even in large artiﬁcial neural networks
and is really useful in order to face unavoidable fabrication failures or inhomogeneities.
From this discussion is clear that we can model pretty well the dynamics of a single neu-
ron. Thus, simulate a neural network by setting equation of a supercomputer would result
in exact results and on the same behaviour of the modelled neural network. But the power
consumption and the computation time would be exaggerate, prevent it from the use of real
time and useful computation. On contrary, neuromorphic hardware emulates rather than
simulate.
G. Rovere 18
1.3. Neuromorphic Engineering
Weak Inversion Vs Strong Inversion
Voltage-mode Vs Current-mode
Non-clocked Vs Switched Capacitor
Biophysical model Vs Phenomenological model
Real-time Vs Accelerated-time
Table 1.2: Silicon neurons design styles.
Emulation and simulation are two diﬀerent concepts. The simulation is a conceptual model
that represents the reality and only reproduce a certain input-output black block behaviour.
An ordinary supercomputer can simulate neuronal networks in the sense that it behaves like
that, even though the underlying mechanisms are totally diﬀerent. On the contrary the
emulation aims to exact mimic a system behaviour in a lower level, even though it's based on
diﬀerent means.
1.3.2 Silicon Neurons
Several artiﬁcial neurons have been developed since the very beginning. Of course, as
on every branch of engineering, trade oﬀs on speciﬁcations are unavoidable, hence diﬀerent
design styles were developed. The following comparison is extracted from [2] and summarize
the mains neuromorphic techniques to build silicon neurons.
From the structural point of view silicon neurons can all be described by the schematic
of Figure 1.2. The synapses receives the spikes, integrated them over the time and sum each
other output currents. The soma block get the summed current from the synapses, and if
this current crosses a threshold, an event, namely a spike, is generated at the output of the
neuron.
As already emphasized, the neurons exhibits a complex dynamics over time. To mimic
such complex behaviour an artiﬁcial neuron requires more circuitry and a relatively big area.
This reduces the possibility to pack several neurons into a chip and then simulate large neural
networks. Hence, several neuronal model were developed, some of them are cumbersome but
behave very realistic, some other tends to be small but have less faithful dynamics.
Some opposite techniques that are common among neuromorphic engineers are recap in
Table 1.2. Each of the computational blocks of Figure 1.2 can be implemented in any of these
design styles.
The weak and strong inversion are referred to the operational point of the MOS tran-
sistors. Strong inversion is the usual above threshold bias point. Weak inversion is also called
sub-threshold. In this condition the channel is not completely formed . Usually the strong
inversion requires more current to bias the MOS, is much more insensitive to mismatches and
the Ids vs Vds is a square law. The weak inversion is more suitable for low power operation
and its Ids vs Vds is an exponential curve.
Voltage mode or current mode circuits means that the meaningful variable is repre-
19 G. Rovere
1.3. Neuromorphic Engineering
sented respectively by a diﬀerence in potential or in a ﬂow of electrons. Current mode circuits
are more appropriate for low power operation because are more insensitive to power supply
reduction trend.
Non clocked signal are continuous in time, conversely processing could be done in ana-
logue domain but in quantized clocked time . For instance this happens in switched capacitor
circuit. These circuits work with a clock with two phases. In the ﬁrst phase, charges that
carries information are stored in a capacitor and (ideally) can't move. Then, in the other
phase, switch are closed and the charges can ﬂow freely according to the circuit dynamics set
by the topology. This technique is more complex but allow better matching and precision.
Biophysical and phenomenological models refers to the level of the detail on which
circuits emulates neuronal behaviour. As already pointed out, this trade oﬀ is strictly related
to the used area and then to the number of neuron you can integrate on a single die.
Circuits that can operate with time scales that are biologically comparable (on the order
of magnitude of milliseconds) are named real-time. Conversely circuits that runs with time
scales, at least, 10 times faster are said to be at accelerated-time. This distinction is pretty
important, in fact, even though simulation of neural behaviour could be run event at accel-
erated time, the interaction with real world through sensors should be eﬃciently performed
only in real-time scales.
An example of silicon neuron is the one shown in Figure 1.6. This is the conceptual
schematic of neurons that are integrated in the MN256r1 chip developed by the NCS group.
Iin
Vdd
Vlk
Vthr
Vlkahp
Vthrahp
Vdd
Cmem
CP
VddVdd
Vahp
Vref
CR
Vdd
/ACK
/REQ
ML1
ML2
ML3
MG1
MG2
MG3
MG4
MG5
MG6
MA1
MA2
MA3
MA4
MA5
MA6
MR1
MR2
MR3
MR4
MR5
MR6
Iahp
Ia
Ir
ImemVP
ICa
Figure 1.6: An integrate & ﬁre artiﬁal neuron based on the DPI [5].
This artiﬁcial neuron consist of four blocks. The yellow block is the synapse of the neuron
and is based on the Diﬀerential Pair Integrator circuit (a detailed explanation of this circuit
will be provided in section 2.4). This DPI takes the current input, integrates it and charges
G. Rovere 20
1.3. Neuromorphic Engineering
the capacitor Cmem accordingly. The DPI acts as the neuronal input synapse while the voltage
across Cmem is the membrane potential. To note that a single neuron can have several input
synapses, hence several DPIs in parallel.
The red part is the spike event generation block. The blue block is in charge for the reset
of the neuron and for providing digital acknowledgement and request signals for the inter chip
neuronal communication (AER). The green block is another DPI (in this case it doesn't act
as a synapse) in charge of set the temporal dynamics of the neuron.
According to Table 1.2, this circuit can be categorized as a weak inversion, current mode,
non-clocked, phenomenological circuit. Real or accelerated time can be choose according to
the capacitor values.
1.3.3 MN256r1 chip
The MN256r1 chip is the chip developed by the NCS group at INI. Its main purpose it to
emulate a medium size neuronal network with plastic and cognitive capabilities.
It consists of 256 neurons, each of which has 256 plastic synapses (DPIs) and 256 non plas-
tic synapses (basically is a simpler version of a DPI in which the weights are not settable).
In order to form a neural network all these neurons are connected each other. This is a chal-
lenging task due to large amount of connections in the chip. This routing is implemented via
a the AER protocol(Address-Event-Representation). Basically it's a sort of "virtual wiring"
technique for interconnecting spiking circuits in neuromorphic systems. In the AER protocol,
whenever a spiking neuron in the chip generates a spike, its "address" is written on a high
speed digital bus and sent to the receiving neuron(s). To note that here the digital part
doesn't interfere with the signal processing.
The chip has a 64 programmable bias generator, each of those signals are independent and
whose current ranges from 50fA to 20µA. The chip die size is 9mm x 5mm.
In order to crate a cognitive neural network a full understanding about real neurons at
cellular level and their complex connections are required. But, even technological aspects are
important and were one of the bottle neck that delayed the development of neuromorphic
hardware. However, is clear that this new emerging ﬁeld of science is a beautiful merge of
diﬀerent topics and requires a convergence of knowledges in order to go further.
21 G. Rovere
1.4. The Homeostatic Principle
n
A
E
R
 
I
N
P
U
T
Bias Generator
AER INPUT
A
E
R
 
O
U
T
P
U
T
M
U
L
T
I
P
L
E
X
E
R
NEURAL
CORE
Inhibitory synapse Excitatory synapse Homeostatic plasticity I&F Neuron
Figure 1.7: High level conceptual schematic of the MN256r1 chip.
1.4 The Homeostatic Principle
The "Homeostasis" word originates from the union of two Greek words: hómoios = similar
and stásis = standing still. Homeostasis is a property of a systems that tends to hold its own
behaviour in a stable condition in face of environmental ﬂuctuations or input changes. It's
a diﬀerent concept from the "dynamic equilibrium," which is not regulated and from "steady
states," that are stable but can jump into diﬀerent states if energy is applied. Indeed, home-
ostasis forces the system to its only equilibrium point which is asymptomatically stable.
The homeostatic principle is a general mechanism of physiology that was ﬁrstly observed
by Cannon in the late 20s. Way more recently, Turrigiano et al. [4] showed that this also
applies to a population of plastic neurons.
In [4] is showed that this principle is fundamental for maintain the stability of a neural
network and let the whole system works properly. As already emphasized, the leaning mech-
anism of neural networks based on the Hebbian principle updates the synaptic weights. It is
basically a positive feedback, in the sense that it tends to destabilize the system forcing the
synaptic weights to reach their maximum or minimum value (±1). Electronically speaking it
means that the synapses approach respectively a high or low gain. Hence, these mechanisms
of plasticity will lead the system towards runaway excitation or quiescence.
One way to visualize this problem is consider a neural network made by ﬁve layers con-
nected in a feedforward way as show in Figure 1.8. There you can see that if the gain between
two adjacent layers is too high, then with a certain input, the network output will saturate
and lost its information. It means that the ﬁring rate of the last layer neurons will be always
somehow at high ﬁring rate and almost independent from the input signals. Vice versa, if the
gain is too low the system will be pushed into an high attenuation state and the output ﬁring
rate will approach to zero over time.
G. Rovere 22
1.4. The Homeostatic Principle
This problem directly arises because the synapses are highly plastic and their weights can
remarkably changes according to their input signals. Hence setting initial condition in the
right state would not prevent the system to fall into saturation.
Figure 1.8: This illustrates how a gain higher or lower than 1 can distort the input signals if processed by a
chain of neurons. (a) Neural network (b) Output ﬁring rate from each layer. Saturation. (c) Output ﬁring
rate from each layer. Attenuation. Image taken from [4].
On contrary the homeostatic principle is a negative feedback that tends to stabilize the
system in face of input changes or in environmental variations. It senses the output of the
neuron and changes their synaptic gain accordingly. This principle is represented in Figure
1.9.
If we interpret the plot at a ﬁxed time it describes the steady state (static) behaviour of
the system. If the synaptic drive is low then the ﬁring rate is low, and vice versa. The static
dependence of these two variable is not linear and when the curve gets ﬂat then information
distortion occurs. This happens when the synaptic gains are really high or really low, that is
the case which the Hebbian mechanism would pull towards.
On contrary, the homeostatic principle forces the synaptic drive to lie in the inner region
of the plot, there the behaviour is linear. The highlighted zone is the target ﬁring rate, so the
two arrows pointing each other shows the dynamics of the homoeostatic principle.
Even though there could be several types of homeostatic principle (additive, multiplica-
tive) the multiplicative is the one I will consider from now on. The multiplicative homeostatic
principle means that it changes the synaptic drives multiplying or dividing all the synapses
weights of one neuron. Hence, the relative synaptic strengths are keep constant, but the abso-
lute strengths are increased or decreased. This is really important, otherwise the "program",
learnt by the network and stored in the synaptic weights, would be distorted or lost.
Trying to draw again a parallelism to the electronics world, it's clear there is an evident
analogy between the saturation of Figure 1.8 and analogue electronics ﬁlters. In fact, high
23 G. Rovere
1.4. The Homeostatic Principle
Figure 1.9: The synaptic input vs ﬁring rate of a neuron plot. Image taken from [4].
order analog ﬁlters, consist of a cascade of second order cells. The ﬁrst designer concern is the
position of the poles, but it's not suﬃcient in order to design a reliable ﬁlter. In fact, the gain
of each of those cells must be set as close to the unity as possible in order to prevent distortion
and assure high a SNR. The conceptual problem is exactly the same explained before and
relates to attenuation and saturation of stages.
However, analogue ﬁlter order is seldom higher than 7-8, hence the precision in the gain
has relaxed boundaries. Hence, the analogue designer doesn't need to build a gain control
circuit. On contrary, in neural networks the cascade of neurons is remarkably longer and this
gain problems is relevant.
In addition, the homeostatic principle is not only fundamental for hold the system in a
stable condition, but is very useful even facing environmental changes or imperfections (see
1.3 about mismatch).
For instance, in brain this could happen when the temperature increases (for example
when you have a ﬂu), or when ion concentrations changes due to illness or whatever else. In
silicon, for example, the stability can be aﬀected by mismatch or by variations in references
(power supply variations due to plug/unplug loads in the chip). Hence, if these changes aren't
excessive, the homeostatic principle would ﬁx it.
In summary, the homeostatic principle is a global negative feedback that tends to let the
system stable, it can be seen as an Automatic Gain Control. It has basically two beneﬁts:
it allow the system to be stable and it let the system works even in real condition (mismatches
and changes in the environment).
G. Rovere 24
1.4. The Homeostatic Principle
1.4.1 The state-of-the-art
As far as I know only few attempts to implement this homeostatic principle have been
made [5], [11] and [12]. In the ﬁrst article the usual approach is to implement the homeostatic
principle via a software that controls the hardware on the chip. This methods allows to
get time constants arbitrarily long (104) and is useful to perform generic experiments on
homeostasis in order to validate the model. Of course, these setup have limited capabilities
if the ﬁnal goal is to develop a very compact low power neuromorphic hardware.
The authors in [11] actually proposed an "on chip" solution. But the performances of the
homeostatic loop (a few seconds) aren't even comparable to those obtained with HW/SW
mixed setup of [5]. To get long time constants is crucial in order not to interfere with the
learning mechanisms and to exploit neural networks behaviour. A further analysis how their
circuits can't get high performances will be given in Section 3.3. In [12] the authors imple-
mented the homeosatic plasticity on silicon using the ﬂoating gate MOS technique. Although
the performance of this implementation are really high (time constants of minutes), dealing
with ﬂoating gate require special processes and more circuitry.
As already emphasized, the challenging task is to get really long time constants on silicon.
Actually it's diﬃcult to achieve because, in standard CMOS process, leakage currents and
second order non-idealities must be taken into account. Hence, modiﬁed topologies and
structure must be examined. In addition to that, a careful layout must be drown in order to
guarantee high performances.
25 G. Rovere
1.5. The Thesis Aim
1.5 The Thesis Aim
The aim of this thesis work is to report the design of a circuit that implements the
homeostatic principle on silicon. The circuit will be integrated in the MN256r1 chip and
fabricated in standard CMOS AMS 0.18µm process.
In the above brief introduction I tried to give insights focusing on trade oﬀs that are
dominant in neuromorphic engineering. Here below I report the given circuit speciﬁcation
that my design has to accomplish.
Desired Circuit Speciﬁcations:
• Ultra long time constant
With a capacitor value of 1pF. However is not important the exact value of the time
constants as long as they are at least several seconds. Variance of the time constants
across chip instances is not critical.
• Small silicon area
The maximum circuit available area, capacitor area excluded, is 16.5µm× 50µm.
• Low power consumption
No explicit upper boundaries were give but recall that this circuit must be instantiated
for each neuron (currently 256). So, the lower, the better.
G. Rovere 26
CHAPTER
2
CONCEPT AND CIRCUITS OF
NEUROMORPHIC ENGINEERING
The purpose of this chapter is to recall and provide the foundations of engineering concepts
that will be essential in order to fully understand the design and phenomena on which my
circuit is based on. A more detailed and comprehensive analysis of these topics are covered
by several books and scientiﬁc papers, some of them are listed in the bibliography. Here I
start with a review of the MOS device and the semiconductor physics, that is the key element
of modern integrated circuits. A special operation point of this device is analysed and will be
the key of my design low current generator. Then the log-domain ﬁlters technique is presented
along with few meaningful examples and circuits. Many derivations are given only for nMOS
type transistors, pMOS extension is straightforward.
2.1 The Subthreshold MOS
The MOSFET is an unipolar electrical active device made out of silicon that can be used
both as a digital switch or as an analogue ampliﬁer. MOSFET is the acronym of Metal-Oxide-
Semiconductor-Field-Eﬀect-Transistor that recall its physical structure. Usually MOSFETs
are divided into two groups according to the type of currents, i.e. nMOS if currents consist
of electrons or pMOS if currents consist of holes.
The MOS consist of diﬀerent layers of doped silicon. However, the starting point is the
intrinsic silicon, i.e. the pure form of silicon. Silicon is a semiconductor material and
compared to metals has low conductivity due to low amount of free carriers, namely electrons
(e) and holes (h). The concentration of these carriers (ni(e) and ni(h)) in the intrinsic silicon
27
2.1. The Subthreshold MOS
is proportional to the silicon energy gap (Eg) and to the absolute temperature (T ) according
to the following law:
ni(e) = ni(h) = ni = CkT
2
3 e
−Eg
2kbT (2.1)
where C is a material dependent constant and kb is the Boltzmann constant.
To dope silicon means to add donor/acceptor atoms (usually phosphorous or boron) in
the intrinsic silicon lattice. Hence, the silicon is not pure any more and it's then called
extrinsic silicon since electrical properties are modiﬁed. In fact, these two added atoms have
respectively one more and one less electron in their conduction band if compared to the silicon
atom. Hence a n-type silicon, doped with boron atoms, has additional free electrons. On
contrary, a p-type silicon is a silicon lattice doped with phosphorus, and has more holes than
electrons. Both n and p-type silicon shows an increase in silicon conductivity proportional to
the number of added atoms.
The majority carriers concentration in an extrinsic silicon is known, i.e. the number of
added impurity atoms, but the minority carriers follow the mass action rule:
ne(e) · ne(h) = n2i (2.2)
But how is the physical structure of a MOS? Let's think then about a nMOS.
The nMOS is a planar device made on a layer of p-type silicon bulk(B) in which, under
the surface, there are implanted two small regions of n-type silicon, namely the Drain(D) and
the Source(S). The section between these two terminals is covered by silicon oxide (insulator)
and then by polysilicon (a not so good conductor) in order to make the Gate(G) terminal.
See Figure 2.1. Due to its symmetry the source and the drain terminals are exchangeable
according to their potential and aren't physically located. In fact, in a nMOS device, the
source terminal is the one with lower potential, while the drain has the higher one (compared
to the source). Hence, the current will always ﬂow from the drain to the source in a nMOS.
Vice versa in the pMOS devices currents ﬂow from the source to the drain.
I will start here with an insight how the nMOS works, then equations are derived later
only for the subthreshold region.
For the ﬁrst time let's think about a nMOS without the source and drain terminals. The
resulting structure is just a three layer structure that consist of a polysilicon gate terminal,
an insulator layer and then the bulk layer (p-type silicon). Actually this structure acts as a
parallel-plate capacitor, in fact the gate and the bulk are the two capacitor plates, while the
oxide is the dielectric.
Let's now apply a positive voltage between the gate and bulk terminals. Then, positive
charges will accumulate on the gate terminal. Since same type charges repeals each others,
most of the majority carriers in the p-type silicon (holes) are repelled and pulled away from
the bulk-dioxide interface beneath the gate. While these majority charges can freely move
in the silicon, the atoms from which they belongs are ﬁxed in the lattice structure. Thus,
these atoms became negative charged ions since they lost one of their valence band electrons.
G. Rovere 28
2.1. The Subthreshold MOS
p+
p-type silicon
n+n+
bulk source drain
gate
- -- - - -- - -- ------ - ------
- - - - - - - -
-
depletion layer ions electrons channel
oxide
polysilicon
x
y
z
L
Figure 2.1: Lateral section of the nMOS in strong inversion. You can see the depletion layer and a tiny
electrons channel benheat the oxide. Distinction between source and drain terminals is arbitrary since no
potentials are applyed.
These negative ions counterbalance the positive charges on the gate reaching an equilibrium
and forming such an ions layer called depletion layer (depleted by majority carriers).
At the very beginning the depth of this depletion layer is linearly modulated by the gate
voltage, in the sense that the bigger is Vgb the bigger is the depletion layers depth Ldepletion.
But, when this diﬀerence in potential (Vgs) became so strong, even the minority carriers of
p-type silicon (electrons) are attracted by the gate and, since they are free to move, they will
pack and thicken just under the oxide layer. If Vgs goes beyond a certain threshold voltage
Vth then the number of these minority free holes became considerable, and they start act
as an additional plate that shield the depleted layer from the magnetic ﬁeld created by the
gate. Hence, after goes beyond a voltage threshold, the depletion layer would not increase
any more, while the minority layer will.
The threshold voltage Vth is set by the foundry according to its own process. It's usually
scaled down with the power supply scaling trend in order let MOS be able to switch on and
oﬀ. For instance, in AMS 0.18µm process the regular Vth for a nMOS is around 300 mV. This
threshold can be increased up to 550 mV using an High Threshold Voltage(HVT) nMOS. This
is consist of a regular nMOS with an additional mask in the process that adjust ion implants
in order to modify the threshold voltage.
Even though the threshold voltage is not adjustable by the IC designer, still it's not
constant and exhibits second order dependencies. The most important eﬀect is the so called
29 G. Rovere
2.1. The Subthreshold MOS
body eﬀect. It consist of a modulation of the threshold according to the source-bulk voltage.
This important relation is given by the following equation:
Vth = Vth0 + γ
(√
|Vsb + 2φF | −
√
|2φF |
)
(2.3)
where Vth0 is the threshold voltage at Vsb = 0, and γ and φF are a process dependent
parameters.
Now let's add back the source and drain terminals, resulting the usual MOS device. The
before mentioned layer of minority carriers under the gate is called channel and eﬀectively
connect the drain and the source terminals. Then, if this channel is well established (i.e. if
Vgs > Vth) a ﬂow of carriers, mainly driven by drift, can rush from the drain to the source.
This current intensity on ﬁrst approximation depends on Vds.
On contrary, if this channel is not yet formed (that means Vgs < Vth), still a small ﬂow of
carriers can appear from the drain to the source, but now the ﬂow of current is mainly driven
by diﬀusion since there is a gradient in electron concentrations between the drain-source ter-
minals.
Now let's make clear what exactly diﬀusion and drifts mean.
Single carriers (electrons or holes) in silicon lattice have a complex dynamics mainly due
to thermal motion and modiﬁed by multiple collisions with atoms and atomic forces. These
trajectories are very diﬃcult to analyse, but can be seen as a certain realization of stochastic
process called Brownian Motion. Fortunately, currents in electronic devices still involve ﬂuxes
of high quantities of these particles and hence only their statistical average trajectories are
needed in order to describe the device behaviour. Therefore, the meaningful dimensions in
the MOS analysis are current densities rather than single carriers motion.
The MOS behaviour is basically governed by two kind of carrier motions in silicon. Actu-
ally they are both the results of the Brownian motion but they are studied separately [1].
The drift motion happens when external forces, such as an electric ﬁeld (E), are applied
to free charges in silicon. Hence, carriers velocity will be deﬁned by the following formula:
vdrift = µE (2.4)
in which the µ term is the mobility of the particle in a certain doped silicon lattice.
The diﬀusion motion generally arises when there is a concentration (n(·)) gradient in
the mean dn(·)
dx
. Particles ﬂow from the high density region to the low density region. The
diﬀusion velocity can be written in terms of gradient of the particle distribution along the x
axis:
vdiff = − 1
2n(·)
dn(·)
dx
tfv
2
T (2.5)
In which equation the negative sign arises because the net ﬂow of particles is from higher
density to the lower density, n(·) is the particle concentration, tf is the mean time between
G. Rovere 30
2.1. The Subthreshold MOS
particle collisions and vT is velocity average. Recalling that at thermal equilibrium a particle
with mass m has kinetic energy deﬁned by:
1
2
mv2T =
1
2
kT (2.6)
hence, substituting Equation 2.6 in Equation 2.5 yields:
vdiff = − 1
2n(·)
dn(·)
dx
kT
tf
m
= −D 1
n(·)
dn(·)
dx
(2.7)
in which D = kT tf/2m is called the diﬀusion constant.
Since drift and diﬀusions mechanisms are diﬀerent manifestation of the same phenomena,
then they are related by the Einstein's relation:
D =
kT tf
2m
=
kT
q
µ (2.8)
From these velocities, current densities through a surface Asurface can be derived multi-
plying by the particle densities according to:
Jdiff = n(·)vdiffusion (2.9a)
Jdrift = n(·)vdrift (2.9b)
and then get currents by:
Idiff = Jdiff · Asurface (2.10a)
Idrift = Jdrift · Asurface (2.10b)
Subthreshold Region
As already mentioned, currents in subthreshold MOS are dominated by diﬀusion currents.
Let's show and calculate it. The p-type silicon surface beneath the gate has potential ψs that
is aﬀected by the voltage applied to the gate terminal. If the gate voltage increase but stays
below the threshold voltage Vth, no electrons channel shield eﬀect happens and the depletion
thickness Ldepletion increases proportionally in order to balance out the charges of the gate.
Assuming small variation around the operating point, is true that [18]:
ψs = ψ0 + κVg (2.11)
where ψ0 is just an arbitrary reference for the potential. Furthermore, κ is the gate coupling
coeﬃcient deﬁned as Coxide
Coxide+Cdepletion
and represent the coupling of the gate to the surface
potential. Since Coxide is ﬁxed by the physical dimensions and the depletion capacitance
31 G. Rovere
2.1. The Subthreshold MOS
Cdepletion is fairly constant in subthreshold region, the κ is hence considered constant and
typical values in modern processes are around κ ≈ 0.7.
Now additionally assume that the potential of the channel, denoted with ψs, is constant
along the whole x axis [15]. No diﬀerence in potential results no drift currents. But, since
drain and source terminals are accessible from the external through physical connections, their
potential can be imposed. If these potential Vs and Vd are diﬀerent from Vg hence, source and
drain electrons face a potential barrier respect to the gate. The barrier hight is then given
by:
θs = θ0 − q(ψs − Vs)
θd = θ0 − q(ψs − Vd)
(2.12)
in which θ0 = qVbi is the built in energy barrier of the two p-n junctions and q is the
proton charge. These barriers are proportional to the diﬀerence between the terminal (S and
D) applied voltages and the channel one.
Given the energy barrier, Equation 2.12 seen by the electrons we can compute the electron
concentrations at the source and drain terminals via the following equations [1]:
n(e)source = n(0)e
− θs
kT = n(0)e
−(θ0−q(ψs−V s))
kT
n(e)drain = n(0)e
− θd
kT = n(0)e
−(θ0−q(ψs−V d))
kT
(2.13)
Where n(0) is the electron density at the reference potential.
Since the drain terminal, compared to the source one, by deﬁnition has the highest po-
tential, hence its electrons concentration is grater than the source electron concentration
(n(e)drain < n(e)source).
As emphasized over and over again, whenever there is a diﬀerence of concentration there
is a gradient and hence an electron ﬂow by diﬀusion, in this case from the source to the drain.
Since we know the concentrations at the ends of the transistor and we know how do they
move by diﬀusion, hence we can compute the channel current in the MOS recalling Equations
2.10, 2.9 and 2.7.
I = Jn,diffWLdepth = n(e)vdiffusionWLdepth = −qWLdepthDndn(e)
dx
(2.14)
where Jn,diff is the diﬀusion current density, W is the usual width of the channel and
Ldepth is the depth of the inversion layer, Dn is the diﬀusion coeﬃcient of the electrons and
dn(·)
dx
is the concentration gradient across the MOS.
Since we are analysing only this phenomena, we implicitly assumes that non other currents
ﬂow in the bulk or in the gate. This is a reasonable assumption since the gate is isolated from
the bulk and the drain-bulk and source-bulk junctions are usually reverse biased.
Hence, from the KLC, the current throughout the channel must be constant.
dn(e)
dx
=
n(e)drain − n(e)source
L
=
n(e)1
L
e
ψs
UT (e
− Vd
UT − e− VsUT ) (2.15)
G. Rovere 32
2.1. The Subthreshold MOS
Ids = −qW
L
LdepthDnn(e)1e
ψs
UT (e
− Vd
UT − e− VsUT ) = I0W
L
e
ψs
UT (e
− Vd
UT − e− VsUT ) (2.16)
where n(e)1 = e
− θ0
kT and I0 = qLdepthDnn(e)1.
Finally, Equation 2.16 yields:
Ids = I0
W
L
e
κVg
Ut
(e
− Vd
UT −e−
Vs
UT )
(2.17)
As holds in the usual above threshold, even in the subthreshold regime the nMOS exhibits
diﬀerent behaviours according to the nMOS polarization point:
• Non-saturation region
In this regions arises when Vds < UT hence we can rewrite Equation 2.17 as follows:
Ids = I0
W
L
e
(κVg−Vs)
UT (1− e−
Vds
UT ) (2.18)
As contrary on what happens in strong inversion, here the Vds vs Ids curves are not
linear.
• Saturation Region
If Vds > 4UT ≈ 100mV the concentrations of electrons at the source terminal is negligible
compared to the one at the drain terminal, the Equation 2.17 can be approximated with
the following one:
Ids = I0
W
L
e
(κVg−Vs)
UT (2.19)
Intuitively, this Ids vs Vd independence is consistent with the intuitive analysis. In
fact, if Vds is considerable, then the diﬀerence in electrons concentrations at source and
terminal is big. Hence, doesn't really matters how small is the drain concentrations in
the Ids since their contribute to it is negligible.
33 G. Rovere
2.2. A Low Leakage Cell
0.0 .25 .5 .75 1.0 1.25 1.5 1.75 2.0
20.0
17.5
15.0
12.5
10.0
7.5
5.0
2.5
0
Vds [V]
I d
s [
uA
]
Vgs
600 mV
550 mV
500 mV
400 mV
300 mV
Figure 2.2: Simulation of a nMOS WL =
1µm
1µm Ids vs Vds plot, parametrized on Vgs up to 600mV (50mV steps).
2.2 A Low Leakage Cell
This section brieﬂy summarizes and explains concepts and results reported by M. O'Halloran
and R. Sarpeshkar in [6] and in [7]. Their papers focus on how to obtain accurate analogue
storage cells with ultra low leakage currents. Natural applications of their results comprise
analogue memories, sample and hold, switch capacitor circuits, oﬀset cancellation and so
on. However, even if they didn't mention it, their novel insight and characterization about
additional leakage mechanisms in MOS devices can be used as basis for really low currents
generator, exploiting parasitic leakage mechanisms in the device. As will be explained in
Section 3.2, this insight is the key and technique used in this thesis work in order to get ultra
low time constants.
In their papers, the authors ﬁrst analysed past low leakage cells and made a fair com-
parison between the designs, normalizing the meaningful quantities. Later, they analysed
accumulation mode MOS with in silicon measurements in order to develop and validate a
comprehensive model. Then they derived a practical rule of thumb for low leakage cells de-
sign.
Usually transistors drain leakage currents are modelled by the combination of two separate
phenomena: diode leakage and subthreshold conduction. In a nMOS the drain to bulk
junction current can be written as:
Idb = −Is
[
e−Vdb/UT − 1] (2.20)
where Is is the diode saturation current and is a process constant proportional to the
G. Rovere 34
2.2. A Low Leakage Cell
junction perimeter and area.
The subthershold conduction equation is the Equation 2.18:
Ids = I0
W
L
e
(κVgs)
UT (1− e−
Vds
UT ) (2.21)
thus, the smaller is Vgs, the less current ﬂow from the source to the drain.
Since the transistor drain leakage is a combination of these two leakage sources, subthresh-
old conduction can be made negligible if a small Vgs is applied. Resulting in Ids << Idb.
However, the Vgs at which this happens, has two interesting dependencies. First, its a
weak function of Idb, it means that if Idb is reduced, even Vgs must decrease accordingly in
order to let the subthreshold conduction be negligible.
Second, it depend linearly on the transistor threshold voltage. The smaller is Vth, the more
negative Vgs is required in order to switch the nMOS oﬀ. Hence, the use of high threshold
voltage devices (n/pMOSHVT), sometimes available in processes, can be beneﬁcial.
Let's start with the ﬁrst part, the review of existing methods. In order to reduce leakage
at the drain terminal, and hence memory degradation if a capacitor is connected to it, two
techniques were widely used in past literature. The ﬁrst consist of a nMOS and a pMOS
in parallel forming a transmission gate, see Figure 2.3(a). Since it is made of transistor of
opposite type, the leakages of the drain-bulk junctions of each device have opposite directions.
If the MOS are carefully sized and matched, these drain-bulk currents can cancel each other
and get theoretically zero leakage currents from the drain terminal. This, in practice, is
very diﬃcult to get because this nullifying system is very sensible to devices dimensions,
mismatches and, not the least, to the MOS operation point.
Another technique consist instead of a single device that has the bulk and the drain
terminals tied at the same potential, for instance with a voltage buﬀer, see Figure 2.3(b). A
junction with no diﬀerence potential between the two sides doesn't draw any current from
the drain terminal. Sure enough, in real world this can happens only if the voltage buﬀer
has zero oﬀset, if not the junction is polarized with the buﬀer oﬀset voltage and still a small
current ﬂow from the drain terminal.
Vdd
Vin VinVout Vout
Vg
Vg
Vg
nMOS
nMOS
pMOS
Vout'
Ileakage,n
Ileakage,p Ileakage
(a) (b)
Ileakage,net
Figure 2.3: Leakage current minimizing techniques. (a) nMOS and pMOS in parallel, the two leakage currents
counterbalance each other. (b) A nMOS cell, here the leakage current is minimized forcing Vout to be as close
as Vout′ as possible, via a feedback ampliﬁer.
35 G. Rovere
2.2. A Low Leakage Cell
Technology Lambda
(λ)
Minimum
junction Area
(30λ2)
Scaled reverse-biased
junction leakage per
unit area
Estimated
net leakage
Achieved
net leakage
[µm] [µm] [µm2] [ fA
µm2
] [fA] [fA]
3 1.5 67.5 4 * 0.02 0.22 1.6
2 1 30 3 * 0.02 0.36 1
1.2 0.6 10.8 2 * 0.02 0.017 0.08
Table 2.1: Comparison among estimated and measured leakage reduction performances for a given technology.
Table taken from [6].
Despite these technique are well known and widely used in industry, the authors of the
paper claims that these past low leakage designs didn't achieve the minimum leakage that
they would expect from physical considerations and from math.
In order to prove that, they ﬁrstly compared measured leakages found in past papers
with their relative processes estimations. They assumed to use low size MOS and that the
subthreshold conduction is completely eliminated (reasonable, since it's usually quite easy to
achieve). Hence, the only measurable leakage current would be the current junction between
drain and bulk. In modern processes reverse-biased leakage per unit area at room temperature
is around 0.02fA/µm2 [9], this value is decreased by a factor of two to three in the past 15
years as technology improved. Then, they conveniently scaled this parameter according to
the process technology and obtainable minimum dimensions. They summarize their analysis
in a table, which I partially reproduced in Table 2.1.
That comparison table shows that even the best design circuit has at least 2.5 times larger
currents compared to what expected from theory, hence they inferred that optimal design has
not been achieved. The authors in [6] claims that:
"These results suggests that additional leakage mechanisms, probably due to the MOS struc-
ture, exists and have not been compensated for in these past implementations".
Once addressed the problem, let's analyse the measurements in order to ﬁgure out what
is unexpected.
In modern processes, biasing the nMOS at Vgs = 0 means that it's digitally oﬀ but if Vds
is non zero, Ids will however be larger that Idb due to subthreshold conduction.
These considerations are shown in Figure 2.4. In fact, with large Vgs the dominant leakage
current is the subtheshold conduction (exponential), while with deep negative Vgs, the current
is leaded by Idb and is almost constant. The measured value at which these two currents get
equal is around −200mV . Hence, the ﬁrst insight about how to get low leakage in nMOS is
to a force a deep negative Vgs, resulting Idb >> Ids.
This can actually be done both by decreasing the gate potential or increase the threshold
via the body eﬀect increasing the source potential. They are both feasible and eﬀective for
this purpose, but increasing the Vsb has the disadvantage that also bias the drain to bulk
G. Rovere 36
2.2. A Low Leakage Cell
diode, increasing Idb.
Figure 2.4: Five diﬀerent minimum-sized nMOS Id vs Vgs curves with Vs = Vb = 0V and Vds = 150mV . Plot
taken from [6].
Equation 2.21 is valid only in subthershold MOS, i.e. higher than around −200mV , ac-
cording to Figure 2.4. Below that value, that equation is not true any more since the nMOS
goes out of the saturation region and enter in the accumulation mode. In this region the
Vgs is so negative that attracts positive holes under the oxide surface and subthershold drain
source couple mechanisms are hence negligible. The usual (and wrong) conclusion is than
that no source drain coupling is present any more in the device, but the authors carried out
intensive measurements on MOS devices in deep accumulation showing that it's not true, and
this inaccuracy can be a relevant source of leakage that explains non optimal performance of
past circuits.
In Figure 2.5 is shown the unexpected dependencies of Id to Vsb in accumulation mode.
This Id vs Vgs plot is obtained with Vbulk = 0, Vdb = 150mV and parametrized in Vsb =
−50 ÷ 150mV . If no accumulation coupling mode were present, the Id current variation to
Vsb would be negligible . On contrary, is shown that Id is actually aﬀected by the source
potential. Additionally, as Vsb approaches to Vdb, the conduction current is minimized. This
sounds reasonable due to the internal symmetric structure of the MOS.
In Figure 2.6 is shown a Id vs Vsb plot parametrized in Vdb = 150÷300mV given Vgb = −1
and of course Vb = 0. The plot shows that with Vsb in the range from −50 ÷ +250mV the
drain current still depend exponentially on the source potential, but is almost constant for
Vsb > 100mV . This residual leakage current is still weakly dependent with the drain terminal
37 G. Rovere
2.2. A Low Leakage Cell
due to drain-bulk reverse leakage.
Figure 2.5: Minimum-sized nMOS accumulation-mode Id vs Vgs curves for −50mV ≤ Vsb ≤ 150mV , with
Vb = 0V and Vdb = 150mV .Plot taken from [6].
These last two plots clearly shows how a source drain coupling is still evident in deep
accumulation mode MOS. So, in order to take into account this mechanism, the authors
developed an empirical model of the MOS in accumulation mode. Since the coupling eﬀect
between drain currents and source potential is exponential, it's natural to model it with an
additional parasitic diode Dsb2 in the MOS structure that only interact with the opposite
MOS terminal. While Dsb1 is the usual source to bulk junction diode.
MOS are symmetric devices, hence the same model can be applied to the drain terminal.
Hence, we will have a Ddb2 diode for modelling the accumulation mode coupling with the
source and a Ddb1 that only model the drain-bulk junction leakage.
The mechanism that links Dsb2 and Ddb2 is the diﬀusion under the accumulation layer.
This speculation is supported by the fact that in Figure 2.5 the drain current decreases as Vgs
became more negative. Since the more negative is the gate voltage, the wider is the accumu-
lation mode and hence less source drain coupling is present due to reduction of cross section
area of Dsb2 and Ddb2 junctions.
In this paper were presented only measurements and considerations, no analytical model
were developed due to the high complexity of the problem. However, a useful rule of thumb
is provided in order to design low leakage switches. Additional experiments were carried later
by the same authors [7] and get leakage currents on the order of 5 electrons/second in modern
device and diﬀerential topology.
G. Rovere 38
2.2. A Low Leakage Cell
Figure 2.6: Minimum-sized nMOS accumulation-mode Id vs Vsb curves for 150mV ≤ Vdb ≤ 300mV , with
Vb = 0V and Vgb = −1. Plot taken from [6].
accumulation layer
Figure 2.7: The MOS accumulation region model. The drain to source coupling phenomena is modelled by
the two additional diodes Dsb2 and Ddb2. On contrary Dsb1 and Ddb1 model the p-n junction leakages. Figure
taken from [6].
39 G. Rovere
2.3. The Translinear Principle and log-domain Filters
It's clear form the above discussion that, due to the symmetric nature of the MOS, the
source-drain accumulation coupling can be minimized if Vd = Vs. In fact, in this situation,
the current through Dsb2 is equal to the current that ﬂow thorough Ddb2 but with opposite
direction, hence summing to zero. Finally, since the subthreshold conduction is voided by
setting the MOS in accumulation mode, the only relevant drain leakage current would be the
one from Ddb1. This current can be reduced by reverse biasing that junction, yielding the
following:
Low Leakage Design rule of thumb (nMOS):
• Vgs < −400mV , in order to null the subthreshold conduction and biases the MOS
in accumulation region.
• Vdb = 0 in order to minimize drain bulk junction leakage.
• Vsb = Vdb in order to minimize drain to source coupling eﬀects.
2.3 The Translinear Principle and log-domain Filters
The translinear principle was ﬁrst introduced by B. Gilbert in the mid '70s, and since
that, this technique has been analysed, synthesizes and used by several people. Hence from
this, a new paradigm of ﬁlters called log-domain ﬁlters have been developed and formalized
by Mulder in [17].
Log-domain ﬁlters are time continuous ﬁlters that consist of constant current sources,
linear capacitors and translinear devices arranged in order to make a translinear loop. The
adjective log-domain relates to the fact that variables in the circuit are log compressed volt-
ages.
The power and the beauty of this technique is that it is simply based on the exponential
laws of certain electronic devices, such as BJT, MOS, IGBT (even combinations of those can
be ﬁne) and from a math log(·) propriety. However, to be speciﬁc, in this thesis work I will
focus only on translinear circuits and log-domain ﬁlters made out of MOS devices.
The impact of circuits that rely on that principle is important since it is extremely useful
for perform very resources consuming functions such as: multiplications, divisions, squaring,
and square rooting in a very eﬃcient way with very few components. In fact, computationally
expensive operation such as exp(·) are directly implemented by the device I-V curve itself.
Other regular functions, e.g. sum and subtractions, are simply implemented by the KCL on
a node. Otherwise, usual techniques that perform such complex functions would require to
go into the digital domain and, if not part of a big system with shared resources, the function
performances will loose compactness, eﬃciency and speed.
The modern implementation of log-domain ﬁlters in CMOS processes is based on the
G. Rovere 40
2.3. The Translinear Principle and log-domain Filters
suthreshold MOS in saturation. This is due because this device in this polarization region
has exactly the exponential dependence that is required to act as a translinear element and
build translinear circuits. In fact, its current-voltage relation was derived in Section 2.1 and
described by the Equation 2.19 is rewritten here below for convenience:
Ids = I0
W
L
e
κVgs
UT (2.22)
The term translinear is referred to the fact that the transconductance of the device is
linear in its collector current. Sure enough this is exactly what happens in the saturated
subthreshold MOS:
gm =
∂Ids
∂Vgs
= I0
W
L
e
κVgs
UT · κ
UT
=
κIds
UT
(2.23)
According to Table 1.2, translinear circuits are current-mode circuits, in the sense that
their input and output are currents. Inverting Equation 2.22, Vgs voltages in a MOS are the
log(·) versions of their drain-source currents:
Vgs =
UT
κ
ln
(
Ids
I0
L
W
)
(2.24)
hence, is clear how translinear circuits compress the voltage dynamic range with a log(·)
function. Since voltages can be accommodated in a reduced range, this compression is an ad-
ditional beneﬁt if the circuit is powered with low power supply, as happens in neuromorphic
engineering.
In order to have a translinear circuit, we need a topology in which we can identify a closed
loop of n translinear elements. For each of these devices (in our case MOS) they must have
the gates and the sources connected at least to one gate or to one source of another translinear
element. A topology like that is said to be a translinear loop and is shown in Figure 2.8.
Then, in order to analyse the circuit, an arbitrary direction of the loop must be set (let's
assume its direction is clockwise). Hence, following the loop direction, if the Vgs is positive
then the encountered translinear element is said to have a clockwise (CW) direction. On
contrary, if the Vgs is negative, the translinear MOS is labelled as a counter-clockwise element
(CCW).
As long as we travel all around the loop in the direction of the arrow we can apply the
KVL and hence write:
∑
n∈CCW
Vgs_n =
∑
m∈CW
Vgs_m (2.25a)
∑
n∈CCW
UT
κ
ln
(
Ids
I0
L
W
)
=
∑
m∈CW
UT
κ
ln
(
Ids
I0
L
W
)
(2.25b)
Additionally, given that all the devices are reasonably at the same temperature, and if the
κ of the involved devices are equals, then it reduces to:
41 G. Rovere
2.3. The Translinear Principle and log-domain Filters
I4
I3
I2
I6
I5
V5
U5
U4
U3
U6
U1
U2
Un–1
UN–1 Un
UO
V6
V4
V3
IN
VN Vn
I1
V1
V2
l4
l2
l3
l6
ln
ln
lN
l1
l5
Figure 2.8: A Translinear Loop with MOS elements. Loop in clockwise direction.
∑
n∈CCW
ln
(
Ids
I0
L
W
)
=
∑
m∈CW
ln
(
Ids
I0
L
W
)
(2.26)
Recalling that ln(ab) = ln(a) + ln(b) we have that:
ln
[ ∏
n∈CCW
Ids
I0
L
W
]
= ln
[ ∏
m∈CW
Ids
I0
L
W
]
(2.27a)
∏
n∈CCW
Ids
L
W
= ICCW−CW0
∏
m∈CW
Ids
I0
L
W
(2.27b)
If the number of CCW and CW elements are equals (N
2
) then the term I0 simpliﬁes and
the translinear relation ﬁnally becomes:
∏
n∈CCW
Idsn
Ln
Wn
=
∏
m∈CW
Idsm
Lm
Wm
(2.28)
This remarkable result is however obtained based on the two non obvious previous assump-
tions, i.e. the κ of the translinear elements are equals and #CCW = #CW. If that holds,
G. Rovere 42
2.3. The Translinear Principle and log-domain Filters
than results Ids currents in the translinear elements are related each other to multiplication
and divisions and are linearly scaled according to the MOS dimension ratios W/L.
Even though the second assumption is usually quite easy to get, the κ equal assumption is
generally tougher. In fact, the MOS body eﬀect results in a κ-dependent exponential constant
for the gate and a non-κ-dependent exponential constant for the source in the subtherhold
MOS [?]. However, although the body eﬀect is relevant, the log() properties still applies for
the translinear circuits, but results in undesirable power laws and hence distortion. So, in
order to neglect the κ eﬀect, the body eﬀect must be minimized, i.e. the source and the bulk
of the MOS must be tied together.
Unfortunately, in a standard CMOS n-well process (like the AMS 0.18µm) only the pMOS
bulks are independent from each other since they are made in isolated wells. The nMOS bulks
are shared because are the substrate of the die, hence sometimes can be impossible to zeros
the bulk eﬀect without forward bias the junctions. However, it is feasible and works well if
the sources of the nMOS devices are all at the same potential.
Even the Early eﬀect degrade the tranlinear loop performances. In fact, Equation 2.22
is just an approximation and doesn't take into account this eﬀect. But, if recall the original
Equation 2.17, an additional distortion would appear since a weak dependence on Vd is present.
However, both the Early eﬀect and the body eﬀect are usually overshadowed by the mis-
match between elements. As already pointed out, mismatch in modern CMOS processes can
be huge and it results in important variation from equations if careful design and layout are
not performed. Even though the MOS mismatches are countless, since countless are the MOS
parameters, the most relevant errors arises from mismatches in threshold values Vth and from
the mismatch of the transconductances.
Vdd
M1 M2
C
D
G
S
G
S
D
I1 I2
IA IC
Figure 2.9: A simple log-domain ﬁlter made out of cur-
rent mirror and a linear capacitor.
A simple yet instructive example of log-
domain ﬁlter made out nMOS devices is the
one shown in Figure 2.9. It is basically the
well known current mirror with the addition
of a capacitor. If the two nMOS are biased in
subthreshold saturation they acts as translin-
ear elements and hence a translinear loop can
be identiﬁed. For the circuit analysis let's set
the loop direction to be CW (as highlighted
in the ﬁgure) and given that the ﬁrst encoun-
tered terminal ofM1 is its source, M1 is then
a labelled as CW element. On contrary, since
M2 is ﬁrst encountered at the gate terminal,
M2 is hence a CCW element.
Is shown that #CCW = #CW, so the
ﬁrst assumption is veriﬁed. Since this is a
very simple circuit, even the κ assumption is
43 G. Rovere
2.4. The Diﬀerential Pair Integrator Circuit
veriﬁed. In fact, the bulk terminals of the
two device can be connected at their source
terminals cancelling the body eﬀect and thus getting equals κ.
Recalling the capacitor equation we can write:
ic(t) = C
dVc
dt
= C
dVgs2
dt
=
CUT
κi2(t)
di2(t)
dt
(2.29)
and from KCL holds i1(t) = iA(t) + iC(t), then applying the translinear principle and
combing these two Equations yields:
iA(t) = i2(t) (2.30a)
i1(t)− iC(t) = i2(t) (2.30b)
i1(t) = i2(t) +
CUT
κi2(t)
di2(t)
dt
(2.30c)
(2.30d)
then Laplace transforming and setting τ = CUT
κI2
it results:
I1(s) = I2(s)(1 + τs) −→ I2(s)
I1(s)
=
1
(1 + τs)
(2.31a)
Giving a LPF ﬁrst order log-domain ﬁlter.
This simple yet meaningful example gives an insight about translinear loops an log-domain
ﬁltering. Second order eﬀects, such as threshold voltage deviations, mismatches, early eﬀects
are easily understandable by this simple example. However, all of that degradates the circuit
performances in terms of static errors (current ratio) or in dynamic errors (distortions).
In this discussion about translinear circuits we assumed currents that carry informations
are always positive. This is not always true in real circuits, hence arises the problem how to
represent such a negative signals. This issue is usually overcome by implementing diﬀerential
translinear loops, as a result the eﬀective signal is the diﬀerence between two positive currents,
and thus can be negative. Otherwise you can keep the single ended circuit and add an oﬀset
current as happened in the mirror example.
In this section I gave an overview and an example about how the translinear principle
is really useful for make ﬁlters. In fact, it can be done by simply adding capacitors in one
translinear loop node. A further and more interesting example about low pass ﬁlter is the
Diﬀerential Pair Integrator circuit, shown in section 2.4.
2.4 The Diﬀerential Pair Integrator Circuit
The Diﬀerential Pair Integrator is a log-domain circuit developed by C. Bartolozzi at INI
in 2005. This circuit goal is to emulate synapses in silicon that exhibits compactness, low
G. Rovere 44
2.4. The Diﬀerential Pair Integrator Circuit
power consumption but still a wide control of the dynamics. The DPI is hence one of the
basic building blocks of the neural architecture implemented in the MN256r1 chip.
The basic DPI circuit is shown in Figure 2.10, it consist of four nMOS and two pMOS and
one capacitor. As will be clear later, Mthr sets, via its gate terminal, the DPI threshold while
Mτ and Csyn aﬀect the circuit time constant, i.e. the dynamics. Min is the input nMOS and
acts as a current source according to the input signal (in our case a spike). The maximum
input generated current is weighted by the Vw bias of Mw. Msyn is the output pMOS that
acts as a current generator, which current magnitude is proportional to the input current Iin
and to the circuit parameters.
Vw
Vthr
M
τ
Md Mthr
Mw
V
τ
Msyn
Isyn
Vsyn
Csyn
Min
Iin
I
I
τ
Va
dIthr
IC
Figure 2.10: The DPI schematic.
If we are at steady state condition, the current Iin is constant, and it's split in two
components that goes through Mthr and through Md. Since in steady state condition the
capacitor doesn't let current ﬂows, so IC = 0 and Iτ is equal to Id of Md. Hence Ithr is equal
to Iin − Id. In order to satisfy this last equation Vgs of Mthr is then set by the device physics
and is not a variable. Since Vthr is externally set, its source potential Va is at:
Va = Vthr − UT
κ
ln
(
Ithr
I0
L
W
)
(2.32)
Note that, ideally this source imposition doesn't interfere with the current Iin, since the
early eﬀect is negligible.
Now we are in the situation where Md has both the current imposed (Iτ by Mτ ) and the
source terminal potential Va (due to Mthr previous consideration). So its gate terminal will
be set by the circuit accordingly to the usual formula 2.19. The Md diode connected nMOS
allows it because its drain potential is ideally independent from its current. It results that,
at steady state a certain potential Vsyn aﬀected by Vthr and by Iin, sets the output current Isyn.
45 G. Rovere
2.4. The Diﬀerential Pair Integrator Circuit
t [ms]
t [ms]
t [ms]
Vin [V]
Va [mV]
Vsyn [uV]
t1 t2
Figure 2.11: Qualitative time plot of meaningful voltages in the DPI schematic draw upon simulation plots.
Note the diﬀerent y-axis scales.
But, what happens if a variation in the input current occurs? Let's now assume we are
at the steady state and suddenly increase the input current. Such additional current must
ﬂow somewhere according to the KCL, i.e. in Mthr or in Md or both. However, these nMOS
gates are ﬁxed, both from external generator or by the capacitor inertia. Hence, the only
way to allow the increase in current is to decrease their source potential Va down to a local
equilibrium is reached. This quickly happens at t2 (see Figure 2.11) and it results in an
increase in both the currents, namely Ithr and Id.
Since Id is statically given by the ﬁxed It, the additional required current must be drained
out from the capacitor. However, if the capacitor drain currents, it results in a ﬂow of elec-
trons and then in a decrease of potential of the capacitor. Hence, Vsyn potential is not ﬁxed
any more and goes lower. If Vsyn is lowering, Iτ is not aﬀected, but Id will and hence a smaller
currents ﬂow though Md. In order to satisfy the KCL Iin = Ithr + Id the common node Va
gets slightly lower (see after t2). Since the decrease of Va and of Vsyn are set by diﬀerent dy-
namics, their diﬀerence Vgs ofMd will converge and end up to an equilibrium point is reached.
Instead, if we suddenly decrease the input current, happens that Va increase in order to
reduce Ithr and Id and satisfy the KCL at that node. But now Id < Iτ . Hence, the current
diﬀerence at t1 ﬂows into the capacitor giving IC = Id− Iτ . As before, while the current goes
into the capacitor, it raise up Vsyn and hence increase Id reaching the equilibrium point where
Id = Iτ . However, this charging phase is much slower than the discharging phase due to a
bigger current that ﬂows in the capacitor. As contrary as before, while Vsyn increases, the
output current Isyn gets lower.
This was just an intuitive explanations in order to to really understand the circuit, an
analytic derivation based on the translinear loop is given here below.
G. Rovere 46
2.4. The Diﬀerential Pair Integrator Circuit
Vdd Vdd Vdd
Vdd
Vdd
Vin
Vw
V
τ
Csyn
Vthr
Va
Isyn
Ithr
Iin
Id
I
τ
IC
Vsyn
Ip_thr
Min
MsynMdMthr
M
τ
Mp_thr
Mw
-
+
-
+
-
-
+ +
Vgs
Vgs Vgs
Vgs
Figure 2.12: The DPI schematic with the translinear loop in evidence. Note the additional pMOS on the left
(Mp_thr) added for convenience.
Relating to Figure 2.12 let's start with the net and the elements equations:
Isyn = I0e
κVsyn
UT Iin = Ithr + Id Id = Iτ + Ic (2.33a)
Ic = C
dVsyn
dt
= C
UT
κIsyn
dIsyn
dt
(2.33b)
If the circuit elements are biased in subthreshold saturation region, hence the translinear
loop applies (see dotted line in Figure 2.12). The resulting relation is then:
Ip_thr · Ithr = Id · Isyn (2.34)
Combining Equations 2.34 and 2.33a we can write:
Ip_thr(Iin − Iτ − C UT
κIsyn
dIsyn
dt
) = Isyn(Iτ + C
UT
κIsyn
dIsyn
dt
) (2.35a)
and dividing both terms by Iτ and deﬁning τ =
CUT
κIτ
τ
(
1 +
Ip_thr
Isyn
)
dIsyn
dt
+ Isyn =
Ip_thrIin
Iτ
− Ip_thr (2.36)
This is a ﬁrst order non linear diﬀerential equation that can't be solved analytically.
However, if Iin >> Iτ the term Ip_thr on the right side of Equation 2.36 can be neglected. And
if Isyn >> Ip_thr even the term
Ip_thr
Isyn
can be dropped. This last assumption true since even
though Isyn = 0 at the beginning, it will increase monotonically and meet the Isyn >> Ip_thr
condition over time.
47 G. Rovere
2.4. The Diﬀerential Pair Integrator Circuit
Hence, we can rewrite Equation 2.36 as:
τ
dIsyn
dt
+ Isyn =
IinIp_thr
Iτ
−→ Isyn(s)
Iin(s)
=
Ip_thr
Iτ
1
1 + sτ
(2.37)
10−3 10−2 10−1 100 101 102 103
80
60
40
20
0
−20
−40
frequency [Hz]
Isyn
[dB
]
Vthr = 1.6 V
Vthr = 1.5 V
Figure 2.13: The DPI AC response, parametrized with Vthr, it changes the circuit gain but not the time
constant. All transitors are sized: W/L = 1µm1µm .
As described in Section 1.1 meaningful signals in neuromorphic engineer are spikes. Even
though the action potential spike dynamic is shown to be quite complex, is proven [2] that
a ﬁrst order approximation is suﬃcient for compute and process information in large neural
networks. Once again, the computational power is results form highly interconnected neuron
rather than a faithfully reproduced dynamics.
A spike can be obtained by applying a digital impulse to the DPI input. If that impulse
is narrow enough, i.e. one or two orders of magnitude below the DPI time constant, happens
that the rising time of the Isyn signal is really fast since the exp in a neighbour of the origin
can be approximated as a steep line. But, since the spike is narrow and ends soon, the DPI
doesn't reach the steady state point and goes back to its resting value. Hence, given a narrow
digital impulse, the output is an analogue signal that resemble the neuron action potential.
See Figure 2.15 and Figure 2.16 for the temporal dynamics of the DPI.
As already pointed out, this circuits implements only a basic action potential. In MN256r1
other circuits are responsible for inter-spike dependent dynamics. One of those circuits that
G. Rovere 48
2.4. The Diﬀerential Pair Integrator Circuit
10−3 10−2 10−1 100 101 102 103
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0
−10.0
frequency [Hz]
Isyn
 [d
B]
Vτ = 1.75 V      
Vτ = 1.8 V      
Figure 2.14: The DPI AC response, parametrized with Vτ , it changes both the cut-oﬀ frequency and the
circuit gain.
123 124 125 126 127 128
7.0
6.0
5.0
4.0
3.0
2.0
1.0
0
−1.0
1.0
.75
.5
.25
0
V13:p (Vt=1.40e+00) V13:p (Vt=1.42e+00) V13:p (Vt=1.44e+00) V13:p (Vt=1.46e+00) V13:p (Vt=1.48e+00) ...
V14:p (Vt=1.40e+00)
time [ms]
I in [
uA
]
I syn
 [n
A]
Figure 2.15: The DPI response of a square wave duty 50% input signal, parametrized in Vτ . All transistors
sized: W/L = 1µm1µm
49 G. Rovere
2.4. The Diﬀerential Pair Integrator Circuit
120 121 122 123 124
125.0
100.0
75.0
50.0
25.0
0
1.0
.75
.5
.25
0
/V13/PLUS
/V14/PLUS
time [ms]
Iin [
uA
]
Isyn
 [p
A]
Figure 2.16: The DPI response of a pulse input signal. The output shape recall the biological spike of Figure
1.3.
will be interfaced directly to the DPI is my homoeostatic circuit that implement the homoeo-
static plasticity.
G. Rovere 50
2.5. The Winner-Take-All Circuit
2.5 The Winner-Take-All Circuit
The Winner Take All circuit is a continuous time analogue signal processor invented by
Lazzaro et al. in the late 80's. It consist of two MOS cells and exhibits a very compact,
parallel and highly modular architecture for comparing signal magnitudes. Mathematically
speaking, it actually implement the max(Iin1, Iin2, ...IinN) function among N inputs, and se-
lect the "winning" output letting it ﬂow the whole bias current Itail. Hence the name Winner
Take All (WTA), in the sense that the most rugged signal wins over the others.
A basic two cells WTA schematic is shown in Figure 2.17. There we have a cell made by
M2 and M4 arranged as a local feedback and connect one input Iin1 to the circuit. M3 and
M5 are the second input cell and process the Iin2 signal. Any further inputs can be added by
simply insert additional two nMOS cell for each of the required inputs.
M1 acts as current generator Itail which current represent the maximum output current.
Itail can be seen as the "prize" for the "winner" among the competing input signals. Addi-
tionally, it sets the circuit gain and directly aﬀect the power consumption.
Vdd Vdd
M1
M2
M4 M5
M3
Vb
Iin1 Iin2
Iout1 Iout2
Vd2 Vd3
Vd1
Itail
Figure 2.17: A two input Winner Take All schematic.
Except where we intentionally want asymmetric and weighted behaviour for the inputs,
the WTA cells usually consist of nMOS of equal sizes. The only nMOS that can be diﬀer
from the other is of course M1. Indeed, given a symmetric circuit, if the two inputs Iin1 and
Iin2 are equals then the tail current set by M1 is hence equally split on Iout1 and Iout2.
But what happens if the two inputs are remarkably (Iin1 >> Iout2) diﬀerent?
If this happens while M2 an M3 are in saturation region, both try to satisfy Equation
2.19, and a conﬂict arises. In fact, these transistors would like to set their gate terminals at
diﬀerent values, but since they belong to a shared node Vd1, it can't happen. The value of Vd1
is set by the larger of the gate voltages between M2 and M3, i.e M2 (the one with the highest
input current).
51 G. Rovere
2.5. The Winner-Take-All Circuit
Hence, given that M3 has the Vgs set by M2, in order to be Iin2 = Ids2 << Iin1 must
happens that M3 changes its bias point and goes out of saturation region and falls into the
triode region. In fact, recalling Figure 2.2, with ﬁxed Vgs, we see that Ids considerably lowers
if Vds decreases and let M3 goes into the triode region. Since Vd3 lowers and Vd1 is ﬁxed by
the other input branch, the current through M5, i.e. Iout2 severely decreases. Given the KLC
at Vd1 node, is true that Itail = Iout1 + Iout2, and then we can conclude that Iout1 ≈ Itail.
The "winning" input has hence been "rewarded" let him taking the whole pot, i.e. Itail.
Note that an increase of Iout1 will increase Vd2, but since the Ids to Vds slope is negligible
in saturation compared to the one of triode region, this variation won't heavily aﬀect Vd1 and
won't at all inﬂuences large signals operations.
But, what happens if the diﬀerence between the inputs is small? Recall Figure 2.2, we
see that in saturation, the slope of the MOS is not zero due to nonidealities and second order
eﬀects (Early voltage). That behaviour can be included in the nMOS subthreshold model by
the following formula:
Ids = I0
W
L
e
κVgs
UT
(
1 +
Vds
VE
)
(2.38)
where VE is the Early voltage.
Starting from the equilibrium point (Iin1 = Iin2) we slightly increase the value of Iin1 by
δI . Hence, its drain voltage will increase by:
δV =
δI
I0
W
L
e
κVgs
UT
VE (2.39)
Since Vd2 is also the gate M4, Iout1 will be ampliﬁed by an amount proportional to e
δV .
Hence, due to KCL at node Vd1, the output current Iout2 must decrease by the same amount.
The gain of the competing mechanisms between two input signals is δV
δI
and is directly pro-
portional to the Early voltage VE and inversely proportional to the tail current Itail.
A simulation of a two input WTA fully symmetric circuit is plotted in Figure 2.18 with
two diﬀerent values of Itail. The bigger is the bias current, the steeper is the plot in the x-axis
zero neighbourhood. As explained before, with big diﬀerential inputs, the output currents
saturates at the bias current value.
A variation of the schematic shown in Figure 2.17 is depicted in Figure 2.19. The dif-
ference of this last circuit is that one output current is doubled via a proper sized mirror
and forced to ﬂow in a branch shared with a constant current source which value is Itail. If
the input signal Iin1 is diﬀerent compared to the Iin2, hence 2Iout1 is either close to zero or
to 2Itail. Given such a diﬀerent of imposed currents, the results is that the drain terminals
of the output transistors M8 and M9 will goes to an appropriate condition in order to sat-
isfy KCL in the branch. Since the gain of the WTA is high, hence the Vout swing ranges
from almost ground up to the power supply. However, a particular equilibrium condition
arises when Iin1 = Iin2 and yields a Vout = V
∗
out, which value lays in between gnd and Vdd.
G. Rovere 52
2.5. The Winner-Take-All Circuit
−1.0 −.5 0 .5 1.0
25.0
20.0
15.0
10.0
5.0
0
−5.0
Idiff [nA] 
I o
ut
_1
,2
 [n
A
] 
Figure 2.18: The WTA static plot. All nMOS are sized WL =
1µm
500nm , Itail = 10÷ 20nA, Icomm_input = 10nA
and Idiffernetial = ±1nA.
The equilibrium point V ∗out depends on the MOS sizes and on the common input current value.
This structure has a behaviour that resemble a digital current-voltage logic gate with an
additional equilibrium point somewhere in the middle of the voltage swing, or it can seen as
an analogue comparator. However, it can be modelled as follows:
Vout =

0 if Iin1 < Iin2
V ∗out if Iin1 = Iin2
V dd if Iin1 > Iin2
The temporal dynamics of this modiﬁed WTA circuit is simulated and plotted in Figure
2.20.
53 G. Rovere
2.5. The Winner-Take-All Circuit
Vdd VddVdd
VddVdd
Iin1 Iin2
Itail
Iout1 Iout2
Itail
2Iout1
M1
M2
M4 M5
M3
M7 M8
M9
Vout
Vb
Z = 4
Figure 2.19: A modiﬁed WTA with voltage output. All the MOS are sized Z = 2 except for M8 that is sized
Z = 4.
0 2.5 5.0 7.5 10.0
25.0
20.0
15.0
10.0
5.0
0
2.0
1.75
1.5
1.25
1.0
.75
.5
.25
0
−.25
time [ms]
Iin
1 [
nA
]
V
ou
t [
V
]
Figure 2.20: The enhanced WTA time simulation with Itail = 10nA = Iref . The solid line is the voltage
output while the dashed one curve is the current input. All MOS have Z = 2 but M8 that has Z = 4.
G. Rovere 54
CHAPTER
3
THE FILTER DESIGN
In this chapter I ﬁrstly restate the homeostatic principle in a engineering way providing a
transition from the biological domain to the silicon domain. The key point to that is to adopt
a standard AGC architecture. Then I moved on topology design considerations of LPF needed
in the AGC with an implementation level emphasis. As already announced, I tried to report
in this thesis work even some intermediate solutions that didn't works ﬁne for our application
but still emphasize circuits limitations. In particular, I report here two intermediate design
(Design 1 and Design 2) that were not suitable to obtain very small cut-oﬀ frequencies. The
ﬁnal design (Design 3) that gives nice results is presented at the end of the chapter and will
be the adopted solution that successfully implements really long time constants on silicon.
3.1 The Automatic Gain Control
The Automatic Gain Control (AGC) is a circuit that adjust the gain of an ampliﬁer in
order to have a constant output amplitude in face of ampliﬁer input variations. This type
of circuit is wide known in industry and is commonly used in hard disk drive read channels,
medical and multimedia systems CCD sensors, wireless receiver and so on. It is basically a
negative feedback loop and one way to implement this automatic gain correction is to sense
the output signal of the ampliﬁer and modify the gain of the ampliﬁer accordingly. This is
the feedback AGC implementation, the opposite to that is the feedforward implementation
that conversely senses the input of the ampliﬁer instead of the output. However, the key
element of the AGC circuit (both feedback and forward) is a LPF that averages the ampliﬁer
output signal (or input in the forward realization) and provides smooth and low frequency
signals suitable to be used as gain control commands. This value is used to set the gain of
the ampliﬁer in a way that the ampliﬁer gain increase if the output signal strength is too low,
55
3.1. The Automatic Gain Control
and the ampliﬁer gain decrease if the sensed output strength is too high.
From this brief discussion and recalling Section 1.4 is clear that the AGC is the engi-
neering implementation of the homeostatic principle present in real neurons. In
fact, the neuron can be seen as an ampliﬁer with the pre-synaptic current Iin as the input and
with the spike frequency rate as output fout. This relation can be assumed to be linear and
the slope of the gain is set by internal parameters of the circuits and by Vthr (Ip_thr) of the DPI.
Thus, to implement the homeosatic principle in silicon, the proper solution is to sense the
output ﬁring rate of the neuron, compare it with a frequency reference, process the diﬀerence
with a LPF and feed back this processed signal in the synapses changing its Vthr. This solu-
tion is depicted by a block diagram in Figure 3.1.
X
Σ +
-
Isyn
Iin
Differential Pair Integrator - Synapse Integrate and Fire - Neuron
Ip_thr
∫
LPF
fout
fref
Figure 3.1: Block diagram of the AGC loop applied to an artiﬁcial neuron of MN256r1 chip. Each neuronal
instance consist of the cascade of a DPI synapses and an integrate and ﬁre neuron.
However, would be computationally expensive to extract the frequency from an analog
signal consist of spikes (such as fout), and compare it to a ﬁxed frequency fref . An imple-
mentation to that would be to add an additional ﬁlter that converts fout fast spikes into a
low-frequency current or voltage which value is proportional to the neuronal ﬁring ratio. This
additional ﬁlter, plus to the AGC-LPF and the DPI integrator, results in a more complex
implementation third order loop, which stability is not guaranteed.
These problems arises since the information is taken at the very output of the neuron.
But, can we do something to avoid it?
Recalling Figure 1.2, since fout = f(Isyn) is assumed to be linear in small variation around
the stable point, fout is a constant gain scaled version of Isyn, hence fout = hIsyn, where
G. Rovere 56
3.1. The Automatic Gain Control
h ∈ R+. So, fout and Isyn carry the same information. Given that observation, the AGC
that implements the homeostatic principle is not required to sense fout since the information
can be instead recovered from u(t). This is a great advantage because simpliﬁes the AGC
circuitry implementation. The schematic block of the simpliﬁed AGC that implements the
homoestatic principle in silicon is given in Figure 3.2.
X
Σ +
-
Isyn
Iref
Iin
Differential Pair Integrator - Synapse Integrate and Fire - Neuron
Ip_thr
∫
LPF
fout
Figure 3.2: Block diagram of the simpliﬁed AGC implementation of the homoestatic principle on MN256r1
neuronal instance.
In this ﬁgure, Iref is the reference current at which the system state, when the AGC
dynamics are over, will converge. In fact, if Iref − Isyn is not zero, this diﬀerence is processed
by the LPF and its output value aﬀects the DPI gain via Ip_thr (i.e. Vthr). This will change
Isyn in a direction (negative feedback) that results in Iref = Isyn, thus giving Ip_thr constant.
To note that now we can't directly set the ﬁring rate of the neuron, but we can only set
its synaptic current. This means that, in order to get the desired fout, Iref must be set
accordingly, i.e. Iref =
foutref
h
. In practice, this is not an issue since Iref is a manual bias.
If the assumption f(Isyn) = h is not true, a diﬀerence between the desired output average
ﬁring rate and the real one is aﬀected by and error, that can be only static if f(t) = h∗ 6= h
with h∗ ∈ R, or dynamic if f(Isyn) can't be approximated with a linear equation. However,
high precision in fref is not required in our application, both concerning its absolute value
and its variance between neuronal instances in the chip.
As already highlighted before, the homeostatic principle dynamics must be kept several
order of magnitude slower that the learning mechanisms dynamics, i.e. the DPI dynamics.
This point is a key speciﬁcation and almost all the eﬀort in the design path were concentrated
here. However, this practical consideration is very important even while concerning about
the AGC stability. In fact, referring to Figure 3.2, the AGC loop consist of a second order
system. One pole is associated to the DPI, and the other comes form the LPF. These two
57 G. Rovere
3.2. The Core Idea
poles would give the system in a potential instability, since the phase margin can be very
close to 0◦. Fortunately, this is avoided in our circuit since the big diﬀerence in the two
cut-oﬀ frequencies yields poles very distant to each other, let the system to be modelled as
a dominant pole system. In fact, given a ﬁxed gain system, only the dominant pole sets the
bandwidth of the loop and the phase margin is around 90◦. Since the inﬂuence of the second
pole is way negligible if they are spaced by frequency decades, it doesn't degrades the phase
margin.
3.2 The Core Idea
In order to obtain ultra long time constants using small area (i.e. small capacitors), a cur-
rent generator of tiny (femto to atto Amperes) currents is required to charge and discharge
the ﬁlter capacitor. Usually, a current generator consist of a MOS device with the gate and
the source potentials set by Equation 2.19 in order to get the desired current Id. However,
this implementation has important limitations concerning the minimum obtainable current
Id due to unwanted leakages mechanisms that easily dominates at such low currents regimes.
As already anticipated in Section 2.2, tiny currents can be obtained by applying the rules
of thumb for a low leakage cell design. In fact, measurements on [6] and [7] show that such
a technique can gives a leakage, hence a current, down to 5/electrons per second in modern
CMOS processes. In their papers, the authors show an application of their results in a sample
and hold cell and in a switched capacitor ﬁlter only. Indeed, they never attempted to use
their results to build a tiny current generator.
From the structure they presented, no straightforward derivations of a femto to atto Am-
peres current generator with one ﬂoating terminal can be obtained. In fact, is not easy to
generate such low leakage currents and, at the same time, let those tiny currents ﬂow into a
capacitor without being aﬀected by other device currents. This behaviour is mandatory in
order to set temporal dynamics and hence build ﬁlters.
A possible implementation of a tiny current generator that I developed on purpose for this
thesis, based on the low leakage design rules of thumb, is shown in Figure 3.3.
That circuit comprises a pMOS polarized as low leakage current source via two ideal am-
pliﬁers A1 and A2 that actively satisﬁes two bullets (the 2nd and the 3rd) of the low leakage
design rule of thumb. In fact, A2 forces the Vdb voltage to be zero, reducing the drain bulk Idb
leakage current. Meanwhile, A1 forces the drain and the source terminals to be at the same
potential, cancelling the source drain accumulation mode coupling.
The gate terminal is tied at Vdd (or at the maximum voltage available in the chip) hence,
if Vc = Vd = Vb = Vs ranges from ground to Vg − 400mV = 1.8− 0.4 = 1.4 then, the pMOS is
biased in accumulation mode and the subthreshold conduction is voided. In this scheme, the
only current that can ﬂow out from the drain terminal of the pMOS goes into one capacitor
terminal. Since this terminal is not tied to any ﬁxed potential, charging or discharging of the
G. Rovere 58
3.2. The Core Idea
− + − +
Vdd
Vdd
C
Vc
bulk
S
D
Low Leakage Cell
A1 A2
Figure 3.3: A Low current generator with one ﬂoating node structure. This is inspired by concepts and
measurements of [6] and [7].
capacitor can happens according to the sign of Id pMOS current.
If the pMOS is biased into accumulation mode and the others Low Leakage Cell (LLC)
rule of thumb are satisﬁed, we see from Figures 2.5 and 2.6 that the tiny Id current is Vgs
invariant and almost constant if small variations of Vds are experienced by the device. The
fact that Id is Vgs invariant is great since the source, the bulk and the drain terminals are
imposed by the capacitor stored charge and will surely vary according to the ﬁlter dynamics
in normal circuit operations.
However, the LLC pMOS ﬁrst bullet requirement (Vgs > 400mV ) is not guaranteed by
any topological condition and can incidentally be violated if not taken into account by the
designer at a higher level. So, if happens that Vgs < 400m, then the Ids term exponentially
increases due to subthreshold conduction that considerably increase Id current. Sure enough,
this is a situation that we must avoid by any means at regimes operations.
Even though the low leakages rule of thumb holds both for nMOS and pMOS, in my
circuit, I choose to implement the Low Leakage Cell with a pMOS for two reasons:
The ﬁrst is that I needed to have independent access and control to the bulk terminal of the
MOS in order to reduce the leakages. As already pointed out in Section 2.3, our process AMS
0.18µm is a n-well process, hence only pMOS can be made in a separate well. On contrary,
all nMOS shares bulks, prevent to the possibility to control it independently. Additionally,
is beneﬁcial that the pMOS has lower conductivity compared to nMOS devices. This is due
because holes mobility is less than electrons mobility µh < µe.
Additionally, I want to stress out that the Vc node, that is the one that carries informa-
tion, is really sensible and must be handled with care. It means that any electrical devices
connected to it must exhibit really high impedance seen from that node. This is mandatory
and is an important condition in order to not insert additional leakage sources that would
59 G. Rovere
3.2. The Core Idea
otherwise vanish all the LLC pMOS improvements and degrades circuit performances in term
of cut-oﬀ frequency.
Sure enough, this particular care has been applied in the case of my circuit of Figure 3.3.
In fact, the node VC , is connected only to the two gates of the ampliﬁers. This connection can
statically leaks out out currents in the order of few atto ampere (simulated), due to second
order phenomena such as tunnel eﬀect into and through the gate oxide and from the injection
of hot carriers from substrate to the gate oxide [8]. However, the Vc node can be dynamically
aﬀected by capacitive coupling. This is a serious problem if in the circuit there are digital like
(control) signals or fast analogue dynamics, such as spikes. Actually, except for one control
Vcontrol signal that will be introduced and explained later, that is not our case due to careful
design.
Given that, one should avoid to connect anything else on that sensible node in order not
to interfere and introduce additional leakage mechanisms. But, if needed, in order to close
the AGC, the only thing allowed is to connect to Vc a gate of a MOS not topologically close
to signals with fast dynamics.
However, even though dangerous signals are distant from the topology point of view, this
may not be guaranteed in the layout. So, additional care must be performed during this last
design phase with shields and guard rings.
Note that the drain and the source terminals of the low leakage cell are somehow arbitrary
in the scheme of Figure 3.3. In fact, according to the deﬁnition, they are relative to each other
and depend according to their voltage magnitude. With ideal ampliﬁers, they are impossible
to label since Vd = Vs. However, in real world we will deal with real ampliﬁers made out by
OPAMPs and hence with all their limitations, such as ﬁnite gain, distortion, small output
resistance, additional poles, and so on. In this case, important nonidealities aﬀect the voltage
diﬀerence between the noniverting terminals of the OPAMP, Vin+, and its output, Vout. This
diﬀerence Vdiff = Vin+ − Vout = Vfinite_gain + Voffest is the resulting combination of the ﬁnite
gain of the OPAMP Vfinite_gain and of the oﬀset voltage Voffset. That oﬀset voltage is mainly
due to internal device processes mismatches, Early eﬀects and other nonidealities.
The usual low frequency model of a real OPAMP is
Vout = (Vin+ − Vin−)A (3.1)
where the gain (A)is ﬁnite. If the OPAMP is feedbacked in unity gain voltage follower can
write:
W =
Vout
Vin+
=
A
1 + A
(3.2)
Note that if A =∞ hence we get the ideal OPAMP relation Vout = Vin+. But if it is ﬁnite
A <∞ hence is true that Vout < Vin+.
However, even though this relation, we are not sure weather the Vout is higher or lower
compared to Vin+ due to the additional term Voffset. That therm is a poorly controllable
value (within certain boundaries) and can be both positive or negative for each instance of
G. Rovere 60
3.2. The Core Idea
the OPAMP. Hence, the tiny accumulation mode current Ids in the pMOS can ﬂow both up-
wards or downwards in the pMOS according to the sign of Vdiff of OPAMP A1. Additionally,
Vdiff of OPAMP A2 set the magnitude and the sign of Idb leakage current. Hence, the sign
and the magnitude of the ﬁnal current Id, responsible to alter the capacitor charge, is aﬀected
by the nonidealities of both the two OPAMP A1 and A2.
Previously, I stated that Vs ' Vd ' Vb must range from ground to Vdd− 0.4mV = 1.4V in
order to let the pMOS be in accumulation mode. Hence, this LLC pMOS consideration gives
only an upper bound limit if OPAMPS are ideal. However, other important limitations of the
OPAMPs are its input and output voltage swing. In fact, if a on purpose advanced design has
not been performed, such as [24], OPAMP suﬀer of applicable input voltage limitations. In
fact, recalling that the input stage of a simple OPAMP is usually made by a single diﬀerential
pair. Let's consider the nMOS diﬀerential pair case, hence a minimum DC Vin+ is required
in order to let it be biased in the correct region (saturation). It results in a lower bound
OPAMP working condition. On contrary, if the diﬀerential pair consist of pMOS then the
working condition gives an upper bound.
As last point I also want to note note that two distinct OPAMPs in this particular circuit
are useless. In fact, they shares both the input terminals Vin+A1 = Vin+A2 = VC and the
output terminals VoutA1 = VoutA2 , hence only one OPAMP would be suﬃcient to the task.
However, the circuit is presented in this form since in later sections this A1 and A2 merging
can't be performed any more and two separate and independent OPAMPs are thus required.
61 G. Rovere
3.2. The Core Idea
0.0 .25 .5 .75 1.0 1.25 1.5 1.75 2.0
10−13
10−14
10−15
10−16
10−17
10−18
10−19
Vc [V]
Id
 [A
]
A
 = 100 500 1k 5k 10k
Figure 3.4: Id vs V c plot of Figure 3.3, parametrized in A that is the gain of both OPAMP A1 and A2. Note
that, if Vc is high (above 1.5V) the drain current increases exponentially. This means that the subthrasheold
conduction is becoming dominant, as shown in real chip measurements of Figure 2.4. Additionally, the current
magnitude is modulated by the OPAMP gain. In fact, the lower is the gain, the more nonideal is the OPAMP
and the value Vin+ − Vout increase, increasing Id as well.
G. Rovere 62
3.3. Filter Design 1 - The DPI
3.3 Filter Design 1 - The DPI
Since in the AGC loop a LPF is required, one straightforward solution is to combine the
DPI circuit of Section 2.4 with the Low Leakage Cell of Section 2.2.
Vdd Vdd VddVdd
Vin
Vw
Csyn
Vthr
Va
Iout
Ithr
Iin
Id
Iτ
IC
Ip_thr
Min
MsynMdMthrMp_thr
Mw
MLLC 
− + − +
Vdd
Vc
A1 A2
D
S
B
Figure 3.5: The schematic of the ﬁrst LPF design attempt. The Iτ current generator is here implemented by
the LLC.
In fact, the idea is to replace Iτ with the LLC, connecting the VC terminal with the gate
and drain of Md of Figure 2.10. According to the DPI dynamics equation τ =
CUT
κIτ
, this
topology should give very long time constants. Unfortunately this is not true due to second
order eﬀects and nonidealities of the DPI circuit that were not taken into account in Section
2.4 analysis. In fact, the bulk terminal of transistor Md is connected to ground and is shared
among all nMOS devices in a n-well process. Hence, a positive Vdb is set and drains out a
leakage current Idb in the order of 50fA. This contribute is negligible if Iτ is well above it, i.e.
if Iτ >> Idb_Md and the time constant is then set only by Iτ . But, if Iτ is comparable with
Idb_Md hence the time constant is altered by the drain bulk junction current and eﬀectively
limits the DPI performances. In the sense that the capacitor is not solely charged by Iτ but
is dominated by Idb_Md and τ =
CUT
κIτ
doesn't hold any more.
Note that, even though in a two well process would be allowed to get a separate p-well
for the nMOS Md, Idb can't be minimized by forcing Vd = Vb. In fact, that would yields to
63 G. Rovere
3.4. Filter Design 2 - The Classical log-domain
forward bias the bulk to source junction of Md.
A time plot simulation of the Filter Design 1 is shown in Figure 3.6. Is clear that the
circuit time constant increase as Iτ lowers, but at a certain point it stops and stay constant.
This happens between curves that corresponds to Iτ = 10fA and Iτ = 5fA. In fact, below
Iτ = 10fA, the charging current is not dominated by Iτ any more but on contrary by Idb_Md ,
that is insensitive the to the value of Iτ .
Additionally, as Iτ lowers, even the DPI gain lowers, according to the DPI transfer function
of Equation 2.37.
0 1.0 2.0 3.0 4.0 5.0
11.6175
11.615
11.6125
11.61
11.6075
11.605
11.6025
10.7175
10.715
10.7125
10.71
10.7075
10.705
10.7025
10.7
9.36
9.3375
1.64
1.63
1.62
1.61
1.6
1.59
1.58
1.57
1.14
1.13
1.12
1.11
1.1
1.09
1.08
time [ks]
Iτ = 100 [fA]
Iτ = 1 [fA]
Iτ = 5 [fA]
Iτ = 10 [fA]
Iτ = 50 [fA]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
Figure 3.6: Time simulation of the DPI Iout given a Iin square wave from 5nA to 6nA. Ip_thr = 1nA,
C = 10pF . The simulated Idb_Md ≈ 50fA.
3.4 Filter Design 2 - The Classical log-domain
The problem of the previous design came up because drain and source terminals of a critic
MOS were connected at diﬀerent potentials. Still remaining in the log-domain ﬁltering, a
diﬀerent topology based on [9] has been analysed. This architecture is shown in Figure 3.7
and comprises four pMOS, three current generator and the usual capacitor.
If all transistors M1 to M4 are biased in subthreshold saturation region, we can analysed
the circuit behaviour by applying the translinear loop technique. This topology has the M3
pMOS, that is functionally equivalent of Md for the DPI. However, in this topology, M3 has
G. Rovere 64
3.4. Filter Design 2 - The Classical log-domain
the Vsb = 0 since the pMOS bulk is accessible and can hence be connected to the drain.
This connection can be done for each of the four pMOS that comprises the translinear loop
(M1 to M4), giving no Body eﬀect and resulting in κ parameter equal between pMOS. As
already emphasized before, if the κ term is not constant in all translinear elements, distortion
increase.
Vdd Vdd Vdd VddVdd
Iin
Ig Iτ
Iout
VC
C
M1
M2 M3
M4
-
+
Vgs
- -
-
+ +
+
Vgs Vgs
Vgs
Figure 3.7: The schematic of the second design attempt.
Let's now derive its transfer function. Following the red translinear line and applying
KVL is true that:
Vgs1 + Vgs2 = Vgs3 + Vgs4 (3.3)
Hence, for simplicity assume that W
L
= Z is equal between the pMOS. Applying Equation
2.28 gives:
IinIg = (Iτ + iC)Iout (3.4)
recalling that iC =
CUT
κ
1
Iout
dIout
dt
, Equation 3.4 becomes:
IinIg = IτIout +
CUT
κ
dIout
dt
(3.5)
by labelling τ = CUT
κIτ
and Laplace transform it yields:
Iout(s)
Iin(s)
=
Ig
Iτ
1
(1 + sτ)
(3.6)
Note the Equation 3.6 is equivalent to Equation 2.37 of the DPI with Ig = Ip_thr.
65 G. Rovere
3.4. Filter Design 2 - The Classical log-domain
Since the source and bulk terminals of M3 are tied together (Figure 3.7), ideally no Isb
current would ﬂow in that junction. However, that is not enough since Idb is reverse biased
and, due to KCL, Idb current must ﬂow out from the bulk and hence through the capacitor,
altering its charge value. An idea in order to ﬁx this problem is presented in Figure 3.8 an
consist of an unity gain OPAMP that senses the source voltage of M3 and hence sets the
bulk of M3 accordingly. If the OPMAP is ideal the condition Vs = Vb is once again satisﬁed.
However, the diﬀerence from before, is that the drain to bulk current that ﬂow in the bulk
now comes from the output stage of the OPAMP instead of from the capacitor.
So, even this circuit was a nice candidate in order to implement the low pass ﬁlter with
the desired ultra long time constants. A schematic of the simulated circuit is shown in Figure
3.8 and once again consist of the union of schematic of Figure 3.7 and Figure 3.3.
Unfortunately, even though that problem is now ﬁxed, another issue arise. In fact, we see
that at steady state condition the tiny current Iτ ﬂow in transistorM3. Since Iτ is the current
that ﬂow both on MLLC and on M3, they must have similar operation point. I.e. for very
small Iτ happens that M3 is in accumulation mode too and hence the ﬁlter doesn't behave as
predicted by Equation 3.6 (the assumption that all the MOS are in saturation region doesn't
hold any more).
Simulations of Figure 3.9 shows that performances of Design 2 are better in term of time
constants compared to Design 1, but still exhibits problem and can't work if Iτ is too low.
This is reasonable since the translinear loop is not exploitable any more. So, I concluded
that also this second design can't work properly as a ﬁlter due to intrinsic problems that can't
ﬁxed with this topology.
From these attempts I see that the challenging part of the project was not only to generate
really low currents, as I thought at the very beginning of my thesis work, but even being able
to exploit it in an eﬀective low cut-oﬀ ﬁlter.
These, and several others circuits, were object of my studies and simulations. However,
none of the conventional ﬁlters that I took into account were suitable for ultra low currents
and hence ultra long time constants. These intrinsic hindrances urged me to develop a new
solution from scratch. The proposed Design 3, that is the one by which I got the best results,
is discuss here below in Section 3.5.
G. Rovere 66
3.4. Filter Design 2 - The Classical log-domain
Vdd Vdd VddVdd
−
+
− + − +
Vdd
Iin
Ig
Iout
VC
C
M1
M2 M3
M4
A2A1
A3
I τ
MLLC
D
S
B
Figure 3.8: The schematic of the second design attempt. The Iτ current generator is here implemented by
the LLC.
67 G. Rovere
3.5. Filter Design 3 - The Unbalanced Structure
0 1.0 2.0 3.0 4.0 5.0
20.10145
20.10135
20.10125
9.8165
9.8164
9.8163
9.8162
9.8161
9.816
9.8159
9.8158
4.12
4.1
4.08
4.06
4.04
4.02
4.0
2.48
2.46
2.44
2.42
2.4
2.38
2.36
2.1
2.06
2.02
time [ks]
Iτ = 100 [fA]
Iτ = 50 [fA]
Iτ = 10 [fA]
Iτ = 5 [fA]
Iτ = 1 [fA]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
I o
ut
 [μ
A
]
Figure 3.9: Time simulation of the output current of the circuit of Figure 3.8. Iout is obtained by applying a
Iin square wave from 5nA to 6nA. Ip_thr = 1nA, C = 10pF .
3.5 Filter Design 3 - The Unbalanced Structure
Since previous ﬁlter designs yielded poor performances that didn't meet the speciﬁcations,
a new approach to the problem was developed. I decided to obtain ultra long time constants
based on a diﬀerent architecture that, as far as I know, has never been used before.
In fact, even though the circuit of Figure 3.3 of Section 3.2 worked ﬁne, the issue arised
while I tried to connect it to other circuits in order to obtain a ﬁlter. Given this issue, my
idea was to develop a ﬁlter-like circuit built around the circuit of Figure 3.3 instead of try to
insert it into an existing ﬁlter topology.
Recalling Figure 3.3 and simulation plot in Figure 3.4, if the OPAMPs are completely
ideal, the circuit is onset. But, if the OPAMP A1 has oﬀset (no matter its nature), then a
small current can ﬂow upwards or downwards in the LLC pMOS according to the magnitude
and sign of the OPAMP oﬀset. Hence, my idea was to somehow get control of that oﬀset and
used it for set the sign of the current in the LLC pMOS and hence in the capacitor.
As a results, due to tiny currents, this unbalanced structure would results in a circuit
with ultra long time constants where the sign of dVC
dt
is decided according to a control
signal (Vcontrol) that on purpose put oﬀset the OPAMP.
G. Rovere 68
3.5. Filter Design 3 - The Unbalanced Structure
Given this consideration, a circuit that implement this idea is shown in Figure 3.10. The
LLC part is exactly the same as presented in Figure 3.3, but this structure has additional
external circuitry that set small voltage diﬀerences Vds across the LLC according to the Vcontrol
value.
In fact, the nMOS are sized as follow: M1 =
1µm
1µm
, M2 =
2µm
1µm
and M3 =
3µm
1µm
and, via the
input terminal Vcontrol, is possible to decide the sign and the amount of the pMOS Id
current. This external circuitry has the same results as modify the OPAMP internal oﬀset
of Equation Vdiff = Vin+ − Vout = Vfinite_gain + Voffest of Section 3.2.
Vdd
−
+
− +
Vdd
VddVdd Vdd
A 1
A 2
bu
lk
Low Leakage Cell
VA
VB VC
Ibias Ibias
M2
MLLC
Vcontrol M4
+
- Vgs_2
IC
M3 M1+
-Vgs_1
C
Figure 3.10: Schematic of the Core Idea plus additional circuit that gives the Unbalanced Structure required
to get long time constants.
Let's now assume Vcontrol = Vdd. Hence, if Ibias >> I0, then the M4 and M3 branch is
negligible. Thus holds the following KVL:
VB − VC = Vgs_2 − Vgs_1 (3.7)
Assuming that the Ibias current generator are equals and that nMOS sizes are matched is
true that:
VB − VC = Vgs_2 − Vgs_1 = UT
κ
ln
(
IIbias
2I0
)
− UT
κ
ln
(
IIbias
I0
)
=
UT
κ
ln
(
1
2
)
≈ −25mV (3.8)
69 G. Rovere
3.5. Filter Design 3 - The Unbalanced Structure
Hence, since the Idb leakage current is minimized due to A2, only a tiny current in MLLC
ﬂow from VC to VB discharging the capacitor C.
Let's now study the Vcontrol = 0V case. Now M4 is above threshold linear region, hence it
can be modelled as a low impedance wire. Therefore, M1 and M3 are two device in parallel
and can be treated as a single device with dimensions 1µm
1µm
+ 3µm
1µm
= 4µm
1µm
.
Now, inserting this result in Equation 3.7 yields:
VB − VC = Vgs_2 − Vgs_1 = UT
κ
ln
(
IIbias
2I0
)
− UT
κ
ln
(
IIbias
4I0
)
=
UT
κ
ln
(
2
1
)
≈ +25mV (3.9)
That means that if Vcontrol is low, now the tiny current will ﬂow from VB to VC terminals.
I reported here only the analysis of the two extremes values of Vcontrol. However, there is
a certain value Vcontrol∗ somewhere in the middle of the Vcontrol ranges that gives the VB = VC
condition and hence Id = 0. To calculate the exact value of Vcontrol∗ at which this happens
is not straightforward. However, it's not so meaningful since its severely altered by device
nonidealities and device mismatch in the AGC loop.
Additionally, the MLLC absolute current is modulated by the value |VB − VC |. This mod-
ulation is hard to predict via simulations at such extreme pMOS polarization conditions.
However, as explained later, due to careful design the exact Ids vs VB − VC relation won't
severely aﬀect the homeostatic circuit behaviour.
Let's now reﬁne Equation 3.7 modelling the current generator mismatch with two current
generator with diﬀerent values: Ibias and Ibias+∆Ibias , where the term ∆Ibias comprises physical
size mismatch and second order eﬀects, such as Early eﬀect. The size mismatch of transisors
M1, M2 and M3 can be modelled as a deviation of α from the ideal value and can thus be
rewritten as α + ∆α. Finally, the κ term, in this topology, depends on the the diﬀerence of
the source terminals of the MOS, i.e. by Vdiff . Given that, we can hence write:
VB − VC = UT
κ+ ∆κ
ln
(
α + ∆α
2
(
1 +
∆Ibias
Ibias
))
+ Vdiff_A1 (3.10)
where deviations from ideality of κ results in a increase/decrees of the Vds of LLC pMOS
range. On contrary ∆α, ∆Ibias and Vdiff_A1 results in a shift of the Vds range. This last can
be a serious issue, especially regarding α and Vdiff_A1 terms. In fact, if these two components
are remarkable, could happens that the Vds has always the same sign, no matter how is the
Vcontrol input. Hence, the current in the capacitor can ﬂow only in one direction resulting into
a useless circuit.
To be sure that the circuit exhibits both positive and negative Vds, even with such de-
viations, must be that the Vds range is wide enough to handle worst case scenarios. By
simulations Vds = ±25mV is a safe value. In addition to that, Vcontrol range is expanded by
the use of a comparator. This point will be explained later but its purpose is to increase
circuit reliability in face of nonidealities.
G. Rovere 70
3.5. Filter Design 3 - The Unbalanced Structure
0 50.0 100 150 200
2.0
1.75
1.5
1.25
1.0
.75
.5
.25
0.0
30
20
10
0
−10
−20
−30
300
200
100
0
−100
−200
−300
V
co
nt
ro
l6[
V
]
V
ds
_L
LC
_p
M
O
S6[
m
V
]
I d_
LL
C
_p
M
O
S6[
aA
]
time6[ms]
1.86V
06V
-26.766mV
24.026mV
-2716aA
2366aA
Figure 3.11: A simulation of Figure 3.10 with Ibias = 10nA, A = 10000, LLC
1µm
1µm , and M4 =
1µm
1µm . The plot
shows how the pMOS Id current is inﬂuenced by the voltage applied to the Vcontrol terminal.
In order to unbalance the OPAMP, an alternative approach that I attempted was to add
an output or input branch in parallel to the regular one and enable and disable it via a control
signal. This technique actually try to aﬀect the Voffest of the OPAMP modifying its internal
balance and matching of the topology.
Even though several diﬀerent topologies and combination were examined, they didn't
yields linear like behaviours but on contrary a very non regular and ugly piecewise curves.
From here the alternative idea to set the Vds of the LLC pMOS not by aﬀecting the inside
the OPAMP but by an external circuitry (as was explained in this section). This solution is
simulated and reported in Figure 3.11. While Vcontrol ranges from ground to Vdd, the voltage
across the LLC ranges from 24mV to −26mV and the current IC that ﬂow in the capacitor
ranges from 236aA to −271aA.
Furthermore, the external oﬀset technique of this section, in addition to be eﬀective nice-
behaviour, is suitable for modular circuit design. In the sense that this concept can be applied
even to diﬀerent types of OPAMP if certain speciﬁcations are meet, namely medium gain,
very low oﬀset and high input impedance.
This is beneﬁcial in context where OPAMP are already available in design libraries, in-
creasing the ﬂexibility of the structure and facilitate the IC designer.
71 G. Rovere

CHAPTER
4
THE FINAL DESIGN AND SIMULATIONS
In this chapter I close the loop that implements the homeostatic principle in silicon. The
loop consist of a dynamic block based on the ﬁlter Design 3, a comparator and a diﬀerential
pair. Simulations that validate the circuits are extensively performed. Details on how to
obtain low oﬀset ampliﬁers are provided at the end of the chapter.
4.1 The Loop Design
Now we have all the elements in order to implement the homeostatic plasticity in silicon.
The challenging part of the project was to get ultra long time constants, and it is achieved by
on purpose develop the LPF Design 3. The last thing to take into account now is to interface
the designed LPF with the synapses and close the AGC loop. However, the Filter Design
3 has voltage input and voltage output, while the artiﬁcial synapses have current input and
output. Hence, conversion circuits (V-I and I-V) are needed. In addition to that, there are
two additional points that I want to illustrate.
First, recalling the ﬁrst bullet of the Low Leakage Cell pMOS rule of thumb, it states that
Vgs > 400mV must holds always in order to bias the pMOS in accumulation region. But,
recalling Figure 3.10 and previous analysis, from the topology point of view there is nothing
that assure this condition. Hence, in the loop, an additional circuit must be inserted in order
to satisfy this mandatory condition.
The second point results from a more practical consideration. In fact, since we are dealing
with 256 neurons in our chip, shared biases among instances is acceptable but is not allowed
to have independent biases for tune each of the homoeostatic plasticity circuits that control
73
4.1. The Loop Design
the neurons. This because it is not feasible in practice due to the inconvenience of setting
256 biases.
Vdd
Vdd
in
pu
t
ou
tp
ut
M1
Vdd
Vbias_tail
Vbias_Vc
Ip_thr
VC
Cfeedback
Vthr
M2 M3
M4
I3
I1
Figure 4.1: The output diﬀerential pair circuit. It acts as interface between the LPF with ultra long time
constants and the artiﬁcial synapse.
Let's now start considering the ﬁrst point. In Figure 4.1 is depicted a suitable circuit to
interface the output of the LPF to the DPI-synapse. It has two purposes, one is to convert
the voltage output of the LPF (VC , the voltage across the capacitor) into a current Ip_thr
suitable to be used as gain control signal for the DPI. The second is to set the DC value of
VC in order to bias the pMOS in accumulation mode.
However, Ip_thr value is set by the feedback loop itself and can't be directly modiﬁed. In
particular, the dynamic of this current is associated to the VC dynamic that can't be modiﬁed
either.
Let's now analyse the diﬀerential pair of Figure 4.1. If M1 is in subthreshold saturation
region, Equation 2.19 holds and yields:
I1 = I0
W
L
e
κ
(Vbias_tail−Vs)
UT (4.1)
the same is true for nMOS M2 and M3 giving:
Ip_thr = I0
W
L
e
κ
(VC−Vs)
UT (4.2)
I3 = I0
W
L
e
κ
(Vbias_V c−Vs)
UT (4.3)
applying the KCL at the source terminal of M2 and M3 is true that:
G. Rovere 74
4.1. The Loop Design
I1 = Ip_thr + I3 = I0
W
L
e
−Vs
UT
(
e
κVC
UT − e
κVbias_V c
UT
)
(4.4)
and solving for Vs and substituting in Equation 4.2 of Ip_thr ﬁnally yields:
Ip_thr = I1
e
κVC
UT
e
κVC
UT − e
κVbias_V c
UT
(4.5)
So, since I1 and Vbias_V c are set by biasing the circuit and Ip_thr is set by the feedback
loop, the DC value of VC is hence a dependent variable. Such a value can be chosen by
properly size the MOS and set the biases. In fact, from Equations 4.2 and 4.3 VC = Vbias_V c
if Ip_thr = I3 and the size of M2 is equal to the size of M3. Current Ip_thr and hence I3 can
be estimated by Equation 2.37 once given the biases of the DPI- synapses.
In normal operation condition Iin changes, and results in a Ip_thr variation and thus in a
VC variation that counterbalance the DPI input changes eﬀect. Quantitatively, VC AC signals
are usually (simulated) small (up to ±100mV ) compared to its DC value. Its swing ∆VC is
inversely proportional to the gain T of the loop.
Let's now consider the LPF input stage voltage swing. The control signal Vcontrol, has a
full range input swing that goes from ground to V dd. Inside this range there is a particular
value Vcontrol∗, dependent on the MOS size, biases and on process variations, that gives a
on-set ﬁlter condition, i.e. VB − VC = 0 → IC = 0. In order to let the AGC circuit to work
properly, the input signal of the LPF must has a swing that can crosses Vcontrol∗. This is the
condition that allows the ﬁlter capacitor to charge and discharge giving ultra long temporal
dynamics. See Figure 4.4 for a visual representation of the running signals. (a) represent a
working condition since V− < Vcontrol∗ < V+, while (b) doesn't because Vcontrol∗ is not in the
reachable range of V.
However, if we implement the summer (the block that drives the LPF) as in Figure 4.2
with a simple three MOS structure, its output voltage V swing is very limited in amplitude.
As emphasized before, this potentially can results in the impossibility to properly drive the
LPF if V can't crosses Vcontrol∗. Even though the output DC voltage of the summer can be set
by proper sizing the transistors of the summer, due to very small summer output V voltage
ranges (mV), mismatch and variation among instances can still shift the range of V away
from Vcontrol∗.
A solution to that is to add an ampliﬁer (CCVS) in cascade of the summer in order to
increase the range of the summer and let it to cross Vcontrol∗, Figure 4.3 (b).
We actually implemented this solution in a very conservative way by replacing the am-
pliﬁer (CCVS) with a comparator (that can be thought as an ampliﬁer with really high gain
and saturation limits). Figure 4.3 (c). This comparator has the full power supply output
range V that allow us to be sure the circuits is able to properly work even without additional
bias. However, this architecture pays in terms of reliability of the circuit but introduces an un-
wanted dependence of the time constant of Isyn to the input signal that will be addressed later.
75 G. Rovere
4.1. The Loop Design
Vdd VddVdd Vdd
M1
Msyn M3 M4
Vbias_ref
Csyn
DPI
Isyn
Iref
Iε
Vε
Figure 4.2: A simple three MOS summer exploit KCL. The output range of V doesn't extend from ground
to Vdd.
Iref
Isyn
-
+
Iε∑
(a)
Iref
Isyn
-
+
Iε∑ Vε=h·IεCCVS
(b)
Iref
Isyn
-
+
Iε∑
Vε=Vdd·sign(Iε)  
Comparator
(c)
Figure 4.3: Schematic of the usual feedback summer (a), the same summer with an ampliﬁer in order to
increase the output range (b) and the actual implemented version (c) that is a degeneration of solution (b).
Solutions (a) and (b) are linear while solution (c) is hihlgy non linear.
G. Rovere 76
4.1. The Loop Design
25 mV
-25 mV
250 aA
-250 aA
Vcontrol*
I C
V
dc
_L
LC
V
co
nt
ro
l
V
ε
1.8 V
0 V
1.8 V
0 V
1.8 V
-1.8 V
Vε+
Vε-
Vε_0
(a)
25 mV
-25 mV
250 aA
-250 aA
Vcontrol*
(b)
I C
V
dc
_L
LC
V
co
nt
ro
l
V
ε
1.8 V
0 V
1.8 V
0 V
1.8 V
-1.8 V
Vε+
Vε-
Vε0
Figure 4.4: Visual representation of the signal ranges in the loop chain. This representation gives a visual
understanding of the problem of limited range of V. (a) is the case in which the signal V range includes
Vcontrol∗ resulting in a proper working circuit. (b) On contrary represent a cicuit that can't work because V
doesn't include Vcontrol∗ in its range, hence Vds_LLC < 0 and IC < 0.
Vbias_DPI_weight Iτ Vbias_WTA Iref Ibias Vbias_tail Vbias_V c Vstart−up
1.8V 4.5pA 335mV 10nA 10nA 230mV 1.2V 0 then V dd
Table 4.1: Biases of schematic Figure 4.5.
The implemented comparator is just the modiﬁed WTA circuit described in Section 2.5.
It acts as a comparator with input current signals and output voltages. The whole designed
circuit with all blocks is shown in Figure 4.5.
Since we are dealing with very small currents few additional remarks about simulations
are needed. The ﬁrst is that, in order to get meaningful results, default simulation parameters
are often not adequate. In our case, with Cadence ADE, I set gmin = 10−14 and iabstol =
2 × 10−17. These are the lowest value that provides nice results and the convergence on the
simulation.
MOS W [µm] L [µm] MOS W [µm] L [µm] MOS W [µm] L [µm]
M1 1 1 M8 10 0.5 M15 1 0.5
M2 2 1 M9 1 0.5 M16 1 0.5
M3 3 1 M10 1 0.5 M17 1 0.5
M4 1 1 M11 1 0.5 M18 1 1
M5 4 0.5 M12 1 0.5 M19 1 1
M6 4 0.5 M13 1 0.5 M20 1 1
M7 2 1 M14 1 0.5 M21 2 1
Table 4.2: The list of the actual MOS sizes for the design of Figure 4.5.
77 G. Rovere
4.1. The Loop Design
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
dd
V
bi
as
_W
TA
V
bi
as
_t
ai
l
V
bi
as
_V
c
V
in
V
bi
as
_D
P
I_
w
ei
gh
t
I in
I p_
th
r
I ou
t
f o
ut
In
te
gr
at
e 
an
d 
Fi
re
 n
eu
ro
n
M
5
M
6
M
7
M
8
M
12
M
9
M
10
M
11
M
13
M
15
M
14
M
16
M
17
M
18
M
19
M
20
M
21
V
st
ar
t-u
p
V
st
ar
t-u
p
− B
−
B
A
1
A
2
bulk
Lo
w
Le
ak
ag
e
C
el
l
V
A
V
B
V
C
I bi
as
I b
ia
sM
2
M
LL
C
M
4
B
-
V g
s_
2
nh
cb
v
M
22
I C
M
3
M
1
B
-
V g
s_
1
V
dd
V
co
nt
ro
l
I re
f
Iτ
C
fe
ed
ba
ck
C
D
P
I
Figure 4.5: The ﬁnal schematic of the homeostasis AGC in silicon. It comprises the WTA, the LPF and the
output diﬀerential pair. At the very beginning, as ﬁrst thing Vstart−up must be set to 0V in order to let be
VC = 1.8. This is the start up condition, then Vstart−up must be set equal to Vdd. This lets the circuit works
as explained in the text with long time constants.
G. Rovere 78
4.1. The Loop Design
The second remark is that low currents gives long time constants in the homeostatic circuit
loop. These dynamics are order of magnitude higher that DPI dynamics hence, simulation
of these two systems is usually very resource consuming resulting long lasting simulations.
This happens because simulation must perform, for long time windows (due to LPF long time
constants), short time step analysis (due to the fast dynamics of the DPI). However, since
the two dynamic magnitudes are so diﬀerent, the DPI Isyn fast dynamic can be approximated
with its mean value. Hence, instead of feed the input of the DPI with a fast digital input and
obtain the relative fast spike out Isyn of the DPI, I stimulated the DPI with current mean
values. I.e. Iin represent the mean value input current of synapses that stimulate one neuron.
A simulation of the ﬁnal design sized as Table 4.2 and biased as Table 4.1 is shown in Figure
4.6. The input signal is provided to the synaptic input Iin of the DPI. It is a current square
wave that represent the mean current of all the 256 synapses that are connected to one neuron.
Hence, increase in Iin means that synapses are stimulating the neuron more intensively. Iin
varies from around 4nA to 22nA while the homeostatic reference, i.e. the synaptic current
at which the system would converge, is Iref = 10nA. Without the homeostatic circuit these
condition would results in Isyn 6= Iref that lasts for ever. But, with the homeostatic circuit,
the system Isyn will converge to Iref with long dynamics. This is exactly what can be observed
from simulation of Figure 4.6. In the second window is plotted Isyn and, after a Iin changes,
Isyn suddenly changes as well because the gain of the DPI is still the same. But, after
a certain amount of time, that is proportional to C and IC , Isyn is forced to converge at
Iref = 10nA due the homeostatic eﬀect. From the quantitatively point of view we see that
this transitions lasts 260s or 240s according to the sign of IC in the capacitor, that is not
equal in the charge and discharge phases. In fact, since IC is proportional to ∆Vds_LLC_pMOS,
observing window 3 of Figure 4.7, we see that ∆Vds_LLC_pMOS in the charging phase is
−1.6 + 24.5 = 22.9mV and yields a ∆t = 260s. On contrary in the discharging phase,
∆Vds_LLC_pMOS = 22.7 + 1.6 = 24.3mV that gives ∆t = 240s. The value of −1.6mV is the
steady state condition that compensate leakages currents (most likely due to the A1 and A2
oﬀsets and leakages at gates) in order to maintain IC = 0.
Deviation of Isyn = 10.54nA at steady state from Iref = 10nA are due to nonidealities in
the comparator block (the WTA). This is not a serious issue since gives only an oﬀset to Isyn
that can easily ﬁxed by changing Iref , if needed.
The shape of the Isyn is not exponential as happens in any ﬁrst order linear system. In
fact, the presence of the comparator in the loop gives a highly non linear behaviour that is
responsible for this response of Isyn. This happens because with the comparator we loose
the information how close is Isyn compared to Iref hence, the capacitor is always charged
or discharged with constant currents. This can be observed in Figure 4.7 (central window)
that shows Vcontrol, i.e. the output of the WTA comparator, which value can be either high
(1.799V ), low (5µV ) or somewhere in the middle Vcontrol∗ = 1.395V . There are no in between
values that prove that the circuit can process only the sign of the diﬀerence of Isyn − Iref ,
but not its magnitude. Finally, simulation of Figure 4.8 plot the VC value in response to the
usual square wave current input. The VC DC voltage is around 0.995V with an AC signal
±30mV . This condition assures both non distortion in the output diﬀerential pair of Figure
4.1 and the accumulation mode bias of the LLC pMOS (Vgs = 1.8− VC ≈ 0.8 > 0.4).
79 G. Rovere
4.1. The Loop Design
20.0 22.5 25.0 27.5 30.0 32.5 35.0
22.5
20.0
17.5
15.0
12.5
10.0
7.5
5.0
2.5
50
40
30
20
10
0
Δt = 260 s
Δt = 240 s
time [ks]
I sy
n [
nA
]
I in
 [n
A
]
Iref = 10 nA
Isyn = 10.54 nA
Figure 4.6: Simulation of the Final Design of Figure 4.5. The top window represents the input signal Iin =
4nA÷22nA feed into the DPI. The reference for the homeostatic plasticity is Iref = 10nA. The lower window
plots the DPI output synaptic current, its dynamics are around 4 minutes with 1pF capacitor.
G. Rovere 80
4.1. The Loop Design
20.0 22.5 25.0 27.5 30.0 32.5 35.0
22.5
20.0
17.5
15.0
12.5
10.0
7.5
5.0
2.5
2.0
1.75
1.5
1.25
1.0
.75
.5
.25
0
−.25
40
20
0
−20
−40
timep[ks]
I in
p[n
A
]
Irefp=p10pnA
5pμV
1.779pV
1.395pV
22.7pmV
-1.6pmV
-24.5pmV
V
ds
_L
LC
_p
M
O
Sp[
V
]
V
co
nt
ro
lp[
m
V
]
Figure 4.7: This simulation shows Vcontrol, (i.e. the output of the WTA comparator) and the voltage diﬀerence
across the LLC pMOS in response to the Iin stimuli.
20.0 22.5 25.0 27.5 30.0 32.5 35.0
22.5
20.0
17.5
15.0
12.5
10.0
7.5
5.0
2.5
1.2
1.1
1.0
.9
.8
time [ks]
I in
 [n
A
]
Iref = 10 nA
1.024 V
0.967 VV
C
 [V
]
Figure 4.8: This simulation shows the voltage in the state variable capacitor. Its value is always well under
1.4V in order to satisfy the accumulation mode condition for the LLC pMOS (Vgs > 400mV ).
81 G. Rovere
4.1. The Loop Design
Previously, we mention that the time constant of the circuit is strongly dependent to the
value of Iin. In order to understand this behaviour is important to note that the time constant
by which the capacitor voltage is modiﬁed is directly proportional to C and to 1
IC
. However,
the loop gain T is proportional to the input current Iin of the DPI, see Figure 3.2. The higher
is the gain, the less are the AC variation of VC that changes Ip_thr and hence the gain of the
DPI synapse. Vice versa if the loop gain is small, it results in high variations of VC in order
to counterbalance the eﬀect of Iin variation.
This is conﬁrmed by circuit simulations of Figure 4.9. In fact, with Iin1 = 1n ÷ 6nA the
loop gain T1 yields ∆VC = 64mV , with Iin2 = 4n÷22nA the loop gain T2 yields ∆VC = 57mV
and with Iin2 = 11n ÷ 44nA the loop gain T3 yields ∆VC = 54mV . Is clear that the more is
the input Iin the higher is the gain T and the lower is ∆VC . However, the slope by which VC
varies is constant and proportional to C, 1
IC
.
Given that, is clear that the eﬀective time by which the system will get to the equilibrium
state is directly proportional to C, 1
IC
and ∆VC , that is the above mentioned diﬀerence be-
tween the voltage on the capacitor taken at the steady state points with two diﬀerent inputs
Iin.
This variation in dynamics is a problem that didn't aﬀect the operation of our speciﬁc
application, but could be relevant in other contexts. This issue primary results from the use
of a comparator instead of a linear summer, even though this would not be suﬃcient either
to obtain exponential time dynamics [16]. In fact, by using the comparator we loose the
information about how far is Isyn from Iref . Hence, from the above considerations, the VC
slope and the voltage diﬀerence of VC related to two equilibrium points both contributes to
the eﬀective time constants.
G. Rovere 82
4.2. A low oﬀset ampliﬁer
20.0 22.5 25.0 27.5 30.0 32.5 35.0
50
40
30
20
10
1.2
1.1
1.0
.9
.8
v /V+; tran (V)<0> v /V+; tran (V)<2> v /V+; tran (V)<1> 
time [ks]
I sy
n [
nA
]
V
C
 [V
]
ΔVC = 64 mV
ΔVC = 57 mV
ΔVC = 54 mV
Δt ≈ 240; 260; 290 s
Figure 4.9: This simulation of the loop shows how Isyn dynamics varies according to ∆VC . ∆VC is inversely
proportional to the loop gain T set by changing Iin.
4.2 A low oﬀset ampliﬁer
In this section I ﬁnally focus on how to design simple low oﬀset ampliﬁers that are needed
in my design. The idea described in this section is presented in [14] and can be applied only
to a unity gain ampliﬁer conﬁguration.
As contrary as the auto zero oﬀset cancellation technique, this approach doesn't use any
continuous auto calibration but instead rely on matching properties. Hence, the design results
in a very compact and eﬀective topology that nicely ﬁts in my LPF.
Usually, the oﬀset issue is an unwanted deviation from the ideal ampliﬁer model that
mainly results from mismatch considerations and from second order eﬀects in MOS devices.
It is deﬁned as the "diﬀerential voltage that has to be applied to a diﬀerential ampliﬁer in
order to cancel the DC oﬀset output voltage". In a simple 5 MOS OTA of Figure 4.10 (tail
generator plus diﬀerential pair plus current mirror) happens that, with Vin+ = Vin− with
proper Vin_common, the currents in the two diﬀerential pair branches (VINN, VINP) are diﬀer-
ent if transistors are not matched and if Vd voltages at drains of paired nMOS are not equal.
This last point happens systematically, no matter how well matched are MOS in the layout,
because the OTA structure has not a perfectly symmetry topology. In fact, M4 pMOS is
diode connected (Vd = Vg) while the respective MOS in the other branch M5 is not. Hence,
even with the same Vgs applied, currents of M4 and M5 in the mirror are not perfectly copied
due to channel length modulation. Given that, a simple way to reduce the oﬀset eﬀect is to
make transistors bigger and to add MOS cascode in order to reduce this channel modulation
83 G. Rovere
4.2. A low oﬀset ampliﬁer
issue.
Vdd
M1
Vdd
M2 M3
M4 M5
Vin+ Vin-
Vout
Vbias
Figure 4.10: A basic ﬁve MOS Operational Transconductance Ampliﬁer (OTA).
However, even careful designs, with such technique oﬀset only down to 5mV can be ob-
tained. This has been simulated and is shown in Figure 4.11, which plot has four windows.
The ﬁrst one is the sweep of the diﬀerential input voltage of the OTA, the other three windows
are the output voltages of the OTA.
The simulation shows that, even though every transistor in the schematic has perfect
sizes, there is still an oﬀset of 0.9mV resulting from the non symmetry of voltages in the two
OTA branches. However, this is not even a realistic case. In fact, in real circuits, meaningful
deviations between transistors ratio are experienced.
In Figure 4.11, I simulated the oﬀset voltage of the OTA by changing of 20% one MOS size,
namely the diﬀerential pair pMOS (M3) or the current mirror nMOS (M5). This unpaired
situation gives diﬀerent currents in M3 and M5 that results in Vout 6= Vcommon and hence in a
output oﬀset.
From simulations of Figure 4.11 is clear that an oﬀset around mV is something that can
easily happens in real circuit implementations. This is not tolerated in our application since
we are dealing with voltages across the Low Leakage Cell in the order of 50mV .
The solution exploited by [14] aims to make zero oﬀset by matching two OTA oﬀsets and
subtract each others. In their paper the authors claimed that the obtainable oﬀsets are in the
order of µV (simulated). The underlying idea is depicted in Figure 4.12 and shows a main
ampliﬁer Amain that has some oﬀset Voffset_main. Hence, in a voltage buﬀer conﬁguration
holds that Vout = Vin + Voffset_main.
So, if in the unity gain loop path there is a DC voltage source with the same value
Voffset_main, and if we are able to somehow subtract it from Vout, then the overall oﬀset would
be instantaneously cancelled and Vout = Vin, resulting in an oﬀset free signal. But, since
Voffset_main is not known and is variable, the best way to implement that voltage generator
G. Rovere 84
4.2. A low oﬀset ampliﬁer
−7.5 −5.0 −2.5 0 2.5 5.0 7.5
2 0
1 5
1 0
5.0
0
−5.0
−10
−15
−20
1.5
1.25
1.0
.75
. 5
.25
0
1.5
1.25
1.0
.75
. 5
.25
0
1.5
1.25
1.0
.75
. 5
.25
0
Vdiff [mV]
V
of
fs
et
 [m
V
]
V
ou
t [
V
]
V
ou
t [
V
]
V
ou
t [
V
]
-6.6 mV
0.9 mV
8.6 mV
all matched
diff pair 20%
mirror 20%
Figure 4.11: Voﬀset simulation of a ﬁve transistor OTA with cascode. The circuit is loaded with a 10pF
capacitor and biased with Itail = 200nA. There simulations are plot, one with all transistors matched, the
other two with a deviation of 20% from the nomianl size
is to have a copy of Amain called Afb that provides a good estimation of Voffset_main.
−
+
−
+
+
−
Vin VoutAlow_offset
Vin VoutAmain
Afb
=
Figure 4.12: A technique to reduce oﬀset is to place two matched ampliﬁer in a loop. This arrangement
cancels the oﬀsets of the two devices in the signal path resulting in a eﬀective low oﬀset unity gain ampliﬁer.
The CMOS implementation of ampliﬁers of Figure 4.12 is depicted in Figure 4.13. It
consist of two simple ﬁve MOS OTAs with cascodes (M2, M6, M12, M13, M9, M5 and M3,
M7, M10, M11, M8, M4) and with one shared tail generator M1.
Matching considerations suggest to match same transistors of the two diﬀerent ampliﬁers
rather than matches transistors in the same ampliﬁer (as would have be done in regular de-
sign). For example, M2 should match M3 by being placed close in the physical layout, M4
85 G. Rovere
4.2. A low oﬀset ampliﬁer
and M5 as well, and so on. Applying this criteria it doesn't minimize the oﬀset of the sin-
gle ampliﬁer Amain nor the one of Afb, but, on contrary, it matches the magnitude of the
two oﬀsets of the two ampliﬁers. Then, since they are subtracted each other by the topol-
ogy in the loop path, this strategy minimizes this diﬀerence between the input Vin and the
output Vout of Alow_offset (see Figure 4.12) with matched non minimum Amain and Afb oﬀsets.
To further reduce the sources of mismatches, the bias of the ampliﬁers is shared and pro-
vided by M1. Hence, OTAs are no longer independent but, since gate and source voltages of
MOS M2, M3 and M4 , M5 are ideally identical, the eﬀect on the circuit behaviour is small
[14]. As beneﬁt, this topology forces source voltages of those pMOS to be identical enhancing
the matching characteristics between the two ampliﬁers.
Vdd
M1
M2 M3 M4 M5
M6 M7 M8 M9
M10 M11
M12 M13
Vbias
Vcasc
Vin
Vout
Figure 4.13: The transistor level implementation of Alow_offset. M2, M6, M12, M13, M9 and M5 realizes the
Amain ampliﬁer while M3, M7, M10, M11, M8 and M4 realizes the Afb ampliﬁer. The current source M1 is
shared between the two ampliﬁers. Matched MOS are grouped by dotted boxes.
In Figure 4.14 there is plot the simulation of schematic of Figure 4.13 and shows how the
structure of Figure 4.12 works well and has good oﬀset performances. The input signal is
a DC value at 0.9V with superimposed an AC ±10mV , 100 Hz sinusoid. The ﬁrst window
in the plot is the diﬀerence between Vin and Vout of the ampliﬁer. In the ﬁrst window there
are two curves, one relative to all matched MOS and the other relative to a 20% variation of
transistor M4 and M5.
G. Rovere 86
0 10 20 30 40 50
125.0
100.0
75.0
50.0
25.0
0
−25.0
−50.0
−75.0
−100.0
−770
−775
−780
−785
−790
−795
−800
−805
−7.77
−7.78
−7.79
−7.8
−7.81
−7.82
−7.83
time [ms]
V
in
 - 
V
ou
t [
µV
]
V
of
fs
et
 [m
V
]
V
of
fs
et
 [µ
V
]
all matched
diff pair 20%
Figure 4.14: The ﬁrst window shows the diﬀerence between the input Vin and the output Vout of the ampliﬁer,
one with all transistor matched, and the other withM4 andM5 size increased by 20%. The diﬀerence Vin−Vout
is almost the same between the two curves, even though the ampliﬁers oﬀsets are way diﬀerent. Window 2
and 3 respectively shows the Voffset of Amain in the two matching conditions.

CHAPTER
5
CONCLUSIONS
5.1 Discussion
The homoesostatic principle is a key mechanisms that allows biological neurons in large
neural network to interact each other and work properly even with high variations of chemical
concentrations and physical quantities. Due to its eﬀectiveness, a good idea is to transpose
this concept even in artiﬁcial neurons populations. The most straightforward implementation
of the homeostatic plasticity in silicon is to build an Automatic Gain Control loop around
the artiﬁcial neuron. This allows to change the neuron input-output gain by modifying its
synaptic gain and thus keep the output ﬁring rate of the neuron at a reference frequency.
The homeostatic plasticity is very useful in order to face process variation mismatches, tem-
perature variation, changes in chip loads and so on. The homeostatic mechanisms can be
combined with Hebbian synaptic plasticity on the same neuron in order to provide a wide
range of adapting mechanisms. In fact, has been proven that complex behaviour are obtained
by the interaction between homeostatic and Hebbian plasticity that can't be reproduced by
one single mechanisms alone.
The key blocks of my AGC loop homeostatic plasticity implementation are a LPF, a
comparator and a diﬀerential pair. A very important speciﬁcation of the AGC loop is that
it requires ultra long dynamics but still must exhibits compact size due to chip integration
reasons. From these two speciﬁcations result opposite design styles that usually yields in
trade-oﬀs.
Due to its appeal, some attempts in past literature has been performed in order to imple-
ment this homeostatic plasticity in artiﬁcial neural networks. Unfortunately, they required
ﬂoating gate design technique that needs higher voltages and higher area. Other designs
exploit workstations to generate long time constants, preventing it from the use in portable
89
5.1. Discussion
applications.
An idea for obtain ultra long time constants, while still occupy small area, is to deal
with tiny currents in the order of magnitude of atto-Ampere. Such small currents can be
generated by properly biasing a pMOS in accumulation mode and reducing all leakage mech-
anisms. From these insights, I ﬁrst I developed a femto/atto ampere current generator (Sec-
tion 3.2), then I developed a novel unbalanced architecture (Section 3.5) based on it that can
charge/discharge the system state capacitor with such tiny currents and thus obtain ultra
long time constants.
In this thesis work I ﬁrst understand and analysed the homeostatic principle in neurons by
reading the current literature. Then I model it and reformulate the problem in an engineer-
ing way with an AGC. The solution I proposed here in my thesis work meet the speciﬁcation
of compactness and ultra long dynamics described in Section 1.5. Results are validated by
extensive software simulations (Cadence ADE) and plots are reported in Figure 4.6, Figure
4.7, Figure 4.8 and Figure 4.9.
Here I want to mention that I'm aware that I'm simulating the pMOS (Low Leakage
Cell) in odd biasing condition where usually, in standard analog design, no high precision is
required. As far as I understand, the pMOS model (PSP) is quite realistic, but the actual
overlapping behaviour between the Cadence simulated pMOS and the real fabricated pMOS
strongly depends on the foundry device characterization that provides parameters for the
Cadence PSP model. Hence, if a not accurate real pMOS characterization were performed by
AMS, the simulation results are quantitatively imprecise.
However, I'm optimistic about the success and feature of the circuit because the designed
structure intuitively makes sense and it is supported by the qualitatively analysis. Even
though simulations can gives not accurate results, since my structure is based on [6] and [7]
where true on chip measurements were performed, the order of magnitude of the running
currents must be comparable.
Due to these simulation limitations, true on chip measurements must be performed in
order to validate and full characterize the designed structure. This will be done when the
chip will be sent back from the foundry (Fall 2013).
As a conclusion I brieﬂy provide here the features and limitations of my implementation
of the homeostatic circuit in silicon. It satisﬁes all the speciﬁcations of the project (see 1.5)
while introducing not severe constraints (for our application).
Pros
• Ultra long dynamics ∆t ≈ 250s with realistic biases and signals (simulated up to 600s).
• Very compact design 16 x 46 µm (LPF, ampliﬁers, comparator, diﬀerential pair).
• No calibration is required and mismatch robustness.
G. Rovere 90
5.2. Future Works
• Power consumption of 100nW (comparator, LPF, ampliﬁers, output diﬀerential pair).
Cons
• Simulations results too dependent on software parameters, most likely due to the inac-
curate model ﬁt of pMOS in accumulation region.
• Non-linear dynamics due the comparator.
• Time constant dependent to the input Iin magnitude.
5.2 Future Works
Due to short time available for develop the full working circuit, surely its design is not
optimized for each part and then could be further enhanced. If I'll have the opportunity, in the
future I would like to spend time to make it better, especially in terms of time constants and
linearity. In fact, since my design was quite conservative, I think that even better results in
terms of ﬁlter cut-oﬀ frequency can be obtained without sacriﬁce robustness and reliability. In
addition to that, another good improvement for the homeosatatic loop would be to substitute
the non linear comparator with a pure analog summer in order to have linear dynamics.
If the circuit will prove to work in real chip, this unbalanced ﬁlter technique can be eﬀec-
tively used where ultra low cut-oﬀ frequencies are required, such as pace-makers, averaging,
real world signal conditioning interfaces and so on [20]. Here below I report in Table 5.1
the state of the art sub-Hertz LPF comparison, where the Normalized Cut-oﬀ Frequency
is obtained by multiplying the Cut-oﬀ Frequency by 1pF/Integrating_Capacitance. This
parameter allow a fair comparison between the designs cut-oﬀ frequencies as if they had the
same capacitor. For my work, the cut-oﬀ frequency is obtained by treating dynamics as if they
were exponential in a linear system. Hence, according to simulations, ∆t ≈ 250s (average)
and assuming that the dynamics transients ends in 5τ , in my design τ = 250/5 = 50s, that
gives a fc =
1
2piτ
= 3mHz.
Reference CMOS
process
[µm]
Cut-oﬀ
Freq. [Hz]
Supply
Voltage
[V]
Area
[mm2]
Integrating
Capaci-
tance[F]
Normalized
Cut-oﬀ
Freq. (1pF)
[Hz]
[9] 0.35 0.5 - - 100f 0.035
[21] 0.35 35 3.2 0.025 25f 0.875
[22] 0.50 0.180 - 0.035 15p 2.7
[23] 1 0.075 5 0.25 10p 0.75
This work 0.18 0.003 1.8 0.0012 1pF 0.003
Table 5.1: A state of the art comparison of sub-Hertz ﬂters. Table partailly taken from [20].
91 G. Rovere
5.2. Future Works
Table 5.1 shows very good performances in terms of cut-oﬀ frequencies for my design. In
particular, my Normalized cut-oﬀ frequency is one order of magnitude lower than state of the
art design but with far less required area.
G. Rovere 92
APPENDIX
A
MASKS LAYOUT
Figure A.1: The layout of schematic of Figure 4.5. 2.5pF NMOSCAP included but neuron and synapses
excluded. (a) all layers. (b) Layers: N-well, Polysilicon, Metal 1, Metal 2, Vias. (c) Metal 1, Metal 3, Metal
4, Vias
93

BIBLIOGRAPHY
[1] C. A. Mead, "Analog VLSI and Neural Systems," Addison-Wesley: Reading, MA, 1989.
[2] G. Indiveri, B. Linares-Barranco, T. Hamilton, A. van Schaik, R. Etienne-Cummings, et al.,
"Neuromorphic silicon circuits," Frontiers in Neuroscience vol. 5, pp. 123, May 2011.
[3] E. Chicca, F. Stefanini and G. Indiveri, "Neuromorphic electronic circuits for building autonou-
mous congitive systems," proceeding of the IEEE, vol. X, no. x, XX. (Under review process)
[4] G. Turrigiano and S. Nelson, Homeostatic plasticity in the developing nervous system, Nature
Reviews Neuroscience, vol. 5, pp. 97107, February 2004.
[5] C. Bartolozzi and G. Indiveri, "Global scaling of synaptic eﬃcacy: Homeostasis in silcon
synapses," Neurocomputing, vol. 72, no. 4-6, pp. 726-731, January 2009.
[6] M. O'Halloran and R. Sarpeshkar, "A 10-nW 12-bit Accurate Analog Storage Cell With 10-aA
Leakage," IEEE journal of solid-state circuits, vol. 39, no. 11, November 2004.
[7] M. O'Halloran and R. Sarpeshkar, "An Analog Storage Cell with 5e−/sec Leakage," IEEE In-
ternational Symposium on Circuits and Systems, pp. 560-564, May 2006.
[8] K. Roy, S. Mukhopadhyay and H. Mahmoodi-meimand, "Leakage Current Mechanisms and
Leakage Reduction Techniques in Deep-Submicrometer CMOS Circuit," Proceeding of the IEEE,
vol. 91, no. 2, pp. 305-327, February 2003.
[9] B. Linares-Barranco and T. Serrano-Gotarredona, "On The Design and Characterization of
Femtoampere Current-Mode Circuits," IEEE journal of solid-state circuits, vol. 38, no. 8, pp.
1353-1363, August 2003.
[10] C. Bartolozzi, S. Mitra, G. Indiveri, "An ultra low power current-mode ﬁlter for neuromorphic
systems and biomedical signal processing," IEEE Biomedical Circuits and Systems Conference,
pp. 130-133, November 2006.
95
Bibliography
[11] C. Bartolozzi, O. Nikolayeva, G. Indiveri, "Implementing homeostatic plasticity in VLSI net-
works of spiking neurons," IEEE International Conference on Electronics, Circuits and Systems,
pp. 682-685, August-September 2008.
[12] S. C. Liu, B. A. Minch "Homeostasis in a Silicon Integrate and Fire Neuron," Advanced Neural
Information Processing Systems 2001, pp. 727-733.
[13] A. G. Andreou and K. A. Boahen, "Translinear circuits in subthreshold MOS," Analog Inte-
grated Circuits and Signal Processing, vol. 9, no. 2, pp. 141-166, March 1996.
[14] R. Wang, T. J. Hamilton, J. Tapson, A. van Schaik, "An Analogue Memory for Spiking Neural
Networks with 100-aA leakage," Transactions on Biomedical Circuits and Systems, vol. XX, pp.
XX, XX. (Under review process)
[15] S. C. Liu, J. Kramer, G. Indiveri, T. Delbrück, and R. Douglas, "Analog VLSI-Circuits and
Principles," MIT Press, 2002.
[16] J. P. A. Pérez, S. C. Pueyo, B. C. López "Automatic Gain Control," Springer New York, 2011.
[17] J. Mulder, W. A. Serdijn, A. C. van der Woerd, A. H. M. van Roermund "Dynamic Translin-
ear and Log-Domain Circuits: Analysis and Synthesis," The Springer International Series in
Engineering and Computer Science, 1999.
[18] Y. Tsividis "Operation and Modeling of the MOS Transistor," 2nd edition, Oxford University
Press, 1999.
[19] D. Purves, "Neuroscience," 4th edition, Sinauer Associates, 2007.
[20] E. Rodriguez-Villegas, A. J. Casson and P. Corbishley "A Subhertz Nanopower Low-Pass Filter,"
IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 58, no. 6, pp. 351-355, 2011.
[21] F. Gozzini, G. Ferrari, and M. Sampietro, Linear transconductor with rail-to-rail input swing
for very large time constant applications, Electronics Letters, vol. 42, no. 19, pp. 10691070,
2006.
[22] A. Becker-Gomez, U. Cilingiroglu, and J. Silva-Martinez, Compact sub-Hertz OTA-C ﬁlter
design with interface-trap charge pump, IEEE Journal of Solid-State Circuits, vol. 38, no. 6,
pp. 929934, 2003.
[23] P. Bruschi, G. Barillaro, F. Pieri, and M. Piotto, Temperature stabilised tunable Gm-C ﬁlter for
very low frequencies, Proceeding of the 30th European IEEE Solid-State Circuits Conference,
pp. 107-110, 2004.
[24] R. Hogervorst, J. T. Tero, R. G. H. Eschauzier and J. H. Huijsing "A Compact Power-Eﬃcinet
3V CMOS Rail-to-Rail Input/Output Operational Ampliﬁer for VLSI Cell Libraries," IEEE
Journal of Solid-state circuits, vol. 29. no. 12, December 1994.
G. Rovere 96
