All-MOS implementation of RC networks for time-controlled Gaussian spatial filtering by Fernández-Berni, J. & Carmona-Galán, R.
All-MOS implementation of RC networks for time-controlled
Gaussian spatial filtering
J. Ferna´ndez-Berni∗, R. Carmona-Gala´n
Institute of Microelectronics of Seville (IMSE-CNM),
Consejo Superior de Investigaciones Cient´ıficas y Universidad de Sevilla,
C/ Ame´rico Vespucio s/n, 41092, Seville, Spain (e-mail: berni@imse-cnm.csic.es).
SUMMARY
This paper addresses the design and VLSI implementation of MOS-based RC networks capable of
performing time-controlled Gaussian ﬁltering. In these networks, all the resistors are substituted one
by one by a single MOS transistor biased in the ohmic region. The design of this elementary transistor
is carefully realized according to the value of the ideal resistor to be emulated. For a prescribed
signal range, the MOSFET in triode region delivers an interval of instantaneous resistance values. We
demonstrate that, for the elementary 2-node network, establishing the design equation at a particular
point within this interval guarantees minimum error. This equation is then corroborated for networks
of arbitrary size by analysing them from a stochastic point of view. Following the design methodology
proposed, the error committed by a MOS-based grid when compared to its equivalent ideal RC network
is, despite the intrinsic nonlinearities of the transistors, below 1% even under mismatch conditions
of 10%. In terms of image processing, this error hardly aﬀects the outcome, which is perceptually
equivalent to that of the ideal network. These results, extracted from simulation, are veriﬁed in a
prototype vision chip with QCIF resolution manufactured in the AMS 0.35µm CMOS-OPTO process.
This prototype incorporates a focal-plane MOS-based RC network which performs fully-programmable
Gaussian ﬁltering.
key words: Gaussian ﬁltering, focal-plane processing, RC networks, time-controlled diﬀusion, VLSI
implementation
1. INTRODUCTION
Gaussian ﬁltering is a basic task for early vision. It is used for reducing the noise associated
to the image capture without aﬀecting subsequent processing stages. In fact, the image
enhancement through the Diﬀerence of Gaussians (DoG) is preferred over other image
enhancemement methods as it preserves the details of interest within the scene while ﬁltering
∗Correspondence to: berni@imse-cnm.csic.es
Contract/grant sponsor: Junta de Andaluc´ıa-CICE; Ministerio de Ciencia e Innovacio´n; contract/grant number:
2006-TIC-2352; TEC 2009-11812, co-funded by FEDER
2 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
sharp random noise [1]. In this case, the width of the Gaussian ﬁlters involved will depend on
the noise nature as well as on the scale of the objects to be analysed, not usually known in
advance. Scale is indeed a key point when it comes to eﬃciently process images. Its adequate
selection simpliﬁes the analysis of certain features in a scene by removing information at other
scales [2]. If several scales are to be analysed, pyramidal representations can be easily built in
order to only store the necessary information for each representation according to the spatial
frequencies involved [3]. The scale-space representation of a scene is obtained by applying
successive Gaussian ﬁlters with increasing widths over its original representation [4].
From the processing features above described, it can be seen that the utility of Gaussian
ﬁltering reaches its maximum level when the width, or smoothing degree, is under the control of
the user. Programmable digital processors [5] support this possibility by means of user-deﬁned
convolution kernels. However, this implementation is not very eﬃcient, taking into account
the necessary serialization of the raw image data along with the repeated accesses to memory
to operate over each and every pixel and its neighborhood. Besides, this energy ineﬃciency
increases with the width of the ﬁlter as more neighbors are involved in the computation for
each pixel — the kernel size must be at least 6 times the variance of the targeted Gaussian
ﬁlter [6]. This has led to alternative approaches, most of them making use of the ability of
CMOS processes to integrate pure imaging with signal processing circuitry. Thus, they achieve
massively parallel focal-plane processing concurrent with the photosensing, delivering the same
result as a digital processor but in a much more eﬃcient way.
Resistive grids are the ﬁrst alternative to be considered [7, 8]. They are passive networks
which can perform diﬀerent spatial ﬁlterings by using positive and, if necessary, negative
resistors. Their robustness to mismatch makes them specially suitable for VLSI implementation
[9,10]. In [11], a single-chip analog implementation of a resistive network for Gaussian ﬁltering
is described. Negative resistors are mandatory in order to attain a Gaussian-like convolution
kernel, what makes the circuitry bulky due to the negative impedance converters. The variation
of the ﬁlter width is achieved by two MOSFETs in parallel biased in the triode region whose
control of the gate voltages results in a variable resistor. It is precisely the control circuit of
this variable resistor what greatly increases the power consumption of the chip. Moreover, only
Gaussians with a width variable by a factor of 2 are available. Other possibility of Gaussian
ﬁltering in resistive grids is through MOSFETs working in subthreshold regime [12,13]. In this
case, the main drawback is the signiﬁcant inﬂuence of leakage currents and mismatch for a
ﬁne control of the ﬁltering width [14]. Finally, the ﬁltering performed by a resistive grid can
be theoretically emulated by the CNN framework [15]. However, the unavoidable mismatch
presents at any VLSI implementation prevents the typical transconductance-based approach
from achieving Gaussian ﬁltering with enough accuracy, specially for large widths [16].
In [17], the physical implementation of Gaussian ﬁlters with user-deﬁned width is addressed
in a totally diﬀerent way. It introduces a capacitive network which can be considered as a
numeric solver of the spatially-discretized diﬀusion equation. The variance of the ﬁlter is
determined by a capacitor ratio, ﬁxed by layout design, and a iteration number associated
to the implicit time discretization of the network. Four switches, two switching capacitors and
one grounded capacitor amount to each node. An error of 1% is delivered when the iteration
number is higher than ten. A more recent VLSI implementation of this approach is reported
in [18].
This paper proposes to take advantage of the dynamics of a RC network in order to achieve
user-deﬁned Gaussian ﬁlters. A RC network can be also considered as a solver of the spatially-
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 3
(a) (b)
Figure 1. RC network performing linear diﬀusion (a) and its MOS-based counterpart (b)
discretized diﬀusion equation, but no time discretization is now realized. This means that any
ﬁlter width is ideally possible. The main drawback which can be argued against this approach
is the considerable area consumption of resistors in CMOS processes. We will demonstrate that
this problem can be solved by substituting each resistor by a single MOS transistor biased in
the ohmic region. The ratio resistance/area of this elementary MOS is much greater than that
of the resistors made with polysilicon or diﬀusion strips. Besides, the dynamics of the network
can be activated or deactivated by controlling the gate voltages of the transistors. The eﬀects
of the inevitable nonlinearities introduced by the MOSFETs are alleviated by a careful design
of the elementary transistor and a very simple on-chip calibration. An error below 1% is thus
achieved, which is translated into perceptually equivalent outputs. Only two transistors, acting
as switches with an ON resistance designed ad-hoc, and a grounded capacitor amount to each
node.
The paper is organized as follows. Section 2 deﬁnes the operation of a RC network as a
solver of the spatially-discretized diﬀusion equation, determining the corresponding ﬁltering
function. Section 3 compares a 2-node ideal RC network with its counterpart where the
resistor is substituted by a MOSFET. We thus ﬁnd the design equation for a MOS transistor
emulating a targeted resistance with minimum error. Section 4 extends the design equation
previously obtained to networks of arbitrary size. Section 5 shows some simulation results for a
64×64 network designed by making use of such a equation. Finally, Section 6 describes an on-
chip fully-programmable Gaussian ﬁltering operation which validates the design methodology
proposed.
2. GAUSSIAN FILTERING IN A RC NETWORK
The generic RC network analysed throughout this paper is depicted in Fig. 1(a), where the
initial voltage at the capacitor of every node represents the value of the corresponding pixel.
4 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
σ = 0 σ = 0.6 σ = 1 σ = 1.3
Figure 2. Gaussian ﬁltering over the Lena image (64×64 px).
Our objective is to design the MOS-based version in Fig. 1(b) in such a way that minimum
error is committed. For the ideal RC network, the equation satisﬁed at each node is:
τ
dVij
dt
= −4Vij + Vi+1,j + Vi−1,j + Vi,j+1 + Vi,j−1 (1)
where τ = RC. Eq. (1) represents the spatially-discretized diﬀusion equation and its solution
is formally the scale-space representation of 2-D discrete signals [19]. Indeed, what happens in
the unforced RC network is that the initial charge of the capacitors actually diﬀuses, with a
pace determined by τ . Applying the DFT to Eq. (1) we obtain:
τ
dVˆuv
dt
= −4Vˆuv + e 2πiuM Vˆuv + e−2πiuM Vˆuv + e 2πivN Vˆuv + e−2πivN Vˆuv (2)
where we have considered an array whose size is M ×N pixels. Notice that the dynamics of
those nodes located just at the edge of the array is not aﬀected by a complete 4-connected
neighborhood but by a reduced 2- or 3-connected one. It is equivalent to consider mirroring
boundary conditions at every time instant for the edges of the array. That is, Eq. (1) can be
also used for the nodes at the boundaries. But bear in mind that the nodes falling outside
the array correspond to dummy nodes which do not aﬀect the dynamics of the network as
their value always equals that of the boundary node under consideration at every time instant.
Eq. (2) can be rewritten as:
τ
dVˆuv
dt
= −4
[
sin2
(πu
M
)
+ sin2
(πv
N
)]
Vˆuv (3)
and solving now in the time domain we obtain:
Hˆuv(t) =
Vˆuv(t)
Vˆuv(0)
= e−
4t
τ [sin
2(πuM )+sin
2(πvN )] (4)
where Vˆuv(0) represents the DFT of the image deﬁned by the initial voltages at the capacitors
and Vˆuv(t) is the DFT of the image deﬁned by those same node voltages after a certain time
interval t since the network started to evolve at time instant t = 0. Thus Eq. (4) describes
the ﬁltering process undergone by the initial image as the network evolves. It corresponds to
the spatially-discretized version of the ideal Gaussian ﬁltering with spatial width σ =
√
2t/τ
performed by a continous-plane diﬀusion process [3]. Therefore, for a certain τ , each time
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 5
Figure 3. 2-node ideal RC network (a) and its MOS-based implementation (b)
instant of the dynamics of the network will be equivalent to a diﬀerent Gaussian ﬁlter. As an
example, consider the Fig. 2, where we have applied Gaussian ﬁlters with σ = 0, σ = 0.6,
σ = 1 and σ = 1.3 over the picture of ‘Lena’.
3. ANALYSIS OF A 2-NODE NETWORK
Let us compare the 2-node grids in Fig. 3 as a ﬁrst approximation to the design of MOS
resistors for RC networks. For purposes of clarity, we will conﬁne the analysis to n-channel
MOS transistors, but it applies equally well for p-type transistors. The gate voltage VG is ﬁxed
and we will assume, without loss of generality, that the initial conditions of the capacitors
fulﬁll V10 > V20, being V10 = V
′
10 and V20 = V
′
20. We will also assume that the transistor is
biased in the triode region for any voltage at the drain and source terminals, that will range
from Vmin to Vmax. The evolution of the circuit in Fig. 3(a) is described by this set of ODEs:{
C dV1dt = −V1(t)−V2(t)R
C dV2dt =
V1(t)−V2(t)
R
(5)
while the behaviour of the circuit in Fig. 3(b) is described by:{
C
dV ′1
dt = −V
′
1 (t)−V ′2(t)
RM (t)
C
dV ′2
dt =
V ′1 (t)−V ′2(t)
RM (t)
(6)
being:
RM (t) =
1
knSn {VC − [V ′1(t) + V ′2 (t)]}
(7)
where kn = µnC
′
ox/2, VC = 2 (VG − VTn) and Sn = W/L. Several key aspects must be
clariﬁed at this point about Eq. (7). Firstly, it corresponds to the instantaneous resistance
of the transistor derived from the classical ﬁrst-order approximation for the drain current of a
NMOS biased in the triode region. This is really a coarse approximation for the real behaviour
of the transistors. However, it will permit to draw conclusions while keeping the equations
reasonably manageable. These conclusions are conﬁrmed not only by simulation, where the
6 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
models include a great deal of second-order eﬀects, but also in the physical implementation
realized. It means that the mentioned ﬁrst-order approximation summarizes the essential
features of the transistors for a trustful design of MOS-based RC networks. Secondly, due
to charge conservation, Eq. (7) can be expressed as:
RM =
1
knSn[VC − (V ′10 + V ′20)]
(8)
which means that, neglecting second-order eﬀects, the resistance of the transistor depends on
the sum of the initial conditions and does not vary along the corresponding diﬀusion. In other
words, Eq. (8) is telling us that, if we choose a certain set of initial conditions within the
prescribed signal range whose sum coincides and make RM = R for this sum, the dynamics
for any initial conditions within this set will be perfectly emulated by the MOS network. On
the contrary, there will always be an error for any other initial conditions outside the set as
the resistance of the transistor during the diﬀusion will never match R. The question arising
here is therefore: what is the value of the sum of initial conditions which, fulﬁlling RM = R,
minimizes the maximum error committed for any other possible value of this sum? That is,
deﬁning VS = V
′
10 + V
′
20 = V10 + V20, what is its optimum value VSopt for which making:
1
knSn(VC − VSopt)
= R (9)
the maximum error committed by the MOS network for any other possible value of VS is
minimum? The design equation of the MOSFET is immediately derived from Eq. (9):
Snopt =
1
knR(VC − VSopt)
(10)
where the value of VSopt must be within the interval [2Vmin, 2Vmax] according to the signal
range previously established. Note that this design equation demands to know the exact value
of kn, which can present signiﬁcant variations across the design space delimited by the corners
of the process. We will see in Section 6 how to solve this problem.
In order to determine VSopt , notice ﬁrstly that the charge extracted from one capacitor will
end up in the other at both the ideal network and the MOS-based network. We can thus deﬁne
the error in the corresponding node voltages as:
{
V ′1(t) = V1(t) + (t)
V ′2(t) = V2(t) − (t) (11)
or, equivalently:
(t) =
V ′1(t)− V ′2 (t)
2
− V1(t)− V2(t)
2
(12)
Since our initial assumptions were V10 = V
′
10 and V20 = V
′
20, we have that (0) = 0. Also,
the stationary state, reached when t → ∞, renders (∞) = 0, as V1(∞) = V2(∞) and
V ′1(∞) = V ′2(∞). Therefore, there must be at least one point in time, let us call it text, in
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 7
which the error reaches an extreme value, either positive or negative. That is, since the time
derivative of the error can be expressed as†:
τ
d
dt
= [V ′1(t)− V ′2(t)]
V10 + V20 − VSopt
VC − VSopt
− 2(t) (13)
with τ = RC, it must cancel in text, resulting in an extreme error of:
ext =
1
2
[V ′1(text)− V ′2(text)]
V10 + V20 − VSopt
VC − VSopt
(14)
where VC is a constant, VSopt is a design parameter and V
′
1 (text) and V
′
2(text) are variables
which can be referred to the initial conditions by solving Eq. (6), obtaining:
ext =
1
2
(V10 − V20)
V10 + V20 − VSopt
VC − VSopt
e
−2text
RMC (15)
Finally, text can be found by solving Eq. (5) and Eq. (6) and making use of Eq. (11):
text =
τ
2
ln(r)
r − 1 (16)
where:
r =
R
RM
=
VC − V10 − V20
VC − VSopt
(17)
Substituting Eq. (16) in Eq. (15), we have the following expression:
ext =
1
2
(V10 − V20)
V10 + V20 − VSopt
VC − VSopt
r
r
1−r (18)
from which the ﬁrst important conclusion can be extracted. The extreme error is independent
of Sn and kn. That is to say, once Sn is deﬁned by Eq. (10) for a value of kn known, the error
committed by the MOS network does not depend on these parameters and therefore is not
aﬀected by their mismatch. This robustness to mismatch is conﬁrmed both by simulation and
in the physical implementation presented in this paper.
Unfortunately, to obtain the exact analytical expression for the extremes of ext from
Eq. (18) is not possible. This in turn implies that the exact analytical expression for VSopt
can not be found either. However, a good approximation for them is still possible. Let us take
a look to Eq. (17). The value of r depends on a quotient where VC is the dominant term at
both the numerator and the denominator for deep triode biasing. Let us therefore assume that
∂r/∂V10  0 and ∂r/∂V20  0. Under these conditions, it can be demonstrated that only a
critical point, more precisely a saddle point, can be found at (VSopt/2, VSopt/2). Therefore, we
can only talk of absolute maxima or minima which will be at the boundaries of the domain
considered for (V10, V20). It can be demonstrated
‡ that the absolute extremes are located at
†See APPENDIX A
‡See APPENDIX B
8 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
the points (V10, V20) = (Vmax, VSopt/2) and (V10, V20) = (VSopt/2, Vmin) being, respectively,
their values:
⎧⎪⎪⎪⎪⎨
⎪⎪⎪⎩
ext|max = 18
(2Vmax−VSopt)
2
VC−VSopt
(
1
2 +
1
2
VC−2Vmax
VC−VSopt
) 2(VC−Vmax)−VSopt
2Vmax−VSopt
ext|min = − 18
(VSopt−2Vmin)
2
VC−VSopt
(
1
2 +
1
2
VC−2Vmin
VC−VSopt
)− 2(VC−Vmin)−VSoptVSopt−2Vmin
(19)
The ﬁnal step is to determine the value of VSopt which minimizes max |ext|. To this end, let
us again assume deep triode biasing. In this way, the exponential terms in ext|max and in
ext|min can be approximated by 1 when varying VSopt within the range of its possible values
[2Vmin, 2Vmax]. Applying this approximation, it can be seen that to increase or to decrease
VSopt has antagonistic eﬀects in the magnitude of ext|max and ext|min, being:
VSopt = Vmin + Vmax (20)
the expression for VSopt which minimizes the magnitude of the error:
min (max |ext|) = 1
8
(Vmax − Vmin)2
VC − Vmin − Vmax (21)
Notice that this minimized error depends inversely on VC , that is, on VG − VTn , which was
considered ﬁxed at the beginning of the design process. Thus, VG and VTn must be chosen in
such a way that makes their diﬀerence as large as possible.
Finally, by substituting Eq. (20) in Eq. (10), the design equation for minimum error is
obtained:
Snopt =
1
knR(VC − Vmin − Vmax) (22)
This conclusion about VSopt invalidates the groundless intuition that the optimal design could
be derived from equaling the midpoint of the interval of possible values of RM to R. On
the contrary, the value of RM which matches R for optimal design is notably below such a
midpoint.
It is time now to numerically corroborate the validity of the assumptions realized and, on
the way, to know the magnitude of the error achieved by the optimum design. In Table I some
results are showed for typical values of the voltages involved. As above mentioned, the larger
VG the smaller the error. Thus, two usual maximum biasing voltages for the transistor gate
in current CMOS technologies are introduced in Table I. Regarding VTn , two typical related
values are used. We do not take into account the possibility of using low- or even zero-threshold
transistors currently oﬀered by some manufacturers, what would further reduce the error. Note
that the columns labelled as max eext, min eext and VSoptnum are numerically calculated from
Eq. (18) whereas those ones labelled as emax, emin and VSopt are directly calculated from
Eq. (19) and Eq. (20) respectively. Several conclusions can be drawn from a careful analysis
of Table I. First of all, the deeper the ohmic biasing the less the error committed by the MOS
network and the better the approximations applied, as expected. Secondly, despite the large
signal swings considered and therefore the large variation of the instantaneous resistance of
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 9
the MOSFET, represented by the wide interval of possible values of r, the optimal design
keeps the error moderately small. It is due to the suitability of this design for the diﬀusion
dynamics. Note that in the ideal network, the maximum charge injection occurs when one of
the nodes equals Vmax whereas the other equals Vmin. This situation can only exist just at
the beginning of the diﬀusion. A signiﬁcant error committed by the MOS-based grid at this
point would mean to noticeably alter the rest of the dynamics. However, such a conﬁguration
of the voltages makes their sum equal to Vmin+Vmax and therefore the MOS network does not
commit any error at all. On the contrary, the error committed by the MOS grid is maximum
when the voltages of the nodes involved coincide at Vmin or Vmax, being their sum 2Vmin or
2Vmax respectively. But there is no charge injection between the nodes in these cases as their
voltages are the same. Therefore, the maximum error is committed when the dynamics is not
aﬀected. Finally, remark that no speciﬁc value of R has been included in Table I as it is not
necessary to compute the error. Once set VG, VTn and [Vmin, Vmax], Snopt can ideally take,
from Eq. (22), any value according to the targeted R without aﬀecting the magnitude of the
error.
4. NETWORKS OF ARBITRARY SIZE
Taking into account the mathematical framework deployed to optimize the design of a 2-node
MOS network, it is obvious that the extension of the results to networks of arbitrary size can
not be addressed in the same way. Our proposal for such extension is a stochastic approach.
Let us suppose a M ×N RC grid similar to that of Fig. 1(a) where every initial value of pixel
can be modelled by a random variable with an uniform distribution between Vmin and Vmax,
that is, Vij(0) ∼ U(Vmin, Vmax), according to notation in [20]. Such a distribution is depicted
in Fig. 4(a). This is a rather reasonable supposition, specially if the grid is intended to process
natural images. In this way, if we choose any two neighbor nodes of the network, namely
(i, j) and (k, l), the resulting distribution of the sum of both nodes is a triangle like that one
represented in Fig. 4(b). It can be seen that the most probable value of the sum is Vmin+Vmax.
Let us see what happens at the end of the processing. Consider again Eq. (4), which deﬁnes
the ﬁltering carried out by a RC network along time. Notice that the DC component is the
only one that is not aﬀected by the ﬁltering, that is, Hˆ00(t) = 1 ∀t. It means that the average
value of the pixels does not change during the processing. Thus, when the diﬀusion is running,
the higher frequencies left are progressively ﬁltered until all of them are eventually removed
except the DC component. In other words, the values of the pixels are progressively getting
closer until all of them eventually coincides at the mean value, which has never been altered.
Let V¯ be the random variable representing the mean value of the pixels of an image. It can
be expressed as:
V¯ =
V1,1(0) +V1,2(0) + ...+Vij(0) + ...+VM,N(0)
MN
(23)
from which, taking into account that every pixelVij(0) presents an uniform distribution and by
applying the central limit theorem, we can conclude that V¯ approaches a normal distribution
N(µ, σ2) as follows:
10 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
(a) (b)
Figure 4. Probability density function for the initial voltage (a) and for the sum of any two initial
voltages (b) at the nodes of a RC network.
V¯ ∼ N
[
Vmin + Vmax
2
,
(Vmax − Vmin)2
12
]
(24)
and as every pixel will reach the mean value at the end of the diﬀusion process,Vij(∞) presents
the same distribution as deﬁned by Eq. (24), depicted in Fig. 5(a). It implies that the most
probable value of the sum of any two voltages of the network by the end of the diﬀusion is
again Vmin + Vmax, as showed in Fig. 5(b). We have seen what happens at the beginning
and at the end of the diﬀusion, but what about the processing itself? We start from uniform
distributions for every pixel. When the diﬀusion is being performed, the probability of a certain
pixel to reach the mean value is constantly increasing as the probability of having ﬁltered all
the frequencies other than the DC component also increases. Note that, according to Eq. (4),
any time interval carrying out diﬀusion implies necessarily the ﬁltering of frequencies other
than the DC component. Thus, the uniform distribution represented in Fig. 4(a) is transformed
along the diﬀusion until eventually becoming that in Fig. 5(a). However, the interest for us in
this transformation falls on the fact that, on increasing the probability of a certain pixel to
reach the mean value, its most probable value during the diﬀusion is (Vmin + Vmax)/2 as this
is the most probable value of the mean value. And it in turn means that the most probable
value of the sum of any two voltages within the ideal network at any time instant during the
diﬀusion is Vmin+Vmax. From this result, and keeping in mind that the elementary transistor
at a MOS-based RC grid emulating the ideal network just described presents an instantaneous
resistance:
RMij,kl(t) =
1
knSn
{
VC −
[
V ′ij(t) + V
′
kl(t)
]} (25)
it can be concluded that this value of resistance should match the resistor of the ideal network
for the most probable value of the sum of any two neighbor voltages, that is:
1
knSn [VC − (Vmin + Vmax)] = R (26)
which directly leads to the design equation obtained for the 2-node case:
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 11
(a) (b)
Figure 5. Probability density function for the ﬁnal voltage (a) and for the sum of any two ﬁnal voltages
(b) at the nodes of a RC network.
Snopt =
1
knR(VC − Vmin − Vmax) (27)
In this way, we make sure that the instantaneous diﬀusion performed by each elementary
MOS transistor between its drain and source terminals equals that of the corresponding ideal
resistor for most of the time instants of the dynamics, introducing thus minimum error from a
stochastical point of view. Note that we are implicitly assuming that this error is small enough
to apply to the MOS grid the same considerations extracted from the ideal grid regarding the
distributions and most probable values of the voltages during the diﬀusion. The simulation
results presented in the next section conﬁrm that this assumption can be made.
5. SIMULATION OF A 64×64 NETWORK
This section addresses the design of a 64 × 64 MOS-based RC network. The objective is to
conﬁrm the validity of the guidelines drawn in the previous Section despite having made use of
a coarse approximation for the behavior of the transistors in the triode region. Simulations have
been realized using standard 0.35µm CMOS 3.3V process transistor models in HSPICE. The
signal range at the nodes is [0V,1.5V], wide enough to evidence the inﬂuence of the MOSFET
nonlinearities over the Gaussian ﬁltering performed by the grid. VG is established at 3.3V in
order to bias the transistor as deep in the ohmic region as possible. The design speciﬁcation
is to implement a RC network with τ = 100ns by using a resistor R = 100kΩ and a capacitor
C = 1pF. The sizing of the elementary NMOS transistor to achieve this value of R is based
on Eq. (27). But this equation does not take into account second-order eﬀects like for example
the body eﬀect. It means that Sn does not only depend on the sum Vmin + Vmax but also on
the values of the voltages at drain and source that render that sum. Thus, for a speciﬁc value
of Sn, the instantaneous resistance implemented by the transistor can vary ±5% depending on
the drain and source voltages applied. In order to take into account these variations, we have
selected the minimum possible width W = 0.4µm. Then, we have swept L until ﬁnding that
value which makes the average resistance of the transistor for all the possible voltages at the
drain and source terminals rendering the optimum sum, i. e. Vmin + Vmax, equals to 100kΩ.
12 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
                                                
(a) (b) (c) (d)
Figure 6. (a) Original image, (b) MOS-diﬀused image at the instant of maximum error, (c) image
diﬀused by the corresponding ideal RC network, (d) absolute error normalized to maximum individual
pixel error
The result is L = 7.54µm§.
Once designed the elementary transistor, let us suppose that the initial voltage at the
capacitors is proportional to the image intensity displayed at Fig. 6(a). The MOS-based RC
network runs the diﬀusion over these initial voltages in parallel with its ideal counterpart in
order to be compared. The deviation is measured via the RMSE (Fig. 7(a)) and reaches a
maximum soon after the beginning of the diﬀusion process. The state of the corresponding
nodes in both networks at this point, displayed in Figs. 6(b) and 6(c), is perceptually equivalent.
The maximum observed RMSE for the complete image is 0.5%, while the maximum individual
pixel error is 1.76%. The RMSE remains below 0.6% — equivalent resolution between 6 and
7 bits — even introducing an exaggerated mismatch (10%) in the transistors’ VTn0 and µn
(Fig. 7(b)), conﬁrming thus the robustness to mismatch predicted in Section 3.
6. ON-CHIP GAUSSIAN FILTERING
This section describes the programmable focal-plane Gaussian ﬁltering operation performed
by a QCIF resolution smart CMOS imager [21] manufactured in the AMS 0.35µm CMOS-
OPTO 3.3V process. This CMOS process does not incorporate any special device for
image sensors. Indeed, it only diﬀers from the standard AMS 0.35µm process in an anti-
reﬂective coating and an EPI wafer which reduces the dark current. The chip implements a
massively parallel focal-plane processing array which can output diﬀerent kinds of simpliﬁed
representations from an image sequence at very low energy cost. Focal-plane processing is
performed on every single image frame. Only spatial information is employed. Temporal
variations between consecutive images in a sequence are not taken into account. The main
characteristics of the chip are summarized in Table II. We can compare this prototype
with other realizations of programmable bandwidth focal-plane Gaussian ﬁltering, like [11]
and in [18]. A ﬁgure of merit which contemplates their major features can be computed:
§This transistor length lies out of the physical design grid, that ﬁxes the minimum feature size to be 0.05µm.
We are using it here as illustrative of the design procedure.
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 13
(a)
0 0.2 0.4 0.6 0.8 1
x 10−5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Time (sec.)
R
M
SE
 (%
)
(b)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x 10−5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Time (sec.)
R
M
SE
 (%
)
Figure 7. RMSE of the MOS-based grid state vs. ideal RC grid state: (a) w/o mismatch, (b) Monte
Carlo with 10% mismatch
FOM = (Power · Area)/(Spatial resolution · Throughput). Thus, for the prototype of this
paper, the FOM, measured in pJ·mm2/px·Sa, results in 84.5 whereas it is 1.49× 106 for [11],
and 95.1 for [18].
The functionality of the array is mostly based on the fully-programmable time-controlled
Gaussian ﬁltering carried out by a RC network. In Section 2 we deﬁned the width of such
ﬁltering as σ =
√
2t/τ . Since this width depends on the quotient between the time interval
which the network is permitted to evolve and the time constant of the network, their values
must be correlated in order to achieve a high degree of programmability over the ﬁltering.
It in turn establishes several tradeoﬀs for the design of the corresponding circuitry. On one
hand, the larger the value of τ the coarser the necessary time control of the network dynamics
to render a certain value of σ, making simpler the circuitry for this control. Besides, larger
values of τ need more area for the implementation of the elementary transistor and elementary
capacitor of the network, being therefore more robust to mismatch. On the other hand, the
area consumed by these components has dramatic consequences for the size of the array as
they must be included at each and every elementary cell of the focal plane. From this point of
view, a small value of τ results more adequate. The problem is that a ﬁner temporal control
of the network dynamics would be mandatory in such a case, even forcing the internal, i.e. on-
chip, generation of the pulses which control the evolution of the grid. Otherwise, propagation
delays could distort the resulting ﬁltering.
6.1. DIFFUSION DURATION CONTROL
In order to reduce as much as possible the value of τ and therefore the size of the focal-plane
processing array, we propose a method for a ﬁne control of t based on an on-chip VCO. The
ﬁrst block of the diﬀusion duration control circuit is the VCO itself (Fig. 8(a)). It consists of
a ring of pseudo-NMOS inverters in which the load current is controlled by ‘Vbias clk’, thus
modifying the propagation delay of each stage. This circuit provides an internal clock that will
be employed to time pulses that add up to the ﬁnal diﬀusion duration.
14 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
(a)
(b)
Figure 8. (a) 15-stage inverter ring VCO and (b) diﬀusion control logic
The main block of the diﬀusion control is a 12-stage shift register, Fig. 8(b). It will store
a chain of 1’s indicating how many clock cycles the diﬀusion process will run. The clock
employed for this will be either external or the already described internal VCO¶. The output
signal, diff ctrl, is a pulse with the desired duration of the diﬀusion, t, that is inverted and
delivered to the gates of pMOS resistors. Thus, t depends on two parameters, namely: N1,
which is the number of logic ‘1’s within the bit string, and fCLK , the frequency of the clock.
In this way, t = N1/fCLK. A minimum step of around t = 6.66ns can be achieved.
6.2. MOS-BASED RC NETWORK
Once tmin has been set, the design of the RC network can be addressed. Our objective is to
implement a value of τ around one order of magnitude greater than tmin. This means that
Gaussian ﬁlters with widths below σ = 1 must be easily achieved. The design methodology
applied for the elementary MOS resistor is exactly the same than in Section 5. The result is
depicted in Fig. 9. The MOS-based elementary capacitor has a nominal value of 1pF whereas
the elementary MOSFET implements, for typical mean conditions (TM), a resistance of 85kΩ.
¶The aim of the internal VCO is to reach a better resolution of the diﬀusion time than an external clock. For
loading the appropriate sequence into the register, an external, and slower, clock is usually preferred.
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 15
Figure 9. On-chip RC network
A key point which was brieﬂy commented in Section 3 is that Snopt varies, due to its dependence
of kn, across the design space delimited by the corners of the process. Or, equivalently, for a
ﬁxed Snopt , the resistance implemented by the transistor varies according to the value of kn. In
our case, this value of the resistance ranges from 49kΩ (WP corner) to 148kΩ (WS corner). The
problem therefore is to know the exact value of kn in a speciﬁc implementation. We propose
a calibration process in order to solve this. This calibration process is carried out by means of
two pairs of neighbor pixels whose initial voltage can be externally set. One pair is located at
the upper left corner of the array whereas the other one is at the upper right corner. It permits
to take the across-die variations into account. In order to obtain the value of τ , one of the two
pixels at every pair is set to Vmin whereas the corresponding neighbor is set to Vmax. Successive
diﬀusion steps are programmed until reaching the steady state. After each diﬀusion step, the
pixel values are read out in order to be compared oﬀ-line to an ideal diﬀusion over the same
initial conditions. This ideal diﬀusion, where the sum of the drain and source voltages equals
at every time instant Vmin + Vmax, should be perfectly emulated by the MOS-based diﬀusion
according to Eq. (27). Obviously, second-order eﬀects will cause deviations, but a least square
ﬁtting of the pixel values read out from the chip with respect to the ideal diﬀusion will permit
to know the average τ , similarly to how the average resistance is obtained in Section 5. The
result for the upper left corner is depicted in Fig. 10, where a minimum RMSE of 2.26% is
obtained for τ = 72.4ns. In the upper right corner, a minimum RMSE of 0.58% is reached for
τ = 69.8ns. These values make perfect sense according to the range of possible values of the
MOS resistance above mentioned. The ﬁnal value of τ considered for the whole array will be
the average of the extracted values, that is, τ = 71.1ns.
6.3. EXPERIMENTAL RESULTS
Once τ is calibrated, any on-chip Gaussian ﬁlter can be compared to its ideal counterpart
obtained by solving the spatially-discretized diﬀusion equation. Thus, a single image is
captured. This image is converted to the digital domain and delivered through the output bus.
Ideal Gaussian ﬁlters with increasing widths are applied over this image. The same ﬁlters are
16 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
Time (s)
Vo
lta
ge
(V
)
♦ Ideal pixel 1  Chip pixel 1
 Ideal pixel 2  Chip pixel 2
0 1 2 3 4 5
×10−7
1.5
2
2.5
Figure 10. Calibration of τ at the upper left corner
Time (s)
R
M
SE
(%
)
0 0.5 1 1.5 2 2.5 3
×10−6
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Figure 11. Evolution of the RMSE for on-chip Gaussian ﬁltering with respect to the ideal case
implemented on-chip by programming the adequate time intervals of diﬀusion. After every on-
chip ﬁltering, the resulting image is converted to digital and delivered to the test instruments to
be compared to its ideal counterpart generated by MATLAB R© (Fig. 11). A total of 12 diﬀerent
ﬁlters have been applied over the original captured image. Six of them are represented in Fig. 12
(ﬁrst row) and compared to the ideal images (second row). The last row contains a pictorial
representation of the error, normalized in each case to the highest measured error on individual
pixels, which are 0%, 24.99%, 19.39%, 6.17%, 3.58% and 6.68%, respectively. It can be seen
how noise eventually becomes the dominant error for the largest values of σ. Keep in mind
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 17
t = 0 t = 20ns t = 40ns t = 200ns t = 800ns t = 1500ns
(σ = 0) (σ  0.75) (σ  1) (σ  2.4) (σ  4.75) (σ  6.5)
Figure 12. Comparative of Gaussian ﬁltering for diﬀerent values of σ. The ﬁrst row corresponds to the
images extracted from the chip, the second one corresponds to their ideal counterparts and ﬁnally the
third one corresponds to their normalized diﬀerence.
that this noise is added to each image coming from the chip because of the readout mechanism.
On the contrary, the noise present at the initial image is ﬁltered by the ideal Gaussian ﬁlters
applied oﬀ-chip. The key point here is that the error is kept under a reasonable level despite
no FPN post-processing is carried out. This fact together with the eﬃciency of the focal-plane
operation is crucial for artiﬁcial vision applications under strict power budgets. Furthermore,
the accuracy of the processing predicted by simulation is very close to that of the ﬁlters with
small σ, where noise is not dominant yet. It validates all the steps of the design methodology
proposed along this paper, specially the use of the classical ﬁrst-order approximation for a
transistor biased in the triode region. Despite its simplicity, it has proved to be enough for a
robust design.
7. CONCLUSIONS
A methodology for the optimal design and VLSI implementation of focal-plane Gaussian
ﬁltering has been presented. It is based on the ﬁne control of the diﬀusion dynamics of a
MOS-based RC network. The inclusion of transistors instead of true resistors achieves a much
more area-eﬃcient implementation. Besides, the control of their gate voltages permits to stop
the diﬀusion when required, implementing thus ﬁlters with programmable width. Together
with this programmability, the most remarkable features of the methodology proposed are
18 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
the robustness to mismatch and the accuracy of the ﬁltering despite the nonlinearities of the
transistors. This makes the VLSI implementation of MOS-based RC networks a very useful
and eﬃcient tool for early vision processing.
APPENDIX A
Firstly, time derivative is applied to Eq. (12), obtaining:
d
dt
=
1
2
d
dt
[V ′1 (t)− V ′2(t)]−
1
2
d
dt
[V1(t)− V2(t)] (28)
Taking into account now that, from Eq. (6):
d
dt
[V ′1(t)− V ′2 (t)] = −2
V ′1(t)− V ′2(t)
RMC
(29)
and from Eq. (5):
d
dt
[V1(t)− V2(t)] = −2V1(t)− V2(t)
RC
(30)
we can rewrite Eq. (28) as:
d
dt
=
V1(t)− V2(t)
RC
− V
′
1(t)− V ′2(t)
RMC
(31)
where, deﬁning τ = RC, we have:
τ
d
dt
= V1(t)− V2(t)− R
RM
[V ′1(t)− V ′2(t)] (32)
Finally, considering that, according to Eq. (11):
V1(t)− V2(t) = V ′1(t)− V ′2 (t)− 2(t) (33)
and substituting Eq. (8) and Eq. (9) in Eq. (32), we obtain:
τ
d
dt
= [V ′1(t)− V ′2(t)]
V10 + V20 − VSopt
VC − VSopt
− 2(t) (34)
where we have also applied our initial assumption regarding the initial conditions, that is,
V10 = V
′
10 and V20 = V
′
20.
APPENDIX B
The absolute extremes are at the boundaries of the domain for (V10, V20). The initial
assumption V10 > V20 reduces the boundaries to be searched to only two. The ﬁrst is that one
set by making V10 = Vmax, that is, according to Eq. 18:
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 19
ext|V10=Vmax =
1
2
(Vmax − V20)
Vmax + V20 − VSopt
VC − VSopt
r
r
1−r (35)
from which calculating the derivative with respect to V20 and bearing in mind our assumption
that ∂r/∂V20  0, we have:
d
dV20
ext|V10=Vmax =
1
2
VSopt − 2V20
VC − VSopt
r
r
1−r (36)
and ﬁnally, by making this expression equal to 0, the location of the ﬁrst absolute extreme
is obtained, (V10, V20) = (Vmax, VSopt/2). The other boundary to search absolute extremes is
that one set by making V20 = Vmin. It means, according to Eq. 18, that:
ext|V20=Vmin =
1
2
(V10 − Vmin)
V10 + Vmin − VSopt
VC − VSopt
r
r
1−r (37)
We calculate now the derivative with respect to V10, assuming ∂r/∂V10  0:
d
dV10
ext|V20=Vmin =
1
2
2V10 − VSopt
VC − VSopt
r
r
1−r (38)
and again, by making this expression equal to 0, the second absolute extreme is located at
point (V10, V20) = (VSopt/2, Vmin). Finally, by substituting these two points in Eq. 18, the
absolute extremes of Eq. 19 are respectively obtained.
REFERENCES
1. Poggio T, Voorhees H, Yuille A. A regularized solution to edge detection. J. of Complexity 1988; 4(2):106–
123.
2. Mutch J, Lowe D. Object class recognition and localization using sparse features with limited receptive
ﬁelds. Int. J. of Computer Vision 2008; 80(1):45–57.
3. Ja¨hne BE. Multiresolutional signal representation. In Chapter 4, Handbook of Computer Vision and
Applications. Volume 2: Signal Processing and Pattern Recognition, Academic Press, 1999;
4. Babaud J, Witkin AP, Baudin M, Duda RO. Uniqueness of the Gaussian kernel for scale-space ﬁltering.
IEEE Trans. on Pattern Analysis and Machine Intelligence 1986; 8(1):26–33.
5. Za´randy A´, Rekeczky C. 2D operators on topographic and non-topographic architectures implementation,
eﬃciency analysis, and architecture selection methodology. International Journal of Circuit Theory and
Applications 2010; n/a. doi: 10.1002/cta.681
6. Sotak GE, Boyer KL. The Laplacian-of-Gaussian kernel: a formal analysis and design procedure for fast,
accurate convolution and full-frame output. Computer Vision, Graphics and Image Processing 1989;
48(2):147–189.
7. Mead C. Analog VLSI and Neural Systems. Addison-Wesley, 1989;
8. Raﬀo L, Sabatini SP, Bo GM, Bisio GM. Analog VLSI circuits as physical structures for perception in
early visual tasks. IEEE Transactions on Neural Networks 1998; 9(6):1483–1494.
9. Hui KF, Shi BE. Distortion in analog VLSI networks for image ﬁltering. IEEE Transactions on Circuits
and Systems - I 1999; 46(10):1161–1171.
10. Shi BE. The eﬀect of mismatch in current- versus voltage-mode resistive grids. International Journal of
Circuit Theory and Applications 2009; 37(1):53–65.
11. Kobayashi H, White JL, Abidi AA. An active resistor network for Gaussian ﬁltering of images. IEEE
Journal of Solid-State Circuits 1991; 26(5):738–748.
12. Vittoz EA, Arreguit X. Linear networks based on transistors. Electronic Letters 1993; 29(3):297–299.
13. Andreou AG, Boahen KA. A 590,000 transistor 48,000 pixel, contrast sensitive, edge enhancing, CMOS
imager-silicon retina. Proc. Conf. on Advanced Research in VLSI 1995; 225–240.
20 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
14. Lenero-Bardallo JA, Serrano-Gotarredona T, Linares-Barranco B. A mismatch calibrated bipolar spatial
contrast AER retina with adjustable contrast threshold. International Symposium on Circuits and Systems
2009; 1493–1496.
15. Shi BE, Chua LO. Resistive grid image ﬁltering: input/output analysis via the CNN framework. IEEE
Transactions on Circuits and Systems - I 1992; 39(7):531–548.
16. Fernandez-Berni J, Carmona-Galan R. On the implementation of linear diﬀusion in transconductance-
based cellular nonlinear networks. International Journal of Circuit Theory and Applications 2009;
37(4):543–567.
17. Ni Y, Zhu YM, Arian B, Devos F. Yet another analog 2D Gaussian convolver. IEEE Int. Symp. on Circuits
and Systems (ISCAS) 1993; 1:192–195.
18. Ni Y. Smart image sensing in CMOS technology. IEE Proc.-Circuits Devices Syst. 2005; 152(5):547–555.
19. Lindeberg T. Discrete scale-space theory and the scale-space primal sketch. Ph. D. dissertation, Royal
Institute of Technology (Stockholm, Sweden), 1991;
20. Papoulis A, Unnikrishna S. Probability, Random Variables and Stochastic Processes, McGraw Hill, 2002;
21. Fernandez-Berni J, Carmona-Galan R. Robust focal-plane analog processing hardware for dynamic texture
segmentation. Proc. IEEE Int. Workshop on Cellular Nanoscale Networks and their Applications (CNNA)
2010; 453–458.
MOS-BASED RC NETWORKS FOR TIME-CONTROLLED GAUSSIAN FILTERING 21
V
G
(V
)
V
T
n
(V
)
[V
m
in
,V
m
a
x
](
V
)
m
a
x
e e
x
t(
m
V
)
m
in
e e
x
t(
m
V
)
V
S
o
p
t
n
u
m
(V
)
e m
a
x
(m
V
)
e m
in
(m
V
)
V
S
o
p
t
(V
)
r
E
q
u
iv
.
re
so
l.
(b
it
s)
3
.3
0
.8
[0
,1
.5
]
3
0
.4
3
-3
0
.4
3
1
.5
8
3
3
.1
9
-2
6
.7
4
1
.5
[0
.5
7
,1
.4
3
]
4
.5
3
.3
0
.8
[0
,0
.7
5
]
6
.1
7
-6
.1
7
0
.7
7
6
.3
7
-5
.8
3
0
.7
5
[0
.8
2
,1
.1
8
]
5
.8
3
.3
0
.8
[0
.7
5
,1
.5
]
9
.6
1
-9
.6
1
2
.2
8
1
0
.1
0
-8
.8
1
2
.2
5
[0
.7
3
,1
.2
7
]
5
.2
1
.8
0
.5
[0
,1
]
3
0
.6
1
-3
0
.6
1
1
.0
8
3
4
.2
6
-2
4
.9
3
1
[0
.3
8
,1
.6
2
]
3
.9
1
.8
0
.5
[0
,0
.5
]
5
.6
2
-5
.6
2
0
.5
1
5
.8
2
-5
.1
7
0
.5
[0
.7
6
,1
.2
4
]
5
.4
1
.8
0
.5
[0
.5
,1
]
1
0
.7
9
-1
0
.7
9
1
.5
3
1
1
.8
2
-9
.4
0
1
.5
[0
.5
5
,1
.4
5
]
4
.4
T
a
b
le
I.
N
u
m
er
ic
a
l
v
er
iﬁ
ca
ti
o
n
o
f
th
e
a
p
p
ro
x
im
a
ti
o
n
s
re
a
li
ze
d
.
22 J. FERNA´NDEZ-BERNI, R. CARMONA-GALA´N
Technology 0.35µm CMOS 2P4M
Vendor (Process) Austria Microsystems (C35OPTO)
Die size (with pads) 7280.8µm × 5780.8µm
Cell size 34.07µm × 29.13µm
Fill factor 6.45%
Resolution QCIF: 176×144 px
Photodiode type n-well/p-substrate
Power supply 3.3V
Signal range [1.5V,2.5V]
FPN 0.72%
PRNU (50% signal range) 2.42%
Sensitivity 0.15V/(lux·s)
Power consumption (worst case) 5.6mW@30fps
ADC throughput 0.11MSa/s (9µs/Sa)
Internal clock freq. range 0.5-150MHz
Table II. Summary of prototype chip.
