A 75kb SRAM in 65nm CMOS for In-Memory Computing Based Neuromorphic
  Image Denoising by Bose, Sumon Kumar et al.
A 75kb SRAM in 65nm CMOS for In-memory
Computing based Neuromorphic Image Denoising
Sumon Kumar Bose
School of Electrical and
Electronic Engineering
Nanyang Technological University
Singapore
bose0003@e.ntu.edu.sg
Vivek Mohan
School of Electrical and
Electronic Engineering
Nanyang Technological University
Singapore
vivekmoh001@e.ntu.edu.sg
Arindam Basu
School of Electrical and
Electronic Engineering
Nanyang Technological University
Singapore
arindam.basu@ntu.edu.sg
Abstract—This paper presents an in-memory computing (IMC)
architecture for image denoising. The proposed SRAM based in-
memory processing framework works in tandem with approxi-
mate computing on a binary image generated from neuromorphic
vision sensors. Implemented in TSMC 65nm process, the pro-
posed architecture enables ≈ 2000X energy savings (≈ 222X
from IMC) compared to a digital implementation when tested
with the video recordings from a DAVIS sensor and achieves a
peak throughput of 1.25− 1.66 frames/µs.
Index Terms—In-memory computing, SRAM, neuromorphic
vision sensors, median filter, approximate computing.
I. INTRODUCTION
Bio-inspired neuromorphic vision sensors (NVS) [1] [2]
have gained traction among the researchers due to its low
bandwidth and energy requirement, as well as recent commer-
cial availability [3]. Unlike a traditional frame-based camera,
NVS detects any changes of contrast in a pixel and out-
puts an event corresponding to the (x,y) coordinates of that
pixel which is also known as Address Event Representation
(AER) [4]. Hence, AER contains less superfluous information
due to its asynchronous event-based encoding and finds appli-
cation in robotics [5], environment monitoring, traffic surveil-
lance [6] and object tracking [7] in the scene. However, the
raw image data is corrupted with noise and its removal from
the image is one of the most important pre-processing tasks for
region proposal, object tracking and classification [8]. While
earlier approaches use an event based denoise method termed
the nearest neighbour filter (NN-filt) [9], recent hybrid frame-
event approaches using median filtering outperformed NN-
filt in terms of performance, memory requirement and com-
putes [10]. However, traditional Von Neumann architecture is
still a bottleneck in terms of latency and energy dissipation for
hardware implementation of neuromorphic processing [11].
To address this, in-memory computing(IMC) paradigm is
proposed where processing is performed inside the memory
and shows unprecedented performance benefits compared to
its Von Neumann counterpart. IMC not only enables highly
parallel processing due to its simultaneous access of multiple
cells but also gets rid of the energy consumption of data
transfer from memory to processor and vice versa [12]. Several
works on IMC are shown to be effective, such as [13] proposed
(a)	Median	Filter (b)	Proposed	Nonoverlap	Filter
Fig. 1. (a) Conventional median filter using a 3× 3 kernel and stride = 1 (b)
Proposed NOMF using a 3 × 3 kernel and stride = 3 for image denoising.
Approximation due to NOMF reduces the memory read, write energy and its
architecture enables in-memory computing.
6-T SRAM based linear classifier using current summation and
achieved 13x energy savings on MNIST dataset compared to
the digital implementation. Similarly, [14] implemented 10T-
SRAM based binary-weighted Convolutional Neural Networks
(CNN) leveraging charge distribution and attained 16x energy
benefit for MNIST dataset. While most of the efforts on IMC
are shown for the post-processing of the image, in this paper,
we use IMC for efficient denoising of the event based binary
image (EBBI) since this method is shown to outperform pure
event based ones [10].
Approximate computing is another avenue for energy reduc-
tion in an application like pattern recognition or multimedia
processing where slight degradation in the calculation does
not affect the final outcome or the output quality remains
its acceptable range. Approximation in the calculation can be
introduced to the circuit [15] [16], software [17] or system
level [18]. Since a slight change of object boundary has a little
impact on region proposal, objects tracking or classification
performance, we propose to use approximate computing while
filtering of an image frame. The details of the algorithm and
VLSI implementation are presented in the following sections.
II. OVERVIEW: MEDIAN FILTER ALGORITHM
A median filter is a nonlinear filter that replaces the center
pixel of an n × n kernel by the median value of n2 pixels
ar
X
iv
:2
00
3.
10
30
0v
1 
 [e
es
s.I
V]
  2
3 M
ar 
20
20
associated with the kernel. The output of median filter at (i,
j) location can be presented as Eq.(1) where i, j ⊂ Z+.
Pmf (i, j) =median({P (i+ k, j + l) | k, l ⊂ Z
and ∈ {−n− 1
2
, · · · , n− 1
2
}) (1)
Implementation of the median filter for a grayscale image
involves sorting the pixel values. On the contrary, carrying out
the median filtering for a binary image is simple and requires
a counter which adds up the number of occurrence of “1” for
an n × n patch and assign “1” for the middle pixel if the
number of “1”s is higher than that of “0” and vice versa. The
whole operation can be shown as
Pmf (i, j) =
1, if ΣP (i+ k, j + l) ≥
⌈
n2
2
⌉
0, otherwise
(2)
In a traditional median filter, an n × n kernel convolves
over the image in an overlap fashion where the stride, s = 1 as
shown in Fig. 1(a). Hence, fetching and summing up bit by bit
for the binary image, followed by comparison in the processing
unit and a write operation in the memory demand 2n2+1 clock
cycles and associated energy for each pixel. However, since the
adjacent pixels of an image have similar characteristics, we can
apply the decision of an n×n kernel to all the n2 pixels instead
of the center one. This is equivalent to having stride s = n
(Fig. 1(b)) resulting in non-overlap median filter (NOMF)
that we use in this work. While the proposed approach changes
the object boundary slightly (marginal effect on tracking as
shown later), it reduces the processing and memory read access
time by a factor of n2 and enables the same memory to be
utilized to store the filtered image. It also enables IMC based
denoise as shown next. However, NOMF approach does not
reduce the memory write cycles and energy. Table I captures
usage of the resources in both approaches for an image of size
W ×H .
III. IN-MEMORY DENOISE: HARDWARE IMPLEMENTATION
A. Architecture
Figure 2 shows an architecture of a 320×240 SRAM array
for image denoising (QVGA or lower resolution) applicable to
NVS such as [19] [20]. It operates in two modes (a) normal
read and write mode (b) filter mode. Unlike a conventional
SRAM write, NVS does not allow to write all the bits of a
byte or a word simultaneously since this memory is targeted
for event-based cameras and events are not contiguous. There-
fore, a single bit writing circuitry is implemented in normal
write mode. In order to reduce the dynamic bit-line power
consumption [21], the whole memory is divided into 22 banks
having 15×240 cells in each bank except the last one. In filter
mode, the kernel can be configured as either a 3×3 or a 5×5
(enabling n successive WLs and connecting n−1 consecutive
BLs and BLBs separately, n ∈ {3, 5}) patch. To have almost
the same delay of WL signal for each cell of a kernel, 15
columns are selected for each bank. In normal SRAM write
mode, global (GWL) and local word-line (LWL) blocks enable
C1
320x240
Bitcell 
Array
C2 C3 C14 C15
R
ow
	D
ec
od
er
G
W
L	
M
U
X
LW
L	
LW
L	
C321
WL
BL BLBNA1 NA2
ND1 ND2
PU1 PU2
BL	&	BLB	Short
Column	Decoder	(BL,	BLB	&	HS	Driver)
B
L 1
3
B
LB
13
B
L 1
4
B
LB
14
B
L 1
5
B
LB
15
S S
S
CLK
B
an
k	
Se
le
ct
WL1
WL2
WL3
WL238
WL239
WL240
S
Sense	Amplifier
A B
"1""0"
Fig. 2. Architecture of a 320 × 240 bitcell array for noise removal from
a binary image. In filter mode for n = 3, three consecutive word-lines
(WL) are enabled together to discharge bit-line (BL) and bit-line bar (BLB)
simultaneously. BLs and BLBs of three successive columns are connected
together separately using transmission gates to implement a 3×3 kernel. The
IMC architecture enables highly parallel noise filtering of 320 × 3 cells in
two clock cycles.
one of the word-lines (WL), and column decoder writes the
data and its complement on the bit-line (BL) and bit-line bar
(BLB) respectively. The rest of the BLs and BLBs are charged
to VDD by the half select (HS) driver to mitigate the read
disturb issue of the half-selected cells in the selected bank
(cells are selected along row but not selected along column).
During writing a memory cell, one of the lines (either BL or
BLB) is driven to 0 V and another line is connected to VDD.
The line, connected to 0 V, initiates the bit-flip process in an
SRAM cell. For instance, 6T SRAM cell in the left inset of
Fig. 2 stores “0” and in order to write “1” in the cell, BL and
BLB are connected to VDD and 0 V respectively. Once WL is
asserted, the strength of PU2 and NA2 decides the bit-flip in
the cell. If NA2 has higher strength than PU2, it will write “1”
in the cell. However, the writing operation can happen even
when the BL is connected to lower potential than VDD. In that
case, strength of NA2 transistor has to be increased further. In
read mode, BL and BLB are charged to VDD, and when the
WL signal is asserted, either of the lines starts discharging
depending on the value stored in the cell.
B. Implementation of NOMF
We follow the steps of an SRAM cell read and bit-flip to
implement the NOMF for noise removal in the memory. BLs
and BLBs of the 3×3 cells are connected separately employing
transmission gates which is shown in the right inset of Fig. 2.
TABLE I
COMPARISON OF DIFFERENT FILTERS FOR AN IMAGE OF SIZE, D=W × H
Input
#
memory
read
#
memory
write
#
operations
#
Bits
NN-filt Events βtγn2D βtγD γn2D βtD
Median Filter EBBI n2D D n2D 2D
NOMF EBBI D D D D
NOMF+IMC EBBI D/n αD 0 D
βt = 16, γ ≈ 15%, D = HW, α ≈ 3.6%, n ∈ {3, 5}.
Throughout the filter operation, the signal S is kept high. The
resistance of the transmission gate, Rtg is chosen such that
the following criterion is met:
RtgCBL << CBL · V DD
is
(3)
where is denotes the discharging current of each SRAM cell
and CBL is a combination of the metal routing capacitor of BL
or BLB, and diffusion capacitor of 240 access transistors, NA1
or NA2. From post-layout simulation after parasitic extraction,
CBL ≈ 140fF. The condition in Eq. (3) is maintained so that
the discharge profiles of the three BLs of a 3×3 kernel follow
each other with minimal delay and the same is applicable for
the BLBs discharge. The proposed IMC architecture takes two
clock cycles to filter the noise from a n×n patch. In the first
cycle, n BLs and BLBs are charged to VDD. n successive
WLs are asserted in the next cycle, which enables n2 (n×n)
cells to discharge BLs and BLBs simultaneously. Since there
will be a difference of BL and BLB discharge current due to
the different number of “0”s and “1”s in a n×n kernel, one of
the lines will discharge and reach 0 V faster. This configuration
of BL and BLB is similar to write mode and it will flip the
minority pixels in the kernel. If the number of “0”s is less than
the number of “1”s in a kernel, we refer “0” as minority pixel
in that patch and vice versa. In filter mode, we keep all the
bank select signals high to activate highly parallel processing
in the memory and it filters 320 × 3 cells in one pass. We
repeat this procedure until all the rows are filtered.
Intuitively, the n × n kernel can be thought of as a circuit
where two latches of different strength and stored values are
connected to BL and BLB. Their strengths are determined
by the number of “0”s and “1”s stored in the kernel. Whoever
wins in discharging BL or BLB faster, imposes its stored value
on the other.
The voltage difference between BL and BLB at any instant
of time, t, is represented as
∆V =
(
Σi0
CBL
− Σi1
CBLB
)
t (4)
Where Σi0 and Σi1 represent the discharging current of BL
and BLB due to the stored “0”s and “1”s in the kernel
respectively. In the best-case scenario, all the bits in the
kernel are either “0” or “1” and bit-flip does not happen. In
contrast, the kernel takes the longest time to decide and flips
the minority pixels when the difference between the number
of “0”s and “1” is one. However, due to the discharging
current and capacitor mismatch, majority pixels in a kernel
may flip in the worst-case scenario. The unintended bit-flips
due to the mismatch reduces the object boundary when the
majority pixel is “1” and inserts new object in the frame in the
opposite scenario (
⌊
n2
2
⌋
“0”s and
⌈
n2
2
⌉
“1”s). However, the
probability of
⌊
n2
2
⌋
noise pixels appearing inside the faulty
kernel is negligible. Nevertheless, to mitigate the mismatch
effects, width and length of NA1, NA2, ND1, and ND2 are
increase by a factor of 2 from its minimum value supported by
the process and low VT devices are used. We run 200 trials of
Monte-Carlo simulation initializing the kernel with four “1”s
and five “0” and do not observe any unintentional bit-flip in
the 3 × 3 kernel (see Fig. 3(c)) at VDD=1.2V. Even though
the usage of low VT devices increases the leakage power, we
can shut down the memory once processing is done.
C. Performance
The proposed approach has several major advantages a) it
reduces the dynamic BL power consumption during SRAM
read operation. BLs and BLBs are required to charge once
to read n (3 or 5) cells along the column compared to the
conventional approach where the requirement is n times. b) It
does not require any sense amplifier to sense the BL and BLB
voltage difference. The n×n kernel inherently acts as a sense
amplifier and takes the decision. c) It does not consume any
dynamic BL power during write operation since the discharges
of BL and BLB are related to the read operation. d) Minimal
energy is required to flip the minority pixels (only noise and
boundary pixels of an object).
Table I compares the proposed NOMF implemented us-
ing IMC with the state-of-the-art denoising techniques. The
nearest neighbour filter (NN-filt) [22] stores and updates
the timestamp of an incoming event using βt (βt=16) bit
per timestamp [10]. Whereas other techniques process event
based-binary image (EBBI) frame. γ represents the average
number of events (≈ 15%) during the frame duration (single
pixel can be fired multiple times). As discussed earlier, IMC
reduces the number of memory read by a factor of n. α in the
third column represents the fraction of the pixels that need
to be flipped for the filter implementation (only noise and
boundary pixels of the objects). We observed that the average
value of α is 0.036 for 15000 image frames. Also, the proposed
IMC approach does not require any addition or comparison.
IV. RESULTS
The circuit has been designed in 65nm CMOS refer to
unit SRAM cell layout picture in Fig. 3(d). We initialize one
of the 3 × 3 kernels of the 320 × 240 memory array with
four “0”s and five “1” to simulate the NOMF in SPICE. Fig.
3(a) captures the transient behavior of different nodes of the
kernel. When WL goes low, BL and BLB are charged to VDD.
Initially node A stores “0” and when WL is made high, BL
and BLB start discharging. Since the number of “1” is higher
than that of “0” in the kernel, BLB gets discharged faster and
A
 (V
)
0.4
0.8
1.2
W
L 
(V
)
,B
LB
 (V
)
B
L
time (ns)
9 11 13 15 17 19 21 0.7 0.8 0.9 1 1.1 1.2
VDD (V)
5
10
15
20
25
# 
un
in
te
nd
ed
 b
it-
fli
p 4 '1's_5 '0's
3 '1's_6 '0's
2 '1's_7 '0's
(c)(a)
0.4
0.8
1.2
0.4
0.8
1.2
No
. o
f S
am
pl
es
0
10
20
30
40
50
60
Current (uA)
150 170 190 210 230 250 270
(b)
I_BLB
= 186uσ = 11.2u
(d)
1.8um
2.
03
um
Fig. 3. (a) Bit-flip of an SRAM memory cell in a 3× 3 kernel. Since the number of “1” is higher than that of “0”, BLB gets discharged faster and the stored
value flips at node A. (b) 1000 points Monte-Carlo DC simulation of BL and BLB discharging current at VDD=1V, and 4“1”s 5“0”s scenario (worst-case).
(c) 200 points Monte-Carlo simulation: unintended bit-flip due to the mismatches across VDD and different number of “1”s and “0” in the kernel. (d) Unit
SRAM cell layout.
the minority cells flip its stored value. 1000 points Monte-
Carlo DC simulation of BL and BLB discharging current at the
worst-case scenario and 1V is shown in Fig.3(b). The overlap
region in the histogram is responsible for the unintended bit-
flips at 1V. However, it is seen in MATLAB simulation using
the dataset described below that the probability of appearing
four noisy pixels in a kernel is 0.0006. Fig. 3(c) shows 200
points Monte-Carlo simulation of unintended bit-flips across
VDD. It can be seen that at 1.2V, unintended bit-flip does not
happen. However, due to lower overdrive, and mismatches,
2.5% unintended bit-flips occur at 1V and the worst-case
scenario (Overall probability=0.025× 0.0006).
In order to validate the proposed NOMF and compare with
prior work, we use the same dataset as used in [10] for a fair
comparison. The dataset comprises more than 1 hour of traffic
scene recordings with different objects such as cars, buses,
trucks, bikes, humans along with the background noise. More
details are available in [10].
Fig. 4(b)-(c) show the MATLAB simulation of the median
filter and proposed NOMF using a 3 × 3 kernel on the
binary raw image. In term of noise removal, both filters
show similar performance. We also evaluate the performance
- recall and precision of an overlap-base tracker (OT) [10]
using both filtered images for different IoU values as shown
in Fig. 4(d)-(e). IoU=(AGTB ∩APB)/(AGTB ∪APB), where
AGTB and APB denote the area of manually annotated ground
truth and region proposed by the OT encapsulating an object
respectively. If the IoU of a proposed region is greater than
a threshold, the region is assumed to be true positive region.
Precision (true positive regions/ total proposed regions) and
recall (true positive regions/ total ground truth regions) of the
OT are calculated using all the output frames from both filters,
and the performance is comparable as shown in Fig.4.
Table II compares the performance of the proposed NOMF
implemented using IMC with spatiotemporal [23] and fully-
digital median filter that is synthesized in the same process
for fair comparison. The spatiotemporal filter works on the
continuous events from the NVS whereas proposed NOMF
and fully-digital implementation process event-based binary
image following [10]. Latency and energy are estimated at
200MHz on the post-layout netlist. The synergy between the
approximate computing and IMC reduces the execution time to
0.8µs/frame and enables ≈ 2000X energy saving compared to
the digital counterpart where the contribution of approximation
and IMC are ≈ 9X and ≈ 222X respectively.
Table III compares the proposed approach with the recently
(c)(b)(a)
(d) (e)
Fig. 4. (a) raw binary frame (b) output frame of median filter and (c) proposed NOMF using a 3× 3 kernel. (d) Precision and (e) recall of an overlap-based
tracker (OT) [10] using both filtered images for different IoU values.
TABLE II
COMPARISON WITH DIFFERENT FILTER IMPLEMENTATIONS
Process Area/Cell(µm2)
Latency(ns)
/bit
Energy(pJ)
/bit
Spatio-temporal
Filter [23] 180nm 400 10 20
Median Filter 65nm 4.89 95 228
Proposed
NOMF+IMC 65nm 3.65 0.01 0.11
TABLE III
COMPARISON OF DIFFERENT PUBLISHED IMC WORKS
This Work [14] [24] [25]
Technology 65nm 65nm 65nm 55nm
Algorithm Filter CNN k-NN CNN
SRAM size 75kb 16kb 128kb 3.75kb
Throughput
(GOPS) 85.3-153 8 10.2 -
Energy Effi-
ciency(TOPS/W) 11.3-20
14.7-
40.3 1.94
18.37-
72.1
published IMC works and demonstrates an order of magnitude
improvement in throughput due to the highly parallel process-
ing. Assuming n2−1 operations (addition) for the calculation
of a n × n kernel, the energy efficiency is comparable with
other state of the art.
CONCLUSION
In this work, we present an approximate and in-memory
computing framework for binary image denoising. The pro-
posed approach is tested with the binary image frames from
a DAVIS sensor setup and achieves ≈ 2000X energy saving
compared to conventional Von Neumann digital approaches.
The massively parallel architecture reduces the processing
time to 0.6µs per frame and provides enough time for the
subsequent processing stages.
REFERENCES
[1] C. Posch, T. Serrano-Gotarredona, B. Linares-Barranco, and T. Del-
bruck, “Retinomorphic event-based vision sensors: Bioinspired cameras
with spiking output,” Proceedings of the IEEE, vol. 102, no. 10, pp.
1470–1484, 2014.
[2] R. Berner, C. Brandli, M. Yang, S. C. Liu, and T. Delbruck, “A 240x180
10mW 12us latency sparse-output vision sensor for mobile applications,”
IEEE Symposium on VLSI Circuits, Digest of Technical Papers, pp.
C186–C187, 2013.
[3] “Celepixel.” [Online]. Available: https://www.celepixel.com. [Accessed:
10- Oct- 2019].
[4] K. A. Boahen, “A burst-mode word-serial address-event link-I: trans-
mitter design,” IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 51, no. 7, pp. 1269–1280, July 2004.
[5] T. Delbruck and M. Lang, “Robotic goalie with 3 ms reaction time at
4% CPU load using event-based dynamic vision sensor,” Frontiers in
Neuroscience, vol. 7, no. 7 NOV, pp. 1–7, 2013.
[6] M. Litzenberger, A. N. Belbachir, N. Donath, G. Gritsch, H. Garn,
B. Kohn, C. Posch, and S. Schraml, “Estimation of vehicle speed
based on asynchronous data from a silicon retina optical sensor,” IEEE
Conference on Intelligent Transportation Systems, Proceedings, ITSC,
pp. 653–658, 2006.
[7] P. Lichtsteiner and T. Delbruck, “A 64x64 aer logarithmic temporal
derivative silicon retina,” in Research in Microelectronics and Electron-
ics, 2005 PhD, vol. 2, July 2005, pp. 202–205.
[8] V. Padala, A. Basu, and G. Orchard, “A noise filtering algorithm for
event-based Asynchronous change detection image sensors on TrueNorth
and its implementation on TrueNorth,” Frontiers in Neuroscience,
vol. 12, no. MAR, pp. 1–14, 2018.
[9] D. Czech and G. Orchard, “Evaluating noise filtering for event-based
asynchronous change detection image sensors,” in 2016 6th IEEE
International Conference on Biomedical Robotics and Biomechatronics
(BioRob), June 2016, pp. 19–24.
[10] J. Acharya, A. U. Caycedo, V. R. Padala, R. R. S. Singh, G. Orchard,
B. Ramesh, and A. Basu, “EBBIOT: A low-complexity tracking
algorithm for surveillance in iovt using stationary neuromorphic
vision sensors,” CoRR, vol. abs/1910.01851, 2019. [Online]. Available:
http://arxiv.org/abs/1910.01851
[11] A. Basu, J. Acharya, and et.al., “Low-power, adaptive neuromorphic
systems: Recent progress and future directions,” IEEE Journal on
Emerging and Selected Topics in Circuits and Systems, vol. 8, no. 1,
pp. 6–27, 2018.
[12] A. Biswas, “Energy-efficient smart embedded memory design for IoT
and AI,” Ph.D. dissertation, Dept. Elect. Eng. Comput. Sci., Mas-
sachusetts Inst. Technol., Cambridge, MA, USA, Jun. 2018.
[13] J. Zhang, Z. Wang, and N. Verma, “In-memory computation of a
machine-learning classifier in a standard 6t sram array,” IEEE Journal
of Solid-State Circuits, vol. 52, no. 4, pp. 915–924, April 2017.
[14] A. Biswas and A. P. Chandrakasan, “Conv-RAM: An energy-efficient
SRAM with embedded convolution computation for low-power CNN-
based machine learning applications,” Digest of Technical Papers - IEEE
International Solid-State Circuits Conference, vol. 61, pp. 488–490,
2018.
[15] S. L. Lu, “Speeding up processing with approximation circuits,” Com-
puter, vol. 37, no. 3, pp. 67–73, 2004.
[16] A. Gupta, S. Mandavalli, and et. al, “Low power probabilistic floating
point multiplier design,” in 2011 IEEE Computer Society Annual Sym-
posium on VLSI, July 2011, pp. 182–187.
[17] S. K. Bose, B. Kar, M. Roy, P. K. Gopalakrishnan, and A. Basu, “Ade-
pos: Anomaly detection based power saving for predictive maintenance
using edge computing,” in Proceedings of the 24th Asia and South
Pacific Design Automation Conference, ser. ASPDAC ’19. New York,
NY, USA: ACM, 2019, pp. 597–602.
[18] A. Raha and V. Raghunathan, “Approximating beyond the Processor:
Exploring Full-System Energy-Accuracy Tradeoffs in a Smart Camera
System,” IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, vol. 26, no. 12, pp. 2884–2897, 2018.
[19] C. Posch, D. Matolin, and R. Wohlgenannt, “A qvga 143 db dynamic
range frame-free pwm image sensor with lossless pixel-level video com-
pression and time-domain cds,” IEEE Journal of Solid-State Circuits,
vol. 46, no. 1, pp. 259–275, Jan 2011.
[20] C. Brandli, R. Berner, M. Yang, S.-C. Liu, and T. Delbruck, “A 240x180
130db 3s latency global shutter spatiotemporal vision sensor,” IEEE
Journal of Solid-State Circuits, vol. 49, no. 10, pp. 2333–2341, 2014.
[21] A. P. Chandrakasan and R. W. Brodersen, Low Power Digital CMOS
Design. Norwell, MA, USA: Kluwer Academic Publishers, 1995.
[22] R. C. Gonzalez and R. E. Woods, Digital Image Processing (3rd
Edition). Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2006.
[23] H. Liu, C. Brandli, C. Li, S. Liu, and T. Delbruck, “Design of a
spatiotemporal correlation filter for event-based sensors,” in 2015 IEEE
International Symposium on Circuits and Systems (ISCAS), May 2015,
pp. 722–725.
[24] M. Kang, S. K. Gonugondla, A. Patil, and N. R. Shanbhag, “A Multi-
Functional In-Memory Inference Processor Using a Standard 6T SRAM
Array,” IEEE Journal of Solid-State Circuits, vol. 53, no. 2, pp. 642–655,
2018.
[25] X. Si, J. Chen, Y. Tu, W. Huang, J. Wang, Y. Chiu, W. Wei, S. Wu,
X. Sun, R. Liu, S. Yu, R. Liu, C. Hsieh, K. Tang, Q. Li, and M. Chang,
“24.5 a twin-8t sram computation-in-memory macro for multiple-bit
cnn-based machine learning,” in 2019 IEEE International Solid- State
Circuits Conference - (ISSCC), Feb 2019, pp. 396–398.
