9,976 research outputs found
Nonphotolithographic nanoscale memory density prospects
Technologies are now emerging to construct molecular-scale electronic wires and switches using bottom-up self-assembly. This opens the possibility of constructing nanoscale circuits and memories where active devices are just a few nanometers square and wire pitches may be on the order of ten nanometers. The features can be defined at this scale without using photolithography. The available assembly techniques have relatively high defect rates compared to conventional lithographic integrated circuits and can only produce very regular structures. Nonetheless, with proper memory organization, it is reasonable to expect these technologies to provide memory densities in excess of 10/sup 11/ b/cm/sup 2/ with modest active power requirements under 0.6 W/Tb/s for random read operations
MorphIC: A 65-nm 738k-Synapse/mm Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning
Recent trends in the field of neural network accelerators investigate weight
quantization as a means to increase the resource- and power-efficiency of
hardware devices. As full on-chip weight storage is necessary to avoid the high
energy cost of off-chip memory accesses, memory reduction requirements for
weight storage pushed toward the use of binary weights, which were demonstrated
to have a limited accuracy reduction on many applications when
quantization-aware training techniques are used. In parallel, spiking neural
network (SNN) architectures are explored to further reduce power when
processing sparse event-based data streams, while on-chip spike-based online
learning appears as a key feature for applications constrained in power and
resources during the training phase. However, designing power- and
area-efficient spiking neural networks still requires the development of
specific techniques in order to leverage on-chip online learning on binary
weights without compromising the synapse density. In this work, we demonstrate
MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a
stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning
rule and a hierarchical routing fabric for large-scale chip interconnection.
The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF)
neurons and more than two million plastic synapses for an active silicon area
of 2.86mm in 65nm CMOS, achieving a high density of 738k synapses/mm.
MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy
tradeoff on the MNIST classification task compared to previously-proposed SNNs,
while having no penalty in the energy-accuracy tradeoff.Comment: This document is the paper as accepted for publication in the IEEE
Transactions on Biomedical Circuits and Systems journal (2019), the
fully-edited paper is available at
https://ieeexplore.ieee.org/document/876400
Graded, Dynamically Routable Information Processing with Synfire-Gated Synfire Chains
Coherent neural spiking and local field potentials are believed to be
signatures of the binding and transfer of information in the brain. Coherent
activity has now been measured experimentally in many regions of mammalian
cortex. Synfire chains are one of the main theoretical constructs that have
been appealed to to describe coherent spiking phenomena. However, for some
time, it has been known that synchronous activity in feedforward networks
asymptotically either approaches an attractor with fixed waveform and
amplitude, or fails to propagate. This has limited their ability to explain
graded neuronal responses. Recently, we have shown that pulse-gated synfire
chains are capable of propagating graded information coded in mean population
current or firing rate amplitudes. In particular, we showed that it is possible
to use one synfire chain to provide gating pulses and a second, pulse-gated
synfire chain to propagate graded information. We called these circuits
synfire-gated synfire chains (SGSCs). Here, we present SGSCs in which graded
information can rapidly cascade through a neural circuit, and show a
correspondence between this type of transfer and a mean-field model in which
gating pulses overlap in time. We show that SGSCs are robust in the presence of
variability in population size, pulse timing and synaptic strength. Finally, we
demonstrate the computational capabilities of SGSC-based information coding by
implementing a self-contained, spike-based, modular neural circuit that is
triggered by, then reads in streaming input, processes the input, then makes a
decision based on the processed information and shuts itself down
All-optical pulse reshaping and retiming systems incorporating pulse shaping fiber Bragg grating
This paper demonstrates two optical pulse retiming and reshaping systems incorporating superstructured fiber Bragg gratings (SSFBGs) as pulse shaping elements. A rectangular switching window is implemented to avoid conversion of the timing jitter on the original data pulses into pulse amplitude noise at the output of a nonlinear optical switch. In a first configuration, the rectangular pulse generator is used at the (low power) data input to a nonlinear optical loop mirror (NOLM) to perform retiming of an incident noisy data signal using a clean local clock signal to control the switch. In a second configuration, the authors further amplify the data signal and use it to switch a (low power) clean local clock signal. The S-shaped nonlinear characteristic of the NOLM results in this instance in a reduction of both timing and amplitude jitter on the data signal. The underlying technologies required for the implementation of this technique are such that an upgrade of the scheme for the regeneration of ultrahigh bit rate signals at data rates in excess of 320 Gb/s should be achievable
A sub-mW IoT-endnode for always-on visual monitoring and smart triggering
This work presents a fully-programmable Internet of Things (IoT) visual
sensing node that targets sub-mW power consumption in always-on monitoring
scenarios. The system features a spatial-contrast binary
pixel imager with focal-plane processing. The sensor, when working at its
lowest power mode ( at 10 fps), provides as output the number of
changed pixels. Based on this information, a dedicated camera interface,
implemented on a low-power FPGA, wakes up an ultra-low-power parallel
processing unit to extract context-aware visual information. We evaluate the
smart sensor on three always-on visual triggering application scenarios.
Triggering accuracy comparable to RGB image sensors is achieved at nominal
lighting conditions, while consuming an average power between and
, depending on context activity. The digital sub-system is extremely
flexible, thanks to a fully-programmable digital signal processing engine, but
still achieves 19x lower power consumption compared to MCU-based cameras with
significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa
YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration
Convolutional neural networks (CNNs) have revolutionized the world of
computer vision over the last few years, pushing image classification beyond
human accuracy. The computational effort of today's CNNs requires power-hungry
parallel processors or GP-GPUs. Recent developments in CNN accelerators for
system-on-chip integration have reduced energy consumption significantly.
Unfortunately, even these highly optimized devices are above the power envelope
imposed by mobile and deeply embedded applications and face hard limitations
caused by CNN weight I/O and storage. This prevents the adoption of CNNs in
future ultra-low power Internet of Things end-nodes for near-sensor analytics.
Recent algorithmic and theoretical advancements enable competitive
classification accuracy even when limiting CNNs to binary (+1/-1) weights
during training. These new findings bring major optimization opportunities in
the arithmetic core by removing the need for expensive multiplications, as well
as reducing I/O bandwidth and storage. In this work, we present an accelerator
optimized for binary-weight CNNs that achieves 1510 GOp/s at 1.2 V on a core
area of only 1.33 MGE (Million Gate Equivalent) or 0.19 mm and with a power
dissipation of 895 {\mu}W in UMC 65 nm technology at 0.6 V. Our accelerator
significantly outperforms the state-of-the-art in terms of energy and area
efficiency achieving 61.2 TOp/s/[email protected] V and 1135 GOp/s/[email protected] V, respectively
Data-Width-Driven Power Gating of Integer Arithmetic Circuits
When performing narrow-width computations, power gating of unused arithmetic circuit portions can significantly reduce leakage power. We deploy coarse-grain power gating in 32-bit integer arithmetic circuits that frequently will operate on narrow-width data. Our contributions include a design framework that automatically implements coarse-grain power-gated arithmetic circuits considering a narrow-width input data mode, and an analysis of the impact of circuit architecture on the efficiency of this data-width-driven power gating scheme. As an example, with a performance penalty of 6.7%, coarse-grain power gating of a 45-nm 32-bit multiplier is demonstrated to yield an 11.6x static leakage energy reduction per 8x8-bit operation
- …