5,261 research outputs found
On Real-Time AER 2-D Convolutions Hardware for Neuromorphic Spike-Based Cortical Processing
In this paper, a chip that performs real-time image
convolutions with programmable kernels of arbitrary shape is presented.
The chip is a first experimental prototype of reduced size
to validate the implemented circuits and system level techniques.
The convolution processing is based on the address–event-representation
(AER) technique, which is a spike-based biologically
inspired image and video representation technique that favors
communication bandwidth for pixels with more information. As
a first test prototype, a pixel array of 16x16 has been implemented
with programmable kernel size of up to 16x16. The
chip has been fabricated in a standard 0.35- m complimentary
metal–oxide–semiconductor (CMOS) process. The technique also
allows to process larger size images by assembling 2-D arrays of
such chips. Pixel operation exploits low-power mixed analog–digital
circuit techniques. Because of the low currents involved (down
to nanoamperes or even picoamperes), an important amount of
pixel area is devoted to mismatch calibration. The rest of the
chip uses digital circuit techniques, both synchronous and asynchronous.
The fabricated chip has been thoroughly tested, both at
the pixel level and at the system level. Specific computer interfaces
have been developed for generating AER streams from conventional
computers and feeding them as inputs to the convolution
chip, and for grabbing AER streams coming out of the convolution
chip and storing and analyzing them on computers. Extensive
experimental results are provided. At the end of this paper, we
provide discussions and results on scaling up the approach for
larger pixel arrays and multilayer cortical AER systems.Commission of the European Communities IST-2001-34124 (CAVIAR)Commission of the European Communities 216777 (NABAB)Ministerio de Educación y Ciencia TIC-2000-0406-P4Ministerio de Educación y Ciencia TIC-2003-08164-C03-01Ministerio de Educación y Ciencia TEC2006-11730-C03-01Junta de Andalucía TIC-141
Spike Events Processing for Vision Systems
In this paper we briefly summarize the fundamental
properties of spike events processing applied to artificial
vision systems. This sensing and processing technology
is capable of very high speed throughput, because it
does not rely on sensing and processing sequences of
frames, and because it allows for complex hierarchically
structured cortical-like layers for sophisticated
processing. The paper includes a few examples that have
demonstrated the potential of this technology for highspeed
vision processing, such as a multilayer event
processing network of 5 sequential cortical-like layers,
and a recognition system capable of discriminating
propellers of different shape rotating at 5000 revolutions
per second (300000 revolutions per minute)
FPGA Implementations Comparison of Neuro-cortical Inspired Convolution Processors for Spiking Systems
Image convolution operations in digital computer systems are usually
very expensive operations in terms of resource consumption (processor
resources and processing time) for an efficient Real-Time application. In these
scenarios the visual information is divided in frames and each one has to be
completely processed before the next frame arrives. Recently a new method for
computing convolutions based on the neuro-inspired philosophy of spiking
systems (Address-Event-Representation systems, AER) is achieving high
performances. In this paper we present two FPGA implementations of AERbased
convolution processors that are able to work with 64x64 images and
programmable kernels of up to 11x11 elements. The main difference is the use
of RAM for integrators in one solution and the absence of integrators in the
second solution that is based on mapping operations. The maximum equivalent
operation rate is 163.51 MOPS for 11x11 kernels, in a Xilinx Spartan 3 400
FPGA with a 50MHz clock. Formulations, hardware architecture, operation
examples and performance comparison with frame-based convolution
processors are presented and discussed.Ministerio de Ciencia e Innovación TEC2006-11730-C03-02Junta de Andalucía P06-TIC-0141
On the AER Convolution Processors for FPGA
Image convolution operations in digital computer
systems are usually very expensive operations in terms of
resource consumption (processor resources and processing time)
for an efficient Real-Time application. In these scenarios the
visual information is divided into frames and each one has to be
completely processed before the next frame arrives in order to
warranty the real-time. A spike-based philosophy for computing
convolutions based on the neuro-inspired Address-Event-
Representation (AER) is achieving high performances. In this
paper we present two FPGA implementations of AER-based
convolution processors for relatively small Xilinx FPGAs
(Spartan-II 200 and Spartan-3 400), which process 64x64 images
with 11x11 convolution kernels. The maximum equivalent
operation rate that can be reached is 163.51 MOPS for 11x11
kernels, in a Xilinx Spartan 3 400 FPGA with a 50MHz clock.
Formulations, hardware architecture, operation examples and
performance comparison with frame-based convolution
processors are presented and discussed.Ministerio de Ciencia e Innovación TEC2006-11730-C03-02Ministerio de Ciencia e Innovación TEC2009-10639-C04-02Junta de Andalucía P06-TIC-0141
Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices
A recent trend in DNN development is to extend the reach of deep learning
applications to platforms that are more resource and energy constrained, e.g.,
mobile devices. These endeavors aim to reduce the DNN model size and improve
the hardware processing efficiency, and have resulted in DNNs that are much
more compact in their structures and/or have high data sparsity. These compact
or sparse models are different from the traditional large ones in that there is
much more variation in their layer shapes and sizes, and often require
specialized hardware to exploit sparsity for performance improvement. Thus,
many DNN accelerators designed for large DNNs do not perform well on these
models. In this work, we present Eyeriss v2, a DNN accelerator architecture
designed for running compact and sparse DNNs. To deal with the widely varying
layer shapes and sizes, it introduces a highly flexible on-chip network, called
hierarchical mesh, that can adapt to the different amounts of data reuse and
bandwidth requirements of different data types, which improves the utilization
of the computation resources. Furthermore, Eyeriss v2 can process sparse data
directly in the compressed domain for both weights and activations, and
therefore is able to improve both processing speed and energy efficiency with
sparse models. Overall, with sparse MobileNet, Eyeriss v2 in a 65nm CMOS
process achieves a throughput of 1470.6 inferences/sec and 2560.3 inferences/J
at a batch size of 1, which is 12.6x faster and 2.5x more energy efficient than
the original Eyeriss running MobileNet. We also present an analysis methodology
called Eyexam that provides a systematic way of understanding the performance
limits for DNN processors as a function of specific characteristics of the DNN
model and accelerator design; it applies these characteristics as sequential
steps to increasingly tighten the bound on the performance limits.Comment: accepted for publication in IEEE Journal on Emerging and Selected
Topics in Circuits and Systems. This extended version on arXiv also includes
Eyexam in the appendi
An AER Spike-Processing Filter Simulator and Automatic VHDL Generator Based on Cellular Automata
Spike-based systems are neuro-inspired circuits implementations
traditionally used for sensory systems or sensor signal processing. Address-Event-
Representation (AER) is a neuromorphic communication protocol for transferring
asynchronous events between VLSI spike-based chips. These neuro-inspired
implementations allow developing complex, multilayer, multichip neuromorphic
systems and have been used to design sensor chips, such as retinas and cochlea,
processing chips, e.g. filters, and learning chips. Furthermore, Cellular Automata
(CA) is a bio-inspired processing model for problem solving. This approach
divides the processing synchronous cells which change their states at the same time
in order to get the solution. This paper presents a software simulator able to gather
several spike-based elements into the same workspace in order to test a CA
architecture based on AER before a hardware implementation. Furthermore this
simulator produces VHDL for testing the AER-CA into the FPGA of the USBAER
AER-tool.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
Event-Driven Contrastive Divergence for Spiking Neuromorphic Systems
Restricted Boltzmann Machines (RBMs) and Deep Belief Networks have been
demonstrated to perform efficiently in a variety of applications, such as
dimensionality reduction, feature learning, and classification. Their
implementation on neuromorphic hardware platforms emulating large-scale
networks of spiking neurons can have significant advantages from the
perspectives of scalability, power dissipation and real-time interfacing with
the environment. However the traditional RBM architecture and the commonly used
training algorithm known as Contrastive Divergence (CD) are based on discrete
updates and exact arithmetics which do not directly map onto a dynamical neural
substrate. Here, we present an event-driven variation of CD to train a RBM
constructed with Integrate & Fire (I&F) neurons, that is constrained by the
limitations of existing and near future neuromorphic hardware platforms. Our
strategy is based on neural sampling, which allows us to synthesize a spiking
neural network that samples from a target Boltzmann distribution. The recurrent
activity of the network replaces the discrete steps of the CD algorithm, while
Spike Time Dependent Plasticity (STDP) carries out the weight updates in an
online, asynchronous fashion. We demonstrate our approach by training an RBM
composed of leaky I&F neurons with STDP synapses to learn a generative model of
the MNIST hand-written digit dataset, and by testing it in recognition,
generation and cue integration tasks. Our results contribute to a machine
learning-driven approach for synthesizing networks of spiking neurons capable
of carrying out practical, high-level functionality.Comment: (Under review
A Novel Optical/digital Processing System for Pattern Recognition
This paper describes two processing algorithms that can be implemented optically: the Radon transform and angular correlation. These two algorithms can be combined in one optical processor to extract all the basic geometric and amplitude features from objects embedded in video imagery. We show that the internal amplitude structure of objects is recovered by the Radon transform, which is a well-known result, but, in addition, we show simulation results that calculate angular correlation, a simple but unique algorithm that extracts object boundaries from suitably threshold images from which length, width, area, aspect ratio, and orientation can be derived. In addition to circumventing scale and rotation distortions, these simulations indicate that the features derived from the angular correlation algorithm are relatively insensitive to tracking shifts and image noise. Some optical architecture concepts, including one based on micro-optical lenslet arrays, have been developed to implement these algorithms. Simulation test and evaluation using simple synthetic object data will be described, including results of a study that uses object boundaries (derivable from angular correlation) to classify simple objects using a neural network
- …