3,417 research outputs found
A programmable VLSI filter architecture for application in real-time vision processing systems
An architecture is proposed for the realization of real-time edge-extraction filtering operation in an Address-Event-Representation (AER) vision system. Furthermore, the approach is valid for any 2D filtering operation as long as the convolutional kernel F(p,q) is decomposable into an x-axis and a y-axis component, i.e. F(p,q)=H(p)V(q), for some rotated coordinate system [p,q]. If it is possible to find a coordinate system [p,q], rotated with respect to the absolute coordinate system a certain angle, for which the above decomposition is possible, then the proposed architecture is able to perform the filtering operation for any angle we would like the kernel to be rotated. This is achieved by taking advantage of the AER and manipulating the addresses in real time. The proposed architecture, however, requires one approximation: the product operation between the horizontal component H(p) and vertical component V(q) should be able to be approximated by a signed minimum operation without significant performance degradation. It is shown that for edge-extraction applications this filter does not produce performance degradation. The proposed architecture is intended to be used in a complete vision system known as the Boundary-Contour-System and Feature-Contour-System Vision Model, proposed by Grossberg and collaborators. The present paper proposes the architecture, provides a circuit implementation using MOS transistors operated in weak inversion, and shows behavioral simulation results at the system level operation and electrical simulation and experimental results at the circuit level operation of some critical subcircuits
High throughput spatial convolution filters on FPGAs
Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility
The Chameleon Architecture for Streaming DSP Applications
We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool
An AER Spike-Processing Filter Simulator and Automatic VHDL Generator Based on Cellular Automata
Spike-based systems are neuro-inspired circuits implementations
traditionally used for sensory systems or sensor signal processing. Address-Event-
Representation (AER) is a neuromorphic communication protocol for transferring
asynchronous events between VLSI spike-based chips. These neuro-inspired
implementations allow developing complex, multilayer, multichip neuromorphic
systems and have been used to design sensor chips, such as retinas and cochlea,
processing chips, e.g. filters, and learning chips. Furthermore, Cellular Automata
(CA) is a bio-inspired processing model for problem solving. This approach
divides the processing synchronous cells which change their states at the same time
in order to get the solution. This paper presents a software simulator able to gather
several spike-based elements into the same workspace in order to test a CA
architecture based on AER before a hardware implementation. Furthermore this
simulator produces VHDL for testing the AER-CA into the FPGA of the USBAER
AER-tool.Ministerio de Ciencia e Innovación TEC2009-10639-C04-0
Memory and information processing in neuromorphic systems
A striking difference between brain-inspired neuromorphic processors and
current von Neumann processors architectures is the way in which memory and
processing is organized. As Information and Communication Technologies continue
to address the need for increased computational power through the increase of
cores within a digital processor, neuromorphic engineers and scientists can
complement this need by building processor architectures where memory is
distributed with the processing. In this paper we present a survey of
brain-inspired processor architectures that support models of cortical networks
and deep neural networks. These architectures range from serial clocked
implementations of multi-neuron systems to massively parallel asynchronous ones
and from purely digital systems to mixed analog/digital systems which implement
more biological-like models of neurons and synapses together with a suite of
adaptation and learning mechanisms analogous to the ones found in biological
nervous systems. We describe the advantages of the different approaches being
pursued and present the challenges that need to be addressed for building
artificial neural processing systems that can display the richness of behaviors
seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed
neuromorphic computing platforms and system
Spike-based VITE control with Dynamic Vision Sensor applied to an Arm Robot.
Spike-based motor control is very important in the
field of robotics and also for the neuromorphic engineering
community to bridge the gap between sensing / processing
devices and motor control without losing the spike philosophy
that enhances speed response and reduces power consumption.
This paper shows an accurate neuro-inspired spike-based system
composed of a DVS retina, a visual processing system that detects
and tracks objects, and a SVITE motor control, where everything
follows the spike-based philosophy. The control system is a spike
version of the neuroinspired open loop VITE control algorithm
implemented in a couple of FPGA boards: the first one runs the
algorithm and the second one drives the motors with spikes. The
robotic platform is a low cost arm with four degrees of freedom.Ministerio de Ciencia e Innovación TEC2009-10639-C04-02/01Ministerio de Economía y Competitividad TEC2012-37868-C04-02/0
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Near-sensor data analytics is a promising direction for IoT endpoints, as it
minimizes energy spent on communication and reduces network load - but it also
poses security concerns, as valuable data is stored or sent over the network at
various stages of the analytics pipeline. Using encryption to protect sensitive
data at the boundary of the on-chip analytics engine is a way to address data
security issues. To cope with the combined workload of analytics and encryption
in a tight power envelope, we propose Fulmine, a System-on-Chip based on a
tightly-coupled multi-core cluster augmented with specialized blocks for
compute-intensive data processing and encryption functions, supporting software
programmability for regular computing tasks. The Fulmine SoC, fabricated in
65nm technology, consumes less than 20mW on average at 0.8V achieving an
efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to
25MIPS/mW in software. As a strong argument for real-life flexible application
of our platform, we show experimental results for three secure analytics use
cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN
consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with
secured remote recognition in 5.74pJ/op; and seizure detection with encrypted
data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE
Transactions on Circuits and Systems - I: Regular Paper
- …