281 research outputs found
Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices
For many applications in low-power real-time robotics, stereo cameras are the
sensors of choice for depth perception as they are typically cheaper and more
versatile than their active counterparts. Their biggest drawback, however, is
that they do not directly sense depth maps; instead, these must be estimated
through data-intensive processes. Therefore, appropriate algorithm selection
plays an important role in achieving the desired performance characteristics.
Motivated by applications in space and mobile robotics, we implement and
evaluate a FPGA-accelerated adaptation of the ELAS algorithm. Despite offering
one of the best trade-offs between efficiency and accuracy, ELAS has only been
shown to run at 1.5-3 fps on a high-end CPU. Our system preserves all
intriguing properties of the original algorithm, such as the slanted plane
priors, but can achieve a frame rate of 47fps whilst consuming under 4W of
power. Unlike previous FPGA based designs, we take advantage of both components
on the CPU/FPGA System-on-Chip to showcase the strategy necessary to accelerate
more complex and computationally diverse algorithms for such low power,
real-time systems.Comment: 8 pages, 7 figures, 2 table
A Binaural Neuromorphic Auditory Sensor for FPGA: A Spike Signal Processing Approach
This paper presents a new architecture, design
flow, and field-programmable gate array (FPGA) implementation
analysis of a neuromorphic binaural auditory sensor, designed
completely in the spike domain. Unlike digital cochleae that
decompose audio signals using classical digital signal processing
techniques, the model presented in this paper processes information
directly encoded as spikes using pulse frequency modulation
and provides a set of frequency-decomposed audio information
using an address-event representation interface. In this case,
a systematic approach to design led to a generic process for
building, tuning, and implementing audio frequency decomposers
with different features, facilitating synthesis with custom features.
This allows researchers to implement their own parameterized
neuromorphic auditory systems in a low-cost FPGA in order to
study the audio processing and learning activity that takes place
in the brain. In this paper, we present a 64-channel binaural
neuromorphic auditory system implemented in a Virtex-5 FPGA
using a commercial development board. The system was excited
with a diverse set of audio signals in order to analyze its response
and characterize its features. The neuromorphic auditory system
response times and frequencies are reported. The experimental
results of the proposed system implementation with 64-channel
stereo are: a frequency range between 9.6 Hz and 14.6 kHz
(adjustable), a maximum output event rate of 2.19 Mevents/s,
a power consumption of 29.7 mW, the slices requirements
of 11 141, and a system clock frequency of 27 MHz.Ministerio de Economía y Competitividad TEC2012-37868-C04-02Junta de Andalucía P12-TIC-130
An Architecture for High-throughput and Improved-quality Stereo Vision Processor
This paper presents the VLSI architecture to achieve high-throughput and
improved-quality stereo vision for real applications. The stereo vision processor
generates gray-scale output images with depth information from input images taken by
two CMOS Image Sensors (CIS). The depth estimator using the sum of absolute
differences (SAD) algorithm as stereo matching technique is implemented on hardware
by exploiting pipelining and parallelism. To produce depth maps with improved-quality
at real-time, pre- and post-processing units are adopted, and to enhance the adaptability
of the system to real environments, special function registers (SFRs) are assigned to
vision parameters. The design using 0.18um standard CMOS technology can operate at
120MHz clock, achieving over 140 frames/sec depth maps with 320 by 240 image size
and 64 disparity levels. Experimental results based on images taken in real world and
the Middlebury data set will be presented. Comparison data with existing hardware
systems and hardware specifications of the proposed processor will be given
Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing
This article describes an image processing system based on an intelligent ad-hoc camera, whose two principle elements are a high speed 1.2 megapixel Complementary Metal Oxide Semiconductor (CMOS) sensor and a Field Programmable Gate Array (FPGA). The latter is used to control the various sensor parameter configurations and, where desired, to receive and process the images captured by the CMOS sensor. The flexibility and versatility offered by the new FPGA families makes it possible to incorporate microprocessors into these reconfigurable devices, and these are normally used for highly sequential tasks unsuitable for parallelization in hardware. For the present study, we used a Xilinx XC4VFX12 FPGA, which contains an internal Power PC (PPC) microprocessor. In turn, this contains a standalone system which manages the FPGA image processing hardware and endows the system with multiple software options for processing the images captured by the CMOS sensor. The system also incorporates an Ethernet channel for sending processed and unprocessed images from the FPGA to a remote node. Consequently, it is possible to visualize and configure system operation and captured and/or processed images remotely
Mapping a guided image filter on the HARP reconfigurable architecture using OpenCL
Intel recently introduced the Heterogeneous Architecture Research Platform, HARP. In this platform, the Central Processing Unit and a Field-Programmable Gate Array are connected through a high-bandwidth, low-latency interconnect and both share DRAM memory. For this platform, Open Computing Language (OpenCL), a High-Level Synthesis (HLS) language, is made available. By making use of HLS, a faster design cycle can be achieved compared to programming in a traditional hardware description language. This, however, comes at the cost of having less control over the hardware implementation. We will investigate how OpenCL can be applied to implement a real-time guided image filter on the HARP platform. In the first phase, the performance-critical parameters of the OpenCL programming model are defined using several specialized benchmarks. In a second phase, the guided image filter algorithm is implemented using the insights gained in the first phase. Both a floating-point and a fixed-point implementation were developed for this algorithm, based on a sliding window implementation. This resulted in a maximum floating-point performance of 135 GFLOPS, a maximum fixed-point performance of 430 GOPS and a throughput of HD color images at 74 frames per second
- …