9,913 research outputs found
Velocity Estimation for Autonomous Underwater Vehicles using Vision-Based Systems
In this dissertation, it is presented a study of a system architecture capable of calculating the linear and angular velocity of an autonomous underwater vehicle, AUV, in real-time, suitable to be used in the control loop of an AUV. The velocity is estimated using computer vision algorithms, optical flow and block matching, keeping in mind the movement characteristics of autonomous underwater vehicles, i.e. maximum velocity and acceleration, regarding these systems as having a slow dynamic. Considering that these computer vision technics are computing intensive tasks, and are not compatible with real-time systems when implemented in microcomputers, this problem is solved through the study of a possible implementation of these technics in a field programmable gate array, FPGA, and microcomputers. The computer vision algorithms studied, for optical flow computation, were Horn-Schunck, Lucas and Kanade, and it's different variations and optimizations, and more simple algorithms as block matching
A novel system architecture for real-time low-level vision
A novel system architecture that exploits the spatial locality in memory access that is found in most low-level vision algorithms is presented. A real-time feature selection system is used to exemplify the underlying ideas, and an implementation based on commercially available Field Programmable Gate Arrays (FPGA’s) and synchronous SRAM memory devices is proposed. The peak memory access rate of a system based on this architecture is estimated at 2.88 G-Bytes/s, which represents a four to five times improvement with respect to existing reconfigurable computers
Accelerated hardware video object segmentation: From foreground detection to connected components labelling
This is the preprint version of the Article - Copyright @ 2010 ElsevierThis paper demonstrates the use of a single-chip FPGA for the segmentation of moving objects in a video sequence. The system maintains highly accurate background models, and integrates the detection of foreground pixels with the labelling of objects using a connected components algorithm. The background models are based on 24-bit RGB values and 8-bit gray scale intensity values. A multimodal background differencing algorithm is presented, using a single FPGA chip and four blocks of RAM. The real-time connected component labelling algorithm, also designed for FPGA implementation, run-length encodes the output of the background subtraction, and performs connected component analysis on this representation. The run-length encoding, together with other parts of the algorithm, is performed in parallel; sequential operations are minimized as the number of run-lengths are typically less than the number of pixels. The two algorithms are pipelined together for maximum efficiency
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions
In the past decade, Convolutional Neural Networks (CNNs) have demonstrated
state-of-the-art performance in various Artificial Intelligence tasks. To
accelerate the experimentation and development of CNNs, several software
frameworks have been released, primarily targeting power-hungry CPUs and GPUs.
In this context, reconfigurable hardware in the form of FPGAs constitutes a
potential alternative platform that can be integrated in the existing deep
learning ecosystem to provide a tunable balance between performance, power
consumption and programmability. In this paper, a survey of the existing
CNN-to-FPGA toolflows is presented, comprising a comparative study of their key
characteristics which include the supported applications, architectural
choices, design space exploration methods and achieved performance. Moreover,
major challenges and objectives introduced by the latest trends in CNN
algorithmic research are identified and presented. Finally, a uniform
evaluation methodology is proposed, aiming at the comprehensive, complete and
in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal,
201
Neuromorphic LIF Row-by-Row Multiconvolution Processor for FPGA
Deep Learning algorithms have become state-of-theart
methods for multiple fields, including computer vision, speech
recognition, natural language processing, and audio recognition,
among others. In image vision, convolutional neural networks
(CNN) stand out. This kind of network is expensive in terms of
computational resources due to the large number of operations required
to process a frame. In recent years, several frame-based chip
solutions to deploy CNN for real time have been developed. Despite
the good results in power and accuracy given by these solutions, the
number of operations is still high, due the complexity of the current
network models. However, it is possible to reduce the number of
operations using different computer vision techniques other than
frame-based, e.g., neuromorphic event-based techniques. There exist
several neuromorphic vision sensors whose pixels detect changes
in luminosity. Inspired in the leaky integrate-and-fire (LIF) neuron,
we propose in this manuscript an event-based field-programmable
gate array (FPGA) multiconvolution system. Its main novelty is the
combination of a memory arbiter for efficient memory access to
allowrow-by-rowkernel processing. This system is able to convolve
64 filters across multiple kernel sizes, from 1 × 1 to 7 × 7, with
latencies of 1.3 μs and 9.01 μs, respectively, generating a continuous
flow of output events. The proposed architecture will easily fit
spike-based CNNs.Ministerio de Economía y Competitividad TEC2016-77785-
A one-transistor-synapse strategy for electrically-programmable massively-parallel analog array processors
This paper presents a linear, four-quadrants, electrically-programmable, one-transistor synapse strategy applicable to the implementation of general massively-parallel analog processors in CMOS technology. It is specially suited for translationally-invariant processing arrays with local connectivity, and results in a significant reduction in area occupation and power dissipation of the basic processing units. This allows higher integration densities and therefore, permits the integration of larger arrays on a single chip.Comisión Interministerial de Ciencia y Tecnología TIC96- 1392-C02-0
- …