9,913 research outputs found

    Velocity Estimation for Autonomous Underwater Vehicles using Vision-Based Systems

    Get PDF
    In this dissertation, it is presented a study of a system architecture capable of calculating the linear and angular velocity of an autonomous underwater vehicle, AUV, in real-time, suitable to be used in the control loop of an AUV. The velocity is estimated using computer vision algorithms, optical flow and block matching, keeping in mind the movement characteristics of autonomous underwater vehicles, i.e. maximum velocity and acceleration, regarding these systems as having a slow dynamic. Considering that these computer vision technics are computing intensive tasks, and are not compatible with real-time systems when implemented in microcomputers, this problem is solved through the study of a possible implementation of these technics in a field programmable gate array, FPGA, and microcomputers. The computer vision algorithms studied, for optical flow computation, were Horn-Schunck, Lucas and Kanade, and it's different variations and optimizations, and more simple algorithms as block matching

    A novel system architecture for real-time low-level vision

    Get PDF
    A novel system architecture that exploits the spatial locality in memory access that is found in most low-level vision algorithms is presented. A real-time feature selection system is used to exemplify the underlying ideas, and an implementation based on commercially available Field Programmable Gate Arrays (FPGA’s) and synchronous SRAM memory devices is proposed. The peak memory access rate of a system based on this architecture is estimated at 2.88 G-Bytes/s, which represents a four to five times improvement with respect to existing reconfigurable computers

    Accelerated hardware video object segmentation: From foreground detection to connected components labelling

    Get PDF
    This is the preprint version of the Article - Copyright @ 2010 ElsevierThis paper demonstrates the use of a single-chip FPGA for the segmentation of moving objects in a video sequence. The system maintains highly accurate background models, and integrates the detection of foreground pixels with the labelling of objects using a connected components algorithm. The background models are based on 24-bit RGB values and 8-bit gray scale intensity values. A multimodal background differencing algorithm is presented, using a single FPGA chip and four blocks of RAM. The real-time connected component labelling algorithm, also designed for FPGA implementation, run-length encodes the output of the background subtraction, and performs connected component analysis on this representation. The run-length encoding, together with other parts of the algorithm, is performed in parallel; sequential operations are minimized as the number of run-lengths are typically less than the number of pixels. The two algorithms are pipelined together for maximum efficiency

    Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

    Get PDF
    In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

    Neuromorphic LIF Row-by-Row Multiconvolution Processor for FPGA

    Get PDF
    Deep Learning algorithms have become state-of-theart methods for multiple fields, including computer vision, speech recognition, natural language processing, and audio recognition, among others. In image vision, convolutional neural networks (CNN) stand out. This kind of network is expensive in terms of computational resources due to the large number of operations required to process a frame. In recent years, several frame-based chip solutions to deploy CNN for real time have been developed. Despite the good results in power and accuracy given by these solutions, the number of operations is still high, due the complexity of the current network models. However, it is possible to reduce the number of operations using different computer vision techniques other than frame-based, e.g., neuromorphic event-based techniques. There exist several neuromorphic vision sensors whose pixels detect changes in luminosity. Inspired in the leaky integrate-and-fire (LIF) neuron, we propose in this manuscript an event-based field-programmable gate array (FPGA) multiconvolution system. Its main novelty is the combination of a memory arbiter for efficient memory access to allowrow-by-rowkernel processing. This system is able to convolve 64 filters across multiple kernel sizes, from 1 × 1 to 7 × 7, with latencies of 1.3 μs and 9.01 μs, respectively, generating a continuous flow of output events. The proposed architecture will easily fit spike-based CNNs.Ministerio de Economía y Competitividad TEC2016-77785-

    A one-transistor-synapse strategy for electrically-programmable massively-parallel analog array processors

    Get PDF
    This paper presents a linear, four-quadrants, electrically-programmable, one-transistor synapse strategy applicable to the implementation of general massively-parallel analog processors in CMOS technology. It is specially suited for translationally-invariant processing arrays with local connectivity, and results in a significant reduction in area occupation and power dissipation of the basic processing units. This allows higher integration densities and therefore, permits the integration of larger arrays on a single chip.Comisión Interministerial de Ciencia y Tecnología TIC96- 1392-C02-0
    corecore