3,279 research outputs found

    Histogram of oriented gradients front end processing: an FPGA based processor approach

    Get PDF
    The Field Programmable Gate Array (FPGA) implementation of the commonly used Histogram of Oriented Gradients (HOG) algorithm is explored. The HOG algorithm is employed to extract features for object detection. A key focus has been to explore the use of a new FPGA-based processor which has been targeted at image processing. The paper gives details of the mapping and scheduling factors that influence the performance and the stages that were undertaken to allow the algorithm to be deployed on FPGA hardware, whilst taking into account the specific IPPro architecture features. We show that multi-core IPPro performance can exceed that of against state-of-the-art FPGA designs by up to 3.2 times with reduced design and implementation effort and increased flexibility all on a low cost, Zynq programmable system

    Adaptive online deployment for resource constrained mobile smart clients

    Get PDF
    Nowadays mobile devices are more and more used as a platform for applications. Contrary to prior generation handheld devices configured with a predefined set of applications, today leading edge devices provide a platform for flexible and customized application deployment. However, these applications have to deal with the limitations (e.g. CPU speed, memory) of these mobile devices and thus cannot handle complex tasks. In order to cope with the handheld limitations and the ever changing device context (e.g. network connections, remaining battery time, etc.) we present a middleware solution that dynamically offloads parts of the software to the most appropriate server. Without a priori knowledge of the application, the optimal deployment is calculated, that lowers the cpu usage at the mobile client, whilst keeping the used bandwidth minimal. The information needed to calculate this optimum is gathered on the fly from runtime information. Experimental results show that the proposed solution enables effective execution of complex applications in a constrained environment. Moreover, we demonstrate that the overhead from the middleware components is below 2%

    Enhancing Real-time Embedded Image Processing Robustness on Reconfigurable Devices for Critical Applications

    Get PDF
    Nowadays, image processing is increasingly used in several application fields, such as biomedical, aerospace, or automotive. Within these fields, image processing is used to serve both non-critical and critical tasks. As example, in automotive, cameras are becoming key sensors in increasing car safety, driving assistance and driving comfort. They have been employed for infotainment (non-critical), as well as for some driver assistance tasks (critical), such as Forward Collision Avoidance, Intelligent Speed Control, or Pedestrian Detection. The complexity of these algorithms brings a challenge in real-time image processing systems, requiring high computing capacity, usually not available in processors for embedded systems. Hardware acceleration is therefore crucial, and devices such as Field Programmable Gate Arrays (FPGAs) best fit the growing demand of computational capabilities. These devices can assist embedded processors by significantly speeding-up computationally intensive software algorithms. Moreover, critical applications introduce strict requirements not only from the real-time constraints, but also from the device reliability and algorithm robustness points of view. Technology scaling is highlighting reliability problems related to aging phenomena, and to the increasing sensitivity of digital devices to external radiation events that can cause transient or even permanent faults. These faults can lead to wrong information processed or, in the worst case, to a dangerous system failure. In this context, the reconfigurable nature of FPGA devices can be exploited to increase the system reliability and robustness by leveraging Dynamic Partial Reconfiguration features. The research work presented in this thesis focuses on the development of techniques for implementing efficient and robust real-time embedded image processing hardware accelerators and systems for mission-critical applications. Three main challenges have been faced and will be discussed, along with proposed solutions, throughout the thesis: (i) achieving real-time performances, (ii) enhancing algorithm robustness, and (iii) increasing overall system's dependability. In order to ensure real-time performances, efficient FPGA-based hardware accelerators implementing selected image processing algorithms have been developed. Functionalities offered by the target technology, and algorithm's characteristics have been constantly taken into account while designing such accelerators, in order to efficiently tailor algorithm's operations to available hardware resources. On the other hand, the key idea for increasing image processing algorithms' robustness is to introduce self-adaptivity features at algorithm level, in order to maintain constant, or improve, the quality of results for a wide range of input conditions, that are not always fully predictable at design-time (e.g., noise level variations). This has been accomplished by measuring at run-time some characteristics of the input images, and then tuning the algorithm parameters based on such estimations. Dynamic reconfiguration features of modern reconfigurable FPGA have been extensively exploited in order to integrate run-time adaptivity into the designed hardware accelerators. Tools and methodologies have been also developed in order to increase the overall system dependability during reconfiguration processes, thus providing safe run-time adaptation mechanisms. In addition, taking into account the target technology and the environments in which the developed hardware accelerators and systems may be employed, dependability issues have been analyzed, leading to the development of a platform for quickly assessing the reliability and characterizing the behavior of hardware accelerators implemented on reconfigurable FPGAs when they are affected by such faults

    Image Processing Using FPGAs

    Get PDF
    This book presents a selection of papers representing current research on using field programmable gate arrays (FPGAs) for realising image processing algorithms. These papers are reprints of papers selected for a Special Issue of the Journal of Imaging on image processing using FPGAs. A diverse range of topics is covered, including parallel soft processors, memory management, image filters, segmentation, clustering, image analysis, and image compression. Applications include traffic sign recognition for autonomous driving, cell detection for histopathology, and video compression. Collectively, they represent the current state-of-the-art on image processing using FPGAs

    Evolvable hardware system for automatic optical inspection

    Get PDF

    Optimal processor assignment for pipeline computations

    Get PDF
    The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered

    Design and implementation of an FPGA-based piecewise affine Kalman Filter for Cyber-Physical Systems

    Get PDF
    The Kalman Filter is a robust tool often employed as a process observer in Cyber-Physical Systems. However, in the general case the high computational cost, especially for large plant models or fast sample rates, makes it an impractical choice for typical low-power microcontrollers. Furthermore, although industry trends towards tighter integration are supported by powerful high-end System-on-Chip software processors, this consolidation complicates the ability for a controls engineer to verify correct behavior of the system under all conditions, which is important in safety-critical systems and systems demanding a high degree of reliability. Dedicated Field-Programmable Gate Array (FPGA) hardware can provide application speedup, design partitioning in mixed-criticality systems, and fully deterministic timing, which helps ensure a control system behaves identically to offline simulations. This dissertation presents a new design methodology which can be leveraged to yield such benefits. Although this dissertation focuses on the Kalman Filter, the method is general enough to be extended to other compute-intensive algorithms which rely on state-space modeling. For the first part, the core idea is that decomposing the Kalman Filter algorithm from a strictly linear perspective leads to a more generalized architecture with increased performance compared to approaches which focus on nonlinear filters (e.g. Extended Kalman Filter). Our contribution is a broadly-applicable hardware-software architecture for a linear Kalman Filter whose operating domain is extended through online model swapping. A supporting application-agnostic performance and resource analysis is provided. For the second part, we identify limitations of the mixed hardware-software method and demonstrate how to leverage hardware-based region identification in order to develop a strictly hardware-only Kalman Filter which maintains a large operating domain. The resulting hardware processor is partitioned from low criticality software tasks running on a supervising software processor and enables vastly simplified timing validation
    • …
    corecore