30 research outputs found

    FPGA向けの効率的なハードウエアアルゴリズム

    Get PDF
    広島大学(Hiroshima University)博士(工学)Doctor of Engineeringdoctora

    Hardware acceleration of the trace transform for vision applications

    Get PDF
    Computer Vision is a rapidly developing field in which machines process visual data to extract meaningful information. Digitised images in their pixels and bits serve no purpose of their own. It is only by interpreting the data, and extracting higher level information that a scene can be understood. The algorithms that enable this process are often complex, and data-intensive, limiting the processing rate when implemented in software. Hardware-accelerated implementations provide a significant performance boost that can enable real- time processing. The Trace Transform is a newly proposed algorithm that has been proven effective in image categorisation and recognition tasks. It is flexibly defined allowing the mathematical details to be tailored to the target application. However, it is highly computationally intensive, which limits its applications. Modern heterogeneous FPGAs provide an ideal platform for accelerating the Trace transform for real-time performance, while also allowing an element of flexibility, which highly suits the generality of the Trace transform. This thesis details the implementation of an extensible Trace transform architecture for vision applications, before extending this architecture to a full flexible platform suited to the exploration of Trace transform applications. As part of the work presented, a general set of architectures for large-windowed median and weighted median filters are presented as required for a number of Trace transform implementations. Finally an acceleration of Pseudo 2-Dimensional Hidden Markov Model decoding, usable in a person detection system, is presented. Such a system can be used to extract frames of interest from a video sequence, to be subsequently processed by the Trace transform. All these architectures emphasise the need for considered, platform-driven design in achieving maximum performance through hardware acceleration

    Feature detection in an indoor environment using Hardware Accelerators for time-efficient Monocular SLAM

    Get PDF
    In the field of Robotics, Monocular Simultaneous Localization and Mapping (Monocular SLAM) has gained immense popularity, as it replaces large and costly sensors such as laser range finders with a single cheap camera. Additionally, the well-developed area of Computer Vision provides robust image processing algorithms which aid in developing feature detection technique for the implementation of Monocular SLAM. Similarly, in the field of digital electronics and embedded systems, hardware acceleration using FPGAs, has become quite popular. Hardware acceleration is based upon the idea of offloading certain iterative algorithms from the processor and implementing them on a dedicated piece of hardware such as an ASIC or FPGA, to speed up performance in terms of timing and to possibly reduce the net power consumption of the system. Good strides have been taken in developing massively pipelined and resource efficient hardware implementations of several image processing algorithms on FPGAs, which achieve fairly decent speed-up of the processing time. In this thesis, we have developed a very simple algorithm for feature detection in an indoor environment by means of a single camera, based on Canny Edge Detection and Hough Transform algorithms using OpenCV library, and proposed its integration with existing feature initialization technique for a complete Monocular SLAM implementation. Following this, we have developed hardware accelerators for Canny Edge Detection & Hough Transform and we have compared the timing performance of implementation in hardware (using FPGAs) with an implementation in software (using C++ and OpenCV)

    Ein flexibles, heterogenes Bildverarbeitungs-Framework für weltraumbasierte, rekonfigurierbare Datenverarbeitungsmodule

    Get PDF
    Scientific instruments as payload of current space missions are often equipped with high-resolution sensors. Thereby, especially camera-based instruments produce a vast amount of data. To obtain the desired scientific information, this data usually is processed on ground. Due to the high distance of missions within the solar system, the data rate for downlink to the ground station is strictly limited. The volume of scientific relevant data is usually less compared to the obtained raw data. Therefore, processing already has to be carried out on-board the spacecraft. An example of such an instrument is the Polarimetric and Helioseismic Imager (PHI) on-board Solar Orbiter. For acquisition, storage and processing of images, the instrument is equipped with a Data Processing Module (DPM). It makes use of heterogeneous computing based on a dedicated LEON3 processor in combination with two reconfigurable Xilinx Virtex-4 Field-Programmable Gate Arrays (FPGAs). The thesis will provide an overview of the available space-grade processing components (processors and FPGAs) which fulfill the requirements of deepspace missions. It also presents existing processing platforms which are based upon a heterogeneous system combining processors and FPGAs. This also includes the DPM of the PHI instrument, whose architecture will be introduced in detail. As core contribution of this thesis, a framework will be presented which enables high-performance image processing on such hardware-based systems while retaining software-like flexibility. This framework mainly consists of a variety of modules for hardware acceleration which are integrated seamlessly into the data flow of the on-board software. Supplementary, it makes extensive use of the dynamic in-flight reconfigurability of the used Virtex-4 FPGAs. The flexibility of the presented framework is proven by means of multiple examples from within the image processing of the PHI instrument. The framework is analyzed with respect to processing performance as well as power consumption.Wissenschaftliche Instrumente auf aktuellen Raumfahrtmissionen sind oft mit hochauflösenden Sensoren ausgestattet. Insbesondere kamerabasierte Instrumente produzieren dabei eine große Menge an Daten. Diese werden üblicherweise nach dem Empfang auf der Erde weiterverarbeitet, um daraus wissenschaftlich relevante Informationen zu gewinnen. Aufgrund der großen Entfernung von Missionen innerhalb unseres Sonnensystems ist die Datenrate zur Übertragung an die Bodenstation oft sehr begrenzt. Das Volumen der wissenschaftlich relevanten Daten ist meist deutlich kleiner als die aufgenommenen Rohdaten. Daher ist es vorteilhaft, diese bereits an Board der Sonde zu verarbeiten. Ein Beispiel für solch ein Instrument ist der Polarimetric and Helioseismic Imager (PHI) an Bord von Solar Orbiter. Um die Daten aufzunehmen, zu speichern und zu verarbeiten, ist das Instrument mit einem Data Processing Module (DPM) ausgestattet. Dieses nutzt ein heterogenes Rechnersystem aus einem dedizierten LEON3 Prozessor, zusammen mit zwei rekonfigurierbaren Xilinx Virtex-4 Field-Programmable Gate Arrays (FPGAs). Die folgende Arbeit gibt einen Überblick über verfügbare Komponenten zur Datenverarbeitung (Prozessoren und FPGAs), die den Anforderungen von Raumfahrtmissionen gerecht werden, und stellt einige existierende Plattformen vor, die auf einem heterogenen System aus Prozessor und FPGA basieren. Hierzu gehört auch das Data Processing Module des PHI Instrumentes, dessen Architektur im Verlauf dieser Arbeit beschrieben wird. Als Kernelement der Dissertation wird ein Framework vorgestellt, das sowohl eine performante, als auch eine flexible Bilddatenverarbeitung auf einem solchen System ermöglicht. Dieses Framework besteht aus verschiedenen Modulen zur Hardwarebeschleunigung und bindet diese nahtlos in den Datenfluss der On-Board Software ein. Dabei wird außerdem die Möglichkeit genutzt, die eingesetzten Virtex-4 FPGAs dynamisch zur Laufzeit zu rekonfigurieren. Die Flexibilität des vorgestellten Frameworks wird anhand mehrerer Fallbeispiele aus der Bildverarbeitung von PHI dargestellt. Das Framework wird bezüglich der Verarbeitungsgeschwindigkeit und Energieeffizienz analysiert

    FPGA implementation of a memory-efficient Hough Parameter Space for the detection of lines

    Get PDF
    The Line Hough Transform (LHT) is a robust and accurate line detection algorithm, useful for applications such as lane detection in Advanced Driver Assistance Systems. For real-time implementation, the LHT is demanding in terms of computation and memory, and hence Field Programmable Gate Arrays (FPGAs) are often deployed. However, many small FPGAs are incapable of implementing the LHT due to the large memory requirement of the Hough Parameter Space (HPS). This paper presents a memory-efficient architecture of the LHT named the Angular Regions - Line Hough Transform (AR-LHT). We present a suitable FPGA implementation of the AR-LHT and provide a performance and resource analysis after targeting a Xilinx xc7z010-1 device. Results demonstrate that, for an image of 1024x1024 pixels, approximately 48% less memory is used than the Standard LHT. The FPGA architecture is capable of processing a single image in 9.03ms

    Resource-efficient dynamic partial reconfiguration on FPGAs for space instruments

    Get PDF
    Field-Programmable Gate Arrays (FPGAs) provide highly flexible platforms to implement sophisticated data processing for scientific space instruments. The dynamic partial reconfiguration (DPR) capability of FPGAs allows it to schedule HW tasks. While this feature adds another dimension of processing power that can be exploited without significantly increasing system complexity and power consumption, there are still several challenges for an efficient DPR use. State-of-the-art concepts concentrate either on resource-efficient implementations at design time or flexible HW task scheduling at runtime. In this paper we propose a balanced algorithm that considers both optimization goals and is well suited for resource-limited space applications

    Heterogeneous computing systems for vision-based multi-robot tracking

    Get PDF
    Irwansyah A. Heterogeneous computing systems for vision-based multi-robot tracking. Bielefeld: Universität Bielefeld; 2017

    Study and Optimization of Particle Track Detection via Hough Transform Hardware Implementation for the ATLAS Phase-II Trigger Upgrade

    Get PDF
    In the CERN of Geneva the Large Hadron Collider (LHC) will undergo several deep upgrades in the next years. Instantaneous and Integrated Luminosity will be increased respectively up to 5−7·10 34 cm −2 s −1 and 3000 f b −1 . Alongside this collider the experiments exploiting LHC will undergo through upgrades crucial to fulfill the HEP goals. The ATLAS upgrades are divided into phases, namely Phase-I and Phase-II. Part of the ATLAS upgrade concerns the Trigger and Data Acquisition systems. In particular, for the ATLAS trigger, a big technological update is planned for the Phase-II. My contribution to these Phase-I and Phase-II plans has been focused to the Trigger and Data Acquisition system electronic update. In the Phase-I upgrade I worked at the commissioning of the new FELIX readout cards FLX-712 which will be mounted on part of the TDAQ system. These cards are FPGA based with a bandwidth up to 480 Gb/s and exploit PCI Express Generation 3 technology. My work has been focused on the preparation and the follow up of part of the tests of the cards for quality checks and controls. The ATLAS Phase-II trigger targets to increase its output data stream to the Tier 0 of one order of magnitude. For the ATLAS Phase-II upgrade I developed an implementation of a tracking algorithm to fulfill the new trigger requirements. This algorithm, known as Hough Transform, is used to track particle trajectories and it has been already demonstrated to be suited for the ATLAS specifications. In this thesis I present the study, the simulations and the hardware implementation of a preliminary version of the Hough Transform algorithm on a XILINX Ultrascale+ FPGA device
    corecore