2,317 research outputs found

    An area-efficient 2-D convolution implementation on FPGA for space applications

    Get PDF
    The 2-D Convolution is an algorithm widely used in image and video processing. Although its computation is simple, its implementation requires a high computational power and an intensive use of memory. Field Programmable Gate Arrays (FPGA) architectures were proposed to accelerate calculations of 2-D Convolution and the use of buffers implemented on FPGAs are used to avoid direct memory access. In this paper we present an implementation of the 2-D Convolution algorithm on a FPGA architecture designed to support this operation in space applications. This proposed solution dramatically decreases the area needed keeping good performance, making it appropriate for embedded systems in critical space application

    A Scalable Architecture for Discrete Wavelet Transform on FPGA-Based System

    Get PDF

    FPGA implementations for parallel multidimensional filtering algorithms

    Get PDF
    PhD ThesisOne and multi dimensional raw data collections introduce noise and artifacts, which need to be recovered from degradations by an automated filtering system before, further machine analysis. The need for automating wide-ranged filtering applications necessitates the design of generic filtering architectures, together with the development of multidimensional and extensive convolution operators. Consequently, the aim of this thesis is to investigate the problem of automated construction of a generic parallel filtering system. Serving this goal, performance-efficient FPGA implementation architectures are developed to realize parallel one/multi-dimensional filtering algorithms. The proposed generic architectures provide a mechanism for fast FPGA prototyping of high performance computations to obtain efficiently implemented performance indices of area, speed, dynamic power, throughput and computation rates, as a complete package. These parallel filtering algorithms and their automated generic architectures tackle the major bottlenecks and limitations of existing multiprocessor systems in wordlength, input data segmentation, boundary conditions as well as inter-processor communications, in order to support high data throughput real-time applications of low-power architectures using a Xilinx Virtex-6 FPGA board. For one-dimensional raw signal filtering case, mathematical model and architectural development of the generalized parallel 1-D filtering algorithms are presented using the 1-D block filtering method. Five generic architectures are implemented on a Virtex-6 ML605 board, evaluated and compared. A complete set of results on area, speed, power, throughput and computation rates are obtained and discussed as performance indices for the 1-D convolution architectures. A successful application of parallel 1-D cross-correlation is demonstrated. For two dimensional greyscale/colour image processing cases, new parallel 2-D/3-D filtering algorithms are presented and mathematically modelled using input decimation and output image reconstruction by interpolation. Ten generic architectures are implemented on the Virtex-6 ML605 board, evaluated and compared. Key results on area, speed, power, throughput and computation rate are obtained and discussed as performance indices for the 2-D convolution architectures. 2-D image reconfigurable processors are developed and implemented using single, dual and quad MAC FIR units. 3-D Colour image processors are devised to act as 3-D colour filtering engines. A 2-D cross-correlator parallel engine is successfully developed as a parallel 2-D matched filtering algorithm for locating any MRI slice within a MRI data stack library. Twelve 3-D MRI filtering operators are plugged in and adapted to be suitable for biomedical imaging, including 3-D edge operators and 3-D noise smoothing operators. Since three dimensional greyscale/colour volumetric image applications are computationally intensive, a new parallel 3-D/4-D filtering algorithm is presented and mathematically modelled using volumetric data image segmentation by decimation and output reconstruction by interpolation, after simultaneously and independently performing 3-D filtering. Eight generic architectures are developed and implemented on the Virtex-6 board, including 3-D spatial and FFT convolution architectures. Fourteen 3-D MRI filtering operators are plugged and adapted for this particular biomedical imaging application, including 3-D edge operators and 3-D noise smoothing operators. Three successful applications are presented in 4-D colour MRI (fMRI) filtering processors, k-space MRI volume data filter and 3-D cross-correlator.IRAQI Government

    Hardware Architectures for Image Processing Acceleration

    Get PDF

    Ein flexibles, heterogenes Bildverarbeitungs-Framework fĂŒr weltraumbasierte, rekonfigurierbare Datenverarbeitungsmodule

    Get PDF
    Scientific instruments as payload of current space missions are often equipped with high-resolution sensors. Thereby, especially camera-based instruments produce a vast amount of data. To obtain the desired scientific information, this data usually is processed on ground. Due to the high distance of missions within the solar system, the data rate for downlink to the ground station is strictly limited. The volume of scientific relevant data is usually less compared to the obtained raw data. Therefore, processing already has to be carried out on-board the spacecraft. An example of such an instrument is the Polarimetric and Helioseismic Imager (PHI) on-board Solar Orbiter. For acquisition, storage and processing of images, the instrument is equipped with a Data Processing Module (DPM). It makes use of heterogeneous computing based on a dedicated LEON3 processor in combination with two reconfigurable Xilinx Virtex-4 Field-Programmable Gate Arrays (FPGAs). The thesis will provide an overview of the available space-grade processing components (processors and FPGAs) which fulfill the requirements of deepspace missions. It also presents existing processing platforms which are based upon a heterogeneous system combining processors and FPGAs. This also includes the DPM of the PHI instrument, whose architecture will be introduced in detail. As core contribution of this thesis, a framework will be presented which enables high-performance image processing on such hardware-based systems while retaining software-like flexibility. This framework mainly consists of a variety of modules for hardware acceleration which are integrated seamlessly into the data flow of the on-board software. Supplementary, it makes extensive use of the dynamic in-flight reconfigurability of the used Virtex-4 FPGAs. The flexibility of the presented framework is proven by means of multiple examples from within the image processing of the PHI instrument. The framework is analyzed with respect to processing performance as well as power consumption.Wissenschaftliche Instrumente auf aktuellen Raumfahrtmissionen sind oft mit hochauflösenden Sensoren ausgestattet. Insbesondere kamerabasierte Instrumente produzieren dabei eine große Menge an Daten. Diese werden ĂŒblicherweise nach dem Empfang auf der Erde weiterverarbeitet, um daraus wissenschaftlich relevante Informationen zu gewinnen. Aufgrund der großen Entfernung von Missionen innerhalb unseres Sonnensystems ist die Datenrate zur Übertragung an die Bodenstation oft sehr begrenzt. Das Volumen der wissenschaftlich relevanten Daten ist meist deutlich kleiner als die aufgenommenen Rohdaten. Daher ist es vorteilhaft, diese bereits an Board der Sonde zu verarbeiten. Ein Beispiel fĂŒr solch ein Instrument ist der Polarimetric and Helioseismic Imager (PHI) an Bord von Solar Orbiter. Um die Daten aufzunehmen, zu speichern und zu verarbeiten, ist das Instrument mit einem Data Processing Module (DPM) ausgestattet. Dieses nutzt ein heterogenes Rechnersystem aus einem dedizierten LEON3 Prozessor, zusammen mit zwei rekonfigurierbaren Xilinx Virtex-4 Field-Programmable Gate Arrays (FPGAs). Die folgende Arbeit gibt einen Überblick ĂŒber verfĂŒgbare Komponenten zur Datenverarbeitung (Prozessoren und FPGAs), die den Anforderungen von Raumfahrtmissionen gerecht werden, und stellt einige existierende Plattformen vor, die auf einem heterogenen System aus Prozessor und FPGA basieren. Hierzu gehört auch das Data Processing Module des PHI Instrumentes, dessen Architektur im Verlauf dieser Arbeit beschrieben wird. Als Kernelement der Dissertation wird ein Framework vorgestellt, das sowohl eine performante, als auch eine flexible Bilddatenverarbeitung auf einem solchen System ermöglicht. Dieses Framework besteht aus verschiedenen Modulen zur Hardwarebeschleunigung und bindet diese nahtlos in den Datenfluss der On-Board Software ein. Dabei wird außerdem die Möglichkeit genutzt, die eingesetzten Virtex-4 FPGAs dynamisch zur Laufzeit zu rekonfigurieren. Die FlexibilitĂ€t des vorgestellten Frameworks wird anhand mehrerer Fallbeispiele aus der Bildverarbeitung von PHI dargestellt. Das Framework wird bezĂŒglich der Verarbeitungsgeschwindigkeit und Energieeffizienz analysiert

    Dynamic reconfiguration technologies based on FPGA in software defined radio system

    Get PDF
    Partial Reconfiguration (PR) is a method for Field Programmable Gate Array (FPGA) designs which allows multiple applications to time-share a portion of an FPGA while the rest of the device continues to operate unaffected. Using this strategy, the physical layer processing architecture in Software Defined Radio (SDR) systems can benefit from reduced complexity and increased design flexibility, as different waveform applications can be grouped into one part of a single FPGA. Waveform switching often means not only changing functionality, but also changing the FPGA clock frequency. However, that is beyond the current functionality of PR processes as the clock components (such as Digital Clock Managers (DCMs)) are excluded from the process of partial reconfiguration. In this paper, we present a novel architecture that combines another reconfigurable technology, Dynamic Reconfigurable Port (DRP), with PR based on a single FPGA in order to dynamically change both functionality and also the clock frequency. The architecture is demonstrated to reduce hardware utilization significantly compared with standard, static FPGA design
    • 

    corecore