2,924 research outputs found

    High performance hardware architectures for one bit transform based motion estimation

    Get PDF
    Motion Estimation (ME) is the most computationally intensive and most power consuming part of video compression and video enhancement systems. ME is used in video compression standards such as MPEG4, H.264 and it is used in video enhancement algorithms such as frame rate conversion and de-interlacing. One bit transform (1BT) based ME algorithms have low computational complexity. Therefore, in this thesis, we propose high performance hardware architectures for 1BT based fixed block size (FBS) single reference frame (SRF) ME, variable block size (VBS) SRF ME, and multiple reference frame (MRF) ME. Constraint One Bit Transform (C-1BT) ME algorithm improves the ME performance of 1BT ME, and the early terminated C-1BT ME algorithm reduces the computational complexity of C-1BT ME. Therefore, in this thesis, we also propose a high performance early terminated C-1BT ME hardware architecture. The proposed FBS SRF ME hardware architectures perform full search ME for 4 Macroblocks in parallel and they are faster than the 1BT based ME hardware reported in the literature. In addition, they use less on-chip memory than the previous 1BT based ME hardware by using a novel data reuse scheme and memory organization. The proposed VBS SRF ME and MRF ME hardware architectures are the first 1BT based VBS ME and MRF ME hardware architectures in the literature. The proposed MRF ME hardware is designed as reconfigurable in order to statically configure the number and selection of reference frames based on the application requirements. The proposed early terminated C-1BT ME hardware architecture is the first early terminated C-1BT ME hardware architecture in the literature. All of the proposed ME hardware architectures are implemented in Verilog HDL and mapped to Xilinx FPGAs. All FPGA implementations are verified with post place & route simulations

    Statistical lossless compression of space imagery and general data in a reconfigurable architecture

    Get PDF

    Low energy HEVC and VVC video compression hardware

    Get PDF
    Video compression standards compress a digital video by reducing and removing redundancy in the digital video using computationally complex algorithms. As spatial and temporal resolutions of videos increase, compression efficiencies of video compression algorithms are also increasing. However, increased compression efficiency comes with increased computational complexity. Therefore, it is necessary to reduce computational complexities of video compression algorithms without reducing their visual quality in order to reduce area and energy consumption of their hardware implementations. In this thesis, we propose a novel technique for reducing amount of computations performed by HEVC intra prediction algorithm. We designed low energy, reconfigurable HEVC intra prediction hardware using the proposed technique. We also designed a low energy FPGA implementation of HEVC intra prediction algorithm using the proposed technique and DSP blocks. We propose a reconfigurable VVC intra prediction hardware architecture. We also propose an efficient VVC intra prediction hardware architecture using DSP blocks. We designed low energy VVC fractional interpolation hardware. We propose a novel approximate absolute difference technique. We designed low energy approximate absolute difference hardware using the proposed technique. We propose a novel approximate constant multiplication technique. We designed approximate constant multiplication hardware using the proposed technique. We quantified computation reductions achieved by the proposed techniques and video quality loss caused by the proposed approximation techniques. The proposed approximate absolute difference technique and approximate constant multiplication technique cause very small PSNR loss. The other proposed techniques cause no PSNR loss. We implemented the proposed hardware architectures in Verilog HDL. We mapped the Verilog RTL codes to Xilinx Virtex 6 or Xilinx Virtex 7 FPGAs and estimated their power consumptions using Xilinx XPower Analyzer tool. The proposed techniques significantly reduced power and energy consumptions of these FPGA implementation

    Algorithms & implementation of advanced video coding standards

    Get PDF
    Advanced video coding standards have become widely deployed coding techniques used in numerous products, such as broadcast, video conference, mobile television and blu-ray disc, etc. New compression techniques are gradually included in video coding standards so that a 50% compression rate reduction is achievable every five years. However, the trend also has brought many problems, such as, dramatically increased computational complexity, co-existing multiple standards and gradually increased development time. To solve the above problems, this thesis intends to investigate efficient algorithms for the latest video coding standard, H.264/AVC. Two aspects of H.264/AVC standard are inspected in this thesis: (1) Speeding up intra4x4 prediction with parallel architecture. (2) Applying an efficient rate control algorithm based on deviation measure to intra frame. Another aim of this thesis is to work on low-complexity algorithms for MPEG-2 to H.264/AVC transcoder. Three main mapping algorithms and a computational complexity reduction algorithm are focused by this thesis: motion vector mapping, block mapping, field-frame mapping and efficient modes ranking algorithms. Finally, a new video coding framework methodology to reduce development time is examined. This thesis explores the implementation of MPEG-4 simple profile with the RVC framework. A key technique of automatically generating variable length decoder table is solved in this thesis. Moreover, another important video coding standard, DV/DVCPRO, is further modeled by RVC framework. Consequently, besides the available MPEG-4 simple profile and China audio/video standard, a new member is therefore added into the RVC framework family. A part of the research work presented in this thesis is targeted algorithms and implementation of video coding standards. In the wide topic, three main problems are investigated. The results show that the methodologies presented in this thesis are efficient and encourage

    Enhancing Real-time Embedded Image Processing Robustness on Reconfigurable Devices for Critical Applications

    Get PDF
    Nowadays, image processing is increasingly used in several application fields, such as biomedical, aerospace, or automotive. Within these fields, image processing is used to serve both non-critical and critical tasks. As example, in automotive, cameras are becoming key sensors in increasing car safety, driving assistance and driving comfort. They have been employed for infotainment (non-critical), as well as for some driver assistance tasks (critical), such as Forward Collision Avoidance, Intelligent Speed Control, or Pedestrian Detection. The complexity of these algorithms brings a challenge in real-time image processing systems, requiring high computing capacity, usually not available in processors for embedded systems. Hardware acceleration is therefore crucial, and devices such as Field Programmable Gate Arrays (FPGAs) best fit the growing demand of computational capabilities. These devices can assist embedded processors by significantly speeding-up computationally intensive software algorithms. Moreover, critical applications introduce strict requirements not only from the real-time constraints, but also from the device reliability and algorithm robustness points of view. Technology scaling is highlighting reliability problems related to aging phenomena, and to the increasing sensitivity of digital devices to external radiation events that can cause transient or even permanent faults. These faults can lead to wrong information processed or, in the worst case, to a dangerous system failure. In this context, the reconfigurable nature of FPGA devices can be exploited to increase the system reliability and robustness by leveraging Dynamic Partial Reconfiguration features. The research work presented in this thesis focuses on the development of techniques for implementing efficient and robust real-time embedded image processing hardware accelerators and systems for mission-critical applications. Three main challenges have been faced and will be discussed, along with proposed solutions, throughout the thesis: (i) achieving real-time performances, (ii) enhancing algorithm robustness, and (iii) increasing overall system's dependability. In order to ensure real-time performances, efficient FPGA-based hardware accelerators implementing selected image processing algorithms have been developed. Functionalities offered by the target technology, and algorithm's characteristics have been constantly taken into account while designing such accelerators, in order to efficiently tailor algorithm's operations to available hardware resources. On the other hand, the key idea for increasing image processing algorithms' robustness is to introduce self-adaptivity features at algorithm level, in order to maintain constant, or improve, the quality of results for a wide range of input conditions, that are not always fully predictable at design-time (e.g., noise level variations). This has been accomplished by measuring at run-time some characteristics of the input images, and then tuning the algorithm parameters based on such estimations. Dynamic reconfiguration features of modern reconfigurable FPGA have been extensively exploited in order to integrate run-time adaptivity into the designed hardware accelerators. Tools and methodologies have been also developed in order to increase the overall system dependability during reconfiguration processes, thus providing safe run-time adaptation mechanisms. In addition, taking into account the target technology and the environments in which the developed hardware accelerators and systems may be employed, dependability issues have been analyzed, leading to the development of a platform for quickly assessing the reliability and characterizing the behavior of hardware accelerators implemented on reconfigurable FPGAs when they are affected by such faults

    Low complexity hardware oriented H.264/AVC motion estimation algorithm and related low power and low cost architecture design

    Get PDF
    制度:新 ; 報告番号:甲2999号 ; 学位の種類:博士(工学) ; 授与年月日:2010/3/15 ; 早大学位記番号:新525

    Baseband analog front-end and digital back-end for reconfigurable multi-standard terminals

    Get PDF
    Multimedia applications are driving wireless network operators to add high-speed data services such as Edge (E-GPRS), WCDMA (UMTS) and WLAN (IEEE 802.11a,b,g) to the existing GSM network. This creates the need for multi-mode cellular handsets that support a wide range of communication standards, each with a different RF frequency, signal bandwidth, modulation scheme etc. This in turn generates several design challenges for the analog and digital building blocks of the physical layer. In addition to the above-mentioned protocols, mobile devices often include Bluetooth, GPS, FM-radio and TV services that can work concurrently with data and voice communication. Multi-mode, multi-band, and multi-standard mobile terminals must satisfy all these different requirements. Sharing and/or switching transceiver building blocks in these handsets is mandatory in order to extend battery life and/or reduce cost. Only adaptive circuits that are able to reconfigure themselves within the handover time can meet the design requirements of a single receiver or transmitter covering all the different standards while ensuring seamless inter-interoperability. This paper presents analog and digital base-band circuits that are able to support GSM (with Edge), WCDMA (UMTS), WLAN and Bluetooth using reconfigurable building blocks. The blocks can trade off power consumption for performance on the fly, depending on the standard to be supported and the required QoS (Quality of Service) leve

    Video Processing Acceleration using Reconfigurable Logic and Graphics Processors

    No full text
    A vexing question is `which architecture will prevail as the core feature of the next state of the art video processing system?' This thesis examines the substitutive and collaborative use of the two alternatives of the reconfigurable logic and graphics processor architectures. A structured approach to executing architecture comparison is presented - this includes a proposed `Three Axes of Algorithm Characterisation' scheme and a formulation of perfor- mance drivers. The approach is an appealing platform for clearly defining the problem, assumptions and results of a comparison. In this work it is used to resolve the advanta- geous factors of the graphics processor and reconfigurable logic for video processing, and the conditions determining which one is superior. The comparison results prompt the exploration of the customisable options for the graphics processor architecture. To clearly define the architectural design space, the graphics processor is first identifed as part of a wider scope of homogeneous multi-processing element (HoMPE) architectures. A novel exploration tool is described which is suited to the investigation of the customisable op- tions of HoMPE architectures. The tool adopts a systematic exploration approach and a high-level parameterisable system model, and is used to explore pre- and post-fabrication customisable options for the graphics processor. A positive result of the exploration is the proposal of a reconfigurable engine for data access (REDA) to optimise graphics processor performance for video processing-specific memory access patterns. REDA demonstrates the viability of the use of reconfigurable logic as collaborative `glue logic' in the graphics processor architecture
    corecore