33 research outputs found

    Adaptive motion estimation algorithm and hardware designs for H.264 multiview video coding

    Get PDF
    Multiview Video Coding (MVC) is the process of efficiently compressing stereo (2 views) or multiview video signals. The improved compression efficiency achieved by H.264 MVC comes with a significant increase in computational complexity. Therefore, in this thesis, we propose novel techniques for significantly reducing the amount of computations performed by full search motion estimation algorithm for H.264 MVC, and therefore significantly reducing the energy consumption of full search motion estimation hardware for H.264 MVC with very small PSNR loss and bitrate increase. We also propose an adaptive fast motion estimation algorithm for reducing the amount of computations performed by H.264 MVC motion estimation, and therefore reducing the energy consumption of H.264 MVC motion estimation hardware even more with additional very small PSNR loss and bitrate increase. We also propose an adaptive H.264 MVC motion estimation hardware for implementing the proposed adaptive fast motion estimation algorithm. The proposed motion estimation hardware is implemented in Verilog HDL and mapped to a Xilinx Virtex-6 FPGA. The proposed motion estimation hardware has less energy consumption than the full search motion estimation hardware for H.264 MVC and the full search motion estimation hardware for H.264 MVC including the proposed computation reduction techniques

    High Performance Multiview Video Coding

    Get PDF
    Following the standardization of the latest video coding standard High Efficiency Video Coding in 2013, in 2014, multiview extension of HEVC (MV-HEVC) was published and brought significantly better compression performance of around 50% for multiview and 3D videos compared to multiple independent single-view HEVC coding. However, the extremely high computational complexity of MV-HEVC demands significant optimization of the encoder. To tackle this problem, this work investigates the possibilities of using modern parallel computing platforms and tools such as single-instruction-multiple-data (SIMD) instructions, multi-core CPU, massively parallel GPU, and computer cluster to significantly enhance the MVC encoder performance. The aforementioned computing tools have very different computing characteristics and misuse of the tools may result in poor performance improvement and sometimes even reduction. To achieve the best possible encoding performance from modern computing tools, different levels of parallelism inside a typical MVC encoder are identified and analyzed. Novel optimization techniques at various levels of abstraction are proposed, non-aggregation massively parallel motion estimation (ME) and disparity estimation (DE) in prediction unit (PU), fractional and bi-directional ME/DE acceleration through SIMD, quantization parameter (QP)-based early termination for coding tree unit (CTU), optimized resource-scheduled wave-front parallel processing for CTU, and workload balanced, cluster-based multiple-view parallel are proposed. The result shows proposed parallel optimization techniques, with insignificant loss to coding efficiency, significantly improves the execution time performance. This , in turn, proves modern parallel computing platforms, with appropriate platform-specific algorithm design, are valuable tools for improving the performance of computationally intensive applications

    Low energy motion estimation hardware designs for h.264 multiview video coding

    Get PDF
    Multiview Video Coding (MVC) is the process of efficiently compressing stereo (2 views) or multiview video signals. The improved compression efficiency achieved by H.264 MVC comes with a significant increase in computational complexity. Temporal prediction and inter-view prediction are the most computationally intensive parts of H.264 MVC. Therefore, in this thesis, we propose an H.264 MVC full search motion estimation hardware for implementing the temporal and inter-view predictions including several novel energy reduction techniques. The proposed motion estimation hardware is implemented in Verilog HDL and mapped to a Xilinx Virtex-6 FPGA. The FPGA implementation is capable of processing 60 frames per second of VGA size stereo view video sequence. It consumes 65% less energy than H.264 MVC full search motion estimation hardware not including the novel energy reduction techniques with very small PSNR loss and bitrate increase. We also propose a vector prediction based fast motion estimation algorithm for reducing the energy consumption of H.264 MVC motion estimation hardware with additional very small PSNR loss and bitrate increase. We also propose an H.264 MVC motion estimation hardware for implementing the proposed fast motion estimation algorithm. The proposed motion estimation hardware is implemented in Verilog HDL and mapped to a Xilinx Virtex-6 FPGA. The FPGA implementation is capable of processing 92 frames per second of VGA size three view video sequence. It consumes 91% less energy than H.264 MVC full search motion estimation hardware not including the novel energy reduction techniques with very small PSNR loss and bitrate increase

    Three-dimensional video coding on mobile platforms

    Get PDF
    Ankara : The Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent University, 2009.Thesis (Master's) -- Bilkent University, 2009.Includes bibliographical references leaves 83-87.With the evolution of the wireless communication technologies and the multimedia capabilities of the mobile phones, it is expected that three-dimensional (3D) video technologies will soon get adapted to the mobile phones. This raises the problem of choosing the best 3D video representation and the most efficient coding method for the selected representation for mobile platforms. Since the latest 2D video coding standard, H.264/MPEG-4 AVC, provides better coding efficiency over its predecessors, coding methods of the most common 3D video representations are based on this standard. Among the most common 3D video representations, there are multi-view video, video plus depth, multi-view video plus depth and layered depth video. For using on mobile platforms, we selected the conventional stereo video (CSV), which is a special case of multi-view video, since it is the simplest among the available representations. To determine the best coding method for CSV, we compared the simulcast coding, multi-view coding (MVC) and mixed-resolution stereoscopic coding (MRSC) without inter-view prediction, with subjective tests using simple coding schemes. From these tests, MVC is found to provide the best visual quality for the testbed we used, but MRSC without inter-view prediction still came out to be promising for some of the test sequences and especially for low bit rates. Then we adapted the Joint Video Team’s reference multi-view decoder to run on ZOOMTM OMAP34xTM Mobile Development Kit (MDK). The first decoding performance tests on the MDK resulted with around four stereo frames per second with frame resolutions of 640×352. To further improve the performance, the decoder software is profiled and the most demanding algorithms are ported to run on the embedded DSP core. Tests resulted with performance gains ranging from 25% to 60% on the DSP core. However, due to the design of the hardware platform and the structure of the reference decoder, the time spent for the communication link between the main processing unit and the DSP core is found to be high, leaving the performance gains insignificant. For this reason, it is concluded that the reference decoder should be restructured to use this communication link as infrequently as possible in order to achieve overall performance gains by using the DSP core.Bal, CanM.S

    Autonomous Recovery Of Reconfigurable Logic Devices Using Priority Escalation Of Slack

    Get PDF
    Field Programmable Gate Array (FPGA) devices offer a suitable platform for survivable hardware architectures in mission-critical systems. In this dissertation, active dynamic redundancy-based fault-handling techniques are proposed which exploit the dynamic partial reconfiguration capability of SRAM-based FPGAs. Self-adaptation is realized by employing reconfiguration in detection, diagnosis, and recovery phases. To extend these concepts to semiconductor aging and process variation in the deep submicron era, resilient adaptable processing systems are sought to maintain quality and throughput requirements despite the vulnerabilities of the underlying computational devices. A new approach to autonomous fault-handling which addresses these goals is developed using only a uniplex hardware arrangement. It operates by observing a health metric to achieve Fault Demotion using Recon- figurable Slack (FaDReS). Here an autonomous fault isolation scheme is employed which neither requires test vectors nor suspends the computational throughput, but instead observes the value of a health metric based on runtime input. The deterministic flow of the fault isolation scheme guarantees success in a bounded number of reconfigurations of the FPGA fabric. FaDReS is then extended to the Priority Using Resource Escalation (PURE) online redundancy scheme which considers fault-isolation latency and throughput trade-offs under a dynamic spare arrangement. While deep-submicron designs introduce new challenges, use of adaptive techniques are seen to provide several promising avenues for improving resilience. The scheme developed is demonstrated by hardware design of various signal processing circuits and their implementation on a Xilinx Virtex-4 FPGA device. These include a Discrete Cosine Transform (DCT) core, Motion Estimation (ME) engine, Finite Impulse Response (FIR) Filter, Support Vector Machine (SVM), and Advanced Encryption Standard (AES) blocks in addition to MCNC benchmark circuits. A iii significant reduction in power consumption is achieved ranging from 83% for low motion-activity scenes to 12.5% for high motion activity video scenes in a novel ME engine configuration. For a typical benchmark video sequence, PURE is shown to maintain a PSNR baseline near 32dB. The diagnosability, reconfiguration latency, and resource overhead of each approach is analyzed. Compared to previous alternatives, PURE maintains a PSNR within a difference of 4.02dB to 6.67dB from the fault-free baseline by escalating healthy resources to higher-priority signal processing functions. The results indicate the benefits of priority-aware resiliency over conventional redundancy approaches in terms of fault-recovery, power consumption, and resource-area requirements. Together, these provide a broad range of strategies to achieve autonomous recovery of reconfigurable logic devices under a variety of constraints, operating conditions, and optimization criteria

    Real-Time High-Resolution Multiple-Camera Depth Map Estimation Hardware and Its Applications

    Get PDF
    Depth information is used in a variety of 3D based signal processing applications such as autonomous navigation of robots and driving systems, object detection and tracking, computer games, 3D television, and free view-point synthesis. These applications require high accuracy and speed performances for depth estimation. Depth maps can be generated using disparity estimation methods, which are obtained from stereo matching between multiple images. The computational complexity of disparity estimation algorithms and the need of large size and bandwidth for the external and internal memory make the real-time processing of disparity estimation challenging, especially for high resolution images. This thesis proposes a high-resolution high-quality multiple-camera depth map estimation hardware. The proposed hardware is verified in real-time with a complete system from the initial image capture to the display and applications. The details of the complete system are presented. The proposed binocular and trinocular adaptive window size disparity estimation algorithms are carefully designed to be suitable to real-time hardware implementation by allowing efficient parallel and local processing while providing high-quality results. The proposed binocular and trinocular disparity estimation hardware implementations can process 55 frames per second on a Virtex-7 FPGA at a 1024 x 768 XGA video resolution for a 128 pixel disparity range. The proposed binocular disparity estimation hardware provides best quality compared to existing real-time high-resolution disparity estimation hardware implementations. A novel compressed-look up table based rectification algorithm and its real-time hardware implementation are presented. The low-complexity decompression process of the rectification hardware utilizes a negligible amount of LUT and DFF resources of the FPGA while it does not require the existence of external memory. The first real-time high-resolution free viewpoint synthesis hardware utilizing three-camera disparity estimation is presented. The proposed hardware generates high-quality free viewpoint video in real-time for any horizontally aligned arbitrary camera positioned between the leftmost and rightmost physical cameras. The full embedded system of the depth estimation is explained. The presented embedded system transfers disparity results together with synchronized RGB pixels to the PC for application development. Several real-time applications are developed on a PC using the obtained RGB+D results. The implemented depth estimation based real-time software applications are: depth based image thresholding, speed and distance measurement, head-hands-shoulders tracking, virtual mouse using hand tracking and face tracking integrated with free viewpoint synthesis. The proposed binocular disparity estimation hardware is implemented in an ASIC. The ASIC implementation of disparity estimation imposes additional constraints with respect to the FPGA implementation. These restrictions, their implemented efficient solutions and the ASIC implementation results are presented. In addition, a very high-resolution (82.3 MP) 360°x90° omnidirectional multiple camera system is proposed. The hemispherical camera system is able to view the target locations close to horizontal plane with more than two cameras. Therefore, it can be used in high-resolution 360° depth map estimation and its applications in the future

    Towards Computational Efficiency of Next Generation Multimedia Systems

    Get PDF
    To address throughput demands of complex applications (like Multimedia), a next-generation system designer needs to co-design and co-optimize the hardware and software layers. Hardware/software knobs must be tuned in synergy to increase the throughput efficiency. This thesis provides such algorithmic and architectural solutions, while considering the new technology challenges (power-cap and memory aging). The goal is to maximize the throughput efficiency, under timing- and hardware-constraints

    Towards a Common Software/Hardware Methodology for Future Advanced Driver Assistance Systems

    Get PDF
    The European research project DESERVE (DEvelopment platform for Safe and Efficient dRiVE, 2012-2015) had the aim of designing and developing a platform tool to cope with the continuously increasing complexity and the simultaneous need to reduce cost for future embedded Advanced Driver Assistance Systems (ADAS). For this purpose, the DESERVE platform profits from cross-domain software reuse, standardization of automotive software component interfaces, and easy but safety-compliant integration of heterogeneous modules. This enables the development of a new generation of ADAS applications, which challengingly combine different functions, sensors, actuators, hardware platforms, and Human Machine Interfaces (HMI). This book presents the different results of the DESERVE project concerning the ADAS development platform, test case functions, and validation and evaluation of different approaches. The reader is invited to substantiate the content of this book with the deliverables published during the DESERVE project. Technical topics discussed in this book include:Modern ADAS development platforms;Design space exploration;Driving modelling;Video-based and Radar-based ADAS functions;HMI for ADAS;Vehicle-hardware-in-the-loop validation system
    corecore