146,970 research outputs found

    High performance hardware architecture for half-pixel accurate H.264 motion estimation

    Get PDF
    In this paper, we present a high performance and low cost hardware architecture for real-time implementation of half-pel accurate variable block size motion estimation for H.264 / MPEG4 Part 10 video coding. The proposed architecture includes a novel half-pel interpolation hardware that is shared by novel half-pel search hardwares designed for each block size. This half-pel accurate motion estimation hardware is designed to be used as part of a complete H.264 video coding system for portable applications. The proposed architecture is implemented in Verilog HDL. The Verilog RTL code is verified to work at 85 MHz in a Xilinx Virtex II FPGA. The FPGA implementation can process 30 HDTV frames (1280x720) per second

    An efficient hardware architecture for H.264 intra prediction algorithm

    Get PDF
    In this paper, we present an efficient hardware architecture for real-time implementation of intra prediction algorithm used in H.264 / MPEG4 Part 10 video coding standard. The hardware design is based on a novel organization of the intra prediction equations. This hardware is designed to be used as part of a complete H.264 video coding system for portable applications. The proposed architecture is implemented in Verilog HDL. The Verilog RTL code is verified to work at 90 MHz in a Xilinx Virtex II FPGA. The FPGA implementation can process 27 VGA frames (640x480) per second

    An efficient hardware architecture for H.264 adaptive deblocking filter algorithm

    Get PDF
    This paper presents an efficient hardware architecture for real-time implementation of adaptive deblocking filter algorithm used in H.264 video coding standard. This hardware is designed to be used as part of a complete H.264 video coding system for portable applications. We use a novel edge filter ordering in a Macroblock to prevent the deblocking filter hardware from unnecessarily waiting for the pixels that will be filtered become available. The proposed architecture is implemented in Verilog HDL. The Verilog RTL code is verified to work at 72 MHz in a Xilinx Virtex II FPGA. The FPGA implementation can code 30 CIF frames (352x288) per second

    Dynamic reconfiguration technologies based on FPGA in software defined radio system

    Get PDF
    Partial Reconfiguration (PR) is a method for Field Programmable Gate Array (FPGA) designs which allows multiple applications to time-share a portion of an FPGA while the rest of the device continues to operate unaffected. Using this strategy, the physical layer processing architecture in Software Defined Radio (SDR) systems can benefit from reduced complexity and increased design flexibility, as different waveform applications can be grouped into one part of a single FPGA. Waveform switching often means not only changing functionality, but also changing the FPGA clock frequency. However, that is beyond the current functionality of PR processes as the clock components (such as Digital Clock Managers (DCMs)) are excluded from the process of partial reconfiguration. In this paper, we present a novel architecture that combines another reconfigurable technology, Dynamic Reconfigurable Port (DRP), with PR based on a single FPGA in order to dynamically change both functionality and also the clock frequency. The architecture is demonstrated to reduce hardware utilization significantly compared with standard, static FPGA design

    Designing with RoBs for High Performance VLIW Architecture

    Get PDF
    VLIW architecture has become widespread due to the combined bene?ts of simple hardware and compiler extracted instruction level parallelism. The VLIW instruction set architecture and its hardware implementation is tightly coupled and a novel simultaneous multithreading VLIW architecture with dynamic dispatch mechanism which uses RoBs complex logic to maximize ILP has been proposed. Since the resulting dynamic instruction schedule of many applications seldom changes, it is reasonable to store and reuse the schedule instead of reconstructing it each time. The new VLIW architecture shows that it can effectively increase the processor efficiency which improves the performance
    • …
    corecore