77 research outputs found

    Low complexity hardware oriented H.264/AVC motion estimation algorithm and related low power and low cost architecture design

    Get PDF
    制度:新 ; 報告番号:甲2999号 ; 学位の種類:博士(工学) ; 授与年月日:2010/3/15 ; 早大学位記番号:新525

    Low-power and application-specific SRAM design for energy-efficient motion estimation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 181-189).Video content is expected to account for 70% of total mobile data traffic in 2015. High efficiency video coding, in this context, is crucial for lowering the transmission and storage costs for portable electronics. However, modern video coding standards impose a large hardware complexity. Hence, energy-efficiency of these hardware blocks is becoming more critical than ever before for mobile devices. SRAMs are critical components in almost all SoCs affecting the overall energy-efficiency. This thesis focuses on algorithm and architecture development as well as low-power and application-specific SRAM design targeting motion estimation. First, a motion estimation design is considered for the next generation video standard, HEVC. Hardware cost and coding efficiency trade-offs are quantified and an optimum design choice between hardware complexity and coding efficiency is proposed. Hardware-efficient search algorithm, shared search range across CU engines and pixel pre-fetching algorithms provide 4.3x area, 56x on-chip bandwidth and 151 x off-chip bandwidth reduction. Second, a highly-parallel motion estimation design targeting ultra-low voltage operation and supporting AVC/H.264 and VC-1 standards are considered. Hardware reconfigurability along with frame and macro-block parallel processing are implemented for this engine to maximize hardware sharing between multiple standards and to meet throughput constraints. Third, in the context of low-power SRAMs, a 6T and an 8T SRAM are designed in 28nm and 45nm CMOS technologies targeting low voltage operation. The 6T design achieves operation down to 0.6V and the 8T design achieves operation down to 0.5V providing ~ 2.8x and ~ 4.8x reduction in energy/access respectively. Finally, an application-specific SRAM design targeted for motion estimation is developed. Utilizing the correlation of pixel data to reduce bit-line switching activity, this SRAM achieves up to 1.9x energy savings compared to a similar conventional 8T design. These savings demonstrate that application-specific SRAM design can introduce a new dimension and can be combined with voltage scaling to maximize energy-efficiency.by Mahmut Ersin Sinangil.Ph.D

    Implementing video compression algorithms on reconfigurable devices

    Get PDF
    The increasing density offered by Field Programmable Gate Arrays(FPGA), coupled with their short design cycle, has made them a popular choice for implementing a wide range of algorithms and complete systems. In this thesis the implementation of video compression algorithms on FPGAs is studied. Two areas are specifically focused on; the integration of a video encoder into a complete system and the power consumption of FPGA based video encoders. Two FPGA based video compression systems are described, one which targets surveillance applications and one which targets video conferencing applications. The FPGA video surveillance system makes use of a novel memory format to improve the efficiency with which input video sequences can be loaded over the system bus. The power consumption of a FPGA video encoder is analyzed. The results indicating that the motion estimation encoder stage requires the most power consumption. An algorithm, which reuses the intra prediction results generated during the encoding process, is then proposed to reduce the power consumed on an FPGA video encoder’s external memory bus. Finally, the power reduction algorithm is implemented within an FPGA video encoder. Results are given showing that, in addition to reducing power on the external memory bus, the algorithm also reduces power in the motion estimation stage of a FPGA based video encoder

    Low Power Architectures for MPEG-4 AVC/H.264 Video Compression

    Get PDF

    Motion estimation for H.264/AVC on multiple GPUs using NVIDIA CUDA

    Get PDF
    To achieve the high coding efficiency the H.264/AVC standard offers, the encoding process quickly becomes computationally demanding. One of the most intensive encoding phases is motion estimation. Even modern CPUs struggle to process high-definition video sequences in real-time. While personal computers are typically equipped with powerful Graphics Processing Units (GPUs) to accelerate graphics operations, these GPUs lie dormant when encoding a video sequence. Furthermore, recent developments show more and more computer configurations come with multiple GPUs. However, no existing GPU-enabled motion estimation architectures target multiple GPUs. In addition, these architectures provide no early-out behavior nor can they enforce a specific processing order. We developed a motion search architecture, capable of executing motion estimation and partitioning for an H.264/AVC sequence entirely on the GPU using the NVIDIA CUDA (Compute Unified Device Architecture) platform. This paper describes our architecture and presents a novel job scheduling system we designed, making it possible to control the GPU in a flexible way. This job scheduling system can enforce real-time demands of the video encoder by prioritizing calculations and providing an early-out mode. Furthermore, the job scheduling system allows the use of multiple GPUs in one computer system and efficient load balancing of the motion search over these GPUs. This paper focuses on the execution speed of the novel job scheduling system on both single and multi-GPU systems. Initial results show that real-time full motion search of 720p high-definition content is possible with a 32 by 32 search window running on a system with four GPUs

    Implementation And Optimizaton Of Real-time H.264 Baseline Encoder On Tms320dm642 Dsp

    Get PDF
    Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2007Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2007Günümüzde sayısal video kodlama sayısal gözetim sistemleri, video konferans, mobil uygulamalar ve video yayını gibi bir çok uygulamada zorunlu hale gelmiştir. Uluslararası bir video sıkıştırma standardı olan H.264/MPEG-4 bölüm 10, daha önceki standartlara göre kodlama verimini iyileştirmek amacıyla geliştirilmiştir. Fakat, bu kodlama geliştirmesi beraberinde kodlama karmaşıklığının da artmasına yol açmaktadır. Bu tez çalışmasında Texas Instruments TMS320DM642 sayısal sinyal işleyici üzerinde H.264 temel profil kodlayıcı gerçeklenmiştir. DM642 DSP çekirdeği üzerindeki gerçek zamanlı H.264/AVC kodlayıcı uygulaması hata esnekliği araçları ve çeyrek piksel hareket dengeleme dışında standart tüm H.264/AVC temel profil kodlama araçlarını sunmaktadır. Çeyrek piksel hareket dengelem yerine, tüm parlaklılık ve renklik bileşenleri için tam sayı ve yarım piksel pozisyonlarında hareket kestirim ve dengeleme gerçeklenmiştir. Kullanılan DM642 DSP çekirdeği platformu, 2-seviyeli bellek/önbellek aşama düzenine sahip ve VLIW içeren yüksek performanslı sayısal işlemci olarak tasarlanmıştır. Sunulan H.264 temel kodlayıcı sistemin gerçeklenmesi ve eniyilemesi bu tezin konusudur. Üstelik, algoritma bazlı, mimari ve bellek stratejilerini içeren eniyileme çalışma fazları detaylarıyla açıklanmaktadır. H.264/AVC video kodlayıcının hem geliştirme ortamında hem de DM642 EVM donanım ortamında çalışması doğrulanmıştır. Kısaca, kodlayıcı sisteme giriş olan CIF çözünürlükte sıkıştırılmamış YUV video dizisi H.264 Annex-B dosya biçiminde ve de ekrana video çıktı verilerek sıkıştırılmaktadır. Ek olarak, kodlayıcı çıktısı H.264 referans yazılımla doğruluğu kontrol edilmiş ve uyumluluğu kanıtlanmıştır.Recently, digital video coding is mandatory in many applications such as digital surveillance systems, video conferencing, mobile applications as well as video broadcasts. The H.264/MPEG-4 Part 10, an international video compression standard, is developed for improving the coding efficiency compared to previous standards. However, the coding improvement comes with an increase in coding complexity. In this thesis, an H.264 baseline profile encoder is implemented on Texas Instruments TMS320DM642 digital signal processor. The real-time implementation of the H.264/AVC encoder on DM642 DSP core offers most of the standard H.264/AVC baseline profile coding tools except error resiliency tools and quarter-pel motion estimation. Instead of quarter-pel motion compensation, integer and half pixel position motion estimation and compensation for all luminance and chrominance components are implemented. The target platform, DM64 DSP core, is designed as a high-performance digital media processor with two-level memory/cache hierarchy and VLIW architecture. The subject of the thesis is H.264 baseline encoder system realization and optimization on the target platform. Moreover, the study of optimization phases covering algorithmic, architectural and memory strategies are clarified in details. The H.264/AVC encoder system is verified both to execute on the development workstation and DM642 EVM (Evaluation Module) hardware platform. Briefly, the uncompressed input of a YUV video sequence with CIF resolution to the encoder system is compressed to H.264 Annex-B file format and displayed on screen. Additionally, the encoder output is verified with H.264 reference software and the compliancy is proven.Yüksek LisansM.Sc

    Adaptive Computationally Scalable Motion Estimation for the Hardware H.264/AVC Encoder

    Full text link
    corecore