3 research outputs found

    Fast HEVC Intramode Decision Based on Hybrid Cost Ranking

    Get PDF
    To improve rate-distortion (R-D) performance, high efficiency video coding (HEVC) increases the intraprediction modes with heavy computational load, and thus the intracoding optimization is highly demanded for real-time applications. According to the conditional probabilities of most probable modes and the correlation of potential candidate subsets, this paper proposes a fast HEVC intramode decision scheme based on the hybrid cost ranking which includes both Hadamard cost and rate-distortion cost. The proposed scheme utilizes the coded results of the modified rough mode decision and the neighboring prediction units so as to obtain a potential candidate subset and then conditionally selects the optimal mode through early likelihood decision and hybrid cost ranking. By the experiment-driven methodology, the proposed scheme implements the early termination if the best mode from the candidate subset is equal to one or two neighboring intramodes. The experimental results demonstrate that the proposed scheme averagely provides about 23.7% encoding speedup with just 0.82% BD-rate loss in comparison with default fast intramode decision in HM16.0. Compared to other fast intramode decision schemes, the proposed scheme also significantly reduces intracoding time while maintaining similar R-D performance for the all-intraconfiguration in HM16.0 Main profile

    On the Effectiveness of Video Recolouring as an Uplink-model Video Coding Technique

    Get PDF
    For decades, conventional video compression formats have advanced via incremental improvements with each subsequent standard achieving better rate-distortion (RD) efficiency at the cost of increased encoder complexity compared to its predecessors. Design efforts have been driven by common multi-media use cases such as video-on-demand, teleconferencing, and video streaming, where the most important requirements are low bandwidth and low video playback latency. Meeting these requirements involves the use of computa- tionally expensive block-matching algorithms which produce excellent compression rates and quick decoding times. However, emerging use cases such as Wireless Video Sensor Networks, remote surveillance, and mobile video present new technical challenges in video compression. In these scenarios, the video capture and encoding devices are often power-constrained and have limited computational resources available, while the decoder devices have abundant resources and access to a dedicated power source. To address these use cases, codecs must be power-aware and offer a reasonable trade-off between video quality, bitrate, and encoder complexity. Balancing these constraints requires a complete rethinking of video compression technology. The uplink video-coding model represents a new paradigm to address these low-power use cases, providing the ability to redistribute computational complexity by offloading the motion estimation and compensation steps from encoder to decoder. Distributed Video Coding (DVC) follows this uplink model of video codec design, and maintains high quality video reconstruction through innovative channel coding techniques. The field of DVC is still early in its development, with many open problems waiting to be solved, and no defined video compression or distribution standards. Due to the experimental nature of the field, most DVC codec to date have focused on encoding and decoding the Luma plane only, which produce grayscale reconstructed videos. In this thesis, a technique called “video recolouring” is examined as an alternative to DVC. Video recolour- ing exploits the temporal redundancies between colour planes, reducing video bitrate by removing Chroma information from specific frames and then recolouring them at the decoder. A novel video recolouring algorithm called Motion-Compensated Recolouring (MCR) is proposed, which uses block motion estimation and bi-directional weighted motion-compensation to reconstruct Chroma planes at the decoder. MCR is used to enhance a conventional base-layer codec, and shown to reduce bitrate by up to 16% with only a slight decrease in objective quality. MCR also outperforms other video recolouring algorithms in terms of objective video quality, demonstrating up to 2 dB PSNR improvement in some cases

    Selected topics in video coding and computer vision

    Get PDF
    Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences
    corecore