77 research outputs found

    Region-Based Template Matching Prediction for Intra Coding

    Get PDF
    Copy prediction is a renowned category of prediction techniques in video coding where the current block is predicted by copying the samples from a similar block that is present somewhere in the already decoded stream of samples. Motion-compensated prediction, intra block copy, template matching prediction etc. are examples. While the displacement information of the similar block is transmitted to the decoder in the bit-stream in the first two approaches, it is derived at the decoder in the last one by repeating the same search algorithm which was carried out at the encoder. Region-based template matching is a recently developed prediction algorithm that is an advanced form of standard template matching. In this method, the reference area is partitioned into multiple regions and the region to be searched for the similar block(s) is conveyed to the decoder in the bit-stream. Further, its final prediction signal is a linear combination of already decoded similar blocks from the given region. It was demonstrated in previous publications that region-based template matching is capable of achieving coding efficiency improvements for intra as well as inter-picture coding with considerably less decoder complexity than conventional template matching. In this paper, a theoretical justification for region-based template matching prediction subject to experimental data is presented. Additionally, the test results of the aforementioned method on the latest H.266/Versatile Video Coding (VVC) test model (version VTM-14.0) yield an average Bjþntegaard-Delta (BD) bit-rate savings of −0.75% using all intra (AI) configuration with 130% encoder run-time and 104% decoder run-time for a particular parameter selection

    Motion compensated video coding

    Get PDF
    The result of many years of international co-operation in video coding has been the development of algorithms that remove interframe redundancy, such that only changes in the image that occur over a given time are encoded for transmission to the recipient. The primary process used here is the derivation of pixel differences, encoded in a method referred to as Differential Pulse-Coded Modulation (DPCM)and this has provided the basis of contemporary research into low-bit rate hybrid codec schemes. There are, however, instances when the DPCM technique cannot successfully code a segment of the image sequence because motion is a major cause of interframe differences. Motion Compensation (MC) can be used to improve the efficiency of the predictive coding algorithm. This thesis examines current thinking in the area of motion-compensated video compression and contrasts the application of differing algorithms to the general requirements of interframe coding. A novel technique is proposed, where the constituent features in an image are segmented, classified and their motion tracked by a local search algorithm. Although originally intended to complement the DPCM method in a predictive hybrid codec, it will be demonstrated that the evaluation of feature displacement can, in its own right, form the basis of a low bitrate video codec of low complexity. After an extensive discussion of the issues involved, a description of laboratory simulations shows how the postulated technique is applied to standard test sequences. Measurements of image quality and the efficiency of compression are made and compared with a contemporary standard method of low bitrate video coding

    Spatial Correlation-Based Motion-Vector Prediction for Video-Coding Efficiency Improvement

    Get PDF
    H.265/HEVC achieves an average bitrate reduction of 50% for fixed video quality compared with the H.264/AVC standard, while computation complexity is significantly increased. The purpose of this work is to improve coding efficiency for the next-generation video-coding standards. Therefore, by developing a novel spatial neighborhood subset, efficient spatial correlation-based motion vector prediction (MVP) with the coding-unit (CU) depth-prediction algorithm is proposed to improve coding efficiency. Firstly, by exploiting the reliability of neighboring candidate motion vectors (MVs), the spatial-candidate MVs are used to determine the optimized MVP for motion-data coding. Secondly, the spatial correlation-based coding-unit depth-prediction is presented to achieve a better trade-off between coding efficiency and computation complexity for interprediction. This approach can satisfy an extreme requirement of high coding efficiency with not-high requirements for real-time processing. The simulation results demonstrate that overall bitrates can be reduced, on average, by 5.35%, up to 9.89% compared with H.265/HEVC reference software in terms of the Bjontegaard Metric

    Motion compensated video coding

    Get PDF
    The result of many years of international co-operation in video coding has been the development of algorithms that remove interframe redundancy, such that only changes in the image that occur over a given time are encoded for transmission to the recipient. The primary process used here is the derivation of pixel differences, encoded in a method referred to as Differential Pulse-Coded Modulation (DPCM)and this has provided the basis of contemporary research into low-bit rate hybrid codec schemes. There are, however, instances when the DPCM technique cannot successfully code a segment of the image sequence because motion is a major cause of interframe differences. Motion Compensation (MC) can be used to improve the efficiency of the predictive coding algorithm. This thesis examines current thinking in the area of motion-compensated video compression and contrasts the application of differing algorithms to the general requirements of interframe coding. A novel technique is proposed, where the constituent features in an image are segmented, classified and their motion tracked by a local search algorithm. Although originally intended to complement the DPCM method in a predictive hybrid codec, it will be demonstrated that the evaluation of feature displacement can, in its own right, form the basis of a low bitrate video codec of low complexity. After an extensive discussion of the issues involved, a description of laboratory simulations shows how the postulated technique is applied to standard test sequences. Measurements of image quality and the efficiency of compression are made and compared with a contemporary standard method of low bitrate video coding

    Inter-prediction methods based on linear embedding for video compression

    Get PDF
    International audienceThis paper considers the problem of temporal prediction for inter-frame coding of video sequences using locally linear embedding (LLE). LLE-based prediction, first considered for intra-frame prediction, computes the predictor as a linear combination of K nearest neighbors (K-NN) searched within one or several reference frames. The paper explores different K-NN search strategies in the context of temporal prediction, leading to several temporal predictor variants. The proposed methods are tested as extra inter-frame prediction modes in an H.264 codec, but the proposed concepts are still valid in HEVC. The results show that significant rate-distortion performance gains are obtained with respect to H.264 (up to 15.31% bit-rate saving)

    Object-based video representations: shape compression and object segmentation

    Get PDF
    Object-based video representations are considered to be useful for easing the process of multimedia content production and enhancing user interactivity in multimedia productions. Object-based video presents several new technical challenges, however. Firstly, as with conventional video representations, compression of the video data is a requirement. For object-based representations, it is necessary to compress the shape of each video object as it moves in time. This amounts to the compression of moving binary images. This is achieved by the use of a technique called context-based arithmetic encoding. The technique is utilised by applying it to rectangular pixel blocks and as such it is consistent with the standard tools of video compression. The blockbased application also facilitates well the exploitation of temporal redundancy in the sequence of binary shapes. For the first time, context-based arithmetic encoding is used in conjunction with motion compensation to provide inter-frame compression. The method, described in this thesis, has been thoroughly tested throughout the MPEG-4 core experiment process and due to favourable results, it has been adopted as part of the MPEG-4 video standard. The second challenge lies in the acquisition of the video objects. Under normal conditions, a video sequence is captured as a sequence of frames and there is no inherent information about what objects are in the sequence, not to mention information relating to the shape of each object. Some means for segmenting semantic objects from general video sequences is required. For this purpose, several image analysis tools may be of help and in particular, it is believed that video object tracking algorithms will be important. A new tracking algorithm is developed based on piecewise polynomial motion representations and statistical estimation tools, e.g. the expectationmaximisation method and the minimum description length principle

    Video Analysis and Indexing

    Get PDF

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

    Information Fusion for Improved Motion Estimation

    Get PDF
    studentship award number 98318229Motion Estimation is an important research field with many commercial applications including surveillance, navigation, robotics, and image compression. As a result, the field has received a great deal of attention and there exist a wide variety of Motion Estimation techniques which are often specialised for particular problems. The relative performance of these techniques, in terms of both accuracy and of computational requirements, is often found to be data dependent, and no single technique is known to outperform all others for all applications under all conditions. Information Fusion strategies seek to combine the results of different classifiers or sensors to give results of a better quality for a given problem than can be achieved by any single technique alone. Information Fusion has been shown to be of benefit to a number of applications including remote sensing, personal identity recognition, target detection, forecasting, and medical diagnosis. This thesis proposes and demonstrates that Information Fusion strategies may also be applied to combine the results of different Motion Estimation techniques in order to give more robust, more accurate and more timely motion estimates than are provided by any of the individual techniques alone. Information Fusion strategies for combining motion estimates are investigated and developed. Their usefulness is first demonstrated by combining scalar motion estimates of the frequency of rotation of spinning biological cells. Then the strategies are used to combine the results from three popular 2D Motion Estimation techniques, chosen to be representative of the main approaches in the field. Results are presented, from both real and synthetic test image sequences, which illustrate the potential benefits of Information Fusion to Motion Estimation applications. There is often a trade-off between accuracy of Motion Estimation techniques and their computational requirements. An architecture for Information Fusion that allows faster, less accurate techniques to be effectively combined with slower, more accurate techniques is described. This thesis describes a number of novel techniques for both Information Fusion and Motion Estimation which have potential scope beyond that examined here. The investigations presented in this thesis have also been reported in a number of workshop, conference and journal papers, which are listed at the end of the document
    • 

    corecore