289 research outputs found
Complexity adaptation in video encoders for power limited platforms
With the emergence of video services on power limited platforms, it is necessary to consider both performance-centric and constraint-centric signal processing techniques. Traditionally, video applications have a bandwidth or computational resources constraint or both. The recent H.264/AVC video compression standard offers significantly improved efficiency and flexibility compared to previous standards, which leads to less emphasis on bandwidth. However, its high computational complexity is a problem for codecs running on power limited plat- forms. Therefore, a technique that integrates both complexity and bandwidth issues in a single framework should be considered.
In this thesis we investigate complexity adaptation of a video coder which focuses on managing computational complexity and provides significant complexity savings when applied to recent standards. It consists of three sub functions specially designed for reducing complexity and a framework for using these sub functions; Variable Block Size (VBS) partitioning, fast motion estimation, skip macroblock detection, and complexity adaptation framework.
Firstly, the VBS partitioning algorithm based on the Walsh Hadamard Transform (WHT) is presented. The key idea is to segment regions of an image as edges or flat regions based on the fact that prediction errors are mainly affected by edges. Secondly, a fast motion estimation algorithm called Fast Walsh Boundary Search (FWBS) is presented on the VBS partitioned images. Its results outperform other commonly used fast algorithms. Thirdly, a skip macroblock detection algorithm is proposed for use prior to motion estimation by estimating the Discrete Cosine Transform (DCT) coefficients after quantisation. A new orthogonal transform called the S-transform is presented for predicting Integer DCT coefficients from Walsh Hadamard Transform coefficients. Complexity saving is achieved by deciding which macroblocks need to be processed and which can be skipped without processing. Simulation results show that the proposed algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. Finally, a complexity adaptation framework which combines all three techniques mentioned above is proposed for maximizing the perceptual quality of coded video on a complexity constrained platform
Signal processing for high-definition television
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 1995.Includes bibliographical references (p. 60-62).by Peter Monta.Ph.D
Platforms for handling and development of audiovisual data
EstĂĄgio realizado na MOG Solutions e orientado por VĂtor TeixeiraTese de mestrado integrado. Engenharia InformĂĄtca e Computação. Faculdade de Engenharia. Universidade do Porto. 200
Evaluation of the color image and video processing chain and visual quality management for consumer systems
With the advent of novel digital display technologies, color processing is increasingly becoming a key aspect in consumer video applications. Todayâs state-of-the-art displays require sophisticated color and image reproduction techniques in order to achieve larger screen size, higher luminance and higher resolution than ever before. However, from color science perspective, there are clearly opportunities for improvement in the color reproduction capabilities of various emerging and conventional display technologies. This research seeks to identify potential areas for improvement in color processing in a video processing chain. As part of this research, various processes involved in a typical video processing chain in consumer video applications were reviewed. Several published color and contrast enhancement algorithms were evaluated, and a novel algorithm was developed to enhance color and contrast in images and videos in an effective and coordinated manner. Further, a psychophysical technique was developed and implemented for performing visual evaluation of color image and consumer video quality. Based on the performance analysis and visual experiments involving various algorithms, guidelines were proposed for the development of an effective color and contrast enhancement method for images and video applications. It is hoped that the knowledge gained from this research will help build a better understanding of color processing and color quality management methods in consumer video
Segmentation sémantique des contenus audio-visuels
Dans ce travail, nous avons mis au point une mĂ©thode de segmentation des contenus audiovisuels applicable aux appareils de stockage domestiques pour cela nous avons expĂ©rimentĂ© un systĂšme distribuĂ© pour lâanalyse du contenu composĂ© de modules individuels dâanalyse : les Service Unit. Lâun dâentre eux a Ă©tĂ© dĂ©diĂ© Ă la caractĂ©risation des Ă©lĂ©ments hors contenu, i.e. les publicitĂ©s, et offre de bonnes performances. ParallĂšlement, nous avons testĂ© diffĂ©rents dĂ©tecteurs de changement de plans afin de retenir le meilleur dâentre eux pour la suite. Puis, nous avons proposĂ© une Ă©tude des rĂšgles de production des films, i.e. grammaire de films, qui a permis de dĂ©finir les sĂ©quences de Parallel Shot. Nous avons, ainsi, testĂ© quatre mĂ©thodes de regroupement basĂ©es similaritĂ© afin de retenir la meilleure dâentre elles pour la suite. Finalement, nous avons recherchĂ© diffĂ©rentes mĂ©thodes de dĂ©tection des frontiĂšres de scĂšnes et avons obtenu les meilleurs rĂ©sultats en combinant une mĂ©thode basĂ©e couleur avec un critĂšre de longueur de plan. Ce dernier offre des performances justifiant son intĂ©gration dans les appareils de stockage grand public.In this work we elaborated a method for semantic segmentation of audiovisual content applicable for consumer electronics storage devices. For the specific solution we researched first a service-oriented distributed multimedia content analysis framework composed of individual content analysis modules, i.e. Service Units. One of the latter was dedicated to identify non-content related inserts, i.e. commercials blocks, which reached high performance results. In a subsequent step we researched and benchmarked various Shot Boundary Detectors and implement the best performing one as Service Unit. Here after, our study of production rules, i.e. film grammar, provided insights of Parallel Shot sequences, i.e. Cross-Cuttings and Shot-Reverse-Shots. We researched and benchmarked four similarity-based clustering methods, two colour- and two feature-point-based ones, in order to retain the best one for our final solution. Finally, we researched several audiovisual Scene Boundary Detector methods and achieved best results combining a colour-based method with a shot length based criteria. This Scene Boundary Detector identified semantic scene boundaries with a robustness of 66% for movies and 80% for series, which proofed to be sufficient for our envisioned application Advanced Content Navigation
Recommended from our members
Inspection and evaluation of artifacts in digital video sources
Streaming digital video content providers such as YouTube, Amazon, Hulu, and Netflix collaborate with production teams to obtain new and old video content. These collaborations lead to an accumulation of video sources, some of which might contain unacceptable visual artifacts. Artifacts may inadvertently enter the video master at any point in the production pipeline, due to any of a number of equipment and user failures. Unfortunately, these artifacts are difficult to detect since no pristine reference exists for comparison. As of now, few automated tools exist that can effectively capture the most common forms of these artifacts. This work studies no-reference video source inspection for generalized artifact detection and subjective quality prediction, which will ultimate inform decisions related to acquisition of new content.
Automatically identifying the locations and severities of video artifacts is a difficult problem. We have developed a general method for detecting local artifacts by learning differences in the statistics between distorted and pristine video frames. Our model, which we call the Video Impairment Mapper (VID-MAP), produces a full resolution map of artifact detection probabilities based on comparisons of excitatory and inhibatory convolutional responses. Validation on a large database shows that our method outperforms the previous state-of-the-art of even distortion-specific detectors.
A variety of powerful picture quality predictors are available that rely on neuro-statistical models of distortion perception. We extend these principles to video source inspection, by coupling spatial divisive normalization with a series of filterbanks tuned for artifact detection, implemented using a common convolutional framework. We developed the Video Impairment Detection by SParse Error CapTure (VIDSPECT) model, which leverages discriminative sparse dictionaries that are tuned to detect specific artifacts. VIDSPECT is simple, highly generalizable, and yields better accuracy than competing methods.
To evaluate the perceived quality of video sources containing artifacts, we built a new digital video database, called the LIVE Video Masters Database, which contains 384 videos affected by the types of artifacts encountered in otherwise pristine digital video sources. We find that VIDSPECT delivers top performance on this database for most artifacts tested, and competitive performance otherwise, using the same basic architecture in all cases.Electrical and Computer Engineerin
Object-based video representations: shape compression and object segmentation
Object-based video representations are considered to be useful for easing the process of multimedia content production and enhancing user interactivity in multimedia productions. Object-based video presents several new technical challenges, however.
Firstly, as with conventional video representations, compression of the video data is a
requirement. For object-based representations, it is necessary to compress the shape of
each video object as it moves in time. This amounts to the compression of moving
binary images. This is achieved by the use of a technique called context-based
arithmetic encoding. The technique is utilised by applying it to rectangular pixel blocks and as such it is consistent with the standard tools of video compression. The blockbased application also facilitates well the exploitation of temporal redundancy in the sequence of binary shapes. For the first time, context-based arithmetic encoding is used in conjunction with motion compensation to provide inter-frame compression. The method, described in this thesis, has been thoroughly tested throughout the MPEG-4 core experiment process and due to favourable results, it has been adopted as part of the MPEG-4 video standard.
The second challenge lies in the acquisition of the video objects. Under normal conditions, a video sequence is captured as a sequence of frames and there is no inherent information about what objects are in the sequence, not to mention information relating to the shape of each object. Some means for segmenting semantic objects from general video sequences is required. For this purpose, several image analysis tools may be of help and in particular, it is believed that video object tracking algorithms will be important. A new tracking algorithm is developed based on piecewise polynomial motion representations and statistical estimation tools, e.g. the expectationmaximisation method and the minimum description length principle
- âŠ