33 research outputs found

    Investigating low-bitrate, low-complexity H.264 region of interest techniques in error-prone environments

    Get PDF
    The H.264/AVC video coding standard leverages advanced compression methods to provide a significant increase in performance over previous CODECs in terms of picture quality, bitrate, and flexibility. The specification itself provides several profiles and levels that allow customization through the use of various advanced features. In addition to these features, several new video coding techniques have been developed since the standard\u27s inception. One such technique known as Region of Interest (RoI) coding has been in existence since before H.264\u27s formalization, and several means of implementing RoI coding in H.264 have been proposed. Region of Interest coding operates under the assumption that one or more regions of a sequence have higher priority than the rest of the video. One goal of RoI coding is to provide a decrease in bitrate without significant loss of perceptual quality, and this is particularly applicable to low complexity environments, if the proper implementation is used. Furthermore, RoI coding may allow for enhanced error resilience in the selected regions if desired, making RoI suitable for both low-bitrate and error-prone scenarios. The goal of this thesis project was to examine H.264 Region of Interest coding as it applies to such scenarios. A modified version of the H.264 JM Reference Software was created in which all non-Baseline profile features were removed. Six low-complexity RoI coding techniques, three targeting rate control and three targeting error resilience, were selected for implementation. Error and distortion modeling tools were created to enhance the quality of experimental data. Results were gathered by varying a range of coding parameters including frame size, target bitrate, and macroblock error rates. Methods were then examined based on their rate-distortion curves, ability to achieve target bitrates accurately, and per-region distortions where applicable

    Resource-Constrained Low-Complexity Video Coding for Wireless Transmission

    Get PDF

    Description-driven Adaptation of Media Resources

    Get PDF
    The current multimedia landscape is characterized by a significant diversity in terms of available media formats, network technologies, and device properties. This heterogeneity has resulted in a number of new challenges, such as providing universal access to multimedia content. A solution for this diversity is the use of scalable bit streams, as well as the deployment of a complementary system that is capable of adapting scalable bit streams to the constraints imposed by a particular usage environment (e.g., the limited screen resolution of a mobile device). This dissertation investigates the use of an XML-driven (Extensible Markup Language) framework for the format-independent adaptation of scalable bit streams. Using this approach, the structure of a bit stream is first translated into an XML description. In a next step, the resulting XML description is transformed to reflect a desired adaptation of the bit stream. Finally, the transformed XML description is used to create an adapted bit stream that is suited for playback in the targeted usage environment. The main contribution of this dissertation is BFlavor, a new tool for exposing the syntax of binary media resources as an XML description. Its development was inspired by two other technologies, i.e. MPEG-21 BSDL (Bitstream Syntax Description Language) and XFlavor (Formal Language for Audio-Visual Object Representation, extended with XML features). Although created from a different point of view, both languages offer solutions for translating the syntax of a media resource into an XML representation for further processing. BFlavor (BSDL+XFlavor) harmonizes the two technologies by combining their strengths and eliminating their weaknesses. The expressive power and performance of a BFlavor-based content adaptation chain, compared to tool chains entirely based on either BSDL or XFlavor, were investigated by several experiments. One series of experiments targeted the exploitation of multi-layered temporal scalability in H.264/AVC, paying particular attention to the use of sub-sequences and hierarchical coding patterns, as well as to the use of metadata messages to communicate the bit stream structure to the adaptation logic. BFlavor was the only tool to offer an elegant and practical solution for XML-driven adaptation of H.264/AVC bit streams in the temporal domain

    Coding of video with a single information plane

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesAs actuais normas para codificação de vídeo, tais como os MPEG2/4 ou H.263/4, foram desenvolvidas para codificação de vídeo com cor. A informação de cor é representada usando um espaço apropriado, como, por exemplo, o YCbCr. Estes espaços de cor são constituídos por três planos: um para a dominância (no exemplo dado, o Y) e dois para a informação de crominância (neste caso, o Cb e o Cr). Contudo, há aplicações onde a informação a codificar é composta apenas por um plano de informação que pode, por exemplo, representar níveis de cinzento em imagem médica, ou índices para tabelas de cores. A motivação desta tese prende-se com dois factos: a produção de imagens médicas em formato digital estar a crescer, impondo técnicas eficazes para o tratamento e a compressão de dados e, embora os modelos de cor indexada sejam há muito utilizados para representar imagens, não têm sido convenientemente explorados em vídeo. Com esta dissertação pretende-se investigar novas estratégias de compressão sem perdas que explorem a redundância entre imagens consecutivas que caracterizam estas modalidades de imagem. Portanto, ao longo do trabalho implementou-se dois codificadores de vídeo para um só plano de informação, baseados num modelo híbrido. Um deles utiliza codificação de Golomb e o outro codificação aritmética, estudando-se assim a eficácia de cada um, quer para a escala de cinzentos, quer para vídeos com tabela de cores indexadas. Adicionalmente, para vídeos de cor indexada, implementou-se um algoritmo de reordenação da tabela de cores, o que torna a codificação mais eficaz. ABSTRACT: The current standards for video encoding, such as MPEG2/4 or H.263/4, have been developed for encoding video with color. The color information is represented using an appropriate space, such as YCbCr. These color spaces are made of three planes: one for luminance (in the given example, the Y) and two for the chrominance information (in this case, the Cb and Cr). However, there are applications where the information lies in a single information plane that may, for example, represent shades of gray (medical imaging) or indexes to color tables (color indexed video). The motivation of this thesis is related with two points: the production of medical images in digital format has been growing, imposing efficient techniques for the treatment and compression of data and, although color indexed models have been used for a long time to represent images, it has not been adequately explored in video. With this thesis, we intended to investigate new strategies for lossless compression which exploits the redundancy between consecutive images that characterize these types of images. Therefore, during this work, it has been implemented two video encoders with one information plane, based on a hybrid model. One of them uses Golomb codes and the other arithmetic coding. It has been studied the efficiency of each one, both using gray scale and color indexed videos. In addition, for color indexed videos, it has been implemented a palette reordering algorithm, making the encoding more efficient

    Slice-Level Trading of Quality and Performance in Decoding H.264 Video: Slice-basiertes Abwägen zwischen Qualität und Leistung beim Dekodieren von H.264-Video

    Get PDF
    When a demanding video decoding task requires more CPU resources then available, playback degrades ungracefully today: The decoder skips frames selected arbitrarily or by simple heuristics, which is noticed by the viewer as jerky motion in the good case or as images completely breaking up in the bad case. The latter can happen due to missing reference frames. This thesis provides a way to schedule individual decoding tasks based on a cost for performance trade. Therefore, I will present a way to preprocess a video, generating estimates for the cost in terms of execution time and the performance in terms of perceived visual quality. The granularity of the scheduling decision is a single slice, which leads to a much more fine-grained approach than dealing with entire frames. Together with an actual scheduler implementation that uses the generated estimates, this work allows for higher perceived quality video playback in case of CPU overload.Wenn eine anspruchsvolle Video-Dekodierung mehr Prozessor-Ressourcen benötigt, als verfügbar sind, dann verschlechtert sich die Abspielqualität mit aktuellen Methoden drastisch: Willkürlich oder mit einfachen Heuristiken ausgewählten Bilder werden nicht dekodiert. Diese Auslassung nimmt der Betrachter im günstigsten Fall nur als ruckelnde Bewegung wahr, im ungünstigen Fall jedoch als komplettes Zusammenbrechen nachfolgender Bilder durch Folgefehler im Dekodierprozess. Meine Arbeit ermöglicht es, einzelne Teilaufgaben des Dekodierprozesses anhand einer Kosten-Nutzen-Analyse einzuplanen. Dafür ermittle ich die Kosten im Sinne von Rechenzeitbedarf und den Nutzen im Sinne von visueller Qualität für einzelne Slices eines H.264 Videos. Zusammen mit einer Implementierung eines Schedulers, der diese Werte nutzt, erlaubt meine Arbeit höhere vom Betrachter wahrgenommene Videoqualität bei knapper Prozessorzeit

    Format-independent media resource adaptation and delivery

    Get PDF

    Motion correlation based low complexity and low power schemes for video codec

    Get PDF
    制度:新 ; 報告番号:甲3750号 ; 学位の種類:博士(工学) ; 授与年月日:2012/11/19 ; 早大学位記番号:新6121Waseda Universit

    Segmentation based coding of depth Information for 3D video

    Get PDF
    Increased interest in 3D artifact and the need of transmitting, broadcasting and saving the whole information that represents the 3D view, has been a hot topic in recent years. Knowing that adding the depth information to the views will increase the encoding bitrate considerably, we decided to find a new approach to encode/decode the depth information for 3D video. In this project, different approaches to encode/decode the depth information are experienced and a new method is implemented which its result is compared to the best previously developed method considering both bitrate and quality (PSNR)
    corecore