33 research outputs found
Investigating low-bitrate, low-complexity H.264 region of interest techniques in error-prone environments
The H.264/AVC video coding standard leverages advanced compression methods to provide a significant increase in performance over previous CODECs in terms of picture quality, bitrate, and flexibility. The specification itself provides several profiles and levels that allow customization through the use of various advanced features. In addition to these features, several new video coding techniques have been developed since the standard\u27s inception. One such technique known as Region of Interest (RoI) coding has been in existence since before H.264\u27s formalization, and several means of implementing RoI coding in H.264 have been proposed. Region of Interest coding operates under the assumption that one or more regions of a sequence have higher priority than the rest of the video. One goal of RoI coding is to provide a decrease in bitrate without significant loss of perceptual quality, and this is particularly applicable to low complexity environments, if the proper implementation is used. Furthermore, RoI coding may allow for enhanced error resilience in the selected regions if desired, making RoI suitable for both low-bitrate and error-prone scenarios. The goal of this thesis project was to examine H.264 Region of Interest coding as it applies to such scenarios. A modified version of the H.264 JM Reference Software was created in which all non-Baseline profile features were removed. Six low-complexity RoI coding techniques, three targeting rate control and three targeting error resilience, were selected for implementation. Error and distortion modeling tools were created to enhance the quality of experimental data. Results were gathered by varying a range of coding parameters including frame size, target bitrate, and macroblock error rates. Methods were then examined based on their rate-distortion curves, ability to achieve target bitrates accurately, and per-region distortions where applicable
Description-driven Adaptation of Media Resources
The current multimedia landscape is characterized by a significant diversity in terms of available media formats, network technologies, and device properties. This heterogeneity has resulted in a number of new challenges, such as providing universal access to multimedia content. A solution for this diversity is the use of scalable bit streams, as well as the deployment of a complementary system that is capable of adapting scalable bit streams to the constraints imposed by a particular usage environment (e.g., the limited screen resolution of a mobile device). This dissertation investigates the use of an XML-driven (Extensible Markup Language) framework for the format-independent adaptation of scalable bit streams. Using this approach, the structure of a bit stream is first translated into an XML description. In a next step, the resulting XML description is transformed to reflect a desired adaptation of the bit stream. Finally, the transformed XML description is used to create an adapted bit stream that is suited for playback in the targeted usage environment. The main contribution of this dissertation is BFlavor, a new tool for exposing the syntax of binary media resources as an XML description. Its development was inspired by two other technologies, i.e. MPEG-21 BSDL (Bitstream Syntax Description Language) and XFlavor (Formal Language for Audio-Visual Object Representation, extended with XML features). Although created from a different point of view, both languages offer solutions for translating the syntax of a media resource into an XML representation for further processing. BFlavor (BSDL+XFlavor) harmonizes the two technologies by combining their strengths and eliminating their weaknesses. The expressive power and performance of a BFlavor-based content adaptation chain, compared to tool chains entirely based on either BSDL or XFlavor, were investigated by several experiments. One series of experiments targeted the exploitation of multi-layered temporal scalability in H.264/AVC, paying particular attention to the use of sub-sequences and hierarchical coding patterns, as well as to the use of metadata messages to communicate the bit stream structure to the adaptation logic. BFlavor was the only tool to offer an elegant and practical solution for XML-driven adaptation of H.264/AVC bit streams in the temporal domain
Recommended from our members
Error control strategies in H.265|HEVC video transmission
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonWith the rapid development in video coding technologies in the last decade, high-resolution video delivery suffers from packet loss due to unreliable transmission channels (time-varying characteristics). The error Resilience approaches at channel coding level are less efficient to implement in real time video transmission as the encoded video samples are in variable code length. Therefore, error resilience in video coding standard plays a vital role to reduce the effect of error propagation and improve the perceived visual quality. The main work in this thesis is to develop an efficient error resilience mechanism for H.265|HEVC video coding standard to reduce the effects of error propagation in error-prone conditions. In this thesis, two error resilience algorithms are proposed. The first one is Adaptive Slice Encoding (ASE) error resilience algorithm. The concept of this algorithm is to extract and protect the most active slices in the coded bitstream based on the adaptive search window. This algorithm can be applied in low delay video transmission with and without using a feedback channel. It is also designed to be compatible with reference coding software manual (HM16) for H.265|HEVC coding standard. The second proposed algorithm is a joint encoder-decoder error resilience called Error resilience based on Supplemental Enhancement Information (ERSEI) algorithm. A feedback message status is used from the decoder to notify the encoder to start encoding clean random-access picture adaptively based on the decoded picture hash message status from the decoder. At the same time, the decoder will be notified to start the error concealment process whilst waiting to receive correct video data. A recovery point message from the decoder feedback channel is used to update the encoder with error messages.
In this thesis, extensive experimental work, evaluation, and comparison with state-of-the-art related algorithms have been conducted to evaluate the proposed algorithms. Furthermore, the best trade-off between the coding efficiency of the proposed error resilience algorithms and error resilience performance has been considered at the design stage. The experimental work evaluation includes both encoding conditions, i.e. error-free and error-prone. The results achieved from the experiments show significant improvements, in (Y-PSNR) results and subjective quality of the decoded bitstream, using the proposed algorithm in error-prone conditions with a variety of packet loss rates.
Moreover, experimental work is conducted to test the algorithms complexity in terms of required processing execution time at both encoding and decoding stages. Additionally, the video coding standard performance for both H.264|AVC and H.265|HEVC coding standards are evaluated in error-free and error-prone environments.
For ASE algorithm and when compared with improved region of interest (IROI) and region of interest (ROI) algorithms, a significant improvement in visual quality was the most obvious finding from the obtained results with PLRs of 2-18 (%).
For ERSEI algorithm and when compared with the default HM16 with pixel copy concealment and motion compensated error concealment (MCEC) techniques, the evaluation results indicate clear visual quality enhancement under different packet loss rates PLRs (1,2 6, 8) %.The Ministry of Higher Education and Scientific Research in Ira
Coding of video with a single information plane
Mestrado em Engenharia Electrónica e TelecomunicaçõesAs actuais normas para codificação de vídeo, tais como os MPEG2/4 ou
H.263/4, foram desenvolvidas para codificação de vídeo com cor. A informação de cor
é representada usando um espaço apropriado, como, por
exemplo, o YCbCr. Estes espaços de cor são constituídos por três planos:
um para a dominância (no exemplo dado, o Y) e dois para a informação
de crominância (neste caso, o Cb e o Cr). Contudo, há aplicações onde a
informação a codificar é composta apenas por um plano de informação que
pode, por exemplo, representar níveis de cinzento em imagem médica, ou
índices para tabelas de cores. A motivação desta tese prende-se com dois
factos: a produção de imagens médicas em formato digital estar a crescer,
impondo técnicas eficazes para o tratamento e a compressão de dados e,
embora os modelos de cor indexada sejam há muito utilizados para representar
imagens, não têm sido convenientemente explorados em vídeo. Com
esta dissertação pretende-se investigar novas estratégias de compressão sem
perdas que explorem a redundância entre imagens consecutivas que caracterizam
estas modalidades de imagem. Portanto, ao longo do trabalho
implementou-se dois codificadores de vídeo para um só plano de informação,
baseados num modelo híbrido. Um deles utiliza codificação de Golomb e
o outro codificação aritmética, estudando-se assim a eficácia de cada um,
quer para a escala de cinzentos, quer para vídeos com tabela de cores indexadas.
Adicionalmente, para vídeos de cor indexada, implementou-se um
algoritmo de reordenação da tabela de cores, o que torna a codificação mais
eficaz.
ABSTRACT: The current standards for video encoding, such as MPEG2/4 or H.263/4,
have been developed for encoding video with color. The color information
is represented using an appropriate space, such as YCbCr. These color
spaces are made of three planes: one for luminance (in the given example,
the Y) and two for the chrominance information (in this case, the Cb and
Cr). However, there are applications where the information lies in a single
information plane that may, for example, represent shades of gray (medical
imaging) or indexes to color tables (color indexed video). The motivation
of this thesis is related with two points: the production of medical images
in digital format has been growing, imposing efficient techniques for the
treatment and compression of data and, although color indexed models have
been used for a long time to represent images, it has not been adequately
explored in video. With this thesis, we intended to investigate new strategies
for lossless compression which exploits the redundancy between consecutive
images that characterize these types of images. Therefore, during this work,
it has been implemented two video encoders with one information plane,
based on a hybrid model. One of them uses Golomb codes and the other
arithmetic coding. It has been studied the efficiency of each one, both using
gray scale and color indexed videos. In addition, for color indexed videos, it
has been implemented a palette reordering algorithm, making the encoding
more efficient
Slice-Level Trading of Quality and Performance in Decoding H.264 Video: Slice-basiertes Abwägen zwischen Qualität und Leistung beim Dekodieren von H.264-Video
When a demanding video decoding task requires more CPU resources then available, playback degrades ungracefully today: The decoder skips frames selected arbitrarily or by simple heuristics, which is noticed by the viewer as jerky motion in the good case or as images completely breaking up in the bad case. The latter can happen due to missing reference frames. This thesis provides a way to schedule individual decoding tasks based on a cost for performance trade. Therefore, I will present a way to preprocess a video, generating estimates for the cost in terms of execution time and the performance in terms of perceived visual quality. The granularity of the scheduling decision is a single slice, which leads to a much more fine-grained approach than dealing with entire frames. Together with an actual scheduler implementation that uses the generated estimates, this work allows for higher perceived quality video playback in case of CPU overload.Wenn eine anspruchsvolle Video-Dekodierung mehr Prozessor-Ressourcen benötigt, als verfügbar sind, dann verschlechtert sich die Abspielqualität mit aktuellen Methoden drastisch: Willkürlich oder mit einfachen Heuristiken ausgewählten Bilder werden nicht dekodiert.
Diese Auslassung nimmt der Betrachter im günstigsten Fall nur als ruckelnde Bewegung wahr, im ungünstigen Fall jedoch als komplettes Zusammenbrechen nachfolgender Bilder durch Folgefehler im Dekodierprozess. Meine Arbeit ermöglicht es, einzelne Teilaufgaben des Dekodierprozesses anhand einer Kosten-Nutzen-Analyse einzuplanen.
Dafür ermittle ich die Kosten im Sinne von Rechenzeitbedarf und den Nutzen im Sinne von visueller Qualität für einzelne Slices eines H.264 Videos. Zusammen mit einer Implementierung eines Schedulers, der diese Werte nutzt, erlaubt meine Arbeit höhere vom Betrachter wahrgenommene Videoqualität bei knapper Prozessorzeit
Recommended from our members
Scalable and network aware video coding for advanced communications over heterogeneous networks
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityThis work addresses the issues concerned with the provision of scalable video services over heterogeneous networks particularly with regards to dynamic adaptation and user’s acceptable quality of service.
In order to provide and sustain an adaptive and network friendly multimedia communication service, a suite of techniques that achieved automatic scalability and adaptation are developed. These techniques are evaluated objectively and subjectively to assess the Quality of Service (QoS) provided to diverse users with variable constraints and dynamic resources. The research ensured the consideration of various levels of user acceptable QoS The techniques are further evaluated with view to establish their performance against state of the art scalable and non-scalable techniques.
To further improve the adaptability of the designed techniques, several experiments and real time simulations are conducted with the aim of determining the optimum performance with various coding parameters and scenarios. The coding parameters and scenarios are evaluated and analyzed to determine their performance using various types of video content and formats. Several algorithms are developed to provide a dynamic adaptation of coding tools and parameters to specific video content type, format and bandwidth of transmission.
Due to the nature of heterogeneous networks where channel conditions, terminals, users capabilities and preferences etc are unpredictably changing, hence limiting the adaptability of a specific technique adopted, a Dynamic Scalability Decision Making Algorithm (SADMA) is developed. The algorithm autonomously selects one of the designed scalability techniques basing its decision on the monitored and reported channel conditions. Experiments were conducted using a purpose-built heterogeneous network simulator and the network-aware selection of the scalability techniques is based on real time simulation results. A technique with a minimum delay, low bit-rate, low frame rate and low quality is adopted as a reactive measure to a predicted bad channel condition. If the use of the techniques is not favoured due to deteriorating channel conditions reported, a reduced layered stream or base layer is used. If the network status does not allow the use of the base layer, then the stream uses parameter identifiers with high efficiency to improve the scalability and adaptation of the video service.
To further improve the flexibility and efficiency of the algorithm, a dynamic de-blocking filter and lambda value selection are analyzed and introduced in the algorithm. Various methods, interfaces and algorithms are defined for transcoding from one technique to another and extracting sub-streams when the network conditions do not allow for the transmission of the entire bit-stream
Motion correlation based low complexity and low power schemes for video codec
制度:新 ; 報告番号:甲3750号 ; 学位の種類:博士(工学) ; 授与年月日:2012/11/19 ; 早大学位記番号:新6121Waseda Universit
Segmentation based coding of depth Information for 3D video
Increased interest in 3D artifact and the need of transmitting, broadcasting and saving the
whole information that represents the 3D view, has been a hot topic in recent years.
Knowing that adding the depth information to the views will increase the encoding bitrate
considerably, we decided to find a new approach to encode/decode the depth information
for 3D video.
In this project, different approaches to encode/decode the depth information are
experienced and a new method is implemented which its result is compared to the best
previously developed method considering both bitrate and quality (PSNR)