6 research outputs found
Contributions to the solution of the rate-distorsion optimization problem in video coding
In the last two decades, we have witnessed significant changes concerning the demand of video codecs. The diversity of services has significantly increased, high definition (HD) and beyond-HD resolutions have become a reality, the video traffic coming from mobile devices and tablets is increasing, the video-on-demand services are now playing a prominent role, and so on. All of these advances have converged to demand more powerful standard video codecs, the more recent ones being the H.264/Advanced Video Coding (H.264/AVC) and the latest High Efficiency Video Coding (HEVC), both generated by the Joint Collaborative Team on Video Coding (JCT-VC), a partnership of the ITU-T Video Coding Expert Group (VCEG) and the ISO/IED Moving Picture Expert Group (MEPG).
These two standards (and many others starting with the ITU-T H.261) rely on a hybrid model known as Differential Pulse Code Modulation (DPCM)/Discrete Cosine Transform (DCT) hybrid video coder, which involves a motion estimation and compensation phase followed by a transformation and quantization stages and an entropy coder. Moreover, each of these main subsystems is made of a number of interdependent and parametric modules that can be adapted to the particular video content.
The main problem arising from this approach is how to choose as best as possible the combination of the different parametrizations to achieve the most efficient coding of the current content. To solve this problem, one of the solutions proposed (and the one adopted in both the H.264/AVC and the HEVC reference encoder implementations) is the process referred to as rate-distortion optimization, which chooses a parametrization of the encoder based on the minimization of a cost function that considers the trade-off between rate and distortion, weighted by a Lagrange multiplier (��) which has been empirically obtained for both the H.264/AVC and the HEVC reference encoder implementations, aiming to provide a robust solution for a variety of video contents.
In this PhD. thesis, an exhaustive study of the influence of this Lagrangian parameter on different video sequences reveals that there are some common features that appear frequently in video sequences for which the adopted �� model (the reference model) becomes ineffective. Furthermore, we have found a notable margin of improvement in the coding efficiency of both coders when using a more adequate model for the Lagrangian parameter.
Thus, contributions of this thesis are the following: (i) to prove that the reference
Lagrangian model becomes ineffective in certain common situations; and (ii), propose generalized solutions to improve the robustness of the reference model, both for the H.264/AVC and the HEVC standards, obtaining important improvements in the coding efficiency. In both proposals, changes in the nature over the video sequence are taken into account, proposing models that adaptively consider the video content and minimize the increment in computational complexity.En las últimas dos décadas hemos sido testigos de importantes cambios en la demanda de codificadores de vÃdeo debido a múltiples factores: la diversidad de servicios se ha visto incrementada significativamente, la resolución high definition (HD) (e incluso mayores) se ha hecho realidad, el tráfico de vÃdeo procedente de dispositivos móviles y tabletas está aumentando y los servicios de vÃdeo bajo demanda son cada vez más comunes, entre otros muchos ejemplos. Todos estos avances convergen en la demanda de estándares de codificación de vÃdeo más potentes, siendo los más importantes el H.264/Advanced Video Coding (AVC) y el más reciente High Efficiency Video Coding (HEVC), ambos definidos por el Joint Collaborative Team on Video Coding (JCT-VC), una colaboraci´on entre el ITU-T Video Coding Expert
Group (VCEG) y el ISO/IED Moving Picture Expert Group (MPEG).
Estos dos estándares (y otros muchos, empezando con el ITU-T H.261) se basan en un modelo hÃbrido de codificador conocido como Differential Pulse Code Modulation (DPCM)/Discrete Cosine Transform (DCT), que está formado por una estimación y compensación de movimiento seguida de una etapa de transformación y cuantificación y un codificador entrópico. Además, cada uno de estos subsistemas está formado por un cierto número de módulos interdependientes y paramétricos que pueden adaptarse al contenido especÃfico de cada secuencia de vÃdeo.
El principal problema que surge de esta aproximación es cómo elegir de la forma más adecuada la combinación de las distintas parametrizaciones con el objetivo de alcanzar la codificación más eficiente posible del contenido que se está procesando.
Para resolver este problema, una de las soluciones propuestas es el proceso conocido como optimización tasa-distorsión, que se encarga de elegir una parametrización para el codificador basada en la minimización de una función de coste que considera el compromiso existente entre la tasa y la distorsión, ponderado por un multiplicador de Lagrange (�) que ha sido obtenido de forma empÃrica para las implementaciones de referencia del codificador tanto del estándar H.264/AVC como del estándar HEVC, con el objetivo de proponer una solución robusta para distintos tipos de contenidos de vÃdeo.
En esta tesis doctoral, un estudio exhaustivo de la influencia de este parámetro
lagrangiano en distintas secuencias de vÃdeo revela que existen algunas caracterÃsticas comunes que aparecen frecuentemente en secuencias de vÃdeo para las que el modelo � adoptado en las implementaciones de referencia resulta poco efectivo. Además, hemos encontrado un notable margen de mejora en la eficiencia de codificación de ambos codificadores usando un modelo más adecuado para este parámetro lagrangiano.
Por consiguiente, las contribuciones de esta tesis son las que siguen: (i) probar que el modelo lagrangiano de referencia resulta inefectivo bajo ciertas situaciones comunes; y (ii), proponer soluciones generalizadas para mejorar la robustez del modelo de referencia, tanto en el caso de H.264/AVC como en el de HEVC, obteniendo mejoras importantes en eficiencia de codificación. En ambas propuestas se tienen en cuenta los cambios en la naturaleza del contenido de una secuencia de vÃdeo proponiendo modelos que se adaptan dinámicamente a dicho contenido variable y que tienen en cuenta el incremento en la complejidad computacional del codificador.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: José Prades Nebot.- Secretario: Carmen Peláez Moreno.- Vocal: Julián Cabrera Quesad
Improved Method to Select the Lagrange Multiplier for Rate-Distortion Based Motion Estimation in Video Coding
The motion estimation (ME) process used in the H.264/AVC reference software is based on minimizing a cost function that involves two terms (distortion and rate) that are properly balanced through a Lagrangian parameter, usually denoted as lambda(motion). In this paper we propose an algorithm to improve the conventional way of estimating lambda(motion) and, consequently, the ME process. First, we show that the conventional estimation of lambda(motion) turns out to be significantly less accurate when ME-compromising events, which make the ME process to perform poorly, happen. Second, with the aim of improving the coding efficiency in these cases, an efficient algorithm is proposed that allows the encoder to choose between three different values of lambda(motion) for the Inter 16x16 partition size. To be more precise, for this partition size, the proposed algorithm allows the encoder to additionally test lambda(motion) = 0 and lambda(motion) arbitrarily large, which corresponds to minimum distortion and minimum rate solutions, respectively. By testing these two extreme values, the algorithm avoids making large ME errors. The experimental results on video segments exhibiting this type of ME-compromising events reveal an average rate reduction of 2.20% for the same coding quality with respect to the JM15.1 reference software of H.264/AVC. The algorithm has been also tested in comparison with a state-of-the-art algorithm called context adaptive Lagrange multiplier. Additionally, two illustrative examples of the subjective performance improvement are provided.This work has been partially supported by the National Grant TEC2011-26807 of the Spanish Ministry of Science and Innovation.Publicad
Colección de prácticas de instrumentación acústica y control de ruido
Grado en IngenierÃa de Sistemas Audiovisuales. Asignatura: Instrumentación Acústica y Control de RuidoLa asignatura Instrumentación Acústica y Control de Ruido está encuadrada en el cuarto
curso del Grado de IngenierÃa de Sistemas Audiovisuales en la Universidad Carlos III
de Madrid. Esta materia trata de formar a los estudiantes en lo que se refiere a cómo se
diseña la instrumentación utilizada en la mayorÃa de las medidas acústicas, y a cómo
hay que proceder en la realización de las medidas más habituales en acústica ambiental
y de la edificación. El objetivo de esta asignatura es formar profesionales cualificados
que puedan ejercer como técnicos especialistas y directores técnicos en laboratorios de
acústica. Se definen 6 prácticas de laboratorio con el objetivo de que el alumno adquiera los conocimientos necesarios
Detector de carcinomas en imágenes de fluorescencia
IngenierÃa Técnica de Telecomunicación, especialidad Sonido e ImagenTelekomunikazio Ingeniaritza Teknikoa. Soinua eta Irudia Berezitasun
Adaptive Lagrange multiplier estimation algorithm in HEVC
The latest High Efficiency Video Coding (HEVC) standard relies on a large number of coding tools from which the encoder should choose for every coding unit. This optimization process is based on the minimization of a Lagrangian cost function that evaluates the distortion produced and the bit-rate needed to encode each coding unit. The value of the Lagrangian parameter lambda, which balances the weight of the rate and distortion terms, is related to the quantization parameter through a model that has been implemented in the HEVC reference software. Nevertheless, in this paper we show that this model can be refined, especially for static background sequences, so that the coding performance of HEVC can be improved by adaptively modifying the relation between A. and the quantization parameter. Specifically, the proposed method (i) determines whether the background of a sequence is static or not by means of a simple classifier; and (ii) when static, it evaluates an exponential regression function to estimate a proper value of the lambda parameter. In so doing, the proposed method becomes content-aware, being able to dynamically act on the lambda parameter. Experiments conducted over a large set of static and dynamic background video sequences prove that the proposed method achieves an average bit-rate saving of -6.72% (-11.07% for static background video sequences) compared with the reference HM16.0 software, notably outperforming the results of a state-of-the-art method.This work has been partially supported by the National Grant TEC2014-53390-P of
the Spanish Ministry of Economy and Competitiveness.Publicad
Two-level sliding-window VBR control algorithm for video on demand streaming
A two-level variable bit rate (VBR) control algorithm for hierarchical video coding, specifically tailored for the new High Efficiency Video Coding (HEVC) standard, is presented here. A long-term level monitors the current bit count along a sliding window of a few seconds, comprising several intra-periods (IPs) and shifted on an IP basis. This long-term view allows the accommodation of the naturally occurring rate variations at a slow pace, avoiding the annoying sharp quality changes commonly appearing when non-sliding window approaches are used. The bit excesses or defects observed at this level are evenly delivered to a short-term level mechanism that establishes target bit budgets for a narrower sliding window covering a single IP and shifting on a frame basis. At this level, an adequate quantization parameter is estimated to comply with the designated target bit rate. Recommended test conditions as well as two few minutes long video sequences with scene cuts have been used for the assessment of the proposed VBR controller. Comparisons with a state-of-the-art rate control algorithm have produced good results in terms of quality consistency, in exchange for moderate rate-distortion performance losses.This work has been partially supported by the National Grant TEC2011-26807 of the Spanish Ministry of Economy and Competitiveness.Publicad