86 research outputs found

    Algorithms for complexity management in video coding

    Get PDF
    Nowadays, the applications based on video services are becoming very popular, e.g., the transmission of video sequences over the Internet or mobile networks, or the increasingly common use of the High Definition (HD) video signals in television or Blu-Ray systems. Thanks to this popularity of video services, video coding has become an essential tool to send and store digital video sequences. The standardization organizations have developed several video coding standards, being the most recent H.264/AVC and HEVC. Both standards achieve great results compressing the video signal by virtue of a set of spatio-temporal predictive techniques. Nevertheless, the efficacy of these techniques comes in exchange for a high increase in the computational cost of the video coding process. Due to the high complexity of these standards, a variety of algorithms attempting to control the computational burden of video coding have been developed. The goal of these algorithms is to control the coder complexity, using a specific amount of coding resources while keeping the coding efficiency as high as possible. In this PhD Thesis, we propose two algorithms devoted to control the complexity of the H.264/AVC and HEVC standards. Relying on the statistical properties of the video sequences, we will demonstrate that the developed methods are able to control the computational burden avoiding relevant losses in coding efficiency. Moreover, our proposals are designed to adapt their behavior according to the video content, as well as to different target complexities. The proposed methods have been thoroughly tested and compared with other state-of-the-art proposals for a variety of video resolutions, video sequences and coding configurations. The obtained results proved that our methods outperform other approaches and revealed that they are suitable for practical implementations of coding standards, where the computational complexity becomes a key feature for a proper design of the system.En la actualidad, la popularidad de las aplicaciones basadas en servicios de vídeo, como su transmisión sobre Internet o redes móviles, o el uso de la alta definición (HD) en sistemas de televisión o Blu-Ray, ha hecho que la codificación de vídeo se haya convertido en una herramienta imprescindible para poder transmitir y almacenar eficientemente secuencias de vídeo digitalizadas. Los organismos de estandarización han desarrollado diversos estándares de codificación de vídeo, siendo los más recientes H.264/AVC y HEVC. Ambos consiguen excelentes resultados a la hora de comprimir señales de vídeo, gracias a una serie de técnicas predictivas espacio-temporales. Sin embargo, la eficacia de estas técnicas tiene como contrapartida un considerable aumento en el coste computacional del proceso de codificación. Debido a la alta complejidad de estos estándares, se han desarrollado una gran cantidad de métodos para controlar el coste computacional del proceso de codificación. El objetivo de estos métodos es controlar la complejidad del codificador, utilizando para ello una cantidad de recursos específica mientras procuran maximizar la eficiencia del sistema. En esta Tesis, se proponen dos algoritmos dedicados a controlar la complejidad de los estándares H.264/AVC y HEVC. Apoyándose en las propiedades estadísticas de las secuencias de vídeo, demostraremos que los métodos desarrollados son capaces de controlar la complejidad sin incurrir en graves pérdidas de eficiencia de codificación. Además, nuestras propuestas se han diseñado para adaptar su funcionamiento al contenido de la secuencia de vídeo, así como a diferentes complejidades objetivo. Los métodos propuestos han sido ampliamente evaluados y comparados con otros sistemas del estado de la técnica, utilizando para ello una gran variedad de secuencias, resoluciones, y configuraciones de codificación, demostrando que alcanzan resultados superiores a los métodos con los que se han comparado. Adicionalmente, se ha puesto de manifiesto que resultan adecuados para implementaciones prácticas de los estándares de codificación, donde la complejidad computacional es un parámetro clave para el correcto diseño del sistema.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Fernando Jaureguizar Núñez.- Secretario: Iván González Díaz.- Vocal: Javier Ruiz Hidalg

    Driver distraction detection using machine vision techniques

    Get PDF
    Resumen En este artículo se presenta un sistema para la detección de estados de distracción en conductores de vehículos en horas diurnas mediante técnicas de visión de máquina, el cual se basa en la segmentación de la imagen respecto a los ojos y la boca de una persona, vista de frente por una cámara. De dicha segmentación se establece los estados de movimiento que de la boca y de la cabeza, permiten inferir un estado de distracción. Las imágenes se extraen de videos de corta duración y con una resolución de 640x480 píxeles, sobre las cuales se emplean técnicas de procesamiento de imagen como transformación de espacios de color y análisis de histograma. La decisión del estado es el resultado de una combinación de las características extraídas ingresadas a una red neuronal del tipo perceptrón multicapa. El desempeño logrado en la detección en un ambiente controlado de pruebas es del 90% y del 86% en ambiente real, con un tiempo de respuesta promedio de 30 ms. Abstract This article presents a system for detecting states of distraction in drivers during daylight hours using machine vision techniques, which is based on the image segmentation of the eyes and mouth of a person, with a front-face-view camera. From said segmentation states of motion of the mouth and head are established, thus allowing to infer the corresponding state of distraction. Images are extracted from short videos with a resolution of 640x480 pixels and image processing techniques such as color space transformation and histogram analysis are applied. A decision concerning the state of the driver is the result from a multilayer perceptron-type neural network with all extracted features as inputs. Achieved performance is 90% for a controlled environment screening test and 86% in real environment, with an average response time of 30 ms

    Standard compliant flicker reduction method with PSNR loss control

    Get PDF
    Proceedings: EEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2013). Vancouver, Canada, May 26-31, 2013Flicker is a common video coding artifact that occurs especially at low and medium bit rates. In this paper we propose a temporal filter-based method to reduce flicker. The proposed method has been designed to be compliant with conventional video coding standards, i.e., to generate a bitstream that is decodable by any standard decoder implementation. The aim of the proposed method is to make the luminance changes between consecutive frames smoother on a block-by-block basis. To this end, a selective temporal low-pass filtering is proposed that smooths these luminance changes on flicker-prone blocks. Furthermore, since the low-pass filtering can incur in a noticeable blurring effect, an adaptive algorithm that allows for limiting the PSNR loss -and thus the blur-has also been designed. The proposed method has been extensively assessed on the reference software of the H.264/AVC video coding standard and compared to a state-of-the-art method. The experimental results show the effectiveness of the proposed method and prove that its performance is superior to that of the state-of-the-art method.Publicad

    Mode Decision-Based Algorithm for Complexity Control in H.264/AVC

    Get PDF
    The latest H.264/AVC video coding standard achieves high compression rates in exchange for high computational complexity. Nowadays, however, many application scenarios require the encoder to meet some complexity constraints. This paper proposes a novel complexity control method that relies on a hypothesis testing that can handle time-variant content and target complexities. Specifically, it is based on a binary hypothesis testing that decides, on a macroblock basis, whether to use a low-or a high-complexity coding model. Gaussian statistics are assumed so that the probability density functions involved in the hypothesis testing can be easily adapted. The decision threshold is also adapted according to the deviation between the actual and the target complexities. The proposed method is implemented on the H.264/AVC reference software JM10.2 and compared with a state-of-the-art method. Our experimental results prove that the proposed method achieves a better trade-off between complexity control and coding efficiency. Furthermore, it leads to a lower deviation from the target complexity.This work has been partially supported by the National Grant TEC2011-26807 of the Spanish Ministry of Science and Innovation.Publicad

    Bayesian adaptive algorithm for fast coding unit decision in the High Efficiency Video Coding (HEVC) standard

    Get PDF
    The latest High Efficiency Video Coding standard (HEVC) provides a set of new coding tools to achieve a significantly higher coding efficiency than previous standards. In this standard, the pixels are first grouped into Coding Units (CU), then Prediction Units (PU), and finally Transform Units (TU). All these coding levels are organized into a quadtree-shaped arrangement that allows highly flexible data representation; however, they involve a very high computational complexity. In this paper, we propose an effective early CU depth decision algorithm to reduce the encoder complexity. Our proposal is based on a hierarchical approach, in which a hypothesis test is designed to make a decision at every CU depth, where the algorithm either produces an early termination or decides to evaluate the subsequent depth level. Moreover, the proposed method is able to adaptively estimate the parameters that define each hypothesis test, so that it adapts its behavior to the variable contents of the video sequences. The proposed method has been extensively tested, and the experimental results show that our proposal outperforms several state-of-the-art methods, achieving a significant reduction of the computational complexity (36.5% and 38.2% average reductions in coding time for two different encoder configurations) in exchange for very slight losses in coding performance (1.7% and 0.8% average bit rate increments).This work has been partially supported by the National Grant TEC2014-53390-P of the Spanish Ministry of Economy and Competitiveness

    Standard-Compliant Low-Pass Temporal Filter to Reduce the Perceived Flicker Artifact

    Get PDF
    Flicker is a common video-compression-related temporal artifact. It occurs when co-located regions of consecutive frames are not encoded in a consistent manner, especially when Intra frames are periodically inserted at low and medium bit rates. In this paper we propose a flicker reduction method which aims to make the luminance changes between pixels in the same area of consecutive frames less noticeable. To this end, a temporal low-pass filtering is proposed that smooths these luminance changes on a block-by-block basis. The proposed method has some advantages compared to another state-of-the-art methods. It has been designed to be compliant with conventional video coding standards, i.e., to generate a bitstream that is decodable by any standard decoder implementation. The filter strength is estimated on-the-fly to limit the PSNR loss and thus the appearance of a noticeable blurring effect. The proposed method has been implemented on the H. 264/AVC reference software and thoroughly assessed in comparison to a couple of state-of-the-art methods. The flicker reduction achieved by the proposed method (calculated using an objective measurement) is notably higher than that of compared methods: 18.78% versus 5.32% and 31.96% versus 8.34%, in exchange of some slight losses in terms of coding efficiency. In terms of subjective quality, the proposed method is perceived more than two times better than the compared methods.This work has been partially supported by the National Grant TEC2011-26807 of the Spanish Ministry of Science and Innovation.Publicad

    Improved Method to Select the Lagrange Multiplier for Rate-Distortion Based Motion Estimation in Video Coding

    Get PDF
    The motion estimation (ME) process used in the H.264/AVC reference software is based on minimizing a cost function that involves two terms (distortion and rate) that are properly balanced through a Lagrangian parameter, usually denoted as lambda(motion). In this paper we propose an algorithm to improve the conventional way of estimating lambda(motion) and, consequently, the ME process. First, we show that the conventional estimation of lambda(motion) turns out to be significantly less accurate when ME-compromising events, which make the ME process to perform poorly, happen. Second, with the aim of improving the coding efficiency in these cases, an efficient algorithm is proposed that allows the encoder to choose between three different values of lambda(motion) for the Inter 16x16 partition size. To be more precise, for this partition size, the proposed algorithm allows the encoder to additionally test lambda(motion) = 0 and lambda(motion) arbitrarily large, which corresponds to minimum distortion and minimum rate solutions, respectively. By testing these two extreme values, the algorithm avoids making large ME errors. The experimental results on video segments exhibiting this type of ME-compromising events reveal an average rate reduction of 2.20% for the same coding quality with respect to the JM15.1 reference software of H.264/AVC. The algorithm has been also tested in comparison with a state-of-the-art algorithm called context adaptive Lagrange multiplier. Additionally, two illustrative examples of the subjective performance improvement are provided.This work has been partially supported by the National Grant TEC2011-26807 of the Spanish Ministry of Science and Innovation.Publicad

    Control de movimiento de un robot humanoide por medio de visión de máquina y réplica de movimientos humanos

    Get PDF
    En este artículo se presenta el desarrollo e implementación de un sistema de captura de movimiento antropomórfico mediante técnicas de visión de máquina basado en el dispositivo Kinect, con el fin de realizar el control de movimiento imitativo de un agente robótico Bioloid en el Grupo de Aplicaciones Virtuales (GAV) del Programa de Ingeniería en Mecatrónica de la Universidad Militar Nueva Granada (UMNG). Dados los múltiples grados de libertad de un brazo humano, se busca simplificar una interfaz de control que permita replicar los movimientos de este en un robot humanoide. En este artículo se presentan las técnicas usadas para mejorar el nivel de precisión de los datos entregados por el Kinect y los métodos personalizados de transmisión y codificación de las órdenes enviadas al robot. Los resultados obtenidos derivan en un sistema que cumple con las exigencias básicas de estabilidad, precisión y velocidad de repuesta en la imitación

    Predicción de radiación solar mediante deep belief network

    Get PDF
    The continued development of computational tools offers the possibility to execute processes with the ability to carry out activities more efficiently, exactness and precision. Between these tools there is the neural architecture, Deep Belief Network (DBN), designed to collaborate in the development of prediction technics to find information that allows to study the behavior of the natural phenomena, such as the solar insolation. This paper presents the obtained results when using the DBN architecture for solar insolation prediction, simulated through the programming tool Visual Studio C#, showing the deep level that this architecture has, how it affects the number of layers and neurons per layer in the training and the results to predict the desired values in 2014, with errors close to 2% and faster to training, respect to errors obtained through conventional methods for neural training, which are about 5% and take long periods of training.El desarrollo continuo de las herramientas computacionales ofrece la posibilidad de realizar procesos con la capacidad de llevar a cabo actividades con mayor eficiencia, exactitud y precisión. Entre estas herramientas se  encuentra la arquitectura neuronal, Deep Belief Network (DBN), diseñada con el propósito de colaborar en el desarrollo de técnicas de predicción para hallar información que permita estudiar el comportamiento de los fenómenos naturales, como lo es la radiación solar. En el presente trabajo se presentan los resultados obtenidos al manejar la arquitectura DBN para predicción de radiación solar, la cual se simula mediante la herramienta de programación Visual Studio C#, indicando el nivel de profundidad que posee esta arquitectura, como afecta la cantidad de capas y de neuronas en el entrenamiento y los resultados obtenidos para poder predecir los valores deseados en el 2014, con errores cercanos al 2 % y mayor rapidez para el entrenamiento, respecto a errores  obtenidos por métodos convencionales de entrenamiento neuronal, que se encuentran por el 5% y que a su vez llevan largos periodos de entrenamiento

    Motion control of a humanoid robot through machine vision and human motion replica

    Get PDF
    En este artículo se presenta el desarrollo e implementación de un sistema de captura de movimiento antropomórfico mediante técnicas de visión de máquina basado en el dispositivo Kinect, con el fin de realizar el control de movimiento imitativo de un agente robótico Bioloid en el Grupo de Aplicaciones Virtuales (GAV) del Programa de Ingeniería en Mecatrónica de la Universidad Militar Nueva Granada (UMNG). Dados los múltiples grados de libertad de un brazo humano, se busca simplificar una interfaz de control que permita replicar los movimientos de este en un robot humanoide. En este artículo se presentan las técnicas usadas para mejorar el nivel de precisión de los datos entregados por el Kinect y los métodos personalizados de transmisión y codificación de las órdenes enviadas al robot. Los resultados obtenidos derivan en un sistema que cumple con las exigencias básicas de estabilidad, precisión y velocidad de repuesta en la imitaciónThis paper presents the development and implementation of an anthropomorphic motion capture system through machine vision based on the Kinect device in order to achieve the imitative motion control of a Bioloid robotic agent of the GAV research group from the Mecatronics Engineering program at UMNG. We present the techniques used to improve data precision delivered by Kinect, as well as custom methods of transmission and coding of commands sent to the robot. Results derive in a system that meets the basic requirements of stability, accuracy and speed of imitation respons
    corecore