30 research outputs found

    Two-Pass Rate Control for Improved Quality of Experience in UHDTV Delivery

    Get PDF

    Studying Rate Control Methods for UHDTV Delivery Using HEVC

    Get PDF
    Since the early video coding standardisation efforts, rate control has been considered essential for almost any application, and has therefore been extensively studied. With the advent of improved video coding standards, such as the current stateof-the-art High Efficiency Video Coding (HEVC) standard, and the introduction of advanced flexible coding tools, previous Rate-Distortion (RD) models used for rate control have become obsolete. To address this issue, some rate control methods have been recently proposed specifically for HEVC which introduce many useful features, such as a robust correspondence between the rate and Lagrange multiplier . However, when applying these rate control methods on sequences in the new Ultra High Definition Television (UHDTV) format, degraded coding performance was observed. In this paper, an analysis of the state-of-the-art HEVC rate control method was performed and two directions for its improvement were evaluated. These improvements target frame-level bit-allocation and model parameter initialisation. When compared to the rate control method implemented in the HEVC reference software, these improvements result in reduced BDrate losses of 3:1% and 2:1%, versus the 8:8% provided by the reference algorithm. Moreover, the proposed improvements improve the accuracy in hitting the target bit-rate./p

    Comparison of compression efficiency between HEVC/H.265 and VP9 based on subjective assessments

    Get PDF
    Current increasing effort of broadcast providers to transmit UHD (Ultra High Definition) content is likely to increase demand for ultra high definition televisions (UHDTVs). To compress UHDTV content, several alter- native encoding mechanisms exist. In addition to internationally recognized standards, open access proprietary options, such as VP9 video encoding scheme, have recently appeared and are gaining popularity. One of the main goals of these encoders is to efficiently compress video sequences beyond HDTV resolution for various scenarios, such as broadcasting or internet streaming. In this paper, a broadcast scenario rate-distortion performance analysis and mutual comparison of one of the latest video coding standards H.265/HEVC with recently released proprietary video coding scheme VP9 is presented. Also, currently one of the most popular and widely spread encoder H.264/AVC has been included into the evaluation to serve as a comparison baseline. The comparison is performed by means of subjective evaluations showing actual differences between encoding algorithms in terms of perceived quality. The results indicate a dominance of HEVC based encoding algorithm in comparison to other alternatives if a wide range of bit-rates from very low to high bit-rates corresponding to low quality up to transparent quality when compared to original and uncompressed video is considered. In addition, VP9 shows competitive results for synthetic content and bit-rates that correspond to operating points for transparent or close to transparent quality video

    Adaptive Streaming: From Bitrate Maximization to Rate-Distortion Optimization

    Get PDF
    The fundamental conflict between the increasing consumer demand for better Quality-of-Experience (QoE) and the limited supply of network resources has become significant challenges to modern video delivery systems. State-of-the-art adaptive bitrate (ABR) streaming algorithms are dedicated to drain available bandwidth in hope to improve viewers' QoE, resulting in inefficient use of network resources. In this thesis, we develop an alternative design paradigm, namely rate-distortion optimized streaming (RDOS), to balance the contrast demands from video consumers and service providers. Distinct from the traditional bitrate maximization paradigm, RDOS must operate at any given point along the rate-distortion curve, as specified by a trade-off parameter. The new paradigm has found plausible explanations in information theory, economics, and visual perception. To instantiate the new philosophy, we decompose adaptive streaming algorithms into three mutually independent components, including throughput predictor, reward function, and bitrate selector. We provide a unified framework to understand the connections among all existing ABR algorithms. The new perspective also illustrates the fundamental limitations of each algorithm by going behind its underlying assumptions. Based on the insights, we propose novel improvements to each of the three functional components. To alleviate a series of unrealistic assumptions behind bitrate-based QoE models, we develop a theoretically-grounded objective QoE model. The new objective QoE model combines the information from subject-rated streaming videos and the prior knowledge about human visual system (HVS) in a principled way. By analyzing a corpus of psychophysical experiments, we show the QoE function estimation can be formulated as a projection onto convex sets problem. The proposed model presents strong generalization capability over a broad range of source contents, video encoders, and viewing conditions. Most importantly, the QoE model disentangles bitrate with quality, making it an ideal component in the RDOS framework. In contrast to the existing throughput estimators that approximate the marginal probability distribution over all connections, we optimize the throughput predictor conditioned on each client. Although there are lack of training data for each Internet Protocol connection, we can leverage the latest advances in meta learning to incorporate the knowledge embedded in similar tasks. With a deliberately designed objective function, the algorithm learns to identify similar structures among different network characteristics from millions of realistic throughput traces. During the test phase, the model can quickly adapt to connection-level network characteristics with only a small amount of training data from novel streaming video clients with a small number of gradient steps. The enormous space of streaming videos, constantly progressing encoding schemes, and great diversity of throughput characteristics make it extremely challenging for modern data-driven bitrate selectors that are trained with limited samples to generalize well. To this end, we propose a Bayesian bitrate selection algorithm by adaptively fusing an online, robust, and short-term optimal controller with an offline, susceptible, and long-term optimal planner. Depending on the reliability of the two controllers in certain system states, the algorithm dynamically prioritizes the one of the two decision rules to obtain the optimal decision. To faithfully evaluate the performance of RDOS, we construct a large-scale streaming video dataset -- the Waterloo Streaming Video database. It contains a wide variety of high quality source contents, encoders, encoding profiles, realistic throughput traces, and viewing devices. Extensive objective evaluation demonstrates the proposed algorithm can deliver identical QoE to state-of-the-art ABR algorithms at a much lower cost. The improvement is also supported by so-far the largest subjective video quality assessment experiment

    Layered Division Multiplexing With Multi-Radio-Frequency Channel Technologies

    Full text link
    "(c) 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.")The advanced television system committee (ATSC) is to release the next-generation U.S. digital terrestrial television standard, known as ATSC 3.0. Layered division multiplexing (LDM) is one of the new physical layer technologies included in the standard, which enables the efficient provision of mobile and fixed services by superposing two independent signals with different power levels. ATSC 3.0 has also adopted a novel transmission technique known as channel bonding (CB), which splits the data of a service into two sub-streams that are modulated and transmitted over two radio-frequency (RF) channels. This paper investigates the potential use cases, implementation aspects, and performance advantages, for combining LDM with CB and also with the multi-RF channel technology time frequency slicing (TFS) introduced in digital video broadcasting - terrestrial second generation (DVB-T2) (as an informative annex) and digital video broadcasting - next generation handheld (DVB-NGH) which allows distributing the data of a service across two or more RF channels by means of time slicing and frequency hopping.Parts of this paper have been published in the Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, Ghent, Belgium, in 2015. This work was supported by the ICT Research and Development Program of MSIP/IITP. [R0101-15-294, Development of Service and Transmission Technology for Convergent Realistic Broadcast.]Garro Crevill茅n, E.; Gimenez Gandia, JJ.; Park, SI.; G贸mez Barquero, D. (2016). Layered Division Multiplexing With Multi-Radio-Frequency Channel Technologies. IEEE Transactions on Broadcasting. 62(2):365-374. doi:10.1109/TBC.2015.2492474S36537462

    Improved Spectrum Usage with Multi-RF Channel Aggregation Technologies for the Next-Generation Terrestrial Broadcasting

    Full text link
    [EN] Next-generation terrestrial broadcasting targets at enhancing spectral efficiency to overcome the challenges derived from the spectrum shortage as a result of the progressive allocation of frequencies - the so-called Digital Dividend - to satisfy the growing demands for wireless broadband capacity. Advances in both transmission standards and video coding are paramount to enable the progressive roll-out of high video quality services such as HDTV (High Definition Televison) or Ultra HDTV. The transition to the second generation European terrestrial standard DVB-T2 and the introduction of MPEG-4/AVC video coding already enables the transmission of 4-5 HDTV services per RF (Radio Frequency) channel. However, the impossibility to allocate higher bit-rate within the remaining spectrum could jeopardize the evolution of the DTT platforms in favour of other high-capacity systems such as the satellite or cable distribution platforms. Next steps are focused on the deployment of the recently released High Efficiency Video Coding (HEVC) standard, which provides more than 50% coding gain with respect to AVC, with the next-generation terrestrial standards. This could ensure the competitiveness of the DTT. This dissertation addresses the use of multi-RF channel aggregation technologies to increase the spectral efficiency of future DTT networks. The core of the Thesis are two technologies: Time Frequency Slicing (TFS) and Channel Bonding (CB). TFS and CB consist in the transmission of the data of a TV service across multiple RF channels instead of using a single channel. CB spreads data of a service over multiple classical RF channels (RF-Mux). TFS spreads the data by time-slicing (slot-by-slot) across multiple RF channels which are sequentially recovered at the receiver by frequency hopping. Transmissions using these features can benefit from capacity and coverage gains. The first one comes from a more efficient statistical multiplexing (StatMux) for Variable Bit Rate (VBR) services due to a StatMux pool over a higher number of services. Furthermore, CB allows increasing service data rate with the number of bonded RF channels and also advantages when combined with SVC (Scalable Video Coding). The coverage gain comes from the increased RF performance due to the reception of the data of a service from different RF channels rather that a single one that could be, eventually, degraded. Robustness against interferences is also improved since the received signal does not depend on a unique potentially interfered RF channel. TFS was firstly introduced as an informative annex in DVB-T2 (not normative) and adopted in DVB-NGH (Next Generation Handheld). TFS and CB are proposed for inclusion in ATSC 3.0. However, they have never been implemented. The investigations carried out in this dissertation employ an information-theoretical approach to obtain their upper bounds, physical layer simulations to evaluate the performance in real systems and the analysis of field measurements that approach realistic conditions of the network deployments. The analysis report coverage gains about 4-5 dB with 4 RF channels and high capacity gains already with 2 RF channels. This dissertation also focuses on implementation aspects. Channel bonding receivers require one tuner per bonded RF channel. The implementation of TFS with a single tuner demands the fulfilment of several timing requirements. However, the use of just two tuners would still allow for a good performance with a cost-effective implementation by the reuse of existing chipsets or the sharing of existing architectures with dual tuner operation such as MIMO (Multiple Input Multiple Output).[ES] La televisi贸n digital terrestre (TDT) de 煤ltima generaci贸n est谩 orientada a una necesaria mejora de la eficiencia espectral con el fin de abordar los desaf铆os derivados de la escasez de espectro como resultado de la progresiva asignaci贸n de frecuencias - el llamado Dividendo Digital - para satisfacer la creciente demanda de capacidad para la banda ancha inal谩mbrica. Los avances tanto en los est谩ndares de transmisi贸n como de codificaci贸n de v铆deo son de suma importancia para la progresiva puesta en marcha de servicios de alta calidad como la televisi贸n de Ultra AD (Alta Definici贸n). La transici贸n al est谩ndar europeo de segunda generaci贸n DVB-T2 y la introducci贸n de la codificaci贸n de v铆deo MPEG-4 / AVC ya permite la transmisi贸n de 4-5 servicios de televisi贸n de AD por canal RF (Radiofrecuencia). Sin embargo, la imposibilidad de asignar una mayor tasa de bit sobre el espectro restante podr铆a poner en peligro la evoluci贸n de las plataformas de TDT en favor de otros sistemas de alta capacidad tales como el sat茅lite o las distribuidoras de cable. El siguiente paso se centra en el despliegue del reciente est谩ndar HEVC (High Efficiency Video Coding), que ofrece un 50% de ganancia de codificaci贸n con respecto a AVC, junto con los est谩ndares terrestres de pr贸xima generaci贸n, lo que podr铆a garantizar la competitividad de la TDT en un futuro cercano. Esta tesis aborda el uso de tecnolog铆as de agregaci贸n de canales RF que permitan incrementar la eficiencia espectral de las futuras redes. La tesis se centra en torno a dos tecnolog铆as: Time Frequency Slicing (TFS) y Channel Bonding (CB). TFS y CB consisten en la transmisi贸n de los datos de un servicio de televisi贸n a trav茅s de m煤ltiples canales RF en lugar de utilizar un solo canal. CB difunde los datos de un servicio a trav茅s de varios canales RF convencionales formando un RF-Mux. TFS difunde los datos a trav茅s de ranuras temporales en diferentes canales RF. Los datos son recuperados de forma secuencial en el receptor mediante saltos en frecuencia. La implementaci贸n de estas t茅cnicas permite obtener ganancias en capacidad y cobertura. La primera de ellas proviene de una multiplexaci贸n estad铆stica (StatMux) de servicios de tasa variable (VBR) m谩s eficiente. Adem谩s, CB permite aumentar la tasa de pico de un servicio de forma proporcional al n煤mero de canales as铆 como ventajas al combinarla con codificaci贸n de v铆deo escalable. La ganancia en cobertura proviene de un mejor rendimiento RF debido a la recepci贸n de los datos de un servicio desde diferentes canales en lugar uno s贸lo que podr铆a estar degradado. Del mismo modo, es posible obtener una mayor robustez frente a interferencias ya que la recepci贸n o no de un servicio no depende de si el canal que lo alberga est谩 o no interferido. TFS fue introducido en primer lugar como un anexo informativo en DVB-T2 (no normativo) y posteriormente fue adoptado en DVB-NGH (Next Generation Handheld). TFS y CB han sido propuestos para su inclusi贸n en ATSC 3.0. A煤n as铆, nunca han sido implementados. Las investigaciones llevadas a cabo en esta Tesis emplean diversos enfoques basados en teor铆a de la informaci贸n para obtener los l铆mites de ganancia, en simulaciones de capa f铆sica para evaluar el rendimiento en sistemas reales y en el an谩lisis de medidas de campo. Estos estudios reportan ganancias en cobertura en torno a 4-5 dB con 4 canales e importantes ganancias en capacidad a煤n con s贸lo 2 canales RF. Esta tesis tambi茅n se centra en los aspectos de implementaci贸n. Los receptores para CB requieren un sintonizador por canal RF agregado. La implementaci贸n de TFS con un solo sintonizador exige el cumplimiento de varios requisito temporales. Sin embargo, el uso de dos sintonizadores permitir铆a un buen rendimiento con una implementaci贸n m谩s rentable con la reutilizaci贸n de los actuales chips o su introducci贸n junto con las arquitecturas existentes que operan con un doble sintonizador tales como[CA] La televisi贸 digital terrestre (TDT) d'煤ltima generaci贸 est脿 orientada a una necess脿ria millora de l'efici猫ncia espectral a fi d'abordar els desafiaments derivats de l'escassetat d'espectre com a resultat de la progressiva assignaci贸 de freq眉猫ncies - l'anomenat Dividend Digital - per a satisfer la creixent demanda de capacitat per a la banda ampla sense fil. Els avan莽os tant en els est脿ndards de transmissi贸 com de codificaci贸 de v铆deo s贸n de la m脿xima import脿ncia per a la progressiva posada en marxa de serveis d'alta qualitat com la televisi贸 d'Ultra AD (Alta Definici贸). La transici贸 a l'est脿ndard europeu de segona generaci贸 DVB-T2 i la introducci贸 de la codificaci贸 de v铆deo MPEG-4/AVC ja permet la transmissi贸 de 4-5 serveis de televisi贸 d'AD per canal RF (Radiofreq眉猫ncia). No obstant aix貌, la impossibilitat d'assignar una major taxa de bit sobre l'espectre restant podria posar en perill l'evoluci贸 de les plataformes de TDT en favor d'altres sistemes d'alta capacitat com ara el sat猫l路lit o les distribu茂dores de cable. El seg眉ent pas se centra en el desplegament del recent est脿ndard HEVC (High Efficiency V铆deo Coding), que oferix un 50% de guany de codificaci贸 respecte a AVC, junt amb els est脿ndards terrestres de pr貌xima generaci贸, la qual cosa podria garantir la competitivitat de la TDT en un futur pr貌xim. Aquesta tesi aborda l'煤s de tecnologies d'agregaci贸 de canals RF que permeten incrementar l'efici猫ncia espectral de les futures xarxes. La tesi se centra entorn de dues tecnologies: Time Frequency Slicing (TFS) i Channel Bonding (CB). TFS i CB consistixen en la transmissi贸 de les dades d'un servei de televisi贸 a trav茅s de m煤ltiples canals RF en compte d'utilitzar un sol canal. CB difon les dades d'un servei a trav茅s d'uns quants canals RF convencionals formant un RF-Mux. TFS difon les dades a trav茅s de ranures temporals en diferents canals RF. Les dades s贸n recuperades de forma seq眉encial en el receptor per mitj脿 de salts en freq眉猫ncia. La implementaci贸 d'aquestes t猫cniques permet obtindre guanys en capacitat i cobertura. La primera d'elles prov茅 d'una multiplexaci贸 estad铆stica (StatMux) de serveis de taxa variable (VBR) m茅s eficient. A m茅s, CB permet augmentar la taxa de pic d'un servei de forma proporcional al nombre de canals aix铆 com avantatges al combinar-la amb codificaci贸 de v铆deo escalable. El guany en cobertura prov茅 d'un millor rendiment RF a causa de la recepci贸 de les dades d'un servei des de diferents canals en lloc de nom茅s un que podria estar degradat. De la mateixa manera, 茅s possible obtindre una major robustesa enfront d'interfer猫ncies ja que la recepci贸 o no d'un servei no dep茅n de si el canal que l'allotja est脿 o no interferit. TFS va ser introdu茂t en primer lloc com un annex informatiu en DVB-T2 (no normatiu) i posteriorment va ser adoptat en DVB-NGH (Next Generation Handheld). TFS i CB han sigut proposades per a la seva inclusi贸 en ATSC 3.0. Encara aix铆, mai han sigut implementades. Les investigacions dutes a terme en esta Tesi empren diverses vessants basades en teoria de la informaci贸 per a obtindre els l铆mits de guany, en simulacions de capa f铆sica per a avaluar el rendiment en sistemes reals i en l'an脿lisi de mesures de camp. Aquestos estudis reporten guanys en cobertura entorn als 4-5 dB amb 4 canals i importants guanys en capacitat encara amb nom茅s 2 canals RF. Esta tesi tamb茅 se centra en els aspectes d'implementaci贸. Els receptors per a CB requerixen un sintonitzador per canal RF agregat. La implementaci贸 de TFS amb un sol sintonitzador exigix el compliment de diversos requisit temporals. No obstant aix貌, l'煤s de dos sintonitzadors permetria un bon rendiment amb una implementaci贸 m茅s rendible amb la reutilitzaci贸 dels actuals xips o la seua introducci贸 junt amb les arquitectures existents que operen amb un doble sintonitzador com ara MIMO (Multiple Input Multiple Output).Gim茅nez Gandia, JJ. (2015). Improved Spectrum Usage with Multi-RF Channel Aggregation Technologies for the Next-Generation Terrestrial Broadcasting [Tesis doctoral no publicada]. Universitat Polit猫cnica de Val猫ncia. https://doi.org/10.4995/Thesis/10251/52520TESI

    Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture

    Get PDF
    Encoding textural content remains a challenge for current standardised video codecs. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize encoding performance. In this paper, we analyse the spatio-temporal features and statistics of video textures, explore the rate-quality performance of different texture types and investigate models to mathematically describe them. For all considered theoretical models, we employ machine-learning regression to predict the rate-quality curves based solely on selected spatio-temporal features extracted from uncompressed content. All experiments were performed on homogeneous video textures to ensure validity of the observations. The results of the regression indicate that using an exponential model we can more accurately predict the expected rate-quality curve (with a mean Bj{\o}ntegaard Delta rate of 0.46% over the considered dataset) while maintaining a low relative complexity. This is expected to be adopted by in the loop processes for faster encoding decisions such as rate-distortion optimisation, adaptive quantization, partitioning, etc.Comment: 17 page

    Algorithms for compression of high dynamic range images and video

    Get PDF
    The recent advances in sensor and display technologies have brought upon the High Dynamic Range (HDR) imaging capability. The modern multiple exposure HDR sensors can achieve the dynamic range of 100-120 dB and LED and OLED display devices have contrast ratios of 10^5:1 to 10^6:1. Despite the above advances in technology the image/video compression algorithms and associated hardware are yet based on Standard Dynamic Range (SDR) technology, i.e. they operate within an effective dynamic range of up to 70 dB for 8 bit gamma corrected images. Further the existing infrastructure for content distribution is also designed for SDR, which creates interoperability problems with true HDR capture and display equipment. The current solutions for the above problem include tone mapping the HDR content to fit SDR. However this approach leads to image quality associated problems, when strong dynamic range compression is applied. Even though some HDR-only solutions have been proposed in literature, they are not interoperable with current SDR infrastructure and are thus typically used in closed systems. Given the above observations a research gap was identified in the need for efficient algorithms for the compression of still images and video, which are capable of storing full dynamic range and colour gamut of HDR images and at the same time backward compatible with existing SDR infrastructure. To improve the usability of SDR content it is vital that any such algorithms should accommodate different tone mapping operators, including those that are spatially non-uniform. In the course of the research presented in this thesis a novel two layer CODEC architecture is introduced for both HDR image and video coding. Further a universal and computationally efficient approximation of the tone mapping operator is developed and presented. It is shown that the use of perceptually uniform colourspaces for internal representation of pixel data enables improved compression efficiency of the algorithms. Further proposed novel approaches to the compression of metadata for the tone mapping operator is shown to improve compression performance for low bitrate video content. Multiple compression algorithms are designed, implemented and compared and quality-complexity trade-offs are identified. Finally practical aspects of implementing the developed algorithms are explored by automating the design space exploration flow and integrating the high level systems design framework with domain specific tools for synthesis and simulation of multiprocessor systems. The directions for further work are also presented
    corecore