Search CORE

13 research outputs found

Análise do HEVC escalável : desempenho e controlo de débito

Author: Santiago Alexandre José Batista
Publication venue: Universidade de Aveiro
Publication date: 19/12/2016
Field of study

Mestrado em Engenharia Eletrónica e TelecomunicaçõesEsta dissertação apresenta um estudo da norma de codificação de vídeo de alta eficiência (HEVC) e a sua extensão para vídeo escalável, SHVC. A norma de vídeo SHVC proporciona um melhor desempenho quando codifica várias camadas em simultâneo do que quando se usa o codificador HEVC numa configuração simulcast. Ambos os codificadores de referência, tanto para a camada base como para a camada superior usam o mesmo modelo de controlo de débito, modelo R-λ, que foi otimizado para o HEVC. Nenhuma otimização de alocação de débito entre camadas foi até ao momento proposto para o modelo de testes (SHM 8) para a escalabilidade do HEVC (SHVC). Derivamos um novo modelo R-λ apropriado para a camada superior e para o caso de escalabilidade espacial, que conduziu a um ganho de BD-débito de 1,81% e de BD-PSNR de 0,025 em relação ao modelo de débito-distorção existente no SHM do SHVC. Todavia, mostrou-se também nesta dissertação que o proposto modelo de R-λ não deve ser usado na camada inferior (camada base) no SHVC e por conseguinte no HEVC.This dissertation provides a study of the High Efficiency Video Coding standard (HEVC) and its scalable extension, SHVC. The SHVC provides a better performance when encoding several layers simultaneously than using an HEVC encoder in a simulcast configuration. Both reference encoders, in the base layer and in the enhancement layer use the same rate control model, R-λ model, which was optimized for HEVC. No optimal bitrate partitioning amongst layers is proposed in scalable HEVC (SHVC) test model (SHM 8). We derived a new R-λ model for the enhancement layer and for the spatial case which led to a DB-rate gain of 1.81% and DB-PSNR gain of 0.025 in relation to the rate-distortion model of SHM-SHVC. Nevertheless, we also show in this dissertation that the proposed model of R-λ should not be used neither in the base layer nor in HEVC

Repositório Institucional da Universidade de Aveiro

Guided Transcoding for Next-Generation Video Coding (HEVC)

Author: Nordgren Harald
Publication venue: Lunds universitet/Institutionen för datavetenskap
Publication date: 01/01/2016
Field of study

Video content is the dominant traffic type on mobile networks today and this portion is only expected to increase in the future. In this thesis we investigate ways of reducing bit rates for adaptive streaming applications in the latest video coding standard, H.265 / High Efficiency Video Coding (HEVC). The current models for offering different-resolution versions of video content in a dynamic way, so called adaptive streaming, require either large amounts of storage capacity where full encodings of the material is kept at all times, or extremely high computational power in order to regenerate content on-demand. Guided transcoding aims at finding a middle-ground were we can store and transmit less data, at full or near-full quality, while still keeping computational complexity low. This is achieved by shifting the computationally heavy operations to a preprocessing step where so called side-information is generated. The side-information can then be used to quickly reconstruct sequences on-demand -- even when running on generic, non-specialized, hardware. Two method for generating side-information, pruning and deflation, are compared on a varying set of standardized HEVC test sequences and the respective upsides and downsides of each method are discussed.Genom att slänga bort viss information från en komprimerad video och sedan återskapa sekvensen i realtid kan vi minska behovet av lagringsutrymme för adaptiv videostreaming med 20–30%. Detta med helt bibehållen bildkvalité eller endast små försämringar. ==================== Adaptiv streaming Streaming är ett populärt sätt att skicka video över internet där en sekvens delas upp i korta segment som skickas kontinuerligt till användaren. Dessa segment kan skickas med varierande kvalité, och en modell där vi automatiskt känner av nätverkets belastning och dynamiskt anpassar kvalitén kallas för adaptiv streaming. Detta är ett system som används av SVT Play, TV4 Play och YouTube. HD- eller UltraHD-video måste komprimeras för att kunna skickas över ett nätverk – den tar helt enkelt för stor plats annars. Video som kodas med den senaste komprimeringsstandarden, HEVC/H.265, blir upp emot 700 gånger mindre med minimala försämringar av bildkvalitén. Ett segment på tio sekunder som tar 1,5 GB att skicka i rå form kan då komprimeras till strax över 2 MB. För att kunna erbjuda tittaren en videosekvens – en film eller ett TV-program – i varierande kvalité, skapar man olika kodningar av materialet. Generellt har vi inte möjlighet att förändra kvalitén på en sekvens i efterhand – omkodning av även en kort HD-video tar timmar att genomföra – så för att adaptiv streaming ska kunna fungera i praktiken genereras alla versioner på förhand och sparas undan. Men detta kräver stort lagringsutrymme. Guided transcoding Guided transcoding (”guidad omkodning”) erbjuder ett sätt att minska behovet av lagringsutrymme genom att slänga bort viss information och sedan återskapa den vid behov i ett senare skede. Vi gör detta för varje sekvens av lägre kvalité, men behåller högsta kvalitén som den är. En stympad lågkvalité-video tillsammans med videon av högsta kvalitén kan sedan användas för att exakt återskapa sekvensen. Denna process är mycket snabb i jämförelse med vanlig omkodning, så vi kan med kort varsel generera videokodningar av varierande kvalité. Vi har undersökt två metoder för plocka bort och återskapa videoinformation: pruning och deflation. Den första ger små försämringar i bildkvalitén och sparar närmare 30% lagringsutrymme. Den senare har ingen påverkan på bildkvalitén men sparar bara drygt 20% i utrymme

Efficient algorithms for scalable video coding

Author: Lu Xin (Researcher in Computer science)
Publication venue
Publication date
Field of study

A scalable video bitstream specifically designed for the needs of various client terminals, network conditions, and user demands is much desired in current and future video transmission and storage systems. The scalable extension of the H.264/AVC standard (SVC) has been developed to satisfy the new challenges posed by heterogeneous environments, as it permits a single video stream to be decoded fully or partially with variable quality, resolution, and frame rate in order to adapt to a specific application. This thesis presents novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute Difference (MAD) prediction model. The proposed fast inter-frame and inter-layer mode selection algorithm is based on the empirical observation that a macroblock (MB) with slow movement is more likely to be best matched by one in the same resolution layer. However, for a macroblock with fast movement, motion estimation between layers is required. Simulation results show that the algorithm can reduce the encoding time by up to 40%, with negligible degradation in RD performance. The proposed hierarchical fast mode selection scheme comprises four levels and makes full use of inter-layer, temporal and spatial correlation aswell as the texture information of each macroblock. Overall, the new technique demonstrates the same coding performance in terms of picture quality and compression ratio as that of the SVC standard, yet produces a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode selection algorithms, the proposed algorithm achieves a superior computational time reduction under very similar RD performance conditions. The existing SVC rate distortion model cannot accurately represent the RD properties of the prediction modes, because it is influenced by the use of inter-layer prediction. A separate RD model for inter-layer prediction coding in the enhancement layer(s) is therefore introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy is maintained to within 0.07% on average. As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction model for the spatial enhancement layers is proposed that considers the MAD from previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction. Simulation results indicate that the proposedMADprediction model reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation

Warwick Research Archives Portal Repository

High-Level Synthesis Based VLSI Architectures for Video Coding

Author: Ahmad Waqar
Publication venue: Politecnico di Torino
Publication date: 01/01/2017
Field of study

High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Efficient HEVC-based video adaptation using transcoding

Author: Pham Van Luong
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2017
Field of study

In a video transmission system, it is important to take into account the great diversity of the network/end-user constraints. On the one hand, video content is typically streamed over a network that is characterized by different bandwidth capacities. In many cases, the bandwidth is insufficient to transfer the video at its original quality. On the other hand, a single video is often played by multiple devices like PCs, laptops, and cell phones. Obviously, a single video would not satisfy their different constraints. These diversities of the network and devices capacity lead to the need for video adaptation techniques, e.g., a reduction of the bit rate or spatial resolution. Video transcoding, which modifies a property of the video without the change of the coding format, has been well-known as an efficient adaptation solution. However, this approach comes along with a high computational complexity, resulting in huge energy consumption in the network and possibly network latency. This presentation provides several optimization strategies for the transcoding process of HEVC (the latest High Efficiency Video Coding standard) video streams. First, the computational complexity of a bit rate transcoder (transrater) is reduced. We proposed several techniques to speed-up the encoder of a transrater, notably a machine-learning-based approach and a novel coding-mode evaluation strategy have been proposed. Moreover, the motion estimation process of the encoder has been optimized with the use of decision theory and the proposed fast search patterns. Second, the issues and challenges of a spatial transcoder have been solved by using machine-learning algorithms. Thanks to their great performance, the proposed techniques are expected to significantly help HEVC gain popularity in a wide range of modern multimedia applications

Ghent University Academic Bibliography

Deep Video Compression

Author: Ma Di
Publication venue
Publication date: 24/06/2021
Field of study

Explore Bristol Research

Motion hints based video coding

Author: Ahmmed Ashek
Publication venue: UNSW, Sydney
Publication date: 01/01/2015
Field of study

The persistent growth of video-based applications is heavily dependent on the advancements in video coding systems. Modern video codecs use the motion model itself to describe the geometric boundaries of moving objects in video sequences and thereby spend a significant portion of their bit rate refining the motion description in regions where motion discontinuities exist. This explicit communication of motion introduces redundancy, since some aspects of the motion can at least partially be inferred from the reference frames. In this thesis work, a novel bi-directional motion hints based prediction paradigm is proposed that moves away from the traditional redundant approach of careful partitioning around object boundaries by exploiting the spatial structure of the reference frames to infer appropriate boundaries for the intermediate ones. Motion hint provide a global description of motion over specific domain. Fundamentally this is related to the segmentation of foreground from background regions where the foreground and background motions are the motion hints. The appealing thing about motion hints is that they are continuous and invertible, even though the observed motion field for a frame is discontinuous and non-invertible. Experimental results show that at low bit rate applications, the motion hints based coder achieved a rate-distortion (RD) gain of 0.81 dB, or equivalently 13.38% savings in bit rate over the H.264/AVC reference. In a hybrid setting, this gain increased to 0.94 dB and 20.41% bit rebate is obtained. If both low and high bit rate scenarios are considered then the hybrid coder showed a RD performance of 0.80 dB, or equivalently 16.57% savings in bit rate. The usage of higher fractional pixel accurate motion hint, predictive coding of motion hint, a memory-based initialization for motion hint estimation improved the RD gain to 0.85 dB and 17.55% of bit rebate. The prediction framework is highly flexible in the sense that the motion model order for the hints can be content adaptive i.e. it can accommodate different motion models like affine, elastic, etc. Detecting motion discontinuity macroblocks (MBs) is a challenging task and the prediction paradigm managed to detect a significant number of such MBs. If the motion hints based prediction is used as a prediction mode for MBs, at low bit rates almost 50% of the motion discontinuity MBs chose to use affine hint mode and this number increased to 60% if elastic hint is used

UNSWorks

Recommended from our members

Error control strategies in H.265|HEVC video transmission

Author: Alfaqheri Taha Tareq
Publication venue: Brunel University London
Publication date: 01/01/2019
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonWith the rapid development in video coding technologies in the last decade, high-resolution video delivery suffers from packet loss due to unreliable transmission channels (time-varying characteristics). The error Resilience approaches at channel coding level are less efficient to implement in real time video transmission as the encoded video samples are in variable code length. Therefore, error resilience in video coding standard plays a vital role to reduce the effect of error propagation and improve the perceived visual quality. The main work in this thesis is to develop an efficient error resilience mechanism for H.265|HEVC video coding standard to reduce the effects of error propagation in error-prone conditions. In this thesis, two error resilience algorithms are proposed. The first one is Adaptive Slice Encoding (ASE) error resilience algorithm. The concept of this algorithm is to extract and protect the most active slices in the coded bitstream based on the adaptive search window. This algorithm can be applied in low delay video transmission with and without using a feedback channel. It is also designed to be compatible with reference coding software manual (HM16) for H.265|HEVC coding standard. The second proposed algorithm is a joint encoder-decoder error resilience called Error resilience based on Supplemental Enhancement Information (ERSEI) algorithm. A feedback message status is used from the decoder to notify the encoder to start encoding clean random-access picture adaptively based on the decoded picture hash message status from the decoder. At the same time, the decoder will be notified to start the error concealment process whilst waiting to receive correct video data. A recovery point message from the decoder feedback channel is used to update the encoder with error messages. In this thesis, extensive experimental work, evaluation, and comparison with state-of-the-art related algorithms have been conducted to evaluate the proposed algorithms. Furthermore, the best trade-off between the coding efficiency of the proposed error resilience algorithms and error resilience performance has been considered at the design stage. The experimental work evaluation includes both encoding conditions, i.e. error-free and error-prone. The results achieved from the experiments show significant improvements, in (Y-PSNR) results and subjective quality of the decoded bitstream, using the proposed algorithm in error-prone conditions with a variety of packet loss rates. Moreover, experimental work is conducted to test the algorithms complexity in terms of required processing execution time at both encoding and decoding stages. Additionally, the video coding standard performance for both H.264|AVC and H.265|HEVC coding standards are evaluated in error-free and error-prone environments. For ASE algorithm and when compared with improved region of interest (IROI) and region of interest (ROI) algorithms, a significant improvement in visual quality was the most obvious finding from the obtained results with PLRs of 2-18 (%). For ERSEI algorithm and when compared with the default HM16 with pixel copy concealment and motion compensated error concealment (MCEC) techniques, the evaluation results indicate clear visual quality enhancement under different packet loss rates PLRs (1,2 6, 8) %.The Ministry of Higher Education and Scientific Research in Ira

Brunel University Research Archive

Recommended from our members

Perceptual video quality and quality of experience for adaptive video streaming

Author: Bampis Christos George
Publication venue
Publication date: 26/01/2021
Field of study

We live in a world where images and videos dominate our everyday lives. Every day, an enormous amount of video data is being shared in social media and consumer applications, while video streaming is becoming a new form of digital entertainment. Large-scale video streaming on demand has become possible thanks to numerous engineering achievements in fields such as video compression, high-speed computation and display technologies. Nevertheless, the skyrocketing needs for bandwidth and network resources consumed by video applications challenges modern video content delivery. Since the available bandwidth resources are limited, streaming service providers have to mediate between operation costs, bandwidth efficiency and maximizing user quality of experience. However, these goals are inherently conflicting and require knowledge of how user quality of experience is affected by the network-induced changes in video quality. Being able to understand and predict user quality of experience and perceptually optimize rate allocation, can have significant effects in better network utilization, reduced costs for service providers and improved user satisfaction. The goal of this dissertation is to study and predict user quality of experience in video streaming applications, by exploiting perceptual video quality and human behavioral responses to streaming-related video impairments. To this end, I present the details of three large-scale video subjective studies which target video streaming under multiple viewing conditions, such as display device, session duration, content characteristics and network/buffer conditions. By analyzing how humans react to changes in visual quality and streaming video impairments, I also design numerous video quality and quality of experience prediction models that can be used to evaluate the overall and the continuous-time perceived video quality. Throughout this dissertation, my goal is to perceptually optimize various stages of the video streaming pipeline, such as video encoding and video quality control as well as client-based rate adaptation. Ultimately, I envision that the outcome of this dissertation can be useful for video streaming applications at global scaleElectrical and Computer Engineerin

Texas ScholarWorks

Visual Content Characterization Based on Encoding Rate-Distortion Analysis

Author: Li Zhuoran
Publication venue: 'University of Waterloo'
Publication date: 26/02/2023
Field of study

Visual content characterization is a fundamentally important but under exploited step in dataset construction, which is essential in solving many image processing and computer vision problems. In the era of machine learning, this has become ever more important, because with the explosion of image and video content nowadays, scrutinizing all potential content is impossible and source content selection has become increasingly difficult. In particular, in the area of image/video coding and quality assessment, it is highly desirable to characterize/select source content and subsequently construct image/video datasets that demonstrate strong representativeness and diversity of the visual world, such that the visual coding and quality assessment methods developed from and validated using such datasets exhibit strong generalizability. Encoding Rate-Distortion (RD) analysis is essential for many multimedia applications. Examples of applications that explicitly use RD analysis include image encoder RD optimization, video quality assessment (VQA), and Quality of Experience (QoE) optimization of streaming videos etc. However, encoding RD analysis has not been well investigated in the context of visual content characterization. This thesis focuses on applying encoding RD analysis as a visual source content characterization method with image/video coding and quality assessment applications in mind. We first conduct a video quality subjective evaluation experiment for state-of-the-art video encoder performance analysis and comparison, where our observations reveal severe problems that motivate the needs of better source content characterization and selection methods. Then the effectiveness of RD analysis in visual source content characterization is demonstrated through a proposed quality control mechanism for video coding by eigen analysis in the space of General Quality Parameter (GQP) functions. Finally, by combining encoding RD analysis with submodular set function optimization, we propose a novel method for automating the process of representative source content selection, which helps boost the RD performance of visual encoders trained with the selected visual contents

University of Waterloo's Institutional Repository