1,227 research outputs found
State-of-the-Art and Trends in Scalable Video Compression with Wavelet Based Approaches
3noScalable Video Coding (SVC) differs form traditional single point approaches mainly because it allows to encode in a unique bit stream several working points corresponding to different quality, picture size and frame rate. This work describes the current state-of-the-art in SVC, focusing on wavelet based motion-compensated approaches (WSVC). It reviews individual components that have been designed to address the problem over the years and how such components are typically combined to achieve meaningful WSVC architectures. Coding schemes which mainly differ from the space-time order in which the wavelet transforms operate are here compared, discussing strengths and weaknesses of the resulting implementations. An evaluation of the achievable coding performances is provided considering the reference architectures studied and developed by ISO/MPEG in its exploration on WSVC. The paper also attempts to draw a list of major differences between wavelet based solutions and the SVC standard jointly targeted by ITU and ISO/MPEG. A major emphasis is devoted to a promising WSVC solution, named STP-tool, which presents architectural similarities with respect to the SVC standard. The paper ends drawing some evolution trends for WSVC systems and giving insights on video coding applications which could benefit by a wavelet based approach.partially_openpartially_openADAMI N; SIGNORONI. A; R. LEONARDIAdami, Nicola; Signoroni, Alberto; Leonardi, Riccard
A Fully Scalable Video Coder with Inter-Scale Wavelet Prediction and Morphological Coding
In this paper a new fully scalable - wavelet based - video coding architecture is proposed, where motion compensated temporal filtered subbands of spatially scaled versions of a video sequence can be used as base layer for inter-scale predictions. These predictions take place between data at the same resolution level without the need of interpolation. The prediction residuals are further transformed by spatial wavelet decompositions. The resulting multi-scale spatiotemporal wavelet subbands are coded thanks to an embedded morphological dilation technique and context based arithmetic coding. Dyadic spatio-temporal scalability and progressive SNR scalability are achieved. Multiple adaptation decoding can be easily implemented without the need of knowing a predefined set of operating points. The proposed coding system allows to compensate some of the typical drawbacks of current wavelet based scalable video coding architectures and shows interesting visual results even when compared with the single operating point video coding standard AVC/H.264
Highly parallel HEVC decoding for heterogeneous systems with CPU and GPU
The High Efficiency Video Coding HEVC standard provides a higher compression efficiency than other video coding standards but at the cost of an increased computational load, which makes hard to achieve real-time encoding/decoding for ultra high-resolution and high-quality video sequences. Graphics Processing Units GPU are known to provide massive processing capability for highly parallel and regular computing kernels, but not all HEVC decoding procedures are suited for GPU execution. Furthermore, if HEVC decoding is accelerated by GPUs, energy efficiency is another concern for heterogeneous CPU+GPU decoding. In this paper, a highly parallel HEVC decoder for heterogeneous CPU+GPU system is proposed. It exploits available parallelism in HEVC decoding on the CPU, GPU, and between the CPU and GPU devices simultaneously. On top of that, different workload balancing schemes can be selected according to the devoted CPU and GPU computing resources. Furthermore, an energy optimized solution is proposed by tuning GPU clock rates. Results show that the proposed decoder achieves better performance than the state-of-the-art CPU decoder, and the best performance among the workload balancing schemes depends on the available CPU and GPU computing resources. In particular, with an NVIDIA Titan X Maxwell GPU and an Intel Xeon E5-2699v3 CPU, the proposed decoder delivers 167 frames per second (fps) for Ultra HD 4K videos, when four CPU cores are used. Compared to the state-of-the-art CPU decoder using four CPU cores, the proposed decoder gains a speedup factor of . When decoding performance is bounded by the CPU, a system wise energy reduction up to 36% is achieved by using fixed (and lower) GPU clocks, compared to the default dynamic clock settings on the GPU.EC/H2020/688759/EU/Low-Power Parallel Computing on GPUs 2/LPGPU
Decoder Hardware Architecture for HEVC
This chapter provides an overview of the design challenges faced in the implementation of hardware HEVC decoders. These challenges can be attributed to the larger and diverse coding block sizes and transform sizes, the larger interpolation filter for motion compensation, the increased number of steps in intra prediction and the introduction of a new in-loop filter. Several solutions to address these implementation challenges are discussed. As a reference, results for an HEVC decoder test chip are also presented.Texas Instruments Incorporate
Signal processing for high-definition television
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 1995.Includes bibliographical references (p. 60-62).by Peter Monta.Ph.D
Deep Graph-Convolutional Image Denoising
Non-local self-similarity is well-known to be an effective prior for the
image denoising problem. However, little work has been done to incorporate it
in convolutional neural networks, which surpass non-local model-based methods
despite only exploiting local information. In this paper, we propose a novel
end-to-end trainable neural network architecture employing layers based on
graph convolution operations, thereby creating neurons with non-local receptive
fields. The graph convolution operation generalizes the classic convolution to
arbitrary graphs. In this work, the graph is dynamically computed from
similarities among the hidden features of the network, so that the powerful
representation learning capabilities of the network are exploited to uncover
self-similar patterns. We introduce a lightweight Edge-Conditioned Convolution
which addresses vanishing gradient and over-parameterization issues of this
particular graph convolution. Extensive experiments show state-of-the-art
performance with improved qualitative and quantitative results on both
synthetic Gaussian noise and real noise
Architectures for Adaptive Low-Power Embedded Multimedia Systems
This Ph.D. thesis describes novel hardware/software architectures for adaptive low-power embedded multimedia systems. Novel techniques for run-time adaptive energy management are proposed, such that both HW & SW adapt together to react to the unpredictable scenarios. A complete power-aware H.264 video encoder was developed. Comparison with state-of-the-art demonstrates significant energy savings while meeting the performance constraint and keeping the video quality degradation unnoticeable
Low Bit-rate Color Video Compression using Multiwavelets in Three Dimensions
In recent years, wavelet-based video compressions have become a major focus of research because of the advantages that it provides. More recently, a growing thrust of studies explored the use of multiple scaling functions and multiple wavelets with desirable properties in various fields, from image de-noising to compression. In term of data compression, multiple scaling functions and wavelets offer a greater flexibility in coefficient quantization at high compression ratio than a comparable single wavelet. The purpose of this research is to investigate the possible improvement of scalable wavelet-based color video compression at low bit-rates by using three-dimensional multiwavelets. The first part of this work included the development of the spatio-temporal decomposition process for multiwavelets and the implementation of an efficient 3-D SPIHT encoder/decoder as a common platform for performance evaluation of two well-known multiwavelet systems against a comparable single wavelet in low bitrate color video compression. The second part involved the development of a motion-compensated 3-D compression codec and a modified SPIHT algorithm designed specifically for this codec by incorporating an advantage in the design of 2D SPIHT into the 3D SPIHT coder. In an experiment that compared their performances, the 3D motion-compensated codec with unmodified 3D SPIHT had gains of 0.3dB to 4.88dB over regular 2D wavelet-based motion-compensated codec using 2D SPIHT in the coding of 19 endoscopy sequences at 1/40 compression ratio. The effectiveness of the modified SPIHT algorithm was verified by the results of a second experiment in which it was used to re-encode 4 of the 19 sequences with lowest performance gains and improved them by 0.5dB to 1.0dB. The last part of the investigation examined the effect of multiwavelet packet on 3-D video compression as well as the effects of coding multiwavelet packets based on the frequency order and energy content of individual subbands
Recommended from our members
Inspection and evaluation of artifacts in digital video sources
Streaming digital video content providers such as YouTube, Amazon, Hulu, and Netflix collaborate with production teams to obtain new and old video content. These collaborations lead to an accumulation of video sources, some of which might contain unacceptable visual artifacts. Artifacts may inadvertently enter the video master at any point in the production pipeline, due to any of a number of equipment and user failures. Unfortunately, these artifacts are difficult to detect since no pristine reference exists for comparison. As of now, few automated tools exist that can effectively capture the most common forms of these artifacts. This work studies no-reference video source inspection for generalized artifact detection and subjective quality prediction, which will ultimate inform decisions related to acquisition of new content.
Automatically identifying the locations and severities of video artifacts is a difficult problem. We have developed a general method for detecting local artifacts by learning differences in the statistics between distorted and pristine video frames. Our model, which we call the Video Impairment Mapper (VID-MAP), produces a full resolution map of artifact detection probabilities based on comparisons of excitatory and inhibatory convolutional responses. Validation on a large database shows that our method outperforms the previous state-of-the-art of even distortion-specific detectors.
A variety of powerful picture quality predictors are available that rely on neuro-statistical models of distortion perception. We extend these principles to video source inspection, by coupling spatial divisive normalization with a series of filterbanks tuned for artifact detection, implemented using a common convolutional framework. We developed the Video Impairment Detection by SParse Error CapTure (VIDSPECT) model, which leverages discriminative sparse dictionaries that are tuned to detect specific artifacts. VIDSPECT is simple, highly generalizable, and yields better accuracy than competing methods.
To evaluate the perceived quality of video sources containing artifacts, we built a new digital video database, called the LIVE Video Masters Database, which contains 384 videos affected by the types of artifacts encountered in otherwise pristine digital video sources. We find that VIDSPECT delivers top performance on this database for most artifacts tested, and competitive performance otherwise, using the same basic architecture in all cases.Electrical and Computer Engineerin
- …