14,263 research outputs found
A hybrid low bit-rate video codec using subbands and statistical modeling
A hybrid low bit-rate video codes using subbands and statistical modeling is proposed in this thesis. The redundancy within adjacent video frames is exploited by motion estimation and compensation. The Motion Compensated Frame Difference (MCFD) signals are decomposed into 7 subbands using 2-D dyadic tree structure and separable filters. Some of the subband signals are statistically modeled by using the 2-D AR(1) technique. The model parameters provide a representation of these subbands at the receiver side with a. certain level of error. The remaining subbands are compressed employing a classical waveform coding technique, namely vector quantization (VQ).
It is shown that the statistical modeling is a viable representation approach for low-correlated subbands of MCFD signal.The subbands with higher correlation are better represented with waveform coding techniques
PEA265: Perceptual Assessment of Video Compression Artifacts
The most widely used video encoders share a common hybrid coding framework
that includes block-based motion estimation/compensation and block-based
transform coding. Despite their high coding efficiency, the encoded videos
often exhibit visually annoying artifacts, denoted as Perceivable Encoding
Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience
(QoE) of end users. To monitor and improve visual QoE, it is crucial to develop
subjective and objective measures that can identify and quantify various types
of PEAs. In this work, we make the first attempt to build a large-scale
subjectlabelled database composed of H.265/HEVC compressed videos containing
various PEAs. The database, namely the PEA265 database, includes 4 types of
spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types
of temporal PEAs (i.e. flickering and floating). Each containing at least
60,000 image or video patches with positive and negative labels. To objectively
identify these PEAs, we train Convolutional Neural Networks (CNNs) using the
PEA265 database. It appears that state-of-theart ResNeXt is capable of
identifying each type of PEAs with high accuracy. Furthermore, we define PEA
pattern and PEA intensity measures to quantify PEA levels of compressed video
sequence. We believe that the PEA265 database and our findings will benefit the
future development of video quality assessment methods and perceptually
motivated video encoders.Comment: 10 pages,15 figures,4 table
Review And Comparative Study Of Motion Estimation Techniques To Reduce Complexity
ABSTRACT: Block matching motion estimation is a key Component in video compression because of its high computational complexity. The process of motion estimation has become a bottleneck problem in many video applications. Typical applications include HDTV, multimedia communications, video conferencing, etc. Motion estimation is a useful in estimating the motion of any object. Motion estimation has been conventionally used in the application of video encoding but nowadays researchers from various fields other than video encoding are turning towards motion estimation to solve various real life problems in their respective fields. In this paper, we present a review of block matching based motion estimation algorithms, reduced complexity of motion estimation techniques and a comparative study across all different algorithms. Also the aim of this study is to provide the reader with a feel of the relative performance of the algorithms, with particular attention to the important trade-off between computational complexity, prediction quality, result quality and other various applications. Keywords: Fixed size block motion estimation (FSBME), Block-based motion estimation (BMME), Peak-Signal-toNoise-Ratio (PSNR), Hybrid block matching algorithm (HBMA). I.INTRODUCTION Motion compensated transform coding forms the basis of the existing video compression Standards H.26 1/H.262 and MPEG-1 /MPEG-2, where the compression algorithm tries to exploit the temporal and spatial redundancies by using some form of motion compensation followed by a transform coding, respectively. The key step in removing temporal redundancy is the motion estimation where a motion Vector is predicted between the current frame and a reference frame. Following the motion estimation, a Motion compensation stage is applied to obtain the residual image, i.e. the pixel differences between the current frame and a reference frame. Later this residual is compressed using transform coding or a combination of transform and entropy coding. The above Video compression standards employs block motion estimation techniques. The main advantages of FSBME (fixed size block motion estimation) are simplicity of the algorithm and the fact that no segmentation information needs to be transmitted In block motion compensated video coding; first image frames are divided into square blocks (FIXED SIZE). The next step is to apply a three-step procedure, consisting of Motion Detection, Motion Estimation and Motion Compensation. Motion detection is used for classifying blocks as moving or non-moving based on a predefined distance or similarity measure. This similarity measure is usually done by MSE (minimum mean square error) criteria or minimum SAD (sum of absolute different) criteria. The output of the motion-estimation algorithm comprises the motion vector for each block, and the pixel value differences between the blocks in the current frame and the "matched" blocks in the reference frame. We call this difference signal the motion compensation error, or simply block error. Many techniques have been proposed for motion estimation for video compression so far. All the methods are proposed keeping any one or more of the three directions aimed that 1.reducing computational complexity 2.representing true motion (proving good quality) 3.reducing bit rate(high compression ratio)
Prediction error image coding using a modified stochastic vector quantization scheme
The objective of this paper is to provide an efficient and yet simple method to encode the prediction error image of video sequences, based on a stochastic vector quantization (SVQ) approach that has been modified to cope with the intrinsic decorrelated nature of the prediction error image of video signals. In the SVQ scheme, the codewords are generated by stochastic techniques instead of being generated by a training set representative of the expected input image as is normal use in VQ. The performance of the scheme is shown for the particular case of segmentation-based video coding although the technique can be also applied to motion-compensated hybrid coding schemes.Peer ReviewedPostprint (published version
Motion-Compensated Coding and Frame-Rate Up-Conversion: Models and Analysis
Block-based motion estimation (ME) and compensation (MC) techniques are
widely used in modern video processing algorithms and compression systems. The
great variety of video applications and devices results in numerous compression
specifications. Specifically, there is a diversity of frame-rates and
bit-rates. In this paper, we study the effect of frame-rate and compression
bit-rate on block-based ME and MC as commonly utilized in inter-frame coding
and frame-rate up conversion (FRUC). This joint examination yields a
comprehensive foundation for comparing MC procedures in coding and FRUC. First,
the video signal is modeled as a noisy translational motion of an image. Then,
we theoretically model the motion-compensated prediction of an available and
absent frames as in coding and FRUC applications, respectively. The theoretic
MC-prediction error is further analyzed and its autocorrelation function is
calculated for coding and FRUC applications. We show a linear relation between
the variance of the MC-prediction error and temporal-distance. While the
affecting distance in MC-coding is between the predicted and reference frames,
MC-FRUC is affected by the distance between the available frames used for the
interpolation. Moreover, the dependency in temporal-distance implies an inverse
effect of the frame-rate. FRUC performance analysis considers the prediction
error variance, since it equals to the mean-squared-error of the interpolation.
However, MC-coding analysis requires the entire autocorrelation function of the
error; hence, analytic simplicity is beneficial. Therefore, we propose two
constructions of a separable autocorrelation function for prediction error in
MC-coding. We conclude by comparing our estimations with experimental results
Low complexity video compression using moving edge detection based on DCT coefficients
In this paper, we propose a new low complexity video compression method based on detecting blocks containing moving edges us- ing only DCT coe±cients. The detection, whilst being very e±cient, also allows e±cient motion estimation by constraining the search process to moving macro-blocks only. The encoders PSNR is degraded by 2dB com- pared to H.264/AVC inter for such scenarios, whilst requiring only 5% of the execution time. The computational complexity of our approach is comparable to that of the DISCOVER codec which is the state of the art low complexity distributed video coding. The proposed method ¯nds blocks with moving edge blocks and processes only selected blocks. The approach is particularly suited to surveillance type scenarios with a static camera
A Convolutional Neural Network Approach for Half-Pel Interpolation in Video Coding
Motion compensation is a fundamental technology in video coding to remove the
temporal redundancy between video frames. To further improve the coding
efficiency, sub-pel motion compensation has been utilized, which requires
interpolation of fractional samples. The video coding standards usually adopt
fixed interpolation filters that are derived from the signal processing theory.
However, as video signal is not stationary, the fixed interpolation filters may
turn out less efficient. Inspired by the great success of convolutional neural
network (CNN) in computer vision, we propose to design a CNN-based
interpolation filter (CNNIF) for video coding. Different from previous studies,
one difficulty for training CNNIF is the lack of ground-truth since the
fractional samples are actually not available. Our solution for this problem is
to derive the "ground-truth" of fractional samples by smoothing high-resolution
images, which is verified to be effective by the conducted experiments.
Compared to the fixed half-pel interpolation filter for luma in High Efficiency
Video Coding (HEVC), our proposed CNNIF achieves up to 3.2% and on average 0.9%
BD-rate reduction under low-delay P configuration.Comment: International Symposium on Circuits and Systems (ISCAS) 201
- …