16,872 research outputs found
Multi-Frame Quality Enhancement for Compressed Video
The past few years have witnessed great success in applying deep learning to
enhance the quality of compressed image/video. The existing approaches mainly
focus on enhancing the quality of a single frame, ignoring the similarity
between consecutive frames. In this paper, we investigate that heavy quality
fluctuation exists across compressed video frames, and thus low quality frames
can be enhanced using the neighboring high quality frames, seen as Multi-Frame
Quality Enhancement (MFQE). Accordingly, this paper proposes an MFQE approach
for compressed video, as a first attempt in this direction. In our approach, we
firstly develop a Support Vector Machine (SVM) based detector to locate Peak
Quality Frames (PQFs) in compressed video. Then, a novel Multi-Frame
Convolutional Neural Network (MF-CNN) is designed to enhance the quality of
compressed video, in which the non-PQF and its nearest two PQFs are as the
input. The MF-CNN compensates motion between the non-PQF and PQFs through the
Motion Compensation subnet (MC-subnet). Subsequently, the Quality Enhancement
subnet (QE-subnet) reduces compression artifacts of the non-PQF with the help
of its nearest PQFs. Finally, the experiments validate the effectiveness and
generality of our MFQE approach in advancing the state-of-the-art quality
enhancement of compressed video. The code of our MFQE approach is available at
https://github.com/ryangBUAA/MFQE.gitComment: to appear in CVPR 201
Quality-Gated Convolutional LSTM for Enhancing Compressed Video
The past decade has witnessed great success in applying deep learning to
enhance the quality of compressed video. However, the existing approaches aim
at quality enhancement on a single frame, or only using fixed neighboring
frames. Thus they fail to take full advantage of the inter-frame correlation in
the video. This paper proposes the Quality-Gated Convolutional Long Short-Term
Memory (QG-ConvLSTM) network with bi-directional recurrent structure to fully
exploit the advantageous information in a large range of frames. More
importantly, due to the obvious quality fluctuation among compressed frames,
higher quality frames can provide more useful information for other frames to
enhance quality. Therefore, we propose learning the "forget" and "input" gates
in the ConvLSTM cell from quality-related features. As such, the frames with
various quality contribute to the memory in ConvLSTM with different importance,
making the information of each frame reasonably and adequately used. Finally,
the experiments validate the effectiveness of our QG-ConvLSTM approach in
advancing the state-of-the-art quality enhancement of compressed video, and the
ablation study shows that our QG-ConvLSTM approach is learnt to make a
trade-off between quality and correlation when leveraging multi-frame
information. The project page: https://github.com/ryangchn/QG-ConvLSTM.git.Comment: Accepted to IEEE International Conference on Multimedia and Expo
(ICME) 201
Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement
In this paper, we propose a Hierarchical Learned Video Compression (HLVC)
method with three hierarchical quality layers and a recurrent enhancement
network. The frames in the first layer are compressed by an image compression
method with the highest quality. Using these frames as references, we propose
the Bi-Directional Deep Compression (BDDC) network to compress the second layer
with relatively high quality. Then, the third layer frames are compressed with
the lowest quality, by the proposed Single Motion Deep Compression (SMDC)
network, which adopts a single motion map to estimate the motions of multiple
frames, thus saving bits for motion information. In our deep decoder, we
develop the Weighted Recurrent Quality Enhancement (WRQE) network, which takes
both compressed frames and the bit stream as inputs. In the recurrent cell of
WRQE, the memory and update signal are weighted by quality features to
reasonably leverage multi-frame information for enhancement. In our HLVC
approach, the hierarchical quality benefits the coding efficiency, since the
high quality information facilitates the compression and enhancement of low
quality frames at encoder and decoder sides, respectively. Finally, the
experiments validate that our HLVC approach advances the state-of-the-art of
deep video compression methods, and outperforms the "Low-Delay P (LDP) very
fast" mode of x265 in terms of both PSNR and MS-SSIM. The project page is at
https://github.com/RenYang-home/HLVC.Comment: Published in CVPR 2020; corrected a minor typo in the footnote of
Table 1; corrected Figure 1
Learned Quality Enhancement via Multi-Frame Priors for HEVC Compliant Low-Delay Applications
Networked video applications, e.g., video conferencing, often suffer from
poor visual quality due to unexpected network fluctuation and limited
bandwidth. In this paper, we have developed a Quality Enhancement Network
(QENet) to reduce the video compression artifacts, leveraging the spatial and
temporal priors generated by respective multi-scale convolutions spatially and
warped temporal predictions in a recurrent fashion temporally. We have
integrated this QENet as a standard-alone post-processing subsystem to the High
Efficiency Video Coding (HEVC) compliant decoder. Experimental results show
that our QENet demonstrates the state-of-the-art performance against default
in-loop filters in HEVC and other deep learning based methods with noticeable
objective gains in Peak-Signal-to-Noise Ratio (PSNR) and subjective gains
visually
Scalable video transcoding for mobile communications
Mobile multimedia contents have been introduced in the market and their demand is growing every day due to the increasing number of mobile devices and the possibility to watch them at any moment in any place. These multimedia contents are delivered over different networks that are visualized in mobile terminals with heterogeneous characteristics. To ensure a continuous high quality it is desirable that this multimedia content can be adapted on-the-fly to the transmission constraints and the characteristics of the mobile devices. In general, video contents are compressed to save storage capacity and to reduce the bandwidth required for its transmission. Therefore, if these compressed video streams were compressed using scalable video coding schemes, they would be able to adapt to those heterogeneous networks and a wide range of terminals. Since the majority of the multimedia contents are compressed using H.264/AVC, they cannot benefit from that scalability. This paper proposes a technique to convert an H.264/AVC bitstream without scalability to a scalable bitstream with temporal scalability as part of a scalable video transcoder for mobile communications. The results show that when our technique is applied, the complexity is reduced by 98 % while maintaining coding efficiency
Video adaptation for mobile digital television
Mobile digital television is one of the new services introduced recently by telecommunications operators in the market. Due to the possibilities of personalization and interaction provided, together with the increasing demand of this type of portable services, it would be expected to be a successful technology in near future. Video contents stored and transmitted over the networks deployed to provide mobile digital television need to be compressed to reduce the resources required. The compression scheme chosen by the great majority of these networks is H.264/AVC. Compressed video bitstreams have to be adapted to heterogeneous networks and a wide range of terminals. To deal with this problem scalable video coding schemes were proposed and standardized providing temporal, spatial and quality scalability using layers within the encoded bitstream. Because existing H.264/AVC contents cannot benefit from scalability tools, efficient techniques for migration of single-layer to scalable contents are desirable for supporting these mobile digital television systems. This paper proposes a technique to convert from single-layer H.264/AVC bitstream to a scalable bitstream with temporal scalability. Applying this approach, a reduction of 60% of coding complexity is achieved while maintaining the coding efficiency
- …