15 research outputs found

    DMCNN: Dual-Domain Multi-Scale Convolutional Neural Network for Compression Artifacts Removal

    Full text link
    JPEG is one of the most commonly used standards among lossy image compression methods. However, JPEG compression inevitably introduces various kinds of artifacts, especially at high compression rates, which could greatly affect the Quality of Experience (QoE). Recently, convolutional neural network (CNN) based methods have shown excellent performance for removing the JPEG artifacts. Lots of efforts have been made to deepen the CNNs and extract deeper features, while relatively few works pay attention to the receptive field of the network. In this paper, we illustrate that the quality of output images can be significantly improved by enlarging the receptive fields in many cases. One step further, we propose a Dual-domain Multi-scale CNN (DMCNN) to take full advantage of redundancies on both the pixel and DCT domains. Experiments show that DMCNN sets a new state-of-the-art for the task of JPEG artifact removal.Comment: To appear in IEEE ICIP 201

    Learning Binary Residual Representations for Domain-specific Video Streaming

    Full text link
    We study domain-specific video streaming. Specifically, we target a streaming setting where the videos to be streamed from a server to a client are all in the same domain and they have to be compressed to a small size for low-latency transmission. Several popular video streaming services, such as the video game streaming services of GeForce Now and Twitch, fall in this category. While conventional video compression standards such as H.264 are commonly used for this task, we hypothesize that one can leverage the property that the videos are all in the same domain to achieve better video quality. Based on this hypothesis, we propose a novel video compression pipeline. Specifically, we first apply H.264 to compress domain-specific videos. We then train a novel binary autoencoder to encode the leftover domain-specific residual information frame-by-frame into binary representations. These binary representations are then compressed and sent to the client together with the H.264 stream. In our experiments, we show that our pipeline yields consistent gains over standard H.264 compression across several benchmark datasets while using the same channel bandwidth.Comment: Accepted in AAAI'18. Project website at https://research.nvidia.com/publication/2018-02_Learning-Binary-Residua

    D3\mathbf{D^3}: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images

    Full text link
    In this paper, we design a Deep Dual-Domain (D3\mathbf{D^3}) based fast restoration model to remove artifacts of JPEG compressed images. It leverages the large learning capacity of deep networks, as well as the problem-specific expertise that was hardly incorporated in the past design of deep architectures. For the latter, we take into consideration both the prior knowledge of the JPEG compression scheme, and the successful practice of the sparsity-based dual-domain approach. We further design the One-Step Sparse Inference (1-SI) module, as an efficient and light-weighted feed-forward approximation of sparse coding. Extensive experiments verify the superiority of the proposed D3D^3 model over several state-of-the-art methods. Specifically, our best model is capable of outperforming the latest deep model for around 1 dB in PSNR, and is 30 times faster

    Image Restoration by Estimating Frequency Distribution of Local Patches

    Full text link
    In this paper, we propose a method to solve the image restoration problem, which tries to restore the details of a corrupted image, especially due to the loss caused by JPEG compression. We have treated an image in the frequency domain to explicitly restore the frequency components lost during image compression. In doing so, the distribution in the frequency domain is learned using the cross entropy loss. Unlike recent approaches, we have reconstructed the details of an image without using the scheme of adversarial training. Rather, the image restoration problem is treated as a classification problem to determine the frequency coefficient for each frequency band in an image patch. In this paper, we show that the proposed method effectively restores a JPEG-compressed image with more detailed high frequency components, making the restored image more vivid.Comment: 9 pages, 5 figures, Accepted as a poster in CVPR 201

    MemNet: A Persistent Memory Network for Image Restoration

    Full text link
    Recently, very deep convolutional neural networks (CNNs) have been attracting considerable attention in image restoration. However, as the depth grows, the long-term dependency problem is rarely realized for these very deep models, which results in the prior states/layers having little influence on the subsequent ones. Motivated by the fact that human thoughts have persistency, we propose a very deep persistent memory network (MemNet) that introduces a memory block, consisting of a recursive unit and a gate unit, to explicitly mine persistent memory through an adaptive learning process. The recursive unit learns multi-level representations of the current state under different receptive fields. The representations and the outputs from the previous memory blocks are concatenated and sent to the gate unit, which adaptively controls how much of the previous states should be reserved, and decides how much of the current state should be stored. We apply MemNet to three image restoration tasks, i.e., image denosing, super-resolution and JPEG deblocking. Comprehensive experiments demonstrate the necessity of the MemNet and its unanimous superiority on all three tasks over the state of the arts. Code is available at https://github.com/tyshiwo/MemNet.Comment: Accepted by ICCV 2017 (Spotlight presentation

    End-to-End JPEG Decoding and Artifacts Suppression Using Heterogeneous Residual Convolutional Neural Network

    Full text link
    Existing deep learning models separate JPEG artifacts suppression from the decoding protocol as independent task. In this work, we take one step forward to design a true end-to-end heterogeneous residual convolutional neural network (HR-CNN) with spectrum decomposition and heterogeneous reconstruction mechanism. Benefitting from the full CNN architecture and GPU acceleration, the proposed model considerably improves the reconstruction efficiency. Numerical experiments show that the overall reconstruction speed reaches to the same magnitude of the standard CPU JPEG decoding protocol, while both decoding and artifacts suppression are completed together. We formulate the JPEG artifacts suppression task as an interactive process of decoding and image detail reconstructions. A heterogeneous, fully convolutional, mechanism is proposed to particularly address the uncorrelated nature of different spectral channels. Directly starting from the JPEG code in k-space, the network first extracts the spectral samples channel by channel, and restores the spectral snapshots with expanded throughput. These intermediate snapshots are then heterogeneously decoded and merged into the pixel space image. A cascaded residual learning segment is designed to further enhance the image details. Experiments verify that the model achieves outstanding performance in JPEG artifacts suppression, while its full convolutional operations and elegant network structure offers higher computational efficiency for practical online usage compared with other deep learning models on this topic

    Implicit Dual-domain Convolutional Network for Robust Color Image Compression Artifact Reduction

    Full text link
    Several dual-domain convolutional neural network-based methods show outstanding performance in reducing image compression artifacts. However, they suffer from handling color images because the compression processes for gray-scale and color images are completely different. Moreover, these methods train a specific model for each compression quality and require multiple models to achieve different compression qualities. To address these problems, we proposed an implicit dual-domain convolutional network (IDCN) with the pixel position labeling map and the quantization tables as inputs. Specifically, we proposed an extractor-corrector framework-based dual-domain correction unit (DCU) as the basic component to formulate the IDCN. A dense block was introduced to improve the performance of extractor in DRU. The implicit dual-domain translation allows the IDCN to handle color images with the discrete cosine transform (DCT)-domain priors. A flexible version of IDCN (IDCN-f) was developed to handle a wide range of compression qualities. Experiments for both objective and subjective evaluations on benchmark datasets show that IDCN is superior to the state-of-the-art methods and IDCN-f exhibits excellent abilities to handle a wide range of compression qualities with little performance sacrifice and demonstrates great potential for practical applications.Comment: accepted by IEEE Transactions on Circuits and Systems for Video Technology(T-CSVT

    Quality Adaptive Low-Rank Based JPEG Decoding with Applications

    Full text link
    Small compression noises, despite being transparent to human eyes, can adversely affect the results of many image restoration processes, if left unaccounted for. Especially, compression noises are highly detrimental to inverse operators of high-boosting (sharpening) nature, such as deblurring and superresolution against a convolution kernel. By incorporating the non-linear DCT quantization mechanism into the formulation for image restoration, we propose a new sparsity-based convex programming approach for joint compression noise removal and image restoration. Experimental results demonstrate significant performance gains of the new approach over existing image restoration methods

    Non-Local ConvLSTM for Video Compression Artifact Reduction

    Full text link
    Video compression artifact reduction aims to recover high-quality videos from low-quality compressed videos. Most existing approaches use a single neighboring frame or a pair of neighboring frames (preceding and/or following the target frame) for this task. Furthermore, as frames of high quality overall may contain low-quality patches, and high-quality patches may exist in frames of low quality overall, current methods focusing on nearby peak-quality frames (PQFs) may miss high-quality details in low-quality frames. To remedy these shortcomings, in this paper we propose a novel end-to-end deep neural network called non-local ConvLSTM (NL-ConvLSTM in short) that exploits multiple consecutive frames. An approximate non-local strategy is introduced in NL-ConvLSTM to capture global motion patterns and trace the spatiotemporal dependency in a video sequence. This approximate strategy makes the non-local module work in a fast and low space-cost way. Our method uses the preceding and following frames of the target frame to generate a residual, from which a higher quality frame is reconstructed. Experiments on two datasets show that NL-ConvLSTM outperforms the existing methods.Comment: ICCV 201

    Random Walk Graph Laplacian based Smoothness Prior for Soft Decoding of JPEG Images

    Full text link
    Given the prevalence of JPEG compressed images, optimizing image reconstruction from the compressed format remains an important problem. Instead of simply reconstructing a pixel block from the centers of indexed DCT coefficient quantization bins (hard decoding), soft decoding reconstructs a block by selecting appropriate coefficient values within the indexed bins with the help of signal priors. The challenge thus lies in how to define suitable priors and apply them effectively. In this paper, we combine three image priors---Laplacian prior for DCT coefficients, sparsity prior and graph-signal smoothness prior for image patches---to construct an efficient JPEG soft decoding algorithm. Specifically, we first use the Laplacian prior to compute a minimum mean square error (MMSE) initial solution for each code block. Next, we show that while the sparsity prior can reduce block artifacts, limiting the size of the over-complete dictionary (to lower computation) would lead to poor recovery of high DCT frequencies. To alleviate this problem, we design a new graph-signal smoothness prior (desired signal has mainly low graph frequencies) based on the left eigenvectors of the random walk graph Laplacian matrix (LERaG). Compared to previous graph-signal smoothness priors, LERaG has desirable image filtering properties with low computation overhead. We demonstrate how LERaG can facilitate recovery of high DCT frequencies of a piecewise smooth (PWS) signal via an interpretation of low graph frequency components as relaxed solutions to normalized cut in spectral clustering. Finally, we construct a soft decoding algorithm using the three signal priors with appropriate prior weights. Experimental results show that our proposal outperforms state-of-the-art soft decoding algorithms in both objective and subjective evaluations noticeably
    corecore