2,640 research outputs found

    Deep Algorithm Unrolling for Blind Image Deblurring

    Full text link
    Blind image deblurring remains a topic of enduring interest. Learning based approaches, especially those that employ neural networks have emerged to complement traditional model based methods and in many cases achieve vastly enhanced performance. That said, neural network approaches are generally empirically designed and the underlying structures are difficult to interpret. In recent years, a promising technique called algorithm unrolling has been developed that has helped connect iterative algorithms such as those for sparse coding to neural network architectures. However, such connections have not been made yet for blind image deblurring. In this paper, we propose a neural network architecture based on this idea. We first present an iterative algorithm that may be considered as a generalization of the traditional total-variation regularization method in the gradient domain. We then unroll the algorithm to construct a neural network for image deblurring which we refer to as Deep Unrolling for Blind Deblurring (DUBLID). Key algorithm parameters are learned with the help of training images. Our proposed deep network DUBLID achieves significant practical performance gains while enjoying interpretability at the same time. Extensive experimental results show that DUBLID outperforms many state-of-the-art methods and in addition is computationally faster

    FastDVDnet: Towards Real-Time Deep Video Denoising Without Flow Estimation

    Full text link
    In this paper, we propose a state-of-the-art video denoising algorithm based on a convolutional neural network architecture. Until recently, video denoising with neural networks had been a largely under explored domain, and existing methods could not compete with the performance of the best patch-based methods. The approach we introduce in this paper, called FastDVDnet, shows similar or better performance than other state-of-the-art competitors with significantly lower computing times. In contrast to other existing neural network denoisers, our algorithm exhibits several desirable properties such as fast runtimes, and the ability to handle a wide range of noise levels with a single network model. The characteristics of its architecture make it possible to avoid using a costly motion compensation stage while achieving excellent performance. The combination between its denoising performance and lower computational load makes this algorithm attractive for practical denoising applications. We compare our method with different state-of-art algorithms, both visually and with respect to objective quality metrics.Comment: Code for this algorithm and results can be found in https://github.com/m-tassano/fastdvdnet. arXiv admin note: text overlap with arXiv:1906.1189

    Face Recognition in Low Quality Images: A Survey

    Full text link
    Low-resolution face recognition (LRFR) has received increasing attention over the past few years. Its applications lie widely in the real-world environment when high-resolution or high-quality images are hard to capture. One of the biggest demands for LRFR technologies is video surveillance. As the the number of surveillance cameras in the city increases, the videos that captured will need to be processed automatically. However, those videos or images are usually captured with large standoffs, arbitrary illumination condition, and diverse angles of view. Faces in these images are generally small in size. Several studies addressed this problem employed techniques like super resolution, deblurring, or learning a relationship between different resolution domains. In this paper, we provide a comprehensive review of approaches to low-resolution face recognition in the past five years. First, a general problem definition is given. Later, systematically analysis of the works on this topic is presented by catogory. In addition to describing the methods, we also focus on datasets and experiment settings. We further address the related works on unconstrained low-resolution face recognition and compare them with the result that use synthetic low-resolution data. Finally, we summarized the general limitations and speculate a priorities for the future effort.Comment: There are some mistakes addressing in this paper which will be misleading to the reader and we wont have a new version in short time. We will resubmit once it is being corecte

    Real-world Underwater Enhancement: Challenges, Benchmarks, and Solutions

    Full text link
    Underwater image enhancement is such an important low-level vision task with many applications that numerous algorithms have been proposed in recent years. These algorithms developed upon various assumptions demonstrate successes from various aspects using different data sets and different metrics. In this work, we setup an undersea image capturing system, and construct a large-scale Real-world Underwater Image Enhancement (RUIE) data set divided into three subsets. The three subsets target at three challenging aspects for enhancement, i.e., image visibility quality, color casts, and higher-level detection/classification, respectively. We conduct extensive and systematic experiments on RUIE to evaluate the effectiveness and limitations of various algorithms to enhance visibility and correct color casts on images with hierarchical categories of degradation. Moreover, underwater image enhancement in practice usually serves as a preprocessing step for mid-level and high-level vision tasks. We thus exploit the object detection performance on enhanced images as a brand new task-specific evaluation criterion. The findings from these evaluations not only confirm what is commonly believed, but also suggest promising solutions and new directions for visibility enhancement, color correction, and object detection on real-world underwater images.Comment: arXiv admin note: text overlap with arXiv:1712.04143 by other author

    Connecting Image Denoising and High-Level Vision Tasks via Deep Learning

    Full text link
    Image denoising and high-level vision tasks are usually handled independently in the conventional practice of computer vision, and their connection is fragile. In this paper, we cope with the two jointly and explore the mutual influence between them with the focus on two questions, namely (1) how image denoising can help improving high-level vision tasks, and (2) how the semantic information from high-level vision tasks can be used to guide image denoising. First for image denoising we propose a convolutional neural network in which convolutions are conducted in various spatial resolutions via downsampling and upsampling operations in order to fuse and exploit contextual information on different scales. Second we propose a deep neural network solution that cascades two modules for image denoising and various high-level tasks, respectively, and use the joint loss for updating only the denoising network via back-propagation. We experimentally show that on one hand, the proposed denoiser has the generality to overcome the performance degradation of different high-level vision tasks. On the other hand, with the guidance of high-level vision information, the denoising network produces more visually appealing results. Extensive experiments demonstrate the benefit of exploiting image semantics simultaneously for image denoising and high-level vision tasks via deep learning. The code is available online: https://github.com/Ding-Liu/DeepDenoisingComment: arXiv admin note: text overlap with arXiv:1706.0428

    Blind Motion Deblurring with Cycle Generative Adversarial Networks

    Full text link
    Blind motion deblurring is one of the most basic and challenging problems in image processing and computer vision. It aims to recover a sharp image from its blurred version knowing nothing about the blur process. Many existing methods use Maximum A Posteriori (MAP) or Expectation Maximization (EM) frameworks to deal with this kind of problems, but they cannot handle well the figh frequency features of natural images. Most recently, deep neural networks have been emerging as a powerful tool for image deblurring. In this paper, we prove that encoder-decoder architecture gives better results for image deblurring tasks. In addition, we propose a novel end-to-end learning model which refines generative adversarial network by many novel training strategies so as to tackle the problem of deblurring. Experimental results show that our model can capture high frequency features well, and the results on benchmark dataset show that proposed model achieves the competitive performance

    Segmentation-Aware Image Denoising without Knowing True Segmentation

    Full text link
    Several recent works discussed application-driven image restoration neural networks, which are capable of not only removing noise in images but also preserving their semantic-aware details, making them suitable for various high-level computer vision tasks as the pre-processing step. However, such approaches require extra annotations for their high-level vision tasks, in order to train the joint pipeline using hybrid losses. The availability of those annotations is yet often limited to a few image sets, potentially restricting the general applicability of these methods to denoising more unseen and unannotated images. Motivated by that, we propose a segmentation-aware image denoising model dubbed U-SAID, based on a novel unsupervised approach with a pixel-wise uncertainty loss. U-SAID does not need any ground-truth segmentation map, and thus can be applied to any image dataset. It generates denoised images with comparable or even better quality, and the denoised results show stronger robustness for subsequent semantic segmentation tasks, when compared to either its supervised counterpart or classical "application-agnostic" denoisers. Moreover, we demonstrate the superior generalizability of U-SAID in three-folds, by plugging its "universal" denoiser without fine-tuning: (1) denoising unseen types of images; (2) denoising as pre-processing for segmenting unseen noisy images; and (3) denoising for unseen high-level tasks. Extensive experiments demonstrate the effectiveness, robustness and generalizability of the proposed U-SAID over various popular image sets

    Bridging the Gap Between Computational Photography and Visual Recognition

    Full text link
    What is the current state-of-the-art for image restoration and enhancement applied to degraded images acquired under less than ideal circumstances? Can the application of such algorithms as a pre-processing step to improve image interpretability for manual analysis or automatic visual recognition to classify scene content? While there have been important advances in the area of computational photography to restore or enhance the visual quality of an image, the capabilities of such techniques have not always translated in a useful way to visual recognition tasks. Consequently, there is a pressing need for the development of algorithms that are designed for the joint problem of improving visual appearance and recognition, which will be an enabling factor for the deployment of visual recognition tools in many real-world scenarios. To address this, we introduce the UG^2 dataset as a large-scale benchmark composed of video imagery captured under challenging conditions, and two enhancement tasks designed to test algorithmic impact on visual quality and automatic object recognition. Furthermore, we propose a set of metrics to evaluate the joint improvement of such tasks as well as individual algorithmic advances, including a novel psychophysics-based evaluation regime for human assessment and a realistic set of quantitative measures for object recognition performance. We introduce six new algorithms for image restoration or enhancement, which were created as part of the IARPA sponsored UG^2 Challenge workshop held at CVPR 2018. Under the proposed evaluation regime, we present an in-depth analysis of these algorithms and a host of deep learning-based and classic baseline approaches. From the observed results, it is evident that we are in the early days of building a bridge between computational photography and visual recognition, leaving many opportunities for innovation in this area.Comment: CVPR Prize Challenge: http://www.ug2challenge.or

    Noiseprint: a CNN-based camera model fingerprint

    Full text link
    Forensic analyses of digital images rely heavily on the traces of in-camera and out-camera processes left on the acquired images. Such traces represent a sort of camera fingerprint. If one is able to recover them, by suppressing the high-level scene content and other disturbances, a number of forensic tasks can be easily accomplished. A notable example is the PRNU pattern, which can be regarded as a device fingerprint, and has received great attention in multimedia forensics. In this paper we propose a method to extract a camera model fingerprint, called noiseprint, where the scene content is largely suppressed and model-related artifacts are enhanced. This is obtained by means of a Siamese network, which is trained with pairs of image patches coming from the same (label +1) or different (label -1) cameras. Although noiseprints can be used for a large variety of forensic tasks, here we focus on image forgery localization. Experiments on several datasets widespread in the forensic community show noiseprint-based methods to provide state-of-the-art performance

    A Deep Journey into Super-resolution: A survey

    Full text link
    Deep convolutional networks based super-resolution is a fast-growing field with numerous practical applications. In this exposition, we extensively compare 30+ state-of-the-art super-resolution Convolutional Neural Networks (CNNs) over three classical and three recently introduced challenging datasets to benchmark single image super-resolution. We introduce a taxonomy for deep-learning based super-resolution networks that groups existing methods into nine categories including linear, residual, multi-branch, recursive, progressive, attention-based and adversarial designs. We also provide comparisons between the models in terms of network complexity, memory footprint, model input and output, learning details, the type of network losses and important architectural differences (e.g., depth, skip-connections, filters). The extensive evaluation performed, shows the consistent and rapid growth in the accuracy in the past few years along with a corresponding boost in model complexity and the availability of large-scale datasets. It is also observed that the pioneering methods identified as the benchmark have been significantly outperformed by the current contenders. Despite the progress in recent years, we identify several shortcomings of existing techniques and provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey
    corecore