2,628 research outputs found

    Deep SR-ITM: Joint Learning of Super-Resolution and Inverse Tone-Mapping for 4K UHD HDR Applications

    Full text link
    Recent modern displays are now able to render high dynamic range (HDR), high resolution (HR) videos of up to 8K UHD (Ultra High Definition). Consequently, UHD HDR broadcasting and streaming have emerged as high quality premium services. However, due to the lack of original UHD HDR video content, appropriate conversion technologies are urgently needed to transform the legacy low resolution (LR) standard dynamic range (SDR) videos into UHD HDR versions. In this paper, we propose a joint super-resolution (SR) and inverse tone-mapping (ITM) framework, called Deep SR-ITM, which learns the direct mapping from LR SDR video to their HR HDR version. Joint SR and ITM is an intricate task, where high frequency details must be restored for SR, jointly with the local contrast, for ITM. Our network is able to restore fine details by decomposing the input image and focusing on the separate base (low frequency) and detail (high frequency) layers. Moreover, the proposed modulation blocks apply location-variant operations to enhance local contrast. The Deep SR-ITM shows good subjective quality with increased contrast and details, outperforming the previous joint SR-ITM method.Comment: Accepted at ICCV 2019 (Oral

    Learnable Exposure Fusion for Dynamic Scenes

    Full text link
    In this paper, we focus on Exposure Fusion (EF) [ExposFusi2] for dynamic scenes. The task is to fuse multiple images obtained by exposure bracketing to create an image which comprises a high level of details. Typically, such images are not possible to obtain directly from a camera due to hardware limitations, e.g., a limited dynamic range of the sensor. A major problem of such tasks is that the images may not be spatially aligned due to scene motion or camera motion. It is known that the required alignment by image registration problems is ill-posed. In this case, the images to be aligned vary in their intensity range, which makes the problem even more difficult. To address the mentioned problems, we propose an end-to-end \emph{Convolutional Neural Network} (CNN) based approach to learn to estimate exposure fusion from 22 and 33 Low Dynamic Range (LDR) images depicting different scene contents. To the best of our knowledge, no efficient and robust CNN-based end-to-end approach can be found in the literature for this kind of problem. The idea is to create a dataset with perfectly aligned LDR images to obtain ground-truth exposure fusion images. At the same time, we obtain additional LDR images with some motion, having the same exposure fusion ground-truth as the perfectly aligned LDR images. This way, we can train an end-to-end CNN having misaligned LDR input images, but with a proper ground truth exposure fusion image. We propose a specific CNN-architecture to solve this problem. In various experiments, we show that the proposed approach yields excellent results

    cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey

    Full text link
    The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers on computer vision, pattern recognition, and related fields. For this particular review, we focused on reading the ALL 602 conference papers presented at the CVPR2015, the premier annual computer vision event held in June 2015, in order to grasp the trends in the field. Further, we are proposing "DeepSurvey" as a mechanism embodying the entire process from the reading through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape

    Deep Retinex Decomposition for Low-Light Enhancement

    Full text link
    Retinex model is an effective tool for low-light image enhancement. It assumes that observed images can be decomposed into the reflectance and illumination. Most existing Retinex-based methods have carefully designed hand-crafted constraints and parameters for this highly ill-posed decomposition, which may be limited by model capacity when applied in various scenes. In this paper, we collect a LOw-Light dataset (LOL) containing low/normal-light image pairs and propose a deep Retinex-Net learned on this dataset, including a Decom-Net for decomposition and an Enhance-Net for illumination adjustment. In the training process for Decom-Net, there is no ground truth of decomposed reflectance and illumination. The network is learned with only key constraints including the consistent reflectance shared by paired low/normal-light images, and the smoothness of illumination. Based on the decomposition, subsequent lightness enhancement is conducted on illumination by an enhancement network called Enhance-Net, and for joint denoising there is a denoising operation on reflectance. The Retinex-Net is end-to-end trainable, so that the learned decomposition is by nature good for lightness adjustment. Extensive experiments demonstrate that our method not only achieves visually pleasing quality for low-light enhancement but also provides a good representation of image decomposition.Comment: BMVC 2018(Oral). Dataset and Project page: https://daooshee.github.io/BMVC2018website

    UG2+^{2+} Track 2: A Collective Benchmark Effort for Evaluating and Advancing Image Understanding in Poor Visibility Environments

    Full text link
    The UG2+^{2+} challenge in IEEE CVPR 2019 aims to evoke a comprehensive discussion and exploration about how low-level vision techniques can benefit the high-level automatic visual recognition in various scenarios. In its second track, we focus on object or face detection in poor visibility enhancements caused by bad weathers (haze, rain) and low light conditions. While existing enhancement methods are empirically expected to help the high-level end task, that is observed to not always be the case in practice. To provide a more thorough examination and fair comparison, we introduce three benchmark sets collected in real-world hazy, rainy, and low-light conditions, respectively, with annotate objects/faces annotated. To our best knowledge, this is the first and currently largest effort of its kind. Baseline results by cascading existing enhancement and detection models are reported, indicating the highly challenging nature of our new data as well as the large room for further technical innovations. We expect a large participation from the broad research community to address these challenges together.Comment: A summary paper on datasets, fact sheets, baseline results, challenge results, and winning methods in UG2+^{2+} Challenge (Track 2). More materials are provided in http://www.ug2challenge.org/index.htm

    MANTIS: Model-Augmented Neural neTwork with Incoherent k-space Sampling for efficient MR T2 mapping

    Full text link
    Quantitative mapping of magnetic resonance (MR) parameters have been shown as valuable methods for improved assessment of a range of diseases. Due to the need to image an anatomic structure multiple times, parameter mapping usually requires long scan times compared to conventional static imaging. Therefore, accelerated parameter mapping is highly-desirable and remains a topic of great interest in the MR research community. While many recent deep learning methods have focused on highly efficient image reconstruction for conventional static MR imaging, applications of deep learning for dynamic imaging and in particular accelerated parameter mapping have been limited. The purpose of this work was to develop and evaluate a novel deep learning-based reconstruction framework called Model-Augmented Neural neTwork with Incoherent k-space Sampling (MANTIS) for efficient MR parameter mapping. Our approach combines end-to-end CNN mapping with k-space consistency using the concept of cyclic loss to further enforce data and model fidelity. Incoherent k-space sampling is used to improve reconstruction performance. A physical model is incorporated into the proposed framework, so that the parameter maps can be efficiently estimated directly from undersampled images. The performance of MANTIS was demonstrated for the spin-spin relaxation time (T2) mapping of the knee joint. Compared to conventional reconstruction approaches that exploited image sparsity, MANTIS yielded lower errors and higher similarity with respect to the reference in the T2 estimation. Our study demonstrated that the proposed MANTIS framework, with a combination of end-to-end CNN mapping, signal model-augmented data consistency, and incoherent k-space sampling, represents a promising approach for efficient MR parameter mapping. MANTIS can potentially be extended to other types of parameter mapping with appropriate models

    Learning Spatial-Spectral Prior for Super-Resolution of Hyperspectral Imagery

    Full text link
    Recently, single gray/RGB image super-resolution reconstruction task has been extensively studied and made significant progress by leveraging the advanced machine learning techniques based on deep convolutional neural networks (DCNNs). However, there has been limited technical development focusing on single hyperspectral image super-resolution due to the high-dimensional and complex spectral patterns in hyperspectral image. In this paper, we make a step forward by investigating how to adapt state-of-the-art residual learning based single gray/RGB image super-resolution approaches for computationally efficient single hyperspectral image super-resolution, referred as SSPSR. Specifically, we introduce a spatial-spectral prior network (SSPN) to fully exploit the spatial information and the correlation between the spectra of the hyperspectral data. Considering that the hyperspectral training samples are scarce and the spectral dimension of hyperspectral image data is very high, it is nontrivial to train a stable and effective deep network. Therefore, a group convolution (with shared network parameters) and progressive upsampling framework is proposed. This will not only alleviate the difficulty in feature extraction due to high-dimension of the hyperspectral data, but also make the training process more stable. To exploit the spatial and spectral prior, we design a spatial-spectral block (SSB), which consists of a spatial residual module and a spectral attention residual module. Experimental results on some hyperspectral images demonstrate that the proposed SSPSR method enhances the details of the recovered high-resolution hyperspectral images, and outperforms state-of-the-arts. The source code is available at \url{https://github.com/junjun-jiang/SSPSRComment: Accepted for publication at IEEE Transactions on Computational Imagin

    Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery

    Full text link
    Recovering images from undersampled linear measurements typically leads to an ill-posed linear inverse problem, that asks for proper statistical priors. Building effective priors is however challenged by the low train and test overhead dictated by real-time tasks; and the need for retrieving visually "plausible" and physically "feasible" images with minimal hallucination. To cope with these challenges, we design a cascaded network architecture that unrolls the proximal gradient iterations by permeating benefits from generative residual networks (ResNet) to modeling the proximal operator. A mixture of pixel-wise and perceptual costs is then deployed to train proximals. The overall architecture resembles back-and-forth projection onto the intersection of feasible and plausible images. Extensive computational experiments are examined for a global task of reconstructing MR images of pediatric patients, and a more local task of superresolving CelebA faces, that are insightful to design efficient architectures. Our observations indicate that for MRI reconstruction, a recurrent ResNet with a single residual block effectively learns the proximal. This simple architecture appears to significantly outperform the alternative deep ResNet architecture by 2dB SNR, and the conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For image superresolution, our preliminary results indicate that modeling the denoising proximal demands deep ResNets.Comment: 11 pages, 11 figure

    Deep Learning Techniques for Inverse Problems in Imaging

    Full text link
    Recent work in machine learning shows that deep neural networks can be used to solve a wide variety of inverse problems arising in computational imaging. We explore the central prevailing themes of this emerging area and present a taxonomy that can be used to categorize different problems and reconstruction methods. Our taxonomy is organized along two central axes: (1) whether or not a forward model is known and to what extent it is used in training and testing, and (2) whether or not the learning is supervised or unsupervised, i.e., whether or not the training relies on access to matched ground truth image and measurement pairs. We also discuss the trade-offs associated with these different reconstruction approaches, caveats and common failure modes, plus open problems and avenues for future work

    Deep learning for fast MR imaging: a review for learning reconstruction from incomplete k-space data

    Full text link
    Magnetic resonance imaging is a powerful imaging modality that can provide versatile information but it has a bottleneck problem "slow imaging speed". Reducing the scanned measurements can accelerate MR imaging with the aid of powerful reconstruction methods, which have evolved from linear analytic models to nonlinear iterative ones. The emerging trend in this area is replacing human-defined signal models with that learned from data. Specifically, from 2016, deep learning has been incorporated into the fast MR imaging task, which draws valuable prior knowledge from big datasets to facilitate accurate MR image reconstruction from limited measurements. This survey aims to review deep learning based MR image reconstruction works from 2016- June 2020 and will discuss merits, limitations and challenges associated with such methods. Last but not least, this paper will provide a starting point for researchers interested in contributing to this field by pointing out good tutorial resources, state-of-the-art open-source codes and meaningful data sources.Comment: Invited review submitted to Biomedical signal processing and control in Jan 202
    • …
    corecore