Search CORE

2,425 research outputs found

New Techniques for Preserving Global Structure and Denoising with Low Information Loss in Single-Image Super-Resolution

Author: Bei Yijie
Damian Alex
Hu Shijia
Menon Sachit
Ravi Nikhil
Rudin Cynthia
Publication venue
Publication date: 15/06/2018
Field of study

This work identifies and addresses two important technical challenges in single-image super-resolution: (1) how to upsample an image without magnifying noise and (2) how to preserve large scale structure when upsampling. We summarize the techniques we developed for our second place entry in Track 1 (Bicubic Downsampling), seventh place entry in Track 2 (Realistic Adverse Conditions), and seventh place entry in Track 3 (Realistic difficult) in the 2018 NTIRE Super-Resolution Challenge. Furthermore, we present new neural network architectures that specifically address the two challenges listed above: denoising and preservation of large-scale structure.Comment: 8 pages, CVPR workshop 201

arXiv.org e-Print Archive

Learning Digital Camera Pipeline for Extreme Low-Light Imaging

Author: Arora Aditya
Khan Fahad Shahbaz
Khan Salman
Shao Ling
Zamir Syed Waqas
Publication venue
Publication date: 11/04/2019
Field of study

In low-light conditions, a conventional camera imaging pipeline produces sub-optimal images that are usually dark and noisy due to a low photon count and low signal-to-noise ratio (SNR). We present a data-driven approach that learns the desired properties of well-exposed images and reflects them in images that are captured in extremely low ambient light environments, thereby significantly improving the visual quality of these low-light images. We propose a new loss function that exploits the characteristics of both pixel-wise and perceptual metrics, enabling our deep neural network to learn the camera processing pipeline to transform the short-exposure, low-light RAW sensor data to well-exposed sRGB images. The results show that our method outperforms the state-of-the-art according to psychophysical tests as well as pixel-wise standard metrics and recent learning-based perceptual image quality measures

arXiv.org e-Print Archive

Handheld Multi-Frame Super-Resolution

Author: Ernst Manfred
Garcia-Dorado Ignacio
Kelly Damien
Krainin Michael
Levoy Marc
Liang Chia-Kai
Milanfar Peyman
Wronski Bartlomiej
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/02/2021
Field of study

Compared to DSLR cameras, smartphone cameras have smaller sensors, which limits their spatial resolution; smaller apertures, which limits their light gathering ability; and smaller pixels, which reduces their signal-to noise ratio. The use of color filter arrays (CFAs) requires demosaicing, which further degrades resolution. In this paper, we supplant the use of traditional demosaicing in single-frame and burst photography pipelines with a multiframe super-resolution algorithm that creates a complete RGB image directly from a burst of CFA raw images. We harness natural hand tremor, typical in handheld photography, to acquire a burst of raw frames with small offsets. These frames are then aligned and merged to form a single image with red, green, and blue values at every pixel site. This approach, which includes no explicit demosaicing step, serves to both increase image resolution and boost signal to noise ratio. Our algorithm is robust to challenging scene conditions: local motion, occlusion, or scene changes. It runs at 100 milliseconds per 12-megapixel RAW input burst frame on mass-produced mobile phones. Specifically, the algorithm is the basis of the Super-Res Zoom feature, as well as the default merge method in Night Sight mode (whether zooming or not) on Google's flagship phone.Comment: 24 pages, accepted to Siggraph 2019 Technical Papers progra

arXiv.org e-Print Archive

Real Image Denoising with Feature Attention

Author: Anwar Saeed
Barnes Nick
Publication venue
Publication date: 23/03/2020
Field of study

Deep convolutional neural networks perform better on images containing spatially invariant noise (synthetic noise); however, their performance is limited on real-noisy photographs and requires multiple stage network modeling. To advance the practicability of denoising algorithms, this paper proposes a novel single-stage blind real image denoising network (RIDNet) by employing a modular architecture. We use a residual on the residual structure to ease the flow of low-frequency information and apply feature attention to exploit the channel dependencies. Furthermore, the evaluation in terms of quantitative metrics and visual quality on three synthetic and four real noisy datasets against 19 state-of-the-art algorithms demonstrate the superiority of our RIDNet.Comment: Accepted in ICCV (Oral), 201

arXiv.org e-Print Archive

A Deep Journey into Super-resolution: A survey

Author: Anwar Saeed
Barnes Nick
Khan Salman
Publication venue
Publication date: 23/03/2020
Field of study

Deep convolutional networks based super-resolution is a fast-growing field with numerous practical applications. In this exposition, we extensively compare 30+ state-of-the-art super-resolution Convolutional Neural Networks (CNNs) over three classical and three recently introduced challenging datasets to benchmark single image super-resolution. We introduce a taxonomy for deep-learning based super-resolution networks that groups existing methods into nine categories including linear, residual, multi-branch, recursive, progressive, attention-based and adversarial designs. We also provide comparisons between the models in terms of network complexity, memory footprint, model input and output, learning details, the type of network losses and important architectural differences (e.g., depth, skip-connections, filters). The extensive evaluation performed, shows the consistent and rapid growth in the accuracy in the past few years along with a corresponding boost in model complexity and the availability of large-scale datasets. It is also observed that the pioneering methods identified as the benchmark have been significantly outperformed by the current contenders. Despite the progress in recent years, we identify several shortcomings of existing techniques and provide future research directions towards the solution of these open problems.Comment: Accepted in ACM Computing Survey

arXiv.org e-Print Archive

LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering

Author: Davidson Philip
Fanello Sean
Goldman Dan B
Izadi Shahram
Keskin Cem
Khamis Sameh
Kowdle Adarsh
Lincoln Peter
Martin-Brualla Ricardo
Pandey Rohit
Pidlypenskyi Pavel
Rhemann Christoph
Seitz Steve
Taylor Jonathan
Tkach Anastasia
Valentin Julien
Yang Shuoran
Publication venue
Publication date: 12/11/2018
Field of study

Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real-time. We call this approach neural (re-)rendering, and our live system "LookinGood". Our deep architecture is trained to produce high resolution and high quality images from a coarse rendering in real-time. First, we propose a self-supervised training method that does not require manual ground-truth annotation. We contribute a specialized reconstruction error that uses semantic information to focus on relevant parts of the subject, e.g. the face. We also introduce a salient reweighing scheme of the loss function that is able to discard outliers. We specifically design the system for virtual and augmented reality headsets where the consistency between the left and right eye plays a crucial role in the final user experience. Finally, we generate temporally stable results by explicitly minimizing the difference between two consecutive frames. We tested the proposed system in two different scenarios: one involving a single RGB-D sensor, and upper body reconstruction of an actor, the second consisting of full body 360 degree capture. Through extensive experimentation, we demonstrate how our system generalizes across unseen sequences and subjects. The supplementary video is available at http://youtu.be/Md3tdAKoLGU.Comment: The supplementary video is available at: http://youtu.be/Md3tdAKoLGU To be presented at SIGGRAPH Asia 201

arXiv.org e-Print Archive

Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network

Author: Kang Eunhee
Ye Jong Chul
Yoo Jaejun
Publication venue
Publication date: 28/03/2018
Field of study

Model based iterative reconstruction (MBIR) algorithms for low-dose X-ray CT are computationally expensive. To address this problem, we recently proposed a deep convolutional neural network (CNN) for low-dose X-ray CT and won the second place in 2016 AAPM Low-Dose CT Grand Challenge. However, some of the texture were not fully recovered. To address this problem, here we propose a novel framelet-based denoising algorithm using wavelet residual network which synergistically combines the expressive power of deep learning and the performance guarantee from the framelet-based denoising algorithms. The new algorithms were inspired by the recent interpretation of the deep convolutional neural network (CNN) as a cascaded convolution framelet signal representation. Extensive experimental results confirm that the proposed networks have significantly improved performance and preserves the detail texture of the original images.Comment: This will appear in IEEE Transaction on Medical Imaging, a special issue of Machine Learning for Image Reconstructio

arXiv.org e-Print Archive

Chaining Identity Mapping Modules for Image Denoising

Author: Anwar Saeed
Huynh Cong Phouc
Porikli Fatih
Publication venue
Publication date: 18/07/2019
Field of study

We propose to learn a fully-convolutional network model that consists of a Chain of Identity Mapping Modules (CIMM) for image denoising. The CIMM structure possesses two distinctive features that are important for the noise removal task. Firstly, each residual unit employs identity mappings as the skip connections and receives pre-activated input in order to preserve the gradient magnitude propagated in both the forward and backward directions. Secondly, by utilizing dilated kernels for the convolution layers in the residual branch, in other words within an identity mapping module, each neuron in the last convolution layer can observe the full receptive field of the first layer. After being trained on the BSD400 dataset, the proposed network produces remarkably higher numerical accuracy and better visual image quality than the state-of-the-art when being evaluated on conventional benchmark images and the BSD68 dataset

arXiv.org e-Print Archive

Face Hallucination using Linear Models of Coupled Sparse Support

Author: Farrugia Reuben
Guillemot Christine
Publication venue
Publication date: 18/12/2015
Field of study

Most face super-resolution methods assume that low-resolution and high-resolution manifolds have similar local geometrical structure, hence learn local models on the lowresolution manifolds (e.g. sparse or locally linear embedding models), which are then applied on the high-resolution manifold. However, the low-resolution manifold is distorted by the oneto-many relationship between low- and high- resolution patches. This paper presents a method which learns linear models based on the local geometrical structure on the high-resolution manifold rather than on the low-resolution manifold. For this, in a first step, the low-resolution patch is used to derive a globally optimal estimate of the high-resolution patch. The approximated solution is shown to be close in Euclidean space to the ground-truth but is generally smooth and lacks the texture details needed by state-ofthe-art face recognizers. This first estimate allows us to find the support of the high-resolution manifold using sparse coding (SC), which are then used as support for learning a local projection (or upscaling) model between the low-resolution and the highresolution manifolds using Multivariate Ridge Regression (MRR). Experimental results show that the proposed method outperforms six face super-resolution methods in terms of both recognition and quality. These results also reveal that the recognition and quality are significantly affected by the method used for stitching all super-resolved patches together, where quilting was found to better preserve the texture details which helps to achieve higher recognition rates

arXiv.org e-Print Archive

Structure-Preserving Image Super-resolution via Contextualized Multi-task Learning

Author: Chen Chongyu
Lin Liang
Shi Yukai
Wang Keze
Xu Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/07/2017
Field of study

Single image super resolution (SR), which refers to reconstruct a higher-resolution (HR) image from the observed low-resolution (LR) image, has received substantial attention due to its tremendous application potentials. Despite the breakthroughs of recently proposed SR methods using convolutional neural networks (CNNs), their generated results usually lack of preserving structural (high-frequency) details. In this paper, regarding global boundary context and residual context as complimentary information for enhancing structural details in image restoration, we develop a contextualized multi-task learning framework to address the SR problem. Specifically, our method first extracts convolutional features from the input LR image and applies one deconvolutional module to interpolate the LR feature maps in a content-adaptive way. Then, the resulting feature maps are fed into two branched sub-networks. During the neural network training, one sub-network outputs salient image boundaries and the HR image, and the other sub-network outputs the local residual map, i.e., the residual difference between the generated HR image and ground-truth image. On several standard benchmarks (i.e., Set5, Set14 and BSD200), our extensive evaluations demonstrate the effectiveness of our SR method on achieving both higher restoration quality and computational efficiency compared with several state-of-the-art SR approaches. The source code and some SR results can be found at: http://hcp.sysu.edu.cn/structure-preserving-image-super-resolution/Comment: To appear in Transactions on Multimedia 201

arXiv.org e-Print Archive