927 research outputs found
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
Learned Perceptual Image Enhancement
Learning a typical image enhancement pipeline involves minimization of a loss
function between enhanced and reference images. While L1 and L2 losses are
perhaps the most widely used functions for this purpose, they do not
necessarily lead to perceptually compelling results. In this paper, we show
that adding a learned no-reference image quality metric to the loss can
significantly improve enhancement operators. This metric is implemented using a
CNN (convolutional neural network) trained on a large-scale dataset labelled
with aesthetic preferences of human raters. This loss allows us to conveniently
perform back-propagation in our learning framework to simultaneously optimize
for similarity to a given ground truth reference and perceptual quality. This
perceptual loss is only used to train parameters of image processing operators,
and does not impose any extra complexity at inference time. Our experiments
demonstrate that this loss can be effective for tuning a variety of operators
such as local tone mapping and dehazing
Joint Regression and Ranking for Image Enhancement
Research on automated image enhancement has gained momentum in recent years,
partially due to the need for easy-to-use tools for enhancing pictures captured
by ubiquitous cameras on mobile devices. Many of the existing leading methods
employ machine-learning-based techniques, by which some enhancement parameters
for a given image are found by relating the image to the training images with
known enhancement parameters. While knowing the structure of the parameter
space can facilitate search for the optimal solution, none of the existing
methods has explicitly modeled and learned that structure. This paper presents
an end-to-end, novel joint regression and ranking approach to model the
interaction between desired enhancement parameters and images to be processed,
employing a Gaussian process (GP). GP allows searching for ideal parameters
using only the image features. The model naturally leads to a ranking technique
for comparing images in the induced feature space. Comparative evaluation using
the ground-truth based on the MIT-Adobe FiveK dataset plus subjective tests on
an additional data-set were used to demonstrate the effectiveness of the
proposed approach.Comment: WACV 201
- …