21,996 research outputs found
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing
This paper tackles a new problem setting: reinforcement learning with
pixel-wise rewards (pixelRL) for image processing. After the introduction of
the deep Q-network, deep RL has been achieving great success. However, the
applications of deep RL for image processing are still limited. Therefore, we
extend deep RL to pixelRL for various image processing applications. In
pixelRL, each pixel has an agent, and the agent changes the pixel value by
taking an action. We also propose an effective learning method for pixelRL that
significantly improves the performance by considering not only the future
states of the own pixel but also those of the neighbor pixels. The proposed
method can be applied to some image processing tasks that require pixel-wise
manipulations, where deep RL has never been applied. We apply the proposed
method to three image processing tasks: image denoising, image restoration, and
local color enhancement. Our experimental results demonstrate that the proposed
method achieves comparable or better performance, compared with the
state-of-the-art methods based on supervised learning.Comment: Accepted to AAAI 201
Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
Depth estimation from a single image is a fundamental problem in computer
vision. In this paper, we propose a simple yet effective convolutional spatial
propagation network (CSPN) to learn the affinity matrix for depth prediction.
Specifically, we adopt an efficient linear propagation model, where the
propagation is performed with a manner of recurrent convolutional operation,
and the affinity among neighboring pixels is learned through a deep
convolutional neural network (CNN). We apply the designed CSPN to two depth
estimation tasks given a single image: (1) To refine the depth output from
state-of-the-art (SOTA) existing methods; and (2) to convert sparse depth
samples to a dense depth map by embedding the depth samples within the
propagation procedure. The second task is inspired by the availability of
LIDARs that provides sparse but accurate depth measurements. We experimented
the proposed CSPN over two popular benchmarks for depth estimation, i.e. NYU v2
and KITTI, where we show that our proposed approach improves in not only
quality (e.g., 30% more reduction in depth error), but also speed (e.g., 2 to 5
times faster) than prior SOTA methods.Comment: 14 pages, 8 figures, ECCV 201
Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
This work addresses the problem of semantic scene understanding under dense
fog. Although considerable progress has been made in semantic scene
understanding, it is mainly related to clear-weather scenes. Extending
recognition methods to adverse weather conditions such as fog is crucial for
outdoor applications. In this paper, we propose a novel method, named
Curriculum Model Adaptation (CMAda), which gradually adapts a semantic
segmentation model from light synthetic fog to dense real fog in multiple
steps, using both synthetic and real foggy data. In addition, we present three
other main stand-alone contributions: 1) a novel method to add synthetic fog to
real, clear-weather scenes using semantic input; 2) a new fog density
estimator; 3) the Foggy Zurich dataset comprising real foggy images,
with pixel-level semantic annotations for images with dense fog. Our
experiments show that 1) our fog simulation slightly outperforms a
state-of-the-art competing simulation with respect to the task of semantic
foggy scene understanding (SFSU); 2) CMAda improves the performance of
state-of-the-art models for SFSU significantly by leveraging unlabeled real
foggy data. The datasets and code are publicly available.Comment: final version, ECCV 201
BLADE: Filter Learning for General Purpose Computational Photography
The Rapid and Accurate Image Super Resolution (RAISR) method of Romano,
Isidoro, and Milanfar is a computationally efficient image upscaling method
using a trained set of filters. We describe a generalization of RAISR, which we
name Best Linear Adaptive Enhancement (BLADE). This approach is a trainable
edge-adaptive filtering framework that is general, simple, computationally
efficient, and useful for a wide range of problems in computational
photography. We show applications to operations which may appear in a camera
pipeline including denoising, demosaicing, and stylization
SLIC Based Digital Image Enlargement
Low resolution image enhancement is a classical computer vision problem.
Selecting the best method to reconstruct an image to a higher resolution with
the limited data available in the low-resolution image is quite a challenge. A
major drawback from the existing enlargement techniques is the introduction of
color bleeding while interpolating pixels over the edges that separate distinct
colors in an image. The color bleeding causes to accentuate the edges with new
colors as a result of blending multiple colors over adjacent regions. This
paper proposes a novel approach to mitigate the color bleeding by segmenting
the homogeneous color regions of the image using Simple Linear Iterative
Clustering (SLIC) and applying a higher order interpolation technique
separately on the isolated segments. The interpolation at the boundaries of
each of the isolated segments is handled by using a morphological operation.
The approach is evaluated by comparing against several frequently used image
enlargement methods such as bilinear and bicubic interpolation by means of Peak
Signal-to-Noise-Ratio (PSNR) value. The results obtained exhibit that the
proposed method outperforms the baseline methods by means of PSNR and also
mitigates the color bleeding at the edges which improves the overall
appearance.Comment: 6 page
- …