829 research outputs found
DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs
We present a novel deep learning architecture for fusing static
multi-exposure images. Current multi-exposure fusion (MEF) approaches use
hand-crafted features to fuse input sequence. However, the weak hand-crafted
representations are not robust to varying input conditions. Moreover, they
perform poorly for extreme exposure image pairs. Thus, it is highly desirable
to have a method that is robust to varying input conditions and capable of
handling extreme exposure without artifacts. Deep representations have known to
be robust to input conditions and have shown phenomenal performance in a
supervised setting. However, the stumbling block in using deep learning for MEF
was the lack of sufficient training data and an oracle to provide the
ground-truth for supervision. To address the above issues, we have gathered a
large dataset of multi-exposure image stacks for training and to circumvent the
need for ground truth images, we propose an unsupervised deep learning
framework for MEF utilizing a no-reference quality metric as loss function. The
proposed approach uses a novel CNN architecture trained to learn the fusion
operation without reference ground truth image. The model fuses a set of common
low level features extracted from each image to generate artifact-free
perceptually pleasing results. We perform extensive quantitative and
qualitative evaluation and show that the proposed technique outperforms
existing state-of-the-art approaches for a variety of natural images.Comment: ICCV 201
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction
The ultimate goal of many image-based modeling systems is to render
photo-realistic novel views of a scene without visible artifacts. Existing
evaluation metrics and benchmarks focus mainly on the geometric accuracy of the
reconstructed model, which is, however, a poor predictor of visual accuracy.
Furthermore, using only geometric accuracy by itself does not allow evaluating
systems that either lack a geometric scene representation or utilize coarse
proxy geometry. Examples include light field or image-based rendering systems.
We propose a unified evaluation approach based on novel view prediction error
that is able to analyze the visual quality of any method that can render novel
views from input images. One of the key advantages of this approach is that it
does not require ground truth geometry. This dramatically simplifies the
creation of test datasets and benchmarks. It also allows us to evaluate the
quality of an unknown scene during the acquisition and reconstruction process,
which is useful for acquisition planning. We evaluate our approach on a range
of methods including standard geometry-plus-texture pipelines as well as
image-based rendering techniques, compare it to existing geometry-based
benchmarks, and demonstrate its utility for a range of use cases.Comment: 10 pages, 12 figures, paper was submitted to ACM Transactions on
Graphics for revie
Improving SLI Performance in Optically Challenging Environments
The construction of 3D models of real-world scenes using non-contact methods is an important problem in computer vision. Some of the more successful methods belong to a class of techniques called structured light illumination (SLI). While SLI methods are generally very successful, there are cases where their performance is poor. Examples include scenes with a high dynamic range in albedo or scenes with strong interreflections. These scenes are referred to as optically challenging environments.
The work in this dissertation is aimed at improving SLI performance in optically challenging environments. A new method of high dynamic range imaging (HDRI) based on pixel-by-pixel Kalman filtering is developed. Using objective metrics, it is show to achieve as much as a 9.4 dB improvement in signal-to-noise ratio and as much as a 29% improvement in radiometric accuracy over a classic method. Quality checks are developed to detect and quantify multipath interference and other quality defects using phase measuring profilometry (PMP). Techniques are established to improve SLI performance in the presence of strong interreflections. Approaches in compressed sensing are applied to SLI, and interreflections in a scene are modeled using SLI. Several different applications of this research are also discussed
PC-MSDM: A quality metric for 3D point clouds
International audienceIn this paper, we present PC-MSDM, an objective metric for visual quality assessment of 3D point clouds. This full-reference metric is based on local curvature statistics and can be viewed as an extension for point clouds of the MSDM metric suited for 3D meshes. We evaluate its performance on an open subjective dataset of point clouds compressed by octree pruning; results show that the proposed metric outperforms its counterparts in terms of correlation with mean opinion scores
Semi-Sparsity for Smoothing Filters
In this paper, we propose an interesting semi-sparsity smoothing algorithm
based on a novel sparsity-inducing optimization framework. This method is
derived from the multiple observations, that is, semi-sparsity prior knowledge
is more universally applicable, especially in areas where sparsity is not fully
admitted, such as polynomial-smoothing surfaces. We illustrate that this
semi-sparsity can be identified into a generalized -norm minimization in
higher-order gradient domains, thereby giving rise to a new "feature-aware"
filtering method with a powerful simultaneous-fitting ability in both sparse
features (singularities and sharpening edges) and non-sparse regions
(polynomial-smoothing surfaces). Notice that a direct solver is always
unavailable due to the non-convexity and combinatorial nature of -norm
minimization. Instead, we solve the model based on an efficient half-quadratic
splitting minimization with fast Fourier transforms (FFTs) for acceleration. We
finally demonstrate its versatility and many benefits to a series of
signal/image processing and computer vision applications
Recommended from our members
From Pixels to Physics: Probabilistic Color De-Rendering
Consumer digital cameras use tone-mapping to produce compact, narrow-gamut images that are nonetheless visually pleasing. In doing so, they discard or distort substantial radiometric signal that could otherwise be used for computer vision. Existing methods attempt to undo these effects through deterministic maps that de-render the reported narrow-gamut colors back to their original wide-gamut sensor measurements. Deterministic approaches are unreliable, however, because the reverse narrow-to-wide mapping is one-to-many and has inherent uncertainty. Our solution is to use probabilistic maps, providing uncertainty estimates useful to many applications. We use a non-parametric Bayesian regression technique - local Gaussian process regression - to learn for each pixel's narrow-gamut color a probability distribution over the scene colors that could have created it. Using a variety of consumer cameras we show that these distributions, once learned from training data, are effective in simple probabilistic adaptations of two popular applications: multi-exposure imaging and photometric stereo. Our results on these applications are better than those of corresponding deterministic approaches, especially for saturated and out-of-gamut colors.Engineering and Applied Science
- …