16,784 research outputs found
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
While it is nearly effortless for humans to quickly assess the perceptual
similarity between two images, the underlying processes are thought to be quite
complex. Despite this, the most widely used perceptual metrics today, such as
PSNR and SSIM, are simple, shallow functions, and fail to account for many
nuances of human perception. Recently, the deep learning community has found
that features of the VGG network trained on ImageNet classification has been
remarkably useful as a training loss for image synthesis. But how perceptual
are these so-called "perceptual losses"? What elements are critical for their
success? To answer these questions, we introduce a new dataset of human
perceptual similarity judgments. We systematically evaluate deep features
across different architectures and tasks and compare them with classic metrics.
We find that deep features outperform all previous metrics by large margins on
our dataset. More surprisingly, this result is not restricted to
ImageNet-trained VGG features, but holds across different deep architectures
and levels of supervision (supervised, self-supervised, or even unsupervised).
Our results suggest that perceptual similarity is an emergent property shared
across deep visual representations.Comment: Accepted to CVPR 2018; Code and data available at
https://www.github.com/richzhang/PerceptualSimilarit
How Does the Low-Rank Matrix Decomposition Help Internal and External Learnings for Super-Resolution
Wisely utilizing the internal and external learning methods is a new
challenge in super-resolution problem. To address this issue, we analyze the
attributes of two methodologies and find two observations of their recovered
details: 1) they are complementary in both feature space and image plane, 2)
they distribute sparsely in the spatial space. These inspire us to propose a
low-rank solution which effectively integrates two learning methods and then
achieves a superior result. To fit this solution, the internal learning method
and the external learning method are tailored to produce multiple preliminary
results. Our theoretical analysis and experiment prove that the proposed
low-rank solution does not require massive inputs to guarantee the performance,
and thereby simplifying the design of two learning methods for the solution.
Intensive experiments show the proposed solution improves the single learning
method in both qualitative and quantitative assessments. Surprisingly, it shows
more superior capability on noisy images and outperforms state-of-the-art
methods
Photometric Depth Super-Resolution
This study explores the use of photometric techniques (shape-from-shading and
uncalibrated photometric stereo) for upsampling the low-resolution depth map
from an RGB-D sensor to the higher resolution of the companion RGB image. A
single-shot variational approach is first put forward, which is effective as
long as the target's reflectance is piecewise-constant. It is then shown that
this dependency upon a specific reflectance model can be relaxed by focusing on
a specific class of objects (e.g., faces), and delegate reflectance estimation
to a deep neural network. A multi-shot strategy based on randomly varying
lighting conditions is eventually discussed. It requires no training or prior
on the reflectance, yet this comes at the price of a dedicated acquisition
setup. Both quantitative and qualitative evaluations illustrate the
effectiveness of the proposed methods on synthetic and real-world scenarios.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(T-PAMI), 2019. First three authors contribute equall
- …