8,589 research outputs found
Fully-automatic inverse tone mapping algorithm based on dynamic mid-level tone mapping
High Dynamic Range (HDR) displays can show images with higher color contrast levels and peak luminosities than the common Low Dynamic Range (LDR) displays. However, most existing video content is recorded and/or graded in LDR format. To show LDR content on HDR displays, it needs to be up-scaled using a so-called inverse tone mapping algorithm. Several techniques for inverse tone mapping have been proposed in the last years, going from simple approaches based on global and local operators to more advanced algorithms such as neural networks. Some of the drawbacks of existing techniques for inverse tone mapping are the need for human intervention, the high computation time for more advanced algorithms, limited low peak brightness, and the lack of the preservation of the artistic intentions. In this paper, we propose a fully-automatic inverse tone mapping operator based on mid-level mapping capable of real-time video processing. Our proposed algorithm allows expanding LDR images into HDR images with peak brightness over 1000 nits, preserving the artistic intentions inherent to the HDR domain. We assessed our results using the full-reference objective quality metrics HDR-VDP-2.2 and DRIM, and carrying out a subjective pair-wise comparison experiment. We compared our results with those obtained with the most recent methods found in the literature. Experimental results demonstrate that our proposed method outperforms the current state-of-the-art of simple inverse tone mapping methods and its performance is similar to other more complex and time-consuming advanced techniques
Learned Perceptual Image Enhancement
Learning a typical image enhancement pipeline involves minimization of a loss
function between enhanced and reference images. While L1 and L2 losses are
perhaps the most widely used functions for this purpose, they do not
necessarily lead to perceptually compelling results. In this paper, we show
that adding a learned no-reference image quality metric to the loss can
significantly improve enhancement operators. This metric is implemented using a
CNN (convolutional neural network) trained on a large-scale dataset labelled
with aesthetic preferences of human raters. This loss allows us to conveniently
perform back-propagation in our learning framework to simultaneously optimize
for similarity to a given ground truth reference and perceptual quality. This
perceptual loss is only used to train parameters of image processing operators,
and does not impose any extra complexity at inference time. Our experiments
demonstrate that this loss can be effective for tuning a variety of operators
such as local tone mapping and dehazing
A Style-Based Generator Architecture for Generative Adversarial Networks
We propose an alternative generator architecture for generative adversarial
networks, borrowing from style transfer literature. The new architecture leads
to an automatically learned, unsupervised separation of high-level attributes
(e.g., pose and identity when trained on human faces) and stochastic variation
in the generated images (e.g., freckles, hair), and it enables intuitive,
scale-specific control of the synthesis. The new generator improves the
state-of-the-art in terms of traditional distribution quality metrics, leads to
demonstrably better interpolation properties, and also better disentangles the
latent factors of variation. To quantify interpolation quality and
disentanglement, we propose two new, automated methods that are applicable to
any generator architecture. Finally, we introduce a new, highly varied and
high-quality dataset of human faces.Comment: CVPR 2019 final versio
Towards a Semantic Perceptual Image Metric
We present a full reference, perceptual image metric based on VGG-16, an
artificial neural network trained on object classification. We fit the metric
to a new database based on 140k unique images annotated with ground truth by
human raters who received minimal instruction. The resulting metric shows
competitive performance on TID 2013, a database widely used to assess image
quality assessments methods. More interestingly, it shows strong responses to
objects potentially carrying semantic relevance such as faces and text, which
we demonstrate using a visualization technique and ablation experiments. In
effect, the metric appears to model a higher influence of semantic context on
judgments, which we observe particularly in untrained raters. As the vast
majority of users of image processing systems are unfamiliar with Image Quality
Assessment (IQA) tasks, these findings may have significant impact on
real-world applications of perceptual metrics
Synthesizing Normalized Faces from Facial Identity Features
We present a method for synthesizing a frontal, neutral-expression image of a
person's face given an input face photograph. This is achieved by learning to
generate facial landmarks and textures from features extracted from a
facial-recognition network. Unlike previous approaches, our encoding feature
vector is largely invariant to lighting, pose, and facial expression.
Exploiting this invariance, we train our decoder network using only frontal,
neutral-expression photographs. Since these photographs are well aligned, we
can decompose them into a sparse set of landmark points and aligned texture
maps. The decoder then predicts landmarks and textures independently and
combines them using a differentiable image warping operation. The resulting
images can be used for a number of applications, such as analyzing facial
attributes, exposure and white balance adjustment, or creating a 3-D avatar
Manipulating Attributes of Natural Scenes via Hallucination
In this study, we explore building a two-stage framework for enabling users
to directly manipulate high-level attributes of a natural scene. The key to our
approach is a deep generative network which can hallucinate images of a scene
as if they were taken at a different season (e.g. during winter), weather
condition (e.g. in a cloudy day) or time of the day (e.g. at sunset). Once the
scene is hallucinated with the given attributes, the corresponding look is then
transferred to the input image while preserving the semantic details intact,
giving a photo-realistic manipulation result. As the proposed framework
hallucinates what the scene will look like, it does not require any reference
style image as commonly utilized in most of the appearance or style transfer
approaches. Moreover, it allows to simultaneously manipulate a given scene
according to a diverse set of transient attributes within a single model,
eliminating the need of training multiple networks per each translation task.
Our comprehensive set of qualitative and quantitative results demonstrate the
effectiveness of our approach against the competing methods.Comment: Accepted for publication in ACM Transactions on Graphic
DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks
Despite a rapid rise in the quality of built-in smartphone cameras, their
physical limitations - small sensor size, compact lenses and the lack of
specific hardware, - impede them to achieve the quality results of DSLR
cameras. In this work we present an end-to-end deep learning approach that
bridges this gap by translating ordinary photos into DSLR-quality images. We
propose learning the translation function using a residual convolutional neural
network that improves both color rendition and image sharpness. Since the
standard mean squared loss is not well suited for measuring perceptual image
quality, we introduce a composite perceptual error function that combines
content, color and texture losses. The first two losses are defined
analytically, while the texture loss is learned in an adversarial fashion. We
also present DPED, a large-scale dataset that consists of real photos captured
from three different phones and one high-end reflex camera. Our quantitative
and qualitative assessments reveal that the enhanced image quality is
comparable to that of DSLR-taken photos, while the methodology is generalized
to any type of digital camera
- …