24,590 research outputs found
Laplacian-Steered Neural Style Transfer
Neural Style Transfer based on Convolutional Neural Networks (CNN) aims to
synthesize a new image that retains the high-level structure of a content
image, rendered in the low-level texture of a style image. This is achieved by
constraining the new image to have high-level CNN features similar to the
content image, and lower-level CNN features similar to the style image. However
in the traditional optimization objective, low-level features of the content
image are absent, and the low-level features of the style image dominate the
low-level detail structures of the new image. Hence in the synthesized image,
many details of the content image are lost, and a lot of inconsistent and
unpleasing artifacts appear. As a remedy, we propose to steer image synthesis
with a novel loss function: the Laplacian loss. The Laplacian matrix
("Laplacian" in short), produced by a Laplacian operator, is widely used in
computer vision to detect edges and contours. The Laplacian loss measures the
difference of the Laplacians, and correspondingly the difference of the detail
structures, between the content image and a new image. It is flexible and
compatible with the traditional style transfer constraints. By incorporating
the Laplacian loss, we obtain a new optimization objective for neural style
transfer named Lapstyle. Minimizing this objective will produce a stylized
image that better preserves the detail structures of the content image and
eliminates the artifacts. Experiments show that Lapstyle produces more
appealing stylized images with less artifacts, without compromising their
"stylishness".Comment: Accepted by the ACM Multimedia Conference (MM) 2017. 9 pages, 65
figure
WESPE: Weakly Supervised Photo Enhancer for Digital Cameras
Low-end and compact mobile cameras demonstrate limited photo quality mainly
due to space, hardware and budget constraints. In this work, we propose a deep
learning solution that translates photos taken by cameras with limited
capabilities into DSLR-quality photos automatically. We tackle this problem by
introducing a weakly supervised photo enhancer (WESPE) - a novel image-to-image
Generative Adversarial Network-based architecture. The proposed model is
trained by under weak supervision: unlike previous works, there is no need for
strong supervision in the form of a large annotated dataset of aligned
original/enhanced photo pairs. The sole requirement is two distinct datasets:
one from the source camera, and one composed of arbitrary high-quality images
that can be generally crawled from the Internet - the visual content they
exhibit may be unrelated. Hence, our solution is repeatable for any camera:
collecting the data and training can be achieved in a couple of hours. In this
work, we emphasize on extensive evaluation of obtained results. Besides
standard objective metrics and subjective user study, we train a virtual rater
in the form of a separate CNN that mimics human raters on Flickr data and use
this network to get reference scores for both original and enhanced photos. Our
experiments on the DPED, KITTI and Cityscapes datasets as well as pictures from
several generations of smartphones demonstrate that WESPE produces comparable
or improved qualitative results with state-of-the-art strongly supervised
methods
Reversible GANs for Memory-efficient Image-to-Image Translation
The Pix2pix and CycleGAN losses have vastly improved the qualitative and
quantitative visual quality of results in image-to-image translation tasks. We
extend this framework by exploring approximately invertible architectures which
are well suited to these losses. These architectures are approximately
invertible by design and thus partially satisfy cycle-consistency before
training even begins. Furthermore, since invertible architectures have constant
memory complexity in depth, these models can be built arbitrarily deep. We are
able to demonstrate superior quantitative output on the Cityscapes and Maps
datasets at near constant memory budget
- …