3 research outputs found
Structural Consistency and Controllability for Diverse Colorization
Colorizing a given gray-level image is an important task in the media and
advertising industry. Due to the ambiguity inherent to colorization (many
shades are often plausible), recent approaches started to explicitly model
diversity. However, one of the most obvious artifacts, structural
inconsistency, is rarely considered by existing methods which predict
chrominance independently for every pixel. To address this issue, we develop a
conditional random field based variational auto-encoder formulation which is
able to achieve diversity while taking into account structural consistency.
Moreover, we introduce a controllability mecha- nism that can incorporate
external constraints from diverse sources in- cluding a user interface.
Compared to existing baselines, we demonstrate that our method obtains more
diverse and globally consistent coloriza- tions on the LFW, LSUN-Church and
ILSVRC-2015 datasets.Comment: Accepted to ECCV 201
Deep Photo Cropper and Enhancer
This paper introduces a new type of image enhancement problem. Compared to
traditional image enhancement methods, which mostly deal with pixel-wise
modifications of a given photo, our proposed task is to crop an image which is
embedded within a photo and enhance the quality of the cropped image. We split
our proposed approach into two deep networks: deep photo cropper and deep image
enhancer. In the photo cropper network, we employ a spatial transformer to
extract the embedded image. In the photo enhancer, we employ super-resolution
to increase the number of pixels in the embedded image and reduce the effect of
stretching and distortion of pixels. We use cosine distance loss between image
features and ground truth for the cropper and the mean square loss for the
enhancer. Furthermore, we propose a new dataset to train and test the proposed
method. Finally, we analyze the proposed method with respect to qualitative and
quantitative evaluations
SCGAN: Saliency Map-guided Colorization with Generative Adversarial Network
Given a grayscale photograph, the colorization system estimates a visually
plausible colorful image. Conventional methods often use semantics to colorize
grayscale images. However, in these methods, only classification semantic
information is embedded, resulting in semantic confusion and color bleeding in
the final colorized image. To address these issues, we propose a fully
automatic Saliency Map-guided Colorization with Generative Adversarial Network
(SCGAN) framework. It jointly predicts the colorization and saliency map to
minimize semantic confusion and color bleeding in the colorized image. Since
the global features from pre-trained VGG-16-Gray network are embedded to the
colorization encoder, the proposed SCGAN can be trained with much less data
than state-of-the-art methods to achieve perceptually reasonable colorization.
In addition, we propose a novel saliency map-based guidance method. Branches of
the colorization decoder are used to predict the saliency map as a proxy
target. Moreover, two hierarchical discriminators are utilized for the
generated colorization and saliency map, respectively, in order to strengthen
visual perception performance. The proposed system is evaluated on ImageNet
validation set. Experimental results show that SCGAN can generate more
reasonable colorized images than state-of-the-art techniques.Comment: accepted by IEEE Transactions on Circuits and Systems for Video
Technolog