997 research outputs found
Unsupervised Diverse Colorization via Generative Adversarial Networks
Colorization of grayscale images has been a hot topic in computer vision.
Previous research mainly focuses on producing a colored image to match the
original one. However, since many colors share the same gray value, an input
grayscale image could be diversely colored while maintaining its reality. In
this paper, we design a novel solution for unsupervised diverse colorization.
Specifically, we leverage conditional generative adversarial networks to model
the distribution of real-world item colors, in which we develop a fully
convolutional generator with multi-layer noise to enhance diversity, with
multi-layer condition concatenation to maintain reality, and with stride 1 to
keep spatial information. With such a novel network architecture, the model
yields highly competitive performance on the open LSUN bedroom dataset. The
Turing test of 80 humans further indicates our generated color schemes are
highly convincible
Colorization as a Proxy Task for Visual Understanding
We investigate and improve self-supervision as a drop-in replacement for
ImageNet pretraining, focusing on automatic colorization as the proxy task.
Self-supervised training has been shown to be more promising for utilizing
unlabeled data than other, traditional unsupervised learning methods. We build
on this success and evaluate the ability of our self-supervised network in
several contexts. On VOC segmentation and classification tasks, we present
results that are state-of-the-art among methods not using ImageNet labels for
pretraining representations.
Moreover, we present the first in-depth analysis of self-supervision via
colorization, concluding that formulation of the loss, training details and
network architecture play important roles in its effectiveness. This
investigation is further expanded by revisiting the ImageNet pretraining
paradigm, asking questions such as: How much training data is needed? How many
labels are needed? How much do features change when fine-tuned? We relate these
questions back to self-supervision by showing that colorization provides a
similarly powerful supervisory signal as various flavors of ImageNet
pretraining.Comment: CVPR 2017 (Project page:
http://people.cs.uchicago.edu/~larsson/color-proxy/
Probabilistic Image Colorization
We develop a probabilistic technique for colorizing grayscale natural images.
In light of the intrinsic uncertainty of this task, the proposed probabilistic
framework has numerous desirable properties. In particular, our model is able
to produce multiple plausible and vivid colorizations for a given grayscale
image and is one of the first colorization models to provide a proper
stochastic sampling scheme. Moreover, our training procedure is supported by a
rigorous theoretical framework that does not require any ad hoc heuristics and
allows for efficient modeling and learning of the joint pixel color
distribution. We demonstrate strong quantitative and qualitative experimental
results on the CIFAR-10 dataset and the challenging ILSVRC 2012 dataset
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
We propose a novel method for learning convolutional neural image
representations without manual supervision. We use motion cues in the form of
optical flow, to supervise representations of static images. The obvious
approach of training a network to predict flow from a single image can be
needlessly difficult due to intrinsic ambiguities in this prediction task. We
instead propose a much simpler learning goal: embed pixels such that the
similarity between their embeddings matches that between their optical flow
vectors. At test time, the learned deep network can be used without access to
video or flow information and transferred to tasks such as image
classification, detection, and segmentation. Our method, which significantly
simplifies previous attempts at using motion for self-supervision, achieves
state-of-the-art results in self-supervision using motion cues, competitive
results for self-supervision in general, and is overall state of the art in
self-supervised pretraining for semantic image segmentation, as demonstrated on
standard benchmarks
The Missing Data Encoder: Cross-Channel Image Completion\\with Hide-And-Seek Adversarial Network
Image completion is the problem of generating whole images from fragments
only. It encompasses inpainting (generating a patch given its surrounding),
reverse inpainting/extrapolation (generating the periphery given the central
patch) as well as colorization (generating one or several channels given other
ones). In this paper, we employ a deep network to perform image completion,
with adversarial training as well as perceptual and completion losses, and call
it the ``missing data encoder'' (MDE). We consider several configurations based
on how the seed fragments are chosen. We show that training MDE for ``random
extrapolation and colorization'' (MDE-REC), i.e. using random
channel-independent fragments, allows a better capture of the image semantics
and geometry. MDE training makes use of a novel ``hide-and-seek'' adversarial
loss, where the discriminator seeks the original non-masked regions, while the
generator tries to hide them. We validate our models both qualitatively and
quantitatively on several datasets, showing their interest for image
completion, unsupervised representation learning as well as face occlusion
handling
- …