272 research outputs found
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
We propose a novel method for learning convolutional neural image
representations without manual supervision. We use motion cues in the form of
optical flow, to supervise representations of static images. The obvious
approach of training a network to predict flow from a single image can be
needlessly difficult due to intrinsic ambiguities in this prediction task. We
instead propose a much simpler learning goal: embed pixels such that the
similarity between their embeddings matches that between their optical flow
vectors. At test time, the learned deep network can be used without access to
video or flow information and transferred to tasks such as image
classification, detection, and segmentation. Our method, which significantly
simplifies previous attempts at using motion for self-supervision, achieves
state-of-the-art results in self-supervision using motion cues, competitive
results for self-supervision in general, and is overall state of the art in
self-supervised pretraining for semantic image segmentation, as demonstrated on
standard benchmarks
Softmax Splatting for Video Frame Interpolation
Differentiable image sampling in the form of backward warping has seen broad
adoption in tasks like depth estimation and optical flow prediction. In
contrast, how to perform forward warping has seen less attention, partly due to
additional challenges such as resolving the conflict of mapping multiple pixels
to the same target location in a differentiable way. We propose softmax
splatting to address this paradigm shift and show its effectiveness on the
application of frame interpolation. Specifically, given two input frames, we
forward-warp the frames and their feature pyramid representations based on an
optical flow estimate using softmax splatting. In doing so, the softmax
splatting seamlessly handles cases where multiple source pixels map to the same
target location. We then use a synthesis network to predict the interpolation
result from the warped representations. Our softmax splatting allows us to not
only interpolate frames at an arbitrary time but also to fine tune the feature
pyramid and the optical flow. We show that our synthesis approach, empowered by
softmax splatting, achieves new state-of-the-art results for video frame
interpolation.Comment: CVPR 2020, http://sniklaus.com/softspla
Discovery of Visual Semantics by Unsupervised and Self-Supervised Representation Learning
The success of deep learning in computer vision is rooted in the ability of
deep networks to scale up model complexity as demanded by challenging visual
tasks. As complexity is increased, so is the need for large amounts of labeled
data to train the model. This is associated with a costly human annotation
effort. To address this concern, with the long-term goal of leveraging the
abundance of cheap unlabeled data, we explore methods of unsupervised
"pre-training." In particular, we propose to use self-supervised automatic
image colorization.
We show that traditional methods for unsupervised learning, such as
layer-wise clustering or autoencoders, remain inferior to supervised
pre-training. In search for an alternative, we develop a fully automatic image
colorization method. Our method sets a new state-of-the-art in revitalizing old
black-and-white photography, without requiring human effort or expertise.
Additionally, it gives us a method for self-supervised representation learning.
In order for the model to appropriately re-color a grayscale object, it must
first be able to identify it. This ability, learned entirely self-supervised,
can be used to improve other visual tasks, such as classification and semantic
segmentation. As a future direction for self-supervision, we investigate if
multiple proxy tasks can be combined to improve generalization. This turns out
to be a challenging open problem. We hope that our contributions to this
endeavor will provide a foundation for future efforts in making
self-supervision compete with supervised pre-training.Comment: Ph.D. thesi
- …