203 research outputs found
Learning Less is More - 6D Camera Localization via 3D Surface Regression
Popular research areas like autonomous driving and augmented reality have
renewed the interest in image-based camera localization. In this work, we
address the task of predicting the 6D camera pose from a single RGB image in a
given 3D environment. With the advent of neural networks, previous works have
either learned the entire camera localization process, or multiple components
of a camera localization pipeline. Our key contribution is to demonstrate and
explain that learning a single component of this pipeline is sufficient. This
component is a fully convolutional neural network for densely regressing
so-called scene coordinates, defining the correspondence between the input
image and the 3D scene space. The neural network is prepended to a new
end-to-end trainable pipeline. Our system is efficient, highly accurate, robust
in training, and exhibits outstanding generalization capabilities. It exceeds
state-of-the-art consistently on indoor and outdoor datasets. Interestingly,
our approach surpasses existing techniques even without utilizing a 3D model of
the scene during training, since the network is able to discover 3D scene
geometry automatically, solely from single-view constraints.Comment: CVPR 201
Learning an Interactive Segmentation System
Many successful applications of computer vision to image or video
manipulation are interactive by nature. However, parameters of such systems are
often trained neglecting the user. Traditionally, interactive systems have been
treated in the same manner as their fully automatic counterparts. Their
performance is evaluated by computing the accuracy of their solutions under
some fixed set of user interactions. This paper proposes a new evaluation and
learning method which brings the user in the loop. It is based on the use of an
active robot user - a simulated model of a human user. We show how this
approach can be used to evaluate and learn parameters of state-of-the-art
interactive segmentation systems. We also show how simulated user models can be
integrated into the popular max-margin method for parameter learning and
propose an algorithm to solve the resulting optimisation problem.Comment: 11 pages, 7 figures, 4 table
Benchmarking the Robustness of Semantic Segmentation Models
When designing a semantic segmentation module for a practical application,
such as autonomous driving, it is crucial to understand the robustness of the
module with respect to a wide range of image corruptions. While there are
recent robustness studies for full-image classification, we are the first to
present an exhaustive study for semantic segmentation, based on the
state-of-the-art model DeepLabv3+. To increase the realism of our study, we
utilize almost 400,000 images generated from Cityscapes, PASCAL VOC 2012, and
ADE20K. Based on the benchmark study, we gain several new insights. Firstly,
contrary to full-image classification, model robustness increases with model
performance, in most cases. Secondly, some architecture properties affect
robustness significantly, such as a Dense Prediction Cell, which was designed
to maximize performance on clean data only.Comment: CVPR 2020 camera read
Panoptic Segmentation
We propose and study a task we name panoptic segmentation (PS). Panoptic
segmentation unifies the typically distinct tasks of semantic segmentation
(assign a class label to each pixel) and instance segmentation (detect and
segment each object instance). The proposed task requires generating a coherent
scene segmentation that is rich and complete, an important step toward
real-world vision systems. While early work in computer vision addressed
related image/scene parsing tasks, these are not currently popular, possibly
due to lack of appropriate metrics or associated recognition challenges. To
address this, we propose a novel panoptic quality (PQ) metric that captures
performance for all classes (stuff and things) in an interpretable and unified
manner. Using the proposed metric, we perform a rigorous study of both human
and machine performance for PS on three existing datasets, revealing
interesting insights about the task. The aim of our work is to revive the
interest of the community in a more unified view of image segmentation.Comment: accepted to CVPR 201
- …