Search CORE

3,322 research outputs found

Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

Author: Dai Dengxin
Sakaridis Christos
Van Gool Luc
Publication venue
Publication date: 01/01/2021
Field of study

We address the problem of semantic nighttime image segmentation and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation models from day to night through progressively darker times of day, exploiting cross-time-of-day correspondences between daytime images from a reference map and dark images to guide the label inference in the dark domains; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, comprising 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 201 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark for our novel evaluation. Experiments show that our map-guided curriculum adaptation significantly outperforms state-of-the-art methods on nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can improve results on data with ambiguous content such as our benchmark and profit safety-oriented applications involving invalid inputs.Comment: IEEE T-PAMI 202

arXiv.org e-Print Archive

Repository for Publications and Research Data

Multiscale Representations for Manifold-Valued Data

Author: Bär Christian
David L. Donoho
Helgason Sigurdur
Iddo Drori
Inam Ur Rahman
Lang Serge
Peter Schröder
Victoria C. Stodden
Publication venue
Publication date: 01/01/2005
Field of study

We describe multiscale representations for data observed on equispaced grids and taking values in manifolds such as the sphere

S^2

, the special orthogonal group

SO(3)

, the positive definite matrices

SPD(n)

, and the Grassmann manifolds

G(n,k)

. The representations are based on the deployment of Deslauriers--Dubuc and average-interpolating pyramids "in the tangent plane" of such manifolds, using the

Exp

and

Log

maps of those manifolds. The representations provide "wavelet coefficients" which can be thresholded, quantized, and scaled in much the same way as traditional wavelet coefficients. Tasks such as compression, noise removal, contrast enhancement, and stochastic simulation are facilitated by this representation. The approach applies to general manifolds but is particularly suited to the manifolds we consider, i.e., Riemannian symmetric spaces, such as

S^{n-1}

SO(n)

G(n,k)

, where the

Exp

and

Log

maps are effectively computable. Applications to manifold-valued data sources of a geometric nature (motion, orientation, diffusion) seem particularly immediate. A software toolbox, SymmLab, can reproduce the results discussed in this paper

CiteSeerX

Crossref

Columbia University Academic Commons

Caltech Authors

Unsupervised learning of object landmarks by factorized spatial embeddings

Author: Bilen Hakan
Thewlis James
Vedaldi Andrea
Publication venue
Publication date: 01/01/2017
Field of study

Learning automatically the structure of object categories remains an important open problem in computer vision. In this paper, we propose a novel unsupervised approach that can discover and learn landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object deformation, by learning a deep neural network that detects landmarks consistently with such visual effects. Furthermore, we show that the learned landmarks establish meaningful correspondences between different object instances in a category without having to impose this requirement explicitly. We assess the method qualitatively on a variety of object types, natural and man-made. We also show that our unsupervised landmarks are highly predictive of manually-annotated landmarks in face benchmark datasets, and can be used to regress these with a high degree of accuracy.Comment: To be published in ICCV 201

arXiv.org e-Print Archive

Oxford University Research Archive

Implicit 3D Orientation Learning for 6D Object Detection from RGB Images

Author: BT Phong
P Vincent
S Hinterstoisser
S Hinterstoisser
S Hinterstoisser
SR Richter
T Hodaň
T-Y Lin
W Kehl
W Liu
Y Movshovitz-Attias
Z Zhang
Publication venue
Publication date: 10/09/2018
Field of study

We propose a real-time RGB-based pipeline for object detection and 6D pose estimation. Our novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization. This so-called Augmented Autoencoder has several advantages over existing methods: It does not require real, pose-annotated training data, generalizes to various test sensors and inherently handles object and view symmetries. Instead of learning an explicit mapping from input images to object poses, it provides an implicit representation of object orientations defined by samples in a latent space. Our pipeline achieves state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D domain. We also evaluate on the LineMOD dataset where we can compete with other synthetically trained approaches. We further increase performance by correcting 3D orientation estimates to account for perspective errors when the object deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Crossref

Joint Viewpoint and Keypoint Estimation with Real and Synthetic Data

Author: P Felzenszwalb
P Panareda Busto
RG Keys
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/12/2019
Field of study

The estimation of viewpoints and keypoints effectively enhance object detection methods by extracting valuable traits of the object instances. While the output of both processes differ, i.e., angles vs. list of characteristic points, they indeed share the same focus on how the object is placed in the scene, inducing that there is a certain level of correlation between them. Therefore, we propose a convolutional neural network that jointly computes the viewpoint and keypoints for different object categories. By training both tasks together, each task improves the accuracy of the other. Since the labelling of object keypoints is very time consuming for human annotators, we also introduce a new synthetic dataset with automatically generated viewpoint and keypoints annotations. Our proposed network can also be trained on datasets that contain viewpoint and keypoints annotations or only one of them. The experiments show that the proposed approach successfully exploits this implicit correlation between the tasks and outperforms previous techniques that are trained independently.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

Crossref