Search CORE

13 research outputs found

Learning Single-Image Depth from Videos using Quality Assessment Networks

Author: Chen Weifeng
Deng Jia
Qian Shengyi
Publication venue
Publication date: 01/01/2019
Field of study

Depth estimation from a single image in the wild remains a challenging problem. One main obstacle is the lack of high-quality training data for images in the wild. In this paper we propose a method to automatically generate such data through Structure-from-Motion (SfM) on Internet videos. The core of this method is a Quality Assessment Network that identifies high-quality reconstructions obtained from SfM. Using this method, we collect single-view depth training data from a large number of YouTube videos and construct a new dataset called YouTube3D. Experiments show that YouTube3D is useful in training depth estimation networks and advances the state of the art of single-view depth estimation in the wild

arXiv.org e-Print Archive

Princeton University Open Access Repository

Face Normals "in-the- wild" using Fully Convolutional Networks

Author: Kokkinos I
Snape P
Trigeorgis G
Zafeiriou S
Publication venue: 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publication date: 21/07/2017
Field of study

In this work we pursue a data-driven approach to the problem of estimating surface normals from a single intensity image, focusing in particular on human faces. We introduce new methods to exploit the currently available facial databases for dataset construction and tailor a deep convolutional neural network to the task of estimating facial surface normals in-the-wild. We train a fully convolutional network that can accurately recover facial normals from images including a challenging variety of expressions and facial poses. We compare against state-of-the-art face Shape-from-Shading and 3D reconstruction techniques and show that the proposed network can recover substantially more accurate and realistic normals. Furthermore, in contrast to other existing face-specific surface recovery methods, we do not require the solving of an explicit alignment step due to the fully convolutional nature of our network

Crossref

UCL Discovery

Surface Normal Estimation of Tilted Images via Spatial Rectifier

Author: Do Tien
Park Hyun Soo
Roumeliotis Stergios I.
Vuong Khiem
Publication venue
Publication date: 17/07/2020
Field of study

In this paper, we present a spatial rectifier to estimate surface normals of tilted images. Tilted images are of particular interest as more visual data are captured by arbitrarily oriented sensors such as body-/robot-mounted cameras. Existing approaches exhibit bounded performance on predicting surface normals because they were trained using gravity-aligned images. Our two main hypotheses are: (1) visual scene layout is indicative of the gravity direction; and (2) not all surfaces are equally represented by a learned estimator due to the structured distribution of the training data, thus, there exists a transformation for each tilted image that is more responsive to the learned estimator than others. We design a spatial rectifier that is learned to transform the surface normal distribution of a tilted image to the rectified one that matches the gravity-aligned training data distribution. Along with the spatial rectifier, we propose a novel truncated angular loss that offers a stronger gradient at smaller angular errors and robustness to outliers. The resulting estimator outperforms the state-of-the-art methods including data augmentation baselines not only on ScanNet and NYUv2 but also on a new dataset called Tilt-RGBD that includes considerable roll and pitch camera motion.Comment: 16 page

arXiv.org e-Print Archive

Single-Image Depth Prediction Makes Feature Matching Easier

Author: A Criminisi
A Criminisi
AJ Davison
B Zeisl
C Wu
D Gálvez-López
D Liebowitz
D Mishkin
DG Lowe
ES Jones
G Baatz
G Baatz
G Simon
H Aanæs
J Matas
J Pritts
JL Schönberger
JM Morel
K Cordes
K Mikolajczyk
K Mikolajczyk
L Svärm
MA Fischler
R Garg
R Mur-Artal
S Hinterstoisser
T Lindeberg
T Sattler
W Liu
W Maddern
Y Pang
Publication venue
Publication date: 01/01/2020
Field of study

Good local features improve the robustness of many 3D re-localization and multi-view reconstruction pipelines. The problem is that viewing angle and distance severely impact the recognizability of a local feature. Attempts to improve appearance invariance by choosing better local feature points or by leveraging outside information, have come with pre-requisites that made some of them impractical. In this paper, we propose a surprisingly effective enhancement to local feature extraction, which improves matching. We show that CNN-based depths inferred from single RGB images are quite helpful, despite their flaws. They allow us to pre-warp images and rectify perspective distortions, to significantly enhance SIFT and BRISK features, enabling more good matches, even when cameras are looking at the same scene but in opposite directions.Comment: 14 pages, 7 figures, accepted for publication at the European conference on computer vision (ECCV) 202

arXiv.org e-Print Archive

Crossref

UCL Discovery

Chalmers Research