Search CORE

1,452 research outputs found

Learning monocular 3D reconstruction of articulated categories from motion

Author: Kokkinos Filippos
Kokkinos Iasonas
Publication venue
Publication date: 30/03/2021
Field of study

Monocular 3D reconstruction of articulated object categories is challenging due to the lack of training data and the inherent ill-posedness of the problem. In this work we use video self-supervision, forcing the consistency of consecutive 3D reconstructions by a motion-based cycle loss. This largely improves both optimization-based and learning-based 3D mesh reconstruction. We further introduce an interpretable model of 3D template deformations that controls a 3D surface through the displacement of a small number of local, learnable handles. We formulate this operation as a structured layer relying on mesh-laplacian regularization and show that it can be trained in an end-to-end manner. We finally introduce a per-sample numerical optimisation approach that jointly optimises over mesh displacements and cameras within a video, boosting accuracy both for training and also as test time post-processing. While relying exclusively on a small set of videos collected per category for supervision, we obtain state-of-the-art reconstructions with diverse shapes, viewpoints and textures for multiple articulated object categories.Comment: For project website see https://fkokkinos.github.io/video_3d_reconstruction

arXiv.org e-Print Archive

UCL Discovery

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

Author: C Liu
E Shelhamer
F Xia
G Ghiasi
GJ Brostow
M Everingham
O Ronneberger
T-Y Lin
W Liu
Publication venue
Publication date: 19/08/2018
Field of study

We focus on the challenging task of real-time semantic segmentation in this paper. It finds many practical applications and yet is with fundamental difficulty of reducing a large portion of computation for pixel-wise label inference. We propose an image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address this challenge. We provide in-depth analysis of our framework and introduce the cascade feature fusion unit to quickly achieve high-quality segmentation. Our system yields real-time inference on a single GPU card with decent quality results evaluated on challenging datasets like Cityscapes, CamVid and COCO-Stuff.Comment: ECCV 201

arXiv.org e-Print Archive

Crossref