Search CORE

5 research outputs found

Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach

Author: Atapour-Abarghouei A
Breckon TP
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach

Author: Atapour-Abarghouei A.
Breckon T.P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2019
Field of study

Robust geometric and semantic scene understanding is ever more important in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based approach capable of jointly performing geometric and semantic scene understanding, namely depth prediction (monocular depth estimation and depth completion) and semantic scene segmentation. Within a single temporally constrained recurrent network, our approach uniquely takes advantage of a complex series of skip connections, adversarial training and the temporal constraint of sequential frame recurrence to produce consistent depth and semantic class labels simultaneously. Extensive experimental evaluation demonstrates the efficacy of our approach compared to other contemporary state-of-the-art techniques.Comment: CVPR 201

arXiv.org e-Print Archive

Durham Research Online

Crossref

Temporally Coherent General Dynamic Scene Reconstruction

Author: Guillemaut Jean-Yves
Hilton Adrian
Kim Hansung
Mustafa Armin
Volino Marco
Publication venue
Publication date: 03/08/2020
Field of study

Existing techniques for dynamic scene reconstruction from multiple wide-baseline cameras primarily focus on reconstruction in controlled environments, with fixed calibrated cameras and strong prior constraints. This paper introduces a general approach to obtain a 4D representation of complex dynamic scenes from multi-view wide-baseline static or moving cameras without prior knowledge of the scene structure, appearance, or illumination. Contributions of the work are: An automatic method for initial coarse reconstruction to initialize joint estimation; Sparse-to-dense temporal correspondence integrated with joint multi-view segmentation and reconstruction to introduce temporal coherence; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes by introducing shape constraint. Comparison with state-of-the-art approaches on a variety of complex indoor and outdoor scenes, demonstrates improved accuracy in both multi-view segmentation and dense reconstruction. This paper demonstrates unsupervised reconstruction of complete temporally coherent 4D scene models with improved non-rigid object segmentation and shape reconstruction and its application to free-viewpoint rendering and virtual reality.Comment: Submitted to IJCV 2019. arXiv admin note: substantial text overlap with arXiv:1603.0338

arXiv.org e-Print Archive

University of Surrey

Surrey Research Insight

Accurate dense depth from light field technology for object segmentation and 3D computer vision

Author: San Wai
Publication venue: 'University of Queensland Library'
Publication date: 06/07/2020
Field of study

University of Queensland eSpace

Deep 3D Information Prediction and Understanding

Author: Zhao Shanshan
Publication venue: Faculty of Engineering, School of Computer Science
Publication date: 01/01/2022
Field of study

3D information prediction and understanding play significant roles in 3D visual perception. For 3D information prediction, recent studies have demonstrated the superiority of deep neural networks. Despite the great success of deep learning, there are still many challenging issues to be solved. One crucial issue is how to learn the deep model in an unsupervised learning framework. In this thesis, we take monocular depth estimation as an example to study this problem through exploring the domain adaptation technique. Apart from the prediction from a single image or multiple images, we can also estimate the depth from multi-modal data, such as RGB image data coupled with 3D laser scan data. Since the 3D data is usually sparse and irregularly distributed, we are required to model the contextual information from the sparse data and fuse the multi-modal features. We examine the issues by studying the depth completion task. For 3D information understanding, such as point clouds analysis, due to the sparsity and unordered property of 3D point cloud, instead of the conventional convolution, new operations which can model the local geometric shape are required. We design a basic operation for point cloud analysis through introducing a novel adaptive edge-to-edge interaction learning module. Besides, due to the diversity in configurations of the 3D laser scanners, the captured 3D data often varies from dataset to dataset in object size, density, and viewpoints. As a result, the domain generalization in 3D data analysis is also a critical problem. We study this issue in 3D shape classification by proposing an entropy regularization term. Through studying four specific tasks, this thesis focuses on several crucial issues in deep 3D information prediction and understanding, including model designing, multi-modal fusion, sparse data analysis, unsupervised learning, domain adaptation, and domain generalization

Sydney eScholarship