Search CORE

Southampton (e-Prints Soton)

Carolina Digital Repository

Modified belief propagation for reconstruction of office environments

Author: Larsen E. Scott
Publication venue
Publication date: 01/12/2008
Field of study

Belief Propagation (BP) is an algorithm that has found broad application in many areas of computer science. The range of these areas includes Error Correcting Codes, Kalman filters, particle filters, and -- most relevantly -- stereo computer vision. Many of the currently best algorithms for stereo vision benchmarks, e.g. the Middlebury dataset, use Belief Propagation. This dissertation describes improvements to the core algorithm to improve its applicability and usefulness for computer vision applications. A Belief Propagation solution to a computer vision problem is commonly based on specification of a Markov Random Field that it optimizes. Both Markov Random Fields and Belief Propagation have at their core some definition of nodes and neighborhoods' for each node. Each node has a subset of the other nodes defined to be its neighborhood. In common usages for stereo computer vision, the neighborhoods are defined as a pixel's immediate four spatial neighbors. For any given node, this neighborhood definition may or may not be correct for the specific scene. In a setting with video cameras, I expand the neighborhood definition to include corresponding nodes in temporal neighborhoods in addition to spatial neighborhoods. This amplifies the problem of erroneous neighborhood assignments. Part of this dissertation addresses the erroneous neighborhood assignment problem. Often, no single algorithm is always the best. The Markov Random Field formulation appears amiable to integration of other algorithms: I explore that potential here by integrating priors from independent algorithms. This dissertation makes core improvements to BP such that it is more robust to erroneous neighborhood assignments, is more robust in regions with inputs that are near-uniform, and can be biased in a sensitive manner towards higher level priors. These core improvements are demonstrated by the presented results: application to office environments, real-world datasets, and benchmark datasets

Disparity Estimation in Stereo Sequences using Scene Flow

Author
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2009
Field of study

Space-Time Joint Multi-layer Segmentation and Depth Estimation

Author: Guillemaut J-Y
Hilton A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Video-based segmentation and reconstruction tech-niques are predominantly extensions of techniques devel-oped for the image domain treating each frame indepen-dently. These approaches ignore the temporal information contained in input videos which can lead to incoherent re-sults. We propose a framework for joint segmentation and reconstruction which explicitly enforces temporal consis-tency by formulating the problem as an energy minimisation generalised to groups of frames. The main idea is to use op-tical flow in combination with a confidence measure to im-pose robust temporal smoothness constraints. Optimisation is performed using recent advances in the field of graph-cuts combined with practical considerations to reduce run-time and memory consumption. Experimental results with real sequences containing rapid motion demonstrate that the method is able to improve spatio-temporal coherence both in terms of segmentation and reconstruction without introducing any degradation in regions where optical flow fails due to fast motion. 1

CiteSeerX

arXiv.org e-Print Archive

Temporally Coherent General Dynamic Scene Reconstruction

Author: Guillemaut Jean-Yves
Hilton Adrian
Kim Hansung
Mustafa Armin
Volino Marco
Publication venue
Publication date: 03/08/2020
Field of study

Existing techniques for dynamic scene reconstruction from multiple wide-baseline cameras primarily focus on reconstruction in controlled environments, with fixed calibrated cameras and strong prior constraints. This paper introduces a general approach to obtain a 4D representation of complex dynamic scenes from multi-view wide-baseline static or moving cameras without prior knowledge of the scene structure, appearance, or illumination. Contributions of the work are: An automatic method for initial coarse reconstruction to initialize joint estimation; Sparse-to-dense temporal correspondence integrated with joint multi-view segmentation and reconstruction to introduce temporal coherence; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes by introducing shape constraint. Comparison with state-of-the-art approaches on a variety of complex indoor and outdoor scenes, demonstrates improved accuracy in both multi-view segmentation and dense reconstruction. This paper demonstrates unsupervised reconstruction of complete temporally coherent 4D scene models with improved non-rigid object segmentation and shape reconstruction and its application to free-viewpoint rendering and virtual reality.Comment: Submitted to IJCV 2019. arXiv admin note: substantial text overlap with arXiv:1603.0338

arXiv.org e-Print Archive

U4D: Unsupervised 4D Dynamic Scene Understanding

Author: Hilton Adrian
Mustafa Armin
Russell Chris
Publication venue
Publication date: 22/07/2019
Field of study

We introduce the first approach to solve the challenging problem of unsupervised 4D visual scene understanding for complex dynamic scenes with multiple interacting people from multi-view video. Our approach simultaneously estimates a detailed model that includes a per-pixel semantically and temporally coherent reconstruction, together with instance-level segmentation exploiting photo-consistency, semantic and motion information. We further leverage recent advances in 3D pose estimation to constrain the joint semantic instance segmentation and 4D temporally coherent reconstruction. This enables per person semantic instance segmentation of multiple interacting people in complex dynamic scenes. Extensive evaluation of the joint visual scene understanding framework against state-of-the-art methods on challenging indoor and outdoor sequences demonstrates a significant (approx 40%) improvement in semantic segmentation, reconstruction and scene flow accuracy.Comment: To appear in IEEE International Conference in Computer Vision ICCV 201

Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes

Author: Hilton Adrian
Mustafa Armin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2017
Field of study

In this paper we propose a framework for spatially and temporally coherent semantic co-segmentation and reconstruction of complex dynamic scenes from multiple static or moving cameras. Semantic co-segmentation exploits the coherence in semantic class labels both spatially, between views at a single time instant, and temporally, between widely spaced time instants of dynamic objects with similar shape and appearance. We demonstrate that semantic coherence results in improved segmentation and reconstruction for complex scenes. A joint formulation is proposed for semantically coherent object-based co-segmentation and reconstruction of scenes by enforcing consistent semantic labelling between views and over time. Semantic tracklets are introduced to enforce temporal coherence in semantic labelling and reconstruction between widely spaced instances of dynamic objects. Tracklets of dynamic objects enable unsupervised learning of appearance and shape priors that are exploited in joint segmentation and reconstruction. Evaluation on challenging indoor and outdoor sequences with hand-held moving cameras shows improved accuracy in segmentation, temporally coherent semantic labelling and 3D reconstruction of dynamic scenes