1,575 research outputs found
Doctor of Philosophy
dissertation3D reconstruction from image pairs relies on finding corresponding points between images and using the corresponding points to estimate a dense disparity map. Today's correspondence-finding algorithms primarily use image features or pixel intensities common between image pairs. Some 3D computer vision applications, however, don't produce the desired results using correspondences derived from image features or pixel intensities. Two examples are the multimodal camera rig and the center region of a coaxial camera rig. Additionally, traditional stereo correspondence-finding techniques which use image features or pixel intensities sometimes produce inaccurate results. This thesis presents a novel image correspondence-finding technique that aligns pairs of image sequences using the optical flow fields. The optical flow fields provide information about the structure and motion of the scene which is not available in still images, but which can be used to align images taken from different camera positions. The method applies to applications where there is inherent motion between the camera rig and the scene and where the scene has enough visual texture to produce optical flow. We apply the technique to a traditional binocular stereo rig consisting of an RGB/IR camera pair and to a coaxial camera rig. We present results for synthetic flow fields and for real images sequences with accuracy metrics and reconstructed depth maps
Temporally coherent 4D reconstruction of complex dynamic scenes
This paper presents an approach for reconstruction of 4D temporally coherent
models of complex dynamic scenes. No prior knowledge is required of scene
structure or camera calibration allowing reconstruction from multiple moving
cameras. Sparse-to-dense temporal correspondence is integrated with joint
multi-view segmentation and reconstruction to obtain a complete 4D
representation of static and dynamic objects. Temporal coherence is exploited
to overcome visual ambiguities resulting in improved reconstruction of complex
scenes. Robust joint segmentation and reconstruction of dynamic objects is
achieved by introducing a geodesic star convexity constraint. Comparative
evaluation is performed on a variety of unstructured indoor and outdoor dynamic
scenes with hand-held cameras and multiple people. This demonstrates
reconstruction of complete temporally coherent 4D scene models with improved
nonrigid object segmentation and shape reconstruction.Comment: To appear in The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 2016 . Video available at:
https://www.youtube.com/watch?v=bm_P13_-Ds
3D Dynamic Scene Reconstruction from Multi-View Image Sequences
A confirmation report outlining my PhD research plan is presented. The PhD research topic is 3D dynamic scene reconstruction from multiple view image sequences. Chapter 1 describes the motivation and research aims. An overview of the progress in the past year is included. Chapter 2 is a review of volumetric scene reconstruction techniques and Chapter 3 is an in-depth description of my proposed reconstruction method. The theory behind the proposed volumetric scene reconstruction method is also presented, including topics in projective geometry, camera calibration and energy minimization. Chapter 4 presents the research plan and outlines the future work planned for the next two years
Highlighting objects of interest in an image by integrating saliency and depth
Stereo images have been captured primarily for 3D reconstruction in the past.
However, the depth information acquired from stereo can also be used along with
saliency to highlight certain objects in a scene. This approach can be used to
make still images more interesting to look at, and highlight objects of
interest in the scene. We introduce this novel direction in this paper, and
discuss the theoretical framework behind the approach. Even though we use depth
from stereo in this work, our approach is applicable to depth data acquired
from any sensor modality. Experimental results on both indoor and outdoor
scenes demonstrate the benefits of our algorithm
Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency
This paper presents a volumetric formulation for the multiview stereo problem which is amenable to a computationally tractable global optimization using Graph-cuts. Our approach is to seek the optimal partitioning of 3D space into two regions labeled as "object" and "empty" under a cost functional consisting of the following two terms: 1) A term that forces the boundary between the two regions to pass through photo-consistent locations; and 2) a ballooning term that inflates the "object" region. To take account of the effect of occlusion on the first term, we use an occlusion robust photo-consistency metric based on normalized cross correlation, which does not assume any geometric knowledge about the reconstructed object. The globally optimal 3D partitioning can be obtained as the minimum cut solution of a weighted graph
Online Mutual Foreground Segmentation for Multispectral Stereo Videos
The segmentation of video sequences into foreground and background regions is
a low-level process commonly used in video content analysis and smart
surveillance applications. Using a multispectral camera setup can improve this
process by providing more diverse data to help identify objects despite adverse
imaging conditions. The registration of several data sources is however not
trivial if the appearance of objects produced by each sensor differs
substantially. This problem is further complicated when parallax effects cannot
be ignored when using close-range stereo pairs. In this work, we present a new
method to simultaneously tackle multispectral segmentation and stereo
registration. Using an iterative procedure, we estimate the labeling result for
one problem using the provisional result of the other. Our approach is based on
the alternating minimization of two energy functions that are linked through
the use of dynamic priors. We rely on the integration of shape and appearance
cues to find proper multispectral correspondences, and to properly segment
objects in low contrast regions. We also formulate our model as a frame
processing pipeline using higher order terms to improve the temporal coherence
of our results. Our method is evaluated under different configurations on
multiple multispectral datasets, and our implementation is available online.Comment: Preprint accepted for publication in IJCV (December 2018
- …