37,016 research outputs found

    Using eSkel to Implement the Multiple Baseline Stereo Application

    Get PDF
    We give an overview of the Edinburgh Skeleton Library eSkel, a structured parallel programming library which offers a range of skeletal parallel programming constructs to the C/MPI programmer. Then we illustrate the efficacy of such a high level approach through an application of multiple baseline stereo. We describe the application and show different ways to introduce parallelism using algorithmic skeletons. Some performance results will be reported

    3D sparse feature model using short baseline stereo and multiple view registration

    Full text link
    This paper outlines a methodology to generate a distinctive object representation offline, using short-baseline stereo fundamentals to triangulate highly descriptive object features in multiple pairs of stereo images. A group of sparse 2.5D perspective views are built and the multiple views are then fused into a single sparse 3D model using a common 3D shape registration technique. Having prior knowledge, such as the proposed sparse feature model, is useful when detecting an object and estimating its pose for real-time systems like augmented reality

    Wide-baseline Stereo from Multiple Views: a Probabilistic Account

    Get PDF
    This paper describes a method for dense depth reconstruction from a small set of wide-baseline images. In a widebaseline setting an inherent difficulty which complicates the stereo-correspondence problem is self-occlusion. Also, we have to consider the possibility that image pixels in different images, which are projections of the same point in the scene, will have different color values due to non-Lambertian effects or discretization errors. We propose a Bayesian approach to tackle these problems. In this framework, the images are regarded as noisy measurements of an underlying ’true’ image-function. Also, the image data is considered incomplete, in the sense that we do not know which pixels from a particular image are occluded in the other images. We describe an EM-algorithm, which iterates between estimating values for all hidden quantities, and optimizing the current depth estimates. The algorithm has few free parameters, displays a stable convergence behavior and generates accurate depth estimates. The approach is illustrated with several challenging real-world examples. We also show how the algorithm can generate realistic view interpolations and how it merges the information of all images into a new, synthetic view

    Segmentation Based Features for Wide-Baseline Multi-view Reconstruction

    Get PDF
    A common problem in wide-baseline stereo is the sparse and non-uniform distribution of correspondences when using conventional detectors such as SIFT, SURF, FAST and MSER. In this paper we introduce a novel segmentation based feature detector SFD that produces an increased number of ‘good’ features for accurate wide-baseline reconstruction. Each image is segmented into regions by over-segmentation and feature points are detected at the intersection of the boundaries for three or more regions. Segmentation-based feature detection locates features at local maxima giving a relatively large number of feature points which are consistently detected across wide-baseline views and accurately localised. A comprehensive comparative performance evaluation with previous feature detection approaches demonstrates that: SFD produces a large number of features with increased scene coverage; detected features are consistent across wide-baseline views for images of a variety of indoor and outdoor scenes; and the number of wide-baseline matches is increased by an order of magnitude compared to alternative detector-descriptor combinations. Sparse scene reconstruction from multiple wide-baseline stereo views using the SFD feature detector demonstrates at least a factor six increase in the number of reconstructed points with reduced error distribution compared to SIFT when evaluated against ground-truth and similar computational cost to SURF/FAST

    Multi-view passive 3D face acquisition device

    Get PDF
    Approaches to acquisition of 3D facial data include laser scanners, structured light devices and (passive) stereo vision. The laser scanner and structured light methods allow accurate reconstruction of the 3D surface but strong light is projected on the faces of subjects. Passive stereo vision based approaches do not require strong light to be projected, however, it is hard to obtain comparable accuracy and robustness of the surface reconstruction. In this paper a passive multiple view approach using 5 cameras in a ’+’ configuration is proposed that significantly increases robustness and accuracy relative to traditional stereo vision approaches. The normalised cross correlations of all 5 views are combined using direct projection of points instead of the traditionally used rectified images. Also, errors caused by different perspective deformation of the surface in the different views are reduced by using an iterative reconstruction technique where the depth estimation of the previous iteration is used to warp the windows of the normalised cross correlation for the different views

    Learning multi-modal features for dense matching-based confidence estimation

    Get PDF
    In recent years, the ability to assess the uncertainty of depth estimates in the context of dense stereo matching has received increased attention due to its potential to detect erroneous estimates. Especially, the introduction of deep learning approaches greatly improved general performance, with feature extraction from multiple modalities proving to be highly advantageous due to the unique and different characteristics of each modality. However, most work in the literature focuses on using only mono- or bi- or rarely tri-modal input, not considering the potential effectiveness of modalities, going beyond tri-modality. To further advance the idea of combining different types of features for confidence estimation, in this work, a CNN-based approach is proposed, exploiting uncertainty cues from up to four modalities. For this purpose, a state-of-the-art local-global approach is used as baseline and extended accordingly. Additionally, a novel disparity-based modality named warped difference is presented to support uncertainty estimation at common failure cases of dense stereo matching. The general validity and improved performance of the proposed approach is demonstrated and compared against the bi-modal baseline in an evaluation on three datasets using two common dense stereo matching techniques

    WxBS: Wide Baseline Stereo Generalizations

    Full text link
    We have presented a new problem -- the wide multiple baseline stereo (WxBS) -- which considers matching of images that simultaneously differ in more than one image acquisition factor such as viewpoint, illumination, sensor type or where object appearance changes significantly, e.g. over time. A new dataset with the ground truth for evaluation of matching algorithms has been introduced and will be made public. We have extensively tested a large set of popular and recent detectors and descriptors and show than the combination of RootSIFT and HalfRootSIFT as descriptors with MSER and Hessian-Affine detectors works best for many different nuisance factors. We show that simple adaptive thresholding improves Hessian-Affine, DoG, MSER (and possibly other) detectors and allows to use them on infrared and low contrast images. A novel matching algorithm for addressing the WxBS problem has been introduced. We have shown experimentally that the WxBS-M matcher dominantes the state-of-the-art methods both on both the new and existing datasets.Comment: Descriptor and detector evaluation expande

    General Dynamic Scene Reconstruction from Multiple View Video

    Get PDF
    This paper introduces a general approach to dynamic scene reconstruction from multiple moving cameras without prior knowledge or limiting constraints on the scene structure, appearance, or illumination. Existing techniques for dynamic scene reconstruction from multiple wide-baseline camera views primarily focus on accurate reconstruction in controlled environments, where the cameras are fixed and calibrated and background is known. These approaches are not robust for general dynamic scenes captured with sparse moving cameras. Previous approaches for outdoor dynamic scene reconstruction assume prior knowledge of the static background appearance and structure. The primary contributions of this paper are twofold: an automatic method for initial coarse dynamic scene segmentation and reconstruction without prior knowledge of background appearance or structure; and a general robust approach for joint segmentation refinement and dense reconstruction of dynamic scenes from multiple wide-baseline static or moving cameras. Evaluation is performed on a variety of indoor and outdoor scenes with cluttered backgrounds and multiple dynamic non-rigid objects such as people. Comparison with state-of-the-art approaches demonstrates improved accuracy in both multiple view segmentation and dense reconstruction. The proposed approach also eliminates the requirement for prior knowledge of scene structure and appearance
    • …
    corecore