16 research outputs found

    Semi-Global Stereo Matching with Surface Orientation Priors

    Full text link
    Semi-Global Matching (SGM) is a widely-used efficient stereo matching technique. It works well for textured scenes, but fails on untextured slanted surfaces due to its fronto-parallel smoothness assumption. To remedy this problem, we propose a simple extension, termed SGM-P, to utilize precomputed surface orientation priors. Such priors favor different surface slants in different 2D image regions or 3D scene regions and can be derived in various ways. In this paper we evaluate plane orientation priors derived from stereo matching at a coarser resolution and show that such priors can yield significant performance gains for difficult weakly-textured scenes. We also explore surface normal priors derived from Manhattan-world assumptions, and we analyze the potential performance gains using oracle priors derived from ground-truth data. SGM-P only adds a minor computational overhead to SGM and is an attractive alternative to more complex methods employing higher-order smoothness terms.Comment: extended draft of 3DV 2017 (spotlight) pape

    Image-based rendering and synthesis

    Get PDF
    Multiview imaging (MVI) is currently the focus of some research as it has a wide range of applications and opens up research in other topics and applications, including virtual view synthesis for three-dimensional (3D) television (3DTV) and entertainment. However, a large amount of storage is needed by multiview systems and are difficult to construct. The concept behind allowing 3D scenes and objects to be visualized in a realistic way without full 3D model reconstruction is image-based rendering (IBR). Using images as the primary substrate, IBR has many potential applications including for video games, virtual travel and others. The technique creates new views of scenes which are reconstructed from a collection of densely sampled images or videos. The IBR concept has different classification such as knowing 3D models and the lighting conditions and be rendered using conventional graphic techniques. Another is lightfield or lumigraph rendering which depends on dense sampling with no or very little geometry for rendering without recovering the exact 3D-models.published_or_final_versio

    Stereo Matching Using a Modified Efficient Belief Propagation in a Level Set Framework

    Get PDF
    Stereo matching determines correspondence between pixels in two or more images of the same scene taken from different angles; this can be handled either locally or globally. The two most common global approaches are belief propagation (BP) and graph cuts. Efficient belief propagation (EBP), which is the most widely used BP approach, uses a multi-scale message passing strategy, an O(k) smoothness cost algorithm, and a bipartite message passing strategy to speed up the convergence of the standard BP approach. As in standard belief propagation, every pixel sends messages to and receives messages from its four neighboring pixels in EBP. Each outgoing message is the sum of the data cost, incoming messages from all the neighbors except the intended receiver, and the smoothness cost. Upon convergence, the location of the minimum of the final belief vector is defined as the current pixel’s disparity. The present effort makes three main contributions: (a) it incorporates level set concepts, (b) it develops a modified data cost to encourage matching of intervals, (c) it adjusts the location of the minimum of outgoing messages for select pixels that is consistent with the level set method. When comparing the results of the current work with that of standard EBP, the disparity results are very similar, as they should be

    Deep Blending for Free-Viewpoint Image-Based Rendering

    Get PDF
    International audienceIBR Algorithm Real surface Reconstruction Fig. 1. Image-based rendering blends contributions from different input images to synthesize a novel view. (a) This blending operation is complex for a variety of reasons. For example, incorrect visibility (e.g., for the non-reconstructed green surface in input view (1)) may result in wrong projections of an image into the novel view. Blending must also account for, e.g., differences in projected resolution ((2) and (3)). Previous methods have used hand-crafted heuristics to overcome these issues. (b) Our method generates a set of ranked contributions (mosaics) from the input images and uses predicted blending weights from a CNN to perform deep blending and synthesize (c) novel views. Our solution significantly reduces visible artifacts compared to (d) previous methods. Free-viewpoint image-based rendering (IBR) is a standing challenge. IBR methods combine warped versions of input photos to synthesize a novel view. The image quality of this combination is directly affected by geometric inaccuracies of multi-view stereo (MVS) reconstruction and by view-and image-dependent effects that produce artifacts when contributions from different input views are blended. We present a new deep learning approach to blending for IBR, in which we use held-out real image data to learn blending weights to combine input photo contributions. Our Deep Blending method requires us to address several challenges to achieve our goal of interactive free-viewpoint IBR navigation. We first need to provide sufficiently accurate geometry so the Convolutional Neural Network (CNN) can succeed in finding correct blending weights. We do this by combining two different MVS reconstructions with complementary accuracy vs. completeness tradeoffs. To tightly integrate learning in an interactive IBR system, we need to adapt our rendering algorithm to produce a fixed number of input layers that can then be blended by the CNN. We generate training data with a variety of captured scenes, using each input photo as ground truth in a held-out approach. We also design the network architecture and the training loss to provide high quality novel view synthesis, while reducing temporal flickering artifacts. Our results demonstrate free-viewpoint IBR in a wide variety of scenes, clearly surpassing previous methods in visual quality, especially when moving far from the input cameras

    3D RECONSTRUCTION FROM STEREO/RANGE IMAGES

    Get PDF
    3D reconstruction from stereo/range image is one of the most fundamental and extensively researched topics in computer vision. Stereo research has recently experienced somewhat of a new era, as a result of publically available performance testing such as the Middlebury data set, which has allowed researchers to compare their algorithms against all the state-of-the-art algorithms. This thesis investigates into the general stereo problems in both the two-view stereo and multi-view stereo scopes. In the two-view stereo scope, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy minimization framework. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer. A GPU approach of the Hierarchical BP algorithm is then proposed, which provides similar stereo quality to CPU Hierarchical BP while running at real-time speed. A fast-converging BP is also proposed to solve the slow convergence problem of general BP algorithms. Besides two-view stereo, ecient multi-view stereo for large scale urban reconstruction is carefully studied in this thesis. A novel approach for computing depth maps given urban imagery where often large parts of surfaces are weakly textured is presented. Finally, a new post-processing step to enhance the range images in both the both the spatial resolution and depth precision is proposed

    3D city scale reconstruction using wide area motion imagery

    Get PDF
    3D reconstruction is one of the most challenging but also most necessary part of computer vision. It is generally applied everywhere, from remote sensing to medical imaging and multimedia. Wide Area Motion Imagery is a field that has gained traction over the recent years. It consists in using an airborne large field of view sensor to cover a typically over a square kilometer area for each captured image. This is particularly valuable data for analysis but the amount of information is overwhelming for any human analyst. Algorithms to efficiently and automatically extract information are therefore needed and 3D reconstruction plays a critical part in it, along with detection and tracking. This dissertation work presents novel reconstruction algorithms to compute a 3D probabilistic space, a set of experiments to efficiently extract photo realistic 3D point clouds and a range of transformations for possible applications of the generated 3D data to filtering, data compression and mapping. The algorithms have been successfully tested on our own datasets provided by Transparent Sky and this thesis work also proposes methods to evaluate accuracy, completeness and photo-consistency. The generated data has been successfully used to improve detection and tracking performances, and allows data compression and extrapolation by generating synthetic images from new point of view, and data augmentation with the inferred occlusion areas.Includes bibliographical reference
    corecore