2,937 research outputs found

    Stereo matching on objects with fractional boundary.

    Get PDF
    Xiong, Wei.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 56-61).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 2 --- Background Study --- p.6Chapter 2.1 --- Stereo matching --- p.6Chapter 2.2 --- Digital image matting --- p.8Chapter 2.3 --- Expectation Maximization --- p.9Chapter 3 --- Model Definition --- p.12Chapter 4 --- Initialization --- p.20Chapter 4.1 --- Initializing disparity --- p.20Chapter 4.2 --- Initializing alpha matte --- p.24Chapter 5 --- Optimization --- p.26Chapter 5.1 --- Expectation Step --- p.27Chapter 5.1.1 --- "Computing E((Pp(df = d1̐ưجθ(n),U))" --- p.28Chapter 5.1.2 --- "Computing E((Pp(db = d2̐ưجθ(n),U))" --- p.29Chapter 5.2 --- Maximization Step --- p.31Chapter 5.2.1 --- "Optimize α, given {F, B} fixed" --- p.34Chapter 5.2.2 --- "Optimize {F, B}, given α fixed" --- p.37Chapter 5.3 --- Computing Final Disparities --- p.40Chapter 6 --- Experiment Results --- p.42Chapter 7 --- Conclusion --- p.54Bibliography --- p.5

    Blending Learning and Inference in Structured Prediction

    Full text link
    In this paper we derive an efficient algorithm to learn the parameters of structured predictors in general graphical models. This algorithm blends the learning and inference tasks, which results in a significant speedup over traditional approaches, such as conditional random fields and structured support vector machines. For this purpose we utilize the structures of the predictors to describe a low dimensional structured prediction task which encourages local consistencies within the different structures while learning the parameters of the model. Convexity of the learning task provides the means to enforce the consistencies between the different parts. The inference-learning blending algorithm that we propose is guaranteed to converge to the optimum of the low dimensional primal and dual programs. Unlike many of the existing approaches, the inference-learning blending allows us to learn efficiently high-order graphical models, over regions of any size, and very large number of parameters. We demonstrate the effectiveness of our approach, while presenting state-of-the-art results in stereo estimation, semantic segmentation, shape reconstruction, and indoor scene understanding

    Hierarchical Surface Prediction for 3D Object Reconstruction

    Full text link
    Recently, Convolutional Neural Networks have shown promising results for 3D geometry prediction. They can make predictions from very little input data such as a single color image. A major limitation of such approaches is that they only predict a coarse resolution voxel grid, which does not capture the surface of the objects well. We propose a general framework, called hierarchical surface prediction (HSP), which facilitates prediction of high resolution voxel grids. The main insight is that it is sufficient to predict high resolution voxels around the predicted surfaces. The exterior and interior of the objects can be represented with coarse resolution voxels. Our approach is not dependent on a specific input type. We show results for geometry prediction from color images, depth images and shape completion from partial voxel grids. Our analysis shows that our high resolution predictions are more accurate than low resolution predictions.Comment: 3DV 201

    An Improved Multi-Level Edge-Based Stereo Correspondence Technique for Snake Based Object Segmentation

    Get PDF
    Disparity maps generated by stereo correspondence are very useful for stereo object segmentation because based on disparity background clutter can be effectively removed from the image. This enables conventional methods such as snake-based to efficiently detect the object of interest contour. In this research I propose two main enhancements on Alattar’s method first I increased the number of edge levels, and utilized the color information in the matching process. Besides a few minor modifications, these enhancements achieve a more accurate disparity map which eventually helps achieve higher segmentation accuracy by the snake. Experiments were performed in various indoor and outdoor image conditions to evaluate the matching performance of the proposed method compared to the previous work

    Exploiting High Level Scene Cues in Stereo Reconstruction

    Get PDF
    We present a novel approach to 3D reconstruction which is inspired by the human visual system. This system unifies standard appearance matching and triangulation techniques with higher level reasoning and scene understanding, in order to resolve ambiguities between different interpretations of the scene. The types of reasoning integrated in the approach includes recognising common configurations of surface normals and semantic edges (e.g. convex, concave and occlusion boundaries). We also recognise the coplanar, collinear and symmetric structures which are especially common in man made environments

    DART: Distribution Aware Retinal Transform for Event-based Cameras

    Full text link
    We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-features classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101). (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) For overcoming the low-sample problem for the one-shot learning of a binary classifier, statistical bootstrapping is leveraged with online learning; (ii) To achieve tracker robustness, the scale and rotation equivariance property of the DART descriptors is exploited for the one-shot learning. (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset. (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201
    corecore