2,937 research outputs found
Stereo matching on objects with fractional boundary.
Xiong, Wei.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 56-61).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 2 --- Background Study --- p.6Chapter 2.1 --- Stereo matching --- p.6Chapter 2.2 --- Digital image matting --- p.8Chapter 2.3 --- Expectation Maximization --- p.9Chapter 3 --- Model Definition --- p.12Chapter 4 --- Initialization --- p.20Chapter 4.1 --- Initializing disparity --- p.20Chapter 4.2 --- Initializing alpha matte --- p.24Chapter 5 --- Optimization --- p.26Chapter 5.1 --- Expectation Step --- p.27Chapter 5.1.1 --- "Computing E((Pp(df = d1̐ưجθ(n),U))" --- p.28Chapter 5.1.2 --- "Computing E((Pp(db = d2̐ưجθ(n),U))" --- p.29Chapter 5.2 --- Maximization Step --- p.31Chapter 5.2.1 --- "Optimize α, given {F, B} fixed" --- p.34Chapter 5.2.2 --- "Optimize {F, B}, given α fixed" --- p.37Chapter 5.3 --- Computing Final Disparities --- p.40Chapter 6 --- Experiment Results --- p.42Chapter 7 --- Conclusion --- p.54Bibliography --- p.5
Blending Learning and Inference in Structured Prediction
In this paper we derive an efficient algorithm to learn the parameters of
structured predictors in general graphical models. This algorithm blends the
learning and inference tasks, which results in a significant speedup over
traditional approaches, such as conditional random fields and structured
support vector machines. For this purpose we utilize the structures of the
predictors to describe a low dimensional structured prediction task which
encourages local consistencies within the different structures while learning
the parameters of the model. Convexity of the learning task provides the means
to enforce the consistencies between the different parts. The
inference-learning blending algorithm that we propose is guaranteed to converge
to the optimum of the low dimensional primal and dual programs. Unlike many of
the existing approaches, the inference-learning blending allows us to learn
efficiently high-order graphical models, over regions of any size, and very
large number of parameters. We demonstrate the effectiveness of our approach,
while presenting state-of-the-art results in stereo estimation, semantic
segmentation, shape reconstruction, and indoor scene understanding
Hierarchical Surface Prediction for 3D Object Reconstruction
Recently, Convolutional Neural Networks have shown promising results for 3D
geometry prediction. They can make predictions from very little input data such
as a single color image. A major limitation of such approaches is that they
only predict a coarse resolution voxel grid, which does not capture the surface
of the objects well. We propose a general framework, called hierarchical
surface prediction (HSP), which facilitates prediction of high resolution voxel
grids. The main insight is that it is sufficient to predict high resolution
voxels around the predicted surfaces. The exterior and interior of the objects
can be represented with coarse resolution voxels. Our approach is not dependent
on a specific input type. We show results for geometry prediction from color
images, depth images and shape completion from partial voxel grids. Our
analysis shows that our high resolution predictions are more accurate than low
resolution predictions.Comment: 3DV 201
An Improved Multi-Level Edge-Based Stereo Correspondence Technique for Snake Based Object Segmentation
Disparity maps generated by stereo correspondence are very useful for stereo object segmentation because based on disparity background clutter can be effectively removed from the image. This enables conventional methods such as snake-based to efficiently detect the object of interest contour. In this research I propose two main enhancements on Alattar’s method first I increased the number of edge levels, and utilized the color information in the matching process. Besides a few minor modifications, these enhancements achieve a more accurate disparity map which eventually helps achieve higher segmentation accuracy by the snake. Experiments were performed in various indoor and outdoor image conditions to evaluate the matching performance of the proposed method compared to the previous work
Exploiting High Level Scene Cues in Stereo Reconstruction
We present a novel approach to 3D reconstruction which is inspired by the human visual system. This system unifies standard appearance matching and triangulation techniques with higher level reasoning and scene understanding, in order to resolve ambiguities between different interpretations of the scene. The types of reasoning integrated in the approach includes recognising common configurations of surface normals and semantic edges (e.g. convex, concave and occlusion boundaries). We also recognise the coplanar, collinear and symmetric structures which are especially common in man made environments
DART: Distribution Aware Retinal Transform for Event-based Cameras
We introduce a generic visual descriptor, termed as distribution aware
retinal transform (DART), that encodes the structural context using log-polar
grids for event cameras. The DART descriptor is applied to four different
problems, namely object classification, tracking, detection and feature
matching: (1) The DART features are directly employed as local descriptors in a
bag-of-features classification framework and testing is carried out on four
standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS,
NCaltech-101). (2) Extending the classification system, tracking is
demonstrated using two key novelties: (i) For overcoming the low-sample problem
for the one-shot learning of a binary classifier, statistical bootstrapping is
leveraged with online learning; (ii) To achieve tracker robustness, the scale
and rotation equivariance property of the DART descriptors is exploited for the
one-shot learning. (3) To solve the long-term object tracking problem, an
object detector is designed using the principle of cluster majority voting. The
detection scheme is then combined with the tracker to result in a high
intersection-over-union score with augmented ground truth annotations on the
publicly available event camera dataset. (4) Finally, the event context encoded
by DART greatly simplifies the feature correspondence problem, especially for
spatio-temporal slices far apart in time, which has not been explicitly tackled
in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201
- …