2,319 research outputs found
DCTM: Discrete-Continuous Transformation Matching for Semantic Flow
Techniques for dense semantic correspondence have provided limited ability to
deal with the geometric variations that commonly exist between semantically
similar images. While variations due to scale and rotation have been examined,
there lack practical solutions for more complex deformations such as affine
transformations because of the tremendous size of the associated solution
space. To address this problem, we present a discrete-continuous transformation
matching (DCTM) framework where dense affine transformation fields are inferred
through a discrete label optimization in which the labels are iteratively
updated via continuous regularization. In this way, our approach draws
solutions from the continuous space of affine transformations in a manner that
can be computed efficiently through constant-time edge-aware filtering and a
proposed affine-varying CNN-based descriptor. Experimental results show that
this model outperforms the state-of-the-art methods for dense semantic
correspondence on various benchmarks
Stereo Matching Using a Modified Efficient Belief Propagation in a Level Set Framework
Stereo matching determines correspondence between pixels in two or more images of the same scene taken from different angles; this can be handled either locally or globally. The two most common global approaches are belief propagation (BP) and graph cuts.
Efficient belief propagation (EBP), which is the most widely used BP approach, uses a multi-scale message passing strategy, an O(k) smoothness cost algorithm, and a bipartite message passing strategy to speed up the convergence of the standard BP approach. As in standard belief propagation, every pixel sends messages to and receives messages from its four neighboring pixels in EBP. Each outgoing message is the sum of the data cost, incoming messages from all the neighbors except the intended receiver, and the smoothness cost. Upon convergence, the location of the minimum of the final belief vector is defined as the current pixel’s disparity.
The present effort makes three main contributions: (a) it incorporates level set concepts, (b) it develops a modified data cost to encourage matching of intervals, (c) it adjusts the location of the minimum of outgoing messages for select pixels that is consistent with the level set method.
When comparing the results of the current work with that of standard EBP, the disparity results are very similar, as they should be
3D RECONSTRUCTION FROM STEREO/RANGE IMAGES
3D reconstruction from stereo/range image is one of the most fundamental and extensively researched topics in computer vision. Stereo research has recently experienced somewhat of a new era, as a result of publically available performance testing such as the Middlebury data set, which has allowed researchers to compare their algorithms against all the state-of-the-art algorithms. This thesis investigates into the general stereo problems in both the two-view stereo and multi-view stereo scopes. In the two-view stereo scope, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy minimization framework. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer. A GPU approach of the Hierarchical BP algorithm is then proposed, which provides similar stereo quality to CPU Hierarchical BP while running at real-time speed. A fast-converging BP is also proposed to solve the slow convergence problem of general BP algorithms. Besides two-view stereo, ecient multi-view stereo for large scale urban reconstruction is carefully studied in this thesis. A novel approach for computing depth maps given urban imagery where often large parts of surfaces are weakly textured is presented. Finally, a new post-processing step to enhance the range images in both the both the spatial resolution and depth precision is proposed
Cross-Scale Cost Aggregation for Stereo Matching
Human beings process stereoscopic correspondence across multiple scales.
However, this bio-inspiration is ignored by state-of-the-art cost aggregation
methods for dense stereo correspondence. In this paper, a generic cross-scale
cost aggregation framework is proposed to allow multi-scale interaction in cost
aggregation. We firstly reformulate cost aggregation from a unified
optimization perspective and show that different cost aggregation methods
essentially differ in the choices of similarity kernels. Then, an inter-scale
regularizer is introduced into optimization and solving this new optimization
problem leads to the proposed framework. Since the regularization term is
independent of the similarity kernel, various cost aggregation methods can be
integrated into the proposed general framework. We show that the cross-scale
framework is important as it effectively and efficiently expands
state-of-the-art cost aggregation methods and leads to significant
improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). 2014 (poster, 29.88%
- …