3 research outputs found
Superpixel Segmentation with Fully Convolutional Networks
In computer vision, superpixels have been widely used as an effective way to
reduce the number of image primitives for subsequent processing. But only a few
attempts have been made to incorporate them into deep neural networks. One main
reason is that the standard convolution operation is defined on regular grids
and becomes inefficient when applied to superpixels. Inspired by an
initialization strategy commonly adopted by traditional superpixel algorithms,
we present a novel method that employs a simple fully convolutional network to
predict superpixels on a regular image grid. Experimental results on benchmark
datasets show that our method achieves state-of-the-art superpixel segmentation
performance while running at about 50fps. Based on the predicted superpixels,
we further develop a downsampling/upsampling scheme for deep networks with the
goal of generating high-resolution outputs for dense prediction tasks.
Specifically, we modify a popular network architecture for stereo matching to
simultaneously predict superpixels and disparities. We show that improved
disparity estimation accuracy can be obtained on public datasets.Comment: 16 pages, 15 figures, to be published in CVPR'2
Hierarchical Deep Stereo Matching on High-resolution Images
We explore the problem of real-time stereo matching on high-res imagery. Many
state-of-the-art (SOTA) methods struggle to process high-res imagery because of
memory constraints or speed limitations. To address this issue, we propose an
end-to-end framework that searches for correspondences incrementally over a
coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare,
we introduce a dataset with high-res stereo pairs for both training and
evaluation. Our approach achieved SOTA performance on Middlebury-v3 and
KITTI-15 while running significantly faster than its competitors. The
hierarchical design also naturally allows for anytime on-demand reports of
disparity by capping intermediate coarse results, allowing us to accurately
predict disparity for near-range structures with low latency (30ms). We
demonstrate that the performance-vs-speed trade-off afforded by on-demand
hierarchies may address sensing needs for time-critical applications such as
autonomous driving.Comment: CVPR 201
A Comparative Evaluation of SGM Variants (including a New Variant, tMGM) for Dense Stereo Matching
Our goal here is threefold: [1] To present a new dense-stereo matching
algorithm, tMGM, that by combining the hierarchical logic of tSGM with the
support structure of MGM achieves 6-8\% performance improvement over the
baseline SGM (these performance numbers are posted under tMGM-16 in the
Middlebury Benchmark V3 ); and [2] Through an exhaustive quantitative and
qualitative comparative study, to compare how the major variants of the SGM
approach to dense stereo matching, including the new tMGM, perform in the
presence of: (a) illumination variations and shadows, (b) untextured or weakly
textured regions, (c) repetitive patterns in the scene in the presence of large
stereo rectification errors. [3] To present a novel DEM-Sculpting approach for
estimating initial disparity search bounds for multi-date satellite stereo
pairs. Based on our study, we have found that tMGM generally performs best with
respect to all these data conditions. Both tSGM and MGM improve the density of
stereo disparity maps and combining the two in tMGM makes it possible to
accurately estimate the disparities at a significant number of pixels that
would otherwise be declared invalid by SGM. The datasets we have used in our
comparative evaluation include the Middlebury2014, KITTI2015, and ETH3D
datasets and the satellite images over the San Fernando area from the MVS
Challenge dataset