1,705 research outputs found
Local Stereo Matching Using Adaptive Local Segmentation
We propose a new dense local stereo matching framework for gray-level images based on an adaptive local segmentation using a dynamic threshold. We define a new validity domain of the fronto-parallel assumption based on the local intensity variations in the 4-neighborhood of the matching pixel. The preprocessing step smoothes low textured areas and sharpens texture edges, whereas the postprocessing step detects and recovers occluded and unreliable disparities. The algorithm achieves high stereo reconstruction quality in regions with uniform intensities as well as in textured regions. The algorithm is robust against local radiometrical differences; and successfully recovers disparities around the objects edges, disparities of thin objects, and the disparities of the occluded region. Moreover, our algorithm intrinsically prevents errors caused by occlusion to propagate into nonoccluded regions. It has only a small number of parameters. The performance of our algorithm is evaluated on the Middlebury test bed stereo images. It ranks highly on the evaluation list outperforming many local and global stereo algorithms using color images. Among the local algorithms relying on the fronto-parallel assumption, our algorithm is the best ranked algorithm. We also demonstrate that our algorithm is working well on practical examples as for disparity estimation of a tomato seedling and a 3D reconstruction of a face
Anytime Stereo Image Depth Estimation on Mobile Devices
Many applications of stereo depth estimation in robotics require the
generation of accurate disparity maps in real time under significant
computational constraints. Current state-of-the-art algorithms force a choice
between either generating accurate mappings at a slow pace, or quickly
generating inaccurate ones, and additionally these methods typically require
far too many parameters to be usable on power- or memory-constrained devices.
Motivated by these shortcomings, we propose a novel approach for disparity
prediction in the anytime setting. In contrast to prior work, our end-to-end
learned approach can trade off computation and accuracy at inference time.
Depth estimation is performed in stages, during which the model can be queried
at any time to output its current best estimate. Our final model can process
1242375 resolution images within a range of 10-35 FPS on an NVIDIA
Jetson TX2 module with only marginal increases in error -- using two orders of
magnitude fewer parameters than the most competitive baseline. The source code
is available at https://github.com/mileyan/AnyNet .Comment: Accepted by ICRA201
Entropy-difference based stereo error detection
Stereo depth estimation is error-prone; hence, effective error detection
methods are desirable. Most such existing methods depend on characteristics of
the stereo matching cost curve, making them unduly dependent on functional
details of the matching algorithm. As a remedy, we propose a novel error
detection approach based solely on the input image and its depth map. Our
assumption is that, entropy of any point on an image will be significantly
higher than the entropy of its corresponding point on the image's depth map. In
this paper, we propose a confidence measure, Entropy-Difference (ED) for stereo
depth estimates and a binary classification method to identify incorrect
depths. Experiments on the Middlebury dataset show the effectiveness of our
method. Our proposed stereo confidence measure outperforms 17 existing measures
in all aspects except occlusion detection. Established metrics such as
precision, accuracy, recall, and area-under-curve are used to demonstrate the
effectiveness of our method
Depth from Monocular Images using a Semi-Parallel Deep Neural Network (SPDNN) Hybrid Architecture
Deep neural networks are applied to a wide range of problems in recent years.
In this work, Convolutional Neural Network (CNN) is applied to the problem of
determining the depth from a single camera image (monocular depth). Eight
different networks are designed to perform depth estimation, each of them
suitable for a feature level. Networks with different pooling sizes determine
different feature levels. After designing a set of networks, these models may
be combined into a single network topology using graph optimization techniques.
This "Semi Parallel Deep Neural Network (SPDNN)" eliminates duplicated common
network layers, and can be further optimized by retraining to achieve an
improved model compared to the individual topologies. In this study, four SPDNN
models are trained and have been evaluated at 2 stages on the KITTI dataset.
The ground truth images in the first part of the experiment are provided by the
benchmark, and for the second part, the ground truth images are the depth map
results from applying a state-of-the-art stereo matching method. The results of
this evaluation demonstrate that using post-processing techniques to refine the
target of the network increases the accuracy of depth estimation on individual
mono images. The second evaluation shows that using segmentation data alongside
the original data as the input can improve the depth estimation results to a
point where performance is comparable with stereo depth estimation. The
computational time is also discussed in this study.Comment: 44 pages, 25 figure
Memory-efficient belief propagation for high-definition real-time stereo matching systems
Tele-presence systems will enable participants to feel like they are physically together. In order to improve this feeling, these systems are starting to include depth estimation capabilities. A typical requirement for these systems includes high definition, good quality results and low latency. Benchmarks demonstrate that stereo-matching algorithms using Belief Propagation (BP) produce the best results. The execution time of the BP algorithm in a CPU cannot satisfy real-time requirements with high-definition images. GPU-based implementations of BP algorithms are only able to work in real-time with small-medium size images because the traffic with memory limits their applicability. The inherent parallelism of the BP algorithm makes FPGA-based solutions a good choice. However, even though the memory traffic of a commercial FPGA-based ASIC-prototyping board is high, it is still not enough to comply with realtime, high definition and good immersive feeling requirements. The work presented estimates depth maps in less than 40 milliseconds for high-definition images at 30fps with 80 disparity levels. The proposed double BP topology and the new data-cost estimation improve the overall classical BP performance while they reduce the memory traffic by about 21%. Moreover, the adaptive message compression method and message distribution in memory reduce the number of memory accesses by more than 70% with an almost negligible loss of performance. The total memory traffic reduction is about 90%, demonstrating sufficient quality to be classified within the first 40 positions in the Middlebury ranking.This work has been partially supported by the CDTI under project CENIT-VISION 2007-1007 and the
CICYT under TEC2008-04107
Acceleration of stereo-matching on multi-core CPU and GPU
This paper presents an accelerated version of a
dense stereo-correspondence algorithm for two different parallelism
enabled architectures, multi-core CPU and GPU. The
algorithm is part of the vision system developed for a binocular
robot-head in the context of the CloPeMa 1 research project.
This research project focuses on the conception of a new clothes
folding robot with real-time and high resolution requirements
for the vision system. The performance analysis shows that
the parallelised stereo-matching algorithm has been significantly
accelerated, maintaining 12x and 176x speed-up respectively
for multi-core CPU and GPU, compared with non-SIMD singlethread
CPU. To analyse the origin of the speed-up and gain
deeper understanding about the choice of the optimal hardware,
the algorithm was broken into key sub-tasks and the performance
was tested for four different hardware architectures
- …