19 research outputs found
Semantically Guided Depth Upsampling
We present a novel method for accurate and efficient up- sampling of sparse
depth data, guided by high-resolution imagery. Our approach goes beyond the use
of intensity cues only and additionally exploits object boundary cues through
structured edge detection and semantic scene labeling for guidance. Both cues
are combined within a geodesic distance measure that allows for
boundary-preserving depth in- terpolation while utilizing local context. We
model the observed scene structure by locally planar elements and formulate the
upsampling task as a global energy minimization problem. Our method determines
glob- ally consistent solutions and preserves fine details and sharp depth
bound- aries. In our experiments on several public datasets at different levels
of application, we demonstrate superior performance of our approach over the
state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral
Depth Superresolution using Motion Adaptive Regularization
Spatial resolution of depth sensors is often significantly lower compared to
that of conventional optical cameras. Recent work has explored the idea of
improving the resolution of depth using higher resolution intensity as a side
information. In this paper, we demonstrate that further incorporating temporal
information in videos can significantly improve the results. In particular, we
propose a novel approach that improves depth resolution, exploiting the
space-time redundancy in the depth and intensity using motion-adaptive low-rank
regularization. Experiments confirm that the proposed approach substantially
improves the quality of the estimated high-resolution depth. Our approach can
be a first component in systems using vision techniques that rely on high
resolution depth information
A Joint Intensity and Depth Co-Sparse Analysis Model for Depth Map Super-Resolution
High-resolution depth maps can be inferred from low-resolution depth
measurements and an additional high-resolution intensity image of the same
scene. To that end, we introduce a bimodal co-sparse analysis model, which is
able to capture the interdependency of registered intensity and depth
information. This model is based on the assumption that the co-supports of
corresponding bimodal image structures are aligned when computed by a suitable
pair of analysis operators. No analytic form of such operators exist and we
propose a method for learning them from a set of registered training signals.
This learning process is done offline and returns a bimodal analysis operator
that is universally applicable to natural scenes. We use this to exploit the
bimodal co-sparse analysis model as a prior for solving inverse problems, which
leads to an efficient algorithm for depth map super-resolution.Comment: 13 pages, 4 figure
RGB Guided Depth Map Super-Resolution with Coupled U-Net
The depth maps captured by RGB-D cameras usually are of low resolution, entailing recent efforts to develop depth super-resolution (DSR) methods. However, several problems remain in existing DSR methods. First, conventional DSR methods often suffer from unexpected artifacts. Secondly, high-resolution (HR) RGB features and low-resolution (LR) depth features are often fused in shallow layers only. Thirdly, only the last layer of features is used for reconstruction. To address the above problems, we propose Coupled U-Net (CU-Net), a new color image guided DSR method built on two U-Net branches for HR color images and LR depth maps, respectively. The CU-Net embeds a dual skip connection structure to leverage the feature interaction of the two branches, and a multi-scale fusion to fuse the deeper and multi-scale features of two branch decoders for more effective feature reconstruction. Moreover, a channel attention module is proposed to eliminate artifacts. Extensive experiments show that the proposed CU-Net outperforms state-of-the-art methods
Explicit modeling on depth-color inconsistency for color-guided depth up-sampling
© 2016 IEEE. Color-guided depth up-sampling is to enhance the resolution of depth map according to the assumption that the depth discontinuity and color image edge at the corresponding location are consistent. Through all methods reported, MRF including its variants is one of major approaches, which has dominated in this area for several years. However, the assumption above is not always true. Solution usually is to adjust the weighting inside smoothness term in MRF model. But there is no any method explicitly considering the inconsistency occurring between depth discontinuity and the corresponding color edge. In this paper, we propose quantitative measurement on such inconsistency and explicitly embed it into weighting value of smoothness term. Such solution has not been reported in the literature. The improved depth up-sampling based on the proposed method is evaluated on Middlebury datasets and ToFMark datasets and demonstrate promising results
Investigations of closed source registration method of depth sensor technologies for human-robot collaboration
Productive teaming is the new form of human-robot interaction. The multimodal 3D imaging has a key role in this to gain a more comprehensive understanding of production system as well as to enable trustful collaboration from the teams. For a complete scene capture, the registration of the image modalities is required. Currently, low-cost RGB-D sensors are often used. These come with a closed source registration function. In order to have an efficient and freely available method for any sensors, we have developed a new method, called Triangle-Mesh-Rasterization-Projection (TMRP). To verify the performance of our method, we compare it with the closed-source projection function of the Azure Kinect Sensor (Microsoft). The qualitative comparison showed that both methods produce almost identical results. Minimal differences at the edges indicate that our TMRP interpolation is more accurate. With our method, a freely available open-source registration method is now available that can be applied to almost any multimodal 3D/2D image dataset and is not like the Microsoft SDK optimized for Microsoft products