28 research outputs found
Probabilistic ToF and Stereo Data Fusion Based on Mixed Pixel Measurement Models
This paper proposes a method for fusing data acquired by a ToF camera and a stereo pair based on a model for depth measurement by ToF cameras which accounts also for depth discontinuity artifacts due to the mixed pixel effect. Such model is exploited within both a ML and a MAP-MRF frameworks for ToF and stereo data fusion. The proposed MAP-MRF framework is characterized by site-dependent range values, a rather important feature since it can be used both to improve the accuracy and to decrease the computational complexity of standard MAP-MRF approaches. This paper, in order to optimize the site dependent global cost function characteristic of the proposed MAP-MRF approach, also introduces an extension to Loopy Belief Propagation which can be used in other contexts. Experimental data validate the proposed ToF measurements model and the effectiveness of the proposed fusion techniques
Learning to Detect Ground Control Points for Improving the Accuracy of Stereo Matching
International audienceWhile machine learning has been instrumental to the ongoing progress in most areas of computer vision, it has not been applied to the problem of stereo matching with similar frequency or success. We present a supervised learning approach for predicting the correctness of stereo matches based on a random forest and a set of features that capture various forms of information about each pixel.We show highly competitive results in predicting the correctness of matches and in confidence estimation, which allows us to rank pixels according to the reliability of their assigned disparities. Moreover, we show how these confidence values can be used to improve the accuracy of disparity maps by integrating them with an MRF-based stereo algorithm. This is an important distinction from current literature that has mainly focused on sparsification by removing potentially erroneous disparities to generate quasi-dense disparity maps
Calibration Method for Texel Images Created from Fused Lidar and Digital Camera Images
The fusion of imaging lidar information and digital imagery results in 2.5-dimensional surfaces covered with texture information, called texel images. These data sets, when taken from different viewpoints, can be combined to create three-dimensional (3-D) images of buildings, vehicles, or other objects. This paper presents a procedure for calibration, error correction, and fusing of flash lidar and digital camera information from a single sensor configuration to create accurate texel images. A brief description of a prototype sensor is given, along with a calibration technique used with the sensor, which is applicable to other flash lidar/digital image sensor systems. The method combines systematic error correction of the flash lidar data, correction for lens distortion of the digital camera and flash lidar images, and fusion of the lidar to the camera data in a single process. The result is a texel image acquired directly from the sensor. Examples of the resulting images, with improvements from the proposed algorithm, are presented. Results with the prototype sensor show very good match between 3-D points and the digital image (\u3c 2.8 image pixels), with a 3-D object measurement error of \u3c 0.5%, compared to a noncalibrated error of ∼3%
Patch based synthesis for single depth image super-resolution
We present an algorithm to synthetically increase the resolution of a solitary depth image using only a generic database of local patches. Modern range sensors measure depths with non-Gaussian noise and at lower starting resolutions than typical visible-light cameras. While patch based approaches for upsampling intensity images continue to improve, this is the first exploration of patching for depth images. We match against the height field of each low resolution input depth patch, and search our database for a list of appropriate high resolution candidate patches. Selecting the right candidate at each location in the depth image is then posed as a Markov random field labeling problem. Our experiments also show how important further depth-specific processing, such as noise removal and correct patch normalization, dramatically improves our results. Perhaps surprisingly, even better results are achieved on a variety of real test scenes by providing our algorithm with only synthetic training depth data
Learning Stereo from Single Images
Supervised deep networks are among the best methods for finding
correspondences in stereo image pairs. Like all supervised approaches, these
networks require ground truth data during training. However, collecting large
quantities of accurate dense correspondence data is very challenging. We
propose that it is unnecessary to have such a high reliance on ground truth
depths or even corresponding stereo pairs. Inspired by recent progress in
monocular depth estimation, we generate plausible disparity maps from single
images. In turn, we use those flawed disparity maps in a carefully designed
pipeline to generate stereo training pairs. Training in this manner makes it
possible to convert any collection of single RGB images into stereo training
data. This results in a significant reduction in human effort, with no need to
collect real depths or to hand-design synthetic data. We can consequently train
a stereo matching network from scratch on datasets like COCO, which were
previously hard to exploit for stereo. Through extensive experiments we show
that our approach outperforms stereo networks trained with standard synthetic
datasets, when evaluated on KITTI, ETH3D, and Middlebury.Comment: Accepted as an oral presentation at ECCV 202
Background Subtraction for Time of Flight Imaging
A time of flight camera provides two types of images simultaneously, depth and intensity. In this paper a computational method for background subtraction, combining both images or fast sequences of images, is proposed. The background model is based on unbalanced or semi-supervised classifiers, in particular support vector machines. A brief review of one class support vector machines is first given. A method that combines the range and intensity data in two operational modes is then provided. Finally, experimental results are presented and discussed.Facultad de Informátic
Supresión de segundo plano en imágenes de tiempo de vuelo
En este artÃculo se presenta un método computacional para detectar y extraer el plano de fondo, segundo plano, a partir de datos obtenidos por cámaras de tiempo de vuelo. Se utiliza una variante de un método de clasificación basado en máquinas de soporte vectorial. Considerando las caracterÃsticas particulares del tipo de cámaras utilizadas, se incorpora adecuadamente la información de rango e intensidad, y se utiliza la capacidad para obtener secuencias rápidas de datos en una modalidad particular. El artÃculo revisa las técnicas especÃficas de reconocimiento de patrones utilizadas, presenta la solución propuesta y muestra resultados experimentales preliminares del método propuesto.VII Workshop Procesamiento de Señales y Sistemas de Tiempo Real (WPSTR).Red de Universidades con Carreras en Informática (RedUNCI