10 research outputs found
Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
We present an approach to depth estimation that fuses information from a
stereo pair with sparse range measurements derived from a LIDAR sensor or a
range camera. The goal of this work is to exploit the complementary strengths
of the two sensor modalities, the accurate but sparse range measurements and
the ambiguous but dense stereo information. These two sources are effectively
and efficiently fused by combining ideas from anisotropic diffusion and
semi-global matching.
We evaluate our approach on the KITTI 2015 and Middlebury 2014 datasets,
using randomly sampled ground truth range measurements as our sparse depth
input. We achieve significant performance improvements with a small fraction of
range measurements on both datasets. We also provide qualitative results from
our platform using the PMDTec Monstar sensor. Our entire pipeline runs on an
NVIDIA TX-2 platform at 5Hz on 1280x1024 stereo images with 128 disparity
levels.Comment: 7 pages, 5 figures, 2 table
Nonâdestructive automatic leaf area measurements by combining stereo and timeâofâflight images
Leaf area measurements are commonly obtained by destructive and laborious practice. This study shows how stereo and timeâofâflight (ToF) images can be combined for nonâdestructive automatic leaf area measurements. The authors focus on some challenging plant images captured in a greenhouse environment, and show that even the stateâofâtheâart stereo methods produce unsatisfactory results. By transforming depth information in a ToF image to a localised search range for dense stereo, a global optimisation strategy is adopted for producing smooth results that preserve discontinuity. They also use edges of colour and disparity images for automatic leaf detection and develop a smoothing method necessary for accurately estimating surface area. In addition to show that combining stereo and ToF images gives superior qualitative and quantitative results, 149 automatic measurements on leaf area using the authors system in a validation trial have a correlation of 0.97 with true values and the rootâmeanâsquare error is 10.97 cm2, which is 9.3% of the average leaf area. Their approach could potentially be applied for combining other modalities of images with large difference in image resolutions and camera positions
Combining Time-of-Flight Depth and Stereo Images without Accurate Extrinsic Calibration
Abstract: We combine a low resolution time-of-flight depth image camera based on photonic mixer devices with two standard cameras in a stereo configuration. We show that this approach is useful even without accurate calibration. In a graph cut approach, we use depth information from the low resolution time-of-flight camera to initialize the domain, and color information for accurate depth discontinuities in the high resolution depth image. The system is promising as it is low cost, and naturally extends to the setting of dynamic scenes, providing high frame rates
Autonomous navigation for guide following in crowded indoor environments
The requirements for assisted living are rapidly changing as the number of elderly
patients over the age of 60 continues to increase. This rise places a high level of stress on
nurse practitioners who must care for more patients than they are capable. As this trend is
expected to continue, new technology will be required to help care for patients. Mobile
robots present an opportunity to help alleviate the stress on nurse practitioners by
monitoring and performing remedial tasks for elderly patients. In order to produce
mobile robots with the ability to perform these tasks, however, many challenges must be
overcome.
The hospital environment requires a high level of safety to prevent patient injury. Any
facility that uses mobile robots, therefore, must be able to ensure that no harm will come
to patients whilst in a care environment. This requires the robot to build a high level of
understanding about the environment and the people with close proximity to the robot.
Hitherto, most mobile robots have used vision-based sensors or 2D laser range finders.
3D time-of-flight sensors have recently been introduced and provide dense 3D point
clouds of the environment at real-time frame rates. This provides mobile robots with
previously unavailable dense information in real-time. I investigate the use of time-of-flight
cameras for mobile robot navigation in crowded environments in this thesis. A
unified framework to allow the robot to follow a guide through an indoor environment
safely and efficiently is presented. Each component of the framework is analyzed in
detail, with real-world scenarios illustrating its practical use.
Time-of-flight cameras are relatively new sensors and, therefore, have inherent problems
that must be overcome to receive consistent and accurate data. I propose a novel and
practical probabilistic framework to overcome many of the inherent problems in this
thesis. The framework fuses multiple depth maps with color information forming a
reliable and consistent view of the world. In order for the robot to interact with the
environment, contextual information is required. To this end, I propose a region-growing
segmentation algorithm to group points based on surface characteristics, surface normal
and surface curvature. The segmentation process creates a distinct set of surfaces,
however, only a limited amount of contextual information is available to allow for
interaction. Therefore, a novel classifier is proposed using spherical harmonics to
differentiate people from all other objects.
The added ability to identify people allows the robot to find potential candidates to
follow. However, for safe navigation, the robot must continuously track all visible
objects to obtain positional and velocity information. A multi-object tracking system is
investigated to track visible objects reliably using multiple cues, shape and color. The
tracking system allows the robot to react to the dynamic nature of people by building an
estimate of the motion flow. This flow provides the robot with the necessary information
to determine where and at what speeds it is safe to drive. In addition, a novel search
strategy is proposed to allow the robot to recover a guide who has left the field-of-view.
To achieve this, a search map is constructed with areas of the environment ranked
according to how likely they are to reveal the guideâs true location. Then, the robot can
approach the most likely search area to recover the guide. Finally, all components
presented are joined to follow a guide through an indoor environment. The results
achieved demonstrate the efficacy of the proposed components
Fusion of LIDAR with stereo camera data - an assessment
This thesis explores data fusion of LIDAR (laser range-finding) with stereo matching, with a particular emphasis on close-range industrial 3D imaging. Recently there has been interest in improving the robustness of stereo matching using data fusion with active range data. These range data have typically been acquired using time of flight cameras (ToFCs), however ToFCs offer poor spatial resolution and are noisy. Comparatively little work has been performed using LIDAR. It is argued that stereo and LIDAR are complementary and there are numerous advantages to integrating LIDAR into stereo systems. For instance, camera calibration is a necessary prerequisite for stereo 3D reconstruction, but the process is often tedious and requires precise calibration targets. It is shown that a visible-beam LIDAR enables automatic, accurate (sub-pixel) extrinsic and intrinsic camera calibration without any explicit targets. Two methods for using LIDAR to assist dense disparity maps from featureless scenes were investigated. The first involved using a LIDAR to provide high-confidence seed points for a region growing stereo matching algorithm. It is shown that these seed points allow dense matching in scenes which fail to match using stereo alone. Secondly, LIDAR was used to provide artificial texture in featureless image regions. Texture was generated by combining real or simulated images of every point the laser hits to form a pseudo-random pattern. Machine learning was used to determine the image regions that are most likely to be stereo- matched, reducing the number of LIDAR points required. Results are compared to competing techniques such as laser speckle, data projection and diffractive optical elements
Vision artificielle pour les non-voyants : une approche bio-inspirée pour la reconnaissance de formes
More than 315 million people worldwide suffer from visual impairments, with several studies suggesting that this number will double by 2030 due to the ageing of the population. To compensate for the loss of sight the current approaches consist of either specific aids designed to answer particular needs or generic systems such as neuroprostheses and sensory substitution devices. These holistic approaches, which try to restore vision as a whole, have been shown to be very inefficient in real life situations given the low resolution of output interfaces. To overcome these obstacles we propose the use of artificial vision in order to pre-process visual scenes and provide the user with relevant information. We have validated this approach through the development of a novel assistive device for the blind called Navig. Through shape recognition and spatialized sounds synthesis, this system allows users to locate and grab objects of interest. It also features navigational aids based on a new positioning method combining GPS, inertial sensors and the visual detection of geolocalized landmarks. To enhance the performance of the visual module we further developed, as part of this thesis, a bio-inspired pattern recognition algorithm which uses latency-based coding of visual information, oriented edge representations and a cascaded architecture combining detection at different resolutions.La dĂ©ficience visuelle touche aujourdâhui plus de 315 millions de personnes Ă travers le monde, un chiffre qui pourrait doubler dâici Ă 2030 du fait du vieillissement de la population. Les deux grandes approches existantes pour compenser la perte de vision sont les aides spĂ©cifiques, rĂ©pondant Ă un besoin identifiĂ©, et les systĂšmes gĂ©nĂ©riques tels que les neuroprothĂšses ou les systĂšmes de substitution sensorielle. Ces approches holistiques, tentant de restituer lâensemble de lâinformation visuelle, sâavĂšrent inadaptĂ©es de par la trop faible rĂ©solution des interfaces de sortie, rendant ces systĂšmes inutilisables dans la vie quotidienne. Face Ă ce constat, nous proposons dans cette thĂšse une dĂ©marche alternative, consistant Ă intĂ©grer des mĂ©thodes de vision artificielle, afin de prĂ©traiter la scĂšne visuelle, et de ne restituer au non-voyant que les informations extraites pertinentes. Pour valider cette approche, nous prĂ©senterons le dĂ©veloppement dâun systĂšme de supplĂ©ance baptisĂ© Navig. GrĂące Ă la reconnaissance de formes et Ă la synthĂšse de sons spatialisĂ©s, il permet Ă lâutilisateur de localiser des objets dâintĂ©rĂȘt. Il offre Ă©galement des fonctions de navigation, basĂ©es sur une nouvelle mĂ©thode de positionnement combinant GPS, donnĂ©es inertielles, et dĂ©tections de cibles visuelles gĂ©olocalisĂ©es. Afin dâamĂ©liorer les performances du module de vision artificielle, nous proposerons Ă©galement dans cette thĂšse un nouvel algorithme de reconnaissance de formes bio-inspirĂ©, reposant sur un codage de lâinformation visuelle par latence, sur des reprĂ©sentations sous forme dâarĂȘtes orientĂ©es, et sur une architecture en cascade combinant des dĂ©tections Ă diffĂ©rentes rĂ©solutions