10 research outputs found

    Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements

    Full text link
    We present an approach to depth estimation that fuses information from a stereo pair with sparse range measurements derived from a LIDAR sensor or a range camera. The goal of this work is to exploit the complementary strengths of the two sensor modalities, the accurate but sparse range measurements and the ambiguous but dense stereo information. These two sources are effectively and efficiently fused by combining ideas from anisotropic diffusion and semi-global matching. We evaluate our approach on the KITTI 2015 and Middlebury 2014 datasets, using randomly sampled ground truth range measurements as our sparse depth input. We achieve significant performance improvements with a small fraction of range measurements on both datasets. We also provide qualitative results from our platform using the PMDTec Monstar sensor. Our entire pipeline runs on an NVIDIA TX-2 platform at 5Hz on 1280x1024 stereo images with 128 disparity levels.Comment: 7 pages, 5 figures, 2 table

    DATA FUSION OF LIDAR INTO A REGION GROWING STEREO ALGORITHM

    Get PDF

    Non‐destructive automatic leaf area measurements by combining stereo and time‐of‐flight images

    Full text link
    Leaf area measurements are commonly obtained by destructive and laborious practice. This study shows how stereo and time‐of‐flight (ToF) images can be combined for non‐destructive automatic leaf area measurements. The authors focus on some challenging plant images captured in a greenhouse environment, and show that even the state‐of‐the‐art stereo methods produce unsatisfactory results. By transforming depth information in a ToF image to a localised search range for dense stereo, a global optimisation strategy is adopted for producing smooth results that preserve discontinuity. They also use edges of colour and disparity images for automatic leaf detection and develop a smoothing method necessary for accurately estimating surface area. In addition to show that combining stereo and ToF images gives superior qualitative and quantitative results, 149 automatic measurements on leaf area using the authors system in a validation trial have a correlation of 0.97 with true values and the root‐mean‐square error is 10.97 cm2, which is 9.3% of the average leaf area. Their approach could potentially be applied for combining other modalities of images with large difference in image resolutions and camera positions

    Combining Time-of-Flight Depth and Stereo Images without Accurate Extrinsic Calibration

    No full text
    Abstract: We combine a low resolution time-of-flight depth image camera based on photonic mixer devices with two standard cameras in a stereo configuration. We show that this approach is useful even without accurate calibration. In a graph cut approach, we use depth information from the low resolution time-of-flight camera to initialize the domain, and color information for accurate depth discontinuities in the high resolution depth image. The system is promising as it is low cost, and naturally extends to the setting of dynamic scenes, providing high frame rates

    Autonomous navigation for guide following in crowded indoor environments

    No full text
    The requirements for assisted living are rapidly changing as the number of elderly patients over the age of 60 continues to increase. This rise places a high level of stress on nurse practitioners who must care for more patients than they are capable. As this trend is expected to continue, new technology will be required to help care for patients. Mobile robots present an opportunity to help alleviate the stress on nurse practitioners by monitoring and performing remedial tasks for elderly patients. In order to produce mobile robots with the ability to perform these tasks, however, many challenges must be overcome. The hospital environment requires a high level of safety to prevent patient injury. Any facility that uses mobile robots, therefore, must be able to ensure that no harm will come to patients whilst in a care environment. This requires the robot to build a high level of understanding about the environment and the people with close proximity to the robot. Hitherto, most mobile robots have used vision-based sensors or 2D laser range finders. 3D time-of-flight sensors have recently been introduced and provide dense 3D point clouds of the environment at real-time frame rates. This provides mobile robots with previously unavailable dense information in real-time. I investigate the use of time-of-flight cameras for mobile robot navigation in crowded environments in this thesis. A unified framework to allow the robot to follow a guide through an indoor environment safely and efficiently is presented. Each component of the framework is analyzed in detail, with real-world scenarios illustrating its practical use. Time-of-flight cameras are relatively new sensors and, therefore, have inherent problems that must be overcome to receive consistent and accurate data. I propose a novel and practical probabilistic framework to overcome many of the inherent problems in this thesis. The framework fuses multiple depth maps with color information forming a reliable and consistent view of the world. In order for the robot to interact with the environment, contextual information is required. To this end, I propose a region-growing segmentation algorithm to group points based on surface characteristics, surface normal and surface curvature. The segmentation process creates a distinct set of surfaces, however, only a limited amount of contextual information is available to allow for interaction. Therefore, a novel classifier is proposed using spherical harmonics to differentiate people from all other objects. The added ability to identify people allows the robot to find potential candidates to follow. However, for safe navigation, the robot must continuously track all visible objects to obtain positional and velocity information. A multi-object tracking system is investigated to track visible objects reliably using multiple cues, shape and color. The tracking system allows the robot to react to the dynamic nature of people by building an estimate of the motion flow. This flow provides the robot with the necessary information to determine where and at what speeds it is safe to drive. In addition, a novel search strategy is proposed to allow the robot to recover a guide who has left the field-of-view. To achieve this, a search map is constructed with areas of the environment ranked according to how likely they are to reveal the guide’s true location. Then, the robot can approach the most likely search area to recover the guide. Finally, all components presented are joined to follow a guide through an indoor environment. The results achieved demonstrate the efficacy of the proposed components

    Fusion of LIDAR with stereo camera data - an assessment

    Get PDF
    This thesis explores data fusion of LIDAR (laser range-finding) with stereo matching, with a particular emphasis on close-range industrial 3D imaging. Recently there has been interest in improving the robustness of stereo matching using data fusion with active range data. These range data have typically been acquired using time of flight cameras (ToFCs), however ToFCs offer poor spatial resolution and are noisy. Comparatively little work has been performed using LIDAR. It is argued that stereo and LIDAR are complementary and there are numerous advantages to integrating LIDAR into stereo systems. For instance, camera calibration is a necessary prerequisite for stereo 3D reconstruction, but the process is often tedious and requires precise calibration targets. It is shown that a visible-beam LIDAR enables automatic, accurate (sub-pixel) extrinsic and intrinsic camera calibration without any explicit targets. Two methods for using LIDAR to assist dense disparity maps from featureless scenes were investigated. The first involved using a LIDAR to provide high-confidence seed points for a region growing stereo matching algorithm. It is shown that these seed points allow dense matching in scenes which fail to match using stereo alone. Secondly, LIDAR was used to provide artificial texture in featureless image regions. Texture was generated by combining real or simulated images of every point the laser hits to form a pseudo-random pattern. Machine learning was used to determine the image regions that are most likely to be stereo- matched, reducing the number of LIDAR points required. Results are compared to competing techniques such as laser speckle, data projection and diffractive optical elements

    Vision artificielle pour les non-voyants : une approche bio-inspirée pour la reconnaissance de formes

    Get PDF
    More than 315 million people worldwide suffer from visual impairments, with several studies suggesting that this number will double by 2030 due to the ageing of the population. To compensate for the loss of sight the current approaches consist of either specific aids designed to answer particular needs or generic systems such as neuroprostheses and sensory substitution devices. These holistic approaches, which try to restore vision as a whole, have been shown to be very inefficient in real life situations given the low resolution of output interfaces. To overcome these obstacles we propose the use of artificial vision in order to pre-process visual scenes and provide the user with relevant information. We have validated this approach through the development of a novel assistive device for the blind called Navig. Through shape recognition and spatialized sounds synthesis, this system allows users to locate and grab objects of interest. It also features navigational aids based on a new positioning method combining GPS, inertial sensors and the visual detection of geolocalized landmarks. To enhance the performance of the visual module we further developed, as part of this thesis, a bio-inspired pattern recognition algorithm which uses latency-based coding of visual information, oriented edge representations and a cascaded architecture combining detection at different resolutions.La dĂ©ficience visuelle touche aujourd’hui plus de 315 millions de personnes Ă  travers le monde, un chiffre qui pourrait doubler d’ici Ă  2030 du fait du vieillissement de la population. Les deux grandes approches existantes pour compenser la perte de vision sont les aides spĂ©cifiques, rĂ©pondant Ă  un besoin identifiĂ©, et les systĂšmes gĂ©nĂ©riques tels que les neuroprothĂšses ou les systĂšmes de substitution sensorielle. Ces approches holistiques, tentant de restituer l’ensemble de l’information visuelle, s’avĂšrent inadaptĂ©es de par la trop faible rĂ©solution des interfaces de sortie, rendant ces systĂšmes inutilisables dans la vie quotidienne. Face Ă  ce constat, nous proposons dans cette thĂšse une dĂ©marche alternative, consistant Ă  intĂ©grer des mĂ©thodes de vision artificielle, afin de prĂ©traiter la scĂšne visuelle, et de ne restituer au non-voyant que les informations extraites pertinentes. Pour valider cette approche, nous prĂ©senterons le dĂ©veloppement d’un systĂšme de supplĂ©ance baptisĂ© Navig. GrĂące Ă  la reconnaissance de formes et Ă  la synthĂšse de sons spatialisĂ©s, il permet Ă  l’utilisateur de localiser des objets d’intĂ©rĂȘt. Il offre Ă©galement des fonctions de navigation, basĂ©es sur une nouvelle mĂ©thode de positionnement combinant GPS, donnĂ©es inertielles, et dĂ©tections de cibles visuelles gĂ©olocalisĂ©es. Afin d’amĂ©liorer les performances du module de vision artificielle, nous proposerons Ă©galement dans cette thĂšse un nouvel algorithme de reconnaissance de formes bio-inspirĂ©, reposant sur un codage de l’information visuelle par latence, sur des reprĂ©sentations sous forme d’arĂȘtes orientĂ©es, et sur une architecture en cascade combinant des dĂ©tections Ă  diffĂ©rentes rĂ©solutions
    corecore