203 research outputs found

    Accurate foreground segmentation without pre-learning

    Get PDF
    Foreground segmentation has been widely used in many computer vision applications. However, most of the existing methods rely on a pre-learned motion or background model, which will increase the burden of users. In this paper, we present an automatic algorithm without pre-learning for segmenting foreground from background based on the fusion of motion, color and contrast information. Motion information is enhanced by a novel method called support edges diffusion (SED) , which is built upon a key observation that edges of the difference image of two adjacent frames only appear in moving regions in most of the cases. Contrasts in background are attenuated while those in foreground are enhanced using gradient of the previous frame and that of the temporal difference. Experiments on many video sequences demonstrate the effectiveness and accuracy of the proposed algorithm. The segmentation results are comparable to those obtained by other state-of-the-art methods that depend on a pre-learned background or a stereo setup. © 2011 IEEE.published_or_final_versionThe 6th International Conference on Image and Graphics (ICIG 2011), Hefei, Anhui, China, 12-15 August 2011. In Proceedings of the 6th ICIG, 2011, p. 331-33

    Non-Parametric Learning for Monocular Visual Odometry

    Get PDF
    This thesis addresses the problem of incremental localization from visual information, a scenario commonly known as visual odometry. Current visual odometry algorithms are heavily dependent on camera calibration, using a pre-established geometric model to provide the transformation between input (optical flow estimates) and output (vehicle motion estimates) information. A novel approach to visual odometry is proposed in this thesis where the need for camera calibration, or even for a geometric model, is circumvented by the use of machine learning principles and techniques. A non-parametric Bayesian regression technique, the Gaussian Process (GP), is used to elect the most probable transformation function hypothesis from input to output, based on training data collected prior and during navigation. Other than eliminating the need for a geometric model and traditional camera calibration, this approach also allows for scale recovery even in a monocular configuration, and provides a natural treatment of uncertainties due to the probabilistic nature of GPs. Several extensions to the traditional GP framework are introduced and discussed in depth, and they constitute the core of the contributions of this thesis to the machine learning and robotics community. The proposed framework is tested in a wide variety of scenarios, ranging from urban and off-road ground vehicles to unconstrained 3D unmanned aircrafts. The results show a significant improvement over traditional visual odometry algorithms, and also surpass results obtained using other sensors, such as laser scanners and IMUs. The incorporation of these results to a SLAM scenario, using a Exact Sparse Information Filter (ESIF), is shown to decrease global uncertainty by exploiting revisited areas of the environment. Finally, a technique for the automatic segmentation of dynamic objects is presented, as a way to increase the robustness of image information and further improve visual odometry results

    Non-Parametric Learning for Monocular Visual Odometry

    Get PDF
    This thesis addresses the problem of incremental localization from visual information, a scenario commonly known as visual odometry. Current visual odometry algorithms are heavily dependent on camera calibration, using a pre-established geometric model to provide the transformation between input (optical flow estimates) and output (vehicle motion estimates) information. A novel approach to visual odometry is proposed in this thesis where the need for camera calibration, or even for a geometric model, is circumvented by the use of machine learning principles and techniques. A non-parametric Bayesian regression technique, the Gaussian Process (GP), is used to elect the most probable transformation function hypothesis from input to output, based on training data collected prior and during navigation. Other than eliminating the need for a geometric model and traditional camera calibration, this approach also allows for scale recovery even in a monocular configuration, and provides a natural treatment of uncertainties due to the probabilistic nature of GPs. Several extensions to the traditional GP framework are introduced and discussed in depth, and they constitute the core of the contributions of this thesis to the machine learning and robotics community. The proposed framework is tested in a wide variety of scenarios, ranging from urban and off-road ground vehicles to unconstrained 3D unmanned aircrafts. The results show a significant improvement over traditional visual odometry algorithms, and also surpass results obtained using other sensors, such as laser scanners and IMUs. The incorporation of these results to a SLAM scenario, using a Exact Sparse Information Filter (ESIF), is shown to decrease global uncertainty by exploiting revisited areas of the environment. Finally, a technique for the automatic segmentation of dynamic objects is presented, as a way to increase the robustness of image information and further improve visual odometry results

    Multiple object tracking with context awareness

    Get PDF
    [no abstract

    Stereo vision and mapping with unsynchronized cameras

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (leaves 69-72).Environmental awareness is an important prerequisite for autonomous behavior in vehicles. Without it, robots are unable to react to unknown surroundings and require extensive human input for tasks such as target identification and obstacle avoidance. This would negate many of the advantages of having an autonomous system. Giving a vehicle the ability to map its surroundings and use the data effectively will allow humans to spend less time scanning the vehicle's video feed and providing direct navigational commands. This thesis details the development of a real-time, extensible vision and mapping system that provides an interface for control systems to access details of the map. It addresses the problems of image capture, signal noise, and three dimensional map storage. It extends existing real-time stereo mapping systems by tolerating unsynchronized stereo cameras. Results indicate that synchronization allows the system to locate points significantly more accurately than the system without synchronization. When compared with a monocular mapping system, synchronized stereo provides a more detailed map and will tolerate more erroneous localization data. Because it is developed with an abstract localization system, this system is designed to be modular and easily extensible.by Ray C. He.M.Eng
    • …
    corecore