512 research outputs found

    Single-pass inline pipeline 3D reconstruction using depth camera array

    Get PDF
    A novel inline inspection (ILI) approach using depth cameras array (DCA) is introduced to create high-fidelity, dense 3D pipeline models. A new camera calibration method is introduced to register the color and the depth information of the cameras into a unified pipe model. By incorporating the calibration outcomes into a robust camera motion estimation approach, dense and complete 3D pipe surface reconstruction is achieved by using only the inline image data collected by a self-powered ILI rover in a single pass through a straight pipeline. The outcomes of the laboratory experiments demonstrate one-millimeter geometrical accuracy and 0.1-pixel photometric accuracy. In the reconstructed model of a longer pipeline, the proposed method generates the dense 3D surface reconstruction model at the millimeter level accuracy with less than 0.5% distance error. The achieved performance highlights its potential as a useful tool for efficient in-line, non-destructive evaluation of pipeline assets

    Robust Visual SLAM in Challenging Environments with Low-texture and Dynamic Illumination

    Get PDF
    - Robustness to Dynamic Illumination conditions is also one of the main open challenges in visual odometry and SLAM, e.g. high dynamic range (HDR) environments. The main difficulties in these situations come from both the limitations of the sensors, for instance automatic settings of a camera might not react fast enough to properly record dynamic illumination changes, and also from limitations in the algorithms, e.g. the track of interest points is typically based on brightness constancy. The work of this thesis contributes to mitigate these phenomena from two different perspectives. The first one addresses this problem from a deep learning perspective by enhancing images to invariant and richer representations for VO and SLAM, benefiting from the generalization properties of deep neural networks. In this work it is also demonstrated how the insertion of long short term memory (LSTM) allows us to obtain temporally consistent sequences, since the estimation depends on previous states. Secondly, a more traditional perspective is exploited to contribute with a purely geometric-based tracking of line segments in challenging stereo streams with complex or varying illumination, since they are intrinsically more informative. Fecha de lectura de Tesis Doctoral: 26 de febrero 2020In the last years, visual Simultaneous Localization and Mapping (SLAM) has played a role of capital importance in rapid technological advances, e.g. mo- bile robotics and applications such as virtual, augmented, or mixed reality (VR/AR/MR), as a vital part of their processing pipelines. As its name indicates, it comprises the estimation of the state of a robot (typically the pose) while, simultaneously, incrementally building and refining a consistent representation of the environment, i.e. the so-called map, based on the equipped sensors. Despite the maturity reached by state-of-art visual SLAM techniques in controlled environments, there are still many open challenges to address be- fore reaching a SLAM system robust to long-term operations in uncontrolled scenarios, where classical assumptions, such as static environments, do not hold anymore. This thesis contributes to improve robustness of visual SLAM in harsh or difficult environments, in particular: - Low-textured Environments, where traditional approaches suffer from an accuracy impoverishment and, occasionally, the absolute failure of the system. Fortunately, many of such low-textured environments contain planar elements that are rich in linear shapes, so an alternative feature choice such as line segments would exploit information from structured parts of the scene. This set of contributions exploits both type of features, i.e. points and line segments, to produce visual odometry and SLAM algorithms robust in a broader variety of environments, hence leveraging them at all instances of the related processes: monocular depth estimation, visual odometry, keyframe selection, bundle adjustment, loop closing, etc. Additionally, an open-source C++ implementation of the proposed algorithms has been released along with the published articles and some extra multimedia material for the benefit of the community

    Depth-Assisted Semantic Segmentation, Image Enhancement and Parametric Modeling

    Get PDF
    This dissertation addresses the problem of employing 3D depth information on solving a number of traditional challenging computer vision/graphics problems. Humans have the abilities of perceiving the depth information in 3D world, which enable humans to reconstruct layouts, recognize objects and understand the geometric space and semantic meanings of the visual world. Therefore it is significant to explore how the 3D depth information can be utilized by computer vision systems to mimic such abilities of humans. This dissertation aims at employing 3D depth information to solve vision/graphics problems in the following aspects: scene understanding, image enhancements and 3D reconstruction and modeling. In addressing scene understanding problem, we present a framework for semantic segmentation and object recognition on urban video sequence only using dense depth maps recovered from the video. Five view-independent 3D features that vary with object class are extracted from dense depth maps and used for segmenting and recognizing different object classes in street scene images. We demonstrate a scene parsing algorithm that uses only dense 3D depth information to outperform using sparse 3D or 2D appearance features. In addressing image enhancement problem, we present a framework to overcome the imperfections of personal photographs of tourist sites using the rich information provided by large-scale internet photo collections (IPCs). By augmenting personal 2D images with 3D information reconstructed from IPCs, we address a number of traditionally challenging image enhancement techniques and achieve high-quality results using simple and robust algorithms. In addressing 3D reconstruction and modeling problem, we focus on parametric modeling of flower petals, the most distinctive part of a plant. The complex structure, severe occlusions and wide variations make the reconstruction of their 3D models a challenging task. We overcome these challenges by combining data driven modeling techniques with domain knowledge from botany. Taking a 3D point cloud of an input flower scanned from a single view, each segmented petal is fitted with a scale-invariant morphable petal shape model, which is constructed from individually scanned 3D exemplar petals. Novel constraints based on botany studies are incorporated into the fitting process for realistically reconstructing occluded regions and maintaining correct 3D spatial relations. The main contribution of the dissertation is in the intelligent usage of 3D depth information on solving traditional challenging vision/graphics problems. By developing some advanced algorithms either automatically or with minimum user interaction, the goal of this dissertation is to demonstrate that computed 3D depth behind the multiple images contains rich information of the visual world and therefore can be intelligently utilized to recognize/ understand semantic meanings of scenes, efficiently enhance and augment single 2D images, and reconstruct high-quality 3D models

    Visual Perception For Robotic Spatial Understanding

    Get PDF
    Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability. Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently. We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet

    Approaches to three-dimensional reconstruction of plant shoot topology and geometry

    Get PDF
    There are currently 805 million people classified as chronically undernourished, and yet the World’s population is still increasing. At the same time, global warming is causing more frequent and severe flooding and drought, thus destroying crops and reducing the amount of land available for agriculture. Recent studies show that without crop climate adaption, crop productivity will deteriorate. With access to 3D models of real plants it is possible to acquire detailed morphological and gross developmental data that can be used to study their ecophysiology, leading to an increase in crop yield and stability across hostile and changing environments. Here we review approaches to the reconstruction of 3D models of plant shoots from image data, consider current applications in plant and crop science, and identify remaining challenges. We conclude that although phenotyping is receiving an increasing amount of attention – particularly from computer vision researchers – and numerous vision approaches have been proposed, it still remains a highly interactive process. An automated system capable of producing 3D models of plants would significantly aid phenotyping practice, increasing accuracy and repeatability of measurements

    Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments

    Get PDF
    Image-based estimation of camera motion, known as visual odometry (VO), plays a very important role in many robotic applications such as control and navigation of unmanned mobile robots, especially when no external navigation reference signal is available. The core problem of VO is the estimation of the camera’s ego-motion (i.e. tracking) either between successive frames, namely relative pose estimation, or with respect to a global map, namely absolute pose estimation. This thesis aims to develop efficient, accurate and robust VO solutions by taking advantage of structural regularities in man-made environments, such as piece-wise planar structures, Manhattan World and more generally, contours and edges. Furthermore, to handle challenging scenarios that are beyond the limits of classical sensor based VO solutions, we investigate a recently emerging sensor — the event camera and study on event-based mapping — one of the key problems in the event-based VO/SLAM. The main achievements are summarized as follows. First, we revisit an old topic on relative pose estimation: accurately and robustly estimating the fundamental matrix given a collection of independently estimated homograhies. Three classical methods are reviewed and then we show a simple but nontrivial two-step normalization within the direct linear method that achieves similar performance to the less attractive and more computationally intensive hallucinated points based method. Second, an efficient 3D rotation estimation algorithm for depth cameras in piece-wise planar environments is presented. It shows that by using surface normal vectors as an input, planar modes in the corresponding density distribution function can be discovered and continuously tracked using efficient non-parametric estimation techniques. The relative rotation can be estimated by registering entire bundles of planar modes by using robust L1-norm minimization. Third, an efficient alternative to the iterative closest point algorithm for real-time tracking of modern depth cameras in ManhattanWorlds is developed. We exploit the common orthogonal structure of man-made environments in order to decouple the estimation of the rotation and the three degrees of freedom of the translation. The derived camera orientation is absolute and thus free of long-term drift, which in turn benefits the accuracy of the translation estimation as well. Fourth, we look into a more general structural regularity—edges. A real-time VO system that uses Canny edges is proposed for RGB-D cameras. Two novel alternatives to classical distance transforms are developed with great properties that significantly improve the classical Euclidean distance field based methods in terms of efficiency, accuracy and robustness. Finally, to deal with challenging scenarios that go beyond what standard RGB/RGB-D cameras can handle, we investigate the recently emerging event camera and focus on the problem of 3D reconstruction from data captured by a stereo event-camera rig moving in a static scene, such as in the context of stereo Simultaneous Localization and Mapping

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
    • …
    corecore