9 research outputs found

    Plane-extraction from depth-data using a Gaussian mixture regression model

    Get PDF
    We propose a novel algorithm for unsupervised extraction of piecewise planar models from depth-data. Among other applications, such models are a good way of enabling autonomous agents (robots, cars, drones, etc.) to effectively perceive their surroundings and to navigate in three dimensions. We propose to do this by fitting the data with a piecewise-linear Gaussian mixture regression model whose components are skewed over planes, making them flat in appearance rather than being ellipsoidal, by embedding an outlier-trimming process that is formally incorporated into the proposed expectation-maximization algorithm, and by selectively fusing contiguous, coplanar components. Part of our motivation is an attempt to estimate more accurate plane-extraction by allowing each model component to make use of all available data through probabilistic clustering. The algorithm is thoroughly evaluated against a standard benchmark and is shown to rank among the best of the existing state-of-the-art methods.Comment: 11 pages, 2 figures, 1 tabl

    SEGMENTATION OF UAV-BASED IMAGES INCORPORATING 3D POINT CLOUD INFORMATION

    Get PDF
    Numerous applications related to urban scene analysis demand automatic recognition of buildings and distinct sub-elements. For example, if LiDAR data is available, only 3D information could be leveraged for the segmentation. However, this poses several risks, for instance, the in-plane objects cannot be distinguished from their surroundings. On the other hand, if only image based segmentation is performed, the geometric features (e.g., normal orientation, planarity) are not readily available. This renders the task of detecting the distinct sub-elements of the building with similar radiometric characteristic infeasible. In this paper the individual sub-elements of buildings are recognized through sub-segmentation of the building using geometric and radiometric characteristics jointly. 3D points generated from Unmanned Aerial Vehicle (UAV) images are used for inferring the geometric characteristics of roofs and facades of the building. However, the image-based 3D points are noisy, error prone and often contain gaps. Hence the segmentation in 3D space is not appropriate. Therefore, we propose to perform segmentation in image space using geometric features from the 3D point cloud along with the radiometric features. The initial detection of buildings in 3D point cloud is followed by the segmentation in image space using the region growing approach by utilizing various radiometric and 3D point cloud features. The developed method was tested using two data sets obtained with UAV images with a ground resolution of around 1-2 cm. The developed method accurately segmented most of the building elements when compared to the plane-based segmentation using 3D point cloud alone

    Towards a data-driven object recognition framework using temporal depth-data

    Get PDF
    Object recognition using depth-sensors such as the Kinect device has received a lot of attention in recent years. Yet the limitations of such devices such as large noise and missing data makes the problem very challenging. In this work I propose a framework for data-driven object recognition that uses a combination of local and global features as well as time varying depth information

    High-level environment representations for mobile robots

    Get PDF
    In most robotic applications we are faced with the problem of building a digital representation of the environment that allows the robot to autonomously complete its tasks. This internal representation can be used by the robot to plan a motion trajectory for its mobile base and/or end-effector. For most man-made environments we do not have a digital representation or it is inaccurate. Thus, the robot must have the capability of building it autonomously. This is done by integrating into an internal data structure incoming sensor measurements. For this purpose, a common solution consists in solving the Simultaneous Localization and Mapping (SLAM) problem. The map obtained by solving a SLAM problem is called ``metric'' and it describes the geometric structure of the environment. A metric map is typically made up of low-level primitives (like points or voxels). This means that even though it represents the shape of the objects in the robot workspace it lacks the information of which object a surface belongs to. Having an object-level representation of the environment has the advantage of augmenting the set of possible tasks that a robot may accomplish. To this end, in this thesis we focus on two aspects. We propose a formalism to represent in a uniform manner 3D scenes consisting of different geometric primitives, including points, lines and planes. Consequently, we derive a local registration and a global optimization algorithm that can exploit this representation for robust estimation. Furthermore, we present a Semantic Mapping system capable of building an \textit{object-based} map that can be used for complex task planning and execution. Our system exploits effective reconstruction and recognition techniques that require no a-priori information about the environment and can be used under general conditions

    Fast and accurate plane segmentation in depth maps for indoor scenes

    No full text

    Camera Marker Networks for Pose Estimation and Scene Understanding in Construction Automation and Robotics.

    Full text link
    The construction industry faces challenges that include high workplace injuries and fatalities, stagnant productivity, and skill shortage. Automation and Robotics in Construction (ARC) has been proposed in the literature as a potential solution that makes machinery easier to collaborate with, facilitates better decision-making, or enables autonomous behavior. However, there are two primary technical challenges in ARC: 1) unstructured and featureless environments; and 2) differences between the as-designed and the as-built. It is therefore impossible to directly replicate conventional automation methods adopted in industries such as manufacturing on construction sites. In particular, two fundamental problems, pose estimation and scene understanding, must be addressed to realize the full potential of ARC. This dissertation proposes a pose estimation and scene understanding framework that addresses the identified research gaps by exploiting cameras, markers, and planar structures to mitigate the identified technical challenges. A fast plane extraction algorithm is developed for efficient modeling and understanding of built environments. A marker registration algorithm is designed for robust, accurate, cost-efficient, and rapidly reconfigurable pose estimation in unstructured and featureless environments. Camera marker networks are then established for unified and systematic design, estimation, and uncertainty analysis in larger scale applications. The proposed algorithms' efficiency has been validated through comprehensive experiments. Specifically, the speed, accuracy and robustness of the fast plane extraction and the marker registration have been demonstrated to be superior to existing state-of-the-art algorithms. These algorithms have also been implemented in two groups of ARC applications to demonstrate the proposed framework's effectiveness, wherein the applications themselves have significant social and economic value. The first group is related to in-situ robotic machinery, including an autonomous manipulator for assembling digital architecture designs on construction sites to help improve productivity and quality; and an intelligent guidance and monitoring system for articulated machinery such as excavators to help improve safety. The second group emphasizes human-machine interaction to make ARC more effective, including a mobile Building Information Modeling and way-finding platform with discrete location recognition to increase indoor facility management efficiency; and a 3D scanning and modeling solution for rapid and cost-efficient dimension checking and concise as-built modeling.PHDCivil EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113481/1/cforrest_1.pd

    Detection of counterfeit coins based on 3D Height-Map Image Analysis

    Get PDF
    Analyzing 3-D height-map images leads to the discovery of a new set of features that cannot be extracted or even seen in 2-D images. To the best of our knowledge, there was no research in the literature analyzing height-map images to detect counterfeit coins or to classify coins. The main goal of this thesis is to propose a new comprehensive method for analyzing 3D height-map images to detect counterfeit of any type of coins regardless of their country of origin, language, shape, and quality. Therefore, we applied a precise 3-D scanner to produce coin height-map images, since detecting a counterfeit coin using 2D image processing is nearly impossible in some cases, especially when the coin is damaged, corroded or worn out. In this research, we propose some 3-D approaches to model and analyze several large datasets. In our first and second methods, we aimed to solve the degradation problem of shiny coin images due to the scanning process. To solve this problem, first, the characters of the coin images were straightened by a proposed straightening algorithm. The height-map image, then, was decomposed row-wise to a set of 1-D signals, which were analyzed separately and restored by two different proposed methods. These approaches produced remarkable results. We also proposed a 3-D approach to detect and analyze the precipice borders from the coin surface and extract significant features that ignored the degradation problem. To extract the features, we also proposed Binned Borders in Spherical Coordinates (BBSC) to analyze different parts of precipice borders at different polar and azimuthal angles. We also took advantage of stack generalization to classify the coins and add a reject option to increase the reliability of the system. The results illustrate that the proposed method outperforms other counterfeit coin detectors. Since there are traces of deep learning in most recent research related to image processing, it is worthwhile to benefit from deep learning approaches in our study. In another proposed method of this thesis, we applied deep learning algorithms in two steps to detect counterfeit coins. As Generative Adversarial Network is being used for generating fake images in image processing applications, we proposed a novel method based on this network to augment our fake coin class and compensate for the lack of fake coins for training the classifier. We also decomposed the coin height-map image into three types of Steep, Moderate, and Gentle slopes. Therefore, the grayscale height-map image is turned to the proposed SMG height-map channel. Then, we proposed a hybrid CNN-based deep neural network to train and classify these new SMG images. The results illustrated that a deep neural network trained with the proposed SMG images outperforms the system trained by the grayscale images. In this research, the proposed methods were trained and tested with four types of Danish and two types of Chinese coins with encouraging results

    Room layout estimation on mobile devices

    Get PDF
    Room layout generation is the problem of generating a drawing or a digital model of an existing room from a set of measurements such as laser data or images. The generation of floor plans can find application in the building industry to assess the quality and the correctness of an ongoing construction w.r.t. the initial model, or to quickly sketch the renovation of an apartment. Real estate industry can rely on automatic generation of floor plans to ease the process of checking the livable surface and to propose virtual visits to prospective customers. As for the general public, the room layout can be integrated into mixed reality games to provide a better immersiveness experience, or used in other related augmented reality applications such room redecoration. The goal of this industrial thesis (CIFRE) is to investigate and take advantage of the state-of-the art mobile devices in order to automate the process of generating room layouts. Nowadays, modern mobile devices usually come a wide range of sensors, such as inertial motion unit (IMU), RGB cameras and, more recently, depth cameras. Moreover, tactile touchscreens offer a natural and simple way to interact with the user, thus favoring the development of interactive applications, in which the user can be part of the processing loop. This work aims at exploiting the richness of such devices to address the room layout generation problem. The thesis has three major contributions. We first show how the classic problem of detecting vanishing points in an image can benefit from an a-priori given by the IMU sensor. We propose a simple and effective algorithm for detecting vanishing points relying on the gravity vector estimated by the IMU. A new public dataset containing images and the relevant IMU data is introduced to help assessing vanishing point algorithms and foster further studies in the field. As a second contribution, we explored the state of-the-art of real-time localization and map optimization algorithms for RGB-D sensors. Real-time localization is a fundamental task to enable augmented reality applications, and thus it is a critical component when designing interactive applications. We propose an evaluation of existing algorithms for the common desktop set-up in order to be employed on a mobile device. For each considered method, we assess the accuracy of the localization as well as the computational performances when ported on a mobile device. Finally, we present a proof of concept of application able to generate the room layout relying on a Project Tango tablet equipped with an RGB-D sensor. In particular, we propose an algorithm that incrementally processes and fuses the 3D data provided by the sensor in order to obtain the layout of the room. We show how our algorithm can rely on the user interactions in order to correct the generated 3D model during the acquisition process
    corecore