5 research outputs found

    Modelado Semántico 3D de Ambientes Interiores basado en Nubes de Puntos y Relaciones Contextuales

    Get PDF
    Context: We propose a methodology to identify and label the components of a typical indoor environment in order to generate a semantic model of the scene. We are interested in identifying walls, ceilings, floors, doorways with open doors, doorways with closed doors that are recessed into walls, and partially occluded windows.Method: The elements to be identified should be flat in case of walls, floors, and ceilings and should have a rectangular shape in case of windows and doorways, which means that the indoor structure is Manhattan. The identification of these structures is determined through the analysis of the contextual relationships among them as parallelism, orthogonality, and position of the structure in the scene. Point clouds were acquired using a RGB-D device (Microsoft Kinect Sensor).Results: The obtained results show a precision of 99.03% and a recall of 95.68%, in a proprietary dataset.Conclusions: A method for 3D semantic labeling of indoor scenes based on contextual relationships among the objects is presented. Contextual rules used for classification and labeling allow a perfect understanding of the process and also an identification of the reasons why there are some errors in labeling. The time response of the algorithm is short and the accuracy attained is satisfactory. Furthermore, the computational requirements are not high.Contexto: Se propone una metodología para identificar y etiquetar los componentes de la estructura de un ambiente interior típico y así generar un modelo semántico de la escena. Nos interesamos en la identificación de: paredes, techos, suelos, puertas abiertas, puertas cerradas que forman un pequeño hueco con la pared y ventanas parcialmente ocultas.Método: Los elementos a ser identificados deben ser planos en el caso de paredes, pisos y techos y deben tener una forma rectangular en el caso de puertas y ventanas, lo que significa que la estructura del ambiente interior es Manhattan. La identificación de estas estructuras se determina mediante el análisis de las relaciones contextuales entre ellos, paralelismo, ortogonalidad y posición de la estructura en la escena. Las nubes de puntos de las escenas fueron adquiridas con un dispositivo RGB-D (Sensor Kinect de Microsoft).Resultados: Los resultados obtenidos muestran una precisión de 99.03% y una sensibilidad de 95.68%, usando una base de datos propia.Conclusiones: Se presenta un método para el etiquetado semántico 3D de escenas en interiores basado en relaciones contextuales entre los objetos. Las reglas contextuales usadas para clasificación y etiquetado permiten un buen entendimiento del proceso y, también, una identificación de las razones por las que se presentan algunos errores en el etiquetado. El tiempo de respuesta del algoritmo es corto y la exactitud alcanzada es satisfactoria. Además, los requerimientos computacionales no son altos

    An empirical assessment of real-time progressive stereo reconstruction

    Get PDF
    3D reconstruction from images, the problem of reconstructing depth from images, is one of the most well-studied problems within computer vision. In part because it is academically interesting, but also because of the significant growth in the use of 3D models. This growth can be attributed to the development of augmented reality, 3D printing and indoor mapping. Progressive stereo reconstruction is the sequential application of stereo reconstructions to reconstruct a scene. To achieve a reliable progressive stereo reconstruction a combination of best practice algorithms needs to be used. The purpose of this research is to determine the combinat ion of best practice algorithms that lead to the most accurate and efficient progressive stereo reconstruction i.e the best practice combination. In order to obtain a similarity reconstruction the in t rinsic parameters of the camera need to be known. If they are not known they are determined by capturing ten images of a checkerboard with a known calibration pattern from different angles and using the moving plane algori thm. Thereafter in order to perform a near real-time reconstruction frames are acquired and reconstructed simultaneously. For the first pair of frames keypoints are detected and matched using a best practice keypoint detection and matching algorithm. The motion of the camera between the frames is then determined by decomposing the essential matrix which is determined from the fundamental matrix, which is determined using a best practice ego-motion estimation algorithm. Finally the keypoints are reconstructed using a best practice reconstruction algorithm. For sequential frames each frame is paired with t he previous frame and keypoints are therefore only detected in the sequential frame. They are detected , matched and reconstructed in the same fashion as the first pair of frames, however to ensure that the reconstructed points are in the same scale as the points reconstructed from the first pair of frames the motion of the camera between t he frames is estimated from 3D-2D correspondences using a best practice algorithm. If the purpose of progressive reconstruction is for visualization the best practice combination algorithm for keypoint detection was found to be Speeded Up Robust Features (SURF) as it results in more reconstructed points than Scale-Invariant Feature Transform (SIFT). SIFT is however more computationally efficient and thus better suited if the number of reconstructed points does not matter, for example if the purpose of progressive reconstruction is for camera tracking. For all purposes the best practice combination algorithm for matching was found to be optical flow as it is the most efficient and for ego-motion estimation the best practice combination algorithm was found to be the 5-point algorithm as it is robust to points located on planes. This research is significant as the effects of the key steps of progressive reconstruction and the choices made at each step on the accuracy and efficiency of the reconstruction as a whole have never been studied. As a result progressive stereo reconstruction can now be performed in near real-time on a mobile device without compromising the accuracy of reconstruction

    Understanding the 3D layout of a cluttered room from multiple images

    No full text

    On-line, Incremental Visual Scene Understanding for an Indoor Navigating Robot.

    Full text link
    An indoor navigating robot must perceive its local environment in order to act. The robot must construct a model that captures critical navigation information from the stream of visual data that it acquires while traveling within the environment. Visual processing must be done on-line and efficiently to keep up with the robot's need. This thesis contributes both representations and algorithms toward solving the problem of modeling the local environment for an indoor navigating robot. Two representations, Planar Semantic Model (PSM) and Action Opportunity Star (AOS), are proposed to capture important navigation information of the local indoor environment. PSM models the geometric structure of the indoor environment in terms of ground plane and walls, and captures rich relationships among the wall segments. AOS is an abstracted representation that reasons about the navigation opportunities at a given pose. Both representations are capable of capturing incomplete knowledge where representations of unknown regions can be incrementally built as observations become available. An on-line generate-and-test framework is presented to construct the PSM from a stream of visual data. The framework includes two key elements, an incremental process of generating structural hypotheses and an on-line hypothesis testing mechanism using a Bayesian filter. Our framework is evaluated in three phases. First, we evaluate the effectiveness of the on-line hypothesis testing mechanism with an initially generated set of hypotheses in simple empty environments. We demonstrate that our method outperforms state-of-the-art methods on geometric reasoning both in terms of accuracy and applicability to a navigating robot. Second, we evaluate the incremental hypothesis generating process and demonstrate the expressive power of our proposed representations. At this phase, we also demonstrate an attention focusing method to efficiently discriminate among the active hypothesized models. Finally, we demonstrate a general metric to test the hypotheses with partial explanations in cluttered environments.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108914/1/gstsai_1.pd

    Room layout estimation on mobile devices

    Get PDF
    Room layout generation is the problem of generating a drawing or a digital model of an existing room from a set of measurements such as laser data or images. The generation of floor plans can find application in the building industry to assess the quality and the correctness of an ongoing construction w.r.t. the initial model, or to quickly sketch the renovation of an apartment. Real estate industry can rely on automatic generation of floor plans to ease the process of checking the livable surface and to propose virtual visits to prospective customers. As for the general public, the room layout can be integrated into mixed reality games to provide a better immersiveness experience, or used in other related augmented reality applications such room redecoration. The goal of this industrial thesis (CIFRE) is to investigate and take advantage of the state-of-the art mobile devices in order to automate the process of generating room layouts. Nowadays, modern mobile devices usually come a wide range of sensors, such as inertial motion unit (IMU), RGB cameras and, more recently, depth cameras. Moreover, tactile touchscreens offer a natural and simple way to interact with the user, thus favoring the development of interactive applications, in which the user can be part of the processing loop. This work aims at exploiting the richness of such devices to address the room layout generation problem. The thesis has three major contributions. We first show how the classic problem of detecting vanishing points in an image can benefit from an a-priori given by the IMU sensor. We propose a simple and effective algorithm for detecting vanishing points relying on the gravity vector estimated by the IMU. A new public dataset containing images and the relevant IMU data is introduced to help assessing vanishing point algorithms and foster further studies in the field. As a second contribution, we explored the state of-the-art of real-time localization and map optimization algorithms for RGB-D sensors. Real-time localization is a fundamental task to enable augmented reality applications, and thus it is a critical component when designing interactive applications. We propose an evaluation of existing algorithms for the common desktop set-up in order to be employed on a mobile device. For each considered method, we assess the accuracy of the localization as well as the computational performances when ported on a mobile device. Finally, we present a proof of concept of application able to generate the room layout relying on a Project Tango tablet equipped with an RGB-D sensor. In particular, we propose an algorithm that incrementally processes and fuses the 3D data provided by the sensor in order to obtain the layout of the room. We show how our algorithm can rely on the user interactions in order to correct the generated 3D model during the acquisition process
    corecore