25 research outputs found

    Multi-planar Fitting in an Indoor ManhattanWorld

    Get PDF

    Pose Normalization of Indoor Mapping Datasets Partially Compliant with the Manhattan World Assumption

    Get PDF
    In this paper, we present a novel pose normalization method for indoor mapping point clouds and triangle meshes that is robust against large fractions of the indoor mapping geometries deviating from an ideal Manhattan World structure. In the case of building structures that contain multiple Manhattan World systems, the dominant Manhattan World structure supported by the largest fraction of geometries is determined and used for alignment. In a first step, a vertical alignment orienting a chosen axis to be orthogonal to horizontal floor and ceiling surfaces is conducted. Subsequently, a rotation around the resulting vertical axis is determined that aligns the dataset horizontally with the coordinate axes. The proposed method is evaluated quantitatively against several publicly available indoor mapping datasets. Our implementation of the proposed procedure along with code for reproducing the evaluation will be made available to the public upon acceptance for publication

    Indoor Mapping and Reconstruction with Mobile Augmented Reality Sensor Systems

    Get PDF
    Augmented Reality (AR) ermöglicht es, virtuelle, dreidimensionale Inhalte direkt innerhalb der realen Umgebung darzustellen. Anstatt jedoch beliebige virtuelle Objekte an einem willkĂŒrlichen Ort anzuzeigen, kann AR Technologie auch genutzt werden, um Geodaten in situ an jenem Ort darzustellen, auf den sich die Daten beziehen. Damit eröffnet AR die Möglichkeit, die reale Welt durch virtuelle, ortbezogene Informationen anzureichern. Im Rahmen der vorliegenen Arbeit wird diese Spielart von AR als "Fused Reality" definiert und eingehend diskutiert. Der praktische Mehrwert, den dieses Konzept der Fused Reality bietet, lĂ€sst sich gut am Beispiel seiner Anwendung im Zusammenhang mit digitalen GebĂ€udemodellen demonstrieren, wo sich gebĂ€udespezifische Informationen - beispielsweise der Verlauf von Leitungen und Kabeln innerhalb der WĂ€nde - lagegerecht am realen Objekt darstellen lassen. Um das skizzierte Konzept einer Indoor Fused Reality Anwendung realisieren zu können, mĂŒssen einige grundlegende Bedingungen erfĂŒllt sein. So kann ein bestimmtes GebĂ€ude nur dann mit ortsbezogenen Informationen augmentiert werden, wenn von diesem GebĂ€ude ein digitales Modell verfĂŒgbar ist. Zwar werden grĂ¶ĂŸere Bauprojekt heutzutage oft unter Zuhilfename von Building Information Modelling (BIM) geplant und durchgefĂŒhrt, sodass ein digitales Modell direkt zusammen mit dem realen GebĂ€ude ensteht, jedoch sind im Falle Ă€lterer BestandsgebĂ€ude digitale Modelle meist nicht verfĂŒgbar. Ein digitales Modell eines bestehenden GebĂ€udes manuell zu erstellen, ist zwar möglich, jedoch mit großem Aufwand verbunden. Ist ein passendes GebĂ€udemodell vorhanden, muss ein AR GerĂ€t außerdem in der Lage sein, die eigene Position und Orientierung im GebĂ€ude relativ zu diesem Modell bestimmen zu können, um Augmentierungen lagegerecht anzeigen zu können. Im Rahmen dieser Arbeit werden diverse Aspekte der angesprochenen Problematik untersucht und diskutiert. Dabei werden zunĂ€chst verschiedene Möglichkeiten diskutiert, Indoor-GebĂ€udegeometrie mittels Sensorsystemen zu erfassen. Anschließend wird eine Untersuchung prĂ€sentiert, inwiefern moderne AR GerĂ€te, die in der Regel ebenfalls ĂŒber eine Vielzahl an Sensoren verfĂŒgen, ebenfalls geeignet sind, als Indoor-Mapping-Systeme eingesetzt zu werden. Die resultierenden Indoor Mapping DatensĂ€tze können daraufhin genutzt werden, um automatisiert GebĂ€udemodelle zu rekonstruieren. Zu diesem Zweck wird ein automatisiertes, voxel-basiertes Indoor-Rekonstruktionsverfahren vorgestellt. Dieses wird außerdem auf der Grundlage vierer zu diesem Zweck erfasster DatensĂ€tze mit zugehörigen Referenzdaten quantitativ evaluiert. Desweiteren werden verschiedene Möglichkeiten diskutiert, mobile AR GerĂ€te innerhalb eines GebĂ€udes und des zugehörigen GebĂ€udemodells zu lokalisieren. In diesem Kontext wird außerdem auch die Evaluierung einer Marker-basierten Indoor-Lokalisierungsmethode prĂ€sentiert. Abschließend wird zudem ein neuer Ansatz, Indoor-Mapping DatensĂ€tze an den Achsen des Koordinatensystems auszurichten, vorgestellt

    Convex Decomposition of Indoor Scenes

    Full text link
    We describe a method to parse a complex, cluttered indoor scene into primitives which offer a parsimonious abstraction of scene structure. Our primitives are simple convexes. Our method uses a learned regression procedure to parse a scene into a fixed number of convexes from RGBD input, and can optionally accept segmentations to improve the decomposition. The result is then polished with a descent method which adjusts the convexes to produce a very good fit, and greedily removes superfluous primitives. Because the entire scene is parsed, we can evaluate using traditional depth, normal, and segmentation error metrics. Our evaluation procedure demonstrates that the error from our primitive representation is comparable to that of predicting depth from a single image.Comment: 18 pages, 12 figure

    Finding Good Configurations of Planar Primitives in Unorganized Point Clouds

    Get PDF
    International audienceWe present an algorithm for detecting planar primitives from unorganized 3D point clouds. Departing from an initial configuration, the algorithm refines both the continuous plane parameters and the discrete assignment of input points to them by seeking high fidelity, high simplicity and high completeness. Our key contribution relies upon the design of an exploration mechanism guided by a multiobjective energy function. The transitions within the large solution space are handled by five geometric operators that create, remove and modify primitives. We demonstrate the potential of our method on a variety of scenes, from organic shapes to man-made objects, and sensors, from multiview stereo to laser. We show its efficacy with respect to existing primitive fitting approaches and illustrate its applicative interest in compact mesh reconstruction, when combined with a plane assembly method

    iBARLE: imBalance-Aware Room Layout Estimation

    Full text link
    Room layout estimation predicts layouts from a single panorama. It requires datasets with large-scale and diverse room shapes to train the models. However, there are significant imbalances in real-world datasets including the dimensions of layout complexity, camera locations, and variation in scene appearance. These issues considerably influence the model training performance. In this work, we propose the imBalance-Aware Room Layout Estimation (iBARLE) framework to address these issues. iBARLE consists of (1) Appearance Variation Generation (AVG) module, which promotes visual appearance domain generalization, (2) Complex Structure Mix-up (CSMix) module, which enhances generalizability w.r.t. room structure, and (3) a gradient-based layout objective function, which allows more effective accounting for occlusions in complex layouts. All modules are jointly trained and help each other to achieve the best performance. Experiments and ablation studies based on ZInD~\cite{cruz2021zillow} dataset illustrate that iBARLE has state-of-the-art performance compared with other layout estimation baselines

    Line Primitives and Their Applications in Geometric Computer Vision

    Get PDF
    Line primitives are widely found in structured scenes which provide a higher level of structure information about the scenes than point primitives. Furthermore, line primitives in space are closely related to Euclidean transformations, because the dual vector (also known as Pluecker coordinates) representation of 3D lines is the counterpart of the dual quaternion which depicts an Euclidean transformation. These geometric properties of line primitives motivate the work in this thesis with the following contributions: Firstly, by combining local appearances of lines and geometric constraints between line pairs in images, a line segment matching algorithm is developed which constructs a novel line band descriptor to depict the local appearance of a line and builds a relational graph to measure the pair-wise consistency between line correspondences. Experiments show that the matching algorithm is robust to various image transformations and more efficient than conventional graph based line matching algorithms. Secondly, by investigating the symmetric property of line directions in space, this thesis presents a complete analysis about the solutions of the Perspective-3-Line (P3L) problem which estimates the camera pose from three reference lines in space and their 2D projections. For three spatial lines in general configurations, a P3L polynomial is derived which is employed to develop a solution of the Perspective-n-Line problem. The proposed robust PnL algorithm can efficiently and accurately estimate the camera pose for both small numbers and large numbers of line correspondences. For three spatial lines in special configurations (e.g., in a Manhattan world which consists of three mutually orthogonal dominant directions), the solution of the P3L problem is employed to solve the vanishing point estimation and line classification problem. The proposed vanishing point estimation algorithm achieves high accuracy and efficiency by thoroughly utilizing the Manhattan world characteristic. Another advantage of the proposed framework is that it can be easily generalized to images taken by central catadioptric cameras or uncalibrated cameras. The third major contribution of this thesis is about structure-from-motion using line primitives. To circumvent the Pluecker constraints on the Pluecker coordinates of lines, the Cayley representation of lines is developed which is inspired by the geometric property of the Pluecker coordinates of lines. To build the line observation model, two derivations of line projection functions are presented: one is based on the dual relationship between points and lines; and the other is based on the relationship between Pluecker coordinates and the Pluecker matrix. Then the motion and structure parameters are initialized by an incremental approach and optimized by sparse bundle adjustment. Quantitative validations show the increase in performance when compared to conventional line reconstruction algorithms

    PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus

    Full text link
    We present a real-time method for robust estimation of multiple instances of geometric models from noisy data. Geometric models such as vanishing points, planar homographies or fundamental matrices are essential for 3D scene analysis. Previous approaches discover distinct model instances in an iterative manner, thus limiting their potential for speedup via parallel computation. In contrast, our method detects all model instances independently and in parallel. A neural network segments the input data into clusters representing potential model instances by predicting multiple sets of sample and inlier weights. Using the predicted weights, we determine the model parameters for each potential instance separately in a RANSAC-like fashion. We train the neural network via task-specific loss functions, i.e. we do not require a ground-truth segmentation of the input data. As suitable training data for homography and fundamental matrix fitting is scarce, we additionally present two new synthetic datasets. We demonstrate state-of-the-art performance on these as well as multiple established datasets, with inference times as small as five milliseconds per image.Comment: AAAI 202

    3D Scene Geometry Estimation from 360∘^\circ Imagery: A Survey

    Full text link
    This paper provides a comprehensive survey on pioneer and state-of-the-art 3D scene geometry estimation methodologies based on single, two, or multiple images captured under the omnidirectional optics. We first revisit the basic concepts of the spherical camera model, and review the most common acquisition technologies and representation formats suitable for omnidirectional (also called 360∘^\circ, spherical or panoramic) images and videos. We then survey monocular layout and depth inference approaches, highlighting the recent advances in learning-based solutions suited for spherical data. The classical stereo matching is then revised on the spherical domain, where methodologies for detecting and describing sparse and dense features become crucial. The stereo matching concepts are then extrapolated for multiple view camera setups, categorizing them among light fields, multi-view stereo, and structure from motion (or visual simultaneous localization and mapping). We also compile and discuss commonly adopted datasets and figures of merit indicated for each purpose and list recent results for completeness. We conclude this paper by pointing out current and future trends.Comment: Published in ACM Computing Survey
    corecore