    Visual Analysis of High-Dimensional Point Clouds using Topological Abstraction

    This thesis is about visualizing a kind of data that is trivial to process by computers but difficult to imagine by humans because nature does not allow for intuition with this type of information: high-dimensional data. Such data often result from representing observations of objects under various aspects or with different properties. In many applications, a typical, laborious task is to find related objects or to group those that are similar to each other. One classic solution for this task is to imagine the data as vectors in a Euclidean space with object variables as dimensions. Utilizing Euclidean distance as a measure of similarity, objects with similar properties and values accumulate to groups, so-called clusters, that are exposed by cluster analysis on the high-dimensional point cloud. Because similar vectors can be thought of as objects that are alike in terms of their attributes, the point cloud\''s structure and individual cluster properties, like their size or compactness, summarize data categories and their relative importance. The contribution of this thesis is a novel analysis approach for visual exploration of high-dimensional point clouds without suffering from structural occlusion. The work is based on implementing two key concepts: The first idea is to discard those geometric properties that cannot be preserved and, thus, lead to the typical artifacts. Topological concepts are used instead to shift away the focus from a point-centered view on the data to a more structure-centered perspective. The advantage is that topology-driven clustering information can be extracted in the data\''s original domain and be preserved without loss in low dimensions. The second idea is to split the analysis into a topology-based global overview and a subsequent geometric local refinement. The occlusion-free overview enables the analyst to identify features and to link them to other visualizations that permit analysis of those properties not captured by the topological abstraction, e.g. cluster shape or value distributions in particular dimensions or subspaces. The advantage of separating structure from data point analysis is that restricting local analysis only to data subsets significantly reduces artifacts and the visual complexity of standard techniques. That is, the additional topological layer enables the analyst to identify structure that was hidden before and to focus on particular features by suppressing irrelevant points during local feature analysis. This thesis addresses the topology-based visual analysis of high-dimensional point clouds for both the time-invariant and the time-varying case. Time-invariant means that the points do not change in their number or positions. That is, the analyst explores the clustering of a fixed and constant set of points. The extension to the time-varying case implies the analysis of a varying clustering, where clusters appear as new, merge or split, or vanish. Especially for high-dimensional data, both tracking---which means to relate features over time---but also visualizing changing structure are difficult problems to solve

    Adaptive Methods for Robust Document Image Understanding

    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Strokes2Surface: Recovering Curve Networks From 4D Architectural Design Sketches

    We present Strokes2Surface, an offline geometry reconstruction pipeline that recovers well-connected curve networks from imprecise 4D sketches to bridge concept design and digital modeling stages in architectural design. The input to our pipeline consists of 3D strokes' polyline vertices and their timestamps as the 4th dimension, along with additional metadata recorded throughout sketching. Inspired by architectural sketching practices, our pipeline combines a classifier and two clustering models to achieve its goal. First, with a set of extracted hand-engineered features from the sketch, the classifier recognizes the type of individual strokes between those depicting boundaries (Shape strokes) and those depicting enclosed areas (Scribble strokes). Next, the two clustering models parse strokes of each type into distinct groups, each representing an individual edge or face of the intended architectural object. Curve networks are then formed through topology recovery of consolidated Shape clusters and surfaced using Scribble clusters guiding the cycle discovery. Our evaluation is threefold: We confirm the usability of the Strokes2Surface pipeline in architectural design use cases via a user study, we validate our choice of features via statistical analysis and ablation studies on our collected dataset, and we compare our outputs against a range of reconstructions computed using alternative methods.Comment: 15 pages, 14 figure

    Automatic 3D Building Detection and Modeling from Airborne LiDAR Point Clouds

    Urban reconstruction, with an emphasis on man-made structure modeling, is an active research area with broad impact on several potential applications. Urban reconstruction combines photogrammetry, remote sensing, computer vision, and computer graphics. Even though there is a huge volume of work that has been done, many problems still remain unsolved. Automation is one of the key focus areas in this research. In this work, a fast, completely automated method to create 3D watertight building models from airborne LiDAR (Light Detection and Ranging) point clouds is presented. The developed method analyzes the scene content and produces multi-layer rooftops, with complex rigorous boundaries and vertical walls, that connect rooftops to the ground. The graph cuts algorithm is used to separate vegetative elements from the rest of the scene content, which is based on the local analysis about the properties of the local implicit surface patch. The ground terrain and building rooftop footprints are then extracted, utilizing the developed strategy, a two-step hierarchical Euclidean clustering. The method presented here adopts a divide-and-conquer scheme. Once the building footprints are segmented from the terrain and vegetative areas, the whole scene is divided into individual pendent processing units which represent potential points on the rooftop. For each individual building region, significant features on the rooftop are further detected using a specifically designed region-growing algorithm with surface smoothness constraints. The principal orientation of each building rooftop feature is calculated using a minimum bounding box fitting technique, and is used to guide the refinement of shapes and boundaries of the rooftop parts. Boundaries for all of these features are refined for the purpose of producing strict description. Once the description of the rooftops is achieved, polygonal mesh models are generated by creating surface patches with outlines defined by detected vertices to produce triangulated mesh models. These triangulated mesh models are suitable for many applications, such as 3D mapping, urban planning and augmented reality

    Text Segmentation in Web Images Using Colour Perception and Topological Features

    The research presented in this thesis addresses the problem of Text Segmentation in Web images. Text is routinely created in image form (headers, banners etc.) on Web pages, as an attempt to overcome the stylistic limitations of HTML. This text however, has a potentially high semantic value in terms of indexing and searching for the corresponding Web pages. As current search engine technology does not allow for text extraction and recognition in images, the text in image form is ignored. Moreover, it is desirable to obtain a uniform representation of all visible text of a Web page (for applications such as voice browsing or automated content analysis). This thesis presents two methods for text segmentation in Web images using colour perception and topological features. The nature of Web images and the implicit problems to text segmentation are described, and a study is performed to assess the magnitude of the problem and establish the need for automated text segmentation methods. Two segmentation methods are subsequently presented: the Split-and-Merge segmentation method and the Fuzzy segmentation method. Although approached in a distinctly different way in each method, the safe assumption that a human being should be able to read the text in any given Web Image is the foundation of both methods’ reasoning. This anthropocentric character of the methods along with the use of topological features of connected components, comprise the underlying working principles of the methods. An approach for classifying the connected components resulting from the segmentation methods as either characters or parts of the background is also presented

    Surface Remeshing and Applications

    Due to the focus of popular graphic accelerators, triangle meshes remain the primary representation for 3D surfaces. They are the simplest form of interpolation between surface samples, which may have been acquired with a laser scanner, computed from a 3D scalar field resolved on a regular grid, or identified on slices of medical data. Typical methods for the generation of triangle meshes from raw data attempt to lose as less information as possible, so that the resulting surface models can be used in the widest range of scenarios. When such a general-purpose model has to be used in a particular application context, however, a pre-processing is often worth to be considered. In some cases, it is convenient to slightly modify the geometry and/or the connectivity of the mesh, so that further processing can take place more easily. Other applications may require the mesh to have a pre-defined structure, which is often different from the one of the original general-purpose mesh. The central focus of this thesis is the automatic remeshing of highly detailed surface triangulations. Besides a thorough discussion of state-of-the-art applications such as real-time rendering and simulation, new approaches are proposed which use remeshing for topological analysis, flexible mesh generation and 3D compression. Furthermore, innovative methods are introduced to post-process polygonal models in order to recover information which was lost, or hidden, by a prior remeshing process. Besides the technical contributions, this thesis aims at showing that surface remeshing is much more useful than it may seem at a first sight, as it represents a nearly fundamental step for making several applications feasible in practice

    Laser-Based Detection and Tracking of Moving Obstacles to Improve Perception of Unmanned Ground Vehicles

    El objetivo de esta tesis es desarrollar un sistema que mejore la etapa de percepción de vehículos terrestres no tripulados (UGVs) heterogéneos, consiguiendo con ello una navegación robusta en términos de seguridad y ahorro energético en diferentes entornos reales, tanto interiores como exteriores. La percepción debe tratar con obstáculos estáticos y dinámicos empleando sensores heterogéneos, tales como, odometría, sensor de distancia láser (LIDAR), unidad de medida inercial (IMU) y sistema de posicionamiento global (GPS), para obtener la información del entorno con la precisión más alta, permitiendo mejorar las etapas de planificación y evitación de obstáculos. Para conseguir este objetivo, se propone una etapa de mapeado de obstáculos dinámicos (DOMap) que contiene la información de los obstáculos estáticos y dinámicos. La propuesta se basa en una extensión del filtro de ocupación bayesiana (BOF) incluyendo velocidades no discretizadas. La detección de velocidades se obtiene con Flujo Óptico sobre una rejilla de medidas LIDAR discretizadas. Además, se gestionan las oclusiones entre obstáculos y se añade una etapa de seguimiento multi-hipótesis, mejorando la robustez de la propuesta (iDOMap). La propuesta ha sido probada en entornos simulados y reales con diferentes plataformas robóticas, incluyendo plataformas comerciales y la plataforma (PROPINA) desarrollada en esta tesis para mejorar la colaboración entre equipos de humanos y robots dentro del proyecto ABSYNTHE. Finalmente, se han propuesto métodos para calibrar la posición del LIDAR y mejorar la odometría con una IMU

    자율 주행을 위한 3D Point Cloud Data 기반 물체 탐지 및 분류 기법에 관한 연구

    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 서승우.A 3D LIDAR provides 3D surface information of objects with the highest position accuracy, among available sensors that can be utilized to develop perception algorithms for automated driving vehicles. In terms of automated driving, the accurate surface information gives the following benefits: 1) the accurate position information that is quite useful itself for collision avoidance is stably provided regardless of illumination condition, because the LIDAR is an active sensor. 2) the surface information can provide precise 3D shape-oriented features for object classification. Motivated by these characteristics, we propose three algorithms for a perception purpose of automated driving vehicles based on the 3D LIDAR in this dissertation. A very first procedure to utilize the 3D LIDAR as a perception sensor is segmentation that transform a stream of the LIDAR measurements into multiple point groups, where each point group indicate an individual object near the sensor. In chapter 2, a real-time and accurate segmentation is proposed. In particular, Gaussian Process regression is used to solve a problem called over-segmentation that increases False Positives by partitioning an object into multiple portions. The segmentation result can be utilized as input of another perception algorithm, such as object classification that is required for designing more human-likely driving strategies. For example, it is important to recognize pedestrians in urban driving environments because avoiding collisions with pedestrians are nearly a top priority. In chapter 3, we propose a pedestrian recognition algorithm based on a Deep Neural Network architecture that learns appearance variation. Another traffic participant that should be recognized with high-priority is a vehicle. Because various vehicle types of which appearances differ, such as a sedan, a bus, or a truck, are present on road, detection of the vehicles with similar performance regardless of the types is necessary. In chapter 4, we propose an algorithm that makes use of a common appearance of vehicles to solve the problem. To improve performance, a monocular camera is additionally employed, where the information from both sensors are integrated by a Dempster-Shafer Theory framework.Chapter 1 Introduction 1 1.1 Background and Motivations 1 1.2 Contributions and Outline of the Dissertation 3 1.2.1 Real-time and Accurate Segmentation of 3D Point Clouds based on Gaussian Process Regression 3 1.2.2 Pedestrian Recognition Based on Appearance Variation Learning 4 1.2.3 Vehicle Recognition using a Common Appearance Captured by a 3D LIDAR and a Monocular Camera 5 Chapter 2 Real-time and Accurate Segmentation of 3D Point Clouds based on Gaussian Process Regression 6 2.1 Introduction 6 2.2 Related Work 10 2.3 Framework overview 15 2.4 Clustering of Non-ground Points 16 2.4.1 Graph Construction 17 2.4.2 Clustering of Points on Vertical Surface 17 2.4.3 Cluster Extension 21 2.5 Accuracy Enhancement 24 2.5.1 Approach to Handling Over-segmentation 26 2.5.2 Handling Over-segmentation with GP Regression 27 2.5.3 Learning Hyperparameters 31 2.6 Experiments 32 2.6.1 Experiment Environment 32 2.6.2 Evaluation Metrics 33 2.6.3 Processing Time 36 2.6.4 Accuracy on Various Driving Environments 37 2.6.5 Impact on Tracking 46 2.7 Conclusion 48 Chapter 3 Pedestrian recognition based on appearance variation learning 50 3.1 Introduction 50 3.2 Related Work 53 3.3 Appearance Variation Learning 56 3.3.1 Primal Input Data for the Proposed Architecture 57 3.3.2 Learning Spatial Features from Appearance 57 3.3.3 Learning Appearance Variation 59 3.3.4 Classification 61 3.3.5 Data Augmentation 61 3.3.6 Implementation Detail 61 3.4 EXPERIMENTS 62 3.4.1 Experimental Environment 62 3.4.2 Experimental Results 65 3.5 CONCLUSIONS AND FUTURE WORKS 70 Chapter 4 Vehicle Recognition using a Common Appearance Captured by a 3D LIDAR and a Monocular Camera 72 4.1 Introduction 72 4.2 Related Work 75 4.3 Vehicle Recognition 77 4.3.1 Point Cloud Processing 78 4.3.2 Image Processing 80 4.3.3 Dempster-Shafer Theory (DST) for Information Fusion 82 4.4 Experiments 84 4.5 Conclusion 87 Chapter 5 Conclusion 89 Bibliography 91 국문초록 105Docto