55 research outputs found
Recommended from our members
A High-Performance Domain-Specific Language and Code Generator for General N-body Problems
General N-body problems are a set of problems in which an update to a single element in the system depends on every other element. N-body problems are ubiquitous, with applications in various domains ranging from scientific computing simulations in molecular dynamics, astrophysics, acoustics, and fluid dynamics all the way to computer vision, data mining and machine learning problems. Different N-body algorithms have been designed and implemented in these various fields. However, there is a big gap between the algorithm one designs on paper and the code that runs efficiently on a parallel system. It is time-consuming to write fast, parallel, and scalable code for these problems. On the other hand, the sheer scale and growth of modern scientific datasets necessitate exploiting the power of both parallel and approximation algorithms where there is a potential to trade-off accuracy for performance. The main problem that we are tackling in this thesis is how to automatically generate asymptotically optimal N-body algorithms from the high-level specification of the problem. We combine the body of work in performance optimizations, compilers and the domain of N-body problems to build a unified system where domain scientists can write programs at the high level while attaining performance of code written by an expert at the low level.In order to generate a high-performance, scalable code for this group of problems, we take the following steps in this thesis; first, we propose a unified algorithmic framework named PASCAL in order to address the challenge of designing a general algorithmic template to represent the class of N-body problems. PASCAL utilizes space-partitioning trees and user-controlled pruning/approximations to reduce the asymptotic runtime complexity from linear to logarithmic in the number of data points. In PASCAL, we design an algorithm that automatically generates conditions for pruning or approximation of an N-body problem considering the problem's definition. In order to evaluate PASCAL, we developed tree-based algorithms for six well-known problems: k-nearest neighbors, range search, minimum spanning tree, kernel density estimation, expectation maximization, and Hausdorff distance. We show that applying domain-specific optimizations and parallelization to the algorithms written in PASCAL achieves 10x to 230x speedup compared to state-of-the-art libraries on a dual-socket Intel Xeon processor with 16 cores on real-world datasets. Second, we extend the PASCAL framework to build PASCAL-X that adds support for NUMA-aware parallelization. PASCAL-X also presents insights on the influence of tuning parameters. Tuning parameters such as leaf size (influences the shape of the tree) and cut-off level (controls the granularity of tasks) of the space-partitioning trees result in performance improvement of up to 4.6x. A key goal is to generate scalable and high-performance code automatically without sacrificing productivity. That implies minimizing the effort the users have to put in to generate the desired high-performance code. Another critical factor is the adaptivity, which indicates the amount of effort that is required to extend the high-performance code generation to new N-body problems. Finally, we consider these factors and develop a domain-specific language and code generator named Portal, which is built on top of PASCAL-X. Portal's language design is inspired by the mathematical representation of N-body problems, resulting in an intuitive language for rapid implementation of a variety of problems. Portal's back-end is designed and implemented to generate optimized, parallel, and scalable implementations for multi-core systems. We demonstrate that the performance achieved by using Portal is comparable to that of expert hand-optimized code while providing productivity for domain scientists. For instance, using Portal for the k-nearest neighbors problem gains performance that is similar to the hand-optimized code, while reducing the lines of code by 68x. To the best of our knowledge, there are no known libraries or frameworks that implement parallel asymptotically optimal algorithms for the class of general N-body problems and this thesis primarily aims to fill this gap. Finally, we present a case study of Portal for the real-world problem of face clustering. In this case study, we show that Portal not only provides a fast solution for the face clustering problem with similar accuracy as the state-of-the-art algorithm, but also it provides productivity by implementing the face clustering algorithm in only 14 lines of Portal code
Visual Analysis of High-Dimensional Point Clouds using Topological Abstraction
This thesis is about visualizing a kind of data that is trivial to process by computers but difficult to imagine by humans because nature does not allow for intuition with this type of information: high-dimensional data. Such data often result from representing observations of objects under various aspects or with different properties. In many applications, a typical, laborious task is to find related objects or to group those that are similar to each other. One classic solution for this task is to imagine the data as vectors in a Euclidean space with object variables as dimensions. Utilizing Euclidean distance as a measure of similarity, objects with similar properties and values accumulate to groups, so-called clusters, that are exposed by cluster analysis on the high-dimensional point cloud. Because similar vectors can be thought of as objects that are alike in terms of their attributes, the point cloud\''s structure and individual cluster properties, like their size or compactness, summarize data categories and their relative importance. The contribution of this thesis is a novel analysis approach for visual exploration of high-dimensional point clouds without suffering from structural occlusion. The work is based on implementing two key concepts: The first idea is to discard those geometric properties that cannot be preserved and, thus, lead to the typical artifacts. Topological concepts are used instead to shift away the focus from a point-centered view on the data to a more structure-centered perspective. The advantage is that topology-driven clustering information can be extracted in the data\''s original domain and be preserved without loss in low dimensions. The second idea is to split the analysis into a topology-based global overview and a subsequent geometric local refinement. The occlusion-free overview enables the analyst to identify features and to link them to other visualizations that permit analysis of those properties not captured by the topological abstraction, e.g. cluster shape or value distributions in particular dimensions or subspaces. The advantage of separating structure from data point analysis is that restricting local analysis only to data subsets significantly reduces artifacts and the visual complexity of standard techniques. That is, the additional topological layer enables the analyst to identify structure that was hidden before and to focus on particular features by suppressing irrelevant points during local feature analysis. This thesis addresses the topology-based visual analysis of high-dimensional point clouds for both the time-invariant and the time-varying case. Time-invariant means that the points do not change in their number or positions. That is, the analyst explores the clustering of a fixed and constant set of points. The extension to the time-varying case implies the analysis of a varying clustering, where clusters appear as new, merge or split, or vanish. Especially for high-dimensional data, both tracking---which means to relate features over time---but also visualizing changing structure are difficult problems to solve
Adaptive Methods for Robust Document Image Understanding
A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy
Strokes2Surface: Recovering Curve Networks From 4D Architectural Design Sketches
We present Strokes2Surface, an offline geometry reconstruction pipeline that
recovers well-connected curve networks from imprecise 4D sketches to bridge
concept design and digital modeling stages in architectural design. The input
to our pipeline consists of 3D strokes' polyline vertices and their timestamps
as the 4th dimension, along with additional metadata recorded throughout
sketching. Inspired by architectural sketching practices, our pipeline combines
a classifier and two clustering models to achieve its goal. First, with a set
of extracted hand-engineered features from the sketch, the classifier
recognizes the type of individual strokes between those depicting boundaries
(Shape strokes) and those depicting enclosed areas (Scribble strokes). Next,
the two clustering models parse strokes of each type into distinct groups, each
representing an individual edge or face of the intended architectural object.
Curve networks are then formed through topology recovery of consolidated Shape
clusters and surfaced using Scribble clusters guiding the cycle discovery. Our
evaluation is threefold: We confirm the usability of the Strokes2Surface
pipeline in architectural design use cases via a user study, we validate our
choice of features via statistical analysis and ablation studies on our
collected dataset, and we compare our outputs against a range of
reconstructions computed using alternative methods.Comment: 15 pages, 14 figure
Automatic 3D Building Detection and Modeling from Airborne LiDAR Point Clouds
Urban reconstruction, with an emphasis on man-made structure modeling, is an active research area with broad impact on several potential applications. Urban reconstruction combines photogrammetry, remote sensing, computer vision, and computer graphics. Even though there is a huge volume of work that has been done, many problems still remain unsolved. Automation is one of the key focus areas in this research. In this work, a fast, completely automated method to create 3D watertight building models from airborne LiDAR (Light Detection and Ranging) point clouds is presented. The developed method analyzes the scene content and produces multi-layer rooftops, with complex rigorous boundaries and vertical walls, that connect rooftops to the ground. The graph cuts algorithm is used to separate vegetative elements from the rest of the scene content, which is based on the local analysis about the properties of the local implicit surface patch. The ground terrain and building rooftop footprints are then extracted, utilizing the developed strategy, a two-step hierarchical Euclidean clustering. The method presented here adopts a divide-and-conquer scheme. Once the building footprints are segmented from the terrain and vegetative areas, the whole scene is divided into individual pendent processing units which represent potential points on the rooftop. For each individual building region, significant features on the rooftop are further detected using a specifically designed region-growing algorithm with surface smoothness constraints. The principal orientation of each building rooftop feature is calculated using a minimum bounding box fitting technique, and is used to guide the refinement of shapes and boundaries of the rooftop parts. Boundaries for all of these features are refined for the purpose of producing strict description. Once the description of the rooftops is achieved, polygonal mesh models are generated by creating surface patches with outlines defined by detected vertices to produce triangulated mesh models. These triangulated mesh models are suitable for many applications, such as 3D mapping, urban planning and augmented reality
Text Segmentation in Web Images Using Colour Perception and Topological Features
The research presented in this thesis addresses the problem of Text Segmentation in Web images. Text is routinely created in image form (headers, banners etc.) on Web pages, as an attempt to overcome the stylistic limitations of HTML. This text however, has a potentially high semantic value in terms of indexing and searching for the corresponding Web pages. As current search engine technology does not allow for text extraction and recognition in images, the text in image form is ignored. Moreover, it is desirable to obtain a uniform representation of all visible text of a Web page (for applications such as voice browsing or automated content analysis). This thesis presents two methods for text segmentation in Web images using colour perception and topological features. The nature of Web images and the implicit problems to text segmentation are described, and a study is performed to assess the magnitude of the problem and establish the need for automated text segmentation methods. Two segmentation methods are subsequently presented: the Split-and-Merge segmentation method and the Fuzzy segmentation method. Although approached in a distinctly different way in each method, the safe assumption that a human being should be able to read the text in any given Web Image is the foundation of both methods’ reasoning. This anthropocentric character of the methods along with the use of topological features of connected components, comprise the underlying working principles of the methods. An approach for classifying the connected components resulting from the segmentation methods as either characters or parts of the background is also presented
Surface Remeshing and Applications
Due to the focus of popular graphic accelerators, triangle meshes remain the primary representation for 3D surfaces. They are the simplest form of interpolation between surface samples, which may have been acquired with a laser scanner, computed from a 3D scalar field resolved on a regular grid, or identified on slices of medical data. Typical methods for the generation of triangle meshes from raw data attempt to lose as less information as possible, so that the resulting surface models can be used in the widest range of scenarios. When such a general-purpose model has to be used in a particular application context, however, a pre-processing is often worth to be considered. In some cases, it is convenient to slightly modify the geometry and/or the connectivity of the mesh, so that further processing can take place more easily. Other applications may require the mesh to have a pre-defined structure, which is often different from the one of the original general-purpose mesh. The central focus of this thesis is the automatic remeshing of highly detailed surface triangulations. Besides a thorough discussion of state-of-the-art applications such as real-time rendering and simulation, new approaches are proposed which use remeshing for topological analysis, flexible mesh generation and 3D compression. Furthermore, innovative methods are introduced to post-process polygonal models in order to recover information which was lost, or hidden, by a prior remeshing process. Besides the technical contributions, this thesis aims at showing that surface remeshing is much more useful than it may seem at a first sight, as it represents a nearly fundamental step for making several applications feasible in practice
Laser-Based Detection and Tracking of Moving Obstacles to Improve Perception of Unmanned Ground Vehicles
El objetivo de esta tesis es desarrollar un sistema que mejore la etapa de percepción de vehículos terrestres no tripulados (UGVs) heterogéneos, consiguiendo con ello una navegación robusta en términos de seguridad y ahorro energético en diferentes entornos reales, tanto interiores como exteriores. La percepción debe tratar con obstáculos estáticos y dinámicos empleando sensores heterogéneos, tales como, odometría, sensor de distancia láser (LIDAR), unidad de medida inercial (IMU) y sistema de posicionamiento global (GPS), para obtener la información del entorno con la precisión más alta, permitiendo mejorar las etapas de planificación y evitación de obstáculos.
Para conseguir este objetivo, se propone una etapa de mapeado de obstáculos dinámicos (DOMap) que contiene la información de los obstáculos estáticos y dinámicos. La propuesta se basa en una extensión del filtro de ocupación bayesiana (BOF) incluyendo velocidades no discretizadas. La detección de velocidades se obtiene con Flujo Óptico sobre una rejilla de medidas LIDAR discretizadas. Además, se gestionan las oclusiones entre obstáculos y se añade una etapa de seguimiento multi-hipótesis, mejorando la robustez de la propuesta (iDOMap).
La propuesta ha sido probada en entornos simulados y reales con diferentes plataformas robóticas, incluyendo plataformas comerciales y la plataforma (PROPINA) desarrollada en esta tesis para mejorar la colaboración entre equipos de humanos y robots dentro del proyecto ABSYNTHE. Finalmente, se han propuesto métodos para calibrar la posición del LIDAR y mejorar la odometría con una IMU
자율 주행을 위한 3D Point Cloud Data 기반 물체 탐지 및 분류 기법에 관한 연구
학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 서승우.A 3D LIDAR provides 3D surface information of objects with the highest position accuracy, among available sensors that can be utilized to develop perception algorithms for automated driving vehicles. In terms of automated driving, the accurate surface information gives the following benefits: 1) the accurate position information that is quite useful itself for collision avoidance is stably provided regardless of illumination condition, because the LIDAR is an active sensor. 2) the surface information can provide precise 3D shape-oriented features for object classification. Motivated by these characteristics, we propose three algorithms for a perception purpose of automated driving vehicles based on the 3D LIDAR in this dissertation.
A very first procedure to utilize the 3D LIDAR as a perception sensor is segmentation that transform a stream of the LIDAR measurements into multiple point groups, where each point group indicate an individual object near the sensor. In chapter 2, a real-time and accurate segmentation is proposed. In particular, Gaussian Process regression is used to solve a problem called over-segmentation that increases False Positives by partitioning an object into multiple portions.
The segmentation result can be utilized as input of another perception algorithm, such as object classification that is required for designing more human-likely driving strategies. For example, it is important to recognize pedestrians in urban driving environments because avoiding collisions with pedestrians are nearly a top priority. In chapter 3, we propose a pedestrian recognition algorithm based on a Deep Neural Network architecture that learns appearance variation.
Another traffic participant that should be recognized with high-priority is a vehicle. Because various vehicle types of which appearances differ, such as a sedan,
a bus, or a truck, are present on road, detection of the vehicles with similar performance regardless of the types is necessary. In chapter 4, we propose an algorithm that makes use of a common appearance of vehicles to solve the problem. To improve performance, a monocular camera is additionally employed, where the information from both sensors are integrated by a Dempster-Shafer Theory framework.Chapter 1 Introduction 1
1.1 Background and Motivations 1
1.2 Contributions and Outline of the Dissertation 3
1.2.1 Real-time and Accurate Segmentation of 3D Point Clouds based on Gaussian Process Regression 3
1.2.2 Pedestrian Recognition Based on Appearance Variation Learning 4
1.2.3 Vehicle Recognition using a Common Appearance Captured by a 3D LIDAR and a Monocular Camera 5
Chapter 2 Real-time and Accurate Segmentation of 3D Point Clouds based on Gaussian Process Regression 6
2.1 Introduction 6
2.2 Related Work 10
2.3 Framework overview 15
2.4 Clustering of Non-ground Points 16
2.4.1 Graph Construction 17
2.4.2 Clustering of Points on Vertical Surface 17
2.4.3 Cluster Extension 21
2.5 Accuracy Enhancement 24
2.5.1 Approach to Handling Over-segmentation 26
2.5.2 Handling Over-segmentation with GP Regression 27
2.5.3 Learning Hyperparameters 31
2.6 Experiments 32
2.6.1 Experiment Environment 32
2.6.2 Evaluation Metrics 33
2.6.3 Processing Time 36
2.6.4 Accuracy on Various Driving Environments 37
2.6.5 Impact on Tracking 46
2.7 Conclusion 48
Chapter 3 Pedestrian recognition based on appearance variation learning 50
3.1 Introduction 50
3.2 Related Work 53
3.3 Appearance Variation Learning 56
3.3.1 Primal Input Data for the Proposed Architecture 57
3.3.2 Learning Spatial Features from Appearance 57
3.3.3 Learning Appearance Variation 59
3.3.4 Classification 61
3.3.5 Data Augmentation 61
3.3.6 Implementation Detail 61
3.4 EXPERIMENTS 62
3.4.1 Experimental Environment 62
3.4.2 Experimental Results 65
3.5 CONCLUSIONS AND FUTURE WORKS 70
Chapter 4 Vehicle Recognition using a Common Appearance Captured by a 3D LIDAR and a Monocular Camera 72
4.1 Introduction 72
4.2 Related Work 75
4.3 Vehicle Recognition 77
4.3.1 Point Cloud Processing 78
4.3.2 Image Processing 80
4.3.3 Dempster-Shafer Theory (DST) for Information Fusion 82
4.4 Experiments 84
4.5 Conclusion 87
Chapter 5 Conclusion 89
Bibliography 91
국문초록 105Docto
- …