595 research outputs found

    Acceleration of Computational Geometry Algorithms for High Performance Computing Based Geo-Spatial Big Data Analysis

    Get PDF
    Geo-Spatial computing and data analysis is the branch of computer science that deals with real world location-based data. Computational geometry algorithms are algorithms that process geometry/shapes and is one of the pillars of geo-spatial computing. Real world map and location-based data can be huge in size and the data structures used to process them extremely big leading to huge computational costs. Furthermore, Geo-Spatial datasets are growing on all V’s (Volume, Variety, Value, etc.) and are becoming larger and more complex to process in-turn demanding more computational resources. High Performance Computing is a way to breakdown the problem in ways that it can run in parallel on big computers with massive processing power and hence reduce the computing time delivering the same results but much faster.This dissertation explores different techniques to accelerate the processing of computational geometry algorithms and geo-spatial computing like using Many-core Graphics Processing Units (GPU), Multi-core Central Processing Units (CPU), Multi-node setup with Message Passing Interface (MPI), Cache optimizations, Memory and Communication optimizations, load balancing, Algorithmic Modifications, Directive based parallelization with OpenMP or OpenACC and Vectorization with compiler intrinsic (AVX). This dissertation has applied at least one of the mentioned techniques to the following problems. Novel method to parallelize plane sweep based geometric intersection for GPU with directives is presented. Parallelization of plane sweep based Voronoi construction, parallelization of Segment tree construction, Segment tree queries and Segment tree-based operations has been presented. Spatial autocorrelation, computation of getis-ord hotspots are also presented. Acceleration performance and speedup results are presented in each corresponding chapter

    Efficient Parallel and Distributed Algorithms for GIS Polygon Overlay Processing

    Get PDF
    Polygon clipping is one of the complex operations in computational geometry. It is used in Geographic Information Systems (GIS), Computer Graphics, and VLSI CAD. For two polygons with n and m vertices, the number of intersections can be O(nm). In this dissertation, we present the first output-sensitive CREW PRAM algorithm, which can perform polygon clipping in O(log n) time using O(n + k + k\u27) processors, where n is the number of vertices, k is the number of intersections, and k\u27 is the additional temporary vertices introduced due to the partitioning of polygons. The current best algorithm by Karinthi, Srinivas, and Almasi does not handle self-intersecting polygons, is not output-sensitive and must employ O(n^2) processors to achieve O(log n) time. The second parallel algorithm is an output-sensitive PRAM algorithm based on Greiner-Hormann algorithm with O(log n) time complexity using O(n + k) processors. This is cost-optimal when compared to the time complexity of the best-known sequential plane-sweep based algorithm for polygon clipping. For self-intersecting polygons, the time complexity is O(((n + k) log n log log n)/p) using p In addition to these parallel algorithms, the other main contributions in this dissertation are 1) multi-core and many-core implementation for clipping a pair of polygons and 2) MPI-GIS and Hadoop Topology Suite for distributed polygon overlay using a cluster of nodes. Nvidia GPU and CUDA are used for the many-core implementation. The MPI based system achieves 44X speedup while processing about 600K polygons in two real-world GIS shapefiles 1) USA Detailed Water Bodies and 2) USA Block Group Boundaries) within 20 seconds on a 32-node (8 cores each) IBM iDataPlex cluster interconnected by InfiniBand technology

    Increasing the performance of the Wetland DEM Ponding Model using multiple GPUs

    Get PDF
    Due to the lack of conventional drainage systems on the Canadian Prairies, when excess water runs off the landscape because of the snow-melt and heavy rainfall, the water may be trapped in surface depressions ranging in size from puddles to permanent wetlands and may cause local flooding. Hydrological processes play an important role in the Canadian Prairies regions, and using hydrological simulation models helps people understand past hydrological events and predict future ones. In order to obtain an accurate simulation, higher-resolution systems and larger simulation areas are introduced, and those lead to the need to solve larger-scale problems. However, the size of the problem is often limited by available computational resources, and solving large systems results in unacceptable simulation durations. Therefore, improving the computational efficiency and taking advantage of available computational resources is an urgent task for hydrological researchers and software developers. The Wetland DEM Ponding Model (WDPM) was developed to model the distribution of runoff water on the Canadian Prairies. It helps determine the fraction of Prairie basins contributing flows to stream while these change dynamically with water storage in the depressions. In the WDPM, the water redistribution module is the most computationally intensive part. Previously, the WDPM has been developed to run in parallel with one CPU or one GPU that makes the water redistribution module more efficient. Multi-device parallel computing is a common method to increase the available computation resources and could effectively speed up the application with an appropriate parallel algorithm. This thesis develops a multiple-GPU parallel algorithm and investigates efficient data transmission methods compared to the CPU parallel and one-GPU parallel algorithm. A technique that overlaps communication with computation is applied to optimize the parallel computing process. Then the thesis evaluates the new implementation from several aspects. In the first step, the output summary and the output system are compared between the new implementation and the initial one. The solution shows significant convergence as the simulation processes, verifying the new implementation produces the correct result. In the second step, the multiple-GPU code is profiled, and it is verified that the algorithm can be re-organized to take advantage of multiple GPUs and carry out efficient data synchronization through optimized techniques. Finally, by means of numerical experiments, the new implementation shows performance improvement when using multiple GPUs and demonstrates good scaling. In particular, when working with a large system, the multiple-GPU implementation produces correct output and shows that there is around 2.35 times improvement in the performance compared using four GPUs with using one GPU

    OpenACC Based GPU Parallelization of Plane Sweep Algorithm for Geometric Intersection

    Get PDF
    Line segment intersection is one of the elementary operations in computational geometry. Complex problems in Geographic Information Systems (GIS) like finding map overlays or spatial joins using polygonal data require solving segment intersections. Plane sweep paradigm is used for finding geometric intersection in an efficient manner. However, it is difficult to parallelize due to its in-order processing of spatial events. We present a new fine-grained parallel algorithm for geometric intersection and its CPU and GPU implementation using OpenMP and OpenACC. To the best of our knowledge, this is the first work demonstrating an effective parallelization of plane sweep on GPUs. We chose compiler directive based approach for implementation because of its simplicity to parallelize sequential code. Using Nvidia Tesla P100 GPU, our implementation achieves around 40X speedup for line segment intersection problem on 40K and 80K data sets compared to sequential CGAL library

    Accelerating SIFT on Parallel Architectures

    Get PDF
    SIFT is a widely-used algorithm that extracts features from images; using it to extract information from hundreds of terabytes of aerial and satellite photographs requires parallelization in order to be feasible. We explore accelerating an existing serial SIFT implementation with OpenMP parallelization and GPU execution
    • …
    corecore