553 research outputs found
Acceleration of Computational Geometry Algorithms for High Performance Computing Based Geo-Spatial Big Data Analysis
Geo-Spatial computing and data analysis is the branch of computer science that deals with real world location-based data. Computational geometry algorithms are algorithms that process geometry/shapes and is one of the pillars of geo-spatial computing. Real world map and location-based data can be huge in size and the data structures used to process them extremely big leading to huge computational costs. Furthermore, Geo-Spatial datasets are growing on all V’s (Volume, Variety, Value, etc.) and are becoming larger and more complex to process in-turn demanding more computational resources. High Performance Computing is a way to breakdown the problem in ways that it can run in parallel on big computers with massive processing power and hence reduce the computing time delivering the same results but much faster.This dissertation explores different techniques to accelerate the processing of computational geometry algorithms and geo-spatial computing like using Many-core Graphics Processing Units (GPU), Multi-core Central Processing Units (CPU), Multi-node setup with Message Passing Interface (MPI), Cache optimizations, Memory and Communication optimizations, load balancing, Algorithmic Modifications, Directive based parallelization with OpenMP or OpenACC and Vectorization with compiler intrinsic (AVX). This dissertation has applied at least one of the mentioned techniques to the following problems. Novel method to parallelize plane sweep based geometric intersection for GPU with directives is presented. Parallelization of plane sweep based Voronoi construction, parallelization of Segment tree construction, Segment tree queries and Segment tree-based operations has been presented. Spatial autocorrelation, computation of getis-ord hotspots are also presented. Acceleration performance and speedup results are presented in each corresponding chapter
08081 Abstracts Collection -- Data Structures
From February 17th to 22nd 2008, the Dagstuhl Seminar 08081 ``Data Structures\u27\u27 was held in the International Conference and Research Center (IBFI),
Schloss Dagstuhl. It brought together 49 researchers from four continents to discuss recent developments concerning data structures in terms of research but also in terms of new technologies that impact how data can be stored, updated,
and retrieved.
During the seminar a fair number of participants presented their current
research. There was discussion of ongoing work, and in addition an open problem
session was held. This paper first describes the seminar topics and goals in general, then gives the minutes of the open problem session, and concludes with
abstracts of the presentations given during the seminar.
Where appropriate and available, links to extended abstracts or full papers are provided
Optimal (Randomized) Parallel Algorithms in the Binary-Forking Model
In this paper we develop optimal algorithms in the binary-forking model for a
variety of fundamental problems, including sorting, semisorting, list ranking,
tree contraction, range minima, and ordered set union, intersection and
difference. In the binary-forking model, tasks can only fork into two child
tasks, but can do so recursively and asynchronously. The tasks share memory,
supporting reads, writes and test-and-sets. Costs are measured in terms of work
(total number of instructions), and span (longest dependence chain).
The binary-forking model is meant to capture both algorithm performance and
algorithm-design considerations on many existing multithreaded languages, which
are also asynchronous and rely on binary forks either explicitly or under the
covers. In contrast to the widely studied PRAM model, it does not assume
arbitrary-way forks nor synchronous operations, both of which are hard to
implement in modern hardware. While optimal PRAM algorithms are known for the
problems studied herein, it turns out that arbitrary-way forking and strict
synchronization are powerful, if unrealistic, capabilities. Natural simulations
of these PRAM algorithms in the binary-forking model (i.e., implementations in
existing parallel languages) incur an overhead in span. This
paper explores techniques for designing optimal algorithms when limited to
binary forking and assuming asynchrony. All algorithms described in this paper
are the first algorithms with optimal work and span in the binary-forking
model. Most of the algorithms are simple. Many are randomized
Format Abstraction for Sparse Tensor Algebra Compilers
This paper shows how to build a sparse tensor algebra compiler that is
agnostic to tensor formats (data layouts). We develop an interface that
describes formats in terms of their capabilities and properties, and show how
to build a modular code generator where new formats can be added as plugins. We
then describe six implementations of the interface that compose to form the
dense, CSR/CSF, COO, DIA, ELL, and HASH tensor formats and countless variants
thereof. With these implementations at hand, our code generator can generate
code to compute any tensor algebra expression on any combination of the
aforementioned formats.
To demonstrate our technique, we have implemented it in the taco tensor
algebra compiler. Our modular code generator design makes it simple to add
support for new tensor formats, and the performance of the generated code is
competitive with hand-optimized implementations. Furthermore, by extending taco
to support a wider range of formats specialized for different application and
data characteristics, we can improve end-user application performance. For
example, if input data is provided in the COO format, our technique allows
computing a single matrix-vector multiplication directly with the data in COO,
which is up to 3.6 faster than by first converting the data to CSR.Comment: Presented at OOPSLA 201
Improved Bounds on Information Dissemination by Manhattan Random Waypoint Model
With the popularity of portable wireless devices it is important to model and
predict how information or contagions spread by natural human mobility -- for
understanding the spreading of deadly infectious diseases and for improving
delay tolerant communication schemes. Formally, we model this problem by
considering moving agents, where each agent initially carries a
\emph{distinct} bit of information. When two agents are at the same location or
in close proximity to one another, they share all their information with each
other. We would like to know the time it takes until all bits of information
reach all agents, called the \textit{flood time}, and how it depends on the way
agents move, the size and shape of the network and the number of agents moving
in the network.
We provide rigorous analysis for the \MRWP model (which takes paths with
minimum number of turns), a convenient model used previously to analyze mobile
agents, and find that with high probability the flood time is bounded by
, where agents move on an
grid. In addition to extensive simulations, we use a data set of
taxi trajectories to show that our method can successfully predict flood times
in both experimental settings and the real world.Comment: 10 pages, ACM SIGSPATIAL 2018, Seattle, U
LIPIcs, Volume 244, ESA 2022, Complete Volume
LIPIcs, Volume 244, ESA 2022, Complete Volum
Optimization Techniques for Stencil Data Parallel Programs: Methodologies and Applications
The optimization of data parallel programs is a challenging open problem. We analyzed in detail the optimization techniques for stencil computations, which are a subset of data parallel computations.
Drawing from previous research, we developed a structured model to describe the program transformations. We used this model to compare the different optimizations presented in
literature and study the interaction between them
- …