Search CORE

7 research outputs found

External-memory search trees with fast insertions

Author: Nelson Jelani (Jelani Osei)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 65-68).This thesis provides both experimental and theoretical contributions regarding external-memory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B-tree, a data structure that provably outperforms B-trees for updates at the cost of a constant factor decrease in query performance. This thesis also describes the cache-oblivious lookahead array, which outperforms B-trees for updates at a logarithmic cost in query performance, and does so without knowing the cache parameters of the system it is being run on. The buffered repository B-tree is an external-memory search tree that can be tuned for a tradeoff between queries and updates. Specifically, for any E [1/ lg B, 1] this data structure achieves O((1/EBl-E)(1 + logB(N/B))) block transfers for INSERT and DELETE and 0((/(1 + logB(N/B))) block transfers for SEARCH. The update complexity is amortized and is O((1/e)(1 + logB(N/B))) in the worst case. Using the value = 1/2, I was able to achieve a 17 times increase in insertion performance at the cost of only a 3 times decrease in search performance on a database with 12-byte items on a disk with a 4-kilobyte block size.(cont.) This thesis also shows how to build a cache-oblivious data structure, the cache-oblivious lookahead array, which achieves the same bounds as the buffered repository B'-tree in the case where e = 1/ lg B. Specifically, it achieves an update complexity of O((1/B) log(N/B)) and a query complexity of O(log(N/B)) block transfers. This is the first data structure to achieve these bounds cache-obliviously. The research involving the cache-oblivious lookahead array represents joint work with Michael A. Bender, Jeremy Fineman, and Bradley C. Kuszmaul.by Jelani Nelson.M.Eng

DSpace@MIT

Acceleration of Computational Geometry Algorithms for High Performance Computing Based Geo-Spatial Big Data Analysis

Author: Paudel Anmol
Publication venue: e-Publications@Marquette
Publication date: 01/04/2022
Field of study

Geo-Spatial computing and data analysis is the branch of computer science that deals with real world location-based data. Computational geometry algorithms are algorithms that process geometry/shapes and is one of the pillars of geo-spatial computing. Real world map and location-based data can be huge in size and the data structures used to process them extremely big leading to huge computational costs. Furthermore, Geo-Spatial datasets are growing on all V’s (Volume, Variety, Value, etc.) and are becoming larger and more complex to process in-turn demanding more computational resources. High Performance Computing is a way to breakdown the problem in ways that it can run in parallel on big computers with massive processing power and hence reduce the computing time delivering the same results but much faster.This dissertation explores different techniques to accelerate the processing of computational geometry algorithms and geo-spatial computing like using Many-core Graphics Processing Units (GPU), Multi-core Central Processing Units (CPU), Multi-node setup with Message Passing Interface (MPI), Cache optimizations, Memory and Communication optimizations, load balancing, Algorithmic Modifications, Directive based parallelization with OpenMP or OpenACC and Vectorization with compiler intrinsic (AVX). This dissertation has applied at least one of the mentioned techniques to the following problems. Novel method to parallelize plane sweep based geometric intersection for GPU with directives is presented. Parallelization of plane sweep based Voronoi construction, parallelization of Segment tree construction, Segment tree queries and Segment tree-based operations has been presented. Spatial autocorrelation, computation of getis-ord hotspots are also presented. Acceleration performance and speedup results are presented in each corresponding chapter

epublications@Marquette

Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools

Author: Bingmann Timo
Publication venue
Publication date: 01/01/2018
Field of study

This dissertation focuses on two fundamental sorting problems: string sorting and suffix sorting. The first part considers parallel string sorting on shared-memory multi-core machines, the second part external memory suffix sorting using the induced sorting principle, and the third part distributed external memory suffix sorting with a new distributed algorithmic big data framework named Thrill.Comment: 396 pages, dissertation, Karlsruher Instituts f\"ur Technologie (2018). arXiv admin note: text overlap with arXiv:1101.3448 by other author

arXiv.org e-Print Archive

KITopen

Cache-oblivious R-trees

Author: Arge L.
Berg de, M.
Haverkort H.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We develop a cache-oblivious data structure for storing a set S of N axis-aligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point can be found efficiently. Our structure is an axis-aligned bounding-box hierarchy and as such it is the first cache-oblivious R-tree with provable performance guarantees. If no point in the plane is contained in more than a constant number of rectangles in S, we can construct, for any constant e, a structure that answers a rectangle query using memory transfers and a point query using O((N/B) e ) memory transfers, where T is the number of reported rectangles and B is the block size of memory transfers between any two levels of a multilevel memory hierarchy. We also develop a variant of our structure that achieves the same performance on input sets with arbitrary overlap among the rectangles. The rectangle query bound matches the bound of the best known linear-space cache-aware structure

Pure OAI Repository

Cache-Oblivious R-Trees

Author: Herman Haverkort
Lars Arge
Mark de Berg
Publication venue
Publication date: 01/01/2005
Field of study

We develop a cache-oblivious data structure for storing a set S of N axis-aligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point can be found efficiently. Our structure is an axis-aligned bounding-box hierarchy and as such it is the first cache-oblivious R-tree with provable performance guarantees. If no point in the plane is contained in B or more rectangles in S, the structure answers a rectangle query using O(\sqrt{N/B} + T/B) memory transfers and a point query using O((N/B)^ε) memory transfers for any ε>0, where B is the block size of memory transfers between any two levels of a multilevel memory hierarchy. We also develop a variant of our structure that achieves the same performance on input sets with arbitrary overlap among the rectangles. The rectangle query bound matches the bound of the best known linear-space cache-aware structure

CiteSeerX

Crossref

Cache-Oblivious R-Trees

Author: A. Aggarwal
D. Comer
G.S. Brodal
Herman Haverkort
J.L. Bentley
J.S. Vitter
K.V.R. Kanth
L. Arge
L. Arge
Lars Arge
Mark de Berg
N. Rahman
P.K. Agarwal
P.K. Agarwal
R. Bayer
V. Gaede
Y. Manolopoulos
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref