Search CORE

20,477 research outputs found

Fast Parallel Operations on Search Trees

Author: Akhremtsev Yaroslav
Sanders Peter
Publication venue
Publication date: 11/05/2016
Field of study

Using (a,b)-trees as an example, we show how to perform a parallel split with logarithmic latency and parallel join, bulk updates, intersection, union (or merge), and (symmetric) set difference with logarithmic latency and with information theoretically optimal work. We present both asymptotically optimal solutions and simplified versions that perform well in practice - they are several times faster than previous implementations

arXiv.org e-Print Archive

Crossref

Dynamic load balancing in parallel KD-tree k-means

Author: Di Fatta Giuseppe
Pettinger David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/06/2010
Field of study

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy

Central Archive at the University of Reading

Crossref

Parallel Weighted Random Sampling

Author: Sanders Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutations, subsets, and reservoirs. We also give improved sequential algorithms for alias table construction and for sampling with replacement. Experiments on shared-memory parallel machines with up to 158 threads show near linear speedups both for construction and queries

Dagstuhl Research Online Publication Server

Subtree weight ratios for optimal binary search trees

Author: Hirschberg D. S.
Larmore L. L.
Molodowitch M.
Publication venue: eScholarship, University of California
Publication date: 01/01/1986
Field of study

For an optimal binary search tree T with a subtree S(d) at a distance d from the root of T, we study the ratio of the weight of S(d) to the weight of T. The maximum possible value, which we call ρ(d), of the ratio of weights, is found to have an upper bound of 2/F_d+3 where F_i is the ith Fibonacci number. For d = 1, 2, 3, and 4, the bound is shown to be tight. For larger d, the Fibonacci bound gives ρ(d) = O(ϕ^d) where ϕ ~ .61803 is the golden ratio. By giving a particular set of optimal trees, we prove ρ(d) = Ω((.58578 ... )^d), and believe a similar proof follows for ρ(d) = Ω((.60179 ... )^d). If we include frequencies for unsuccessful searches in the optimal binary search trees, the Fibonacci bound is found to be tight

CiteSeerX

eScholarship - University of California