Search CORE

125,732 research outputs found

Efficient Algorithms for the Closest Pair Problem and Applications

Author: Pathak Sudipta
Rajasekaran Sanguthevar
Publication venue
Publication date: 21/07/2014
Field of study

The closest pair problem (CPP) is one of the well studied and fundamental problems in computing. Given a set of points in a metric space, the problem is to identify the pair of closest points. Another closely related problem is the fixed radius nearest neighbors problem (FRNNP). Given a set of points and a radius

R

, the problem is, for every input point

p

, to identify all the other input points that are within a distance of

R

from

p

. A naive deterministic algorithm can solve these problems in quadratic time. CPP as well as FRNNP play a vital role in computational biology, computational finance, share market analysis, weather prediction, entomology, electro cardiograph, N-body simulations, molecular simulations, etc. As a result, any improvements made in solving CPP and FRNNP will have immediate implications for the solution of numerous problems in these domains. We live in an era of big data and processing these data take large amounts of time. Speeding up data processing algorithms is thus much more essential now than ever before. In this paper we present algorithms for CPP and FRNNP that improve (in theory and/or practice) the best-known algorithms reported in the literature for CPP and FRNNP. These algorithms also improve the best-known algorithms for related applications including time series motif mining and the two locus problem in Genome Wide Association Studies (GWAS)

arXiv.org e-Print Archive

CiteSeerX

Closest pair optimization on modern hardware

Author: Bright Jason
Publication venue
Publication date: 01/05/2019
Field of study

Master's Project (M.S.) University of Alaska Fairbanks, 2019In this project we examine the performance of several algorithms for finding the closest pair of points out of a given set of points in a plane. We look at four algorithms, including brute force, recursive, non-recursive, and a random expected linear time for numbers of points ranging from one hundred to one billion. In our examination, we find that on average the non-recursive is the fastest, except for limited cases of 100 points for the brute force, and 32 bit spaces for the random expected linear

ScholarWorks@UA

Dominance Product and High-Dimensional Closest Pair under $L_\infty$

Author: Gold Omer
Sharir Micha
Publication venue
Publication date: 01/01/2017
Field of study

Given a set

S

n

points in

\mathbb{R}^d

, the Closest Pair problem is to find a pair of distinct points in

S

at minimum distance. When

d

is constant, there are efficient algorithms that solve this problem, and fast approximate solutions for general

d

. However, obtaining an exact solution in very high dimensions seems to be much less understood. We consider the high-dimensional

L_\infty

Closest Pair problem, where

d=n^r

for some

r > 0

, and the underlying metric is

L_\infty

. We improve and simplify previous results for

L_\infty

Closest Pair, showing that it can be solved by a deterministic strongly-polynomial algorithm that runs in

O(DP(n,d)\log n)

time, and by a randomized algorithm that runs in

O(DP(n,d))

expected time, where

DP(n,d)

is the time bound for computing the {\em dominance product} for

n

points in

\mathbb{R}^d

. That is a matrix

D

, such that

D[i,j] = \bigl| \{k \mid p_i[k] \leq p_j[k]\} \bigr|

; this is the number of coordinates at which

p_j

dominates

p_i

. For integer coordinates from some interval

[-M, M]

, we obtain an algorithm that runs in

\tilde{O}\left(\min\{Mn^{\omega(1,r,1)},\, DP(n,d)\}\right)

time, where

\omega(1,r,1)

is the exponent of multiplying an

n \times n^r

matrix by an

n^r \times n

matrix. We also give slightly better bounds for

DP(n,d)

, by using more recent rectangular matrix multiplication bounds. Computing the dominance product itself is an important task, since it is applied in many algorithms as a major black-box ingredient, such as algorithms for APBP (all pairs bottleneck paths), and variants of APSP (all pairs shortest paths)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic

Author: C. S. Karthik
Manurangsi Pasin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 10th Innovations in Theoretical Computer Science Conference (ITCS 2019)
Publication date: 01/01/2018
Field of study

Given a set of n points in R^d, the (monochromatic) Closest Pair problem asks to find a pair of distinct points in the set that are closest in the l_p-metric. Closest Pair is a fundamental problem in Computational Geometry and understanding its fine-grained complexity in the Euclidean metric when d=omega(log n) was raised as an open question in recent works (Abboud-Rubinstein-Williams [FOCS\u2717], Williams [SODA\u2718], David-Karthik-Laekhanukit [SoCG\u2718]). In this paper, we show that for every p in R_{>= 1} cup {0}, under the Strong Exponential Time Hypothesis (SETH), for every epsilon>0, the following holds: - No algorithm running in time O(n^{2-epsilon}) can solve the Closest Pair problem in d=(log n)^{Omega_{epsilon}(1)} dimensions in the l_p-metric. - There exists delta = delta(epsilon)>0 and c = c(epsilon)>= 1 such that no algorithm running in time O(n^{1.5-epsilon}) can approximate Closest Pair problem to a factor of (1+delta) in d >= c log n dimensions in the l_p-metric. In particular, our first result is shown by establishing the computational equivalence of the bichromatic Closest Pair problem and the (monochromatic) Closest Pair problem (up to n^{epsilon} factor in the running time) for d=(log n)^{Omega_epsilon(1)} dimensions. Additionally, under SETH, we rule out nearly-polynomial factor approximation algorithms running in subquadratic time for the (monochromatic) Maximum Inner Product problem where we are given a set of n points in n^{o(1)}-dimensional Euclidean space and are required to find a pair of distinct points in the set that maximize the inner product. At the heart of all our proofs is the construction of a dense bipartite graph with low contact dimension, i.e., we construct a balanced bipartite graph on n vertices with n^{2-epsilon} edges whose vertices can be realized as points in a (log n)^{Omega_epsilon(1)}-dimensional Euclidean space such that every pair of vertices which have an edge in the graph are at distance exactly 1 and every other pair of vertices are at distance greater than 1. This graph construction is inspired by the construction of locally dense codes introduced by Dumer-Miccancio-Sudan [IEEE Trans. Inf. Theory\u2703]

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Recommended from our members

Fully dynamic maintenance of Euclidean minimum spanning trees and maxima of decomposable functions

Author: Eppstein David
Publication venue: eScholarship, University of California
Publication date: 25/08/1992
Field of study

We maintain the minimum spanning tree of a point set in the plane, subject to point insertions and deletions, in time O(n^1/2 log^2 n) per update operation. We reduce the problem to maintaining bichromatic closest pairs, which we solve in time O(n^E) per update. Our algorithm uses a novel construction, the ordered nearest neighbors of a sequence of points. Any point set or bichromatic point set can be ordered so that this graph is a simple path. Our results generalize to higher dimensions, and to fully dynamic algorithms for maintaining maxima of decomposable functions, including the diameter of a point set and the bichromatic farthest pair

eScholarship - University of California

A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem

Author: Gu Yan
Shun Julian
Wang Yiqiu
Yu Shangdi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Computational Geometry (SoCG 2021)
Publication date: 01/01/2021
Field of study

We propose a theoretically-efficient and practical parallel batch-dynamic data structure for the closest pair problem. Our solution is based on a serial dynamic closest pair data structure by Golin et al., and supports batches of insertions and deletions in parallel. For a data set of size

n

, our data structure supports a batch of insertions or deletions of size

m

O(m(1+\log ((n+m)/m)))

expected work and

O(\log (n+m)\log^*(n+m))

depth with high probability, and takes linear space. The key techniques for achieving these bounds are a new work-efficient parallel batch-dynamic binary heap, and careful management of the computation across sets of points to minimize work and depth. We provide an optimized multicore implementation of our data structure using dynamic hash tables, parallel heaps, and dynamic

k

-d trees. Our experiments on a variety of synthetic and real-world data sets show that it achieves a parallel speedup of up to 38.57x (15.10x on average) on 48 cores with hyper-threading. In addition, we also implement and compare four parallel algorithms for static closest pair problem, for which we are not aware of any existing practical implementations. On 48 cores with hyper-threading, the static algorithms achieve up to 51.45x (29.42x on average) speedup, and Rabin's algorithm performs the best on average. Comparing our dynamic algorithm to the fastest static algorithm, we find that it is advantageous to use the dynamic algorithm for batch sizes of up to 20\% of the data set. As far as we know, our work is the first to experimentally evaluate parallel closest pair algorithms, in both the static and the dynamic settings

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server