125,732 research outputs found
Efficient Algorithms for the Closest Pair Problem and Applications
The closest pair problem (CPP) is one of the well studied and fundamental
problems in computing. Given a set of points in a metric space, the problem is
to identify the pair of closest points. Another closely related problem is the
fixed radius nearest neighbors problem (FRNNP). Given a set of points and a
radius , the problem is, for every input point , to identify all the
other input points that are within a distance of from . A naive
deterministic algorithm can solve these problems in quadratic time. CPP as well
as FRNNP play a vital role in computational biology, computational finance,
share market analysis, weather prediction, entomology, electro cardiograph,
N-body simulations, molecular simulations, etc. As a result, any improvements
made in solving CPP and FRNNP will have immediate implications for the solution
of numerous problems in these domains. We live in an era of big data and
processing these data take large amounts of time. Speeding up data processing
algorithms is thus much more essential now than ever before. In this paper we
present algorithms for CPP and FRNNP that improve (in theory and/or practice)
the best-known algorithms reported in the literature for CPP and FRNNP. These
algorithms also improve the best-known algorithms for related applications
including time series motif mining and the two locus problem in Genome Wide
Association Studies (GWAS)
Closest pair optimization on modern hardware
Master's Project (M.S.) University of Alaska Fairbanks, 2019In this project we examine the performance of several algorithms for finding the closest pair of points
out of a given set of points in a plane. We look at four algorithms, including brute force, recursive,
non-recursive, and a random expected linear time for numbers of points ranging from one hundred to
one billion. In our examination, we find that on average the non-recursive is the fastest, except for
limited cases of 100 points for the brute force, and 32 bit spaces for the random expected linear
Dominance Product and High-Dimensional Closest Pair under
Given a set of points in , the Closest Pair problem is
to find a pair of distinct points in at minimum distance. When is
constant, there are efficient algorithms that solve this problem, and fast
approximate solutions for general . However, obtaining an exact solution in
very high dimensions seems to be much less understood. We consider the
high-dimensional Closest Pair problem, where for some , and the underlying metric is .
We improve and simplify previous results for Closest Pair, showing
that it can be solved by a deterministic strongly-polynomial algorithm that
runs in time, and by a randomized algorithm that runs in
expected time, where is the time bound for computing the
{\em dominance product} for points in . That is a matrix ,
such that ; this is the
number of coordinates at which dominates . For integer coordinates
from some interval , we obtain an algorithm that runs in
time, where
is the exponent of multiplying an matrix by an
matrix.
We also give slightly better bounds for , by using more recent
rectangular matrix multiplication bounds. Computing the dominance product
itself is an important task, since it is applied in many algorithms as a major
black-box ingredient, such as algorithms for APBP (all pairs bottleneck paths),
and variants of APSP (all pairs shortest paths)
On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic
Given a set of n points in R^d, the (monochromatic) Closest Pair problem asks to find a pair of distinct points in the set that are closest in the l_p-metric. Closest Pair is a fundamental problem in Computational Geometry and understanding its fine-grained complexity in the Euclidean metric when d=omega(log n) was raised as an open question in recent works (Abboud-Rubinstein-Williams [FOCS\u2717], Williams [SODA\u2718], David-Karthik-Laekhanukit [SoCG\u2718]).
In this paper, we show that for every p in R_{>= 1} cup {0}, under the Strong Exponential Time Hypothesis (SETH), for every epsilon>0, the following holds:
- No algorithm running in time O(n^{2-epsilon}) can solve the Closest Pair problem in d=(log n)^{Omega_{epsilon}(1)} dimensions in the l_p-metric.
- There exists delta = delta(epsilon)>0 and c = c(epsilon)>= 1 such that no algorithm running in time O(n^{1.5-epsilon}) can approximate Closest Pair problem to a factor of (1+delta) in d >= c log n dimensions in the l_p-metric.
In particular, our first result is shown by establishing the computational equivalence of the bichromatic Closest Pair problem and the (monochromatic) Closest Pair problem (up to n^{epsilon} factor in the running time) for d=(log n)^{Omega_epsilon(1)} dimensions.
Additionally, under SETH, we rule out nearly-polynomial factor approximation algorithms running in subquadratic time for the (monochromatic) Maximum Inner Product problem where we are given a set of n points in n^{o(1)}-dimensional Euclidean space and are required to find a pair of distinct points in the set that maximize the inner product.
At the heart of all our proofs is the construction of a dense bipartite graph with low contact dimension, i.e., we construct a balanced bipartite graph on n vertices with n^{2-epsilon} edges whose vertices can be realized as points in a (log n)^{Omega_epsilon(1)}-dimensional Euclidean space such that every pair of vertices which have an edge in the graph are at distance exactly 1 and every other pair of vertices are at distance greater than 1. This graph construction is inspired by the construction of locally dense codes introduced by Dumer-Miccancio-Sudan [IEEE Trans. Inf. Theory\u2703]
Recommended from our members
Fully dynamic maintenance of Euclidean minimum spanning trees and maxima of decomposable functions
We maintain the minimum spanning tree of a point set in the plane, subject to point insertions and deletions, in time O(n^1/2 log^2 n) per update operation. We reduce the problem to maintaining bichromatic closest pairs, which we solve in time O(n^E) per update. Our algorithm uses a novel construction, the ordered nearest neighbors of a sequence of points. Any point set or bichromatic point set can be ordered so that this graph is a simple path. Our results generalize to higher dimensions, and to fully dynamic algorithms for maintaining maxima of decomposable functions, including the diameter of a point set and the bichromatic farthest pair
A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem
We propose a theoretically-efficient and practical parallel batch-dynamic
data structure for the closest pair problem. Our solution is based on a serial
dynamic closest pair data structure by Golin et al., and supports batches of
insertions and deletions in parallel. For a data set of size , our data
structure supports a batch of insertions or deletions of size in
expected work and depth
with high probability, and takes linear space. The key techniques for achieving
these bounds are a new work-efficient parallel batch-dynamic binary heap, and
careful management of the computation across sets of points to minimize work
and depth.
We provide an optimized multicore implementation of our data structure using
dynamic hash tables, parallel heaps, and dynamic -d trees. Our experiments
on a variety of synthetic and real-world data sets show that it achieves a
parallel speedup of up to 38.57x (15.10x on average) on 48 cores with
hyper-threading. In addition, we also implement and compare four parallel
algorithms for static closest pair problem, for which we are not aware of any
existing practical implementations. On 48 cores with hyper-threading, the
static algorithms achieve up to 51.45x (29.42x on average) speedup, and Rabin's
algorithm performs the best on average. Comparing our dynamic algorithm to the
fastest static algorithm, we find that it is advantageous to use the dynamic
algorithm for batch sizes of up to 20\% of the data set. As far as we know, our
work is the first to experimentally evaluate parallel closest pair algorithms,
in both the static and the dynamic settings
- …