Search CORE

580 research outputs found

New Applications of Nearest-Neighbor Chains: Euclidean TSP and Motorcycle Graphs

Author: Efrat Alon
Eppstein David
Frishberg Daniel
Goodrich Michael T.
Kobourov Stephen
Mamano Nil
Matias Pedro
Polishchuk Valentin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

We show new applications of the nearest-neighbor chain algorithm, a technique that originated in agglomerative hierarchical clustering. We use it to construct the greedy multi-fragment tour for Euclidean TSP in O(n log n) time in any fixed dimension and for Steiner TSP in planar graphs in O(n sqrt(n)log n) time; we compute motorcycle graphs, a central step in straight skeleton algorithms, in O(n^(4/3+epsilon)) time for any epsilon>0

Dagstuhl Research Online Publication Server

Dynamic Geometric Data Structures via Shallow Cuttings

Author: Chan Timothy M.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th International Symposium on Computational Geometry (SoCG 2019)
Publication date: 01/01/2019
Field of study

We present new results on a number of fundamental problems about dynamic geometric data structures: 1) We describe the first fully dynamic data structures with sublinear amortized update time for maintaining (i) the number of vertices or the volume of the convex hull of a 3D point set, (ii) the largest empty circle for a 2D point set, (iii) the Hausdorff distance between two 2D point sets, (iv) the discrete 1-center of a 2D point set, (v) the number of maximal (i.e., skyline) points in a 3D point set. The update times are near n^{11/12} for (i) and (ii), n^{7/8} for (iii) and (iv), and n^{2/3} for (v). Previously, sublinear bounds were known only for restricted "semi-online" settings [Chan, SODA 2002]. 2) We slightly improve previous fully dynamic data structures for answering extreme point queries for the convex hull of a 3D point set and nearest neighbor search for a 2D point set. The query time is O(log^2n), and the amortized update time is O(log^4n) instead of O(log^5n) [Chan, SODA 2006; Kaplan et al., SODA 2017]. 3) We also improve previous fully dynamic data structures for maintaining the bichromatic closest pair between two 2D point sets and the diameter of a 2D point set. The amortized update time is O(log^4n) instead of O(log^7n) [Eppstein 1995; Chan, SODA 2006; Kaplan et al., SODA 2017]

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Dynamic Connectivity in Disk Graphs

Author: Baumann Alexander
Kaplan Haim
Klost Katharina
Knorr Kristin
Mulzer Wolfgang
Roditty Liam
Seiferth Paul
Publication venue
Publication date: 01/01/2024
Field of study

Let S ⊆ R2 be a set of n sites in the plane, so that every site s ∈ S has an associated radius rs > 0. Let D(S) be the disk intersection graph defined by S, i.e., the graph with vertex set S and an edge between two distinct sites s, t ∈ S if and only if the disks with centers s, t and radii rs , rt intersect. Our goal is to design data structures that maintain the connectivity structure of D(S) as sites are inserted and/or deleted in S. First, we consider unit disk graphs, i.e., we fix rs = 1, for all sites s ∈ S. For this case, we describe a data structure that has O(log2 n) amortized update time and O(log n/ log log n) query time. Second, we look at disk graphs with bounded radius ratio Ψ, i.e., for all s ∈ S, we have 1 ≤ rs ≤ Ψ, for a parameter Ψ that is known in advance. Here, we not only investigate the fully dynamic case, but also the incremental and the decremental scenario, where only insertions or only deletions of sites are allowed. In the fully dynamic case, we achieve amortized expected update time O(Ψ log4 n) and query time O(log n/ log log n). This improves the currently best update time by a factor of Ψ. In the incremental case, we achieve logarithmic dependency on Ψ, with a data structure that has O(α(n)) amortized query time and O(log Ψ log4 n) amortized expected update time, where α(n) denotes the inverse Ackermann function. For the decremental setting, we first develop an efficient decremental disk revealing data structure: given two sets R and B of disks in the plane, we can delete disks from B, and upon each deletion, we receive a list of all disks in R that no longer intersect the union of B. Using this data structure, we get decremental data structures with a query time of O(log n/ log log n) that supports deletions in O(n log Ψ log4 n) overall expected time for disk graphs with bounded radius ratio Ψ and O(n log5 n) overall expected time for disk graphs with arbitrary radii, assuming that the deletion sequence is oblivious of the internal random choices of the data structures

Institutional Repository of the Freie Universität Berlin

Dynamic Enumeration of Similarity Joins

Author: Agarwal Pankaj K.
Hu Xiao
Sintos Stavros
Yang Jun
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

Graph-based time-space trade-offs for approximate near neighbors

Author: Laarhoven Thijs
Publication venue
Publication date: 08/12/2017
Field of study

We take a first step towards a rigorous asymptotic analysis of graph-based approaches for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of (randomized) greedy walks on the approximate near neighbor graph. For random data sets of size

n = 2^{o(d)}

on the

d

-dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approximate nearest neighbor problem with approximation factor c > 1 in query time

n^{\rho_q + o(1)}

and space

n^{1 + \rho_s + o(1)}

, for arbitrary

\rho_q, \rho_s \geq 0

satisfying \begin{align} (2c^2 - 1) \rho_q + 2 c^2 (c^2 - 1) \sqrt{\rho_s (1 - \rho_s)} \geq c^4. \end{align} Graph-based near neighbor searching is especially competitive with hash-based methods for small

c

and near-linear memory, and in this regime the asymptotic scaling of a greedy graph-based search matches the recent optimal hash-based trade-offs of Andoni-Laarhoven-Razenshteyn-Waingarten [SODA'17]. We further study how the trade-offs scale when the data set is of size

n = 2^{\Theta(d)}

, and analyze asymptotic complexities when applying these results to lattice sieving

Pure OAI Repository

Online Data Structures in External Memory

Author: Vitter Jeffrey Scott
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2011
Field of study

The original publication is available at www.springerlink.comThe data sets for many of today's computer applications are too large to t within the computer's internal memory and must instead be stored on external storage devices such as disks. A major performance bottleneck can be the input/output communication (or I/O) between the external and internal memories. In this paper we discuss a variety of online data structures for external memory, some very old and some very new, such as hashing (for dictionaries), B-trees (for dictionaries and 1-D range search), bu er trees (for batched dynamic problems), interval trees with weight-balanced B-trees (for stabbing queries), priority search trees (for 3-sided 2-D range search), and R-trees and other spatial structures. We also discuss several open problems along the way

KU ScholarWorks

Transactional Support for Visual Instance Search

Author: A Babenko
A Ólafsson
C Li
C Mohan
DG Lowe
E Nowak
H Bay
H Jégou
H Lejsek
J Gray
J Uhlmann
K Beyer
K Fukunaga
K Mikolajczyk
K Mikolajczyk
L Paulevé
M Datar
M Muja
N Marz
Y Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

International audienceThis article addresses the issue of dynamicity and durability for scalable indexing of very large and rapidly growing collections of local features for visual instance retrieval. By extending the NV-tree, a scalable disk-based high-dimensional index, we show how to implement the ACID properties of transactions which ensure both dynamicity and durability. We present a detailed performance evaluation of the transactional NV-tree, showing that the insertion throughput is excellent despite the effort to enforce the ACID properties

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

The IT University of Copenhagen's Repository

HAL-Rennes 1

PECANN: Parallel Efficient Clustering with Graph-Based Approximate Nearest Neighbor Search

Author: Engels Joshua
Huang Yihao
Shun Julian
Yu Shangdi
Publication venue
Publication date: 13/12/2023
Field of study

This paper studies density-based clustering of point sets. These methods use dense regions of points to detect clusters of arbitrary shapes. In particular, we study variants of density peaks clustering, a popular type of algorithm that has been shown to work well in practice. Our goal is to cluster large high-dimensional datasets, which are prevalent in practice. Prior solutions are either sequential, and cannot scale to large data, or are specialized for low-dimensional data. This paper unifies the different variants of density peaks clustering into a single framework, PECANN, by abstracting out several key steps common to this class of algorithms. One such key step is to find nearest neighbors that satisfy a predicate function, and one of the main contributions of this paper is an efficient way to do this predicate search using graph-based approximate nearest neighbor search (ANNS). To provide ample parallelism, we propose a doubling search technique that enables points to find an approximate nearest neighbor satisfying the predicate in a small number of rounds. Our technique can be applied to many existing graph-based ANNS algorithms, which can all be plugged into PECANN. We implement five clustering algorithms with PECANN and evaluate them on synthetic and real-world datasets with up to 1.28 million points and up to 1024 dimensions on a 30-core machine with two-way hyper-threading. Compared to the state-of-the-art FASTDP algorithm for high-dimensional density peaks clustering, which is sequential, our best algorithm is 45x-734x faster while achieving competitive ARI scores. Compared to the state-of-the-art parallel DPC-based algorithm, which is optimized for low dimensions, we show that PECANN is two orders of magnitude faster. As far as we know, our work is the first to evaluate DPC variants on large high-dimensional real-world image and text embedding datasets

arXiv.org e-Print Archive

Suitability of Nearest Neighbour Indexes for Multimedia Relevance Feedback

Author: Aumüller Martin
Jónsson Björn Thór
Khan Omar Shahbaz
Publication venue
Publication date: 01/01/2023
Field of study

The IT University of Copenhagen's Repository