13 research outputs found

    Approximate Geometric MST Range Queries

    Get PDF
    Range searching is a widely-used method in computational geometry for efficiently accessing local regions of a large data set. Typically, range searching involves either counting or reporting the points lying within a given query region, but it is often desirable to compute statistics that better describe the structure of the point set lying within the region, not just the count. In this paper we consider the geometric minimum spanning tree (MST) problem in the context of range searching where approximation is allowed. We are given a set P of n points in R^d. The objective is to preprocess P so that given an admissible query region Q, it is possible to efficiently approximate the weight of the minimum spanning tree of the subset of P lying within Q. There are two natural sources of approximation error, first by treating Q as a fuzzy object and second by approximating the MST weight itself. To model this, we assume that we are given two positive real approximation parameters eps_q and eps_w. Following the typical practice in approximate range searching, the range is expressed as two shapes Q^- and Q^+, where Q^- is contained in Q which is contained in Q^+, and their boundaries are separated by a distance of at least eps_q diam(Q). Points within Q^- must be included and points external to Q^+ cannot be included. A weight W is a valid answer to the query if there exist subsets P\u27 and P\u27\u27 of P, such that Q^- is contained in P\u27 which is contained in P\u27\u27 which is contained in Q^+ and wt(MST(P\u27)) <= W <= (1+eps_w) wt(MST(P\u27\u27)). In this paper, we present an efficient data structure for answering such queries. Our approach uses simple data structures based on quadtrees, and it can be applied whenever Q^- and Q^+ are compact sets of constant combinatorial complexity. It uses space O(n), and it answers queries in time O(log n + 1/(eps_q eps_w)^{d + O(1)}). The O(1) term is a small constant independent of dimension, and the hidden constant factor in the overall running time depends on d, but not on eps_q or eps_w. Preprocessing requires knowledge of eps_w, but not eps_q

    Approximate Range Queries for Clustering

    Get PDF
    We study the approximate range searching for three variants of the clustering problem with a set P of n points in d-dimensional Euclidean space and axis-parallel rectangular range queries: the k-median, k-means, and k-center range-clustering query problems. We present data structures and query algorithms that compute (1+epsilon)-approximations to the optimal clusterings of P cap Q efficiently for a query consisting of an orthogonal range Q, an integer k, and a value epsilon>0

    New Bounds for Range Closest-Pair Problems

    Get PDF
    Given a dataset S of points in R^2, the range closest-pair (RCP) problem aims to preprocess S into a data structure such that when a query range X is specified, the closest-pair in S cap X can be reported efficiently. The RCP problem can be viewed as a range-search version of the classical closest-pair problem, and finds applications in many areas. Due to its non-decomposability, the RCP problem is much more challenging than many traditional range-search problems. This paper revisits the RCP problem, and proposes new data structures for various query types including quadrants, strips, rectangles, and halfplanes. Both worst-case and average-case analyses (in the sense that the data points are drawn uniformly and independently from the unit square) are applied to these new data structures, which result in new bounds for the RCP problem. Some of the new bounds significantly improve the previous results, while the others are entirely new

    Searching for the Closest-Pair in a Query Translate

    Get PDF
    We consider a range-search variant of the closest-pair problem. Let Gamma be a fixed shape in the plane. We are interested in storing a given set of n points in the plane in some data structure such that for any specified translate of Gamma, the closest pair of points contained in the translate can be reported efficiently. We present results on this problem for two important settings: when Gamma is a polygon (possibly with holes) and when Gamma is a general convex body whose boundary is smooth. When Gamma is a polygon, we present a data structure using O(n) space and O(log n) query time, which is asymptotically optimal. When Gamma is a general convex body with a smooth boundary, we give a near-optimal data structure using O(n log n) space and O(log^2 n) query time. Our results settle some open questions posed by Xue et al. at SoCG 2018

    Optimal-Location-Selection Query Processing in Spatial Databases

    Get PDF
    Abstract—This paper introduces and solves a novel type of spatial queries, namely, Optimal-Location-Selection (OLS) search, which has many applications in real life. Given a data object set DA, a target object set DB, a spatial region R, and a critical distance dc in a multidimensional space, an OLS query retrieves those target objects in DB that are outside R but have maximal optimality. Here, the optimality of a target object b 2 DB located outside R is defined as the number of the data objects from DA that are inside R and meanwhile have their distances to b not exceeding dc. When there is a tie, the accumulated distance from the data objects to b serves as the tie breaker, and the one with smaller distance has the better optimality. In this paper, we present the optimality metric, formalize the OLS query, and propose several algorithms for processing OLS queries efficiently. A comprehensive experimental evaluation has been conducted using both real and synthetic data sets to demonstrate the efficiency and effectiveness of the proposed algorithms. Index Terms—Query processing, optimal-location-selection, spatial database, algorithm. Ç

    On the power of the semi-separated pair decomposition

    Get PDF
    A Semi-Separated Pair Decomposition (SSPD), with parameter s > 1, of a set is a set {(A i ,B i )} of pairs of subsets of S such that for each i, there are balls and containing A i and B i respectively such that min ( radius ) , radius ), and for any two points p, q S there is a unique index i such that p A i and q B i or vice-versa. In this paper, we use the SSPD to obtain the following results: First, we consider the construction of geometric t-spanners in the context of imprecise points and we prove that any set of n imprecise points, modeled as pairwise disjoint balls, admits a t-spanner with edges which can be computed in time. If all balls have the same radius, the number of edges reduces to . Secondly, for a set of n points in the plane, we design a query data structure for half-plane closest-pair queries that can be built in time using space and answers a query in time, for any ε> 0. By reducing the preprocessing time to and using space, the query can be answered in time. Moreover, we improve the preprocessing time of an existing axis-parallel rectangle closest-pair query data structure from quadratic to near-linear. Finally, we revisit some previously studied problems, namely spanners for complete k-partite graphs and l

    New Plane-Sweep Algorithms for Distance-Based Join Queries in Spatial Databases

    Get PDF
    Efficient and effective processing of the distance-based join query (DJQ) is of great importance in spatial databases due to the wide area of applications that may address such queries (mapping, urban planning, transportation planning, resource management, etc.). The most representative and studied DJQs are the K Closest Pairs Query (KCPQ) and εDistance Join Query (εDJQ). These spatial queries involve two spatial data sets and a distance function to measure the degree of closeness, along with a given number of pairs in the final result (K) or a distance threshold (ε). In this paper, we propose four new plane-sweep-based algorithms for KCPQs and their extensions for εDJQs in the context of spatial databases, without the use of an index for any of the two disk-resident data sets (since, building and using indexes is not always in favor of processing performance). They employ a combination of plane-sweep algorithms and space partitioning techniques to join the data sets. Finally, we present results of an extensive experimental study, that compares the efficiency and effectiveness of the proposed algorithms for KCPQs and εDJQs. This performance study, conducted on medium and big spatial data sets (real and synthetic) validates that the proposed plane-sweep-based algorithms are very promising in terms of both efficient and effective measures, when neither inputs are indexed. Moreover, the best of the new algorithms is experimentally compared to the best algorithm that is based on the R-tree (a widely accepted access method), for KCPQs and εDJQs, using the same data sets. This comparison shows that the new algorithms outperform R-tree based algorithms, in most cases

    A unified approach for computing top-k pairs in multidimensional space

    Full text link