17,029 research outputs found

    Helly-Type Theorems in Property Testing

    Full text link
    Helly's theorem is a fundamental result in discrete geometry, describing the ways in which convex sets intersect with each other. If SS is a set of nn points in RdR^d, we say that SS is (k,G)(k,G)-clusterable if it can be partitioned into kk clusters (subsets) such that each cluster can be contained in a translated copy of a geometric object GG. In this paper, as an application of Helly's theorem, by taking a constant size sample from SS, we present a testing algorithm for (k,G)(k,G)-clustering, i.e., to distinguish between two cases: when SS is (k,G)(k,G)-clusterable, and when it is ϵ\epsilon-far from being (k,G)(k,G)-clusterable. A set SS is ϵ\epsilon-far (0<ϵ1)(0<\epsilon\leq1) from being (k,G)(k,G)-clusterable if at least ϵn\epsilon n points need to be removed from SS to make it (k,G)(k,G)-clusterable. We solve this problem for k=1k=1 and when GG is a symmetric convex object. For k>1k>1, we solve a weaker version of this problem. Finally, as an application of our testing result, in clustering with outliers, we show that one can find the approximate clusters by querying a constant size sample, with high probability

    Multi-Step Processing of Spatial Joins

    Get PDF
    Spatial joins are one of the most important operations for combining spatial objects of several relations. In this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three steps. First, a spatial join is performed on the minimum bounding rectangles of the objects returning a set of candidates. Various approaches for accelerating this step of join processing have been examined at the last year’s conference [BKS 93a]. In this paper, we focus on the problem how to compute the answers from the set of candidates which is handled by the following two steps. First of all, sophisticated approximations are used to identify answers as well as to filter out false hits from the set of candidates. For this purpose, we investigate various types of conservative and progressive approximations. In the last step, the exact geometry of the remaining candidates has to be tested against the join predicate. The time required for computing spatial join predicates can essentially be reduced when objects are adequately organized in main memory. In our approach, objects are first decomposed into simple components which are exclusively organized by a main-memory resident spatial data structure. Overall, we present a complete approach of spatial join processing on complex spatial objects. The performance of the individual steps of our approach is evaluated with data sets from real cartographic applications. The results show that our approach reduces the total execution time of the spatial join by factors

    Testing surface area with arbitrary accuracy

    Full text link
    Recently, Kothari et al.\ gave an algorithm for testing the surface area of an arbitrary set A[0,1]nA \subset [0, 1]^n. Specifically, they gave a randomized algorithm such that if AA's surface area is less than SS then the algorithm will accept with high probability, and if the algorithm accepts with high probability then there is some perturbation of AA with surface area at most κnS\kappa_n S. Here, κn\kappa_n is a dimension-dependent constant which is strictly larger than 1 if n2n \ge 2, and grows to 4/π4/\pi as nn \to \infty. We give an improved analysis of Kothari et al.'s algorithm. In doing so, we replace the constant κn\kappa_n with 1+η1 + \eta for η>0\eta > 0 arbitrary. We also extend the algorithm to more general measures on Riemannian manifolds.Comment: 5 page

    The Optimal Mechanism in Differential Privacy

    Full text link
    We derive the optimal ϵ\epsilon-differentially private mechanism for single real-valued query function under a very general utility-maximization (or cost-minimization) framework. The class of noise probability distributions in the optimal mechanism has {\em staircase-shaped} probability density functions which are symmetric (around the origin), monotonically decreasing and geometrically decaying. The staircase mechanism can be viewed as a {\em geometric mixture of uniform probability distributions}, providing a simple algorithmic description for the mechanism. Furthermore, the staircase mechanism naturally generalizes to discrete query output settings as well as more abstract settings. We explicitly derive the optimal noise probability distributions with minimum expectation of noise amplitude and power. Comparing the optimal performances with those of the Laplacian mechanism, we show that in the high privacy regime (ϵ\epsilon is small), Laplacian mechanism is asymptotically optimal as ϵ0\epsilon \to 0; in the low privacy regime (ϵ\epsilon is large), the minimum expectation of noise amplitude and minimum noise power are Θ(Δeϵ2)\Theta(\Delta e^{-\frac{\epsilon}{2}}) and Θ(Δ2e2ϵ3)\Theta(\Delta^2 e^{-\frac{2\epsilon}{3}}) as ϵ+\epsilon \to +\infty, while the expectation of noise amplitude and power using the Laplacian mechanism are Δϵ\frac{\Delta}{\epsilon} and 2Δ2ϵ2\frac{2\Delta^2}{\epsilon^2}, where Δ\Delta is the sensitivity of the query function. We conclude that the gains are more pronounced in the low privacy regime.Comment: 40 pages, 5 figures. Part of this work was presented in DIMACS Workshop on Recent Work on Differential Privacy across Computer Science, October 24 - 26, 201