141,587 research outputs found

    Adaptive Range Counting and Other Frequency-Based Range Query Problems

    Get PDF
    We consider variations of range searching in which, given a query range, our goal is to compute some function based on frequencies of points that lie in the range. The most basic such computation involves counting the number of points in a query range. Data structures that compute this function solve the well-studied range counting problem. We consider adaptive and approximate data structures for the 2-D orthogonal range counting problem under the w-bit word RAM model. The query time of an adaptive range counting data structure is sensitive to k, the number of points being counted. We give an adaptive data structure that requires O(n loglog n) space and O(loglog n + log_w k) query time. Non-adaptive data structures on the other hand require Ω(log_w n) query time (Pătraşcu, 2007). Our specific bounds are interesting for two reasons. First, when k=O(1), our bounds match the state of the art for the 2-D orthogonal range emptiness problem (Chan et al., 2011). Second, when k=Θ(n), our data structure is tight to the aforementioned Ω(log_w n) query time lower bound. We also give approximate data structures for 2-D orthogonal range counting whose bounds match the state of the art for the 2-D orthogonal range emptiness problem. Our first data structure requires O(n loglog n) space and O(loglog n) query time. Our second data structure requires O(n) space and O(log^ε n) query time for any fixed constant ε>0. These data structures compute an approximation k' such that (1-δ)k≤k'≤(1+δ)k for any fixed constant δ>0. The range selection query problem in an array involves finding the kth lowest element in a given subarray. Range selection in an array is very closely related to 3-sided 2-D orthogonal range counting. An extension of our technique for 3-sided 2-D range counting yields an efficient solution to adaptive range selection in an array. In particular, we present an adaptive data structure that requires O(n) space and O(log_w k) query time, exactly matching a recent lower bound (Jørgensen and Larsen, 2011). We next consider a variety of frequency-based range query problems in arrays. We give efficient data structures for the range mode and least frequent element query problems and also exhibit the hardness of these problems by reducing Boolean matrix multiplication to the construction and use of a range mode or least frequent element data structure. We also give data structures for the range α-majority and α-minority query problems. An α-majority is an element whose frequency in a subarray is greater than an α fraction of the size of the subarray; any other element is an α-minority. Surprisingly, geometric insights prove to be useful even in the design of our 1-D range α-majority and α-minority data structures

    On Geometric Range Searching, Approximate Counting and Depth Problems

    Get PDF
    In this thesis we deal with problems connected to range searching, which is one of the central areas of computational geometry. The dominant problems in this area are halfspace range searching, simplex range searching and orthogonal range searching and research into these problems has spanned decades. For many range searching problems, the best possible data structures cannot offer fast (i.e., polylogarithmic) query times if we limit ourselves to near linear storage. Even worse, it is conjectured (and proved in some cases) that only very small improvements to these might be possible. This inefficiency has encouraged many researchers to seek alternatives through approximations. In this thesis we continue this line of research and focus on relative approximation of range counting problems. One important problem where it is possible to achieve significant speedup through approximation is halfspace range counting in 3D. Here we continue the previous research done and obtain the first optimal data structure for approximate halfspace range counting in 3D. Our data structure has the slight advantage of being Las Vegas (the result is always correct) in contrast to the previous methods that were Monte Carlo (the correctness holds with high probability). Another series of problems where approximation can provide us with substantial speedup comes from robust statistics. We recognize three problems here: approximate Tukey depth, regression depth and simplicial depth queries. In 2D, we obtain an optimal data structure capable of approximating the regression depth of a query hyperplane. We also offer a linear space data structure which can answer approximate Tukey depth queries efficiently in 3D. These data structures are obtained by applying our ideas for the approximate halfspace counting problem. Approximating the simplicial depth turns out to be much more difficult, however. Computing the simplicial depth of a given point is more computationally challenging than most other definitions of data depth. In 2D we obtain the first data structure which uses near linear space and can answer approximate simplicial depth queries in polylogarithmic time. As applications of this result, we provide two non-trivial methods to approximate the simplicial depth of a given point in higher dimension. Along the way, we establish a tight combinatorial relationship between the Tukey depth of any given point and its simplicial depth. Another problem investigated in this thesis is the dominance reporting problem, an important special case of orthogonal range reporting. In three dimensions, we solve this problem in the pointer machine model and the external memory model by offering the first optimal data structures in these models of computation. Also, in the RAM model and for points from an integer grid we reduce the space complexity of the fastest known data structure to optimal. Using known techniques in the literature, we can use our results to obtain solutions for the orthogonal range searching problem as well. The query complexity offered by our orthogonal range reporting data structures match the most efficient query complexities known in the literature but our space bounds are lower than the previous methods in the external memory model and RAM model where the input is a subset of an integer grid. The results also yield improved orthogonal range searching in higher dimensions (which shows the significance of the dominance reporting problem). Intersection searching is a generalization of range searching where we deal with more complicated geometric objects instead of points. We investigate the rectilinear disjoint polygon counting problem which is a specialized intersection counting problem. We provide a linear-size data structure capable of counting the number of disjoint rectilinear polygons intersecting any rectilinear polygon of constant size. The query time (as well as some other properties of our data structure) resembles the classical simplex range searching data structures

    Succinct Color Searching in One Dimension

    Get PDF
    In this paper we study succinct data structures for one-dimensional color reporting and color counting problems. We are given a set of n points with integer coordinates in the range [1,m] and every point is assigned a color from the set {1,...sigma}. A color reporting query asks for the list of distinct colors that occur in a query interval [a,b] and a color counting query asks for the number of distinct colors in [a,b]. We describe a succinct data structure that answers approximate color counting queries in O(1) time and uses mathcal{B}(n,m) + O(n) + o(mathcal{B}(n,m)) bits, where mathcal{B}(n,m) is the minimum number of bits required to represent an arbitrary set of size n from a universe of m elements. Thus we show, somewhat counterintuitively, that it is not necessary to store colors of points in order to answer approximate color counting queries. In the special case when points are in the rank space (i.e., when n=m), our data structure needs only O(n) bits. Also, we show that Omega(n) bits are necessary in that case. Then we turn to succinct data structures for color reporting. We describe a data structure that uses mathcal{B}(n,m) + nH_d(S) + o(mathcal{B}(n,m)) + o(nlgsigma) bits and answers queries in O(k+1) time, where k is the number of colors in the answer, and nH_d(S) (d=log_sigma n) is the d-th order empirical entropy of the color sequence. Finally, we consider succinct color reporting under restricted updates. Our dynamic data structure uses nH_d(S)+o(nlgsigma) bits and supports queries in O(k+1) time

    Dynamic Range Majority Data Structures

    Full text link
    Given a set PP of coloured points on the real line, we study the problem of answering range α\alpha-majority (or "heavy hitter") queries on PP. More specifically, for a query range QQ, we want to return each colour that is assigned to more than an α\alpha-fraction of the points contained in QQ. We present a new data structure for answering range α\alpha-majority queries on a dynamic set of points, where α(0,1)\alpha \in (0,1). Our data structure uses O(n) space, supports queries in O((lgn)/α)O((\lg n) / \alpha) time, and updates in O((lgn)/α)O((\lg n) / \alpha) amortized time. If the coordinates of the points are integers, then the query time can be improved to O(lgn/(αlglgn)+(lg(1/α))/α))O(\lg n / (\alpha \lg \lg n) + (\lg(1/\alpha))/\alpha)). For constant values of α\alpha, this improved query time matches an existing lower bound, for any data structure with polylogarithmic update time. We also generalize our data structure to handle sets of points in d-dimensions, for d2d \ge 2, as well as dynamic arrays, in which each entry is a colour.Comment: 16 pages, Preliminary version appeared in ISAAC 201

    Linear-Space Data Structures for Range Mode Query in Arrays

    Full text link
    A mode of a multiset SS is an element aSa \in S of maximum multiplicity; that is, aa occurs at least as frequently as any other element in SS. Given a list A[1:n]A[1:n] of nn items, we consider the problem of constructing a data structure that efficiently answers range mode queries on AA. Each query consists of an input pair of indices (i,j)(i, j) for which a mode of A[i:j]A[i:j] must be returned. We present an O(n22ϵ)O(n^{2-2\epsilon})-space static data structure that supports range mode queries in O(nϵ)O(n^\epsilon) time in the worst case, for any fixed ϵ[0,1/2]\epsilon \in [0,1/2]. When ϵ=1/2\epsilon = 1/2, this corresponds to the first linear-space data structure to guarantee O(n)O(\sqrt{n}) query time. We then describe three additional linear-space data structures that provide O(k)O(k), O(m)O(m), and O(ji)O(|j-i|) query time, respectively, where kk denotes the number of distinct elements in AA and mm denotes the frequency of the mode of AA. Finally, we examine generalizing our data structures to higher dimensions.Comment: 13 pages, 2 figure

    Planar Visibility: Testing and Counting

    Full text link
    In this paper we consider query versions of visibility testing and visibility counting. Let SS be a set of nn disjoint line segments in R2\R^2 and let ss be an element of SS. Visibility testing is to preprocess SS so that we can quickly determine if ss is visible from a query point qq. Visibility counting involves preprocessing SS so that one can quickly estimate the number of segments in SS visible from a query point qq. We present several data structures for the two query problems. The structures build upon a result by O'Rourke and Suri (1984) who showed that the subset, VS(s)V_S(s), of R2\R^2 that is weakly visible from a segment ss can be represented as the union of a set, CS(s)C_S(s), of O(n2)O(n^2) triangles, even though the complexity of VS(s)V_S(s) can be Ω(n4)\Omega(n^4). We define a variant of their covering, give efficient output-sensitive algorithms for computing it, and prove additional properties needed to obtain approximation bounds. Some of our bounds rely on a new combinatorial result that relates the number of segments of SS visible from a point pp to the number of triangles in sSCS(s)\bigcup_{s\in S} C_S(s) that contain pp.Comment: 22 page
    corecore