11 research outputs found

    Succinct Indices for Range Queries with applications to Orthogonal Range Maxima

    Full text link
    We consider the problem of preprocessing NN points in 2D, each endowed with a priority, to answer the following queries: given a axis-parallel rectangle, determine the point with the largest priority in the rectangle. Using the ideas of the \emph{effective entropy} of range maxima queries and \emph{succinct indices} for range maxima queries, we obtain a structure that uses O(N) words and answers the above query in O(log⁥Nlog⁥log⁥N)O(\log N \log \log N) time. This is a direct improvement of Chazelle's result from FOCS 1985 for this problem -- Chazelle required O(N/Ï”)O(N/\epsilon) words to answer queries in O((log⁥N)1+Ï”)O((\log N)^{1+\epsilon}) time for any constant Ï”>0\epsilon > 0.Comment: To appear in ICALP 201

    Encoding Two-Dimensional Range Top-k Queries Revisited

    Get PDF
    We consider the problem of encoding two-dimensional arrays, whose elements come from a total order, for answering Top-k queries. The aim is to obtain encodings that use space close to the information-theoretic lower bound, which can be constructed efficiently. For 2 x n arrays, we first give upper and lower bounds on space for answering sorted and unsorted 3-sided Top-k queries. For m x n arrays, with m <=n and k <=mn, we obtain (m lg{(k+1)n choose n}+4nm(m-1)+o(n))-bit encoding for answering sorted 4-sided Top-k queries. This improves the min{(O(mn lg{n}),m^2 lg{(k+1)n choose n} + m lg{m}+o(n))}-bit encoding of Jo et al. [CPM, 2016] when m = o(lg{n}). This is a consequence of a new encoding that encodes a 2 x n array to support sorted 4-sided Top-k queries on it using an additional 4n bits, in addition to the encodings to support the Top-k queries on individual rows. This new encoding is a non-trivial generalization of the encoding of Jo et al. [CPM, 2016] that supports sorted 4-sided Top-2 queries on it using an additional 3n bits. We also give almost optimal space encodings for 3-sided Top-k queries, and show lower bounds on encodings for 3-sided and 4-sided Top-k queries

    An Encoding for Order-Preserving Matching

    Get PDF
    Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of order-preserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications in data analysis. Two strings are said to be an order-preserving match if the relative order of their characters is the same: e.g., (4, 1, 3, 2) and (10, 3, 7, 5) are an order-preserving match. We show how, given a string S[1..n] over an arbitrary alphabet of size sigma and a constant c >=1, we can build an O(n log log n)-bit encoding such that later, given a pattern P[1..m] with m >= log^c n, we can return the number of order-preserving occurrences of P in S in O(m) time. Within the same time bound we can also return the starting position of some order-preserving match for P in S (if such a match exists). We prove that our space bound is within a constant factor of optimal if log(sigma) = Omega(log log n); our query time is optimal if log(sigma) = Omega(log n). Our space bound contrasts with the Omega(n log n) bits needed in the worst case to store S itself, an index for order-preserving pattern matching with no restrictions on the pattern length, or an index for standard pattern matching even with restrictions on the pattern length. Moreover, we can build our encoding knowing only how each character compares to O(log^c n) neighbouring characters

    A Simple Linear-Space Data Structure for Constant-Time Range Minimum Query

    Full text link
    Abstract. We revisit the range minimum query problem and present a new O(n)-space data structure that supports queries in O(1) time. Although previous data structures exist whose asymptotic bounds match ours, our goal is to introduce a new solution that is simple, intuitive, and practical without increasing asymptotic costs for query time or space

    Entropy Trees and Range-Minimum Queries In Optimal Average-Case Space

    Get PDF
    The range-minimum query (RMQ) problem is a fundamental data structuring task with numerous applications. Despite the fact that succinct solutions with worst-case optimal 2n+o(n)2n+o(n) bits of space and constant query time are known, it has been unknown whether such a data structure can be made adaptive to the reduced entropy of random inputs (Davoodi et al. 2014). We construct a succinct data structure with the optimal 1.736n+o(n)1.736n+o(n) bits of space on average for random RMQ instances, settling this open problem. Our solution relies on a compressed data structure for binary trees that is of independent interest. It can store a (static) binary search tree generated by random insertions in asymptotically optimal expected space and supports many queries in constant time. Using an instance-optimal encoding of subtrees, we furthermore obtain a "hyper-succinct" data structure for binary trees that improves upon the ultra-succinct representation of Jansson, Sadakane and Sung (2012)

    Pisin yhteinen jatke -ongelman ratkaiseminen koodaavan tietorakenteen avulla

    Get PDF
    Pisin yhteinen jatke -ongelmassa tarkoituksena on selvittÀÀ tietorakenteen avulla merkkijonon kahden loppuosan pisimmÀn yhteisen alkuosan pituus. Ongelman nopea ratkaiseminen on tÀrkeÀÀ esimerkiksi monissa merkkijonoalgoritmeissa. LisÀksi tiedon mÀÀrÀn jatkuva kasvu lisÀÀ tarvetta minimoida tietorakenteen viemÀ tila. Tutkielmassa kÀsitellÀÀn pisin yhteinen jatke -ongelman ratkaisemista koodaavalla tietorakenteella. TÀllöin alkuperÀiseen merkkijonoon pÀÀstÀÀn kÀsiksi vain ennalta mÀÀriteltyjen kyselyjen avulla, eikÀ merkkijonoa tarvita kyselyissÀ. Tyypillisesti koodaava tietorakenne vie vÀhemmÀn tilaa ja sisÀltÀÀ vÀhemmÀn informaatiota kuin alkuperÀinen tieto. Taustan antamiseksi kÀsitellÀÀn ensin pisin yhteinen jatke -ongelman perinteisiÀ ratkaisuja. Sen jÀlkeen tarkastellaan tiedon tilavaativuutta informaatioteoreettisen entropian avulla, mikÀ luo perustan arvioida tietorakenteiden tilavaativuuden optimaalisuutta ja aika - tila-vaihtokauppaa. LisÀksi annetaan esimerkkejÀ koodaavan tietorakenteen kÀytöstÀ ja ongelman tilaa sÀÀstÀvistÀ ratkaisuista. Yksityiskohtaisesti kÀsitellÀÀn pisin yhteinen jatke -ongelman ratkaisua, jossa kÀytetÀÀn koodaavaa tietorakennetta. Ratkaisussa on kaksi pÀÀasiallista osaa, joiden avulla vastaus selvitetÀÀn. Toteutuksessa kÀytetÀÀn hyvÀksi useita tietorakenteita, esimerkiksi loppuosapuuta, de Bruijn -verkon muunnosta ja virittÀvÀÀ puuta. Algoritmin ja tietorakenteen toteutuksen lisÀksi tarkastellaan tietorakenteen tilavaativuuden ylÀrajan parantamista ja puiden esittÀmistÀ toteutuksessa tilaa sÀÀstÀen. Keskeiset johtopÀÀtökset ovat seuraavat. Useita kyselyjÀ tehtÀessÀ pisin yhteinen jatke -ongelma voi kannattaa ratkaista tietorakenteen avulla. Tiedon mÀÀrÀn kasvun ja tietorakenteiden kehityksen seurauksena ongelmaan on viime vuosina esitetty tilaa sÀÀstÀviÀ ratkaisuja. NiissÀ optimoidaan aika- ja tilavaativuutta monin eri tavoin. Ongelmaan on olemassa muun muassa vakioaikainen, koodaavaa tietorakennetta hyödyntÀvÀ ratkaisu, jolla voidaan pÀÀstÀ alilineaariseen tilavaativuuteen ilman alkuperÀisen merkkijonon korkeaa tiivistyvyyttÀ. Viimeaikaisten ratkaisujen suorituskyvystÀ sovelluksissa ei ole vertailutietoa

    Space Efficient Encodings for Bit-strings, Range queries and Related Problems

    Get PDF
    í•™ìœ„ë…ŒëŹž (ë°•ì‚Ź)-- 서욞대학ꔐ 대학원 : ì „êž°Â·ì»Ží“ší„°êł”í•™ë¶€, 2016. 2. Srinivasa Rao Satti.In this thesis, we design and implement various space efficient data structures. Most of these structures use spaces close to the information-theoretic lower bound while supporting the queries efficiently. In particular, this thesis is concerned with the data structures for four problems: (i) supporting \rank{} and \select{} queries on compressed bit strings, (ii) nearest larger neighbor problem, (iii) simultaneous encodings for range and next/previous larger/smaller value queries, and (iv) range \topk{} queries on two-dimensional arrays. We first consider practical implementations of \emph{compressed} bitvectors, which support \rank{} and \select{} operations on a given bit-string, while storing the bit-string in compressed form~\cite{DBLP:conf/dcc/JoJORS14}. Our approach relies on \emph{variable-to-fixed} encodings of the bit-string, an approach that has not yet been considered systematically for practical encodings of bitvectors. We show that this approach leads to fast practical implementations with low \emph{redundancy} (i.e., the space used by the bitvector in addition to the compressed representation of the bit-string), and is a flexible and promising solution to the problem of supporting \rank{} and \select{} on moderately compressible bit-strings, such as those encountered in real-world applications. Next, we propose space-efficient data structures for the nearest larger neighbor problem~\cite{IWOCA2014,walcom-JoRS15}. Given a sequence of nn elements from a total order, and a position in the sequence, the nearest larger neighbor (\NLV{}) query returns the position of the element which is closest to the query position, and is larger than the element at the query position. The problem of finding all nearest larger neighbors has attracted interest due to its applications for parenthesis matching and in computational geometry~\cite{AsanoBK09,AsanoK13,BerkmanSV93}. We consider a data structure version of this problem, which is to preprocess a given sequence of elements to construct a data structure that can answer \NLN{} queries efficiently. For one-dimensional arrays, we give time-space tradeoffs for the problem on \textit{indexing model}. For two-dimensional arrays, we give an optimal encoding with constant query on \textit{encoding model}. We also propose space-efficient encodings which support various range queries, and previous and next smaller/larger value queries~\cite{cocoonJS15}. Given a sequence of nn elements from a total order, we obtain a 4.088n+o(n)4.088n + o(n)-bit encoding that supports all these queries where nn is the length of input array. For the case when we need to support all these queries in constant time, we give an encoding that takes 4.585n+o(n)4.585n + o(n) bits. This improves the 5.08n+o(n)5.08n+o(n)-bit encoding obtained by encoding the colored 2d2d-Min and 2d2d-Max heaps proposed by Fischer~\cite{Fischer11}. We extend the original DFUDS~\cite{BDMRRR05} encoding of the colored 2d2d-Min and 2d2d-Max heap that supports the queries in constant time. Then, we combine the extended DFUDS of 2d2d-Min heap and 2d2d-Max heap using the Min-Max encoding of Gawrychowski and Nicholson~\cite{Gawry14} with some modifications. We also obtain encodings that take lesser space and support a subset of these queries. Finally, we consider the various encodings that support range \topk{} queries on a two-dimensional array containing elements from a total order. For an m×nm \times n array, we first propose an optimal encoding for answering one-sided \topk{} queries, whose query range is restricted to [1
m][1
a][1 \dots m][1 \dots a], for 1≀a≀n1 \le a \le n. Next, we propose an encoding for the general \topk{} queries that takes m2lg⁥((k+1)nn)+mlg⁥m+o(n)m^2\lg{{(k+1)n \choose n}} + m\lg{m}+o(n) bits. This generalizes the \topk{} encoding of Gawrychowski and Nicholson~\cite{Gawry14}.Chapter 1 Introduction 1 1.1 Computational model 2 1.1.1 Encoding and indexing models 2 1.2 Contribution of the thesis 3 1.3 Organization of the thesis 5 Chapter 2 Preliminaries 7 Chapter 3 Compressed bit vectors based on variable-to-fixed encodings 10 3.1 Introduction 10 3.2 Bit-vectors using V2F coding 14 3.3 V2F compression algorithms for bit-strings 16 3.3.1 Tunstall code 16 3.3.2 Enumerative codes 19 3.3.3 LZW algorithm 23 3.3.4 Empirical evaluation of the compressors 23 3.4 Practical implementation of bitvectors based on V2F compression. 26 3.4.1 Testing Methodology 29 3.4.2 Results of Empirical Evaluation 33 3.5 Future works 35 Chapter 4 Space Efficient Data Structures for Nearest Larger Neighbor 39 4.1 Introduction 39 4.2 Indexing NLV queries on 1D arrays 43 4.3 Encoding NLN queries on2D binary arrays 44 4.4 Encoding NLN queries for general 2D arrays 50 4.4.1 2D NLN in the encoding model–distinct case 50 4.4.2 2D NLN in the encoding model–general case 53 4.5 Open problems 63 Chapter 5 Simultaneous encodings for range and next/previous larger/smaller value queries 64 5.1 Introduction 64 5.2 Preliminaries 67 5.2.1 2d-Min heap 69 5.2.2 Encoding range min-max queries 72 5.3 Extended DFUDS for colored 2d-Min heap 75 5.4 Encoding colored 2d-Min and 2d-Max heaps 80 5.4.1 Combined data structure for DCMin(A) and DCMax(A) 82 5.4.2 Encoding colored 2d-Min and 2d-Max heaps using less space 88 5.5 Open problems 89 Chapter 6 Encoding Two-dimensional range Top-k queries 90 6.1 Introduction 90 6.2 Encoding one-sided range Top-k queries on 2D array 92 6.3 Encoding general range Top-k queries on 2D array 95 6.4 Open problems 99 Chapter 7 Conculsion 100 Bibliography 103 요앜 112Docto

    29th International Symposium on Algorithms and Computation: ISAAC 2018, December 16-19, 2018, Jiaoxi, Yilan, Taiwan

    Get PDF
    corecore