Search CORE

19,448 research outputs found

Secondary Indexing in One Dimension: Beyond B-trees and Bitmap Indexes

Author: Pagh Rasmus
Rao S. Srinivasa
Publication venue
Publication date: 18/11/2008
Field of study

Let S be a finite, ordered alphabet, and let x = x_1 x_2 ... x_n be a string over S. A "secondary index" for x answers alphabet range queries of the form: Given a range [a_l,a_r] over S, return the set I_{[a_l;a_r]} = {i |x_i \in [a_l; a_r]}. Secondary indexes are heavily used in relational databases and scientific data analysis. It is well-known that the obvious solution, storing a dictionary for the position set associated with each character, does not always give optimal query time. In this paper we give the first theoretically optimal data structure for the secondary indexing problem. In the I/O model, the amount of data read when answering a query is within a constant factor of the minimum space needed to represent I_{[a_l;a_r]}, assuming that the size of internal memory is (|S| log n)^{delta} blocks, for some constant delta > 0. The space usage of the data structure is O(n log |S|) bits in the worst case, and we further show how to bound the size of the data structure in terms of the 0-th order entropy of x. We show how to support updates achieving various time-space trade-offs. We also consider an approximate version of the basic secondary indexing problem where a query reports a superset of I_{[a_l;a_r]} containing each element not in I_{[a_l;a_r]} with probability at most epsilon, where epsilon > 0 is the false positive probability. For this problem the amount of data that needs to be read by the query algorithm is reduced to O(|I_{[a_l;a_r]}| log(1/epsilon)) bits.Comment: 16 page

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Maintaining range trees in secondary memory. Part I: Partitions

Author: Berg de, M.T.
Kreveld van, M.J.
Overmars M.H.
Smid M.H.M.
Publication venue: 'The Graduate School of the Humanities, Utrecht University'
Publication date: 01/01/1987
Field of study

Repository TU/e

Pure OAI Repository

A Unified approach to concurrent and parallel algorithms on balanced data structures

Author: Gabarró Vallès Joaquim
Messeguer Peypoch Xavier
Publication venue
Publication date: 01/01/1997
Field of study

Concurrent and parallel algorithms are different. However, in the case of dictionaries, both kinds of algorithms share many common points. We present a unified approach emphasizing these points. It is based on a careful analysis of the sequential algorithm, extracting from it the more basic facts, encapsulated later on as local rules. We apply the method to the insertion algorithms in AVL trees. All the concurrent and parallel insertion algorithms have two main phases. A percolation phase, moving the keys to be inserted down, and a rebalancing phase. Finally, some other algorithms and balanced structures are discussed.Postprint (published version

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC