Search CORE

7,128 research outputs found

Optimal Color Range Reporting in One Dimension

Author: B. Chazelle
D.E. Willard
E.M. McCreight
L. Arge
M. Thorup
M.L. Fredman
P. Beame
P. Emde Boas van
P. Gupta
P.B. Miltersen
Q. Shi
R. Janardan
T.M. Chan
Publication venue
Publication date: 01/01/2013
Field of study

Color (or categorical) range reporting is a variant of the orthogonal range reporting problem in which every point in the input is assigned a \emph{color}. While the answer to an orthogonal point reporting query contains all points in the query range

Q

, the answer to a color reporting query contains only distinct colors of points in

Q

. In this paper we describe an O(N)-space data structure that answers one-dimensional color reporting queries in optimal

O(k+1)

time, where

k

is the number of colors in the answer and

N

is the number of points in the data structure. Our result can be also dynamized and extended to the external memory model

arXiv.org e-Print Archive

Crossref

I/O-Efficient Planar Range Skyline and Attrition Priority Queues

Author: Kejlberg-Rasmussen Casper
Tao Yufei
Tsakalidis Konstantinos
Tsichlas Kostas
Yoon Jeonghun
Publication venue
Publication date: 01/01/2013
Field of study

In the planar range skyline reporting problem, we store a set P of n 2D points in a structure such that, given a query rectangle Q = [a_1, a_2] x [b_1, b_2], the maxima (a.k.a. skyline) of P \cap Q can be reported efficiently. The query is 3-sided if an edge of Q is grounded, giving rise to two variants: top-open (b_2 = \infty) and left-open (a_1 = -\infty) queries. All our results are in external memory under the O(n/B) space budget, for both the static and dynamic settings: * For static P, we give structures that answer top-open queries in O(log_B n + k/B), O(loglog_B U + k/B), and O(1 + k/B) I/Os when the universe is R^2, a U x U grid, and a rank space grid [O(n)]^2, respectively (where k is the number of reported points). The query complexity is optimal in all cases. * We show that the left-open case is harder, such that any linear-size structure must incur \Omega((n/B)^e + k/B) I/Os for a query. We show that this case is as difficult as the general 4-sided queries, for which we give a static structure with the optimal query cost O((n/B)^e + k/B). * We give a dynamic structure that supports top-open queries in O(log_2B^e (n/B) + k/B^1-e) I/Os, and updates in O(log_2B^e (n/B)) I/Os, for any e satisfying 0 \le e \le 1. This leads to a dynamic structure for 4-sided queries with optimal query cost O((n/B)^e + k/B), and amortized update cost O(log (n/B)). As a contribution of independent interest, we propose an I/O-efficient version of the fundamental structure priority queue with attrition (PQA). Our PQA supports FindMin, DeleteMin, and InsertAndAttrite all in O(1) worst case I/Os, and O(1/B) amortized I/Os per operation. We also add the new CatenateAndAttrite operation that catenates two PQAs in O(1) worst case and O(1/B) amortized I/Os. This operation is a non-trivial extension to the classic PQA of Sundar, even in internal memory.Comment: Appeared at PODS 2013, New York, 19 pages, 10 figures. arXiv admin note: text overlap with arXiv:1208.4511, arXiv:1207.234

arXiv.org e-Print Archive

Crossref

Hong Kong University of Science and Technology Institutional Repository

Linear-Space Data Structures for Range Mode Query in Arrays

Author: Durocher Stephane
Morrison Jason
Publication venue
Publication date: 01/01/2011
Field of study

A mode of a multiset

S

is an element

a \in S

of maximum multiplicity; that is,

a

occurs at least as frequently as any other element in

S

. Given a list

A[1:n]

n

items, we consider the problem of constructing a data structure that efficiently answers range mode queries on

A

. Each query consists of an input pair of indices

(i, j)

for which a mode of

A[i:j]

must be returned. We present an

O(n^{2-2\epsilon})

-space static data structure that supports range mode queries in

O(n^\epsilon)

time in the worst case, for any fixed

\epsilon \in [0,1/2]

. When

\epsilon = 1/2

, this corresponds to the first linear-space data structure to guarantee

O(\sqrt{n})

query time. We then describe three additional linear-space data structures that provide

O(k)

O(m)

, and

O(|j-i|)

query time, respectively, where

k

denotes the number of distinct elements in

A

and

m

denotes the frequency of the mode of

A

. Finally, we examine generalizing our data structures to higher dimensions.Comment: 13 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Online Data Structures in External Memory

Author: Vitter Jeffrey Scott
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2011
Field of study

The original publication is available at www.springerlink.comThe data sets for many of today's computer applications are too large to t within the computer's internal memory and must instead be stored on external storage devices such as disks. A major performance bottleneck can be the input/output communication (or I/O) between the external and internal memories. In this paper we discuss a variety of online data structures for external memory, some very old and some very new, such as hashing (for dictionaries), B-trees (for dictionaries and 1-D range search), bu er trees (for batched dynamic problems), interval trees with weight-balanced B-trees (for stabbing queries), priority search trees (for 3-sided 2-D range search), and R-trees and other spatial structures. We also discuss several open problems along the way

KU ScholarWorks

Secondary Indexing in One Dimension: Beyond B-trees and Bitmap Indexes

Author: Pagh Rasmus
Rao S. Srinivasa
Publication venue
Publication date: 18/11/2008
Field of study

Let S be a finite, ordered alphabet, and let x = x_1 x_2 ... x_n be a string over S. A "secondary index" for x answers alphabet range queries of the form: Given a range [a_l,a_r] over S, return the set I_{[a_l;a_r]} = {i |x_i \in [a_l; a_r]}. Secondary indexes are heavily used in relational databases and scientific data analysis. It is well-known that the obvious solution, storing a dictionary for the position set associated with each character, does not always give optimal query time. In this paper we give the first theoretically optimal data structure for the secondary indexing problem. In the I/O model, the amount of data read when answering a query is within a constant factor of the minimum space needed to represent I_{[a_l;a_r]}, assuming that the size of internal memory is (|S| log n)^{delta} blocks, for some constant delta > 0. The space usage of the data structure is O(n log |S|) bits in the worst case, and we further show how to bound the size of the data structure in terms of the 0-th order entropy of x. We show how to support updates achieving various time-space trade-offs. We also consider an approximate version of the basic secondary indexing problem where a query reports a superset of I_{[a_l;a_r]} containing each element not in I_{[a_l;a_r]} with probability at most epsilon, where epsilon > 0 is the false positive probability. For this problem the amount of data that needs to be read by the query algorithm is reduced to O(|I_{[a_l;a_r]}| log(1/epsilon)) bits.Comment: 16 page

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Dynamic Range Majority Data Structures

Author: A. Andersson
E.D. Demaine
J. Bentley
J. Misra
L. Arge
M. Fredman
P. Bozanis
P. Gupta
R. Karp
S. Durocher
T. Gagie
T. Husfeldt
Y. Lai
Publication venue
Publication date: 01/01/2011
Field of study

Given a set

P

of coloured points on the real line, we study the problem of answering range

\alpha

-majority (or "heavy hitter") queries on

P

. More specifically, for a query range

Q

, we want to return each colour that is assigned to more than an

\alpha

-fraction of the points contained in

Q

. We present a new data structure for answering range

\alpha

-majority queries on a dynamic set of points, where

\alpha \in (0,1)

. Our data structure uses O(n) space, supports queries in

O((\lg n) / \alpha)

time, and updates in

O((\lg n) / \alpha)

amortized time. If the coordinates of the points are integers, then the query time can be improved to

O(\lg n / (\alpha \lg \lg n) + (\lg(1/\alpha))/\alpha))

. For constant values of

\alpha

, this improved query time matches an existing lower bound, for any data structure with polylogarithmic update time. We also generalize our data structure to handle sets of points in d-dimensions, for

d \ge 2

, as well as dynamic arrays, in which each entry is a colour.Comment: 16 pages, Preliminary version appeared in ISAAC 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Copenhagen University Research Information System

10091 Abstracts Collection -- Data Structures

Author: Arge Lars
Demaine Erik D.
Seidel Raimund
Publication venue: Dagstuhl Seminar Proceedings. 10091 - Data Structures
Publication date: 01/01/2010
Field of study

From February 28th to March 5th 2010, the Dagstuhl Seminar 10091 "Data Structures" was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. It brought together 45 international researchers to discuss recent developments concerning data structures in terms of research, but also in terms of new technologies that impact how data can be stored, updated, and retrieved. During the seminar a fair number of participants presented their current research and open problems where discussed. This document first briefly describes the seminar topics and then gives the abstracts of the presentations given during the seminar

Dagstuhl Research Online Publication Server

External Memory Planar Point Location with Fast Updates

Author: Iacono John
Karsin Ben
Koumoutsos Grigorios
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

We study dynamic planar point location in the External Memory Model or Disk Access Model (DAM). Previous work in this model achieves polylog query and polylog amortized update time. We present a data structure with O(log_B^2 N) query time and O(1/B^(1-epsilon) log_B N) amortized update time, where N is the number of segments, B the block size and epsilon is a small positive constant, under the assumption that all faces have constant size. This is a B^(1-epsilon) factor faster for updates than the fastest previous structure, and brings the cost of insertion and deletion down to subconstant amortized time for reasonable choices of N and B. Our structure solves the problem of vertical ray-shooting queries among a dynamic set of interior-disjoint line segments; this is well-known to solve dynamic planar point location for a connected subdivision of the plane with faces of constant size

Dagstuhl Research Online Publication Server

DI-fusion