research

A limit process for partial match queries in random quadtrees and 22-d trees

Abstract

We consider the problem of recovering items matching a partially specified pattern in multidimensional trees (quadtrees and kk-d trees). We assume the traditional model where the data consist of independent and uniform points in the unit square. For this model, in a structure on nn points, it is known that the number of nodes Cn(ξ)C_n(\xi ) to visit in order to report the items matching a random query ξ\xi, independent and uniformly distributed on [0,1][0,1], satisfies E[Cn(ξ)]κnβ\mathbf {E}[{C_n(\xi )}]\sim\kappa n^{\beta}, where κ\kappa and β\beta are explicit constants. We develop an approach based on the analysis of the cost Cn(s)C_n(s) of any fixed query s[0,1]s\in[0,1], and give precise estimates for the variance and limit distribution of the cost Cn(x)C_n(x). Our results permit us to describe a limit process for the costs Cn(x)C_n(x) as xx varies in [0,1][0,1]; one of the consequences is that E[maxx[0,1]Cn(x)]γnβ\mathbf {E}[{\max_{x\in[0,1]}C_n(x)}]\sim \gamma n^{\beta}; this settles a question of Devroye [Pers. Comm., 2000].Comment: Published in at http://dx.doi.org/10.1214/12-AAP912 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1107.223

    Similar works