Search CORE

17 research outputs found

MKtree: generation and simulations

Author: Brunet Crosa Pere
Franquesa Niubó Marta
Publication venue
Publication date: 01/01/2002
Field of study

The problem to represent very complex systems has been studied by several authors, obtaining solutions based on different data structures. In this paper, a K dimensional tree (Multirresolution Kdtree, MKtree) is introduced. The MKtree represents a hierarchical subdivision of the scene objects that guarantees a minimum space overlap between node regions. MKtrees are useful for collision detection and for time-critical rendering in very large environments requiring external memory storage. Examples in ship design applications are described.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

$Omega$ -storage : a self organizing multi-attribute storage technique for very large main memories

Author: Karlsson J.S.
Kersten M.L. (Martin)
Publication venue: CWI
Publication date: 01/01/1999
Field of study

Main memory is continuously improving both in price and capacity. With this comes new storage problems as well as new directions of usage. Just before the millennium, several main memory database systems are becoming commercially available. The hot areas include boosting the performance of web-enabled systems, such as search-engines, and auctioning systems. We present a novel data storage structure -- the {em

Omega

-storage structure, a high performance data structure, allowing automatically indexed storage of {em very large amounts of multi-attribute data. The experiments show excellent performance for point retrieval, and highly efficient pruning for {em pattern searches. It provides the balanced storage previously achieved by random kd-trees, but avoids their increased pattern match search times, by an effective assignment bits of attributes. Moreover, it avoids the sensitivity of the kd-tree to insert orders

CWI's Institutional Repository

A limit field for orthogonal range searches in two-dimensional random point search trees

Author: Broutin Nicolas
Sulzbach Henning
Publication venue
Publication date: 23/08/2018
Field of study

We consider the cost of general orthogonal range queries in random quadtrees. The cost of a given query is encoded into a (random) function of four variables which characterize the coordinates of two opposite corners of the query rectangle. We prove that, when suitably shifted and rescaled, the random cost function converges uniformly in probability towards a random field that is characterized as the unique solution to a distributional fixed-point equation. We also state similar results for

2

-d trees. Our results imply for instance that the worst case query satisfies the same asymptotic estimates as a typical query, and thereby resolve an old question of Chanzy, Devroye and Zamora-Cura [\emph{Acta Inf.}, 37:355--383, 2000]Comment: 24 pages, 8 figure

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

On the cost of fixed partial match queries in K-d trees

Author: Duch Brown Amalia
Lau Laynes-Lozada Gustavo Salvador
Martínez Parra Conrado
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/s00453-015-0097-4Partial match queries constitute the most basic type of associative queries in multidimensional data structures such as K-d trees or quadtrees. Given a query q=(q0,…,qK-1) where s of the coordinates are specified and K-s are left unspecified (qi=*), a partial match search returns the subset of data points x=(x0,…,xK-1) in the data structure that match the given query, that is, the data points such that xi=qi whenever qi¿*. There exists a wealth of results about the cost of partial match searches in many different multidimensional data structures, but most of these results deal with random queries. Only recently a few papers have begun to investigate the cost of partial match queries with a fixed query q. This paper represents a new contribution in this direction, giving a detailed asymptotic estimate of the expected cost Pn,q for a given fixed query q. From previous results on the cost of partial matches with a fixed query and the ones presented here, a deeper understanding is emerging, uncovering the following functional shape for Pn,q Pn,q=¿·(¿i:qi is specifiedqi(1-qi))a/2·na+l.o.t. (l.o.t. lower order terms, throughout this work) in many multidimensional data structures, which differ only in the exponent a and the constant ¿, both dependent on s and K, and, for some data structures, on the whole pattern of specified and unspecified coordinates in q as well. Although it is tempting to conjecture that this functional shape is “universal”, we have shown experimentally that it seems not to be true for a variant of K-d trees called squarish K-d trees.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

A limit process for partial match queries in random quadtrees and $2$ -d trees

Author: Broutin Nicolas
Neininger Ralph
Sulzbach Henning
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

We consider the problem of recovering items matching a partially specified pattern in multidimensional trees (quadtrees and

k

-d trees). We assume the traditional model where the data consist of independent and uniform points in the unit square. For this model, in a structure on

n

points, it is known that the number of nodes

C_n(\xi )

to visit in order to report the items matching a random query

\xi

, independent and uniformly distributed on

[0,1]

, satisfies

\mathbf {E}[{C_n(\xi )}]\sim\kappa n^{\beta}

, where

\kappa

and

\beta

are explicit constants. We develop an approach based on the analysis of the cost

C_n(s)

of any fixed query

s\in[0,1]

, and give precise estimates for the variance and limit distribution of the cost

C_n(x)

. Our results permit us to describe a limit process for the costs

C_n(x)

x

varies in

[0,1]

; one of the consequences is that

\mathbf {E}[{\max_{x\in[0,1]}C_n(x)}]\sim \gamma n^{\beta}

; this settles a question of Devroye [Pers. Comm., 2000].Comment: Published in at http://dx.doi.org/10.1214/12-AAP912 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1107.223

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Rank selection in multidimensional data

Author: Duch Brown Amalia
Jiménez Gómez Rosa María
Martínez Parra Conrado
Publication venue
Publication date: 01/01/2009
Field of study

Suppose we have a set of K-dimensional records stored in a general purpose spatial index like a K-d tree. The index efficiently supports insertions, ordinary exact searches, orthogonal range searches, nearest neighbor searches, etc. Here we consider whether we can also efficiently support search by rank, that is, to locate the i-th smallest element along the j-th coordinate. We answer this question in the affirmative by developing a simple algorithm with expected cost O(na(1/K) log n), where n is the size of the K-d tree and a(1/K) < 1 for any K ¿ 2. The only requirement to support the search by rank is that each node in the K-d tree stores the size of the subtree rooted at that node (or some equivalent information). This is not too space demanding. Furthermore, it can be used to randomize the update algorithms to provide guarantees on the expected performance of the various operations on K-d trees. Although selection in multidimensional data can be solved more efficiently than with our algorithm, those solutions will rely on ad-hoc data structures or superlinear space. Our solution adds to an existing data structure (K-d trees) the capability of search by rank with very little overhead. The simplicity of the algorithm makes it easy to implement, practical and very flexible; however, its correctness and efficiency are far from self-evident. Furthermore, it can be easily adapted to other spatial indexes as well.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Design and Analysis of Multidimensional Data Structures

Author: Barbaro M. B.
Caballero Carretero Juan Antonio
Giusti Carlotta
González Jiménez Raúl
Ivanov M. V.
Meucci Andrea
Udías J. M.
Publication venue: Universitat Politècnica de Catalunya
Publication date: 09/12/2004
Field of study

Aquesta tesi està dedicada al disseny i a l'anàlisi d'estructures de dades multidimensionals, és a dir, estructures de dades que serveixen per emmagatzemar registres

K

-dimensionals que solen representar-se com a punts en l'espai

[0,1]^K

. Aquestes estructures tenen aplicacions en diverses àrees de la informàtica com poden ser els sistemes d'informació geogràfica, la robòtica, el processament d'imatges, la world wide web, el data mining, entre d'altres. Les estructures de dades multidimensionals també es poden utilitzar com a indexos d'estructures de dades que emmagatzemen, possiblement en memòria externa, dades més complexes que els punts.Les estructures de dades multidimensionals han d'oferir la possibilitat de realitzar operacions d'inserció i esborrat de claus dinàmicament, a més de permetre realitzar cerques anomenades associatives. Exemples d'aquest tipus de cerques són les cerques per rangs ortogonals (quins punts cauen dintre d'un hiper-rectangle donat?) i les cerques del veí més proper (quin és el punt més proper a un punt donat?).Podem dividir les contribucions d'aquesta tesi en dues parts: La primera part està relacionada amb el disseny d'estructures de dades per a punts multidimensionals. Inclou el disseny d'arbres binaris

K

-dimensionals al·leatoritzats (Randomized

K

-d trees), el d'arbres quaternaris al·leatoritzats (Randomized quad trees) i el d'arbres multidimensionals amb punters de referència (Fingered multidimensional trees).La segona part analitza el comportament de les estructures de dades multidimensionals. En particular, s'analitza el cost mitjà de les cerques parcials en arbres

K

-dimensionals relaxats, i el de les cerques per rang en diverses estructures de dades multidimensionals. Respecte al disseny d'estructures de dades multidimensionals, proposem algorismes al·leatoritzats d'inserció i esborrat de registres per als arbres

K

-dimensionals i per als arbres quaternaris. Aquests algorismes produeixen arbres aleatoris, independentment de l'ordre d'inserció dels registres i desprès de qualsevol seqüència d'insercions i esborrats. De fet, el comportament esperat de les estructures produïdes mitjançant els algorismes al·leatoritzats és independent de la distribució de les dades d'entrada, tot i conservant la simplicitat i la flexibilitat dels arbres

K

-dimensionals i quaternaris estàndard. Introduïm també els arbres multidimensionals amb punters de referència. Això permet que les estructures multidimensionals puguin aprofitar l'anomenada localitat de referència en cerques associatives altament correlacionades.I respecte de l'anàlisi d'estructures de dades multidimensionals, primer analitzem el cost esperat de las cerques parcials en els arbres

K

-dimensionals relaxats. Seguidament utilitzem aquest resultat com a base per a l'anàlisi de les cerques per rangs ortogonals, juntament amb arguments combinatoris i geomètrics. D'aquesta manera obtenim un estimat asimptòtic precís del cost de les cerques per rangs ortogonals en els arbres

K

-dimensionals aleatoris. Finalment, mostrem que les tècniques utilitzades es poden estendre fàcilment a d'altres estructures de dades i per tant proporcionem una anàlisi exacta del cost mitjà de cerques per rang en estructures de dades com són els arbres

K

-dimensionals estàndard, els arbres quaternaris, els tries quaternaris i els tries

K

-dimensionals.Esta tesis está dedicada al diseño y al análisis de estructuras de datos multidimensionales; es decir, estructuras de datos específicas para almacenar registros

K

-dimensionales que suelen representarse como puntos en el espacio

[0,1]^K

. Estas estructuras de datos tienen aplicaciones en diversas áreas de la informática como son: los sistemas de información geográfica, la robótica, el procesamiento de imágenes, la world wide web o data mining, entre otras.Las estructuras de datos multidimensionales suelen utilizarse también como índices de estructuras que almacenan, posiblemente en memoria externa, datos complejos.Las estructuras de datos multidimensionales deben ofrecer la posibilidad de realizar operaciones de inserción y borrado de llaves de manera dinámica, pero además deben permitir realizar búsquedas asociativas en los registros almacenados. Ejemplos de búsquedas asociativas son las búsquedas por rangos ortogonales (¿qué puntos de la estructura de datos están dentro de un hiper-rectángulo dado?) y las búsquedas del vecino más cercano (¿cuál es el punto de la estructura de datos más cercano a un punto dado?).Las contribuciones de esta tesis se dividen en dos partes:La primera parte está dedicada al diseño de estructuras de datos para puntos multidimensionales, que incluye el diseño de los árboles binarios

K

-dimensionales aleatorios (Randomized

K

-d trees), el de los árboles cuaternarios aleatorios (Randomized quad trees), y el de los árboles multidimensionales con punteros de referencia (Fingered multidimensional trees).La segunda parte contiene contribuciones al análisis del comportamiento de las estructuras de datos para puntos multidimensionales. En particular, damos el análisis del costo promedio de las búsquedas parciales en los árboles

K

-dimensionales relajados y el de las búsquedas por rango en varias estructuras de datos multidimensionales.Con respecto al diseño de estructuras de datos multidimensionales, proponemos algoritmos aleatorios de inserción y borrado de registros para los árboles

K

-dimensionales y los árboles cuaternarios que producen árboles aleatorios independientemente del orden de inserción de los registros y después de cualquier secuencia de inserciones y borrados intercalados. De hecho, con la aleatorización garantizamos un buen rendimiento esperado de las estructuras de datos resultantes, que es independiente de la distribución de los datos de entrada, conservando la flexibilidad y la simplicidad de los árboles

K

-dimensionales y de los árboles cuaternarios estándar. También proponemos los árboles multidimensionales con punteros de referencia, una técnica que permite que las estructuras de datos multidimensionales exploten la localidad de referencia en búsquedas asociativas que se presentan altamente correlacionadas.Con respecto al análisis de estructuras de datos multidimensionales, comenzamos dando un análisis preciso del costo esperado de las búsquedas parciales en los árboles

K

-dimensionales relajados. A continuación, utilizamos este resultado como base para el análisis de las búsquedas por rangos ortogonales, combinándolo con argumentos combinatorios y geométricos. Como resultado obtenemos un estimado asintótico preciso del costo de las búsquedas por rango en los árboles

K

-dimensionales relajados. Finalmente, mostramos que las técnicas utilizadas pueden extenderse fácilmente a otras estructuras de datos y por tanto proporcionamos un análisis preciso del costo promedio de búsquedas por rango en estructuras de datos como los árboles

K

-dimensionales estándar, los árboles cuaternarios, los tries cuaternarios y los tries

K

-dimensionales.This thesis is about the design and analysis of point multidimensional data structures: data structures that store

K

-dimensional keys which we may abstract as points in

[0,1]^K

. These data structures are present in many applications of geographical information systems, image processing or robotics, among others. They are also frequently used as indexes of more complex data structures, possibly stored in external memory.Point multidimensional data structures must have capabilities such as insertion, deletion and (exact) search of items, but in addition they must support the so called {em associative queries}. Examples of these queries are orthogonal range queries (which are the items that fall inside a given hyper-rectangle?) and nearest neighbour queries (which is the closest item to some given point?).The contributions of this thesis are two-fold:Contributions to the design of point multidimensional data structures: the design of randomized

K

-d trees, the design of randomized quad trees and the design of fingered multidimensional search trees;Contributions to the analysis of the performance of point multidimensional data structures: the average-case analysis of partial match queries in relaxed

K

-d trees and the average-case analysis of orthogonal range queries in various multidimensional data structures.Concerning the design of randomized point multidimensional data structures, we propose randomized insertion and deletion algorithms for

K

-d trees and quad trees that produce random

K

-d trees and quad trees independently of the order in which items are inserted into them and after any sequence of interleaved insertions and deletions. The use of randomization provides expected performance guarantees, irrespective of any assumption on the data distribution, while retaining the simplicity and flexibility of standard

K

-d trees and quad trees.Also related to the design of point multidimensional data structures is the proposal of fingered multidimensional search trees, a new technique that enhances point multidimensional data structures to exploit locality of reference in associative queries.With regards to performance analysis, we start by giving a precise analysis of the cost of partial matches in randomized

K

-d trees. We use these results as a building block in our analysis of orthogonal range queries, together with combinatorial and geometric arguments and we provide a tight asymptotic estimate of the cost of orthogonal range search in randomized

K

-d trees. We finally show that the techniques used apply easily to other data structures, so we can provide an analysis of the average cost of orthogonal range search in other data structures such as standard

K

-d trees, quad trees, quad tries, and

K

-d tries

arXiv.org e-Print Archive

Crossref

Tesis Doctorals en Xarxa

Ghent University Academic Bibliography

idUS. Depósito de Investigación Universidad de Sevilla

Design and Analysis of Multidimensional Data Structures

Author: Duch Brown Amàlia
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2004
Field of study

Aquesta tesi està dedicada al disseny i a l'anàlisi d'estructures de dades multidimensionals, és a dir, estructures de dades que serveixen per emmagatzemar registres

K

-dimensionals que solen representar-se com a punts en l'espai

[0,1]^K

K

-dimensionals al·leatoritzats (Randomized

K

K

K

K

K

K

K

-dimensionals estàndard, els arbres quaternaris, els tries quaternaris i els tries

K

-dimensionals.Esta tesis está dedicada al diseño y al análisis de estructuras de datos multidimensionales; es decir, estructuras de datos específicas para almacenar registros

K

-dimensionales que suelen representarse como puntos en el espacio

[0,1]^K

K

-dimensionales aleatorios (Randomized

K

K

K

K

K

K

K

-dimensionales estándar, los árboles cuaternarios, los tries cuaternarios y los tries

K

-dimensionales.This thesis is about the design and analysis of point multidimensional data structures: data structures that store

K

-dimensional keys which we may abstract as points in

[0,1]^K

K

K

K

-d trees and quad trees that produce random

K

K

K

K

K

-d trees, quad trees, quad tries, and

K

-d tries

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura

Design, Analysis and Implementation of New Variants of Kd-trees

Author: Pons Crespo Maria Mercè
Publication venue: Universitat Politècnica de Catalunya
Publication date: 08/09/2010
Field of study

The representation of multidimensional data is a central issue in database design, as well as in many other elds, including computer graphics, com- putational geometry, pattern recognition, geographic information systems and others. Indeed, multidimensional points can represent locations, as well as more general records that arise in database management systems. For instance, consider an employee record that has attributes corresponding to the employee's name, address, sex, age, height and weight. Although the di erent dimensions have di erent data types (name and address are strings of characters; sex is a binary eld; and age, height and weight are numbers), these records can be treated as points in a six-dimensional space. We may see a database as a collection of records. Each record has several attributes, some of which are keys. The associative retrieval problem consists of answering queries with respect to a le of multidimensional records. Such an associative query requires the retrieval of those records in the le whose key attributes satisfy a certain condition. Examples of associative queries are intersection queries and nearest neighbor queries. In order to facilitate the retrieval of records based on some conditions on its key attributes, it is usually helpful to assumed the existence of an ordering for its values. In the case of numeric keys, such an ordering is quite obvious. In the case of alphanumeric keys, the ordering is usually based on the alphabetic sequence of the characters making up the attribute value. Furthermore, certain queries, like nearest neighbor searches, require the existence of a distance function