229 research outputs found

    Persistent Data Structures for Incremental Join Indices

    Get PDF
    Join indices are used in relational databases to make join operations faster. Join indices essentially materialise the results of join operations and so accrue maintenance cost, which makes them more suitable for use cases where modifications are rare and joins are performed frequently. To make the maintenance cost lower incrementally updating existing indices is to be preferred. The usage of persistent data structures for the join indices were explored. Motivation for this research was the ability of persistent data structures to construct multiple partially different versions of the same data structure memory efficiently. This is useful, because there can exist different versions of join indices simultaneously due to usage of multi-version concurrency control (MVCC) in a database. The techniques used in Relaxed Radix Balanced Trees (RRB-Trees) persistent data structure were found promising, but none of the popular implementations were found directly suitable for the use case. This exploration was done from the context of a particular proprietary embedded in-memory columnar multidimensional database called FastormDB developed by RELEX Solutions. This focused the research into Java Virtual Machine (JVM) based data structures as the implementation of FastormDB is in Java. Multiple persistent data-structures made for the thesis and ones from Scala, Clojure and Paguro were evaluated with Java Microbenchmark Harness (JMH) and Java Object Layout (JOL) based benchmarks and their results analysed via visualisations

    Associative access in persistent object stores : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences in Information Systems at Massey University

    Get PDF
    Page 276 missing from original copy.The overall aim of the thesis is to study associative access in a Persistent Object Store (POS) providing necessary object storage and retrieval capabilities to an Object Oriented Database System (OODBS) (Delis, Kanitkar & Kollios, 1998 cited in Kirchberg & Tretiakov, 2002). Associative access in an OODBS often includes navigational access to referenced or referencing objects of the object being accessed (Kim. Kim. & Dale. 1989). The thesis reviews several existing approaches proposed to support associative and navigational access in an OODBS. It was found that the existing approaches proposed for associative access could not perform well when queries involve multiple paths or inheritance hierarchies. The thesis studies how associative access can be supported in a POS regardless of paths or inheritance hierarchies involved with a query. The thesis proposes extensions to a model of a POS such that approaches that are proposed for navigational access can be used to support associative access in the extended POS. The extensions include (1) approaches to cluster storage objects in a POS on their storage classes or values of attributes, and (2) approaches to distinguish references between storage objects in a POS based on criteria such as reference types - inheritance and association, storage classes of referenced storage objects or referencing storage objects, and reference names. The thesis implements Matrix-Index Coding (MIC) approach with the extended POS by several coding techniques. The implementation demonstrates that (1) a model of a POS extended by proposed extensions is capable of supporting associative access in an OODBS and (2) the MIC implemented with the extended POS can support a query that requires associative access in an OODBS and involves multiple paths or inheritance hierarchies. The implementation also provides proof of the concepts suggested by Kirchberg & Tretiakov (2002) that (1) the MIC can be made independent from a coding technique, and (2) data compression techniques should be considered as appropriate alternatives to implement the MIC because they could reduce the storage size required

    Efficient Processing of Spatial Joins Using R-Trees

    Get PDF
    Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems

    Combining Indexing Schemes to Accelerate Querying XML on Content and Structure

    Get PDF
    This paper presents the advantages of combining multiple document representation schemes for query processing of XML queries on content and structure. We show how extending the Text Region approach [2] with the main features of the Binary Relation approach developed in [8] leads to a considerable speed-up in the processing of the XPath location steps. We detail how, by using the combined scheme, we reduce the number of structural joins used to process the XPath steps, while simultaneously limiting the amount of memory usage. We discuss optimisation strategies enabled by the new `combined representation scheme'. Experiments comparing the efficiency of alternative query processing strategies on a subset of the queries used at INEX 2003 (the Initiative for the Evaluation of XML Retrieval [4]) demonstrate a favourable performance for the combined indexing scheme

    On the Selection of Optimal Index Configuration in OO Databases

    Get PDF
    An operation in object-oriented databases gives rise to the processing of a path. Several database operations may result into the same path. The authors address the problem of optimal index configuration for a single path. As it is shown an optimal index configuration for a path can be achieved by splitting the path into subpaths and by indexing each subpath with the optimal index organization. The authors present an algorithm which is able to select an optimal index configuration for a given path. The authors consider a limited number of existing indexing techniques (simple index, inherited index, nested inherited index, multi-index, and multi-inherited index) but the principles of the algorithm remain the same adding more indexing technique
    • 

    corecore