19 research outputs found

    Bulkloading and Maintaining XML Documents

    Get PDF
    The popularity of XML as a exchange and storage format brings about massive amounts of documents to be stored, maintained and analyzed -- a challenge that traditionally has been tackled with Database Management Systems (DBMS). To open up the content of XML documents to analysis with declarative query languages, efficient bulk loading techniques are necessary. Database technology has traditionally been offering support for these tasks but yet falls short of providing efficient automation techniques for the challenges that large collections of XML data raise. As storage back-end, many applications rely on relational databases, which are designed towards large data volumes. This paper studies the bulk load and update algorithms for XML data stored in relational format and outlines opportunities and problems. We investigate both (1) bulk insertion and deletion as well as (2) updates in the form of edit scripts which heavily use pointer-chasing techniques which often are considered orthogonal to the algebraic operations relational databases are optimized for. To get the most out of relational database systems, we show that one should make careful use of edit scripts and replace them with bulk operations if more than a very small portion of the database is updated. We implemented our ideas on top of the Monet Database System and benchmarked their performance

    Bulkloading and maintaining XML documents

    Get PDF
    The popularity of XML as a exchange and storage format brings about massive amounts of documents to be stored, maintained and analyzed -- a challenge that traditionally has been tackled with Database Management Systems (DBMS). To open up the content of XML documents to analysis with declarative query languages, efficient bulk loading techniques are necessary. Database technology has traditionally been offering support for these tasks but yet falls short of providing efficient automation techniques for the challenges that large collections of XML data raise. As storage back-end, many applications rely on relational databases, which are designed towards large data volumes. This paper studies the bulk load and update algorithms for XML data stored in relational format and outlines opportunities and problems. We investigate both (1) bulk insertion and deletion as well as (2) updates in the form of edit scripts which heavily use pointer-chasing techniques which often are considered orthogonal to the algebraic operations relational databases are optimized for. To get the most out of relational database systems, we show that one should make careful use of edit scripts and replace them with bulk operations if more than a very small portion of the database is updated. We implemented our ideas on top of the Monet Database System and benchmarked their performance

    A user configurable implementation of B-trees

    Get PDF
    The use of B-trees for achieving good performance for updates and retrievals in databases is well-known. Many excellent implementations of B-trees are available as well. However it is difficult to find B-trees that are easily configured and deployed into experimental systems. We undertake an implementation of B-trees from scratch that specifically addresses configurability and deployablility issue. An XML file is used to store as well as document information such as page formats of the nodes of the B-trees and details about the nature of records and keys. The behavior of the tree is encapsulated by commands for creation of B-trees, insertions of records in the tree, and make retrievals via the tree. The XML based configuration together with commands make the deployment and functionality of the tree completely clear and straightforward

    Storing XML Documents in Databases

    Get PDF
    The authors introduce concepts for loading large amounts of XML documents into databases where the documents are stored and maintained. The goal is to make XML databases as unobtrusive in multi-tier systems as possible and at the same time provide as many services defined by the XML standards as possible. The ubiquity of XML has sparked great interest in deploying concepts known from Relational Database Management Systems such as declarative query languages, transactions, indexes and integrity constraints. This chapter presents now bulkloading is done in Monet XML, a main memory XML database system, and evaluates the cost of bulkloading and bulk deletion with respect to strategies which base on insertion and deletion of individual nodes. Additionally, we survey the applicability of the techniques to a wider class of XML storage schemas

    Resource-efficient processing of large data volumes

    Get PDF
    The complex system environment of data processing applications makes it very challenging to achieve high resource efficiency. In this thesis, we develop solutions that improve resource efficiency at multiple system levels by focusing on three scenarios that are relevant—but not limited—to database management systems. First, we address the challenge of understanding complex systems by analyzing memory access characteristics via efficient memory tracing. Second, we leverage information about memory access characteristics to optimize the cache usage of algorithms and to avoid cache pollution by applying hardware-based cache partitioning. Third, after optimizing resource usage within a multicore processor, we optimize resource usage across multiple computer systems by addressing the problem of resource contention for bulk loading, i.e., ingesting large volumes of data into the system. We develop a distributed bulk loading mechanism, which utilizes network bandwidth and compute power more efficiently and improves both bulk loading throughput and query processing performance

    Evaluation and selectivity estimation of XML queries

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Granularity in visualisation of 3D BIM models design-science approach

    Get PDF
    Building Information Modeling (BIM) has gradually grown into one of the key information management platforms in the Architecture, Engineering and Construction industry. With the growing amount of information in and outside the digital construction industry, concepts of information retrieval like relevance and granularity have become relevant in this domain. An increased need for interoperability and easy access to information has caused the industry to look towards concepts like the semantic web and linked data. But within all this, the geometrical visualisation, which is an integral part of the BIM process, has lagged behind on the front of granularity and still seems to be done mainly using the conventional ways. We try to explore ways to introduce granularity in the visualisation of 3D BIM models, and connecting it to the semantic information which is already granular, thus creating a mapping between the two at a granular level. A web-based prototype is implemented and analysed as a proof of the presented concept, with the semantics being represented inside a graph-based data structure. We further present a discussion on the potential applications and use cases of the conceptualised framework in the field of construction and building lifecycle management. The work aims to take the first step towards modularising the visualisation process, and has tried to pave the way for detailed analyses and further improvements that may follow in this direction

    Core Technologies for Native XML Database Management Systems

    Full text link
    This work investigates the core technologies required to build Database Management Systems (DBMSs) for large collections of XML documents. We call such systems XML Base Management Systems (XBMSs). We identify requirements, and analyze how they can be met using a conventional DBMS. Our conclusion is that an XML support layer on top of an existing conventional DBMS does not address the requirements for XBMSs. Hence, we built a Native XBMS, called Natix. Natix has been developed completely from scratch, incorporating optimizations for high-performance XML processing in those places where they are most effective

    DATA MODELING AND QUERY PROCESSING FOR ONLINE SOCIAL NETWORKING SERVICES

    Get PDF
    Master'sMASTER OF SCIENC
    corecore