Search CORE

54 research outputs found

MIL primitives for querying a fragmented world

Author: Boncz P.A. (Peter)
Kersten M.L. (Martin)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/10/1999
Field of study

In query-intensive database application areas, like decision support and data mining, systems that use vertical fragmentation have a significant performance advantage. In order to support relational or object oriented applications on top of such a fragmented data model, a flexible yet powerful intermediate language is needed. This problem has been successfully tackled in Monet, a modern extensible database kernel developed by our group. We focus on the design choices made in the Monet Interpreter Language (MIL), its algebraic query language, and outline how its concept of tactical optimization enhances and simplifies the optimization of complex queries. Finally, we summarize the experience gained in Monet by creating a highly efficient implementation of MIL

CWI's Institutional Repository

MonetDB/XQuery - Consistent & Efficient Updates on the Pre/Post Plane

Author: Boncz Peter
Flokstra Jan
Grust Torsten
Keulen Maurice van
Manegold Stefan
Mullender Sjoerd
Rittinger Jan
Teubner Jens
Publication venue: Springer
Publication date: 01/01/2006
Field of study

Relational XQuery processors aim at leveraging mature relational DBMS query processing technology to provide scalability and efficiency. To achieve this goal, various storage schemes have been proposed to encode the tree structure of XML documents in flat relational tables. Basically, two classes can be identified: (1) encodings using fixed-length surrogates, like the preorder ranks in the pre/post encoding [5] or the equivalent pre/size/level encoding [8], and (2) encodings using variable-length surrogates, like, e.g., ORDPATH [9] or P-PBiTree [12]. Recent research [1] showed a clear advantage of the former for efficient evaluation of XPath location steps, exploiting techniques like cheap node order tests, positional lookup, and node skipping in staircase join [7]. However, once updates are involved, variable-length surrogates are often considered the better choice, mainly as a straightforward implementation of structural XML updates using fixed-length surrogates faces two performance bottlenecks: (i) high physical cost (the preorder ranks of all nodes following the update position must be modified—on average 50% of the document), and (ii) low transaction concurrency (updating the size of all ancestor nodes causes lock contention on the document root)

CiteSeerX

University of Twente Research Information

Moa and the multi-model architecture: a new perspective on XNF2

Author: Blok H.E.
Flokstra J.
Keulen M. van
Vonk J.
Vries A.P. de
Publication venue: Springer-Verlag
Publication date: 01/01/2003
Field of study

Advanced non-traditional application domains such as geographic information systems and digital library systems demand advanced data management support. In an effort to cope with this demand, we present the concept of a novel multi-model DBMS architecture which provides evaluation of queries on complexly structured data without sacrificing efficiency. A vital role in this architecture is played by the Moa language featuring a nested relational data model based on XNF2, in which we placed renewed interest. Furthermore, extensibility in Moa avoids optimization obstacles due to black-box treatment of ADTs. The combination of a mapping of queries on complexly structured data to an efficient physical algebra expression via a nested relational algebra, extensibility open to optimization, and the consequently better integration of domain-specific algorithms, makes that the Moa system can efficiently and effectively handle complex queries from non-traditional application domains

Crossref

CWI's Institutional Repository

University of Twente Research Information

MonetDB/X100 - A DBMS in the CPU cache

Author: Boncz P.A. (Peter)
Héman S. (Sándor)
Nes N.J. (Niels)
Zukowski M. (Marcin)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2005
Field of study

X100 is a new execution engine for the MonetDB system, that improves execution speed and overcomes its main memory limitation. It introduces t

CWI's Institutional Repository

Navigating through a forest of quad trees to spot images in a database

Author: Bosch H.G.P.
Kersten M.L. (Martin)
Nes N.J. (Niels)
Publication venue: CWI
Publication date: 01/01/2000
Field of study

This paper describes how we maintain color and spatial index information on more than 1,000,000 images and how we allow users to browse the spatial color feature space. We break down all our images in color-based quad trees and we store all quad trees in our main-memory database. We allow users to browse the quad trees directly, or they can pre-select images through our color bit vector, which acts as an index accelerator. A Java based textsc{gui is used to navigate through our image indexes

CWI's Institutional Repository

Bulkloading and Maintaining XML Documents

Author: Kersten M.L. (Martin)
Schmidt A.R.
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2002
Field of study

The popularity of XML as a exchange and storage format brings about massive amounts of documents to be stored, maintained and analyzed -- a challenge that traditionally has been tackled with Database Management Systems (DBMS). To open up the content of XML documents to analysis with declarative query languages, efficient bulk loading techniques are necessary. Database technology has traditionally been offering support for these tasks but yet falls short of providing efficient automation techniques for the challenges that large collections of XML data raise. As storage back-end, many applications rely on relational databases, which are designed towards large data volumes. This paper studies the bulk load and update algorithms for XML data stored in relational format and outlines opportunities and problems. We investigate both (1) bulk insertion and deletion as well as (2) updates in the form of edit scripts which heavily use pointer-chasing techniques which often are considered orthogonal to the algebraic operations relational databases are optimized for. To get the most out of relational database systems, we show that one should make careful use of edit scripts and replace them with bulk operations if more than a very small portion of the database is updated. We implemented our ideas on top of the Monet Database System and benchmarked their performance

CWI's Institutional Repository

Finding a Second Wind: Speeding Up Graph Traversal Queries in RDBMSs Using Column-Oriented Processing

Author: Chernishev George
Firsov Mikhail
Polyntsov Michael
Smirnov Kirill
Publication venue
Publication date: 16/08/2023
Field of study

Recursive queries and recursive derived tables constitute an important part of the SQL standard. Their efficient processing is important for many real-life applications that rely on graph or hierarchy traversal. Position-enabled column-stores offer a novel opportunity to improve run times for this type of queries. Such systems allow the engine to explicitly use data positions (row ids) inside its core and thus, enable novel efficient implementations of query plan operators. In this paper, we present an approach that significantly speeds up recursive query processing inside RDBMSes. Its core idea is to employ a particular aspect of column-store technology (late materialization) which enables the query engine to manipulate data positions during query execution. Based on it, we propose two sets of Volcano-style operators intended to process different query cases. In order validate our ideas, we have implemented the proposed approach in PosDB, an RDBMS column-store with SQL support. We experimentally demonstrate the viability of our approach by providing a comparison with PostgreSQL. Experiments show that for breadth-first search: 1) our position-based approach yields up to 6x better results than PostgreSQL, 2) our tuple-based one results in only 3x improvement when using a special rewriting technique, but it can work in a larger number of cases, and 3) both approaches can't be emulated in row-stores efficiently

arXiv.org e-Print Archive

Moa and the Multi-model architecture: a new perspective on XNF 2

Author: Blok H.E.
Flokstra J.
Keulen M. van
Vonk J.
Vries A.P. (Arjen) de
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

Advanced non-traditional application domains such as geographic information systems and digital library systems demand advanced data management support. In an effort to cope with this demand, we present the concept of a novel multi-model DBMS architecture which provides evaluation of queries on complexly structured data without sacrificing efficiency. A vital role in this architecture is played by the Moa language featuring a nested relational data model based on XNF2 , in which we placed renewed interest. Furthermore, extensibility in Moa avoids optimization obstacles due to black-box treatment of ADTs. The combination of a mapping of queries on complexly structured data to an efficient physical algebra expression via a nested relational algebra, extensibility open to optimization, and the consequently better integration of domain-specific algorithms, makes that the Moa system can efficiently handle complex queries from non-traditional application domains

CWI's Institutional Repository

CIRQUID: Complex information retrieval queries in a database

Author: Blok H.E.
Hiemstra D.
Jonker W.
Kersten M.L. (Martin)
Keulen M. van
Vries A.P. (Arjen) de
Publication venue: Centre for Telematics and Information Technology
Publication date: 01/01/2003
Field of study

CWI's Institutional Repository