44,237 research outputs found

    S2ST: A Relational RDF Database Management System

    Get PDF
    The explosive growth of RDF data on the Semantic Web drives the need for novel database systems that can efficiently store and query large RDF datasets. To achieve good performance and scalability of query processing, most existing RDF storage systems use a relational database management system as a backend to manage RDF data. In this paper, we describe the design and implementation of a Relational RDF Database Management System. Our main research contributions are: (1) We propose a formal model of a Relational RDF Database Management System (RRDBMS), (2) We propose generic algorithms for schema, data and query mapping, (3) We implement the first and only RRDBMS, S2ST, that supports multiple relational database management systems, user-customizable schema mapping, schema-independent data mapping, and semantics-preserving query translation

    Fast and Simple Relational Processing of Uncertain Data

    Full text link
    This paper introduces U-relations, a succinct and purely relational representation system for uncertain databases. U-relations support attribute-level uncertainty using vertical partitioning. If we consider positive relational algebra extended by an operation for computing possible answers, a query on the logical level can be translated into, and evaluated as, a single relational algebra query on the U-relation representation. The translation scheme essentially preserves the size of the query in terms of number of operations and, in particular, number of joins. Standard techniques employed in off-the-shelf relational database management systems are effective for optimizing and processing queries on U-relations. In our experiments we show that query evaluation on U-relations scales to large amounts of data with high degrees of uncertainty.Comment: 12 pages, 14 figure

    Highspeed Graph Processing Exploiting Main-Memory Column Stores

    Get PDF
    A popular belief in the graph database community is that relational database management systems are generally ill-suited for efficient graph processing. This might apply for analytic graph queries performing iterative computations on the graph, but does not necessarily hold true for short-running, OLTP-style graph queries. In this paper we argue that, instead of extending a graph database management system with traditional relational operators—predicate evaluation, sorting, grouping, and aggregations among others—one should consider adding a graph abstraction and graph-specific operations, such as graph traversals and pattern matching, to relational database management systems. We use an exemplary query from the interactive query workload of the LDBC social network benchmark and run it against our enhanced in-memory, columnar relational database system to support our claims. Our performance measurements indicate that a columnar RDBMS—extended by graph-specific operators and data structures—can serve as a foundation for high-speed graph processing on big memory machines with non-uniform memory access and a large number of available cores

    Anatomy of a Native XML Base Management System

    Full text link
    Several alternatives to manage large XML document collections exist, ranging from file systems over relational or other database systems to specifically tailored XML repositories. In this paper we give a tour of Natix, a database management system designed from scratch for storing and processing XML data. Contrary to the common belief that management of XML data is just another application for traditional databases like relational systems, we illustrate how almost every component in a database system is affected in terms of adequacy and performance. We show how to design and optimize areas such as storage, transaction management comprising recovery and multi-user synchronisation as well as query processing for XML

    SAP HANA Database: Data Management for Modern Business Applications

    Get PDF
    The SAP HANA database is positioned as the core of the SAP HANA Appliance to support complex business analytical processes in combination with transactionally consistent operational workloads. Within this paper, we outline the basic characteristics of the SAP HANA database, emphasizing the distinctive features that differentiate the SAP HANA database from other classical relational database management systems. On the technical side, the SAP HANA database consists of multiple data processing engines with a distributed query processing environment to provide the full spectrum of data processing -- from classical relational data supporting both row- and column-oriented physical representations in a hybrid engine, to graph and text processing for semi- and unstructured data management within the same system. From a more application-oriented perspective, we outline the specific support provided by the SAP HANA database of multiple domain-specific languages with a built-in set of natively implemented business functions. SQL -- as the lingua franca for relational database systems -- can no longer be considered to meet all requirements of modern applications, which demand the tight interaction with the data management layer. Therefore, the SAP HANA database permits the exchange of application semantics with the underlying data management platform that can be exploited to increase query expressiveness and to reduce the number of individual application-to-database round trips

    6 Access Methods and Query Processing Techniques

    Get PDF
    The performance of a database management system (DBMS) is fundamentally dependent on the access methods and query processing techniques available to the system. Traditionally, relational DBMSs have relied on well-known access methods, such as the ubiquitous B +-tree, hashing with chaining, and, in som

    GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Management Systems

    Get PDF
    Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications demand native graph query and processing capabilities for enterprise-critical operations on data stored in relational database management systems. In this paper we propose an extensible graph traversal framework (GRAPHITE) as a central graph processing component on a common storage engine inside a relational database management system. We study the influence of the graph topology on the execution time of graph traversals and derive two traversal algorithm implementations specialized for different graph topologies and traversal queries. We conduct extensive experiments on GRAPHITE for a large variety of real-world graph data sets and input configurations. Our experiments show that the proposed traversal algorithms differ by up to two orders of magnitude for different input configurations and therefore demonstrate the need for a versatile framework to efficiently process graph traversals on a wide range of different graph topologies and types of queries. Finally, we highlight that the query performance of our traversal implementations is competitive with those of two native graph database management systems

    RAM: array processing over a relational DBMS

    Get PDF
    Developing multimedia applications in relational databases is hindered by a mismatch in computational frameworks. Efficient manipulation of multimedia data calls for array-based processing, which at best is available as a database add-on, not supported by the query optimizer. As a result, array-based processing ends up in dedicated programs outside the DBMS: non-reusable black boxes. The goal of our research is to reduce this gap between user-needs and system functionality by developing a seemless integration of array processing in a relational algebra engine. The paper introduces a declarative language for array-expressions based on the array comprehension, and its mapping to a relational kernel in a prototype implementation. The layered architecture of the resulting array database management system allows the use of structural knowledge available in the array data type. This additional source of information can be exploited for query optimization, which is demonstrated with a case study. The experiments show how the performance of a standard tool for matrix computations can be achieved without sacrificing data independence, highlighting however a critical aspect in the DBMS architecture proposed

    DAS: a data management system for instrument tests and operations

    Full text link
    The Data Access System (DAS) is a metadata and data management software system, providing a reusable solution for the storage of data acquired both from telescopes and auxiliary data sources during the instrument development phases and operations. It is part of the Customizable Instrument WorkStation system (CIWS-FW), a framework for the storage, processing and quick-look at the data acquired from scientific instruments. The DAS provides a data access layer mainly targeted to software applications: quick-look displays, pre-processing pipelines and scientific workflows. It is logically organized in three main components: an intuitive and compact Data Definition Language (DAS DDL) in XML format, aimed for user-defined data types; an Application Programming Interface (DAS API), automatically adding classes and methods supporting the DDL data types, and providing an object-oriented query language; a data management component, which maps the metadata of the DDL data types in a relational Data Base Management System (DBMS), and stores the data in a shared (network) file system. With the DAS DDL, developers define the data model for a particular project, specifying for each data type the metadata attributes, the data format and layout (if applicable), and named references to related or aggregated data types. Together with the DDL user-defined data types, the DAS API acts as the only interface to store, query and retrieve the metadata and data in the DAS system, providing both an abstract interface and a data model specific one in C, C++ and Python. The mapping of metadata in the back-end database is automatic and supports several relational DBMSs, including MySQL, Oracle and PostgreSQL.Comment: Accepted for pubblication on ADASS Conference Serie
    • …
    corecore