9 research outputs found

    Effective Partitioning and Multiple RDF Indexing for Database Triple Store

    Get PDF
    The capability of semantic technology leads to adaption of semantic technology to multiple applications of various domains. Due to vast number of applications, the size of RDF triple store is increasing. Effective semantic query execution has become a challenge due to the structure of RDF triple store. Effective indexing and partitioning leads to good sematic query performance against RDF triple store. The current research work has focused on various indexing techniques and proposed a predicate centric partitioning and multiple RDF indexing method for database triple store. A detailed analysis process is been executed to measure and compare the query performance. The current method is evaluated using standard benchmark and real datasets with various indexing techniques. Later the methodology is applied to R & D project management dataset. A set of twenty seven queries has been derived by considering various user requirements that cover most of the SPARQL constructs. The method is implemented and a detailed evaluation has been successfully carried out. The query time is evaluated on R & D project management dataset. The test results indicate that the proposed method provides considerable improvement in overall query performance

    Mejora de la eficiencia en la recuperación de imágenes en bases de datos por mediante indexación semántica

    Get PDF
    En este trabajo se presenta una forma de indexación que permite recuperar de manera eficiente metadatos introducidos en tripletas (RDF) en tablas OR. Estas tripletas son utilizadas para relacionar a imágenes con conceptos semánticos. Las imágenes se relacionan con los conceptos semánticos mediante referencias a registros de imágenes. La propuesta de indexación, tiene como objetivo mejorar el tiempo de recuperación de las referencias almacenados en las tripletas. Se realizaron pruebas con diferentes cantidades de tripletas. Los resultados en pruebas empíricas, muestran que la tasa media de recuperación para volúmenes importantes de tripletas disminuye aproximadamente un 50%.Sociedad Argentina de Informática e Investigación Operativ

    An Extensible Framework for Query Optimization on TripleT-Based RDF Stores

    Get PDF
    ABSTRACT The RDF data model is a key technology in the Linked Data vision. Given its graph structure, even relatively simple RDF queries often involve a large number of joins. Join evaluation poses a significant performance challenge on all state-of-the-art RDF engines. TripleT is a novel RDF index data structure, demonstrated to be competitive with the current state-of-the-art for join processing. Query optimization on TripleT, however, has not been systematically studied up to this point. In this paper we investigate how the use of (i) heuristics and (ii) data statistics can contribute towards a more intelligent way of generating query plans over TripleT-based RDF stores. We propose a generic framework for query optimization, and show through an extensive empirical study that our framework consistently produces efficient query evaluation plans

    Benchmarking Bottom-Up and Top-Down Strategies to Sparql-To-Sql Query Translation

    Get PDF
    Many researchers have proposed using conventional relational databases to store and query large Semantic Web datasets. The most complex component of this approach is SPARQL-to-SQL query translation. Existing algorithms perform this translation using either bottom-up or top-down strategy and result in semantically equivalent but syntactically different relational queries. Do relational query optimizers always produce identical query execution plans for semantically equivalent bottom-up and top-down queries? Which of the two strategies yields faster SQL queries? To address these questions, this work studies bottom-up and top-down translations of SPARQL queries with nested optional graph patterns. This work presents: (1) A basic graph pattern translation algorithm that yields flat SQL queries, (2) A bottom-up nested optional graph pattern translation algorithm, (3) A top-down nested optional graph pattern translation algorithm, and (4) A performance study featuring SPARQL queries with nested optional graph patterns over RDF databases created in Oracle, DB2, and PostgreSQL

    Adaptive Low-level Storage of Very Large Knowledge Graphs

    Get PDF
    The increasing availability and usage of Knowledge Graphs (KGs) on the Web calls for scalable and general-purpose solutions to store this type of data structures. We propose Trident, a novel storage architecture for very large KGs on centralized systems. Trident uses several interlinked data structures to provide fast access to nodes and edges, with the physical storage changing depending on the topology of the graph to reduce the memory footprint. In contrast to single architectures designed for single tasks, our approach offers an interface with few low-level and general-purpose primitives that can be used to implement tasks like SPARQL query answering, reasoning, or graph analytics. Our experiments show that Trident can handle graphs with 10^11 edges using inexpensive hardware, delivering competitive performance on multiple workloads.Comment: Accepted WWW 202

    Pushing the Scalability of RDF Engines on IoT Edge Devices

    Get PDF
    Semantic interoperability for the Internet of Things (IoT) is enabled by standards and technologies from the Semantic Web. As recent research suggests a move towards decentralised IoT architectures, we have investigated the scalability and robustness of RDF (Resource Description Framework)engines that can be embedded throughout the architecture, in particular at edge nodes. RDF processing at the edge facilitates the deployment of semantic integration gateways closer to low-level devices. Our focus is on how to enable scalable and robust RDF engines that can operate on lightweight devices. In this paper, we have first carried out an empirical study of the scalability and behaviour of solutions for RDF data management on standard computing hardware that have been ported to run on lightweight devices at the network edge. The findings of our study shows that these RDF store solutions have several shortcomings on commodity ARM (Advanced RISC Machine) boards that are representative of IoT edge node hardware. Consequently, this has inspired us to introduce a lightweight RDF engine, which comprises an RDF storage and a SPARQL processor for lightweight edge devices, called RDF4Led. RDF4Led follows the RISC-style (Reduce Instruction Set Computer) design philosophy. The design constitutes a flash-aware storage structure, an indexing scheme, an alternative buffer management technique and a low-memory-footprint join algorithm that demonstrates improved scalability and robustness over competing solutions. With a significantly smaller memory footprint, we show that RDF4Led can handle 2 to 5 times more data than popular RDF engines such as Jena TDB (Tuple Database) and RDF4J, while consuming the same amount of memory. In particular, RDF4Led requires 10%–30% memory of its competitors to operate on datasets of up to 50 million triples. On memory-constrained ARM boards, it can perform faster updates and can scale better than Jena TDB and Virtuoso. Furthermore, we demonstrate considerably faster query operations than Jena TDB and RDF4J.BMBF, 01IS18025A, Verbundprojekt BIFOLD-BBDC: Berlin Institute for the Foundations of Learning and DataBMBF, 01IS18037A, Verbundprojekt BIFOLD-BZML: Berlin Institute for the Foundations of Learning and DataEC/H2020/661180/EU/A Scalable and Elastic Platform for Near-Realtime Analytics for The Graph of Everything/SMARTE

    Scalable indexing of RDF graphs for efficient join processing

    No full text
    Current approaches to RDF graph indexing suffer from weak data locality, i.e., information regarding a piece of data appears in multiple locations, spanning multiple data structures. Weak data locality negatively impacts storage and query processing costs. Towards stronger data locality, we propose a Three-way Triple Tree (TripleT) secondary memory indexing technique to facilitate flexible and efficient join evaluation on RDF data. The novelty of TripleT is that the index is built over the atoms occurring in the data set, rather than at a coarser granularity, such as whole triples occurring in the data set; and, the atoms are indexed regardless of the roles (i.e., subjects, predicates, or objects) they play in the triples of the data set. We show through extensive empirical evaluation that TripleT exhibits multiple orders of magnitude improvement over the state-of-the-art, in terms of both storage and query processing costs

    Scalable indexing of RDF graphs for efficient join processing

    No full text
    Current approaches to RDF graph indexing suffer from weak data locality, i.e., information regarding a piece of data appears in multiple locations, spanning multiple data structures. Weak data locality negatively impacts storage and query processing costs. Towards stronger data locality, we propose a Three-way Triple Tree (TripleT) secondary memory indexing technique to facilitate flexible and efficient join evaluation on RDF data. The novelty of TripleT is that the index is built over the atoms occurring in the data set, rather than at a coarser granularity, such as whole triples occurring in the data set; and, the atoms are indexed regardless of the roles (i.e., subjects, predicates, or objects) they play in the triples of the data set. We show through extensive empirical evaluation that TripleT exhibits multiple orders of magnitude improvement over the state-of-the-art, in terms of both storage and query processing costs
    corecore