Search CORE

811 research outputs found

Partout: A Distributed Engine for Efficient RDF Processing

Author: Galárraga Luis
Hose Katja
Schenkel Ralf
Publication venue
Publication date: 01/01/2012
Field of study

The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art approaches for partitioning and distributed SPARQL query processing

arXiv.org e-Print Archive

CiteSeerX

VBN

MPG.PuRe

LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs

Author: Amann Bernd
Curé Olivier
Naacke Hubert
Randriamalala Tendry
Publication venue
Publication date: 12/10/2015
Field of study

The number of linked data sources and the size of the linked open data graph keep growing every day. As a consequence, semantic RDF services are more and more confronted with various "big data" problems. Query processing in the presence of inferences is one them. For instance, to complete the answer set of SPARQL queries, RDF database systems evaluate semantic RDFS relationships (subPropertyOf, subClassOf) through time-consuming query rewriting algorithms or space-consuming data materialization solutions. To reduce the memory footprint and ease the exchange of large datasets, these systems generally apply a dictionary approach for compressing triple data sizes by replacing resource identifiers (IRIs), blank nodes and literals with integer values. In this article, we present a structured resource identification scheme using a clever encoding of concepts and property hierarchies for efficiently evaluating the main common RDFS entailment rules while minimizing triple materialization and query rewriting. We will show how this encoding can be computed by a scalable parallel algorithm and directly be implemented over the Apache Spark framework. The efficiency of our encoding scheme is emphasized by an evaluation conducted over both synthetic and real world datasets.Comment: 8 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

A Self-Optimizing Cloud Computing System for Distributed Storage and Processing of Semantic Web Data

Author: Dennis Heinrich
Johannes Blume
Stefan Werner
Sven Groppe
Publication venue: RonPub
Publication date: 01/01/2014
Field of study

Clouds are dynamic networks of common, off-the-shell computers to build computation farms. The rapid growth of databases in the context of the semantic web requires efficient ways to store and process this data. Using cloud technology for storing and processing Semantic Web data is an obvious way to overcome difficulties in storing and processing the enormously large present and future datasets of the Semantic Web. This paper presents a new approach for storing Semantic Web data, such that operations for the evaluation of Semantic Web queries are more likely to be processed only on local data, instead of using costly distributed operations. An experimental evaluation demonstrates the performance improvements in comparison to a naive distribution of Semantic Web data

RonPub -- Research Online Publishing

Neural Graph Reasoning:Complex Logical Query Answering Meets Graph Databases

Author: Cochez Michael
Galkin Mikhail
Leskovec Jure
Ren Hongyu
Zhu Zhaocheng
Publication venue
Publication date: 26/03/2023
Field of study

VU Research Portal

Adaptive Low-level Storage of Very Large Knowledge Graphs

Author: Azzam A.
Baolin Liu
Bordes Antoine
Chong Eugene Inseok
DATASTAX
Fan Jing
Gonzalez E.
Gonzalez E.
Gray Jim
Guha R.
Harris Steve
Kim Jinha
L.
McBride Brian
Modoni E.
Motik Boris
Rietveld Laurens
Urbani Jacopo
Urbani Jacopo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/01/2020
Field of study

The increasing availability and usage of Knowledge Graphs (KGs) on the Web calls for scalable and general-purpose solutions to store this type of data structures. We propose Trident, a novel storage architecture for very large KGs on centralized systems. Trident uses several interlinked data structures to provide fast access to nodes and edges, with the physical storage changing depending on the topology of the graph to reduce the memory footprint. In contrast to single architectures designed for single tasks, our approach offers an interface with few low-level and general-purpose primitives that can be used to implement tasks like SPARQL query answering, reasoning, or graph analytics. Our experiments show that Trident can handle graphs with 10^11 edges using inexpensive hardware, delivering competitive performance on multiple workloads.Comment: Accepted WWW 202

arXiv.org e-Print Archive

VU Research Portal

Crossref

View Selection in Semantic Web Databases

Author: François Goasdoué
François Goasdoué
Ioana Manolescu
Julien Leblay
Julien Leblay
Konstantinos Karanasos
Konstantinos Karanasos
Équipes-projets Leo
Publication venue
Publication date: 01/01/2011
Field of study

We consider the setting of a Semantic Web database, containing both explicit data encoded in RDF triples, and implicit data, implied by the RDF semantics. Based on a query workload, we address the problem of selecting a set of views to be materialized in the database, minimizing a combination of query processing, view storage, and view maintenance costs. Starting from an existing relational view selection method, we devise new algorithms for recommending view sets, and show that they scale significantly beyond the existing relational ones when adapted to the RDF context. To account for implicit triples in query answers, we propose a novel RDF query reformulation algorithm and an innovative way of incorporating it into view selection in order to avoid a combinatorial explosion in the complexity of the selection process. The interest of our techniques is demonstrated through a set of experiments.Comment: VLDB201

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Rennes 1