450 research outputs found

    SMART-KG: Hybrid Shipping for SPARQL Querying on the Web

    Get PDF
    While Linked Data (LD) provides standards for publishing (RDF) and (SPARQL) querying Knowledge Graphs (KGs) on the Web, serving, accessing and processing such open, decentralized KGs is often practically impossible, as query timeouts on publicly available SPARQL endpoints show. Alternative solutions such as Triple Pattern Fragments (TPF) attempt to tackle the problem of availability by pushing query processing workload to the client side, but suffer from unnecessary transfer of irrelevant data on complex queries with large intermediate results. In this paper we present smart-KG, a novel approach to share the load between servers and clients, while significantly reducing data transfer volume, by combining TPF with shipping compressed KG partitions. Our evaluations show that outperforms state-of-the-art client-side solutions and increases server-side availability towards more cost-effective and balanced hosting of open and decentralized KGs.Series: Working Papers on Information Systems, Information Business and Operation

    MapReduce-based Solutions for Scalable SPARQL Querying

    Get PDF
    The use of RDF to expose semantic data on the Web has seen a dramatic increase over the last few years. Nowadays, RDF datasets are so big and rconnected that, in fact, classical mono-node solutions present significant scalability problems when trying to manage big semantic data. MapReduce, a standard framework for distributed processing of great quantities of data, is earning a place among the distributed solutions facing RDF scalability issues. In this article, we survey the most important works addressing RDF management and querying through diverse MapReduce approaches, with a focus on their main strategies, optimizations and results

    Compressed k2-Triples for Full-In-Memory RDF Engines

    Get PDF
    Current "data deluge" has flooded the Web of Data with very large RDF datasets. They are hosted and queried through SPARQL endpoints which act as nodes of a semantic net built on the principles of the Linked Data project. Although this is a realistic philosophy for global data publishing, its query performance is diminished when the RDF engines (behind the endpoints) manage these huge datasets. Their indexes cannot be fully loaded in main memory, hence these systems need to perform slow disk accesses to solve SPARQL queries. This paper addresses this problem by a compact indexed RDF structure (called k2-triples) applying compact k2-tree structures to the well-known vertical-partitioning technique. It obtains an ultra-compressed representation of large RDF graphs and allows SPARQL queries to be full-in-memory performed without decompression. We show that k2-triples clearly outperforms state-of-the-art compressibility and traditional vertical-partitioning query resolution, remaining very competitive with multi-index solutions.Comment: In Proc. of AMCIS'201

    Using Description Logics for RDF Constraint Checking and Closed-World Recognition

    Full text link
    RDF and Description Logics work in an open-world setting where absence of information is not information about absence. Nevertheless, Description Logic axioms can be interpreted in a closed-world setting and in this setting they can be used for both constraint checking and closed-world recognition against information sources. When the information sources are expressed in well-behaved RDF or RDFS (i.e., RDF graphs interpreted in the RDF or RDFS semantics) this constraint checking and closed-world recognition is simple to describe. Further this constraint checking can be implemented as SPARQL querying and thus effectively performed.Comment: Extended version of a paper of the same name that will appear in AAAI-201

    Tool for SPARQL Querying over Compact RDF Representations

    Get PDF
    Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.[Abstract] We present an architecture for the efficient storing and querying of large RDF datasets. Our approach seeks to store RDF datasets in very little space while offering complete SPARQL functionality. To achieve this, our proposal was built over HDT, an RDF serialization framework, and its interaction with the Jena query engine. We propose a set of modifications to this framework in order to incorporate a range of space-efficient compact data structures for data storage and access, while using high-level capabilities to answer more complicated SPARQL queries. As a result, our approach provides a standard mechanism for using low-level data structures in complicated query situations requiring SPARQL searches, which are typically not supported by current solutions.This research was funded by Xunta de Galicia/FEDER grant ED431G 2019/01, Xunta de Galicia/FEDER-UE grant IN852A 2018/14; Ministerio de Ciencia, Innovación y Universidades grants [TIN2016-78011-C4-1-R; PID2019-105221RB-C41]; Consellería de Cultura, Educación e Universidade/Consellería de Economía, Empresa e Innovación/GAIN/Xunta de Galicia grant ED431C 2021/53; and by MICINN (PGE/ERDF) grant PID2020-114635RB-I00.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; IN852A 2018/14Xunta de Galicia; ED431C 2021/5

    LODmilla: a Linked Data Browser for All

    Get PDF
    Abstract. Although the Linked Data paradigm is extremely popular, and there is immense amount of Linked Open Data available worldwide, the human ex-ploration of these datasets is limited. In our work we try to evolve a generic platform called LODmilla for exploring and editing Linked Open Data. Our aim is to enable the extraction and sharing of data associations (or information) hid-den in Linked Open Data. LODmilla is an open web application supporting graph views, graph searching and many other commodity features for surfing over Linked Data

    SPARQL Query Recommendations by Example

    Get PDF
    In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset. In contrast with existing approaches, SQUIRE aims at rec- ommending queries whose reformulations: i) reflect as much as possible the same intended meaning, structure, type of results and result size as the original query and ii) do not require to have a mapping between the two datasets. Based on a set of criteria to measure the similarity between the initial query and the recommended ones, SQUIRE demonstrates the feasibility of the underlying query reformulation process, ranks appropriately the recommended queries, and offers a valuable support for query recommendations over an unknown and unmapped target RDF dataset, not only assisting the user in learning the data model and content of an RDF dataset, but also supporting its use without requiring the user to have intrinsic knowledge of the data

    SECF: Improving SPARQL Querying Performance with Proactive Fetching and Caching

    Get PDF
    Querying on SPARQL endpoints may be unsatisfactory due to high latency of connections to the endpoints. Caching is an important way to accelerate the query response speed. In this paper, we propose SPARQL Endpoint Caching Framework (SECF), a client-side caching framework for this purpose. In particular, we prefetch and cache the results of similar queries to recently cached query aiming to improve the overall querying performance. The similarity between queries are calculated via an improved Graph Edit Distance (GED) function. We also adapt a smoothing method to implement the cache replacement. The empirical evaluations on real world queries show that our approach has great potential to enhance the cache hit rate and accelerate the querying speed on SPARQL endpoints
    corecore