8 research outputs found

    Towards efficient query processing over heterogeneous RDF interfaces

    Get PDF

    SaGe: Preemptive Query Execution for High Data Availability on the Web

    Full text link
    Semantic Web applications require querying available RDF Data with high performance and reliability. However, ensuring both data availability and performant SPARQL query execution in the context of public SPARQL servers are challenging problems. Queries could have arbitrary execution time and unknown arrival rates. In this paper, we propose SaGe, a preemptive server-side SPARQL query engine. SaGe relies on a preemptable physical query execution plan and preemptable physical operators. SaGe stops query execution after a given slice of time, saves the state of the plan and sends the saved plan back to the client with retrieved results. Later, the client can continue the query execution by resubmitting the saved plan to the server. By ensuring a fair query execution, SaGe maintains server availability and provides high query throughput. Experimental results demonstrate that SaGe outperforms the state of the art SPARQL query engines in terms of query throughput, query timeout and answer completeness

    Analysis of the Effect of Query Shapes on Performance over LDF Interfaces

    Get PDF

    Querying Linked Data: An Experimental Evaluation of State-of-the-Art Interfaces

    Full text link
    The adoption of Semantic Web technologies, and in particular the Open Data initiative, has contributed to the steady growth of the number of datasets and triples accessible on the Web. Most commonly, queries over RDF data are evaluated over SPARQL endpoints. Recently, however, alternatives such as TPF have been proposed with the goal of shifting query processing load from the server running the SPARQL endpoint towards the client that issued the query. Although these interfaces have been evaluated against standard benchmarks and testbeds that showed their benefits over previous work in general, a fine-granular evaluation of what types of queries exploit the strengths of the different available interfaces has never been done. In this paper, we present the results of our in-depth evaluation of existing RDF interfaces. In addition, we also examine the influence of the backend on the performance of these interfaces. Using representative and diverse query loads based on the query log of a public SPARQL endpoint, we stress test the different interfaces and backends and identify their strengths and weaknesses.Comment: 18 pages, 14 figure

    Robust query processing for linked data fragments

    Get PDF
    Linked Data Fragments (LDFs) refer to interfaces that allow for publishing and querying Knowledge Graphs on the Web. These interfaces primarily differ in their expressivity and allow for exploring different trade-offs when balancing the workload between clients and servers in decentralized SPARQL query processing. To devise efficient query plans, clients typically rely on heuristics that leverage the metadata provided by the LDF interface, since obtaining fine-grained statistics from remote sources is a challenging task. However, these heuristics are prone to potential estimation errors based on the metadata which can lead to inefficient query executions with a high number of requests, large amounts of data transferred, and, consequently, excessive execution times. In this work, we investigate robust query processing techniques for Linked Data Fragment clients to address these challenges. We first focus on robust plan selection by proposing CROP, a query plan optimizer that explores the cost and robustness of alternative query plans. Then, we address robust query execution by proposing a new class of adaptive operators: Polymorphic Join Operators. These operators adapt their join strategy in response to possible cardinality estimation errors. The results of our first experimental study show that CROP outperforms state-of-the-art clients by exploring alternative plans based on their cost and robustness. In our second experimental study, we investigate how different planning approaches can benefit from polymorphic join operators and find that they enable more efficient query execution in the majority of cases

    A Survey of the First 20 Years of Research on Semantic Web and Linked Data

    Get PDF
    International audienceThis paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several bibliographical sources and bibliometric indicators , we identify the main research trends and we reference some of their major publications to provide an overview of that initial period. We conclude with some perspectives for the future research challenges.Cet article est une étude des sujets de recherche dans le domaine du Web sémantique, des données liées et du Web des données. Cette étude se penche sur les contributions de cette communauté de recherche au cours de ses vingt premières années d'existence. En compilant plusieurs sources bibliographiques et indicateurs bibliométriques, nous identifions les principales tendances de la recherche et nous référençons certaines de leurs publications majeures pour donner un aperçu de cette période initiale. Nous concluons avec une discussion sur les tendances et perspectives de recherche

    A formal framework for comparing linked data fragments

    No full text
    The Linked Data Fragment (LDF) framework has been proposed as auniform view to explore the trade-offs of consuming Linked Data when serversprovide (possibly many) different interfaces to access their data. Every such in-terface has its own particular properties regarding performance, bandwidth needs,caching, etc. Several practical challenges arise. For example, before exposing anew type of LDFs in some server, can we formally say something about how thisnew LDF interface compares to other interfaces previously implemented in thesame server? From the client side, given a client with some restricted capabilitiesin terms of time constraints, network connection, or computational power, whichis the best type of LDFs to complete a given task? Today there are only a fewformal theoretical tools to help answer these and other practical questions, andresearchers have embarked in solving them mainly by experimentation.In this paper we propose theLinked Data Fragment Machine(LDFM) which isthe first formalization to model LDF scenarios. LDFMs work as classical Tur-ing Machines with extra features that model the server and client capabilities. Byproving formal results based on LDFMs, we draw a fairly completeexpressive-ness latticethat shows the interplay between several combinations of client andserver capabilities. We also show the usefulness of our model to formally analyzethe fine grain interplay between several metrics such as the number of requestssent to the server, and the bandwidth of communication between client and server
    corecore