8 research outputs found
SaGe: Preemptive Query Execution for High Data Availability on the Web
Semantic Web applications require querying available RDF Data with high
performance and reliability. However, ensuring both data availability and
performant SPARQL query execution in the context of public SPARQL servers are
challenging problems. Queries could have arbitrary execution time and unknown
arrival rates. In this paper, we propose SaGe, a preemptive server-side SPARQL
query engine. SaGe relies on a preemptable physical query execution plan and
preemptable physical operators. SaGe stops query execution after a given slice
of time, saves the state of the plan and sends the saved plan back to the
client with retrieved results. Later, the client can continue the query
execution by resubmitting the saved plan to the server. By ensuring a fair
query execution, SaGe maintains server availability and provides high query
throughput. Experimental results demonstrate that SaGe outperforms the state of
the art SPARQL query engines in terms of query throughput, query timeout and
answer completeness
Querying Linked Data: An Experimental Evaluation of State-of-the-Art Interfaces
The adoption of Semantic Web technologies, and in particular the Open Data
initiative, has contributed to the steady growth of the number of datasets and
triples accessible on the Web. Most commonly, queries over RDF data are
evaluated over SPARQL endpoints. Recently, however, alternatives such as TPF
have been proposed with the goal of shifting query processing load from the
server running the SPARQL endpoint towards the client that issued the query.
Although these interfaces have been evaluated against standard benchmarks and
testbeds that showed their benefits over previous work in general, a
fine-granular evaluation of what types of queries exploit the strengths of the
different available interfaces has never been done. In this paper, we present
the results of our in-depth evaluation of existing RDF interfaces. In addition,
we also examine the influence of the backend on the performance of these
interfaces. Using representative and diverse query loads based on the query log
of a public SPARQL endpoint, we stress test the different interfaces and
backends and identify their strengths and weaknesses.Comment: 18 pages, 14 figure
Robust query processing for linked data fragments
Linked Data Fragments (LDFs) refer to interfaces that allow for publishing and querying Knowledge Graphs on the Web. These interfaces primarily differ in their expressivity and allow for exploring different trade-offs when balancing the workload between clients and servers in decentralized SPARQL query processing. To devise efficient query plans, clients typically rely on heuristics that leverage the metadata provided by the LDF interface, since obtaining fine-grained statistics from remote sources is a challenging task. However, these heuristics are prone to potential estimation errors based on the metadata which can lead to inefficient query executions with a high number of requests, large amounts of data transferred, and, consequently, excessive execution times. In this work, we investigate robust query processing techniques for Linked Data Fragment clients to address these challenges. We first focus on robust plan selection by proposing CROP, a query plan optimizer that explores the cost and robustness of alternative query plans. Then, we address robust query execution by proposing a new class of adaptive operators: Polymorphic Join Operators. These operators adapt their join strategy in response to possible cardinality estimation errors. The results of our first experimental study show that CROP outperforms state-of-the-art clients by exploring alternative plans based on their cost and robustness. In our second experimental study, we investigate how different planning approaches can benefit from polymorphic join operators and find that they enable more efficient query execution in the majority of cases
A Survey of the First 20 Years of Research on Semantic Web and Linked Data
International audienceThis paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several bibliographical sources and bibliometric indicators , we identify the main research trends and we reference some of their major publications to provide an overview of that initial period. We conclude with some perspectives for the future research challenges.Cet article est une étude des sujets de recherche dans le domaine du Web sémantique, des données liées et du Web des données. Cette étude se penche sur les contributions de cette communauté de recherche au cours de ses vingt premières années d'existence. En compilant plusieurs sources bibliographiques et indicateurs bibliométriques, nous identifions les principales tendances de la recherche et nous référençons certaines de leurs publications majeures pour donner un aperçu de cette période initiale. Nous concluons avec une discussion sur les tendances et perspectives de recherche
Recommended from our members
Geographic Knowledge Graph Summarization
Geographic knowledge graphs play a significant role in the geospatial semantics paradigm for fulfilling the interoperability, the accessibility, and the conceptualization demands in geographic information science. However, due to the immense quantity of information accompanying and the enormous diversity of geographic knowledge graphs, there are many challenges that hinder the applicability and mass adoption of such useful structured knowledge. In order to tackle these challenges, this dissertation focuses on devising ways in which geographic knowledge graphs can be digested and summarized. Such a summarization task, on the one hand lifts the burden of information overload for end users, on the other hand facilitates the reduction of data storage, speeds up queries, and helps eliminate noise. The main contribution of this dissertation is that it introduces the general concept of geospatial inductive bias and explains different ways this idea can be used in the geographic knowledge graph summarization task. By decomposing the task into separate but related components, this dissertation is based upon three peer-reviewed articles which focus on the hierarchical place type structure, multimedia leaf nodes, and general relation and entity components respectively. A spatial knowledge map interface that illustrates the effectiveness of summarizing geographic knowledge graphs is presented. Throughout the dissertation, top-down knowledge engineering and bottom-up knowledge learning methods are integrated. We hope this dissertation would promote the awareness of this fascinating area and motivate researchers to investigate related questions
A formal framework for comparing linked data fragments
The Linked Data Fragment (LDF) framework has been proposed as auniform view to explore the trade-offs of consuming Linked Data when serversprovide (possibly many) different interfaces to access their data. Every such in-terface has its own particular properties regarding performance, bandwidth needs,caching, etc. Several practical challenges arise. For example, before exposing anew type of LDFs in some server, can we formally say something about how thisnew LDF interface compares to other interfaces previously implemented in thesame server? From the client side, given a client with some restricted capabilitiesin terms of time constraints, network connection, or computational power, whichis the best type of LDFs to complete a given task? Today there are only a fewformal theoretical tools to help answer these and other practical questions, andresearchers have embarked in solving them mainly by experimentation.In this paper we propose theLinked Data Fragment Machine(LDFM) which isthe first formalization to model LDF scenarios. LDFMs work as classical Tur-ing Machines with extra features that model the server and client capabilities. Byproving formal results based on LDFMs, we draw a fairly completeexpressive-ness latticethat shows the interplay between several combinations of client andserver capabilities. We also show the usefulness of our model to formally analyzethe fine grain interplay between several metrics such as the number of requestssent to the server, and the bandwidth of communication between client and server