6 research outputs found
Opportunistic linked data querying through approximate membership metadata
Between URI dereferencing and the SPARQL protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute SPARQL queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical SPARQL query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golomb-coded sets, as extra metadata. In addition to reducing HTTP requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer HTTP requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface
DBpedia's triple pattern fragments: usage patterns and insights
Queryable Linked Data is published through several interfaces, including SPARQL endpoints and Linked Data documents. In October 2014, the DBpedia Association announced an official Triple Pattern Fragments interface to its popular DBpedia dataset. This interface proposes to improve the availability of live queryable data by dividing query execution between clients and servers. In this paper, we present a usage analysis between November 2014 and July 2015. In 9 months time, the interface had an average availability of 99.99 %, handling 16,776,170 requests, 43.0% of which were served from cache. These numbers provide promising evidence that low-cost Triple Pattern Fragments interfaces provide a viable strategy for live applications on top of public, queryable datasets
The Highway to Queryable Linked Data: Self-Describing Web s with Varying Features
Abstract. Making Linked Data queryable on the Web is not an easy task for publishers, for technical and logistical reasons. Can they afford to offer a endpoint, or should they offer an or data dump instead? And what technical knowledge is needed for that? This demo presents a user-friendly pipeline to compose s for Linked Datasets, consisting of a customizable set of reusable features, e.g., Triple Pattern Fragments, substring search, membership metadata, etc. These s indicate their supported features in hypermedia responses, so that clients can discover which server-provided functionality they understand, and divide the evaluation of queries accordingly between client and server. That way, publishers can determine the complexity of the resulting , and thus the maximal set of server tasks. This demo shows how publishers can easily set up an with this pipeline, and demonstrates the client-side execution of federated queries against such s
SaGe: Preemptive Query Execution for High Data Availability on the Web
Semantic Web applications require querying available RDF Data with high
performance and reliability. However, ensuring both data availability and
performant SPARQL query execution in the context of public SPARQL servers are
challenging problems. Queries could have arbitrary execution time and unknown
arrival rates. In this paper, we propose SaGe, a preemptive server-side SPARQL
query engine. SaGe relies on a preemptable physical query execution plan and
preemptable physical operators. SaGe stops query execution after a given slice
of time, saves the state of the plan and sends the saved plan back to the
client with retrieved results. Later, the client can continue the query
execution by resubmitting the saved plan to the server. By ensuring a fair
query execution, SaGe maintains server availability and provides high query
throughput. Experimental results demonstrate that SaGe outperforms the state of
the art SPARQL query engines in terms of query throughput, query timeout and
answer completeness
Opportunistic Linked Data Querying through Approximate Membership Metadata
Abstract. Between dereferencing and the protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golombcoded sets, as extra metadata. In addition to reducing requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface