Search CORE

9 research outputs found

Substring filtering for low-cost linked data interfaces

Author: E Minack
I Ermilov
J Van Herwegen
L Rietveld
M Arias Gallego
M Nelson
MP Ferguson
NR Brisaboa
O Erling
R Li
R Verborgh
S van Hooland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Recently, Triple Pattern Fragments (TPFS) were introduced as a low-cost server-side interface when high numbers of clients need to evaluate SPARQL queries. Scalability is achieved by moving part of the query execution to the client, at the cost of elevated query times. Since the TPFS interface purposely does not support complex constructs such as SPARQL filters, queries that use them need to be executed mostly on the client, resulting in long execution times. We therefore investigated the impact of adding a literal substring matching feature to the TPFS interface, with the goal of improving query performance while maintaining low server cost. In this paper, we discuss the client/server setup and compare the performance of SPARQL queries on multiple implementations, including Elastic Search and case-insensitive FM-index. Our evaluations indicate that these improvements allow for faster query execution without significantly increasing the load on the server. Offering the substring feature on TPF servers allows users to obtain faster responses for filter-based SPARQL queries. Furthermore, substring matching can be used to support other filters such as complete regular expressions or range queries

Crossref

Ghent University Academic Bibliography

DBpedia's triple pattern fragments: usage patterns and insights

Author: C Bizer
C Bizer
JD Fernández
O Hartig
R Verborgh
S van Hooland
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Queryable Linked Data is published through several interfaces, including SPARQL endpoints and Linked Data documents. In October 2014, the DBpedia Association announced an official Triple Pattern Fragments interface to its popular DBpedia dataset. This interface proposes to improve the availability of live queryable data by dividing query execution between clients and servers. In this paper, we present a usage analysis between November 2014 and July 2015. In 9 months time, the interface had an average availability of 99.99 %, handling 16,776,170 requests, 43.0% of which were served from cache. These numbers provide promising evidence that low-cost Triple Pattern Fragments interfaces provide a viable strategy for live applications on top of public, queryable datasets

Crossref

Ghent University Academic Bibliography

Querying distributed heterogeneous linked data interfaces through reasoning

Author: Van Herwegen Joachim
Publication venue
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

Versioned triple pattern fragments : a low-cost linked data interface feature

Author: Mannens Erik
Taelman Ruben
Vander Sande Miel
Verborgh Ruben
Publication venue
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Archivsystem Ask23

The Highway to Queryable Linked Data: Self-Describing Web s with Varying Features

Author: Erik Mannens
Joachim Van Herwegen
Laurens De Vocht
Miel Vander Sande
Rik Van De Walle
Ruben Verborgh
Publication venue
Publication date: 11/04/2020
Field of study

Abstract. Making Linked Data queryable on the Web is not an easy task for publishers, for technical and logistical reasons. Can they afford to offer a endpoint, or should they offer an or data dump instead? And what technical knowledge is needed for that? This demo presents a user-friendly pipeline to compose s for Linked Datasets, consisting of a customizable set of reusable features, e.g., Triple Pattern Fragments, substring search, membership metadata, etc. These s indicate their supported features in hypermedia responses, so that clients can discover which server-provided functionality they understand, and divide the evaluation of queries accordingly between client and server. That way, publishers can determine the complexity of the resulting , and thus the maximal set of server tasks. This demo shows how publishers can easily set up an with this pipeline, and demonstrates the client-side execution of federated queries against such s

CiteSeerX

Multidimensional interfaces for selecting data within ordinal ranges

Author: Colpaert Pieter
Mannens Erik
Taelman Ruben
Verborgh Ruben
Publication venue
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

A file-based linked data fragments approach to prefix search

Author: Colpaert Pieter
Dedecker Ruben
Delva Harm
Verborgh Ruben
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Text-fields that need to look up specific entities in a dataset can be equipped with autocompletion functionality. When a dataset becomes too large to be embedded in the page, setting up a full-text search API is not the only alternative. Alternate API designs that balance different trade-offs such as archivability, cacheability and privacy, may not require setting up a new back-end architecture. In this paper, we propose to perform prefix search over a fragmentation of the dataset, enabling the client to take part in the query execution by navigating through the fragmented dataset. Our proposal consists of (i) a self-describing fragmentation strategy, (ii) a client search algorithm, and (iii) an evaluation of the proposed solution, based on a small dataset of 73k entities and a large dataset of 3.87 m entities. We found that the server cache hit ratio is three times higher compared to a server-side prefix search API, at the cost of a higher bandwidth consumption. Nevertheless, an acceptable user-perceived performance has been measured: assuming 150 ms as an acceptable waiting time between keystrokes, this approach allows 15 entities per prefix to be retrieved in this interval. We conclude that an alternate set of trade-offs has been established for specific prefix search use cases: having added more choice to the spectrum of Web APIs for autocompletion, a file-based approach enables more datasets to afford prefix search

Ghent University Academic Bibliography

SaGe: Preemptive Query Execution for High Data Availability on the Web

Author: Minier Thomas
Molli Pascal
Skaf-Molli Hala
Publication venue
Publication date: 01/06/2018
Field of study

Semantic Web applications require querying available RDF Data with high performance and reliability. However, ensuring both data availability and performant SPARQL query execution in the context of public SPARQL servers are challenging problems. Queries could have arbitrary execution time and unknown arrival rates. In this paper, we propose SaGe, a preemptive server-side SPARQL query engine. SaGe relies on a preemptable physical query execution plan and preemptable physical operators. SaGe stops query execution after a given slice of time, saves the state of the plan and sends the saved plan back to the client with retrieved results. Later, the client can continue the query execution by resubmitting the saved plan to the server. By ensuring a fair query execution, SaGe maintains server availability and provides high query throughput. Experimental results demonstrate that SaGe outperforms the state of the art SPARQL query engines in terms of query throughput, query timeout and answer completeness

arXiv.org e-Print Archive

HAL-Rennes 1

Metadata en besturingskenmerken van lagekosteninfrastructuren voor de publicatie van gelinkte data

Author: Vander Sande Miel
Publication venue
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography