63 research outputs found
DBpedia's triple pattern fragments: usage patterns and insights
Queryable Linked Data is published through several interfaces, including SPARQL endpoints and Linked Data documents. In October 2014, the DBpedia Association announced an official Triple Pattern Fragments interface to its popular DBpedia dataset. This interface proposes to improve the availability of live queryable data by dividing query execution between clients and servers. In this paper, we present a usage analysis between November 2014 and July 2015. In 9 months time, the interface had an average availability of 99.99 %, handling 16,776,170 requests, 43.0% of which were served from cache. These numbers provide promising evidence that low-cost Triple Pattern Fragments interfaces provide a viable strategy for live applications on top of public, queryable datasets
Co-evolution of RDF Datasets
Linking Data initiatives have fostered the publication of large number of RDF
datasets in the Linked Open Data (LOD) cloud, as well as the development of
query processing infrastructures to access these data in a federated fashion.
However, different experimental studies have shown that availability of LOD
datasets cannot be always ensured, being RDF data replication required for
envisioning reliable federated query frameworks. Albeit enhancing data
availability, RDF data replication requires synchronization and conflict
resolution when replicas and source datasets are allowed to change data over
time, i.e., co-evolution management needs to be provided to ensure consistency.
In this paper, we tackle the problem of RDF data co-evolution and devise an
approach for conflict resolution during co-evolution of RDF datasets. Our
proposed approach is property-oriented and allows for exploiting semantics
about RDF properties during co-evolution management. The quality of our
approach is empirically evaluated in different scenarios on the DBpedia-live
dataset. Experimental results suggest that proposed proposed techniques have a
positive impact on the quality of data in source datasets and replicas.Comment: 18 pages, 4 figures, Accepted in ICWE, 201
Opportunistic linked data querying through approximate membership metadata
Between URI dereferencing and the SPARQL protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute SPARQL queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical SPARQL query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golomb-coded sets, as extra metadata. In addition to reducing HTTP requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer HTTP requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface
Moving real-time linked data query evaluation to the client
Traditional RDF stream processing engines work completely server-side, which contributes to a high server cost. For allowing a large number of concurrent clients to do continuous querying, we extend the low-cost Triple Pattern Fragments (TPF) interface with support for timesensitive queries. In this poster, we give the overview of a client-side rdf stream processing engine on top of tpf. Our experiments show that our solution significantly lowers the server load while increasing the load on the clients. Preliminary results indicate that our solution moves the complexity of continuously evaluating real-time queries from the server to the client, which makes real-time querying much more scalable for a large amount of concurrent clients when compared to the alternatives
EcoDaLo : federating advertisement targeting with linked data
A key source of revenue for the media and entertainmentdomain isad targeting: serving advertisements to a select set of visitorsbased on various captured visitor traits. Compared to global media com-panies such as Google and Facebook that aggregate data from varioussources (and the privacy concerns these aggregations bring), local compa-nies only capture a small number of (high-quality) traits and retrieve anunbalanced small amount of revenue. To increase these local publishers’competitive advantage, they need to join forces, whilst taking the visi-tors’ privacy concerns into account. The EcoDaLo consortium, located in Belgium and consisting of Adlogix, Pebble Media, and Roularta MediaGroup as founding partners, aims to combine local publishers’ data without requiring these partners to share this data across the consortium.Usage of Semantic Web technologies enables a decentralized approachwhere federated querying allows local companies to combine their captured visitor traits, and better target visitors, without aggregating alldata. To increase potential uptake, technical complexity to join this consortium is kept minimal, and established technology is used where possible. This solution was showcased in Belgium which provided the participating partners valuable insights and suggests future research challenges. Perspectives are to enlarge the consortium and provide measurable impact in ad targeting to local publishers
Self-Enforcing Access Control for Encrypted RDF
The amount of raw data exchanged via web protocols is
steadily increasing. Although the Linked Data infrastructure could
potentially be used to selectively share RDF data with different individuals
or organisations, the primary focus remains on the unrestricted
sharing of public data. In order to extend the Linked Data paradigm to
cater for closed data, there is a need to augment the existing infrastructure
with robust security mechanisms. At the most basic level both access
control and encryption mechanisms are required. In this paper, we propose
a flexible and dynamic mechanism for securely storing and efficiently
querying RDF datasets. By employing an encryption strategy based on
Functional Encryption (FE) in which controlled data access does not
require a trusted mediator, but is instead enforced by the cryptographic
approach itself, we allow for fine-grained access control over encrypted
RDF data while at the same time reducing the administrative overhead
associated with access control management
Bridging the Semantic Web and NoSQL Worlds: Generic SPARQL Query Translation and Application to MongoDB
International audienceRDF-based data integration is often hampered by the lack of methods to translate data locked in heterogeneous silos into RDF representations. In this paper, we tackle the challenge of bridging the gap between the Semantic Web and NoSQL worlds, by fostering the development of SPARQL interfaces to heterogeneous databases. To avoid defining yet another SPARQL translation method for each and every database, we propose a two-phase method. Firstly, a SPARQL query is translated into a pivot abstract query. This phase achieves as much of the translation process as possible regardless of the database. We show how optimizations at this abstract level can save subsequent work at the level of a target database query language. Secondly, the abstract query is translated into the query language of a target database, taking into account the specific database capabilities and constraints. We demonstrate the effectiveness of our method with the MongoDB NoSQL document store, such that arbitrary MongoDB documents can be aligned on existing domain ontologies and accessed with SPARQL. Finally, we draw on a real-world use case to report experimental results with respect to the effectiveness and performance of our approach
- …