7,189 research outputs found
Distributed Semantic Web data management in HBase and MySQL cluster
Various computing and data resources on the Web are being enhanced with machine-interpretable semantic descriptions to facilitate better search, discovery and integration. This interconnected metadata constitutes the Semantic Web, whose volume can potentially grow the scale of the Web. Efficient management of Semantic Web data, expressed using the W3C\u27s Resource Description Framework (RDF), is crucial for supporting new data-intensive, semantics-enabled applications. In this work, we study and compare two approaches to distributed RDF data management based on emerging cloud computing technologies and traditional relational database clustering technologies. In particular, we design distributed RDF data storage and querying schemes for HBase and MySQL Cluster and conduct an empirical comparison of these approaches on a cluster of commodity machines using datasets and queries from the Third Provenance Challenge and Lehigh University Benchmark. Our study reveals interesting patterns in query evaluation, shows that our algorithms are promising, and suggests that cloud computing has a great potential for scalable Semantic Web data management
Provenance for SPARQL queries
Determining trust of data available in the Semantic Web is fundamental for
applications and users, in particular for linked open data obtained from SPARQL
endpoints. There exist several proposals in the literature to annotate SPARQL
query results with values from abstract models, adapting the seminal works on
provenance for annotated relational databases. We provide an approach capable
of providing provenance information for a large and significant fragment of
SPARQL 1.1, including for the first time the major non-monotonic constructs
under multiset semantics. The approach is based on the translation of SPARQL
into relational queries over annotated relations with values of the most
general m-semiring, and in this way also refuting a claim in the literature
that the OPTIONAL construct of SPARQL cannot be captured appropriately with the
known abstract models.Comment: 22 pages, extended version of the ISWC 2012 paper including proof
Distributed Semantic Web Data Management in HBase and MySQL Cluster
Various computing and data resources on the Web are being enhanced with
machine-interpretable semantic descriptions to facilitate better search,
discovery and integration. This interconnected metadata constitutes the
Semantic Web, whose volume can potentially grow the scale of the Web. Efficient
management of Semantic Web data, expressed using the W3C's Resource Description
Framework (RDF), is crucial for supporting new data-intensive,
semantics-enabled applications. In this work, we study and compare two
approaches to distributed RDF data management based on emerging cloud computing
technologies and traditional relational database clustering technologies. In
particular, we design distributed RDF data storage and querying schemes for
HBase and MySQL Cluster and conduct an empirical comparison of these approaches
on a cluster of commodity machines using datasets and queries from the Third
Provenance Challenge and Lehigh University Benchmark. Our study reveals
interesting patterns in query evaluation, shows that our algorithms are
promising, and suggests that cloud computing has a great potential for scalable
Semantic Web data management.Comment: In Proc. of the 4th IEEE International Conference on Cloud Computing
(CLOUD'11
Semantic Modeling of Analytic-based Relationships with Direct Qualification
Successfully modeling state and analytics-based semantic relationships of
documents enhances representation, importance, relevancy, provenience, and
priority of the document. These attributes are the core elements that form the
machine-based knowledge representation for documents. However, modeling
document relationships that can change over time can be inelegant, limited,
complex or overly burdensome for semantic technologies. In this paper, we
present Direct Qualification (DQ), an approach for modeling any semantically
referenced document, concept, or named graph with results from associated
applied analytics. The proposed approach supplements the traditional
subject-object relationships by providing a third leg to the relationship; the
qualification of how and why the relationship exists. To illustrate, we show a
prototype of an event-based system with a realistic use case for applying DQ to
relevancy analytics of PageRank and Hyperlink-Induced Topic Search (HITS).Comment: Proceedings of the 2015 IEEE 9th International Conference on Semantic
Computing (IEEE ICSC 2015
Enhancing Workflow with a Semantic Description of Scientific Intent
Peer reviewedPreprin
Dynamic Provenance for SPARQL Update
While the Semantic Web currently can exhibit provenance information by using
the W3C PROV standards, there is a "missing link" in connecting PROV to storing
and querying for dynamic changes to RDF graphs using SPARQL. Solving this
problem would be required for such clear use-cases as the creation of version
control systems for RDF. While some provenance models and annotation techniques
for storing and querying provenance data originally developed with databases or
workflows in mind transfer readily to RDF and SPARQL, these techniques do not
readily adapt to describing changes in dynamic RDF datasets over time. In this
paper we explore how to adapt the dynamic copy-paste provenance model of
Buneman et al. [2] to RDF datasets that change over time in response to SPARQL
updates, how to represent the resulting provenance records themselves as RDF in
a manner compatible with W3C PROV, and how the provenance information can be
defined by reinterpreting SPARQL updates. The primary contribution of this
paper is a semantic framework that enables the semantics of SPARQL Update to be
used as the basis for a 'cut-and-paste' provenance model in a principled
manner.Comment: Pre-publication version of ISWC 2014 pape
Hybrid Search: Effectively Combining Keywords and Semantic Searches
This paper describes hybrid search, a search method supporting both document and knowledge retrieval via the flexible combination of ontologybased search and keyword-based matching. Hybrid search smoothly copes with
lack of semantic coverage of document content, which is one of the main limitations of current semantic search methods. In this paper we define hybrid search formally, discuss its compatibility with the current semantic trends and present a reference implementation: K-Search. We then show how the method outperforms both keyword-based search and pure semantic search in terms of precision and recall in a set of experiments performed on a collection of about 18.000 technical documents. Experiments carried out with professional users show that users understand the paradigm and consider it very powerful and reliable. K-Search has been ported to two applications released at Rolls-Royce
plc for searching technical documentation about jet engines
On Reasoning with RDF Statements about Statements using Singleton Property Triples
The Singleton Property (SP) approach has been proposed for representing and
querying metadata about RDF triples such as provenance, time, location, and
evidence. In this approach, one singleton property is created to uniquely
represent a relationship in a particular context, and in general, generates a
large property hierarchy in the schema. It has become the subject of important
questions from Semantic Web practitioners. Can an existing reasoner recognize
the singleton property triples? And how? If the singleton property triples
describe a data triple, then how can a reasoner infer this data triple from the
singleton property triples? Or would the large property hierarchy affect the
reasoners in some way? We address these questions in this paper and present our
study about the reasoning aspects of the singleton properties. We propose a
simple mechanism to enable existing reasoners to recognize the singleton
property triples, as well as to infer the data triples described by the
singleton property triples. We evaluate the effect of the singleton property
triples in the reasoning processes by comparing the performance on RDF datasets
with and without singleton properties. Our evaluation uses as benchmark the
LUBM datasets and the LUBM-SP datasets derived from LUBM with temporal
information added through singleton properties
A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data
We describe a generic framework for representing and reasoning with annotated
Semantic Web data, a task becoming more important with the recent increased
amount of inconsistent and non-reliable meta-data on the web. We formalise the
annotated language, the corresponding deductive system and address the query
answering problem. Previous contributions on specific RDF annotation domains
are encompassed by our unified reasoning formalism as we show by instantiating
it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we
provide a generic method for combining multiple annotation domains allowing to
represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the
development of a query language -- AnQL -- that is inspired by SPARQL,
including several features of SPARQL 1.1 (subqueries, aggregates, assignment,
solution modifiers) along with the formal definitions of their semantics
- …