Search CORE

17,271 research outputs found

Dynamic Provenance for SPARQL Update

Author: C. Gutierrez
G. Flouris
H. Halpin
J. Perèz
J.J. Carroll
L. Moreau
L. Moreau
N. Lopes
O. Udrea
P. Buneman
P. Buneman
R. Horne
R.T. Snodgrass
T.J. Green
V. Papavassiliou
Y. Theoharis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

While the Semantic Web currently can exhibit provenance information by using the W3C PROV standards, there is a "missing link" in connecting PROV to storing and querying for dynamic changes to RDF graphs using SPARQL. Solving this problem would be required for such clear use-cases as the creation of version control systems for RDF. While some provenance models and annotation techniques for storing and querying provenance data originally developed with databases or workflows in mind transfer readily to RDF and SPARQL, these techniques do not readily adapt to describing changes in dynamic RDF datasets over time. In this paper we explore how to adapt the dynamic copy-paste provenance model of Buneman et al. [2] to RDF datasets that change over time in response to SPARQL updates, how to represent the resulting provenance records themselves as RDF in a manner compatible with W3C PROV, and how the provenance information can be defined by reinterpreting SPARQL updates. The primary contribution of this paper is a semantic framework that enables the semantics of SPARQL Update to be used as the basis for a 'cut-and-paste' provenance model in a principled manner.Comment: Pre-publication version of ISWC 2014 pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

PROV-JSONLD: a JSON and linked data representation for provenance

Author: Huynh Trung Dong
Michaelides Danius T.
Moreau Luc
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

In this paper, we propose a representation for PROV in JSON-LD, the JSON format for Linked Data, called PROV-JSONLD. As a JSON-based format, this provenance representation can be readily consumed by Web applications currently supporting JSON. As a Linked Data format, at the same time, it also represents provenance data in RDF using the PROV ontology. Hence, it is suitable for usages in both the Web and the Semantic Web

Southampton (e-Prints Soton)

Distributed Semantic Web data management in HBase and MySQL cluster

Author: Franke Craig M.
Publication venue: ScholarWorks @ UTRGV
Publication date: 01/05/2011
Field of study

Various computing and data resources on the Web are being enhanced with machine-interpretable semantic descriptions to facilitate better search, discovery and integration. This interconnected metadata constitutes the Semantic Web, whose volume can potentially grow the scale of the Web. Efficient management of Semantic Web data, expressed using the W3C\u27s Resource Description Framework (RDF), is crucial for supporting new data-intensive, semantics-enabled applications. In this work, we study and compare two approaches to distributed RDF data management based on emerging cloud computing technologies and traditional relational database clustering technologies. In particular, we design distributed RDF data storage and querying schemes for HBase and MySQL Cluster and conduct an empirical comparison of these approaches on a cluster of commodity machines using datasets and queries from the Third Provenance Challenge and Lehigh University Benchmark. Our study reveals interesting patterns in query evaluation, shows that our algorithms are promising, and suggests that cloud computing has a great potential for scalable Semantic Web data management

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

Distributed Semantic Web Data Management in HBase and MySQL Cluster

Author: Abraham John
Brazier Pearl
Chebotko Artem
Franke Craig
Morin Samuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/05/2011
Field of study

Various computing and data resources on the Web are being enhanced with machine-interpretable semantic descriptions to facilitate better search, discovery and integration. This interconnected metadata constitutes the Semantic Web, whose volume can potentially grow the scale of the Web. Efficient management of Semantic Web data, expressed using the W3C's Resource Description Framework (RDF), is crucial for supporting new data-intensive, semantics-enabled applications. In this work, we study and compare two approaches to distributed RDF data management based on emerging cloud computing technologies and traditional relational database clustering technologies. In particular, we design distributed RDF data storage and querying schemes for HBase and MySQL Cluster and conduct an empirical comparison of these approaches on a cluster of commodity machines using datasets and queries from the Third Provenance Challenge and Lehigh University Benchmark. Our study reveals interesting patterns in query evaluation, shows that our algorithms are promising, and suggests that cloud computing has a great potential for scalable Semantic Web data management.Comment: In Proc. of the 4th IEEE International Conference on Cloud Computing (CLOUD'11

arXiv.org e-Print Archive

Crossref

Enhancing Workflow with a Semantic Description of Scientific Intent

Author: Edwards Peter
Gotts Nick
Pignotti Edoardo
Polhill Gary
Publication venue: 'Elsevier BV'
Publication date: 10/05/2011
Field of study

Peer reviewedPreprin

Aberdeen University Research

Crossref

Provenance-based validation of E-science experiments

Author: Klaus-peter Zauner
Luc Moreau
Paul Groth
Simon Miles
Sylvia C. Wong
Weijian Fang
Publication venue: Springer
Publication date: 01/01/2005
Field of study

E-Science experiments typically involve many distributed services maintained by different organisations. After an experiment has been executed, it is useful for a scientist to verify that the execution was performed correctly or is compatible with some existing experimental criteria or standards. Scientists may also want to review and verify experiments performed by their colleagues. There are no existing frameworks for validating such experiments in today's e-Science systems. Users therefore have to rely on error checking performed by the services, or adopt other ad hoc methods. This paper introduces a platform-independent framework for validating workflow executions. The validation relies on reasoning over the documented provenance of experiment results and semantic descriptions of services advertised in a registry. This validation process ensures experiments are performed correctly, and thus results generated are meaningful. The framework is tested in a bioinformatics application that performs protein compressibility analysis

CiteSeerX

Southampton (e-Prints Soton)

King's Research Portal

Addressing the Challenges of Semantic Citizen-Sensing

Author: Corsar David
Edwards Pete
Nelson John Donald
Pan Jeff Z
Velaga Nagendra Rao
Publication venue: CEUR-WS
Publication date: 23/10/2011
Field of study

Preprin

Aberdeen University Research

Version Control in Online Software Repositories

Author: Nicole Denis A
Watkins E Rowland
Publication venue: CSREA Press
Publication date: 01/06/2005
Field of study

Software version control repositories provide a uniform and stable interface to manage documents and their version histories. Unfortunately, Open Source systems, for example, CVS, Subversion, and GNU Arch are not well suited to highly collaborative environments and fail to track semantic changes in repositories. We introduce document provenance as our Description Logic framework to track the semantic changes in software repositories and draw interesting results about their historic behaviour using a rule-based inference engine. To support the use of this framework, we have developed our own online collaborative tool, leveraging the fluency of the modern WikiWikiWeb

Southampton (e-Prints Soton)

A Linked Data Approach to Sharing Workflows and Workflow Results

Author: Bechhofer S
Margaria T
Marshall MS
Missier P
Newman DR
Roos M
Roure DD
Steffen B
Zhao J
Publication venue
Publication date: 01/01/2010
Field of study

A bioinformatics analysis pipeline is often highly elaborate, due to the inherent complexity of biological systems and the variety and size of datasets. A digital equivalent of the ‘Materials and Methods’ section in wet laboratory publications would be highly beneficial to bioinformatics, for evaluating evidence and examining data across related experiments, while introducing the potential to find associated resources and integrate them as data and services. We present initial steps towards preserving bioinformatics ‘materials and methods’ by exploiting the workflow paradigm for capturing the design of a data analysis pipeline, and RDF to link the workflow, its component services, run-time provenance, and a personalized biological interpretation of the results. An example shows the reproduction of the unique graph of an analysis procedure, its results, provenance, and personal interpretation of a text mining experiment. It links data from Taverna, myExperiment.org, BioCatalogue.org, and ConceptWiki.org. The approach is relatively ‘light-weight’ and unobtrusive to bioinformatics users

Southampton (e-Prints Soton)

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

The University of Manchester - Institutional Repository

Semantic Modeling of Analytic-based Relationships with Direct Qualification

Author: Ahmed Norman
Bryant Jason
Hasseler Gregory
Paulini Matthew
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/02/2015
Field of study

Successfully modeling state and analytics-based semantic relationships of documents enhances representation, importance, relevancy, provenience, and priority of the document. These attributes are the core elements that form the machine-based knowledge representation for documents. However, modeling document relationships that can change over time can be inelegant, limited, complex or overly burdensome for semantic technologies. In this paper, we present Direct Qualification (DQ), an approach for modeling any semantically referenced document, concept, or named graph with results from associated applied analytics. The proposed approach supplements the traditional subject-object relationships by providing a third leg to the relationship; the qualification of how and why the relationship exists. To illustrate, we show a prototype of an event-based system with a realistic use case for applying DQ to relevancy analytics of PageRank and Hyperlink-Induced Topic Search (HITS).Comment: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015

arXiv.org e-Print Archive

Crossref