22,973 research outputs found
DataHub: Collaborative Data Science & Dataset Version Management at Scale
Relational databases have limited support for data collaboration, where teams
collaboratively curate and analyze large datasets. Inspired by software version
control systems like git, we propose (a) a dataset version control system,
giving users the ability to create, branch, merge, difference and search large,
divergent collections of datasets, and (b) a platform, DataHub, that gives
users the ability to perform collaborative data analysis building on this
version control system. We outline the challenges in providing dataset version
control at scale.Comment: 7 page
On Reasoning with RDF Statements about Statements using Singleton Property Triples
The Singleton Property (SP) approach has been proposed for representing and
querying metadata about RDF triples such as provenance, time, location, and
evidence. In this approach, one singleton property is created to uniquely
represent a relationship in a particular context, and in general, generates a
large property hierarchy in the schema. It has become the subject of important
questions from Semantic Web practitioners. Can an existing reasoner recognize
the singleton property triples? And how? If the singleton property triples
describe a data triple, then how can a reasoner infer this data triple from the
singleton property triples? Or would the large property hierarchy affect the
reasoners in some way? We address these questions in this paper and present our
study about the reasoning aspects of the singleton properties. We propose a
simple mechanism to enable existing reasoners to recognize the singleton
property triples, as well as to infer the data triples described by the
singleton property triples. We evaluate the effect of the singleton property
triples in the reasoning processes by comparing the performance on RDF datasets
with and without singleton properties. Our evaluation uses as benchmark the
LUBM datasets and the LUBM-SP datasets derived from LUBM with temporal
information added through singleton properties
Enhancing Workflow with a Semantic Description of Scientific Intent
Peer reviewedPreprin
Recommended from our members
A short survey of discourse representation models
With the advancement of technology and the wide adoption of ontologies as knowledge representation formats, in the last decade, a handful of models were proposed for the externalization of the rhetoric and argumentation captured within scientific publications. Conceptually, most of these models share a similar representation form of the scientific publication, i.e. as a series of interconnected elementary knowledge items. The main differences are given by the terminology used, the types of rhetorical and/or argumentation relations connecting the knowledge items and the foundational theories supporting these relations. This paper analyzes the state of the art and provides a concise comparative overview of the five most prominent discourse representation models, with the goal of sketching an unified model for discourse representation
Dynamic Provenance for SPARQL Update
While the Semantic Web currently can exhibit provenance information by using
the W3C PROV standards, there is a "missing link" in connecting PROV to storing
and querying for dynamic changes to RDF graphs using SPARQL. Solving this
problem would be required for such clear use-cases as the creation of version
control systems for RDF. While some provenance models and annotation techniques
for storing and querying provenance data originally developed with databases or
workflows in mind transfer readily to RDF and SPARQL, these techniques do not
readily adapt to describing changes in dynamic RDF datasets over time. In this
paper we explore how to adapt the dynamic copy-paste provenance model of
Buneman et al. [2] to RDF datasets that change over time in response to SPARQL
updates, how to represent the resulting provenance records themselves as RDF in
a manner compatible with W3C PROV, and how the provenance information can be
defined by reinterpreting SPARQL updates. The primary contribution of this
paper is a semantic framework that enables the semantics of SPARQL Update to be
used as the basis for a 'cut-and-paste' provenance model in a principled
manner.Comment: Pre-publication version of ISWC 2014 pape
- …