21,063 research outputs found
SCOPE - A Scientific Compound Object Publishing and Editing System
This paper presents the SCOPE (Scientific Compound Object Publishing and Editing) system which is designed to enable scientists to easily author, publish and edit scientific compound objects. Scientific compound objects enable scientists to encapsulate the various datasets and resources generated or utilized during a scientific experiment or discovery process, within a single compound object, for publishing and exchange. The adoption of “named graphs” to represent these compound objects enables provenance information to be captured via the typed relationships between the components. This approach is also endorsed by the OAI-ORE initiative and hence ensures that we generate OAI-ORE-compliant Scientific Compound Objects. The SCOPE system is an extension of the Provenance Explorer tool – which enables access-controlled viewing of scientific provenance trails. Provenance Explorer provided dynamic rendering of RDF graphs of scientific discovery processes, showing the lineage from raw data to publication. Views of different granularity can be inferred automatically using SWRL (Semantic Web Rules Language) rules and an inferencing engine. SCOPE extends the Provenance Explorer tool and GUI by: 1) Adding an embedded web browser that can be used for incorporating objects discoverable via the Web; 2) Representing compound objects as Named Graphs, that can be saved in RDF, TriX, TriG or as an Atom syndication feed; 3) Enabling scientists to attach Creative Commons Licenses to the compound objects to specify how they may be re-used; 4) Enabling compound objects to be published as Fedora Object XML (FOXML) files within a Fedora digital library
Representing Dataset Quality Metadata using Multi-Dimensional Views
Data quality is commonly defined as fitness for use. The problem of
identifying quality of data is faced by many data consumers. Data publishers
often do not have the means to identify quality problems in their data. To make
the task for both stakeholders easier, we have developed the Dataset Quality
Ontology (daQ). daQ is a core vocabulary for representing the results of
quality benchmarking of a linked dataset. It represents quality metadata as
multi-dimensional and statistical observations using the Data Cube vocabulary.
Quality metadata are organised as a self-contained graph, which can, e.g., be
embedded into linked open datasets. We discuss the design considerations, give
examples for extending daQ by custom quality metrics, and present use cases
such as analysing data versions, browsing datasets by quality, and link
identification. We finally discuss how data cube visualisation tools enable
data publishers and consumers to analyse better the quality of their data.Comment: Preprint of a paper submitted to the forthcoming SEMANTiCS 2014, 4-5
September 2014, Leipzig, German
Making Digital Artifacts on the Web Verifiable and Reliable
The current Web has no general mechanisms to make digital artifacts --- such
as datasets, code, texts, and images --- verifiable and permanent. For digital
artifacts that are supposed to be immutable, there is moreover no commonly
accepted method to enforce this immutability. These shortcomings have a serious
negative impact on the ability to reproduce the results of processes that rely
on Web resources, which in turn heavily impacts areas such as science where
reproducibility is important. To solve this problem, we propose trusty URIs
containing cryptographic hash values. We show how trusty URIs can be used for
the verification of digital artifacts, in a manner that is independent of the
serialization format in the case of structured data files such as
nanopublications. We demonstrate how the contents of these files become
immutable, including dependencies to external digital artifacts and thereby
extending the range of verifiability to the entire reference tree. Our approach
sticks to the core principles of the Web, namely openness and decentralized
architecture, and is fully compatible with existing standards and protocols.
Evaluation of our reference implementations shows that these design goals are
indeed accomplished by our approach, and that it remains practical even for
very large files.Comment: Extended version of conference paper: arXiv:1401.577
Version Control in Online Software Repositories
Software version control repositories provide a uniform and stable interface to manage documents and their version histories. Unfortunately, Open Source systems, for example, CVS, Subversion, and GNU Arch are not well suited to highly collaborative environments and fail to track semantic changes in repositories. We introduce document provenance as our Description Logic framework to track the semantic changes in software repositories and draw interesting results about their historic behaviour using a rule-based inference engine. To support the use of this framework, we have developed our own online collaborative tool, leveraging the fluency of the modern WikiWikiWeb
Version Control in Online Software Repositories
Software version control repositories provide a uniform and stable interface to manage documents and their version histories. Unfortunately, Open Source systems, for example, CVS, Subversion, and GNU Arch are not well suited to highly collaborative environments and fail to track semantic changes in repositories. We introduce document provenance as our Description Logic framework to track the semantic changes in software repositories and draw interesting results about their historic behaviour using a rule-based inference engine. To support the use of this framework, we have developed our own online collaborative tool, leveraging the fluency of the modern WikiWikiWeb
Local Type Checking for Linked Data Consumers
The Web of Linked Data is the cumulation of over a decade of work by the Web
standards community in their effort to make data more Web-like. We provide an
introduction to the Web of Linked Data from the perspective of a Web developer
that would like to build an application using Linked Data. We identify a
weakness in the development stack as being a lack of domain specific scripting
languages for designing background processes that consume Linked Data. To
address this weakness, we design a scripting language with a simple but
appropriate type system. In our proposed architecture some data is consumed
from sources outside of the control of the system and some data is held
locally. Stronger type assumptions can be made about the local data than
external data, hence our type system mixes static and dynamic typing.
Throughout, we relate our work to the W3C recommendations that drive Linked
Data, so our syntax is accessible to Web developers.Comment: In Proceedings WWV 2013, arXiv:1308.026
- …