488 research outputs found
A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data
We describe a generic framework for representing and reasoning with annotated
Semantic Web data, a task becoming more important with the recent increased
amount of inconsistent and non-reliable meta-data on the web. We formalise the
annotated language, the corresponding deductive system and address the query
answering problem. Previous contributions on specific RDF annotation domains
are encompassed by our unified reasoning formalism as we show by instantiating
it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we
provide a generic method for combining multiple annotation domains allowing to
represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the
development of a query language -- AnQL -- that is inspired by SPARQL,
including several features of SPARQL 1.1 (subqueries, aggregates, assignment,
solution modifiers) along with the formal definitions of their semantics
Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs
Expressing machine-interpretable statements in the form of subject-predicate-object triples is a well-established practice for capturing semantics of structured data. However, the standard used for representing these triples, RDF, inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative. This paper is a critical review of data models, annotation frameworks, knowledge organization systems, serialization syntaxes, and algebras that enable provenance-aware RDF statements. The various approaches are assessed in terms of standard compliance, formal semantics, tuple type, vocabulary term usage, blank nodes, provenance granularity, and scalability. This can be used to advance existing solutions and help implementers to select the most suitable approach (or a combination of approaches) for their applications. Moreover, the analysis of the mechanisms and their limitations highlighted in this paper can serve as the basis for novel approaches in RDF-powered applications with increasing provenance needs
Decentralized provenance-aware publishing with nanopublications
Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable
Provenance for SPARQL queries
Determining trust of data available in the Semantic Web is fundamental for
applications and users, in particular for linked open data obtained from SPARQL
endpoints. There exist several proposals in the literature to annotate SPARQL
query results with values from abstract models, adapting the seminal works on
provenance for annotated relational databases. We provide an approach capable
of providing provenance information for a large and significant fragment of
SPARQL 1.1, including for the first time the major non-monotonic constructs
under multiset semantics. The approach is based on the translation of SPARQL
into relational queries over annotated relations with values of the most
general m-semiring, and in this way also refuting a claim in the literature
that the OPTIONAL construct of SPARQL cannot be captured appropriately with the
known abstract models.Comment: 22 pages, extended version of the ISWC 2012 paper including proof
Linked Data - the story so far
The term “Linked Data” refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions— the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward
Dataset search: a survey
Generating value from data requires the ability to find, access and make
sense of datasets. There are many efforts underway to encourage data sharing
and reuse, from scientific publishers asking authors to submit data alongside
manuscripts to data marketplaces, open data portals and data communities.
Google recently beta released a search service for datasets, which allows users
to discover data stored in various online repositories via keyword queries.
These developments foreshadow an emerging research field around dataset search
or retrieval that broadly encompasses frameworks, methods and tools that help
match a user data need against a collection of datasets. Here, we survey the
state of the art of research and commercial systems in dataset retrieval. We
identify what makes dataset search a research field in its own right, with
unique challenges and methods and highlight open problems. We look at
approaches and implementations from related areas dataset search is drawing
upon, including information retrieval, databases, entity-centric and tabular
search in order to identify possible paths to resolve these open problems as
well as immediate next steps that will take the field forward.Comment: 20 pages, 153 reference
10042 Abstracts Collection -- Semantic Challenges in Sensor Networks
From 24.01. to 29.01.2010, the Dagstuhl Seminar 10042 ``Semantic Challenges in Sensor Networks \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
- …