Search CORE

8 research outputs found

Decentralized provenance-aware publishing with nanopublications

Author: Chichester Christine
Dumontier Michel
Giannakopoulos George
Krauthammer Michael
Kuhn Tobias
Ngonga Ngomo Axel-Cyrille
Queralt-Rosinach Núria
Verborgh Ruben
Viglianti Raffaele
Publication venue: 'PeerJ'
Publication date: 01/01/2016
Field of study

Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable

VU Research Portal

Ghent University Academic Bibliography

Directory of Open Access Journals

Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

Author: Chichester Christine
Dumontier Michel
Krauthammer Michael
Kuhn Tobias
Publication venue
Publication date: 01/01/2015
Field of study

Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. Here we propose to design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used for the Semantic Web in general. Evaluation of the current small network shows that this system is efficient and reliable.Comment: In Proceedings of the 14th International Semantic Web Conference (ISWC) 201

arXiv.org e-Print Archive

CiteSeerX

Maastricht University Research Portal

VU Research Portal

Using nanopublications as a distributed ledger of digital truth

Author: Asif Imran
Publication venue: Mathematical and Computer Sciences
Publication date: 01/04/2022
Field of study

With the increase in volume of research publications, it is very difficult for researchers to keep abreast of all work in their area. Additionally, the claims in classical publications are not machine-readable making it challenging to retrieve, integrate, and link prior work. Several semantic publishing approaches have been proposed to address these challenges, including Research Object, Executable Paper, Micropublications, and Nanopublications. Nanopublications are a granular way of publishing research-based claims, their associated provenance, and publication information (metadata of the nanopublication) in a machine-readable form. To date, over 10 million nanopublications have been published, covering a wide range of topics, predominantly in the life sciences. Nanopublications are immutable, decentralised/distributed, uniformly structured, granular level, and authentic. These features of nanopublications allow them to be used as a Distributed Ledger of Digital Truth. Such a ledger enables detecting conflicting claims and generating the timeline of discussion on a particular topic. However, the inability to identify all nanopublications related to a given topic prevent existing nanopublications forming a ledger. In this dissertation, we make the following contributions: (i) Identify quality issues regarding misuse of authorship properties and linkrot which impact on the quality of the digital ledger. We argue that the Nanopub community needs to be developed a set of guidelines for publishing nanopublications. (ii) Provide a framework for generating a timeline of discourse over a collection of nanopublications by retrieving and combining nanopublications on a particular topic to provide interoperability between them. (iii) Detect contradictory claims between nanopublications automatically highlighting the conflicts and provide explanations based on the provenance information in the nanopublications. Through these contributions, we show that nanopublications can form a distributed ledger of digital truth, providing key benefits such as citability, timelines of discourse, and conflict detection, to users of the ledger

ROS: The Research Output Service. Heriot-Watt University Edinburgh

Table 1: Existing datasets in the nanopublication format, five of which were used for the first part of the evaluation.

Author: Banda
Belhajjame
Berners-Lee
Bradley
Buil-Aranda
Carroll
Chichester
Chichester
Clarke
Cohen
Feigenbaum
Filali
Freedman
Fu
Golden
Gray
Groth
Han
Harris
Hartig
Jacobson
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Ladwig
Markman
McCusker
Miller
Mons
NP Index RA7SuQ0e66
NP Index RACy0I4f_w
NP Index RAR5dwELYL
NP Index RAVEKRW0m6
NP Index RAXFlG04YM
NP Index RAXy332hxq
NP Index RAY_lQruua
Paskin
Patrinos
Proell
Queralt-Rosinach
Sequeda
Speicher
Verborgh
Wilkinson
Williams
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

Table 1: Example scientific notation of Julian days.

Author: Belarte
Bradley
Buckland
Eisenstein
Elliott
Gradmann
Groth
Groth
Heath
Isaksen
Kauppinen
Kuhn
Kuhn
Mink
Mons
Mons
Nelson
Nottingham
Rabinowitz
Shaw
Starr
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

Building expertise on FAIR through evolving Bring Your Own Data (BYOD) workshops: describing the data, software, and management- focused approaches and their evolution

University of Twente Research Information

Provenance, propagation and quality of biological annotation

Author: Bell Michael James
Publication venue: Newcastle University
Publication date: 01/01/2014
Field of study

PhD ThesisBiological databases have become an integral part of the life sciences, being used to store, organise and share ever-increasing quantities and types of data. Biological databases are typically centred around raw data, with individual entries being assigned to a single piece of biological data, such as a DNA sequence. Although essential, a reader can obtain little information from the raw data alone. Therefore, many databases aim to supplement their entries with annotation, allowing the current knowledge about the underlying data to be conveyed to a reader. Although annotations come in many di erent forms, most databases provide some form of free text annotation. Given that annotations can form the foundations of future work, it is important that a user is able to evaluate the quality and correctness of an annotation. However, this is rarely straightforward. The amount of annotation, and the way in which it is curated, varies between databases. For example, the production of an annotation in some databases is entirely automated, without any manual intervention. Further, sections of annotations may be reused, being propagated between entries and, potentially, external databases. This provenance and curation information is not always apparent to a user. The work described within this thesis explores issues relating to biological annotation quality. While the most valuable annotation is often contained within free text, its lack of structure makes it hard to assess. Initially, this work describes a generic approach that allows textual annotations to be quantitatively measured. This approach is based upon the application of Zipf's Law to words within textual annotation, resulting in a single value, . The relationship between the value and Zipf's principle of least e ort provides an indication as to the annotations quality, whilst also allowing annotations to be quantitatively compared. Secondly, the thesis focuses on determining annotation provenance and tracking any subsequent propagation. This is achieved through the development of a visualisation - i - framework, which exploits the reuse of sentences within annotations. Utilising this framework a number of propagation patterns were identi ed, which on analysis appear to indicate low quality and erroneous annotation. Together, these approaches increase our understanding in the textual characteristics of biological annotation, and suggests that this understanding can be used to increase the overall quality of these resources

Newcastle University eTheses

Converting neXtProt into Linked Data and nanopublications

Author: Bairoch Amos Marc
Chichester Christine
Gaudet Pascale
Karch Olivier
Lane Lydie
Mons Barend
Publication venue: 'IOS Press'
Publication date: 01/01/2015
Field of study

The development of Linked Data provides the opportunity for databases to supply extensive volumes of biological data, information, and knowledge in a machine interpretable format to make previously isolated data silos interoperable. To increase ease of use, often databases incorporate annotations from several different resources. Linked Data can overcome many formatting and identifier issues that prevent data interoperability, but the extensive cross incorporation of annotations between databases makes the tracking of provenance in open, decentralized systems especially important. With the diversity of published data, provenance information becomes critical to providing reliable and trustworthy services to scientists. The nanopublication system addresses many of these challenges. We have developed the neXtProt Linked Data by serializing in RDF/XML annotations specific to neXtProt and started employing the nanopublication model to give appropriate attribution to all data. Specifically, a use case demonstrates the handling of post-translational modification (PTM) data modeled as nanopublications to illustrate the how the different levels of provenance and data quality thresholds can be captured in this model

Archive ouverte UNIGE