76 research outputs found
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data
To make digital resources on the web verifiable, immutable, and permanent, we
propose a technique to include cryptographic hash values in URIs. We call them
trusty URIs and we show how they can be used for approaches like
nanopublications to make not only specific resources but their entire reference
trees verifiable. Digital artifacts can be identified not only on the byte
level but on more abstract levels such as RDF graphs, which means that
resources keep their hash values even when presented in a different format. Our
approach sticks to the core principles of the web, namely openness and
decentralized architecture, is fully compatible with existing standards and
protocols, and can therefore be used right away. Evaluation of our reference
implementations shows that these desired properties are indeed accomplished by
our approach, and that it remains practical even for very large files.Comment: Small error corrected in the text (table data was correct) on page
13: "All average values are below 0.8s (0.03s for batch mode). Using Java in
batch mode even requires only 1ms per file.
Provenance-Centered Dataset of Drug-Drug Interactions
Over the years several studies have demonstrated the ability to identify
potential drug-drug interactions via data mining from the literature (MEDLINE),
electronic health records, public databases (Drugbank), etc. While each one of
these approaches is properly statistically validated, they do not take into
consideration the overlap between them as one of their decision making
variables. In this paper we present LInked Drug-Drug Interactions (LIDDI), a
public nanopublication-based RDF dataset with trusty URIs that encompasses some
of the most cited prediction methods and sources to provide researchers a
resource for leveraging the work of others into their prediction methods. As
one of the main issues to overcome the usage of external resources is their
mappings between drug names and identifiers used, we also provide the set of
mappings we curated to be able to compare the multiple sources we aggregate in
our dataset.Comment: In Proceedings of the 14th International Semantic Web Conference
(ISWC) 201
Making Digital Artifacts on the Web Verifiable and Reliable
The current Web has no general mechanisms to make digital artifacts --- such
as datasets, code, texts, and images --- verifiable and permanent. For digital
artifacts that are supposed to be immutable, there is moreover no commonly
accepted method to enforce this immutability. These shortcomings have a serious
negative impact on the ability to reproduce the results of processes that rely
on Web resources, which in turn heavily impacts areas such as science where
reproducibility is important. To solve this problem, we propose trusty URIs
containing cryptographic hash values. We show how trusty URIs can be used for
the verification of digital artifacts, in a manner that is independent of the
serialization format in the case of structured data files such as
nanopublications. We demonstrate how the contents of these files become
immutable, including dependencies to external digital artifacts and thereby
extending the range of verifiability to the entire reference tree. Our approach
sticks to the core principles of the Web, namely openness and decentralized
architecture, and is fully compatible with existing standards and protocols.
Evaluation of our reference implementations shows that these design goals are
indeed accomplished by our approach, and that it remains practical even for
very large files.Comment: Extended version of conference paper: arXiv:1401.577
Supplemental Information 2: Example dataset description
Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets
A More Decentralized Vision for Linked Data
In this deliberately provocative position paper, we claim that ten years into Linked Data there are still (too?) many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. We take a deeper look at the biomedical domain - currently, one of the most promising "adopters" of Linked Data - if we believe the ever-present "LOD cloud" diagram. Herein, we try to highlight and exemplify key technical and non-technical challenges to the success of LOD, and we outline potential solution strategies. We hope that this paper will serve as a discussion basis for a fresh start towards more actionable, truly decentralized Linked Data, and as a call to the community to join forces.Series: Working Papers on Information Systems, Information Business and Operation
Exploring the Capacity of Open, Linked Data Sources to Assess Adverse Drug Reaction Signals
Abstract. In this work, we explore the capacity of open, linked data sources to assess adverse drug reaction (ADR) signals. Our study is based on a set of drugrelated Bio2RDF data sources and three reference datasets, containing both positive and negative ADR signals, which were used for benchmarking. We present the overall approach for this assessment and refer to some early findings based on the analysis performed so far
- …