Search CORE

67 research outputs found

Exposing Provenance Metadata Using Different RDF Models

Author: Bodenreider Olivier
Bolton Evan
Dumontier Michel
Fu Gang
Furlong Laura I.
Nguyen Vinh
Rosinach Núria Queralt
Sheth Amit
Publication venue
Publication date: 01/01/2015
Field of study

A standard model for exposing structured provenance metadata of scientific assertions on the Semantic Web would increase interoperability, discoverability, reliability, as well as reproducibility for scientific discourse and evidence-based knowledge discovery. Several Resource Description Framework (RDF) models have been proposed to track provenance. However, provenance metadata may not only be verbose, but also significantly redundant. Therefore, an appropriate RDF provenance model should be efficient for publishing, querying, and reasoning over Linked Data. In the present work, we have collected millions of pairwise relations between chemicals, genes, and diseases from multiple data sources, and demonstrated the extent of redundancy of provenance information in the life science domain. We also evaluated the suitability of several RDF provenance models for this crowdsourced data set, including the N-ary model, the Singleton Property model, and the Nanopublication model. We examined query performance against three commonly used large RDF stores, including Virtuoso, Stardog, and Blazegraph. Our experiments demonstrate that query performance depends on both RDF store as well as the RDF provenance model

arXiv.org e-Print Archive

Maastricht University Research Portal

On Reasoning with RDF Statements about Statements using Singleton Property Triples

Author: Bodenreider Olivier
Bolton Evan
Dumontier Michel
Fu Gang
Furlong Laura I.
Nguyen Vinh
Rosinach Núria Queralt
Sheth Amit
Thirunarayan Krishnaprasad
Publication venue
Publication date: 15/09/2015
Field of study

The Singleton Property (SP) approach has been proposed for representing and querying metadata about RDF triples such as provenance, time, location, and evidence. In this approach, one singleton property is created to uniquely represent a relationship in a particular context, and in general, generates a large property hierarchy in the schema. It has become the subject of important questions from Semantic Web practitioners. Can an existing reasoner recognize the singleton property triples? And how? If the singleton property triples describe a data triple, then how can a reasoner infer this data triple from the singleton property triples? Or would the large property hierarchy affect the reasoners in some way? We address these questions in this paper and present our study about the reasoning aspects of the singleton properties. We propose a simple mechanism to enable existing reasoners to recognize the singleton property triples, as well as to infer the data triples described by the singleton property triples. We evaluate the effect of the singleton property triples in the reasoning processes by comparing the performance on RDF datasets with and without singleton properties. Our evaluation uses as benchmark the LUBM datasets and the LUBM-SP datasets derived from LUBM with temporal information added through singleton properties

arXiv.org e-Print Archive

Maastricht University Research Portal

Decentralized provenance-aware publishing with nanopublications

Author: Chichester Christine
Dumontier Michel
Giannakopoulos George
Krauthammer Michael
Kuhn Tobias
Ngonga Ngomo Axel-Cyrille
Queralt-Rosinach Núria
Verborgh Ruben
Viglianti Raffaele
Publication venue: 'PeerJ'
Publication date: 01/01/2016
Field of study

Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable

VU Research Portal

Ghent University Academic Bibliography

Directory of Open Access Journals

Enabling FAIR Discovery of Rare Disease Digital Resources

Author: Bonino Da Silva Santos Luiz Olavo
Burger Kees
Hanauer Marc
Kaliyaperumal Rajaram
Queralt Rosinach Núria
Roos Marco
Publication venue: IOS
Publication date: 07/05/2021
Field of study

University of Twente Research Information

Reuse of design pattern measurements for health data

Author: Bernabé CH
Dumontier M
Kaliyaperumal R
Long Q
Queralt-Rosinach N
Roos M
Schofield PN
Wilkinson M
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2021
Field of study

Research using health data is challenged by its heterogeneous nature, description and storage. The COVID-19 outbreak made clear that rapid analysis of observations such as clinical measurements across a large number of healthcare providers can have enormous health benefits. This has brought into focus the need for a common model of quantitative health data that enables data exchange and federated computational analysis. The application of ontologies, Semantic Web technologies and the FAIR principles is an approach used by different life science research projects, such as the European Joint Programme on Rare Diseases, to make data and metadata machine readable and thereby reduce the barriers for data sharing and analytics and harness health data for discovery. Here, we show the reuse of a pattern for measurements to model diverse health data, to demonstrate and raise visibility of the usefulness of this pattern for biomedical research

Maastricht University Research Portal

Apollo (Cambridge)

The use of foundational ontologies in biomedical research

Author: Bernabé César H.
Bonino da Silva Santos Luiz Olavo
Jacobsen Annika
Mons Barend
Queralt-Rosinach Núria
Roos Marco
Silva Souza Vítor E.
Publication venue
Publication date: 11/12/2023
Field of study

Background: The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts. Ontologies are currently modelled following different approaches, sometimes describing conflicting definitions of the same concepts, which can affect interoperability. To cope with that, prior literature suggests organising ontologies in levels, where domain specific (low-level) ontologies are grounded in domain independent high-level ontologies (i.e., foundational ontologies). In this level-based organisation, foundational ontologies work as translators of intended meaning, thus improving interoperability. Despite their considerable acceptance in biomedical research, there are very few studies testing foundational ontologies. This paper describes a systematic literature mapping that was conducted to understand how foundational ontologies are used in biomedical research and to find empirical evidence supporting their claimed (dis)advantages. Results: From a set of 79 selected papers, we identified that foundational ontologies are used for several purposes: ontology construction, repair, mapping, and ontology-based data analysis. Foundational ontologies are claimed to improve interoperability, enhance reasoning, speed up ontology development and facilitate maintainability. The complexity of using foundational ontologies is the most commonly cited downside. Despite being used for several purposes, there were hardly any experiments (1 paper) testing the claims for or against the use of foundational ontologies. In the subset of 49 papers that describe the development of an ontology, it was observed a low adherence to ontology construction (16 papers) and ontology evaluation formal methods (4 papers). Conclusion: Our findings have two main implications. First, the lack of empirical evidence about the use of foundational ontologies indicates a need for evaluating the use of such artefacts in biomedical research. Second, the low adherence to formal methods illustrates how the field could benefit from a more systematic approach when dealing with the development and evaluation of ontologies. The understanding of how foundational ontologies are used in the biomedical field can drive future research towards the improvement of ontologies and, consequently, data FAIRness. The adoption of formal methods can impact the quality and sustainability of ontologies, and reusing these methods from other fields is encouraged.</p

University of Twente Research Information

Semantic modelling of common data elements for rare disease registries, and a prototype workflow for their deployment over registry data

Author: Benis Nirupama
Bernabe Cesar Henrique
Cornet Ronald
Dumontier Michel
Godoy Mario Prieto
Jacobsen Annika
Kaliyaperumal Rajaram
Kool Leo J. Schultze
Lalout Nawel
Le Cornec Clemence M. A.
Moreno Pablo Alarcon
Queralt-Rosinach Nuria
Roos Marco
Swertz Morris A.
van Damme Philip
van der Velde K. Joeri
Vieira Bruna dos Santos
Wilkinson Mark D.
Zhang Shuxin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/03/2022
Field of study

BACKGROUND: The European Platform on Rare Disease Registration (EU RD Platform) aims to address the fragmentation of European rare disease (RD) patient data, scattered among hundreds of independent and non-coordinating registries, by establishing standards for integration and interoperability. The first practical output of this effort was a set of 16 Common Data Elements (CDEs) that should be implemented by all RD registries. Interoperability, however, requires decisions beyond data elements - including data models, formats, and semantics. Within the European Joint Programme on Rare Diseases (EJP RD), we aim to further the goals of the EU RD Platform by generating reusable RD semantic model templates that follow the FAIR Data Principles. RESULTS: Through a team-based iterative approach, we created semantically grounded models to represent each of the CDEs, using the SemanticScience Integrated Ontology as the core framework for representing the entities and their relationships. Within that framework, we mapped the concepts represented in the CDEs, and their possible values, into domain ontologies such as the Orphanet Rare Disease Ontology, Human Phenotype Ontology and National Cancer Institute Thesaurus. Finally, we created an exemplar, reusable ETL pipeline that we will be deploying over these non-coordinating data repositories to assist them in creating model-compliant FAIR data without requiring site-specific coding nor expertise in Linked Data or FAIR. CONCLUSIONS: Within the EJP RD project, we determined that creating reusable, expert-designed templates reduced or eliminated the requirement for our participating biomedical domain experts and rare disease data hosts to understand OWL semantics. This enabled them to publish highly expressive FAIR data using tools and approaches that were already familiar to them

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Digital.CSIC

Dissertations of the University of Groningen

Infrastructure for synthetic health data

Author: Aguiló-Castillo Sergi
ALSHAIKHDEEB Basel M.M.
Barbero Marcos Casado
BOLZANI Luca
Castro Leyla Jael
Cirillo Davide
GHOSH Soumyabrata
Kalaš Matúš
Palmblad Magnus
Queralt-Rosinach Núria
SATAGOPAM Venkata
Sheriff Rahuman S. Malik
SHOAIB Muhammad
Tsueng Ginger
WELTER Danielle
Publication venue
Publication date: 22/07/2023
Field of study

editorial reviewedMachine learning (ML) methods are becoming ever more prevalent across all domains of lifesciences. However, a key component of effective ML is the availability of large datasets thatare diverse and representative. In the context of health systems, with significant heterogeneityof clinical phenotypes and diversity of healthcare systems, there exists a necessity to developand refine unbiased and fair ML models. Synthetic data are increasingly being used to protectthe patient’s right to privacy and overcome the paucity of annotated open-access medical data. Here, we present our proof of concept for the generation of synthetic health data and our proposed FAIR implementation of the generated synthetic datasets. The work was developed during and after the one-week-long BioHackathon Europe, by together 20 participants (10 new to the project), from different countries (NL, ES, LU, UK, GR, FL, DE, . . . ).</p

Open Repository and Bibliography - Luxembourg

Table 1: Existing datasets in the nanopublication format, five of which were used for the first part of the evaluation.

Author: Banda
Belhajjame
Berners-Lee
Bradley
Buil-Aranda
Carroll
Chichester
Chichester
Clarke
Cohen
Feigenbaum
Filali
Freedman
Fu
Golden
Gray
Groth
Han
Harris
Hartig
Jacobson
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Kuhn
Ladwig
Markman
McCusker
Miller
Mons
NP Index RA7SuQ0e66
NP Index RACy0I4f_w
NP Index RAR5dwELYL
NP Index RAVEKRW0m6
NP Index RAXFlG04YM
NP Index RAXy332hxq
NP Index RAY_lQruua
Paskin
Patrinos
Proell
Queralt-Rosinach
Sequeda
Speicher
Verborgh
Wilkinson
Williams
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

A Simple Standard for Sharing Ontological Mappings (SSSOM).

Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec

The Jackson Laboratory: The Mouseion at the JAXlibrary