Search CORE

904 research outputs found

Reproducible Domain-Specific Knowledge Graphs in the Life Sciences: a Systematic Literature Review

Author: Babalou Samira
König-Ries Birgitta
Samuel Sheeba
Publication venue
Publication date: 15/09/2023
Field of study

Knowledge graphs (KGs) are widely used for representing and organizing structured knowledge in diverse domains. However, the creation and upkeep of KGs pose substantial challenges. Developing a KG demands extensive expertise in data modeling, ontology design, and data curation. Furthermore, KGs are dynamic, requiring continuous updates and quality control to ensure accuracy and relevance. These intricacies contribute to the considerable effort required for their development and maintenance. One critical dimension of KGs that warrants attention is reproducibility. The ability to replicate and validate KGs is fundamental for ensuring the trustworthiness and sustainability of the knowledge they represent. Reproducible KGs not only support open science by allowing others to build upon existing knowledge but also enhance transparency and reliability in disseminating information. Despite the growing number of domain-specific KGs, a comprehensive analysis concerning their reproducibility has been lacking. This paper addresses this gap by offering a general overview of domain-specific KGs and comparing them based on various reproducibility criteria. Our study over 19 different domains shows only eight out of 250 domain-specific KGs (3.2%) provide publicly available source code. Among these, only one system could successfully pass our reproducibility assessment (14.3%). These findings highlight the challenges and gaps in achieving reproducibility across domain-specific KGs. Our finding that only 0.4% of published domain-specific KGs are reproducible shows a clear need for further research and a shift in cultural practices

arXiv.org e-Print Archive

KG-Hub-building and exchanging biological knowledge graphs.

Author: Balhoff Jim
Bruskiewich Richard M
Callahan Tiffany J
Cappelletti Luca
Carbon Seth
Caufield J Harry
Chan Lauren E
Cortes Katherina
Elsarboukh Glass
Fontana Tommaso
Haendel Melissa A
Harris Nomi L
Hegde Harshad
Joachimiak Marcin P
Matentzoglu Nicolas
Moxon Sierra A T
Mungall Christopher J
Munoz-Torres Monica C
Putman Tim
Ravanmehr Vida
Reese Justin T
Robinson Peter N
Schaper Kevin
Shefchek Kent A
Thessen Anne E
Unni Deepak R
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/07/2023
Field of study

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org

The Jackson Laboratory: The Mouseion at the JAXlibrary

Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences

Author: Anton Nekrutenko
James Taylor
Jeremy Goecks
The Galaxy Team
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Increased reliance on computational approaches in the life sciences has revealed grave concerns about how accessible and reproducible computation-reliant results truly are. Galaxy http://usegalaxy.org, an open web-based platform for genomic research, addresses these problems. Galaxy automatically tracks and manages data provenance and provides support for capturing the context and intent of computational methods. Galaxy Pages are interactive, web-based documents that provide users with a medium to communicate a complete computational analysis

Crossref

Springer - Publisher Connector

PubMed Central

Templates as a method for implementing data provenance in decision support systems

Author: Corrigan Derek
Curcin Vasa
Danger Roxana
Fairweather Elliot
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

AbstractDecision support systems are used as a method of promoting consistent guideline-based diagnosis supporting clinical reasoning at point of care. However, despite the availability of numerous commercial products, the wider acceptance of these systems has been hampered by concerns about diagnostic performance and a perceived lack of transparency in the process of generating clinical recommendations. This resonates with the Learning Health System paradigm that promotes data-driven medicine relying on routine data capture and transformation, which also stresses the need for trust in an evidence-based system. Data provenance is a way of automatically capturing the trace of a research task and its resulting data, thereby facilitating trust and the principles of reproducible research. While computational domains have started to embrace this technology through provenance-enabled execution middlewares, traditionally non-computational disciplines, such as medical research, that do not rely on a single software platform, are still struggling with its adoption. In order to address these issues, we introduce provenance templates – abstract provenance fragments representing meaningful domain actions. Templates can be used to generate a model-driven service interface for domain software tools to routinely capture the provenance of their data and tasks. This paper specifies the requirements for a Decision Support tool based on the Learning Health System, introduces the theoretical model for provenance templates and demonstrates the resulting architecture. Our methods were tested and validated on the provenance infrastructure for a Diagnostic Decision Support System that was developed as part of the EU FP7 TRANSFoRm project

Elsevier - Publisher Connector

King's Research Portal

The achievement of algorithmic accountability through the incorporation of machine learning monitoring methodology

Author: Peeters T.J.
Publication venue
Publication date: 04/07/2021
Field of study

Open University of the Netherlands Research Portal

Combining machine learning and semantic web: A systematic mapping study

Author: Breit Anna
Ekaputra Fajar J
Ekelhart Andreas
Harmelen Frank van
Iana Andreea
Paulheim Heiko
Portisch Jan
Revenko Artem
Sabou Marta
Teije Annette ten
Waltersdorfer Laura
Publication venue: ACM Press
Publication date: 01/01/2023
Field of study

In line with the general trend in artificial intelligence research to create intelligent systems that combine learning and symbolic components, a new sub-area has emerged that focuses on combining Machine Learning components with techniques developed by the Semantic Web community - Semantic Web Machine Learning (SWeML). Due to its rapid growth and impact on several communities in thepast two decades, there is a need to better understand the space of these SWeML Systems, their characteristics, and trends. Yet, surveys that adopt principled and unbiased approaches are missing. To fill this gap, we performed a systematic study and analyzed nearly 500 papers published in the past decade in this area, where we focused on evaluating architectural and application-specific features. Our analysis identified a rapidly growing interest in SWeML Systems, with a high impact on several application domains and tasks. Catalysts for this rapid growth are the increased application of deep learning and knowledge graph technologies. By leveraging the in-depth understanding of this area acquired through this study, a further key contribution of this article is a classification system for SWeML Systems that we publish as ontology.</p

VU Research Portal

MAnnheim DOCument Server

The FAIR Cookbook - the essential resource for and by FAIR doers

Author: Abbassi Daloii Tooba
Capella Gutierrez Salvador
Gu Wei
Ioannidis Vassilios
Marin del Piico Eva
Rocca Serra Philippe
Publication venue: Nature Research
Publication date: 01/01/2023
Field of study

The notion that data should be Findable, Accessible, Interoperable and Reusable, according to the FAIR Principles, has become a global norm for good data stewardship and a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. Despite such global endorsements, however, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an open, online resource of hands-on recipes for “FAIR doers” in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbook covers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. Part of the ELIXIR ecosystem, and recommended by funders, the FAIR Cookbook is open to contributions of new recipes.We thank all book dash participants and recipe authors, as well as the FAIRplus fellows, all partners, and the members of the FAIRplus Scientific Advisory Board, and the management team. In particular we acknowledge a number of colleagues for their role in the FAIRplus project, in particular: Ebitsam Alharbi (0000-0002-3887-3857), Oya Deniz Beyan (0000-0001-7611-3501), Ola Engkvist (0000-0003-4970-6461), Laura Furlong (0000-0002-9383-528X), Carole Goble (0000-0003-1219-2137), Mark Ibberson (0000-0003-3152-5670), Manfred Kohler, Nick Lynch (0000-0002-8997-5298), Scott Lusher (0000-0003-2401-4223), Jean-Marc Neefs, George Papadotas, Manuela Pruess (0000-0002-6857-5543), Ratnesh Sahay, Rudi Verbeeck (0000-0001-5445-6095), Bryn Williams-Jones, and Gesa Witt (0000-0003-2344-706X). This work and the authors were primarily funded by FAIRplus (IMI 802750). PRS and SAS also acknowledge contributions from the following grants (the FAIR Cookbook is also embedded in or connected to): ELIXIR Interoperability Platform, EOSC-Life (H2020-EU 824087), FAIRsharing (Wellcome 212930/Z/18/Z), NIH CFDE Coordinating Center (NIH Common Fund OT3OD025459-01), Precision Toxicology (H2020-EU 965406), UKRI DASH grant (MR/V038966/1), BY-COVID (Horizon-EU 101046203), AgroServ (Horizon-EU 101058020).Peer Reviewed"Article signat per 33 autors/es: Philippe Rocca-Serra, Wei Gu, Vassilios Ioannidis, Tooba Abbassi-Daloii, Salvador Capella-Gutierrez, Ishwar Chandramouliswaran, Andrea Splendiani, Tony Burdett, Robert T. Giessmann, David Henderson, Dominique Batista, Ibrahim Emam, Yojana Gadiya, Lucas Giovanni, Egon Willighagen, Chris Evelo, Alasdair J. G. Gray, Philip Gribbon, Nick Juty, Danielle Welter, Karsten Quast, Paul Peeters, Tom Plasterer, Colin Wood, Eelke van der Horst, Dorothy Reilly, Herman van Vlijmen, Serena Scollen, Allyson Lister, Milo Thurston, Ramon Granell, the FAIR Cookbook Contributors & Susanna-Assunta Sansone"Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

brainlife.io: A decentralized and open source cloud platform to support neuroscience research

Author: Avesani Paolo
Aydogan D. Baran
Berto Giulia
Bhatia Dheeraj
Bridge Holly
Brown Shaw T.
Bullock Daniel N.
Bussalb Aurore
Caron Bradley
Carson James P.
Chaumon Maximilien
Craddock Cameron
Delogu Franco
Eke Damian
Fabrega Ricardo
Faskowitz Joshua
Fischer Jeremy
Freiwald Winrich
Garyfallidis Eleftherios
George Nathalie
Guaje Javier
Hancock David Y.
Hanekamp Sandra
Hanson Jamie
Hayashi Soichi
Heinsfeld Anibal S.
Henschel Robert
Heyman Stephanie
Hunt David
Iacovella Vittorio
Jolly Jasleen
Kitchell Lindsey
Koudoro Serge
Kurzwaski Jan
Leong Josiah
Levitas Daniel
Marinazzo Daniele
McKee Shawn
McPherson Brent C.
Mejia Amanda
Mikellidou Koulla
Niso J. Guiomar
Olivetti Emanuele
Pestilli Franco
Pisner Derek
Poldrack Russell A.
Port Nicholas
Puce Aina
Rorden Christopher
Sani Ilaria
Schnyer David
Silva Filipi N.
Stanzione Daniel
Stewart Craig A.
Veraart Jelle
Victory Conner
Vinci-Booher Sophia
Willis Hanna
Yeh Frank C.
Zuidema Taylor
Publication venue
Publication date: 03/06/2023
Field of study

Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR data analysis to portions of the worldwide research community. brainlife.io was developed to reduce these burdens and democratize modern neuroscience research across institutions and career levels. Using community software and hardware infrastructure, the platform provides open-source data standardization, management, visualization, and processing and simplifies the data pipeline. brainlife.io automatically tracks the provenance history of thousands of data objects, supporting simplicity, efficiency, and transparency in neuroscience research. Here brainlife.io's technology and data services are described and evaluated for validity, reliability, reproducibility, replicability, and scientific utility. Using data from 4 modalities and 3,200 participants, we demonstrate that brainlife.io's services produce outputs that adhere to best practices in modern neuroscience research

arXiv.org e-Print Archive