Search CORE

36 research outputs found

FAIR Computational Workflows

Author: Cohen-Boulakia Sarah
Crusoe Michael R.
Garijo Daniel
Gil Yolanda
Goble Carole
Peters Kristian
Schober Daniel
Soiland-Reyes Stian
Publication venue
Publication date: 06/07/2019
Field of study

Computational workflows describe the complex multi-step methods that are used for data collection, data preparation, analytics, predictive modelling, and simulation that lead to new data products. They can inherently contribute to the FAIR data principles: by processing data according to established metadata; by creating metadata themselves during the processing of data; and by tracking and recording data provenance. These properties aid data quality assessment and contribute to secondary data usage. Moreover, workflows are digital objects in their own right. This paper argues that FAIR principles for workflows need to address their specific nature in terms of their composition of executable software steps, their provenance, and their development.Accepted for Data Intelligence special issue: FAIR best practices 2019. Carole Goble acknowledges funding by BioExcel2 (H2020 823830), IBISBA1.0 (H2020 730976) and EOSCLife (H2020 824087) . Daniel Schober's work was financed by Phenomenal (H2020 654241) at the initiation-phase of this effort, current work in kind contribution. Kristian Peters is funded by the German Network for Bioinformatics Infrastructure (de.NBI) and acknowledges BMBF funding under grant number 031L0107. Stian Soiland-Reyes is funded by BioExcel2 (H2020 823830). Daniel Garijo, Yolanda Gil, gratefully acknowledge support from DARPA award W911NF-18-1-0027, NIH award 1R01AG059874-01, and NSF award ICER-1740683

HAL-CentraleSupelec

ZENODO

INRIA a CCSD electronic archive server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

The University of Manchester - Institutional Repository

U.S. Department of the Interior: Sharing FAIR Data Fairly

Author: Duke Jason
Publication venue: AIS Electronic Library (AISeL)
Publication date: 10/08/2022
Field of study

Government-produced data are consumed by thousands of scientists, researchers, industries, and students around the world daily, but are often difficult to locate because they are collected and stored in a duplicative state at varying levels of quality, inhibiting their usefulness for data science investigations and analysis. To address these challenges, the United States Department of the Interior bureaus have been implementing FAIR Data Principles into their data sharing strategies since 2016. Differing interpretations of the FAIR Data Principles are leading to data that are not documented uniformly and are not properly integrated for reuse. In order to establish a FAIR baseline, analysis of select datasets is being performed with peer-reviewed FAIR assessment tools. Delphi panels are being conducted with DOI Chief Data Officers and DOI Federal data consumers to gain insights as to how to affordably deliver this data according to the FAIR Data Principles

AIS Electronic Library (AISeL)

FAIRsoft - A practical implementation of FAIR principles for research software

Author: Capella Gutiérrez Salvador
Gelpi Josep
Martín del Pico Eva
Publication venue: Barcelona Supercomputing Center
Publication date: 01/05/2022
Field of study

Computational tools are increasingly becoming constitutive parts of scientific research, from experimentation and data collection to the dissemination and storage of results. Unfortunately, however, research software is not subjected to the same requirements as other methods of scientific research: being peer-reviewed, being reproducible and allowing one to build upon another’s work. This situation is detrimental to the integrity and advancement of scientific research, leading to computational methods frequently being impossible to reproduce and/or verify [1]. Moreover, they are often opaque, direcly unavailable or impossible to use by others [2]. One step to address this problem could be formulating a set of principles that research software should meet to ensure its quality and sustainability, resembling the FAIR (Findable, Accessible, Interoperable and Reusable) Data Principles [3]. The FAIR Data Principles were created to solve similar issues affecting scholarly data, namely great difficulty of sharing and accessibility, and are currently widely recognized accross fileds. We present here FAIRsoft, our initial effort to assess the quality of research software using a FAIR-like framework, as a first step towards its implementation in OpenEBench [4], the ELIXIR benchmarking platform

UPCommons. Portal del coneixement obert de la UPC

The HMC Information Portal for Enhanced Metadata Collaboration in the Helmholtz FAIR Data Space

Author: Bröder Jens
Curdt Constanze
Kollai Helen
Kubin Markus
Kulla Lucas
Lemster Christine
Nolden Marco
Schmieder Kai
Strupp Annika
Stucky Karl-Uwe
Söding Emanuel
Walter Konstantin Pascal
Witold Arndt
Publication venue: Karlsruher Institut für Technologie
Publication date: 10/10/2023
Field of study

KITopen

Unique, Persistent, Resolvable: Identifiers as the foundation of FAIR

Author: Clark Tim
Goble Carole Anne
Juty Nick
Kunze John
Soiland-Reyes Stian
Wimalaratne Sarala M.
Publication venue
Publication date: 03/07/2019
Field of study

The FAIR Principles describe characteristics intended to support access to and reuse of digital artifacts in the scientific research ecosystem. Persistent, globally unique identifiers, resolvable on the Web, and associated with a set of additional descriptive metadata, are foundational to FAIR data. Here we describe some basic principles and exemplars for their design, use and orchestration with other system elements to achieve FAIRness for digital research objects

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

The University of Manchester - Institutional Repository

F*** workflows: when parts of FAIR are missing

Author: Eisenhauer Greg
Kapadia Anuj J.
Knight Kathryn
Logan Jeremy
Widener Patrick
Wilkinson Sean R.
Wolf Matthew
Publication venue
Publication date: 19/09/2022
Field of study

The FAIR principles for scientific data (Findable, Accessible, Interoperable, Reusable) are also relevant to other digital objects such as research software and scientific workflows that operate on scientific data. The FAIR principles can be applied to the data being handled by a scientific workflow as well as the processes, software, and other infrastructure which are necessary to specify and execute a workflow. The FAIR principles were designed as guidelines, rather than rules, that would allow for differences in standards for different communities and for different degrees of compliance. There are many practical considerations which impact the level of FAIR-ness that can actually be achieved, including policies, traditions, and technologies. Because of these considerations, obstacles are often encountered during the workflow lifecycle that trace directly to shortcomings in the implementation of the FAIR principles. Here, we detail some cases, without naming names, in which data and workflows were Findable but otherwise lacking in areas commonly needed and expected by modern FAIR methods, tools, and users. We describe how some of these problems, all of which were overcome successfully, have motivated us to push on systems and approaches for fully FAIR workflows.Comment: 6 pages, 0 figures, accepted to ERROR 2022 workshop (see https://error-workshop.org/ for more information), to be published in proceedings of IEEE eScience 202

arXiv.org e-Print Archive

A Maturity Model for Operations in Neuroscience Research

Author: Dichter Benjamin K.
Gunalan Kabilar
Halchenko Yaroslav O.
Johnson Erik C.
Kosma Montgomery
Martone Maryann E.
Neufeld Shay Q.
Nguyen Thinh T.
Pestilli Franco
Ritter Petra
Schirner Michael
Wester Brock
Yatsenko Dimitri
Zappulla Frank
Publication venue
Publication date: 29/12/2023
Field of study

Scientists are adopting new approaches to scale up their activities and goals. Progress in neurotechnologies, artificial intelligence, automation, and tools for collaboration promises new bursts of discoveries. However, compared to other disciplines and the industry, neuroscience laboratories have been slow to adopt key technologies to support collaboration, reproducibility, and automation. Drawing on progress in other fields, we define a roadmap for implementing automated research workflows for diverse research teams. We propose establishing a five-level capability maturity model for operations in neuroscience research. Achieving higher levels of operational maturity requires new technology-enabled methodologies, which we describe as ``SciOps''. The maturity model provides guidelines for evaluating and upgrading operations in multidisciplinary neuroscience teams.Comment: 10 pages, one figur

arXiv.org e-Print Archive

Meeting the Challenge of Scientific Dissemination in the Era of COVID-19:Toward a Modular Approach to Knowledge-Sharing for Radiation Oncology

Author: Fuller Clifton D.
Ludmir Ethan B.
Scott Jacob G.
Thomas Charles R.
Thompson Reid F.
van Dijk Lisanne V.
Publication venue: 'Elsevier BV'
Publication date: 01/10/2020
Field of study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Normal Tissue Complication Probability (NTCP) Prediction Model for Osteoradionecrosis of the Mandible in Patients With Head and Neck Cancer After Radiation Therapy:Large-Scale Observational Cohort

Author: Abusaif Abdelrahman A.
Fuller Clifton D.
Hutcheson Katherine A.
Lai Stephen Y.
Mohamed Abdallah S. R.
Naser Mohamed A.
Rigert Jillian
van Dijk Lisanne
Publication venue
Publication date: 10/06/2021
Field of study

Purpose: Osteoradionecrosis (ORN) of the mandible represents a severe, debilitating complication of radiation therapy (RT) for head and neck cancer (HNC). At present, no normal tissue complication probability (NTCP) models for risk of ORN exist. The aim of this study was to develop a multivariable clinical/dose-based NTCP model for the prediction of ORN any grade (ORNI-IV) and grade IV (ORNIV) after RT (+/- chemotherapy) in patients with HNC.Methods and Materials: Included patients with HNC were treated with (chemo-)RT between 2005 and 2015. Mandible bone radiation dose-volume parameters and clinical variables (ie, age, sex, tumor site, pre-RT dental extractions, chemotherapy history, postoperative RT, and smoking status) were considered as potential predictors. The patient cohort was randomly divided into a training (70%) and independent test (30%) cohort. Bootstrapped forward variable selection was performed in the training cohort to select the predictors for the NTCP models. Final NTCP model(s) were validated on the holdback test subset.Results: Of 1259 included patients with HNC, 13.7% (n = 173 patients) developed any grade ORN (ORNI-IV primary endpoint) and 5% (n = 65) ORNIV (secondary endpoint). All dose and volume parameters of the mandible bone were significantly associated with the development of ORN in univariable models. Multivariable analyses identified D30% and pre-RT dental extraction as independent predictors for both ORNI-IV and ORNIV best-performing NTCP models with an area under the curve (AUC) of 0.78 (AUCvalidation = 0.75 [0.69-0.82]) and 0.81 (AUCvalidation = 0.82 [0.74-0.89]), respectively.Conclusions: This study presented NTCP models based on mandible bone D30% and pre-RT dental extraction that predict ORNI-IV and ORNIV (ie, needing invasive surgical intervention) after HNC RT. Our results suggest that less than 30% of the mandible should receive a dose of 35 Gy or more for an ORNI-IV risk lower than 5%. These NTCP models can improve ORN prevention and management by identifying patients at risk of ORN. (C) 2021 The Author(s). Published by Elsevier Inc.</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

Dissertations of the University of Groningen

Julia as a unifying end-to-end workflow language on the Frontier exascale system

Author: Anderson Caira
da Silva Rafael Ferreira
Gainaru Ana
Godoy William F.
Lee Katrina W.
Valero-Lara Pedro
Vetter Jeffrey S.
Publication venue
Publication date: 27/09/2023
Field of study

We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational kernel on AMD's MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes/GPUs or 512 nodes, (iii) parallel I/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive analysis. Results suggest that although Julia generates a reasonable LLVM-IR, a nearly 50% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition language, as measured on the fastest supercomputer in the world.Comment: 11 pages, 8 figures, accepted at the 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC2

arXiv.org e-Print Archive