Search CORE

341 research outputs found

Providing traceability for neuroimaging analyses

Author: Amendolia
Andrew Branson
Anjum
Anjum
Ashiq Anjum
Benkner
Ellisman
Erl
Estrella
Estrella
Foster
Foster
Hauer
Irfan Habib
Jetendr Shamdasani
Joseph
Kamran Munir
Kamran Soomro
McClatchey
Montagnat
Mueller
Munir
Newcomer
Peter Bloodsworth
Redolfi
Richard McClatchey
Rivière
Sanchez-Benavides
Zijdenbos
Publication venue: 'Elsevier BV'
Publication date: 01/06/2012
Field of study

IntroductionWith the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being seen as viable solutions for managing distributed data and algorithms in the bio-medical domain. For neuroscientific analyses, especially those centred on complex image analysis, traceability of processes and datasets is essential but up to now this has not been captured in a manner that facilitates collaborative study. Purpose and MethodFew examples exist, of deployed medical systems based on Grids that provide the traceability of research data needed to facilitate complex analyses and none have been evaluated in practice. Over the past decade, we have been working with mammographers, paediatricians and neuroscientists in three generations of projects to provide the data management and provenance services now required for 21st century medical research. This paper outlines the finding of a requirements study and a resulting system architecture for the production of services to support neuroscientific studies of biomarkers for Alzheimer’s Disease.ResultsThe paper proposes a software infrastructure and services that provide the foundation for such support. It introduces the use of the CRISTAL software to provide provenance management as one of a number of services delivered on a SOA, deployed to manage neuroimaging projects that have been studying biomarkers for Alzheimer’s disease. ConclusionsIn the neuGRID and N4U projects a Provenance Service has been delivered that captures and reconstructs the workflow information needed to facilitate researchers in conducting neuroimaging analyses. The software enables neuroscientists to track the evolution of workflows and datasets. It also tracks the outcomes of various analyses and provides provenance traceability throughout the lifecycle of their studies. As the Provenance Service has been designed to be generic it can be applied across the medical domain as a reusable tool for supporting medical researchers thus providing communities of researchers for the first time with the necessary tools to conduct widely distributed collaborative programmes of medical analysis

arXiv.org e-Print Archive

Crossref

UWE Bristol Research Repository

UDORA - University of Derby Online Research Archive

Designing Traceability into Big Data Systems

Author: Branson Andrew
Consortium the CRISTAL-ISE
Kovacs Zsolt
McClatchey Richard
Shamdasani Jetendr
Publication venue
Publication date: 01/01/2015
Field of study

Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.575

arXiv.org e-Print Archive

CERN Document Server

Data provenance tracking as the basis for a biomedical virtual research environment

Author: McClatchey Richard
Publication venue
Publication date: 01/01/2018
Field of study

In complex data analyses it is increasingly important to capture information about the usage of data sets in addition to their preservation over time to ensure reproducibility of results, to verify the work of others and to ensure appropriate conditions data have been used for specific analyses. Scientific workflow based studies are beginning to realize the benefit of capturing this provenance of data and the activities used to process, transform and carry out studies on those data. This is especially true in biomedicine where the collection of data through experiment is costly and/or difficult to reproduce and where that data needs to be preserved over time. One way to support the development of workflows and their use in (collaborative) biomedical analyses is through the use of a Virtual Research Environment. The dynamic and distributed nature of Grid/Cloud computing, however, makes the capture and processing of provenance information a major research challenge. Furthermore most workflow provenance management services are designed only for data-flow oriented workflows and researchers are now realising that tracking data or workflows alone or separately is insufficient to support the scientific process. What is required for collaborative research is traceable and reproducible provenance support in a full orchestrated Virtual Research Environment (VRE) that enables researchers to define their studies in terms of the datasets and processes used, to monitor and visualize the outcome of their analyses and to log their results so that others users can call upon that acquired knowledge to support subsequent studies. We have extended the work carried out in the neuGRID and N4U projects in providing a so-called Virtual Laboratory to provide the foundation for a generic VRE in which sets of biomedical data (images, laboratory test results, patient records, epidemiological analyses etc.) and the workflows (pipelines) used to process those data, together with their provenance data and results sets are captured in the CRISTAL software. This paper outlines the functionality provided for a VRE by the Open Source CRISTAL software and examines how that can provide the foundations for a practice-based knowledge base for biomedicine and, potentially, for a wider research community

arXiv.org e-Print Archive

Crossref

UWE Bristol Research Repository

Designing Reusable Systems that Can Handle Change - Description-Driven Systems : Revisiting Object-Oriented Principles

Author: Branson Andrew
McClatchey Richard
Shamdasani Jetendr
Publication venue
Publication date: 01/01/2014
Field of study

In the age of the Cloud and so-called Big Data systems must be increasingly flexible, reconfigurable and adaptable to change in addition to being developed rapidly. As a consequence, designing systems to cater for evolution is becoming critical to their success. To be able to cope with change, systems must have the capability of reuse and the ability to adapt as and when necessary to changes in requirements. Allowing systems to be self-describing is one way to facilitate this. To address the issues of reuse in designing evolvable systems, this paper proposes a so-called description-driven approach to systems design. This approach enables new versions of data structures and processes to be created alongside the old, thereby providing a history of changes to the underlying data models and enabling the capture of provenance data. The efficacy of the description-driven approach is exemplified by the CRISTAL project. CRISTAL is based on description-driven design principles; it uses versions of stored descriptions to define various versions of data which can be stored in diverse forms. This paper discusses the need for capturing holistic system description when modelling large-scale distributed systems.Comment: 8 pages, 1 figure and 1 table. Accepted by the 9th Int Conf on the Evaluation of Novel Approaches to Software Engineering (ENASE'14). Lisbon, Portugal. April 201

arXiv.org e-Print Archive

CERN Document Server

Facilitating evolution during design and implementation

Author: McClatchey Richard
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The volumes and complexity of data that companies need to handle are increasing at an accelerating rate. In order to compete effectively and ensure their commercial sustainability, it is becoming crucial for them to achieve robust traceability in both their data and the evolving designs of their systems. This is addressed by the CRISTAL software which was originally developed at CERN by UWE, Bristol, for one of the particle detectors at the Large Hadron Collider, and has been subsequently transferred into the commercial world. Companies have been able to demonstrate increased agility, generate additional revenue, and improve the efficiency and cost-effectiveness with which they develop and implement systems in various areas, including business process management (BPM), healthcare and accounting applications. CRISTAL’s ability to manage data and its provenance at the terabyte scale, with full traceability over extended timescales, together with its description-driven approach, has provided the flexible adaptability required to future proof dynamically evolving software for these businesses

arXiv.org e-Print Archive

UWE Bristol Research Repository

CERN Document Server

Towards Provenance and Traceability in CRISTAL for HEP

Author: Branson Andrew
McClatchey Richard
Shamdasani Jetendr
Publication venue: 'IOP Publishing'
Publication date: 24/02/2014
Field of study

This paper discusses the CRISTAL object lifecycle management system and its use in provenance data management and the traceability of system events. This software was initially used to capture the construction and calibration of the CMS ECAL detector at CERN for later use by physicists in their data analysis. Some further uses of CRISTAL in different projects (CMS, neuGRID and N4U) are presented as examples of its flexible data model. From these examples, applications are drawn for the High Energy Physics domain and some initial ideas for its use in data preservation HEP are outlined in detail in this paper. Currently investigations are underway to gauge the feasibility of using the N4U Analysis Service or a derivative of it to address the requirements of data and analysis logging and provenance capture within the HEP long term data analysis environment.Comment: 5 pages and 1 figure. 20th International Conference on Computing in High Energy and Nuclear Physics (CHEP13). 14-18th October 2013. Amsterdam, Netherlands. To appear in Journal of Physics Conference Serie

arXiv.org e-Print Archive

CERN Document Server

NeuroProv: Provenance data visualisation for neuroimaging analyses

Author: Bechhofer
Belhajjame
Bose
Cheung
Cohen
Corbett
Deelman
Deelman
Del Rio
Dey
Dolgert
El-Jaick
Fleisher
Freire
Gil
Goble
Kehoe
Kunde
Kwok
MacKenzie-Graham
Macko
Macko
McClatchey
McPhillips
Miksa
Moreau
Mueller
Munir
Oinn
Quan
Rusinek
Tian
Ton That
Yuan
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/06/2019
Field of study

© 2019 Elsevier Ltd Visualisation underpins the understanding of scientific data both through exploration and explanation of analysed data. Provenance strengthens the understanding of data by showing the process of how a result has been achieved. With the significant increase in data volumes and algorithm complexity, clinical researchers are struggling with information tracking, analysis reproducibility and the verification of scientific output. In addition, data coming from various heterogeneous sources with varying levels of trust in a collaborative environment adds to the uncertainty of the scientific outputs. This provides the motivation for provenance data capture and visualisation support for analyses. In this paper a system, NeuroProv is presented, to visualise provenance data in order to aid in the process of verification of scientific outputs, comparison of analyses, progression and evolution of results for neuroimaging analyses. The experimental results show the effectiveness of visualising provenance data for neuroimaging analyses

Crossref

UWE Bristol Research Repository