17 research outputs found

    Providing traceability for neuroimaging analyses

    Get PDF
    IntroductionWith the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being seen as viable solutions for managing distributed data and algorithms in the bio-medical domain. For neuroscientific analyses, especially those centred on complex image analysis, traceability of processes and datasets is essential but up to now this has not been captured in a manner that facilitates collaborative study. Purpose and MethodFew examples exist, of deployed medical systems based on Grids that provide the traceability of research data needed to facilitate complex analyses and none have been evaluated in practice. Over the past decade, we have been working with mammographers, paediatricians and neuroscientists in three generations of projects to provide the data management and provenance services now required for 21st century medical research. This paper outlines the finding of a requirements study and a resulting system architecture for the production of services to support neuroscientific studies of biomarkers for Alzheimer’s Disease.ResultsThe paper proposes a software infrastructure and services that provide the foundation for such support. It introduces the use of the CRISTAL software to provide provenance management as one of a number of services delivered on a SOA, deployed to manage neuroimaging projects that have been studying biomarkers for Alzheimer’s disease. ConclusionsIn the neuGRID and N4U projects a Provenance Service has been delivered that captures and reconstructs the workflow information needed to facilitate researchers in conducting neuroimaging analyses. The software enables neuroscientists to track the evolution of workflows and datasets. It also tracks the outcomes of various analyses and provides provenance traceability throughout the lifecycle of their studies. As the Provenance Service has been designed to be generic it can be applied across the medical domain as a reusable tool for supporting medical researchers thus providing communities of researchers for the first time with the necessary tools to conduct widely distributed collaborative programmes of medical analysis

    Designing Traceability into Big Data Systems

    Full text link
    Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.575

    A formal architecture-centric and model driven approach for the engineering of science gateways

    Get PDF
    From n-Tier client/server applications, to more complex academic Grids, or even the most recent and promising industrial Clouds, the last decade has witnessed significant developments in distributed computing. In spite of this conceptual heterogeneity, Service-Oriented Architecture (SOA) seems to have emerged as the common and underlying abstraction paradigm, even though different standards and technologies are applied across application domains. Suitable access to data and algorithms resident in SOAs via so-called ‘Science Gateways’ has thus become a pressing need in order to realize the benefits of distributed computing infrastructures.In an attempt to inform service-oriented systems design and developments in Grid-based biomedical research infrastructures, the applicant has consolidated work from three complementary experiences in European projects, which have developed and deployed large-scale production quality infrastructures and more recently Science Gateways to support research in breast cancer, pediatric diseases and neurodegenerative pathologies respectively. In analyzing the requirements from these biomedical applications the applicant was able to elaborate on commonly faced issues in Grid development and deployment, while proposing an adapted and extensible engineering framework. Grids implement a number of protocols, applications, standards and attempt to virtualize and harmonize accesses to them. Most Grid implementations therefore are instantiated as superposed software layers, often resulting in a low quality of services and quality of applications, thus making design and development increasingly complex, and rendering classical software engineering approaches unsuitable for Grid developments.The applicant proposes the application of a formal Model-Driven Engineering (MDE) approach to service-oriented developments, making it possible to define Grid-based architectures and Science Gateways that satisfy quality of service requirements, execution platform and distribution criteria at design time. An novel investigation is thus presented on the applicability of the resulting grid MDE (gMDE) to specific examples and conclusions are drawn on the benefits of this approach and its possible application to other areas, in particular that of Distributed Computing Infrastructures (DCI) interoperability, Science Gateways and Cloud architectures developments

    Facilitating evolution during design and implementation

    Get PDF
    The volumes and complexity of data that companies need to handle are increasing at an accelerating rate. In order to compete effectively and ensure their commercial sustainability, it is becoming crucial for them to achieve robust traceability in both their data and the evolving designs of their systems. This is addressed by the CRISTAL software which was originally developed at CERN by UWE, Bristol, for one of the particle detectors at the Large Hadron Collider, and has been subsequently transferred into the commercial world. Companies have been able to demonstrate increased agility, generate additional revenue, and improve the efficiency and cost-effectiveness with which they develop and implement systems in various areas, including business process management (BPM), healthcare and accounting applications. CRISTAL’s ability to manage data and its provenance at the terabyte scale, with full traceability over extended timescales, together with its description-driven approach, has provided the flexible adaptability required to future proof dynamically evolving software for these businesses

    Data provenance tracking as the basis for a biomedical virtual research environment

    Get PDF
    In complex data analyses it is increasingly important to capture information about the usage of data sets in addition to their preservation over time to ensure reproducibility of results, to verify the work of others and to ensure appropriate conditions data have been used for specific analyses. Scientific workflow based studies are beginning to realize the benefit of capturing this provenance of data and the activities used to process, transform and carry out studies on those data. This is especially true in biomedicine where the collection of data through experiment is costly and/or difficult to reproduce and where that data needs to be preserved over time. One way to support the development of workflows and their use in (collaborative) biomedical analyses is through the use of a Virtual Research Environment. The dynamic and distributed nature of Grid/Cloud computing, however, makes the capture and processing of provenance information a major research challenge. Furthermore most workflow provenance management services are designed only for data-flow oriented workflows and researchers are now realising that tracking data or workflows alone or separately is insufficient to support the scientific process. What is required for collaborative research is traceable and reproducible provenance support in a full orchestrated Virtual Research Environment (VRE) that enables researchers to define their studies in terms of the datasets and processes used, to monitor and visualize the outcome of their analyses and to log their results so that others users can call upon that acquired knowledge to support subsequent studies. We have extended the work carried out in the neuGRID and N4U projects in providing a so-called Virtual Laboratory to provide the foundation for a generic VRE in which sets of biomedical data (images, laboratory test results, patient records, epidemiological analyses etc.) and the workflows (pipelines) used to process those data, together with their provenance data and results sets are captured in the CRISTAL software. This paper outlines the functionality provided for a VRE by the Open Source CRISTAL software and examines how that can provide the foundations for a practice-based knowledge base for biomedicine and, potentially, for a wider research community
    corecore