1,069,509 research outputs found

    Interoperability and FAIRness through a novel combination of Web technologies

    Get PDF
    Data in the life sciences are extremely diverse and are stored in a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG for pathway data or UniProt for protein data) to those that are general-purpose (such as FigShare, Zenodo, Dataverse or EUDAT). These data have widely different levels of sensitivity and security considerations. For example, clinical observations about genetic mutations in patients are highly sensitive, while observations of species diversity are generally not. The lack of uniformity in data models from one repository to another, and in the richness and availability of metadata descriptions, makes integration and analysis of these data a manual, time-consuming task with no scalability. Here we explore a set of resource-oriented Web design patterns for data discovery, accessibility, transformation, and integration that can be implemented by any general- or special-purpose repository as a means to assist users in finding and reusing their data holdings. We show that by using off-the-shelf technologies, interoperability can be achieved atthe level of an individual spreadsheet cell. We note that the behaviours of this architecture compare favourably to the desiderata defined by the FAIR Data Principles, and can therefore represent an exemplar implementation of those principles. The proposed interoperability design patterns may be used to improve discovery and integration of both new and legacy data, maximizing the utility of all scholarly outputs

    Mass storage system experiences and future needs at the National Center for Atmospheric Research

    Get PDF
    This presentation is designed to relate some of the experiences of the Scientific Computing Division at NCAR dealing with the 'data problem'. A brief history and a development of some basic Mass Storage System (MSS) principles are given. An attempt is made to show how these principles apply to the integration of various components into NCAR's MSS. There is discussion of future MSS needs for future computing environments

    Tuning of MC generator MPI models

    Full text link
    MC models of multiple partonic scattering inevitably introduce many free parameters, either fundamental to the models or from their integration with MC treatments of primary-scattering evolution. This non-perturbative and non-factorisable physics in particular cannot currently be constrained from theoretical principles, and hence parameter optimisation against experimental data is required. This process is commonly referred to as MC tuning. We summarise the principles, problems and history of MC tuning, and the still-evolving modern approach to both model optimisation and estimation of modelling uncertainties.Comment: Contributed chapter to "Multiple Parton Interactions at the LHC", World Scientific 201

    Mass storage system experiences and future needs at the National Center for Atmospheric Research

    Get PDF
    A summary and viewgraphs of a discussion presented at the National Space Science Data Center (NSSDC) Mass Storage Workshop is included. Some of the experiences of the Scientific Computing Division at the National Center for Atmospheric Research (NCAR) dealing the the 'data problem' are discussed. A brief history and a development of some basic mass storage system (MSS) principles are given. An attempt is made to show how these principles apply to the integration of various components into NCAR's MSS. Future MSS needs for future computing environments is discussed

    Linked education: interlinking educational resources and the web of data

    Get PDF
    Research on interoperability of technology-enhanced learning (TEL) repositories throughout the last decade has led to a fragmented landscape of competing approaches, such as metadata schemas and interface mechanisms. However, so far Web-scale integration of resources is not facilitated, mainly due to the lack of take-up of shared principles, datasets and schemas. On the other hand, the Linked Data approach has emerged as the de-facto standard for sharing data on the Web and offers a large potential to solve interoperability issues in the field of TEL. In this paper, we describe a general approach to exploit the wealth of already existing TEL data on the Web by allowing its exposure as Linked Data and by taking into account automated enrichment and interlinking techniques to provide rich and well-interlinked data for the educational domain. This approach has been implemented in the context of the mEducator project where data from a number of open TEL data repositories has been integrated, exposed and enriched by following Linked Data principles

    Enhancing reuse of data and biological material in medical research : from FAIR to FAIR-Health

    Get PDF
    The known challenge of underutilization of data and biological material from biorepositories as potential resources formedical research has been the focus of discussion for over a decade. Recently developed guidelines for improved data availability and reusability—entitled FAIR Principles (Findability, Accessibility, Interoperability, and Reusability)—are likely to address only parts of the problem. In this article,we argue that biologicalmaterial and data should be viewed as a unified resource. This approach would facilitate access to complete provenance information, which is a prerequisite for reproducibility and meaningful integration of the data. A unified view also allows for optimization of long-term storage strategies, as demonstrated in the case of biobanks.Wepropose an extension of the FAIR Principles to include the following additional components: (1) quality aspects related to research reproducibility and meaningful reuse of the data, (2) incentives to stimulate effective enrichment of data sets and biological material collections and its reuse on all levels, and (3) privacy-respecting approaches for working with the human material and data. These FAIR-Health principles should then be applied to both the biological material and data. We also propose the development of common guidelines for cloud architectures, due to the unprecedented growth of volume and breadth of medical data generation, as well as the associated need to process the data efficiently.peer-reviewe
    corecore