307 research outputs found

    Towards Interoperable Research Infrastructures for Environmental and Earth Sciences

    Get PDF
    This open access book summarises the latest developments on data management in the EU H2020 ENVRIplus project, which brought together more than 20 environmental and Earth science research infrastructures into a single community. It provides readers with a systematic overview of the common challenges faced by research infrastructures and how a ‘reference model guided’ engineering approach can be used to achieve greater interoperability among such infrastructures in the environmental and earth sciences. The 20 contributions in this book are structured in 5 parts on the design, development, deployment, operation and use of research infrastructures. Part one provides an overview of the state of the art of research infrastructure and relevant e-Infrastructure technologies, part two discusses the reference model guided engineering approach, the third part presents the software and tools developed for common data management challenges, the fourth part demonstrates the software via several use cases, and the last part discusses the sustainability and future directions

    <i>Active</i> provenance for Data-Intensive workflows: engaging users and developers

    Get PDF
    We present a practical approach for provenance capturing in Data-Intensive workflow systems. It provides contextualisation by recording injected domain metadata with the provenance stream. It offers control over lineage precision, combining automation with specified adaptations. We address provenance tasks such as extraction of domain metadata, injection of custom annotations, accuracy and integration of records from multiple independent workflows running in distributed contexts. To allow such flexibility, we introduce the concepts of programmable Provenance Types and Provenance Configuration.Provenance Types handle domain contextualisation and allow developers to model lineage patterns by re-defining API methods, composing easy-to-use extensions. Provenance Configuration, instead, enables users of a Data-Intensive workflow execution to prepare it for provenance capture, by configuring the attribution of Provenance Types to components and by specifying grouping into semantic clusters. This enables better searches over the lineage records. Provenance Types and Provenance Configuration are demonstrated in a system being used by computational seismologists. It is based on an extended provenance model, S-PROV.PublishedSan Diego (CA, USA)3IT. Calcolo scientific

    Towards Interoperable Research Infrastructures for Environmental and Earth Sciences

    Get PDF
    This open access book summarises the latest developments on data management in the EU H2020 ENVRIplus project, which brought together more than 20 environmental and Earth science research infrastructures into a single community. It provides readers with a systematic overview of the common challenges faced by research infrastructures and how a ‘reference model guided’ engineering approach can be used to achieve greater interoperability among such infrastructures in the environmental and earth sciences. The 20 contributions in this book are structured in 5 parts on the design, development, deployment, operation and use of research infrastructures. Part one provides an overview of the state of the art of research infrastructure and relevant e-Infrastructure technologies, part two discusses the reference model guided engineering approach, the third part presents the software and tools developed for common data management challenges, the fourth part demonstrates the software via several use cases, and the last part discusses the sustainability and future directions

    Science Gateways and AI/ML: How Can Gateway Concepts and Solutions Meet the Needs in Data Science?

    Get PDF
    Science gateways are a crucial component of critical infrastructure as they provide the means for users to focus on their topics and methods instead of the technical details of the infrastructure. They are defined as end-to-end solutions for accessing data, software, computing services, sensors, and equipment specific to the needs of a science or engineering discipline and their goal is to hide the complexity of the underlying infrastructure. Science gateways are often called Virtual Research Environments in Europe and Virtual Labs in Australasia; we consider these two terms to be synonymous with science gateways. Over the past decade, artificial intelligence (AI) and machine learning (ML) have found applications in many different fields in private industry, and private industry has reaped the benefits. Likewise, in the academic realm, large-scale data science applications have also learned to apply public high-performance computing resources to make use of this technology. However, academic and research science gateways have yet to fully adopt the tools of AI. There is an opportunity in the gateways space, both to increase the visibility and accessibility to AI/ML applications and to enable researchers and developers to advance the field of science gateway cyberinfrastructure itself. Harnessing AI/ML is recognized as a high priority by the science gateway community. It is, therefore, critical for the next generation of science gateways to adapt to support the AI/ML that is already transforming many scientific fields. The goal is to increase collaborations between the two fields and to ensure that gateway services are used and are valuable to the AI/ML community. This chapter presents state-of-the-art examples and areas of opportunity for the science gateways community to pursue in relation to AI/ML and some vision of where these new capabilities might impact science gateways and support scientific research

    Web technologies for environmental big data

    Get PDF
    Recent evolutions in computing science and web technology provide the environmental community with continuously expanding resources for data collection and analysis that pose unprecedented challenges to the design of analysis methods, workflows, and interaction with data sets. In the light of the recent UK Research Council funded Environmental Virtual Observatory pilot project, this paper gives an overview of currently available implementations related to web-based technologies for processing large and heterogeneous datasets and discuss their relevance within the context of environmental data processing, simulation and prediction. We found that, the processing of the simple datasets used in the pilot proved to be relatively straightforward using a combination of R, RPy2, PyWPS and PostgreSQL. However, the use of NoSQL databases and more versatile frameworks such as OGC standard based implementations may provide a wider and more flexible set of features that particularly facilitate working with larger volumes and more heterogeneous data sources
    • 

    corecore