1,416 research outputs found

    Modeling and Implementation of an Asynchronous Approach to Integrating HPC and Big Data Analysis

    Get PDF
    With the emergence of exascale computing and big data analytics, many important scientific applications require the integration of computationally intensive modeling and simulation with data-intensive analysis to accelerate scientific discovery. In this paper, we create an analytical model to steer the optimization of the end-to-end time-to-solution for the integrated computation and data analysis. We also design and develop an intelligent data broker to efficiently intertwine the computation stage and the analysis stage to practically achieve the optimal time-to-solution predicted by the analytical model. We perform experiments on both synthetic applications and real-world computational fluid dynamics (CFD) applications. The experiments show that the analytic model exhibits an average relative error of less than 10%, and the application performance can be improved by up to 131% for the synthetic programs and by up to 78% for the real-world CFD application

    A component-based framework for certification of components in a cloud of HPC services

    Get PDF
    HPC Shelfis a proposal of a cloud computing platform to provide component-oriented services for High Performance Computing (HPC) applications. This paper presents a Verification-as-a-Service (VaaS) framework for component certification onHPC Shelf. Certification is aimed at providing higher confidence that components of parallel computing systems ofHPC Shelfbehave as expected according to one or more requirements expressed in their contracts. To this end, new abstractions are introduced, starting with certifier components. They are designed to inspect other components and verify them for different types of functional, non-functional and behavioral requirements. The certification framework is naturally based on parallel computing techniques to speed up verification tasks.NORTE-01-0145- FEDER-000037

    Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability

    Full text link
    Modern large-scale scientific discovery requires multidisciplinary collaboration across diverse computing facilities, including High Performance Computing (HPC) machines and the Edge-to-Cloud continuum. Integrated data analysis plays a crucial role in scientific discovery, especially in the current AI era, by enabling Responsible AI development, FAIR, Reproducibility, and User Steering. However, the heterogeneous nature of science poses challenges such as dealing with multiple supporting tools, cross-facility environments, and efficient HPC execution. Building on data observability, adapter system design, and provenance, we propose MIDA: an approach for lightweight runtime Multi-workflow Integrated Data Analysis. MIDA defines data observability strategies and adaptability methods for various parallel systems and machine learning tools. With observability, it intercepts the dataflows in the background without requiring instrumentation while integrating domain, provenance, and telemetry data at runtime into a unified database ready for user steering queries. We conduct experiments showing end-to-end multi-workflow analysis integrating data from Dask and MLFlow in a real distributed deep learning use case for materials science that runs on multiple environments with up to 276 GPUs in parallel. We show near-zero overhead running up to 100,000 tasks on 1,680 CPU cores on the Summit supercomputer.Comment: 10 pages, 5 figures, 2 Listings, 42 references, Paper accepted at IEEE eScience'2

    Why High-Performance Modelling and Simulation for Big Data Applications Matters

    Get PDF
    Modelling and Simulation (M&S) offer adequate abstractions to manage the complexity of analysing big data in scientific and engineering domains. Unfortunately, big data problems are often not easily amenable to efficient and effective use of High Performance Computing (HPC) facilities and technologies. Furthermore, M&S communities typically lack the detailed expertise required to exploit the full potential of HPC solutions while HPC specialists may not be fully aware of specific modelling and simulation requirements and applications. The COST Action IC1406 High-Performance Modelling and Simulation for Big Data Applications has created a strategic framework to foster interaction between M&S experts from various application domains on the one hand and HPC experts on the other hand to develop effective solutions for big data applications. One of the tangible outcomes of the COST Action is a collection of case studies from various computing domains. Each case study brought together both HPC and M&S experts, giving witness of the effective cross-pollination facilitated by the COST Action. In this introductory article we argue why joining forces between M&S and HPC communities is both timely in the big data era and crucial for success in many application domains. Moreover, we provide an overview on the state of the art in the various research areas concerned
    corecore