1,975 research outputs found
Heterogeneous hierarchical workflow composition
Workflow systems promise scientists an automated end-to-end path from hypothesis to discovery. However, expecting any single workflow system to deliver such a wide range of capabilities is impractical. A more practical solution is to compose the end-to-end workflow from more than one system. With this goal in mind, the integration of task-based and in situ workflows is explored, where the result is a hierarchical heterogeneous workflow composed of subworkflows, with different levels of the hierarchy using different programming, execution, and data models. Materials science use cases demonstrate the advantages of such heterogeneous hierarchical workflow composition.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the
U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02-
06CH11357, program manager Laura Biven, and by the Spanish
Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft
Plasma Edge Kinetic-MHD Modeling in Tokamaks Using Kepler Workflow for Code Coupling, Data Management and Visualization
A new predictive computer simulation tool targeting the development of the H-mode pedestal at the plasma edge in tokamaks and the triggering and dynamics of edge localized modes (ELMs) is presented in this report. This tool brings together, in a coordinated and effective manner, several first-principles physics simulation codes, stability analysis packages, and data processing and visualization tools. A Kepler workflow is used in order to carry out an edge plasma simulation that loosely couples the kinetic code, XGC0, with an ideal MHD linear stability analysis code, ELITE, and an extended MHD initial value code such as M3D or NIMROD. XGC0 includes the neoclassical ion-electron-neutral dynamics needed to simulate pedestal growth near the separatrix. The Kepler workflow processes the XGC0 simulation results into simple images that can be selected and displayed via the Dashboard, a monitoring tool implemented in AJAX allowing the scientist to track computational resources, examine running and archived jobs, and view key physics data, all within a standard Web browser. The XGC0 simulation is monitored for the conditions needed to trigger an ELM crash by periodically assessing the edge plasma pressure and current density profiles using the ELITE code. If an ELM crash is triggered, the Kepler workflow launches the M3D code on a moderate-size Opteron cluster to simulate the nonlinear ELM crash and to compute the relaxation of plasma profiles after the crash. This process is monitored through periodic outputs of plasma fluid quantities that are automatically visualized with AVS/Express and may be displayed on the Dashboard. Finally, the Kepler workflow archives all data outputs and processed images using HPSS, as well as provenance information about the software and hardware used to create the simulation. The complete process of preparing, executing and monitoring a coupled-code simulation of the edge pressure pedestal buildup and the ELM cycle using the Kepler scientific workflow system is described in this paper
Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up
This paper targets an important class of applications that requires combining HPC simulations with data analysis for online or real-time scientific discovery. We use the state-of-the-art parallel-IO and data-staging libraries to build simulation-time data analysis workflows, and conduct performance analysis with real-world applications of computational fluid dynamics (CFD) simulations and molecular dynamics (MD) simulations. Driven by in-depth performance inefficiency analysis, we design an end-to-end application-level approach to eliminating the interlocks and synchronizations existent in the present methods. Our new approach employs both task parallelism and pipeline parallelism to reduce synchronizations effectively. In addition, we design a fully asynchronous, fine-grain, and pipelining runtime system, which is named Zipper. Zipper is a multi-threaded distributed runtime system and executes in a layer below the simulation and analysis applications. To further reduce the simulation application's stall time and enhance the data transfer performance, we design a concurrent data transfer optimization that uses both HPC network and parallel file system for improved bandwidth. The scalability of the Zipper system has been verified by a performance model and various empirical large scale experiments. The experimental results on an Intel multicore cluster as well as a Knight Landing HPC system demonstrate that the Zipper based approach can outperform the fastest state-of-the-art I/O transport library by up to 220% using 13,056 processor cores
Recent EUROfusion Achievements in Support of Computationally Demanding Multiscale Fusion Physics Simulations and Integrated Modeling
Integrated modeling (IM) of present experiments and future tokamak reactors requires the provision of computational resources and numerical tools capable of simulating multiscale spatial phenomena as well as fast transient events and relatively slow plasma evolution within a reasonably short computational time. Recent progress in the implementation of the new computational resources for fusion applications in Europe based on modern supercomputer technologies (supercomputer MARCONI-FUSION), in the optimization and speedup of the EU fusion-related first-principle codes, and in the development of a basis for physics codes/modules integration into a centrally maintained suite of IM tools achieved within the EUROfusion Consortium is presented. Physics phenomena that can now be reasonably modelled in various areas (core turbulence and magnetic reconnection, edge and scrape-off layer physics, radio-frequency heating and current drive, magnetohydrodynamic model, reflectometry simulations) following successful code optimizations and parallelization are briefly described. Development activities in support to IM are summarized. They include support to (1) the local deployment of the IM infrastructure and access to experimental data at various host sites, (2) the management of releases for sophisticated IM workflows involving a large number of components, and (3) the performance optimization of complex IM workflows.This work has been carried out within the framework of the EUROfusion Consortium and has received funding from the Euratom research and training programme 2014 to 2018 under grant agreement 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission or ITER.Peer ReviewedPostprint (published version
A FLEXIBLE APPROACH FOR ORCHESTRATING ADAPTIVE SCIENTIFIC WORKFLOWS FOR SCALABLE COMPUTING
Modern scientific workflows are becoming complex with the incorporation of non-traditionalcomputation methods, and advances in technologies enabling on-the-fly analysis. These work-
flows exhibit unpredictable runtime behaviors and have dynamic requirements. For example,
such workflows must maintain overall performance and throughput while dealing with undesired
events, adapting to failures, and supporting data-driven adaptive analysis. A fixed, predetermined
resource assignment common to HPC machines is inefficient for overall performance, throughput,
and data-driven adaptive analysis. While solutions exist to enable elastic resource management,
there is no support that can manage the workflows at runtime to determine when the resource
assignment and/or the runtime state of tasks (i.e. stopping, starting, changing the task parameters
for adapting analysis, or changing how data is sent/received by the workflow tasks) needs to be
revised, and perform the feasible changes at runtime accordingly.
This dissertation provides a flexible and portable model, DYFLOW, with strategies to auto-mate the management of scalable and adaptive workflows. The model gathers runtime statistics,
tracks the occurrence of important events, and finalizes a plan of action to execute in response to
events that occurred, by mediating between suggested actions with respect to the running state
of the workflow tasks and resource availability. Further, the model supports a wide range of con-
structs and tunable parameters that allow users to express events of interest, select prospective
responses, and set various preferences to set the service expectation, e.g., throughput, performance, resilience to failures, or quality of results. To showcase that the DYFLOW model supports
adaptive functionality desired for emerging workflows, several examples of problematic behavior
are demonstrated where DYFLOW accommodates the specific requirements and automates the
runtime management process for scientists while delivering the quality of service desired
Exploring power behaviors and trade-offs of in-situ data analytics
pre-printAs scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads and associated trade-offs from an energy perspective. The goal of this paper is exploring data-related energy/performance trade-offs for end-to-end simulation workflows running at scale on current high-end computing systems. Specifically, this paper presents: (1) an analysis of the data-related behaviors of a combustion simulation workflow with an in-situ data analytics pipeline, running on the Titan system at ORNL; (2) a power model based on system power and data exchange patterns, which is empirically validated; and (3) the use of the model to characterize the energy behavior of the workflow and to explore energy/performance tradeoffs on current as well as emerging systems
Purdue Contribution of Fusion Simulation Program
The overall science goal of the FSP is to develop predictive simulation capability for magnetically confined fusion plasmas at an unprecedented level of integration and fidelity. This will directly support and enable effective U.S. participation in research related to the International Thermonuclear Experimental Reactor (ITER) and the overall mission of delivering practical fusion energy. The FSP will address a rich set of scientific issues together with experimental programs, producing validated integrated physics results. This is very well aligned with the mission of the ITER Organization to coordinate with its members the integrated modeling and control of fusion plasmas, including benchmarking and validation activities. [1]. Initial FSP research will focus on two critical areas: 1) the plasma edge and 2) whole device modeling including disruption avoidance. The first of these problems involves the narrow plasma boundary layer and its complex interactions with the plasma core and the surrounding material wall. The second requires development of a computationally tractable, but comprehensive model that describes all equilibrium and dynamic processes at a sufficient level of detail to provide useful prediction of the temporal evolution of fusion plasma experiments. The initial driver for the whole device model (WDM) will be prediction and avoidance of discharge-terminating disruptions, especially at high performance, which are a critical impediment to successful operation of machines like ITER. If disruptions prove unable to be avoided, their associated dynamics and effects will be addressed in the next phase of the FSP. The FSP plan targets the needed modeling capabilities by developing Integrated Science Applications (ISAs) specific to their needs. The Pedestal-Boundary model will include boundary magnetic topology, cross-field transport of multi-species plasmas, parallel plasma transport, neutral transport, atomic physics and interactions with the plasma wall. It will address the origins and structure of the plasma electric field, rotation, the L-H transition, and the wide variety of pedestal relaxation mechanisms. The Whole Device Model will predict the entire discharge evolution given external actuators (i.e., magnets, power supplies, heating, current drive and fueling systems) and control strategies. Based on components operating over a range of physics fidelity, the WDM will model the plasma equilibrium, plasma sources, profile evolution, linear stability and nonlinear evolution toward a disruption (but not the full disruption dynamics). The plan assumes that, as the FSP matures and demonstrates success, the program will evolve and grow, enabling additional science problems to be addressed. The next set of integration opportunities could include: 1) Simulation of disruption dynamics and their effects; 2) Prediction of core profile including 3D effects, mesoscale dynamics and integration with the edge plasma; 3) Computation of non-thermal particle distributions, self-consistent with fusion, radio frequency (RF) and neutral beam injection (NBI) sources, magnetohydrodynamics (MHD) and short-wavelength turbulence
Simulation Intelligence: Towards a New Generation of Scientific Methods
The original "Seven Motifs" set forth a roadmap of essential methods for the
field of scientific computing, where a motif is an algorithmic method that
captures a pattern of computation and data movement. We present the "Nine
Motifs of Simulation Intelligence", a roadmap for the development and
integration of the essential algorithms necessary for a merger of scientific
computing, scientific simulation, and artificial intelligence. We call this
merger simulation intelligence (SI), for short. We argue the motifs of
simulation intelligence are interconnected and interdependent, much like the
components within the layers of an operating system. Using this metaphor, we
explore the nature of each layer of the simulation intelligence operating
system stack (SI-stack) and the motifs therein: (1) Multi-physics and
multi-scale modeling; (2) Surrogate modeling and emulation; (3)
Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based
modeling; (6) Probabilistic programming; (7) Differentiable programming; (8)
Open-ended optimization; (9) Machine programming. We believe coordinated
efforts between motifs offers immense opportunity to accelerate scientific
discovery, from solving inverse problems in synthetic biology and climate
science, to directing nuclear energy experiments and predicting emergent
behavior in socioeconomic settings. We elaborate on each layer of the SI-stack,
detailing the state-of-art methods, presenting examples to highlight challenges
and opportunities, and advocating for specific ways to advance the motifs and
the synergies from their combinations. Advancing and integrating these
technologies can enable a robust and efficient hypothesis-simulation-analysis
type of scientific method, which we introduce with several use-cases for
human-machine teaming and automated science
- …