3,491 research outputs found
Heterogeneous hierarchical workflow composition
Workflow systems promise scientists an automated end-to-end path from hypothesis to discovery. However, expecting any single workflow system to deliver such a wide range of capabilities is impractical. A more practical solution is to compose the end-to-end workflow from more than one system. With this goal in mind, the integration of task-based and in situ workflows is explored, where the result is a hierarchical heterogeneous workflow composed of subworkflows, with different levels of the hierarchy using different programming, execution, and data models. Materials science use cases demonstrate the advantages of such heterogeneous hierarchical workflow composition.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the
U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02-
06CH11357, program manager Laura Biven, and by the Spanish
Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft
ASCR/HEP Exascale Requirements Review Report
This draft report summarizes and details the findings, results, and
recommendations derived from the ASCR/HEP Exascale Requirements Review meeting
held in June, 2015. The main conclusions are as follows. 1) Larger, more
capable computing and data facilities are needed to support HEP science goals
in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of
the demand at the 2025 timescale is at least two orders of magnitude -- and in
some cases greater -- than that available currently. 2) The growth rate of data
produced by simulations is overwhelming the current ability, of both facilities
and researchers, to store and analyze it. Additional resources and new
techniques for data analysis are urgently needed. 3) Data rates and volumes
from HEP experimental facilities are also straining the ability to store and
analyze large and complex data volumes. Appropriately configured
leadership-class facilities can play a transformational role in enabling
scientific discovery from these datasets. 4) A close integration of HPC
simulation and data analysis will aid greatly in interpreting results from HEP
experiments. Such an integration will minimize data movement and facilitate
interdependent workflows. 5) Long-range planning between HEP and ASCR will be
required to meet HEP's research needs. To best use ASCR HPC resources the
experimental HEP program needs a) an established long-term plan for access to
ASCR computational and data resources, b) an ability to map workflows onto HPC
resources, c) the ability for ASCR facilities to accommodate workflows run by
collaborations that can have thousands of individual members, d) to transition
codes to the next-generation HPC platforms that will be available at ASCR
facilities, e) to build up and train a workforce capable of developing and
using simulations and analysis to support HEP scientific research on
next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio
Recommended from our members
Improving Performance of M-to-N Processing and Data Redistribution in In Transit Analysis and Visualization
In an in transit setting, a parallel data producer, such as a numerical simulation, runs on one set of ranks M, while a data consumer, such as a parallel visualization application, runs on a different set of ranks N. One of the central challenges in this in transit setting is to determine the mapping of data from the set of M producer ranks to the set of N consumer ranks. This is a challenging problem for several reasons, such as the producer and consumer codes potentially having different scaling characteristics and different data models. The resulting mapping from M to N ranks can have a significant impact on aggregate application performance. In this work, we present an approach for performing this M-to-N mapping in a way that has broad applicability across a diversity of data producer and consumer applications. We evaluate its design and performance with
a study that runs at high concurrency on a modern HPC platform. By leveraging design characteristics, which facilitate an “intelligent” mapping from M-to-N, we observe significant performance gains are possible in terms of several different metrics, including time-to-solution and amount of data moved
Exploring power behaviors and trade-offs of in-situ data analytics
pre-printAs scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads and associated trade-offs from an energy perspective. The goal of this paper is exploring data-related energy/performance trade-offs for end-to-end simulation workflows running at scale on current high-end computing systems. Specifically, this paper presents: (1) an analysis of the data-related behaviors of a combustion simulation workflow with an in-situ data analytics pipeline, running on the Titan system at ORNL; (2) a power model based on system power and data exchange patterns, which is empirically validated; and (3) the use of the model to characterize the energy behavior of the workflow and to explore energy/performance tradeoffs on current as well as emerging systems
Diva: A Declarative and Reactive Language for In-Situ Visualization
The use of adaptive workflow management for in situ visualization and
analysis has been a growing trend in large-scale scientific simulations.
However, coordinating adaptive workflows with traditional procedural
programming languages can be difficult because system flow is determined by
unpredictable scientific phenomena, which often appear in an unknown order and
can evade event handling. This makes the implementation of adaptive workflows
tedious and error-prone. Recently, reactive and declarative programming
paradigms have been recognized as well-suited solutions to similar problems in
other domains. However, there is a dearth of research on adapting these
approaches to in situ visualization and analysis. With this paper, we present a
language design and runtime system for developing adaptive systems through a
declarative and reactive programming paradigm. We illustrate how an adaptive
workflow programming system is implemented using our approach and demonstrate
it with a use case from a combustion simulation.Comment: 11 pages, 5 figures, 6 listings, 1 table, to be published in LDAV
2020. The article has gone through 2 major revisions: Emphasized
contributions, features and examples. Addressed connections between DIVA and
FRP. In sec. 3, we fixed a design flaw and addressed it in sec. 3.3-3.4.
Re-designed sec. 5 with a more concrete example and benchmark results.
Simplified the syntax of DIV
Applications and Challenges of Real-time Mobile DNA Analysis
The DNA sequencing is the process of identifying the exact order of
nucleotides within a given DNA molecule. The new portable and relatively
inexpensive DNA sequencers, such as Oxford Nanopore MinION, have the potential
to move DNA sequencing outside of laboratory, leading to faster and more
accessible DNA-based diagnostics. However, portable DNA sequencing and analysis
are challenging for mobile systems, owing to high data throughputs and
computationally intensive processing performed in environments with unreliable
connectivity and power.
In this paper, we provide an analysis of the challenges that mobile systems
and mobile computing must address to maximize the potential of portable DNA
sequencing, and in situ DNA analysis. We explain the DNA sequencing process and
highlight the main differences between traditional and portable DNA sequencing
in the context of the actual and envisioned applications. We look at the
identified challenges from the perspective of both algorithms and systems
design, showing the need for careful co-design
The science-policy interfaces of the European network for observing our changing planet : From Earth Observation data to policy-oriented decisions
This paper reports on major outcomes of the ERA-PLANET (The European network for observing our changing planet) project, which was funded under Horizon 2020 ERA-net co-funding scheme. ERA-PLANET strengthened the European Research Area in the domain of Earth Observation (EO) in coherence with the European partici-pation to Group on Earth Observation and the Copernicus European Union's Earth Observation programme. ERA -PLANET was implemented through four projects focused on smart cities and resilient societies (SMURBS), resource efficiency and environmental management (GEOEssential), global changes and environmental treaties (iGOSP) and polar areas and natural resources (iCUPE). These projects developed specific science-policy workflows and interfaces to address selected environmental policy issues and design cost-effective strategies aiming to achieve targeted objectives. Key Enabling Technologies were implemented to enhancing 'data to knowledge' transition for supporting environmental policy making. Data cube technologies, the Virtual Earth Laboratory, Earth Observation ontologies and Knowledge Platforms were developed and used for such applications.SMURBS brought a substantial contribution to resilient cities and human settlements topics that were adopted by GEO as its 4th engagement priority, bringing the urban resilience topic in the GEO agenda on par with climate change, sustainable development and disaster risk reduction linked to environmental policies. GEOEssential is contributing to the development of Essential Variables (EVs) concept, which is encouraging and should allow the EO community to complete the description of the Earth System with EVs in a close future. This will clearly improve our capacity to address intertwined environmental and development policies as a Nexus.iGOSP supports the implementation of the GEO Flagship on Mercury (GOS4M) and the GEO Initiative on POPs (GOS4POPs) by developing a new integrated approach for global real-time monitoring of environmental quality with respect to air, water and human matrices contamination by toxic substances, like mercury and persistent organic pollutants. iGOSP developed end-user-oriented Knowledge Hubs that provide data repository systems integrated with data management consoles and knowledge information systems.The main outcomes from iCUPE are the novel and comprehensive data sets and a modelling activity that contributed to delivering science-based insights for the Arctic region. Applications enable defining and moni-toring of Arctic Essential Variables and sets up processes towards UN2030 SDGs that include health (SDG 3), clean water resources and sanitation (SDGs 6 and 14).Peer reviewe
- …