197,865 research outputs found

    Post-processing and visualisation of large-scale DEM simulation data with the open-source VELaSSCo platform

    Get PDF
    Regardless of its origin, in the near future the challenge will not be how to generate data, but rather how to manage big and highly distributed data to make it more easily handled and more accessible by users on their personal devices. VELaSSCo (Visualization for Extremely Large-Scale Scientific Computing) is a platform developed to provide new visual analysis methods for large-scale simulations serving the petabyte era. The platform adopts Big Data tools/architectures to enable in-situ processing for analytics of engineering and scientific data and hardware-accelerated interactive visualization. In large-scale simulations, the domain is partitioned across several thousand nodes, and the data (mesh and results) are stored on those nodes in a distributed manner. The VELaSSCo platform accesses this distributed information, processes the raw data, and returns the results to the users for local visualization by their specific visualization clients and tools. The global goal of VELaSSCo is to provide Big Data tools for the engineering and scientific community, in order to better manipulate simulations with billions of distributed records. The ability to easily handle large amounts of data will also enable larger, higher resolution simulations, which will allow the scientific and engineering communities to garner new knowledge from simulations previously considered too large to handle. This paper shows, by means of selected Discrete Element Method (DEM) simulation use cases, that the VELaSSCo platform facilitates distributed post-processing and visualization of large engineering datasets

    A Real-Time Machine Learning and Visualization Framework for Scientific Workflows

    Get PDF
    High-performance computing resources are currently widely used in science and engineering areas. Typical post-hoc approaches use persistent storage to save produced data from simulation, thus reading from storage to memory is required for data analysis tasks. For large-scale scientific simulations, such I/O operation will produce significant overhead. In-situ/in-transit approaches bypass I/O by accessing and processing in-memory simulation results directly, which suggests simulations and analysis applications should be more closely coupled. This paper constructs a flexible and extensible framework to connect scientific simulations with multi-steps machine learning processes and in-situ visualization tools, thus providing plugged-in analysis and visualization functionality over complex workflows at real time. A distributed simulation-time clustering method is proposed to detect anomalies from real turbulence flows

    Heterogeneous hierarchical workflow composition

    Get PDF
    Workflow systems promise scientists an automated end-to-end path from hypothesis to discovery. However, expecting any single workflow system to deliver such a wide range of capabilities is impractical. A more practical solution is to compose the end-to-end workflow from more than one system. With this goal in mind, the integration of task-based and in situ workflows is explored, where the result is a hierarchical heterogeneous workflow composed of subworkflows, with different levels of the hierarchy using different programming, execution, and data models. Materials science use cases demonstrate the advantages of such heterogeneous hierarchical workflow composition.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Optimizing Lossy Compression Rate-Distortion from Automatic Online Selection between SZ and ZFP

    Full text link
    With ever-increasing volumes of scientific data produced by HPC applications, significantly reducing data size is critical because of limited capacity of storage space and potential bottlenecks on I/O or networks in writing/reading or transferring data. SZ and ZFP are the two leading lossy compressors available to compress scientific data sets. However, their performance is not consistent across different data sets and across different fields of some data sets: for some fields SZ provides better compression performance, while other fields are better compressed with ZFP. This situation raises the need for an automatic online (during compression) selection between SZ and ZFP, with a minimal overhead. In this paper, the automatic selection optimizes the rate-distortion, an important statistical quality metric based on the signal-to-noise ratio. To optimize for rate-distortion, we investigate the principles of SZ and ZFP. We then propose an efficient online, low-overhead selection algorithm that predicts the compression quality accurately for two compressors in early processing stages and selects the best-fit compressor for each data field. We implement the selection algorithm into an open-source library, and we evaluate the effectiveness of our proposed solution against plain SZ and ZFP in a parallel environment with 1,024 cores. Evaluation results on three data sets representing about 100 fields show that our selection algorithm improves the compression ratio up to 70% with the same level of data distortion because of very accurate selection (around 99%) of the best-fit compressor, with little overhead (less than 7% in the experiments).Comment: 14 pages, 9 figures, first revisio

    Enabling adaptive scientific workflows via trigger detection

    Full text link
    Next generation architectures necessitate a shift away from traditional workflows in which the simulation state is saved at prescribed frequencies for post-processing analysis. While the need to shift to in~situ workflows has been acknowledged for some time, much of the current research is focused on static workflows, where the analysis that would have been done as a post-process is performed concurrently with the simulation at user-prescribed frequencies. Recently, research efforts are striving to enable adaptive workflows, in which the frequency, composition, and execution of computational and data manipulation steps dynamically depend on the state of the simulation. Adapting the workflow to the state of simulation in such a data-driven fashion puts extremely strict efficiency requirements on the analysis capabilities that are used to identify the transitions in the workflow. In this paper we build upon earlier work on trigger detection using sublinear techniques to drive adaptive workflows. Here we propose a methodology to detect the time when sudden heat release occurs in simulations of turbulent combustion. Our proposed method provides an alternative metric that can be used along with our former metric to increase the robustness of trigger detection. We show the effectiveness of our metric empirically for predicting heat release for two use cases.Comment: arXiv admin note: substantial text overlap with arXiv:1506.0825
    • …
    corecore