11 research outputs found

    Adjustment of a simulator of a complex dynamic system with emphasis on the reduction of computational resources

    Get PDF
    Scientists and engineers continuously build models to interpret axiomatic theories or explain the reality of the universe of interest to reduce the gap between formal theory and observation in practice. We focus our work on dealing with the uncertainty of the input data of the model to improve the quality of the simulation. To perform this type of process large volumes of data and a lot of computer processing must be handled. This article proposes a methodology for adjusting a simulator of a complex dynamic system that models the wave translation along rivers channels, with emphasis on the reduction of computation resources. We propose a simulator calibration by using a methodology based on successive adjustment steps of the model. We based our process in a parametric simulation. The input scenarios used to run the simulator at every step were obtained in an agile way, achieving a model improvement up to 50% in the reduction of the simulated data error. These results encouraged us to extend the adjustment process over a larger domain region.Facultad de Informátic

    Adjustment of a simulator of a complex dynamic system with emphasis on the reduction of computational resources

    Get PDF
    Scientists and engineers continuously build models to interpret axiomatic theories or explain the reality of the universe of interest to reduce the gap between formal theory and observation in practice. We focus our work on dealing with the uncertainty of the input data of the model to improve the quality of the simulation. To perform this type of process large volumes of data and a lot of computer processing must be handled. This article proposes a methodology for adjusting a simulator of a complex dynamic system that models the wave translation along rivers channels, with emphasis on the reduction of computation resources. We propose a simulator calibration by using a methodology based on successive adjustment steps of the model. We based our process in a parametric simulation. The input scenarios used to run the simulator at every step were obtained in an agile way, achieving a model improvement up to 50% in the reduction of the simulated data error. These results encouraged us to extend the adjustment process over a larger domain region.Facultad de Informátic

    Adjustment of a simulator of a complex dynamic system with emphasis on the reduction of computational resources

    Get PDF
    Scientists and engineers continuously build models to interpret axiomatic theories or explain the reality of the universe of interest to reduce the gap between formal theory and observation in practice. We focus our work on dealing with the uncertainty of the input data of the model to improve the quality of the simulation. To perform this type of process large volumes of data and a lot of computer processing must be handled. This article proposes a methodology for adjusting a simulator of a complex dynamic system that models the wave translation along rivers channels, with emphasis on the reduction of computation resources. We propose a simulator calibration by using a methodology based on successive adjustment steps of the model. We based our process in a parametric simulation. The input scenarios used to run the simulator at every step were obtained in an agile way, achieving a model improvement up to 50% in the reduction of the simulated data error. These results encouraged us to extend the adjustment process over a larger domain region.Facultad de Informátic

    Extension of a task-based model to functional programming

    Get PDF
    Recently, efforts have been made to bring together the areas of high-performance computing (HPC) and massive data processing (Big Data). Traditional HPC frameworks, like COMPSs, are mostly task-based, while popular big-data environments, like Spark, are based on functional programming principles. The earlier are know for their good performance for regular, matrix-based computations; on the other hand, for fine-grained, data-parallel workloads, the later has often been considered more successful. In this paper we present our experience with the integration of some dataflow techniques into COMPSs, a task-based framework, in an effort to bring together the best aspects of both worlds. We present our API, called DDF, which provides a new data abstraction that addresses the challenges of integrating Big Data application scenarios into COMPSs. DDF has a functional-based interface, similar to many Data Science tools, that allows us to use dynamic evaluation to adapt the task execution in runtime. Besides the performance optimization it provides, the API facilitates the development of applications by experts in the application domain. In this paper we evaluate DDF's effectiveness by comparing the resulting programs to their original versions in COMPSs and Spark. The results show that DDF can improve COMPSs execution time and even outperform Spark in many use cases.This work was partially supported by CAPES, CNPq, Fapemig and NIC.BR, and by projects Atmosphere (H2020-EU.2.1.1 777154) and INCT-Cyber.Peer ReviewedPostprint (author's final draft

    Task-based programming in COMPSs to converge from HPC to big data

    Get PDF
    Task-based programming has proven to be a suitable model for high-performance computing (HPC) applications. Different implementations have been good demonstrators of this fact and have promoted the acceptance of task-based programming in the OpenMP standard. Furthermore, in recent years, Apache Spark has gained wide popularity in business and research environments as a programming model for addressing emerging big data problems. COMP Superscalar (COMPSs) is a task-based environment that tackles distributed computing (including Clouds) and is a good alternative for a task-based programming model for big data applications. This article describes why we consider that task-based programming models are a good approach for big data applications. The article includes a comparison of Spark and COMPSs in terms of architecture, programming model, and performance. It focuses on the differences that both frameworks have in structural terms, on their programmability interface, and in terms of their efficiency by means of three widely known benchmarking kernels: Wordcount, Kmeans, and Terasort. These kernels enable the evaluation of the more important functionalities of both programming models and analyze different work flows and conditions. The main results achieved from this comparison are (1) COMPSs is able to extract the inherent parallelism from the user code with minimal coding effort as opposed to Spark, which requires the existing algorithms to be adapted and rewritten by explicitly using their predefined functions, (2) it is an improvement in terms of performance when compared with Spark, and (3) COMPSs has shown to scale better than Spark in most cases. Finally, we discuss the advantages and disadvantages of both frameworks, highlighting the differences that make them unique, thereby helping to choose the right framework for each particular objective.This work is supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272). Javier Conejero’s postdoctoral contract is cofinanced by the Ministry of Economy and Competitiveness under the Juan de la Cierva Formación postdoctoral fellowship number FJCI-2015-24651. This work is also supported by the Intel-BSC Exascale Lab. The Human Brain Project receives funding from the EU’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no 604102.Peer ReviewedPostprint (author's final draft

    Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up

    Get PDF
    This paper targets an important class of applications that requires combining HPC simulations with data analysis for online or real-time scientific discovery. We use the state-of-the-art parallel-IO and data-staging libraries to build simulation-time data analysis workflows, and conduct performance analysis with real-world applications of computational fluid dynamics (CFD) simulations and molecular dynamics (MD) simulations. Driven by in-depth performance inefficiency analysis, we design an end-to-end application-level approach to eliminating the interlocks and synchronizations existent in the present methods. Our new approach employs both task parallelism and pipeline parallelism to reduce synchronizations effectively. In addition, we design a fully asynchronous, fine-grain, and pipelining runtime system, which is named Zipper. Zipper is a multi-threaded distributed runtime system and executes in a layer below the simulation and analysis applications. To further reduce the simulation application's stall time and enhance the data transfer performance, we design a concurrent data transfer optimization that uses both HPC network and parallel file system for improved bandwidth. The scalability of the Zipper system has been verified by a performance model and various empirical large scale experiments. The experimental results on an Intel multicore cluster as well as a Knight Landing HPC system demonstrate that the Zipper based approach can outperform the fastest state-of-the-art I/O transport library by up to 220% using 13,056 processor cores

    Big data deployment in containerized infrastructures through the interconnection of network namespaces

    Get PDF
    Big Data applications tackle the challenge of fast handling of large streams of data. Their performance is not only dependent on the data frameworks implementation and the underlying hardware but also on the deployment scheme and its potential for fast scaling. Consequently, several efforts have focused on the ease of deployment of Big Data applications, notably through the use of containerization. This technology was indeed raised to bring multitenancy and multiprocessing out of clusters, providing high deployment flexibility through lightweight container images. Recent studies have focused mostly on Docker containers. Notwithstanding, this article is actually interested in recent Singularity containers as they provide more security and support high-performance computing (HPC) environments and, in this way, they can make Big Data applications benefit from the specialized hardware of HPC. Singularity 2.x, however, does not isolate network resources as required by most Big Data components. Singularity 3.x allows allocating each container with isolated network resources, but their interconnection requires a nontrivial amount of configuration effort. In this context, this article makes a functional contribution in the form of a deployment scheme based on the interconnection of network namespaces, through underlay and overlay networking approaches, to make Big Data applications easily deployable inside Singularity containers. We provide detailed account of our deployment scheme when using both interconnection approaches in the form of a “how-to-do-it” report, and we evaluate it by comparing three Big Data applications based on Hadoop when performing on a bare-metal infrastructure and on scenarios involving Singularity and Docker instances.Peer ReviewedPostprint (author's final draft
    corecore