48 research outputs found

    Towards the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications

    Get PDF
    Reproducibility of the execution of scientific applications on parallel and distributed systems is a growing interest, underlying the trustworthiness of the experiments and the conclusions derived from experiments. Dynamic loop scheduling (DLS) techniques are an effective approach towards performance improvement of scientific applications via load balancing. These techniques address algorithmic and systemic sources of load imbalance by dynamically assigning tasks to processing elements. The DLS techniques have demonstrated their effectiveness when applied in real applications. Complementing native experiments, simulation is a powerful tool for studying the behavior of parallel and distributed applications. This work is a comprehensive reproducibility study of experiments using DLS techniques published in the earlier literature to verify their implementations into SimGrid-MSG [1]. The reproducibility study is carried out by comparing the performance of the SimGrid-MSG-based experiments with those reported in [2]. In earlier work [3] it was shown that a very detailed degree of information regarding the experiments to be reproduced is essential for successful reproducibility. This work concentrates on the reproducibility of experiments with variable application behavior and a high degree of parallelism . It is shown that reproducing measurements of applications with high variance is challenging, albeit feasible and useful. The success of the present reproducibility study denotes the fact that the implementation of the DLS techniques in SimGrid-MSG is verified for the considered applications and systems. Thus, it enables well-founded future research using the DLS techniques in simulation

    Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications

    Get PDF
    Reproducibility of the execution of scientific applications on parallel and distributed systems is a growing concern, underlying the trustworthiness of the experiments and the conclusions derived from experiments. Dynamic loop scheduling (DLS) techniques are an effective approach towards performance improvement of scientific applications via load balancing. These techniques address algorithmic and systemic sources of load imbalance by dynamically assigning tasks to processing elements. The DLS techniques have demonstrated their effectiveness when applied in real applications. Complementing native experiments, simulation is a powerful tool for studying the behavior of parallel and distributed applications. In earlier work, the scalability [1], robustness [2], and resilience [3] of the DLS techniques were investigated using the MSG interface of the SimGrid simulation framework [4]. The present work complements the earlier work and concentrates on the verification via reproducibility of the implementation of the DLS techniques in SimGrid-MSG. This work describes the challenges of verifying the performance of using DLS techniques in earlier implementations of scientific applications. The verification is performed via reproducibility of simulations based on SimGrid-MSG. To simulate experiments selected from earlier literature, the reproduction process begins by extracting the information needed from the earlier literature and converting it into the input required by SimGrid-MSG. The reproducibility study is carried out by comparing the performance of SimGrid-MSG-based experiments with those reported in two selected publications in which the DLS techniques were originally proposed. While the reproduction was not successful for experiments from one of the selected publications, it was successful for experiments from the other. This successful reproduction implies the verification of the DLS implementation in SimGrid-MSG for the considered applications and systems, and thus, it allows well-founded future research on the DLS techniques

    Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications

    Get PDF
    Scientific applications are often irregular and characterized by large computationally-intensive parallel loops. Dynamic loop scheduling (DLS) techniques improve the performance of computationally-intensive scientific applications via load balancing of their execution on high-performance computing (HPC) systems. Identifying the most suitable choices of data distribution strategies, system sizes, and DLS techniques which improve the performance of a given application, requires intensive assessment and a large number of exploratory native experiments (using real applications on real systems), which may not always be feasible or practical due to associated time and costs. In such cases, simulative experiments are more appropriate for studying the performance of applications. This motivates the question of ‘How realistic are the simulations of executions of scientific applications using DLS on HPC platforms?’ In the present work, a methodology is devised to answer this question. It involves the experimental verification and analysis of the performance of DLS in scientific applications. The proposed methodology is employed for a computer vision application executing using four DLS techniques on two different HPC platforms, both via native and simulative experiments. The evaluation and analysis of the native and simulative results indicate that the accuracy of the simulative experiments is strongly influenced by the approach used to extract the computational effort of the application (FLOP- or time-based), the choice of application model representation into simulation (data or task parallel), and the available HPC subsystem models in the simulator (multi-core CPUs, memory hierarchy, and network topology). The minimum and the maximum percent errors achieved between the native and the simulative experiments are 0.95% and 8.03%, respectively

    Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications

    Get PDF
    Scientific applications are often irregular and characterized by large computationally-intensive parallel loops. Dynamic loop scheduling (DLS) techniques improve the performance of computationally-intensive scientific applications via load balancing of their execution on high-performance computing (HPC) systems. Identifying the most suitable choices of data distribution strategies, system sizes, and DLS techniques which improve the performance of a given application, requires intensive assessment and a large number of exploratory native experiments (using real applications on real systems), which may not always be feasible or practical due to associated time and costs. In such cases, simulative experiments are more appropriate for studying the performance of applications. This motivates the question of How realistic are the simulations of executions of scientific applications using DLS on HPC platforms? In the present work, a methodology is devised to answer this question. It involves the experimental verification and analysis of the performance of DLS in scientific applications. The proposed methodology is employed for a computer vision application executing using four DLS techniques on two different HPC plat- forms, both via native and simulative experiments. The evaluation and analysis of the native and simulative results indicate that the accuracy of the simulative experiments is strongly influenced by the approach used to extract the computational effort of the application (FLOP- or time-based), the choice of application model representation into simulation (data or task parallel), and the available HPC subsystem models in the simulator (multi-core CPUs, memory hierarchy, and network topology). The minimum and the maximum percent errors achieved between the native and the simulative experiments are 0.95% and 8.03%, respectively

    An Approach for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems

    Full text link
    Scientific applications often contain large, computationally-intensive, and irregular parallel loops or tasks that exhibit stochastic characteristics. Applications may suffer from load imbalance during their execution on high-performance computing (HPC) systems due to such characteristics. Dynamic loop self-scheduling (DLS) techniques are instrumental in improving the performance of scientific applications on HPC systems via load balancing. Selecting a DLS technique that results in the best performance for different problems and system sizes requires a large number of exploratory experiments. A theoretical model that can be used to predict the scheduling technique that yields the best performance for a given problem and system has not yet been identified. Therefore, simulation is the most appropriate approach for conducting such exploratory experiments with reasonable costs. This work devises an approach to realistically simulate computationally-intensive scientific applications that employ DLS and execute on HPC systems. Several approaches to represent the application tasks (or loop iterations) are compared to establish their influence on the simulative application performance. A novel simulation strategy is introduced, which transforms a native application code into a simulative code. The native and simulative performance of two computationally-intensive scientific applications are compared to evaluate the realism of the proposed simulation approach. The comparison of the performance characteristics extracted from the native and simulative performance shows that the proposed simulation approach fully captured most of the performance characteristics of interest. This work shows and establishes the importance of simulations that realistically predict the performance of DLS techniques for different applications and system configurations

    Creating an Explainable Intrusion Detection System Using Self Organizing Maps

    Full text link
    Modern Artificial Intelligence (AI) enabled Intrusion Detection Systems (IDS) are complex black boxes. This means that a security analyst will have little to no explanation or clarification on why an IDS model made a particular prediction. A potential solution to this problem is to research and develop Explainable Intrusion Detection Systems (X-IDS) based on current capabilities in Explainable Artificial Intelligence (XAI). In this paper, we create a Self Organizing Maps (SOMs) based X-IDS system that is capable of producing explanatory visualizations. We leverage SOM's explainability to create both global and local explanations. An analyst can use global explanations to get a general idea of how a particular IDS model computes predictions. Local explanations are generated for individual datapoints to explain why a certain prediction value was computed. Furthermore, our SOM based X-IDS was evaluated on both explanation generation and traditional accuracy tests using the NSL-KDD and the CIC-IDS-2017 datasets

    Three-dimensional field-scale coupled Thermo-Hydro-Mechanical modelling: parallel computing implementation

    Get PDF
    An approach for the simulation of three-dimensional field-scale coupled thermo-hydro-mechanical problems is presented, including the implementation of parallel computation algorithms. The approach is designed to allow three-dimensional large-scale coupled simulations to be undertaken in reduced time. Owing to progress in computer technology, existing parallel implementations have been found to be ineffective, with the time taken for communication dominating any reduction in time gained by splitting computation across processors. After analysis of the behavior of the solver and the architecture of multicore, nodal, parallel computers, modification of the parallel algorithm using a novel hybrid message passing interface/open multiprocessing (MPI/OpenMP) method was implemented and found to yield significant improvements by reducing the amount of communication required. This finding reflects recent enhancements of current high-performance computing architectures. An increase in performance of 500% over existing parallel implementations on current processors was achieved for the solver. An example problem involving the Prototype Repository experiment undertaken by the Swedish Nuclear Fuel and Waste Management Co. [Svensk KĂ€rnbrĂ€nslehantering AB (SKB)] in Äspö, Sweden, has been presented to demonstrate situations in which parallel computation is invaluable because of the complex, highly coupled nature of the problem

    Balancing Processor Loads and Exploiting Data Locality in Irregular Computations

    No full text
    Fractiling is a scheduling scheme that simultaneously balances processor loads and exploits locality. Because it is based on a probabilistic analysis, fractiling accommodates load imbalances caused by both predictable phenomena, such as irregular data and conditional statements, and also unpredictable phenomena, such as data access latency and operating system interference. Fractiling exploits both temporal locality, which is often profitable for computations on regular data, and spatial locality, which is often profitable for computations on irregular data. Here, we report on a case study involving the application of fractiling to computations on irregular data, namely N-body simulations. In experiments on a KSR1, performance was improved by as much as 43% by fractiling. Performance improvements were obtained on nonuniform and uniform distributions of bodies, underscoring the need for a scheduling scheme that accommodates application as well as system induced execution time variance. ..

    Performance Evaluation of the Graph Partitioning Algorithms in PARTY Version 1.1

    No full text
    Currently, graphs are being used as models for a wide variety of computationally intensive scientific applications. To reduce these computational requirements, the applications are often parallelized. Parallelization requires dividing the graph among various participating processors. Graph partitioning is an NP-complete problem for which many heuristics have been developed. Several of these heuristics have been implemented in currently available graph partitioning packages. This report presents the timing, edge-cut, and edge-cut imbalance analysis of the algorithms implemented in the PARTY package. 1 Introduction Graphs are currently being used to model a wide variety of computationally intensive scientific problems. Some of these problems lie in the areas of Computational Fluid Dynamics (CFD), Computational Field Simulation (CFS), and Computational Mechanics (CM). The edges of the graph represent dependencies between vertices, while the vertices of the graph represent the computation..

    Balancing Processor Loads and Exploiting Data Locality in N-Body Simulations

    No full text
    Although N-body simulation algorithms are amenable to parallelization, performance gains from execution on parallel machines are difficult to obtain due to load imbalances caused by irregular distributions of bodies. In general, there is a tension between balancing processor loads and maintaining locality, as the dynamic re-assignment of work necessitates access to remote data. Fractiling is a dynamic scheduling scheme that simultaneously balances processor loads and maintains locality by exploiting the self-similarity properties of fractals. Fractiling is based on a probabilistic analysis, and thus, accommodates load imbalances caused by predictable phenomena, such as irregular data, and unpredictable phenomena, such as data-access latencies. In experiments on a KSR1, performance of N-body simulation codes were improved by as much as 53% by fractiling. Performance improvements were obtained on uniform and nonuniform distributions of bodies, underscoring the need for a scheduling schem..
    corecore