1,565 research outputs found

    Visualizing network traffic to understand the performance of massively parallel simulations

    Get PDF
    pre-printThe performance of massively parallel applications is often heavily impacted by the cost of communication among compute nodes. However, determining how to best use the network is a formidable task, made challenging by the ever increasing size and complexity of modern supercomputers. This paper applies visualization techniques to aid parallel application developers in understanding the network activity by enabling a detailed exploration of the flow of packets through the hardware interconnect. In order to visualize this large and complex data, we employ two linked views of the hardware network. The first is a 2D view, that represents the network structure as one of several simplified planar projections. This view is designed to allow a user to easily identify trends and patterns in the network traffic. The second is a 3D view that augments the 2D view by preserving the physical network topology and providing a context that is familiar to the application developers. Using the massively parallel multi-physics code pF3D as a case study, we demonstrate that our tool provides valuable insight that we use to explain and optimize pF3D's performance on an IBM Blue Gene/P system

    Modeling the Internet of Things: a simulation perspective

    Full text link
    This paper deals with the problem of properly simulating the Internet of Things (IoT). Simulating an IoT allows evaluating strategies that can be employed to deploy smart services over different kinds of territories. However, the heterogeneity of scenarios seriously complicates this task. This imposes the use of sophisticated modeling and simulation techniques. We discuss novel approaches for the provision of scalable simulation scenarios, that enable the real-time execution of massively populated IoT environments. Attention is given to novel hybrid and multi-level simulation techniques that, when combined with agent-based, adaptive Parallel and Distributed Simulation (PADS) approaches, can provide means to perform highly detailed simulations on demand. To support this claim, we detail a use case concerned with the simulation of vehicular transportation systems.Comment: Proceedings of the IEEE 2017 International Conference on High Performance Computing and Simulation (HPCS 2017

    A visual Analytics System for Optimizing Communications in Massively Parallel Applications

    Get PDF
    Current and future supercomputers have tens of thousands of compute nodes interconnected with high-dimensional networks and complex network topologies for improved performance. Application developers are required to write scalable parallel programs in order to achieve high throughput on these machines. Application performance is largely determined by efficient inter-process communication. A common way to analyze and optimize performance is through profiling parallel codes to identify communication bottlenecks. However, understanding gigabytes of profile data is not a trivial task. In this paper, we present a visual analytics system for identifying the scalability bottlenecks and improving the communication efficiency of massively parallel applications. Visualization methods used in this system are designed to comprehend large-scale and varied communication patterns on thousands of nodes in complex networks such as the 5D torus and the dragonfly. We also present efficient rerouting and remapping algorithms that can be coupled with our interactive visual analytics design for performance optimization. We demonstrate the utility of our system with several case studies using three benchmark applications on two leading supercomputers. The mapping suggestion from our system led to 38% improvement in hop-bytes for MiniAMR application on 4,096 MPI processes.This research has been sponsored in part by the U.S. National Science Foundation through grant IIS-1320229, and the U.S. Department of Energy through grants DE-SC0012610 and DE-SC0014917. This research has been funded in part and used resources of the Argonne Leadership Computing Facility at Argonne National Lab- oratory, which is supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC02-06CH11357. This work was supported in part by the DOE Office of Science, ASCR, under award numbers 57L38, 57L32, 57L11, 57K50, and 508050

    A Visual Analytics Framework for Reviewing Streaming Performance Data

    Full text link
    Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data without missing important changes or patterns. To support streaming data analysis, we introduce a visual analytic framework comprising of three modules: data management, analysis, and interactive visualization. The data management module collects various computing and communication performance metrics from the monitored system using streaming data processing techniques and feeds the data to the other two modules. The analysis module automatically identifies important changes and patterns at the required latency. In particular, we introduce a set of online and progressive analysis methods for not only controlling the computational costs but also helping analysts better follow the critical aspects of the analysis results. Finally, the interactive visualization module provides the analysts with a coherent view of the changes and patterns in the continuously captured performance data. Through a multi-faceted case study on performance analysis of parallel discrete-event simulation, we demonstrate the effectiveness of our framework for identifying bottlenecks and locating outliers.Comment: This is the author's preprint version that will be published in Proceedings of IEEE Pacific Visualization Symposium, 202

    Exploring performance data with boxfish

    Get PDF
    pre-printThe growth in size and complexity of scaling applications and the systems on which they run pose challenges in analyzing and improving their overall performance. With metrics coming from thousands or millions of processes, visualization techniques are necessary to make sense of the increasing amount of data. To aid the process of exploration and understanding, we announce the initial release of Boxfish, an extensible tool for manipulating and visualizing data pertaining to application behavior. Combining and visually presenting data and knowledge from multiple domains, such as the application's communication patterns and the hardware's network configuration and routing policies, can yield the insight necessary to discover the underlying causes of observed behavior. Boxfish allows users to query, filter and project data across these domains to create interactive, linked visualizations

    Characterization and modeling of PIDX parallel I/O for performance optimization

    Get PDF
    pre-printParallel I/O library performance can vary greatly in re- sponse to user-tunable parameter values such as aggregator count, file count, and aggregation strategy. Unfortunately, manual selection of these values is time consuming and dependent on characteristics of the target machine, the underlying file system, and the dataset itself. Some characteristics, such as the amount of memory per core, can also impose hard constraints on the range of viable parameter values. In this work we address these problems by using machine learning techniques to model the performance of the PIDX parallel I/O library and select appropriate tunable parameter values. We characterize both the network and I/O phases of PIDX on a Cray XE6 as well as an IBM Blue Gene/P system. We use the results of this study to develop a machine learning model for parameter space exploration and performance prediction

    Distributed Hybrid Simulation of the Internet of Things and Smart Territories

    Full text link
    This paper deals with the use of hybrid simulation to build and compose heterogeneous simulation scenarios that can be proficiently exploited to model and represent the Internet of Things (IoT). Hybrid simulation is a methodology that combines multiple modalities of modeling/simulation. Complex scenarios are decomposed into simpler ones, each one being simulated through a specific simulation strategy. All these simulation building blocks are then synchronized and coordinated. This simulation methodology is an ideal one to represent IoT setups, which are usually very demanding, due to the heterogeneity of possible scenarios arising from the massive deployment of an enormous amount of sensors and devices. We present a use case concerned with the distributed simulation of smart territories, a novel view of decentralized geographical spaces that, thanks to the use of IoT, builds ICT services to manage resources in a way that is sustainable and not harmful to the environment. Three different simulation models are combined together, namely, an adaptive agent-based parallel and distributed simulator, an OMNeT++ based discrete event simulator and a script-language simulator based on MATLAB. Results from a performance analysis confirm the viability of using hybrid simulation to model complex IoT scenarios.Comment: arXiv admin note: substantial text overlap with arXiv:1605.0487

    Overview on agent-based social modelling and the use of formal languages

    Get PDF
    Transdisciplinary Models and Applications investigates a variety of programming languages used in validating and verifying models in order to assist in their eventual implementation. This book will explore different methods of evaluating and formalizing simulation models, enabling computer and industrial engineers, mathematicians, and students working with computer simulations to thoroughly understand the progression from simulation to product, improving the overall effectiveness of modeling systems.Postprint (author's final draft

    Scheduling and Tuning Kernels for High-performance on Heterogeneous Processor Systems

    Get PDF
    Accelerated parallel computing techniques using devices such as GPUs and Xeon Phis (along with CPUs) have proposed promising solutions of extending the cutting edge of high-performance computer systems. A significant performance improvement can be achieved when suitable workloads are handled by the accelerator. Traditional CPUs can handle those workloads not well suited for accelerators. Combination of multiple types of processors in a single computer system is referred to as a heterogeneous system. This dissertation addresses tuning and scheduling issues in heterogeneous systems. The first section presents work on tuning scientific workloads on three different types of processors: multi-core CPU, Xeon Phi massively parallel processor, and NVIDIA GPU; common tuning methods and platform-specific tuning techniques are presented. Then, analysis is done to demonstrate the performance characteristics of the heterogeneous system on different input data. This section of the dissertation is part of the GeauxDock project, which prototyped a few state-of-art bioinformatics algorithms, and delivered a fast molecular docking program. The second section of this work studies the performance model of the GeauxDock computing kernel. Specifically, the work presents an extraction of features from the input data set and the target systems, and then uses various regression models to calculate the perspective computation time. This helps understand why a certain processor is faster for certain sets of tasks. It also provides the essential information for scheduling on heterogeneous systems. In addition, this dissertation investigates a high-level task scheduling framework for heterogeneous processor systems in which, the pros and cons of using different heterogeneous processors can complement each other. Thus a higher performance can be achieve on heterogeneous computing systems. A new scheduling algorithm with four innovations is presented: Ranked Opportunistic Balancing (ROB), Multi-subject Ranking (MR), Multi-subject Relative Ranking (MRR), and Automatic Small Tasks Rearranging (ASTR). The new algorithm consistently outperforms previously proposed algorithms with better scheduling results, lower computational complexity, and more consistent results over a range of performance prediction errors. Finally, this work extends the heterogeneous task scheduling algorithm to handle power capping feature. It demonstrates that a power-aware scheduler significantly improves the power efficiencies and saves the energy consumption. This suggests that, in addition to performance benefits, heterogeneous systems may have certain advantages on overall power efficiency

    New technologies for urban designers: the VENUE project

    Get PDF
    In this report, we first outline the basic idea of VENUE. This involves developing digital tools froma foundation of geographic information systems (GIS) software which we then apply to urbandesign, a subject area and profession which has little tradition in using such tools. Our project wasto develop two types of tool, namely functional analysis based on embedding models of movementin local environments into GIS based on ideas from the field of space syntax; and secondlyfashioning these ideas in a wider digital context in which the entire range of GIS technologies werebrought to bear at the local scale. By local scale, we mean the representation of urban environmentsfrom about 1: 500 to around 1: 2500
    • …
    corecore