1,095 research outputs found

    Iso-energy-efficiency: An approach to power-constrained parallel computation

    Get PDF
    Future large scale high performance supercomputer systems require high energy efficiency to achieve exaflops computational power and beyond. Despite the need to understand energy efficiency in high-performance systems, there are few techniques to evaluate energy efficiency at scale. In this paper, we propose a system-level iso-energy-efficiency model to analyze, evaluate and predict energy-performance of data intensive parallel applications with various execution patterns running on large scale power-aware clusters. Our analytical model can help users explore the effects of machine and application dependent characteristics on system energy efficiency and isolate efficient ways to scale system parameters (e.g. processor count, CPU power/frequency, workload size and network bandwidth) to balance energy use and performance. We derive our iso-energy-efficiency model and apply it to the NAS Parallel Benchmarks on two power-aware clusters. Our results indicate that the model accurately predicts total system energy consumption within 5% error on average for parallel applications with various execution and communication patterns. We demonstrate effective use of the model for various application contexts and in scalability decision-making

    \u3cem\u3eHP-DAEMON\u3c/em\u3e: \u3cem\u3eH\u3c/em\u3eigh \u3cem\u3eP\u3c/em\u3eerformance \u3cem\u3eD\u3c/em\u3eistributed \u3cem\u3eA\u3c/em\u3edaptive \u3cem\u3eE\u3c/em\u3energy-efficient \u3cem\u3eM\u3c/em\u3eatrix-multiplicati\u3cem\u3eON\u3c/em\u3e

    Get PDF
    The demands of improving energy efficiency for high performance scientific applications arise crucially nowadays. Software-controlled hardware solutions directed by Dynamic Voltage and Frequency Scaling (DVFS) have shown their effectiveness extensively. Although DVFS is beneficial to green computing, introducing DVFS itself can incur non-negligible overhead, if there exist a large number of frequency switches issued by DVFS. In this paper, we propose a strategy to achieve the optimal energy savings for distributed matrix multiplication via algorithmically trading more computation and communication at a time adaptively with user-specified memory costs for less DVFS switches, which saves 7.5% more energy on average than a classic strategy. Moreover, we leverage a high performance communication scheme for fully exploiting network bandwidth via pipeline broadcast. Overall, the integrated approach achieves substantial energy savings (up to 51.4%) and performance gain (28.6% on average) compared to ScaLAPACK pdgemm() on a cluster with an Ethernet switch, and outperforms ScaLAPACK and DPLASMA pdgemm() respectively by 33.3% and 32.7% on average on a cluster with an Infiniband switch

    Exploring hardware overprovisioning in power-constrained, high performance computing

    Full text link

    On the energy footprint of I/O management in Exascale HPC systems

    Get PDF
    International audienceThe advent of unprecedentedly scalable yet energy hungry Exascale supercomputers poses a major challenge in sustaining a high performance-per-watt ratio. With I/O management acquiring a crucial role in supporting scientific simulations, various I/O management approaches have been proposed to achieve high performance and scalability. However, the details of how these approaches affect energy consumption have not been studied yet. Therefore, this paper aims to explore how much energy a supercomputer consumes while running scientific simulations when adopting various I/O management approaches. In particular, we closely examine three radically different I/O schemes including time partitioning, dedicated cores, and dedicated nodes. To do so, we implement the three approaches within the Damaris I/O middleware and perform extensive experiments with one of the target HPC applications of the Blue Waters sustained-petaflop supercomputer project: the CM1 atmospheric model. Our experimental results obtained on the French Grid’5000 platform highlight the differences among these three approaches and illustrate in which way various configurations of the application and of the system can impact performance and energy consumption. Moreover, we propose and validate a mathematical model that estimates the energy consumption of a HPC simulation under different I/O approaches. Our proposed model gives hints to pre-select the most energy-efficient I/O approach for a particular simulation on a particular HPC system and therefore provides a step towards energy-efficient HPC simulations in Exascale systems. To the best of our knowledge, our work provides the first in-depth look into the energy-performance tradeoffs of I/O management approaches

    Energy-efficient computing for HPC workloads on heterogeneous manycore chips

    Full text link
    Power and energy efficiency is one of the major challenges to achieve exascale computing in the next several years. While chips operating at low voltages have been studied to be highly energy-efficient, low voltage operations lead to heterogeneity across cores within the microprocessor chip. In this work, we study chips with low voltage operation and discuss programming systems, and performance modeling in the presence of heterogeneity. We propose an integer linear programming based approach for selecting optimal configu-ration of a chip that minimizes its energy consumption. We obtain an average of 26 % and 10.7 % savings in energy con-sumption of the chip for two HPC mini-applications- min-iMD and Jacobi, respectively. We also evaluate the energy savings with execution time constraints, using the proposed approach. These energy savings are significantly more than the savings by sub-optimal configurations obtained from heuristics
    • 

    corecore