475 research outputs found

    Power Management Techniques for Data Centers: A Survey

    Full text link
    With growing use of internet and exponential growth in amount of data to be stored and processed (known as 'big data'), the size of data centers has greatly increased. This, however, has resulted in significant increase in the power consumption of the data centers. For this reason, managing power consumption of data centers has become essential. In this paper, we highlight the need of achieving energy efficiency in data centers and survey several recent architectural techniques designed for power management of data centers. We also present a classification of these techniques based on their characteristics. This paper aims to provide insights into the techniques for improving energy efficiency of data centers and encourage the designers to invent novel solutions for managing the large power dissipation of data centers.Comment: Keywords: Data Centers, Power Management, Low-power Design, Energy Efficiency, Green Computing, DVFS, Server Consolidatio

    Characterizing Power and Energy Efficiency of Legion Data-Centric Runtime and Applications on Heterogeneous High-Performance Computing Systems

    Get PDF
    The traditional parallel programming models require programmers to explicitly specify parallelism and data movement of the underlying parallel mechanisms. Different from the traditional computation-centric programming, Legion provides a data-centric programming model for extracting parallelism and data movement. In this chapter, we aim to characterize the power and energy consumption of running HPC applications on Legion. We run benchmark applications on compute nodes equipped with both CPU and GPU, and measure the execution time, power consumption and CPU/GPU utilization. Additionally, we test the message passing interface (MPI) version of these applications and compare the performance and power consumption of high-performance computing (HPC) applications using the computation-centric and data-centric programming models. Experimental results indicate Legion applications outperforms MPI applications on both performance and energy efficiency, i.e., Legion applications can be 9.17 times as fast as MPI applications and use only 9.2% energy. Legion effectively explores the heterogeneous architecture and runs applications tasks on GPU. As far as we know, this is the first study to understand the power and energy consumption of Legion programming and runtime infrastructure. Our findings will enable HPC system designers and operators to develop and tune the performance of data-centric HPC applications with constraints on power and energy consumption

    Adaptive runtime techniques for power and resource management on multi-core systems

    Full text link
    Energy-related costs are among the major contributors to the total cost of ownership of data centers and high-performance computing (HPC) clusters. As a result, future data centers must be energy-efficient to meet the continuously increasing computational demand. Constraining the power consumption of the servers is a widely used approach for managing energy costs and complying with power delivery limitations. In tandem, virtualization has become a common practice, as virtualization reduces hardware and power requirements by enabling consolidation of multiple applications on to a smaller set of physical resources. However, administration and management of data center resources have become more complex due to the growing number of virtualized servers installed in data centers. Therefore, designing autonomous and adaptive energy efficiency approaches is crucial to achieve sustainable and cost-efficient operation in data centers. Many modern data centers running enterprise workloads successfully implement energy efficiency approaches today. However, the nature of multi-threaded applications, which are becoming more common in all computing domains, brings additional design and management challenges. Tackling these challenges requires a deeper understanding of the interactions between the applications and the underlying hardware nodes. Although cluster-level management techniques bring significant benefits, node-level techniques provide more visibility into application characteristics, which can then be used to further improve the overall energy efficiency of the data centers. This thesis proposes adaptive runtime power and resource management techniques on multi-core systems. It demonstrates that taking the multi-threaded workload characteristics into account during management significantly improves the energy efficiency of the server nodes, which are the basic building blocks of data centers. The key distinguishing features of this work are as follows: We implement the proposed runtime techniques on state-of-the-art commodity multi-core servers and show that their energy efficiency can be significantly improved by (1) taking multi-threaded application specific characteristics into account while making resource allocation decisions, (2) accurately tracking dynamically changing power constraints by using low-overhead application-aware runtime techniques, and (3) coordinating dynamic adaptive decisions at various layers of the computing stack, specifically at system and application levels. Our results show that efficient resource distribution under power constraints yields energy savings of up to 24% compared to existing approaches, along with the ability to meet power constraints 98% of the time for a diverse set of multi-threaded applications

    Power Bounded Computing on Current & Emerging HPC Systems

    Get PDF
    Power has become a critical constraint for the evolution of large scale High Performance Computing (HPC) systems and commercial data centers. This constraint spans almost every level of computing technologies, from IC chips all the way up to data centers due to physical, technical, and economic reasons. To cope with this reality, it is necessary to understand how available or permissible power impacts the design and performance of emergent computer systems. For this reason, we propose power bounded computing and corresponding technologies to optimize performance on HPC systems with limited power budgets. We have multiple research objectives in this dissertation. They center on the understanding of the interaction between performance, power bounds, and a hierarchical power management strategy. First, we develop heuristics and application aware power allocation methods to improve application performance on a single node. Second, we develop algorithms to coordinate power across nodes and components based on application characteristic and power budget on a cluster. Third, we investigate performance interference induced by hardware and power contentions, and propose a contention aware job scheduling to maximize system throughput under given power budgets for node sharing system. Fourth, we extend to GPU-accelerated systems and workloads and develop an online dynamic performance & power approach to meet both performance requirement and power efficiency. Power bounded computing improves performance scalability and power efficiency and decreases operation costs of HPC systems and data centers. This dissertation opens up several new ways for research in power bounded computing to address the power challenges in HPC systems. The proposed power and resource management techniques provide new directions and guidelines to green exscale computing and other computing systems

    Energy-Based Accounting and Scheduling of Virtual Machines in a Cloud System

    Get PDF
    Computer systems & NetworksVirtualization enables flexible resource provisioning and improves energy efficiency through consolidating virtualized servers into a smaller number of physical servers than that of the virtualized servers. Therefore, it is becoming an essential component for the emerging cloud computing model. Currently, virtualized environment including cloud computing systems bills users for the amount of their processor time, or the number of their virtual machine instances. However, accounting based only on the depreciation cost of server hardware is not an economically proper model because the cooling and energy cost for datacenters has already exceeded the cost to own servers. This paper suggests an estimation model to account energy consumption of each virtual machine without any dedicated measurement hardware. Our estimation model estimates the energy consumption of a virtual machine based on the in-processor events generated by the virtual machine. Based on the estimation model, this paper also proposes the virtual machine scheduling algorithm that is able to provide computing resources according to the energy budget of each virtual machine. The suggested schemes are implemented in the Xen virtualization system, and the evaluation shows the suggested schemes estimate and provide energy consumption with errors less than 5% of the total energy consumption.ope

    Adaptation-Aware Architecture Modeling and Analysis of Energy Efficiency for Software Systems

    Get PDF
    This thesis presents an approach for the design time analysis of energy efficiency for static and self-adaptive software systems. The quality characteristics of a software system, such as performance and operating costs, strongly depend upon its architecture. Software architecture is a high-level view on software artifacts that reflects essential quality characteristics of a system under design. Design decisions made on an architectural level have a decisive impact on the quality of a system. Revising architectural design decisions late into development requires significant effort. Architectural analyses allow software architects to reason about the impact of design decisions on quality, based on an architectural description of the system. An essential quality goal is the reduction of cost while maintaining other quality goals. Power consumption accounts for a significant part of the Total Cost of Ownership (TCO) of data centers. In 2010, data centers contributed 1.3% of the world-wide power consumption. However, reasoning on the energy efficiency of software systems is excluded from the systematic analysis of software architectures at design time. Energy efficiency can only be evaluated once the system is deployed and operational. One approach to reduce power consumption or cost is the introduction of self-adaptivity to a software system. Self-adaptive software systems execute adaptations to provision costly resources dependent on user load. The execution of reconfigurations can increase energy efficiency and reduce cost. If performed improperly, however, the additional resources required to execute a reconfiguration may exceed their positive effect. Existing architecture-level energy analysis approaches offer limited accuracy or only consider a limited set of system features, e.g., the used communication style. Predictive approaches from the embedded systems and Cloud Computing domain operate on an abstraction that is not suited for architectural analysis. The execution of adaptations can consume additional resources. The additional consumption can reduce performance and energy efficiency. Design time quality analyses for self-adaptive software systems ignore this transient effect of adaptations. This thesis makes the following contributions to enable the systematic consideration of energy efficiency in the architectural design of self-adaptive software systems: First, it presents a modeling language that captures power consumption characteristics on an architectural abstraction level. Second, it introduces an energy efficiency analysis approach that uses instances of our power consumption modeling language in combination with existing performance analyses for architecture models. The developed analysis supports reasoning on energy efficiency for static and self-adaptive software systems. Third, to ease the specification of power consumption characteristics, we provide a method for extracting power models for server environments. The method encompasses an automated profiling of servers based on a set of restrictions defined by the user. A model training framework extracts a set of power models specified in our modeling language from the resulting profile. The method ranks the trained power models based on their predicted accuracy. Lastly, this thesis introduces a systematic modeling and analysis approach for considering transient effects in design time quality analyses. The approach explicitly models inter-dependencies between reconfigurations, performance and power consumption. We provide a formalization of the execution semantics of the model. Additionally, we discuss how our approach can be integrated with existing quality analyses of self-adaptive software systems. We validated the accuracy, applicability, and appropriateness of our approach in a variety of case studies. The first two case studies investigated the accuracy and appropriateness of our modeling and analysis approach. The first study evaluated the impact of design decisions on the energy efficiency of a media hosting application. The energy consumption predictions achieved an absolute error lower than 5.5% across different user loads. Our approach predicted the relative impact of the design decision on energy efficiency with an error of less than 18.94%. The second case study used two variants of the Spring-based community case study system PetClinic. The case study complements the accuracy and appropriateness evaluation of our modeling and analysis approach. We were able to predict the energy consumption of both variants with an absolute error of no more than 2.38%. In contrast to the first case study, we derived all models automatically, using our power model extraction framework, as well as an extraction framework for performance models. The third case study applied our model-based prediction to evaluate the effect of different self-adaptation algorithms on energy efficiency. It involved scientific workloads executed in a virtualized environment. Our approach predicted the energy consumption with an error below 7.1%, even though we used coarse grained measurement data of low accuracy to train the input models. The fourth case study evaluated the appropriateness and accuracy of the automated model extraction method using a set of Big Data and enterprise workloads. Our method produced power models with prediction errors below 5.9%. A secondary study evaluated the accuracy of extracted power models for different Virtual Machine (VM) migration scenarios. The results of the fifth case study showed that our approach for modeling transient effects improved the prediction accuracy for a horizontally scaling application. Leveraging the improved accuracy, we were able to identify design deficiencies of the application that otherwise would have remained unnoticed

    Multi-dimensional optimization for cloud based multi-tier applications

    Get PDF
    Emerging trends toward cloud computing and virtualization have been opening new avenues to meet enormous demands of space, resource utilization, and energy efficiency in modern data centers. By being allowed to host many multi-tier applications in consolidated environments, cloud infrastructure providers enable resources to be shared among these applications at a very fine granularity. Meanwhile, resource virtualization has recently gained considerable attention in the design of computer systems and become a key ingredient for cloud computing. It provides significant improvement of aggregated power efficiency and high resource utilization by enabling resource consolidation. It also allows infrastructure providers to manage their resources in an agile way under highly dynamic conditions. However, these trends also raise significant challenges to researchers and practitioners to successfully achieve agile resource management in consolidated environments. First, they must deal with very different responsiveness of different applications, while handling dynamic changes in resource demands as applications' workloads change over time. Second, when provisioning resources, they must consider management costs such as power consumption and adaptation overheads (i.e., overheads incurred by dynamically reconfiguring resources). Dynamic provisioning of virtual resources entails the inherent performance-power tradeoff. Moreover, indiscriminate adaptations can result in significant overheads on power consumption and end-to-end performance. Hence, to achieve agile resource management, it is important to thoroughly investigate various performance characteristics of deployed applications, precisely integrate costs caused by adaptations, and then balance benefits and costs. Fundamentally, the research question is how to dynamically provision available resources for all deployed applications to maximize overall utility under time-varying workloads, while considering such management costs. Given the scope of the problem space, this dissertation aims to develop an optimization system that not only meets performance requirements of deployed applications, but also addresses tradeoffs between performance, power consumption, and adaptation overheads. To this end, this dissertation makes two distinct contributions. First, I show that adaptations applied to cloud infrastructures can cause significant overheads on not only end-to-end response time, but also server power consumption. Moreover, I show that such costs can vary in intensity and time scale against workload, adaptation types, and performance characteristics of hosted applications. Second, I address multi-dimensional optimization between server power consumption, performance benefit, and transient costs incurred by various adaptations. Additionally, I incorporate the overhead of the optimization procedure itself into the problem formulation. Typically, system optimization approaches entail intensive computations and potentially have a long delay to deal with a huge search space in cloud computing infrastructures. Therefore, this type of cost cannot be ignored when adaptation plans are designed. In this multi-dimensional optimization work, scalable optimization algorithm and hierarchical adaptation architecture are developed to handle many applications, hosting servers, and various adaptations to support various time-scale adaptation decisions.Ph.D.Committee Chair: Pu, Calton; Committee Member: Liu, Ling; Committee Member: Liu, Xue; Committee Member: Schlichting, Richard; Committee Member: Schwan, Karsten; Committee Member: Yalamanchili, Sudhaka

    Understanding Security Threats in Cloud

    Get PDF
    As cloud computing has become a trend in the computing world, understanding its security concerns becomes essential for improving service quality and expanding business scale. This dissertation studies the security issues in a public cloud from three aspects. First, we investigate a new threat called power attack in the cloud. Second, we perform a systematical measurement on the public cloud to understand how cloud vendors react to existing security threats. Finally, we propose a novel technique to perform data reduction on audit data to improve system capacity, and hence helping to enhance security in cloud. In the power attack, we exploit various attack vectors in platform as a service (PaaS), infrastructure as a service (IaaS), and software as a service (SaaS) cloud environments. to demonstrate the feasibility of launching a power attack, we conduct series of testbed based experiments and data-center-level simulations. Moreover, we give a detailed analysis on how different power management methods could affect a power attack and how to mitigate such an attack. Our experimental results and analysis show that power attacks will pose a serious threat to modern data centers and should be taken into account while deploying new high-density servers and power management techniques. In the measurement study, we mainly investigate how cloud vendors have reacted to the co-residence threat inside the cloud, in terms of Virtual Machine (VM) placement, network management, and Virtual Private Cloud (VPC). Specifically, through intensive measurement probing, we first profile the dynamic environment of cloud instances inside the cloud. Then using real experiments, we quantify the impacts of VM placement and network management upon co-residence, respectively. Moreover, we explore VPC, which is a defensive service of Amazon EC2 for security enhancement, from the routing perspective. Advanced Persistent Threat (APT) is a serious cyber-threat, cloud vendors are seeking solutions to ``connect the suspicious dots\u27\u27 across multiple activities. This requires ubiquitous system auditing for long period of time, which in turn causes overwhelmingly large amount of system audit logs. We propose a new approach that exploits the dependency among system events to reduce the number of log entries while still supporting high quality forensics analysis. In particular, we first propose an aggregation algorithm that preserves the event dependency in data reduction to ensure high quality of forensic analysis. Then we propose an aggressive reduction algorithm and exploit domain knowledge for further data reduction. We conduct a comprehensive evaluation on real world auditing systems using more than one-month log traces to validate the efficacy of our approach
    corecore