82 research outputs found

    Power Analysis and Optimization Techniques for Energy Efficient Computer Systems

    Get PDF
    Reducing power consumption has become a major challenge in the design and operation of to-day’s computer systems. This chapter describes different techniques addressing this challenge at different levels of system hardware, such as CPU, memory, and internal interconnection network, as well as at different levels of software components, such as compiler, operating system and user applications. These techniques can be broadly categorized into two types: Design time power analysis versus run-time dynamic power management. Mechanisms in the first category use ana-lytical energy models that are integrated into existing simulators to measure the system’s power consumption and thus help engineers to test power-conscious hardware and software during de-sign time. On the other hand, dynamic power management techniques are applied during run-time, and are used to monitor system workload and adapt the system’s behavior dynamically to save energy

    Multilayer Modeling and Design of Energy Managed Microsystems

    Get PDF
    Aggressive energy reduction is one of the key technological challenges that all segments of the semiconductor industry have encountered in the past few years. In addition, the notion of environmental awareness and designing “green” products is yet another major driver for ultra low energy design of electronic systems. Energy management is one of the unique solutions that can address the simultaneous requirements of high-performance, (ultra) low energy and greenness in many classes of computing systems; including high-performance, embedded and wireless. These considerations motivate the focus of this dissertation on the energy efficiency improvement of Energy Managed Microsystems (EMM or EM2). The aim is to maximize the energy efficiency and/or the operational lifetime of these systems. In this thesis we propose solutions that are applicable to many classes of computing systems including high-performance and mobile computing systems. These solutions contribute to make such technologies “greener”. The proposed solutions are multilayer, since they belong to, and may be applicable to, multiple design abstraction layers. The proposed solutions are orthogonal to each other, and if deployed simultaneously in a vertical system integration approach, when possible, the net benefit may be as large as the multiplication of the individual benefits. At high-level, this thesis initially focuses on the modeling and design of interconnections for EM2. For this purpose, a design flow has been proposed for interconnections in EM2. This flow allows designing interconnects with minimum energy requirements that meet all the considered performance objectives, in all specified system operating states. Later, models for energy performance estimation of EM2 are proposed. By energy performance, we refer to the improvements of energy savings of the computing platforms, obtained when some enhancements are applied to those platforms. These models are based on the components of the application profile. The adopted method is inspired by Amdahl’s law, which is driven by the fact that ‘energy’ is ‘additive’, as ‘time’ is ‘additive’. These models can be used for the design space exploration of EM2. The proposed models are high-level and therefore they are easy to use and show fair accuracy, 9.1% error on average, when compared to the results of the implemented benchmarks. Finally, models to estimate energy consumption of EM2 according to their “activity” are proposed. By “activity” we mean the rate at which EM2 perform a set of predefined application functions. Good estimations of energy requirements are very useful when designing and managing the EM2 activity, in order to extend their battery lifetime. The study of the proposed models on some Wireless Sensor Network (WSN) application benchmark confirms a fair accuracy for the energy estimation models, 3% error on average on the considered benchmarks

    Architecture of an end-to-end energy consumption model for a cloud data center

    Get PDF
    Estimates show that a significant proportion of future ICT related energy consumption will be from Cloud Computing. Based on detail analysis and survey of energy consumption and optimization trends in cloud computing, this research presents a comprehensive end-to-end energy consumption model of a cloud facility extending from the end-user equipment to the data center facility. The model is subdivided into three planes and four associated layers and depicts the cross-plane and cross-layer relationships between the components in terms of energy consumption and potential optimization areas and provides a reference framework for planning power optimization strategies at a cloud facility

    Hardware acceleration for power efficient deep packet inspection

    Get PDF
    The rapid growth of the Internet leads to a massive spread of malicious attacks like viruses and malwares, making the safety of online activity a major concern. The use of Network Intrusion Detection Systems (NIDS) is an effective method to safeguard the Internet. One key procedure in NIDS is Deep Packet Inspection (DPI). DPI can examine the contents of a packet and take actions on the packets based on predefined rules. In this thesis, DPI is mainly discussed in the context of security applications. However, DPI can also be used for bandwidth management and network surveillance. DPI inspects the whole packet payload, and due to this and the complexity of the inspection rules, DPI algorithms consume significant amounts of resources including time, memory and energy. The aim of this thesis is to design hardware accelerated methods for memory and energy efficient high-speed DPI. The patterns in packet payloads, especially complex patterns, can be efficiently represented by regular expressions, which can be translated by the use of Deterministic Finite Automata (DFA). DFA algorithms are fast but consume very large amounts of memory with certain kinds of regular expressions. In this thesis, memory efficient algorithms are proposed based on the transition compressions of the DFAs. In this work, Bloom filters are used to implement DPI on an FPGA for hardware acceleration with the design of a parallel architecture. Furthermore, devoted at a balance of power and performance, an energy efficient adaptive Bloom filter is designed with the capability of adjusting the number of active hash functions according to current workload. In addition, a method is given for implementation on both two-stage and multi-stage platforms. Nevertheless, false positive rates still prevents the Bloom filter from extensive utilization; a cache-based counting Bloom filter is presented in this work to get rid of the false positives for fast and precise matching. Finally, in future work, in order to estimate the effect of power savings, models will be built for routers and DPI, which will also analyze the latency impact of dynamic frequency adaption to current traffic. Besides, a low power DPI system will be designed with a single or multiple DPI engines. Results and evaluation of the low power DPI model and system will be produced in future

    RIM: Reconfigurable Instruction Memory Hierarchy for Embedded Systems

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Software Power Analysis And Optimization For Power-Aware Multicore Systems

    Get PDF
    Among all the factors in sustainable computing, power dissipation and energy consumption, arguably speaking, are fundamental aspects of modern computer systems. Different from performance metric, power dissipation is not easy to measure because hardware instrumentation is usually required. Yet as an indispensable component of a computer system, software becomes a major factor affecting power dissipation besides hardware energy-efficiency and power states. With detailed information on resource usage and power dissipation of an application/software, software developers will be able to leverage algorithms and implementations in order to produce power-efficient solutions. Hardware instrumentation, despite its accuracy, is costly and complicated to set up. A general solution to connect software with hardware along with detailed power and system information will improve the system overall efficiency. In this work, we design and implement a general solution to analyze and model software power dissipation. Based on the analysis, we propose a combined solution to optimize the energy efficiency of parallel workload. Starting from the hands-on power measurement method in detail, we provide a fine-grain power profile of two computer systems using hardware instrumentation. Being focusing on dynamic power dissipation analysis, we propose a two-level power model for power-aware multicore computer systems. Based on the model, we design and implement SPAN to relate power dissipation to the different portions of an application using the proposed power model. By using SPAN, developers can easily identify the sections of code consuming the most power in the program. Alternatively, to enable automatic source code instrumentation, we utilize compiler techniques to insert profiling code before and after each function in source code. The expected outcome includes an open source function level power profiling tool, Safari. Using the profiling tools, we propose a model to capture the relationship between concurrency (C), power (P) and execution time (T). By changing the system configuration for different parallel workload, we are able to achieve optimal/near optimal energy-efficient execution of a given workload on a specific platform

    Power Management for Deep Submicron Microprocessors

    Get PDF
    As VLSI technology scales, the enhanced performance of smaller transistors comes at the expense of increased power consumption. In addition to the dynamic power consumed by the circuits there is a tremendous increase in the leakage power consumption which is further exacerbated by the increasing operating temperatures. The total power consumption of modern processors is distributed between the processor core, memory and interconnects. In this research two novel power management techniques are presented targeting the functional units and the global interconnects. First, since most leakage control schemes for processor functional units are based on circuit level techniques, such schemes inherently lack information about the operational profile of higher-level components of the system. This is a barrier to the pivotal task of predicting standby time. Without this prediction, it is extremely difficult to assess the value of any leakage control scheme. Consequently, a methodology that can predict the standby time is highly beneficial in bridging the gap between the information available at the application level and the circuit implementations. In this work, a novel Dynamic Sleep Signal Generator (DSSG) is presented. It utilizes the usage traces extracted from cycle accurate simulations of benchmark programs to predict the long standby periods associated with the various functional units. The DSSG bases its decisions on the current and previous standby state of the functional units to accurately predict the length of the next standby period. The DSSG presents an alternative to Static Sleep Signal Generation (SSSG) based on static counters that trigger the generation of the sleep signal when the functional units idle for a prespecified number of cycles. The test results of the DSSG are obtained by the use of a modified RISC superscalar processor, implemented by SimpleScalar, the most widely accepted open source vehicle for architectural analysis. In addition, the results are further verified by a Simultaneous Multithreading simulator implemented by SMTSIM. Leakage saving results shows an increase of up to 146% in leakage savings using the DSSG versus the SSSG, with an accuracy of 60-80% for predicting long standby periods. Second, chip designers in their effort to achieve timing closure, have focused on achieving the lowest possible interconnect delay through buffer insertion and routing techniques. This approach, though, taxes the power budget of modern ICs, especially those intended for wireless applications. Also, in order to achieve more functionality, die sizes are constantly increasing. This trend is leading to an increase in the average global interconnect length which, in turn, requires more buffers to achieve timing closure. Unconstrained buffering is bound to adversely affect the overall chip performance, if the power consumption is added as a major performance metric. In fact, the number of global interconnect buffers is expected to reach hundreds of thousands to achieve an appropriate timing closure. To mitigate the impact of the power consumed by the interconnect buffers, a power-efficient multi-pin routing technique is proposed in this research. The problem is based on a graph representation of the routing possibilities, including buffer insertion and identifying the least power path between the interconnect source and set of sinks. The novel multi-pin routing technique is tested by applying it to the ISPD and IBM benchmarks to verify the accuracy, complexity, and solution quality. Results obtained indicate that an average power savings as high as 32% for the 130-nm technology is achieved with no impact on the maximum chip frequency

    ENERGY-AWARE OPTIMIZATION FOR EMBEDDED SYSTEMS WITH CHIP MULTIPROCESSOR AND PHASE-CHANGE MEMORY

    Get PDF
    Over the last two decades, functions of the embedded systems have evolved from simple real-time control and monitoring to more complicated services. Embedded systems equipped with powerful chips can provide the performance that computationally demanding information processing applications need. However, due to the power issue, the easy way to gain increasing performance by scaling up chip frequencies is no longer feasible. Recently, low-power architecture designs have been the main trend in embedded system designs. In this dissertation, we present our approaches to attack the energy-related issues in embedded system designs, such as thermal issues in the 3D chip multiprocessor (CMP), the endurance issue in the phase-change memory(PCM), the battery issue in the embedded system designs, the impact of inaccurate information in embedded system, and the cloud computing to move the workload to remote cloud computing facilities. We propose a real-time constrained task scheduling method to reduce peak temperature on a 3D CMP, including an online 3D CMP temperature prediction model and a set of algorithm for scheduling tasks to different cores in order to minimize the peak temperature on chip. To address the challenging issues in applying PCM in embedded systems, we propose a PCM main memory optimization mechanism through the utilization of the scratch pad memory (SPM). Furthermore, we propose an MLC/SLC configuration optimization algorithm to enhance the efficiency of the hybrid DRAM + PCM memory. We also propose an energy-aware task scheduling algorithm for parallel computing in mobile systems powered by batteries. When scheduling tasks in embedded systems, we make the scheduling decisions based on information, such as estimated execution time of tasks. Therefore, we design an evaluation method for impacts of inaccurate information on the resource allocation in embedded systems. Finally, in order to move workload from embedded systems to remote cloud computing facility, we present a resource optimization mechanism in heterogeneous federated multi-cloud systems. And we also propose two online dynamic algorithms for resource allocation and task scheduling. We consider the resource contention in the task scheduling
    corecore