1,828 research outputs found

    HAPPE: Human and Application-Driven Frequency Scaling for Processor Power Efficiency

    Get PDF
    Abstract-Conventional dynamic voltage and frequency scaling techniques use high CPU utilization as a predictor for user dissatisfaction, to which they react by increasing CPU frequency. In this paper, we demonstrate that for many interactive applications, perceived performance is highly dependent upon the particular user and application, and is not linearly related to CPU utilization. This observation reveals an opportunity for reducing power consumption. We propose Human and Application driven frequency scaling for Processor Power Efficiency (HAPPE), an adaptive user-and-application-aware dynamic CPU frequency scaling technique. HAPPE continuously adapts processor frequency and voltage to the learned performance requirement of the current user and application. Adaptation to user requirements is quick and requires minimal effort from the user (typically a handful of key strokes). Once the system has adapted to the user's performance requirements, the user is not required to provide continued feedback but is permitted to provide additional feedback to adjust the control policy to changes in preferences. HAPPE was implemented on a Linux-based laptop and evaluated in 22 hours of controlled user studies. Compared to the default Linux CPU frequency controller, HAPPE reduces the measured system-wide power consumption of CPU-intensive interactive applications by 25 percent on average while maintaining user satisfaction. Index Terms-Power, CPU frequency scaling, user-driven study, mobile systems Ç 1I NTRODUCTION P OWER efficiency has been a major technology driver for battery-powered mobile systems, such as mobile phones, personal digital assistants, MP3 players, and laptops. Power efficiency has also become a new focus for line-powered desktop systems and data centers because of its impact on power dissipation and chip temperature, which affect performance, reliability, and lifetime. Processor power consumption is often a substantial portion of system power consumption in mobile systems Traditional CPU power management approaches can lose sight of an important fact: The ultimate goal of any computer system is to satisfy its users, not to execute a particular number of instructions per second. Although CPU utilization is a good indication of processor performance, the actual perceivable system performance depends on individual users and applications, and user satisfaction is not linearly related to CPU utilization. We conducted a study on 10 users with four interactive applications and found that for some applications, some users are satisfied with system performance when the processor is at the lowest frequency, while other users may not be satisfied even when it operates at the highest frequency. We also found that users may be insensitive to varying processor frequency for one application, but may be very sensitive to such changes for another application. Traditional DVFS policies that consider only CPU utilization or other useroblivious performance metrics are often too pessimistic about user performance requirements, and use a high frequency to satisfy all users, resulting in wasted power. Similar findings were also reported in other studies In this paper, we propose Human and Application driven frequency scaling for Processor Power Efficiency (HAPPE), a CPU DVFS technique that adapts voltage and frequency to the performance requirement of the curren

    Polymorphic computing abstraction for heterogeneous architectures

    Get PDF
    Integration of multiple computing paradigms onto system on chip (SoC) has pushed the boundaries of design space exploration for hardware architectures and computing system software stack. The heterogeneity of computing styles in SoC has created a new class of architectures referred to as Heterogeneous Architectures. Novel applications developed to exploit the different computing styles are user centric for embedded SoC. Software and hardware designers are faced with several challenges to harness the full potential of heterogeneous architectures. Applications have to execute on more than one compute style to increase overall SoC resource utilization. The implication of such an abstraction is that application threads need to be polymorphic. Operating system layer is thus faced with the problem of scheduling polymorphic threads. Resource allocation is also an important problem to be dealt by the OS. Morphism evolution of application threads is constrained by the availability of heterogeneous computing resources. Traditional design optimization goals such as computational power and lower energy per computation are inadequate to satisfy user centric application resource needs. Resource allocation decisions at application layer need to permeate to the architectural layer to avoid conflicting demands which may affect energy-delay characteristics of application threads. We propose Polymorphic computing abstraction as a unified computing model for heterogeneous architectures to address the above issues. Simulation environment for polymorphic applications is developed and evaluated under various scheduling strategies to determine the effectiveness of polymorphism abstraction on resource allocation. User satisfaction model is also developed to complement polymorphism and used for optimization of resource utilization at application and network layer of embedded systems

    Automatic Performance Setting for Dynamic Voltage Scaling

    Full text link
    The emphasis on processors that are both low power and high performance has resulted in the incorporation of dynamic voltage scaling into processor designs. This feature allows one to make fine granularity tradeoffs between power use and performance, provided there is a mechanism in the OS to control that tradeoff. In this paper, we describe a novel software approach to automatically controlling dynamic voltage scaling in order to optimize energy use. Our mechanism is implemented in the Linux kernel and requires no modification of user programs. Unlike previous automated approaches, our method works equally well with irregular and multiprogrammed workloads. Moreover, it has the ability to ensure that the quality of interactive performance is within user specified parameters. Our experiments show that as a result of our algorithm, processor energy savings of as much as 75% can be achieved with only a minimal impact on the user experience.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/41391/1/11276_2004_Article_5091297.pd

    Tailbench: a benchmark suite and evaluation methodology for latency-critical applications

    Get PDF
    Latency-critical applications, common in datacenters, must achieve small and predictable tail (e.g., 95th or 99th percentile) latencies. Their strict performance requirements limit utilization and efficiency in current datacenters. These problems have sparked research in hardware and software techniques that target tail latency. However, research in this area is hampered by the lack of a comprehensive suite of latency-critical benchmarks. We present TailBench, a benchmark suite and evaluation methodology that makes latency-critical workloads as easy to run and characterize as conventional, throughput-oriented ones. TailBench includes eight applications that span a wide range of latency requirements and domains, and a harness that implements a robust and statistically sound load-testing methodology. The modular design of the TailBench harness facilitates multiple load-testing scenarios, ranging from multi-node configurations that capture network overheads, to simplified single-node configurations that allow measuring tail latency in simulation. Validation results show that the simplified configurations are accurate for most applications. This flexibility enables rapid prototyping of hardware and software techniques for latency-critical workloads.National Science Foundation (U.S.) (CCF-1318384)Qatar Computing Research InstituteGoogle (Firm) (Google Research Award

    Architecting Efficient Data Centers.

    Full text link
    Data center power consumption has become a key constraint in continuing to scale Internet services. As our society’s reliance on “the Cloud” continues to grow, companies require an ever-increasing amount of computational capacity to support their customers. Massive warehouse-scale data centers have emerged, requiring 30MW or more of total power capacity. Over the lifetime of a typical high-scale data center, power-related costs make up 50% of the total cost of ownership (TCO). Furthermore, the aggregate effect of data center power consumption across the country cannot be ignored. In total, data center energy usage has reached approximately 2% of aggregate consumption in the United States and continues to grow. This thesis addresses the need to increase computational efficiency to address this grow- ing problem. It proposes a new classes of power management techniques: coordinated full-system idle low-power modes to increase the energy proportionality of modern servers. First, we introduce the PowerNap server architecture, a coordinated full-system idle low- power mode which transitions in and out of an ultra-low power nap state to save power during brief idle periods. While effective for uniprocessor systems, PowerNap relies on full-system idleness and we show that such idleness disappears as the number of cores per processor continues to increase. We expose this problem in a case study of Google Web search in which we demonstrate that coordinated full-system active power modes are necessary to reach energy proportionality and that PowerNap is ineffective because of a lack of idleness. To recover full-system idleness, we introduce DreamWeaver, architectural support for deep sleep. DreamWeaver allows a server to exchange latency for full-system idleness, allowing PowerNap-enabled servers to be effective and provides a better latency- power savings tradeoff than existing approaches. Finally, this thesis investigates workloads which achieve efficiency through methodical cluster provisioning techniques. Using the popular memcached workload, this thesis provides examples of provisioning clusters for cost-efficiency given latency, throughput, and data set size targets.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91499/1/meisner_1.pd

    User experience driven CPU frequency scaling on mobile devices towards better energy efficiency

    Get PDF
    With the development of modern smartphones, mobile devices have become ubiquitous in our daily lives. With high processing capabilities and a vast number of applications, users now need them for both business and personal tasks. Unfortunately, battery technology did not scale with the same speed as computational power. Hence, modern smartphone batteries often last for less than a day before they need to be recharged. One of the most power hungry components is the central processing unit (CPU). Multiple techniques are applied to reduce CPU energy consumption. Among them is dynamic voltage and frequency scaling (DVFS). This technique reduces energy consumption by dynamically changing CPU supply voltage depending on the currently running workload. Reducing voltage, however, also makes it necessary to reduce the clock frequency, which can have a significant impact on task performance. Current DVFS algorithms deliver a good user experience, however, as experiments conducted later in this thesis will show, they do not deliver an optimal energy efficiency for an interactive mobile workload. This thesis presents methods and tools to determine where energy can be saved during mobile workload execution when using DVFS. Furthermore, an improved DVFS technique is developed that achieves a higher energy efficiency than the current standard. One important question when developing a DVFS technique is: How much can you slow down a task to save energy before the negative effect on performance becomes intolerable? The ultimate goal when optimising a mobile system is to provide a high quality of experience (QOE) to the end user. In that context, task slowdowns become intolerable when they have a perceptible effect on QOE. Experiments conducted in this thesis answer this question by identifying workload periods in which performance changes are directly perceptible by the end user and periods where they are imperceptible, namely interaction lags and interaction idle periods. Interaction lags are the time it takes the system to process a user interaction and display a corresponding response. Idle periods are the periods between interactions where the user perceives the system as idle and ready for the next input. By knowing where those periods are and how they are affected by frequency changes, a more energy efficient DVFS governor can be developed. This thesis begins by introducing a methodology that measures the duration of interaction lags as perceived by the user. It uses them as an indicator to benchmark the quality of experience for a workload execution. A representative benchmark workload is generated comprising 190 minutes of interactions collected from real users. In conjunction with this QOE benchmark, a DVFS Oracle study is conducted. It is able to find a frequency profile for an interactive mobile workload which has the maximum energy savings achievable without a perceptible performance impact on the user. The developed Oracle performance profile achieves a QOE which is indistinguishable from always running on the fastest frequency while needing 45% less energy. Furthermore, this Oracle is used as a baseline to evaluate how well current mobile frequency governors are performing. It shows that none of these governors perform particularly well and up to 32% energy savings are possible. Equipped with a benchmark and an optimisation baseline, a user perception aware DVFS technique is developed in the second part of this thesis. Initially, a runtime heuristic is introduced which is able to detect interaction lags as the user would perceive them. Using this heuristic, a reinforcement learning driven governor is developed which is able to learn good frequency settings for interaction lag and idle periods based on sample observations. It consumes up to 22% less energy than current standard governors on mobile devices, and maintains a low impact on QOE

    Survey of End-to-End Mobile Network Measurement Testbeds, Tools, and Services

    Full text link
    Mobile (cellular) networks enable innovation, but can also stifle it and lead to user frustration when network performance falls below expectations. As mobile networks become the predominant method of Internet access, developer, research, network operator, and regulatory communities have taken an increased interest in measuring end-to-end mobile network performance to, among other goals, minimize negative impact on application responsiveness. In this survey we examine current approaches to end-to-end mobile network performance measurement, diagnosis, and application prototyping. We compare available tools and their shortcomings with respect to the needs of researchers, developers, regulators, and the public. We intend for this survey to provide a comprehensive view of currently active efforts and some auspicious directions for future work in mobile network measurement and mobile application performance evaluation.Comment: Submitted to IEEE Communications Surveys and Tutorials. arXiv does not format the URL references correctly. For a correctly formatted version of this paper go to http://www.cs.montana.edu/mwittie/publications/Goel14Survey.pd

    Maximizing heterogeneous processor performance under power constraints

    Get PDF
    corecore