Search CORE

1,231 research outputs found

Recommended from our members

A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC-DC Converters in 28 nm FDSOI

Author: Alon E
Asanović K
Avizienis R
Bailey S
Blagojević M
Chen PH
Chiu PF
Flatresse P
Jevtić R
Keller B
Kwak J
Le HP
Lee Y
Nikolić B
Puggelli A
Richards B
Sutardja N
Waterman A
Zimmer B
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices

eScholarship - University of California

Hierarchical Agent-based Adaptation for Self-Aware Embedded Computing Systems

Author: Guang Liang
Publication venue: Annales Universitatis Turkuensis A I 452
Publication date: 10/12/2012
Field of study

Siirretty Doriast

UTUPub

Recommended from our members

Active timing margin management to improve microprocessor power efficiency

Author: Zu Yazhou
Publication venue
Publication date: 11/04/2019
Field of study

Improving power/performance efficiency is critical for today’s micro- processors. From edge devices to datacenters, lower power or higher performance always produces better systems, measured by lower cost of ownership or longer battery time. This thesis studies improving microprocessor power/performance efficiency by optimizing the pipeline timing margin. In particular, this thesis focuses on improving the efficacy of Active Timing Margin, a young technology that dynamically adjusts the margin. Active timing margin trims down the pipeline timing margin with a control loop that adjusts voltage and frequency based on real-time chip environment monitoring. The key insight of this thesis is that in order to maximize active timing margin’s efficiency enhancement benefits, synergistic management from processor architecture design and system software scheduling are needed. To that end, this thesis covers the major consumers of pipeline timing margin, including temperature, voltage, and process variation. For temperature variation, the thesis proposes a table-lookup based active timing margin mechanism, and an associated temperature management scheme to minimize power consumption. For voltage variation, the thesis characterizes the limiting factors of adaptive clocking’s power saving and proposes application scheduling to maximize total system power reduction. For process variation, the thesis proposes core-level adaptive clocking reconfiguration to automatically expose inter-core variation and discusses workload scheduling and throttling management to control critical application performance. The author believes the optimization presented in this thesis can potentially benefit a variety of processor architectures as the conclusions are based on the solid measurement on state-of-the-art processors, and the research objective, active timing margin, already has wide applicability in the latest microprocessors by the time this thesis is written.Electrical and Computer Engineerin

Texas ScholarWorks

On Energy Efficient Computing Platforms

Author: Yin Alexander Wei
Publication venue: Turku Centre for Computer Science
Publication date: 28/08/2013
Field of study

In accordance with the Moore's law, the increasing number of on-chip integrated transistors has enabled modern computing platforms with not only higher processing power but also more affordable prices. As a result, these platforms, including portable devices, work stations and data centres, are becoming an inevitable part of the human society. However, with the demand for portability and raising cost of power, energy efficiency has emerged to be a major concern for modern computing platforms. As the complexity of on-chip systems increases, Network-on-Chip (NoC) has been proved as an efficient communication architecture which can further improve system performances and scalability while reducing the design cost. Therefore, in this thesis, we study and propose energy optimization approaches based on NoC architecture, with special focuses on the following aspects. As the architectural trend of future computing platforms, 3D systems have many bene ts including higher integration density, smaller footprint, heterogeneous integration, etc. Moreover, 3D technology can signi cantly improve the network communication and effectively avoid long wirings, and therefore, provide higher system performance and energy efficiency. With the dynamic nature of on-chip communication in large scale NoC based systems, run-time system optimization is of crucial importance in order to achieve higher system reliability and essentially energy efficiency. In this thesis, we propose an agent based system design approach where agents are on-chip components which monitor and control system parameters such as supply voltage, operating frequency, etc. With this approach, we have analysed the implementation alternatives for dynamic voltage and frequency scaling and power gating techniques at different granularity, which reduce both dynamic and leakage energy consumption. Topologies, being one of the key factors for NoCs, are also explored for energy saving purpose. A Honeycomb NoC architecture is proposed in this thesis with turn-model based deadlock-free routing algorithms. Our analysis and simulation based evaluation show that Honeycomb NoCs outperform their Mesh based counterparts in terms of network cost, system performance as well as energy efficiency.Siirretty Doriast

UTUPub

Exceeding Conservative Limits: A Consolidated Analysis on Modern Hardware Margins

Author: Chatzidimitriou Athanasios
Gizopoulos Dimitris
Kestelman Adrian Cristal
Leng Jingwen
Papadimitriou George
Reddi Vijay Janapa
Salami Behzad
Unsal Osman S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2020
Field of study

Modern large-scale computing systems (data centers, supercomputers, cloud and edge setups and high-end cyber-physical systems) employ heterogeneous architectures that consist of multicore CPUs, general-purpose many-core GPUs, and programmable FPGAs. The effective utilization of these architectures poses several challenges, among which a primary one is power consumption. Voltage reduction is one of the most efficient methods to reduce power consumption of a chip. With the galloping adoption of hardware accelerators (i.e., GPUs and FPGAs) in large datacenters and other large-scale computing infrastructures, a comprehensive evaluation of the safe voltage reduction levels for each different chip can be employed for efficient reduction of the total power. We present a survey of recent studies in voltage margins reduction at the system level for modern CPUs, GPUs and FPGAs. The pessimistic voltage guardbands inserted by the silicon vendors can be exploited in all devices for significant power savings. On average, voltage reduction can reach 12% in multicore CPUs, 20% in manycore GPUs and 39% in FPGAs.Comment: Accepted for publication in IEEE Transactions on Device and Materials Reliabilit

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Efficient and Scalable Computing for Resource-Constrained Cyber-Physical Systems: A Layered Approach

Author: Zou An
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2021
Field of study

With the evolution of computing and communication technology, cyber-physical systems such as self-driving cars, unmanned aerial vehicles, and mobile cognitive robots are achieving increasing levels of multifunctionality and miniaturization, enabling them to execute versatile tasks in a resource-constrained environment. Therefore, the computing systems that power these resource-constrained cyber-physical systems (RCCPSs) have to achieve high efficiency and scalability. First of all, given a fixed amount of onboard energy, these computing systems should not only be power-efficient but also exhibit sufficiently high performance to gracefully handle complex algorithms for learning-based perception and AI-driven decision-making. Meanwhile, scalability requires that the current computing system and its components can be extended both horizontally, with more resources, and vertically, with emerging advanced technology. To achieve efficient and scalable computing systems in RCCPSs, my research broadly investigates a set of techniques and solutions via a bottom-up layered approach. This layered approach leverages the characteristics of each system layer (e.g., the circuit, architecture, and operating system layers) and their interactions to discover and explore the optimal system tradeoffs among performance, efficiency, and scalability. At the circuit layer, we investigate the benefits of novel power delivery and management schemes enabled by integrated voltage regulators (IVRs). Then, between the circuit and microarchitecture/architecture layers, we present a voltage-stacked power delivery system that offers best-in-class power delivery efficiency for many-core systems. After this, using Graphics Processing Units (GPUs) as a case study, we develop a real-time resource scheduling framework at the architecture and operating system layers for heterogeneous computing platforms with guaranteed task deadlines. Finally, fast dynamic voltage and frequency scaling (DVFS) based power management across the circuit, architecture, and operating system layers is studied through a learning-based hierarchical power management strategy for multi-/many-core systems

Washington University St. Louis: Open Scholarship

Toward Reliable, Secure, and Energy-Efficient Multi-Core System Design

Author: Basu Prabal
Publication venue: DigitalCommons@USU
Publication date: 01/08/2019
Field of study

Computer hardware researchers have perennially focussed on improving the performance of computers while stipulating the energy consumption under a strict budget. While several innovations over the years have led to high performance and energy efficient computers, more challenges have also emerged as a fallout. For example, smaller transistor devices in modern multi-core systems are afflicted with several reliability and security concerns, which were inconceivable even a decade ago. Tackling these bottlenecks happens to negatively impact the power and performance of the computers. This dissertation explores novel techniques to gracefully solve some of the pressing challenges of the modern computer design. Specifically, the proposed techniques improve the reliability of on-chip communication fabric under a high power supply noise, increase the energy-efficiency of low-power graphics processing units, and demonstrate an unprecedented security loophole of the low-power computing paradigm through rigorous hardware-based experiments

DigitalCommons@USU

GPU NTC Process Variation Compensation with Voltage Stacking

Author: Ardestani Ehsan
Briz Velasco Jose Luis
Ebrahimi Elnaz
Renau Jose
Sankaranarayanan Alamelu
Trapani Possignolo Rafael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Near-threshold computing (NTC) has the potential to significantly improve efficiency in high throughput architectures, such as general-purpose computing on graphic processing unit (GPGPU). Nevertheless, NTC is more sensitive to process variation (PV) as it complicates power delivery. We propose GPU stacking, a novel method based on voltage stacking, to manage the effects of PV and improve the power delivery simultaneously. To evaluate our methodology, we first explore the design space of GPGPUs in the NTC to find a suitable baseline configuration and then apply GPU stacking to mitigate the effects of PV. When comparing with an equivalent NTC GPGPU without PV management, we achieve 37% more performance on average. When considering high production volume, our approach shifts all the chips closer to the nominal non-PV case, delivering on average (across chips) ˜80 % of the performance of nominal NTC GPGPU, whereas when not using our technique, chips would have ˜50 % of the nominal performance. We also show that our approach can be applied on top of multifrequency domain designs, improving the overall performance

Repositorio Universidad de Zaragoza

Distributed IC Power Delivery: Stability-Constrained Design Optimization and Workload-Aware Power Management

Author: Zhan Xin
Publication venue
Publication date: 15/10/2019
Field of study

ABSTRACT Power delivery presents key design challenges in today’s systems ranging from high performance micro-processors to mobile systems-on-a-chips (SoCs). A robust power delivery system is essential to ensure reliable operation of on-die devices. Nowadays it has become an important design trend to place multiple voltage regulators on-chip in a distributive manner to cope with power supply noise. However, stability concern arises because of the complex interactions be-tween multiple voltage regulators and bulky network of the surrounding passive parasitics. The recently developed hybrid stability theorem (HST) is promising to deal with the stability of such system by efficiently capturing the effects of all interactions, however, large overdesign and hence severe performance degradation are caused by the intrinsic conservativeness of the underlying HST framework. To address such challenge, this dissertation first extends the HST by proposing a frequency-dependent system partitioning technique to substantially reduce the pessimism in stability evaluation. By systematically exploring the theoretical foundation of the HST framework, we recognize all the critical constraints under which the partitioning technique can be performed rigorously to remove conservativeness while maintaining key theoretical properties of the partitioned subsystems. Based on that, we develop an efficient stability-ensuring automatic design flow for large power delivery systems with distributed on-chip regulation. In use of the proposed approach, we further discover new design insights for circuit designers such as how regulator topology, on-chip decoupling capacitance, and the number of integrated voltage regulators can be optimized for improved system tradeoffs between stability and performances. Besides stability, power efficiency must be improved in every possible way while maintaining high power quality. It can be argued that the ultimate power integrity and efficiency may be best achieved via a heterogeneous chain of voltage processing starting from on-board switching voltage regulators (VRs), to on-chip switching VRs, and finally to networks of distributed on-chip linear VRs. As such, we propose a heterogeneous voltage regulation (HVR) architecture encompassing regulators with complimentary characteristics in response time, size, and efficiency. By exploring the rich heterogeneity and tunability in HVR, we develop systematic workload-aware control policies to adapt heterogeneous VRs with respect to workload change at multiple temporal scales to significantly improve system power efficiency while providing a guarantee for power integrity. The proposed techniques are further supported by hardware-accelerated machine learning prediction of non-uniform spatial workload distributions for more accurate HVR adaptation at fine time granularity. Our evaluations based on the PARSEC benchmark suite show that the proposed adaptive 3-stage HVR reduces the total system energy dissipation by up to 23.9% and 15.7% on average compared with the conventional static two-stage voltage regulation using off- and on-chip switching VRs. Compared with the 3-stage static HVR, our runtime control reduces system energy by up to 17.9% and 12.2% on average. Furthermore, the proposed machine learning prediction offers up to 4.1% reduction of system energy

Texas A&M Repository

Energy Saving and Scavenging in Stand-alone and Large Scale Distributed Systems.

Author: He Xuejing
Publication venue
Publication date
Field of study

This thesis focuses on energy management techniques for distributed systems such as hand-held mobile devices, sensor nodes, and data center servers. One of the major design problems in multiple application domains is the mismatch between workloads and resources. Sub-optimal assignment of workloads to resources can cause underloaded or overloaded resources, resulting in performance degradation or energy waste. This work specifically focuses on the heterogeneity in system hardware components and workloads. It includes energy management solutions for unregulated or batteryless embedded systems; and data center servers with heterogeneous workloads, machines, and processor wear states. This thesis describes four major contributions: (1) This thesis describes a battery test and energy delivery system design process to maintain battery life in embedded systems without voltage regulators. (2) In battery-less sensor nodes, this thesis demonstrates a routing protocol to maintain reliable transmission through the sensor network. (3) This thesis has characterized typical workloads and developed two models to capture the heterogeneity of data center tasks and machines: a task performance model and a machine resource utilization model. These models allow users to predict task finish time on individual machines. It then integrates these two models into a task scheduler based on the Hadoop framework for MapReduce tasks, and uses this scheduler for server energy minimization using task concentration. (4) In addition to saving server energy consumption, this thesis describes a method of reducing data center cooling energy by maintaining optimal server processor temperature setpoints through a task assignment algorithm. This algorithm considers the reliability impact of processor wear states. It records processor wear states through automatic timing slack tests on a cluster of machines with varying core temperatures, voltages, and frequencies. These optimal temperature setpoints are used in a task scheduling algorithm that saves both server and cooling energy.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116746/1/xjhe_1.pd

Deep Blue Documents at the University of Michigan