15,135 research outputs found

    An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

    Get PDF
    We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.Comment: To appear at the DSN 2020 conferenc

    Approximation Algorithms for Energy Minimization in Cloud Service Allocation under Reliability Constraints

    Get PDF
    We consider allocation problems that arise in the context of service allocation in Clouds. More specifically, we assume on the one part that each computing resource is associated to a capacity constraint, that can be chosen using Dynamic Voltage and Frequency Scaling (DVFS) method, and to a probability of failure. On the other hand, we assume that the service runs as a set of independent instances of identical Virtual Machines. Moreover, there exists a Service Level Agreement (SLA) between the Cloud provider and the client that can be expressed as follows: the client comes with a minimal number of service instances which must be alive at the end of the day, and the Cloud provider offers a list of pairs (price,compensation), this compensation being paid by the Cloud provider if it fails to keep alive the required number of services. On the Cloud provider side, each pair corresponds actually to a guaranteed success probability of fulfilling the constraint on the minimal number of instances. In this context, given a minimal number of instances and a probability of success, the question for the Cloud provider is to find the number of necessary resources, their clock frequency and an allocation of the instances (possibly using replication) onto machines. This solution should satisfy all types of constraints during a given time period while minimizing the energy consumption of used resources. We consider two energy consumption models based on DVFS techniques, where the clock frequency of physical resources can be changed. For each allocation problem and each energy model, we prove deterministic approximation ratios on the consumed energy for algorithms that provide guaranteed probability failures, as well as an efficient heuristic, whose energy ratio is not guaranteed

    Using MCD-DVS for dynamic thermal management performance improvement

    Get PDF
    With chip temperature being a major hurdle in microprocessor design, techniques to recover the performance loss due to thermal emergency mechanisms are crucial in order to sustain performance growth. Many techniques for power reduction in the past and some on thermal management more recently have contributed to alleviate this problem. Probably the most important thermal control technique is dynamic voltage and frequency scaling (DVS) which allows for almost cubic reduction in power with worst-case performance penalty only linear. So far, DVS techniques for temperature control have been studied at the chip level. Finer grain DVS is feasible if a globally-asynchronous locally-synchronous (GALS) design style is employed. GALS, also known as multiple-clock domain (MCD), allows for an independent voltage and frequency control for each one of the clock domains that are part of the chip. There are several studies on DVS for GALS that aim to improve energy and power efficiency but not temperature. This paper proposes and analyses the usage of DVS at the domain level to control temperature in a clustered MCD microarchitecture with the goal of improving the performance of applications that do not meet the thermal constraints imposed by the designers.Peer ReviewedPostprint (published version

    Tolerisanje grešaka i energetska efikasnost kod sistema za rad u realnom vremenu sa vremenskom redundansom

    Get PDF
    The concept of real-time systems (RTSs) is presented in the computer science for decades. During that period, the RTSs have evolved from special purpose microcomputer systems for industrial application to various forms of embedded system that are deeply ingrained in wide segments of daily life. The new application domains pose new design requirements and goals to RTSs, which are now often required to provide both fault tolerance and energy efficiency in addition to their main objective to compute and deliver correct results within a specified period of time. There is a fundamental tradeoff between these two additional requirements because fault tolerance techniques use slack time to improve reliability while low energy consumption techniques exploits slack time to increase energy efficiency. The central problem considered in the dissertation is how to optimally distribute the slack time between these techniques. Dynamic voltage scaling (DVS) is known as one of the most effective low-energy technique for RTSs. However, most existing DVS techniques only focus on minimizing energy consumption without taking the fault-tolerant capability of RTSs into account. In order to solve specify problem in this dissertation, a new heuristic-based fault-tolerant dynamic voltage and frequency scaling (FT-DVFS) algorithm is developed. The goal of the proposed algorithm is to minimize the amount of energy consumed by a real-time system under fault tolerance constraints while guaranteeing that all real-time tasks can complete successfully before their deadlines. Basically, the FT-DVFS is a DVS algorithm with integrated response time analysis (RTA) to check both the schedulability and the fault tolerant constraints of real-time task sets. The performances of FT-DVFS algorithm are evaluated by simulation in a custom build simulator. The simulation results are analyzed from three different points of view: the schedulability, the energy consumption, and the fault tolerance. The simulation results show that the proposed algorithm saves a significant amount of energy even with only two frequency/voltage levels, and the savings further increases with the increase of the number of frequency levels. Also, the simulations show that the reduction in power consumption, which can be achieved with FT-DVFS algorithm decreases with the increase of the processor utilization factor (i.e. processor spare time). The simulation results from the fault tolerant point of view show that the higher level of fault tolerance can only be attained through sacrificing a part of savings in power consumption, and vice versa. The proposed heuristic FT-DVFS algorithm is compared with the optimal DVS algorithm. The simulation analysis show that FT-DVFS algorithm achieves near-optimal solutions in very short computation time even for large task sets

    Statistical Power Supply Dynamic Noise Prediction in Hierarchical Power Grid and Package Networks

    Get PDF
    One of the most crucial high performance systems-on-chip design challenge is to front their power supply noise sufferance due to high frequencies, huge number of functional blocks and technology scaling down. Marking a difference from traditional post physical-design static voltage drop analysis, /a priori dynamic voltage drop/evaluation is the focus of this work. It takes into account transient currents and on-chip and package /RLC/ parasitics while exploring the power grid design solution space: Design countermeasures can be thus early defined and long post physical-design verification cycles can be shortened. As shown by an extensive set of results, a carefully extracted and modular grid library assures realistic evaluation of parasitics impact on noise and facilitates the power network construction; furthermore statistical analysis guarantees a correct current envelope evaluation and Spice simulations endorse reliable result

    Development of a superconductor magnetic suspension and balance prototype facility for studying the feasibility of applying this technique to large scale aerodynamic testing

    Get PDF
    The basic research and development work towards proving the feasibility of operating an all-superconductor magnetic suspension and balance device for aerodynamic testing is presented. The feasibility of applying a quasi-six-degree-of freedom free support technique to dynamic stability research was studied along with the design concepts and parameters for applying magnetic suspension techniques to large-scale aerodynamic facilities. A prototype aerodynamic test facility was implemented. Relevant aspects of the development of the prototype facility are described in three sections: (1) design characteristics; (2) operational characteristics; and (3) scaling to larger facilities
    corecore