15,135 research outputs found
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
We empirically evaluate an undervolting technique, i.e., underscaling the
circuit supply voltage below the nominal level, to improve the power-efficiency
of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable
Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing
faults due to excessive circuit latency increase. We evaluate the
reliability-power trade-off for such accelerators. Specifically, we
experimentally study the reduced-voltage operation of multiple components of
real FPGAs, characterize the corresponding reliability behavior of CNN
accelerators, propose techniques to minimize the drawbacks of reduced-voltage
operation, and combine undervolting with architectural CNN optimization
techniques, i.e., quantization and pruning. We investigate the effect of
environmental temperature on the reliability-power trade-off of such
accelerators. We perform experiments on three identical samples of modern
Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification
CNN benchmarks. This approach allows us to study the effects of our
undervolting technique for both software and hardware variability. We achieve
more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain
is the result of eliminating the voltage guardband region, i.e., the safe
voltage region below the nominal level that is set by FPGA vendor to ensure
correct functionality in worst-case environmental and circuit conditions. 43%
of the power-efficiency gain is due to further undervolting below the
guardband, which comes at the cost of accuracy loss in the CNN accelerator. We
evaluate an effective frequency underscaling technique that prevents this
accuracy loss, and find that it reduces the power-efficiency gain from 43% to
25%.Comment: To appear at the DSN 2020 conferenc
Approximation Algorithms for Energy Minimization in Cloud Service Allocation under Reliability Constraints
We consider allocation problems that arise in the context of service
allocation in Clouds. More specifically, we assume on the one part that each
computing resource is associated to a capacity constraint, that can be chosen
using Dynamic Voltage and Frequency Scaling (DVFS) method, and to a probability
of failure. On the other hand, we assume that the service runs as a set of
independent instances of identical Virtual Machines. Moreover, there exists a
Service Level Agreement (SLA) between the Cloud provider and the client that
can be expressed as follows: the client comes with a minimal number of service
instances which must be alive at the end of the day, and the Cloud provider
offers a list of pairs (price,compensation), this compensation being paid by
the Cloud provider if it fails to keep alive the required number of services.
On the Cloud provider side, each pair corresponds actually to a guaranteed
success probability of fulfilling the constraint on the minimal number of
instances. In this context, given a minimal number of instances and a
probability of success, the question for the Cloud provider is to find the
number of necessary resources, their clock frequency and an allocation of the
instances (possibly using replication) onto machines. This solution should
satisfy all types of constraints during a given time period while minimizing
the energy consumption of used resources. We consider two energy consumption
models based on DVFS techniques, where the clock frequency of physical
resources can be changed. For each allocation problem and each energy model, we
prove deterministic approximation ratios on the consumed energy for algorithms
that provide guaranteed probability failures, as well as an efficient
heuristic, whose energy ratio is not guaranteed
Using MCD-DVS for dynamic thermal management performance improvement
With chip temperature being a major hurdle in microprocessor design, techniques to recover the performance loss due to thermal emergency mechanisms are crucial in order to sustain performance growth. Many techniques for power reduction in the past and some on thermal management more recently have contributed to alleviate this problem. Probably the most important thermal control technique is dynamic voltage and frequency scaling (DVS) which allows for almost cubic reduction in power with worst-case performance penalty only linear. So far, DVS techniques for temperature control have been studied at the chip level. Finer grain DVS is feasible if a globally-asynchronous locally-synchronous (GALS) design style is employed. GALS, also known as multiple-clock domain (MCD), allows for an independent voltage and frequency control for each one of the clock domains that are part of the chip. There are several studies on DVS for GALS that aim to improve energy and power efficiency but not temperature. This paper proposes and analyses the usage of DVS at the domain level to control temperature in a clustered MCD microarchitecture with the goal of improving the performance of applications that do not meet the thermal constraints imposed by the designers.Peer ReviewedPostprint (published version
Tolerisanje grešaka i energetska efikasnost kod sistema za rad u realnom vremenu sa vremenskom redundansom
The concept of real-time systems (RTSs) is presented in the computer science for
decades. During that period, the RTSs have evolved from special purpose microcomputer
systems for industrial application to various forms of embedded system that are deeply
ingrained in wide segments of daily life. The new application domains pose new design
requirements and goals to RTSs, which are now often required to provide both fault tolerance
and energy efficiency in addition to their main objective to compute and deliver correct
results within a specified period of time. There is a fundamental tradeoff between these two
additional requirements because fault tolerance techniques use slack time to improve
reliability while low energy consumption techniques exploits slack time to increase energy
efficiency. The central problem considered in the dissertation is how to optimally distribute
the slack time between these techniques.
Dynamic voltage scaling (DVS) is known as one of the most effective low-energy
technique for RTSs. However, most existing DVS techniques only focus on minimizing
energy consumption without taking the fault-tolerant capability of RTSs into account. In
order to solve specify problem in this dissertation, a new heuristic-based fault-tolerant
dynamic voltage and frequency scaling (FT-DVFS) algorithm is developed. The goal of the
proposed algorithm is to minimize the amount of energy consumed by a real-time system
under fault tolerance constraints while guaranteeing that all real-time tasks can complete
successfully before their deadlines. Basically, the FT-DVFS is a DVS algorithm with
integrated response time analysis (RTA) to check both the schedulability and the fault
tolerant constraints of real-time task sets. The performances of FT-DVFS algorithm are
evaluated by simulation in a custom build simulator. The simulation results are analyzed from
three different points of view: the schedulability, the energy consumption, and the fault
tolerance. The simulation results show that the proposed algorithm saves a significant amount
of energy even with only two frequency/voltage levels, and the savings further increases with
the increase of the number of frequency levels. Also, the simulations show that the reduction
in power consumption, which can be achieved with FT-DVFS algorithm decreases with the
increase of the processor utilization factor (i.e. processor spare time). The simulation results
from the fault tolerant point of view show that the higher level of fault tolerance can only be
attained through sacrificing a part of savings in power consumption, and vice versa. The
proposed heuristic FT-DVFS algorithm is compared with the optimal DVS algorithm. The
simulation analysis show that FT-DVFS algorithm achieves near-optimal solutions in very
short computation time even for large task sets
Statistical Power Supply Dynamic Noise Prediction in Hierarchical Power Grid and Package Networks
One of the most crucial high performance systems-on-chip design challenge is to front their power supply noise sufferance due to high frequencies, huge number of functional blocks and technology scaling down. Marking a difference from traditional post physical-design static voltage drop analysis, /a priori dynamic voltage drop/evaluation is the focus of this work. It takes into account transient currents and on-chip and package /RLC/ parasitics while exploring the power grid design solution space: Design countermeasures can be thus early defined and long post physical-design verification cycles can be shortened. As shown by an extensive set of results, a carefully extracted and modular grid library assures realistic evaluation of parasitics impact on noise and facilitates the power network construction; furthermore statistical analysis guarantees a correct current envelope evaluation and Spice simulations endorse reliable result
Development of a superconductor magnetic suspension and balance prototype facility for studying the feasibility of applying this technique to large scale aerodynamic testing
The basic research and development work towards proving the feasibility of operating an all-superconductor magnetic suspension and balance device for aerodynamic testing is presented. The feasibility of applying a quasi-six-degree-of freedom free support technique to dynamic stability research was studied along with the design concepts and parameters for applying magnetic suspension techniques to large-scale aerodynamic facilities. A prototype aerodynamic test facility was implemented. Relevant aspects of the development of the prototype facility are described in three sections: (1) design characteristics; (2) operational characteristics; and (3) scaling to larger facilities
- …