Abstract: Semiconductor scaling makes the individual part can no longer share the same supply voltage, and some chips even require multiple different supply voltage levels. Different input and output voltage standard specification of each device make use of multiple supply voltage levels. Various devices such as display, RF, USB, SD card, etc. increase the number of supply voltage levels. Moreover, analog devices often do not allow sharing power supply due to coupling noise. However, those components are commonly powered by a single power source such as a battery. Consequently, power converters such as on-and off-chip switching-mode DC-DC converters, low-dropout linear regulators and charge pumps are largely populated even on a single circuit board. Efficiency of the power converters is known to be high enough and often ignored during power management policy development. However, their actual conversion efficiency varies significantly according to device activity and power mode, which sometimes results in substantially lower efficiency than the value provided in datasheets. Moreover, hardware designers generally optimize the power converters for the maximum power supply current of the device and even perform over-design while the actual device power consumption during runtime could be largely offset from the energy-optimal operating point. This tutorial paper covers a wide range of topics on power converter-aware design and introduces several design practices; i) power converter basics and the conversion efficiency, ii) power converter voltage transition overhead, iii) power converter-aware design of embedded systems, and iv) maximum energy transfer of energy harvesting devices.
Introduction
Modern digital systems are equipped with various digital and analog devices that require various levels of supply voltages as shown in Fig. 1 . Various devices such as display, RF, USB, SD card, etc. increase the number of supply voltages. Moreover, analog devices such as audio amplifiers do not allow power supply coupling noise with digital devices. It is not surprising to see a small circuit board system equipped with a number of DC-DC converters and low-dropout regulators (LDOs). In practice, conversion efficiency of the power converters significantly impacts on the overall system efficiency.
The power conversion efficiency has been often overlooked in commercial products. The conversion efficiency of the power converters is not constant and sometimes can be significantly lower than the value specified in the datasheet by operating conditions. Circuit designers typically follow an overdesign factor to ensure reliable operation of the chip powered by a power converter. However, overdesign largely affects the conversion efficiency such that the nominal operating point of the power converter is largely offset from the best operating condition [1] . In addition, power management techniques make the operating point even more offset from the best operating condition. Careless power management may even harm the overall energy efficiency due to the power loss in the power converter. The power consumption of each component should not be optimized individually but optimized with careful considerations for power converter operating point and the resultant efficiency.
In this paper, we introduce power converter efficiency as the major hurdle for improving overall system efficiency, show various aspects of power converters that previous design methodologies often overlook, and provide several practices of power converter-aware design methodologies. We begin with the basics of power converter operation, and power conversion efficiency models. We emphasize various aspects of power converters that should be carefully considered during system design. For example, the overhead of a dynamic voltage and frequency scaling (DVFS) transition has been widely ignored or incorrectly evaluated in existing literature. Incorrect evaluation of the converter effects could result in sub-optimal optimization, deadline violation in real-time systems, and thermal instability in the system if DVFS is used to control temperature [2] . DVS scheduling techniques in conjunction with considerations for DC-DC c 2013 Information Processing Society of Japan converter efficiency can truly maximize the system energy efficiency. DC-DC converter efficiency is incorporated into DVS scheduling problem to maximize the overall system efficiency including the DC-DC converter [3] , [4] . We also look into elimination of the power converters to improve system energy efficiency. Sometimes, it is beneficial not to use power converters and directly connect the power source to the system to minimize the power converter overhead [5] . The DC-DC converter efficiency becomes more critical in micro scale energy harvesting systems [6] , [7] . DC-DC converter has minimum required input voltage, and so the system designer should also consider all these practical requirements when designing a system [8] . EscaCap performs reconfiguration of the supercapacitor array to maximize DC-DC converter efficiency [9] . Nowadays, more systems are being equipped with renewable energy sources such as photovoltaic (PV) modules. Maximum power point tracking (MPTT) algorithms are applied to PV modules to maximize its output power. However, merely maximizing the output power of PV modules only does not guarantee maximum energy efficiency at the system-level. Jointly optimizing the conversion efficiency with the energy source is crucial [10] . This tutorial paper handles various topics on power converter-aware design beginning from converter basics and efficiency model to architectural techniques, system design methodologies, and renewable power conversion. (i.e., buck) input voltage according to the converter topologies. Figure 3 (a) shows the basic structure of the buck-boost switching regulator. Depending on the relation between V in and V out , the buck-boost switching regulator has two working modes: buck (step-down) mode and boost (step-up) mode. As the names imply, it operates in the buck mode if V in > V out , and otherwise in the boost mode [12] . A switching regulator contains a circuit, located on the path between the external power supply and the energy-storage element, which controls two MOSFET switches. The switch control techniques most widely used in practical switching regulators are pulse-width modulation (PWM), which controls the turn-on duty ratio of each MOSFET with a fixed switching frequency, and pulse-frequency modulation (PFM), which controls the switching frequency by constraining the peak current flowing through the inductor as shown in Fig. 3 (b) and (c). Each control technique has its own advantages and shortcomings. Switching regulators controlled by PWM generate less ripple in the V out , their switching noises are easier to filter out, and they are more efficient under heavy loading, whereas switching regulators controlled by PFM exhibit higher efficiency with light loads [4] .
Most commercial switching regulators use either PWM or hybrid PWM/PFM control. The hybrid technique inherits higher light-load efficiency from the PFM control technique, but PWM is preferred for applications that require low cost, or small size, and for noise-sensitive systems including analog circuits and wireless communication subsystems [4] . c 2013 Information Processing Society of Japan
Power Conversion Efficiency
An ideal power converter delivers the entire power from the source to the load without any loss, but the power conversion involves non-zero amount of power loss in practice. The power conversion efficiency η conv is defined as
where P in , P out , and P conv are input power, output power, and the power loss during the power conversion, respectively, and I in is input current [12] .
Linear Regualtor Efficiency
The power conversion efficiency of a linear regulator is primarily determined by the dropout voltage, as its I in and output current, I out , are almost the same. Recent linear regulators have a very small dropout voltage, around 300 mV, and the peak efficiency is high. But in general, the V out is fixed as part of the design specification, while the V in varies within a particular range due to the IR drop across the power source. Therefore, the linear regulator does not achieve high average efficiency, even though its dropout is small. Linear regulators are more efficient when the power source's voltage is closer to the required V out [11] . The power loss in a linear regulator is computed as
Switching Regulator Efficiency
An ideal switching regulator consumes no power, unlike an ideal linear voltage regulator. However, practical switching regulators have non-ideal characteristics that cause power to be lost. Generally, the major sources of power dissipation in the switching regulation are classified into three categories [13] . Based on a number of previous studies of power loss in switching regulators [13] , [14] , [15] , we express the power dissipation due to each source in terms of manufacturing parameters and load conditions such as the V out and the I out . The power dissipation of the switching regulator is expressed as the sum of the following three components: P conv(swt) = P cdct + P sw + P ctrl [4] .
Conduction Power Dissipation: All the elements of a switching regulator, such as switches, inductors, and capacitors, are nonideal and have their own resistive components R ES R . This means that power dissipation I 2 · R ES R due to the current I through these elements is unavoidable.
Although varied amounts of current flow through different components, as shown in Fig. 3 , these currents are all positively related to the load current of the system. Consequently, the conduction power dissipation of the switching regulator can be reduced by reducing the load current, which can be achieved by high-level power management that controls the load power.
Since the different types of switching regulator have different ways of switching their MOSFETs, as shown in Fig. 3 (b) and (c), their conduction power dissipation has different characteristics. Therefore, different conduction power models are used for the different types of switching regulator.
The conduction power dissipation of a PWM switching regulator in the buck mode can be formulated as
where R S W1 , R S W2 , R S W4 , R L , and R C are the turn-on resistance of the SW1, the turn-on resistance of the SW2, the turn-on resistance of the SW4, the equivalent series resistance of the inductor L, and the equivalent series resistance of the capacitor C, respectively. D and ΔI L(PW M,buck) are the duty ratio (time when the current actually flows through the component/total time) and the ripple of the current flowing through the inductor, respectively, which can be expressed as follows:
where L f is the value of the inductor, and f S is the switching frequency, which is assumed to be constant in a PWM switching regulator.
In the boost mode, the conduction power dissipation can be expressed as
P cdct (PW M,buck) and P cdct(PW M,boost) consist of two terms. The first and second terms represent the conduction power consumptions due to the dc component and the ac component (or current ripple), respectively, of the current flowing through all components (i.e., S W1, S W2, S W3, S W4, L, and C) on the current path. In the first term of P cdct (PW M,buck) 
is the effective resistance of the current path of the switching regulator, considering the duty ratio of each component on that path. The duty ratios for S W1, S W2, S W4, and L are D, (1 − D), 1, and 1, respectively. (Since the dc component of the current flowing through the C is zero, the term related to C is omitted.) It is well known that the conduction power consumption of some systems can be expressed by I 2 · R, where I is the current flowing through the system and R is the resistive component of the system. Therefore, the product of this effective resistance and I that depends on the output current, the output voltage, and other factors. Therefore, the switching frequency should be characterized accurately to determine the amount of the conduction power dissipation of a PFM switching regulator. From Ref. [13] , the switching frequency can be described as
where I peak is the peak inductor current allowed in a given PFM switching regulator, and T S W1 and T S W2 are the turn-on times of the S W1 and the S W2, respectively. T S W1 and T S W2 can be determined as follows:
P cdct (PF M,buck) in Eq. (9) is modeled in the same way as P cdct (PW M,buck) . In the first term, ((T S W1 + T S W2 )/T ) · (I peak /2) 2 is the square of the dc component of the current flowing through each component, and in the second term,
2 is the square of the ac component of that current. The duty ratios for S W1 and S W2 are (T S W1 /(T S W1 + T S W2 )) and (T S W2 /(T S W1 + T S W2 )), respectively. These expressions for the current and duty ratios can be found in (or derived from) many references (e.g., Refs. [13] and [15] ).
Replacing T S W1 and T S W2 with the expressions from Eq. (8), we can also construct the alternative expression shown in the last two lines of the following equation:
where ΔI L(PF M,buck) is the ripple of the inductor current, which is almost the same as I peak in the PFM switching regulator [4] . Gate Drive Power Dissipation: The gate capacitance of two MOSFET switches is another source of power dissipation in switching regulators. A switching regulator controls the output voltage and maintains the required load current by opening and closing two switches alternately. This process requires repeated charging of the gate capacitances of the two switches. Thus, the gate drive power dissipation is directly affected by the amount of switching per unit time, which is the switching frequency. Consequently, PWM switching regulators with a constant switching frequency consume a fixed gate drive power that is independent of the load condition, whereas PFM switching regulators consume less gate drive power as the output current diminishes. Gate drive power dissipation is roughly proportional to the input voltage, the switching frequency, and the gate charge of MOSFETs, as shown in the following equation [15] :
where Q S W1 and Q S W2 are the gate charges of the SW1 and the SW2, respectively. This gate drive power model can be applied to both PWM and PFM switching regulators in the same way, except that f S is a constant in the PWM model, but a variable in the PFM model.
In the boost mode, the gate drive power dissipation can be expressed as
Controller Power Dissipation: Besides the gate drive power dissipation of the control circuit, the static power dissipation of the PWM or PFM control circuit, and the power lost in miscellaneous circuits in a switching regulator should be considered. Generally, controller power dissipation is independent of the load condition, which makes this power dissipation a dominant one under light loads. We characterize the controller power dissipation as (12) where I ctrl is the current flowing into the controller of the switching regulator, excluding the current charging the gate capacitance.
Other Practical Considerations
Effect of the DC-DC converter efficiency is more critical in micro scale systems [6] , [7] . The micro scale solar energy harvesting systems perform the MPPT with considerations on the energy efficiency variation to maximize the harvested energy. DC-DC converter is also subject to minimum input voltage requirement. It cannot operate below certain input voltage level. The system designers should be aware of such requirements when developing a system. Sensors that rely on harvested energy and a supercapacitor storage may encounter severe loss in time and energy efficiency to meet this requirement during cold boot. DuraCap relieves this problem by adding a small capacitor besides a large reservoir supercapacitor array [8] . EscaCap performs reconfiguration of the supercapacitor array to maximize DC-DC converter efficiency [9] .
Low-power Techniques and Correct Overhead Modeling

Dynamic Voltage and Frequency Scaling
Dynamic voltage and frequency scaling (DVFS) has gained popularity over the last decade as a promising power reduction technique. DVFS exploits under-utilized resources by reducing supply voltage and operating frequency. Power consumption of a CMOS circuit is as follows.
c 2013 Information Processing Society of Japan where P total , P dynamic , and P leakage is total, dynamic, and leakage power consumption, C, V dd , f , α 1 , and α 2 is effective switching capacitance, supply voltage, switching frequency, and two leakage coefficients, respectively. Dynamic power P dynamic is proportional to switching frequency and square of supply voltage as Eq. (13) implies. The dynamic energy gain comes from reducing the supply voltage only since reducing switching frequency increases the execution time proportional to the amount of power reduction. Instead, operating frequency is subject to another constraint, the alpha-power law [16] .
where d, V T , and α are critical path delay, supply voltage, threshold voltage, and empirical parameter between 1 and 2 where α approaches to 1 as technology scales, respectively. Delay increases as supply voltage decreases, which means that the circuit should operate at a lower frequency. One should allow for leakage power consumption also. Scaling down the voltage and frequency too much might increase leakage power consumption. There exists an optimal operating point which minimizes the total energy consumption. This is called critical speed. There exists a large body research on DVFS techniques based on these observations. DVFS is also extensively studied as means for dynamic thermal management since power dissipation is directly related to temperature [17] , [18] . A study on Intel Pentium M system has shown that DVFS is very effective in controlling the processor temperature [19] . For fine-grain control of temperature, the voltage and frequency transition could be very frequent in the order of a few milliseconds.
Power Converter Voltage and Frequency Transition
Overhead Despite the extensive use of DVFS as a low-power technique or a thermal management technique, the overhead of DVFS transition has not yet been fully understood [2] . Output programming is typically done by configuration register access from the microprocessor. The configuration registers are often mapped to special registers of the microprocessor such as MSRs (Model Specific Registers) of Intel Core Duo [20] . The processor writes to such special registers, and then a command is sent to the DC-DC converter to increase the output voltage. However, the output voltage takes some time to settle down. The processor sets a new target frequency to frequency synthesizer. Output frequency consumes some time for the frequency to stabilize. This whole sequence becomes delay and energy overhead that requires careful modeling.
Existing DVFS transition overhead models have limitations and are not applicable to modern DVFS setups. In particular, they are significantly simplified, contain technical fallacies, or are limited to uncommon setups. The most widely used model for DVFS transition overhead is [21] , [22] . The model is summarized as the following equations.
where T X and E X is the delay and energy overhead of a DVFS ) is the maximum inductor current, and V e and V s are the end and start voltage of a DVFS transition, η is the converter efficiency, respectively. However, this model has fallacies. Equation (15) is the time required for the converter to charge the output bulk capacitor. This is a result of assumption that the microprocessor stops operation during the entire voltage transition period, something that is neither desirable nor practical [23] . Also, Eq. (16) assumes that the charge in and out of the bulk capacitor itself is the energy overhead. Other works often assume that voltage controlled oscillators are used for the clock generator, which is unusual in today's microprocessors. An actual DVFS transition sequence is shown in Fig. 4 . CPU voltage is increased first and then frequency is increased during an upscaling transition. The processor is underclocked using a conservative value during voltage transition to ensure safe operation. Frequency change requires certain amount of time called PLL lock time during which the processor must halt. Frequency is decreased first and then CPU voltage is decreased during a downscaling transition.
The actual DVS voltage transition time T † X should satisfy the following equation:
T † X is an important variable to determine the amount of E up , E down and E uc . Note that we take I cpu (t) into consideration because max (I L (t)) is not large enough to allow us ignore I cpu during voltage transition. However, I cpu was ignored in previous models [21] . Table 1 shows the definitions of the terms to be used in the following sections.
c 2013 Information Processing Society of Japan
Converter-induced Overhead of a DVFS Transition
Conventional DVFS transition energy overhead accounts for the charge transfer to and from the bulk capacitor as described in Section 3.2. We perform LTSPICE simulation and confirm the presence of: i) additional inductor IR loss during voltage upscaling and ii) energy loss due to continuous-mode DC-DC conversion.
Charge transfer to and from the bulk capacitor: The DC-DC converter output voltage is set by the bulk capacitor terminal voltage, which is in turn proportional to this charge stored in the capacitor. Voltage upscaling transfers additional charge to the bulk capacitor and increases the terminal voltage from V s to V e . The amount of energy for the upscaling is calculated as
Notice that E cap > 0 for voltage upscaling, which denotes energy loss.
For downscaling, E cap ≤ 0 because the upper MOSFET in the DC-DC converter is open and stops supplying current to the bulk capacitor, but the bulk capacitor still supplies power to the microprocessor until the voltage converges to V e . This contributes as a source of negative energy overhead (i.e., energy gain) during voltage downscaling.
Additional inductor IR losses: The additional charge transfer to the bulk capacitor also incurs IR loss in the inductor. This loss is not symmetrical because voltage downscaling does not involve the inductor.
Notice that the above equation does not account for the inductor IR losses due to microprocessor current that goes thru the inductor. This is necessary in order to avoid double counting on the energy losses during upscaling. Instead it only counts the additional current needed to increase the bulk capacitor voltage. Energy loss due to continuous-mode DC-DC conversion: Unfortunately, continuous-mode DC-DC conversion mostly wastes the potential energy gain from E cap during the voltage downscaling because the lower MOSFET discharges the bulk capacitor to GND. The energy loss, E down is given by
where
On the other hand, discontinuous-mode DC-DC conversion effectively blocks the negative inductor current, i.e., E down = 0. Total converter-induced DVFS overhead: The total converterinduced overhead, which is not symmetrical for voltage up and downscaling, and is given by
Microprocessor-induced Overhead of a DVFS Transition
There is a microprocessor-induced overhead when the microprocessor changes its voltage and the frequency. We perform LT-SPICE simulation and evaluate the microprocessor performance underclocking energy loss. We find that I cpu during the PLL lock time is an important factor for calculating the energy loss and thus, should not be ignored.
Microprocessor performance underclocking loss: One of the most dominant energy loss sources is caused by underclocking the microprocessor (i.e., applying a conservative clock frequency below the maximum frequency that the supply voltage can safely support) during the transition period as shown in Fig. 4 . Because of underclocking, the microprocessor consumes additional dynamic and static power, which is given by
Power consumption during the PLL lock time: The microprocessor operation is halted during the PLL lock time as explained in Section 3.2. In general, clock and/or power gating cannot be ideal (without overhead losses), i.e., there is non-zero amount of static power consumption from the microprocessor during the PLL lock time, which is given by
Total microprocessor-induced DVFS overhead: The total microprocessor-induced DVFS overhead is given by
The total energy overhead of a DVFS transition is the sum of the converter-induced overhead in Eq. (22) and microprocessorinduced overhead in Eq. (25) as follows:
Time Overhead of a DVFS Transition
Once again, we model the PLL lock time as a constant time penalty regardless of f s and f e as described in Ref. [24] . Most previous work regards the PLL lock time as the only source of the time penalty that causes the microprocessor performance degradation. However, we address another time overhead factor especially for upscaling. Therefore, we first present the penalty of the microprocessor during the DVFS transition in cycles, and then derive time penalty. The microprocessor operates at f s during the voltage upscaling time to guarantee safe operation of the microprocessor. This is similar to the performance underclocking loss but only voltage upscaling is subject to the additional time overhead. Thus, the cycle penalty, C P is given by
and then we present time penalty using the cycle penalty as follows c 2013 Information Processing Society of Japan
: upscaling,
The time penalty is the only overhead for a DVFS transition because the microprocessor is not halted for T † X as described in Section 3.2. Therefore, the time overhead of a DVFS transition is the same as the time penalty as follows:
DC-DC Converter Aware Dynamic Voltage Scaling
Almost all of the currently proposed methods do not seriously take into account the efficiency of DC-DC converters, simply assuming DC-DC converter power efficiency as a constant value. If the amount of power dissipation of a DC-DC converter were constant over the entire operating range, we could ignore its effect on the total energy consumption of the system. However, the efficiency of a DC-DC converter has a close correlation with the output voltage level and the load current as we discussed in the previous sections. The key concern of DC-DC converter-aware DVS is: Even though an effective power management scheme can reduce the power consumption of a device to a large extent, it does not always mean that it also reduces the power consumption of a DC-DC converter minimally, in some cases operating very inefficiently, resulting in poor battery life enhancement. Consequently, it is necessary to solve the two problems, namely the problem of (output) voltage scaling of a DC-DC converter, and the problem of voltage scaling that is applied to the devices other than the DC-DC converter in an integrated fashion, so that the total energy consumption is globally minimized [3] , [4] .
Specifically, we approach the problem in two aspects, in which the two subproblems in Eqs. (1) and (2) in the following to cover the core parts of the problem of DC-DC converter-aware power management: (1. Converter-aware voltage scaling problem) For a given single task with execution cycles and a deadline, we derive the power consumption model of a DC-DC converter by analyzing how the power consumption is related to the output voltage, and propose a solid voltage scaling technique that minimizes the total sum of the energy consumed by the execution of the task and the energy dissipated by the DC-DC converter. The proposed technique is then simply, yet effectively, extended to handle multiple tasks; (2. Application-driven converter optimization problem) Conversely to the problem solved in Eq. (1), we propose a solution to the problem of finding a configuration of a DC-DC converter that is best suited for the application to be executed in the system in terms of minimizing total energy consumption.
Converter-aware Voltage Scaling
We adopt the power loss model introduced in Ref. [25] to describe the energy consumption of a DC-DC converter according to the load current. We do not use this model as it is, but make a simplified version, which considers many manufacture-related parameters as constants, as follows:
where I is the load current, W is a DC-DC converter configuration parameter which controls a tradeoff between load independent power consumption and load dependent power consumption (e.g., the gate width of MOSFET switches in Kursun's loss model [25] ), and c 1 , · · · , c 4 are constants. If I = 0, we can consider P DC (I, W) as zero because many DC-DC converters enter the shutdown state with very little power loss when there is no load current. An instance of a task scheduling and a voltage allocation problem in a system consists of a set J = {J 1 , J 2 , · · · , J N } of tasks (or jobs) and a variable voltage range [V min , V max ] where N represents the number of tasks. We denote f (V) to be the clock speed corresponding to the voltage V.
Each task J i ∈ J is associated with the following parameters:
• a i : the arrival time of J i .
• R i : the number of processor cycles required to complete J i , Since the supply voltage directly determines the processor's clock frequency (as implied in Eq. (14)), it is often convenient to think of the energy consumption as a function of the clock frequency. Let f i (t) be the clock frequency assigned to task J i at time t, and P i ( f i (t)) be the energy consumed in task J i during a period of unit time, starting at t. Then, the total energy consumed by a voltage scaling, A i , for task J i is given in Ref. [26] .
where t i,1 and t i,2 are the start and ending times of the execution of task J i . Thus, the total CPU energy consumption, E CPU , excluding that in DC-DC converter for N tasks (
Then, from Eqs. (32) and (30), the total energy consumption including that in a DC-DC converter for the tasks is computed by
Note that the values of a i , d i , and R i are given for task J i , and the values of s i (t) and P i ( f i (t)) vary according to the dynamic scaling of voltages to J i , and, thus, directly affect the amount of energy consumption. A schedule of tasks is referred to as a feasible schedule if all the timing constraints of the tasks are satisfied. We assume that tasks can be preempted. Then, the task scheduling and voltage scaling problem is:
Problem 1: Given an instance of tasks, a DC-DC converter, and a voltage range of a processor, find a feasible task schedule and voltage scaling to tasks that minimizes the quantity of E tot in Eq. (33).
To reduce the complexity of the problem, we first propose a technique for solving a restricted version of Problem 1, and then extend it to fully solve Problem 1.
• Solution to Problem 1 with a single task: We derive a total power equation in terms of supply voltage variable only from Eq. (30) and P = CV 2 f : For a system with dynamic voltage scaling, the maximum operating frequency is proportional to the operating voltage. That is, f = αV where α is a system-dependent constant, and thus P = CαV 3 . Furthermore, since power consumption can also be expressed as a product of load current and supply voltage (i.e., P = VI), we have
We can express the total power consumption Eqs. (34) and (30) with a fixed value of W as follows
For a task with execution time of T and deadline of D, the quantity of E tot for the execution of the task can be obtained by simply multiplying the total power consumption, P tot (V), by the execution time because the power loss of the DC-DC converter in standby state is negligible:
is the number of cycles for given task) to E tot (V) gives
The last term in Eq. (36) indicates that the total energy consumption is not a monotonic increasing function of the output voltage. This means that using the lowest feasible voltage (or frequency) for a task does not always lead to minimal total energy consumption. Figure 5 shows the curve of E tot (V) for a DC-DC converter. The curve clearly indicates that the optimal voltage for E tot (V) is not always the lowest feasible voltage. • Solution to Problem 1 with multiple tasks: There can be two directions to solve Problem 1 with multiple tasks. One is a generic technique that is applicable to a broad class of DVS methods. The other is a fine-tuned technique only applicable to a specific DVS method. Since we are interested in the problem of integrating the efficiency variation of a DC-DC converter into the existing DVS methods, we choose the former direction. Specifically, for any (existing) DVS method with no consideration of power minimization in the DC-DC converter, our devised technique is the one that attempts to improve the quality of results produced by the method by reflecting the power consumption in a DC-DC converter. The idea of our proposed technique, called DC DVS-m, is to decompose the schedule of tasks into task basis (V( f ) ).
• Derive f OPT from Eq. (36); and apply DC DVS-1 in Fig. 6 to each of decomposed schedules to further reduce the total energy consumption of the task. Let E be f ore i and E a f ter i be the quantities of E tot in Eq. (36) of task i, before and after the application of DC DVS-1 to task i. Then, the total amount of energy saving ΔE tot by DC DVS-m over that of an existing DVS method is:
Note that the value of ΔE tot is always positive because for every i, E be f ore i − E a f ter i > 0. Figure 7 summarizes the procedure of DC DVS-m. DC DVS-m preserves the schedule of tasks that is produced by the input DVS method. It only updates frequency (i.e., supply voltage) to each task. If the schedule of a task spans more than one time interval, the intervals are merged to be one time interval and DC DVS-m is applied to the interval. The assignment [a i , d i ] = |s| in Fig. 7 performs such merge of time intervals. After when DC DVS-1 is applied to each task, the merged interval is restored to the original intervals.
Application-driven Converter Optimization
The problem of implementing a DC-DC converter that consumes the least energy consumption under the application of DVS is not simple since there could be various parameters, possibly, some of which are conflict each other. However, as mentioned in Section 2.3, one of the most critically impacting parameters on the variation of energy consumption is parameter W in Eq. (30) that controls the tradeoff between load independent power and load dependent power in a significant way. (Figure 8 shows two different energy curves that are extracted from experimentation for two different values of parameter W of a DC-DC converter.) In this section, we show how the parameter W can be optimized to minimize the total energy consumption of a system. Note that our optimization procedure is general in that it is applicable to any of the parameters only if the energy consumption can be expressed in terms of the parameter.
The derived form of energy model in terms of parameter W and applied voltage V is that in Eq. (36):
The last two terms represent the amount of energy consumption in the converter itself while the first term represents the amount of CPU energy consumed in a task. Note that W in the converter design should be constrained to be a value in [W min , W max ]. Even though it is not so difficult to find energy-optimal values of W and V from Eq. (38) for a 'single' task in a specific application, in a practical point of view, it would be hard to find optimal values for 'multiple' tasks. Since solving the problem using a complex mathematical tool would be a very time consuming process, we simplify the problem in a way to find the best value of W after the application of DVS, independently of the DC-DC converter. In other word, for a given DVS result, we want to find a value of W in [W min , W max ] that minimizes the total amount of energy consumption of the system. Precisely, let v 1 , v 2 , · · · v k be the voltages used to a (scheduled) sequence of unit times of execution of multiple tasks produced by a DVS scheme, and E tot (v i , W) be the total energy consumption in the corresponding time using voltage v i , then the total energy can be expressed, in terms of variable W only, as follows: range of [W min ,W max ], and set the energy minimal value of W accordingly, as shown in Fig. 9 .
The experimental results are shown in Tables 2 and 3 where NO DVS, DVS ONLY, and DC DVS refers to the results without DVS, DVS without considering DC-DC converter loss, and the proposed DC-DC converter-aware DVS, respectively. Benchmarks VP, AVN, CNC refers to videophone application, avionics application, and computerized numerical control machine application, respectively. DC DVS-1 is applied to single task benchmark MPEG, and DC DVS-m is applied to the rest.
Passive Voltage Scaling (PVS)
Passive Voltage Scaling
Passive Voltage Scaling (PVS) is a supply voltage scaling method that eliminates a DC-DC converter by hooking up a battery directly to a microcontroller. The direct battery connection is same to typical wireless sensor nodes [27] which operate the microcontroller at a fixed low clock frequency at all times, but the substantial difference is that we scale the clock frequency according to the battery voltage. Figure 10 illustrates the concept of PVS. When the battery has a high (SoC) level, V B is high enough to run the ULP core at the maximum speed. Generally, V B decreases by the SoC decreases ( Fig. 10 (a) ), and it is monitored by an embedded voltage supervisor (Fig. 10 (b) ). The voltage supervisor generates interrupts and helps the microcontroller core slow down the clock frequency according to V B . The clock scaling is generally done in a discrete manner. As a result, the throughput of the sensor nodes is a function of the battery voltage, i.e., the SoC of the battery (Fig. 10 (c)) .
A discrete PVS can be easily implemented with embedded peripheral devices of MSP430F1611 as shown in Fig. 10 , which c 2013 Information Processing Society of Japan does not incur additional hardware overhead. In a discrete PVS, the supply voltage monitor watches V B whether it crosses the boundary of the pre-defined subranges. Whenever V B crosses the boundaries of the subranges, it generates an interrupt. The microcontroller core then reprograms the DLL to scale down the clock frequency. In fact, a discrete PVS adjusts the clock frequency only a few times over the whole lifetime of a sensor node, which incurs only negligible overhead. The clock frequency is adjusted by the following delay equation:
where V DD is the supply voltage such that V DD = V B , V t is the threshold voltage of the CMOS logic that composes the microcontroller, and α is a velocity saturation coefficient.
Throughput and Power Saving
Sensor nodes with DC-DC converters regulate the battery voltage, and thus the microcontrollers can run with any desired clock frequencies over the entire lifetime. In other words, the performance of the microcontrollers has nothing to do with the battery c 2013 Information Processing Society of Japan SoC. Recent wireless sensor nodes removed the DC-DC converter and simply fixed the clock frequency low enough so that the microcontroller tolerate the lowest battery voltage. Thus, such a sensor node also exhibits throughput which is independent to the battery SoC [27] . However, as the sensor node experiences the lowest battery voltage only when the battery is about to die out, this is a very conservative method that wastes the performance of the microcontrollers during most of its lifetime. In contrast, as PVS continuously decreases the clock frequency according to Eq. (40), the throughput of the microcontroller also decreases. This is new characteristic of a digital system, which could be easily found in old analog systems. For example, old portable radios could make louder sound output when they had new batteries, but they eventually could make smaller sound output after the battery discharge was progressed. DVS achieves energy gain from a microcontroller by reducing the dynamic energy for a clock cycle. At the same time, leakage energy and other peripheral energy increase as the execution time of a task is lengthened [28] , [29] . The baseline energy consumption of a system, with a DC-DC converter and a fixed f c , is given by
E(Baseline) = E(CPU) + E(Peripherals) + E(DC-DC). (41)
Therefore, the energy gain from DVS can be written as follows:
The primary advantage of PVS is longer battery life, but the sources of energy gain is completely different from those of DVS; PVS does not aim at only CPU energy reduction. First, elimination of the DC-DC converter can restore the energy loss, which ranges from typically 20% to 40% in conventional DVS for microcontrollers. Second, decreasing the battery current such that i B ∝ V B , increases the total energy can be drawn from the battery. Third, a DC-DC converter is a noisy analog component that requires passive components such as an inductor and a bulk capacitor. The layout and circuit board pattern design is not cheap to minimize and isolate the switching noise. The PVS helps reduce the complexity, area and cost of a system. Fourth, PVS is ideal for cheap alkaline batteries whose voltage drop is significant by the SoC loss. Most previous battery-aware power management assumed Li-ion batteries, but alkaline batteries are more appropriate for disposable and low-cost system. Fifth, PVS recovers potential performance of the CPU and reduces the execution time (or increase throughput). This consequently does not incur appreciable peripheral energy overhead, which is opposite to DVS. The energy gain from PVS can be written as follows:
Therefore, the energy gain of PVS is not limited to the CPU, and has great potential to reduce the whole system energy when the CPU power portion is not dominant. We demonstrate the amount of energy saving in the following example.
PVS on a Wireless Sensor Network
In this section, we evaluate PVS considering the whole sensor network We demonstrate how PVS achieves longer lifetime with traditional performance-driven routing for a distributed data processing. The necessary condition to justify the effectiveness of PVS is that network latency (not power consumption) should be bounded by computational delay, not by communication delay, which is common in wireless sensor networks [30] .
We borrow TinyOS [31] and real applications and compare the actual battery SoC and lifetime of PVS with DCDC and TELOS. The DCDC represents a sensor network where all the nodes are equipped with a DC-DC converter and runs at the highest frequency, while the TELOS represents typical sensor nodes with direct battery connection and the microcontroller running at the lowest frequency. We use TOSSIM and customized MultiHopRouter component to include the battery, DC-DC converters, and detailed energy and performance models of MSP430F161. We employ a discrete battery model considering the rate capacity and recovery effects [32] . We assume a 7 × 7 sensor network where each node is spaced with an equal distance forming a grid. The center node is a sink node having a storage element and only receives packets. The other nodes sample ambient noise with a 7.5 KHz sampling rate with 8-bit resolution, and forward it to the sink node. When a node is triggered by noise level that exceeds the threshold, it samples the noise for 4 s. We assume that the event occurs every 8 s with uniform spatial probability.
Once a node is triggered, it captures and forwards the noise data to an adjacent node without compression. The adjacent node receives the data and compresses it with the ratio of two, and forwards it to its adjacent node again, until the data reaches to the sink node. We do not attempt any further compression once data has been compressed. If a node receives another data packet during compressing data, it forwards the newly arrived data packet to the next adjacent node without compression. The compression throughput of a node is 80 Kbps at 8 MHz. The radio bandwidth is 250 Kbps, and the actual data transmission rate is 60 Kbps. We scale down the capacity of the Toshiba LR-44 button alkaline battery to 0.2 mAh (500:1) for faster simulation. The TX power and RX power is 52.2 mW and 56.4 mW, respectively [31] . We assume ideal scheduled wake up for the RX power management. The underlying routing scheme is a performance-driven routing: a tree-based routing [33] . Figure 11 shows the snapshots of the residual energy of the sensor network for the three different schemes: DCDC, TELOS c 2013 Information Processing Society of Japan Fig. 11 Residual energy snapshots of the sensor network with DCDC, TE-LOS and PVS [5] .
Fig. 12
Latency distribution of the sensor nodes [5] .
and PVS. DCDC shows rapid decrease of the residual energy over TELOS and PVS. PVS shows the best residual energy saving and even worn out of the sensor nodes. Figure 12 shows the network performance comparison among DCDC, TELOS and PVS. Among them, DCDC shows 0.56 s average latency but the lifetime is much shorter as shown around 689 s. TELOS lasts more than 817 s, but its average latency is 0.73 s. In contrast, PVS lasts up to 1,095 s and the average latency is 0.54 s. In conclusion, PVS takes both advantages of TELOS and DCDC, in terms of residual energy saving and high performance. Most of all, PVS offers even worn out of the network with a performance-driven routing.
Maximum Power Transfer Tracking
Solar Energy Harvesting
Practical deployment of a renewable energy source mandates an electrical energy storage element to compensate the output power fluctuation of the renewable source. Generally, the energy storage element of a renewable energy source experiences very frequent charge and discharge phase changes. Supercapacitors, also known as electrolytic double layer capacitors, are one of the most promising energy storage elements for this application because of i) orders-of-magnitude longer cycle life compared to ordinary batteries, ii) high power rating, iii) no limitation in deep cycle use, and iv) very low negative environmental impact. Supercapacitors and PV (solar cell) modules are an excellent combination because the energy storage element typically performs deep-cycle discharge every night. Figure 13 illustrates a simplified schematic diagram from energy generation to storage. The total system efficiency enhancement seeks to maximize the power that goes into the supercapacitor, P charge . In contrast, previous maximum power point tracking (MPPT) work would maximize the power output from the PV module, P pv . If the efficiency of the charger η is constant, maximizing P pv also maximizes the power that goes into the supercapacitor P charge . However, typical switching charger efficiency for supercapacitors changes dramatically as a function of its input and output voltage difference, i.e., the PV module voltage and the supercapacitor terminal voltage. The charger efficiency varies from 10% to 80% as a function of its input-output voltage difference and in turn the supercapacitor's SoC. Because the charger efficiency η is not constant, conventional MPPT methods do not any longer guarantee the maximum P charge . In other words, even if conventional MPPT achieves the maximum power drawn from the PV modules, a large portion of power is simply dissipated by heat in the charger, and never goes to the supercapacitor. Furthermore, determination of C is constrained by not only the amount of required energy storage but also the charger efficiency.
The MPPT methods dynamically adjust the output current to match impedance so that the maximum amount of power can be drawn from the power generating device. They first identify the maximum power point (MPP) to draw the maximum P pv and continuously keep track of this point against the irradiance variation and/or load impedance variation. There are lots of previous contributions to achieve MPPT. Perturb & observe (P&O) method and incremental conduction method identify the MPP by generating a slight change in I pv and observing the change in P pv [34] . Ripple correlation control method [35] finds the MPP using the time derivative of I pv and P pv . As for economical implementations of MPPT, a small pilot cell [36] or a linear relationship of the MPP to the open-circuit voltage or short-circuit current can estimate the MPP [37] .
Maximum Power Transfer Tracking
The maximum power transfer tracking (MPTT) [10] keeps tracking (V pv , I pv ) which may be slightly different from that of conventional MPPT, to guarantee the maximum amount of power to the load at all times rather than the maximum power from the PV module. The MPTT sets the operating point (voltage and current) of the PV module to where the output power of the charger is maximized, not to where the output power of the PV module is maximized. The MPTT always outperforms the MPPT in terms of net power delivery P charge to the load regardless of environmental change. More importantly, P charge at the maximum power transfer (MPT) point is determined not only by G, but also by the current value of V cap . Figure 14 GV cap domain. We may draw a trace of (V cap (t), G(t)) pairs when C is 300 F, 3,000 F or 30,000 F. For illustration purpose, we set G(t noon ) = 900 W/m 2 . Initially, V cap (t sunrise ) = 0 V. Figure 14 (b) shows the supercapacitor's energy, E cap as t elapses. The sampling points in Fig. 14 (a) and (b) are matched with each other in terms of t. This figures show that the determination of appropriate C is critical for the efficient energy harvesting. However, there has been no consideration on this problem because previous MPPT methods do not concern about the energy efficiency in the next phase of the charger. A naive brute force method to find the optimal design is that first we obtain P charge surfaces for all n × m PV array configurations, and then evaluate E cap for all C. However, obtaining the P charge surfaces as in Fig. 15 is very time consuming because of the numerical iterations needed to reach the convergence point of the PV model and charger models. Based on the observation such that a switching converter exhibits a higher efficiency when the input and output voltages are similar to each other, we develop a systematic algorithm that efficiently derives the near-optimal values of n, m, and C when E req , V rating , and G are given. The objective of this algorithm is to derive the minimum n × m and optimal C. Table 4 shows the energy efficiency of the suggested MPTT method compared to conventional MPPT method. E cap is the harvested energy at the end of the day when operated by the MPPT and MPTT methods. The capacitance C of the MPTT case is the theoretical optimum for each given n × m. MPTT shows more than 6x harvested energy over a poorly configured conven- tional MPPT. This is because MPTT finds the true optimal tracking point considering the charger loss while conventional MPPT draws more power from the PV array but loses even more power in the charger. Table 5 is the comparison between designs derived by the suggested algorithm and the optimal design found by exhaustive search for various E req and V rating values. For all cases, we notice that the negative error is less than 2%, which is quite reasonable in light of the typical device tolerance used in commercial circuits.
Energy Harvesting Improvement
Our proposed MPTT framework derives the cost-optimal solar cell array configuration, the optimal supercapacitor bank capacitance, and keeps track of the true power-optimal points that are different from those of conventional MPPT. This framework significantly enhances the overall system efficiency from 3% to 6+ times. This energy-optimal design and operation have a significant contribution for use of renewable energy sources for green computing. The future work includes a design improvement consisting of a battery-supercapacitor hybrid energy storage, as well as the load scheduling.
Conclusion
This tutorial paper introduced the importance of power converter-aware design in various aspects. The importance of power conversion efficiency in low-power design has widely been underestimated or incorrectly evaluated in spite of significant amount of work in low-power research. The conversion efficiency of the power converters is not constant and changes according to the input voltage, output voltage, and load current unlike the constant-efficiency assumptions in existing works. We first showed the basics of DC-DC converters and provided a concise efficiency model. We then performed an accurate evaluation of the overhead of the most widely known low-power technique, dynamic voltage and frequency scaling (DVFS), with regard to the c 2013 Information Processing Society of Japan losses DC-DC converters. DC-DC converter-aware DVS, passive voltage scaling (PVS), and maximum power transfer tracking (MPTT) are introduced as design practices of converter-aware design. These design practices show that the power conversion efficiency has significant impact on a digital system energy efficiency, and it should be carefully considered during the whole design process of system design.
