Abstract-A thermoelectric energy harvesting interface based on a single-inductor dual-output (SIDO) boost converter is presented. A system-level design methodology combined with ultra-low-power circuit techniques reduce the power consumption and minimize the losses within the converter. Additionally, accurate zero-current switching (ZCS) and zero-voltage switching (ZVS) techniques are employed in the control circuit to ensure high conversion efficiency at µW input power levels. The proposed SIDO boost converter is implemented in a 0.18 µm CMOS process and can operate from input voltages as low as 15 mV. The measurement results show that the converter achieves a peak conversion efficiency of 86.6% at 30 µW input power.
Abstract-A thermoelectric energy harvesting interface based on a single-inductor dual-output (SIDO) boost converter is presented. A system-level design methodology combined with ultra-low-power circuit techniques reduce the power consumption and minimize the losses within the converter. Additionally, accurate zero-current switching (ZCS) and zero-voltage switching (ZVS) techniques are employed in the control circuit to ensure high conversion efficiency at µW input power levels. The proposed SIDO boost converter is implemented in a 0.18 µm CMOS process and can operate from input voltages as low as 15 mV. The measurement results show that the converter achieves a peak conversion efficiency of 86.6% at 30 µW input power.
Index Terms-Boost converter, dead time, energy harvesting, low-power design, single-inductor dual-output, zero-current switching, zero-voltage switching.
I. INTRODUCTION
E XTENSIVE investigation of potential harvesting solutions is driven by the increase need of self-powered, long-lifetime, and small-size wireless sensor systems. For instance, wearable and implantable biomedical systems could particularly benefit from autonomous, miniature, biocompatible and ECO-friendly harvesting devices. However, for most biomedical systems, harvesting solutions remain challenging to implement due to their constrained size and excessive power consumption in comparison to the available energy extracted from the environment. Recent micro-/nanotechnology advances have pushed the power consumption of biomedical systems to extremely low values. At the same time, the power density of the miniature harvesters is constantly improving, making energy harvesting a feasible powering solution.
Among harvesting sources that are suitable for biomedical systems [1] , [2] , the latest thermoelectric generators (TEGs) provide the highest power densities at miniature scales (more than 100 μW/cm 3 [3] ). However, in wearable, and especially in implantable biomedical applications, the expected temperature differences across the plates of a TEG are very low, ranging from 0.5 K to 3 K [4] , [5] . At these temperature differences, the TEG [3] can provide voltages of 15 mV-90 mV and power levels of 1 μW-40 μW to a matched load. So, a customized boost converter is necessary to efficiently up-convert such low voltages to the levels required by a biosensor system. Extremely low input voltages combined with high conversion ratios and ultra-low-power levels set very challenging specifications for the boost converter. Recent literature demonstrates various low input voltage boost converters for thermoelectric energy harvesting [6] - [10] . Different approaches have been proposed to overcome the challenges and improve the conversion efficiency. For instance, a low-power control method has been proposed in [6] to facilitate the up-conversion of very low input voltages. A maximum power extraction (MPE) technique has been introduced in [7] to maximize the end-to-end transfer of the extracted energy. In [8] , a fully electrical start-up circuit has been demonstrated. High conversion efficiencies (above 80%) are achieved in [9] and [10] due to improved control schemes. However, all these solutions obtain high efficiencies at relatively high input power levels (from hundreds of μW to mW). As the input power decreases, the efficiencies degrade and become insufficient at μW power levels. This is because the power consumption of the control circuit and the losses within the converter become comparable to the input power. To overcome these challenges, this paper proposes a combination of methods and circuit techniques to reduce the power consumption, and further suppress the losses within the converter. First, losses expressions are used to find the optimal switching frequency, inductor value and switch sizes of the converter. Next, a dual-output architecture is employed to reduce the power consumption of the control circuit to nW level. Finally, accurate zero-current and zero-voltage switching techniques mitigate the losses related to the accuracy of the control. This paper is organized as follows. Section II introduces the system architecture and its advantages. Section III presents a thorough losses analysis and a design methodology for achieving a high efficiency. Section IV focuses on detailed circuit implementations of individual building blocks. Section V highlights the measurement results. Finally, concluding remarks are given in Section VI. Fig. 1 shows the proposed single-inductor dualoutput (SIDO) architecture of the boost converter. It consists of a DC-DC core circuit and a control block. The control block 0018-9200 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
II. SYSTEM ARCHITECTURE
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. includes a switch control, driver circuit, supply multiplexer, voltage divider, voltage monitor circuit, clock generator, and reference circuits. To achieve a high conversion efficiency at μW input power levels, besides reducing the losses within the converter, the power consumption of the control circuit has to be minimized relative to the input power. Considering that the control circuit is almost entirely digital [6] - [8] , the most effective way to reduce its power consumption is to reduce the supply voltage [11] . The control circuit is usually powered from the output of the converter [6] , [8] - [10] , and since the converter is driving a biosensor system, its output voltage is defined by the requirements of the biosensor rather than optimized for the low power consumption of the control circuit. Instead, in this work a separate output is added to the converter and used to power the control circuit. The additional output is beneficial for multiple reasons. First, since it is independent of the requirements of the biosensor system, it can be optimized for a low power consumption of the control circuit. Second, to start-up the converter, instead of charging a large storage capacitor, C ST , a much smaller second output capacitor, C CT R L , is charged. Finally, the second output can also be used as a second voltage supply for the biosensor system. The boost converter transfers the energy extracted from the harvester to only one of its two outputs at the time. The main output, V ST , is used to power the load (duty cycled biosensor) and the additional output, V CT R L , powers the internal circuits (control block) of the converter. The timing diagrams of the output voltages are also shown in Fig. 1 . The output V CT R L is regulated to around 1 V. The control block assures that the energy is forwarded to the capacitor C CT R L whenever V CT R L < 1 V is detected. When the second output is not used, C CT R L is charged roughly once in every 80 cycles in typical conditions (the input power ∼ 17 μW). Most of the time, the harvested energy is transferred to V ST and accumulated over time in the storage capacitor C ST . When sufficient energy is accumulated (V ST = 2 V), the signal V ST _O K (active high) is triggered to enable the load. The load is enabled until C ST is discharged to 1.8 V, the point at which V ST _O K becomes inactive again. Therefore, the load is enabled asynchronously to accommodate for unpredictable input power levels.
III. DESIGN METHODOLOGY
The main goal of the boost converter is to provide as much power as possible to the load (at the appropriate voltage level). In order to do so, it first has to extract the maximum possible power from the TEG, then efficiently transfer that power to the load and, at the same time, convert the voltage to a higher level. The proposed methodology aims to maximize the efficiency by jointly optimizing the total losses within the converter and the power consumption of its control circuit.
The MPE is achieved by matching the equivalent input resistance of the converter, R I N , to the internal resistance of the TEG, R T EG . Assuming that V ST V I N , R I N can be expressed as [7] :
where L is the inductor value, τ N is the duration of the nMOS switch ON time and f s is the switching frequency of the converter.
The conversion efficiency of the boost converter is defined as:
where P in and P out are the input and output power of the converter, respectively, P loss represents the total losses within the converter and P ctrl is the power consumption of the control block. The total losses within the boost converter are a sum of conduction, switching, synchronization and leakage losses. The synchronization losses arise from inaccurate turn ON and OFF timings of the pMOS switches (in Fig. 1 ), where in both cases the body diode or the reverse conduction introduce some additional losses to the already present conduction losses [6] , [12] . The leakage losses originate from the subthreshold leakage currents through the switches and may become significant when very low input power levels are targeted [13] .
A. Losses Analysis
The two pMOS switches of the boost converter are identically sized, and since only one is working at a time, in terms of the conduction losses this is approximately equivalent of having one output which turns on in every cycle. A similar observation can be made regarding the switching losses. However, the additional switch, M PC , introduces more parasitic capacitance at node X increasing the switching losses related to this node. The leakage loss of M PC is negligible compared to the leakage loss of the main switch, M P . This is because in the off-state, the drain voltage of M PC is twice lower while its gate voltage is the same as of M P , so the drain-induced barrier lowering (DIBL) effect is much less pronounced [14] . For these reasons and the sake of simplicity, the losses' analysis is done considering that the converter has a single output, but the total capacitance at node X is estimated considering both pMOS switches.
1) Conduction Losses:
The conduction losses originate from the nMOS and pMOS switches ON resistances (R N and R P , respectively), the inductor's equivalent series resistance (R L ) and the total parasitic series resistance (R par ). The resistance R par includes all parasitic resistances in the power path, such as resistances from PCB tracks, bondwires and other. The inductor core losses are neglected. Assuming that V ST V I N , the inductor's volt-second balance can be approximated as V I N · τ N ≈ V ST · τ P indicating that τ N τ P . Therefore, the conduction losses can be expressed as:
where τ N is extracted from (1) and
2) Switching Losses: The switching losses arise from driving the gate capacitances of the nMOS and pMOS switches, C N and C P , respectively, and the parasitic capacitance at node X, C X [13] . The switching losses can be approximated as:
where k is the power consumption factor of a driver circuit. The driver circuit usually consists of several buffer stages, so besides the power required to charge and discharge the gate capacitance, the additional power is consumed in the driver circuit. For this reason, the factor k is not equal but higher than one. In this work, the driver circuit is implemented as a 4-stage tapered buffer, which presents k ≈ 2.
3) Synchronization Losses: These losses can be completely eliminated if ON and OFF timings of the pMOS switches are perfect. However, in reality small timing errors are inevitable.
Turn-off timing losses are defined by the duration of the pMOS switch ON time, τ P . The duration of τ P is accurate if the pMOS switch is turned off when the inductor current reaches zero. As a result, almost lossless zero-current switching (ZCS) is achieved. For short τ P , the switch is turned off early, the inductor current flows through the body diode of the transistor and introduces the additional losses. When τ P is long, the pMOS switch is turned off late. In this situation, the inductor current changes polarity and discharges the output capacitor introducing relatively high losses. The losses introduced when τ P is short (P sync,s ) are much lower than when τ P is long (P sync,l ) for the same timing error, t err . This is evident from P sync,s /P sync,l = 1−(t os /t err ), which can be extracted from [6] , where t os is the overshoot time (usually very close to t err ). The fact that P sync,s is much lower than P sync,l for the same t err will be utilized during the design of the control block to further suppress the synchronization losses.
Turn-on timing losses of the pMOS switch are defined by the dead time, t dead . During t dead both switches are OFF and the inductor current is charging C X . The pMOS switch should turn on when V X reaches the output voltage, as it is shown in Fig. 2(a) . In this case, almost lossless ZVS is achieved. If t dead is short, the switch is turned on early, as it is illustrated in Fig. 2(b) . Since V X is still lower than the output voltage, C X is charged from the output capacitor. The reverse current flow introduces significant losses. When t dead is long, the switch is turned on late, as it is shown in Fig. 2(c) . As a result, V X exceeds the output voltage and, eventually, the body diode of the transistor becomes forward-biased introducing the additional losses. The dead time has to be adaptive to successfully eliminate the dead time losses. Otherwise, the efficiency can be degraded with more than 2% [15] . Nevertheless, typical designs consider only a fixed dead time. In this work, an adaptive dead time generator is incorporated in the control circuit to suppress the dead time losses and achieve ZVS.
4) Losses From Parasitics:
The capacitance C X and the resistance R par are the major parasitic contributors to the total losses of the boost converter. They include parasitics from the PCB tracks, package, bondwires, PADs, ESD protection, on-chip interconnections and switches of the converter. In μW boost converters, all these parasitics have to be carefully addressed. Otherwise, they can significantly reduce the overall efficiency of the converter. The additional problem is that most of the techniques that are used to reduce C X will increase R par and vice versa. Both C X and R par can be reduced by using a miniature, no-lead chip package, such as a quad-flat no-lead package (QFN) or similar. Besides parasitics from the chip leads, the miniature package also reduces the length and, consequently, the resistance of bondwires. The use of multiple bondwires/PADs for node X and ground connections also reduces R par , but at the same time increases C X . In this design, optimal numbers of bondwires/PADs for these nodes are empirically determined. In addition, to further reduce the parasitic resistance, the on-chip interconnections for the power path are implemented using networks of multiple metal layers. Finally, the achieved values of C X and R par are estimated to 10 pF and 0.15 , respectively.
B. Efficiency Optimization
To achieve a high efficiency, the sum of P loss and P ctrl has to be minimized, as it can be seen from (2). In [13] , the optimal values of the switching frequency and power transistors' widths for given R I N , V I N , V ST and L provide optimal losses. In this work, the optimization is extended to the power consumption of the control circuit. This is possible because the control is mostly digital, so its power consumption is frequency dependent [11] . In addition, the optimal inductor value is also obtained, contrary to the common practice of intuitively selecting the inductor value as a trade-off between efficiency and form-factor [6] - [8] , [13] . Equation (3) indicates that the higher inductance reduces the conduction losses. However, for the inductors limited by form-factor, the equivalent series resistance of the inductor (ESR) increases proportionally with the inductance. Fig. 3 shows the ESR versus inductance for miniature inductors, family 744043 [16] . Therefore, the ESR value (R L ) can be roughly approximated as p 1 · L, where p 1 is an inductor family dependent constant coefficient. Consequently, the overall contribution of R L to the conduction losses increases (it is proportional to √ L) when a higher inductance is used. In fact, there is an optimal inductance for which the losses are minimized.
Assuming that the control circuitry is successfully suppressing the synchronization losses, by combining (3) and (4), the sum of P loss and P ctrl can be expressed as:
R N and R P are the resistances per unit width, C N and C P are the gate capacitances per unit width, I leak,N and I leak,P are the leakage currents per unit width of the nMOS and pMOS transistors, W N and W P are the transistors' widths and C c,e f f is the total effective capacitance of the control circuitry [11] . Optimal values of W N , W P , f s and L can be obtained, for which the sum of P loss and P ctrl is minimum for given R I N , V I N , V ST and V CT R L . Fig. 4 shows the sum of P loss and P ctrl versus individual variables (W N , W P , f s and L) while other variables are set to their optimal values obtained from (5). For instance, Fig. 4(b) shows how this sum depends on the switching frequency when W N , W P and L are optimal. The arrows in Fig. 4 display the direction in which the certain types of losses are increasing. The input voltage is set to 60 mV, which is expected in typical conditions. The input resistance is assumed to be matched to the internal resistance of the TEG and equal to 210 [3] . The voltage V ST is set to its mean value of 1.9 V and V CT R L is set to 1 V. The capacitance C X and resistance R par are estimated to 10 pF and 0.15 , respectively. The coefficient p 1 is fixed to 10 m /μH, which corresponds to the inductor family [16] after accounting for tolerance. The capacitance C c,e f f is estimated to around 5 pF in this particular case.
IV. CIRCUIT IMPLEMENTATION
The design methodology presented in Section III is complemented by ultra-low-power circuit techniques in order to obtain high efficiency. The power dissipation of digital circuits is reduced by using a low voltage supply. The short-circuit currents are additionally reduced by making the rise/fall time at the input and the output of the gates similar [11] . The analog blocks are either low-frequency dynamic circuits (voltage divider and comparators) or operate in the weak inversion region (voltage and current reference), so their power dissipation is also very low. As a result, the complete control circuit consumes 160 nW, from which the static power amounts for only 80 nW. Fig. 5 shows the current and voltage references. The power consumption of both circuits is at nW levels. The current reference is voltage supply independent, operates in the weak inversion region and generates a PTAT biasing current. The resistance R s and the transistors M 1 and M 2 are sized so that the biasing current is equal to 10 nA. According to the Monte Carlo simulation (process and mismatch variations, 1000 runs), the average value of the biasing current is 10.26 nA and the standard deviation is 0.98 nA. The start-up circuit does not consume any static power [17] . The voltage reference, proposed in [18] , is adopted since it provides a very good performance at low power levels. The power consumption is reduced 
A. Current and Voltage References

B. Driving the nMOS Switch
The switching frequency and the value of the inductor are set to minimize the losses within the converter, as it is shown in Fig. 4 . Therefore, τ N and the duty cycle of the boost converter, D = τ N f s , are determined by (1) . For R I N = 210 and the optimal values of f s and L (15 kHz and 33 μH, respectively), the calculated τ N is 4.6 μs and D is approximately 7%. The signal for driving the nMOS switch is obtained directly from the clock generator. Fig. 6 shows the clock generator circuit, which consists of a current-starved ring oscillator and 
C. Zero-Current Switching
The ZCS is achieved when τ P is accurate. In previous works [6] - [8] , the value of τ P is adjusted after every switching cycle according to the 1 bit information obtained from sensing the voltage V X . Over time, τ P approaches the accurate value, but when it gets very close, it fluctuates around that value. Consequently, τ P is always slightly shorter or slightly longer, and introduces synchronization losses as it was already explained. The timing error t err depends on the unity delay in the one-shot pulse generator (the resolution of τ P ). The slightly shorter τ P is preferred in terms of synchronization losses. This result is utilized in this work, so τ P is kept to its closest to the accurate, but slightly shorter value [19] . Equivalently, the ZCS of the additional pMOS switch is achieved when its ON time, τ PC , is accurate. Considering the inductor's volt-second balance in both cases
For the targeted input voltage, τ P varies from 36 ns to 218 ns with the step of approximately 12 ns and τ PC from 70 ns to 414 ns with the step of approximately 23 ns. As a result, the switch control circuit can share the counter and the sensing circuitry for both switches, but separate one-shot signal generators are required. The proposed implementation for the ZCS scheme is shown in Fig. 7 . The signal V CT R L_O K is active as long as V CT R L ≥ 1 V, in which case the transistor M P (in Fig. 1 ) is operating. Otherwise, the energy is forwarded to the additional output through M PC . The voltage V X is sensed two times per cycle. The sensing is triggered by the delayed rising edges of one of the signals q P or q PC . The sensed 2 bit digital signal (b 1 b 0 ) provides the information whether τ P and τ PC are long 
D. Zero-Voltage Switching
The dead time for which ZVS is achieved can be approximated as [15] :
The dead time is introduced as a delay of the falling edge of the signal q N which triggers the one-shot signal generator, as it is shown in Fig. 7 . According to (6) , the same counter that is used to define τ P can be used to control the dead time.
The additional output of the boost converter requires a separate dead time circuit. The proposed implementation for the adaptive dead time circuit is shown in Fig. 8 . The circuit operates as a digitally programmable delay element controlled by the decoded counter value. A 2 bit signal d 1 d 0 adjusts the dead time for variability of C X . The signal d 1 d 0 is also used to prolong the dead time during measurements at node X to compensate for the capacitance added at this node by an oscilloscope probe. Fig. 9 shows the typical dead time introduced by the circuit and the accurate dead time required for a perfect ZVS. Similarly as in the ZCS scheme, the introduced dead time is slightly longer than the accurate dead time to avoid a lossy reverse conduction of the pMOS switch.
E. Voltage Divider, Voltage Monitor and Supply Multiplexer Circuits
A conventional resistive voltage divider requires a very high total resistance so that its power consumption is negligible compared to other losses within the converter. For instance, a 10 M voltage divider dissipates around 0.36 μW from the output and reduces the efficiency of the converter with approximately 2% in the typical case. Further increasing the The smaller of the two output capacitors, C CT R L , is charged to start-up the boost converter. This means that in the initial phase of the converter's operation, V CT R L is higher than V ST , until enough energy is accumulated in C ST so that V ST surpasses V CT R L . For the proper operation of the converter and to prevent conduction of body diodes, the driver circuits and the bulk terminals of the pMOS switches of the converter should be powered from the greater of V CT R L and V ST at any time. The supply MUX circuit compares V CT R L and V ST and connects the greater of the two voltages, V M AX , to the corresponding circuits [20] . The supply MUX circuit is shown in Fig. 11 . A track and latch dynamic comparator compares the supply voltages and controls the switches accordingly. The outputs of the comparator are buffered to drive the switches M 1 and M 2 . The level of the negative output is adjusted so that the switch M 2 is properly turned off. Unlike the supply MUX in [20] , this solution does not consume any static power.
V. MEASUREMENT RESULTS
The proposed SIDO boost converter was implemented in a 0.18 μm CMOS process. Fig. 12 shows the die photo of the test chip and the evaluation board with the TEG connected. The total chip area is 2.3 mm 2 , from which the active area occupies around 0.3 mm 2 . The evaluation board solution was designed so that the parasitics added to the power path are minimized. The off-chip components include a 33 μH inductor L (4.8 mm × 4.8 mm × 2.8 mm, number 744043330 [16] ), a 1 μF input capacitor C I N , a 100 nF storage capacitor C ST and a 30 nF control supply capacitor C CT R L . The load (duty cycled sensor) is modeled as a resistor in series with a switch, which is enabled by the V ST _O K signal from the chip. To startup the boost converter, C CT R L was precharged to 1 V. The chip was evaluated using two tests. In the first test, the equivalent circuit of the TEG (in Fig. 1 ) was used to fully characterize the converter. In the second test, the chip was evaluated with a commercially available packaged TEG from Micropelt (model TGP-651 in Fig. 12 ) [3] . The TEG has a typical internal resistance of 210 and a Seedbeck voltage of 60 mV/K.
The voltage of the equivalent circuit, V T EG , was initially set to 100 mV. Fig. 13 shows the measured input voltage of the converter. Since the average value of V I N is approximately V T EG /2, the MPE is achieved (R I N = R T EG ). on C ST . When V ST reaches 1.99 V, the signal V ST _O K becomes active and enables the load (a 48.7 k resistor). The load is powered until V ST drops to 1.83 V. Fig. 15 shows the measured voltage at node X, V X . The measured f s is 15.2 kHz, τ N is 4.68 μs and D is around 7.1%. The dead time is slightly longer than the accurate, since V X surpasses V ST and turns on the body diode for a few nanoseconds. The duration of τ P is slightly shorter, but very close to the accurate value. At the moment when M P turns off, the inductor current I L is still flowing through the switch but it is so close to zero that it can not completely turn on the body diode. Therefore, both ZCS and ZVS are achieved. While measuring V X , the oscilloscope probe adds the capacitance to the node X. To account for this capacitance, the dead time is intentionally prolonged according to (6) by using the signal d 1 d 0 (in Fig. 8 ). To evaluate the conversion efficiency, V T EG was varied from 30 mV to 180 mV. As a result, V I N changes between 15 mV and 90 mV. The measured efficiency (given by (2) where P out represents the power delivered to the load) versus the input voltage is shown in Fig. 16 . This figure also shows the theoretical efficiency, obtained from (2) and (5). The small deviation of the measured efficiency from the theoretical one is due to process variations, tolerances and used approximations. The efficiency is relatively flat for input voltages higher than 45 mV. For these values, the sum of P loss and P ctrl in (5) is dominated by the conduction losses, which are scaled down with the input voltage, keeping the efficiency almost constant. The efficiency starts to roll-off when the input voltage drops below 45 mV. From that point, the switching losses dominate and reduce the efficiency as the input voltage decreases further. The boost converter is still operating when the input voltage is below 20 mV. However, in such case the entire energy is transferred to the additional output to power the control circuit, meaning that the power delivered to the load is equal to zero, so as the efficiency. When the input voltage drops below 15 mV, the converter shuts down and requires a new start-up procedure.
The chip was also tested with the packaged thermoelectric device [3] . When the hot plate of the device was pressed against the skin on the wrist while the cold plate was connected to a heat sink, to maintain the temperature of the surrounding air, the harvester was generating the voltage of around 150 mV, which corresponds to 2.5 K temperature difference between the plates. In this case, the converter was extracting 27 μW from the TEG and was delivering around 23 μW to the load. Table I compares this work with the state-of-the-art lowpower boost converters [6] - [10] , [13] , [21] . It can be seen that the proposed SIDO boost converter provides the highest conversion efficiency and it is one of the lowest power designs among the CMOS μW power solutions.
VI. CONCLUSION
This paper has presented a high-efficiency SIDO boost converter for micropower thermoelectric energy harvesting. The dual-output architecture has enabled a significant reduction of the control circuit's power consumption. A theoretical analysis has indicated the optimal switching frequency, inductor value and switch sizes of the converter for minimizing the power and losses. High-efficiency at very low input power levels is reached by combining the obtained optimal values with the accurate ZCS and ZVS techniques. The converter achieves over 80% conversion efficiency for input powers higher than 9 μW and a peak conversion efficiency of 86.6% at 30 μW input power. The converter can operate with input voltages as low as 15 mV, which corresponds to 1 μW input power. Such high efficiency at so low harvested power levels is an extremely important improvement with respect to previously reported works and it paves the way to the development of self-powered wearable and implantable biosensors.
