Abstract-This work presents a switched-capacitor (SC) DC-DC voltage regulator that converts a 3.7V battery voltage down to ~0.8V in order to power the 'brain' SoC of a flapping-wing microrobotic bee. A cascade of two 2:1 SC converters offers high efficiency for a 4:1 conversion ratio. A charge recycling technique reduces the flying capacitor's bottom-plate parasitic loss by 50% and overall conversion efficiency reaches 70%. The output droop is less than 10% of the nominal output voltage for a worstcase 47mA load step.
INTRODUCTION
In the aerial microrobotic bee application [1] , the on-board battery (~3.7V) is the only source of energy. A digital SoC, which works as the 'brain' of the robotic bee, operates at low voltages (~0.8V or less). While a voltage regulator is required to bridge the voltage difference, the stringent weight and area requirements of the robotic bee make the regulator design challenging. First, the regulator needs to be fully integrated along with the SoC without using any external components in order to minimize weight and area. Second, the regulator must directly connect to the battery and support a high (4:1) step down ratio. Third, high conversion efficiency is important to achieve long flight times for the robotic bee.
SC converters are well suited for this application from weight and area perspectives since they only require capacitors and MOS transistors. On-chip MOS capacitors with density as high as 10nF/mm 2 are available in digital CMOS processes [2] . However, choosing the right topology is important. Singlestage SC converters suffer from power switch voltage breakdown and high bottom-plate parasitic loss when the conversion ratio and input voltage are high [2] [4] [5] . One solution has been to cascade thick-oxide transistors to avoid transistor break down in 3:1 SC converters [2] [5] , but this degrades conversion efficiency. Novel switching techniques have also been shown to mitigate flying capacitor bottom-plate parasitic loss [4] [8] . Unfortunately, these issues get worse in single-stage 4:1 SC converters. This paper presents a fully integrated two-stage SC regulator to address these challenges. The proposed two-stage topology simplifies the overall design and implements several techniques to improve conversion efficiency: (1) it uses the appropriate flavor of transistors (thin oxide and think-oxide transistors) in each stage; (2) it applies a charge recycling technique to mitigate bottom-plate parasitic loss; and (3) it employs separate low-boundary feedback controls to regulate the each stage's output to desired levels. Lastly, the two-stage topology provides an intermediate voltage for use by other parts of the microrobotic bee. The two SC stages are nearly identical except for the type of transistors and sizing. Each SC stage implements a multiphase topology to reduce voltage ripple. Sixteen modules operate off both edges of eight interleaved clock phases. A multi-phase current-starved pseudo-differential VCO generates the clock edges and operates directly off of the battery to guarantee proper start-up. To ensure there is always a balanced number of modules in operation, pairs of modules operate 180° out-of-phase off of one shared clock phase. SC converters have two basic phases of operation, thoroughly discussed in [2] . In one phase, energy drawn from the input charges the flying capacitor up and flows to the load. In the other phase, energy stored on the capacitor during the previous phase flows to the load. The power switches operate with stacked voltage domains similar to [3] and [6] . Taking the first-stage as an example, switches driven by Φ S1_1H and Φ S1_2H operate in the high voltage domain (between V INT and V BAT ) while switches driven by Φ S1_1L and Φ S1_2L operate in the low voltage domain (between ground and V INT ).
The maximum switching frequencies of the two stages are also different. The first-stage maximum switching frequency is one quarter of that in the second stage. By doing this, the two stages occupy similar chip area and have similar conversion efficiencies, resulting in optimal overall efficiency and power density for the regulator. By optimizing the two stages separately, the first stage connects to the high battery voltage, 
B. Bottom-Plate Charge Recycling
A dominant source of efficiency loss in SC converters comes from switching the bottom-plate parasitic capacitance associated with the flying capacitor (C FLY ). All of the flying capacitors in this design rely on bulk MOS transistors, which usually have non-negligible bottom-plate parasitic capacitance (~2% in this technology, ~5% in [4] ). Each stage implements circuitry that combines two-step charging/discharging with charge recycling, as illustrated in Fig. 2 for the second stage. C PAR is the parasitic bottom-plate capacitor of C FLY . By adding an additional recycling capacitor, C REC , the proposed technique avoids using an external voltage source. The two-step charging/discharging occurs during the converter's dead time to recycle charge, reduce losses, and improve conversion efficiency.
The charge recycling operation is as follows. Assume C REC >>C PAR and V REC starts out at V OUT /2. When discharging C PAR , C PAR first transfers charge to C REC through the additional switch controlled by Φ REC . In this process, C PAR discharges from V OUT to V OUT /2. Then, the switch Φ REC turns off and C PAR fully discharges to gnd. The amount of charge transferred from C PAR to C REC is C PAR V OUT /2, which is stored on C REC and is recycled in the charging phase. When charging C PAR , C PAR first charges up from gnd to V OUT /2 via C REC . In this period, C REC transfers Q=C PAR V OUT /2 to C PAR , which is the same amount of charge that C REC gets from C PAR in the discharging process. C PAR then disconnects from C REC and fully charges up to V OUT . From an energy perspective, V OUT only needs to provide E=C PAR V OUT 2 /2 in this charging process, which is half of the energy otherwise required. It is important to note that V REC eventually settles to V OUT /2 regardless of its initial voltage, because this is the only balanced state where the energy stored on C REC when discharging C PAR matches the energy that C REC loses when charging C PAR .
The above recycling process assumes C REC >>C PAR . Thanks to the converter's multi-phase operation, C REC can be shared by all of the phases and C REC only needs to be larger than the parasitic capacitance in one phase, achieved with negligible penalty. In this implementation, C REC is 2% of the total flying capacitance. 
C. Low-Boundary Feedback Control
Closed-loop operation regulates V OUT and V INT to desired voltage levels. Each stage implements the same low-boundary feedback control loop illustrated in Fig. 3 [3] . Since the feedback toplogy is the same in both stages, the following illustration uses the second stage as an example. Pairs of the interleaved modules share separate feedback paths, i.e., there are a total of eight feedback paths in the 2 nd stage. In each feedback path, two comparators operate off of complimentary clocks generated by the VCO. The comparators compare V OUT with a reference voltage, V REF2 , on the rising and falling edges of the clock. If V OUT is smaller than V REF2 , V LA switches either from low to high or high to low, depending on its previous state. V LA then propagates through to control the power switches and switch the state of the SC converter. This action increases the output voltage V OUT . If V OUT is larger than V REF2 , V LA remains in its previous state. The power swiches do not switch and V OUT decreases until the SC converter reacts.
A resistor DAC (R-DAC), shown in Fig. 3 , provides separate reference voltages to the 16 comparators via a switch network that connects each individual comparator to the resistor ladder separately. By doing do, we can use the R-DAC to calibrate comparator offsets. The switch network also generates 16 separate reference voltages for the first SC stage. Calibrating comparator offsets improves steady-stage voltage ripple and conversion efficiency.
III. MEASUREMENT RESULTS
The two-stage SC converter was fabricated in TSMC's 40nm CMOS technology. The chip was tested in two modes: open-and closed-loop operation. In open-loop operation, the output voltage and output power can be tuned by changing the switching frequency，F sw , of the converter via the VCO. In closed-loop operation, the VCO frequency is set to its maximum and the feedback control loop adjusts the effective switching frequency of the converter to regulate the output.
In open-loop operation, there is a relationship between the switching frequency and the output voltage and power. Shown in Fig. 4(a) , higher output power requires high switching frequency to deliver energy more frequently. However, when switching frequency increases, there is less time for the switched capacitor circuit to settle in each cycle. Because of this incomplete charge transfer, the energy that is delivered from input to output in each cycle decreases as switching frequency increases. Hence, switching frequency increases super linearly with output power. Switching frequency, and thus switching loss, increases faster than the delivered power. Fig. 4(b) shows that higher output voltages also require higher switching frequencies. As the output voltage increases, there is less energy that can be delivered from input to output in each cycle [2] . So, switching frequency and switching loss increase faster than V OUT increases. In open-lop operation, we manually tuned the VCO frequency to keep V OUT at ~800mV for each power level. In closed-loop operation, the feedback loop keeps the output voltage at ~800mV. Steady-state ripple in open-loop operation is small (~10mV) due to the interleaved design with constant switching frequency. In contrast, closed-loop ripple is generally higher due to the cycle-skipping nature of the feedback topology. In each cycle, the feedback controller must determine whether the converter should switch or not. As a result, the instantaneous switching frequency can vary widely from cycle to cycle. Delay through the feedback loop further exacerbates the ripple, because the control loop must react to the output decreasing below the reference voltage. The longer the feedback delay is, the larger the ripple is. Measurement results show that closedloop ripple increases with output power since larger load currents discharge the output voltage more quickly. Comparing  Figs. 5(b) and (c), calibration helps to reduce voltage ripple by minimizing inconsistent switching thresholds across all of the comparators in the multiple feedback paths. In all subsequent plots, the comparators are always calibrated unless noted otherwise.
B. Conversion efficiency
In SC converters, the major sources of efficiency loss are Fig. 6(a) , open-loop efficiency reaches a peak of 70% at P OUT =15mW. The efficiency rolls off for higher output power, because switching frequency and switching losses increase faster than the delivered power. Efficiency also rolls off for lower output power, because of static overheads. Comparing Figs. 6(a) and (b) , closed-loop efficiency is generally lower than open-loop efficiency, because of larger voltage ripple. Fig. 6 also shows charge recycling consistently improves conversion efficiency by ~2%. Charge recycling is always on for all subsequent plots. [2] . C. Transient response Fig. 9 presents the SC converter's measured response to 47mA output load transients using an on-die load circuit with rise and fall times of ~100ps. As seen in Fig. 9(a) , when the SC converter runs in open-loop with maximum switching frequency, a 3mA to 50mA load step causes V OUT to drop by 155mV. When running in closed-loop with the nominal output voltage set to 750mV, however, the control loop quickly reacts and the voltage droop caused by the load current step is much smaller. In fact, the ~60mV droop in Fig. 9(c) is mostly due to the larger steady-stage voltage ripple previously seen with respect to higher output power.
D. Test chip summary
The silicon area, shown by the micrograph in Fig. 10 , was not optimized for power density but was governed by the pads and circuitry added for testing. Flying capacitors and output filter capacitors, which occupy half of the overall area, total 2.64nF. Table. 1 compares this work to prior art SC converters. The 70% peak efficiency of this design is comparable to the efficiencies in [3] and [5] , but for a higher 4:1 conversion ratio. 
IV. CONCLUSIONS
This paper demonstrates a fully integrated batteryconnected switched capacitor converter for the brain SoC of a microbotic bee. The two-stage topology, with bottom-plate charge recycling, offers high conversion efficiency for the high 4:1 conversion ratio. While closed-loop regulation provides fast transient response, it also exhibits larger stead-state voltage ripple, which results in efficiency drop compared to open-loop operation. This tradeoff motivates exploring an adaptive clocking strategy to improve overall system efficiency as described in [7] .
ACKNOWLEDGMENTS
This work was supported in part by the NSF Expeditions in Computing Award #: CCF-0926148. The authors thank the TSMC university shuttle program for chip fabrication. 
