Abstract-In this paper, an efficient voltage scalable switched capacitor converter (SCC) for 1.1V battery-powered system is presented. The SCC employs a binary resolution technique to step-down the input voltage to a range of voltages, while keeping the efficiency high. An optimization strategy for designing multitopology SCC is presented to improve the effectiveness of the circuit and to preserve efficiency over large load voltages.
INTRODUCTION
The growing quest for power saving techniques in digital systems has raised the need for an integrated DC-DC converter which is compatible with sub-threshold operation levels [1] . To minimize power dissipation in ultra low power systems, the DC-DC converter needs to supply variable subthreshold load voltages [2] . The most efficient technique for step-down conversion is based on the Buck converter. However, the employment of an off-chip inductor for each voltage domain causes serious EMI noise and a large pin requirement, which makes such implementation impractical in many applications. Switched capacitor converters (SCC) have become more attractive for battery operated systems because they can minimize the number of off-chip components and have flexibility in sizing switches and capacitors [3] , [4] . The challenge associated with the realization of an efficient low voltage and ultra low power SCC has led to several recent developments [5] - [7] . These studies have shown that the efficiency of multiple conversion ratios SCC can be maintained effective over wider input voltage range than that of a single topology converter. It is shown that a combination of two standard divide by two SCC cells can support two additional topologies, which provide 2/3 and 1/3 conversion ratios [8] . This circuit and its variants were systematically designed in numerous works, utilizing one or more of its possible conversion ratios. For example, an interleaved SCC structure partitioned into multiple circuits to reduce the input current and output voltage ripples has been explored in [8] .
The most common loss mechanisms of an integrated SCC are: 1) Charge transfer conduction losses, 2) Switching losses, 3) Bottom-plate parasitic capacitors, 4) Output voltage ripple, 5) Switch parasitic losses, 6) Gate drive loss.
In light of these, it is clear that the main challenge in SCC design is the optimization [9] of the converter parameters, while meeting various constraints on efficiency, silicon area and output voltage V DD . If, for instance, a traditional design approach is adopted, such as sizing the switches of the SCC to the same R ds(on) or, alternatively, adjusting it to allow operation in the complete charge transfer mode, one might end up with a non-optimal design.
Motivated by the above concerns, this work presents an optimization procedure of a 40nm 1.1V CMOS voltage scalable SCC, which is suitable for mobile sub-threshold applications supporting output levels of 0.18V-0.6V. It should be noted that the presented concept is general and not depends on particular implementation technology. The core of the proposed circuit is a classical SCC shown in Fig. 1 . This circuit is composed of nine switches and two flying capacitors to support three conversion ratios (this topology denoted henceforth as conventional SCC). Typical candidates for this SCC are applications require ultra-low dissipation Conventional 9 switch SCC with three conversion ratios G={1/3,1/2,2/3}. with low to moderate circuit performance. To enhance the efficiency of this SCC in the vicinity of V DD =0.2V a single extra power switch was introduced (Fig. 2) as compared to the topology of the conventional SCC (Fig. 1) . Hence, a total of only 10 power switches are able to support four topologies with conversion ratios of 2/3, 1/2, 1/3 and 1/4. The proposed G=1/4 topology employs the technique of binary SCC [5] .
Since the proposed SCC is supposed to work over four different topologies, it is hard to define a single optimized operating point that can constitute an optimization criterion for all topologies. As far as author's knowledge this subject has received limited attention over the years. Therefore, this paper outlines an optimization methodology of a multiobjective optimization approach.
II. PROPOSED SWITCHED CAPACITOR CONVERTER
It is well established that any SCC can be modeled with an equivalent no-load voltage source connected in a series with an equivalent resistor R eq [10] , [11] . Loss contributors in SCC power stage are collected into that equivalent resistor. The maximum theoretical efficiency of the SCC for a given conversion ratio G can be expressed by,
where V DD and V BAT are the loaded output and input battery voltages, respectively and R L is the load resistance. Equation (1) however, does not take into account the power losses that stem from leakages, the bottom plate capacitor, the gate drive and the control circuitry. A more accurate representation of (1) will be with all loss contributors as follows,
where P L , P BP , P Drive , P Leak , P Control are respectively the output power, the bottom-plate parasitic capacitor related losses, the gate drive loss, the power consumption due to leakage current and the power lost in the control circuitry. In practice, employing the circuit of Fig. 1 to produce an ultra low V DD voltage such as 0.2V from 1.1V battery is possible but the resulting efficiency will be very poor. For example, using the smallest possible ratio of G=1/3, it follows from (1) that for V DD =0.2V, the obtainable efficiency is limited to 54%. It would appear that a conversion ratio of G=1/4 can be realized by cascading two divide by two SCC in order to increase the efficiency up to 73%. However, this requires an additional number of power switches that need to be considered.
A. The Approach
The configuration of the proposed power stage converter is depicted in Fig. 2 . This circuit consists of a conventional SCC ( Fig. 1 ) and an additional switch M 10 . The SCC is designed to implement 4 different conversion ratios, excluding the trivial case G=1. The reader is referred to [6] and [8] for a detailed analysis of the ordinary conversion ratios and their corresponding topologies. When the circuit is supposed to operate with G=1/4, essentially it consists of three subcircuits extended over three non-overlapping time sessions {Ф 1 ,Ф 2 ,Ф 3 }. Following the theory in [5] , the sub-circuits of the SCC can be represented by an equivalent system of linear equations. Designating the voltages across C f1 and C f2 by V C1 and V C2 , the system of linear equations is composed as follows,
The solution of (3) is: V DD =(1/4)V BAT , V C1 =(1/2)V BAT and V C2 =(1/4)V BAT . These results determine that, in a steady state irrespective of the order in which the three sub-circuits repeat (Fig. 2) , the voltages V DD , V C1 and V C2 eventually converge to the above calculated values.
B. Architecture
The block diagram architecture of the considered SCC is shown in Fig. 3 . Apart from the SCC circuit, all control blocks were implemented with the Verilog-A language (See Appendix). The SCC block contains the flying capacitors (C f1 , C f2 ) and the power switches, as shown in Fig. 2 . Eight clock waveforms are generated inside the non-overlapping clock generation block (Fig. 3) . Fig. 4 illustrates the clock waveforms which are involved in the operation of the converter. The gain selection block activates one of the four topologies, depending on the proximity of its target voltage to the load voltage being delivered and its ability to provide the load power demand. To facilitate a tight regulation and achieve high efficiency, the converter should employ a feedback controller. A typical hysteretic mode control (burst mode) is used, although other techniques such as PFM are also feasible. The converter uses the hysteretic mode control to maintain the feedback voltage V DD within the hysteretic band of ±ΔV, where ΔV is set to 10mV. In this method, the converter remains idle until the output voltage falls below V REF -ΔV. At this point, the output of the comparator V comp switches to a high state and thereby enables the SCC to transfer charge packets to the load.
III. OPTIMIZATION
The proposed circuit was tested and characterized in a standard low power 40nm TSMC CMOS process. To avoid a short channel effect, the channel length was set three times larger than the minimum technology L min namely, L=120nm. The widths of the switches and additional parameters are adjusted by a global optimization procedure using a parallel simulated annealing algorithm. A global optimization was performed by evaluating the average efficiency and unregulated output V DD over a range of switches' widths and flying capacitors C f1 =C f2 , until the system specifications were satisfied. As mentioned above, to facilitate a variable output V DD , the following optimization is based on a behavioral Verilog-A controller which switches conveniently between the four possible topologies.
The optimization procedure was conducted using the following assumptions and conditions: 7) All losses are considered except the losses of the control circuit, which is reasonable to assume negligible compared with major loss factors, 8) The switching frequency f sw was set for simplicity to a fixed value of f sw =26MHz. 9) The SCC was implemented with low V t MOS transistors of the process library, 10) The output capacitor (Fig. 3) was set to C L =1nF to fulfill a requirement of Δ ripple =8mV maximum output ripple by the relation C L =I o /(Δ ripple f sw ) [12] . Each designable parameter was constrained to take values from a limited range. The performance function of the circuit, which also serves as goal function with highest weight, is the average efficiency of the four topologies defined as (Fig. 3 ) is maintained in a high state. The average efficiency η avg and the four efficiencies at the respective unregulated maximum output voltages were constrained to satisfy the following,
and simultaneously, 
while letting the optimizer engine to choose values such that the SCC comply with constraints (5) and (6) . In our case the average efficiency η avg constrained between 78% and 83%, which is acceptable since most SCC designs actually fall into this range. TABLE І summarizes all design variables involved in the optimization procedure, corresponding optimization ranges of legal values and the final optimum values. We note that the notation MIN:STEPS:MAX in TABLE І represents optimization scanning range in which MIN and MAX are the lower and upper bounds, respectively. The parameter STEPS defines number of unequally spaced values between MIN and MAX limits. For example, the format 1μm:30:400μm indicates that the engine searches automatically up to 30 different unequally spaced widths values within 1μm to 400μm to comply with the optimization criteria.
IV. SIMULATION RESULTS
The SCC was implemented and simulated using Cadence Spectre simulator. Flying capacitors are obtained using nMOS capacitors connected in parallel. In this design, the bulk of all nMOS transistors have been tied to ground while the bulk of the pMOS transistors were tied to V BAT =1.1V. No level shifters were used hence, the input voltage V BAT has been used as a rail for the gate drive of all transistors, causing to a gate voltage swing between ground and V BAT . Fig. 5 shows key waveforms of the SCC at two operating points. As evident from the figure the parameters of the SCC are optimized to work in incomplete transfer mode. It can be noticed that using G=1/2 topology the operation is governed by two clock phases while with G=1/4 topology, one clock period is divided into three time sessions.
The efficiency plot of both conventional and proposed SC converters, while delivering a load current of 200μA is depicted in Fig. 6 . The result shows that the proposed SCC dramatically improves the efficiency at low output voltages. Specifically, the SCC efficiency is improved by 16% in the vicinity of V DD =200mV as compared to the one using a conventional 9 switch topology [6] , [8] while remaining with the same efficiency at the rest of the range. The SCC achieves a peak efficiency of 87% at V DD =0.53V with 200μA load current. Theoretically, the load current could be higher however the aim of the study is to present a general concept of SCC optimization. Furthermore, in higher output power the SCC is subject to performance degradation due to higher voltage/current ripples, parasitical effects and larger driver. Nevertheless, it is evident that the proposed circuit has a comparatively small number of power switches and silicon area, while supporting four different conversion ratios.
V. CONCLUSIONS
This work has presented an SC converter with four conversion ratios. We have shown that the proposed converter can deliver scalable load voltages using the core 40nm 1.1V CMOS devices. The presented concept is general and not depends on particular implementation technology. Four different conversion ratios were employed by the addition of a single power switch to the conventional topology (Fig. 1) . The additional conversion ratio of G=1/4 was obtained by dividing the charge transfer process into three time sessions. Simulation results showed that the proposed SCC dramatically improves the efficiency in sub200mV load voltages. The SCC achieves an efficiency improvement of 16% in the vicinity of V DD =200mV as compared to the conventional topology ( Fig. 1) and a peak efficiency of 87% at V DD =0.53V.
It can thus be concluded that the proposed optimization procedure provides an effective solution for designing SCC along with constraints on parameters' values. 
