Abstract-A current-efficent, fully integrated low-dropout regulator(LDO) with improved load transient responses for system-on-chips(SoC) is presented in this paper. It makes use of high bandwidth common-gate amplifier and slew-rate enhancement circuit(SRE) triggered by voltage spikes to improve output voltage spike and response time of the LDO greatly. The proposed circuit has been implemented in a 0.35µm standard CMOS process and occupies an active chip area of 0.057mm 2 . Experimental results show that it can deliver 100mA load current at 150mV dropout voltage. It only consumes 8μA quiescent current and is able to recover within 0.2µs even under the maximum load current change. Consequently, a low-power and ultra-fast capacitor-less LDO can be achieved.
I. INTRODUCTION
dvanced portable devices such as cellular phones and personal digital assistants generally require a variety of voltage levels for powering up different blocks. For example, low-dropout linear regulators (LDOs) can convert decaying battery voltages to low noise and accurate voltages for noise-sensitive analog and/or radio-frequency blocks. Since integrated CMOS LDOs only occupy small chip area, they are also adopted to power up sub-blocks of a system individually in the system-on-a-chip designs in order to tackle the crosstalk problem. Both board space and external pins can be minimized [1] - [6] .
For portable applications, power efficiency is a critical requirement to prolong battery cycle. Therefore, low quiescent current and dropout voltage are essential in LDO design. Large gate capacitance of power transistor degrades the loop-gain bandwidth and the slew rate at the gate drive of the LDO in low-power condition. Therefore, both low quiescent current and fast load transient response cannot be achieved simultaneously using the normal LDO structure. Several techniques are proposed to improve the transient responses without increasing the quiescent current [7] - [13] . Adaptive biasing is employed in [7] - [9] ; a slew-rate-enhancement circuit is used in [10] ; a push-pull technique is developed in [11] and voltage-spike detection circuits are proposed in [12] and [13] to improve the dynamic performance of the LDO.
This work was supported in part by the National Natural Science Foundation of China.
Another important issue is reducing cost by cutting down the silicon area, bulky off-chip components, and pin count. As a result, LDOs with no external filtering capacitor are desirable and the stability must be ensured. Many advanced methods, such as the damping-factor control compensation [2] and Q-reduction compensation [3] , have been developed to achieve high bandwidth with a low quiescent current and a small on-chip compensation capacitance for the capacitor-less LDOs.
In this work, a high bandwidth error amplifier with a dynamic low-power push-pull SRE circuit is proposed to enable a capacitor-less LDO regulator with improved transient responses. Concept of the proposed LDO is discussed in Section II. Circuit design and implementation are shown in Section III. Experimental results and conclusions are given in Sections IV and V, respectively.
II. DESIGN OF THE PROPOSED CIRCUITS
The transient response, which is related to different parameters such as the closed-loop stability, loop bandwidth and slew rate at the gate of the power transistor, is a critical dynamic specification in LDO design. Both the amplitude of the voltage spike and recovery time of the regulated output voltage will affect its overall accuracy. As shown in Fig. 1 , the basic structure of the capacitor-less LDO is similar with reference [11] focusing on dynamic biasing, so more bias current is only used at the transient instant when the output current is changed. The error amplifier is constructed by two common-gate differential-input transconductance cells, a voltage buffer and a current-summation circuit. The two G m cells, which are made basically by a pair of matched transistors (M a and M b in Fig. 1 A 
IEEE/IFIP 19th International Conference on VLSI and System-on-Chip
2 as an example) in the form of a current mirror, are cross-coupled achieving a push-pull output stage to inject and withdraw more current for charging and discharging during the transient instant. Because the output current I o has a quadratic dependence on its input-voltage difference according to the square-law characteristic of MOS transistor, the maximum output current I omax is no longer limited by the constant-current source as in the case of conventional amplifier with a tail-current. This is very helpful for fully on-chip LDOs to improve transient response as low power and high slew rate can be realized at the same time. OUT and can not react to voltage variation anymore. The second one is due to limited input transconductance G m and attainable gain-bandwidth(GBW) which is determined by G m /C pass for the capacitor-less LDOs. Thus fast changing voltage spike cannot be detected effectively by the amplifier at low bias current. So how to solve the problems mentioned above while retaining high SR and current efficiency is critical in this paper.
A. Current Subtracter
Aiming to improve transconductance of the amplifier, a positive feedback technique using a current subtracter can be adopted. As shown in Fig. 2 , another common-gate amplifier(M 3 -M 4 ) has been added and cross-coupled with M 1 -M 2 in the G mH cell. The output current I M4 is then feedback to node C by a current mirror M 5 -M 6 . The only difference is the aspect ratio of input transistors which is 2/1 to guarantee a normal bias point of the total output current I push . When the positive input V + increases, I 1 increases while I 2 decreases accordingly. The larger a voltage pike ΔV + is, the more I push will be gained compared to using M 1 -M 2 alone. Positive feedback is therefore realized at node C and the total transconductance G m is enhanced by a factor of 1.5 as given by 
B. Adaptive Biasing
Note that the power transistor can be designed to work in the linear region when the LDO load is heavy, such that a more efficient usage of the chip area is achieved. In saturation region, the relationship between I d and V gs is quadratic, and in linear region, it is linear, where an equal factor of increment in I d requires a larger increment of V gs . Therefore, the circuit needs higher bandwidth and larger slew rate at heavy load for high speed control, which is achieved successfully by adaptive biasing [9] . This operation revolves around sensing the output current of the regulator and feeding back a ratio of the current to the error amplifier. For example, the buffer stage is adaptively biased as shown in [7] , where the increase in current in the buffer stage aids the circuit by pushing the parasitic pole associated with parasitical capacitors at the gate of power transistors to higher frequencies and by increasing the current available for slew-rate conditions; another choice is used at the input stage in [8] and [9] to extend the loop bandwidth.
As shown in Fig. 2 , the small-signal response speed of the common-gate amplifier is mainly decided by transistor M 1 and gate capacitor C pass of the power transistor, where 1.5g m1 /C pass determines the maximum respond speed for the amplifier. In order to enlarge the transconductance g m1 , more power according to the load is applied to the amplifier, achieving significant improvement of the transient response. This can be done by a simple current mirror and a sense MOSFET that are area efficient. In addition to the small fixed biasing current I B , a feedback current I AB relating to load current I load (i.e., I AB =βI load ) is applied to the drain of transistor M 2 to control V gs2 at different loads. Because V gs1 and V gs2 are equal at DC operating point, the transconductance g m1 of M 1 can be expressed as
The resulting larger bias current at heavy loads increases the transconductance of the input pair, leading to a larger bandwidth of the amplifier. During low load current conditions, the feedback current I AB is negligible, thereby yielding a high current efficiency and not aggravating battery life. 
The larger the I load is, the smaller the V INmin is. Therefore the ICMR is enlarged at heavy load which greatly improves the detection range of output voltage spike amplitude. One important design issue is about carefully setting I AB or the aspect ratio β between the current-sense transistor and the power MOSFET at different loads. Too small β will not gain dynamic biasing advantages; however, since M 2 is diode-connected and the input V -is a stable reference voltage, too large β will introduce more feedback current I AB to the input stage pushing V B to a very low voltage especially when V OUT is small, which may result in transistors in the current source of I B and I AB entering into linear region. If this unluckily happens at large load current, there exists no isolation between ground and bias voltage V B . The ground noises will couple freely to the gate of input transistors, degrading performances of the amplifier. In this circuit, the largest load current is 100mA and the aspect ratio of I load /I AB is chosen as 10000/1, where the largest feedback current is approximately 10µA.
C. Dynamic Push-Pull SRE Circuits
Unluckily, the adaptive biasing is activated only when the gate voltage of the power mos (V G ) goes down (i.e., when the feedback is going to compensate an abrupt increment of load current). However, if the load current suddenly increases, an amount of time occurs before V G is moved down and before the adaptive biasing is activated, which is determined by the bandwidth of the loop. This latency may strongly reduce the effectiveness of the adaptive biasing.
In order to get rid of the dependence on limited bandwidth and reduce output voltage spikes and recovery time further, a SR enhancement (SRE) circuit based on dynamic push-pull techniques is embedded in parallel to the error amplifier to get a better regulated power supply. The SRE circuit only provides the dynamic current to charge and discharge the gate capacitance of the power transistor during transient, and is completely turned off in the static state. It should improve both the loop-gain bandwidth and slew rate at the gate drive of the power transistor, while dissipating small quiescent current in the static state. Normally, the SRE circuit consists of a sensing and driving circuit. Different SRE circuits have been developed based on different sensing and driving circuits [14] - [16] . How to avoid larger loading capacitance due to the additional structures and high dynamic current at input stage in the existing methods is critical. For example, a current-detection SRE circuit reported in reference [10] detects the changes in the current signal at the active load of the core amplifier such that the sensing circuit does not increase the loading of the error amplifier in the LDO.
In the structure shown in Fig. 3 , the sensing circuit adopts a voltage detection method based on capacitive coupling. It senses rapid transient voltage changes at the output of the LDO and then changes current signal I M4 or I M11 to trigger the dynamic push-pull circuit for increasing the driving current momentarily. The basic circuit is a modification for current mirror M 3 -M 4 and M 3 -M 11 , where capacitor C 1 and resistor R 1 realize a high-pass circuit. It provides a fast path to detect the output voltage spike. As shown by the timing diagrams in Fig.   3 , when the amplitude of V OUT changes from low to high (ΔV) instantaneously, the rapid voltage change couples to the gate of M 11 directly due to the high-pass property of C 1 . When C 1 is chosen to be much larger than C gs3 + C gs4 +C gs11 , the gate voltage of M 11 is dominated by the coupled signal from C 1 in this instant. Thus, V gs11 is changed momentarily and the extra current ΔI 6 can be found from ( )
It is found that a larger aspect ratio of the current mirror helps to increase ΔI 6 for injecting more transient current. When V OUT changes from high to low, the coupling effect generates a smaller I M4 and triggers the pull characteristic. When V OUT stays at a constant voltage in the steady state, C 1 is open-circuited, resulting in an auto shutdown of the current boosting circuit.
The driving circuit is composed of transistors M 9 -M 16 . Based on the appropriate ratios of current mirror (b 1 , b 2 ), M 9 and M 10 (M 11 and M 12 ) are designed such that if both transistors operate in the saturation region, their drain currents must meet the relationship I 3 <I 4 (I 5 >I 6 ). So M 10 and M 12 operate in the triode region such that voltages of node N 1 and N 2 are set to "1" and "0" to force transistors M 13 and M 16 to be turned off at steady state.
Once the load current decreases and causes large output variations, the extra current ΔI 6 is generated to pull the voltage of node N 1 down. Then transistor M 13 will be heavily turned on to charge the gate capacitance of power transistor. When V OUT is regulated back to its expected voltage in the steady state, I 6 decreases and the voltage of node N 1 is smoothly reset to "1" to turn transistor M 13 off.
Similarly, the transistor M 16 can be controlled to discharge the load capacitance during the negative slewing period. The traditional method to increase a PMOS current I 3 is simply by pulling down the gate directly [12] . However, this may need another large coupling capacitor and degrade PSRR of the LDO a lot. The new idea proposed is just using current subtracter M 4 -M 6 instead to meet the same function. After that optimization, only one coupling capacitor C 1 is needed. In addition, transistors M 14 and M 15 are used to prevent the noise of N 1 and N 2 from coupling to the gate of the power transistor when transistors M 13 and M 16 are turned on.
The response time of the SRE circuit is determined by the time required to turn on or turn off the drive transistors M 13 and M 16 when an output voltage spike ΔV is applied to SRE circuit. During the positive (negative) output slewing, transistor M 12 (M 10 ) is in the saturation region. Therefore, the response time t res,p and t res,n of the SRE circuit for positive and negative slewing periods is approximately given by is the parasitic capacitance at node N 1 (N 2 ). Equations (5) and (6) show that the response time increases with the value of is to enlarge g m4 and g m11 without increasing much power.
From the above analysis, it seems that the dynamic push-pull scheme can effectively enhance the transient response time for regulating the output voltage of LDO regulator back to a stable voltage level, i.e., the circuit is used to enhance the slew rate of the error amplifier during the transient period. This feedforward path due to capacitive coupling does introduce a zero (1/R 1 C 1 ) in the frequency response of the LDO and add a gain into the loop by the boosted current. The values of R 1 and C 1 can be selected by setting the zero around the GBW to extend the loop-bandwidth of the LDO and make sure the proposed SRE circuit only works for high frequency spikes. Also because the coupling effect is independent of the DC value of V OUT due to the high-pass characteristic of C 1 , the proposed method is suitable for detecting any output voltage level, improving ICMR of the amplifier considerably. Fig. 4 shows the schematic of the proposed LDO regulator, which consists of a pMOS power transistor M o , a current-sensing circuit, a high slew-rate push-pull error amplifier，a SRE circuit and a reference buffer. The push-pull output stage constructed with transistors M 13 and M 20 facilitates the LDO regulator using only moderate size M o to provide a wide range of load currents. In this circuit, to provide 100mA load current with 150mV dropout, the aspect ratio of (W/L) Mo is chosen to be 15000µm/0.35µm in a 0.35µm standard CMOS process where the threshold voltage |V thp | of M o is about 0.66V.
III. CIRCUIT REALIZATION
The amplifier is mainly constructed of two cross-coupled common-gate cells G mH and G mL . Here some transistors like M 2 and M 3 have been reused in both input stages of the cells. The typical bandwidth of a LDO with 100mA output capability is about 200kHz to 1MHz [1] - [3] . Assuming the corner frequency is set to be 100 kHz, the required R 1 and C 1 are 3pF and 500kΩ respectively. The accuracy is not important.
Since most of voltage references do not have output current driving ability [17] , a voltage buffer without frequency compensation is introduced here. Because adaptive biasing is applied, the resulting bias current for the amplifier is increased at heavy load, which requires an enhanced driving current I drive from the buffer. Therefore, the aspect ratio (W/L) M30 should be designed to satisfy the maximum driving ability without using a large overdrive voltage, which must be smaller than the dropout voltage all the time. Otherwise, M 30 will enter into linear region when the difference between V IN and V OUT is small. This will provide a low-resistance path where the supply noises can couple to inputs of the G m cells, degrading PSRR a lot.
As shown in Fig. 4 , since the output of the LDO is connected to a low-resistance node such as the source terminals of M 1 and M 3 inside the G mH cell, this sets the dominant pole p 1 locating at the gate of the power transistor and the output pole p 2 of the LDO to be nondominant. As both drive transistors M 44 and M 45 are off in the LDO during the static state, there is almost no difference between the ac responses of the LDO with and without the SRE circuit. Here four parts mainly contribute to total output load capacitance C load in this structure, including C db of the power MOSFET, input capacitors C in from G m cells, coupling capacitor C 1 from SRE circuit and the parasitic output capacitance C par due to the metal lines for on-chip power distribution which is generally in the range of 10-100pF [5] . By using the circuit proposed above, more input transistors and capacitors are implanted at the output of the LDO compared to reference [11] , C load is therefore unluckily increased pushing p 2 to low frequency, which may degrade phase margin of the feedback loop. This stability may be even worse when a large parasitic capacitance C par and small I load are applied.
In order to realize pole splitting under a wide range of I load from several tens of milliamperes to several µA and occupy less silicon area, the active capacitor multiplier is adopted for miller compensation [18] . Capacitor C 2 performs the multiplied-miller capacitor with current buffer. The overall equivalent miller capacitor C c is equal to kC 2 , where k=(S 18 /S 17 )*(S 20 /S 19 ) and S i =(W/L) i is the aspect ratio of the i-th transistor.
Assuming G m1 and R o1 are the equivalent first stage transconductance and output resistance of the LDO, g mo and C pass are the transconductance and gate capacitance of the power transistor, R out is the overall output resistance, the frequency response can then be given by ( ) 
Here the input resistance 1/G m1 of the error amplifier mainly determines R out . Because adaptive biasing is applied, poles and GBW are changed accordingly in different load conditions as shown in equation (8) and (9) . In order to make sure a phase margin larger than 45°, p 2 must be larger than GBW to determine the total miller capacitor C c .
( ) 
Normally the parameter g mo R out is large enough to make this compensation achieved without using any large on-chip compensation capacitors. In this design, the required compensation capacitor C 2 is only 2.3pF. Area efficiency of such LDO regulator is thus maintained, which is particularly suitable for chip-level power management. Also the capacitor multiplier introduces a left-hand plane zero z 1 (g m17 /C 2 ) at relatively high frequencies, which can be placed near GBW to add phase and optimize frequency compensation.
IV. EXPERIMENTAL RESULTS AND DISCUSSION
The proposed circuit has been implemented in 0.35µm CMOS technology. The chip micrograph is shown in Fig. 5 where the active area of the circuit is 0.057mm 2 . Load transient behavior, which is mainly decided by SR and its bandwidth of the LDO, is measured here to evaluate the transient performance. The measured load-transient responses without an off-chip output capacitor are shown in Fig. 6 . In Fig.  6(a) , I load varies from 50µA to 50mA and V IN is 2.5V; otherwise, the load current change is increased to 100mA in Fig. 6(b) . Measurement results show that the proposed capacitor-less LDO can be fully recovered within 0.18µs at a voltage spike less than 251mV. It can be observed that the voltage deviation and response time are much better than that in reference [11] even at a twice load current change. The measured load regulation is shown in Figs. 8, which is determined by the low frequency loop gain of the LDO. It shows that V OUT varies only about 1.7mV when load current changes from 1mA to 100mA. From the measurement results, satisfactory load regulation is successfully achieved using only 8µA quiescent current. Table I shows performance comparison with some previously reported capacitor-less LDOs. A figure of merit (FOM=T settle *I Q /I load(max) ) used in [5] and [11] is adopted here to evaluate different current efficient designs for improving transient response. A lower FOM implies a better slewing performance, where the proposed regulator has the lowest FOM(0.016ns). This feature is very important and attractive to any high-density SoC applications.
V. CONCLUSION
This paper presents an ultra-fast, capacitor-less LDO with an advanced common-gate error amplifier and SRE circuit. Some low-power methods like adaptive biasing and capacitive coupling have been adopted to improve both ICMR and loop bandwidth of the error amplifier greatly, while maintaining the traditional advantages such as low quiescent current and small chip area. By applying them to a LDO with a power-efficient methodology, the accuracy and response speed are significantly enhanced. The experimental results confirm that overshoots and undershoots in load transient of the LDO are improved a lot as results from the loop-gain-bandwidth enhancement. The performances are especially encouraging in chip-level power management.
