of low-voltage power converters to address these issues. The proposed approach is based on a two-stage power conversion architecture integrated on a single complementary metaloxide-semiconductor (CMOS) die and copackaged with the passive components to form a miniaturized, integrated surface mountable power converter.
Two-stage converters have recently been developed to try to address the fundamental limitations of single-stage converters in this general space. We begin by reviewing examples of this general approach, some of which have only been published in academic journals, while others are in commercial products. To be consistent, we focus on low-voltage step-down versions of such two-stage converters. The most basic type of two-stage converter is a cascade of two switched-mode power converters, such as a higher voltage buck converter followed by a lower voltage buck converter [1] . This alleviates the issues with large step-down ratios, but increases the solution size by requiring the use of two large inductors rather than one large inductor as in the single-stage solution. To address this issue, Sun et al. [2] introduced the idea of using an unregulated and very efficient SC voltage divider in place of the front-end buck converter. The SC voltage divider had a fixed two to one voltage transformation ratio allowing the use of lower voltage switches in the second-stage buck converter. Unfortunately, with a fixed voltage divider, the power converter is not efficient over a wide input voltage range. As proposed in [3] [4] [5] , this can be solved by utilizing a SC converter with multiple distinct voltage conversion ratios that are selected based upon the input voltage.
It has also been shown that performance in SC/magnetic two-stage converters can be enhanced beyond that of a simple cascade of stages. As shown in [3] , [4] , [6] , and [7] , merging the structure and operation of the SC and magnetic stages (e.g., by eliminating or greatly reducing the capacitor between the two stages and controlling the circuit appropriately) can be highly advantageous. In a well-formed design of a merged two-stage converter, the capacitors within the SC converter can be soft charged, thereby reducing the charge redistribution losses among the capacitors within the SC portion of the converter, enabling reductions in capacitor size and/or improvements in efficiency.
Other two-stage or quasi-two-stage converters are also possible. Instead of placing an unregulated voltage divider (such as an SC converter) in front of a switching regulator, it is also possible to place the unregulated voltage divider after the switching regulator. Vicor Corporation [8] has developed such a system. Instead of connecting two stages in series, it is also possible to connect the two stages in parallel as in the quasi-parallel or Sigma architecture [9] . Each of these approaches has benefit in some applications. However, designs that are amenable to implementation with the semiconductor devices and controls on a single CMOS die with copackaged passive components may offer the greatest benefit in many modern battery-powered applications.
This paper investigates and demonstrates a two-stage power conversion approach suitable for power supply in package converters for powering logic devices in batteryoperated applications. The proposed approach addresses the need for achieving extremely high-power density, wide inputvoltage range, low-voltage output, and high-control bandwidth. We consider a two-stage design approach incorporating a variable-conversion-ratio SC stage and a high-frequency (HF) synchronous buck converter stage implemented on a single die, and copackaged with the power converter passives. This paper addresses both design techniques and packaging methods, and demonstrates the proposed approach in a 2.4-W dc-dc converter implemented in a 180-nm CMOS IC process and copackaged with its passive components for high performance. The prototype power converter operates from an input voltage of 2.7-5.5 V with an output voltage of ≤1.2 V, and achieves a 2210-W/in 3 power density with ≥80% efficiency. A key contribution of this paper is the development and demonstration of a fully package-integrated two-stage SC/magnetic power converter design. This includes both semiconductor integration of the conversion stages and methods to achieve overall package integration including copackaging of the requisite passive elements to achieve extreme high-power density. Moreover, we experimentally demonstrate the combination of multiple voltage conversion ratios in the SC stage in conjunction with the operation of the HF regulation stage to achieve a wide input-voltage range for the first time.
Section II of this paper presents an overview of the power converter architecture and its characteristics as compared with conventional synchronous buck converters. Section III presents details of the circuit design and control for the proposed system. Section IV describes the implementation and packaging strategy, and Section V shows experimental results for the prototype design. Finally, Section VI concludes this paper.
II. ARCHITECTURE
A dc-dc converter can be thought of serving two functions. First, it provides voltage transformation from an input to an output (e.g., a step-down voltage transformation). Second, it provides a mechanism to actively regulate the output to compensate for deviations in the input voltage and load. A synchronous buck converter provides both of these functions using only two switches. However, the two switches must be rated for both the high input voltage and the high output current, limiting achievable switching frequency, power density, efficiency, and control bandwidth. Fig. 1 shows a two-stage power conversion architecture in which the transformation and regulation functions of the power converter is separated. In this architecture, the transformation stage and the regulation stage are highly optimized for their specific functions. The first stage is a variable-topology (variable-conversion ratio) SC converter operating at low-to-moderate switching frequency (e.g., hundreds of kilohertz to low megahertz range). Owing to its use of capacitive energy transfer, it can achieve very high efficiency and power density using only small capacitors as energy storage components [10] , [11] . This first stage provides a voltage step down to an intermediate voltage whose value varies over a narrow range as the system input voltage varies over a wide range (by varying the SC stage operating mode among a discrete set of states as a function of input voltage). The second stage is a synchronous buck converter that operates at HF (e.g., 10 MHz and above) to regulate the output voltage from the small, narrow-range intermediate voltage.
The two stages can be realized on a single CMOS die, with the slow SC stage realized using extended-voltage devices and the fast regulation stage realized using low-voltage core devices. This power conversion architecture is thus well suited to leveraging the device types available in modern CMOS processes. As compared with a synchronous buck converter (the most common topology for low output-voltage conversion from wide range battery inputs and power systems in package [12] ) this approach can benefit power density, efficiency, and bandwidth.
Traditional switch-mode power converters consist of a combination of magnetic and electric storage elements. The amount of energy storage required in this type of power converter is a function of the switching frequency; the higher the switching frequency, the lower the energy storage requirement, and the higher the potential control bandwidth. For very low and narrow-range input voltages, it is possible to design synchronous buck converters that operate efficiently at very high frequencies-even at up to hundreds of megahertz [13] , [14] . This is a result of CMOS scaling and low device voltage stresses. As shown in the appendix, the optimal switching frequency of a synchronous buck converter with CMOS devices follows a power law:
with β ranging from −3.0 to −2.5. Thus, at quite low input voltages (e.g., 1-3 V) it is feasible to create power converters operating at quite high frequencies with high-bandwidth control and small passive components (e.g., inductors and capacitors). However, as input voltage increases, the achievable switching frequency-and hence size and bandwidthrapidly deteriorates. Moreover, as the range of input voltages for which the buck converter must operate widens, one is less able to optimize the design, reducing achievable performance for some portions of the operating range [15] . These factors limit the miniaturization and performance of buck converters for battery-powered inputs. (For example, 2.7-5.5 V input range and ∼1.0 V output, is common for power converters supplying low-voltage devices from Li-Ion batteries.)
The architecture of Fig. 1 utilizes two approaches to achieve higher performance. First, the SC stage provides the bulk of the voltage transformation function. As typical multilayer ceramic capacitors provide energy storage densities at least two orders of magnitude higher than inductors given present technological constraints, this voltage transformation can be achieved in much smaller volume with the first SC stage than can be accomplished with a conventional buck stage [3] , [6] , [10] . Second, since the first stage provides a variableconversion ratio step down, it yields a low and narrow-range intermediate voltage for the second synchronous buck regulation stage. Owing to the large magnitude of exponent β in (1) and the narrow range of its input voltage, the second stage can be designed for efficient operation at very HF, yielding reduced passive component size (especially of the inductor) and high achievable control bandwidth. As will be shown, this results in a step-down power conversion system providing extremely high power density and control bandwidth.
III. DESIGN
A prototype dc-dc converter IC has been created to demonstrate the performance and power density of the proposed power converter architecture. All of the power devices and control circuits are monolithically integrated onto a single IC (180-nm CMOS process) to minimize the size of the silicon real estate, while also maximizing the electrical performance of the prototype power converter. Fig. 2 shows a block diagram of the power converter.
Inside the prototype power converter, the transformation stage efficiently provides an intermediate voltage V x rail that varies over a narrow range of low voltages as the input voltage V in varies over a wider range of higher voltages. To achieve this behavior, a reconfigurable series-parallel SC converter was selected. As shown in Fig. 2 , the SC converter is composed of power MOSFETs M 1 -M 7 and energy storage Within the synchronous buck converter, the high-side device M H is a core pMOS transistor while the low-side device M L is a core nMOS transistor. These devices were selected for easy drivability and driven by simple tapered inverters. A driver tapering factor in the 8-10 range was chosen to minimize the combined switching and gate driver loss. Care was taken to match the delay of the drivers. Furthermore, to prevent the shoot-through condition in M L and M H , two nonoverlapping clock signals Φ p and Φ n are fed into the drivers. The nonoverlapping period is a few hundred picoseconds, approximately two to three times longer than the turn-ON and turn-OFF time of the main power devices.
The prototype power converter was designed to operate over the ranges given in Table I . The voltage and power levels were chosen so the power converter could be used in a wide variety of applications. For example, the input voltage V in range of 2.7-5.5 V allows the power converter to work with most lithium-ion battery chemistries. Likewise, the allowed range of the output voltage V o and the output power P o are suitable for a wide range of digital electronics, such as mobile application processors and digital signal processors. Fig. 3 shows the two different SC network configurations for each voltage conversion ratio of the transformation stage. In Fig. 3(a) and (c), the capacitors are charged, while in Fig. 3(b) and (d), the capacitors are discharged.
The maximum designed V in of the prototype power converter is 5.5 V, so considering the SC circuit topology, the IC process must have devices with a voltage rating of at least 3.66 V. Therefore, a 180-nm CMOS process with two different device flavors was selected. It has low-voltage core devices with a maximum rated voltage of 2.0 V and highvoltage IO devices with a maximum rated voltage of 6.0 V. The low-voltage devices are ideally suited for the regulation stage, while the high-voltage devices are suitable for use in the The transformation stage is controlled such that V x never exceeds 2.0 V and is never below 1.5 V. The voltage conversion ratio of the SC stage is dynamically set to 2:1 when V in is below 4.0 V and 3:1 when V in is in the 4.0-5.5 V range. This is important because the regulation stage, which follows the transformation stage, is built using low-voltage core devices with a maximum rating of 2.0 V. In addition, the lower 1.5 V specification ensures that the regulation stage will have enough headroom to regulate a V o of at least 1.2 V. With appropriate control, the regulation stage is able to operate effectively up to at least a 90% duty cycle, yielding a maximum V o of 1.35 V.
To control V o , a feedback loop is wrapped around the regulation stage, while the transformation stage is run in open loop. (The transformation stage is operated at a fixed frequency in the prototype, though dynamic selection of the SC switching frequency based on V in and/or the load would provide efficiency benefits.) A controller is required to determine the duty cycle of the power devices within the regulation stage such that V o is regulated in spite of variations at the input or output port of the prototype power converter. A linear voltage-mode controller was used for this task because it is straightforward to design and implement.
In the controller, V o is compared with a reference voltage and the residual is conditioned by a type three voltage compensator. The design of the compensator sets the bandwidth and steady-state accuracy of the control loop. The output of the compensator is a control signal, which is fed into a pulsewidth modulator in which the control signal is compared with a triangular waveform, thereby producing a double-sided pulsewidth modulation (PWM) gate signal for the power stage. The PWM signal is further conditioned by a dead-time control circuit. Fig. 4 shows the small signal loop gain response of the power converter, when V x is equal 1.8 V and V o is equal to 1.0 V.
To respond to large slew rates imposed by digital logic, the prototype power converter must have either a large control bandwidth or a large amount of output capacitance. Since the goal of this design is to create a high-power density design, we have chosen to push up the control bandwidth instead. The power converter's control loop was designed to be closed at 6 MHz with a 60°phase margin, as shown in Fig. 4 .
As described in detail in [16] , the circuit design and operating parameters were carefully optimized to achieve a good tradeoff between power density and efficiency given the selected semiconductor process and available passive component technologies. Table II shows the optimized parameters for the transformation stage and the regulation stage.
As can be seen, the component values and device sizes are commensurate with the switching frequencies and conversion requirements of each stage. The energy density of multi-layer ceramic capacitors (MLCCs) are a few orders of magnitude higher than that of ferrite inductors, and consequently, the volume consumed by the passive components within the transformation stage is roughly similar to the volume consumed by the passive components within the regulation stage.
IV. IMPLEMENTATION
In power electronics, packaging is often one of the most critical aspects to achieving high power density. As power density increases, a given amount of power is processed in a smaller space. This can lead to both electrical and thermal issues that need to be managed. (Packaging was a limiting factor in both efficiency and power density of the power converter in [6] , for example.) As a result, we developed a packaging scheme that enabled us to achieve both a high power density and excellent electrical performance. The goal of the package was to create a self-contained power converter module that did not require any external passive components, with solder bumps for surface mounting to a target board. The resulting module could then be soldered directly onto a standard PCB as an SMT package. The prototype IC is the heart of the power converter module. Fig. 5 shows a die image of the fabricated IC.
The overall packaging scheme of the prototype power converter is shown in Fig. 6 . In this packaging scheme, the silicon IC, which is thinned to a thickness of 200 μm, is flipped over with its device layer facing upward. A first level of bumps (3 mil gold stud bumps with a 220 μm pitch) connects the IC to a high-density interconnect interposer, which has four metal layers and a total thickness of 250 μm. The passive components, such as the capacitors and inductors, are then soldered onto the opposite side of the interposer.
The IC and interposer are laid out such that each passive component is oriented directly above the particular location on the silicon IC to which it is to be electrically connected. This ensures that the electrical path between the transistors and their respective passive components is very short. Furthermore, a large number of parallel interconnects are utilized to widen the electrical path. Both of these techniques reduce the inductance and resistance between the silicon IC and its passive components, thereby reducing energy loss. (Much more detail about the design and layout of both the IC and the interposer can be found in [16] .)
To complete the package, a second level of bumps (i.e., C 5 ) is formed on the interposer on the same side as the IC is mounted. These bumps provide electrical connections of the system to the outside world. The pitch of the second level of bumps is 500 μm, which allows the module to be mounted on a standard PCB. Note that with the packaging structure of Fig. 6 , there is also the opportunity to remove heat directly from the back side of the thinned IC die into the target board. Fig. 7 shows a close-up photograph of the module with three 100-nH Coilcraft inductors (PFL1005-101MRU) placed in parallel to form the buck inductor and Fig. 8 shows a cross section (inductors not shown). To prevent damaging the IC during testing, we decided to place the Coilcraft inductors off to the side of the interposer. However, there is adequate space to place the inductors on top of the interposer.
The prototype module is quite small-it has a width of 6.96 mm, a length of 2.56 mm, and a height of 1.0 mm, thereby yielding a footprint of 17.80 mm 2 and a total volume of 17.80 mm 3 . Based upon Table III above, the components consume 29.5% of the total package. Consequently, to improve the power density further, the components could be packed closer together, thereby reducing the wasted space.
V. EXPERIMENTAL RESULTS
In this section, we introduce test results from two different prototype modules. The first module has superior electrical performance to that of the second module. A few of the bumps within the first level of bumps in the second module are missing, thereby increasing the resistive losses. It is also worth noting that the data in Figs. 9-13 were taken when the filter inductor L was replaced by a 27.3-nH air-core inductor from Coilcraft (0908SQ-27NJLB). This substitution was made because the core loss parameters for the PFL1005-101MRU inductors were unknown, and therefore difficult to correlate the measured losses with the simulated losses. Furthermore, two different SC stage switching frequencies were tested (580 kHz and 1 MHz); dynamic control of the SC stage switching frequency would have yielded higher performance, but was not implemented. The estimated loss breakdown of the first module based on models detailed in [16] is shown in Fig. 12 . It can be seen that the dominant losses of the system are capacitive switch losses, resistive switch losses, and resistive interconnect loss. At light load, the loss is dominated by the capacitive losses, while at heavy load, the loss is dominated by the resistive losses. Despite the 20-MHz switching frequency of the buck converter, the commutation losses are low due to the low supply voltage applied to the synchronous buck converter, which is a benefit of the proposed architecture. To better gauge the power loss contribution of each stage in Fig. 12 , the efficiency of the SC stage and the buck stage across output current are displayed in Fig. 13 . Fig. 14 shows the measured efficiency of the second prototype module with an air-core filter inductor Fig. 14 . Magnetic-core inductor versus air-core inductor of second module ( f sc = 500 kHz and f b2 = 20 MHz). As can be seen, the efficiency between the projection and system with air-core inductor is quite similar when the output current is above 0.5 A. and a magnetic-core filter inductor (three parallel PFL1005-101MRU). The dotted line is a projection (detailed in [16] ) that assumes the magnetic-core inductor has the same dc resistance as the air-core inductor. In this figure, the input voltage is 5.0 V and the output voltage is 1.0 V.
At light load, however, the deviation in efficiency is as high as 7% at 0.1 A. This behavior is consistent with the addition of core loss. Therefore, the impact on performance using magnetic-core inductors is minor for most load currents even at the second-stage switching frequency of 20 MHz.
The air-core inductor has a height of 1.829 mm with a 2.972 mm × 2.134 mm footprint, whereas the magneticcore inductors have a total height of 0.71 mm with a 1.14 mm × 0.635 mm total footprint, thereby resulting in a 23-fold reduction in inductor volume using a cored magnetic element for a small efficiency penalty.
To measure the transient response, the second prototype module was loaded with an electronic load. Fig. 15 shows the response to a load step (between 50 mA and 1 A) with a 16-μs rise/fall time at an input voltage of 3.6 V. The top curve in blue is the intermediate voltage at 500 mV/div, the middle curve in red is the output voltage at 20 mV/div, and the bottom curve in green is the output current at 500 mA/div.
The load step does not produce a noticeable over-shoot or under-shoot because the bandwidth of the regulation stage is very high (approximately 6 MHz). This high achievable control bandwidth (owing to the high switching frequency of the second stage) is a particular benefit of the architecture. The output voltage has a voltage ripple of approximately 20-mV peak-to-peak, which is mainly due to the small output capacitor used; this could be reduced using additional output capacitance.
There are numerous integrated power converter modules for this general operating range in the marketplace today. Ease of use, reliability, and simplicity (from a PCB footprint perspective) drives their appeal. Table IV shows a comparison of various commercial power converter modules along with the first prototype module from this paper.
As can be seen from Table IV, the prototype module has a maximum output power of 2.4 W and consumes a volume of 17.80 mm 3 . Consequently, the power density is 2210 W/in 3 with a peak efficiency near 84% at an output voltage of 1.2 V. In comparison, the EN5329QI, which has similar specifications with regard to input voltage range, maximum output current, and efficiency, has a power density of 650 W/in 3 . This corresponds to a 3.4-fold improvement in power density as compared with this typical commercial design, and the power density achieved here exceeds (by roughly a factor of two or more) all of the commercial modules of Table IV .
The volume of each module is calculated by multiplying the total solution size, which includes the IC along with its external components, by the tallest component. For example, the solution size of the EN5329QI is 55 mm 2 (i.e., acquired from their website) with a height of 1.1 mm, yielding a volume of 60.50 mm 3 . In the case of the prototype power converter, the volume is equal to the size of the module since no external components are required for proper operation. Fig. 16 plots the power density versus switching frequency of the power converter modules from Table IV. As can be seen, the power density increases as the switching frequency increases. This behavior is the reason there is a continued trend to push up the switching frequency in both industry and academia. The ongoing challenge is then to increase the frequency and decrease the power converter volume without sacrificing efficiency. The architecture and construction method investigated here is beneficial in this regard.
It is likely that most or all of the power converters in the commercial modules of Table IV are synchronous buck converters, though the authors have no direct means of verifying this. In light of this, Table V includes a comparative listing of some previous academic work in this space with different power converter topologies.
The prototype module described here achieves the largest voltage step-down ratio and input voltage range ratio of these comparable works, while operating at a higher switching frequency than many of the other power converters and with 
VI. CONCLUSION
This paper has presented a two-stage power conversion architecture, design techniques, and packaging approach for integrated power converters operating from battery-scale input voltages and supplying low-voltage outputs. The proposed approach leverages the devices available in low-voltage CMOS processes to achieve higher power density and improved transient response as compared with more conventional power converter designs. The proposed packaging approach yields low-parasitic interconnections of the CMOS die to the passive components, facilitating operation at elevated switching frequencies (20 MHz in the prototype design), and provides a surface-mount power system in package (PSIP). The firstgeneration design operates from an input voltage of 2.7-5.5 V with an output voltage of ≤1.2 V, and achieves a 2210 W/in 3 power density with ≥80% efficiency. It is anticipated that the proposed approach will find application in a range of power converter designs for low-voltage systems.
APPENDIX
In this appendix, we explore how the achievable switching frequency of dc-dc converters in deep submicrometer CMOS processes scales with process voltage. For simplicity, we focus on CMOS synchronous buck converters with a switching frequency f sw . Consider how the device-scaling characteristic influences converter performance. As shown in [20] , the MOSFET losses in a synchronous buck converter can be modeled using an effective bridge capacitance C b and an effective bridge resistance R b , as shown in Fig. 17 , where W L , W H , C L0 , C H 0 , R L0 , and R H 0 are the widths, effective specific capacitances, and specific ON-resistances of the switches, respectively. Typically, the gate capacitance is the dominant contributor to dynamic loss in low-voltage CMOS processes. Therefore, this model ignores commutation loss.
In this model, we have defined the following parameters as: and
where the duty cycle is equal to
The optimal width ratio of W H to W L can be found by a constrained minimization of C b at a constant R b , yielding [20] 
Assuming that the power loss is only in the MOSFETs, then the total power loss is a combination of the static loss and dynamic loss given by
where
and The optimal C b can be found by minimizing P loss (6) , yielding
which results in a minimum power loss of P loss(min) = 2W opt C b0 V 2 in f sw = 2I o V in R b0 C b0 f sw . (10) An optimal f sw can be chosen given a desired P loss(min) or, equivalently, a desired efficiency. This optimum is
In addition, by combining (4), (5), (7), (8) , and (11), holding I o and P loss (min) constant, and assuming R H 0 C H 0 α V in and R L0 C L0 α V in , it can be shown that the optimal f sw is
and
In (14), b represents a relative performance factor for S H and S L . If both devices exhibit the same RC product (e.g., are both nMOS), then b equals one. More typically, S H is a pMOS and S L is a nMOS, so b is closer to three. Furthermore, it can be shown empirically that f opt fits the power law f opt = kV β in (15) assuming both β and k are functions of b and V o , and given V in > 2V o . Fig. 18 shows how the exponent β varies as a function of b for various V o .
What can be concluded from (15) and Fig. 18 is that f opt increases very rapidly with decreasing V in . For example, if b = 3 and V o = 1.0 V, then β = −2.67. Therefore, a buck converter with a V in of 1.8 V should be able to switch 15.3 times faster than a buck converter with a V in of 5.0 V with equal power loss in both cases. This leads to less energy storage in the filter elements (L and C) for a given dynamic and static response.
