Distributed power delivery is blooming in SoC power system because the fine-grained power management needs separate power sources to adjust each voltage island dynamically. In addition, dedicated power sources for critical circuit blocks can achieve better signal integrity. To extensively utilize the power modules when they are redundant and idle, this work applies the cooperation concept in SoC power management. The key controller is a mixed-signal estimator that executes the intelligent procedures, like real-time swap the power module depending on its loading and healthy condition, automatically configure the power system with phase interleaving, and support all the peripheral functions. To demonstrate the proposed concept, a prototype chip for voltage down-conversion is implemented. This chip contains four switched-inductor converter modules to emulate the cooperative power network. Each module is small therefore the power efficiency is not optimal for the heavy load. With the cooperation between power modules, the power efficiency is 88% for 300mA load, that is 8.5% higher than the single module operation.
Introduction
Distributed power systems have been widely used for high power delivery applications like micro-grid power systems [1] and voltage regulator modules (VRM) for CPUs [2] . The system with multiple modules helps to deliver high power, and the associated phase interleaving technique effectively reduces the output ripple with smaller filter components.
Recently, the distributed power delivery concept has been applied to the system-on-chip (SoC) and system-inpackage (SiP) integration [3] , [4] . The motivation does not only come from the concerns of power deliverability, but also for fine-grained power management. Considering the ever large-scaled system integration using ever tiny devices, power issues have been the bottleneck for nowadays SoC development. To improve the power usage efficiency and control the power loss [5] , many advanced techniques like leakage power reduction, aggressive active and sleep mode control, subthreshold operation [6] , and dynamic voltage and frequency scaling (DVFS) [7] , [8] , have been widely applied in SoC products. For example, Intel [9] , IBM [10] , Manuscript received December 18, 2015. Manuscript revised January 25, 2016. † The authors is and were with Department of Electrical Engineering, National Tsing Hua University, Taiwan.
* Presently, with FocalTech Systems. * * Presently, with Taiwan Semiconductor Manufacturing Company (TSMC), Taiwan.
* * * Part of this work was presented at IEEE A-SSCC 2013. a) E-mail: pchuang@ee.nthu.edu.tw DOI: 10.1587/transele.E99.C.606 Oracle [11] , Hitachi and Renesas [12] all applied the finegrained power management concept with multiple power domain, separate voltage island and DFVS techniques in their processors for server and mobile applications. In addition, to handle power delivery and execute finegrained power control, for signal integrity, a dedicated power module is needed for sensitive function blocks. Like for the high-speed data transceiver in an IBM processor, the distributed microregulators are developed to locally regulate the supply against the digital switching noise [3] .
To support advanced power management techniques, distributed power modules are essential. Therefore, the power source for each function block can be dynamically assigned under different operation mode. Many embedded sensors are usually inserted to real-time monitor the chip operation and circuit healthy conditions. Figure 1 shows a SoC with distributed power modules and the associated monitoring circuits.
Practically not every function in different power domain is active all the time. In this work, we try to explore the cooperative scheme for power delivery system. The power resources can be shared to reduce the hardware cost. The load condition for each module can be adaptively balanced, considering its healthy condition. The dynamic power allocation also possibly alleviates the hot spot problem. This paper is organized as follows: the cooperative power management concept and the operation algorithm are firstly introduced in Sect. 2. In Sect. 3 we explain the fundamental load sharing controls for the distributed power module. Key functions to support the cooperation are discussed Copyright c 2016 The Institute of Electronics, Information and Communication Engineers in Sect. 4. The prototype chip implementation is in Sect. 5. Finally, a conclusion is drawn in the last section.
Cooperative Power Management
Optimal power efficiency means precise power control in all aspects. For this purpose, circuits should be precisely operated under the specific power supply and clock rate just for its performance needs. Associated with the system-level estimations, many fine-grained power managements have been proposed [13] . These algorithms raise new demands on the power modules: First, the power modules have to dynamically switch between different power mode over wide operation condition. How to fast regulate the power conversion and keep the high power efficiency are the challenges. Second, with more function blocks integrated, the hardware cost of power module will rise exponentially. A smart control algorithm should leverage the power source to maximize the module utilization. Third, to guarantee the system reliability, embedded sensors are necessary to monitor the healthy condition of the power module.
The utilization of distributed power resources is an interesting research topic in advanced power management system. In this work, we involve the cooperative concept in SoC power delivery system design. The basic idea is, if the power domain is idle or light-loaded, the spare power resources can support other power domain with the heavy load. Ideally, this hardware sharing can reduce the cost, improve the reliability, and alleviate the thermal problem, if the system contains many exclusive power domains. Figure 2 shows the conceptual block diagram of the cooperative power management scheme. In a SoC, there are several distributed power modules. Each module has a local controller to regulate its output. To support the cooperative function, the power delivery network with more power traces connects distributed power sources to different loads. The adaptive control algorithm implemented in the estimator will configure the power network and balance the power resources, considering the load request and power module health. The loading request can be analyzed by interpreting the operation system instructions, and the power module health can be monitored by the embedded voltage/current/temperature sensors.
The power management algorithm, based on the system operation, will assign the power modules to be active or idle. The system will be even more dynamic if the cooperative operation is involved. The power module may serve for another active load if the system estimation presents a power advantage. From the real behavior of power conversion/regulation, we may suggest the following guidelines for the module status change:
1. Only one module can to be added or dropped at one time. Too many power modules switch their status simultaneously may cause chaotic problems. 2. The guard time between two status changes must be long enough. The power system has to settle down be- fore the next status change. 3. The estimator will score each power module with an evaluation index. Power module will be dropped out of the active list if its evaluation index is lower the threshold value. On the other hand, the power module can be added into the active list, if the evaluation index is higher than the on-line module. 4. The hazardous power module should get the lowest evaluation index, equivalently has the highest priority to be shut down. For example, it is over current or over temperature. 5. If the evaluation indices are the same for two more modules, the swap sequence follows the order that is pre-defined in the estimator.
There are several fundamental rules for power system designs. For safety, if the power module is not healthy like the operation current or temperature is over the specification, the estimator must shut down the active module immediately. And, the power delivery network has to be evaluated carefully at design stage [14] . Each power module should not be far away from the load because larger loading effects along the power delivery path will degrade the power efficiency. The system estimator plays an important role to execute these evaluation algorithms, update the evaluation index for each module, and decide the active module list for each load.
For N modules in the cooperative power management 1011 or 0111 means module 1 or 2 is added to the working list. 0010 or 0001 is to drop out the power module 3 or 4. Point PFM is a special case when the system enters the very light load condition. In this state, only one module is active using PFM control scheme.
Load Sharing in Distributed Power System
In SoC distributed power system, there are three possible candidates including linear regulator, switched inductor, and switched capacitor, for power conversion and regulation. Each candidate has its advantages and challenges in system integration. Generally, linear regulator provides high onchip integrability. But the dropout voltage limits its power efficiency. Switched inductor converter achieves high power efficiency. But the cost on inductor is critical. The size of switched capacitor converter is proportional to the load current. In this case, power density is a limiting factor.
The cooperative power delivery means several power modules may be dynamically grouped to serve one heavy load. Considering different source impedance of each power module, it needs to take a particular attention to connecting the power modules in parallel. The mismatches between power modules will unbalance current that may make some modules overload. Figure 4 simplifies a distributed power system with two parallel power modules. To properly share the load, there are two control schemes available [15] . One is the passive droop current sharing method, with each module output power adjusted by the source impedance Z 1 and Fig. 4 The control schemes for parallel load sharing. This work adopts active current sharing control method with the inner current control loop. Z 2 . The other is the active current sharing method, with the delivery power adjusted by the inner current control loop. In this work, the latter is adopted because the passive droop method suffers from more efficiency loss. Furthermore, the inner current control loop can be easily extended to cooperative power management. More details will be addressed in Sect. 4.2.
Multi-phase configuration is a common technique for switching type power converters to reduce the output voltage ripple. Figure 5 depicts its operation concept. System analysis indicates that, with four-phase interleaving the output ripple can be three times smaller than the single phase operation.
In the fine-grained power management applications, since the power system may change its state frequently, the activities of power module will be even more dynamic. The transient performance, especially on the disturbance and response time during the operation mode change, is important. And, because the loading condition assigned to each module is variable, the efficiency of power module must keep high over a wide load range.
Key Functions for Coorperation
In a SoC with fine-grained power management, the system is complicated with multiple power domains and hybrid power sources. To clearly explain the operations of the cooperative power delivery system, the system in Fig. 2 is simpli-fied with four identical switched inductor power modules and one power domain load. In the following, the design considerations of three key function blocks, the system estimator, the power module, and multiphase timing generator, for cooperative functions, will be discussed.
System Estimator
The estimator is the kernel to control the cooperative system operation. Figure 6 is the block diagram of the estimator developed in the prototype.
The estimator receives the system command (usually from the task scheduler in operation system) to understand next loading requirement. At the same time, the estimator collects the sensor information to understand system statuses, like power delivery capability and healthy condition. The core of the estimator is a digital processing unit to implement estimation functions. The important tasks include updating the evaluate the priority of each power module, deciding the working list for next power state, smoothing away the hazardous condition, and so on. Once the evaluation results are available, the estimator will send commands and evaluation index value to each power module. The local controller of power module will configure its operation based on these parameters accordingly. The estimator will also program the analog peripheral circuits to provide proper voltage reference, clock timing, and power delivery path to support the power system. The latency of the processing unit and peripheral circuit has to carefully control to make sure the real-time response. Considering the flexibility, the estimator has to be easily programmable and extendable.
Based on the guidelines mentioned, for four power modules in this system, there are total 16 power modes. One special mode is the case when the system is at very light load condition. Under this condition, all the modules should be shut down, with only one module operated in PFM control [16] . The estimator has to automatically switch between the normal PWM control and light-load PFM control, to achieve good power efficiency for wide load range. 
Power Module
Today, some researchers are working on the control strategies for hybrid switching-linear regulators [17] , [18] . To simplify the distributed power delivery system design, this prototype uses four identical switched-inductor power modules as the cooperative converters. Figure 7 depicts the block diagram of a single power module. There are two control modes: PWM for normal regulation and PFM for light load condition. PWM scheme, as discussed above, adopts inner current mode control [19] . The inductor current sensed by an embedded current detector sums up with the compensation ramp V ramp and the evaluation index EI −i to generate the current control signal V inner . The ramp-like waveform then compares to the voltage V outer to determine the PWM pulse duty ratio. The compensation functions for outer voltage control loop [20] is included in system estimator, to calculate V outer for all the cooperative modules. The timing generation circuits provide a clock signal CK −i for precise multi-phase operation. The estimator will send the proper evaluation index EI −i , based on system requests and working conditions. With these signals, the feedback loops around each power module will adjust its output power.
The evaluation index EI −i is important in this system. The estimator will use this parameter to rate each power module, depending on its loading and healthy conditions. For example, if the temperature of one specific power mod- ule is getting higher, the temperature sensor will report this condition to the estimator. The estimator will give the index EI −i with a high value. The output delivery current of this module, therefore, will be reduced, or even shut down. With different evaluation index, power module will be added in or dropped out the working list. The peripheral settings are also based on this index value.
In this multi-phase system, the phase relationship depends on the number of active modules. When the system adds or drops one module, the estimator has to count for the new phase relationship between active modules. The transient time for phase update must be short to reduce the transient disturbance on output power. In the proposed system, the reference CK −i is from the multiphase timing generator. To compensate the frequency and phase offset between the system reference clock and local oscillator in each module, a phase-locked loop (PLL) is adapted in each active module to lock the local timing LO −i with the reference CK −i . Figure 8 shows the block diagram and the PLL frequency responses. The loop bandwidth is 500kHz. The transient time is less than 1µs if the initial frequency offset is small. 
Multiphase Timing Generator
The delay-locked loop (DLL) is a peripheral circuit to generate the precise, equal phase shifted clocks. These clocks serve as the timing reference for active power modules to realize the multi-phase operation. Figure 9 shows the block diagram. The DLL composes of a phase detector, a voltage-controlled delay line, an output clock synthesizer, and a start-controlled circuit. The loop will generate a proper control signal V c to make the phase difference between CLK i and CLK o as desired. To avoid the harmonic locking during startup, the technique proposed by [21] is applied. In this design case, the possible active power modules are from 1 to 4. That means we need twelve phases to synthesize 90 o , 120 o , and 180 o phase differences in multi-phase configuration. There are totally 12 delay cells in the voltage-controlled delay line, to generate the 30 o phase difference between adjacent cell outputs. The output clock synthesizer combines right clock phases to output, depending on the phase number N CK .
In addition to these function blocks, there are more peripheral circuits like the reference voltage generator, power delivery matrix, power module monitoring circuits, and so on, to support the cooperation system. Because these functions are common for many applications, no detail is addressed here.
Prototype Chip Realization
To demonstrate the functionality of the proposed cooperation system, a prototype chip has been realized [22] as shown in Fig. 10 . It uses a 0.35µm, 5-V CMOS technology. The chip integrates four power modules anchored separately at the corners. The mixed-signal estimator at the center au- Figure 11 depicts the output voltage and the inductor current waveforms of the active power module with the load current switched between normal and light load conditions. When the load demand switches to 20mA as shown in Fig. 11 (a) , the active module will reduce the PWM duty ratio continuously. If the duty is smaller than a specific value, the estimator will change the power module operation into PFM mode. Under asynchronous PFM control, the pulse frequency is proportional to the load demand. In this design for 20mA load, the PFM frequency is 50kHz. Figure 11 (b) shows the other case when output is connected to a heavy load of 300mA. The inductor current changes as the high frequency, high amplitude pulses. To avoid the possibly unstable operation, the estimator only turns the power module control into PWM mode after counting five consecutive high-frequency pulses.
In the proposed cooperative system, the evaluation index is an important parameter for power management. To emulate the real-time responses like automatic multi-phase reconfigurability and real-time power swap, the evaluation index for each power module is programmable by a data bus in this prototype. Figure 12 shows the waveforms of regulated output, inductor current, and the multi-phase system clocks. By giving proper evaluation index, the power module of 3, 4, 1, 2 joins the cooperation system sequentially to share a 240mA loading current. The real-time power module swap indicates that the response time of multi-phase clock synthesis and the power module phase alignment are all fast.
The inductor current waveforms in Fig. 12 point out the fact that a certain mismatch exists among different modules. However, with the active current sharing control, the system operation is robust to the component mismatches and other parasitic effects. The output voltage regulation function is stable. Figure 13 summarizes the performance of prototype chip. The power module integrated into this cooperative system is optimized for moderate load condition, around 200mA. Larger power MOS devices, of course, can lower the R ds−on to reduce the power loss. However, the overhead will be large if all the distributed modules are optimized for their heaviest load conditions. In this design, for 300mA load, the power efficiency is lower than the peak efficiency with a single power module. By cooperating four available modules, even including the power overhead of processing Fig. 13 Performance summary of the prototype system. Under heavy load, with more power modules cooperated, the system power efficiency can be improved.
unit and the peripheral circuit, the system power efficiency is improved from 79% to 88%. On the other hand, for the light load (smaller than 50mA) condition, the system switches to variable-frequency PFM control scheme to save 5% efficiency more than the fixed-frequency PWM control.
Conclusion
Continuing the SoC development trend, no doubt more functions will be integrated into a single chip. For fine-grained power and thermal managements, multiple power domains with the distributed power modules are compulsory. In this paper, we introduce the concept of a joint operation between on-chip power modules to gain more system advantages. First, under a specific loading condition, if the nearby idle modules help to share the load, the spatial spread can alleviate the hotspot problems. And if some power domains are active exclusively, the system can share the power sources dynamically the leverage the hardware cost. Of course, the cooperation functions raise more challenges, like ingenious algorithms in power managements and complicated networks in power matrices. Each aspect like the module operation status and the module swap overhead has to be real-time evaluated to make the optimal decision.
A prototype chip is realized to demonstrate the key operation. It is a simplified system that contains four power modules to serve one pseudo load and an estimator to dynamically manage the system operation. The mixed-signal estimator executes the power module cooperation, considering the system demand, power module availability, and each swap operation cost. The load-sharing current mode control keeps the power delivery system stable, even a certain mismatch existed in each module. The multi-phase technique applied for multiple module cooperations reduces the output ripple voltage. With the help of cooperation, the distributed power management system can achieve higher power efficiency, better system reliability and hardware utilization. The concept can be applied for power delivery and management system in SoC, SiP, and 3D-IC integrations extensively.
