Abstract-The primary drawback of thermoelectric coolers (TECs) for electronics cooling applications is their thermodynamic inefficiency due to material limitations. The present work considers a control strategy to improve the overall coefficient of performance in an engineering system instead of addressing material shortcomings. Typical TECs are composed of several individual thermocouples that are powered in series and remove heat in parallel. If one of the numerous thermocouples is powered, all the thermocouples receive the same power whether or not they are needed. The fact that chips heat nonuniformly provides an opportunity for performance enhancement, by sensing and controlling the power to individual couples within the device. The current work presents evidence that applying distributed control to TEC operation can realize appreciable improvement in performance. Compared to monolithic cooling devices, a distributed control strategy can realize a factor of 2 increase in performance for the device studied. Additionally, this type of control can be used in conjunction with many of the existing material-based research initiatives to further compound the benefits.
Abstract-The primary drawback of thermoelectric coolers (TECs) for electronics cooling applications is their thermodynamic inefficiency due to material limitations. The present work considers a control strategy to improve the overall coefficient of performance in an engineering system instead of addressing material shortcomings. Typical TECs are composed of several individual thermocouples that are powered in series and remove heat in parallel. If one of the numerous thermocouples is powered, all the thermocouples receive the same power whether or not they are needed. The fact that chips heat nonuniformly provides an opportunity for performance enhancement, by sensing and controlling the power to individual couples within the device. The current work presents evidence that applying distributed control to TEC operation can realize appreciable improvement in performance. Compared to monolithic cooling devices, a distributed control strategy can realize a factor of 2 increase in performance for the device studied. Additionally, this type of control can be used in conjunction with many of the existing material-based research initiatives to further compound the benefits.
Index Terms-Thermoelectric coolers (TECs).

I. INTRODUCTION
A LTHOUGH many research efforts to develop miniature refrigerators exist, thermoelectric coolers (TECs) are currently the only miniature cooling devices finding use in microelectronics cooling applications [1] . Unlike other miniature cooling schemes, TECs are solid-state devices and, therefore, are reliable, compact, and quiet. However, TECs generally suffer from poor device efficiencies, which limit their widespread applicability. To overcome these performance limitations, many research initiatives are currently seeking to improve the effectiveness of TECs by improving the materials used for these coolers [2] . Examples of possibilities that are being explored include skutterudites, superlattices, thin films, and other nanostructured materials [3] - [7] . Additional research related to chip cooling has also been devoted to improving fin-fan systems, minimizing contact resistances, and employing multistage TECs, which have already been implemented in chip-cooling applications [8] - [13] . In electronic chip-cooling applications, the packaging may consist of the chip, a TEC, a heat spreader, and finally, a fin-fan system. Thermal transport at each interface is enhanced with a thermal interface material (TIM). The heat spreader uniformly distributes the heat load produced by the chip. A TEC that is comparable in size and shape is contacted with the spreader to extract the heat from the chip. The fin-fan system must still remove the heat from the TEC and away from the computer. The purpose of the TEC is to lower the processor temperature so that the integrated circuits may operate in a desired temperature range, because the performance of modern computer chips will degrade above 85 C [14] . However, the ultimate load on the power and cooling system increases as with any thermodynamic cycle.
The typical TEC unit that is currently marketed for chipcooling applications consists of many individual thermocouples, called elements or cells, connected electrically in series but thermally in parallel. Each individual thermocouple is composed of two dissimilar materials, called pellets or legs, one of which is a p-type semiconductor and the other an n-type semiconductor [15] . In general, we can identify two metrics used to evaluate the performance of TECs.
At the material level, a TEC is rated by a dimensionless figure of merit,
. Typical values for available from manufacturers are slightly less than unity. s required to cool the future generations of computer chips will need to approach three and eventually even higher to be competitive to modern vapor-compression cycles [16] . This figure of merit is based solely on material properties but provides a useful comparison between the transport properties of different materials. As such, this metric has a direct effect on the performance of the system. Recent material improvements have resulted in increases of between two and three fold [17] . Some sources purport theoretical increases even higher [18] . One problem with the current material improvements, however, is that they may require years to become fully implemented into production if they can be mass-produced at all.
When the TEC is placed into a system and used as a refrigerator the thermodynamic efficiency or coefficient of performance (COP) is often used to evaluate performance. This general measure of efficiency of a TEC is based on the amount of heat that it removes compared to the amount of work required to remove the heat. This value is calculated the same as any refrigeration cycle and is defined as COP
In this configuration, all elements within the TEC device are powered simultaneously. If the heating in a chip is localized, 1521-3331/$25.00 © 2007 IEEE then elements within the TEC device may be needlessly powered, which will reduce the overall COP of the system. One alternative approach is to control individual elements of the TEC independently based on the local heat load experienced by a single element [19] . This technology is independent of the improvement of . Instead, this technology could improve the performance of any TEC device regardless of internal material efficiency. Because computer chips generate hot spots corresponding to the portion of the chip that is performing calculations at any given time, the heat load is not uniformly distributed. Heat spreaders improve the distribution of heat but do not transfer heat instantaneously [20] . Existing TECs can only be powered as a monolithic unit; all the elements receive the same amount of current regardless of their individual heat load. Therefore, the COP might be improved if each TEC element is only powered when it is needed, and then, powered only to the degree that is required to remove the amount of heat being produced. Therefore, we propose to enhance the system efficiency by distributed control of TEC units.
Distributed control refers to a method that allows the individual elements of a system to respond independently based on their respective state [21] , [22] . Previous research has shown the potential of increased effectiveness by separately controlling individual thermocouples or groups of thermocouples in TECs [12] , [23] . Another major benefit of distributed control is that it may be applied concurrently with other technological advances. Regardless of the material advances in TECs, distributed control would allow for the optimization of COP. The present work performs analysis for the basis of a strategy for using distributed control to improve the performance of TECs as a proof of concept and demonstrates preliminary effectiveness of the proposed approach.
II. THEORY
A typical TEC is a 2-D array of individual thermocouples that work together to cool larger areas. The device is in direct contact with the surface of a computer chip. For the present analysis, a one-dimensional (linear) array is considered for simplicity. Although transient effects such as periodic heating or migration of the heated spot may be important in some applications, the present analysis considers only steady-state heating.
Each section of the computer chip that is in contact with an individual couple is considered to be a separate and distinct heat-producing portion. Using the electrical analogy for heat transfer, the schematic for number of thermocouples is given in Fig. 1 .
represents the heat being produced by the th portion of the chip.
represents the heat being removed by each thermocouple.
represents the heat that is transferred by conduction between the th and the th 1 portions of the computer chip. and represent the thermal resistances between the chip and thermocouple, and between the thermocouple and the ambient environment, respectively. is an equivalent resistance that accounts for chip packaging (including a heat spreader if used), TIM, and contact resistances.
accounts for TIM, contact resistances, and the effectiveness of the fin-fan system. Likewise, is an equivalent thermal resistance between portions of the computer chip and is based on the expected lateral conductivity of the chip with no heat spreader.
is the amount of work the thermocouple requires to remove the heat from the chip. Due to the first law of thermodynamics, however, the sum of the work and the heat, , must be removed from the system through . represents the junction temperature, which is at the surface of the computer chip. and are the temperatures of the cold and hot surface of each thermocouple respectively. Finally, is the ambient temperature.
The solution is found by solving a system of simultaneous algebraic equations. The heat removed by each thermocouple, , is [24] ( 2) where is the Seebeck coefficient, is the electrical current through the device, is the cold-side temperature, and is given by (3) is the hot-side temperature. Referring to (2) , is a geometric factor equal to the cross-sectional area for heat removal divided by the height of a thermocouple, is the electrical resistivity, and is the thermal conductivity. The factor of two is required because each couple is composed of two pellets. The first term in brackets represents the heat being pumped by the couple due to the Peltier effect. This benefit is offset by the Joule heating in the device (the second term in brackets), and the parasitic heat flow in the couple due to the adverse temperature gradient (the last term in brackets). The parameters used for analysis are based on a specific Melcor device, Model CP5-31-06L and are given in Table I . Because the contact resistances, and , vary widely with temperature, surface preparation, TIM and heat removal system, they are largely unknown and regarded as free parameters. A constant value based on other research efforts is assumed here for simplicity [25] . The work required by each thermocouple, , is given by [25] (4)
The remaining equations used to define the system are obtained by balancing heat into and out of the chip sections and each thermocouple. That is, heat is conserved for each portion of the chip as
The heat transferred between two portions of the chip is given by (6) where is the junction temperature. The heat removed by a thermocouple is equivalent to (2) and is given by (7) The heat and work removed by the system is given by (8) The algebraic system is non-linear with respect to current, so it must be solved iteratively for temperatures, heat loads, and currents. A Matlab script is used to maximize the overall COP by adjusting the applied current to each element individually for a given generation profile . The optimization is constrained to maintain a local junction temperature below some maximum value. Note that the effective COP of the system as a whole is different than (1), which is valid for monolithic devices. In the present work the COP is defined as COP
Both forms in (9) are equivalent since all the heat generated must leave the device through the TECs during steady operation. However, because of lateral conduction, the distribution of the generated heat and the heat removed through the TECs will not necessarily be the same, even though their sums are equal. This equation gives some insight to the inherent advantage of distributed control. If a couple is not powered, it may still provide some heat removal due to Fourier heat conduction. This advantage is termed free heat removal because the work required is zero and can result in scenarios that make analysis for the system different than typical analysis for a single TEC device. For instance, in some operating conditions the resulting COP of the present device will be infinite if all the heat can be removed by conduction without powering the device. In addition, favorable temperature gradients across the device can theoretically produce power, and COP will be negative. For performance analysis of a TEC not attached to a system, the COP is considered only when the device is powered at optimal conditions. These optimality conditions are for zero heat flux and maximum temperature drop or for zero temperature drop and maximum heat flux. While these conditions are useful for TEC design, they are not necessarily the same conditions that result in the best system efficiencies. Therefore, results obtained for the entire system may not compare to results for a cooler operating at optimal conditions and must be interpreted cautiously. The analysis in this work will concentrate on the regions where the device is powered, so COP is both positive and finite. Furthermore, we will compare the COP for a monolithic device in the system to the distributed controlled device in the same system only. Consequently, our comparison of COPs is for the system and not an isolated device.
III. RESULTS
The present work will examine an array of five thermocouples corresponding to a discretized chip with five nodes. For simplicity and ease of comparison only the first portion of the chip corresponding to the first thermocouple will produce heat; this is the worst-case scenario for a monolithic TEC that is cooling uniformly. For this example the input values of heat being generated by the first portion of the chip range from 0.0 W to 5.0 W, representing a chip that is capable of generating 25 W of total energy. Because computer-chip performance degrades at 85 C, this value is used as a maximum junction temperature for the optimization. At heat loads greater than 4.8 W, the five thermocouple configuration is unable to maintain less than 85 C for the given resistances in the problem. These resulting computed values are not considered in the analysis because the system is operating in failure mode.
The maximum COP for the system with a TEC whose control is distributed among the elements is plotted in Fig. 2 as a function of heat load. For comparison the values of COP for the same configuration with the thermocouples connected in series (monolithic device) are also plotted. Therefore, Fig. 2 illustrates the maximum theoretical gain in COP from applying distributed control.
In order to achieve this gain, a distributed control rule must be applied, which will allow the system to respond appropriately via sensors and actuators to reproduce the optimum values. More specifically, the nodes within the system must be able to sense specified local parameters, to apply a distributed control rule in order to determine appropriate system response, and then to apply that response to the system. Ideally, the amount of cooling achieved by the TEC should be directly related to the amount of heating provided by the chip. The most reasonable way to sense the amount of heat generation is by measuring local temperatures. The local temperatures that are available to be directly sensed by the TEC are the hot-side and cold-side temperatures, and , of the individual couples. If the optimal current-the current that maximizes COP-may be determined by its relationship with the sensed parameters, then this relationship is the distributed control rule.
In order to find this distributed control rule, and were compared to the optimum current using curve fitting techniques. Several different methods were attempted including variations of multiple-linear and multiple-polynomial regression. Multiple-polynomial regression with squared terms provided a good fit, so higher order terms were not needed. A comparison of sensitivities of each term revealed that the squared term affected the current minimally. Therefore, the squared term was eliminated from the subsequent expression. Considering all factors, such as the number of terms and closeness of fit, the best distributed control rule found includes only one squared term and is given by (10) where and are provided in units of Kelvin and the resulting is given in amperes. The resulting values of COP from applying (10) are also plotted in Fig. 2 . The rule-based values are just slightly less than the optimal values throughout the operating range. Note that the rule-based values still represent a considerable improvement over the serial values. A comparison of the optimal values of current and the values of current computed with (10) are given in Fig. 3 . In this figure each thermocouple has a distinct curve. The couple adjacent to the heat source has the highest temperatures and requires the most current. The other couples require less current because they are farther from the heat source. Each computed value differs by less than 1.0 A from the optimal value, and, in most cases, by less than 0.1 A. These minor variations in the computed values from the optimal values affect the COP as shown in Fig. 2 . However, Fig. 3 . Plot of the optimum COP data compared to the data calculated from the distributed control rule as calculated using multiple polynomial regression (T squared term only). the reduction is minimal compared to the benefit gained over the serial value.
To ensure the new scheme does not enter a forbidden operating region-where the chip temperature exceeds the junction temperature-the junction temperatures were plotted for the optimum current values and the rule-based current values. The resulting graph for the first thermocouple is presented in Fig. 4 . The resulting graph for the second thermocouple is presented in Fig. 5 . For the optimum values, the junction temperature for the first thermocouple rises steadily until it reaches the maximum allowable junction temperature . At this point, approximately 2.4 W, the couple must be powered to maintain throughout the operating regime. The rule-based temperature closely follows the temperatures from the optimal solution. The junction temperature remains just below the throughout most of the operating regime. Although does barely exceed between 4.0 and 4.5 W, this minor breach is considered acceptable. Fig. 4 shows that the rule-based computations can stay within system limitations and produce the desired gains in COP.
In Fig. 5 , the second junction temperature initially rises due to the heat from conduction
. At approximately 2.4 W, the second thermocouple starts to consume more current and actually starts to lower . Recall that only the first section of the chip is producing heat . The first couple responds to this generation but can not satisfy the load and maintain less than . This reduction in temperature of the adjacent section at first may seem counterintuitive to the goal of increasing COP, since is not approaching . However, the benefit in this reduction is to increase the temperature difference between and , thereby increasing as described in (5). So, the second couple, as well as the other couples, gradually assumes more of the heat load, although another portion of the chip is producing the heat. When the adjacent cell must turn on in order to reduce the load on its neighbor, the operation enters a second regime where the gradients in the chip become greater. One claimed benefit of using distributed control to power TECs is the lack of temperature gradients, yet this regime is producing larger temperature gradients. The design tradeoffs of distributed TECs, therefore, must include internal chip gradients. This drawback must be balanced with the ultimate gains in COP. The benefit of each couple assuming a portion of the heat load can be seen in the Joule heating term of (2) . Each couple consumes a small amount of current to optimize the load on the primary thermocouple, thereby controlling the Joule heating. Fig. 6 illustrates the degree to which the five thermocouples assume three different heat loads. The corresponding COPs are also shown.
The distribution of heat load assumed by the other couples will be proportional to the thermal resistance between the portions of the chip . As approaches infinity, the heat load seen by the other couples will approach zero. This limit will require each thermocouple to manage the heat load of its assigned chip portion alone. This means that the assigned chip portion can only produce an amount of heat less than or equal to the capacity of the individual thermocouple without operating in failure mode. As approaches zero, the heat load seen by each couple will be exactly the same regardless of where the generation occurs. Therefore, distributed control offers no benefit, and the optimum solution will be the same as a serially connected TEC. This relationship shows that a heat spreader with a high thermal conductivity may benefit a traditional TEC. However, a TEC operated with distributed control will benefit from a reduced value of thermal conductivity and should not employ a traditional heat spreader. The value of chosen for the current analysis approximates that anticipated from a computer chip with no heat spreader. Fig. 7 compares the COPs for the serial and distributed control cases for three different values of lateral conductivity, . Lateral conductivity is inversely proportional to lateral thermal resistance, . The figure shows the gain from distributed control as the space between the solid and the dashed curves. As the value of increases, the benefit that distributed control offers decreases.
The effect of the thermal conductance of interfaces on the performance of the system is also considered. The values of and were selected to represent of the typical conditions of existing systems. These values vary greatly based on the contact resistances, TIM, and fin-fan systems. Fig. 8 shows how the COP varies over a wide range of values with held constant for an intermediate heat load of 3.9 W. The optimum, rule-based, and serial solutions are shown.
was chosen to vary because it typically has the widest range of values. Fig. 8 shows that the optimal COPs represent a two-fold increase in serial COP values for 3.9 W regardless of the value. The most significant feature in Fig. 8 is that the rule-based values exceed the optimum values. We do not expect any control rule to give better performance than the optimal operating conditions. However, the present rule was specifically designed for a resistance value of 10 K/W, where the rule based performance is slightly lower than the optimal performance. The control rule appears to outperform the optimal solution because the constraint used in the optimization ( 85 C) is being violated. Therefore, this result indicates that a control rule must be devised for each cooling solution and is not characteristic of a particular chip, for example. The distributed control rule is sufficient for small variances in , but when varies considerably from the value used in this paper a new rule must be developed.
Other considerations relevant to the success of the distributed control approach include interconnect heating and thermal fatigue. Due to the increased number of connections required by distributed control, the amount of power losses due to interconnect heating will also increase. Preliminary calculations show that due to the high current levels and the small sizes, the worstcase scenario of individual control of each couple may be prohibitive. One solution may be to control the individual couples in clusters or arrays. Such prospects should be investigated to a greater extent in a working prototype to determine feasibility.
Another immediate issue affecting feasibility of distributed control is thermal fatigue. The application of distributed control will reduce temperature gradients across the computer chip (at least in nominal operating regimes), and therefore, reduce thermal stresses internal to the chip itself. However, current TEC technology is quite susceptible to thermal fatigue due to cycling. Traditional TECs are powered at one level constantly, and are not frequently cycled. Frequent cycling of a TEC causes the internal electrical connections to fail. Some advances in materials and manufacturing have already been considered and some have been implemented [26] . Nevertheless, the extent of the thermal stresses due to distributed control should be considered.
IV. CONCLUSION
The need for improved performance of TECs is well-documented. By managing the individual thermocouples internal to a typical TEC with distributed control, significant gains in performance can be realized. A computer model was developed that determines the optimal performance of the individual thermocouples in a simplified linear array for a non-uniform heat distribution case. This optimum performance can be closely approximated by applying a distributed control rule to the TEC. This rule accepts the sensed local temperature values and determines the current loads to individual thermocouples to maximize efficiency. The results represent improved performance throughout the operating regime. In much of the regime, this improvement is 1.5 to 2.0 times the traditional values. This improvement could, as a minimum, narrow the gap between thermal demands of modern computing equipment and cooling capability making material advances even more rewarding. Distributed control of thermoelectric devices for chip cooling applications where non-uniform heating occurs is a technology that deserves further investigation. Because the heat loads and distribution of heat loads as well as construction of chips vary dramatically between chip technologies, a general solution can not be provided here. The concepts presented in this article should be studied in the context of different chip design and load requirements, transient effects, modern interface resistance technologies, and so forth to assess the widespread applicability of this work.
