Abstract-Solutions to the integration challenges of a new thermal sensor technology into 3-D integrated circuits (ICs) will be discussed in this paper. Our proposed architecture uses bimetallic thin-film thermocouples, which are thermally linked to points of measurement throughout the 3-D stack with dedicated vias. These vias will be similar to thermal through-silicon vias (TSVs) in structure, yet different in functionality. We propose a lowoverhead design methodology by linking the sensor placement task with the existing thermal TSV planning phase for 3-D ICs. A fraction of thermal TSV resources is decoupled from their original use and repurposed for the temperature sensing infrastructure. Tradeoffs concerning the reduction of the thermal TSV resources are investigated. Furthermore, we present an end-to-end system, including the physical realization of the sensor network as well as its analog interface circuitry with the sensor data sampling unit. We demonstrate the operation and correctness of this interface with transistor-level simulations. Next, through thermal modeling and simulation using a state-ofthe-art tool (FloTHERM), we demonstrate that we can achieve high accuracy (1°C error) in temperature tracking while still maintaining the effectiveness of the thermal TSVs in heat management (conforming to a peak temperature constraint of 95°C).
monitoring of 3-D ICs. We also present a design methodology that leverages on thermal via planning for integrating these sensors into 3-D ICs, thereby, minimizing the additional design effort overhead.
A bimetallic thermocouple is made out of two dissimilar metals. These devices generate small voltage outputs in proportion to the thermal gradients they sense. Measurement interfaces need to be carefully designed for effective conversion of these measurements to absolute readings that are compatible with the rest of the system. Since majority of the sensor structure is passive and resides in the metal layer, the sensors require minimal area within the silicon layer. An important architectural question concerns the thermal coupling of the main sensor structure that is physically located in the metal layer with the active layer, where the point of thermal event is located. We have developed a solution that uses dedicated through-silicon vias (TSVs) for this purpose in our previous work [14] . This unique structure creates a new use of thermal sensors, where the physical structure in charge of "sensing" can be physically located in one die (i.e., the thermocouple patterned into the metal layers of that die), whereas it may be used to monitor a point in any other arbitrary die within the 3-D stack thanks to the through-silicon thermal coupling technology.
In this paper, we present a design flow for configuring a given 3-D IC with a set of bimetallic thermocouple thermal sensors. We present an extensive analysis to identify optimal configurations and design practices for several different multistack ICs. We construct a method, which is tightly integrated with the thermal TSV planning stage of 3-D ICs. Positions of through-silicon coupling paths for the sensors are determined by judiciously repurposing a fraction of existing thermal TSVs that are inserted to the design at the thermal TSV planning stage. We minimize the additional design overhead and also minimize the use of chip real estate for additional vias, by tightly coordinating sensor placement with thermal via planning.
Conversion of a thermal TSV into a sensor coupler via will mean that, this via will no longer serve as a heat transfer path to the heat sink and heat spreader. In this paper, we pursue an effective tradeoff between the heat removal capability of thermal TSVs post-repurposing of some of the planned thermal vias and the temperature tracking accuracy of the thermocouples utilizing those converted TSVs as support structures for sensors.
Our novel contributions in this paper are as follows. We first present the design and analysis of interface circuitry for effective operation of thermocouple-based sensors at the thin-film scale for thermal gradients encountered for typical ICs. Next, we tailored the thermal via planning to also encompass thermocouple sensor planning, by utilizing a state-of-theart thermal via planning method [3] , [5] . Finally, we explored the optimal allocation for the thermal TSV versus sensor coupler via ratio. We performed experiments to determine the thermal hotspot tracking accuracy of the resulting thermocouple sensor array for varying power profiles in the 3-D stacks. We also evaluated the heat removal capability of the resulting thermal TSV array in terms of peak temperature. Our experiments revealed that our thermal aware TSV placement algorithm allowed us to obtain accurate temperature tracking while strictly remaining below the predefined peak temperature limit of 95°C by using only 0.27% of total die area for high work load and 0.15% of total die area of medium work load applications.
The remainder of the paper is organized as follows. In Section II, we give an overview of related work. In Section III, we discuss thermocouple-based sensors and costs and limitations of integrating them into 3-D ICs. Section IV presents the 3-D IC stack model and its various parameters. Section V describes the integration strategy for thermocouple-based sensors into 3-D ICs. Section VI presents the underlying thermal models, and Section VII presents a discussion of our experimental results. We summarize our conclusions in Section VIII.
II. RELATED WORKS
There is a large body of work on design of silicon-based temperature sensors and temperature monitoring by embedding sensors into planar ICs [15] - [18] . However, temperature monitoring and hotspot detection in 3-D ICs bring various new challenges [1] , [19] . Thanks to the availability of TSVs in a 3-D IC, new possibilities for distributing components of a thermal monitoring and management system across multiple dies and layers arise. These aspects now require careful consideration to achieve effective and low-cost integration of thermal sensors into 3-D ICs.
Two studies have investigated integration of a ringoscillator-based sensor [20] , [21] and a delay-line-based sensor [22] into 3-D ICs. Kashfi and Draper [20] , [21] proposed a ring-oscillator-based sensor, where the inverter chain is divided into two halves and placed in two adjacent layers in an effort to capture an interesting 3-D thermal condition. This sensor aims to capture a hotspot, which may diffuse from one layer into a second and create a so-called volumetric spatial hotspot. While this design is effective in detecting the existence of a "hotspot," the accuracy of this sensor is limited to the averaged thermal behavior of the volume occupied by all sensor components. Hence, it cannot be used to pinpoint the exact layer, which is the root cause of the thermal emergency, which would be needed by thermal management schemes to perform their intervention at a core or die level. Datta and Burleson [22] proposed integration of delay-linebased digital thermal sensors into 3-D ICs. Their study only focused on circuit optimizations for a single sensor to combat process variation and supply noise effects on the delay lines and sensor-to-control plane sensor output signal communication through a TSV in the 3-D IC. A full chip-level thermal evaluation with a complete set of sensors in place has not been performed. Lo and Chiu [23] proposed an all-digital voltage compensation ring-oscillator-based thermal sensor. While this sensor shows high accuracy, low power, and low cost in its design, this paper does not mention the temperature differences that exist between each layer of the 3-D IC structure if implemented in any 3-D design.
In our previous paper, we presented the first study of the integration of bimetallic thermocouple-based sensors into 3-D ICs. We presented the optimal organization of this sensor architecture for two-tier 3-D IC configurations with medium and high power values under a uniform TSV planning algorithm [14] . In this paper, we extend our study to include the following.
1) Real-time accuracy measurements of our actual fabricated thermocouple samples over the time domain. 2) The relationship between the thickness of the sensor sheet resistance and the absolute error of the sensor due to the surrounding bulk silicon. 3) A complete system circuit design with interface circuit for converting analog sensor readings to digital outputs. 4) A detailed analysis of four-tier 3-D IC structures of both medium and high power loads. 5) Utilizing a thermal aware TSV placement algorithm applied to our two-tier and four-tier 3-D IC configurations showing area cost, TSV cooling efficiency, and sensor accuracy. 6) The coupling accuracy of the two-tier and four-tier, quantifying the relationship between the sensing point, monitoring point, and the bulk silicon point.
III. BIMETALLIC THERMOCOUPLES AS ON-CHIP TEMPERATURE SENSORS
A thermocouple consists of a junction of two different metals or alloys connected at one end and open on the other end. It operates according to the well-known Seebeck effect as defined through
V HC represents the voltage measured across the open ended junction of the thermocouple. The right side of this equation represents the equivalent voltage in terms of the Seebeck Coefficient, α multiplied with the temperature difference between the two ends of the junctions, where T H typically corresponds to the temperature at the junction of the two metals and T C refers to the temperature at the open leads. A thermocouple can be produced as a thin film on the order of hundreds of nanometers scale. There are various promising solutions for on-chip integrated thin-film thermocouple-based temperature sensors for planar ICs [24] - [26] . These solutions have employed thin-film thermocouples such as Ni-Cr and Cu-Constantan pairing of two metals. These devices have been shown to be effective in tracking fine-resolution thermal behavior. 
A. Copper-Constantan Thin-Film Design and Fabrication
We have also successfully fabricated bimetallic thin-film thermocouples made of Cu-Constantan to assess the feasibility and capabilities of this technology with encouraging results. Fig. 1 shows the mask design for our copper-constantan thinfilm thermocouple test sample with a thin-film heater made of Ni-Cr to simulate an on-chip hotspot. Aluminum legs deliver current to the Ni-Cr heater. V HC measured will correspond to the temperature gradient created by the Ni-Cr heater.
Our sensor devices were deposited over several experimental substrates consisting of resistive networks (to emulate hotspots) fabricated on P-typed boron doped silicon wafers. We used negative photolithography for all material deposition layers with photoresist AZ5214E. Ni-Cr (100-nm thickness) and Al (90-nm thickness) layers were deposited using AJA E-Beam Evaporator. Copper (100-nm thickness) and constantan (260-nm thickness) were deposited using AJA Orion Sputter.
In order to quantify our sensors' accuracy, we attached two thermistors (US Sensor PPG102C1RD): one on the closed (T H ) and one on the open (T C ) junction of the thermocouple. Each thermistor is responsible for reporting real time temperature readings. The current supplied to the resistive network is slowly and steadily increased and then decreased as a function of time. At the same time, we measure the output voltage from the thermocouple and change in temperature measured from the thermistors versus real time. Experimental testing, illustrated in Fig. 2 , shows that our calibrated sensors have less than 0.1°C error compared to our test thermistors on the surface.
B. Interface Design for the Thermocouple-Based Sensor
A thermocouple can only measure relative temperature. Clearly, absolute measurements would be needed for thermal monitoring. In order to obtain the absolute temperature, we combine the thermocouples with a thermal reference point circuit. This circuit would contain a reference diode that generates an output voltage proportional to the temperature at its vicinity. The voltage output of the thermocouples added with the output voltage of the diode circuit generates an output voltage. Fig. 3 illustrates in detail how such a circuit would be realized and how the sensor coupler is combined with the thermocouple that is attached to the sensing point. As mentioned before, the thermocouple sensing point needs to be at the same temperature as the monitor point. The junction end of the thermocouple is connected to the sensing point thermally. The other end of the thermocouple is left opened to be connected to a switch relaying the output voltage to the reference point circuit for V voltage reading. Each coupler via used for temperature sensing is directly attached to their respective thermocouple sensor. An interface circuit is necessary to multiplex and read all the thermocouples in the array for their temperature readings. This interface circuit will be placed at the coolest part of the chip, close to the edge of the chip, and/or near the cache blocks. A local temperature sensor (temperature sensing diode) will be used to obtain the absolute temperature. Thus, the temperature of each thermocouple sensor can be obtained in the form of output voltages and an analog-todigital-converter (ADC) can be used to output the digital signals corresponding to the temperature. Proper choice of resistor and diode parameters to minimize the effects of process variation and temperature in a diodebased reference circuitry has been studied analytically by Long et al. [27] . An on-chip ADC module is a typical interface for on-die sensors [28] that generate a voltage output in relation to temperature. Hence, an on-chip implementation of the entire sensor system is feasible. A single thermal reference point can be shared among multiple thermocouples. Likewise, a single ADC can operate in a time multiplexed fashion to accommodate multiple thermocouples. The built-in selfcalibration techniques [29] , [30] for on-chip ADC can be used to compensate for process variations. Similarly, NIST's onsilicon thermocouple calibration techniques [31] can be used for our thin-film thermocouple array.
We demonstrate a complete eight-sensor array system in Fig. 4 . Eight thermocouple sensor blocks output thermocouple readings into sensor amplifier blocks, which in turn amplify the voltage signal. In this design, we choose a generic analog amplifier design with a gain of 25 to demonstrate the functionality of our circuit system. Please note that amplifiers are high input impedance devices and its purpose is for detection and amplification of the voltage signals. Once the signal has been amplified by the sensor amplifier circuits, it will be read in by a single clocked multiplexer block.
The multiplexer clock will determine how fast the analog multiplexer circuit will read out its eight different input lines. Since chip temperature changes are slow relative to normal circuit signals, our multiplexer clock can run at a fairly low rate, in the kilohertz range. The clocked multiplexer block is used to read out all available sensors of interest in order to provide a complete thermal map of the entire chip surface. Once the signal is selected for output by the multiplexer circuit, we use a generic voltage level shifter to correct the voltage level to match the ADC input requirement range. The ADC itself will then take the input and convert our analog signal into a vector of 8-bit digital data that represent that voltage level of our temperature sensor readings for thermal and power management purposes. Our 3-D simulation shows that hotspot temperature can vary between 25°C and 125°C, with 25°C being the nominal temperature and 125°C being the chip maximum temperature. In addition, since one thermocouple output about 40 μV/°C and we designed our sense amplifiers with a gain of 25, each ADC bit would need to represent 1-mV analog input granularity for every degree change on the thermocouple side. For a range of 100°C, a 7-bit ADC would suffice. However, we choose an 8-bit ADC so that our resolution would be at least 0.5°C per ADC bit and also to demonstrate that higher bit ADC can easily be chosen since the thermocouple values are analog. For example, at 8-bit ADC, the resolution would be one ADC bit for every 0.4°C thermocouple change. To gain 0.1°C thermocouple resolution, a 10-bit ADC can be used. Note that we designed our system such that we can easily replace any components of our system with customized application specific circuit designs. Fig. 5 shows our analog mixed-signal simulation from ADE-XL of this circuit. Our simulated temperature changes are correctly converted from our sensor's analog signal into digital output bits.
C. Advantages of Thermocouple-Based Sensors
Bimetallic thermocouples offer an alternative temperature measurement strategy with several advantages. The Seebeck effect is an intrinsic property of the passive metallic structures that comprise the sensor, which is independent of process variations. Also, once the thickness of the thin films are at or above 100 nm for constantan and 120 nm for copper and the width of the thermocouple legs are at or above 30 μm, the Seebeck coefficient is stable [32] . These metallic legs of the thermocouple also do not conduct any current in the act of sensing. Therefore, the bulk of the sensor element does not consume any power, with the only exception of the reference point circuitry. Effectively, a network of K thermocouple sensors can be considered to consume power equivalent to one traditional absolute sensor (the thermal diode and the reference point circuitry could be viewed equivalent to one diode-based sensor), independent of the size of the network. The fact that majority of the sensor structure resides in the metal layers makes it possible to place sensors right on the hotspot. With conventional sensors, using active devices this is often not possible, since high temperatures might be harmful to the sensor logic itself or that area would already be occupied since it must be an area of high activity and timing criticality. Furthermore, the ability to split the reference point circuit from the bimetallic sensing leads physically (potentially several layers apart in the 3-D stack thanks to TSVs) allows the freedom to choose a spot for it with ample free space and low thermal fluctuations. Also the output of the ADC can be strategically placed nearest to the off-chip interfaces to comfortably use I/O resources and convey thermal monitoring results to system-level tracking and management modules.
D. Implementation Costs and Limitations
Introduction of this technology naturally will incur some overheads. In the following, we will discuss how majority of these can be easily circumvented and what remaining price is to be paid for the advantages described above. Thin-film bimetallic thermocouples can be made with a thickness on the order of a few hundred nanometers, and they can also be produced from CMOS compatible materials [24] - [26] . The fabrication method of bimetallic thin-film thermocouples is no different from existing fabrication method used for metal layers in ICs. Materials such as copper and constantan have already been tested as CMOS-industry compatible material [24] , [27] , [32] - [37] . Effectively, one new dedicated metal layer needs to be deposited to house the constantan material. As for the copper leg of the device, there may be two options. One extra copper layer can be added to the process for the thermocouples or one of the less densely populated existing metal layers of the IC can be leveraged to fit the copper legs of the thermocouples. In a 3-D multistacked IC different dies may potentially be produced from different processes by different suppliers. Some may be more suitable for one choice over the other. It may be a cost effective solution to embed the new constantan layer to only one of these dies and not introduce this material to the others at all. We identify it as the best practice for the thermocouple devices to be placed in the metal layers of the bottommost die closest to the printed circuit board (PCB) and all other dies would have access to these devices through coupling TSVs. Besides eliminating the need to embed a new material to each individual die, this also avoids increasing the overall stack height and indirectly the height of signal TSVs.
The thin-films comprising the thermocouples are also subject to certain constraints and limitations. We consider two size effects inherent to the design of the thin-film thermocouple that is essential to the consistency of its Seebeck coefficient. First, there is a dependence between the thickness of the thin film and its Seebeck coefficient. Copper-constantan thin-film thermocouple becomes independent to the size effect after the copper layer reaches 120 nm and constantan layer reaches 100 nm [35] . Existing CMOS processes would comfortably meet this requirement. Second, the Seebeck Coefficient is also dependent on the width of the legs of a thin-film thermocouple. Sun et al. [32] reports a lower bound of 30 μm for the width of each thin-film stripes in a thermocouple device. This is comparable to the diameter of a relatively large thermal TSV, and the available metal area could be considered to allow for an almost unlimited number of such strips to be placed.
IV. THREE-DIMENSIONAL IC STACK MODELING
In this section, we will first present the overall structure of the 3-D IC stack. Our thermal models will be built including the main components of this structure. We will then elaborate on our proposed integration of thermocouple sensors into this model. Fig. 6 illustrates the stacking of two dies in a 3-D IC. A heat spreader is attached to die A, and a PCB board is attached to die B through copper bumps, as shown in Fig. 6 . On the other side of the heat spreader, a heat sink is attached along with an active fan to extract heat. In Fig. 6 , we also illustrate two structures of importance in our discussion: thermal TSVs and coupler TSVs.
We illustrate thermal TSVs extending through the 3-D IC and establishing a thermal path toward the heat sink. TSVs are composed of electrically conductive material such as copper. In the normal operation mode, these vias provide an electrical pathway and enable interdie signal communication. Material with high electrical conductivity tends to have high thermal conductivity, and common via materials are no exception. Thus implementing thermal TSVs as a cooling mechanism takes advantage of their heat transfer property [38] - [43] . The hotspot thermal energy is spread out across the die toward cooler parts.
Coupler TSVs are also shown in this illustration. They enable a thermally coupled sensing path between die A and the thermocouples embedded in the metal layer of die B. Note that the main concepts of embedding thermocouple arrays into the metal layer of one or more dies and using coupler TSVs to track the active layers of different dies apply to any 3-D IC. A four-stack configuration is simply an extension with more dies in the 3-D structure.
V. INTEGRATION OF BIMETALLIC THERMOCOUPLE SENSORS INTO 3-D IC STACKS
The integration of thermocouple sensors into 3-D ICs has two considerations. First, we must determine the physical design structure of a thermocouple sensor inside the 3-D stack. Second, we must determine how to efficiently place these sensors within the die stack.
A. Design Structure of Thermocouple Sensor
The basic principle of a thermocouple is to convert the thermal gradient between its junction and open leads into a voltage difference. In practice, the temperature of the thermocouple junction is associated with the temperature of the location that is being monitored. The temperature of the open leads is either maintained at a close to constant temperature (through placement of the leads at a constant temperature location of the chip) or can be measured and subtracted from the junction temperature. In a large scale array of thermocouple sensors, all open leads can be consolidated in close physical proximity and using only one absolute reference measurement their common open leads temperature is determined (as we discussed in Section III). We refer to the location of interest, where the temperature needs to be measured, as the monitor point. This point is located on the top surface of a die's substrate, near its active logic. We refer to the location of the junction of a thermocouple sensor as the sensing point. This point overlaps with the two metal layers involved in creating the junction. In the most general case, monitor points could be on any die within the 3-D stack, wherever the system requires to monitor chip temperature. Similarly, the thermocouple junction could be implemented in any of the metal layers of any die. In order for the sensing point temperature to closely match, and hence, accurately track, the monitor point, thermal coupling between these two points needs to be established. For this purpose, we can take advantage of thermal TSVs already going through each layer of the die.
We refer to TSVs that create a thermal shortcut between the sensing and monitor points as sensor coupler TSVs, sensor couplers, in short. As shown in Fig. 6 , in order to monitor temperature in die A, we would use sensor couplers to coordinate with the monitor points within die A. Those sensor couplers extend from the surface of die A all the way through interdie copper contacts and through die B. The sensor coupler ends at the metal layer of die B, where sensing points are located, i.e., the thermocouple legs are placed. Note that the sensor coupler does not extend entirely through die A and it is disconnected from the heat spreader. This is required to prevent a thermal short circuit between the monitor point and the heat spreader. Otherwise, the heat spreader would have interfered with the actual temperature measurement. This structure allows us to take a temperature reading at the thermocouple sensing point residing in the metal layer of die B from the surface of the active layer of a different die embedded deeply within the 3-D IC stack.
It is important to note that there will still be thermal TSVs available for thermal management connected to the heat spreader. Fig. 6 illustrates both types of TSVs for comparison. Furthermore, thermal TSVs are electrically isolated and hence, no current flows through them. Similarly, sensor couplers only function as heat pathways and hence do not need to conduct electricity. Being TSVs, the couplers can penetrate multiple dies. This allows us to access any die surface and construct a highly flexible temperature monitoring system by potentially coupling and coordinating monitoring points with sensing points several layers apart throughout the stack. This presents a unique opportunity for thermal monitoring in 3-D IC stacks. Typically, high power density dies, hence, majority of the strategically important (from the perspective of dynamic thermal management) monitor points, are located further away from the PCB board and closer to the heat sink. On the other hand, it would not be desirable to use up the resources of these high activity and resource constrained dies for sensing related structures. Therefore, sensing points could be placed near the bottom of the 3-D stack, i.e., near the PCB with easy access to system-wide thermal management modules. Fig. 7 illustrates in greater detail how the sensor coupler is combined with the thermocouple that is attached to the sensing point. As mentioned before, the thermocouple sensing point needs to be at the same temperature as the monitor point. The junction end of the thermocouple is connected to the sensing point thermally. The other end of the thermocouple is left opened to be connected to a switch relaying the output voltage to the reference point circuit for V voltage reading. This V will be converted to a temperature gradient that is equal to the temperature gradient experienced by the two ends of the thermocouple. Also shown in Fig. 7 , the thermocouple device is made of two metals legs, each residing on a different metal layer. Their electrical connection is through another short via.
B. Thermocouple Sensor Placement
We have previously explored the uniform placement of thermocouple sensor with promising results [14] . However, nonuniform thermal aware TSV placements based on thermal profile are more efficient at relieving hotspots present in the chips. Thermal aware placement algorithm can take advantage of predicted temperature hotspots over many applications to find the most likely position and size of hotspots [3] , [5] . Hotter regions (core functional blocks) of the chip will be assigned more thermal TSVs while cooler regions (L1 and L2 cache blocks) of the chip will be assigned less or no thermal TSVs. Sensor coupler TSVs can also be strategically placed on high power density block(s) while minimizing overhead cost significantly. This will save the overall amount of TSVs used and decrease the complexity during routing. Goplen and Sapatnekar [3] proposed a thermal aware nonuniform placement of TSV placement by adjusting the thermal conductivity of the effective bulk silicon according to the calculated thermal profile and corresponding thermal gradient during each iteration. Thermal TSVs are inserted into the bulk silicon area to decrease the thermal issues caused by high power density blocks. We propose to use a simplified version of the thermal aware algorithm in following equation:
The term K new(i) represents the new thermal conductivity during each i iteration. K new(i−1) is the previous iteration's thermal conductivity. T Max(i−1) is the maximum temperature from the previous iteration. T Max_Ideal is the desired maximum temperature. K new(0) is the initial thermal conductivity of the bulk silicon material. At each i iteration, K new(i) value can be converted to thermal TSV density VD i with (3) and a new thermal profile can be recalculated. VD 0 is set to the initial via density within the bulk silicon block. In (4), K TSV is the thermal conductivity of the thermal TSV. The algorithm would iterate until the thermal conductivity converges to a value that ensures the T Max_Ideal condition is met
In our study, we divided up each functional block into equal square regions of area 363 μm by 363 μm. The algorithm is then applied to each of these square regions for potential thermal TSV and Coupler TSV placement. To ensure that the algorithm does not over place thermal TSVs, we set a constraint of V max on every square region under consideration. V max is defined as the maximum allowed total TSV area to the silicon area in that region. A 1% V max value would indicate only 1% of the silicon area in the square region under consideration can be used for thermal TSV placement.
VI. THREE-DIMENSIONAL IC THERMAL MODEL
We constructed a detailed thermal model for the 3-D IC stack by modifying the stacked plastic ball grid array template from FloTHERM pack to match our 3-D IC specifications. We then simulated this model using the commercial tool FloTHERM Version 9.2. Our two stack 3-D IC model consists of two silicon die stacked vertically. Each die has an area of 6 mm × 8 mm. Die A has a thickness of 310 μm while Die B and other additional stacked dies are thin dies with 50-μm thickness. The silicon wafer's thickness depends on the diameter of the original wafer from the manufacturer [44] (a 300-μm-thick silicon wafer has a typical diameter of 50 mm). For wafer thickness in the range of 50-μm range, a thicker "handling wafer" can be attached temporarily to the thin die for mechanical support [45] - [47] . Each die is flipped so that the metallization are pointed in the-z-direction. Die A has a heat spreader with heat sink attached to its back. Die B is attached through copper bumps to the PCB. Adjacent dies are attached together using a copper thermal diffusion process [45] - [47] . This results in heat transfer properties that are close to copper's heat transfer between each copper contact points. We model the copper contact point with copper material properties while all the surrounding material outside of the copper contacts are modeled with equivalent interlayer material [45] - [47] .
Each die has three basic layers, a silicon bulk layer, an active layer, and a metal layer. All active layers are silicon with 20-μm thickness and all metal layers are copper at 10 μm (except for one constantan layer embedded to the bottommost die). This serves as the basic structure of the 3-D IC stack into which different types of TSVs can be inserted. Table I contains detailed information regarding the most important components, their dimensions, and thermal conductivity. Table II presents details on each layer of material on a single die. Ambient temperature is set to 20°C. Both convection and conduction calculations are taken into account by the simulator. Fig. 8 shows an overall view of a uniformly mixed and equally distributed array of thermal TSVs and sensor couplers for one of the 3-D IC configurations we have tested. Rectangular structures pointing upward out of Converting thermal TSVs into sensor couplers will mean that the TSV height has to be modified and then thermally insulated at certain points. Please note that CMOS compatible high-aspect ratio for copper TSVs ranging from 1:10 to 1:35 has been reported in various published works [48] - [53] . First modification is to decrease the height of the thermal TSV from 385 to 80 μm. As shown in Fig. 6 , the sensor coupler will be disconnected from the heat spreader and therefore, will not go entirely through the top die as the thermal TSVs do. Instead the coupler via starts at the surface of die A and extends through die B and reaches the thermocouple junction. Ideally, we expect the sensor coupler to match the temperature at its two endpoints with negligible discrepancy. However, due to heat escaping along the sides of the coupler TSV the two endpoints might experience significant disagreement. Fig. 9 shows our simulation results that reveal the relationship between the thickness of an isolating sheet surrounding the sides and one end of a coupler TSV and the difference in temperature between its two endpoints. We refer to the difference in temperature between the endpoints as the via coupling error. We observe that for a thin layer of isolation the error can exceed 1°C. The error drops asymptotically with increasing sheet thickness close to 0.1°C at 10 μm. Following this analysis, our second modification is to add five insulating thermal resistance sheets with reasonably low thermal conductivity around the sensor coupler. Note that all kinds of TSVs already have isolation around them by default; hence, this is not a newly added process step specific to sensor coupler TSVs. We determined experimentally (as shown in Fig. 9 ) that an isolating sheet of feasible thickness (on the order of 10 μm) can be formed to yield a thermal conductivity of 0.2 W/(mk) for sufficient thermal isolation for the sensor couplers. One possible CMOS compatible material used for this insulation layer with equivalent thermal conductivity is graphene-oxide thin film [54] - [57] . Another possible CMOS compatible material is porous silicon whose thermal conductivity will depend on the pore size and density [58] . These sheets will surround the sensor coupler and leave only the via side that is exposed to the surface of die A uncovered. Essentially our aim is to have the sensor coupler be exposed to die A surface and be insulated everywhere else. This will help yield the proper temperature readings from the monitoring through the sensor couplers.
VII. EXPERIMENTAL RESULTS
In this section, we will first introduce our experimental setup in Section VII-A. Next, in Section VII-B, we present our experimental results. 
A. Experimental Setup
Recognizing the growing interest in 3-D integration of multicore processor systems and the importance of thermal management for processor systems, we organize each layer of die into one of the following categories: 1) memory only; 2) logic comprised of two cores and L2 cache; and 3) logic comprised of quad cores but no L2 cache [14] . We explore four distinct 3-D IC configurations built from a mixture of the above three types of die, as shown in Table III . The first three configurations (LM, Quad, and LL) contain two stacks and the last configuration (4 LL) contains four stacks. If two vertically adjacent dies both contain high power density logic (cores in our case), one of the two logic dies will be rotated so that it is turned 180°with respect to the other die, ensuring no hotspot overlap.
In those configurations, where different stacks do not contain identical components (e.g., cores versus memory) we assign components with higher power consumption to die (die A) that is closer to the heat sink. The lower layer(s) (die B, die B1, etc.,) is designated with dies consuming lower total power. All dies in the stack structure have the same area. Thus, it is more reasonable to place the die with the higher power density closer to the heat sink for better thermal management. It is common practice to stack processor cores and memory chips in a 3-D structure. These scenarios are represented with the LM and Quad configurations in our experiments, where high-performance processors can then be represented by die A, and the memory system can be represented by die B.
In order to assign realistic worst-case power density to each die, we have performed power analysis of 65-nm Alphalike (21364) processor core executing a mix of benchmarks from SPEC2000 suite [59] using Gem5 and the Wattch power simulator [60] . This core has the dimensions of 3 mm × 4 mm. We have created the 3-D IC configurations listed in Table III by using one or more copies of this core as the basic building block. We created a compute intensive workload with high total power consumption by creating a mix of the apsi, perlbmk, gzip, galgel, and mesa benchmarks from SPEC2000. We will refer to this workload as the H-workload in the rest of the discussion. We created a second workload of relatively lower total power by introducing other applications (ammp, applu, equake, art, mcf) into the mix. We will refer to this workload as the M-workload. Table IV shows details of power consumption of each core for different workloads and different 3-D stack configurations. We assigned a total of 2.88 W to the memory die, representative of a dynamic random access memory (DRAM) memory of comparable die size.
B. Results
We first present an analysis to identify a baseline thermal TSV allocation for each of the 3-D stack configurations. Due to their varying content of logic components and power distribution, they exhibit different thermal characteristics. Hence, their requirements for thermal management are not identical. We first identify the lowest cost thermal TSV allocation, which allows each of the configurations to remain within reliable thermal limits. Next, we will present our analysis on the possible trade-off between amount of thermal TSVs that can be converted to sensor couplers while incurring minimal degradation in peak temperatures and achieving a specific temperature monitoring accuracy.
1) TSV Coupling Error:
In order for our sensor system to work efficiently, we need to determine the coupling accuracy between sensing point and monitoring point. The temperature of the surrounding silicon and the thickness of the insulation sheet affect the accuracy of the TSV coupling.
A well designed system will be able to minimize the coupler error by correctly sizing the insulation sheet. Table V shows the accuracy of the coupler TSV in each configuration with the largest temperature difference between the bulk silicon around the sensing point and the sensing point itself. This is shown in the table as temperature difference. TSV coupling error is calculated as the temperature difference between the sensing point and the monitoring point. This accuracy represents the difference between the real hotspot temperature and what the coupler TSV can transfer to the thermocouple sensors. The largest temperature difference is shown in configuration LL with 28.6°C. The smallest temperature difference is shown in configuration Quad with 8°C. Even with such large temper- ature difference between the surrounding bulk silicon and the actual sensing point, we can see that all three configurations in our system show TSV coupling error of less than or equal to 0.5°C. Table VI explores the effect of increased number of layer stacking to the sensor accuracy. We simulate 3-D configurations ranging from LM1 with one logic layer and one memory layer to LM6 with one logic layer and six memory layers. While the temperature difference between the sensing point and the surrounding silicon can be as large as 32.9°C, our sensor is still capable of obtaining an accuracy of 0.8°C.
2) Four-Tier Configuration Using Uniform TSV Placement:
In our previous work [14] , we explored the uniform TSV placement algorithm. In this paper, we extend out study by utilizing a more efficient thermal aware TSV placement algorithm. We set 95°C as the threshold on the peak temperature for all stack configurations.
To further extend our discussion of multistack structures, we also simulated a four-stack structure made of four logic dies. We determined that the H-workload cannot be executed on this configuration while remaining below the 95°C limit without allocating an excessive amount (over 1% of die area) of thermal TSVs for the baseline. This can be explained by the fact that a high number of processor cores were packed within the four stacks and this IC cannot sustain such high power budgets within this thermal limit. The M-workload on the other hand is sustainable for this configuration with 0.5% of the total die area dedicated to thermal TSVs in the baseline configuration. Fig. 10 presents the peak temperature and tracking errors for the topmost die in the stack. In the entire range of 10% to 50% sensor allocation, the average tracking error is always below 0.7°C and the peak temperature is always less than 95°C. This indicates that we only need to allocate 10% of the thermal TSVs for this layer and convert a fraction of the remaining TSVs into sensor couplers to serve other layers. The resulting monitoring systems could achieve less than 1°C tracking error. Different sensor allocations can have an effect on temperature tracking accuracy and how much cooling the thermal TSVs can achieve. Furthermore, both tracking accuracy and temperature peak are tradeoffs that vary from 3-D stack configuration to configuration with some common consistent guidelines. 3) Thermal Aware TSV Results: To explore the impact of such a constraint on cost and the cooling efficiency of the system, we sweep V max between 0% and 1% at 0.25% intervals. Note that V max is defined as the maximum allowed total TSV area to the silicon area in that region. The first component of our algorithm determines the amount of thermal TSV necessary to cool the chip below 95°C. Fig. 11 shows the effect of thermal aware thermal TSV placement on the hotspot for a V max of 0.25%. Fig. 11(a) shows the original thermal map for LM at H-workload where the hotspot reaches 114°C. Fig. 11(b) shows that after 45 thermal TSV insertion, the hotspot temperature has been maintained below 95°C. As an example shown in Fig. 12 , a functional block with area of 0.5 mm 2 at an initial temperature of 114°C would require a thermal TSV to die area ratio of 0.0086. Each via has an area of 900 μm 2 . The thermal conductivity of the silicon die area at 114°C is 111.5 W/mk. The new thermal conductivity, using our equation, is 135.5 W/mk since our target is 95°C. With V max set to 0.25%, our algorithm will give a recommended six copper thermal via for that functional block.
The second component of our algorithm converts the minimum amount of thermal TSVs already present in the stack into sensor coupler via structures. In our example given above, the algorithm will choose one thermal TSV that is closest to the hotspot to be converted into a sensor coupler via. Fig. 13 shows the percentage of TSV area with respect to die area with different V max constraints for H-workloads. Fig. 14 shows the corresponding peak temperature of H-workloads under different V max constraints. Similarly, Fig. 15 shows the percentage of TSV area with respect to die area with different V max constraints for M-workloads, and Fig. 16 shows the corresponding peak temperature of M-workloads under different V max constraints. Note that even at V max set to 0.25%, all configurations will be assigned with enough thermal TSVs to cool the chip below 95°C. Thermal aware placement utilizes significantly fewer thermal TSVs for cooling. Table VII shows the maximum, minimum, and average error using our thermal aware placement of temperature sensors for all configurations. A positive error indicates that the sensor detected higher temperature than the actual hotspot temperature and vice versa for the negative error. We observe positive error for LL and 4 LL configurations since the surrounding bulk silicon is at the higher temperature than the hotspot itself. We can see that the largest average error is 1°C and 0.6°C for medium and high power workload, respectively. The largest maximum error is 2.6°C and 3.1°C for medium and high power workload, respectively.
VIII. CONCLUSION
A novel method for integrating bimetallic thermocouplebased temperature sensors into a 3-D IC stack has been presented. Interface circuits required for this integration have also been presented and analyzed. The 3-D IC stack was modeled and simulated with an industrial thermal simulator to determine the impact of utilizing some of the existing thermal TSVs to aid in the construction of thermocouple sensors. The optimum point for thermal TSV area to die area ratio was determined. Finally, an analysis of the impact of different sensor coupler via placements on the accuracy of the temperature tracking and the cooling capability of the thermal TSVs was analyzed. The main methods outlined in this paper can be used for any given 3-D IC structure and workload characteristics.
Siddhartha Joshi received the B.S. degree in Electrical and Electronics Engineering and Physics from the Birla Institute of Technology and Science, Pilani, India. He is currently pursuing the Ph.D. degree with the Electrical Engineering and Computer Science Department, Northwestern University, Evanston, IL, USA.
His current research interests include electronic design automation and power modeling and testing.
Ji-Hoon Kim received the B.S. degree in computer engineering and the master of robotic degree from Northwestern University, Evanston, IL, USA.
He is currently a Systems Engineer with DMC Inc. His current research interests include electronics, robotics, and haptic interfaces. 
