Electrical and Thermal Analysis for System-in-a-Package (SiP) Implementation Platform by Michael Wang
Electrical and Thermal Analysis for  
System-in-a-Package (SiP) Implementation Platform 
 
Michael Wang, Katsuharu Suzuki, Wayne Dai 
Computer Engineering Department  
University of California, Santa Cruz 
(mwang, suzuki, dai)@soe.ucsc.edu 
Abstract 
 
This paper presents an electrical and thermal 
performance analysis of System-in-a-Package (SiP) 
memory/logic implementation platform based on Chip-
Laminate-Chip (CLC) technology. Internal IO interface 
inside CLC module has been modeled and compared with 
Stack-Chip (SC) implementation. Thermal analysis, 
including comparison against Stack-Chip and System-on-
a-Chip (SoC) is also presented. It is demonstrated that 
CLC technology provides significant performance 
advantage over conventional SiP technologies and has 
great impact on future system-level integration. 
 
1. Introduction 
 
Today, 100+ million transistors can be made on a 
single die and the cost of a transistor is approaching to 
"zero". Every year the semiconductor industry makes 
almost the same amount of transistors as all of the 
previous years combined. The new challenge is not how 
many transistors can be built on a single chip, rather how 
to integrate diverse technologies together, predictably and 
cost-effectively. System-in-a-Package (SiP), a 
generalization of System-on-a-Chip (SoC), overcomes 
formidable integration barriers without compromising 
individually optimized chip technologies. By preserving 
on-chip electrical environment, SiP matches or exceeds 
SoC performance with lower cost. SiP should be viewed 
as a giant chip rather than a miniaturized circuit board. 
As the feature size of current IC fabrication 
technology approaches 90nm, DRAM technology and 
logic technology diverge more and more even both are in 
CMOS. Embedded DRAM cannot provide cost-effective 
solution because of its low yield and high fabrication 
complexity. Memory/logic integration based on SiP is a 
feasible alternative to embedded memory. Unlike SoC 
approach, which compromises different chip fabrication 
technologies, SiP approach unlocks the full potential of 
IC technology by the integration of conventional ASIC 
and memory technologies using existing, individually 
optimized ICs. Therefore, memory and logic can be 
integrated at lower cost and reduced size, while the 
performance can compete with the SoC counterpart. 
Several technologies have been proposed to develop 
SiP modules. Stacked-chip SiP technology, which does 
not require any extra design process in chip design, is 
most commonly used to build an SiP [11]. However, 
bonding wires have inferior electrical properties including 
high parasitic inductance, and stacked structure results in 
inferior thermal heat conductivity. Therefore it is difficult 
to achieve better electrical or thermal performance than 
conventional design. To avoid these problems, Chip-on-
Chip SiP technology (CoC) has been proposed [1][2][3]. 
One of the modules developed on CoC is FPGA/DRAM 
SiP module, which integrates large-scale FPGA and 
multi-bank DRAM in single package to provide high 
bandwidth memory access. By exploiting solder bumping 
and flip-chip assembly, CoC enables the integration of 
different chips with shorter interconnection length and 
larger IO density. However CoC can be only applied 
when the substrate chip is large enough to hold all of the 
other chips. This is a serious limitation in CoC, and to 
solve this problem, Chip-Laminate-Chip SiP technology 
(CLC) has been introduced [9][10]. In a CLC module, a 
thin film laminate serves as a package substrate, wiring 
resource and decoupling capacitor for power source. 
Chips are solder-bumped on both sides of laminate, which 
allows heat dissipation from top and bottom sides of the 
package. Thus, CLC can achieve better electrical and 
thermal performance compared to CoC. 
In this paper, we analyze electrical and thermal 
performance of CLC-based SiP and compare it with other 
SiP technologies. The paper is organized as following: 
Section 2 introduces CLC technology. Section 3 analyzes 
electrical performance of chip-to-chip connection on 
CLC-based SiP and compares to stacked-chip SiP. 
Section 4 examines thermal performance of CLC-based 
SiP by developing its thermal model and simulating with 
FLOTHERM, a commercial thermal analysis tool. 
Section 5 concludes the paper with some remarks. 
 
2. Chip-Laminate-Chip Technology 
 
Chip-laminate-chip technology employs one thin 
laminate film between the top and bottom chips to 
provide better electrical environment and robust 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0-7695-1881-8/03 $17.00 ￿ 2003 IEEE power/ground distribution. Figure 1 illustrates the CLC 
technology.  
 
Laminate Logic
DRAM
Decoupling C
BGA ball
 
Figure 1. Chip-Laminate-Chip technology 
 
In the CLC module, the laminate is part of the BGA 
package; top-side chips and bottom-side chips are flip-
chip mounted on the laminate. Decoupling capacitors are 
build-in which provide better power/ground structure 
compared with CoC architecture. 
Some CLC package characteristics are [5]: 
•  Maximum off-chip delay << IO buffer delay 
(3.5ns). 
•  Signal round trip time < rise time (500ps). 
•  Inter-chip skew < board skew (500ps). 
•  No terminating resistors required. 
•  Smaller buffer size and minimized ESD 
protection. 
When logic and memory chips are assembled with 
CLC, they are on the same chip electrically, even though 
they are fabricated in different chips physically. The 
memory/logic interface can achieve over 500 MHz for 
double data rate (DDR). The elimination of terminating 
resistors dramatically reduces the per-pin power 
consumption. 
Figure 2 is an example of CLC memory/logic 
integration module, which consists of one S3 graphics 
chip and 8 Micron 8MB DDR SDRAMs [10]. This tighter 
integration offers much higher memory access bandwidth 
than on-board graphic memory with little cost premium.  
(a) (b)  
Figure 2. DDR SDRAM and graphics processor 
integration module implemented by CLC technology 
(a) One graphics chip on one side of the CLC (b) 8 DDR 
SDRAM chip on the other side of the CLC 
 
CLC technology offers the potential that low-end PC 
CPU with DDR can compete with the performance of the 
current high-end server system by further improving 
memory access time, which is the system bottleneck. The 
speed-up of DDR SDRAM inside CLC module balances 
the core logic and memory access speed. Furthermore, 
one logic chip integrating the CPU, graphic chips, and 
chip set on one side, and 500 MHz DDR SDRAMs on the 
other side make a single-package computer. The unique 
features such as low cost, low power, small area, and light 
weight create great opportunities in consumer products. 
 
3. Electrical Performance Analysis of CLC-
based SiP 
 
In this section, we try to analyze the CLC-based SiP 
electrical characteristics by modeling the internal IO 
performance inside CLC package. In this study, we focus 
on a DRAM and FPGA integration module, which 
contains one FGPA and two DRAM chips. The IO path 
from FPGA to DRAM includes FPGA IO (buffer and 
pad), FGPA chip IO rerouting, solder bump on FPGA 
side, laminate routing, solder on DRAM side, DRAM 
chip IO rerouting, and DRAM IO (buffer and pad). Since 
designer can easily reroute the FPGA IOs to match the 
footprint of DRAM solder bumps, connections between 
FGPA solders and DRAM solders are mostly vertical 
(via). We include those parasitics inside the model of 
solder bumps. Solder bump pitch size is 500um [7]. Some 
specifications of the FPGA chip and DRAM chip are 
listed in Table 1. 
  
Table 1. Specification of FPGA and DRAM 
 FGPA  DRAM 
Geometry  12 x 12 mm
2  9 x 4.5 mm
2 
Number of IO  537  84 
Power supply  3.3V  3.3V 
Process 
Technology  0.22 um CMOS  0.24um DRAM 
 
 
3.1. Modeling of rerouting wire length 
 
The rerouting wire length is estimated based on chip 
geometry, IO floorplan, and technological parameters, 
such as IO pad pitch size and solder bump pitch size. 
Chip IO placement of FPGA and DRAM are illustrated in 
Figure 3.   
FPGA IO floorplan DRAM IO floorplan
 
Figure 3. Chip IO placement 
 
We studied the rerouting layout of automatic routing 
tools with two metal layers and found that the rerouting wire lengths are always in a certain range, which is the 
shortest distance between the solder to IO boundary plus 
a few solder pitch size. Assuming the origin is the center 
of the chip, the chip size is M x N, the solder pitch size is 
P, the location of the solder is (x, y), which is in the up-
right quarter of the chip, the wire length can be calculated 
as following: 
For IOs on chip boundary only,  
 
P y
N
x
M
Wreroute α + − − = )
2
,
2
min(  
 
For IOs on chip boundary and centerline,  
 
  P y
N
x reroute α + − = )
2
, min( W  
 
whereα is a parameter dependent on the chip geometry, 
IO pitch size and solder bump pitch size. α can be 
estimated by the solder pitch size divided by IO pitch 
size. The average wire length is the average Wreroute of all 
solders, which is: 
 
M N   assuming   , 12 4
3 2
_ > +
−
= P
MN
M N M
W average reroute α  
 
The worst case is the solder bumps are located on the 
center of the chip (for IO on the centerline, the worst case 
is the solder on the middle of left/right chip boundary). 
The worst case wire length is: 
 
P
N M
W worst reroute α + = )
2
,
2
min( _  
 
The comparison between the measurement results and 
the calculated data is listed in Table 2. Here, the average 
wire lengths were slightly overestimated because  P α is 
more like the upper bound of the offset between the chip 
IO and solder bump. 
 
Table 2. Measured wire length Vs. Calculated wire length 
(Unit um) 
Measurement Calculation   
Avg. Worst Avg. Worst 
FPGA  3960 8450 4500 8500 
DRAM  1680 3350 1940 3250   
 
3.2. RLC equivalent circuit 
 
The equivalent circuit of IO path can be obtained by 
approximating the R, L, and C of IO pad, rerouting metal, 
and solder bumps.  
The pad is mainly a capacitive load for the output 
driver, its capacitance can be easily found using the 
approximate formula [11]: 
 













 + = pad
pad
ox pad P
H
T
H
A
C
222 . 0
4 . 1 15 . 1 ε  
 
where Apad is the bonding pad area, Ppad the bonding pad 
periphery, H the height of the bonding pad above the 
conductive silicon substrate, and T the thickness of the 
metallization of the bonding pad.   
Because the spaces between rerouting wires are 
usually large enough, the coupling capacitance for 
rerouting wires is negligible. The wire capacitance can be 
derived as [11]:  
 
reroute ox reroute W
H
T
H
W
C ×













 + =
222 . 0
8 . 2
15 . 1
ε  
 
where W is the width of rerouting wire, H the height of 
the rerouting layer above the conductive silicon substrate, 
T the thickness of rerouting, and Wreroute is the wire 
length.  
We simplify the estimation of equivalent resistance by 
assuming the rerouting interconnect as one metal line, this 
is reasonable because the rerouting usually do not change 
routing layer or change only once with adequate routing 
resource.  
 
T W
W
R reroute
reroute ×
= ρ  
 
where ρ is the resistivity of the metal line, W the 
interconnect width, and T the thickness of the metal layer. 
The solder can be viewed as a cylinder shape 
conductor. Its resistance can be estimated as: 
 
2
4
D
H
R solder
solder solder
π
ρ =  
 
where solder ρ is the resistivity of solder material; Hsolder the 
height of solder; and D the solder diameter.  
The capacitance of solder joint can be derived as [11]: 
 








 


 


+ = D
H
H
H
D
C
sl
solder
sl
ate la solder π
π
ε
222 . 0 2
min 4 . 1
4
15 . 1  
 
where Hsolder is the height of the solder, D the solder 
diameter, Hsl the distance from solder to the laminate 
ground plate. In addition, each solder bump with the underneath via contributes approximately 0.5nh 
inductance [12]. Therefore, we have the equivalent RLC 
as listed in Table 3.  
 
Table 3. RLC values in CLC IO equivalent circuit 
 R  L  C 
FPGA pad  --  --  0.362 pF 
FPGA rerouting  2.98 ohms  --  0.377 pF 
Solder bump  3.46m ohms  0.5 nh  0.0317 pF 
DRAM rerouting  1.28 ohms  --  0.162 pF 
DRAM pad  --  --  0.362 pF   
 
As a comparison, we also analyzed the corresponding 
implementation using Stack-Chip (SC) technology, as 
illustrated in Figure 4. 
BGA ball
Substrate
Bonding wire Logic DRAM
 
Figure 4. Stack chip technology 
 
Stack chip package mainly exploits wire-bonding 
technology. By eliminating additional rerouting layers, 
Stack chip package offers lower cost and lower design 
complexity over CLC package. However, the large 
parasitic inductance introduced by bonding wires limits 
its application in high frequency domain.  
We model the bonding wire as a 25um diameter 
copper line, the bonding wire length for FPGA and 
DRAM is 3 mm and 5 mm respectively. The parasitic 
inductance can be approximated as 1.1 nh/mm [11]. The 
routing on the substrate is also negligible because the 
flexibility in netlist can minimize the distance between 
corresponding FPGA bonding pad and DRAM bonding 
pad. Therefore, we have the equivalent RLC for FPGA 
bonding wire as 0.162 ohms, 0.321 pF, 3.3 nh, and for 
DRAM bonding wire as 0.270 ohms, 0,401 pF, and 5.5 
nh. The equivalent circuits of CLC IO path and SC IO 
path are illustrated in Figure 5.  
  
Driver
Cpad Cpad Cload CDRAM CFPGA
RFPGA RDRAM Rsolder
Lsolder
Csolder
Rsolder
Lsolder
Csolder
RLC equivalent circuit for CLC package
Driver
Cpad C pad Cload
R wire
Lwire
Cwire C wire
Lwire Rwire
RLC equivalent circuit for SC package
 
Figure 5. RC elements on SiP IO path 
 
3.3. Simulation result 
 
We simulate above circuits using HSPICE. The goal is 
to analyze how the signals are transmitted from chip to 
chip inside the package, and how this new packaging 
technology impacts the chip design. Figure 6 illustrates 
the simulation result of CLC IO path and SC IO path. 
Upper waveform is the signal transfer from FPGA IO to 
DRAM IO in CLC package; lower one is the 
corresponding signal in SC package. 
 
Figure 6. CLC Vs. SC simulation result 
 
Signals in CLC package have very clean edges. 
Overshoot and undershoot introduced by solder 
inductance are less than 0.05V. No terminating resistors 
are needed. On the other hand, the signals in SC package 
include significant overshoot (> 0.5V) and undershoot (> 
1.1V). The falling edge oscillation takes over 1.5 ns to 
converge, which limits the application of SC package in 
high frequency domain (> 1GHz). With (400/1, 200/1) 
driving strength of the IO buffer and 2pF load at the input 
gate, the delay, rise time, and fall time of CLC IO path are 
0.22 ns, 0.36ns, and 0.27 ns, respectively.  
Chip design, especially IO design, should be optimized 
for the superior electrical environment in CLC package. 
The buffer size of output driver can be minimized to 
achieve smaller area and less power. We analyzed how 
the delay, rise time, and fall time change with different 
driving strength, as illustrated in Figure 7. The delay, rise 
time and fall time do not increase much as the buffer size 
was reduced to 50%, which means significant saving in 
chip area and power consumption. In SiP, internal IOs are 
not connected to external pins and are not exposed to the 
Electro-Static-Discharge (ESD). These protections 
become redundant and can be minimized [4]. Figure 8 
compares the delay, rise time, and fall time vs. the input 
capacitance. We can see that the timing constant is 
decreased linearly with smaller load capacitance, which means less delay, smaller IO area, and less power. In 
addition, the output driver can be further minimized with 
smaller load on the input end.  
 
 
Figure 7. Delay and slew rate Vs. IO buffer size in CLC 
module (W/L is P transistor, P/N ratio is 2/1) 
 
 
Figure 8. Delay and slew rate Vs. load capacitance in 
CLC module 
 
4. Thermal Analysis of CLC-based SiP 
 
We analyzed the thermal performance of CLC-based 
SiP module using FLOTHERM [13], a commercial 
thermal simulation tool. We try to compare the thermal 
performance of CLC package with stack-chip package 
and system-on-a-chip (SoC) implementation. We assume 
the SoC implementation has die area and power 
consumption as the sum of each individual chip. This is 
reasonable first order approximation because SoC 
implementation usually has less power consumption and 
smaller chip area than CLC or SC implementation by 
removing inter-chip connection and IOs. These two 
factors contribute inversely to junction temperature, 
which make the thermal performance of SoC close to 
CLC implementation. The test packages have been 
simulated in standard JEDEC test environment with 
different airflow. The ambient temperature is set to be 
30 . Table 4 lists some technology parameters used to 
model the three implementations.  
C °
 
Table 4. Thermal modeling parameters 
 CLC  SC  SoC 
Geometry 
(mm) 
12 x 12 
(FPGA)  
2 x 4.5 x 9 
(DRAM) 
12 x 12 
(FPGA)  
2 x 4.5 x 9 
(DRAM) 
15 x 15 
Power 
consumption 
(W) 
3 (FPGA) 
0.8 (DRAM) 
3 (FPGA) 
0.8 (DRAM) 
4.6 
Package Size 
(mm x mm) 
31 x 31  31 x 31  31 x 31 
Assembly 
technology 
Solder 
bumping 
Wire 
bonding 
Solder 
Bumping   
CLC Module IO Performance Vs. Buffer 
Size
0
0.5
1
1.5
400 350 300 250 200 150 100 50
P transistor W/L
T
i
m
e
 
(
n
s
) Delay
Rise Time
Fall Time
 
Table 5 shows the simulation result. CLC package has 
significant thermal advantage over stack-chip package. In 
the case of stack-chip package, the silicon is not good 
thermal conductor. The heat of the lower chip (FPGA) 
has to dissipate through the upper chip (DRAM), 
therefore, the FPGA junction temperature is much higher 
than in CLC module. In addition, the “back-heat” effect 
on the DRAM chips also makes the DRAM junction 
temperature higher than DRAMs in CLC module. The 
junction temperatures of CLC module are suitable for 
portable devices, which do not allow big heat sink. 
Airflow also has big impact on junction temperature, with 
1m/s air flow from left to right, the temperature can be 
reduced as much as 14° .   C
CLC IO Performance Vs. Load Capacitance
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.25 0.5 0.75 1 1.25 1.5 1.75 2
Load Capacitance (pF)
T
i
m
e
 
(
n
s
)
Delay
Rise
Time
 
Table 5. Thermal simulation result (Unit )  C °
  Still Air  Forced Air 
FPGA 80.72  66.82 
DRAM1 82.78  69.03 
DRAM2 82.72  68.81 
CLC 
Case 80.08  66.11 
FPGA 102.90  89.78 
DRAM1 103.15  89.59 
DRAM2 103.35  89.25 
SC 
Case 102.55  88.40 
Die 81.36 67.36  SoC 
Case 80.94  66.92   
 
Figure 9 is the comparison of the temperature 
distribution of CLC module, SC module and SoC module. 
It is demonstrated that in the case of CLC module and 
SoC module, the heat is very well dissipated from the top 
of the package. On the other hand, the SC package 
provides greater thermal resistance due to the low 
conductivity of the encapsulant. So the junction 
temperature is significantly higher.  
  
 
 
 
 
Figure 9. Comparison of temperature distribution among 
CLC module, SC module, and SoC module  
(1m/s air flow, JEDEC test environment, package is 
removed to show the die temperature.) 
 
5. Conclusion 
 
CLC-based SiP can serve as an implementation 
platform for giga-scale systems by giving designers 
opportunities to explore their hardware architecture and 
physical implementation in an early stage and simplify 
the physical package design task. We analyzed the 
electrical characteristics and thermal performance of 
CLC-based SiP and compared with other implementation 
platform, such as stack-chip SiP and SoC. It is 
demonstrated that CLC technology has significant 
performance advantage over conventional SiP and is an 
ideal cost-effective alternative to system-on-a-chip.   
 
 
 
 
 
6. Acknowledgement 
 
The author would like to thank Professor Andrew B. 
Kahng from University of California, San Diego, Dr. 
King L. Tai from SyChip Inc. and Mr. Ethan Warner 
from Flomerics Limited for their valuable help and 
suggestion on this research. This work is funded in part 
by the DARPA/MARCO Gigascale Silicon Research 
Center. 
 
 
Reference: 
 
[1]. K. L. Tai, “System-In-Package (SIP): Challenges and 
Opportunities”, Asia and South Pacific Design Automation 
Conference, pp. 191-196, 2000. 
[2]. Y. L Low, R.C Frye, and K. J O’Conner, “Design 
methodology for chip-on-chip applications”, IEEE Trans. 
on Components, Packaging, and Manufacturing 
Technology Part B, vol. 21, pp. 298-301, Aug. 1998. 
[3].  M. X. Wang, K. Suzuki, W. W.-M. Dai, Yee L. Low, K. J. 
O’conner, K. L. Tai, “Integration of Large-Scale FPGA 
and DRAM in a Package Using Chip-on-Chip Technology”, 
Asia and South Pacific Design Automation Conference, pp. 
205-210, 2000. 
[4].  M. Wang, K. Suzuki, W. Dai, A. Sakai, K. Watanabe, 
“Configurable area IO memory for System-in-a-Package 
(SiP)”, In Proceedings of the 2001 European Solid State 
Circuit Conference, Sept. 2001 
[5].  K. L. Tai, Private communication, 2000  
[6].  Yang, S.J., Chang, T.C., Ruey-Wen Chien, Wang, E.D., 
Gabara, T.J., Tai, K.L., Frye, R.C., “High speed I/O buffer 
design for MCM.” In Proceedings of 1997 IEEE Multi-
Chip Module Conference, P52-7, 1997 
[7].  www.apack.com.tw, Apack Technologies Inc., 2002 
[8].  “Packages for High Density Mounting”, SONY 
semiconductor, Vol 23, 2000 
[9].  Michael X. Wang, Katsuharu Suzuki, Wayne W.-M. Dai, 
“Memory and Logic Integration for System-in-a-Package”, 
4th International Conference on ASIC, 2001 
[10].  Z. Yang, M. Rahman, S. Mourad, "Signal integrity and 
design consideration of an MCM for video graphic 
acceleration", IEEE Transactions on Advanced Packaging, 
2001 
[11].  Ken Martin, “Digital Integrated Circuit Design”, Oxford 
University Press, 2000 
[12].  Yee L. Low, Yinon Degani, Keith V. Guinn, T. Dixon 
Dudderrar, Jeffrey A. Gregus, Robert C. Frye, “RF Flip-
Module BGA Package”, IEEE Transactions on Advanced 
Packaging, 1999 
[13].  www.flotherm.com, FLOMERICS Limited, 2002  