Temperature sensing and control in multi-zone semiconductor thermal processing by YAN HAN
 TEMPERATURE SENSING AND CONTROL IN  













A THESIS SUBMITTED 
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY 
 
DEPARTMENT OF ELECTRICAL AND 
COMPUTER ENGINEERING 
NATIONAL UNIVERSITY OF SINGAPORE 
2009 
iAcknowledgements
My deepest gratitude is to my advisor, Professor Ho Weng Khuen, for his patience
and support throughout my study and research at the National University of Singa-
pore. I have benefited enormously from the constructive advice and critiques that he
offered over many of our discussions. I must also extend my gratitude to Professor
Ling Keck Voon and Professor Jose´ Romagnoli for the help they rendered to my
research.
I would also like to thank my friends and colleagues: Dr. Hu Ni, Dr. Wu
Xiaodong, Dr. Fu Jun, Dr. Ye Zhen, Dr. Chen Ming, Ms. Wang Yuheng, Mr. Feng
Yong, Mr. Shao Lichun, Ms. Lim Li Hong, Mr. Nie Maowen, Mr. Chua Teck Wee,
Mr. Ngo Yit Sung, Mr Lee See Chek, Ms. Teh Siew Hong, Mr. Lin Feng, Mr. Tan
Kiat An, Mr. Gibson Lee, and many others at the Advanced Control Technology
Laboratory (ACT), the Mechatronics & Automation Laboratory and the Control
& Simulation Laboratory. I have enjoyed entertaining and inspiring conversations
with them and I am thankful for the congenial and conducive working environment
to which we have all contributed.
ii
Finally, my heartfelt thanks to my family for their unfailing support and under-




Table of Contents iii
Summary viii
List of Tables xi
List of Figures xiii
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 RTD Bias Estimation . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Control of the Multi-Zone Bake-Plate . . . . . . . . . . . . . . 6
iv
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 RTD Bias Estimation in Multi-Zone Semiconductor Thermal
Processing and Estimator Performance Analysis . . . . . . . . 8
1.2.2 Multiplexed MPC for Multi-Zone Semiconductor Thermal Pro-
cessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 RTD Bias Estimation for Multi-Zone Semiconductor Thermal Pro-
cessing 13
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Bake-Plate Thermal Modeling . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Bias Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.1 Least Squares Estimation . . . . . . . . . . . . . . . . . . . . 21
2.3.2 GT-based Estimation . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Analysis of Estimator Performance . . . . . . . . . . . . . . . . . . . 25
2.4.1 Influence Function (IF) . . . . . . . . . . . . . . . . . . . . . . 26
2.4.2 IF of LS Estimator . . . . . . . . . . . . . . . . . . . . . . . . 29
v2.4.3 IF of IQR+LS Estimator . . . . . . . . . . . . . . . . . . . . . 29
2.4.4 IF of GT-based Estimator . . . . . . . . . . . . . . . . . . . . 31
2.4.5 Estimation Variance . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Appendix 2A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Appendix 2B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Appendix 2C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Appendix 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Appendix 2E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3 Simulation and Experimental Results 43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 A Simulation Example . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 Problem Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Experimental Verification of Theoretical Results . . . . . . . . . . . . 52
vi
3.3.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.3 Sample Calculation . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Appendix 3A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4 Multiplexed MPC for Multi-Zone Semiconductor Thermal Process-
ing 75
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2 Bake-Plate Thermal Modeling . . . . . . . . . . . . . . . . . . . . . . 81
4.3 A Review of Multiplexed MPC and Feedforward Control . . . . . . . 83
4.3.1 Multiplexed MPC . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3.2 Feedforward Control . . . . . . . . . . . . . . . . . . . . . . . 84
4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . 85
vii
4.4.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . 85
4.4.3 Experimental Runs . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Appendix 4A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5 Conclusion 97
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97





The importance of lithography to semiconductor manufacturing is evident. Lithog-
raphy constitutes about 30% of the cost of manufacturing a chip, and is the key
technological enabler for further down-scaling of device dimensions and upgrading
of chip performance. Thermal processing is an integral part of lithography. In align-
ment with the call for ever smaller critical dimension (CD) and due to the fact that
the final CD is very sensitive to thermal processing temperature, requirement is in-
creasingly stringent on temperature sensing and control in multi-zone semiconductor
thermal processing.
Resistance Temperature Detectors (RTD’s) installed in a multi-zone bake-plate
typically used for semiconductor thermal processing are subject to measurement
bias. Data reconciliation (DR) techniques are extended so that RTD biases can
be estimated online from process data. To handle frequently encountered non-
normality in process data, a generalized T distribution (GT) based bias estimator
is proposed. Equations are derived which relate variance of a bias estimator to
sample size (number of wafers runs per estimation). These equations enable the
ix
computation of the sample size or the number of wafers needed by the bias estimator
to achieve specified variance. With this information, the exact number of wafers
can be used for estimation so that bias can be estimated precisely and eliminated
as soon as possible to avoid wafer wastage. Alternatively, these equations allow the
calculation of the variance of the bias estimator and hence its precision if the number
of wafers used is given.
The theoretical results on estimator analysis are verified experimentally. In
the light of the equations derived, an efficient estimator can be selected. In the
presence of outliers that are close to the good data, the equations show that using
GT, instead of normal distribution, to characterize process data gives rise to a more
efficient estimator than Least Squares (LS) and Interquartile test plus Least Squares
(IQR+LS) and therefore enables earlier remedial actions against RTD bias to save
semiconductor wafers from sensing-related processing defects. In view of the cost of
manufacturing one wafer, a guided choice of an efficient RTD bias estimator and an
appropriate sample size for estimation is economically important.
To fulfil the stringent requirement on temperature control of a multi-zone bake-
plate, Multiplexed Model Predictive Control (MMPC) with feedforward is demon-
strated experimentally on a multi-zone bake-plate application. By distributing the
control moves over one complete update cycle, MMPC can afford to work with higher
sampling rate. It is shown to have the potential to make the bake-plate respond
and recover faster than under conventional MPC when disturbance is induced by
xplacement of a wafer and cannot be sufficiently compensated by feedforward.
The proposed GT-based bias estimator and the MMPC controller can easily
work in the same process to minimize processing defects due to RTD bias, and to




1.1 CD sensitivities for some commercially available KrF resists. . . . . . 2
3.1 Theoretical and experimental results of 5000 wafer runs divided into
50 per batch and bias estimation was performed batch by batch. . . . 57
3.2 Theoretical and experimental results with randomly selected 25% of
the measurements amplified by 3 times. . . . . . . . . . . . . . . . . 58
3.3 Theoretical and Experimental Results with randomly selected 10% of
the measurements amplified by 10 times. . . . . . . . . . . . . . . . 59
3.4 Theoretical and Experimental Results with randomly selected 25% of
the measurements amplified by 2 times. . . . . . . . . . . . . . . . . 66
3.5 Theoretical and Experimental Results with randomly selected 25% of
the measurements amplified by 6 times. . . . . . . . . . . . . . . . . 67
3.6 Simulation data for the illustrative example in Section 3.2. . . . . . . 71
xii
4.1 Comparison of SMPC’s and MMPC’s measurement and computation
instants. ×1 denotes solving a 15-variable Quadratic Program (QP).
×2 denotes solving a 5-variable QP. . . . . . . . . . . . . . . . . . . . 89
xiii
List of Figures
2.1 A schema of an N−zone bake-plate: side view and slant view. The
bake-plate has radially distributed zones. Each zone contains inside
itself an RTD for temperature sensing and an individual resistive
heater for temperature control. . . . . . . . . . . . . . . . . . . . . . 18
2.2 An electric network analog of the bake-plate system. Analogy exists
between voltage and temperature above ambient, current and heater
power, capacitor and thermal capacitance, resistor and thermal resis-
tance, ground and ambient temperature. . . . . . . . . . . . . . . . . 19
2.3 Some special cases of GT. Distributional parameters for these special
cases are: σ =
√
2, p = 2, q → +∞ for normal, σ = √2, p = 2, q = 2.4
for t, σ =
√
2, p, q → +∞ for uniform, and σ = √2, p = 1, q → +∞
for Laplace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
xiv
3.1 GT distribution versus normal distribution in characterizing a non-
normal distribution. GT’s distributional parameters are σ =
√
2, p =
20, q = 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 The bake-plate used in the experiment. . . . . . . . . . . . . . . . . . 53
3.3 Raw data from one run (baking process of one wafer). . . . . . . . . . 56
3.4 GT distribution versus normal distribution in characterizing a non-
normal distribution. GT’s distributional parameters are σ = 1.57 ×
10−2, p = 2, q = 2.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.5 Influence function of various estimators. GT’s distributional param-
eters are σ = 1.57× 10−2, p = 2, q = 2.4. . . . . . . . . . . . . . . . . 69
4.1 Closed-loop temperature responses under MMPC and SMPC when a
flat wafer was placed on the bake-plate. . . . . . . . . . . . . . . . . . 77
4.2 Closed-loop temperature responses under MMPC and SMPC, with
feedforward, when a flat wafer was placed on the bake-plate. . . . . . 78
4.3 Schematics of (a) a flat wafer and (b) a warped wafer on the bake-plate. 79
4.4 Patterns of input moves for Standard MPC (left), and for the Multi-
plexed MPC (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
xv
4.5 Model verification using step response. From left to right, step input
applied to zone 1, zone 2, and zone 3. . . . . . . . . . . . . . . . . . . 86
4.6 Closed-loop temperature responses under MMPC and SMPC, with
feedforward, in response to a warped wafer. . . . . . . . . . . . . . . 92
4.7 Magnified view of Figure 4.6 from t = 73s to t = 77s. ‘×’s denote
measurement instants for MMPC while ‘◦’s denote measurement in-




The importance of lithography in the fabrication of integrated circuits (IC) is two-
fold. First, because of the numerous lithography steps involved in IC manufactur-
ing, lithography accounts for around 30% of the cost of manufacturing a chip [1].
Any drop in the lithographic process throughput amounts to a drop in the overall
throughput for the entire fabrication factory. Second, lithography tends to be the
technical bottleneck for further progress in reducing device size and improving chip
performance, and historically, advances in lithography have framed advances in IC
cost and performance [2].
The Critical Dimension (CD), or the size of the smallest feature printed in resist
2Table 1.1: CD sensitivities for some commercially available KrF resists.






ARCH 2 ≈ 0
with adequate control, is the proxy for the accuracy of circuit patterns formed over
the lithography process. The shrinking of CD is the technological driver of Moore’s
Law [1]. As dimensions of a transistor shrink, an increased number of transistors can
be implemented per unit area on a silicon wafer, resulting in cost advantages and
in many cases improved device reliability. For instance, both gate delay and drive
current are proportional to the inverse of the gate length which is determined by
CD. It is shown that 1nm variation in channel CD is tantamount to 1MHz variation
in chip speed [3].
The post-exposure bake (PEB) is a very critical thermal process in lithography
and the CD is typically much more sensitive to this bake than to the softbake [4]. CD
sensitivities to PEB temperature for commercially available KrF resists are shown
in Table 1.1 [5, 6]. While resist suppliers have responded to industrial requests
with resists that can be manipulated to realize smaller CD, there is increasingly de-
3manding requirement on the accuracy of temperature control in thermal processing,
especially in PEB [7]. It has been shown that the application of an advanced thermal
processing system in PEB could contribute to a reduction in CD variance by 40%
[8]. It has also been shown that proper PEB control leads to improvement in CD
uniformity, which in turn leads to device performance improvement [9]. Specifically,
temperature uniformity across a bake-plate during both transient and steady-state
phases of PEB is critical to enhancing CD uniformity [10, 11, 12, 13, 14, 15].
Currently the most popular PEB method involves the use of a bake-plate [1].
The wafer is brought either into intimate vacuum contact with or close proximity
to a hot, high-mass metal plate. To fulfil the extremely tight CD specifications, the
bake-plate must have excellent across-plate temperature uniformity. To this end, the
bake-plate is typically designed into a multi-zone thermal system. A state-of-the-art
bake-plate with 49 independently controlled heating elements is proposed in [16, 17],
where each zone has an RTD installed within to obtain real-time measurements of
zone temperature.
A typical PEB step begins with a wafer at ambient temperature being trans-
ferred by a mechanical manipulator to the bake-plate held at a setpoint (between
70°C and 150°C and recipe specific). The wafer is removed immediately after being
baked for a pre-specified period of time.
The multi-zone configuration poses a major challenge to temperature sensing.
The fact that readings by RTD’s can be biased is a limiting factor on temperature
4control accuracy. Bias should be estimated efficiently online to minimize the effect of
inaccurate temperature sensing. Another challenge is posed to control engineering:
a computationally tractable algorithm is required for real-time temperature control
of a multi-zone bake-plate which is then able to reject in a timely manner the
disturbance caused by the wafer.
1.1.1 RTD Bias Estimation
Thermal processing of semiconductor wafers is common and critical in semiconductor
manufacturing [18, 19, 20]. The most temperature sensitive step in the lithography
sequence is the post-exposure bake step [4, 21]. To obtain temperature uniformity,
a wafer is heated by multiple independently controlled heating elements simultane-
ously. Each zone is equipped with an RTD for temperature measurement. Heating
in the presence of RTD bias in such temperature sensitive cases inevitably causes
processing defects and reduces wafer yield. To maintain temperature control per-
formance, data reconciliation techniques [22, 23], which seek to find optimal state
estimates from process measurements by maximizing noise likelihood under model
constraint, can be used to perform online RTD calibration by estimating RTD biases
from process data [24].
In process engineering, assumptions commonly made such as normal (Gaussian)
statistics are approximations to reality. The occurrence of outliers, transient data
in steady-state measurements, instrument failure, human error, process nature, etc.
5can all induce non-normal process data [25]. Indeed, whenever the central limit
theorem is invoked – the central limit theorem being a limit theorem, it can at most
suggest approximate normality for real data. Thermal processes in semiconductor
manufacturing are no exception and if the process being monitored does not vary
normally, conventional data reconciliation techniques, which rest on the normal
distribution assumption, may lead to poor results [18].
According to the central limit theorem, any random distribution can be effec-
tively transformed to the normal distribution by subgrouping and averaging. Studies
have shown that this can be accomplished effectively with subgroups of size as small
as three or four, so long as the primary distribution does not depart too significantly
from normality [26]. There are a number of circumstances that arise in semicon-
ductor manufacturing operations in which it is inconvenient to generate subgroups
of size greater than one. As a result, it is frequently desirable to have subgroups of
size equal to one and the practice may be problematic where the processes do not
vary normally. Hence it is useful to develop methods for non-normal distribution.
Conventional data reconciliation formulations assume that process data follows
normal (Gaussian) distribution. However, even high-quality process data may not
be normal. The presence of a single huge outlier can spoil the statistical analysis
completely. Hence it is not wise to use the Least Squares (LS) algorithm, the sim-
plest data reconciliation algorithm, without any built-in check [25, 27]. A common
practice to make LS robust is to introduce a preliminary outlier test before applying
6LS. In the presence of outliers, an outlier test is expected to remove some or all of
them. One of the most popular tests is the interquartile (IQR) test [28].
The generalized T distribution (GT) has previously been employed in econo-
metrics to model random residuals in regression parameter estimation [29]. Being
a distribution superset encompassing normal, uniform, t and Laplace distributions,
GT has the flexibility to characterize economic data with non-normal statistical
properties [30]. It would be a promising distribution model based on which an RTD
bias estimator can be constructed.
1.1.2 Control of the Multi-Zone Bake-Plate
Thermal processing of semiconductor wafers is commonly performed by placement
of the wafer on a heated plate for a pre-specified period of time. The heated plate
is of large thermal mass relative to the wafer and is held at a constant temperature
by a feedback controller that adjusts the resistive heater power in response to mea-
surements of the plate temperature. The plate is designed with multiple radial zone
configurations.
A general requirement for the multi-zone bake-plate is the ability to reject the
load disturbance induced by placement of a cold wafer on the plate. Initially the
plate temperature drops and then recovers because of closed-loop control. In man-
ufacturing, wafers are processed in quick succession, one after another. Sluggish
7response will adversely affect, for example, the repeatability of the manufacturing
process if the recovery time of the plate temperature is longer than the baking time
of the wafer and the next wafer comes before the plate temperature fully recovers.
When this happens, there is not only wafer-to-wafer non-repeatability in tempera-
ture processing trajectory, but also plate-to-plate non-repeatability as the feedback
controller generally does not respond the same. If the processing temperature is
not critical, then this type of response is acceptable. However, with processes such
as PEB for chemically amplified photoresist processing, temperature control is very
critical [31, 32].
If a wafer is perfectly flat and the time at which the wafer arrives is known in
advance, a standard feedforward controller can be designed to eliminate the tem-
perature disturbance. In practice, however, wafer conditions differ. For example, a
wafer can warp up to 100µm [33]. In this case, feedforward control will not be able
to eliminate the temperature disturbance completely and feedback control will still
be necessary.
Work on bake-plate temperature control can be found in [34, 35, 36]. In [35],
PI was used as the feedback controller, while the more sophisticated MPC and
LQG controllers were used in [34] and [36] respectively. MPC operates by solving
a constrained optimization problem online, in real-time, in order to decide how to
update the control inputs (manipulated variables) at the next update instant. This
results in demanding online computational load and can be a limiting factor when
8MPC is applied to complex systems with many inputs or implemented on embedded
systems where computational resources are limited.
In a recent work, a variant of MPC called Multiplexed MPC, or MMPC was
proposed and stability results for MMPC were also established [37, 38]. MMPC
distributes the update of control inputs over one complete cycle to the effect that
optimization is with respect to only one control input at a time. The motivation
for MMPC is to reduce real-time computational load when MPC is implemented on
a multivariable system. In multi-zone semiconductor thermal processing, MMPC
has the potential to bring about better temperature control performance, than con-
ventional MPC, by affording faster sampling with its much reduced computational
load.
1.2 Contributions
1.2.1 RTD Bias Estimation in Multi-Zone Semiconductor
Thermal Processing and Estimator Performance Anal-
ysis
As has been discussed in Section 1.1.1, an array of RTD’s are installed in the multi-
zone bake-plate for semiconductor thermal processing. GT-based data reconciliation
9for the state vector has been examined in a simulation study [25] and we extend the
technique to RTD bias estimation.
We derive equations relating variance of the bias estimator to sample size (num-
ber of wafers runs per estimation). These equations enable us to compute the sample
size or the number of wafers needed by the bias estimator to achieve specified vari-
ance. With this information, the precise number of wafers can be used and wastage
can be prevented. Alternatively, these equations allow us to calculate the variance
of the bias estimator and hence its precision if the number of wafers used is given.
Such information can be used to select appropriate bias estimators depending on
applications.
We examine specifically the performance of the simple least squares (LS), in-
terquartile range test plus least squares (IQR+LS) and the generalized T distribution
(GT) based bias estimator for the difficult problem where measurement outliers are
close to good data such that they cannot be separated easily. The theory is verified
experimentally on a multi-zone bake-plate for semiconductor thermal processing.
In the light of the equations derived, an efficient estimator can be selected. In
the presence of outliers that are close to the good data, the equations show that
using GT, instead of normal distribution, to characterize process data gives rise to a
more efficient estimator than the LS and the IQR+LS and therefore enables earlier
remedial actions against RTD bias to save semiconductor wafers from sensing-related
processing defects.
10
An experiment was performed with 25% of the temperature and power fluctu-
ations coming from distributions with standard deviations 3 times as great as the
rest. It is shown that the GT based estimator with 43 wafer runs achieved the same
estimation variance as the IQR+LS estimator with 50 wafer runs. In other words,
7 wafer could be saved from sensing-related processing defects. Considering that
the cost of manufacturing one semiconductor wafer is of the order of USD 10,000
[39], process data is expensive to come by and processing defects are costly. A
guided choice of an efficient RTD bias estimator and an appropriate sample size for
estimation is therefore economically important.
1.2.2 Multiplexed MPC for Multi-Zone Semiconductor Ther-
mal Processing
Multiplexed MPC (MMPC), by distributing the control moves over a complete up-
dating cycle, reduces computational load otherwise incurred by conventional MPC.
We demonstrate experimentally the application of MMPC with feedforward to the
control of a multi-zone bake-plate.
It is shown that adding feedforward reduces the effect of disturbance signifi-
cantly. While most of the temperature drop will be compensated by feedforward
control, feedback control is still needed to cope with the errors due, for example,
to warped wafers. This is supported by experimental results. These results are
11
important for the semiconductor wafer baking process, because temperature non-
uniformity due to poor temperature control affects CD of the wafer.
Depending on the thermal process, the recipe baking time varies and plate
temperature should recover within the pre-specified period of time. We note from
the experiments the potential of MMPC to make plate temperature recover faster
than under conventional MPC after disturbance takes place. The sampling rate
not being the only factor bearing on closed-loop control performance, MMPC’s key
advantage of reduced computational load supports faster sampling and potentially
brings about superior control performance.
For a 3-zone bake-plate with control horizonMu = 5, conventional MPC needs to
perform optimization with respect to 3×Mu = 15 variables every sampling interval
whereas MMPC needs to perform optimization with respect to only 5 variables. In
our experiment, the sampling of SMPC was at 1.2s or about 1Hz. MMPC, with
its reduced computational load, could sample at 0.4s or 2.5Hz without the need for
hardware upgrade. This computational advantage of MMPC’s becomes even more
significant when constraints are considered and with an increasing number of zones
and extended control horizon. For example, consider the state-of-the-art 49-zone
bake-plate [36] with Mu = 5. SMPC would have to solve a Quadratic Program




This thesis is organized as follows. In Chapter 2, a GT-based RTD bias estima-
tor is proposed for the multi-zone bake-plate and an analytic method is developed
to study estimator performance. Chapter 3 gives detailed examples verifying the
results obtained in Chapter 2 and discusses the performance difference among se-
lected estimators, including the GT-based estimator that we propose. Chapter 4
discusses the use of MMPC as a computationally efficient control scheme favorable
for temperature control of the multi-zone bake-plate. Chapter 5 summarizes the
work presented in previous chapters and gives a brief outlook on future work.
13
Chapter 2




Thermal processing of semiconductor wafers is common and critical in semiconduc-
tor manufacturing [18, 19, 20]. Temperature uniformities within a wafer and from
wafer to wafer are important issues with stringent specifications [7] and significant
impact on the smallest feature size or critical dimension of integrated circuits [15].
The most temperature sensitive step in the microlithography sequence is the post-
exposure bake step [21]. To obtain temperature uniformity, a wafer is heated by
14
multiple independently controlled heating elements simultaneously. A state-of-the-
art thermal system with 49 independently controlled heating elements is described in
[16, 17]. Each zone has a resistance temperature detector (RTD) within to provide
temperature measurements. Heating in the presence of sensor bias in such tempera-
ture sensitive processes inevitably causes processing defects and reduces wafer yield.
To maintain temperature control performance, data reconciliation techniques [22,
23] are proposed in this chapter to perform online sensor calibration by estimating
RTD biases from process data.
Conventional data reconciliation formulations assume that process data follows
normal (Gaussian) distribution. However, even high-quality process data may not
be normal. The presence of a single huge outlier can spoil the statistical analysis
completely. In process engineering, assumptions commonly made such as normal
(Gaussian) statistics are approximations to reality. The occurrence of outliers, tran-
sient data in steady-state measurements, instrument failure, human error, process
nature, etc. can all induce non-normal process data [25]. Indeed, whenever the cen-
tral limit theorem is invoked – the central limit theorem being a limit theorem, it can
at most suggest approximate normality for real data. Semiconductor manufacturing
processes are no exception and if the statistics of the process being monitored is not
normal, conventional data reconciliation techniques may lead to poor results [18].
Hence it is not wise to use the Least Squares (LS), the simplest data reconciliation
algorithm, without any built-in check [25, 27].
15
According to the central limit theorem, any random distribution can be effec-
tively transformed to the normal distribution by subgrouping and averaging. Studies
have shown that this can be accomplished effectively with subgroups of size as small
as three or four, so long as the primary distribution does not depart too significantly
from normality [26]. There are a number of circumstances that arise in semicon-
ductor manufacturing operations in which it is inconvenient to generate subgroups
of size greater than one. There are many reasons for restricting subgroups to the
minimum size, which is one:
 Measuring is expensive. Metrology equipment, for measuring defects, film
thickness, linewidths and overlay, is expensive. Reducing the number of mea-
surements reduces the expenditures required for metrology equipment.
 Equipment utilization needs to be maximized. Processing wafers for the pur-
pose of monitoring equipment detracts from productive time. Processing
equipment is used to make saleable products. It is clearly desirable to minimize
the time used for processing material that cannot be sold.
 Data collection is very time consuming. The amount of time required for peo-
ple to process, measure and analyze data collected by monitoring equipment
tends to increase with the number of wafers processed. Minimizing the number
of wafers reduces the level of human resources needed to control the process.
For all these reasons, it is frequently desirable to have subgroups of size equal to
16
one and the practice may be problematic where the processes do not vary normally.
Hence it is useful to develop methods for non-normal distributions.
A common practice to make LS robust is to introduce a preliminary outlier
test before applying LS. In the presence of outliers, an outlier test is expected to
remove some or all of them. One of the most popular tests is the interquartile (IQR)
test [28], which detects as outliers any measurements that lie more than 1.5 times
the interquartile range below the sample’s first quartile or more than 1.5 times the
interquartile range above the sample’s third quartile.
Generalized T distribution (GT) has previously been employed in econometrics
to model random residuals in regression parameter estimation [29]. Being a dis-
tribution superset encompassing normal, uniform, t and Laplace distributions, GT
has the flexibility to characterize economic data with non-normal statistical proper-
ties [30]. GT-based data reconciliation for the state vector has been examined in a
simulation study [25]. In this chapter, it is extended to bias estimation.
We derive equations relating variance of the bias estimator to sample size. These
equations enable us to compute the sample size or the number of wafers needed by
the bias estimator to achieve specified estimate variance. With this information, the
precise number of wafers can be used and wastage can be prevented. Alternatively,
these equations allow us to calculate the variance of the bias estimator and hence
its precision if the number of wafers used is given. This information can be used to
select appropriate bias estimators depending on applications. As special cases we
17
examine the performance of the least squares (LS), interquartile range test plus least
squares (IQR+LS) and the generalized T distribution (GT) based bias estimator.
The theory is verified experimentally in Chapter 3 on a multi-zone thermal system
for semiconductor wafer processing.
The chapter is organized as follows. In Section 2.2, the steady-state model
of a generic multi-zone bake-plate is obtained. In Section 2.3, RTD bias estimation
problems based on LS, IQR+LS, and GT are formulated and their solutions derived.
In Section 2.4, estimator performance is analyzed and, as special cases, simple LS,
IQR+LS and the GT-based estimator are studied. The chapter is concluded in
Section 2.5.
2.2 Bake-Plate Thermal Modeling
For bias estimation, the structure of the steady-state model of the multi-zone thermal
system is determined first. The model serves as the steady-state constraint for
data reconciliation and is relevant to the quantitative analyses of bias estimators in
Chapters 2 and 3. A typical multi-zone thermal device with a wafer being baked
on top can be divided into ring-shaped zones (see Figure 2.1). Zones are numbered,
from center to edge, 1, 2, · · · , N , respectively. Consider the energy balance and heat
transfer in a distributed lumped parameter model. An analogy can be made between




Figure 2.1: A schema of an N−zone bake-plate: side view and slant view.
The bake-plate has radially distributed zones. Each zone contains inside
itself an RTD for temperature sensing and an individual resistive heater for
temperature control.
the circuit, the dynamic model of the thermal system is given by









where Ci, Ti(t) and pi(t) denote the heat capacitance, temperature above ambient
and power input of the ith zone, respectively. The thermal resistance between the
(i− 1)th and the ith zones is given by r(i−1)i, and rai denotes the thermal resistance
between the ith zone and ambient. Special cases are r(i−1)i = ∞ for i = 1 and
ri(i+1) =∞ for i = N .
At steady state, T˙i(∞) = 0 and Equation (2.1) becomes







































Figure 2.2: An electric network analog of the bake-plate system. Analogy
exists between voltage and temperature above ambient, current and heater
power, capacitor and thermal capacitance, resistor and thermal resistance,
ground and ambient temperature.
Defining new variables θi(t) = Ti(t)−Ti(∞), ui(t) = pi(t)− pi(∞) and substituting
Equation (2.2) into Equation (2.1) gives









At steady state, θ˙i(t) = 0 and Equation (2.3) reduces to



















. . . . . . . . .








x = [x1 · · · x2N ]T = [θ1 θ2 · · · θN u1 u2 · · · uN ]T
2.3 Bias Estimation
For an N -zone bake-plate in the absence of bias, a 2N -dimensional data vector y is
related to a 2N -dimensional state vector x by
y = x+ ε
where ε is the random noise vector and it is assumed that the expectation E(ε) = 0.
In the presence of bias, data vector y and state vector x are related by
y = x+Be+ ε (2.5)
21
In the above, e is an N -dimensional vector containing N bias values to be estimated.
B is a 2N×N matrix with Bii (i = 1, · · · , N) being one and all other elements being
zero. B specifies all N RTD’s in an N -zone bake-plate: x1, · · · , xN are temperatures
and are measured by RTD’s which are subject to bias; xN+1, · · · , x2N are heater
powers and are measured independent of RTD bias. However, temperatures and
heater powers are related by matrix A.
2.3.1 Least Squares Estimation
The LS data reconciliation algorithm is studied in [22, 23]. We now extend it to bias
estimation. Under the assumption that ε follows multivariate normal distribution,
the noise variance matrix is given by
Λ = diag(Λ1, . . . , Λ2N)
where Λi = var(εi). The estimation of bias vector e can be formulated as a mini-
mization problem under constraints Ax = 0. In the LS framework, specifically, the




with respect to x and e, subject to Ax = 0, where S is the sample size.
22
It is shown in Appendix 2A that the solution to the above problem is given by
e∗ = (AB)−1Ay¯ (2.6)




j=1 yi(j) and can be rewritten as
S∑
j=1
(yi(j)− y¯i) = 0 (2.7)
If Ω = diag(var(y¯1), . . . , var(y¯2N)), then the variance of e
∗ is calculated as
Φ = (AB)−1AΩAT (BTAT )−1 (2.8)
It is well known that the LS algorithm is highly sensitive to outliers. A common
practice to make LS robust is to introduce a preliminary outlier test before applying
LS. In the presence of outliers, an outlier test is expected to remove some or all of
them. One of the most popular tests is perhaps the interquartile (IQR) test [28, 40],
among others. For a given data sample, the IQR test finds the 25th percentile Q1,
the 75th percentile Q3, and the interquartile range IQR = Q3−Q1. Any observation
yi(j) such that yi(j) < yL = Q1 − 1.5× IQR or yi(j) > yH = Q3 + 1.5× IQR will
then be removed from the sample. The remaining data are then relegated to LS, ie,
23
Equation (2.7) is modified to
∑
j
(yi(j)− y¯i) = 0, yL ≤ yi(j) ≤ yH (2.9)
so that any measurements detected as outliers do not affect the solution of y¯i.
2.3.2 GT-based Estimation
This estimator is based on the GT distribution, which has the probability density
function
f(z;σ, p, q) =
p





where z is any realization value of ε, σ is the scaling parameter, p and q are the
shape parameters, and β(a, b) =
∫ 1
0
va−1(1− v)b−1dv is the Beta function.
It is shown in Appendix 2B that GT can become normal, t, uniform, and Laplace
distributions by assuming appropriate distributional parameter values. Figure 2.3
gives the plots of some distributions which GT can reduce to. It can be seen that
p = 2 and a appropriate finite q make GT close to the normal distribution with
slightly thicker distribution flanks. This offers the possibility of having GT perform
very much like LS when process data has good normality and yet much better than
LS when process data involves outliers, which GT’s thicker flanks can take into
consideration.
24























Figure 2.3: Some special cases of GT. Distributional parameters for these
special cases are: σ =
√
2, p = 2, q → +∞ for normal, σ = √2, p = 2, q = 2.4
for t, σ =
√
2, p, q → +∞ for uniform, and σ = √2, p = 1, q → +∞ for
Laplace.
25
In the framework of GT, the maximum likelihood method is used to find the






ln f(yi(j)− xi − bie;σi, pi, qi) (2.11)
with respect to x and e, subject to Ax = 0 and where row vector bi denotes the
ith row of B.




(piqi + 1)sgn(yi(j)− y¯i)|yi(j)− y¯i|pi−1
qiσ
pi
i + |yi(j)− y¯i|pi
= 0 (i = 1, · · · , 2N) (2.12)
and then by finding
e∗ = (AB)−1Ay¯
As with LS, if Ω is the variance of y¯ then the variance of e∗ can again be calculated
from Equation (2.8).
2.4 Analysis of Estimator Performance
Note that from Equation (2.8) the estimation variance by LS and GT depends on
the variance of y¯ denoted by Ω. Hence to compare estimators, we turn to the study
of Ω. An important analysis tool is the Influence Function (IF) [27]. It will be used
26
here to quantify estimate variance and to offer a qualitative picture of how outliers
affect the performance of various estimators.
2.4.1 Influence Function (IF)
Consider a more general form of Equations (2.7), (2.9) and (2.12) where subscript i
is dropped for brevity:
S∑
j=1
ψ(y(j)− y¯) = 0 (2.13)
If an extra measurement y(S + 1) is added to the sample y(j) (j = 1, · · · , S), let
the new estimate be y¯ +∆y¯ so that Equation (2.13) is modified to
S+1∑
j=1
ψ(y(j)− y¯ −∆y¯) = 0
Or equivalently
ψ(y(S + 1)− y¯ −∆y¯) +
S∑
j=1
ψ(y(j)− y¯ −∆y¯) = 0
27
Hence using Taylor series expansion at y¯ gives












ψ(k)(y(j)− y¯) · (−∆y¯)
k
k!












The approximation tends to equality as sample size S →∞ and therefore ∆y¯ → 0.
Consider S∆y¯ as the standardized effect on the estimate y¯ by the addition of y(S+1)






















This gives a finite sample approximation for the standardized effect. Given a dis-
tribution instead of a finite sample, the influence function (IF) is the formal metric
of the standardized effect on the resultant estimate caused by an extra observa-
tion to an infinite sample. In other words, IF describes estimator behavior in the
neighborhood of an underlying data distribution. It is defined as
IF(z) = lim
h→0










where µ(·) is the estimator which yields an estimate given a data distribution as the
input argument, f is the probability density function of ε, and δ(z) is the probability
density of a unit probability mass at point z, ie, δ(z) is a unit impulse at z.
Consider an estimator µ(f) determined from
∫ +∞
−∞
ψ(ε, µ(f))f(ε)dε = 0 (2.16)






In the following we find the IF of LS, IQR+LS, and GT-based estimators using
Equation (2.17). To facilitate the derivation of IF we define
µi = y¯i − xi − bie (2.18)
so that using Equation (2.5) we rewrite the term (yi(j) − y¯i) in Equations (2.7),
(2.9) and Equation (2.12) as
yi(j)− y¯i = εi(j)− µi (2.19)
29
2.4.2 IF of LS Estimator
With Equation (2.19), and for a continuous distribution with probability density
function f(ε), Equation (2.7) becomes
∫ +∞
−∞
(ε− µ(f))f(ε)dε = 0 (2.20)
where subscript i is dropped for the sake of brevity. Comparing Equations (2.16)
and (2.20) gives
ψ(ε) = ε (2.21)







−∞ f(ε)dε = 1.
2.4.3 IF of IQR+LS Estimator
With Equation (2.19), and for a continuous distribution with probability density









0 · f(ε)dε = 0 (2.23)
30
where subscript i is dropped for the sake of brevity. The limits εL and εH are given
by
εL = Q1 − 1.5(Q3 −Q1) (2.24)




f(ε)dε = 0.25∫ Q3
−∞
f(ε)dε = 0.75
Comparing Equations (2.16) and (2.23) gives
ψ(ε) =

ε if εL ≤ ε ≤ εH
0 otherwise
(2.26)












if εL ≤ z ≤ εH , and
IF(z) = 0
31










if εL ≤ z ≤ εH
0 otherwise
(2.27)
2.4.4 IF of GT-based Estimator
With Equation (2.19), and for a continuous distribution with probability density
function f(ε), Equation (2.12) becomes
∫ +∞
−∞
(pq + 1)sgn(ε− µ)|ε− µ|p−1
qσp + |ε− µ|p f(ε)dε = 0 (2.28)




qσp + |ε|p (2.29)




















In summary, the variance of y¯, denoted by Ω, can be found element-wise from
Equation (2.31) based on the IF given by Equation (2.17) where the estimator is
determined by Equation (2.16). The variance of bias estimates, denoted by Φ, can
then be computed from Equation (2.8). In addition, with variance specifications
on bias estimations, the appropriate sample size can be calculated from Equations
(2.31) and (2.8).
2.5 Conclusion
In this chapter we studied the problem of sensor bias estimation, a critical issue
in multi-zone semiconductor thermal processing. We derived equations relating
variance of the bias estimator to sample size. These equations enable us to compute
the sample size or the number of wafers needed by the bias estimator to meet
specified estimation variance. With this information, the precise number of wafers
can be used and wastage can be prevented. In addition, these equations allow us to
33
calculate the variance of the bias estimator and hence its precision if the number of
wafers used is given.
34
Appendix 2A
We use the Lagrange multiplier method to obtain the solution to the bias estimation




(y(j)− x−Be)TΛ−1(y(j)− x−Be) + λTAx (2.32)
where λ = [λ1, . . . , λN ]
T . Differentiating J with respect to x and equating the







(x− y(j) +Be) +ATλ (2.33)







(x− y(j) +Be) (2.34)















Substituting Equation (2.36) into Equation (2.35) gives





















































For GT to reduce to a normal distribution, let p = 2 and q → +∞ so that the PDF












2σq1/2 · ∫ +∞
0

































where σ2/2 is the variance.
For GT to reduce to a t distribution, let p = 2 so that the PDF


































where γ(a) is the Gamma function given by γ(a) =
∫ +∞
0
va−1e−vdv, 2q is the number
of degrees of freedom and qσ2/(2q − 2) is the variance.
For GT to reduce to a uniform distribution, let p→ +∞ and q → +∞ so that
37
the PDF









































































−σ ≤ z ≤ σ
0 otherwise
where σ2/3 is the variance.
For GT to reduce to a Laplace distribution, let p = 1 and q → +∞ so that the
PDF
























where 2σ2 is the variance.
38
Appendix 2C
We again use the Lagrange multiplier method to obtain the solution to the bias
estimation problem in the GT framework. Taking into account the constraint Ax =







ln f(yi(j)− xi − bie) + λTAx
where λ = [λ1, . . . , λN ]
T and bi is the ith row vector of B. GT’s distributional
parameters σi, pi and qi are omitted for the sake of brevity. With the substitution







ln f(yi(j)− y¯i) + λTA(y¯−Be)





























= 0 (i = 1, · · · , 2N) (2.43)
Substituting f from Equation (2.10) into Equation (2.43) gives
S∑
j=1
(piqi + 1)sgn(yi(j)− y¯i)|yi(j)− y¯i|pi−1
qiσ
pi
i + |yi(j)− y¯i|pi
= 0 (i = 1, · · · , 2N)
Equation (2.41) leads to the solution for e∗, which is
e∗ = (AB)−1Ay¯
Appendix 2D
Consider an estimator µ(f) determined from
∫ +∞
−∞
ψ(ε, µ(f))f(ε)dε = 0
Replacing f by (1− h)f + hg gives
∫ +∞
−∞
ψ(ε, µ((1− h)f + hg))((1− h)f(ε) + hg(ε))dε = 0
40















∂ψ(ε, µ((1− h)f + hg))
∂µ




µ((1− h)f + hg) (2.44)
For the estimators used in bias estimation, ψ(ε, µ) is a function of (ε−µ), ie, ψ(ε, µ)













∂ψ(ε− µ((1− h)f + hg))
∂ε




µ((1− h)f + hg) (2.45)
At h = 0, Equation (2.45) gives
d
dh










ψ(ε− µ(f))) f(ε)dε (2.46)
41





















ψ(ε− µ(f))) f(ε)dε (2.47)






Using Taylor series expansion
µ((1− h)f + hg)|h=1 ≈ µ((1− h)f + hg)|h=0 + d
dh





µ(g) ≈ µ(f) + d
dh




By Equation (2.47), Equation (2.46) becomes
d
dh








By Equation (2.49), Expression (2.48) becomes











where δ(ε(j)) is the probability density of a unit probability mass at point ε(j), ie,
δ(ε(j)) is a unit impulse at ε(j). Substitute g(ε) with fS(ε) in (2.50), and the finite
sample estimator can be approximated by
µ(fS) ≈ µ(f) +
∫ +∞
−∞











and from Equation (2.18)














By the central limit theorem, 1
S
∑S
j=1 IF(ε(j)) is asymptotically normal. y¯ is there-






In this chapter we verify, with two examples, the set of equations derived in Chapter
2.
The first example is an illustrative case study based on simulation. Bias estima-
tion is carried out on a single-zone bake-plate model and significantly non-normal
noise is applied to differentiate performance by GT and by LS. Simulation results
favor GT, and are very well matched by the results obtained from the Equations
derived in Chapter 2.
44
The second example is an implementation on a real three-zone bake-plate. In
the presence of outliers that are close to the good data, the equations show that
using GT, instead of normal distribution, to characterize process data gives rise to
a more efficient estimator than LS and IQR+LS and therefore enables earlier re-
medial actions against RTD bias to save semiconductor wafers from sensing-related
processing defects. An intuitive explanation is handy: the flexibility of GT allows
a distribution curve with thicker flanks than normal, thereby making itself a bet-
ter description of the experimental data. These theoretical results were verified
experimentally.
An experiment was performed with 25% of the temperature and power fluctu-
ations coming from distributions with standard deviations 3 times as great as the
good data. This is a commonly used distribution to study outliers [40] that are close
to good, normal data such that the outliers cannot be separated easily. This distri-
bution also approximates a t3 distribution, Student’s t with 3 degrees of freedom, for
the estimation problem [27, 41, 42]. Results show that the GT-based estimator with
43 wafer runs achieves the same estimation variance as the IQR+LS estimator with
50 wafer runs. In other words, 7 wafers can be saved from sensing-related processing
defects.
Considering that the cost of manufacturing one semiconductor wafer is of the
order of USD 10,000 [39], process data is expensive to come by and processing defects
are costly. A guided choice of an efficient RTD bias estimator and an appropriate
45
sample size for estimation is therefore economically important.
The chapter is organized as follows. In Section 3.2, an illustrative, simulation-
based example is presented to show the workings of the equations derived in Chapter
2. In Section 3.3, experimental results obtained on a multi-zone plate are presented
in further verification of the equations, and performance of specific estimators is
discussed. The chapter is concluded in Section 3.4.
3.2 A Simulation Example
In this section, we will work on a simple case with a single-zone bake-plate where
there is only one RTD.
3.2.1 Problem Setup
Consider the thermal model given by Equation (2.4) derived in Section 2.2. In this






In this case we only have one RTD so that the bias vector e introduced in Equation




 , x =
 100
100
 , B =
 1
0











ie, −1 and +1 are the only values that the random variables can assume with a
probability of 1
2
for either. The limited number of noise values enables an investi-
gation covering all possible combinations of noises weighted by their frequencies of
occurrence.
We use the LS estimator and the GT estimator, respectively, to perform esti-
mation of the sensor bias e, and compute estimate variances to make a comparison
between LS and GT. Further, simulation results will be compared against the ones
predicted by Equations (2.31) and (2.8). In this example where measurements are
uniformly distributed, it is seen from Equations (2.24) and (2.25) that εL = −4 and
εH = 4. Therefore f(ε) = 0 for ε < εL and ε > εH . As a result, all measurements
fall within the acceptance range of the IQR test and will be processed with LS;
IQR+LS will hence behave the same as simple LS.
47
3.2.2 Results
For a sample size of S = 50, all possible combinations of ε1 or ε2 and hence y1 and
y2 are given by Table 3.6 in Appendix 3A. The probability of each combination is
determined as follows. Consider, for instance, a sample (ε1(1), ε1(2), · · · , ε1(50)) =
(−1, 1, · · · , 1) whose probability is given by (1
2
)50
. The order of measurements within
a sample does not matter, so (−1, 1, 1, · · · , 1), (1,−1, 1, · · · , 1), · · · , (1, 1, 1, · · · ,−1)
are considered the same combination, and the probability for this combination is
therefore (1
2
)50 × C501 . Generally for the combination where exactly k out of the
50 measurements in a sample result from the noise value of −1, the corresponding
probability is given by (1
2
)50 × C50k .
The solutions for y¯1 and y¯2 under LS and GT (p1 = p2 = 20, q1 = q2 =
100, σ1 = σ2 =
√
2, where pi, qi and σi are the parameters for the GT probabil-
ity density introduced in Equation (2.10)) are listed in Table 3.6 in Appendix 3A
accordingly. With LS, the solutions for y¯1 are obtained from Equation (2.7)
50∑
j=1





(109× k + 111× (50− k))
48
With GT, the solutions for y¯1 are obtained from Equation (2.12)
50∑
j=1
(p1q1 + 1)sgn(y1(j)− y¯1)|y1(j)− y¯1|p1−1
q1σ
p1

















× (50− k) = 0
The solutions for y¯2 under LS and GT are obtained similarly.
Based on Table 3.6, the expectations with LS
E(y¯1) = 111× 1
250
× C500 + 110.96×
1
250
× C501 + · · ·+ 109×
1
250
× C5050 = 110
E(y¯2) = 101× 1
250
× C500 + 100.96×
1
250
× C501 + · · ·+ 99×
1
250




 = 10 = e
Similarly with GT, the expectations
E(y¯1) = 111× 1
250
× C500 + 110.102×
1
250
× C501 + · · ·+ 109×
1
250
× C5050 = 110
E(y¯2) = 101× 1
250
× C500 + 100.102×
1
250
× C501 + · · ·+ 99×
1
250




 = 10 = e
49
So both LS and GT are unbiased estimators.
Based on Table 3.6, the variances with LS
var(y¯1) = (111− 110)2 × 1
250
× C500 + (110.96− 110)2 ×
1
250
× C501 + · · ·
+(109− 110)2 × 1
250
× C5050 = 0.02
var(y¯2) = (101− 100)2 × 1
250
× C500 + (100.96− 100)2 ×
1
250
× C501 + · · ·
+(99− 100)2 × 1
250
× C5050 = 0.02
Φ = var(e∗) = (AB)−1A
var(y¯1)
var(y¯2)
AT (BTAT )−1 = 0.04
(3.1)
Similarly with GT, the variances
var(y¯1) = (111− 110)2 × 1
250
× C500 + (110.102− 110)2 ×
1
250
× C501 + · · ·
+ (109− 110)2 × 1
250
× C5050 = 5.62× 10−5
var(y¯2) = (101− 100)2 × 1
250
× C500 + (100.102− 100)2 ×
1
250
× C501 + · · ·
+ (99− 100)2 × 1
250
× C5050 = 5.62× 10−5
Φ = var(e∗) = (AB)−1A
var(y¯1)
var(y¯2)
AT (BTAT )−1 = 1.12× 10−4
(3.2)
These variances calculated based on Table 3.6 can be predicted by the equations
50
obtained in Chapter 2. With LS, note that the IF is given by Equation (2.22). From
(2.31)















where IF(−1) = −1 and IF(1) = 1. From Equation (2.8)
Φ = var(e∗) = var(y¯1) + var(y¯2) ≈ 0.04
which coincides with var(e∗) computed from Equation (3.1).
With GT, note that the IF is given by Equation (2.30). From (2.31)















where IF(−1) = −5.26× 10−2 and IF(1) = 5.26× 10−2. From Equation (2.8)
Φ = var(e∗) = var(y¯1) + var(y¯2) ≈ 1.11× 10−4
which closely approximates var(e∗) computed from Equation (3.2).
51
























Figure 3.1: GT distribution versus normal distribution in characterizing a
non-normal distribution. GT’s distributional parameters are σ =
√
2, p =
20, q = 100.
52
It is noted that (2.31) gives good approximations to var(y¯1), var(y¯2) and hence
to Φ. In using (2.31), one always encounters the need for the specific IF. The IF
is generally given by Equation (2.17), and in the specific cases of LS and GT, it
reduces to Equations (2.22) and (2.30), respectively.
It is also noted that in this case considered by Section 3.2.2 where noise is signif-
icantly non-normal (by being discrete uniform, specifically), the GT-based estimate
has considerably smaller variance (on the order of 1/300th) than the LS estimate.
This means that to achieve the same estimation variance as GT, LS requires samples
about 300 times as big. By Figure 3.1, this observation can intuitively be explained
as a result of GT with p = 20 and q = 100 being a much better description of
the noise distribution and therefore making an estimator closer to the maximum-
likelihood estimator. This advantage results from GT being a flexible parametric
distribution model that can reduce to a wide range of simpler distributions.
3.3 Experimental Verification of Theoretical Re-
sults
3.3.1 Experiment Setup
Having dealt with a conceptual example based on simulation, we now demonstrate
measurement bias estimation implemented on a real multi-zone thermal system.
53
Figure 3.2: The bake-plate used in the experiment.
The bake-plate shown in Figure 3.2 was used in the experiment. For simplicity, it
was configured into three zones, numbered, from center to edge, 1, 2, and 3 respec-
tively. Each of the three zones had an independent heater driven by an independent
proportional-integral controller. Each zone had an RTD temperature sensor em-
bedded within. These RTD’s measured temperature of their respective zones at a
sampling interval of 0.5 second. From the identification experiment, the steady-state
model was given by
A =

−0.684 0.584 0 1 0 0
0.593 −4.957 4.065 0 1 0
0 3.953 −5.879 0 0 1

54

















We made a step change in input u1(1) and measured the steady-state θ1(1), θ2(1),














θ1(1) θ2(1) 0 · · · 0
0 0 θ1(1) θ2(1) θ3(1) 0 0
0 · · · 0 θ2(1) θ3(1)
θ1(2) θ2(2) 0 · · · 0
0 0 θ1(2) θ2(2) θ3(2) 0 0
0 · · · 0 θ2(2) θ3(2)
θ1(3) θ2(3) 0 · · · 0
0 0 θ1(3) θ2(3) θ3(3) 0 0
0 · · · 0 θ2(3) θ3(3)

For each wafer run, a wafer at room temperature was placed on the bake-plate
held at the setpoint of 90ºC. In response to the cold wafer, the temperature of the
plate dropped and then recovered because of the proportional-integral controller.
Figure 3.3 shows the raw measurements of temperature and heater power obtained
from the baking of one wafer for 120 seconds. The thermal system in the last 15
seconds was at steady state and the measurements were averaged to give the steady-
state temperatures and powers for the wafer run. Zones 1, 2, and 3 were calibrated
to bear a bias of +1ºC, +1ºC, and −1ºC, respectively. Data for 500 wafer runs were
collected and gave the variance matrix
Λ = diag(1.240× 10−4, 2.210× 10−4, 5.061× 10−4, 3.275× 10−2,
3.361× 10−1, 7.970× 10−1)
56




































Figure 3.3: Raw data from one run (baking process of one wafer).
Another 5000 wafer runs were collected, and bias estimation was performed on the
data batch by batch. One batch (sample) consisted of 50 wafer runs.
The GT distributional parameters σi, pi and qi were chosen as
σ1 = 1.57× 10−2, σ2 = 2.10× 10−2, σ3 = 3.18× 10−2, σ4 = 2.56× 10−1,
σ5 = 8.20× 10−1, σ6 = 1.26, pi = 2, qi = 2.4 (i = 1, · · · , 6)
It is shown in Appendix 2B that the GT probability density f(z;σ, p, q)|p=2,q→∞
gives the probability density of the normal distribution N (0, σ2/2). We therefore
set p = 2, σi =
√
2Λi to make GT’s distribution shape close to normal in order that
GT’s performance be close to LS’s in the case of normal data. Meantime, we chose
qi = 2.4, because assigning a finite value to qi would thicken both flanks of GT’s
57
Table 3.1: Theoretical and experimental results of 5000 wafer runs divided
into 50 per batch and bias estimation was performed batch by batch.
Estimator
Average difference between
estimated & true bias (e∗ − e)
RTD 1 RTD 2 RTD 3
Simple LS 0.0040 −0.0012 0.0015
IQR+LS 0.0054 0.0010 0.0036
GT 0.0065 0.0008 0.0031
Estimator
Variance of estimated bias (e∗)
(×10−2)
RTD 1 RTD 2 RTD 3
Theory Expt Theory Expt Theory Expt
Simple LS 0.589 0.580 0.503 0.523 0.421 0.435
IQR+LS 0.629 0.605 0.538 0.549 0.450 0.465
GT 0.628 0.595 0.536 0.529 0.449 0.463
distribution to take account of potential outliers. Also the choice qi = 2.4 would lead
to the same estimation variance by GT as by IQR+LS when there was no outlier
contamination. Notice that the variances of GT and IQR+LS listed in Table 3.1
are approximately equal. With GT and IQR+LS, this was a fair starting baseline
for us to proceed from the ideal normal case to the case with outlier contamination.
58
Table 3.2: Theoretical and experimental results with randomly selected 25%
of the measurements amplified by 3 times.
Estimator
Average difference between
(wafers/batch) estimated & true bias (e∗ − e)
RTD 1 RTD 2 RTD 3
Simple LS (50) −0.0002 −0.0045 −0.0005
IQR+LS (50) 0.0035 −0.0030 0.0016
GT (50) 0.0048 −0.0013 0.0023
GT (43) 0.0038 −0.0016 0.0040
Estimator
Variance of estimated bias (e∗)
(×10−2)
(wafers/batch) RTD 1 RTD 2 RTD 3
Theory Expt Theory Expt Theory Expt
Simple LS (50) 1.768 1.770 1.510 1.482 1.264 1.324
IQR+LS (50) 1.194 1.136 1.020 0.993 0.853 0.876
GT (50) 1.031 0.982 0.880 0.860 0.737 0.748
GT (43) 1.199 1.128 1.024 0.980 0.857 0.860
59
Table 3.3: Theoretical and Experimental Results with randomly selected 10%
of the measurements amplified by 10 times.
Estimator
Average difference between
estimated & true bias (e∗ − e)
RTD 1 RTD 2 RTD 3
Simple LS −0.0042 −0.0015 −0.0035
IQR+LS 0.0025 −0.0032 0.0006
GT 0.0011 −0.0009 0.0013
Estimator
Variance of estimated bias (e∗)
(×10−2)
RTD 1 RTD 2 RTD 3
Theory Expt Theory Expt Theory Expt
Simple LS 6.425 6.600 5.487 5.389 4.592 4.402
IQR+LS 0.727 0.711 0.621 0.595 0.520 0.497
GT 0.765 0.749 0.653 0.620 0.547 0.530
60
3.3.2 Results
For each estimator, the mean, variance and sample size served as three important
performance indicators. While the mean checked whether the estimator was biased,
the variance and sample size quantified precision and efficiency respectively. The
variances of bias estimates by the three estimators were calculated using Equations
(2.31) and (2.8), and the results are shown in Table 3.1.
To investigate the behavior of the estimators in the presence of outliers, we
introduced outliers into the off-line data as follows. Take for example temperature
measurements for zone 1. First, the mean value of y1 was computed. Next, 25% ran-
domly selected y1 had their values changed such that their distances from the mean
value became 3 times as great. The same thing was done for other measurements.
Thus an expected 25% of the data points came from distributions with greater vari-
ances than the rest. They were probable outliers close to the good, normal data
[40]. Bias estimation was performed and the results are shown in Table 3.2.
Table 3.3 is similar to Table 3.2 except that 10% instead of 25% of the data
had their distance from the mean value changed to 10 times as great, instead of 3
times. Thus an expected 10% of the data points came from distributions with much
greater variances than the rest. They were probable outliers far away from the good,
normal data. Bias estimation was performed and the results are shown in Table 3.3.
61
3.3.3 Sample Calculation
In this section we give a sample calculation for the IQR+LS and GT estimators’
variance entries in Table 3.2.





IF2(ε)f(ε)dε = 5.023× 10−6
where from Equation (2.27)
IF(ε) =

1.243ε if − 0.0370 ≤ ε ≤ 0.0370
0 otherwise
Since 75% of the observations come from a normal distribution characterized by
the variance matrix Λ and 25% of the observations come from a normal distribution






















The IQR test’s rejection points of εL = −0.0370 and εH = 0.0370 are obtained
62
under f(ε). The other elements of Ω can be found likewise, giving
Ω ≈ diag(5.023× 10−6, 8.953× 10−6, 2.050× 10−5, 1.327× 10−3,









The diagonal elements are listed in Table 3.2.





IF2(ε)f(ε)dε = 4.334× 10−6











The other elements of Ω can be found likewise, giving
Ω ≈ diag(4.334× 10−6, 7.726× 10−6, 1.769× 10−5, 1.146× 10−3,










The diagonal elements are listed in Table 3.2.
3.3.4 Discussion
In Table 3.1, the variances of the LS estimator are the lowest. For the sake of the
experiment, we baked for 2 minutes so that the last 15 seconds of measurements
could be averaged to give a reading and the steady-state reading could be assumed
to follow a normal distribution closely. In practice, depending on the product and
recipe, the baking time may be shorter, for instance 1.5 minutes, when the temper-
ature has just reached steady state and so there is no time for proper averaging.
Conventional data reconciliation formulations assume that process data follows nor-
mal (Gaussian) distribution. However, the presence of a single huge outlier can spoil
the statistical analysis completely. Hence it is not wise to use the LS algorithm with-
out any built-in check [25, 27]. For the GT-based bias estimator, we propose the
simple tuning method of making GT close to the normal distribution while limiting
q (e.g. q = 2.4) to thicken GT’s distribution flanks and make the resultant estimator
robust to potential outliers. As can be noted from Figure 3.4 where the histogram
64






















Figure 3.4: GT distribution versus normal distribution in characterizing a
non-normal distribution. GT’s distributional parameters are σ = 1.57 ×
10−2, p = 2, q = 2.4.
of experimental data with 25% outlier contamination is superimposed onto the GT
and normal curves, GT’s PDF allows for thicker flanks than normal to take account
of outliers while being close to normal otherwise.
In Table 3.2, the variances of the GT-based estimator with 50 wafer runs are the
lowest. The reduction in estimation variances by GT would translate to fewer wafers
needed to achieve specified estimation precision resulting in wafer saving. The Table
shows that the GT-based estimator with 43 wafer runs achieved the same variances
as the IQR+LS estimator with 50 wafer runs. In other words, if the IQR+LS
estimator requires data from 50 wafer runs to give a sufficiently precise bias estimate,
65
the GT-based estimator requires 7 wafers less and allows earlier removal of RTD bias
to save the 7 wafers from sensing-related thermal processing defects. Although by
rejecting some outliers, IQR+LS achieved estimation variances substantially lower
than simple LS, its estimation variances were not as low as GT’s. This can be
understood by considering a simple case. Let both the IQR+LS estimate and the
GT estimate be 0, and the number of accepted data points after the IQR test be S ′.
Now move left one data point from y+H to y
−
H . This data point which was previously
rejected is now accepted by the IQR test and the resultant LS estimate changes
from 0 to y−H/(S
′ + 1). The GT estimate will not change appreciably because y−H ≈
y+H . Hence if there is a large probability of having data around the rejection points,
IQR test’s binary decision mode increases the variance of the estimate appreciably.
In Table 3.3, where outliers were clearly far away from the good data, IQR+LS
was effective in rejecting them as the variances of IQR+LS were the lowest. However,
this problem would be easy to deal with as the outliers were so obviously separated
from the good data that they could be easily identified, sometimes even by visual
inspection.
Tables 3.4 and 3.5 show the intermediate cases. They are similar to Table 3.2,
except that instead of three times, the data had their distances from the mean
increased by two times and six times, respectively. Consider the bias estimation
variance for RTD 1 (Theory) in Table 3.1 through to Table 3.5. The variance ratios
GT/IQR+LS indicate the performance of GT versus that of IQR+LS and they are
66
Table 3.4: Theoretical and Experimental Results with randomly selected 25%
of the measurements amplified by 2 times.
Estimator
Average difference between
estimated & true bias (e∗ − e)
RTD 1 RTD 2 RTD 3
Simple LS −0.0039 −0.0098 −0.0009
IQR+LS 0.0045 −0.0031 0.0009
GT 0.0081 −0.0005 0.0016
Estimator
Variance of estimated bias (e∗)
(×10−2)
RTD 1 RTD 2 RTD 3
Theory Expt Theory Expt Theory Expt
Simple LS 1.032 0.960 0.881 0.916 0.737 0.811
IQR+LS 0.988 0.929 0.844 0.880 0.706 0.730
GT 0.896 0.910 0.765 0.745 0.640 0.691
1 (0.628/0.629), 0.91 (0.896/0.988), 0.86, 0.91 and 1.05 for outliers whose standard
deviations are 1 (Table 3.1), 2 (Table 3.4), 3 (Table 3.2), 6 (Table 3.5), and 10 (Table
3.3) times the standard deviations of good data, respectively. In summary, when
no outlier was introduced (Table 3.1), GT and IQR+LS had similar performance.
IQR+LS was better than GT when outliers were very far off the sample center
(Table 3.3). For the three cases (Tables 3.2, 3.4 and 3.5) where the outliers were
close to the good data, GT was better.
67
Table 3.5: Theoretical and Experimental Results with randomly selected 25%
of the measurements amplified by 6 times.
Estimator
Average difference between
estimated & true bias (e∗ − e)
RTD 1 RTD 2 RTD 3
Simple LS 0.0047 0.0034 0.0018
IQR+LS 0.0068 −0.0011 0.0085
GT 0.0042 −0.0029 0.0006
Estimator
Variance of estimated bias (e∗)
(×10−2)
RTD 1 RTD 2 RTD 3
Theory Expt Theory Expt Theory Expt
Simple LS 5.747 6.011 4.908 5.118 4.108 4.190
IQR+LS 1.228 1.356 1.048 1.156 0.877 0.966
GT 1.112 1.159 0.950 1.004 0.795 0.872
68
Finally, by the negligible differences between the estimated and true bias in
Table 3.1 through to Table 3.5, the estimates may be considered accurate.
Figure 3.5 gives the plot of the IF of LS, GT and IQR+LS related to Table 3.2.
It is clear from (2.31) that variance varies with IF. The IF of LS is proportional to the
distance between the observation and the origin. A distant observation, therefore,
has a large contribution to the variance. The IF of GT is continuous and decreases
to zero as an observation tends to infinity. This means a distant outlier has zero
influence on the estimate and explains the robustness exhibited by this estimator
against outliers. The same thing can be said of IQR+LS whose IF jumps to zero if
the outlier lies beyond rejection points.
Because each data reconciliation exercise only requires a limited amount of data
(e.g. a bias estimate is obtained from every 50 data points), a recursive algorithm
is not necessary.
3.4 Conclusion
In this chapter we used two examples to show the workings of using the equations
derived in Chapter 2. The equations were verified to make good prediction of bias
estimation variances. Especially of interest was the difficult problem where mea-
surement outliers were close to good data such that they could not be separated
easily. The proposed GT-based estimator showed better performance than simple
69
























Figure 3.5: Influence function of various estimators. GT’s distributional
parameters are σ = 1.57× 10−2, p = 2, q = 2.4.
70
LS and IQR+LS, and, in the presence of RTD bias, would save 7 wafers out of every
50 from sensing-related processing defect.
71
Appendix 3A
Simulation data for the illustrative example in Section 3.2 is listed in Table 3.6.





where exactly k measurements in a sample result from ε1 = −1 (ie, y1 = 109) and
the remaining (50− k) measurements result from ε1 = 1 (ie, y1 = 111). Likewise for




)50 × C50k , where
exactly k measurements in a sample result from ε2 = −1 (ie, y2 = 99) and the
remaining (50− k) measurements result from ε2 = 1 (ie, y2 = 101).
Table 3.6: Simulation data for the illustrative example in Section 3.2.
k 50− k Probability
y¯1 y¯2













































)50×C508 110.68 110.044 100.68 100.044
Continued on next page
72
Table 3.6 – continued from previous page
k 50− k Probability
y¯1 y¯2


























































































)50×C5026 109.96 109.999 99.96 99.999
Continued on next page
73
Table 3.6 – continued from previous page
k 50− k Probability
y¯1 y¯2


























































































)50×C5044 109.24 109.948 99.24 99.948
Continued on next page
74
Table 3.6 – continued from previous page
k 50− k Probability
y¯1 y¯2






























)50×C5050 109.00 109.000 99.00 99.000
75
Chapter 4




Thermal processing of semiconductor wafers is commonly performed by placement
of the wafer on a heated plate for a given period of time. The heated plate is of
large thermal mass relative to the wafer and is held at a constant temperature by a
feedback controller that adjusts the resistive heater power in response to measure-
ments of the plate temperature. The plate is designed with multiple radial zone
configurations. The wafer may be placed in direct contact or on proximity pins.
76
A general requirement for the thermal system is the ability to reject the load dis-
turbance induced by placement of a cold wafer on the bake-plate. Figure 4.1 shows
the closed-loop (control algorithm to follow in Section 4.3) temperature response of
a bake-plate used for photoresist processing when a 200 mm wafer at room tem-
perature was placed on the bake-plate. Initially the temperature dropped and then
recovered because of closed-loop control. In manufacturing, wafers are processed
in quick succession, one after another. Sluggish response will adversely affect, for
example, the repeatability of the manufacturing process if the recovery time of the
temperature disturbance is longer than the baking time of the wafer and the next
wafer comes before the temperature fully recovers. When this happens, there is
not only wafer-to-wafer non-repeatability in temperature processing trajectory, but
also plate-to-plate non-repeatability as the feedback controller generally does not
respond the same. If the processing temperature is not critical, then this type of
response is acceptable. However, for some processes such as PEB for chemically
amplified photoresist processing, temperature control is very critical [31, 32].
If a wafer is perfectly flat and the time at which the wafer arrives is known in
advance, a standard feedforward controller (control algorithm to follow in Section
4.3) can be designed to eliminate the temperature disturbance as shown in Figure 4.2.
In practice, however, wafer conditions differ. For example, a wafer can warp up to
100µm [33] as shown in Figure 4.3. In this case, feedforward control will not be able
to eliminate the temperature disturbance completely and feedback control will still
be necessary.
77
Figure 4.1: Closed-loop temperature responses under MMPC and SMPC
when a flat wafer was placed on the bake-plate.
78
Figure 4.2: Closed-loop temperature responses under MMPC and SMPC,












Figure 4.3: Schematics of (a) a flat wafer and (b) a warped wafer on the
bake-plate.
Work on bake-plate temperature control can be found in [34, 35, 36]. In [35],
PI was used as the feedback controller, while the more sophisticated MPC and
LQG controllers were used in [34] and [36] respectively. MPC operates by solving
a constrained optimization problem online, in real-time, in order to decide how to
update the control inputs (manipulated variables) at the next update instant. This
results in demanding online computational load and can be a limiting factor when
MPC is applied to complex systems with many inputs (such as the 49-zone bake-
plate in [36]) or implemented on embedded systems where computational resources
are limited.
In a recent work, a variant of MPC called Multiplexed MPC, or MMPC, was
proposed and stability results for MMPC were also established [37, 38]. The motiva-
tion for MMPC is to reduce real-time computational load when MPC is implemented
on a multivariable system. In this chapter we present the first successful application
80
of MMPC to multi-zone semiconductor thermal processing.
Figure 4.4 shows the pattern of input moves in the MMPC scheme for a three-
input system. In contrast to the conventional MPC scheme where the three input
moves have to be solved simultaneously, the main idea of MMPC is to solve each
move sequentially, and update the control as soon as the solution becomes available.
In other words, the MMPC scheme distributes the control moves over one complete
update cycle, with each control move obtained from solving a smaller optimization
problem, resulting in reduced computational complexity and hence reduced compu-
tational load. Difference in the sequence of signal update may cause difference in
output responses and its significance is plant-dependent. Notwithstanding, differ-
ence in the update sequence does not affect the computational complexity of the
MMPC algorithm. Moreover, whatever the the sequence, each input is updated ex-
actly once in one complete update cycle, like Standard MPC (SMPC), which forms
the basis for a comparison between MMPC and SMPC.
The chapter is organized as follows. In Section 4.2, the dynamic model of a
generic multi-zone bake-plate is obtained. In Section 4.3, MMPC and feedforward
are reviewed and the equations essential to implementation are presented. In Section
4.4, experimental results are presented and discussed. The chapter is concluded in
Section 4.5.
81
Figure 4.4: Patterns of input moves for Standard MPC (left), and for the
Multiplexed MPC (right).
4.2 Bake-Plate Thermal Modeling
The multi-zone bake-plate shown in Figure 3.2 was used. The physical model of an
N -zone bake-plate has been derived in Section 2.2. The dynamic model to be used
with MPC has exactly the same structure and can be determined from open-loop
step response tests. The model is given by Equation (2.3) and can be written in the
state-space form




z = [θ1 θ2 · · · θN ]T
















. . . . . .














 , Cc = I
A first-order disturbance model has been assumed and its parameters, at dis-




z−0.890 − 0.228z−0.902 − 0.253z−0.904
]T
It can be seen from Figure 4.2 that the feedforward controller (see Section 4.3.2)
designed based on this disturbance model could eliminate the disturbance effectively
as the error signal to the feedback controller (MMPC or SMPC) was practically zero.
83
The temperature of the bake-plate remained at the setpoint of 90°C throughout.
4.3 A Review of Multiplexed MPC and Feedfor-
ward Control
The MMPC was proposed in [37, 38]. In this section, we describe the key ideas
of MMPC and the equations necessary for the implementation of MMPC on the
bake-plate.
4.3.1 Multiplexed MPC
Given the continuous-time model of Equation (4.1), a discrete-time model, with dis-
cretization interval of h seconds, suitable for digital control design, can be obtained
as
zk+1 = Adzk +Bduk






eAcτBc dτ, Cd = Cc
84
For MMPC design, it is convenient to express the model given by Equation (4.2)
in the following form with incremental inputs, ∆u









 , A =
 Ad 0
Ad I




 = [ B1, · · · ,BN ]
The design of the Multiplexed MPC, with the model given by Equation (4.3),
is detailed in Appendix 4A.
4.3.2 Feedforward Control
Given the impulse response of the disturbance, D(z), the feedforward control signal
can be calculated as





In this section, MMPC with feedforward control is demonstrated on a three-zone
(N = 3) bake-plate for 200mm wafers.
4.4.1 Experiment Setup
A photograph of the bake-plate is shown in Figure 3.2. Three RTD’s were embedded
in the bake-plate, one for each zone. The ambient temperature was 25°C and the
experiments were conducted at a setpoint of 90°C.
The temperatures of all zones were maintained at 90°C, and wafers at ambient
temperature were placed on the plate. Proximity pins on the plate created an wafer-
to-plate air gap of approximately 165µm to avoid contact between the wafer and
the plate surface.
4.4.2 Parameter Estimation
The physical model given by Equation 4.1 suggests a particular structure for the
state-space matrices. Given this structure, we used the System Identification Tool-
box in MATLAB to perform a structured state-space model estimation to estimate
the unknown parameters.
86
























































































Figure 4.5: Model verification using step response. From left to right, step
input applied to zone 1, zone 2, and zone 3.
Open-loop step response tests were carried out to collect the required input-
output data for parameter estimation. The temperatures of all zones were made to
reach the steady state of 90°C (baking process temperature). Then a step input of
magnitude 0.1V was applied to the power electronics circuitry of the heater of one
zone while inputs to the other two heating zones were maintained constant. The
power electronics was designed such that the power of the heater was proportional
to the applied voltage. The temperature changes in all three zones were recorded.
This process was repeated for the other two inputs. The results are shown in Figure
4.5.
87















Figure 4.2 shows the response obtained under feedforward control for a flat wafer.
The figure essentially gives the response of a standard forward controller, because
the feedback error signal was very small and hence the feedback control action due to
MMPC or SMPC was negligible. Clearly, feedforward control was effective for both
SMPC and MMPC schemes in eliminating the disturbance caused by the placement
of the flat wafer on the bake-plate.
Figure 4.6 shows the response obtained under feedforward control when a wafer
with a center-to-edge warp of about 100µm, was used. In this case, the disturbance
caused by the warped wafer deviated from that by the flat wafer. Hence, feedfor-
ward control was not able to eliminate the temperature disturbance completely and
feedback control came into play.
88
4.4.4 Discussion
Depending on the thermal process, the recipe baking time can be less than three
minutes and the temperature should recover within three minutes. In the experi-
ment, the warped wafer was dropped onto the bake-plate at t = 20s and caused a
disturbance. It can be seen from Figure 4.6 that the temperature of the bake-plate
recovered to the 90°C setpoint at t = 200s and t = 300s for MMPC and SMPC,
respectively. Under SMPC, if the next wafer arrived at t = 200s, the temperature
of the bake-plate would not have recovered to 90°C by then and the second wafer
would be baked at a different starting temperature. Whenever this happens, there
will be wafer-to-wafer non-repeatability in temperature processing trajectory.
Figure 4.7 is a magnified view of Figure 4.6 from t = 73s to t = 77s to clarify
the differences in the sampling and updating patterns of MMPC and SMPC. Figure
4.7 shows the instants when SMPC and MMPC measured the temperatures of the
zones where the ‘×’s denote measurement instants for MMPC while the ‘◦’s denote
measurement instants for SMPC.
SMPC updated the three control signals simultaneously at a 1.2s interval, ie,
at t = 74.0s, 75.2s and 76.4s as indicated by the dotted lines in Figure 4.7. In
comparison, MMPC also updated the control signal at a 1.2s interval, but in a
multiplexed fashion, ie, control signal 1 at t = 74.0s, 75.2s and 76.4s; control signal
2 at t = 74.4s, 75.6s and 76.8s; control signal 3 at t = 74.8s, 76.0s and 77.2s, as
indicated by the solid lines in Figure 4.7.
89
Table 4.1: Comparison of SMPC’s and MMPC’s measurement and compu-
tation instants. ×1 denotes solving a 15-variable Quadratic Program (QP).
×2 denotes solving a 5-variable QP.
Controller Activity
Time (s)
74.0 74.4 74.8 75.2 75.6 76.0
SMPC
Zone 1 (Measure) × ×
Zone 1 (Compute) ×1 ×1
Zone 2 (Measure) × ×
Zone 2 (Compute) ×1 ×1
Zone 3 (Measure) × ×
Zone 3 (Compute) ×1 ×1
SMPC
Zone 1 (Measure) × × × × × ×
Zone 1 (Compute) ×2 ×2
Zone 2 (Measure) × × × × × ×
Zone 2 (Compute) ×2 ×2
Zone 3 (Measure) × × × × × ×
Zone 3 (Compute) ×2 ×2
90
These instants are tabulated in Table 4.1. It can be seen that the MMPC scheme
measured the zone temperatures more frequently than SMPC. In both cases, the
zero-order-hold for the control signal was 1.2s, the control horizon was Mu = 5 and
the weighting matrices were q = diag(0.01, 0.1, 0.2) and r = 1. In general, a largeMu
tends to be computationally expensive. Mu = 5 gave a reasonable trade-off between
performance and computational load. The values of q and r were determined by
experimental tuning. The MMPC scheme divided the MPC problem into a sequence
of smaller optimization problems, solved them sequentially and updated each zone’s
control signal as soon as the solution became available, thus distributing the control
moves over a complete update cycle of 1.2s.
One may be curious why the comparison between SMPC and MMPC was not
done with the same sampling rate. It is known that the computational complexity
of MPC (or Quadratic Programming) is O((N ×Mu)3) [43], where N is the number
of control inputs and Mu the control horizon. As the number of decision variables
(N ×Mu) increases, the computational complexity increases nonlinearly. We reason
that if one could implement SMPC in T with the available computational resources,
then one could implement MMPC at a higher sampling rate (or a smaller sampling




Multiplexed MPC with feedforward has been demonstrated experimentally on a
multi-zone bake-plate application. Adding feedforward reduces the effect of distur-
bance significantly. While most of temperature drop will be compensated by feedfor-
ward control, feedback control is still needed to cope with the errors due to warped
wafers. This is supported by experimental results. These results are important for
the semiconductor wafer baking process, because temperature non-uniformity will
affect critical dimension of the wafer. MMPC has the potential to make the plate re-
cover faster than SMPC when disturbance takes place. There are admittedly many
factors affecting closed-loop performance, such as sampling rate, controller tuning,
etc. A key motivation for MMPC is its reduced computational load and hence faster
sampling. With control horizonMu = 5, SMPC needs to perform optimization with
respect to N ×Mu = 15 variables whereas MMPC needs to perform optimization
with respect to only 5 variables. In our experiment, the sampling of SMPC was at
1.2s or about 1Hz. MMPC, with its reduced computational load, could sample at
0.4s or 2.5Hz without the need for hardware upgrade. This computational advan-
tage of MMPC’s becomes even more significant when constraints are considered and
with an increasing number of zones and an extended control horizon. For example,
consider the state-of-the-art 49-zone bake-plate [36] with Mu = 5. SMPC would
have to solve a Quadratic Program (QP) of 49× 5 = 245 variables whereas MMPC
would solve 49 QP’s of only 5 variables each.
92
Figure 4.6: Closed-loop temperature responses under MMPC and SMPC,
with feedforward, in response to a warped wafer.
93
Figure 4.7: Magnified view of Figure 4.6 from t = 73s to t = 77s. ‘×’s denote




Given a discrete-time LTI plant model in the state-space form represented by




The plant is assumed to have N inputs, denoted by u1,k, · · · , uN,k and ∆uj,k =
uj,k − uj,k−1 is the control move applied to input j at time index k. In the interest
of brevity, we will consider only the regulation problem. It is assumed that at time
index k, the complete state vector xk is known exactly.
The control strategy of MMPC is to update the N inputs of the multivariable
plant one at a time. To be specific, MMPC, at time index k, changes only plant
input ((k mod N) + 1). Hence an alternative representation of the N -input plant
for MMPC is a periodic linear system with one input:
xk+1 = Axk +Bσ(k)∆u˜k (4.5)
where ∆u˜k = ∆uσ(k),k, and σ(k) is the indexing function defined as
σ(k) = (k mod N) + 1 (4.6)






(|xk+i+1|k|2q + |∆u˜k+i|k|2r)+ F (xk+M |k)
wrt ∆u˜k+i|k, (i = 0, N, 2N, · · · ,M − 1)
s.t. ∆u˜k+i|k ∈ Uσ(k+i), (i = 0, · · · ,M − 1)
xk+i|k ∈ X, (i = 1, · · · ,M)
xk+M |k ∈ XI(Kσ(k))
xk+i+1|k = Axk+i|k +Bσ(k+i)∆u˜k+i|k, (i = 0, · · · ,M − 1)
∆u˜k+i|k = ∆u˜k+i|k−1, (i 6= 0, N, 2N, · · · ,M − 1)
(4.7)
where M = (Mu−1)N +1 and Mu is the control horizon, a design parameter which
denotes the number of control moves to be optimized per input channel of the original
system given by Equation (4.4); F (xk+N) is a suitably chosen terminal cost, and X
and U are compact polyhedral sets containing the origin in their interior. XI(Kσ(k))
denotes the set in which none of the constraints is active, and which is the maximum
positively invariant set [44] for the linear periodic system given by Equation (4.5),
when a stabilizing linear periodic feedback controller Kσ(k) is applied. In other
words, xk ∈ XI(Kσ(k)) implies Kσ(k)xk ∈ Uσ(k) and (A+Bσ(k)Kσ(k))xk ∈ XI(Kσ(k)).
The main idea of MMPC, as captured in the problem formulation (4.7), is to
partition the entire system into smaller subsystems, solve the control for each sub-
system sequentially, and makes the control update as soon as the solution becomes
available. Hence the optimization is carried out with respect to only a subset of the
96
available decision variables, ie, ∆u˜k+i|k (i = 0, N, 2N, · · · ,M−1). This is in contrast
to conventional MPC which updates all the control variables simultaneously in one
update cycle. Hence, some assumptions must be made about those inputs which
have already been planned but which have not yet been executed. MMPC assumes
that all such planned decisions are known to the controller, and that they will be
executed as planned, ie,





Temperature in semiconductor thermal processing is an important determinant of
CD which quantifies the accuracy of circuit patterns formed over the lithography
process. Increasingly stringent requirement on temperature control in semiconductor
thermal processing gives rise to the need for a multi-zone bake-plate with unbiased
temperature sensing and a computationally efficient temperature control scheme.
In this thesis we develop and analyze data reconciliation techniques to efficiently
estimate RTD bias from thermal processing data in order for biased RTD’s to be
remedied online. We propose and implement MMPC as a control scheme to lessen
computational load otherwise incurred by conventional MPC in the multi-zone bake-
plate so that potentially superior control performance results without the need for
98
hardware upgrade.
In Chapter 2, data reconciliation (DR) is extended so that RTD biases can
be estimated online from process data. To handle frequently encountered non-
normality in process data, a generalized T distribution (GT) based bias estimator
is proposed. Equations are derived which relate variance of a bias estimator to
sample size (number of wafers runs per estimation). These equations enable the
computation of the sample size or the number of wafers needed by the bias estimator
to achieve specified variance. With this information, the precise number of wafers
can be used and wastage can be prevented. Alternatively, these equations allow
the calculation of the variance of the bias estimator and hence its precision if the
number of wafers used is given. Equations are obtained relevant to the analysis of
simple LS, IQR+LS, and the proposed GT-based estimator.
In Chapter 3, simulation and experimental examples are given to demonstrate
the application of the estimators and estimator analysis developed in Chapter 2.
We examine specifically the performance of simple LS, IQR+LS and GT-based bias
estimator for the difficult problem where measurement outliers are close to good data
such that they cannot be separated easily. The theory is verified experimentally on
a multi-zone bake-plate for semiconductor thermal processing. In the light of the
equations derived, an efficient estimator can be selected. In the presence of outliers
that are close to the good data, the equations show that using GT, instead of normal
distribution, to characterize process data gives rise to a more efficient estimator than
99
LS and IQR+LS and therefore enables earlier remedial actions against RTD bias to
save semiconductor wafers from sensing-related processing defects. The experiment
was performed with 25% of the temperature and power fluctuations coming from
distributions with standard deviations 3 times as great as the good data. It is
shown that the GT-based estimator with 43 wafer runs achieved the same estimation
variance as the IQR+LS estimator with 50 wafer runs. In other words, 7 wafers could
be saved from sensing-related processing defects.
In Chapter 4, we propose the use of Multiplexed MPC (MMPC) for temperature
control in multi-zone thermal processing. MMPC with feedforward is demonstrated
experimentally on a multi-zone bake-plate application. It is shown that adding
feedforward reduces the effect of disturbance significantly. While most of the tem-
perature drop will be compensated by feedforward control, feedback control is still
needed to cope with the errors due, for example, to warped wafers. Depending on
the thermal process, the recipe baking time varies and plate temperature should
recover fully within the pre-specified period of time. We note from the experiments
the potential of MMPC to make plate temperature recover faster than under con-
ventional MPC after disturbance takes place. The sampling rate not being the only
factor bearing on closed-loop control performance, MMPC’s key advantage of re-
duced computational load supports faster sampling and potentially brings about
superior control performance. These results are important for the semiconductor
wafer baking process, because temperature non-uniformity resulting from poor tem-
perature control performance has adverse impact on the CD of wafers.
100
5.2 Future Work
It would be of interest to study the adoption of a Bayesian framework in sensor bias
estimation. In what has previously been discussed in this thesis, the probability that
an RTD is biased is not factored into calculation. Nor is the probability distribution
of the bias of an RTD, since we have assumed that any bias value is equiprobable.
For one thing, factoring bias probability may be especially beneficial by bringing
about fewer false alarms when bias is not actually existent. For another, factoring
bias distribution may bring about further efficiency gains for bias estimators. Bias
value tends to be less likely as its magnitude grows so that it maybe modeled to
follow a mono-modal distribution centered at zero. The bias estimation problem,















with respect to x and e, subject to Ax = 0, where P (e) is the probability an RTD
has bias and f(e) is the probability density of an RTD’s bias value. Equations for
estimator analysis remain to be worked out, which would be a generalization of the
equations we have derived in Chapter 2.
As far as the thesis is concerned, sensor bias estimation, or more broadly, data
reconciliation (DR), has been executed in a centralized manner (CDR). All mea-
surements from the thermal process are collected, transferred to and reconciled by
101
an estimator. CDR per se cannot handle total breakdown of certain sensors if the
quantities these sensors are supposed to measure are not known by the centralized
DR as unmeasurable. In a state-of-the-art multi-zone bake-plate with 49 zone, total
failure of some RTD’s is a real possibility. Robustness of DR to such mishap would
be welcome, as bias estimation of the remaining functional RTD’s could proceed
without human intervention, ie, RTD replacement. Indeed, model constraints as
used by CDR can also be exploited if each sensor participates in DR by reconciling
its own measurement with measurements obtained from neighboring sensors. In the
case of a sensor failure, the neighboring sensors may be able to construct a subop-
timal state estimate for the failed sensor, and the failed sensor is bypassed in the
final centralized estimation for bias.
102
Bibliography
[1] C. A. Mack, Fundamental Principles of Optical Lithography, John Wiley &
Sons, 2007.
[2] C. A. Mack, Field Guide to Optical Lithography, SPIE Optical Engineering
Press, 2006.
[3] J. Sturtevant, “CD Control Challenges for Sub-0.25mm Patterning”, SEMAT-
ECH DUV Lithography Workshop, Austin, TX, Oct. 16-18, 1996.
[4] H. J. Levinson, Principles of Lithography, SPIE Optical Engineering Press,
2005.
[5] R. Kaesmaier, A. Wolter, H. Lo¨schner, and S. Schunck, “Ion-projection Lithog-
raphy: Nov00 Status and Sub-70nm Prospects”, Proceedings of SPIE, 4226, pp.
52-58, 2000.
[6] A. Ehrmann, A. Elsner, R. Liebe, T. Struck, J. Butschke, F. Letzkus, M. Irm-
scher, R. Springer, E. Haygeneder, and H. Lo¨schner, “Stencil Mask Key Pa-
103
rameter Measurement and Control”, Proceedings of SPIE, 3997, pp. 373-384,
2000.
[7] J. Parker, W. Renken, “Temperature metrology for CD control in DUV Lithog-
raphy”, Semiconductor International, pp.111-116, No. 10, Vol. 20, 1997.
[8] Q. Zhang, K. Poolla, and C. J. Spanos, “Across Wafer Critical Dimension
Uniformity Enhancement Through Lighography and Etch Process Sequence:
Concept, Approach, Modeling, and Experiment”, IEEE Trans. on Semi. Manu.,
pp. 488-505, no. 4, vol. 20, 2007.
[9] T. Masahide, S. Shinichi, O. Kunie, and M. Tamotsu, “Effects Produced by
CDU Improvement of Resist Pattern with PEB Temperature Control for Wiring
Resistance Variance Reduction”, Proceedings of SPIE, pp. 69222Z, vol. 6922,
2008.
[10] C.D. Schaper, K. El-Awady, T. Kailath, A. Tay, L.L. Lee, W.K. Ho, and
S. Fuller, “Characterizing Photolithographic Linewidth Sensitivity to Process
Temperature Variations for Advanced Resists Using a Thermal Array”, Applied
Physics A: Material Science and Processing, DOI: 10.1007/s00339-003-2343-x,
2003.
[11] P. Friedberg, C. Tang, B. Singh, T. Brueckner, W. Gruendke, B. Schulz, and
C. Spanos, “Time-based PEB Adjustment for Optimizing CD Distributions”,
Proceedings of SPIE, pp. 703-712, vol. 5375, 2004.
104
[12] J. Cain, P. Naulleau, and C. Spanos, “Critical Dimension Sensitivity to Post-
exposure Bake Temperature Variations in EUV Photoresists”, Proceedings of
SPIE, pp. 1092-1100, vol. 5751, 2005.
[13] Q. Zhang, C. Tang, T. Hsieh, N. Maccrae, B. Singh, K. Poolla, and C. Spanos,
“Comprehensive CD Uniformity Control in Lithography and Etch Process”,
Proceedings of SPIE, vol. 5752(3), 2005.
[14] S. Scheer, M. Carcasi, T. Shibata, and T. Otsuka, “CDU Minimization at the
45nm Node and Beyond: Optical, Resist, and Process Contributions to CD
Control”, Proceedings of SPIE, 65204H, vol. 6520, 2007.
[15] A. Hisai, K. Kaneyama, C. Pieczulewski, “Optimizing CD uniformity by to-
tal PEB cycle temperature control on track equipment”, Advances in Resist
Technology and Processing XIX, Proceedings of SPIE, pp. 754-760, Vol. 4690,
2002.
[16] K. El-Awady, C. D. Schaper, and T. Kailath, “Control of spatial and tran-
sient temperature trajectories for photoresist processing”, Vacuum Science and
Technology, pp. 2109-2114, No.5, Vol. 17, 1999.
[17] K. El-Awady, C. D. Schaper, and T. Kailath, “Integrated Bake/Chill for Pho-
toresist Processing”, IEEE Trans. on Semiconductor Manufacturing, pp. 264-
266, No. 2, Vol. 12, 1999.
[18] H. J. Levinson, Lithography Process Control, SPIE Optical Engineering Press,
1999.
105
[19] H. Huff, R. Goodall, R. Nilson and S. Griffiths, “Thermal Processing Issues
for 300 mm Silicon Wafers: Challenges and Opportunities”, ULSI Science and
Technology VI, Proceedings of the Electrochemical Society, pp. 135-181, 1997.
[20] B. W. Smith, Resist Processing in Microlithography: Science and Technology,
J. R. Sheats and B. W. Smith, Eds., Marcel Dekker, 1998.
[21] D. Steele, A. Coniglio, C. Tang, B. Singh, S. Nip and C. Spanos, “Characterizing
Post Exposure bake Processing for Transient and Steady State Conditions, in
the Context of Critical Dimension Control”, Metrology, Inspection, and Process
Control for Microlithography XVI, Proceedings of SPIE, pp. 517-530, vol. 4689,
2002.
[22] Jose A. Romagnoli, Mabel C. Sanchez, Data Processing and Reconciliation for
Chemical Process Operations, Academic Press, 1999.
[23] S. Narasimhan, M. Jordache, Data reconciliation and gross error detection :
an intelligent use of process data, Gulf Pub. Co, 1999.
[24] W. K. Ho, H. Yan, J. A. Romagnoli, and K.V. Ling, “Measurement Bias De-
tection, Identification and Elimination for Multi-Zone Thermal Processing in
Semiconductor Manufacturing”, The 33rd Annual Conference of the IEEE In-
dustrial Electronics Society, Taiwan, Nov. 5-8, 2007.
[25] D. Wang, J. A. Romagnoli, “A Framework for Robust Data Reconciliation
Based on a Generalized Objective Function”, Industrial & Engineering Chem-
istry Research, pp. 3075-3084, 42, 2003.
106
[26] D. C. Montgomery, Introduction to Statistical Quality Control, 4th Edition,
John Wiley & Sons, 2001.
[27] Frank R. Hampel, Elvezio M. Ronchetti, Peter J. Rousseeuw andWerner A. Sta-
hel, Robust Statistics - The Approach Based on Influence Functions, John
Wiley & Sons, 1986.
[28] J. Tukey, Exploratory Data Analysis, Addison-Wesley, 1977.
[29] James B. McDonald, Whitney K. Newey, “Partially Adaptive Estimation of
Regression Models via the Generalized T Distribution”, Econometric Theory,
pp. 428-457, No.4, 1988.
[30] Christian Hansen, James B. McDonald, Whitney K. Newey, “Instrumental Vari-
ables Estimation with Flexible Distributions”, Journal of Economics and Busi-
ness Statistics, 2007.
[31] W. K. Ho, A. Tay, M. Chen, J. Fu, H. J. Lu and X. C. Shan, “ Critical Dimension
Uniformity via Real Time Photoresist Thickness Control”, IEEE Transactions
on Semiconductor Manufacturing 20, no. 4, pp. 376-380, 2007.
[32] W. K. Ho, A. Tay, J. Fu, M. Chen and Y. Feng, “Critical dimension and
real time temperature control for warped wafers”, accepted for publication by
Journal of Process Control, 2008.
[33] M. Quirk and J. Serda, Semiconductor Manufacturing Technology, Prentice
Hall, 2001.
107
[34] L. L. Lee, D. Schaper, W. K. Ho, “Real-Time Predictive Control of Photoresist
Film Thickness Uniformity”, IEEE Transactions on Semiconductor Manufatur-
ing, vol. 15(1), pp 51-59, 2002.
[35] W. K. Ho, A. Tay, M. Chen, and C. M. Kiew, “Optimal Feed-Forward Control
for Multizone Baking in Microlithography”, Ind. Eng. Chem. Res, 46, pp. 3623-
3628, 2007.
[36] K. El-Awady, C. Schapter and T. Kailath, “Programmable thermal processing
module for semiconductor substrates”, IEEE Transcations on Control System
Technology, vol. 12, pp. 493-509, 2004.
[37] K. V. Ling, J. M. Maciejowski, and B. F. Wu, “Multiplexed Model Predictive
Control”, The 16th IFAC World Congress, Prague, July 2005.
[38] K. V. Ling, J. M. Maciejowski, and B. F. Wu, “Multiplexed Model Pre-
dictive Control”, Technical report Cambridge University Engineering Dept,
CUED/FINFENG/ TR. 561, 2006.
[39] S. W. Jones, “A Simulation Study of the Cost and Economics of 450 mm
Wafers”, Semiconductor International, 2005.
[40] D. F. Andrews, P. J. Bickel, F. R. Hampel, P. J. Huber, W. H. Rogers and
J. W. Tukey, Robust Estimates of Location: Survey and Advances, Princeton
University Press, 1972.
108
[41] Frank R. Hampel, “The Breakdown Points of the Mean Combined With Some
Rejection Rules”, Technometrics, No. 2, vol. 27, 1985.
[42] W. N. Venables, B. D. Ripley, Modern Applied Statistics with S, Springer-
Verlag New York, 2002.
[43] J. Skorin-Kapov, “On Strongly Polynomial Algorithms for Some Classes of
Quadratic Programming Problems”, Mathematical Communications, 2, pp. 95-
105, 1997.




H. Yan, W. K. Ho, K. V. Ling and K. W. Lim, “Multi-Zone Thermal Processing in
Semiconductor Manufacturing: Bias Estimation”, accepted for publication by IEEE
Transactions on Industrial Informatics, 2010.
K. V. Ling, W. K. Ho, B. Wu, Andreas and H. Yan, “Experimental Evaluation of
Multiplexed MPC for Semiconductor Manufacturing”, Asian Control Conference,
August 2009.
K. V. Ling, W. K. Ho, B. Wu, Andreas and H. Yan, “Multiplexed MPC for Multi-
Zone Thermal Processing in Semiconductor Manufacturing”, accepted for publica-
tion by IEEE Transactions on Control Systems Technology, 2009.
W. K. Ho, H. Yan, K. V. Ling, J. A. Romagnoli and K. V. Ling, “Measurement
Bias Detection, Identification and Elimination for Multi-Zone Thermal Processing in
Semiconductor Manufacturing”, The 33rd Annual Conference of the IEEE Industrial
Electronics Society, November 2007.
110
K. V. Ling, H. Yan, W. K. Ho, J. A. Romagnoli and Y. Joe, “Multi-Zone Ther-
mal System in Semiconductor Manufacturing: Gross Error Treatment”, IFAC Con-
ference on Advanced Process Control for Semiconductor Manufacturing, December
2006.
