Implementation of a hierarchical controller in a factory simulation by Violette, James D. (James Douglas)
IMPLEMENTATION OF A HIERARCHICAL
CONTROLLER IN A FACTORY SIMULATION
by
James D. Violette
B.S., Aeronautical Engineering, Massachusetts Institute of
Technology (1989)
Submitted to the Department of Aeronautics and Astronautics in partial fulfillment
of the requirements for the degree of
MASTER of SCIENCE
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 1993
( Massachusetts Institute of Technology, 1993. All rights reserved.
Signature of Author ........
Department of Aeronautics and Astronautics
April 23, 1993
Certified by ......................... ................. . ...
Dr. Stanley B. Gershwin
Senior Research Scientist
Department of Mechanical Engineering
Thesis Supervisor
Accepted by.......
........ ...... . . .• .........................................
Professor Harold Y. Wachman
Chairman, Department Graduate Committee
MASSACHUSETTS INSTITUTE
OF TrU'8nl nGY
4UN 08 1993
1 100AWCO~
IMPLEMENTATION OF A HIERARCHICAL
CONTROLLER IN A FACTORY SIMULATION
by
James D. Violette
Submitted to the Department of Aeronautics and Astronautics
on April 23, 1993, in partial fulfillment of the
requirements for the degree of
Master of Science in Aeronautics and Astronautics
Abstract
This thesis describes the implementation of a hierarchical controller in a factory
simulation. This work is based on the hierarchical scheduling control algorithm of
Gershwin, Akella, and Choong (1984) and the initial implementation described in
Darakananda (1989). The purpose of this implementation is to provide a research
testbed so that extensions to the hierarchical control algorithm may be tested by ap-
plying the theories in an empirical environment. The simulation consists of a 20,000
line program written in pre-ANSI C. The architecture of the code is modular for ease
of expansion and alteration.
The factory model consists of multiple processes and multiple machine types of
limited flexibility and limited reliability. The controller of the factory is based on a
dynamic program feedback control law approximation based on the surplus of each
process. The controller is built around the concepts of frequency and process de-
composition embodied in a hierarchy of semi-independent cells. Work-in-process is
actively controlled by use of hedging points.
This thesis also describes solutions to a number of technical issues related to the
implementation of the theory in this simulation. Examples include reentrant flow
control, boundaries in surplus space, design limits on hedging points, and control of
setup changes. These concepts are illustrated in a series of simulation experiments.
Definitions and basic concepts are carefully developed to provide a common reference
for future work.
Thesis Supervisor: Dr. Stanley B. Gershwin
Title: Senior Research Scientist
Department of Mechanical Engineering
Contents
1 Introduction
1.1 Purpose and Intended Audience . . . . . . .
1.1.1 Purpose .. .. ... . . .. . . .. .
1.1.2 Intended Audience ..........
1.2 Types of Descriptions .............
1.3 Overview of Hiercsim .............
1.3.1 Features Included and Not Included .
1.3.2 Design and Performance . . . . . . .
1.4 Outline of Thesis ...............
1.5 Literature Survey ...............
1.5.1 Commercially Available Simulators .
1.5.2 Prior Development on Hiercsim . . .
1.5.3 Work Based on Hiercsim .......
2 Definitions, Assumptions, Observations and
2.1 Manufacturing System Description . . . . .
2.1.1 Definitions . . . . . . . . . . . . . . .
2.1.2 Notation Conventions . . . . . . . . .
2.2 Types of Events in the Simulation . . . . . .
2.2.1 Types of Events and Activities . . .
2.3 Performance Measure Definitions . . . . . .
2.3.1 Objective System Measures .....
2.3.2 Controller-Specific Performance Measures .
16
. .... ... .. .. 16
. ... .... .. .. 16
.. ... ... ... . 16
.. ... ... .... 17
.. .... ... ... 19
.. .... ... ... 19
S . . . . . . . . . . . 20
... ... ... ... 21
... ... ... ... 22
S . . . . . . . . . . . 23
S . . . . . . . . . . . 25
.. .. .... ... . 26
Notation
2.4 Capacity Constraints ........................... 38
2.5 Assumptions about Frequencies of Events . ............... 41
2.5.1 Characteristic Frequencies of Events . ............. 41
2.5.2 Definition of Level ........................ 43
2.5.3 Assumptions Based on Relative Frequencies . ......... 44
2.5.4 Bounds on Frequencies in the Hierarchical Controller .... . 45
2.6 Assumptions about Processes ................... ... 46
2.7 Assumptions about Machine Reliability . ................ 48
2.7.1 Characterization of Failure Modes . ............... 48
2.7.2 Effects of Failures on Capacity . ................. 50
2.8 Observations about Buffers ........................ 53
2.8.1 Role of Buffers in a Manufacturing System . .......... 53
2.8.2 Buffer Behavior .......................... 54
2.8.3 Constraints Imposed by Buffers . ................ 56
2.9 Observations about Measurement Frequency . ............. 59
2.9.1 Production as a Function of Level . ............... 59
2.9.2 Capacity over Different Time Scales . .............. 61
2.9.3 Material in Buffers as a Function of Level . .......... 65
3 Algorithms for a Hierarchical Controller 68
3.1 Requirements of a Hierarchical Controller . ............... 68
3.1.1 Requirements of a Real-Time Controller . ........... 68
3.1.2 Difficulties of Scheduling ..................... 69
3.1.3 Decomposition Requirements . ................. 70
3.2 Microscope Analogy to Hierarchical Control . ............. 72
3.3 Basic Decompositions ........................... 75
3.4 Control Decomposition Infrastructure . ................. 77
3.4.1 Control Level Definition ..................... 77
3.4.2 Buffer Control Level Definition . ................ 79
3.5 Definition of Controller Infrastructure Terms . ............. 79
3.5.1 Frequency Decomposition Infrastructure . ........... 82
3.5.2 Process Decomposition Infrastructure . ............. 84
3.5.3 Pyramid Decomposition Infrastructure . ............ 85
3.6 The Cell as a Building Block ................... .... 88
3.6.1 The Cell as a Sub-Factory .................... 89
3.6.2 Capacity of a Cell ......................... 92
3.6.3 Cell Controller Function ..................... 99
3.7 Dynamic Programming Approach ................... . 100
3.7.1 Problem Sketch .......................... 100
3.7.2 Dynamic Programming Problem Approximation ........ 102
3.8 Flow Constraints Imposed from Adjacent Cells . ............ 105
3.9 Translation of Production Rates into Loading Times ......... . 110
3.10 Machine Loading Controller ....................... 113
3.11 Information Flow within a Distributed Control System . ....... 113
3.11.1 Relative Position of Level k Cell c . ............... 114
3.11.2 Information Flow Between Levels . ............... 114
3.11.3 Information Flow between Adjacent Cells at the same Level . 116
3.11.4 Information Flow During a High Level State Change ..... 118
3.12 Reentrant Process Control ........................ 120
3.12.1 Reentrant Process ......................... 120
3.12.2 Behavior of Hierarchical Controller . .............. 122
3.12.3 Modification for Reentrant Processes . ............. 127
3.13 Constraints on Hedging Points ...................... 129
4 Implementation Details for Hiercsim Versions 3.5 and 4.0 136
4.1 Overall Architecture of Hiercsim ................... .. 139
4.1.1 Data Representation Architecture . ............... 139
4.1.2 Data Input Architecture ..................... 142
4.1.3 Factory Model Architecture . .................. 145
4.1.4 Linear Program Architecture . .................. 150
4.1.5 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.2 Major Interfaces in Hiercsim ....................... 154
4.3 The Factory Model ................. ......... 157
4.3.1 Possible Machine States ..................... 157
4.3.2 Processing . .. . . ... .. .. .. . . ... . . . . . . .. .. 159
4.3.3 M achine Failures ......................... 162
4.4 Initialization ..... . . ..... .... .. . ...... ... .. 165
4.4.1 Initial Values of the Simulation . ................ 165
4.4.2 Start Event Queue ........................ 166
4.4.3 Degrees of Freedom in Controller Parameters . ......... 166
4.5 Simulation Dynamics ........................... 167
4.5.1 Virtual Machine Addition and Removal . ............ 168
4.5.2 Sequence of Rate Calculations in a System . .......... 171
4.5.3 System Response to Events ................... 173
4.6 Cell Controller Support ....... ................... 174
4.6.1 Linear Program Constraint Implementation . ......... 174
4.6.2 Supporting Steps in the Rate Calculation . ........... 176
4.7 Boundaries in Surplus Space ....................... 179
4.7.1 Regions in Surplus Space ..................... 183
4.7.2 Attractive and Unattractive Boundaries . ........... 187
4.7.3 Attractive Boundary Chattering . ................ 188
4.7.4 Attractive Boundary Constraint . ................ 189
4.8 Boundary Installation. .. .............. ........ .. 190
4.8.1 Equations of Motion ....................... 192
4.8.2 Projected Trajectory ....................... 194
4.8.3 Multiple Boundaries ....................... 196
4.8.4 Installation of Multiple Boundaries . .............. 198
5 Manufacturing Systems with Limited Flexibility 207
5.1 Limited Flexibility in a Manufacturing System . ............ 207
5.2 Setup Notation and Fundamental Concepts . . . . . . . . . . .
5.2.1 Fundamental Concepts ..................
5.2.2 Notation .. .. . . . . . . . . . . . . . . . . . . . . . .
5.3 Assumptions about Flexibility . . . . . . . . . . . . . . . . . .
5.3.1 Assumptions Required by the Current Theory.....
5.3.2 Implementation Assumptions . . . . . . . . . . . . . .
5.4 Limited Flexibility Capacity Set . . . . . . . . . . . . . . . . .
5.4.1 Short-Term Limited Flexibility Capacity Set . . . . . .
5.4.2 Long-Term Limited Flexibility Capacity Set . . . . . .
5.4.3 Generalized Limited Flexibility Capacity Set . . . . . .
5.4.4 Constraints on Setup Change Frequencies . . . . . . .
5.5 Hierarchical Control of Setup Changes . . . . . . . . . . . . .
5.5.1 Setup Controller as a Function of Level . . . . . . . . .
5.5.2 Linear Program with Setup Changes . . . . . . . . . .
5.5.3 Coordination of Systems with Limited Flexibility . . .
6 Setup Implementation
6.1 Design of Hiercsim Version 4.0 . . . . . . . . . . . . . . . . . .
6.1.1 Setup Scheduling Implementation Objective . . . . . .
6.1.2 Limitations of Hiercsim Version 4.0 . . . . . . . . . . .
6.2 Setup Constraints in the Master Linear Program . . . . . . . .
6.3 Capacity of a Cell During a Setup Change . . . . . . . . . . .
6.4 System Coordination of Setups . . . . . . . . . . . . . . . . .
6.4.1 Order of Cell Calculations with Configuration Changes
6.4.2 Controllable Rates During a Configuration Change . .
6.4.3 Cell Configuration Change Status . . . . . . . . . . . .
6.4.4 Cancellation of Configuration Change Requests . . . .
6.4.5 Machine Group Configuration Change Status . . . . .
6.4.6 Machine Configuration Change Status . . . . . . . . .
6.5 Configuration Catalog of a Cell . . . . . . . . . . . . . . . . .
210
210
219
226
227
228
229
229
231
232
235
237
237
238
238
241
. . . . 241
. . . . 241
S. . . 241
S. . . 243
S. . . 244
S. . . 245
S. . . 245
.. . 245
S. . . 247
S. . . 249
S. . . 250
S. . . 250
S. . . 251
6.5.1 Catalog Entry in a Cell ............ ... ..... .. 252
6.5.2 Relationship of Catalog Entries Between Cells . ........ 253
6.6 Algorithm for Determining Cell Configuration . ............ 254
6.6.1 Obtaining and Using Initial Conditions . ............ 255
6.6.2 Setup Staircase Policy ...................... 256
6.6.3 Valid Catalog Entries .......... . .. ...... .. 258
6.6.4 Determination of the Best Configuration . ........... 261
6.6.5 Coordination of Configuration Changes . ............ 262
6.6.6 Supporting Routines ....................... 265
6.7 Choosing the Next Time to Change Configuration . .......... 267
6.8 Configuration Change Coordination Mechanism . ........... 269
6.8.1 Configuration Calculation Sequencing . ............. 269
6.8.2 Part Processing Completion . .................. 270
6.8.3 Machine Group Configuration Change Activation ....... 272
6.9 Machine Setup Assignment Algorithm . ................ 273
6.9.1 Notation for Assignment Algorithm . .............. 273
6.9.2 Setup Assignment Algorithm Description . ........... 277
6.10 Machine Setup Change Procedure ................... . 285
6.10.1 Machine Setup Initiation ................... .. 285
6.10.2 Setup Completion ................. ...... 286
7 Simulation Experiments 287
7.1 Overview of Simulations ........................ 287
7.2 Demonstration of Buffer Behavior ................... . 289
7.2.1 Disruption Duration and Buffer Size ......... . . . . . 289
7.2.2 Systems with No Buffers ..................... 303
7.2.3 Systems with Partly Full Buffers . ............... 309
7.2.4 Systems with Empty or Full Buffers . .............. 310
7.3 Demonstration of Measurement Frequency . .............. 323
7.3.1 Cumulative Production ...................... 326
7.3.2 Amount of Material in a Buffer . . . . . .
7.3.3 Capacity of a Machine ...................... .. 332
7.4 Demonstration of the Cell as a Sub-Factory . ............. 332
7.4.1 Target Rates within Long-Term Capacity . ........... 336
7.4.2 Hedging Points at Different Levels . . . . . . . . . . . . . . . . 340
7.4.3 Failure/Repair Cycle ....................... .. 342
7.4.4 Boundary Installation Algorithm Example . .......... 349
7.4.5 Variation in Relative Priority Coefficients . ........... 360
7.5 Demonstration of Reentrant Flow Anti-Looping Constraints .... . 366
7.5.1 Reentrant Flow Behavior ................... .. 366
7.5.2 Reentrant Flow Controller Linear Programs . ......... 376
7.6 Demonstration of Setup Changes in Hiercsim Version 4.0 ....... 385
7.7 ICL Semiconductor Fab Case Study . .................. 393
8 Conclusion 404
8.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
8.2 Summary ...................... . .......... 405
8.3 Future Research .................. ........... 406
8.4 Note on the Status of Hiercsim ................... .. 408
8.4.1 Code that W orks ......................... 408
8.4.2 Algorithm Improvements ................... .. 409
8.4.3 Code that must be Fixed ..................... 410
A Sample Input File, Hiercsim Version 3.5 414
B Sample Input File for Hiercsim, Version 4.0 428
C Procedure to Add Control Options to Hiercsim 441
C.1 Introduction .. .. . . .. ... .. .. .. . . . .. . . .. . . .. .. 441
C.1.1 Description of Output Control Architecture . ......... 441
C.1.2 Overview of User Control Options . ............... 442
C.1.3 Files and Subroutines to be Changed . ............. 443
329
C.1.4 Accessing files within the Revision Control System (RCS) . . . 444
C.2 Procedure ......... ....... .. ..... .. .. .. .... 445
C.2.1 New Macros and Variable Names - Basic Information ..... 445
C.2.2 New Macros and Variable Names - File Paths . ........ 446
C.2.3 Memory Bookkeeping and Initialization . ............ 447
C.2.4 New Command Line Options . ................. 448
C.2.5 New Output Paths ........................ 449
C.2.6 New Element in Toggle Option List. .... .......... 450
C.2.7 Control Using Flags in Hiercsim . ................ 450
D Procedure to Add Constraints 451
E Hiercsim Version 3.5 Cross Reference 456
F Hiercsim Version 4.0 Cross Reference 469
G Matlab Graphs from Hiercsim Output 482
G.1 Introduction ................................ 482
G.2 Step-by-Step ................................ 482
List of Figures
1-1 Different Scales of Models Described in this Thesis . ......... 17
2-1 Frequency Clusters ............................ 42
2-2 Example of Starvation and Blockage . ................. 58
3-1 Microscope Analogy to Hierarchical Control . ............. 73
3-2 Relationship between Cells, Buffers and Processes . .......... 80
3-3 Frequency Decomposition Infrastructure . ............... 83
3-4 Pyramid Decomposition Infrastructure . ................ 86
3-5 Subfactory Example ........................... 90
3-6 Sample Capacity Set f(ekk) for Level k Cell c . ........ . . 92
3-7 Virtual Machine Added to Level k Cell c . ............... 107
3-8 Virtual Machine Constraint in Level k Capacity Set f2k(ekmk) .... 108
3-9 Staircase Policy Cumulative Production . ................ 112
3-10 Example Reentrant Flow Hierarchy . .................. 123
3-11 Constraints on Hedging Points ...................... 131
4-1 Example Boundary with Capacity Set . ................. 180
4-2 Relationship Between xp, xo, and xk . ................. 200
5-1 Setup Tree for the Multi-Purpose Shop Tool . ............. 214
5-2 Configuration of a Group of Five Multi-Purpose Shop Tools .... . 216
5-3 Example Setup Tree with Notation for Machine Group i ...... . 221
6-1 Original and New Configurations . .................. . 275
6-2 Stage 1 and 2 of Setup Assignment
6-3 Stage 3 of Setup Assignment ................... .... 281
7-1 Disruption Duration and Buffer Size . ................. 290
7-2 Cumulative Production with Large Buffer . .............. 293
7-3 Amount of Material in Large Buffer . .................. 294
7-4 Cumulative Production with Small Buffer . .............. 297
7-5 Amount of Material in Small Buffer . .................. 298
7-6 Cumulative Production with Small Buffer, Low Demand . ...... 301
7-7 Amount of Material in Small Buffer, Low Demand . .......... 302
7-8 System with No Buffers ......................... 304
7-9 Level 2 Cell 2 Cumulative Production with No Buffers . ....... 306
7-10 Level 2 Cell 5 Cumulative Production with No Buffers . ....... 307
7-11 System which Undergoes Starvation ................... 312
7-12 Starvation - Cumulative Production for All Cells . ........... 314
7-13 Starvation - Amount of Material in All Buffers . ............ 315
7-14 System which Undergoes Blockage ................... . 318
7-15 Blockage - Cumulative Production for All Cells . ........... 320
7-16 Blockage - Amount of Material in All Buffers . ............. 321
7-17 Frequency Demonstration System ................... . 324
7-18 Cumulative Production for Entry Cells, Long Interval . ........ 326
7-19 Cumulative Production for Entry Cells, Intermediate Interval ..... . 327
7-20 Cumulative Production in Entry Cells, Short Interval . ........ 328
7-21 Material in Buffer 1, Intermediate Interval . .............. 330
7-22 Material in Buffer 1, Short Interval . .................. 331
7-23 Subfactory Demonstration System ................... . 334
7-24 Cumulative Production for Process 1 . ................. 338
7-25 Cumulative Production for Process 2 . ................. 339
7-26 Process 1 at its Collective Hedging Point . ............... 340
7-27 Process 2 at its Collective Hedging Point . ............... 341
. . . . . . . . . . . . . . . . . . . 2 8 0
7-28 Recovery of Process 1 From Level 2 Failure .
7-29 Recovery of Process 2 From Level 2 Failure . .............. 344
7-30 Process 1 Cumulative Production, High Priority . ........... 361
7-31 Process 2 Cumulative Production, Low Priority . ........... 362
7-32 Process 1 Cumulative Production, Low Priority . ........... 364
7-33 Process 2 Cumulative Production, High Priority . ........... 365
7-34 Reentrant Flow System .......................... 367
7-35 Cumulative Production, Long Interval . ................ 370
7-36 Cumulative Production, Short Interval . ................ 371
7-37 Amount of Material in Buffer 1, Long Interval . ............ 372
7-38 Amount of Material in Buffer 1, Short Interval . ............ 373
7-39 Amount of Material in Buffer 2, Long Interval . ............ 374
7-40 Amount of Material in Buffer 2, Short Interval . ............ 375
7-41 System with Setups ............................ 387
7-42 Setup Catalog for the System with Setups . .............. 388
7-43 Cumulative Production under Corridor Policy . ............ 390
7-44 Cumulative Production under Setup Staircase Policy . ........ 392
7-45 The MIT Twin-Well CMOS process . .................. 394
7-46 CMOS Line Level 2 Cumulative Production . ............. 397
7-47 CMOS Line Level 3 Cumulative Production - Youngest Wafer First . 399
7-48 CMOS Line Overall Work-in-Process - Youngest Wafer First .... . 400
7-49 CMOS Line Level 3 Cumulative Production - Oldest Wafer First . . . 401
7-50 CMOS Line Overall Work-in-Process - Oldest Wafer First ....... 402
343
List of Tables
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
Results of Large Buffer Size Simulation
Results of Small Buffer Size Simulation
Results of Small Buffer Size Simulation,
Results of System with No Buffers . . .
Starvation Simulation Results . . . . .
Blockage Simulation Results . . . . . .
Frequency Demonstration Results . . .
Results of Subfactory Demonstration .
Results of Reentrant Flow System . . .
Results of Corridor Setup Policy . . . .
Results of Setup Staircase Policy . . .
Low Demand
. 292
.. 296
.. 300
.. 308
.. 313
.. 319
.. 325
.. 335
.. 369
.. 389
.. 391
Acknowledgments
I would like to thank my wife, Melanie, for being supportive throughout the writing of
this thesis. I would also like to thank Mitchell Burman for his help in the development
of Hiercsim.
Many people contributed to the debugging and enhancement of Hiercsim. These
include Joe Kowalski, Chris Morin, Steve Riester, Hannes Smarason, and Marcello
Torres. The semiconductor data used to develop the reentrant flow control algorithms
in Hiercsim were developed by Xiewie Sherman Bai.
John Kasarda and the people at Crown Wood Products deserve thanks for pro-
viding us data and time in their television cabinet factory. Likewise, the people from
Packard Electric Company, Hamilton Standard, Hewlett Packard, and Ethicon de-
serve thanks for allowing us to use data from their facilities to test Hiercsim. Many
of these facilities were made available through the Leaders for Manufacturing Program
at MIT.
Sarah Hood from IBM contributed information for the writing of the literature
survey.
Finally, I would like to extend special thanks to Dr. Stanley Gershwin for his
patience through this project. His high editorial standards saved this thesis from the
oblivion of unreadability.
This work was partly supported by the National Science Foundation under Grant
DDM-09142777; partly supported by the Defense Advanced Research Projects Agency
under contracts N00014-85-K-0213 and MDA-972-00-K-0008; and partly supported
by the IBM Corporation.
Chapter 1
Introduction
1.1 Purpose and Intended Audience
1.1.1 Purpose
The purpose of this thesis is to describe an implementation of the hierarchical control
theory and framework in Gershwin (1989) and Violette and Gershwin (1991). The
implementation has taken the form of a software package called Hiercsim. Hiercsim is
a simulator of manufacturing systems which is used as a research testbed for different
control policies consistent with the hierarchical framework. Hiercsim integrates many
separate concepts from the body of papers which comprise the hierarchical manu-
facturing control system. Hiercsim Version 1.0 was written by Darakananda (1989).
This thesis describes two subsequent versions of Hiercsim, 3.5 and 4.0.
1.1.2 Intended Audience
This thesis has been written for two intended audiences. The first audience is one
which has access to the source code for Hiercsim, the manufacturing simulator. This
audience will be reading the thesis to understand the details of algorithms as imple-
mented in Hiercsim. Modifications to Hiercsim will be made based on this under-
standing. The second audience is one which is interested primarily in the theoretical
aspects of the extensions to the basic hierarchical control theory and framework that
Figure 1-1: Different Scales of Models Described in this Thesis
are incorporated in Hiercsim.
1.2 Types of Descriptions
One of the overriding issues in any implementation is the fact that reality has to be
modeled in some way. Any model necessarily approximates reality by including only
the most salient details, and ignoring less important aspects. There is always judg-
ment involved in deciding what is important, and what can be ignored. In addition,
details which are included in the model can be approximated in a number of different
ways.
Therefore, within this thesis, there are a number of different classes of topics which
attempt to describe the models developed for this implementation. They are Reality,
the Long Term Control Framework, Current Theory, and Implementation-Specific
Theory. The relationship between these different classes is shown in Figure 1-1.
Reality Reality is what is considered to be the invariant, underlying physical phe-
nomena which are being modeled. All models are always built on approximations of
reality. In this thesis, our perception of reality is described, and is contrasted with
the following three versions of our models.
Long Term Control Framework The hierarchical control system, described in
this thesis and in several papers, is based on the concept that it may be eventually
expanded to accommodate a larger portion of reality. The framework developed in this
thesis and in the papers on which it is based makes no assumptions about the nature
of specific control algorithms, other than decomposing the scope and bandwidth of
the control.
Current Theory Despite the hope that many types of events and phenomena will
be eventually included in the control framework, the present state of the theory is
limited to operations, failures and setup changes. The control algorithms developed
for these activities are used to specify most, but not all of the decisions needed to
produce finished products. The remaining decisions are left to simple, rule-based
heuristics, of which there are a great number to choose from.
Implementation For any implementation of the hierarchical controller, each con-
trol decision must be precisely specified. Each decision which is not determined in
the current state of the theory must be specified by a heuristic.
At this time, there are two distinct implementations of the hierarchical controller
simulation, called Hiercsim Versions 3.5 and 4.0. Version 3.5 models operations and
failures, and Version 4.0 models operations and setup changes.
The assumptions made for the implementation are based loosely on the data which
our research group has received from various manufacturing facilities in the U.S. In
all cases, Hiercsim was able to model at least some aspects of these facilities.
1.3 Overview of Hiercsim
1.3.1 Features Included and Not Included
Hiercsim is a research testbed. For this reason, many of the features required in
a commercial simulation package are not included. The intent is to refine specific
aspects of the hierarchical control theory before additional complexity is added.
Hiercsim Version 3.5 contains the following features:
* The decomposition of control by frequency of events and by process step. This
control is an extension of the dynamic programming formulation described in
Gershwin (1989) and Violette and Gershwin (1991).
* A model of a factory which includes multiple processes, random failures and
repairs, and multiple machine types with no setup changes.
* A model of groups of identical machines which contain a mix of setup config-
urations. The configuration of a machine group is fixed for the duration of a
simulation.
* The response of the hierarchical controller to stochastic events such as failures.
* The ability to model the control of reentrant processes in a hierarchical frame-
work (e.g., as in semiconductor manufacturing).
Hiercsim Version 4.0 contains the following features:
* The decomposition of control by frequency of events and by process step. This
control is an extension of the dynamic programming formulation described in
Gershwin (1989) and Violette and Gershwin (1991).
* A model of a factory which includes multiple processes, and multiple machine
types with variable setup states. Failures and repairs are not included in Hierc-
sim Version 4.0.
* A model of groups of identical machines which contain a mix of setup configu-
rations. The configuration of a machine group may change during a simulation
run.
* The response of the hierarchical controller to stochastic events such as buffer
blockage and starvation.
* The infrastructure and preliminary control algorithms for modeling setup
changes within a single machine group.
* The preliminary control algorithms for coordination of setup changes among a
number of different machine groups.
Features which would be useful but not yet incorporated into Hiercsim are:
* A model of yield loss or rework.
* A model of line personnel.
* A model with explicit due dates for build-to-order facilities.
* A model of assembly processes.
* A model of machine maintenance.
1.3.2 Design and Performance
The design goal of Hiercsim is for it to be reliable, maintainable, and flexible. The
algorithms developed in this thesis do not have computational efficiency nor memory
usage as explicit design criteria. Simple simulations were completed in under two
minutes and a complex semiconductor simulation modeling 500 days of production
ran in 3 hours on a VAXstation 3100 workstation with 8 Mb of RAM. Hiercsim has
successfully run on Sun Sparcstations and on IBM PC compatibles under both UNIX
and DOS operating systems. Most of the time spent in running the simulation was
spent executing part movement within the factory, and not performing the compu-
tations required by the controller. Therefore, simulations that do not represent part
movement explicitly will run very fast.
Hiercsim Versions 3.5 and 4.0 are written in pre-ANSI C. Each version contains
approximately 20,000 lines of code and 150 subroutines in 60 files. They both accept
ASCII input data files and produce ASCII output.
Hiercsim Version 3.5 has been significantly tested and has been used to model
numerous manufacturing facilities. Hiercsim Version 4.0 has not been tested and its
algorithms are still in a preliminary state. It has been used primarily as an algorithm
development vehicle for setup change modeling and control policies.
1.4 Outline of Thesis
An attempt has been made to divide general theoretical models from specific im-
plementation policies in writing this thesis. Chapters 2, 3, and 5 are devoted to
describing general theoretical models. Chapters 4 and 6 are devoted to describing
the details of Hiercsim Versions 3.5 and 4.0. Chapter 7 demonstrates the general
theoretical models using Hiercsim through a series of simulations. Chapter 8 lists
conclusions from this work and potential areas for future work.
Chapter 2 describes our view of a manufacturing system and the overall framework
used to model such systems. Included in our view of a manufacturing system are
basic definitions, types of events modeled, performance measures, capacity models,
and basic assumptions about frequencies of events, processes, and machine reliability.
Chapter 3 introduces the control algorithms used in the hierarchical controller.
Extensions to the basic framework of Gershwin (1989) are described in this chapter
including the combination of frequency and process decomposition, the communica-
tion links required to make independent cells operate as a system, and reentrant flow
control algorithms.
Chapter 4 describes implementation details which are shared between Hiercsim
Versions 3.5 and 4.0. These details include the basic architecture of Hiercsim, the
interfaces required for data transfer between modules of the simulation, the basic
factory model, simulation dynamics, and cell controller implementation. An extension
to multiple boundaries of the boundary control algorithm (Gershwin, Akella, and
Choong, 1985) is described. Details specific to Hiercsim Version 3.5, such as failures
and repair models, are also discussed.
Chapter 5 introduces systems with limited flexibility that require setup changes
to meet production schedules. This chapter develops basic concepts and notation.
The capacity set for a system with setup changes is developed.
Chapter 6 uses the notation and concepts from Chapter 5 to describe the setup
change control algorithms used in Hiercsim Version 4.0. These algorithms include
choosing the time to change setup state, choosing an appropriate setup state, and the
coordination of setup changes across multiple machine groups and cells. The low level
setup assignment algorithm is described within the context of a setup tree. These
algorithms are preliminary and have not been tested.
A series of appendices are included which describe procedures to use and extend
Hiercsim. Appendix A describes extensions to the user interface originally described
in Darakananda (1989). Appendix B describes how to add more command line ar-
guments, how to add new modules to the input file, how to control output, and how
memory is allocated in Hiercsim to prevent memory leaks and pointer errors. Ap-
pendix C details the procedure to incorporate additional linear program constraints
in Hiercsim as more features are added to the simulator. Appendices D and E con-
tain the list of files and subroutines in Hiercsim Versions 3.5 and 4.0 respectively.
Appendix F details a procedure which can be used to generate graphs of results in
Matlab.
1.5 Literature Survey
Core Hierarchical Papers This section lists a sample of the papers on the topic
of hierarchical control of manufacturing systems as used in this thesis. This list is
not intended to be exhaustive, but instead, it provides a starting point for further
reading.
Gershwin (1989) describes the hierarchical framework and the basic dynamic pro-
gramming problem control theory used in this thesis. Gershwin, Akella, and Choong
(1985) describes the performance of the hierarchical control theory in a simple simu-
lation and develops the concept of boundaries in surplus space. Setup control systems
which are consistent with the hierarchical framework appear in Sharifnia, Caramanis,
and Gershwin (1991) and Srivatsan and Gershwin (1990). A preliminary description
of the pyramid extension of the hierarchy appears in Violette and Gershwin (1991).
Extensions to the hierarchical control theory and its application to semiconductor
manufacturing appear in Bai (1991a, 1991b) and Bai and Gershwin (1990a, 1990b).
1.5.1 Commercially Available Simulators
Commercial simulation software was available at the start of the Hiercsim project
in 1987, and many more packages have been developed since then. This section
mentions some of those simulators and lists the reasons why we chose to create a
custom simulation software package. For a detailed overview of available simulation
software, refer to Swain (1991).
There are at least two classes of manufacturing simulation software. The primary
purpose of simulation software is to model the capacity of new manufacturing lines for
capital planning or to determine the sensitivity of overall production measures such
as throughput, work-in-process, or cycle time to changes in the quantity or reliability
of equipment.
In one class, in which the user constructs a model of a manufacturing system from
a set of predefined objects. The user collects the necessary data and converts the
data into a set of parameters for the simulation. This class does not require the user
to be a programmer and usually has a menu-driven user interface. Examples of such
simulators are SIMAN (Pegden, Sadowski, and Shannon (1990)), ManSim (Wolverine
Software), and Achilles (Atherton, (1987)).
The other class consists of those manufacturing simulators which are high level
programming languages. The user is required to write a computer program using pre-
defined subroutines which model components of a manufacturing system. Customized
subroutines may be written to supplement the pre-defined subroutines. Examples of
these high level languages are MODSIM by CACI, SLAM II (Pritsker, 1986), GPSS,
and BLOCS (Glassey and Adiga, 1989). The boundary between the two classes is
fuzzy since custom extensions can be written for SIMAN.
Two fundamental reasons contributed to the decision to create Hiercsim, rather
than purchasing a commercially available package.
* Difference in Purpose The purpose of most commercial simulation software
packages is to aid companies in analyzing business decisions. These include
analyzing the capacity of a proposed manufacturing line and its capital require-
ments, and analyzing the effect of additional or more reliable equipment on an
existing line's performance.
The purpose of Hiercsim is limited in scope to testing and refining real-time
hierarchical manufacturing system control policies. The intent is not to model
any specific system in great detail.
* System Control Policies Control of flow through the model of a manufac-
turing system in commercial simulation software is primarily limited to local
dispatch rules and start of line release rules. These rules do not generally re-
spond to stochastic changes in the manufacturing system.
The aim of the hierarchical control framework is to provide a dynamic control
law which is based on feedback of manufacturing measurements. Transforming
an existing software package into one which dynamically responds to the state
of a manufacturing system was considered to be too arduous.
While writing the custom simulation software, some disadvantages arose. None of
these disadvantages outweighed the advantages of having the flexibility to incorporate
features unique to a research testbed.
* Hiercsim was written without the aid of modern Computer Aided Software En-
gineering (CASE) tools. These tools permit the creation of user interfaces with
minimal effort, allow for quick debugging of code, and assist version manage-
ment.
In addition, this code was written by graduate and undergraduate students,
which made it difficult retain people who understood the inner workings of
Hiercsim.
* The user interface and data file creation of Hiercsim are crude.
In contrast, SIMAN includes templates which automate much of the modeling
required for specific types of manufacturing systems. One example is a wafer
fabrication template by the Systems Modeling Corporation. ManSim has a com-
pletely menu-driven user interface which incorporates many standard models of
manufacturing systems.
* Hiercsim is limited in its functionality and modeling capability. This is due to
the lack of mature control policies consistent with the hierarchical framework as
well as the lack of software which implements those policies. It is relatively easy
to model a physical phenomenon, but it proved difficult to develop appropriate
control policies based on the hierarchical framework. This is only a disadvantage
if Hiercsim were to be used in the management of a real factory.
Many different features are included in commercial simulation software. Op-
erations, batch processing, assembly, setup changes, machine failures and re-
pairs, preventative maintenance, operators, workday calendars, lunch breaks,
are standard in most commercial simulation software. Sequence dependent and
sequence independent setup changes are also modeled.
1.5.2 Prior Development on Hiercsim
The Hiercsim project originated in 1987 and Version 1.0 was completed in 1989
(Darakanda, 1989). Hiercsim Version 1.0 did not perform to expectations due to
unforeseen difficulties in translating the hierarchical framework and dynamic pro-
gramming control laws into a manufacturing simulation system.
In addition to the underdeveloped hierarchical theory, Hiercsim Version 1.0 was
written in three months, whereas the debugging took 12 months. Four major defi-
ciencies arose from this short gestation period:
* The overall architecture of Hiercsim Version 1.0 was sound and was used as a
springboard for Hiercsim Versions 3.5 and 4.0. However, many of the details of
the implementation were not expandable and often were wrong.
Functionality of subroutines and data structures overlapped and contained a
high degree of interdependence. This made changing to code to accommodate
new theory difficult.
* The theory was not well enough understood for the implementation to be suc-
cessful. Numerical and computational problems had to be solved before the code
could work. This was not recognized, leading to a proliferation of temporary
fixes based on specific simulations. These fixes compounded the complexity of
the code and its subsequent debugging effort.
* The factory model was inaccurate and highly complex as multiple patches had
been made by the end of the debugging period.
* Documentation on the software was virtually non-existent. The 100 pages of
code were uncommented, variables had multiple uses, and the thesis did not
describe the implementation other than providing a data entry user's manual.
It took two years of further work to understand and correct deficiencies in both the
hierarchical algorithms, the software architecture, and the software implementation
of the hierarchical algorithms. Those extensions of the theory and modification of
the implementation are described in detail in this thesis. Note that without this
preliminary attempt at an implementation, it would have been impossible to develop
the implementation to the level of Hiercsim Version 3.5.
1.5.3 Work Based on Hiercsim
Simulations Run Under Hiercsim A number of real manufacturing systems have
been simulated using Hiercsim Version 3.5. No real systems have been simulated using
Hiercsim Versions 1.0 and 4.0 due to their incomplete states.
The purpose of the simulations of actual manufacturing lines is primarily to val-
idate the hierarchical framework and its control algorithms with real manufacturing
problems. The contribution of each of the simulations to the development of Hiercsim
Version 3.5 is noted along with the reference to the thesis in which the simulation is
described.
* A portion of a line used in the manufacture of Crown Wood Products television
cabinets in North Carolina was modeled by Kowalski (1990). This simulation
provided sufficient experience to solve reentrant flow control problems.
* The printed circuit board surface mount line in the Hewlett-Packard Medical
Systems line in Massachusetts was modeled by Morin (1991). This simulation
showed that multiple machines and multiple part types could be modeled in
Hiercsim.
* The simulation of the MIT Integrated Circuit Laboratory Wafer Fabrication line
in Massachusetts was adapted from Bai (1991b) to test reentrant flow control
in a distributed control system. This simulation is described in Chapter 7 of
this thesis.
* A Packard Electric Fuse Block Line in Mississippi was modeled by Smarason
(1991). This line demonstrated the ability of Hiercsim to control nine separate
part types in a system with machine failures.
* A United Technologies Hamilton Standard printed circuit board line in Con-
necticut was modeled by Hager (1992) and Riester (1992).
Extensions to Hiercsim Enomoto (1992) added the ability to model time-varying
demand and preventative machine maintenance. The validity of the extended features
were verified through a series of experiments.
A software utility which converts Hiercsim output into graphs was written by
Villaflor (1992).
Chapter 2
Definitions, Assumptions,
Observations and Notation
2.1 Manufacturing System Description
In this chapter, the types of manufacturing systems implemented in Hiercsim are de-
scribed. Notation for types of events, resources and measured quantities is presented.
Assumptions about phenomena are presented. These assumptions are made so that
phenomena may be modeled within the long term framework, the current theory, and
the actual implementations of the hierarchical controller. Each of these models has
different restrictions, with the long term framework being most general, followed by
the current theory, and finally the most restrictive is the current implementation. All
the concepts in this chapter appear in Gershwin (1989) and Violette and Gershwin
(1991).
2.1.1 Definitions
Manufacturing System A manufacturing system is a system in which raw mate-
rial is transformed into finished product. For us, such a system consists of machines,
parts, and buffers. Resources such as people, computers and communication systems
are not considered here.
Scheduling Scheduling is the selection of times for future controllable events. In
this thesis, the only controllable events which are studied are operations and setup
changes. Other controllable events include preventative maintenance, shipments, and
employee meetings.
Part A part is a piece of material which undergoes a transformation from raw
material to finished product. A type of part is one which is specifically assigned to
a particular process. Different types could sometimes follow the same path, but the
operations performed at each step may be different.
Lot A lot is a group of parts which travel from machine to machine as a single unit.
Resource A resource is a permanent part of a manufacturing system. It is not
consumed and does not leave the system. It is required for operations, maintenance,
or other activities. A resource can engage in only one activity at a time, and can not
be used more than 100% of the time. Machines and people are examples of resources.
Machine A machine is a resource which is used to perform a set of operations on
a set of part types.
Machine Group A machine group is a set of identical machines. Machines in the
group can fail and can be set up with different tooling configurations independently
of each other. A part type which visits the machine group can be operated on by any
of the machines in the group as long as the individual machine is operational, and
has the correct tooling for the operation on the part.
Buffer A buffer is a storage area where a part may wait for any length of time. A
buffer has a fixed size. The size of the buffer is the maximum amount of material
that the buffer may hold, and must be an integer number of parts. A homogeneous
buffer is only able to hold parts which are indistinguishable from one another. In this
thesis, all buffers are homogeneous.
Process A process is a fixed sequence of steps whereby a part is transformed from
raw material into finished product. Each step in the process requires that a part
visit either a specific machine group or a specific buffer. A machine step describes
the necessary machine type and tooling configuration, as well as the operation to be
performed. A buffer step serves as a temporary storage area between consecutive
machine steps.
Process Stage The stage of a part describes the cumulative number of steps which
the part has undergone in its process. The stage of a part also tells which step is
to be performed next. All parts of the same type which are at the same stage are
considered to be identical.
Process Segment A process segment is comprised of any number of contiguous
process steps. A process segment starts with an entry buffer, proceeds through a
number of machine group and buffer steps, and ends with an ezit buffer. An entire
process may be composed of many segments. The entry buffer of one segment is the
exit buffer of its upstream neighbor. Likewise, the exit buffer of a segment is the
entry buffer of its downstream neighbor.
State The state describes important time-varying attributes of the manufacturing
system. Included in the state are machine repair conditions, machine tooling config-
uration or setup, buffer levels, and production status.
Events An event is a discontinuous change in the system state or in the model of
the system state. The start of processing at a machine is an example of an event.
The change of production rates is also an event.
Activity An activity is a pair of events: a starting event followed by a completing
event. The duration of an activity is the amount of time between the starting event
and the completing event. Between a start event and an end event, the resource
is unable to do any other activity. The exceptions are failures which preempt any
controllable activity. An operation is an example of an activity. The duration of an
operation is the length of the period of time between the start of processing and the
completion of processing at a machine. The term disruption is sometimes used for an
undesirable activity such as a machine failure or an absence of raw material.
Capacity Capacity is the set of possible frequencies of controllable events in a
manufacturing system. The capacity of a system is represented as a set of constraints
on the rates of controllable events, such as operations and setup changes. Any set of
controllable rates which is chosen by controllers must lie within the constraints of the
capacity set. In Hiercsim Version 3.5, the capacity of a system is the set of possible
production rates of all the part types that are being produced.
Flexibility Flexibility is defined as the number of different part types that may be
produced in a manufacturing system over a given period of time.
Cell A cell is a model of a subset of a manufacturing system which is only affected
by a limited bandwidth of event frequencies. The cell is the basic building block of
the hierarchical model, and is the foundation for the control decompositions described
in Sections 3.5.1, 3.5.2, and 3.5.3.
Controller A controller is a mechanism which allocates resources in a manufactur-
ing system for controllable activities. In this thesis, there are three distinct types of
controllers. A cell controller is one which is responsible for choosing rates and ap-
proximate times for controllable activities for a subset of the manufacturing system.
A machine group controller is responsible for loading of parts onto machines for pro-
cessing, transportation of parts, choosing precise setup change times, and reporting
machine status information to cell controllers. The system controller is the combi-
nation of all cell controllers, machine group controllers, and the infrastructure which
serves to coordinate the each of the smaller controllers towards the same overall goal:
to meet production requirements with the least amount of inventory or surplus.
Mean Time to Repair MTTR The mean time to repair (MTTR) of a failure
mode is the expected duration between the instant of failure and the instant of repair.
Frequently, we write MTTR = 1/r.
Mean Time to Fail MTTF The mean time to fail (MTTF) for a failure mode
is the expected cumulative amount of time between the instant of repair and the
instant of failure. Failures are either operation-dependent (ODF) or time-dependent
(TDF) (Buzacott and Hanifin, 1978). Operation-dependent failures occur after a
random amount of processing has been completed, regardless of the amount of idle
time incurred between operations. Time-dependent failures occur after a random
amount of time has elapsed, regardless of the amount of processing accomplished in
that interval. We sometimes write MTTF = 1/p.
Mean Time Between Failures MTBF The mean time between failures is the
sum of the mean time to fail and the mean time to repair. MTBF = MTTF +
MTTR. This inverse of this quantity is the frequency of occurrence of a failure
mode.
2.1.2 Notation Conventions
Resource Each resource in a manufacturing system which performs operations on
parts is denoted by i. In this thesis, the only resources which are considered are
machine groups. A specific machine group will be denoted by Machine Group i.
Cell When the control structure is introduced, the concept of a cell will be used. A
cell is a subset of resources, buffers, and process steps. A specific instance of a cell is
denoted by Cell c.
Buffer The buffer separating steps n and n + 1 of Process j is represented as nj.
The maximum amount of material which can be held in Buffer nj is B j. The actual
amount of material which is contained in Buffer fj at time t is bnj(t).
Activities The following list details some notation conventions used when describ-
ing activities:
1. Activities (e.g., operations, tooling changes, and failures) are denoted by j.
2. The set of all operations at Machine Group i is written Pi.
3. The set of all failure modes at Machine Group i is written Fi.
4. The state of activity j on Resource i at time t is denoted by aij(t), and is 1
when the resource is occupied by the activity, 0 otherwise. (The symbol j is
also used to denote the process or process segment within which an operation
occurs).
Activity Durations: Operation Time and Machine Repair Time If j e P
is an operation that may occur at Machine Group i, then nij is the average amount
of time required for a machine in Machine Group i to perform the operation on Part
Type j.
If j E F is a possible failure mode of a machine in Machine Group i, then Tij is
the average time required to repair a Group i machine (MTTRj) when the machine
has failed in Mode j. Frequently, we write rij as MTTRij = 1/rij.
Activity Rates The production rate uj, for j E Pi, is the number of Type j parts
produced during a period of time divided by the length of the time period. When cells
are introduced and production rates differ from cell to cell, then uj is the symbol for
the production rate of Type j parts in Cell c.
If j E Fi, then uj is the rate that Machine i experiences Mode j failures and is
equal to uj = 1/(MTBFj).
2.2 Types of Events in the Simulation
There are two major classes of events in the factory simulation: control events and
physical events. Control events are used in the simulation to trigger controller cal-
culations which result in the triggering of additional control events, the change in
rates of controllable events, or actual physical events. Physical events are changes
in the state of a resource, and in many cases, come in pairs to form an activity. In
this thesis, the only activities which are considered are operations, failures, and setup
changes.
2.2.1 Types of Events and Activities
Events treated in Hiercsim include the start and the end of actual activities on re-
sources and parts. Physical events which begin a controllable activity are done only
when there is a request from the system controller. Uncontrollable physical events
occur randomly, irrespective of the controller status.
Physical events include:
1. Arrival of parts at a machine group or buffer
2. Departure of parts from a machine group or buffer
3. Machine loading
4. Machine unloading
5. Processing initiation of parts
6. Processing completion of parts
7. Setup change initiation
8. Setup change completion or cancellation
9. Failure of a machine
10. Repair of a machine
Activities include:
1. Transportation of parts
2. Processing of parts
3. Setup changes at machines
4. Failures and repairs of machines
Other events such as demand changes and engineering change orders are not con-
sidered in this thesis.
2.3 Performance Measure Definitions
In order to compare system performance among different controllers, performance
measures must be defined. There are two types of performance measures in this
thesis: objective system measures and controller-specific measures.
2.3.1 Objective System Measures
Objective system measures indicate how well the entire system is performing and can
be used to compare different controllers. They include total production, work-in-
process (WIP), the distribution of work-in-process, and cycle time.
Total Production Total production is defined as the actual number of parts of a
process that have gone from raw material to finished product by the end of a given,
fixed time period. The completion of processing for a part in the system is defined as
the instant that the process completion event occurs at the last machine step in the
process.
Work in Process Work-in-process (WIP) for a system is defined in this imple-
mentation as the number of parts of a process that are contained within the system's
boundaries. The boundaries of the system are the start of processing on the first
machine in the process and the completion of processing on the last machine in the
process. When cells are introduced, the WIP of a cell is defined in a similar manner,
except that the cell's boundaries are the first machine step in the process segment
and the last machine step in the process segment.
Distribution of Work-in-Process The distribution of work-in-process is com-
puted by measuring the amount of material in every buffer in the system. The
distribution of work-in-process gives an indication of how balanced the system is. An
efficient distribution will be able to reduce the effects of disruptions without requiring
too much inventory. Conversely, an inefficient distribution will accumulate excessive
inventory while not preventing the effects of disruptions from propagating within the
system.
Cycle Time Cycle time is defined as the length of time which a part takes to travel
between the boundaries of the system. The cycle time through a cell is defined as the
length of time which a part takes to travel between the boundaries of the cell.
Cycle time sometimes has a different definition in the literature. Cycle time is used
as the operation time of a synchronous transfer line. After each elapsed cycle time,
operations at each machine in the system have been completed, and each machine
sends its current part to the next station downstream. Cycle time in this thesis is
the total amount of time required to travel the entire length of the process, and is
not the longest operation time in the process.
2.3.2 Controller-Specific Performance Measures
Controller-specific measures tell how well each component of the controller is per-
forming and can be used to detect weaknesses and isolate faults in the controller.
These quantities are meaningful mainly in the context of the theory underlying the
controller. Controller-specific measures for the hierarchical controller include cumu-
lative requirements, cumulative production, and production surplus for subsets of
each process in the manufacturing system. These measures are used directly by the
hierarchical controller to set rates of controllable events.
Cumulative Requirement The cumulative requirement of a process is the total
number of parts which must have completed processing within a system at any in-
stant of time. This requirement is specified from outside of the system, where the
hierarchical controller is unable to exert its influence. The cumulative requirement
of a subset of a process is similarly defined, except that the requirement is specified
outside of the domain of the controller of the process subset.
Cumulative Production The cumulative production of a process is the total num-
ber of parts which have completed processing within a system at any given instant in
time. The cumulative production of a subset of a process is similarly defined, except
that only the number of parts which have traveled through the steps of the subset
are considered.
Production Surplus Production surplus is defined as the difference between the
controller's cumulative production and its cumulative requirements at any given in-
stant of time. In the current state of the theory, cumulative production is computed
based on production rates set by the controller, instead of direct production measure-
ments from the factory. The production surplus of a subset of a process is defined to
be the difference between the cumulative production and cumulative requirements of
the subset. The surplus can be positive (indicating production is ahead of require-
ments), negative (indicating production is behind requirements), or zero (indicating
production is equal to requirements).
The controller is designed with the assumption that the rates of controllable ac-
tivities set by the controller are met almost exactly by the system which is in the
domain of the controller. The controller sets those rates by allocating capacity to
those activities that need it most. In the current theory, the surplus xj(t) of Activity
j is measured by taking the difference between the integral of the rate of occurrence
uj set by the controller and the integral of the target rate dj:
xj(t) = uj(t)dt- jd(t)dt + x(O) (2.1)
Consequently, if the predicted capacity of the resources within the domain of
the controller is imprecise, then the actual rate of occurrence of an activity will not
meet the controller's perception of the rate, uj, exactly, and the controller will be
computing production rates based on inaccurate information.
2.4 Capacity Constraints
It is important for a controller to have an accurate, quantitative representation of
the manufacturing system capacity. Given such a capacity set, any production rates
which are chosen within that set can be met. If production rates are chosen outside
of the capacity of the system, then not only will they not be met, but in some cases,
less production will be done than is theoretically possible. This will occur in cases
where there are more than one operation at a machine. If the rates of operations are
not controlled, then one operation may be performed more than needed. Inventory
may clog up the transportation system, leading to reduced capacity for the other
operations.
Given the assumptions about the types of processes allowed in the current state
of the theory (Section 2.6), the long-term capacity of a manufacturing system can be
defined. This definition is valid over a long time period compared with the duration
of all activities, mean times between events of all classes, and the time required to fill
or empty buffers.
Long Term Capacity Consider a manufacturing system that includes a set of
machine groups. Each Machine Group i contains ni identical machines. The capacity
set is the set of possible values that the production rates uj, j E P can have, where Pi
is the set of all operations which can be performed on Machine i. The most important
limitation on the uj is that no resource may be occupied more than 100% of the time.
The occupation of Activity j on Machine Group i is the fraction of time that the
machine group spends on Activity j and is equal to rijuj. Therefore, the long term
capacity set for a system with only failures and operations is the set of uj, j E Pi such
that
r ijuj ni - E rijuj \i
jEPi jEFi
uj 2 O j P (2.2)
Note that this capacity set assumes that the buffers within the manufacturing system
are large compared to the amount of production expected during any disruption. This
is because the capacity set (2.2) does not consider starvation or blockage of machines.
The sum over all the event types in the system gives the total fraction of the
system which is occupied. The remainder is defined to be idle (or slack) time, Si of
Machine Group i:
Si = ni - r ijuj - rijuj Vi (2.3)
jEFi jEP
When other activities besides operations and failures are introduced, their use of
capacity is accounted for in a similar manner. For an example of the capacity set in
which setup changes are included, see Section 5.4.3.
Short Term Capacity Some events occur much more frequently than other events
in most manufacturing systems. Let phigh denote high frequency controllable events
which occur at Machine Group i. Similarly, let P"~" denote low frequency controllable
events which occur at Machine Group i. Assuming that the frequencies of events in
phigh are much higher than any of the failures in F, there is a time period over which
few events in P'O and Fi occur, while many events in phigh occur. This relationship
among event frequencies will be expanded later in this thesis.
Over a such a short time period, there may be machines which are occupied by
low frequency activities and cannot be used for high frequency activities. A Failure
Mode j, j E Fi, on Machine Group i is represented in the capacity set by its activity
state ai. Similarly, a low frequency controllable Activity j, j E Pl", which is being
performed by Machine Group i is represented in the capacity set by its activity state
aij. Each of the ni machines in the machine group may be occupied by at most one
active failure mode or one low frequency controllable activity, where aij = 1. Those
machines which are so occupied are unable to perform any high frequency controllable
activities.
An operational machine in Machine Group i is one that is not occupied by a low
frequency controllable event or a failure. Define mi to be the number of operational
machines at Machine Group i at any given time instant. The value of mi is given by
the relation
mi = ni - E ai - ai Vi. (2.4)
EFi jE !ow
The total number of machines which are operational can never exceed the total
number of machines in the machine group. That is,
O <mi n Vi (2.5)
If the buffers are large, then the short term capacity set can be constructed by
taking into account low frequency activities at specific machines. The short term
capacity set is the set of uj, j E phigh, such that the occupation of machines in
the group by high frequency activities, j E phih, does not exceed the number of
operational machines mi:
EZ juj : n - E aZ j - Z aij Vi.
jEpihigh jEF p j'
uj _ 0 j E pih (2.6)
The short term capacity set can be rewritten as:
rijuj < mi Vi
jEpih
i g h
u 0 jE pigh (2.7)
Capacity set (2.2) is the average of (2.7) over a long period of time. The short
term capacity set (2.7) is actually slightly conservative (Lasserre, 1989 and 1992). It
ignores the effects of any buffers which may exist in the system. There is an implicit
assumption that if a machine fails, production is immediately stopped, without the
possibility of draining downstream buffers or filling upstream buffers. Note that
this capacity set does not account for starvation or blockage due to high frequency
operations if buffers are very small.
2.5 Assumptions about Frequencies of Events
2.5.1 Characteristic Frequencies of Events
Over any given interval of time, a number of occurrences of Type j events will occur.
The frequency of occurrence of the Type j event is equal to the number of times the
event occurs divided by the duration of the interval. The frequency of occurrence is
dependent on the length of the interval. The long-term average frequencies for all
events in the system are used as a reference upon which the control hierarchy will be
built using the current theory.
Only strictly deterministic events are able to occur with a frequency exactly equal
to their long-term average frequencies. Random events occur with frequencies that
vary from one interval to another. In the current state of the theory, it is assumed
that the frequency of any given event does not vary significantly from its long-term
average for any significant length of time.
Different types of events have different long-term average frequencies of occur-
rence. For example, the frequency of an operation can be much greater than the
( .0 Operations
t! Major Minor
SFailures FailuresO
" O0 UpgradeMachine
fk-1 fk fk+1
Event Frequency
Figure 2-1: Frequency Clusters
frequency of a failure. From the point of view of an observer which can detect events
only of frequencies which are roughly equal to the long-term average frequency of a
failure, operations can only be measured in terms of frequencies of occurrence.
In the hierarchical framework, it is assumed that the long-term average frequencies
of events cluster about distinct values, or characteristic frequencies fk, and that each
cluster is widely separated from all other clusters. Each cluster k of event frequencies
is close to the characteristic frequency fk, and the characteristic frequencies satisfy
0 = f < f2 < ** < fk- < fA < fk+ < .- - - (2.8)
For example, Figure 2-1 shows a sample of some typical factory events on a fre-
quency scale. The least frequent event is a machine upgrade while the most frequent
events are operations. Each event frequency cluster is widely separated from the other
clusters.
Define the long term average frequency of occurrence of a Type j event as u j .
The Type j event belongs to the frequency class k if its long-term average rate of
occurrence satisfies:
fk-1 < uj < fk+l. (2.9)
This assumption allows an important class of manufacturing system to be studied.
There are other systems which do not follow this assumption, but those are not
considered here.
2.5.2 Definition of Level
It is useful to define the concept of a level when discussing the relative frequencies
of different events. The level of an observer is determined by the frequency class
k of events for which the observer has a model for when they occur. A Level k
observer only has models of events whose frequencies of occurrence satisfy (2.9). The
characteristic frequency of events modeled by the Level k observer is fk.
Define the level L(j) of Activity j to be the value of k associated with the class
in which the event frequency lies (Gershwin, 1989). Therefore,
L(j) = k if fk-1 < uj < fk+1. (2.10)
Type j events whose frequencies of occurrence are higher than that of the Level
k characteristic frequency fk (L(j) > k) are modeled by the Level k observer as
occurring at a rate u'. The only change that a Level k observer is able to detect is
when the frequency of occurrence, u , of the event changes, and this is only possible
if the frequency of such changes is around fk or less.
On the other hand, Type j events whose frequencies of occurrence are of com-
parable magnitude to fk, or much lower than fk (L(j) < k) influence the state of
the system in a discrete manner. Define a i to be the discrete state of Activity j,
L(j) < k, at Resource i, as seen by a Level k observer. The value of af is equal
to 1 when the activity is taking place at Resource i, and is 0 otherwise. The vector
ak, which is a subset of a, the complete state of the system, includes all the discrete
states of the system that can be seen by a Level k observer.
When a resource is a machine group with multiple machines, the state of the group
is the sum of all the states of the individual machines.
2.5.3 Assumptions Based on Relative Frequencies
A bandwidth is a spectrum of frequencies within which a set of long-term average
frequencies of events are located. The bandwidth for the long-term average frequencies
of all types of events in a manufacturing system is sometimes much larger than the
bandwidth of any single observer in the system. That is, the range of event frequencies
is sometimes much larger in a manufacturing system than can be detected by a single
observer of the system.
The assumption (2.8) that long-term average frequencies of occurrence of events
cluster around distinct characteristic frequencies in the spectrum allows a Level k
observer in Cluster k to make accurate simplifying assumptions about events in other
clusters. These simplifying assumptions are used in the construction of the hierarchi-
cal controller in Chapter 3.
Consider a Level k observer which only has models of events which occur over a
small band of frequencies near fk. From the point of view of that Level k observer, a
Type j event lies within one of three domains:
1. Lower Level Events Consider a Type j event which has a long-term average
frequency, uj, which is much higher than fk (L(j) > k). From the point of view
of the Level k observer, the Type j event can only be described as occurring
with a variable rate, u, where the average value of ua is equal to the long-term
average frequency, uj, of Type j event occurrences. Two streams of Type j
events in which the frequency of occurrence is the same (but the actual times
of occurrences are different) are identical from the viewpoint of the observer.
If the event is controllable, then uj is chosen by a controller which responds to
events of frequencies less than or not far from fk. If the event is uncontrollable,
then the amount of capacity used by the Type j event (rinju of Resource i) is
not available for controllable events. A change in the rate uj may be an event
from the point of view of the Level k observer, if it occurs at a frequency less
than or not far from fk.
2. Same Level Events Consider a Type j event which occurs with a long-term
average frequency, uj, which is roughly equal to fk (L(j) = k). From the point
of view of a Level k observer, each occurrence of the event is seen in detail. The
Level k observer has a model for predicting future occurrences of this event.
(This model need not be very precise. For example, it might say that the
time between Type j events is exponentially distributed with mean 1/uj.) A
controller whose bandwidth includes the frequency fk must specify the time of
occurrence of each individual event, if the event is controllable, or to respond to
each event, if uncontrollable. This specification or response takes into account
predictions of future occurrences of the event. Such an event changes the state
ca when it occurs at Resource i.
3. Higher Level Events Consider a Type j event which occurs with a long-term
average frequency uj which is much lower then fk (L(j) < k). From the point of
view of a Level k observer, each occurrence of the event is a rare instance. The
event is so rare that the Level k observer does not have a model of when will
occur in the future. When the event does occur, the Level k controller adjusts
to the new value of the Level k state, a., influenced by the event, acting as
though the value will never change again.
2.5.4 Bounds on Frequencies in the Hierarchical Controller
In order to get a rough idea of the number of distinct clusters of frequencies in a
manufacturing system, it is useful to draw on the experience with simulations run on
the current implementation of Hiercsim. In the two case studies of factories performed
to date using Hiercsim (Morin, 1991; and Smarason, 1991), the total production
interval was equal to one week, and each operation required less than one minute to
run. In Smarason (1991), there were a total of four distinct clusters of frequencies:
static, major failures, minor failures, and operations. In Morin (1991), there were
three distinct clusters of frequencies: static, setup changes, and operations.
These simulation runs were the first which were done using the hierarchical con-
troller described in this thesis and using Hiercsim Versions 3.5 and 4.0. In the future,
the total bandwidth of frequencies of events that are that are studied in simulations
will be increased. However, for the time being, the bandwidth of events in the system
spans roughly four orders of magnitude. This limits the number of frequency clusters
which can be formed and still satisfy the frequency separation assumptions presented
in this section.
Simulations using Hiercsim Version 1.0 are described in Darakananda (1989) and
Kowalski (1990).
2.6 Assumptions about Processes
The current state of the theory and that of the implementations (Hiercsim Versions
3.5 and 4.0) requires that assumptions be made about the classes of processes which
can be modeled. The current implementation restricts the types of systems that can
be modeled beyond those restrictions imposed by the theory.
There are two sets of assumptions in the implementation of the hierarchical con-
troller. The first set is necessary because the theory is not yet fully developed to
accommodate some classes of processes. The second set is necessary because of the
particular way in which the simulation was coded. The implementation assump-
tions do not affect the controller's performance, and can for the most part be worked
around when an actual manufacturing system does not match the conditions of the
assumptions exactly.
Assumptions Required by the Current Theory
* Each process is assumed to have a fixed sequence of steps, even though in some
physical systems order may be irrelevant. Each step in a process may only use
one type of resource, with a fixed (constant) amount of operation time. Multiple
routing (Maimon and Gershwin, 1988) is not allowed. Reentrant flow is allowed
(Bai and Gershwin, 1990a and 1990b).
* Processes are assumed to have a fixed overall demand rate, which does not
change for the duration of the simulation. This was one reason why the simu-
lations of Smarason (1991) and Morin (1991) were limited to a maximum time
scale of one week.
* A process makes discrete parts which are identical to each other. In addition
to processes with different required operations, there may be many types of
parts which follow the same steps, and may only be different in details such as
the wording on labels. Each of those variations are considered to be different
processes in this theory.
This is up to the user of Hiercsim. If the user does not care about the differences,
then the processes may be lumped together.
* During the course of production, individual parts of a process are only distin-
guished from each other by the stage they have reached. There is no splitting
or assembly of parts.
* Each step of a process will complete its operation on all parts successfully. There
are no rejects or rework.
* The transit time between steps is negligible compared to operation times.
* The granularity of production is fine, so that flow rate approximations to pro-
duction may be made. A production system with coarse granularity is one where
the frequency of operations is low, and so the assumption that part movement
can be modeled as a flow breaks down. Section 7.7 contains and example which
demonstrates the consequences of using this controller with coarse grain pro-
duction.
* The entry point to a process is never starved and its exit is never blocked. For
this reason, there is always raw material to draw from, and there is always room
to place finished product for all processes. However, buffers which are not at the
entry point or at the exit point of a process may become empty or full, leading
to the possibility of blockage and starvation of intermediate process steps.
If it is necessary to model raw material shortages, a dummy first machine may
be included in the system. That machine could have a reliability which models
the shortages of raw material. Likewise, if it is necessary to model the inability
to deposit finished product, an unreliable dummy last machine may be included
in the system. similar to the
Implementation-Specific Assumptions The next assumptions relate specifically
to the current implementation of the hierarchical controller in Hiercsim Versions 3.5
and 4.0.
* Each buffer in a process sequence has an integer size. Otherwise, the controller
responsible for transporting parts crashes in the current versions of Hiercsim.
* All processes are modeled as a series of machine steps, where each machine step
is preceded by a homogenous buffer which separates the machine step from the
machine step immediately upstream (or the warehouse if the machine performs
the first step in the process). Systems with no buffers between machines cannot
be modeled exactly, but must be approximated, as having very small buffers
compared to the number of parts that would be produced during a disruption.
(See Section 7.2.2 for an example).
2.7 Assumptions about Machine Reliability
2.7.1 Characterization of Failure Modes
In the current state of the theory, a failure is any down time caused by an un-
controllable event. Down time is any interval where a machine is prevented from
performing controllable activities. A failure can be characterized by its mean time
to fail (MTTF), and its mean time to repair (MTTR). In Hiercsim Version 3.5,
there are two possible categories of failures: time dependent failures (TDF's) and
operation-dependent failures (ODF's) Buzacott and Hanifin (1978). In both cases,
the MTTR and MTTF may be either deterministic or exponentially distributed
random variables. Hiercsim Version 4.0 does not allow failures.
A time-dependent failure is one where the time to fail is measured from the instant
the machine gets repaired to the next instant the machine fails. Examples of time
dependent failures include failures of computer or power systems.
An operation-dependent failure is one where the time to fail is based on the cumu-
lative operation time since the last repair. An example of a deterministic operation-
dependent failure is cartridge usage in which a machine's magazine becomes empty
and must be replenished. An example of a random operation-dependent failure is
a failure due to the wear on drill bits. After a random cumulative amount of time,
which is dependent on factors such as the hardness of each block of metal, the drill
bit becomes dull and must be replaced. In order to keep the coding simple in the
Hiercsim Version 3.5, it was assumed that the rate of accumulation of wear and tear
is independent of the type of operation being performed at a machine. Time will
be counted towards each operation dependent failure whenever a given machine is
actually performing an operation.
The time to repair is measured from the instant the machine fails to the instant
the machine gets repaired. The time to repair is measured in the same manner for
both time dependent and operation dependent failures.
The mean time to repair MTTR and the mean time to fail MTTF are measured
by averaging the durations of many repairs and failures over a period which is long
compared to the duration of a single repair or failure. The mean time between two
consecutive time-dependent failures of Mode j for a single machine in the machine
group, MTBFj, is
MTBFj = MTTFj + MTTRj. (2.11)
Similarly, the mean time between two operation-dependent failure of Mode j on
a machine with a high utilization can be approximated as the MTBF for a time-
dependent failure with identical parameters.
The long-term average frequency of a failure is important in our hierarchy in order
to create the hierarchical framework and to classify failure modes. Since a machine
will fail once per interval, the frequency of failures of Mode j, uj, in Machine Group
i with ni machines is approximately
ni
U3 - MTBF if j F (2.12)MTB F
The controller whose bandwidth includes the long-term average frequency of Fail-
ure Mode j is able to accommodate both random and deterministic failure and repair
times. Deterministic failures will occur after exactly MTTFj time units have elapsed.
Random failures will occur after an interval has elapsed which has an exponentially
distributed duration of mean MTTFj time units. The repair time of any failure mode
can also be either exactly MTTRj time units or be exponentially distributed with a
mean of MTTRj time units.
2.7.2 Effects of Failures on Capacity
As with processes, there are two sets of assumptions in the models of failures in the
hierarchical framework. The first set of assumptions is necessary because the current
theory does not accommodate some kinds of machine downtime. The second set is
necessary because of the particular way in which the simulation was coded.
Assumptions Required by Current Theory Failures affect capacity in a man-
ufacturing system by removing machines from production temporarily without any
possible control over timing or duration. To calculate the impact of failures on the
system capacity, it is assumed that failure modes are independent of each other. In-
dependence means that the occurrence of one failure mode cannot influence when
another will occur. It is also assumed that only one failure may occur on a given
machine at any one time.
When the effect of failures is measured at a frequency which is much lower than
the long-term average frequency of the failure, the loss of capacity can be calculated
using the zero-buffer transfer line model of Buzacott and Hanifin (1978).
The model and its approximations are sketched here. The effective efficiency of
a machine in Machine Group i with ni identical machines is denoted by ei. This is
the average fraction of time that a machine will be available to perform operations.
In the transfer line model, the effective efficiency, ej, of a machine in Group i with
multiple independent failure modes can be computed by approximating the machine
as a transfer line with as many machines as failure modes, and no internal buffers.
Because there are no buffers, when one of the imaginary machines fails, the entire
line goes down (see Observation 2 of Section 2.8.2 and the example in Section 7.2.1).
This models the assumption that the single machine can only experience one failure
mode at a time.
If the machine has only independent time-dependent failure modes (denoted by
Failure Modes j, j E TDF), then the average fraction of time the machine is able to
work, ei, is given by
MTTF
eMTTFj (2.13)
jeTD MTTFj + MTTRj
Likewise, if the machine has only independent operation dependent failure modes
(denoted by Failure Modes j, j E ODF), then the average fraction time the machine
is able to work, ei, is given by
1
e MTTR (2.14)
MTTRj
1 + ODF MTTFj
jE ODF
Since the current implementation accounts for both time-dependent and operation-
dependent failures, an estimate of the long-term capacity with both types of failure
is needed. As a first approximation for this implementation, the effects of time-
dependent and operation-dependent failures are assumed to be independent. There-
fore, for a machine which has both time-dependent and operation-dependent failure
modes, the expected time the machine will be able to work is approximated as the
product of (2.13) and (2.14):
MTTFj
H T MTTFj + MTTRj
+ MTTRj
jE ODF MTTFj
The relation (2.15) also applies to machines within a machine group.
The long term capacity set for Machine Group i, (2.2) can be rewritten to account
for the quantitative approximations presented in this section as the set of all vectors
u such that:
j u < ejni Vi
u>j 0 jE Pj (2.16)
The quantity ei is the fraction of a machine in Machine Group i which is still available
for other activities besides failures, and is computed using (2.13), (2.14), or (2.15).
Implementation-Specific Assumptions Due to programming complexity, limits
on the types of machines in simulations using either version of Hiercsim were imposed.
These limits are not fundamental to the current state of the theory, and may be
relaxed in the future as Hiercsim is improved.
The restrictions are:
* In Hiercsim, Version 3.5, machines are unreliable but are unable to change
setups during the course of a simulation run.
* In Hiercsim, Version 4.0, machines can change setups, but cannot fail during
the course of a simulation run. One of the goals for the next version of Hiercsim
is to allow both failures and setup changes in the same run of a simulation.
* A machine which is occupied by Failure Mode j, j e Fi, is unable to perform
any other type of activity until the failure is repaired. Therefore, failure modes
which only slow down machines or restrict their set of operations cannot be
modeled using Hiercsim Version 3.5.
* An operation which is interrupted by a failure will resume where it left off at
the time of interruption. Rejects caused by machine failures are not considered.
2.8 Observations about Buffers
2.8.1 Role of Buffers in a Manufacturing System
Buffers exist in any manufacturing system even though some of their effects are not
desirable. Those undesirable effects include work-in-process (WIP) inventory and
long lead times. However, buffers do serve the useful purpose of isolating disruptions
in a factory by indirectly controlling the distribution of WIP.
An analogy which roughly describes the importance of WIP can be made from
the structure of an aircraft. The distribution of WIP is can be viewed as the amount
of metal in an aircraft. If there is too much metal, the aircraft is too heavy to fly.
If there is too little metal in key areas, the aircraft will break in the air. The key to
good design is to identify where the metal needs to go, and where it can be removed,
so that the minimum amount can be used to ensure a strong, responsive aircraft.
Likewise, WIP in a factory allows the smooth flow of production to occur even
when there are disruptions. However, too much WIP is expensive to maintain, hides
defects, and causes the system to be unresponsive to market demands. Too little WIP
causes the system to be continually interrupted, as even the smallest glitch forces a
shutdown. This is because when buffers are small, there is no room to store material,
and so it will not take long after a disturbance occurs before all buffers are empty
downstream of the disturbance, and all buffers are filled upstream of the disturbance.
It is important to identify machines which experience disruptions due to failures
or inflexibility, and therefore require larger buffers than reliable and flexible machines.
There is a tradeoff between the size of a buffer and the propagation of disruptions.
Buffer sizes influence the distribution of WIP throughout the system by allowing
accumulation in some areas, and preventing accumulation in others. The simulation
tool described in this thesis enables the buffer sizes to be determined by many runs of
the same system and using trial and error to pick a reasonable distribution of buffer
sizes.
2.8.2 Buffer Behavior
In this section, four observations are made about how the behavior of manufacturing
systems differs when different buffer strategies are used. The first deals with the
relationship between disruption duration and buffer size; the second describes system
behavior when there are no buffers; the third describes system behavior in which
buffers are neither empty nor full; and the fourth describes system behavior with
either all empty buffers or all full buffers.
Observation 1: Disruption Duration and Buffer Size Buffers are low-pass
filters which attenuate high-frequency disruptions and let pass low-frequency disrup-
tions. Machines which perform consecutive operations on the same part are considered
to be decoupled, or independent of each other, if the upstream machine always has
room to place completed parts, and if the downstream machine always has raw mate-
rial to work on. In order to decouple completely two such machines from the effects
of a disruption in either machine, a buffer must be able to hold as much material as
may be processed during the time of the disruption.
In general, larger buffers decouple longer disruptions, but the cost of WIP in-
creases with the size of the buffer. The smaller the buffer, the lower the cost of WIP,
but the more disruptions are propagated, reducing the capacity of the system. The
capacity is reduced because either the downstream machine will spend a significant
fraction of time starved of raw material or the upstream machine spend a significant
fraction of time blocked. A blocked machine is unable to perform operations be-
cause it cannot dispose of completed parts, while a starved machine cannot perform
operations because it has no material to work on.
In most cases, the sizing of buffers in a manufacturing system will allow some
coupling between machines as a tradeoff is made between the cost of WIP and the
cost of lost capacity. Unreliable machines, which generate many disruptions, require
larger buffers than reliable machines.
This behavior is demonstrated for an unreliable system in Section 7.2.1.
Observation 2: Systems with No Buffers A sequence of machines within a
factory, which are not separated by buffers, acts as a single unit. As mentioned in the
previous observation, a machine which is either starved or blocked cannot perform
operations. In systems with no buffers, the only place to dispose of finished parts
is directly on the next machine downstream. Likewise, raw material must be taken
directly off of the immediate upstream machine.
Because of this handoff requirement for parts, a system with no buffers will only
be able to operate as fast as the current slowest machine. In most cases, the limiting
rate of production is equal to that of the machine which has the slowest overall
production rate. However, when there is a failed machine, none of the machines are
able to pass parts to the next machine downstream, nor take parts from the next
upstream machine, and production is forced to stop.
This behavior is demonstrated in Section 7.2.2.
Observation 3: Systems with Partly Full Buffers While the buffers separating
the machines in the sequence are neither empty nor full, each machine is able to
operate independently up to its own maximum production rate. This is because the
machine is able to take raw parts from its upstream buffers, and send finished parts
to its downstream buffers as fast as it needs them.
This behavior is demonstrated in Section 7.2.3.
Observation 4: Systems with Empty or Full Buffers A sequence of machines
separated by buffers sometimes act, in a limited way, as if there were no buffers. This
behavior occurs when either all the buffers are empty or all the buffers are full.
When all the buffers are empty, machines are only able to remove parts from their
upstream buffers at the rate with which the upstream machine is producing them.
However, each machine is able to place finished parts into the downstream buffers as
fast as they are being produced. For this reason, the behavior of a system with all
empty buffers only acts as if there were no buffers from the point of view of machines
drawing raw material from upstream machines.
Likewise, when all the buffers are full, machines are only able to deposit finished
parts downstream at the rate with which the machine downstream is taking parts
from its upstream buffers. There is no restriction on the number or rate at which
parts can be drawn from the upstream buffers. In this situation, the behavior of a
system acts as if there were no buffers from the point of view of machines depositing
finished parts downstream.
These behaviors are demonstrated in Section 7.2.4.
2.8.3 Constraints Imposed by Buffers
Buffers act to decouple two machines which perform consecutive operations in a pro-
cess by providing room for the upstream machine to place finished product, and
allowing the accumulation of raw material for the downstream machine. However,
buffers are limited in their effectiveness by their size as compared to the amount
of production possible during a disruption. Once a buffer becomes empty or full,
the buffer ceases to decouple the cells, as stated in Observation 4 of Section 2.8.2.
This section specifies what constraints are needed in a flow rate approximation of
production in order to effectively model the behavior due to empty or full buffers.
The concept of a cell controller is informally introduced, for the purposes of this
section. All machines described in this thesis are contained in cells whose controllers
determine the rates and times of controllable events. This concept is developed pre-
cisely in Section 3.6.1. For now, it is sufficient to state that the controller of a cell
which contains a machine determines the rate of production, uj, of Type j parts.
Assume that the nth operation on Type j parts is performed by a machine in
Cell c, and Operation n + 1 is performed by a machine in Cell c'. The production
rate of Type j parts in Cell c is denoted by uj and that in Cell c' by u,,j. The two
cells, and therefore the two machines, are separated by Buffer /,j, which only holds
Type j parts between the nth and n + 1st operations. The amount of material, bnj,
in Buffer 3 nj must always be less than the maximum buffer size, Bnj, and can never
be negative:
0 < bnj < Bj (2.17)
From the point of view of the controllers of the cells, the buffer fills up or empties
depending on the difference in flow rates between upstream Cell c and downstream
Cell c'. (Note that this does not take into account the discrete nature of production,
but instead smoothes out the discrete jumps into continuous rates of increase or
decrease). The rate of change of the buffer level bnj is
bj = ucj - unj (2.18)
Constraints (2.17) and (2.18) imply that once a buffer becomes empty, the down-
stream cell is starved, and is limited in production rate to the rate of the upstream cell
(Bai and Gershwin, 1990a and 1990b). The conditional constraint due to starvation
is given by
if bj=0 then bj> 0 uej, <ucj (2.19)
Likewise, once a buffer becomes full, the upstream cell is blocked, and is limited in
the rate at which it can deposit raw material for the downstream cell. The conditional
constraint due to blockage is given by
if bj = Bnj then bnj < 0 = c u < uj (2.20)
Figure 2-2 shows these constraints graphically. These constraints are included
in the capacity set (2.7) as needed in addition to the constraints imposed by the
machines.
Starvation Buffer Onj
bj = 0
Type j Parts
Operation n c > U Operation n+l1
cJ - cJ
Blockage F
Cell c
Operation n
3uffer 
,nj
Cell c' _ Type j Parts
Uc j < j Operation n+1
Figure 2-2: Example of Starvation and Blockage
2.9 Observations about Measurement Frequency
As was pointed out in Section 2.5, the bandwidth of event frequencies in a manu-
facturing system spans at least four orders of magnitude. Due to this broad range,
the manufacturing system takes on a vastly different appearance when quantities are
measured at different characteristic frequencies. The frequency with which the system
is viewed and measured has a major impact on the apparent availability of machines,
the cumulative production, and the distribution of WIP throughout the system. This
phenomenon is exploited in the development of the hierarchical controller.
This section builds on the assumptions about frequencies mentioned in Section 2.5,
which defined frequency clusters, characteristic frequencies, and observers of limited
bandwidth. Section 7.3 provides an example of each of the points described here.
2.9.1 Production as a Function of Level
This section describes how the view of production changes as a function of frequency
level. Consider a manufacturing system in which operations occur at a frequency
comparable to the Level k + 1 characteristic frequency, fk+1. Other lower frequency
events occur in the system, but it is not necessary to specify those events for this
section. A detailed example is presented in Section 7.3.1.
Production in this thesis consists of operations on discrete parts. Each operation
takes a small amount of time compared to other activities in the system. The produc-
tion rate uj of Type j parts is the number of parts produced in any interval divided
by the length of the interval. There is a range over which the production rate of Type
j parts can vary which is independent of time scale. The smallest production rate of
Type j parts at a particular machine is zero, and the largest is when the machine is
operating full time on that part type.
Consider the production of Type j parts at Machine Group i, with one reliable
machine. The production rate for the part type which is measured at the Level k
characteristic frequency fk is represented by uj. The average time to complete an
operation for a Type j part on a machine in Group i is rij. Regardless of the frequency
of measurement, the bounds on the magnitude of uj can be written as
0 < u < - (2.21)
-ij
When production is observed at the Level k + 1 characteristic frequency fk+l,
production occurs in discrete jumps as individual operations on parts are completed.
For this reason, a Level k + 1 production rate for Type j parts, ui+l,is not defined.
Instead, the Level k + 1 observer must treat each occurrence of an operation as an
individual event, and account for production using the production state aj.+l
As the average frequency of measurement is decreased to the Level k characteristic
frequency fk, the interval of measurement increases, and so does the average number
of parts produced during a measurement interval. It is possible to speak of a Level k
production rate, u, which is equal to the number of Type j parts produced during
the interval divided by the length of the interval.
At each measurement point, the production rate over the interval, u, may change
from its value over the previous interval. Within each interval, the production rate is
assumed to have been maintained at a constant rate. Therefore, the discrete nature
of production is approximated by a piecewise linear curve. When the measurement
intervals are short, the number of times the production rate u changes can be very
high, allowing for the possibility of a jagged cumulative production curve, where
cumulative production is defined to be
cum prod (t) = udt (2.22)
As the average frequency of measurement decreases to the Level k-1 characteristic
frequency fk-1, the observed production rate of Type j parts is u-. Because the
interval over which production is measured is increased, the number of times the
production rate uj- can change is smaller than that of the Level k production rate
ju. Again, the production rate is assumed to be constant over any Level kc-1 interval.
The lower number of measurement points forces the Level k - 1 production curve to
be smoother than that of Level k.
The average of the Level k production rates u over a Level k - 1 interval is equal
to the Level k - 1 production rate uj-1' for that interval. However, the actual values of
cumulative production may be slightly different at any time instant. This difference
between cumulative production at Level k - 1 and Level k is due to the fact that
a Level k observer sees more detail and is able to detect small fluctuations in rates.
However, from the point of view of a Level k - 1 observer, those Level k fluctuations
are insignificant.
At the static level in the hierarchy (Level 1), the production rate of Type j parts
u is the total number of parts produced during the production run divided by the
overall length of the run. This is the long-term average production rate that is used
to determine the frequency cluster in which production belongs. All detail about
fluctuations in production rate over that interval are lost. Any fluctuations that did
occur are considered to be insignificant.
2.9.2 Capacity over Different Time Scales
Long-term and short-term capacity sets are defined in Section 2.4. This section
describes the relationship between the long-term and the short-term capacity sets in
more detail, from the point of view of observers with different bandwidths. A precise
definition of capacity at any arbitrary Level k is given by (3.12) in Section 3.6.1. A
specific example of the general concepts described here is presented in Section 7.3.3.
Relationship Between Activity States at Different Levels Consider Activity
j that occurs on Resource i at a frequency comparable to the Level k characteristic
frequency fk. The Level k state of Activity j is represented by the binary quantity
k3
When Activity j is in progress, a = 1, and when it is not in progress, ak = 0.
At Level k + 1, the state of Activity j is represented by the binary quantity a j'.
Similar binary quantities are used at Levels k + 2 and below which represent the state
of Activity j.
The relationship between the representations of the state of Activity j at levels k
and lower is given by
ak =1 a = 1,; A > k1 1, (2.23)
ak = 0 a- 0,; A > k
That is, when Level k Activity j is occurring on Resource i, it occupies that re-
source at all levels below Level k. This relationship is used to develop the relationship
between capacity sets over different time scales.
At levels k - 1 and higher, the occurance of Activity j on Resource i is represented
by its rate of occurance u-, A < k. The value of u, , A < k may be a fraction which
indicates that the resource is only partially occupied by Activity j over a long time
scale. The relationship between the rate of occurance of Activity j and its discrete
Level k state is
u > 0 A < k = or (2.24)
- Z.7 (2.24)
u..=0, <k 3
Example Manufacturing System Consider a manufacturing system which con-
tains Machine Group i. All controllable activities which are performed on Machine
Group i are in the set P. Likewise, all uncontrollable activities which occur on
Machine Group i are in the set Fi.
Type 1 failures occur at a frequency which is comparable to the Level k - 1 char-
acteristic frequency, fk-1; Type 2 failures occur at a frequency which is comparable
to the Level k characteristic frequency, fk; and operations on Type j parts occur at
a frequency which is comparable to the Level k + 1 characteristic frequency, fk+l.
Therefore, the set Fi consists of the integers 1 and 2 while the set Pi consists of the
set of integers 3, 4, -
Level k + 1 Capacity Set When the capacity set is measured at the Level k + 1
characteristic frequency, fk+l, the availability of each machine is constrained to binary
values. At any given instant, a machine can either be operational or failed. If the
machine is operational, then it is available for the start of an operation only if it is
not currently performing an operation. Therefore, the capacity set of the machine is
determined simply by its current status: ak+1. The Level k + I production rate, u+l,
is not meaningful in this example. The status ak+1 of the machine is changing very
frequently as parts are loaded, processed, and unloaded and as the machine fails and
gets repaired.
The Level k + 1 capacity set is:
a k+1 < a 1 - a+1 k+1 (2.25)
jEPi
akl = 0,1 = Type j Operation
ai+l = 0, 1 = Type 1 Failure
ak+l = 0,1 4 Type 2 Failure
Note that the right-hand side of (2.25) can only be 0 or 1 since only one activity may
occupy Machine i at any time instant.
Level k Capacity Set When the frequency of measurement is decreased to the
Level k characteristic frequency fk, the capacity set (2.7) can be used. This is be-
cause the frequency of operations on Type j parts is high enough to use a flow rate
approximation uj for the occupation of the machine, riju . However, the right-hand
side of (2.7), which represents the machine availability, is still constrained to be an
integer since the Type 2 failures are seen as individual events by a Level k observer,
and are represented by the failure state, ai2. Whenever a Type 1 failure occurs, the
Level k observer does not have a model for when it will next occur. This is because
Type 1 failures and repairs occur so infrequently. Type 1 failures are also modeled as
discrete states ailk
The Level k capacity set is therefore:
5 iju k < I- -ak k (2.26)E j il - i2jEPi
u k> 0 = Type j Operation
al = 0, 1 Type 1 Failure
a =0,1 = Type 2 Failure
Note again that only one activity may occupy Machine i at any given time, so that
the right-hand side of (2.26) is always greater than zero.
Level k - 1 Capacity Set When the frequency of measurement is decreased to
the Level k - 1 characteristic frequency fk-1, the machine availability is no longer
limited to binary values. Many occurrences of Type 2 failure modes will occur over
a measurement interval, and even more operations will occur when machines are
operational. The effect of those failures will be averaged over the time between
measurements using the rate approximation for the failure: ui21 . The integer machine
availability of (2.7) is replaced by the fraction of time the machine is expected to be
operational. Type 2 failures can be averaged according to (2.13) or (2.14) depending
on whether they are time-dependent or operation-dependent. Again, the occupation
of the machine due to processing of parts can be modeled as rju-'. Type 1 failures
must be considered in detail as changes in the failure state ak-l. Those failures must
be considered as discrete events, causing the loss of an entire machine when they
occur. Therefore, at Level k - 1, there is a mixture of averaged failure modes (Type
2) and discrete failure modes (Type 1).
The Level k - 1 capacity set is:
k-ju -'1 ak-1) k-1)
jEP,
uk-1 > 0 = Type j Operation
k- = 0,1 4 Type 1 Failure
uk-1 > 0 = Type 2 Failure (2.27)
Level 1 Capacity Set As the frequency of measurement of capacity is still further
decreased to that of the static level, f, = 0, the effect of both types of failure are
averaged out using the highest level failure rates: ul and 2. At that point, the
long-term capacity set (2.2) becomes valid. Both failure modes are averaged out
using (2.13), (2.14), and (2.15) of Section 2.7.2, depending on the nature of the
failure modes.
The Level 1 capacity set is:
STijU < 1 -r ilul - ri2U1 (2.28)
TxilU~ + 7i2 U
l  1
u1 > 0 = Type j Operation
U'> 0 Type 1 Failure
u1> 0 Type 2 Failure
Note that the fraction of time that failures occupy Machine i can be no greater 1.0.
2.9.3 Material in Buffers as a Function of Level
This section describes how the view of the amount of material in a buffer changes
as a function of frequency level. Consider the manufacturing system defined in Sec-
tion 2.9.2, with Type 1 failures at Level k - 1, Type 2 failures at Level k, and
operations at Level k + 1. In addition, consider the Buffer P,,j which separates the
nth and n + Ith step in the processing of Type j parts. The size of the buffer, Baj, is
comparable to the amount of production that will occur during a Type 1 failure on
one of the machines while the other is still operational. An example is described in
Section 7.3.2.
At frequencies of measurement comparable to the Level k + 1 characteristic fre-
quency fk+l, (which is comparable to the frequency of operations), the amount of
material in Buffer ,,j, b ~+(t), is the number of parts which are in the buffer at time
t and it is an integer. The machine which performs the operation for Step n deposits
parts into Buffer P,n, while the machine which performs the operation for Step n + 1
takes parts out of 3nj
At frequencies of measurement comparable to the Level k characteristic frequency,
fk, operations on parts at Step n are completed at the Level k rate uj. At the same
time, parts are removed from Buffer /,j for Step n + 1 at the Level k rate, u kn, j .
The amount of material in the buffer bki(t) over the time scale of Level k (1/fk) is
approximated by the integral of the rate of parts flowing in, ukj, less the integral of
the rate of parts flowing out of the buffer, u k
bi(t) = f u(t)dt - +l,(t)dt + b (O) (2.29)
Each time either of the production rates affecting the amount of material changes,
the rate at which the buffer is filling up or is emptying also changes. During an interval
in which neither unj nor Un+l,j changes, material is accumulating in the buffer at a
constant rate bk . Each time either of u , or u k+, changes, bj may also change.
Over the course of an interval in which the buffer fill rate, b, is constant, bj will
change by an amount equal to the integral of btn over the interval. Note that since
Level k corresponds to the frequent but short Type 2 failures, the buffer will not often
become full or empty due to a Type 2 failure.
At frequencies of measurement comparable to the Level k characteristic frequency,
fk, the buffer fill rate is represented by b 1, and the amount of material in the buffer
by b - '1 . Because the interval over which bk- 1 is constant if longer than that at Level
k, the fluctuation in b j - is also larger than that of Level k. In fact, the Level k
fluctuations are too small to be seen by a Level k - 1 observer. At Level k - 1, the
amount of material in the buffer, b - 1, is equal to the integral of the rate at which
the Level k - 1 production at Steps n and n + 1 are filling and emptying the buffer
bk(t) 1 (t)dt - U , j(t)dt + b 1(0) (2.30)
Note that the buffer in the example will fill up or empty if more than one Type 1
failure occurs on the same machine during a period of bad luck. This is due to the
sizing of the buffer, which is enough to accommodate one Type 1 failure at a time,
but not two.
There is a level in the hierarchy at which the size of the change in the amount of
material in the buffer has the potential to become larger than the size of the buffer, if
left unchecked. At that level, the buffer ceases to decouple disruptions, because the
bounds on the amount of material in the buffer given by (2.17) limits the machines
according to the conditional constraints (2.19) and (2.20). To an observer over the
long interval, the ability of the machines to produce at different rates is lost.
In the example of this section, the level at which the buffer ceases to decouple
the two machines is the static level, Level 1. The production rates uln and u,+l)j
at Steps n and n + 1 do not change, and so must be equal. Otherwise, the bounds
on the amount of material in Buffer ,3j will be violated. The controller at the static
level ignores the buffer, and absorbs it into the long-term representation of capacity
(2.2).
Chapter 3
Algorithms for a Hierarchical
Controller
3.1 Requirements of a Hierarchical Controller
This section describes some of the issues involved in the control of a manufacturing
system. Some basic requirements for a real-time controller of a manufacturing system
are sketched as well as some of the difficulties inherent in manufacturing. These
requirements and difficulties lead to a need for decomposition techniques in the design
of a controller. The decomposition approach taken in this thesis requires that the
system controller consist of many components. An overview of those components is
given, along with a brief description of how they fit together into a system.
3.1.1 Requirements of a Real-Time Controller
Scheduling is the selection of times for future controllable events. In this thesis, the
only controllable events which are studied are those which initiate operations and
setup changes. Other controllable events include preventative maintenance, ship-
ments, and employee meetings.
Here a controller or scheduler is a mechanism which allocates resources in a man-
ufacturing system for controllable activities. In this thesis, there are three distinct
types of controllers. A cell controller is one which is responsible for choosing rates
and approximate times for controllable activities for a subset of the manufacturing
system. A machine group controller is responsible for loading of parts onto machines
for processing, transportation of parts, choosing precise setup change times, and re-
porting machine status information to cell controllers. The system controller is the
combination of all cell controllers, machine group controllers, and the infrastructure
which serves to coordinate the component controllers towards the same overall goal
of meeting requirements with a minimal amount of inventory or surplus.
To be effective, a real-time scheduler needs to be responsive to random uncontrol-
lable events. The only random uncontrollable events studied in this thesis are failures,
as described in Section 2.7, and buffers becoming empty or full. Randomness is taken
into account by means of a feedback control law which varies the tentative schedule
according to the state of the manufacturing system.
In addition to being responsive, the system controller must be easily expanded
when the manufacturing system changes. A change may be the addition of new
machines, the introduction of new products, or a change in existing parameters.
The amount of data in any manufacturing system is quite large, and so errors,
inconsistencies, and gaps in the data are to be expected. The system controller should
still be able to operate despite inaccurate or incomplete data. As the data becomes
more accurate and complete, the system controller's performance should get better
and better if it is a good controller.
3.1.2 Difficulties of Scheduling
Scheduling a manufacturing system is a very complex task. The enormous amount
of data, the large bandwidth of event frequencies, the vast number of decisions, and
the interactions of various parts of the system all make for a difficult problem.
A manufacturing system has an overwhelming amount of information which the
system controller needs to make its decisions. The repair and configuration status of
machines, the location of production lots, the status of production for each process,
and the allocation of priorities for sequencing of controllable events are all vital to
a system controller. When a system is more complex than two machines and one
buffer, the amount of data grows rapidly.
The timing of each movement of production lots, tooling change requests, and raw
material staging must be selected by some controller, regardless of whether or not the
controller is a person or a computer. Production rates and resource allocation all
should be chosen to ensure that the movement of production lots and tooling changes
are feasible and meet requirements.
The system controller must manage the interaction of events in the system so
that all the machines are working toward the same overall goal: satisfying demand
with minimal inventory or surplus. Each decision within the system can be relatively
simple in isolation, and there may be many different reasonable options available.
However, the effects of large numbers of decisions may interact in unexpected and
undesirable ways. If such interaction is not managed properly, then machines can be
forced to be idle while there are backlogs, and machines will be working after already
producing more than was needed.
Finally, all this information processing, decision making, and interaction manage-
ment must be accomplished in a time period that is short compared to operations at a
machine so that machines are not idle, or doing the wrong thing, while the controller
is determining the next tasks.
Not only is this thesis about the real time control of factories, but also it is about
the simulation of a factory. In a factory simulation, many decisions must be modeled
in some way in order for it to retain realism. Below the level of the hierarchical
controller, common sense policies must be implemented to choose among equally
valid options and to clarify ambiguous commands. Simulating the factory thus adds
one more layer of complexity on top of the already complex task of controlling events
within the factory.
3.1.3 Decomposition Requirements
The approach adopted here to resolve the complexity issues and requirements for a
real time controller is based on breaking the problem into many manageable pieces.
Control of the system is achieved by forming an infrastructure which clearly defines
the responsibilities of individual controllers with respect to each other. Each cell
or machine group controller has the tools to handle its responsibilities within its
time frame and trusts that all other individual controllers within the system are so
equipped.
The system-wide control infrastructure defines the paths over which information
must flow. Information which is essential to an individual cell or machine group
controller is kept near at hand, while pathways and approximation techniques are used
to provide information which is not needed very often or in detail. The system-wide
control infrastructure has also been created in such a way as to facilitate expansion
and to be independent of any particular control algorithm.
Within the system-wide control infrastructure, each partition of the system con-
troller is responsible for a small portion of the whole. This reduces the burden on each
individual controller, allowing each to perform its calculations in a very short time.
Coordination of controllers is accomplished by the use of the system communication
network.
The hierarchical system controller concept presented above is not new. In fact,
most companies' management structures are organized along these lines. A hierar-
chy is an attempt to decompose a complex problem into a set of intertwined, but
simpler problems. Some traditional hierarchies that are designed for production sys-
tems do not have a systematic method of defining areas of responsibility, resulting in
overlapping responsibilities or gaps in responsibilities.
Because the consequences of a poor hierarchy are not well understood, simple
heuristic rules are often invented to compensate for the deficiencies. However, those
rules do not address the underlying problems, and so are largely ineffective. This
prompts even more rules to compensate for the consequences of the initial set of rules.
Eventually, this vicious cycle leads to an ineffective and complex system controller.
This thesis attempts to describe the underlying structure of a manufacturing sys-
tem in order that the hierarchy may be understood and designed properly. In addi-
tion, algorithms are developed which take advantage of the hierarchically decomposed
system in order to control the manufacturing system.
3.2 Microscope Analogy to Hierarchical Control
One of the fundamental aspects of the hierarchical controller is the method by which
it specifies the precise timing of controllable events. This section is an attempt to
build an analogy of how the system controller operates by using a description of
how a microscope works. The major point demonstrated by this analogy is how the
hierarchical controller makes individual, independent controllers work together as a
team.
Microscope Analogy The purpose of a microscope is to view a small object at a
large magnification. It is difficult to locate such a small object and bring it into focus
in one step. Microscopes overcome this problem by being equipped with at least three
lenses, each of a different magnification power. The lenses and slide are held in place
by a larger structure.
The first step in looking for the small object is to put the slide under the least
powerful lens. The slide is out of focus initially, but by making large movements of
the lens, the slide can be brought into a sharp focus. Most of the slide is seen in the
lens, and general features can be made out. However, the small object cannot be seen
because the fine details are lost. See Figure 3-1, Part A. This is similar to the view
of the highest level controller.
Once the general layout of the slide is seen using the least powerful lens, areas
where the small object might be located are identified. One such area is moved into
the center of the lens viewing area, and the next power lens replaces the first lens.
At first, the middle power lens is out of focus, even though the slide was in focus
for the least powerful lens. The movement required to bring the middle power lens
into focus is much less than the original coarse movements needed to focus the least
powerful lens. This is mainly because the medium power lens started out in the right
neighborhood, identified by the focus point of the least powerful lens.
Part A:
Low Power Lens
Part B:
Middle Power Lens
Part C:
High Power Lens
Figure 3-1: Microscope Analogy to Hierarchical Control
The area of the slide seen using the middle power lens is smaller than that of the
lowest power lens, but the smaller area is seen in more detail. Salient features of the
general area identified by the lowest power lens are now clear. The small object can
be seen, but it is too small to identify its precise shape. See Figure 3-1, Part B.
The goal is to view the small object in as much detail as possible. The object
is brought into the center of the viewing area of the medium power lens. The most
powerful lens then replaces the medium power lens. The new lens is out of focus in
the same way the medium power lens was out of focus when it replaced the first lens.
The new lens is very near where it needs to be, and with some very small adjustments
to its location, the small object comes into sharp focus, filling the entire viewing area.
All the necessary detail of the object is present, but the rest of the system is not seen.
See Figure 3-1, Part C. The highest power lens is like a controller at the lowest level
of the hierarchy.
Relationship of Analogy to Controller The hierarchical controller determines
the precise timing of events in a way similar to the way that the microscope looks
at a small object. The infrastructure and communication links between different
controllers can be compared to the structure which holds the lenses and the slide in
place. The lenses of the microscope relate to each controller of the factory, where the
magnification power of the lens is analogous to the time scale of the controller. The
viewing area of the microscope can be related to the amount of the factory for which
a controller is responsible. The level of detail in the viewing area of the microscope
can be related to how precisely the controller sees the timing of events.
At high levels in the hierarchy, the high level controller sees the entire factory, but
events are aggregated into long-term flow rates. The details of the movement of parts,
failure times of machines and tooling configurations of machines are approximated by
aggregate quantities. The high level controller is able to determine production rates
in such a way as to try to meet the long-term demand rates and still stay within the
capacity of the factory.
Each middle level controller uses the production rates set by the high level con-
troller as a guide to determining production rates at its time scale. Each controls a
smaller number of machines. Details such as machine failure states and tooling con-
figurations are visible to these controllers, but still the precise timing of part loading
on the machine is represented as a set of flow rates. The production rates of each
middle level controller, therefore, are able to take into account specific information
about machine states and production status to determine the best mix of parts to
produce.
Each of the lowest level controllers is responsible only for a single machine. It
is also responsible for determining how many parts to load, and when to load those
parts. It uses the production rates from its middle level controller as a guide, and
loads parts according to those rates. The difference between the exact loading times
from the production rates accounts for minor machine jams, short transportation
times, and any number of short duration events.
3.3 Basic Decompositions
The hierarchical controller is based on two concepts: feedback control algorithms
which are used to allocate resources, and a framework which ties many of those feed-
back controllers into a system-wide controller. This section describes the relationship
of the different types of decomposition to one another. The next sections develop the
controller infrastructure and algorithms.
Basic Building Block The basic building block for the hierarchical control system
is based on work by Kimemia (1982) and Kimemia and Gershwin (1983). Further
development of the basic building block can be found in Gershwin, Akella, and Choong
(1985). The system was controlled by a two level hierarchy. The module used feedback
on production to allocate resources among different part types. The measured variable
was surplus, which is the difference between cumulative production and cumulative
requirements. The behavior of the basic model is described in Akella, Choong and
Gershwin (1984). This controller works well for systems with a small number of
process steps and a two classes of events: low frequency failures and high frequency
operations.
Frequency Decomposition An extension of the basic model to systems with mul-
tiple clusters of event frequencies appeared in Gershwin (1989). System control is
accomplished by multiple controllers which schedule the same machines at different
time scales. Controllers at long time scales set targets for shorter time scale con-
trollers. Frequency decomposition alone works for systems with a small number of
process steps, many failure modes of different frequencies, and machines with lim-
ited flexibility that require setup changes. It is not able to handle work-in-process
inventory.
Process Decomposition The basic model Kimemia and Gershwin (1983) and
Gershwin, Akella, and Choong (1985) was extended to control a manufacturing system
with internal storage by Bai and Gershwin (1990a, 1990b) and Bai (1991a, 1991b).
The system is controlled by a set of smaller controllers, each with responsibility for a
subsection of the factory. The controllers are coordinated by having the same target
production rate, and by the status of buffers separating the subsections. This is a
process decomposition technique and works for systems with a single level of failure
mode frequencies but many process steps with work-in-process.
Pyramid Decomposition The combination of frequency decomposition and pro-
cess decomposition into a pyramid hierarchy first appeared in Darakananda (1989)
and Violette and Gershwin (1991), and is expanded in this thesis. The pyramid de-
composition allows the system controller to be composed of many smaller controllers
within an infrastructure of communication links. Each controller is able to use a feed-
back algorithm which is best suited for its frequency control range. This allows each
controller to operate as an independent module with its own set of variables, while
making reasonable assumptions about the rest of the system. The pyramid structure
works for systems which have many distinct event frequencies and numerous process
steps with work-in-process. Frequency decomposition and process decomposition hi-
erarchies are special cases of the pyramid hierarchy.
3.4 Control Decomposition Infrastructure
3.4.1 Control Level Definition
In Section 2.5, it was assumed that each Type j event has a stationary long-term
average frequency, uj, and that the frequencies of all events are clustered together
about distinct characteristic frequencies, fk. It is possible to assign to each Cluster k
a set of controllers. Those controllers are based on models of events whose frequencies
lie in the Level k frequency domain, but make simplifying assumptions about all other
events. The convention to be used in the remainder of this thesis is defined here.
The level of a controller corresponds to the level of the frequency cluster whose
events are controlled by the controller. Consider a Level k controller which controls
and responds to events that occur at a frequency approximately equal to fk. The
values of the integer k in the term "Level k" follows this general rule 1:
* Low values of k correspond to low frequencies, fk, and high levels in the hier-
archy.
* High values of k correspond to high frequencies, fk, and low levels in the hier-
archy.
The relative location of a controller to other controllers in the hierarchy is de-
scribed as follows:
* A control level is higher than another if it encompasses more territory, and sees
a more aggregate view of the system over a longer time scale.
A high control level implies that the system is viewed from the point of view
of an observer who sees only aggregate information over a long time scale and
a large number of resources. A high level controller responds to events which
'See Figure 3-3 for a sample hierarchy which illustrates this rule.
occur at low frequencies, and therefore to events which belong in clusters with
low values of k.
* A control level is lower than another if it encompasses less territory, and sees a
more detailed view of the system over a shorter time scale.
A low control level implies that the system is viewed from the point of view of
an observer who sees detailed information over a short time scale and a limited
number of resources. A low level controller responds to events which occur at
high frequencies, and therefore to events which belong in clusters with high
values of k.
In this thesis, when reference is made to levels higher than Level k, events which
occur at frequencies f < fk are to be considered. Likewise, when reference is made
to levels lower than Level k, events which occur at frequencies f > fk are to be
considered.
For the controller of Level k Cell c (defined in Section 3.5), states ak which are
influenced by events of frequencies f < fk are treated as constant. For example,
suppose that a failure mode occurs with a long-term average frequency which is close
to the Level k - 1 characteristic frequency fk-1. A machine which is down because of
this failure is completely unavailable to the Level k controller. The times for events,
which occur at frequencies comparable to the Level k characteristic frequency fk, are
chosen directly by the Level k controller, if they are controllable.
On the other hand, events which occur at frequencies much higher than fk can only
be measured by the Level k controller by the rate uk at which they occur. The loss
of capacity due to high frequency failures is averaged out using (2.15). Occurrences
of controllable events are modeled as rates, uC, which are chosen by the controller.
The lowest level in the hierarchy that has a dedicated controller is at the operation
level of the processes. Below that level, automatic logic or operator judgment takes
over to ensure that parts actually get placed into the correct position for an operation
to take place.
3.4.2 Buffer Control Level Definition
Observation 1 of Section 2.8.2 states that buffers are low pass filters, masking short
disruptions, but transmitting the effects of long disruptions. The time scale of the
disruptions that a buffer may mask is on the same order as the length of time needed
to empty or fill the buffer.
A Level k buffer is a buffer which is capable of decoupling events at control levels
k or lower. Machines which operate on two subsequent operations of a process, and
which are separated by a Level k buffer are usually able to operate independently of
each other at control levels k or lower. Each decoupled machine has its own values
for production rates and surpluses. The two Level k rates may be different at any
time instant, but over a time period comparable to or longer than the Level k - 1
time scale (1/fk), the average values of those rates are equal.
An example which illustrates buffer control levels can be found in Section 7.3.2.
3.5 Definition of Controller Infrastructure Terms
The designs of hierarchical manufacturing system controllers in this thesis are based
on the number of distinct control levels (Section 3.4.1) and the placement of different
size buffers (Section 3.4.2). Different types of systems require different control infras-
tructures. This section formally defines the terms for the three types of infrastructures
used in this thesis, as well as components of the controller which are common to all
classes of hierarchical controllers.
Basic Terms For the purposes of this section, define Level k Cell c to be a subset of
resources (buffers and machine groups), as seen by a Level k observer. This concept
is developed further in Section 3.6. The controller of Level k Cell c chooses Level k
target production rates, uc,k for the process segments which use the resources of the
cell. Those Level k target rates are based on a view of capacity and production as
seen by a Level k observer.
Level k - 1 Cell C is a subset of resources, as seen by a Level k - 1 observer. Let
Figure 3-2: Relationship between Cells, Buffers and Processes
Level k - 1 Cell C contain all of the resources in Level k Cell c, and possibly more
resources outside of Cell c, including the resources of Level k Cell c'. Each of the
process steps which are contained in the process segments of Level k Cell c are also
contained in the process segments of Level k - 1 Cell C. However, the Level k - 1
process segments may have more steps than the Level k process segments. One of
the functions of the Level k - 1 Cell C controller is to provide Level k - 1 target
production rates, ui - 1, for each of the Level k process segments in Level k Cell c.
Figure 3-2 shows an example of a factory decomposition which has three control
levels, and multiple cells. There are four processes which pass through the system.
The process steps are separated by buffers of various sizes.
Level k-1 Cell C
.- - - - - - - - - - - - - - - - - - -Level k Cell
Cel 'LeekLevel k+l Cell
Process J, Step n+1 - -
fP i Stop n
Level k Buffer
Level k-1 Buffer Level k+i Buffer
I------------------------------------------------
Common Components The three classes of decomposition infrastructures defined
in Sections 3.5.1, 3.5.2, and 3.5.3 have common components which correspond to the
interface of the hierarchical controller with the system outside of its control. Between
the highest level controller (which serves as an interface between the outside world
and the factory) and the lowest level controllers (which serve as interfaces between the
hierarchical controller and the individual machines) there exists room for a designer
to tailor the hierarchical controller to the particular needs of a specific factory.
At the low frequency end of the hierarchical controller is the high interface level.
The highest interface level is designated as Level 1 with a characteristic frequency,
fi = 0. Regardless of the type of infrastructure used, there is only one controller
which operates at Level 1. It serves as an interface between the long-term external
demand and the factory which meets that demand. That controller supplies target
rates to the entire system based on external demand rates and long-term system
capability. Those target rates are chosen to be within the long-term average capacity
of the system, given by (2.2). In the context of the current theory, those target rates
are static for the duration of any production run.
The presence of the Level 1 interface prevents excessive demand rates from being
imposed on the system. Excessive demand rates are replaced with a combination of
target rates which utilize the full capability of the system, but are less than those
required. Therefore, excessive demand rates will not be met, but instead, the system
will produce in such a manner that at least one of the machine groups will be occupied
100% of the time with controllable and uncontrollable activities. In this way, the
system capacity is not reduced due to overloading the factory with too much work-
in-process.
At the high frequency end of the hierarchical controller (which is typically at
the level where the highest frequency controllable events are clustered), there is the
low level interface. At that level, there is one controller for each subset of machine
groups for which the control of production can be accomplished by simple rules. The
controller of a subset of machine groups at the low level interface is responsible for
providing the subset with raw material so that the higher level target production
rates can be met as closely as possible.
From the buffer immediately upstream of the low level interface to the point where
parts have completed all processing within the subset of machine groups, controllers
based on simple rules transfer each part from one operation to the next. Those
simple rules are based on the current repair and configuration status of machines
in the subset of groups, and on the detailed location of each part in the subset of
groups. An example of a subset of machine groups in which simple rules may be used
is a short flexible manufacturing station in which the completion of an operation on
a part triggers a rule-based controller to transfer the part immediately to the next
machine downstream.
3.5.1 Frequency Decomposition Infrastructure
The frequency decomposition infrastructure is used for systems which have short pro-
cesses (only a few process steps), and which have many different clusters of event
frequencies. Gershwin (1989) describes the details of this infrastructure. The con-
troller of the cell at the low level interface (Section 3.5) consists of one controller for
all the machines in the system and is only responsible for releasing parts from the
warehouse at the target rate supplied to it from the next higher cell.
Between the Level 1 cell and the low interface cell are one or more intermediate
cells. The cells are arranged in a cascade. Each of the intermediate cells has a
controller which responds to a unique cluster of events according to a model consistent
with the cluster's characteristic frequency. The controller provides target production
rates for the entire system based on its view of capacity.
As the target rates are handed down the cell cascade, the view of capacity becomes
more detailed as more and more frequent events become visible and are accounted for
individually. Eventually, the target rates have filtered down to the low level interface
where parts are ordered from the warehouse for processing on the machines in the
system. At that point, the target rates which are used by the low level controller
have accounted for any machine downtime due to uncontrollable events as well as the
performance of the system relative to previous production goals.
Controller
Low
Frequency
of Events
Level k-1
Model
Level k
Model
Level k+1
ModelHigh
k-1=1
(High-Level Cell)
k=2
k+1=3
(Low-Level Cell)
Factory7
S -  Type j Parts
Figure 3-3: Frequency Decomposition Infrastructure
The only buffers contained within the cells of the frequency decomposition infras-
tructure are used by the rule-based controllers on each machine group as temporary
holding areas for parts during the transit between two operations. The buffers are
too small to decouple any two steps within a process.
Figure 3-3 shows a system whose control hierarchy follows the frequency decom-
position infrastructure. Level k Cell c receives Level k - 1 target rates, u - 1 , from
Level k - 1 Cell C. The controller of Level k Cell c converts the Level k - 1 target
rates, - 1 , into Level k target rates, u, for its Level k + 1 component cell, Level
k + 1 Cell c*. Level k + 1 Cell c* sends loading commands to the warehouse which
supplies the machine.
3.5.2 Process Decomposition Infrastructure
The process decomposition infrastructure is described in Bai (1991a, 1991b) and is
used for systems which have long processes (many steps), and which have only one
cluster of event frequencies, other than operations. Therefore, there is only one
intermediate control level between the high interface level and the low interface level.
Due to the large number of steps in each process, it is possible to divide a process
into two or more semi-independent process segments. Each of the process segments
are controlled at the intermediate level by one of a number of independent cells.
The intermediate cells are separated from one another by buffers which are capable
of masking short term disruptions. The presence of the buffers forces two adjacent
cells to cease being independent of each other when there is a long term disruption,
according to Observation 4 of Section 2.8.2.
The highest level controller supplies each of the intermediate cell controllers with
the same set of production rate targets, based on demand and the long-term system
status. The intermediate level cell controllers translate those long-term target rates
into short term target rates based on the production through the local process seg-
ments. Each of the intermediate cells supplies target rates to a single low interface
level cell. The low interface level cell controller supplies its process segments with
raw material based on the target rates from its intermediate level parent and the
availability of parts in the entry buffer to the cell.
The purpose of the intermediate level buffers in the controller is to separate the
system into a set of semi-independent modules. Each of the modules operates without
a detailed knowledge of the rest of the system. However, when a disruption is large
enough to propagate across a number of modules at the intermediate control level,
that information is transmitted by the filling or emptying of buffers. When a buffer
either empties or fills up due to the disruption, the conditional constraints (2.19)
and (2.20) are imposed on the cell to which the effects of the disruption have been
transmitted.
There may be even smaller buffers within each of the low interface level cells, but
those are too small to mask any disruption which would interrupt the production
flow. Those buffers serve as transfer points for parts which are traveling between two
steps in the same process segment.
3.5.3 Pyramid Decomposition Infrastructure
The pyramid decomposition infrastructure is used for systems which have both long
processes (large numbers of process steps) and multiple levels of event frequency
clusters. Therefore, there is a need for multiple intermediate control levels between
the high level interface and the low level interface (Section 3.5). In addition, there
is a need for many buffers in a range of sizes. Each buffer decouples disruptions of a
specific frequency and duration, depending on its size.
The infrastructure of cells in the pyramid decomposition is a combination of the
frequency decomposition and the process decomposition infrastructures. An example
pyramid hierarchy appears in Figure 3-4. There is one process consisting of four steps.
The process is controlled at three levels. There are more level 2 cells than Level 1
cells, and more Level 3 cells than Level 2 cells. The buffers separating the cells at
each level get smaller as the hierarchy is descended.
The multiple levels of event frequency clusters leads to a cascade of cells similar
to that of the frequency decomposition infrastructure. The controller of Level k Cell
c responds to events which occur at frequencies near to the Level k characteristic
Lower
C
0
0
CL
E
0
a
CT
LL
Long Term
Demand RatesILevel 1
Level 1
Target RatesLevel 2
0-
Level 2
Target RatesLevel 3
Higher
I Loading Commands I I
.... - - - - -- - r-
Factory
P1
Process Decomposition
Figure 3-4: Pyramid Decomposition Infrastructure
Cell C
I---
..................
..........I1111
- -- - - - - - - - - - - - - - -
Cell c
I
.............
- - - - - - -
Cell c'
I L IL ------------
frequency, fk. Similarly, the controller of its parent cell, Level k - 1 Cell C, responds
to events which occur at frequencies near the Level k - 1 characteristic frequency,
fk-1.
The large number of steps in each process implies that the time required for a
part to travel through the process is much longer than the Level k time scale, 1/fk.
Therefore, it is necessary to divide a process as seen by a Level k observer into a
number of Level k/ process segments. Each of these process segments is separated
from the next segment upstream and the next segment downstream by buffers of
Level k or higher, and thus can be controlled independently of the other segments
as long as the buffers are not empty or not full. Each Level k process segment is
controlled by the cell which contains all the resources necessary to perform the steps
in the segment.
At Level k - 1, the characteristic time scale, llfk-l, is longer than that of Level
k. This implies that the number of steps which may be performed within the Level
k - 1 time scale is larger than the number at Level k. Therefore, each process may be
divided up into a smaller number of semi-independent process segments at Level k - 1
than at Level k. Each Level k - 1 process segment is contained in one of a number
of Level k - 1 cells, and are separated from each other by buffers of Level k - 1 or
higher. Note that each Level k - 1 process segment contains one or more Level k
process segments. In fact, the Level k - 1 target rate ui for Process Segment j in
Level k - 1 Cell C is transmitted to each of the Level k process segments whose steps
are contained within Level k - 1 Process Segment j.
The term "pyramid decomposition" infrastructure arises from the observation that
the number of cells increases as the hierarchy is descended. A single Level k - 1 Cell
C provides target rates, uk- 1 , which are consistent with the Level k - 1 time horizon,
1/fk-1, to each of its component Level k cells. In turn, the component cell, Level k
Cell c, translates the Level k - 1 target rates, u - 1 , into Level k production rates,
ou, which are consistent with the Level k time scale, 1/fk. Those Level k rates are
transmitted to each of the Level k + 1 component cells of Level k Cell c. As each
control level is reached, a greater amount of detail is incorporated into the choice
of production rates, although the number of resources which will see the production
rates decreases.
Therefore, a long-term production target rate is diffused throughout the system
by an infrastructure of semi-independent cells which operate at specific frequencies,
and eventually provides each resource with an appropriate amount of raw material in
order to meet the long-term production goals.
Because each Level k process segment is bounded by a Level k or higher buffer,
the designer of the pyramid decomposition infrastructure is able to specify the dis-
tribution of work-in-process at each Level k time scale (l/fk) in the manufacturing
system. Higher level disruptions require larger buffers and more work-in-process,
while lower level disruptions require smaller buffers and less work-in-process (Obser-
vation 1, Section 2.8.2). The capability to control the distribution of work-in-process
while accounting for the effects of multiple duration disruptions allows for a more ef-
fective trade-off between the costs of inventory (material costs, rates of defects, etc.),
and the costs of disruptions (starvation and blockage of operational machines).
3.6 The Cell as a Building Block
This section details the components of a cell which permit it to function as a build-
ing block in a complex manufacturing system controller. The basic components are
defined. These include process decomposition within a cell (Section 3.5.2) and the
implications of using a flow rate approximation of production (Section 2.6). A model
of capacity is built using the failure model in Section 2.7 and the assumptions about
processes in Section 2.6. The cell controller function is described in terms of the
model of the factory which is used by the cell.
A detailed set of examples is given in Section 7.4 which demonstrate the cell as a
building block in the hierarchical control system.
3.6.1 The Cell as a Sub-Factory
Cell Definition Recall that Level k Cell c is defined to be a subset of machines
groups, buffers and process segments as viewed by a Level k observer (Section 3.5).
The controller of the cell specifies the times of controllable events and responds to
uncontrollable events which belong to Frequency Cluster k and occur within the
domain of the cell. It also specifies the rates of controllable events uk which occur at
frequencies higher than fk, within the domain of the cell. The Level k specifications
and responses of the controller in Level k Cell c are implemented by subordinate cells
which control the resources of Level k Cell c at Levels k + 1 and lower.
Level k Cell c is separated from the rest of the factory, as seen by a Level k observer,
by Level k and higher entry and exit buffers. The cell acts as an independent factory
as long as the entry and exit buffers to the cell's process segments are neither starved
nor blocked. Disruptions which occur within the cell are almost always invisible to the
outside when the disruption frequencies are fk or higher. Likewise, disruptions due
to events outside the cell are masked by the surrounding buffers if the characteristic
frequencies of the events are fk or higher.
Figure 3-5 is an example of a Level k cell which contains three machine groups
and two Level k + 1 or lower buffers. It has two process segments, each of which
requires two operations. The cell is separated from upstream and downstream cells
by four buffers of Level k or higher. Both process segments provide raw material for
the downstream cell, Level k Cell c'. Section 7.4 provides an in-depth example of how
the controller of Level k Cell c operates.
Process Decomposition Since the Level k Cell c is considered to be a sub-factory,
it controls only a subset of the steps of any given process. The subset of process steps
within the Level k cell is called a Level k process segment (Section 2.1.1). In the
context of Level k Cell c, the cell is surrounded by buffers of Level k or higher. Each
set of contiguous process steps which begins in a Level k or higher entry buffer of a
cell and ends in a Level k or higher exit buffer of the same cell is a single Level k
process segment. Each process segment has one or more steps, and there are one or
Level k Cell c
Figure 3-5: Subfactory Example
more process segments in each cell.
Each process segment appears as if it were a complete process from the point of
view of the cell. Each segment has a unique production rate, surplus, and performance
measure within the cell. However, unlike the total process where the entry is never
starved and the exit is never blocked, each process segment in a cell can be starved
or blocked. (The only exception is if the process segment coincides with either the
entry or exit of the total process.) The production rate for each process segment is set
by the cell controller independently of anything outside the cell, given that the entry
and exit buffers of the segment are not empty or full, respectively. This independence
allows the cell controller to deal with disturbances local to the cell without having to
account for the state of the system outside the cell.
In the overall system, process segments are delimited by the placement of buffers
in the hierarchy. The number of steps in a process segment can be controlled by
the choice of the size of buffers within a process. Large buffers divide a process into
high level process segments, while small buffers divide a process into low level process
Level k Cell c'
segments.
Production as a Flow Define Tej as the sum of the operation times for each step
in the Level k Process Segment j in Level k Cell c. The process segment production
can be viewed as a flow if the total time Tc for a part to travel through Cell c is
much less than the characteristic time scale, 1/fk, of the cell:
1
Tc< < 1 (3.1)
An important subtle point is that no step in a Level k process segment can be
viewed separately from the rest of the steps by a Level k observer. From the point
of view of a Level k + 1 or lower observer, an individual part visits each step in the
Level k process segment at different times. However, to a Level k observer, the time
for a part to travel from the first step to the last step in the segment is insignificant.
Therefore, a Level k observer sees part movements only as a flow of many parts
through the steps of the segment. Each step in the same process segment has the
same cumulative production, the same cumulative requirements, and when the control
policy is introduced, the same Level k hedging point and surplus. The control of
the steps in a Level k process segment has been reduced to the choice of a single
production rate, u, based on a single value of cumulative production, and cumulative
requirements.
At levels in the hierarchy where the characteristic frequency of measurement is
roughly equal to or greater than the production rate of a process, the flow rate
approximation no longer holds. Cells at those levels still have process segments, but
instead of treating part movement as a stream of parts, each movement of a part
must be treated as a single event. Even though the processing time is non-negligible,
a part is counted towards cumulative production as soon as it is released into the cell.
This is done so that only the number of parts which are needed to satisfy cumulative
requirements are released into the system.
Machine Group 2
Constraint
22u e- m,
Machine Group 3
Constraint
31u + T3 2 u 2 <em
E
i-
C Machine Group 1
SSet of Constraint
n Feasible 7ll < k m
. Rates,
Uj Parts per Unit Time
Figure 3-6: Sample Capacity Set Qf(ekrk) for Level k Cell c
3.6.2 Capacity of a Cell
Machine Group Capacity The control of the components of Level k Cell c is
based on a representation of capacity as seen by a Level k observer. The concept of cell
capacity is the same as that of system capacity of Section 2.4, but the notation is more
complex. The capacity set OQ(ek(t)mk(t), t) of the cell (defined in (3.12)) experiences
discrete changes at roughly the Level k characteristic frequency fk. Precise failure
and repair information is localized within the cell and is only transmitted to other
Level k cells by the status of its entry and exit buffers.
Figure 3-6 illustrates the capacity set for the Level k Cell c shown in Figure 3-5.
The terms used on the capacity set are developed in this section. Notice that each of
the three machines contribute one capacity constraint to the set. Machine Groups 1
and 2 only constrain a single process segment, whereas Machine Group 3 constrains
the combination of the two process segments. A specific example of a capacity set for
Level k Cell c is presented in Section 7.4.1.
Consider the Level k Cell c where the time to fill or empty its entry and exit
buffers is not very different than the time between Level k events within the cell. In
the current theory, the resources within the cell consist only of machine groups and
Level k + 1 or lower buffers. The capacity of the cell over the Level k characteristic
time scale, 1/fk, is a function of three components for each machine group in the
cell. The first component is the number of machines devoted to controllable events;
the second is the number of machines which are not currently available, as seen by a
Level k observer; and the third is the expected availability of those machines which
are currently available, as seen by a Level k observer.
Let the Level k rate of Type j events in Cell c be denoted by u. This rate
may represent the production rate for a process segment in the cell, the frequency
of performing preventative maintenance, the setup change frequency for a machine
group, a failure or repair, or the rate of any other activity which consumes time at a
resource and occurs at a frequency much greater than fk. In Hiercsim Versions 3.5
and 4.0, controllable events are limited to production operations and setup changes.
Setup changes are addressed in Chapters 5 and 6 and Hiercsim Version 4.0, and are
not considered in this chapter, nor in Chapter 4, nor in Hiercsim Version 3.5.
Consider Machine Group i in Level k Cell c which has a total of ni machines. Let
a -(t) = 1 whenever Activity j is being performed at Machine Group i, and a ,(t) = 0
otherwise. The capacity set at time t for Machine Group i in its most general form is
Sa(t) < n (3.2)
This capacity set can be separated according to the level of activity performed
and whether or not the activity is controllable or uncontrollable.
1. {j E Fi, L(j) > k} is the set of all uncontrollable activities which occur at Ma-
chine Group i much more frequently than fk.
2. {j E Pj, L(j) > k} is the set of all controllable activities performed at Machine
Group i which occur much more frequently than fk.
3. {j E Fi, L(j) < k} is the set of failure modes on Machine Group i whose long-
term average frequencies are less than or equal to the Level k characteristic
frequency, fk.
4. {j E Pi, L(j) < k} is the set of operations performed by Machine Group i at a
long-term average frequency less than or equal to fk.
The capacity set (3.2), after it is segregated according to level and type of activities
becomes
Cejp) + E a p E al, t) + E k c .  ct) :. _ (3.3)
J P jEF jpi jpEF
L(j)>k L(j)>k L(j)<k L(j)<k
When the Level k expectation is performed on (3.3), the capacity used by Level
k + 1 and higher activities can be expressed in terms of the average rate at which
the activity is performed. For example, production of parts in Process Segment j
at rate uk requires rj u k operational machines in Machine Group i on the average,
where rij is the total amount of processing time required for all steps in Process
Segment j which are performed at Machine Group i. If Machine Group i performs
many different controllable activities, then ,', rij u, is the average number of
L(j)>k
operational machines in the group devoted to performing those activities at the rates
u., for all j E Pi, L(j) > k. Uncontrollable lower level activities j E Fi, L(j) > k.
can be represented in a similar fashion. The Level k expected capacity set for Machine
Group i is
Uk(t) + : ijU(t) + Z a (t) + > a(t) ni (3.4)
jEPi  jEFi JEP jiEFi
L(j)>k L(j)>k L(j)<k L(j)<k
Any machine which is occupied by any uncontrollable Activity j, j E Fi or any
high level controllable Activity j, j E Pi, L(j) < k, cannot be used by the cell for
any lower level controllable Activity j, j E Pi, L(j) > k. Therefore, (3.4) can be
rearranged so that all lower level activities which must be specified by Level k Cell c
are on the left hand side of (3.4), and all activities which take away from the overall
capacity of Level k Cell c can be included on the right hand side of (3.4).
Sriju(t) n - E a -( c{(t) -aj r 1jUkt (3.5)
JEPi EPi JEFi EF
L(j)>k L(j)_k L(j)5k L(j)>k
Define mf(t) to be the total number of machines in Machine Group i as seen by a
Level k observer at time t which are available for lower level activities. The number
of machines available for high frequency activities, as seen by a Level k observer, is
Sc(t), =i - E (t) (3.6)
.EF i  jEPL(j)<k L(j)<k
The value of m (t) is constrained to be non-negative, and less than the total
number of machines in Group i:
0 < m <(t) _ ni (3.7)
The capacity set (3.5) can be rewritten using the definition of m (t),
Z rjucj (t) m( m(t)l - 1 Zt (3.U8)
.PIP,
L(j)>k L(j)>k
Note that even though there are m(t) machines in Machine Group i available for
performing activities whose frequencies are much higher than fk, as seen by a Level k
observer, some of the time on those machines will be used by lower level uncontrollable
activities. Therefore, the availability of machines in Machine Group i for lower level
controllable activities will be less than or equal to m(t).
The capacity set (3.8) is based on the assumption that Machine Group i is isolated
from disturbances elsewhere in the manufacturing system. It is an area of further
research to determine how accurate this model is in the hierarchical framework with
finite buffers.
Let e (t) be the fraction of time an operational machine is available for performing
lower level controllable activities, as seen by a Level k observer.
1
e (t) = 1 - mjU ru(t), if m i > 0 (3.9)
L(j)>k
If mi = 0, then e (t) = 0.
In the current theory, the expectation of (3.9) taken using the reliability assump-
tions of Section 2.7 becomes the equation (2.15). However, only those failure modes
of Machine Group i which occur at a long-term average frequency much greater than
fA are included in (2.13)-(2.15) Therfore, the value of e(t) is computed using (2.15)
as follows
MTTFj
E TD MTTFj + MTTRj() = TDF (3.10)MTTRj
1 + > MTTFj
jE ODF MT
where j E TDF k are all time-dependent failures such that L(j) > k and j E ODFk
are all operation-dependent failures such that L(j) > k.
The Level k capacity set of Machine Group i becomes
L(j)>k
u(t) > 0 Vj E Pj, L(j) > k (3.11)
This capacity set may be expanded to include multiple machine groups by adding
one linear constraint (3.11) for each additional machine group. This capacity estimate
is accurate provided that any Level k + 1 and higher buffers internal to Level k Cell c
are of sufficient size to decouple the effects of any Level k + 1 or lower disturbances.
The simulation in Section 7.2.2 demonstrates how the capacity set estimate changes
when the internal buffers are too small to decouple Level k + 1 and lower disturbances.
Define the Level k capacity of Level k Cell c as Qf(ekmk, t). It is the set of all
possible rates for Level k Cell c. The set can be written as
(ekmk,t) 7= S ryuk(t) e(t)mk(t) Vi E c
JEPi
L(j)>k
u7(t)> o0 Vj Pi, L(j) > k (3.12)
Note that setup changes are not considered in (3.12). They are incorporated into
the capacity set (5.6) in Chapter 5. Note also that the bottleneck machine group in
the cell is the one which has the least amount of idle time after the production rates
u have been chosen within the constraints of (3.12). This capacity set is based on
the assumption that no entry buffers are empty and no exit buffers are full.
Representation of the Outside System Level k Cell c acts as an independent
factory because it is able to approximate the system outside of its domain in a rea-
sonable way. Events which occur at frequencies f < fk are treated as though they
never occur, and those states ack which are affected by those events are considered
to remain constant forever. Once such an event does occur, however, the cell must
respond to the event and subsequent change in state, ak , but it does not have a model
of when the next occurrence will happen. Events which occur at frequencies f > fk
are modeled as aggregate streams of events using rate approximations. The effects of
uncontrollable high frequency events are averaged out from the overall machine avail-
ability using the method of Section 2.7.2. The rates of controllable high frequency
events, u., are set by the controller of the cell, which then assumes that those rates
will be met exactly by controllers at lower levels in the hierarchy.
The system state at the same time scale, but outside the spatial domain of the
cell, is summarized by the status of the buffers surrounding the cell. The amount
of material, bk, and its rate of change, bk, in each entry and exit buffer is monitored
by a Level k observer. During any period of time when an entry buffer is empty or
an exit buffer is full, the Level k controller adds the conditional flow rate constraints
(2.19) and (2.20) from the outside system into the cell's capacity set eOfl(t)mk(t), t)
(3.12). Otherwise, the status of the system at Level k outside of the domain of Level
k Cell c is ignored in the choice of Level k rates, u .
Open and Closed Flow Rates In the perception of a Level k observer, all the
steps in Process Segment j of Level k Cell c are performed at the same production
rate, uk . This rate is limited by the slowest step in the process segment at any given
instant in time. If any of the resources in the process segment are not set up with the
correct tools, or a resource required for the process segment has a Level k or higher
failure, or an entry or exit buffer prevents parts from being loaded or removed from
the point of view of a Level k observer, then the entire process segment is shut down.
This severe restriction on the production rate of a process segment prevents ex-
cessive buildup of work-in-process within the cell. It also allows the cell to allocate
its resources for the production of parts for those process segments which are open.
In the long run, this policy will enable the cell to make efficient use of its capacity.
Controller Data within a Cell Each cell in the hierarchy has a controller which
contains all support data necessary to determine production rates for each of its pro-
cess segments. Each process segment that passes through the cell has a surplus value,
production target, and production rate. The capacity of the cell is independent of the
other cells (apart from blockage and starvation), and depends on the operation times
and flow paths of the cell's process segments, and machine reliability parameters. The
cell contains information and parameters for its controller. The algorithm employed
by the controller is dependent only on the time scale of the cell and the nature of the
process segments. The cell's performance is recorded in measures which include cycle
time, work-in-process, and total production for each process segment.
3.6.3 Cell Controller Function
The controller of Level k Cell c must accomplish two functions. It schedules events
within Cell c which occur at a long-term average frequency near the Level k charac-
teristic frequency, fk. It also chooses rates, u,, of controllable events j which occur
at frequencies much higher than fk. Those rates and times are chosen to be always
within the Level k capacity of the cell, and to match as closely as possible the re-
quirements from the Level k - 1 Cell C which contains Cell c. Whenever any Level k
event occurs, the controller of Cell c must reschedule the Level k controllable events,
and recompute the values of u.
Let the Level k - 1 target rate for controllable activity j E P from Cell C be
denoted by uj . It is computed by Level k - 1 Cell C over the time horizon of
Level k - 1, and is assumed to be feasible in Level k Cell c over the Level k - 1 time
horizon. At any given time instant, however the target rate, u 1, may not be feasible
due to occurrences of Level k events. However, there are many Level k events over
the time horizon of Level k - 1, and the target rate, uk 1', will be in the interior of the
Level k capacity set k (ek(t)mk(t), t) most of the time. Therefore, during the periods
when uk- 1 is in the interior of the Level k capacity set k (ek(t)mk(t),t), the Level k
controller is able to choose a rate, uC, which exceeds u- 1 and thus makes up for the
periods of time when cA' is infeasible.
Let Ek be the Level k conditional expectation operator (Gershwin, 1989). It is
the conditional expectation, given that all quantities which vary at a frequency less
than or roughly equal to fk, remain constant at their current values at time t. All
other quantities are allowed to vary.
Over the Level k - 1 time scale, the Level k capacity set Q (ek(t)mk(t),t) will
change many times, allowing many different values of Level k controllable rates, uc,
to be chosen. Since the target rates of controllable events, uc- , were chosen to be
feasible over the Level k - 1 time scale, the Level k controller should be able to
choose a sequence of rates u k such that their expectation over the Level k - 1 time
scale 1/fk-1 is equal to the target rates uk- 1
Ek uk-1 (3.13)
A proof of this relationship between rates at different levels in the hierarchy ap-
pears in Gershwin (1989).
3.7 Dynamic Programming Approach
3.7.1 Problem Sketch
The system controller must meet production requirements in an uncertain environ-
ment. Even though the general aspects of the uncertainty can be characterized, the
controller can never know the precise details of future events. This thesis's approach
to control in such an environment is based on dynamic programming. The complete
dynamic programming formulation appears in Gershwin (1993) and in Kimemia and
Gershwin (1983).
Surplus Definition Consider Level k Cell c. Define the Level k surplus, x, for
Process Segment j in Level k Cell c to be the difference between cumulative production
and cumulative requirements, as seen by a Level k observer. The Level k cumulative
production is the integral of the Level k rate, u; the Level k cumulative requirements
is the integral of the Level k - 1 target rate, u k. The surplus can be written in
vector form (where the subscript j has been dropped, to imply that all controllable
events, j E P are considered) as
(t) = (u(t) - u -1(t))dt (3.14)
The values for the target production rates u - 1 , are considered to be constant by
the Level k observer.
Dynamic Program Recall that the Level k cell states a'(t) are those that are
influenced by events whose frequencies are of the same order of magnitude as fk or
100
lower.
In order to achieve its goal of satisfying requirements with a minimal amount
of surplus or inventory (described in Section 3.6.3) the controller of Level k Cell c
satisfies (3.13) by keeping u near uk-1. Kimemia and Gershwin formulated a dynamic
programming problem in which they defined a convex function g such that g(0) = 0;
g(X) > 0 V Xk; and limII4kI0, g(x ) = oo. They also defined the Level k cost-to-go
as a function of surplus and cell state: Jk (x(t),a(t),t). The Level k cost-to-go,
jk, is determined by the rate vector uk(x, a , t) which minimizes the total cost of
production over some long time period T, given initial conditions at to:
J (x(t), (t),t) = mi Ek-1 t (t),a(0) (3.15)
In the case of Level k Cell c, the period T should be long enough so that the
dynamic programming problem has a time-invariant solution, u (xk, ak ). Thus,
T - to > 1/fk. (3.16)
Gershwin (1993) approximated the cost-to-go function by a time dependent com-
ponent, Jk'(t), and a surplus and event dependent component, W (zX,k a):
J J' (T - to) + W (wx, ac) (3.17)
assuming that the time interval, T - to is large and that the target production rates
uC- 1 from Level k - 1 Cell C are feasible over the Level k - 1 time period in Level
k Cell c. Note that Wk is not a function of t. Note also that minimizing Jk with
respect to the surplus xk is the same as minimizing Wk with respect to xlk
If the component Wk of (3.17) is small, then xk must be small for all t, to < t < T,
due to the nature of the function g. Equation (3.14) then implies that u is near u -1
Kimemia and Gershwin (1983) derived a Bellman equation for the minimization
problem (3.15). They subsequently approximated the solution by assuming that
the time-independent component, Wc, of the cost-to-go function Jf was a known
quadratic function. The maximum principle determined that the instantaneous pro-
101
duction rate vector uk for Level k Cell c is the solution to the minimization
aWk
min a k U(t) (3.18)
jEP c3
subject to the capacity set U (ek(t)mk(t),t):
_<rj Uk.(t) K e'(t)m'(t) i E c
uk.j(t) > 0 j E P
Note that this capacity set assumes that the cell is operating independently of
the rest of the system. If there were empty entry buffers or full exit buffers, then the
capacity set would have the additional conditional constraints (2.19) and (2.20).
3.7.2 Dynamic Programming Problem Approximation
Quadratic Approximation An approximate solution of the dynamic program-
ming problem (3.15) was first presented by Gershwin, Akella, and Choong (1985).
The approximation was made because any exact or numerical solution would en-
counter the "curse of dimensionality" and become unmanageable. Their solution
used a quadratic approximation, W(x, a, t), in the Level k surplus of Cell c, x a
to the cost-to-go function J(x, k a , t). Once Jk is specified, then the minimization
can be performed.
Even though this thesis considers system more complicated than the Flexible
Manufacturing Systems (FMS) studied in Kimemia and Gershwin (1983), that control
algorithm appeared as if it could be extended to more complicated systems. The key
to the extension of their work to a general system lies in the decomposition of the
factory by using frequency decomposition techniques (Gershwin, 1989), and process
decomposition techniques (Darakananda, 1989; Bai and Gershwin, 1990a and 1990b;
Bai, 1991a and 1991b; and Violette and Gershwin, 1991). In that manner, a general
manufacturing system can be considered for control purposes as a set of smaller sub-
102
factories, each of which satisfy the assumptions of Kimemia and Gershwin (1983).
The approximate cost function 1~W, for the Level k Cell c can be written in matrix
form, with the vector of surplus x and the coefficient matrix A (a), vector B (a),
and scalar Cf:
- W k h T h h k)T:k + leWC (k) = 1(x) Ac xc + (B)x + C (3.19)
This approximation is such that a greater cost arises for a greater deviation from
some ideal value of surplus. Components which are further behind generate greater
costs. Those costs are used as a relative priority scheme which allocates more resources
to the components with greater cost, thus serving as a feedback controller on surplus
which drives the entire system back to its ideal value.
Note that the gradient of (3.19) with respect to the surplus vector xz is a linear
function of surplus x:
= Al k + Bk (3.20)
The minimum of WT occurs when zx = -(Ak) - 1 B". This value of the surplus
is called the hedging point, z. By setting BA = -Al zk, the gradient of (3.19) with
respect to the surplus xk can be written as
- A ( " k - z). (3.21)
In the current theory, the values of Ak and z" are dependent on the instantaneous
state of the cell, ac, in order to account for dynamic changes in Cell c at the Level
k characteristic frequency fk. However, the values of A" and of z" are difficult to
compute exactly.
In the current implementation, the values of A" and z' are independent of ac
and are based on the long-term average reliability of resources in the cell. The less
reliable the cell resources are, the greater the hedging point has to be in order to
minimize the cost of being behind. This is due to the large fluctuations in surplus as
disruptions hinder production. In addition, A" is assumed to be a diagonal matrix
103
which represents the cost of deviation from the hedging point for each process segment
in the cell.
Linear Program The constraints (3.12) placed upon the possible values of control-
lable rates u , in Level k Cell c, are linear in those rates. In addition, the quadratic
approximation ! to the cost-to-go function Jk gives a gradient, OWC/8x~, which is
linear in xck
Since both the objective function (3.18) and the constraints (3.12) on uk are linear
in u , the rates, u , are determined by a linear program which can be written as
minE A k(X - Zm) au (3.22)
jEP
subject to the capacity set
k k
i Pi
Uc >0 jE P
when the Level k Cell c is neither starved nor blocked. In the case that the cell is
starved or blocked, the conditional constraints (2.19) and (2.20) impose further limits
on uc
This linear program is the heart of the hedging point strategy (Gershwin, 1989).
The Level k controller of Cell c tries to ensure that the average of Level k production
is as close as possible to the Level k - 1 target rates. The hedging point strategy
embodied in (3.22) allocates available capacity to the process segments which need
it the most. The feedback of this system lies in the objective function (3.22): as
production occurs for those process segments j which are farthest behind, the gap
between the surplus and the hedging point, x k - zk., closes. On the other hand, those
process segments which were close to their hedging points experience an increase in
the gap between their surplus and hedging points. Eventually, if the Level k- 1 target
production rates are within the Level k capacity k (ek(t)mk(t), t), all surpluses will
104
reach their respective hedging points.
Suppose that the component of the objective function A (XC. - zk.) of (3.22)
corresponding to process segment j is much more negative than those of all other
Process Segments. For certain initial conditions, all capacity will be allocated to
producing parts in Process Segment j. As time progresses, the magnitude of AF( -
z ) will approach zero while that of Process Segment j', A - z) will fall further
behind. At that point, capacity must be shared between Process Segments j and j'.
Eventually, if the target rates, uc - 1, are feasible, and no other Level k or higher events
occur, each of the components of the surplus, xz, is driven to its hedging point, z .
Sections 4.7 and 4.8 describe the implications of changes in relative cost over time
to the behavior of the hedging point strategy algorithm. Those sections describe
boundaries in surplus space which are hyperplanes along which more than one ca-
pacity allocation is optimal. These boundaries require special treatment in the linear
program (3.22) in order to pick a combination of the optimal production rates and
avoid the chattering described in Gershwin, Akella, and Choong (1985).
3.8 Flow Constraints Imposed from Adjacent
Cells
The focus of this thesis is to describe a manufacturing system controller which is
composed of many semi-independent cell controllers. Each of the cells in the system
contains a subset of resources and buffers. The controller of a cell regulates the
production of parts in each of the cell's process segments. The system outside of a
cell is summarized by the status of the entry and exit buffers for each of its process
segments. The status of buffers is contained in the amount of material, b, in a buffer,
and the rate with which material is flowing into the buffer, b.
While the entry buffers of a cell are not empty, and the exit buffers are not full, the
cell controller is able to operate independently of the rest of the system (Observation
3, Section 2.8.2). However, when either the entry buffer of a process segment is empty,
or the exit buffer is full, the production of parts for that segment is limited by the
105
flow of material through the buffer (Observation 4, Section 2.8.2). An empty entry
buffer or a full exit buffer of a process segment is considered to be failed.
Consider Level k Cell c where the characteristic frequency of Level k events, fk, is
much lower than the long-term average frequency of operations. Production in Process
Segment j in Cell c is represented by the rate u'. The capacity set c(ek(t)mk(t),t)
of the cell consists of one linear constraint (3.12) on the rate vector, u,k for each
Machine Group i in the cell.
Recall that lju. is the occupation of Machine Group i by the high-frequency
Activity J. Since the state ak of the cell only changes with a frequency less than or
equal to the Level k characteristic frequency, fk, the occupation of Machine Group i
for all high-frequency activities must be less than the number of operational machines,
mi, given by (3.6), multiplied by the fraction of time the machine is not occupied
by uncontrollable high frequency events, ei . The value of eim is constant over the
Level k time scale, 1/fk. In addition, the operation time, rij, of each high frequency
activity is constant. Therefore, the maximum production rate for Type j parts for a
machine under a given high level cell state a' is bounded by a constant value, equal
to efmf/lri.
The cell is surrounded by Level k or higher buffers. The starvation or blockage
status of those buffers changes with a frequency of the same magnitude or lower as
the Level k characteristic frequency, fk. Therefore, any failed buffer at the entrance
or exit of Process Segment j will impose a constant linear constraint on Level k Cell
c for a time period of duration on the order of 1/fk, thus limiting production rate uc,
in the same manner as if the buffer were a machine.
From the point of view of a Level k observer, the effect of a disturbance exterior
to the cell on Process Segment j can be summarized by a single virtual machine,
imposed whenever the flow rate through either the entry buffer or exit buffer constricts
production. The virtual machine operates as if it were added to the list of resources
in the cell.
Figure 3-7 shows a schematic of Level k Cell c in which one of its process segments
is blocked by a full downstream buffer. This constraint is represented by a virtual
106
Level k Cell c'
Figure 3-7: Virtual Machine Added to Level k Cell c
machine which has been added immediately upstream of the full buffer. The virtual
machine is added to the capacity set f(ek(t)mk(t),t) of the cell.
Figure 3-8 shows how the Level k capacity set is affected by the Level k virtual
machine constraint. This new constraint allows the cell controller to allocate more
resources to the segment which is not blocked (if they are needed) by limiting the
maximum production rate of the blocked segment 2. An example which demonstrates
virtual machines can be found in Section 7.2.4.
Consider Process j in which Step n is performed by a resource in Level k Cell c
and Step n + 1 is performed by a resource in Level k Cell c'. The production rate of
the process segment in Level k Cell c which contains Step n is denoted by u'. The
production rate of the process segment in Level k Cell c' which contains Step n + 1 is
denoted by u,,j. Let Buffer ,,j be the Level k buffer which decouples the two cells.
The amount of material in Buffer Onj as seen by a Level k observer is denoted by bkj,
2When neither the entry buffer nor the exit buffer of Process Segment j is failed, the production
rate for the segment, u. , is coordinated with its upstream and downstream segments by a common
target production rate, u 1, determined by Level k - 1 Cell C.
107
Level k Cell c
Machine Group 2
Constraint
T2 2 u2 -e 2 m
New
Seit rf
Machine Group 3
Constraint
31k + 7u32u < e3m
Virtual Machine
Constraint
u1_ umax
Machine Group 1
Feasible constraint
Rates u m
1 1T ; ,_ el m
Uk Parts per Unit Time
Figure 3-8: Virtual Machine Constraint in Level k Capacity Set Qk (ekmk)
108
E
CI-a-
0.
C'4.
I-
while the maximum buffer size is denoted by Bj.
Suppose that Buffer 0,j is full as seen by a Level k observer, bij = Bj. Let
Umap" = ki be the maximum rate of flow of parts through the failed Buffer 3,j forj= C
Level k Process Segment j. The conditional constraint imposed on Level k Cell c is
if b = Bnj, then Uk. <uma (3.23)
There is a similar conditional constraint imposed on Level k Cell c' if the buffer
separating Step n and Step n + 1 becomes empty (b j = 0).
if bk' = 0 ,then uc,, < u""a (3.24)
Note that either conditional constraint may be converted into a machine constraint
by writing r1 j = 1/um& " and mA3 = 1. Thus, the conditional blockage constraint
(3.23) can be rewritten in the form:
if b = B. j , then raju k K 1 (3.25)
Written in this form, outside disturbances may be handled by the controller of
Level k Cell c as if another machine has joined the cell. Virtual machines eliminate
the need for a system-wide controller whose purpose is to tell cells about disturbances
in other cells. Another advantage is that these constraints fit into the same form as
all other machine capacity constraints, and so no special algorithms are needed to
accommodate starved or blocked buffer constraints.
Virtual machine constraints are only active when a buffer is either full downstream
or empty upstream. The controller of a cell continuously monitors the amount of
material in the buffers surrounding the cell, at the control level of the cell.
It is important to note that a buffer may have independent active virtual machines
at different control levels. In other words, a buffer may be failed at one control level,
but not be failed at another level, or it may be failed at multiple control levels
simultaneously. This is possible because of the different perceptions of the amount
of material in the buffer due to differences in frequency of measurement between two
109
control levels (Section 2.9.3).
For example, when a Level k - 1 buffer is full or empty from the point of view of
a Level k - 1 observer, a Level k - 1 virtual machine is installed in the appropriate
Level k - 1 cell. This constraint alters the target rates for the Level k component cells
such that the buffer constraints are satisfied at Level k when the Level k - 1 target
rates are being met. If the Level k perception of the amount of material in the buffer
is such that the buffer is not failed, then the Level k - 1 target rates will prevent the
buffer from becoming failed, only if the Level k hedging point zk has been reached by
the Level k surplus x'. Therefore, it is possible to have a Level k - 1 virtual machine
installed, while at the same time, a similar Level k virtual machine is not needed.
However, it is also possible that a Level k virtual machine will have to be installed
at the same time as the Level k - 1 virtual machine in order to account for Level k
fluctuations in the amount of material in the buffer.
3.9 Translation of Production Rates into Loading
Times
The purpose of the Level k - 1 Cell C controllers is to choose the instantaneous
production rate uc1 for Type j parts, where the long-term production rate for Type
j parts is much higher than the Level k - 1 characteristic frequency fk-1. The
production rate, u-1, only specifies the frequency of operations. Two sequences of
operations of the same type with different times of occurrence, but the same frequency
of occurrence, are indistinguishable from each other, as seen by a Level k - 1 observer
because the precise times of occurrence are lost. (See Section 2.5.3 for a more detailed
explanation.)
Now suppose that the frequency of operations for Type j parts is comparable to
that of the Level k characteristic frequency, fk. From the point of view of a Level
k observer, two sequences of the same operation which occur at the same frequency,
but different times, are distinguishable from each other because the precise times
of occurrence are visible. The controller of Level k Cell c, which receives its target
110
production rate for Type j parts, uj ', from Level k - 1 Cell C, must convert that
target rate into distinct loading times. This section describes a simple algorithm
which serves to convert a continuous target production rates into loading times. The
algorithm is called the staircase policy (Gershwin, 1989).
Staircase Policy When Level k Cell c is located in the lowest level of the hierarchy
- the level of operations - the staircase policy is used to convert the continuous Level
k - 1 production target rates, u - 1', from Level k - 1 Cell C into a discrete set of
loading times. Through the staircase policy, parts are cleared to enter the cell one
at a time for use as raw material by the operation level (Level k) Process Segment
j. When the total number of parts cleared for use at time t, Nej(t), satisfies the
cumulative requirements from the next higher level Cell C
I. t
Nj(t) > k-(3.26)
no more parts are cleared for entry into the cell. Figure 3-9 graphically represents
the staircase policy. Section 7.3.1 provides sample simulation results demonstrating
the staircase policy.
The staircase policy in the current theory simply gives permission for a part to
enter a cell for processing. The staircase policy does not specify the order in which
cleared parts are to be processed. Section 3.10 describes this aspect of the control in
Hiercsim Version 3.5 in more detail.
For the purposes of accounting in the current theory, the Level k controller counts
all parts which have been granted permission for processing towards the cumulative
production of the respective process segments in the cell. This is assumed even though
those parts do not reach inventory until they have completed all processing in the
cell. The justification is the assumption that the time to process the parts is short
compared to the time scale of the Level k controller.
111
U,SNj (t)ca
0
E
z
a)
E
Time
Figure 3-9: Staircase Policy Cumulative Production
k-1
Cj (s) dsCj
t
112
3.10 Machine Loading Controller
The staircase policy described in the previous section provides raw material to a
cell in order to meet the required production rates specified by higher control levels.
However, the staircase policy does not specify the order in which released parts are
to be processed, nor does it specify the precise time at which processing is to begin.
Experience with simulation work has shown that modification of the policy so that it
specifies the order and timing of processing also sacrifices its simplicity.
Therefore, a separate controller is required which complements the staircase policy.
In the current implementation, this controller is located directly on the machine
group, instead of within a cell, and its function is to choose the order of processing,
the precise timing of processing, and the transportation of parts needed to accomplish
the processing. The machine group controller's decisions are based on information
local to the group, such as the number and type of cleared parts in its entry buffers
and the current status of its machines. All decisions made by the group controller
are go-no/go decisions which are based entirely on situational heuristics.
In this manner, the machine group controller is able to account for the possibility
of more than one type of part being available at the same time and the occurrences
of minor failures. In addition, this controller allows for a clean interface between the
flow rate hierarchical controller and the actual factory. In other words, any person or
algorithm which can account for minor variations in processing start times and choices
among different part types is capable of assuming the function of the machine group
controller. Chapter 4 describes the particular set of rules which were implemented in
the current versions of the simulation program, Hiercsim.
3.11 Information Flow within a Distributed Con-
trol System
The decomposition technique described in this thesis provides each cell with infor-
mation which it needs to choose rates and times of controllable events. This section
113
details the source of each type of information within the context of the current theory.
3.11.1 Relative Position of Level k Cell c
Consider a manufacturing system which can be controlled by a pyramid control in-
frastructure, described in Section 3.5. The next paragraphs outline the terms for
controller components which are used to describe the information required by Level
k Cell c. These components are shown in Figure 3-2.
Consider Level k Cell c which contains a subset of resources of the manufacturing
system. A Level k observer in Cell c only has a detailed model of those events which
occur at a frequency near the Level k characteristic frequency, fk in these resources.
The resources in Level k Cell c are controlled at higher frequencies (i.e., in more
detail, to account for higher frequency events) by one or more component cells at
Level k + 1 and lower. Level k Cell c' is immediately downstream from Level k Cell
c, which implies that parts completing all steps in Level k Process Segment j within
Cell c immediately proceed into Cell c' for further processing (after waiting in a Level
k or higher buffer). Level k Cell c is contained within the Level k - 1 Cell C, which
may contain other Level k cells besides Cell c (for example, possibly Cell c'). Level k
Cell c is separated from cells adjacent to it at Level k by buffers whose control levels
are Level k or higher.
3.11.2 Information Flow Between Levels
Level k Cell c requires information about controllable and uncontrollable events from
Level k - 1 Cell C, from resources within Cell c as seen by a Level k observer, and
from events at resources in Cell c which occur at a frequency much higher than the
Level k characteristic frequency, fk.
Whenever Level k - 1 Cell C recomputes Level k- 1 production target rates, uk- 1,
based on the Level k - 1 view of all resources within Cell C, Level k Cell c is required
to update its production rates, u,. Therefore, a path exists over which Level k Cell
c is told about changes in its Level k - 1 target rates, and over which those target
114
rates can be accessed. Note that the Level k - 1 target rates, u~ 1, are based on the
Level k - 1 status, ac - 1 , of all resources in Level k - 1 Cell C, and not only those
which are also within Level k Cell c.
Each resource within Level k Cell c is required to report the occurrence of any
Type j event, L(j) < k, directly to the cell controller. Such an event is seen as a
discontinuous change in the Level k state of Cell c, ac, which triggers a recalculation
of the Level k rates of controllable events, uk, for resources in Cell c. Therefore, a
direct link exists from each resource in Cell c to the controller of the cell. However,
a direct link does not exist in the opposite direction: from the Level k controller to
the resource. That is, the controller does not influence the resources directly.
Any Type j event, L(j) < k, requires an immediate response from the Level k
controller, in order that capacity be reallocated to account for the effects of the event.
However, any request for a Type j controllable event, L(j) = k, that originates in
the controller of Level k Cell c, must be filtered down through the hierarchy. This is
necessary in order to account for any Level k + 1 or lower activities which may be in
progress and are not seen by a Level k observer. Therefore, the link from the Level
k controller to each resource in the cell is established indirectly using intermediate
level controllers.
Whenever the rates of lower level controllable events, u , change, each of the Level
k + 1 component cells within Level k Cell c are told about the change. The controller
of each Level k + 1 component cell updates its target rates from the new values of the
Level k rates, u. The path over which the production rate information flows is the
only real-time connection from the Level k Cell c to its Level k + 1 components.
Level k Cell c requires models of the aggregate effect of events which occur at
frequencies much higher than the Level k characteristic frequency, fk. In the current
theory, all such models are specified in the design stage of the hierarchical controller.
At that stage, the average operation time, rij, for Activity j on Resource i is deter-
mined, as well as the reliability characteristics of each Resource i.
Those parameters are used in the creation of the Level k capacity set
Qf(ek(t)mk(t),t) of (3.12). The operation times are used to determine the occu-
115
pation of each resource by high frequency controllable activities, and the reliability
information is used in the equations (2.13)-(2.15) for each Resource i. Those equa-
tions provide the controller of Cell c with the fraction of time, e , which Resource i
will be available for controllable activities.
Once the parameters of high frequency events are determined by the designer,
and incorporated into the Level k capacity set £ (ek(t)mk(t),t) of Cell c, there is
no opportunity to change those parameters in real time. In fact, the measure of
production as seen by a Level k observer is based solely on the values of the Level
k production rates, u,. This implies that any deviation of the Level k + 1 or lower
controllers' production from the Level k production rates, u,, is not detected by the
Level k controller. Therefore, the designer of the hierarchical controller, according
to the current theory, must ensure that the models used in the construction of the
Level k capacity set fl£(ek(t)mk(t),t) in Cell c are accurate, or at least conservative.
In the future, feedback of actual system parameters will be incorporated into the
hierarchical controller, so that the Level k capacity set will be self-correcting.
Note that we do not know how critical it is to have good estimates of activity
durations, such as operation times rij. Hiercsim Version 3.5 provides the capability
to specify different values of activity durations: one for the actual duration, and the
other for the duration as perceived by the controller. This feature will enable the
user of Hiercsim to experiment with different levels of accuracy between actual and
perceived values.
3.11.3 Information Flow between Adjacent Cells at the same
Level
The controller of Level k Cell c operates independently of all other Level k cells in
the system. However, when an event occurs in a resource outside of Cell c such that
its effects are propagated to Cell c, then the effect of the event must be accounted for
in the capacity set gQ(ek(t)mk(t),t) of Level k Cell c.
Information about the system outside of Level k Cell c as seen by a Level k observer
116
is summarized in the Level k status of the entry and exit buffers of Cell c. The Level
k status of a buffer consists of the amount of material in the buffer, bj; and the rate
at which the buffer is emptying or filling up, b '. The value, b , is sometimes referred
as the fill rate of Buffer Pj, as seen by a Level k observer. In order to obtain precise
information about the state of a buffer, a path must be available to the Level k Cell
c controller to determine the flow into the buffer and the flow out of the buffer.
Consider the relationship between the two Level k Cells, c and c', introduced in
Section 2.8.3. Recall that Operation n on Type j parts is performed by a resource
in Cell c, and Operation n + 1 is performed by a resource in Cell c'. The production
rate for Operation n is u k, and that for Operation n + 1 is u,. The resources which
perform Operation n and n + 1 are separated by Level k Buffer Pj. The amount of
material in Buffer ,j, as seen by a Level k observer is denoted by b , and the rate
of material flow into the buffer by b ,. The value of the Level k buffer fill rate is
bk = Uk _k 1(3.27)
In order for the controllers of Level k Cells c and c' to determine the fill rate of
the buffer, bj, a two-way link between Cells c and c' must exist. The production
rate, u must be available to Cell c' whenever either u or ukuj changes. Similarly,
k, must be available to Cell c.
The information about production rates in adjacent cells is used in two ways. The
first use is simply to compute the current value of the buffer fill rate, bj, so that the
amount of material in the buffer, bnj(t), can be computed. The second use is to serve
as the source of the limiting rate whenever Buffer Pj is either full or empty, so that
the appropriate conditional constraints (2.19) and (2.20) may be determined.
There is exactly one path for information flow per buffer surrounding Level k Cell
c. Those paths are created in the design stage of the hierarchical controller, when
the buffer control levels and buffer placement are specified. All information about
the system outside of Level k Cell c is summarized using the rates transmitted over
those paths. In particular, dynamic bottleneck information is transmitted through
117
the buffer status. This is vital to the function of the hierarchical controller, since
the location of bottlenecks moves around over time as machine states and target
production mixes change. In addition, the location of bottlenecks may be different at
different levels of the hierarchy as different frequency class events occur.
3.11.4 Information Flow During a High Level State Change
The architecture of the hierarchical controller relies on many semi-independent con-
trollers acting in concert. When the system is at its collective hedging point, each
controller is able to act without any knowledge of the rest of the system. However,
when an event occurs which affects more than one cell, a precedence of calculations
is required.
Events which affect more than one cell at the same time include: the failure
or repair of a machine; the change of target production rates; and the change in
configuration of a high level cell. The order of precedence established in this section
is not sufficient for configuration changes, but that issue is addressed in Chapter 5.
The purpose of establishing a precedence in calculations is to avoid a situation
where the same cell is forced to recompute the rates of high frequency controllable
events more than once in the same calculation cycle, where a calculation cycle is
the set of calculations required to determine the new set of production rates across
multiple cells. One advantage in being able to compute rates only once per cycle is a
simplification in the control algorithm.
The simplification arises because the accounting that accompanies the allocation
of capacity is done only once. When the accounting procedure must be done more
than once, previous results and directives must be canceled, adding another step to
the calculation procedure. Experience with writing the simulation program Hiercsim
has shown that this extra step is complex due to the multitude of possible cancellation
scenarios which can arise, and therefore must be addressed.
Precedence of Calculations between Levels Whenever an event occurs which
changes a high level state, cell controllers at multiple levels respond by reallocating
118
capacity. High level calculations must be completed before low level calculations are
begun.
Consider Level k - 1 Cell C, its Level k component Cell c, and the Level k + 1
component cells of Level k Cell c. When a Type j event occurs, where L(j) = k - 1
(using notation defined in Section 2.5.2), the Level k - 1 rates uk- 1, Level k rates u,
and the Level k + 1 rates are no longer valid.
In this case, Level k - 1 Cell C recomputes the Level k - 1 rates, ul - '. Upon
completion, the rates u -1 are passed to Level k Cell c and are used as target rates
in the calculation of Level k rates, u'. The Level k + 1 components of Cell c do not
begin their calculations until the Level k rates uk are chosen.
Precedence of Calculations within a Level Even though cells at the same
control level operate independently of each other, virtual machine connections (Sec-
tion 3.8) in one cell could trigger a subsequent calculation in an adjacent cell. There-
fore, the precedence is established such that a cell which is limited by an adjacent
cell only calculates production rates after the limiting rate is known. Otherwise, the
limited cell will be forced to redo its calculations when the limiting rate is changed.
In the case that two cells are mutually limiting (the upstream cell is blocked and the
downstream cell is starved) the upstream cell rates are calculated first.
For example, consider the two Level k Cells c and c', which perform Operations
n and n + 1 of Process j respectively. Both cells receive their Level k - 1 target rate,
a c, from the common Level k - 1 parent Cell C. When the Level k - 1 target rate,
UCj , changes, both Level k Cells c and c' must recompute the production rates, uC.
and uk,j. Suppose that Level k Cell c' limits production in Level k Cell c via a virtual
machine. The rate uk must be calculated first, followed by the calculation of u .
Precedence Implemented in Hiercsim Up to this point, the precedence of cal-
culations in a hierarchy has been established for calculations between levels or within
levels. In Hiercsim Version 3.5 and Version 4.0, no Level k production rates can be
sent to the Level k + 1 component cells until all Level k cells within the same Level
119
k - 1 Cell C have completed their rate calculations.
This rule is enforced whether or not virtual machine connections exist. This is a
stricter rule than what is necessary due to virtual machine connections, but it makes
the code easier to write and does not impose any performance penalty.
3.12 Reentrant Process Control
This section describes some of the issues which are raised when reentrant flows are
controlled using the hierarchical control architecture of this thesis.
3.12.1 Reentrant Process
Definition A reentrant process is defined as a sequence of steps in which a resource
(e.g. a machine group) is used more than once. Each time a part is operated on by
the resource, it is considered to have advanced to the next stage of production. Each
stage is separated from the previous and subsequent stages by buffers.
Examples of Reentrant Flow Semiconductor technology is a highly reentrant
process in which circuitry for chips is imprinted on silicon wafers using photolithog-
raphy. The MIT Integrated Circuit Laboratory produces wafers using a DA-CMOS
technology. The wafers produced in the lab have nine layers of circuits. Each layer is
imprinted using the standard process to clean, dope, expose, and develop the specific
circuit mask. This process is accomplished using the same set of machines. Therefore,
the same wafer visits the same sequence of machines nine times in the course of one
process.
Surface mount technology is a method of placing electronic components such as
chips and resistors on a printed circuit board without wires. Because this technology
does not use wires, it is possible to place components on both sides of a printed circuit
board, increasing the density of the circuitry. This process is reentrant because a
board travels through the mounting process twice: once for the first side, and once
for the reverse side. Because the surface mount equipment is expensive, there is
120
usually only one line available for all processing.
The fabrication of wooden television cabinets requires that pieces of molding (dec-
orative strips of wood) be made. The molding is formed using a two step process.
The first step gives a rough cut piece of wood a perfectly square cross-section. After
the square cross-section is cut, a decorative profile is given to the piece using the
same molding machine with a different set of tooling. Again, this process is reentrant
because the same piece of wood visits the same machine at two distinct stages of
production.
Implications of Reentrant Flow Recall that in the hierarchical control infras-
tructure, a machine group is contained within a unique cell at each level. The con-
troller of such a cell sets independent rates for each process segment within the cell.
Consider Level k Cell c whose characteristic frequency, fk is much lower than the
frequency of operations, but is approximately equal to or higher than the frequencies
of events decoupled by the buffers between stages of a reentrant process. Level k Cell
c controls each stage of the reentrant process as an independent process segment,
except when entry or exit buffers are failed.
A complication arises when a buffer separating two stages of a reentrant process
fails. The virtual machine control described in Section 3.8 is based in part on the
assumption that the limit rate imposed on a process segment is constant and is set
by circumstances outside the domain of Level k Cell c. In the case of a reentrant
process, the limiting rate is not a constant, but is instead determined simultaneously
with the limited rate, as the cell controller chooses production rates for both stages
in the same calculation sequence.
This section describes this problem in detail, along with the solution used in the
pyramid hierarchy. This solution is incomplete, as it leads to excessive computation,
as shown in a worked example in Section 7.5. In addition, a related approach to
hierarchical rate calculations formulated by Bai (1991a and 1991b) is described. Bai's
formulation does not encounter this problem with limiting rates, but its application
is limited to a strict process decomposition hierarchy defined in Section 3.5.2.
Illustrative Example In order to demonstrate the behavior of reentrant processes
in a hierarchical control infrastructure, an illustrative example is developed.
Consider the hierarchy depicted in Figure 3-10 where operations occur at a fre-
quency near the Level k + 1 characteristic frequency, fk+1. The process has two steps,
each of which requires an operation at Machine Group i. This makes the process
reentrant.
Operation n in the process is the last step in the first stage and Operation n + 1
is the first step in the second stage. Level k Buffer nj separates Operation n from
Operation n + 1. Level k Buffer fj can contain up to Bn, parts. The current amount
of material in the buffer is equal to bnj and the fill rate of the buffer is equal to bj.
Control at Level k is accomplished by Level k Cell c which contains Machine
Group i. All steps of the first stage of the process are contained within Level k
Process Segment j. All steps of the second stage of the process are contained within
Level k Process Segment j'. The Level k production rate for Process Segment j is
represented by u., and that for Process Segment j' by u k,.
Level k Cell c is contained within Level k - 1 Cell C. Since Buffer nj is too
small to isolate Process Segments j and j' whenever a Level k - 1 event occurs, the
Level k - I production rate, u -,) is the same for both segments. The Level k - 1
k-iproduction rate, Uc 1 , is set by the controller of Level k - 1 Cell C, and is used as a
target rate by Level k Cell c.
Note that Process Segment j' (the second stage) draws its raw material from Pro-
cess Segment j (the first stage). Conversely, Process Segment j deposits its finished
parts into the buffer leading into Process Segment j'.
Loading commands are issued from Level k + 1 Cell c*. The operation times of
the machining steps are rij and rij, respectively.
3.12.2 Behavior of Hierarchical Controller
The hierarchical control algorithm presented in the previous sections performs well
with a multitude of processes, as long as none of the processes is reentrant. However,
under certain conditions, a reentrant process causes the algorithm to become locked
122
Level k-1
Level k
Long Term
Demand
Cell C
k-1
cl
k-1
cl
Level k+1
Cell c
k
, Zk
c
k k
U U
cj eCr
k+1
I 7- Z
I cI
I I
1 V
Factory
MI ' Step n
rBu ,, "fo'7. Stopn+1
Figure 3-10: Example Reentrant Flow Hierarchy
123
I
IF'a
Pi)
in an infinite calculation loop. In particular, infinite looping occurs when one stage
of a reentrant process becomes linked to the next stage upstream or downstream by a
chain of virtual machines. When this occurs, the hierarchical controller described in
this thesis stops functioning. This problem is described in detail in this section and
a solution is presented in the following section.
Consider the system introduced in the previous section in which a reentrant pro-
cess passes through Level k Cell c and includes the two independent Process Segments
j and j'. While Buffer P,,j is neither full nor empty, both process segments operate
independently of each other. Level k Cell c chooses production rates u . and u k,
according to the hedging point strategy of Section 3.7.2.
As soon as Buffer f,j becomes full or empty, Process Segments j and j' are con-
nected by a virtual machine. Suppose that Buffer 0,j becomes empty. The production
rate of Process Segment j' is limited by that of Process Segment j, according to the
conditional constraint (2.19).
Since the virtual machine constraint is built on the assumptions that the limiting
rate is constant, the two segments are still treated as if they were independent. For
process segments in which the limiting rate is determined outside of the cell, this
assumption is adequate. However, in the case of linked reentrant process segments,
the limited rate is determined simultaneously with the limited rate. Therefore, the
limiting rate cannot be accurately modeled by a constant value.
Infinite looping arises if the reentrant nature of the process is not taken into ac-
count. This is because the limiting rate used in the virtual machine constraints (3.24)
and (3.23) has been computed in the previous calculation of rates. To demonstrate
how this leads to infinite looping, consider the following example. The notation (),
denotes the qth calculation of the quantity in parentheses.
Consider Level k Cell c in which Process Segment j' (the second stage) is limited by
Process Segment j (the first stage). Both production rates uk. and uk., are unknown
initially, and recall that the buffer between the two segments is empty. In the first
calculation round, u k and uj, are determined by the following linear program:
124
min Ak ( - zk) Uk + Ak k- zk,) Uk (3.28)cj( Cj zcj) uj   j( x j, - j,) uc,
subject to the capacity set
riju + j u, < ekm
u C > 0
k >0
where ekm k is the instantaneous capacity of the machine which operates on the reen-
trant process segments.
Because both A k(xk - zk.) and A k,(X., - zk.,) are negative, Uk and uk, are asQAC.? Q3 a and are as
large as possible, and the capacity inequality is satisfied with equality. (There may
be other capacity constraints, but we do not care about them if they are not satisfied
with equality.)
Suppose that IA ,(x, - z,) > IA( z) I so that the linear program
(3.28) gives as much capacity as it can to Process Segment j'.
The solution to the first linear program is (u,) 1, (u j )l , where (uC,) 1 > (uj)1.
This violates the virtual machine constraint (3.24). More capacity has been allocated
to Process Segment j' in the first calculation than can be used due to the starved
buffer. A second calculation is required in which a virtual machine constraint
uk, (u)l (3.29)
is added to the linear program (3.28).
The solution to the second linear program is (u k,) 2, (u kj) 2. Since (uk1 )2 < (Uk.,
additional capacity is freed up for u . (3.28) implies that (uk ) 2 > (ukj)1 .
Since the limiting value in the virtual machine constraint on uk, has increased,
the linear program must be solved again. The constraint on u., becomes
125
S< (u)(3.30)
The solution to the third linear program is (u)3,, (u ) 3. Since (ukj, )3 > (U",)2,
capacity is taken away from u . (3.28) implies that (Uk1 )3 < (u k )2. This is an infinite
loop.
Let us define (uL,),, (ku), as the solution to (3.28) with the additional constraint
k= k (3.31)
We know tht (u,l > (j), and .,k < kWe know that (u,) > (u)* and (u)l < (uC ) because the linear program
(3.28) is under constrained. Therefore, we know that
(uk.,)2 < (u .,) (3.32)
since
(U)2 < (uk)l < (u) = (u.) (3.33)
By similar reasoning, we know that (uk) 2 > (uj),.
Since the linear program gives all available capacity to u .,, (u,) 3 < (U ,.
Therefore, (u'k )3 > (uk ), and ( Uk) 3 < (U,k) 3 .
This situation is the same as that after the first linear program, and so the cycle
starts over again. Therefore, in a simple system, a reentrant process can lead to
infinite looping using the virtual machine constraints of Section 3.8.
This infinite looping condition becomes worse when process segments from two
adjacent stages of a reentrant process are separated by more than a single failed
buffer, or when there are more than two stages to a reentrant process, each of which
are linked by a chain of empty or full buffers.
126
3.12.3 Modification for Reentrant Processes
Two approaches to solving the infinite looping problem described in the previous
section exist. One approach can be used in the distributed hierarchy described in
this thesis. The second approach involves a different formulation of the control algo-
rithm which bypasses the infinite looping problem completely. However, it places a
larger computational load on the system controller because it is limited to the process
decomposition hierarchy of Section 3.5.2.
The first approach is described in detail because it is a direct extension of the con-
trol formulation presented in this thesis. A detailed example is shown in Section 7.5.
The second approach is outlined, with specific attention to its structure which avoids
the infinite looping problem but is inconsistent with the pyramid hierarchy concept.
Reentrant Processes in a Pyramid Hierarchy The purpose of the hierarchical
controller presented up to now is to decompose control of a manufacturing system
into a set of semi-independent controllers. This decomposition permits the distributed
calculation of production rates, where each cell controller has a small linear program
to compute local production rates. That linear program is only performed when con-
ditions within the cell's domain change. Any other condition change can be ignored.
This leads to a computational load which involves infrequent, small linear programs
throughout the system.
The solution to the infinite looping problem for reentrant processes attempts to
maintain the clean decomposition of the pyramid hierarchy. This is accomplished by
using two complementary techniques. The first technique requires the identification
and subsequent monitoring of all buffers in a reentrant process. The second tech-
nique requires the addition of a new set of conditional constraints to the hedging
point strategy linear program (3.22). The new conditional constraints link together
explicitly two adjacent reentrant process segments as the buffer conditions require it.
Through the monitoring capability of the hierarchical controller, the amount of
material in each buffer separating two adjacent reentrant process segments is watched.
The controller of the cell which contains the reentrant process segments will be told
127
when all dividing buffers are empty or all are full. At that point, the cell controller
installs the constraints described below.
Consider the example of Level k Cell c, and the two process segments, j and j'.
When Buffer 0,,j separating the two stages is empty, the hedging point strategy linear
program is modified by adding the explicit constraint linking the production rates uC
and u , together. In this case, the upstream rate, u'. limits the downstream rate,
kUcj,.
- u + u , < 0 if b,j = 0 (3.34)
Conversely, when Buffer 0,j is full, the downstream rate, u ., limits the upstream
rate, u.. This is represented in the hedging point linear program (3.22) as:
u_ - u, < 0 if bnj = Bn (3.35)
There additional conditional constraints are called anti-looping constraints be-
cause they prevent infinite looping in reentrant, or looping, processes.
Alternative Formulation An alternative formulation of the hedging point strat-
egy in Section 3.7.2 which effectively avoids the infinite looping of the reentrant
process control algorithm can be found in Bai (1991a and 1991b). In that formula-
tion, each control level of the hierarchy has a single linear program which determines
all rates within the level. Unlike the strict frequency decomposition hierarchy of
Section 3.5.1, Bai's formulation does take into account work-in-process and the de-
coupling effects of buffers.
This is accomplished by using the technique of process decomposition in which a
process is divided into a set of semi-independent process segments. Instead of com-
puting the production rate for a process segment using the independent cell controller
where the steps of the segment are located, Bai's formulation groups all rate com-
putations into a single controller at each level of the hierarchy. During a Level k
calculation, all Level k rates are determined, regardless of whether or not they have
128
changed. This contrasts with the pyramid decomposition which calculates production
rates only when the capacity set or the target production rates of an independent cell
have changed.
Bai's calculation at each level using a single controller gives the same result as the
formulation presented in this thesis when buffers are not failed. However, Bai's work
gives a better result when buffers are failed in a reentrant process.
The difference is in the way the conditional constraints imposed by buffers (2.19)
and (2.20) are included in the rate calculation algorithm. In this thesis, the use of
many independent controllers is built on the assumption that outside conditions will
be constant for a long time compared to the time scale of the cell (Section 3.8).
In Bai's work, the grouping of all rate calculations of a level into one controller
allows the buffer constraint limiting values to be computed simultaneously with the
values that are limited. For this reason, the anti-looping constraints (3.34) and (3.35)
are automatically included in the controller. Therefore, reentrant processes do not
induce the infinite looping found in the pyramid decomposition formulation.
A possible hybrid formulation would be to use the pyramid decomposition of in-
dependent cells when there is no possibility for infinite looping (i.e. buffers are not
failed). When there are virtual machine links within a reentrant flow, Bai's formu-
lation could be used to simultaneously solve for the limiting and limited rates. This
hybrid would take advantage of the decoupling effects of the buffers when possible,
and would eliminate excess calculations when the decoupling effects are not present.
3.13 Constraints on Hedging Points
The hedging point strategy presented in Section 3.7.2 is a feedback control law based
on the deviation of the production surplus from an ideal value (the hedging point).
Each process segment has a unique hedging point which is based on the reliability
of resources within the cell. Hedging points help to maintain a desired distribution
of work in process, and to maintain the isolation of cells when the system is at its
collective hedging point.
129
The control infrastructure of process segments and placement of buffers imposes
conditions on the hedging points for any given process segment. Hedging points in a
system must be chosen so the constraints (2.19) and (2.20) are satisfied for all buffers.
This section describes how the hedging points are limited by the infrastructure and
placement of buffers.
Sample Hierarchy Consider the manufacturing system shown in Figure 3-11. It
consists of two segments of a process in adjacent Level k Cells c and c'. The lowest
level cell in which both Level k Cell c and Cell c' are contained is Level m - 1 Cell C,
where m < k. Level m - 1 Cell C operates at a lower characteristic frequency than
that of the Level k cells.
Step n of the process is the last step performed in Level k Cell c, and Step n + 1
is the first step performed in Level k Cell c'. Denote the upstream process segment
in Level k Cell c as Process Segment j, and the downstream process segment in Level
k Cell c' as Process Segment j'. The process segments are separated by Buffer ,nj of
size Bnj. The Level k production rate of the upstream Process Segment j is uj and
that of the downstream Process Segment is u ,.
By convention in this section, upstream process segments are denoted by subscript
j, and downstream process segments are denoted by the subscript j'. The control
level of a quantity is denoted by superscript k or m.
Single Level Limits on Hedging Points Assume for the moment that m = k,
which implies that Cell C operates at Level k - 1, or the next higher level than Level
k. In this case, Buffer ,j is a Level k buffer. The Level k hedging point of the
upstream Process Segment j is z and that of the downstream Process Segment j' is
zj,. Note that the rate of change of amount of material in Buffer Pnj, according to
(2.18), is
b 1 = Uk - Uk (3.36)
Due to the fact that Cell C contains both process segments at Level k - 1, the
130
Long Term
DemandLevel m-1
Urn-1 I
Level m
U- I
Level k '
Cell c r---- Z.k
I
U
k I
ji
Factory
Buffer Onj Step n+1
Figure 3-11: Constraints on Hedging Points
131
Cell C m-1
L- ZL,,,,-------------------
m-1 I
j, I
r Z
m
-
---
I -
r---- m
Um I
k-l I
U I
k I
Step n
00-
00-
production rate for both process segments at Level k - 1 is equal to uj'. Equation
(3.36) can be rewritten as:
k= (u3 -u -' -(ut -u, ) (3.37)
By definition of production surplus (3.14),
and
xj, -J Uj, -J U j'
Equation (3.37) can be integrated and then the amount of material in the buffer as a
function of time becomes:
bk .(t) = X(t) - XI(t) (3.38)
When both process segments are at their respective hedging points, zj and zj,,
the amount of material in Buffer ,,j is equal to the difference between the hedging
points of the upstream and the downstream process segments:
bj = z - z (3.39)
The relationship between the upstream and downstream hedging points given in
(3.39) implies a set of bounds on those values. The amount of material in the buffer
is constrained as in (2.17). Therefore, the hedging points must satisfy
0 < zk - k < Bnj (3.40)
This is a constraint that is imposed on the designer of the hierarchical controller.
Multiple Level Limits on Hedging Points Suppose that there are one or more
control levels between the Level k Cells c and c' and the Level m - 1 Cell C. Suppose
132
that Level m - 1 Cell C is the lowest level cell in the hierarchy that contains both
Level k Cell c and Cell c'. Buffer O3j now is a Level m buffer, which is able to decouple
the effects of Level m and lower events. Therefore, the Level k - 1 production targets
may be different for Level k Process Segments j and j'.
Therefore, the limits on the hedging points in this situation will involve multiple
levels of hedging points, instead of the single level found in (3.40). Recall that j -=
k - u, and that u7- 1 = u -1. Expanding out the fill rate equation (2.18) for Buffer
3nj,
bnj = (3.41)
f(z - -1+ (U,+1 _ )+ +(u2l k) _-1
3 u ' 3' - ' 3 3 - j
-) + ( M+1- U) • -1 - u-) + (U, -u k-1 )
Note that all the interior terms cancel out, leaving
bk k -- Ur-i Um-1 .= U Ic
After substituting the definition of surplus (3.14) and integrating, the Level k
amount of material in Buffer ,/j as a function of each of the surpluses feeding it from
Level k to Level m is obtained:
bAj(t) = (3.42)
{x7(t) + x7+'(t) + .+ x-1(t) + x(t)}
- (X(t) + lX+l(t) +... + x-1(t) + Xk (t
Note that the surplus at Level k + 1, x+ , is small compared to the surplus at
Level k. It is assumed that from the point of view of a Level k observer, the Level
k + 1 surplus is negligible. This implies that x + x+ 1' from the point of view
133
of a Level k observer. For this reason, only those surpluses from Level k to Level m
are included in (3.43). Those surpluses at Level k + 1 and below are excluded.
When each of the process segments from Level k to Level m are at their respective
hedging points, each of the surpluses, zX, k < A < m, are equal to the hedging point
values, z'. Since this is a desired state for the hierarchical controller, each of the
hedging points must satisfy the buffer constraints (2.19) and (2.20). Therefore, if the
values of surplus, zx, in (3.43) are replaced with the appropriate values of hedging
point, z , and the buffer constraints are imposed, then the following limits on multiple
level hedging points are obtained:
k k
0 < z - z, < Bj (3.43)
A=m A=m
Isolation of Cells using Hedging Points Virtual machines, introduced in Sec-
tion 3.8, are devices used to inform a cell of a limiting rate due to events outside of its
domain. When there is a disturbance, or the system is recovering from a disturbance,
virtual machines are invaluable for balancing a system of isolated cells. However,
when the system is at its collective hedging point, all process segments are producing
at the overall target production rate. There is no need in that case to limit produc-
tion of a cell from the outside given the assumption that the target production rate
is feasible system-wide.
If the hedging points of two adjacent process segments satisfy either one of the
conditions in (3.43) with equality, then a virtual machine will be active when the
system is at its collective hedging point. The active virtual machine effectively elim-
inates the process decomposition between the two segments by tying one segment to
the other. Whenever a calculation in the cell which provides a limiting rate for the
virtual machine is made, a calculation is automatically triggered in the cell with the
virtual machine, eliminating the advantage gained by constructing a controller based
on independent cells.
To avoid unnecessary virtual machine connections and the resulting extra com-
putation when the system is at its collective hedging point, (3.43) is imposed on the
134
hierarchy designer with strict inequalities for control levels m < k
k k
0 < j- Zj < Boj (3.44)
i=m i-m
Distribution of WIP using Hedging Points The hedging points for buffers is
implicitly defined at each Level k by the process segment hedging points, as shown
by (3.39). Buffers serve as temporary storage areas for parts in the middle of the
process sequence. For this reason, the work in process distribution can be regulated
by using the hedging points to determine a desirable amount of material in each
buffer. The tradeoff between the isolation of disturbances and the accumulation of
work in process can be made by choosing appropriate process segment hedging points
and buffer sizes.
Bai (1991a, 1991b) and Bai and Gershwin (1990a, 1990b) introduced the concept
of a buffer hedging point, and a buffer hedging space. The buffer hedging point z*
is equal to the amount of material in the buffer when the process segments entering
and leaving the buffer are at their hedging points:
bz k k
j - ZC _ ZCej
The buffer hedging space z- is the difference between the buffer hedging point
and the maximum buffer size:
znj Bnj - zCj
Using long-term probabilistic models of simple systems, Bai and Gershwin were
able to formulate approximate analytical solutions for optimum buffer spaces and
buffer hedging points.
135
Chapter 4
Implementation Details for
Hiercsim Versions 3.5 and 4.0
This chapter describes some of the implementation details embodied in Hiercsim
Versions 3.5 and 4.0.
Hiercsim Version 1.0 was written by Darakananda (1989) to serve as a research
testbed for the hierarchical control concept in a factory simulation. This initial version
demonstrated that implementation is a non-trivial task, mainly because there is much
more to a factory simulation than the hierarchical controller described in Chapter 3.
The subsequent versions of Hiercsim (3.5 and 4.0) are built on the initial work of
Darakananda. However, in the process of demonstrating the control algorithm, the
entire simulation code was rewritten.
Hiercsim Version 3.5 and Version 4.0 each contain roughly 20,000 lines of code and
comments. The code is written pre-ANSI C on MIT Project Athena and is contained
in 60 different files. It consists of over 350 separate routines. The simulation code
contains a complete factory model based on the assumptions of Chapter 2, a complete
factory controller based on the hierarchical control theory described in Chapter 3, and
a complete user interface using ASCII flat files.
Hiercsim Version 3.5 focuses on factories where the only activities are operations
and failures given a constant setup state for all machines. Hiercsim Version 4.0
focuses on factories where the only activities are operations and setup changes given
136
a constant repair state for all machines. Version 4.0 is partially completed.
Three general areas drove the complexity of this implementation: architecture
development, algorithm development, and system integration. There still remains
work which can improve the efficiency and theoretical understanding of the solutions.
The important issues related to architecture, system integration, and algorithms are
addressed in this chapter. Below is a brief definition of these terms, along with an
outline of the specific problems encountered.
1. Architecture Development Architecture development of Hiercsim is defined
as the layout of all the functions required to run a factory simulation using a
hierarchical control policy. The architecture defines the boundaries of responsi-
bility for different functions, the representation of objects, and the connections
between functions and objects.
In Hiercsim Versions 3.5 and 4.0, a modular architecture was chosen so that
each routine either performed a single function, or called other functions. In
this manner, Hiercsim was designed to be expandable and adaptable to different
control policies within a hierarchical framework. Since each function either calls
other routines, or performs a single task, changing a specific policy requires
changing only a single routine.
The actual control policies are simplified because each of the many decisions is
performed in its own unique routine. This allows the routine to focus on a single
decision at a time, thus minimizing the complexity of interaction described
in Section 3.1.2. These separate decisions are integrated into a system-wide
controller by the interfacing techniques outlined below.
Two other features of the architecture are worth noting. The first is that the
output was designed as a series of probes into the code which could be turned on
or off from the input file. Therefore, the output is non-intrusive, and can be ex-
panded to meet the research needs of users. The output probes are independent
of the execution of the simulation.
The second feature is that a conscious effort was made to separate the fac-
137
tory data storage and representation from the control algorithms. The factory
data is placed into a database format with individual records for each type of
component in the factory.
The advantage of this approach is that it allows the user to change specific
control algorithms without having to redefine the data representing the factory
and control structure. This approach required attention to system integration
which took care of the interfacing between the control algorithms and the data
structures.
2. System Integration System integration is defined as the efforts required to
fuse the many different components of a factory simulation into a single system.
The integration is accomplished through the use of interfaces, the specification
of initial conditions, the control of event timing, and the preparation of the
linear programs (3.22) required for rate calculations.
Hiercsim 3.5 and 4.0 have four major areas requiring interfaces. These interfaces
are described in detail in Section 4.2:
* Interface between the Simulation and the User.
* Interface between Independent Cells in the Hierarchy.
* Interface between the Hierarchical Controller and the Factory.
* Interface between Computation Routines and Data Storage.
3. Algorithm Development In this thesis, algorithm development is defined to
be the translation of the control policy described in Chapter 3 into a com-
plete algorithm which determines the rates and times of controllable events in
a factory simulation.
Two major problem areas were resolved in this implementation, and a third
received a partial resolution. The first area involved the addition of appro-
priate boundary constraints to the linear program (3.22) as the hedging point
was approached. This problem and its solution are described in Sections 4.8.3
and 4.8.4. The second area involved the definition of an appropriate factory
138
model and is described in Section 4.3. The third area involves reentrant flow
constraints from Section 3.12.
4.1 Overall Architecture of Hiercsim
Unlike a physical object in which connections between components are limited to
immediate neighbors, a computer code has no such inherent limitation. Unless lim-
itations are imposed on the code which reflect the physical limitations found in a
factory, the simulation code will not accurately reflect behavior of the factory. Great
care was taken in designing the revised architecture of Hiercsim to embed the physical
reality of a factory into the code.
There are five major components of the architecture, listed below. These five
components are expanded in more detail in the following sections.
1. Data representation
2. Data input
3. Basic factory model
4. Capacity allocation
5. Output
4.1.1 Data Representation Architecture
This section describes the data representation architecture for both the factory model
and the controller.
Data representation describes how the static and dynamic components of the
factory state are stored in the simulation database. The layout of this database
determines how routines within the simulation interact to influence the future state
of the factory based on the current state. This layout is fundamental to achieve a
factory simulation which can accurately reflect the behavior of a real factory.
139
The data representations in Hiercsim Versions 3.5 and 4.0 are designed to mirror
reality as closely as possible. This is accomplished by building each data structure
with exactly the types of information which are available in real life. Hiercsim contains
a total of eight major data structures with numerous supporting data structures. A
brief description of each of the major structures and the type of information available
to each is described below. The reader is referred to Chapters 2 and 3 for definitions
and explanations of the items in this list.
1. QELEM A QELEM in Hiercsim is a Queue ELEMent, or event. This data
structure contains information on the time and type of the event, as well as
specific information such as the control level, cell, lot, resource, or machine
which is directly affected by the event.
These elements are placed into an event queue based the scheduled time of the
event. In the case that two events are scheduled at the same time, the event
queue order is resolved in a first-in first-out basis. This simple convention aids in
maintaining the integrity of event timing. More about the event queue appears
in Section 4.1.3.
2. LOT A LOT in Hiercsim is a fundamental group of parts which travel through
the factory together. No parts from the lot drop out during processing (100%
yield is assumed), and no parts join the lot in transit.
Each lot has a process, location, size, and processing status. The lot keeps track
of when it entered the system so that cycle time statistics can be maintained.
The location changes over time as the lot moves through the various stages of
the process. The lot contains the name of the lowest level cell in the hierarchy
that is currently controlling the movement of the lot.
3. MACHINE A MACHINE in Hiercsim is the fundamental unit which is capa-
ble of performing operations on parts in a lot. A machine is always part of a
machine group, and has an identifying number within that machine group which
distinguishes it from any other identical machine in the same group. The ma-
chine information kept by the machine are the lot it currently being processed,
140
the machine's current setup state, and the machine's current failure mode (if
any).
4. RESOURCE A RESOURCE is a place where a lot of parts can come to
rest for any length of time in a processing sequence. A resource in Hiercsim
can either be a buffer or a machine group. Data contained by each resource
includes the name of the lowest level cell in the hierarchy which controls it, all
the different process steps which use the resource, and a list of all lots presently
at the resource.
Information specific to buffers is the amount of material in the buffer as seen
from all control levels of the hierarchy. (This includes the actual count of parts
in the buffer.) The amount of material in the buffer is different at different levels
of the hierarchy due to the difference in measurement frequencies as described
in Section 2.9.3.
Information specific to machine groups are the total number of identical ma-
chines, the failure parameters of the machine type, the current effective capacity
of the machine type at each control level in the hierarchy, and the setup change
parameters of the machine type.
5. ROUTEENTRY A ROUTEENTRYis the name in the code for a single step
in a process. The information contained in this data structure is the resource
required (either a buffer or a machine group), the lowest level rate variable
which sets the production rate of the step, and the upstream and downstream
steps in the process. The ROUTEENTRY also specifies the setup state required
and the operation performed if the step is a machine group step.
6. PROCESS A PROCESS is a sequence of steps required to transform raw
material into finished product. Hiercsim does not support multiple routing of
parts. The process data structure contains a list of steps to be performed, the
overall target demand rate, and the relative importance of the process compared
to all other processes (i.e. the A coefficient of Section 3.7.2).
141
7. CELL A CELL structure is a fundamental building block of the hierarchical
controller and receives continuous target production rates. This structure is
based on the definition in Section 2.1.1. (A machine group is the other fun-
damental building block and receives discrete loading requirements instead of
continuous target production rates). The cell has a limited view of the factory,
both from a frequency and a spatial vantage point.
The cell contains information about the control level where it is located, the
next lower level components which it controls (either lower level cells or ma-
chine groups), the next higher level cell which supplies it with target rates, as
well as boundary buffers. The cell also contains information about the process
segments which it controls directly and the control structure (CTRLLEVEL)
which contains all the information needed to compute the rates for each of its
process segments.
8. CTRLLEVEL CTRLLEVEL is the name of the data structure in the code
which contains all the information necessary to compute production rates for
each of the process segments in the cell. This is also called the cell controller.
The information consists of the linear program at the heart of the hedging
point strategy (Section 3.7.2), the status of virtual machines (Section 3.8), the
current state of production surplus (x), and setup change control information
(in Hiercsim Version 4.0 described in Chapter 5).
4.1.2 Data Input Architecture
This section describes how data is entered into the simulation for both the factory
model and the controller.
Data input is the means by which parameters governing the behavior of the fac-
tory are entered by a user. The input for a simulation not only has to contain all
information required to run the program, but it also has to be easily readable by a
user so that alterations may be made quickly. Otherwise, the function of the simula-
tion as a research testbed will not be achieved. Note that the data representation in
142
the simulation has a direct impact on the architecture of the input file.
The architecture of the input file consists of modules which are read into the
simulation in sequence. The sequence of the input modules and a brief description of
the content of each module are described below:
1. Overall Parameters This module specifies the random number seed, the sim-
ulation end time, and the values for small and large numbers in the simulation
(E and Z). The choice of these parameters affects the sequence of random events,
the amount of time required to run the simulation, and the sensitivity of the
algorithms to numerical instabilities.
Each control level k requires a time interval Ek. Events whose interval between
occurrences is shorter than Ek are represented by flow rates at Level k. Events
whose interval between occurrences is greater than Ek are treated as discrete
changes in system state at the instant that they occur.
The value of the interval Ek should satisfy the relation
1
k < (4.1)
Similarly, each control level k requires a long time interval Ik. That inter-
val represents a threshold between events whose occurrences must be modeled
explicitly and those events whose occurrences may be considered only at the
instance when they affect the Level k state2 . Such an interval should satisfy the
relation:
1
Ik > (4.2)
'If this time scale concept were implemented correctly, there would be an interval Ek for each
Level k cell based on the characteristic frequency fk. However, the current implementation only
reads a single value of e for the entire simulation. This should be changed in future simulations.
2As with the short interval Ek, only a single value of I is read in for all control levels. In future
versions of the simulation, this should be changed so that each level has its own unique version of a
long interval "k.
143
2. Machine Prototypes The machine prototype section details the different
types of machines which are present in the simulation. Each machine type
has a list of operations which it is capable of performing, a list of possible setup
states, and a list of possible failure modes. (Note that setup information ap-
pears in Hiercsim Version 3.5, but that no setup changes are possible. Only
Version 4.0 allows setup changes.)
From this prototype information, individual machines of each type are created
for each of the machine groups specified in the cell input module. Multiple
copies of the same machine prototype may be created. Only the machines in
the machine groups are used to process parts. Once the machine groups are
specified, the machine prototypes are ignored.
3. Buffers All buffers in the simulation are specified here. The only information
required here is the name and size of each buffer. The location of the buffer is
specified in the cell control structure module.
4. Cell Structure The cell structure module specifies the hierarchy which will
be used to control the factory. In this module, the relative positions of cells,
machine groups, and buffers are specified. The machine groups are created from
machine prototypes, and are inserted directly into the lowest level cells. Buffers
are included in the lowest level cell which supplies target production rates to
both the process segment feeding the buffer and the one which draws from the
buffer (Section 3.5).
5. Process Layout The process layout specifies the overall demand rate, relative
importance (the A-coefficient), the operation level, and the route of a part
through the system. The route is comprised of alternating buffer and machine
group steps. Note that the combination of the cell structure and route layout
determines which process steps are included in which process segments at each
control level of the hierarchy.
Hierarchy Creation After the machine prototypes, buffers, cell structure, and
144
process layout are all read in, the hierarchy is constructed using intermediate
routines between input modules. Those routines set up the linear programs,
the control variables, and all the interconnections necessary for successful com-
munication throughout the controller. Note that this is not an input module.
6. Hedging Points Once interconnections between cells and process segments are
created, hedging points can be read in from the input file for each control level
and each process in the simulation. These hedging points specify the amount of
work-in-process which is desirable at each point in each process, based on the
nature of disruptions affecting production.
7. Output Specification The last modules of the input file specify both the
content and the timing of output to be generated during the simulation run.
These output options are described in Appendix C
A sample input file appears in Appendix A. That input file is used in the example
in Section 7.2.4.
4.1.3 Factory Model Architecture
This section describes in detail the factory model architecture.
The factory model architecture consists of the layout of the database (which stores
the state of the factory) and the routines which modify the state of the factory model.
In order to accurately model a factory, both the data representation and the routines
which use that data must have a very well defined scope. Note that the factory model
represents the physical components of the factory, whereas the controller represents
the decision-making process. The communication between the controller and the
factory model is described in Section 4.3.
Section 4.1.1 describes the eight types of data structures required in the factory
simulation. Four of those data structures interact to form the basic factory model.
Those structures are the lot, machine type, machine group, and buffer. The integrity
of the timing of events affecting these structures is maintained by the event queue,
which is a specialized data structure.
145
This section describes the architecture of the factory model, and how specific
routines interact with the database. The two basic components of factory dynamics
which are modeled in Hiercsim Version 3.5 and 4.0 are lots and machines. Specifically,
a lot may be created, moved from location to location, processed at machine groups,
and removed from the system upon completion. A machine may process lots, fail,
become repaired, and change setup state.
Event Queue The simulation implementation uses an event queue to maintain
continuity and sequence of events. A time is associated with each event in the sim-
ulation. Events are placed in the event queue according to that time. Events which
are supposed to occur in the same time instant are executed sequentially in a first-in,
first-out manner. The event queue and the routine which handles events from the
event queue is the heart of the factory model.
The purpose of putting an event in the event queue is either to start a control
calculation or to model a physical event. An event can change the state of the system,
and cause other events to be put in the event queue. For example, a failure will trigger
a rate calculation; and a part which is just cleared for entry into an operation level
cell will trigger an attempt to load the part by a machine group controller. The scope
of the state alteration depends on the event type.
An event may alter the state of a specific machine, lot, buffer, machine group,
or cell. The fidelity of the factory model is maintained by limiting the scope of the
routines triggered by the event to that of real life. For example, a machine is only
able to react to events in the buffers immediately upstream and downstream of its
machine group. Any control involving more than one machine group is accomplished
by the hierarchical controller.
Once the state of the system is altered, other events can be scheduled which react
to the new state. Those events are placed in the event queue and will be pulled off
at the appropriate time in the future. For example, the event which represents the
completion of processing at a specific machine updates the bookkeeping and schedules
the event which will transport the lot to the downstream buffer.
146
In the case that an event eliminates the need for a previously scheduled event,
that future event is canceled from the queue. This type of event occurs frequently in
the scheduling of control calculations.
For example, the production rate calculations in a cell are based on the assumption
that the state of the cell will be constant until the next time a calculation is required.
That next calculation is scheduled in the event queue. Suppose that a failure occurs
within the cell which changes the state of the cell. The cell controller must react
immediately to the state change, so a calculation event is immediately scheduled.
The previously scheduled calculation event is no longer valid since the cell's state has
changed, and is canceled.
Lot Dynamics Whenever a lot is removed from an entry buffer, a new lot is created
from the raw material in the warehouse. This new lot is placed into the entry buffer
and remains there until the controller permits processing to begin. The routines which
create the lot handle the allocation of memory in the database and the initialization
of data (such as the lot number, lot size, and lot location).
There are a similar set of routines which are responsible for removing a lot upon
completion of processing. These routines ensure the integrity of the memory handling
system and the data management system. Bookkeeping is updated to account for the
shipment of the completed lot.
During the course of processing, lots are moved from station to station. Four sets
of routines handle this function. The precise details of the policies used to move lots
from place to place are described later in Section 4.3.2. The function of these routines
is described here. The four sets of routines can be divided into two separate groups:
the first being lot departures, and the second being lot arrivals.
Lot Departure Upon the departure of a lot from a resource, the departure routine
updates the production and status statistics at the resource. In addition, the routine
schedules the arrival of the lot at the downstream resource.
The departure routine immediately looks upstream to see if there is another lot to
147
fill the space freed by the moved lot. If there is one, a policy is invoked to determine
whether or not that lot may be moved forward. The policies used in Hiercsim Version
3.5 are described in Section 4.3.2. At that point, the lot is marked and is scheduled
for departure from the upstream station. Different policies for buffers and machine
groups are invoked to determine whether or not an upstream lot may move forward.
Lot Arrival Upon arrival at a resource, a different set of routines determines the
next action to be performed on the lot.
If the lot has arrived at a buffer, then the policy to load the lot onto the next
machine group is invoked. Again, these policies are described in Section 4.3.2. If
there is a need to process the lot, then a lot departure is scheduled. Otherwise, the
lot waits in the buffer until needed.
If the lot has arrived at a machine within a machine group, then the arrival routine
determines the time to start processing. This start time is scheduled and is handled
by an entirely different set of routines.
Machine Dynamics The most basic of a machine's function is to process lots. Su-
perimposed on this basic function are the more complex activities of machine failures,
repairs, and setup changes.
In Hiercsim, there are three separate components to processing lots at machines:
the decision to load a lot, the actual loading of the lot, and the processing of the
loaded lot. The architecture of Hiercsim has separated these three components in
order to separate the controller from the factory. This separation allows the user to
experiment with different control algorithms without having to rewrite major portions
of code.
Machine Group Controller The fundamental architectural feature which allows
the separation of the controller from the factory is the existence of a machine group
controller. The machine group controller chooses from among available lots in the
buffers immediately upstream of the machine group and assigns those lots to available
machines within the group. An available machine is one which is not currently occu-
148
pied by any other activity. The algorithm used as a default in Hiercsim is described
in Section 4.3.2.
Lots are made available for processing by the hierarchical controller. An available
lot is indicated by being marked green. Unavailable lots are marked red.
A lot enters the entry buffer of a process marked red. It is turned green when
the staircase policy of Section 3.9 permits the entry cell to process another lot. A lot
remains green until it reaches a buffer whose control level is less than or equal to that
of the operation level of the lot. At that point, the lot is red until cleared for entry
into the next cell downstream by that cell's staircase policy.
This marking of lots allows the hierarchical controller to communicate indirectly
with the machine group controller. In addition, the marking of lots permits each
controller to operate independently of the algorithms used in all other controllers.
Loading, Processing, and Unloading of Lots Once an available lot has been
assigned to a specific machine for processing, separate routines move the lot from the
buffer onto the machine. See Section 4.3.2 for details.
The processing routines are split into four sections: load a lot onto an individ-
ual machine, start processing, interrupt processing (in case of failure), and resume
processing. These routines handle the bookkeeping associated with tracking the pro-
duction status of machines, the allocation of capacity to discrete machines, and the
accounting of failures into the processing time of lots.
Once the lot has completed processing, it is immediately unloaded from the ma-
chine and moved into the appropriate downstream buffer. The machine group con-
troller is informed that the machine is idle.
Note that each of these routines only carry out actions and make no decisions. All
of the decisions required to make the lot available for processing are made elsewhere.
This feature greatly simplifies the writing of the code.
A feature which allows the separation of the factory from its controller is the
existence of actual and perceived operation times. An actual operation time is the
clock time it takes to physically process the lot on a machine. A perceived operation
149
time is the duration that the controller perceives the operation to take, and therefore it
is the time used in capacity allocations. This difference between actual and perceived
operation times can be used to explore the sensitivity of the control algorithm to
imprecise factory data.
Machine Failure and Repair Dynamics In Hiercsim, a failure can occur at any
point in the processing cycle of a machine without warning. During a failure, time
stands still for that machine, according to the assumptions of Section 2.7. That
is, processing stops in mid-operation and resumes processing upon completion of
the machine repair as if the failure did not occur. This assumption simplifies the
failure and repair routines since they can be written independently of the rest of the
simulation.
This architecture also permits each machine in a machine group to be treated
independently of all other machines in the group. These routines alter the status of
machines (Section 4.3.3) and thus communicate indirectly with the machine group
controller which assigns lots to be processed. This ability allows the machine relia-
bility assumptions of Section 2.7 to be coded easily.
In the same manner that operation times may be different for the factory and the
controller, failure and repair times may be different. The actual failure and repair
parameters are representative of the physical factory. The perceived failure and repair
parameters are those which are used in the calculations by the controller to allocate
capacity.
4.1.4 Linear Program Architecture
This section describes the architecture associated with linear programs in Hiercsim
Versions 3.5 and 4.0. It deals with the hierarchical controller. Two factors drove the
choice of architecture: ease of maintenance and usefulness in debugging the simulation
code. Those factors are outlined, followed by the specific description of the linear
program architecture.
150
Common Format The algorithms which operate on a linear program use a com-
mon linear program format, regardless of the types of constraints involved. Con-
straints may be added as the simulation conditions change during a run. In the
future, as new algorithms are formulated, new types of constraints may be required
and must be able to be added to the simulation.
The architecture of Hiercsim requires an interface to translate constraints into the
linear program format. In this way, constraints may be added to the hedging point
strategy in the simulation by merely altering the interface while leaving the calculation
routines intact. The procedure for adding constraints to the control algorithm is
described in Appendix D.
Simple Linear Program Algorithm Early in the development of Hiercsim Ver-
sion 2.0, it was discovered that the linear program solver could not solve some of the
problems generated by the simulation. Because the algorithms used in Hiercsim were
under development, the fact that the linear program solver was not powerful turned
out to be a blessing in disguise.
The solver usually did not work when an algorithm error provided it with an
unusual problem. This failure information was used to detect and subsequently fix
subtle algorithm errors on many occasions. Even though a more powerful solver would
not fail as often, it would also mask important development information.
A fundamental design decision in Hiercsim has been to use the simple linear pro-
gram solver, and try to use supporting logic to remove all rate variables which are
known to be zero in the optimal solution. (Rates that are known to be zero include
those parts whose surplus is greater then their hedging point and those process seg-
ments which contain one or more machine groups with all machines failed). For the
most part, this approach has been sufficient. The approach has led to a deeper un-
derstanding of the algorithms, but it has also led to a fragile code which is prone to
fail upon the slightest provocation. A full scale implementation of this algorithm will
need a better linear program solver so that the routines become more robust.
A side benefit to reducing the number of variables and constraints is a savings in
151
the time it takes to compute each linear program. Since the number of linear programs
needed to determine the production rates using the current algorithms described in
Section 4.8.4 can be large if there are many part types and the system is near its
collective hedging point, the overall savings in time can be significant.
Master and Work Linear Programs Each cell controller which controls events
using flow rates uses the linear program solver to determine rates according to the
hedging point strategy of Section 3.7.2. Depending on the status of machines within
the cell, buffers at the cell boundary and production by the cell, different constraints
are needed at different times. In addition, a subset of the controllable rates may
be forced to zero by the constraints on the cell capacity. The number of rows and
columns in the linear program can change depending on the state of the cell.
The variable number of rows and columns of the linear program can make it
difficult to integrate the controller of an individual cell into an infrastructure of many
cells. For example, the virtual machine concept requires the addition of constraints
depending on the state of entry and exit buffers. When neighboring controllers look
into a cell to find limiting production rates (Section 3.8), they must know which rows
correspond to which machines, and which columns correspond to which rate variable.
From a programming perspective, it is easier to code those routines if there exists a
fixed location within each cell where rates and constraints are stored.
The solution to this problem is to keep the linear program in two different loca-
tions, each with a unique purpose. The first location is the master linear program,
and the second is the work linear program.
The purpose of the master LP is to act as a storehouse for all possible constraints
and all controllable rate variables. It also serves as a ready-reference for component
cells to look up target production rates, and neighboring cells to look up limiting
rates imposed by virtual machines. Any event which causes the capacity of the cell
to change is reported to the master LP. The relative positions of the rate variables
and capacity constraints never change within the master LP, once they are created
in the initial part of the simulation.
152
Suppose that Level k Cell c contains n, production rate variables and n, possible
machine states. The overall dimensions of the master LP for the cell are 2 + nm + 5nP
rows by n, columns. They include
* 2 rows for the objective function.
* nm rows for machine capacity constraints of the form
(Section 3.6, Equation (3.10)):
S j _juk< e mk
* n, rows for possible boundary constraints of the form (Section 4.8.4):
FiTuk FTuk-1
* 2n, rows for all possible blockage and starvation constraints of the form (Sec-
tion 3.8, Equations (3.23) and (3.24)):
rkju k < Unax
* n, rows for possible upstream
form (Section 3.12.3, Equation
virtual machine loops in reentrant flow of the
(3.34)):
k k 0
-u. + u., < 0
* n, rows for possible downstream virtual machine loops in reentrant flows of the
form (Section 3.12.3, Equation (3.35)):
Sk k 0uCj - uC < 0
* nP columns for the production rate variables.
153
The purpose of the work LP is to provide the hedging point strategy algorithms
with the simplest possible linear program.
The work LP is a selection of rows and columns from the master LP. At any given
time, the constraint coefficients, rate variables, and machine availabilities are taken
directly from the master LP and distilled into the work LP. All rate variables whose
values are restricted to be zero are excluded in the work LP. The interface routines
which transfer constraints from the master LP to the work LP contain the logic that
filters those zero-valued rates.
Because the work LP has been written in a standard format, the same set of
calculation routines may be used for all cells in the hierarchy which require a linear
program to determine rates of controllable events. This greatly simplifies the control
algorithms and enhances the maintainability of the code by reducing and centralizing
the logic.
After the controllable rates are successfully computed using the work LP, they are
placed into the correct positions in the master LP for integration into the rest of the
system. (Note that the master LP never sees the linear program solver and so must
be told of the results of the work LP.)
4.1.5 Output
For maximum flexibility, the output routines are designed as a series of probes inserted
at places in the code where data is generated. The simulation is unaffected by the
presence of these probes.
The output can be turned on or off using commands included in the input file.
Additional data probes can be inserted in the code with minimal effort. The procedure
to insert additional probes is described in Appendix C
4.2 Major Interfaces in Hiercsim
This section describes the various interfaces which are required to allow different
components of the simulation talk to each other. The extensive use of interfacing
154
permits a decoupling of code, so that the maintainability of the code is enhanced.
The major interfaces described in this section are:
1. Interface between the Simulation and the User.
2. Interface between Independent Cells in the Hierarchy.
3. Interface between the Hierarchical Controller and the Factory.
4. Interface between Computation Routines and Data Storage.
Interface Between Simulation and User The interface between the simulation
and the user reads in data about a factory and its controller, and reports the results
of the simulation. In order to be effective, the interface must produce input data files
that are easily read and modified. In addition, the output to the user must be easily
generated and interpreted.
This interface in Hiercsim Versions 3.5 and 4.0 is accomplished using an ASCII-
based input file and output options directed from a module within the input file.
The user enters data describing the different components of a simulation (machines,
buffers, cells, and processes), specific control parameters (hedging points and setup
change rates) and output control. This is described in Section 4.1.2. See Darakananda
(1989) for a description of how to use the Hiercsim input files.
During the simulation run, the output requested by the input file is printed to
various output files in ASCII format. These files are later read by the user to interpret
the results of the simulation. For a description of the output options available, see
Appendix C. That appendix also contains a description of how to add further options.
Interface Between Cells in the Hierarchy This section deals with the interface
between cells in the hierarchical controller.
Each Level k Cell c is treated as an independent cell which is required to com-
municate with its neighbors as described in Section 3.5. The interface between Cell
c and the other cells in the hierarchy is accomplished primarily through the use of a
mini-database within each cell.
155
Since a basic assumption in the formulation of the control algorithms is that the
Level k state of Cell c is constant during the time required to complete a calcula-
tion, and is constant between calculations, it is sufficient to store all Level k state
information about the cell in a database. That database is only changed when a
Level k or higher event occurs which affects Cell c, at which time the assumption
is only invalid during the instant in which the state changes. Therefore, when the
controller of Level k Cell c is calculating a new capacity allocation, it is able to read
all necessary information directly from its own database and those of its neighboring
cells. Such information includes Level k - 1 target production rates, Level k machine
states, Level k surplus, and the status of Level k virtual machines.
A change in the Level k state of Cell c is communicated to other cells by scheduling
a recalculation event in the cells directly affected by the change. This includes all
Level k neighboring cells which have an active virtual machine connection to Level k
Cell c, and all Level k + 1 component cells of Cell k. The precise timing algorithms
and rules associated with this communication are described in Section 4.5.
Interface Between Hierarchical Controller and Factory This section de-
scribes how the hierarchical controller and the factory model communicate with each
other in the simulation.
The design of the hierarchical controller explicitly separates the controller from
the factory. In order for the controller to have any effect on events in the factory,
some sort of communication link must be established.
Commands from the hierarchical controller are transmitted to the factory through
marking of lots red or green as described in Section 3.9. The machine group controller
of Section 4.3.2 looks in upstream buffers for green lots to load. If the hierarchical con-
troller is operating well, there will be sufficient green lots to satisfy target production
rates within the capacity of the system.
Changes in the state of the factory are transmitted directly to the appropriate
controller in the hierarchy, as described in Section 4.5.3.
156
Interface Between Computation Routines and Data Storage This section
describes how each computation routine accesses appropriate data. This interface
applies equally to the factory model and to the hierarchical controller.
The interface between computation routines and data storage areas is accom-
plished on a need-to-know basis. The layout of the databases reflects natural limits
on information transfer. Data access is limited to those routines which directly affect
or require information.
For example, Level k Cell c which has an active virtual machine requires infor-
mation about the limiting Level k rate through the affected buffer. Therefore, access
is permitted into the adjacent Level k cell database to retrieve that rate. This is
described in more detail in Section 4.1.1.
4.3 The Factory Model
Once the hierarchical controller has specified machine states and production rates,
those commands must be implemented in a model of the factory with its own distinct
controller. This section describes the details of the factory model and its controller.
All possible machine states are detailed, followed by the detailed loading rules, and
the way in which failures are modeled.
4.3.1 Possible Machine States
This section details the possible machine states in factory model.
The state of a machine is used to describe the status of the activity which it is
currently performing. The state is invisible to the hierarchical controller because it
changes at a much higher frequency than the hierarchical controller is able to detect.
These states are essential to maintain the continuity of the simulation. Some of the
states are merely flags which trigger some control function and are only in effect for
the current time instant. Other states can be in effect for longer periods of time, such
as when the machine is processing a part, failed, or idle. This section describes each
of the states, and when they are in effect.
157
IDLE A machine is IDLE when it is ready to accept a part for processing. The
machine is in this state at the start of the simulation, and when a part has just
been moved off of the machine into a downstream buffer. The change of state to
IDLE triggers the machine group controller to attempt to assign another part to the
machine. The machine is IDLE until it is assigned a part when it is PROCESSING,
or it experiences a time-dependent failure3 , when it is FAILED.
WAITING A machine is WAITING if a part has been assigned to it that has not
yet been loaded. This state prevents more than one part from being loaded onto the
same machine at the same time. The change of state to WAITING serves as a trigger
to move a part from an upstream buffer to the machine. Therefore, the machine is
WAITING for only an instant.
PROCESSING A machine is PROCESSING whenever it is actively operating on
a part. All time that elapses while the machine is PROCESSING is counted towards
the time required to complete an operation. The machine ceases to be PROCESSING
when the operation time has completely elapsed. Processing is paused during failures,
and resumes where it left off when the repair is complete.
HOLDING A machine is HOLDING when it has completed an operation on a
part, but has not yet deposited the part in the downstream buffer. This state triggers
the routines which unload the part. This state is only in effect for an instant because
the machine loading policy described in Section 4.3.2 requires that there be room
downstream to deposit a part before it is loaded onto the machine.
SETTING UP A machine is SETTING UP when it is currently changing its setup
state. All time that elapses while the machine is SETTING UP is counted towards
the time required to complete the setup change. The machine ceases to be SETTING
UP when the setup change time has completely elapsed. The setup change is paused
3When a machine is IDLE, it cannot process parts, and so operation-dependent failures cannot
occur in that state.
158
during failures, and resumes where it left off when the repair is complete.
This machine state is only available in Hiercsim Version 4.0. It is not available in
Hiercsim Version 3.5.
FAILED A machine is FAILED whenever it is interrupted from processing, setting
up, or being idle due to unscheduled downtime. A failure serves to postpone any
operation or setup change on the FAILED machine by an amount of time equal to
the repair time.
REPAIRED A machine is REPAIRED as soon as the time of the repair has
elapsed. This state serves as a trigger to resume the activity in progress at the
time of the failure and is only in effect for an instant.
4.3.2 Processing
This section describes the model used to simulate the steps involved in processing
of parts on machines. There are two essential components to the model. The first
models the physical tasks of the processing, and the second models the heuristic rules
employed by the controller at each machine group for deciding when to load and
transport parts.
Operation Time Even though an operation is considered to be a single unit at the
lowest level of the hierarchy, it actually is made up of numerous tasks. Such tasks
include the transportation of parts from an upstream buffer to a machine, processing,
and the transportation of parts from a machine to a downstream buffer. Hiercsim
allows the transport time to be modeled separately from the operation time rij of
a step of Process Segment j at Machine Group i. However, the hierarchical control
algorithms do not take transport time into account and all simulations in Chapter 7
have zero transport times throughout.
Each of the substeps requires that a decision be made by heuristic controllers
within the factory model. Those controllers are independent of the cell controllers
159
of the hierarchical control system. For the most part, the heuristic controllers can
be combined with the machine group controllers which govern the precise timing of
events at each machine.
Part Generation Raw material is generated at the start of each process. Initially,
the entry buffer is filled with parts. The hierarchical controller gives permission to
the parts for loading onto machines according to the staircase policy of Section 3.9.
As parts are removed from the entry buffer, they are immediately replaced by fresh
parts which do not yet have permission to be loaded onto machines.
Loading Permission from the Hierarchical Controller Parts which arrive at
a operation level or higher buffer must be cleared for entry into the operation level
cell by that cell's controller. The controller uses the staircase policy of Section 3.9 to
clear a part for entry. Once a part is cleared for entry, it retains that clearance until
it arrives at the next buffer whose control level is at the operation level of the process
or higher.
A transfer line with finite buffers may be modeled as a series of machines with
buffers of control level lower than the operation level of the process. All control of part
movement and processing within the operation level cell is ceded to the controllers at
individual machine groups whose algorithms are based purely on heuristics.
Loading Triggers The controller at Machine Group i which is responsible for load-
ing parts onto individual machines from buffers immediately upstream of the group
lies dormant most of the time. The part loading controller is activated only when
certain triggering events occur at the machine group or its upstream or downstream
buffers. Once the loading controller is triggered, it attempts to load a part onto a
machine in the group.
Any of the following events acts as a trigger for the part loading controller:
1. A part arrives at any buffer immediately upstream of the machine group and
has clearance to be loaded from the hierarchical controller.
160
2. A part already in a buffer immediately upstream of the machine group is given
loading clearance.
3. A machine's status is changed to IDLE.
4. A part departs from any downstream buffer.
Part Reservation After the part loading controller is triggered, it checks to see
if a part may be loaded onto a machine in the machine group. There may be many
different parts to choose from. The machine loading controller which reserves parts
for specific machines must decide among the options based on some loading priority
rules.
This topic was introduced in Section 3.10 which described the conditions required
for a part to be loaded onto a machine. That section mentioned that the buffers in
front of a machine group were scanned for parts which satisfied the loading conditions,
but it did not give the specific scanning order. The current implementation of the
scanning order is described here.
The loading controller looks into each buffer in front of the machine for cleared
parts (parts which are marked green and have not yet been assigned to a machine).
There is one buffer per type of operation able to be performed at the machine. Each
time the controller looks for a part to load, the buffers are scanned in the same
order as all the other times, beginning from always the same point. That order is
determined by the order in which the upstream buffers appear in a list at the machine
group controller.
Once a cleared part is found, the machine controller checks two situations: the
status of the machine; and the status of the downstream buffer. If the machine
tooling configuration is incorrect for the operation required by the cleared part or if
the downstream buffer has no room for the part, then the part is passed over.
The buffers are scanned until a cleared part is found on which the machine can
operate, and which can be moved downstream immediately. That part is assigned
to the machine, and the machine's status is changed to WAITING. Otherwise, if no
161
cleared parts are found, then the scanning is stopped after the last buffer.
Every time the machine checks for parts to load, it starts its search with the
same buffer. Since a machine may perform multiple operations, this policy affects the
priority of those operations. The first operation in the list of operations performed
at the machine will always have the greatest priority, the second operation in the list
will be next, and so on down to the last operation which is performed by the machine
group. The last operation which is performed by the machine group will be reached
eventually, but only after the cumulative requirements of all the other operations have
been satisfied.
These rules were chosen for their simplicity, and not because they were the best.
Any other reasonable rule for loading the machine may be used, as long as there is
some mechanism to choose among a range of available parts.
This policy is able to give reasonable results when the volume of parts is great
enough to approximate the production as a flow, since the hierarchical controller is
clearing parts within the capacity of the system. Section 7.7 describes the performance
of the controller when the flow rate assumption of Section 2.6 does not hold and the
policy is not adjusted to match the situation.
Part Loading Once a part is identified, it is reserved for a particular machine in
the group. Reserving the part prevents the controller from loading that part on more
than one machine at the same time. A reserved part will be loaded immediately onto
the machine. If a machine fails before reserved parts are loaded onto it, then the
parts are released for loading onto another machine. Processing begins as soon as the
parts arrive at the machine.
4.3.3 Machine Failures
This section describes the implementation details of the factory model for operation-
dependent and time-dependent failure modes and their repair as described in Sec-
tion 2.7.
When a machine is created in the input section of the simulation, each of the
162
failure modes for that machine type are initialized. The time to fail for Mode j, tfj,
is initialized as an exponentially distributed random variable, with mean time to fail,
MTTFj.
Pr(I tfj J> t) = f MTTFe 1d (4.3)
As the simulation progresses, the initial time to fail, tfj, is decremented according
to the rules described below. When the time to fail for Mode j, tfj, has been decre-
mented to zero (when tfj = 0), the machine is failed in Mode j. The time to fail for
Mode j, tfj, is decremented differently for operational-dependent failure (ODF) than
for time-dependent failures (TDF).
Operation-Dependent Failures Recall from Section 2.7 that an operation depen-
dent failure is one where the time to fail is dependent on the cumulative operation
time since the last repair.
When a part is loaded onto a machine and is about to be processed, the time to
fail for Mode j, tfj, is compared to the operation time rij. If the time to fail is greater
than the operation time, then the time to fail is decremented by the operation time
tfj + tfj - Tij
and the machine will successfully complete the operation. However, if the time to fail
is smaller than the operation time, then the machine will fail tfj units into the op-
eration. Operation dependent failures are implemented correctly in Hiercsim Version
3.5. No failures are modeled in Hiercsim Version 4.0.
Time Dependent Failures Recall from Section 2.7 that a time-dependent failure
is one where the time to fail is measured from the instant the machine gets repaired to
the next instant the machine fails, regardless of the amount of processing the machine
accomplishes. Time dependent failures modes are not implemented correctly in either
Hiercsim Version 3.5 or Hiercsim Version 4.0.
163
Currently, the possible failure of a machine is only checked when processing begins
on a part on that machine. If the time since the last failure for the time-dependent
Mode j, ttid, is greater than the time to fail, tfj, then the machine will fail.
Upon failure, the time since the last failure, ttdj is reset to zero. Otherwise, the
next time the Mode j will be checked is when the next part begins processing on the
machine, regardless of how long in the future this will occur. This implementation
ensures that processing will always be interrupted when a time dependent failure
occurs.
Ideally, time-dependent failures should be scheduled directly into the event queue
for all machines with the time-dependent failure modes. Therefore, a time-dependent
failure should not require a part to be on the machine in order to fail. Machines
should be able to experience more than one time dependent failure mode at the same
time.
Machine Repair Once a machine has experienced a Mode j failure, the time to
repair, trj, is initialized as an exponentially distributed random variable with mean
time to repair, MTTRj.
t 1 MT-R
Pr(| tri 1> t) = MTTRe d, (4.4)
A REPAIR event is scheduled trj time units in the future. Once that time has
elapsed, the machine reverts back to the activity which was interrupted by the failure.
If there was no activity, the machine becomes IDLE.
For the duration of the repair, no other activities may take place on the failed
machine. When the machine is repaired, the activity which was interrupted by the
failure is resumed with no penalty. The time to complete the interrupted activity is
increased by the time to repair, t,j.
As soon as the failure occurs, a new time to fail, tfj is initialized for Mode j
according to (4.3) This new failure time does not begin to be decremented until the
repair is complete.
164
4.4 Initialization
The current implementation of the hierarchical controller, Hiercsim Versions 3.5 and
4.0, require initial conditions about the state of the manufacturing system. The
number of initial conditions that can be set from the input file is currently limited. In
the future, the input file may be expanded to contain more and more initial conditions
as needed.
This section describes the default initial values, as well as those values which can
be specified from the input file. It also describes the initialization of the event queue
and the degrees of freedom available in specifying control parameters which affect the
course of the simulation.
4.4.1 Initial Values of the Simulation
This section details the initial values of both the factory model and the hierarchical
controller.
Initial Conditions and Values for Resources Each machine in the simulation is
assumed to be idle at the start (t = 0). The initial tooling configuration (setup) of the
machine is specified in the input file. Note that the configuration states of Hiercsim
Version 4.0 are implemented in dummy form in Hiercsim Version 3.5. That is, all
information required to describe a machine's configuration are included in Hiercsim
Version 3.5, but none of the control algorithms required to change the configuration
are available in Version 3.5.
The entry buffer for each process is assumed to be never empty. Before the start of
the simulation, the entry buffer is completely filled with parts that are not yet cleared
for processing4 . The controller limits the rate of part loading into the initial cell by
clearing the part for production only as needed by the staircase policy of Section 3.9.
All other buffers are initially empty.
4 Note that the size of the entry buffer can be equal to one, since it will always contain at least
one part.
165
Exit buffers for each process are not required since parts are assumed to be re-
moved from the system immediately upon completion of the final operation.
Process Initial Conditions and Values The user is required to specify the long
term target demand rate for each process. This target demand rate does not change
for the duration of the simulation. The user is also required to specify the highest
control level where process operations must be treated as discrete events. It is assumed
that production is represented by flow rates at all control levels above that specified
level.
The initial value of all surpluses is set to zero. This implies that at the start of the
simulation, cumulative production equals cumulative demand everywhere. Note that
the objective function of the hedging point calculation linear program will not be zero
when the hedging points are different from zero and satisfy the limits of Section 3.13.
4.4.2 Start Event Queue
This section describes the start of the simulation for both the factory model and the
hierarchical controller.
The simulation is started by initiating a production rate calculation in the top cell
of the hierarchy using the event queue described in Section 4.1.3. This calculation
will trigger other production rate calculations and finally will result in the loading of
parts onto machines. Uncontrolled events are initially placed into the event queue by
the routines which determine when those events are to occur. Such events include
the time for the first failure of each machine in the system.
4.4.3 Degrees of Freedom in Controller Parameters
The user must determine the relative priorities of each of the processes in the system at
the start of simulation. Those priorities are in effect for the duration of the simulation
run. The section describes how those priorities are set in the hierarchical controller
by specifying parameters.
166
The function of the hedging point strategy outlined in Section 3.7.2 is to allocate
resources. The resource allocation of Level k Cell c is accomplished by choosing rates
of controllable events within the capacity of the cell that minimize a linear objective
function. The user can choose the relative costs of deviation from the hedging point,
which in turn determines the relative priorities of production between the various
processes in the factory.
Relative Priorities Production mix is regulated by the controller's perception of
priorities. Recall that the dynamic programming formulation uses an approximation
of the total cost J to find the best production mix using the hedging point strategy
(Section 3.7.2).
Each controllable Type j event of Level k Cell c has a hedging point zk and a cost
coefficient Ak . The cost of deviation in the hedging point strategy linear program
(3.22) is
8J
= Aj ,( - z ) (4.5)
The priority of a process in the controller is determined by the relative size of its
cost coefficient Ak. compared to all A,, j'r j. The larger the value of the priority
coefficient A, the larger the cost of deviation will be for event j.
If (4.5) is more negative for some j than for all others, the hedging point strategy
will allocate all resources to bring it to zero faster than the others. Therefore, the
event j with the larger deviation multiplier A~, will be usually maintained closer to
its hedging point z, than the other events. When the surplus is greater than its
hedging point, there will be zero resources allocated to the event.
An example of this effect can be found in Section 7.4.5.
4.5 Simulation Dynamics
This section describes the dynamic response of the simulation to the failure of buffers,
the calculation of production rates, and the failure of machines.
167
4.5.1 Virtual Machine Addition and Removal
This section describes the details of the operation of virtual machines within the con-
troller in a simulation using Hiercsim Versions 3.5 and 4.0. The majority of these
details involve the scheduling and rescheduling of virtual machine activities as pro-
duction rates are recalculated. Virtual machines are introduced in Section 3.8 where
only the constraint added to the linear program is discussed.
Addition of Virtual Machines For the purposes of illustration, only the addition
of a blockage virtual machine constraint (2.20) will be explained. A similar procedure
is used for the addition of a starvation virtual machine constraint (2.19). Figure 2-2
depicts the system used to illustrate the virtual machine addition dynamics.
Consider Process Segment j whose production rate, uc. , is set by the controller of
Level k Cell c. Process Segment j deposits material into Level k Buffer 3,j.
A virtual machine is installed in Level k Cell c if the Level k Buffer /3 has failed
(becomes full) and the calculated Level k production rate uc, violates the flow rate
condition (2.20) of the buffer.
A virtual machine will not be added to the cell controller's linear program (3.22)
due to the failed buffer if the production rate, uf, independently satisfies the con-
straint (2.20). This prevents redundant constraints from appearing in the linear pro-
gram (3.22). However, if the limiting flow rate through the failed buffer is exceeded,
then the virtual machine constraint added to (3.22) and will be in effect until the flow
rate, uc, is strictly less than the limiting flow rate.
As soon as rate calculations are completed in a cell, virtual machine additions are
rescheduled. Suppose that Level k Cell c has just recomputed its production rates, u .
Each component of the rate vector which corresponds to a production rate variable
is used in a calculation to determine the amount of time until a virtual machine is
needed for that production rate variable.
An event is placed in the event queue for each process segment in the cell, schedul-
ing the addition of a virtual machine. A scheduled virtual machine addition is ren-
dered obsolete if either of the production rates contributing to the buffer fill rate
168
(uc or ucj as defined in the next paragraph) changes before the virtual machine is
installed.
Time until Virtual Machine Addition Let component usk be the production
rate variable in Level k Cell c which deposits material into the downstream Buffer
0,,j. Let the component U 'j be the production rate variable from Level k Cell c' which
draws material from Buffer ,nj.
The rate at which Buffer 3,j is filling up is represented by b and is the difference
between the Level k flow into the buffer and the Level k flow out of the buffer:
bkj = k3 - (4.6)
Given that Buffer /~j is neither full nor empty and the fill rate does not change,
a virtual machine will be added sometime in the future depending on the sign and
magnitude of the buffer fill rate bkj.
Let the current Level k amount of material in the buffer be b' and the maximum
allowable material in the buffer to be Bnj.
If the buffer is emptying, then a starvation virtual machine given by (2.19) is
scheduled to be added to Cell c' after time Atv has elapsed where
bk
if bj < O0, then Atv - (4.7)ibt (4.7)
If the buffer is filling up, then a blockage virtual machine given by (2.20) will be
scheduled to be added to the upstream Cell c after time Atv has elapsed where
B .bk
if blj > 0, then Atv = (4.8)
n3
Before the current virtual machine addition is scheduled for Buffer Pnj, any previ-
ously scheduled virtual machine addition is canceled because the previously scheduled
virtual machine has become obsolete.
A similar calculation is performed for Level k buffers that supply Level k Cell c
with raw material. The linear program must allow for the same rate variable to have
169
both starvation and blockage virtual machines simultaneously.
Virtual Machine Removal Level k virtual machines are checked for removal after
all Level k cells which are within the same Level k - 1 Cell C have completed their
rate calculations. Virtual machine constraints are removed if and only if the limited
production rate is strictly less than the virtual machine constraint rate. In the case
of a starvation virtual machine, (2.19) must be satisfied by strict inequality. In the
case of a blockage virtual machine, (2.20) must be satisfied by strict inequality.
If the virtual machine constraints are removed before all cells within Level k - 1
Cell C have converged to a stable production rate mix, infinite looping will result.
Premature virtual machine removal will allow a limited cell to violate the constraint
when it recomputes its production rates before all Level k calculations are complete.
This forces the constraint to be added once again. This behavior has slowed the
convergence to the final production rate mix in actual simulations.
Calculations Triggered by Virtual Machines A rate calculation is scheduled to
occur immediately in Level k Cell c whenever a virtual machine constraint is added,
and whenever the limiting rate of an existing constraint is altered.
Reentrant Flow Constraint Addition and Removal Reentrant flows are de-
scribed in Section 3.12. At the start of the simulation, all reentrant flow processes are
detected and marked. After every virtual machine addition or removal, the simulation
checks for an unbroken chain of virtual machines between two consecutive stages of
a process which are controlled by the same cell.
After each virtual machine addition or removal, all affected process segments are
checked for the completion or breakage of a virtual machine chain. If the addition of a
virtual machine completes an unbroken chain linking two reentrant process segments
in Level k Cell c, then one of the antiloop constraints, either (3.34) or (3.35), is added
to the cell, depending on the direction of the limiting rate. Similarly, if the removal of
a virtual machine breaks a chain linking two reentrant process segments in the Level
k Cell c, then the antiloop constraint is removed from the cell's capacity set.
170
4.5.2 Sequence of Rate Calculations in a System
Section 3.11 introduced the concept of the message center to coordinate independent
cells. This section describes the use of the message center in detail for the coordina-
tion of a system-wide calculation of rates within the controller. This coordination is
important because a cell relies on other cells to provide accurate information about
target rates and restrictions on production rates.
When a major event occurs or high-level production rates change, many inde-
pendent cells are affected. The effect of the change is felt from the highest level in
the hierarchy where the change originated, down to the operation level. The time to
calculate production rates is assumed to be very short compared to the time scale
of the lowest level in the hierarchy. For that reason, the system appears to respond
instantaneously to the major change with a completely new set of production rates
at each cell which is affected by the change.
The number of cells affected can cover much territory across many levels of fre-
quency. The highest level cell affected by the major change recomputes production
rates, and forces each of its components to do so as well. Even though a low level cell
may be far removed from the source of the major change, the effect of the change is
transmitted by the new high-level target production rates.
Communication Restriction across Levels To prevent spurious target trans-
missions from Level k Cell c to its Level k + 1 components, the cell is prevented from
communicating its results until it receives permission from its parent, Level k - 1 Cell
C. When Level k Cell c E C completes its calculations, the higher level Level k - 1
Cell C checks each of its Level k component cells for a scheduled calculation in the
current time instant. If there exists at least one Level k Cell c E C which is scheduled
to recompute production rates for its process segments in the current time instant,
then no other Level k cells within Level k - 1 Cell C are permitted to communicate
production rates their Level k + 1 component cells. If all the Level k component cells
of Cell C have completed their calculations, then each Level k Cell c schedules an
immediate calculation for each of its Level k + 1 components.
171
Sequence Restrictions at the Same Level When Level k - 1 Cell C changes
its production rates, all the Level k component cells within Cell C are required to
recompute their production mixes as well. If there are any active virtual machines
which connect the Level k cells, the sequence of calculations must be such that the
limiting flow rates are computed first. Otherwise, if a Level k cell which is limited
by a virtual machine computes its rates first, it will have to recompute its rates
immediately, since the limiting rate imposed by the virtual machine will have changed.
To make sure that the limiting rates are calculated first, Level k Cell c checks for
the existence of virtual machine constraints in its capacity set. If a virtual machine
does exist, the cell checks to see if the limiting Level k Cell c' is scheduled to recompute
its rates in the current time instant. If the limiting cell is going to recompute its rates
in the current time instant, then Level k Cell c cancels its rate calculation. The
limiting Level k Cell c' will automatically schedule a new rate calculation for Level k
Cell c based on the scheduling trigger algorithm described in Section 4.5.1.
This canceling and rescheduling of rate calculations is a result of the structure of
the event queue of Section 4.1.3). It turns out to be easier to cancel an event and
reschedule it in the same time instant than to move explicitly the event in the queue
when the order of events is modified.
The sequencing technique is not as straightforward for reentrant flows, where Level
k Cell c may limit Level k Cell c' at the same time Level k Cell c' limits Level k Cell
c through a virtual machine chain.
If the virtual machine chain is a starvation chain with all the buffers being empty,
then the ultimate limiting rate is the first process segment in the chain. The cell
which contains the first segment is scheduled to calculate its rates first.
If the virtual machine chain is a blockage chain with all the buffers being full, then
the ultimate limiting rate is the last process segment in the chain. The cell which
contains the last segment is scheduled to calculate its rates first. This logic is also
used for reentrant processes with anti-looping constraints described in Section 3.12.
In order to prevent an endless cycle of rate calculation postponement, any cell is
only allowed to postpone its rate calculation at most one time between two successive
172
rate calculations in the same time instant5 .
4.5.3 System Response to Events
This section details the method by which the event reporting is handled in the sim-
ulation for the controller. The failure and repair cycle of a Level k failure mode of a
machine in Machine Group i in Level k Cell c is used as an example.
Suppose a class of discrete events occurs with a frequency on the same order as
the Level k characteristic frequency fk. When a single event in that class occurs, all
Level k cells and below which contain the resource where the event occurs must be
informed of the event. Once those cells are informed, they respond to the new system
state by adjusting production rates.
Failure/Repair Cycle Consider a system that contains Machine Group i which
has failure modes of widely different duration and frequency. The hierarchical con-
troller is divided into a set of cells, each with a characteristic frequency of events to
which the cell controller responds.
Suppose one of the machines in Machine Group i undergoes a Level k failure. The
failure is reported directly to Level k Cell c which controls the machine group at the
characteristic frequency fk. A new production mix is calculated by the cell controller
to be within the capacity of the machine group with the failed machine. All Level
k + 1 cell components of Level k Cell c are informed of the new target rates and
recompute new production mixes immediately, subject to the sequence restrictions
described in Section 4.5.2.
The new production mixes are transmitted in this manner down to the controller
which releases parts into the system. Machine Group i will continue to process parts
at the new rates until the failed machine gets repaired. At that time, the repair is
communicated directly to Level k Cell c, bypassing all lower level cells.
sThe logic of this section has not yet been proven to converge to the correct rates. However, in
the simulations run to test the algorithm, the correct rates were always chosen. The goal of Hiercsim
Version 3.5 and 4.0 is to provide a working research testbed. This is an area for future research.
173
A new set of production rates is computed at Level k Cell c. Those new rates are
the best rates according to the hedging point strategy of Section 3.7.2 with which the
system can recover from production lost during the failure. These rates are converted
into loading times at the machine group in the usual manner.
4.6 Cell Controller Support
4.6.1 Linear Program Constraint Implementation
This section describes the computational forms of the constraints, conditions which
force a rate variable to zero, and implementation details for the boundary constraints
(4.35), described later in Section 4.8.4. It also describes the method by which the
work LP is created from the master LP. The master and work linear programs (LP's)
mentioned in this section are the linear programs required by the hedging point
strategy of Section 3.7.2 for Level k Cell c and described in Section 4.1.4
1. Virtual Machine Constraints The master LP must contain a constraint row
for both starvation and blockage constraints of each Level k process segment
within Level k Cell c. In addition, each process segment that is part of a
reentrant process must have an antiloop starvation constraint (3.34) or blockage
constraint (3.35) for each possible link to process segments in the same cell.
The virtual machine constraints (3.24) and (3.23) for Process Segment j in Level
k Cell c is written in the master and work LP's in the form
ck Ukj (4.9)
_c - limit
2. Rate Variables A production rate variable column for Process Segment j of
Level k Cell c is transferred from the master LP to the work LP when the
following four conditions are met:
(a) All the machines in the process segment path for that variable are opera-
tional and correctly set up for the operations required by the segment.
174
(b) The Level k - 1 target rate, uj 1 for the process segment production rate
is greater than zero (ucj > 0).
(c) Virtual machines do not limit the process segment production rate to be
zero.
(d) Antilooping constraint limit rates for the process segment production rate
are not zero.
All rate variables which are not included in the work LP have zero rates in the
optimal solution. Therefore, the columns which represent those rates in the
master LP are set automatically to zero. Note that all rate variables which are
set to zero in the master LP are excluded from the work LP as described in
Section 4.1.4.
3. Machine Constraints A machine constraint for Machine Group i of the form
one of the rows in (3.12) requires two conditions to be true in order to be
included in the work LP:
(a) There is at least one operational machine in the Machine Group (mi > 0).
(b) Not all the production rates for operations on Machine Group i are already
known to be zero.
4. Boundary Constraints (NOTE: The terms used here are defined in Sec-
tion 4.7). In the master LP, the boundary constraints are only assigned an
index and do not contain any values. The boundary constraints are not ini-
tially included in the work LP, but as each boundary is reached in the hedging
point strategy calculation, the constraints are activated and the coefficients F
are inserted into the appropriate slots.
There are no general boundary constraints which are known a priori. For this
reason, boundary constraints are excluded every time the work LP is rebuilt
(when either the capacity changes or production targets change). They are
reinstalled as the algorithm of Section 4.8.4 requires.
175
Construction of the Mask and Work LP The construction of the work LP
is accomplished using one mask of logical variables for the rows and one for the
columns of the master LP. The row mask has one element for each row in the master
LP. Likewise, the column mask has one element for each column in the master LP.
Each element in the mask tells the algorithm whether or not the corresponding row
or column is to be included in the work LP. At the start of the work LP construction,
all rows and columns are turned off in the mask. The mask entries are turned on only
for those rows and columns which meet the four conditions outlined above in Item 2
above. All rates that are turned off in the mask are known a priori to be zero. Any
constraint which affects only rates which are known a priori to be zero is excluded
from the linear program.
The first step in the mask creation is to turn on the column elements corresponding
to each active rate variable. Next, any rate variable which had been turned on, but
which is dependent on another production rate through an antiloop constraint, is
turned off if the limiting rate does not appear in the work LP. Once all the necessary
rate variables have been activated in the column mask, the appropriate constraint
rows are activated in the row mask using the conditions outlined above in Item 3.
After the row mask and the column mask have been created, the work LP is
constructed. An element of the master LP is only included in the work LP if both
the corresponding row mask entry and column mask entry have been activated. All
other elements are ignored. The row and column masks are stored so that they can
be used in the transfer of the optimal rates from the work LP to the master LP once
the calculation is complete. Recall that neighboring cell controllers only have access
to rates stored in the master LP for reasons outlined in Section 4.1.4.
4.6.2 Supporting Steps in the Rate Calculation
The rate calculation algorithm for the controller described in Section 4.8.4 requires
some supporting routines which prepare the data and make sure that the results of the
calculations are directed to the correct locations at the end of the calculation. This
section describes the steps which are required before the rate calculation is started
176
and after the rate calculation is completed.
Pre-Processing Steps This section describes the steps needed before a rate cal-
culation can be performed in Level k Cell c. The precise algorithm for computing
production rates is described in Section 4.8.4.
1. Cell Calculation Cancellation When a rate calculation is performed, all pre-
viously scheduled future rate calculations for Level k Cell c become invalid since
the information on which the calculation was scheduled has become obsolete.
Therefore, all previously scheduled rate calculations are canceled by removing
all rate calculations for Level k Cell c from the event queue.
2. Update Amount of Buffer Material The amount of material in buffers
surrounding Cell c, as seen by a Level k observer, must be updated. The
change in the amount of material in the buffer, Ab, is equal to the integral
of the difference of the current Level k production rates (uk - uk,) over the
interval since the last time, to, that the Level k flow rates through the buffer
were computed,
Ab(t) = (uk - u ,)(t - to) (4.10)
3. Update Machine Availability Whenever the controller of Level k Cell c has
been told about a capacity change, the controller counts the number of machines
which are operational in each possible machine state with each Machine Group i
within the cell, as seen by a Level k observer. That amount is multiplied by the
effective capacity e as determined by the Level k machine availability formula
(3.10). This total represents the amount of machine availability in Machine
Group i which can be scheduled for controllable events at lower levels.
4. Update Target Production Rates Target production rates u -1 are read
from the master LP of the Level k - 1 Cell C. If the target production rates
have changed since the last calculation, the work LP is rebuilt from the master
177
LP. The target production rates are compared to the machine constraints (2.2)
to determine whether or not the target rates are feasible, barely feasible, or
infeasible 6. The result of each of the constraints is printed out. This provides
valuable feedback to the user about the presence of bottlenecks.
5. Update the Surplus Vector xz Production rates and target rates are inte-
grated over the time interval since the last production rate calculation in this
cell. These integrals keep track of the cumulative production and cumulative
requirements for each rate variable in the cell. The surplus vector x' is updated
by (3.14).
6. Build Work LP A new work LP is built from the master LP if either the
target production rates or the capacity set of Cell c have changed since the last
production rate calculation of Cell c. Otherwise, the current work LP is used.
The work LP is built from the master LP based on the Level k - 1 target
production rates, uj - l , the machine availabilities, efm, and the conditional
constraints (3.23), (3.24), (3.34), and (3.35).
Post-Processing Steps This section describes all steps needed after a rate calcu-
lation described in Section 4.8.4 has been performed in Level k Cell c.
1. Once the final rates, u,k have been found, they are transferred into the master
LP, where they are accessible to the rest of the system. All rates which did not
appear in the work LP are set to zero in the master LP.
2. The virtual machine constraints in adjacent cells are updated if the limiting
values have changed. The addition of virtual machine constraints are scheduled,
based on the new buffer fill rates.
'The only time in which the target rates are infeasible are at the highest level of the hierarchy
where the user specifies target production rates. All rates below that level satisfy the capacity
constraints of the system.
178
3. Assuming that neither the capacity of the cell nor the target rates will change
before the next rate calculations are needed, the next rate calculation for this
cell is scheduled. The time until the next boundary, At, is computed by (4.48).
Any random event which changes the capacity, Q(ekmk), defined below, or the
target rates, u - 1, of the cell will require an immediate rate calculation, thus
canceling this scheduled rate calculation.
4.7 Boundaries in Surplus Space
The role of the controller described in Section 3.7.2 is to allocate production resources
in Level k Cell c among the different process segments in the cell. There are conditions
in which the control algorithm has a choice of more than one solution. In those cases,
the algorithm is unable to find a unique solution without additional information. This
section describes this phenomenon in terms of boundaries as well as an algorithm to
choose a solution (Gershwin, Akella, Choong, 1985). The next section describes a
refinement to the algorithm which was necessary in the current implementation of
Hiercsim Versions 3.5 and 4.0.
Control Priority Criteria The control algorithm of Section 3.7.2 prioritizes pro-
duction based on the current surplus xk, the current repair state of the cell, e km,
and the quadratic cost-to-go function if of Level k Cell c.
The linear program which represents this prioritization scheme (3.22) of Sec-
tion 3.7.2 is reproduced here:
min A(x"3 - Zkm) U' (4.11)zj m ) j
jEP
subject to the capacity set
riju0 < em i E c
jEPi
ku>O j PiUc
179
Xk2
Region
Number 
Surplus Space
A
5
4
Hedging Point
) (zk Z
k )1 '2
Boundary between
Regions
Figure 4-1: Example Boundary with Capacity Set
The capacity set (3.12) is sometimes denoted as Q(ekmk) and is assumed to be
constant during the period of time for which the linear program is computed. The
surplus Xzk changes continuously. Pi represents all controllable activities at Machine
Group i in Level k Cell c.
Characteristics of the Cost Function There are three major characteristics of
the cost function, C(x) = Az( - z), which affect the priority assigned to each
process segment production rate u .
1. The linear program attempts to pick production rates, uc, to move the surplus,
xk, back to the hedging point, z , as cheaply as possible, given the current Level
k state of the system. Those process segments j with the highest costs C are
assigned the most capacity.
180
2. The linear program divides the surplus space into regions which are convex
cones. Any positive scalar which multiplies the cost function will not change
the outcome of the linear program (Gershwin, 1993).
3. The only part of the cost function, C(z), which is dependent on time is the
production surplus, xi. The production rate solution, u,k is generated based
on the relative magnitudes of the components of C(x). The dynamics of the
surplus is dependent on the solution uk:
we = uk - uc  (4.12)
Based on this relation, the priorities given by the cost C(x) will change over
time, forcing the controller to alter the production allocation.
Figure 4-1 shows an example boundary space and its related capacity set. Each
corner of the capacity set corresponds to a region in surplus space. The rela-
tionship is indicated by the integer labels.
Characteristics of LP Solution Some characteristics of the solution, u , to the
linear program (4.11) are examined here. The conditions in which the linear program
does not have a unique solution are discussed in terms of those characteristics. Refer
to Gershwin (1993) for details and proofs.
By expanding the production rate vector uk with slack variables, the inequality
constraints in (4.11) can be converted into equality constraints on the expanded pro-
duction rate vector, i. Let O(x) be the expanded cost coefficient vector A (xk - zk)
where the priority coefficients, AC, for the slack variables are zero. The linear program
(4.11) becomes
min O() T itk (4.13)
subject to
Dii = ek mk
181
-k>OUc
where D isa matrix whose lements are given by
where D is a matrix whose elements are given by
i;j
Dij = 1
0
if j corresponds to Process Segment j
if f~j is the slack variable for Machine Group i
otherwise
(4.14)
(4.15)
k -kThe standard solution of (4.13) breaks iuc into basic and non-basic parts iu =
[UB UN] with C(x) = [CB(Z) CN(x)] and D = [DB DN] broken up correspondingly,
(Luenberger, 1977). The basic part of D is a square, invertible matrix, DB. By using
the equality in (4.13), the basic part of ik, UB, can be eliminated and the linear
program becomes
min CR() T UN (4.16)
subject to the capacity set
D- 1 D NUN < D-l(ekmk)
uN >O
where
CR() T = CN(x)T - CB()TDB'DN (4.17)
is the reduced cost.
A unique optimal solution to (4.16) occurs when the basic/nonbasic split gives
a reduced cost CkR in which all components are strictly positive. In that case, the
optimal solution is
uN = 0
182
UB = DBl(ekmk)
The solution (4.18) of (4.16) is valid for all values of surplus xk which leave all
components of the reduced cost CkR(x) strictly positive for that basic/nonbasic split.
The solution is non-unique if some components of CkR are positive and some are zero.
4.7.1 Regions in Surplus Space
This section describes regions in surplus space for a cell in the hierarchical controller.
Define the Region RC as the set of all xk such that all components of C k R(),
defined by (4.17) are strictly positive (Gershwin, 1993). Let the reduced cost in
Region RC as a function of surplus, 4x, be defined as CRe(x). The components of
CRc(x) are strictly positive in the Region Re, but CRc(x) has negative components
outside of Region Re. The boundaries Rh of Region Re are the set of all xk such that
the components of CR(x) are positive or zero, and at least one component is zero.
Outside of Region Re and beyond the boundaries Rh, one or more components of
the reduced cost, CRC(x), are less than zero, indicating that the basic/non-basic split
of Region Re is not optimal outside of Region Re. Outside of Region Re, a reduced
cost different from CRc(x) represents the optimal solution.
The optimal solution (4.18) to the linear program (4.11) is constant throughout
Region Re. There is a one-to-one correspondence between Region RC and the optimal
corner of the capacity set f(ekmk).
Note that the cost function, C(x), is linear in x k within the Region Re and subse-
quently the reduced cost is also linear in xk. Therefore, the boundaries Rh of Region
R e are portions of hyperplanes.
Dynamics of the Control Algorithm Generally as time progresses, the dynamics
of production
•x = -k uc1 (4.19)
will drive the surplus x' from the interior of Region RC to one or more of the bound-
183
(4.18)
aries Rh. If the control algorithm does not adjust for the transition from Region R
to the neighboring Region R + , then the solution i = [uB uN] based on the reduced
cost, CRc(X), will cease to be optimal.
The decomposition used to create the control algorithm of Section 3.7.2 decouples
the production rate uc from the surplus xz in Level k Cell c. The production rate
solution uk is known when the surplus Xz is known, as long as the surplus lies in the
interior of Region RC.
If the surplus lies on a boundary Rh between Region RC and one of its neighboring
regions, R+, then the linear program (4.11) is underconstrained. There will be an
infinite number of solutions which minimize the total cost (Gershwin, 1993). Further
information is required to specify one particular solution.
Boundary Coefficients Suppose that the surplus xzk of Level k Cell c lies in the
interior of Region Rc. The components of the reduced cost (4.17) for the optimal
solution are strictly positive and are defined in terms of the basic/nonbasic split. The
basic/nonbasic split of the solution is constant throughout the region and the values
of the production rates, u, are given by (4.18).
The components of the cost function are based on the cost-to-go function ap-
proximation of Section 3.7.2. Each component j of the cost function has a priority
coefficient, A , as well as the deviation of the component's surplus, xk , from its
hedging point, z , for all process segments j. All priority coefficients, A k can be
grouped together into a diagonal matrix A . This matrix can be partitioned accord-
ing to the basic/nonbasic split of the solution (4.18) corresponding to Region Rc,
A= [AB AN].
The reduced cost (4.17) can be expanded in terms of the partitioned priority ma-
trix A = [A AkN], and the partitioned capacity coefficient matrix D = [DB DN].
By factoring out the deviation of the surplus, zx, from its hedging point, z , the
reduced cost can be written as
C (x) = (A, - Ak D-1DN)(Z - z ) (4.20)
184
Define the matrix F to be
F = AN - ABD-'DN (4.21)
The dimensions of F are one row for each nonbasic variable and one column for
each process segment rate variable. Note that the components of F corresponding to
the slack variables are zero since their priority coefficients (A) are zero.
The reduced cost simplifies to
CI(x) =F( - zC) (4.22)
Each row F in (4.21) contains the coefficients of Boundary R h of Region Rc. The
rows Fi are linearly independent of each other. Define the component of the reduced
cost, C'R(x), which corresponds to Boundary R to be h,(x):
hi(x) = F(x - zk) (4.23)
The surplus point, xC, lies on Boundary R h when that component, hi(x), is equal
to zero:
hi(x) = 0 x~ ' lies on Boundary R (4.24)
Suppose that Boundary R h separates Region Rc from Region R + . Let F + be the
boundary coefficients of Region R + given by (4.21) according to the basic/nonbasic
split of Region R + . Let F + be the row in F + which contains the coefficients of
Boundary Ri for Region R + .
Let h+(x) be the component of the reduced cost in Region R+ which corresponds
to Boundary R h . It can be written as
ht(x) = F (Xk - zk) (4.25)
Due to the one-to-one correspondence between regions, R, and corners of the
capacity set, £f(ekmk), the basic/nonbasic split of Region Rc is different from that of
185
Region R+. However, the location and coefficients of Boundary Rh do not change
when the surplus, xk moves from Region Rc to Region R+ despite the change in the
basic/nonbasic split.
The reduced cost component hi(x) of Boundary Rh of Region Rc is negative when
the surplus zx lies in Region R + . The reduced cost component, ht+ ( ) of Boundary
R of Region R + is positive when the surplus Xz lies in Region R+.
Since these two quantities, hi(z) and h+(x), represent the same quantity and are
different only in sign, the relationship between F and F+ is 7
F, = - F (4.27)
Suppose that the solution to the linear program (4.11) in Region Rc, uc , drives the
surplus, xz , to Boundary R' according to the dynamics given by (4.19). This implies
that the value of hi(x) is decreasing. At Boundary R , hi(x) = 0. Across Boundary
R , in Region R+ , the value of hi(x) continues to decrease if the solution, uc, from
Region Rc is used. However, if the solution, u+ from Region R+ is used, the value of
hi(x) may increase, decrease, or remain constant.
Therefore, the solution u+ from Region R+ may either drive the surplus axk deeper
into Region R+, send the surplus xk back into Region Re, or keep the surplus xk on
Boundary R'. The linear program (4.11) does not have a mechanism to determine
this outcome, so an additional step in the rate calculation algorithm is required when
a boundary is encountered.
'The boundary coefficient row, Fi, is normalized in Hiercsim Versions 3.5 and 4.0 so that when
Fi is used to compute the reduced cost, Fi will not affect the magnitude of the result. Therefore, Fj'
is used, where
F! = (4.26)
because Fi'TF = 1. Therefore, the relation (4.27) does not include a positive scalar.
186
4.7.2 Attractive and Unattractive Boundaries
This section describes the attractiveness conditions for boundaries of a cell in the
hierarchical controller.
Suppose the Level k Cell c surplus x lies in Region RC and is being driven towards
Region R + by the solution, uk, of the linear program (4.11). Level k Cell c is operating
under a target production rate uf - 1 supplied by its parent cell, Level k - 1 Cell C.
Region RC and Region R+ are separated from each other by Boundary R . The
coefficients of Boundary R are F 8 as determined by the basic/nonbasic split of
Region Rc (4.21).
Define x- to be any value of surplus in Region Rc and zx to be any value of
surplus in Region R + . The solution to the linear program (4.11) in Region RC is u-
and the solution in Region R+ is u+.
The value of the component of the reduced cost, CcR(x), corresponding to Bound-
ary R is hi(z), see (4.23). Since the surplus Xk is being driven towards Boundary R
from Region Rc,
dhi 0d(x) = Fi(u- - ) < 0 (4.28)
Boundary R is unattractive 9 if the surplus point xk is moved away from Boundary
Rh when it lies within Region R + . This is true if hi(x) continues to decrease after
crossing the boundary and operating under the production rate u+ of Region R+,
i.e., if
dh k-() = F(u + - uc- ) < 0 (4.29)
On the other hand, Boundary R is attractive if the surplus xk either remains
stationary on Boundary R or is moved back into Region Rc when it operates under
the production rate u+ of Region R+. This occurs if hi(x) remains constant or
SSince the coefficients Fi of Region RC are different from the coefficients F+ of Region R+ only
by their sign (4.27), Fi will be used throughout this section exclusively.
9An unattractive boundary is also called a deflective boundary.
187
increases after entering Region R+
()F(U+- k1) 0 (4.30)
dt
Attractiveness Condition Consider two neighboring regions in surplus space, Rc
and R + which are separated by Boundary R h . The coefficients of Boundary R h in
Region Rc are given by the row Fi.
The solution to the linear program (4.11) in Region Rc is u-, and the solution in
Region R+ is u+. The component of the reduced cost corresponding to Boundary R h
in Region RC is hi(x), as shown in (4.23). The target production rate in both regions
is u - 1. The value of hi(x) is zero on the boundary.
The attractiveness conditions for Boundary R h are:
R is attractive if Fi(u- - u - 1) < 0 (4.31)
and Fi(u+ - uk-1) > 0
R is unattractive if Fi(u- - u - 1) > 0 (4.32)
or Fi(u+ - uk- 1) < 0
4.7.3 Attractive Boundary Chattering
This section describes chattering within a cell in the hierarchical controller and how
it is prevented.
If a boundary is attractive according to (4.31), then chattering will occur if no
additional information is provided to the controller through the linear program (4.11).
Suppose the surplus xk of Level k Cell c starts in Region R and is driven across
Boundary R h by the solution u- to (4.11). As soon as the surplus x' enters the
neighboring Region R + , it is immediately driven back across Boundary R h by the
solution u + to (4.11) in Region R + . Upon reentering Region Rc, the surplus xk
will again be driven across Boundary R h back into Region R + . This back-and-forth
motion of surplus x' is called chattering.
If this chattering behavior is allowed to continue for a number of cycles, the surplus
188
k will zig-zag back and forth across Boundary R. It will always stay near Boundary
R h , and its average direction of movement will follow the boundary.
Chattering is undesirable for two reasons:
1. Chattering leads to a large amount of calculation in a short period of time,
thus significantly reducing the efficiency of the control algorithm developed in
Section 3.7.2.
2. Chattering violates the assumption (2.9) of Section 2.5 which states that the
frequency of a Level k event is much lower than that of a Level k + 1 event and
much higher than that of a Level k - 1 event. Chattering violates (2.9) because
the frequency of rate calculation in Level k Cell c immediately increases many
times over, which in turn forces Level k Cell c to communicate with its Level
k + 1 components much more often than the hierarchy permits.
4.7.4 Attractive Boundary Constraint
This section describes a method to prevent chattering within a cell in the hierarchical
controller.
Chattering as described in Section 4.7.3 may be prevented if an additional con-
straint is included in the linear program (4.11). This new constraint will force the
solution of the linear program, uk to drive the surplus xk along the Boundary R h
between Regions Rc and R + using a convex combination of the solutions u + and u-.
Suppose the production rate solutions u + and u- on either side of Boundary
R h satisfy (4.31), indicating that Boundary Rh is attractive. Suppose also that the
surplus xk lies on Boundary R, implying that
hi(x) = F (xk - zk) = 0 (4.33)
Chattering will be avoided if hi(x) remains zero. Therefore, the additional con-
straint in (4.11) is
189
dh (x) = F(ug - u - 1 ) = 0 (4.34)
Since the target production rate, uk- 1 from Level k - 1 Cell C is a constant, the
constraint on uk is
Fu = F uk-1 (4.35)
The modified linear program (4.13) including slack variables is
min C(x)Tfi (4.36)
subject to
D jk > 0
k -
C C
The next section describes the mechanics associated with installing these bound-
ary constraints into the linear program. The installation is non-trivial because the
surplus xk is likely to lie on more than one boundary at any given time, rendering the
attractiveness conditions (4.31) and (4.32) insufficient. This is described in detail in
Section 4.8.
4.8 Boundary Installation
Overview The purpose of this section is to describe the algorithm used to install
multiple attractive boundary constraints (4.35) in the linear program (4.11) in Hi-
ercsim Versions 3.5 and 4.0, for Level k Cell c. Level k Cell c is provided target
production rates, uk- 1, by Level k- 1 Cell C.
The boundary installation algorithm described in Gershwin, Akella, Choong (1985)
is sufficient when only one boundary is encountered at a time. Section 4.7.2 de-
190
scribes the conditions which determine whether or not a boundary is attractive or
unattractive. These conditions rely on moving the surplus xz across the boundary,
and comparing the direction of the surplus across the boundary to the location of the
boundary. When multiple boundaries are encountered, and the surplus xz is moved
into a different region, it is difficult to determine which boundaries were crossed
and which were not. Therefore, the attractiveness conditions (4.31) and (4.32) be-
come confounded, making it difficult to separate the attractive boundaries from the
unattractive boundaries.
Boundary Installation Concept The basis of the boundary installation algo-
rithm used in Hiercsim Versions 3.5 and 4.0 is to convert the multiple boundary
problem into a series of single boundary problems.
The algorithm is built on three main components. The first computes how the sur-
plus, X , moves from the interior of Region RC to the nearest Boundary, R h according
to the equations of motion of surplus. The second uses those equations of motion and
adds boundary constraints one at a time, forming a deterministic projected trajectory
of surplus, xz. The third component determines whether or not the surplus, Xk is
sufficiently close to a boundary to be treated as if it were on that boundary. This is
used as a criterion to count the number of boundaries on which the ghost surplus, xz,
lies.
Each of these three components will be described individually, and then assem-
bled into the algorithm along with the criterion which terminates the algorithm.
That information is used to install the appropriate constraints (4.35) on the optimal
production rates uk . This prevents chattering as described in Section 4.7.3.
Algorithm Features The multiple boundary installation algorithm is designed to
minimize the chance of a coding error going undetected by using numerous cross-
checks. The algorithm is inefficient because it uses many calls of the linear program
solver. It uses no prior information in determining the attractiveness of boundaries
if the capacity set, Q(ekmk), or the target production rate, uk- 1 , has changed. All
191
calculations within the algorithm use a common set of routines so that any changes
may be made in a single location.
These features ensure that the code is robust and will be able to handle most
situations, foreseen and unforeseen. As more experience is gained with this control
algorithm, much of the redundancy should be removed.
4.8.1 Equations of Motion
This section describes the equations of motion of the surplus point of a cell in the
hierarchical controller.
Suppose the surplus, xC,k of Level k Cell c lies in the interior of Region Rc. When
the optimal production rates, u , are computed according to the algorithm of Sec-
tion 3.7.2, the cell controller schedules the next required calculation some time in the
future.
That time is determined using the equation of motion (4.12) for the surplus, x,k
provided the Level k capacity set Q(ekmk) and the target production rate, u - 1 do
not change.
Recall that everywhere within Region Rc, each component of the reduced cost
CkR(x) is strictly positive for the optimal basic/nonbasic split of the linear program
(4.11). That region is bounded by the set of Boundaries Rh on which all components
of the reduced cost are positive or zero, with at least one component equal to zero. The
next boundary that will be reached will be the one corresponding to the component
of the reduced cost which reaches zero first.
Since the reduced cost, CR(x), is linearly dependent on the surplus, Cx, and the
velocity of the surplus, ik is constant throughout the region (4.12), the reduced cost
as a function of time can be written as
/cR = cO() (to)) + dt ((t - t0) (4.37)
where x'(to) is the surplus at time to.
Suppose that the current conditions will lead the surplus, x , to Boundary R h
192
at time tl, before any other boundary is reached. Solving (4.37) for the interval
tl - to and applying (4.22) and (4.12), given the initial condition, x'(to), and using
the optimal production rate, u0 , yields the time interval to reach Boundary R':
CR(C(to)) Fi((to) - z)(4.38)
tx - to = - = F(uo (4.38)
provided there is no change in either the capacity set fi(ekmk) or the target production
rate, u - 1. CRi(xi(to)) is the reduced cost component and Fi are the coefficients
corresponding to Boundary R h . Note that for this equation to yield a positive time
interval, d < O, since CRi > 0 in Region Re. Otherwise, the surplus, x , will move
away from Boundary R .
The value of the surplus, xz(t), at Boundary R h at time tl is
xC(t 1 ) = x (to) + (Uo - u-1)(tl - to) (4.39)
To demonstrate that ,x(tx) is on the boundary R , multiply (4.39) by the bound-
ary coefficients, Fi,
Fix (t) = Fix-(to) + F1 (uo - Uk-)(t, - to)
From (4.38), replace the time interval tl - to,
FixA(tl) = F ( - FX(uo - ) k(o - )
OJ u )( F (uo - uc
This expression simplifies to
Fi(x(t 1 ) F z
which is the equation for the boundary R h , (4.33).
At time tl, the controller will reallocate capacity according to the algorithm in
Section 4.8.4.
193
4.8.2 Projected Trajectory
This section describes the projected trajectory of the surplus point of a cell in the
hierarchical controller.
The hierarchical controller of Section 3.7.2 is designed to respond to random events
in Level k Cell c in a deterministic manner. The randomness in Level k Cell c
is limited to changes in its capacity set Q(ekmk) (3.12) and its Level k - 1 target
production rates, uk- 1 . Both of these quantities undergo instantaneous changes at
random intervals with some long-term average frequency, fk, but are constant between
changes.
Once the capacity set, Q(ekmk), and the target production rates, uk- 1 , are known,
the path of Level k surplus, x , is determined until the next change in either f2(ekmk)
or uc- 1 occurs. That path is a sequence of piecewise linear stages because the velocity
of the surplus, ;ik is constant within any Region R c , once appropriate boundary
constraints (4.35) are added to the linear program (4.13).
The path of surplus, xk, that can be computed based on (ekmk) and uk - 1 is called
the projected trajectory in Gershwin (1989). It is calculated using the equations of
motion of Section 4.8.1 and the boundary attractiveness conditions (4.31) and (4.32).
The projected trajectory is how the controller will manage the transition of bringing
the surplus, k , back to the hedging point, zk, if the target production rates, uk - 1
are feasible (uk- 1 E (ekmk)) or finding the best sequence of production mixes with
which to move away from the hedging point if Uk - 1 are infeasible (uk- 1  (ekmk
Calculation of the Projected Trajectory Consider Level k Cell c which contains
N process segments whose surplus is xk(to) at time to. Suppose that the surplus oX(to)
lies in the interior of Region Rc. The future path of zx(to) can be projected given the
capacity set, Q(ekmk) and the target production rates, uk- 1 remain constant. The
projected path of the surplus, x (t) is computed for the interval of time [to, oo) given
the conditions at time to. It is assumed that along the path of zx(t), boundaries will
be encountered one at a time. Multiple boundaries will be discussed in Section 4.8.3.
The capacity set, £(ekmk), represents all feasible Level k production rates, u.,
194
and has dimension, N, equal to the number of process segments in Level k Cell c.
Among all feasible possible values of uC, only one, uo, is optimal throughout Region
Re. The interval of time (tl - to) until the first Boundary, Ri is reached is given by
(4.38), using the value of u0.
To create the projected trajectory, the surplus, xz(tl) is determined. The attrac-
tiveness condition (4.31) is checked on Boundary R^. If the boundary is attractive,
the constraint (4.35) is added to the constraint matrix D of the linear program (4.13)
and will be in effect for the remainder of the projected trajectory.
After an attractive boundary constraint (4.35) is added to the constraint matrix
D of the linear program (4.13), the surplus, (t , lies on a slice through the original
Region Rc.
Let this slice be Region RC' of dimension N - 1. The dimension of the set of
feasible production rates, uc, for Region Rd' is equal to N - 1. Within Region R ' ,
the dimension of the reduced cost, C(x), is equal to N- 1, and the optimal solution,
ul, corresponds to the corner of the constraint matrix of (4.36) where all components
of CA(zx) > 0. Region Rc' is bounded by Boundaries Rh' on which the components
of the reduced cost C'(x) > 0, with at least one component equal to zero.
The calculations required to determine the coefficients of the next Boundary, R '
and the time interval, (t 2 - t 1), to reach that boundary are exactly the same as for the
first stage of the projected trajectory, except that the dimension of possible solutions,
u is reduced to N - 1 from N. Boundary R ' will be checked for attractiveness using
conditions (4.31) and (4.32) and its constraint will be added if necessary to the linear
program (4.13).
As each boundary is encountered, an interval of time and a set of optimal rates are
added to the projected trajectory of the surplus, xz(t). They define the path of zX(t)
according to the equations of motion (4.39). As each attractive boundary constraint
is added, the dimension of possible solutions, uc, is further reduced by 1.
This process of moving from boundary to boundary ends when the either the
hedging point z, is reached (if u - 1 E QG(ekmk)) or the time to the next boundary is
infinite (if uk-1 k (e mk)). If the time to reach the next boundary is infinite, then
195
(4.38) will be negative.
If the target production rates are feasible, the final production rates will be equal
to the target production rates. Otherwise, if the target production rates are infeasible,
the final production rates will cause the surplus to move away from the hedging point
without limit, until either the capacity set (ekmk) or the target production rates
Uk -C change.
4.8.3 Multiple Boundaries
This section describes the situation when more than one boundary is reached by the
surplus of a cell in the hierarchical controller.
Section 4.7.1 described the existence of boundaries in surplus space for Level k Cell
c. These boundaries are either attractive or unattractive according to the conditions
(4.31) and (4.32). However, these conditions are only appropriate for a boundary in
isolation of all others because they are based on the assumption that there are only
two sets of production rates, u+ and u-, near the boundary.
Additional effort is required to determine the attractiveness of each boundary
in the case where the surplus xk lies on multiple boundaries. As the number n of
boundaries at the surplus zx increases, the number of regions adjacent to xk increases
as much as 2" . Each region contributes its own optimal solution, uk to (4.13) to the set
of possible combinations of optimal production rates. The number of combinations of
attractive and unattractive boundaries becomes unmanageable quickly. Section 4.8.4
describes the solution to this problem as implemented in Hiercsim Versions 3.5 and
4.0.
Criteria for Multiple Boundaries Recall from Section 4.1.2 that Ek is a small
time compared to the characteristic time scale, 1/fk of Level k Cell c. Similarly, Ik
is a large time compared to the Level k characteristic time scale, 1/fk. Define the
quantity e, to be a small real number close to the numerical precision of the computer.
There are two criteria used in the simulation to determine if the surplus xk is
sufficiently close to Boundary R to be treated as if it lay on Boundary R,. The first
196
part deals with the condition that surplus Xk is about to intersect the hyperplane Fi
under to current production and target rates. This can be written as
ti - to = - ,(u\ <  k (4.40)
Fj(Uk - Uk-1)
where to is the current time and tl is the time at which the boundary will be inter-
sected.
The second deals with the condition when the surplus xk is sufficiently close to
Boundary R h to be treated as if it were on the boundary, but is motionless relative to
the boundary. In this case, the time interval is large, but the reduced cost component
of that boundary is small. This condition can be written as
- e, < Fi(x - z ) < E (4.41)
The surplus is on multiple boundaries if the surplus Xz satisfies either (4.40) or
(4.41) for more than one Boundary R E Rh of Region Rc.
Origin of Multiple Boundaries The surplus zk will lie on multiple boundaries
without the appropriate constraints to solve for the optimal rates (4.11) due to either
of two conditions. The first condition arises when the target production rates, u - 1,
from the Level k - 1 Cell C, change, thus possibly changing the attractiveness of the
boundaries. The second condition arises when a the capacity set changes, Q(ekmk),
which alters the number and/or location of the boundaries.
In Hiercsim Versions 3.5 and 4.0, there is a unique value for each hedging point
in Level k Cell c, regardless of the state of the manufacturing system'0 . Because
the location of the hedging point does not change, the surplus xm will remain on
boundaries it was on before the target production rates, 4u-1, changed.
If the target production rates, uk- 1, have changed, then the attractiveness con-
ditions (4.31) and (4.32) will have changed. This invalidates any of the previously
installed boundaries in the linear program (4.36). Before the controller is able to
10 In general, the hedging point values will change in response to stochastic events.
197
continue, all boundaries must be checked for attractiveness again.
If the capacity set, Ql(ekmk), has changed, then the boundaries on which the sur-
plus, xk lies may have shifted. In simulations, the magnitude of these shifts (especially
due to changes in virtual machines limiting rates (3.23) or (3.24)) is not sufficient to
violate the condition (4.41). This implies that the surplus still lies on the bound-
ary, but the optimal rates, uc, will have changed. Therefore, the boundary must be
checked for attractiveness.
More often than not, the surplus xle will lie at the hedging point, z , by the design
of the controller described in Section 3.7.2. When a machine fails or becomes repaired,
a boundary is either removed or added to those at the hedging point. In this case, the
optimal production rates, u, will have changed, possibly altering the attractiveness
of the remaining boundaries.
4.8.4 Installation of Multiple Boundaries
This section details how multiple boundaries are installed in a cell in the hierarchical
controller.
Section 4.8.3 described multiple boundaries and some of the reasons why the
attractiveness conditions of Section 4.7.2 are insufficient. This section details the al-
gorithm used in Hiercsim Versions 3.5 and 4.0 to install boundary constraints in the
linear program (4.13) when the surplus, x , of Level k Cell c lies on multiple bound-
aries. Recall that Level k Cell c has a capacity set Q(ekmk) and target production
rates uc- 1
Overview The purpose of the boundary installation algorithm is to:
* Determine which boundaries are effective (i.e. for which the surplus satisfies
(4.33)).
* Determine which of those boundaries are attractive.
* Determine the coefficients Fi of the attractive boundaries.
198
* Install those boundaries in the capacity set Q(ekmk).
This algorithm is capable of working with an arbitrary number of boundaries in
the same calculation sequence.
The primary concept of this algorithm is the ghost surplus, xp. The ghost surplus
is a copy of the actual surplus zxk that can be altered without changing the actual
surplus. The algorithm moves the ghost surplus, zp, the surplus space in such a
way as to encounter each boundary in isolation of other boundaries, according to the
projected trajectory of Section 4.8.2. This method permits the application of the
conditions (4.31) and (4.32) which were developed for a single boundary.
Reference Surplus, xo At the start of the calculation at time to, the surplus,
k(to), lies on multiple boundaries according to the conditions (4.40) and (4.41).
Those conditions only require the surplus, xk(to), to lie near each boundary, and not
precisely on the boundaries.
Some of those boundaries may be attractive, while others may be unattractive.
When the attractive boundary constraints are added to the linear program (4.11), the
optimal production rates, u, change. In some simulations, the conditions (4.40) and
(4.41) were just barely met, and when the production rates changed, the remaining
unattractive boundaries no longer satisfied the conditions, even though nothing had
changed other than u.
This behavior created problems in the algorithm because it prevented the pro-
duction rates from being computed across the unattractive boundaries. Another rate
calculation would be required almost immediately, as the surplus is moved across
the unattractive boundaries, violating the frequency assumptions of Section 2.5. A
solution to this problem would be to move the surplus xk(to) as close to the nearest
boundary as possible. However, moving the surplus, xk(to), has other problems.
The boundary installation algorithm uses a reference surplus, xo, which lies exactly
on the nearest boundary to surplus, xa(to). The actual surplus x(to) is therefore
only used as a baseline from which both the ghost surplus, xp, and the reference
surplus, xo, are created. At the end of the algorithm, all appropriate attractive
199
Surplus Space
Boundary Criterion
A
SHedging Point
(Z.*Zk
.2
I ,
/
XP 6/
I nitial xk
C
Figure 4-2: Relationship Between xp, xo, and zk
boundary constraints are included in the capacity set £f(ekmk), and the actual surplus
is restored.
Figure 4-2 illustrates the relationship between the ghost surplus xp, the reference
surplus xo, and the initial surplus sx.
Algorithm Steps The rest of this section details the boundary installation algo-
rithm.
1. Calculate Initial Rates The rate calculation is started by computing the
production rates, uc, according to the linear program (4.11), given the sur-
200
plus, Xz(to), and the capacity set, Q(ekmk). The calculation determines the
basic/nonbasic split of production rates and computes the basis inverse, DB1
required to calculate the coefficients, F (4.21) of boundaries Rh.
The time to the first boundary, tl -to, is computed according to (4.38). If tl -to
satisfies (4.40), then the algorithm is continued. Otherwise the surplus, xz(to),
does not lie on a boundary and a rate calculation is scheduled at tl - to time
units in the future, terminating the algorithm.
2. Calculate Reference Surplus, xo If the surplus, ax(to), requires the instal-
lation of boundary constraints, it requires tl - to < Ek time units to reach the
first boundary under the current production rates, u'. To avoid numerical prob-
lems that were encountered during simulation runs, the reference surplus, xo, is
calculated to lie on the closest boundary, as follows
XZ = x (to) + (U - Uk-1)(ti - to) (4.42)
The production rates are determined using the reference surplus, x0 , in (4.11).
The boundaries Ro on which the reference surplus lies are determined. The
basic/nonbasic split of the reference surplus, x0o, may not be identical to that
of surplus, xk, because a boundary may have been crossed within the numer-
ical precision of the computer. This difference does not alter the result of the
algorithm, but it does prevent the use of the initial production rates.
3. Create the Ghost Surplus The heart of the boundary installation algorithm
is based on the creation of a ghost surplus, xp which does not lie on any bound-
ary, but still lies in a region adjacent to all boundaries Ro. A projected tra-
jectory from Section 4.8.2 is created, encountering the Boundaries Ro one at a
time. (The boundaries must be encountered one at a time so that the attractive-
ness criteria (4.31) and (4.32) may be applied without confusing an attractive
boundary with an unattractive boundary).
201
This step in the algorithm creates the ghost surplus, xp, based on a small
perturbation, 6, around the reference surplus, xo. Since the ghost surplus must
lie in a region adjacent to all boundaries RO, the perturbation, 6 must be a
small as possible. However, it is difficult to predict the minimum magnitude
and appropriate direction of 6. This algorithm is designed to operate without
that information.
This has led to an inefficient algorithm because multiple iterations of 5 are com-
puted until the conditions (4.40) and (4.41) are no longer satisfied by the ghost
surplus, xp. With each iteration, the mean magnitude A of 6 is increased. When
the mean magnitude A exceeds a threshold before a successful perturbation is
made, the simulation is stopped.
The ghost surplus, Xp is created in Hiercsim Versions 3.5 and 4.0 by adding
component wise a small random perturbation, 6 to the reference surplus, xo,
away from the hedging point.
xp = xo + 6 (4.43)
For any component, zo0j, of the reference surplus, xo, the sign of the correspond-
ing component, Sj, of the perturbation is determined by the relation of x 0o to
its hedging point, zc,.
If Xoj < z then j < 0Vj (4.44)
else if xoj '> Z- then 6j > 0 Vj
Even though no explanation is currently available, the simulation was more
stable when this rule was followed. Two behaviors related to stability were
observed.
(a) When some or all components of surplus, xz, were near their respective
components of the hedging point, zC, the algorithm sometimes crashed
when the hedging point boundary was crossed. This may have been due to
202
the fact that the costs of (4.11) are small enough to introduce numerical
stability problems in the computation of the basis inverse.
(b) The coefficients, Fi, of Boundary R h , (4.21) are simplified greatly if they are
computed when the surplus used to compute the cost in the linear program
(4.11) is greater than the hedging point, z. That is, the coefficients are
simplified when the costs in (4.11) are positive. This simplification avoids
truncation errors in the calculation of the basis inverse DB1 of (4.13).
The mean magnitude, 1/Am, of each component, ,j of the perturbation, 6 is
determined by three parameters: the current iteration, m = 1, 2, ... , the base
perturbation mean, 1/A, and a fixed multiplier ", A:
1 11  (4.45)
Am AmA
The values of A and A used in Hiercsim Versions 3.5 and 4.0 are set by the user
in the input file. A typical value of A is 1.1. The value of the base mean A
depends on the size of numbers used in the simulation.
The magnitude I6j I of component bj is an exponentially distributed random
variable of mean Am.
Pr(| 8j 1> y) = Ame-Ads (4.46)
Once the ghost surplus, xp is computed, it is used to calculate production rates,
up, in the linear program (4.13). Using the production rates, up, the boundary
proximity criteria are computed.((4.40) and (4.41)). If no boundaries satisfy
(4.40) and (4.41), then the perturbation is successful.
Otherwise, the value of the ghost surplus, xp is thrown away, the iteration
number, m, is increased by 1, and the ghost surplus is recomputed using the
larger value, Am+,. If the algorithm fails to generate a suitable ghost surplus,
"The term Am is A raised to the power m. In Hiercsim Version 3.5, the term A is called
MINDELTA, and the term A is called PERTMULT.
203
xp, after the base mean, A, is increased by some predetermined multiple, Ama,,
then the simulation is stopped. In that case, the user should try a different
value of the base mean A. A typical value of Amax is 100.
4. Follow Trajectory Once an appropriate ghost surplus xp is found, the pro-
jected trajectory of Section 4.8.2 is used to examine the Boundaries Ro one
at a time. The attractiveness conditions (4.31) and (4.32) are applied to each
boundary and the appropriate attractive boundary constraints (4.35) are added
to the linear program (4.13).
As each Boundary R' is reached by the ghost surplus, zp, the coefficients Fi of
that boundary are computed (4.21). The optimal production rates, up, of the
ghost surplus are computed. Those coefficients are compared to the reference
surplus, xo using the criterion (4.40) and the ghost production rates, up.
If that criterion is not met, then the ghost surplus has moved a significant
distance from the original surplus, x', and Boundary RF is not one of the
Boundaries Ro that are being examined. In that case, the algorithm is complete,
and all attractive boundary constraints required by the set Ro have been added
to the linear program (4.13).
The projected trajectory of the ghost surplus can also be terminated by the
projected trajectory criteria described in Section 4.8.2.
At each leg of the ghost trajectory which does not terminate the trajectory, the
attractiveness of the encountered boundary, R P is examined. If Boundary RP is
attractive, according to (4.31), then its constraint (4.35) is added to the linear
program (4.13). Otherwise, Boundary RP is unattractive (4.32) and the ghost
surplus is moved across Boundary R P , eliminating Boundary RP from further
legs of the trajectory.
5. Calculate Final Rates Two results come from this final step. First, the pro-
duction rates, uc, are computed and will be used until the next set of boundaries
is reached, the cell's capacity set, Q(ekmk) changes, or the target production
204
rates, UCk- 1 , are changed. Second, the time until surplus, xz,k reaches the next
boundary using the new production rates, u., is computed.
The purpose of the previous steps is to add the appropriate constraints (4.35)
to the linear program (4.13) for each of the boundaries in Ro which are attrac-
tive (4.31). Any remaining boundaries are unattractive. The only copy of the
surplus which was moved at all in the previous steps was the ghost surplus,
xp. As an unattractive boundary was encountered, the ghost surplus, xp, was
moved across it, but the reference surplus, ox0, and the original surplus, xa, were
not. Therefore, the surplus, xz, still lies on the wrong side of the unattractive
boundaries in Ro.
If the surplus, zx , lies on an unattractive boundary after all appropriate attrac-
tive boundary constraints are installed, then the small amount of time, Atp,
required to move the surplus, xz, across the unattractive boundary is computed
(4.40). That time is equal to the time to reach the boundary plus a very small
amount to account for the numerical precision of the computer.
k
Atp = tj - to + - (4.47)
10
The surplus, xa, is moved across those boundaries according to (4.39) where
tl - to is replaced by Atp. and where the production rates, u, are those from
the unattractive side of the boundaries.
Once the surplus, x , is across all unattractive boundaries, the final production
rates, uC, are computed according to (4.36) and the time interval, t 2 - t1 , to the
next set of boundaries is computed using (4.38).
The next rate calculation is scheduled At time units in the future, where
At = t2 - tl + AtP (4.48)
If the surplus is at the hedging point, or is falling away from the hedging point,
then At is not defined because no boundaries will be reached given the current
205
target production rates and capacity set. This is represented by setting
At = oo (4.49)
indicating that an infinite amount of time has to elapse before the next boundary
is reached.
206
Chapter 5
Manufacturing Systems with
Limited Flexibility
5.1 Limited Flexibility in a Manufacturing Sys-
tem
The first four chapters of this thesis deal with manufacturing systems that are com-
pletely flexible. That is, each machine in any given system is able to switch between
any of its operations without any time or cost penalty. This chapter describes some of
the issues related to the hierarchical controller when there is a time or cost penalty as-
sociated with switching between two different operations on the same machine. This
chapter also describes some of the assumptions and restrictions in Hiercsim Version
4.0, which permits setup changes'.
Flexibility Issues Manufacturing systems are sometimes limited in flexibility by
the inability of machines to switch between operations without a time or cost penalty.
Such penalties are incurred because machines may require different fixturing or tooling
to perform a different operation. The changing of fixtures, tooling or any similar
equipment so that a machine may perform a different set of operations is referred to
1 Hiercsim Version 4.0 is in a preliminary state and has not yet been completely debugged and
tested.
207
as a setup change.
Setup changes considered in this chapter require much more time than operations.
Setups are therefore controlled in the hierarchy at a higher level than operations
(Section 3.5.1)2.
Scheduling setup changes is a complex issue. The complexity is due in part to
the long term impact of decisions. Once a setup on a machine is chosen, the opera-
tions that the machine can perform are limited. Operations which require a different
machine setup must wait.
The more often a machine has its setup changed during a given interval of time,
the greater the variety of operations it can perform. However, the more often setups
are performed, the less production can occur and the more expensive the overall
production becomes. Therefore, there is a tradeoff between flexibility (the ability to
perform different operations) and production volume.
In addition to the complexity on a single machine, the combination of setups
within a manufacturing system also poses a scheduling challenge. The setup of a
single machine group can affect the ability of the entire system to function if that
machine is unable to perform an operation required by other machines in the system.
This leads to either starvation or blockage if buffer sizes are small, or the accumulation
of large work-in-process inventory if buffer sizes are large (Section 2.8.2).
It has been observed in numerous manufacturing facilities that some setup changes
require a long time to complete whereas others require a short time. These differences
in setup change time lead to the application of the hierarchical framework for the
scheduling of the changes. Long duration setup changes occur infrequently, and are
controlled at high levels. Short duration setup changes occur frequently, and are
controlled at low levels.
Goal of Hiercsim Version 4.0 The goal of Hiercsim Version 4.0 is primarily to
provide a research testbed to test different algorithms. Two areas are addressed in
2If setup changes require much less time than operations, it is possible to combine the setup time
with the operation time in the capacity set (3.12) and bypass the scheduling difficulties of setup
changes.
208
this thesis.
The first is to develop a model of setup changes consistent with a pyramid hier-
archy with arbitrary control levels and multiple machines (Section 3.5.3). The model
must include notation which is flexible enough to describe setup states at multiple
control levels and allow for communication of setup information between cells at dif-
ferent levels and with different spans of control. An extension of the capacity set
(3.12) is required so that setup constraints may be incorporated into the control
strategy. In parallel to development required for the hierarchical controller, a model
of the factory with setup changes must be developed and implemented.
The second area is to develop some preliminary control strategies to demonstrate
the model. These control strategies are not intended to be optimal, but instead are
intended to provide a workable controller for later enhancement.
The goal of a hierarchical setup change scheduler is to find a balance between
the amount of time and resources allocated to flexibility (changing setups), and the
amount allocated to production at each control level in the hierarchy. In addition,
the controller must coordinate all the different machine groups contained within the
system. The algorithm developed and implemented for Hiercsim Version 4.0 is based
on the hedging point strategy of Section 3.7.2 and loosely based on the corridor
policy as described in Srivatsan and Gershwin (1990) and Caramanis, Sharifnia and
Gershwin (1991).
Organization of Chapters 5 and 6 Chapter 5 describes extensions which include
setup changes to the hierarchical model found in Chapters 2 and 3. A detailed
description of the definitions and notation used in the models implemented in Hiercsim
Version 4.0 can be found in Section 5.2. All major assumptions used in the creation of
the model are explicitly defined in Section 5.3 An extension of the capacity set (3.12)
is developed Section 5.4. This development takes place in three stages. Extensions to
the short-term and long-term capacity sets (2.2) and (2.7) are made to accomodate
differences due to setups. Then the generalized capacity set defined by (3.12) is
extended to include setups at arbitrary levels (5.6). Issues related to hierarchical
209
control strategies are discussed in Section 5.5.
Chapter 6 describes the setup change model and control policies implemented
in Hiercsim Version 4.0. The design objectives and limitations of Hiercsim Version
4.0 are discussed in Section 6.1. An extension of the master linear program (from
Section 4.1.4) is described in Section 6.2. Section 6.3 describes how and when the
general limited capacity set (5.6) is changed due to simulation events. The bulk of
Chapter 6 desribes the implementation of the preliminary coordination and setup
change control strategies implemented in Hiercsim Version 4.0. Section 6.9 describes
the factory model which implements the setup changes calculated by the hierarchical
controller.
5.2 Setup Notation and Fundamental Concepts
This section defines fundamental concepts and describes the notation used in the
implementation of setup changes in Hiercsim Version 4.0. Section 5.2.1 defines the
fundamental concepts used in the construction of the models implemented in Hierc-
sim Version 4.0. Section 5.2.2 outlines the set of notation used in the remainder of
Chapter 5 and all of Chapter 6. A brief definition of each quantity is made and a
reference to the section where it is first used is given.
5.2.1 Fundamental Concepts
Flexibility Flexibility is the number of different products that may be made by a
manufacturing system in a given time interval. The greater the variety of products
that can be made, the greater is the flexibility of the system.
Setup State A setup state of a machine is the representation of a condition which
permits a specific set of operations to be performed and prohibits all other operations.
Setup states may be changed after the expenditure of the time and resources required.
Once the setup state is changed, a different set of operations may be performed, and
all others outside of the new set are prohibited.
210
Setup Tree In general, a machine can produce a greater variety of products when
viewed from higher control levels because there is time available to perform long
duration setup changes as well as short duration setup changes (Gershwin, 1989).
High level setup states encompass a wide range of operations, whereas lower level
setup states encompass a subset of the higher level operations. The combination of
long and short setup states implies that not all setup changes should be controlled in
the same level of the hierarchy.
The concept of multiple control levels of setup changes is implemented in Hiercsim
Version 4.0 through a setup tree. The setup tree of a machine is a model of all possible
setup states and the operations performed in those setup states. There is one setup
tree for each different type of machine in the manufacturing system. The setup tree
of a machine is invariant throughout an entire simulation.
An important feature of the setup tree concept is the ability of a controller to
determine the aggregate setup state of a machine over a long time period, while
leaving the the detailed setup state decisions to the lower level controllers which use
more detail in their decisions.
An illustrative example of a setup tree for a hypothetical Multi-Purpose Shop
Tool appears later on in this section and in Figure 5-1. The notation which describes
the setup tree is developed in Section 5.2.2. An example of a setup hierarchy appears
in Burman (1992).
The implementation in Hiercsim Version 4.0 includes two copies of the setup tree
for each different machine type in the manufacturing system. One setup tree contains
the hierarchical controller's perception of setup change times and operation times.
The other setup tree contains actual setup change times and operation times. This
implementation may be used to study the effects of corrupted or inaccurate data on
overall controller performance.
Level k Setup State A Level k setup state is a description of a possible setup state
of a machine such that all controllable events in that setup state occur at a much
higher frequency than the Level k characteristic frequency fk. It can be found in one
211
of the Level k nodes in the setup tree.
Level k - 1 Parent Setup State A Level k - 1 parent setup state is a setup state
which can be refined into one or more Level k setup states.
Level k Setup Class A Level k setup class is a group of Level k setup states.
All Level k setup states in a Level k setup class have a common Level k - 1 parent
setup state. The exception to this rule is the setup class which contains all Level
k setup states, regardless of the Level k - 1 parent setup state. The difference in
notation between these two types of setup classes is described in Section 5.2.2. The
Multi-Purpose Shop Tool described later in this section provides an example of this
definition.
Setup Change A setup change is a change in a machine's characteristics such that
it becomes capable of performing a different set of operations.
Setup Change Level The change level of a setup is the control level in the hier-
archy at which commands are issued to change into or out of that setup. The setup
change level is determined by some combination of the frequency of the setup change
and its duration. Hiercsim Version 4.0 requires that the user supply the change level
of each setup state in the system.
Sequence-Dependent Setup Change A sequence-dependent setup change is one
for which the time to change into a particular setup state depends on both the initial
setup state and the new setup state of the machine. In the example shown in Figure 5-
13, the Level k setups are sequence-dependent. The time to change into the jig saw
setup state is short if the machine is already set up to saw, but long if the machine
is set up to drill.
Hiercsim Version 4.0 has the capability to model sequence-dependent setup changes.
Sequence-dependent setup changes require the user to supply the setup change times
3 The manufacturing system shown in Figure 5-1 is described in the illustrative example
212
into each setup state from each of the other setup states. This information is stored
in a series of vectors. However, there are no control strategies implemented to de-
terimine when to change into or out of sequence-dependent setup states. A future
modification of Hiercsim could be the addition of such a controller.
Sequence-Independent Setup Change A sequence-independent setup change is
one for which the setup change time is independent of the initial setup state. In the
example shown in Figure 5-1, the Level k + 1 setups within either the Level k saw
setup class or the Level k drill setup class are sequence-independent. The time to
change into the jig saw setup state is short if the machine is set up to saw, regardless
of the initial saw blade. Likewise, the time to change into the 1 inch drill bit setup
state is short if the machine is set up to drill, regardless of the initial drill bit size.
The control algorithms implemented in Hiercsim Version 4.0 and discussed in this
thesis are for sequence-independent setup changes between setup states in a setup
class with a single parent setup state.
Illustrative Example Figure 5-1 shows the setup tree of a hypothetical machine,
the Multi-Purpose Shop Tool. This tool is capable of drilling different size holes and
sawing a variety of materials.
The two Level k - 1 setup states are "Sawing" and "Drilling". A major setup
change is required to switch between operations that require sawing and operations
that require drilling. That setup change uses up a significant amount of time and
resources, and is done infrequently.
Within the Level k - 1 setup state "Sawing" is the Level k setup class which
consists of the jig saw, the hack saw, and the high speed saw. Once the machine is in
the Level k - 1 setup state "Sawing", there are a number of types of saw blades that
may be used. Changing from one saw blade to another can be done quickly, requiring
only a minor setup change.
Likewise, within the Level k - 1 setup state "Drilling" is the Level k setup class
which consists of 1/4 inch, 1/2 inch, 3/4 inch, and 1 inch drill bits. When the machine
213
Multi Purpose
Shop Tool
+,;Level 1
Level k-1
Level k JigSaw
Level k+1
Major Setup Drilling
Change. Drilling
1/4"
Complex Metal Pine 1/4" 1/2" 3/4" 1"
Shapes Bars Boards Holes Holes Holes Holes
Figure 5-1: Setup Tree for the Multi-Purpose Shop Tool
214
is in the Level k - 1 setup state "Drilling", there are a number of different drill bit
sizes to choose from. Changing from one drill bit size to another can also be done
quickly with a minor setup change. However, changing from one drill bit size to a
new blade type cannot be done without a major setup change.
At the operation level, Level k + 1, once a blade type or drill bit size has been
chosen, only the operations corresponding to that choice may be performed until
a minor or major setup change is performed. For example, if the jig saw blade is
mounted on the Multi-Purpose Shop Tool, only complex shapes can be cut until
either a minor or major setup change is performed.
Configuration The configuration is the method used in Hiercsim Version 4.0 to
describe the combination of setup states of a number of machines. Two configuration
variations are used in Hiercsim Version 4.0. The first is a means of describing the
combination of setup states of a group of machines which contains multiple copies
of the same machine type. The second variation is used to describe the combination
of setup states of a cell which may contain multiple machine groups, each with a
different type of machine.
Level k Machine Group Configuration Consider a machine group has multiple
copies of the same type of machine. The Level k setup state of each machine can
be represented by a particular Level k node in the setup tree. The combination
of all Level k setup states of the machines in Machine Group i is the Level k
machine group configuration.
There is one configuration for each level of the hierarchy.
For example, consider a group of five Multi-Purpose Shop Tools, with the fol-
lowing setup states: one machine is set up to saw complex shapes, two are set
up to saw metal bars, and two are set up to drill 1 inch holes. The configuration
is displayed in Figure 5-2 and is interpreted as follows:
Level 1 Configuration The Level 1 configuration consists of five machines
because at that level, the machine group appears to be completely flexible.
215
Multi Purpose
Shop Tool
Level 1 (
/
Major Setup
Sawing Change
Level k-1 .
Level k JigSaw
Level k+ 1
Complex Metal Pine
Shapes Bars Boards
1/4" 1/2" 3/4" 1"
Holes Holes Holes Holes
Figure 5-2: Configuration of a Group of Five Multi-Purpose Shop Tools
216
Level k - 1 Configuration The Level k - 1 configuration consists of three
machines in the Level k - 1 setup state "Sawing" and two machines in the
Level k - 1 setup state "Drilling".
Level k Configuration The Level k configuration consists of one machine in
the Level k setup state "jig saw", two in the Level k setup state "hack
saw", and two in the Level k setup state "1 inch Drill Bit".
Level k + 1 Configuration The Level k + 1 configuration consists of one ma-
chine in the Level k + 1 setup state "complex shapes", two in the Level
k + 1 setup state "metal bars", and two in the Level k + 1 setup state "1
inch Holes".
Even though there are no choices in Level k + 1 setup states once the Level
k configuration is chosen, there is still a Level k + 1 configuration. This is
used in the factory algorithm which assigns setup changes to specific machines
(Section 6.9).
Level k Cell c Configuration The configuration of Level k Cell c is simply the list
of Level k configurations of each of the machine groups which are contained in
the cell.
When there are many machine groups in a cell, it is possible to have some
machine group configurations to be such that those groups are unable to process
parts which are needed by other groups in the cell. This lack of coordination
between machine groups in the same cell can lead to a significant loss of capacity
and is addressed in Hiercsim Version 4.0 through the concept of a configuration
catalog. Each entry in the configuration catalog of a cell contains a list of
compatible machine group configurations. When the cell controller chooses to
change setup states of its machines, the final cell configuration is limited to be
among its catalog entries. The configuration catalog is described in detail in
Section 6.5.
217
Limited Flexibility Capacity Set The limited flexibility capacity set is the set
of possible frequencies of controllable events, given the current setup state of the
machine. An illustrative example is presented here, while the full development of the
limited flexibility capacity set is in Section 5.4
Consider Multi-Purpose Shop Tool that is shown in Figure 5-1. That machine is
capable of both drilling and sawing, but can not hold both a drill bit and a saw blade
at the same time. In fact, it is only able to hold one type of drill bit or one type of
saw blade at a time.
While the machine is ready for drilling a specific size hole, it is incapable of drilling
other sizes or sawing until the drilling stops and time is used to replace the bit with
either a different bit or a saw blade. Therefore, the Level k limited capacity set of
the machine while it is setup for drilling a specific size hole consists of a non-zero
constraints on the rate of the operation which requires the current drill bit, while
all other controllable rates of are forced to be zero. Likewise, the Level k limited
capacity set of the machine while it is set up for sawing a particular type of shape
or material consists of a non-zero constraint on the the rate of the operation which
requires the saw blade, while all other controllable rates are forced to be zero. The
Level k capacity set (3.12) is no longer sufficient because the machine requires seven
constraints on its rates, one for each of the Level k setup states.
If the Multi-Purpose Shop Tool is set up to saw complex shapes, the Level k
limited capacity set is
0 < jigu7 <1
Uhack = 0
k 0
Uspeed =0
=k (jig)= Uk.25 = 0 (5.1)
U 0 .5 0 = 0
U k  -0
0 . 7 5
U1. 0 0 = 0
If the Multi-Purpose Shop Tool is set up to drill quarter inch holes, the Level k
218
limited capacity set is
Qk( 1/ 4 inch drill) =
Uk =0
U hack 0
k
Uspeed 0
0 ro70.25U. 25  1
u.s = 0
0.75= 0
1.oo = 0
5.2.2 Notation
The notation that is developed to describe setup changes and their control algorithms
in Chapters 5 and 6 is complex. This section is intended to provide a ready reference
for the meaning of the different symbols. More detailed explanations of the models
which use this notation can be found in the referenced sections.
Cell Names The following names for cells will be used in the development of the
setup change control algorithms in Chapters 5 and 6. A definition of a cell can be
found in Section 3.6.
Level k Cell c This is the prototype Level k cell for which the algorithms are de-
veloped.
Level k Cell c' This is a Level k cell which is downstream to Level k Cell c and
receives raw material from Level k Cell c.
Level k - 1 Cell C This is the Level k - 1 cell which contains both Level k Cell c
and Level k Cell c'.
Level k + 1 Cell c* This is one of the Level k + 1 cells contained in Level k Cell c.
Setup Tree Notation The existence of major and minor setup changes is a part
of the physical description of a type of machine and is invariant over the course
219
(5.2)
of a simulation. Since the hierarchy is driven by the frequency clusters of events
(Section 2.5), the setup tree, which links the major and minor setup changes to
specific control levels, is also part of the machine description and is invariant over the
course of the simulation.
Setups in Hiercsim Version 4.0 are modeled in the form of a setup tree. A rep-
resentative setup tree appears in Figure 5-3. An explanation of the symbols which
appear in that figure follow:
S The set of all possible Level k setup states on Machine Group i.
o A particular instance of a Level k setup state on Machine Group i.
S (aik- 1) The set of all Level k setup states on Machine Group i which are aggregated
into Level k - 1 setup state -1.
0 This is the null setup state and is used to represent the setup state of machines
which are undergoing a setup change.
Tk(S') The number of distinct Level k setup states in the set Sk. In Figure 5-3,
T(S ) = 7.
T(Sk) The number of distinct Level k + 1 and lower setup states in the set S . In
Figure 5-3, T(SI) = 16.
Controllable Events 4 The following rate variables represent rates of controllable
events, as seen by a Level k observer. The two types of controllable events in Hiercsim
Version 4.0 are operations and setup changes. This notation is an extension of the
notation developed for the Level k capacity set (3.12).
These quantities are used in the development of the capacity set used in Hiercsim
Version 4.0 which account for the effect of setup changes. That development occurs
in Section 5.4.
4This notation can be extended to include multiple Level k setup states by substituting the Level
k setup class Sk for the Level k setup state oru. The definition of the quantity is extended to all
Level k setup states in the Level k setup class Sll.
220
Level 1
/ \
Level k-1
Level k
I
Level k+1
1 k+ 1
- - -L - - - --- - - - - - - -
l k-1 / k-l\
I/ 0 l
k-1
or ,
Figure 5-3: Example Setup Tree with Notation for Machine Group i
221
u A Level k production rate variable set by the controller of Level k Cell c for a
Process j.
P (a,) The set of indices corresponding to the Level k production rate variables for
operations which can be performed by a machine in Machine Group i that is
set up in Level k setup state ao,.
j E Pj(or) A particular index of the Level k production rate variable for an operation
which can be performed by a machine in Machine Group i that is set up in Level
k setup state or.
f, The Level k setup change rate variable which represents the frequency with which
a Level k+1 setup change, m, is performed. That Level k+1 setup is aggregated
into the Level k setup state or. This rate is specified by the controller of Level
k Cell c and is used as a target rate by Level k + 1 Cell c*.
Ti, The time to change into Level k+1 setup state m. If this setup change is sequence-
independent, the value of this time is a scalar. If this setup change is sequence-
dependent, then the setup change time is a vector with Tk+1(S0+l(o))- 1
elements. This corresponds to one element for each of the possible originating
Level k + 1 setup states in the set S+l'(o ). (The originating setup state is the
setup state out of which the machine is is being changed.) Hiercsim Version 4.0
incorporates only sequence-independent setup changes.
Mk+'(a f) Set of indices corresponding to Level k setup change rate variables used
by the controller of Level k Cell c for the Level k + 1 setup states that are
aggregated into the Level k setup state ar.
m E Mk+1(ao) A particular index of a setup change rate variable in the set Mk+1 ( ).
Mk(ao) The index which corresponds to the Level k setup state o, in Level k - 1
Cell C.
M(o-l) Set of indices corresponding to Level k setup change rate variables for all
Level k + 1 and lower setup states which are aggregated into the Level k setup
222
state i.
Machine Availability Notation This notation is an extension of the machine
availability notation of Section 2.4. The extension takes into account the limited
flexibility of a machine group with setups. This notation is used in the development
of the capacity set which accounts for setups in Section 5.4.
ni Total number of machines in Machine Group i.
n4(o ) Number of machines in Machine Group i which are set up in setup state ao,
as seen by a Level k observer.
Setup Control Algorithm Notation This notation describes the state of setups
in a cell at any given time using the concept of the setup tree. These quantities are
used by the controllers of cells in the hierarchy to determine the best combination of
setups given the current state of the manufacturing system (Section 6.6.4). A detailed
description of these quantities is given in Section 5.2.1.
C, The configuration of Level k Cell c. This describes the setup states of each of the
machines in each of the machine groups that are contained in Level k Cell c.
C'. The current Level k configuration of Machine Group i in Level k Cell c.
C(o,) The number of machines that are set up in Level k setup state ao, when the
Level k configuration of Machine Group i is Ci.
CCk' A new or proposed Level k configuration of Machine Group i in Level k Cell c.
AC' The net Level k change in the configuration of Machine Group i required to
transform its configuration from C. to CI'.
AC-(oa) The net Level k change in the number of machines in Level k setup state
o, required to transform the configuration of Machine Group i from Ck to C$'.
C((0l) The number of machines in Machine Group i which are undergoing a Level k
or higher setup change.
223
Setup Change Timing Notation There is a delay from the time a setup change is
initiated to the time that the change is completed. In the algorithms used in Hiercsim
Version 4.0 to calculate a new configuration for Level k Cell c, three different sets of
target production rates must be available to the controller of Level k Cell c. The use
of these target production rates is described in Section 6.4.2.
The notation presented here also describes the meaning of the various times used
in Chapter 6.
t The current simulation time.
t + At The time in the future at which a proposed setup change is projected to be
complete.
t + t' The time at which a setup change actually begins, even though it is scheduled
to begin at time t.
t + At, The next time the setup change calculation will be performed.
uk-1(t) and f-(t) Level k - 1 target rates used by the controller of Level k Cell c
for production rate variables u k and setup change rate variables fk which are
calculated by the controller of Level k - 1 Cell C under the current Level k - 1
configuration C - 1.
u-'1(t + At) and f-1(t + At) Level k - 1 target rates used by the controller of
Level k Cell c for production rate variables uk and setup change rate variables
fc which are calculated by the controller of Level k - I Cell C the proposed
Level k - 1 configuration Cr -1 .
uk-1l(t + t') and fk- (t + t') Level k - 1 target rates used by the controller of Level k
Cell c for production rate variables uk and setup change rate variables fk which
are calculated by the controller of Level k - 1 Cell C during the transition from
the current Level k - 1 configuration Ck-1 to the new Level k - 1 configuration
C "- 1' . These rates account for those machines which are undergoing a setup
change, and therefore belong to the set n-'(0).
224
Cell Controller Setup Change Policy Notation When the controller of Level k
Cell c converts Level k - 1 target setup change frequencies into discrete setup change
allocations, quantities are required to keep track of how many setup changes have
been allocated, how many have been used, and how many are needed. The following
quantities perform those functions. Their use is described in Sections 6.6.2 and 6.6.3.
Nj(o,, t) Cumulative number of setup changes allocated to setup state o up to time
M'?(o , t) Cumulative number of setup changes into setup state or actually performed
up to time t.
AAk(ole, t) Number of available setup changes into setup state o'r at time t. This
is the difference between the cumulative number of setup changes allocated
N(a(o, t) and the cumulative number performed A(k(o,t).
D (oa, t) Number of Level k setup changes into setup state o which are required to
transform the configuration of Machine Group i from Cki to Ci ', but have not
been allocated by the controller. This number is always less than or equal to
zero.
Machine Group Setup Assignment Notation Due to the separation of the
hierarchical controller from the factory controller, when all cell controllers have cal-
culated a new configuration for Machine Group i, the controller of Machine Group i
(Section 4.1.3) has to initiate the necessary setup changes.
The controller of Machine Group i has to account for all configuration changes in
the hierarchy, up to the control level where the change originated. This requires the
set of variables listed below. The details of how these variables are used can be found
in Section 6.9.
Ang(oi) The net number of setup change assignments into or out of Level k setup
state o that have been made in the machine group setup assignment algorithm.
225
Ami(oi) The remaining number of Level k setup changes to be assigned into or
out of Level k setup state ok to complete the transformation of the Level k
configuration of Machine Group i from C'. to C '.
Am+(Sf) The total remaining number of setup assignments into setup class Sik to
complete the transformation of the Level k configuration of Machine Group i
from C' to C'.
Am-(S) The total remaining number of setup assignments out of setup class S to
complete the transformation of the Level k configuration of Machine Group i
from Ck to C'.
Amot(S i ) The total remaining number of setup assignments which can be accom-
plished by taking one machine out of a setup state in S and giving it to another
setup state in S k to complete the transformation of the Level k configuration
of Machine Group i from Ck to Ck'.
Amce,,,(SP) The total remaining number of setup assignments that are expected to
involve a setup state from outside setup class Sik to complete the transformation
of the Level k configuration of Machine Group i from C'. to Ck'.
rai n and alo, Two Level k setup states in Level k setup class S( k - l ) of Machine
Group i. One of the machines in Machine Group i will change its Level k setup
state from ao,, to Level k setup state uain during the current configuration
change.
5.3 Assumptions about Flexibility
There are two classes of assumptions made about flexibility in this thesis: those
which reflect the current state of the theory; and those which reflect the current
state of the simulation implementation. The theoretical assumptions represent the
current restrictions placed on the types of limited flexibility that can be handled. As
more work is invested in the issue of flexibility, these restrictions may be eased. The
226
simulation assumptions represent the current status of Hiercsim Version 4.0. In some
cases, the simulation assumptions anticipate theoretical advances, and in others, they
lag behind the current state of the theory.
5.3.1 Assumptions Required by the Current Theory
Sequence Dependent vs. Sequence Independent Setup Changes The issue
of sequence-dependent setups is bypassed by dividing the control problem among
different levels in the hierarchy. All setup changes within the Level k setup class
Sj(or- 1) are assumed to be sequence-independent. Setup changes between two Level
k setup classes are specified at the higher control level where the two setup classes
are aggregated into a single class. Therefore, hierarchical setup changes are a special
case of sequence-dependent setup changes.
For example, the Multi-Purpose Shop Tool is controlled at four levels in the hi-
erarchy. Capacity is allocated for setup changes between sawing and drilling at the
Level 1 and for all setup changes with the Level k setup classes Sk(saw) and Sk(drill).
Level k - 1 setup changes are chosen and performed at Level k - 1. Capacity for
Level k setup changes within the Level k setup class Sk(saw) and within the Level
k setup class Sk(drill) is allocated at Level k - 1 according to the appropriate Level
k - 1 configuration. The setup changes within Level k setup class are performed at
Level k. This technique reduces the complexity of the setup change algorithms.
Failures and Setups Failure modes are assumed to be independent of the setup
state of a machine. There are no failures which only occur in some setup states, but
not others. This restriction simplifies the accounting of lost capacity. For example,
the capacity set which includes failures that vary with setup state must account for
the proportion of time a machine will be in any given setup state over a given interval.
By eliminating the dependence on setup state, the formulas of Section 2.7 can be used
in the estimation of capacity lost due to failures.
It is assumed that failures do not occur during a change of setup.
227
5.3.2 Implementation Assumptions
The two implementations of the hierarchical control algorithms (Hiercsim Versions
3.5 and 4.0) restrict the type of manufacturing systems that may be modeled. These
restrictions are due to the current theory's inability to accommodate a wider range
of phenomena, and a lack of programming resources to implement and test more
ambitious versions of the software.
This section describes the assumptions about the relative frequencies of failures
and setups in the two current versions of Hiercsim. Version 3.5 allows failures and no
setup changes while Version 4.0 allows setup changes but no failures. Both versions
contain the same setup tree structure, described in Section 5.2, so both can model
systems with limited flexibility. However, neither version of Hiercsim permits the
specification of initial conditions. This prevents the user from manually transferring
information from one simulation to another as failures and setups occur.
Hiercsim Version 3.5 Hiercsim Version 3.5 allows machines to be set up indepen-
dently of each other, but once the setup state of a machine is chosen, it cannot be
changed during the simulation. Each machine has an initial setup state among all
possible states for that machine type. Within any given setup state of a machine,
there may be many different operations. This represents the capability of performing
minor tooling adjustments, whose time has been incorporated into the operation time
by the user in the creation of the input data file (Section 4.1.2).
This implementation allows a machine group (which is a set of identical machines)
to be configured with each machine in a different setup state, thus modeling a man-
ufacturing system with limited flexibility over a short time period compared to the
interval between setup changes. During the course of the simulation, machines are al-
lowed to fail, and the system is able to respond to the failures in the manner described
in Section 4.5.3.
Hiercsim Version 4.0 Hiercsim Version 4.0 reverses the assumptions about reli-
ability and flexibility from Version 3.5. In Version 4.0, setup changes are possible,
228
but machines cannot fail during the course of a simulation. The time to change from
one setup to another is assumed to be long compared to operation times. Very high
frequency minor failures can be represented by adjusting the operation times in the
input data file (Section 4.1.2).
Each machine may change its initial setup state into any other state based on the
algorithms presented in this chapter and in Chapter 6. The time to change from one
setup to another is assumed to be either deterministic or an exponentially distributed
random variable.
Operations and Setups It is assumed that operations in any given setup state
are unique to that setup state and cannot be performed in any other setup state.
Situations where a machine is capable of performing the same process step in two
different setup states is similar to having two different types of machines able to
perform the same operation. The controller which chooses production rates would be
forced to choose the proportion of production that will be performed in each setup
state. This problem is harder than the multiple routing problem which has not yet
been implemented into the hierarchical controller.
5.4 Limited Flexibility Capacity Set
This section develops the capacity set used in Hiercsim Version 4.0 for manufacturing
systems with limited flexibility. The capacity set for a system with limited flexibility
depends on the time scale with which it is viewed. This section develops the short-
term, long-term, and generalized time frame capacity sets. Also described in this
section are constraints which are unique to setup change frequency variables.
5.4.1 Short-Term Limited Flexibility Capacity Set
The capacity set for a flexible machine is presented in Section 2.4. In that repre-
sentation of capacity, each machine group has a single constraint row in the Level k
Cell c capacity set (3.12) which is used in the hedging point strategy linear program
229
(3.22). Machines within those machine groups are flexible: the time to switch from
one operation to any other requires an insignificant amount of time.
However, when machines are limited in their flexibility, those capacity constraints
are insufficient, since a machine may only make a limited subset of parts at any given
time instant. Let Level k be a level in the hierarchy whose characteristic frequency
of events, fk, is much higher than the frequency of any setup change. A machine in
Machine Group i in Level k Cell c which is in one setup state appears as if it were
a completely different machine compared to an identical machine in a different setup
state, even though the two machines belong to the same machine group.
For that reason, Machine Group i (whose machines have Tk(Sk) different possible
setup states) has Tk(S') different capacity constraint rows in the Level k Cell c
controller's master LP (Section 4.1.4)5 .
The Level k limited flexibility capacity set for Level k Cell c is
S riju k < Ck k k) E S Vi
*EPi(0()
uj 0, j E P, V/i (5.3)
Note that the limited flexibility capacity set for Level k Cell c has multiple constraint
rows for each Machine Group i, one for each element in the machine group configu-
ration vector Ck, and that the number of machine-related constraint rows for Level
k Cell c is equal to the total number of elements contained in the Level k configura-
tion C'. The element of each machine group configuration, Ck (0), which contains the
number of machines undergoing setup changes, is not included in the master linear
program because machines which are setting up are unavailable for other controllable
activities.
Note also that only those elements of the Level k configuration Ck which are non-
zero are transferred from the master LP to the work LP (Section 4.1.4). Those rates,
which are constrained by the elements of the Level k configuration whose values are
5 Note that the method of constructing the work LP from the master LP does not change with
the addition of setup-related constraints.
230
zero, are known to be zero a priori.
5.4.2 Long-Term Limited Flexibility Capacity Set
Consider a manufacturing system which includes Machine Group i that is composed
of machines with limited flexibility. Machine Group i is part of Level k - 1 Cell C
and by Level k Cell c. The controller of Level k - 1 Cell C provides target frequencies
of controllable events to Level k Cell c.
Suppose that for all machine groups in this system, setup changes are performed
at Level k in the hierarchy, and that all setup changes are sequence-independent.
At Level k - 1 in the hierarchy, all setup changes on machines in Machine Group i
can be performed much more frequently than the characteristic frequency of Level
k - 1 events, fk-1. Setup change events, as seen by a Level k - 1 observer, can be
represented by the frequency of setup change events, f~,l,m E Mk(S-l). In this
case, there is one setup change rate variable for each Level k setup state Orf E S .
From the point of view of a Level k - 1 observer, the machines in Machine Group i
are completely flexible for the operations that they are capable of performing because
any setup change required to switch between two operations takes much less time
than the Level k - 1 characteristic time scale, 1/fkl. Therefore, the flexible capacity
constraints (2.2) may be used with modifications to allow for setup change rates. The
Level k - 1 machine group capacity constraint with ni identical machines is written
as a single constraint:
T fj Cm -
jEPi mEMk(S - 1')
u k- 1 >0 ,jE PiS , P(5.4)
fcm > 0 ,m Mk(S - 1
(There are additional constraints on the setup change frequencies, ft -, which are
not included in (5.4). Those constraints are introduced in Section 5.4.4.)
231
Relationship Between Flexibility and Production Capacity There is a trade-
off between flexibility and maximum production rate. The first summation in (5.4)
represents the fraction of capacity of Machine Group i devoted to actual production.
The second summation represents the fraction of capacity devoted to changing setups.
If possible, the capacity set allocated to production over the long term must must
contain the long term target demand rate for each part type. In this implemen-
tation, the remaining capacity set used by failures, setup changes, and idle time.
The split between setup changes and idle time depends on the setup change policy
(Section 6.6.2).
The more production and failures there are in the simulation, the less capacity
there is for setup changes. This implies that the total number of possible setup
changes is less when demand is high and/or machines are unreliable. Therefore,
in those situations, the manufacturing system must produce larger batches to meet
demand. Larger batches generate larger work-in-process, with all the undesirable
effects of larger inventories.
When Machine Group i contains a large number of machines which are limited
in flexibility, most machines can be dedicated to particular part types. If the target
demand rate mix does not change very often, the large machine group's need for
flexibility is reduced, leaving almost all of the capacity of the machine group for
production.
However, in many cases, only large companies can afford the capital expense to
accumulate a large number of identical machines. The scheduling problem for a
system with machines of limited flexibility is to find a balance between the fraction of
available capacity devoted to production and the fraction devoted to setup changes.
5.4.3 Generalized Limited Flexibility Capacity Set
Machine Flexibility and Control Level It has been observed in numerous man-
ufacturing facilities that some setup changes require a short period of time, while
others require a long time to complete (Gershwin, 1989). These differences in setup
change time lead to the application of the hierarchical framework for the scheduling
232
of the changes. Long duration setup changes occur infrequently, and are controlled
at high levels. Short duration setup changes occur frequently, and are controlled at
low levels.
Seen from the long time scale of a high control level, a machine of limited flexibility
appears in some respects to be flexible, although some capacity must be devoted
to setup changes instead of production (Section 5.4.2). The capacity set for the
machine is a single constraint (5.4), with additional constraints on the setup change
frequencies, described in Section 5.4.4. As the hierarchy is descended and the time
scale decreases, the same machine appears less and less flexible. At the control level
where setup changes are no longer possible, the capacity set for the machine becomes
the set of constraints (5.3).
At an intermediate Level k, the limits on the flexibility of a machine in Machine
Group i is a combination of the short term limits on flexibility (5.3) and the long
term limits on flexibility (5.4). The machine appears limited in flexibility because the
Level k - 1 and higher setup states are already chosen by the appropriate Level k - 1
and higher controllers.
However, within those Level k - 1 and higher setup states, there may be Level k
or lower level setup changes which allow for the production of different, but related,
parts. These minor setup changes take up a fraction of the machine's capacity, and
are accounted for in the Level k capacity set in the same manner as setup changes in
the long-term capacity set (5.4) are done for higher level controllers. Capacity must
also be allocated at the highest control level for these short duration setup changes.
Below a certain level, no further changes in setup state can be made, so the
controller must schedule production within the current setup configuration of the
machine group. The lowest levels of the hierarchy have to trust that the higher levels
have chosen an appropriate setup configuration in order to meet the long term target
demand rates for each of the products in the system.
Additional Setup Change Rates The long-term capacity set (5.4) represents a
manufacturing system which has only a single level of setup changes. For systems
233
where setup changes are controlled at multiple levels in the hierarchy, additional setup
change rates are required for the Level k cell controller capacity set. The additional
setup change rate variables are those which determine the amount of setup changes
adjustments.
Generalized Limited Flexibility Capacity Constraints Consider Level k Cell
c which contains Machine Group i. Suppose that the setup tree (Section 5.2) of
Machine Group i is such that there are some setup states that are chosen by Level
k and higher controllers and some others by Level k + 1 and lower controllers. In
this case, there are both fixed setup states and setup change rate variables fC. The
capacity set of Level k Cell c is therefore a combination of the long-term capacity set
(5.4) and the short term capacity set (5.3).
The Level k capacity set for Cell c can be written as Tk(S) constraints, each with
two summations. The first summation in the constraint for Level k setup state a, is
for all operations that can be performed in or. The second summation is for all Level
k + 1 or higher setup changes which are possible within 'i .
The number of machines in Level k setup state o is given by the equation
nk(o = C(k (5.5)
Therefore, the capacity set which includes long term and short term setup changes
can be written,
Sriju k+ k < k (5.6)
j~,(, mEM(,u )
u,3 0, jE P
f% > 0, mE M(S)
01. E S
234
Note that the set of indices, M(S), does not have a superscript on the M because
it includes the Level k setup change rate variables for all Level k + 1 and lower setup
states.
5.4.4 Constraints on Setup Change Frequencies
As was mentioned in Section 5.4.3, setup changes are aggregated by the Level k Cell c
controller as rate variables f k when the frequency of the setup change is much greater
than Level k characteristic frequency, fk. The Level k setup change rate variables
fe~ are treated almost in the same manner as production rate variables. However,
they require a specialized set of constraints to reflect the nature of the setup change
activity.
Not all the constraints presented here are implemented in Hiercsim Version 4.0.
However, all the necessary constraints are listed for completeness. Some of these
setup constraints first appeared in Srivatsan and Gershwin (1990).
Feasibility Constraints Setup changes are required in a system of limited flexibil-
ity to ensure that all required production can be performed. For this reason, all setup
change rate variables which permit production of required process segment operations
must be strictly positive. Any process segment operations which are blocked by failed
machines or failed buffers do not require a positive setup change rate variable.
The Level k setup change rate variable fm for changing into the lower level setup
state which corresponds to the index m E M(Sk) in Level k Cell c must be greater
than some suitably dimensioned rate Ek. The value of Ek is small compared to the
Level k characteristic frequency fk.
Ek fk (5.7)
The constraint may be written as
f#, > k, mE M(S) (5.8)
235
This constraint requires that an Ek be defined at each Level k. However, this
constraint is not currently implemented in Hiercsim Version 4.0. The lack of this
constraint in simulations has lead to problems where there is no capacity allocated
for some setup changes. The initial testing of Hiercsim showed that this omission
prevented a system with limited flexibility to meet its production goals because no
capacity had been allocated to change into some of the required setup states.
Interdependence Constraints Consider Machine Group i with machines of lim-
ited flexibility. Level k - 1 Cell C and Level k Cell c contain Machine Group i at their
respective control levels. Suppose that there are multiple Level k - 1 setup states, one
of which is a-1 and that the setup class S(ao- 1) contains multiple Level k setup
states.
The set of possible frequencies of any change into Level k setup state o E S ( - 1)
is dependent on the frequencies into which all the other setup states in Sk(ao- 1) are
changed. This effect is a conservation of setup changes where the number of changes
into any given setup state o may not exceed the number of changes into all other
setup states combined.
The simplest case to demonstrate this constraint is one with two possible setup
states. As soon as the system changes out of the first state, it immediately changes
into the second, and vice-versa. In this case, the setup change frequencies are equal
to each other. For three or more setup states, a setup change frequency must be less
than or equal to the sum of all the other frequencies.
This constraint represents a physical phenomenon and may be expressed as
f, S fem, s E Mk+l(0?), s m (5.9)
mEMk+l(9 )
Relationship to Production In order to prevent setup changes from occurring
without production, a set of constraints is introduced to ensure that the frequency
with which the system changes into any given setup state is less than the target
production rate of the process segment.
236
Suppose that Process Segment j in Level k Cell c is controlled by the Level k rate
variable uc1, and that it requires Machine Group i to have at least one machine in
the Level k + 1 or lower setup state of +1 . The setup rate change variable fm, (where
m E M(Sk) corresponds the Level k index of r + 1) must be such that it is less than
or equal to the target production rate of the process segment ,k- provided by Level
k - 1 Cell C:
fA m Uk-1 (5.10)
5.5 Hierarchical Control of Setup Changes
This section describes some of the issues relevant to control of setup changes by a
hierarchical controller. The role of cell controllers as a function of level is outlined, and
the linear program which replaces (4.1.4) in determining rates of controllable events
is developed. The coordination of cell configurations in a hierarchy of independent
cells is discussed.
5.5.1 Setup Controller as a Function of Level
Each control level in the hierarchy has a specific role in the choice of system-wide
machine configuration. The highest control levels provide long term production rates
u and setup change rates f, thus specifying an overall split between production and
flexibility. The high control levels are also responsible for determining the appropriate
aggregate setup states for each of the machine groups. Mid-level controllers determine
the more refined setup states which are consistent with the required high level setup
states. Finally, the lowest level controllers are responsible for assigning and initiating
setup changes at specific machines, and reporting the progress of setup changes to
the higher control levels.
237
5.5.2 Linear Program with Setup Changes
The hedging point strategy linear program of Section 3.7.2 which is used to determine
the values of controllable rates within a cell is modified to account for setup change
rate variables.
The objective function of the linear program (3.22) for the controller of Level k
Cell c of Machine Group i must be modified to take into account the setup change
rate variables fk, m M(S) in the case of systems with limited flexibility. As a
first implementation, the objective function for fk is a direct extension of that of the
process segment flow rate variables u.k Each setup change rate variable, ffm, has its
own weighting coefficient Am, surplus Xkm, and hedging point zkm. The cost of being
behind in setup changes is computed in the same manner as that for process segment
rate variables (3.21).
Therefore, the controller of Level k Cell c of Machine Group i tries to minimize
the cost of being behind in production and setup changess:
min [ A ( X- z ) \U + Am( k m X zm)f m)] Vi (5.11)
LjEP mEM(Sf)
subject to the constraints (5.6), (5.8), (5.9), (5.10), and the conditional constraints
(2.19) and (2.20) due to blockage and starvation of buffers. The procedure for instal-
lation of boundaries described in Section 4.8 is used when solving (5.11).
5.5.3 Coordination of Systems with Limited Flexibility
The coordination of setup configuration in a system is a complicated and important
task. This thesis identifies two types of coordination issues: inter-cell configuration
coordination and intra-cell configuration compatibility. Inter-cell coordination is used
6It is important to note that this objective function is one interpretation of the cost of setup
changes. Other objective functions may be better able to capture the real cost. This formulation
was chosen in order that the implementation of setup changes in the simulation could be done as
quickly as possible. Gershwin (1993) shows that this setup staircase policy does not perform well.
The example in Section 7.6 shows the poor performance of the setup staircase policy.
238
to refine a Level k - 1 setup configuration Ck - 1 into an appropriate Level k configu-
ration Ck for Level k Cell c which is consistent with the Level k configuration of each
Level k cell within Level k - 1 Cell C. Intra-cell coordination compatibility must
ensure that the best combination of Level k setup configurations C is chosen on all
machine groups in Level k Cell c, given the current status of production.
Inter-cell Configuration Coordination The controller of Level k Cell c chooses
the Level k configuration C' of the machine groups in the cell based only on Level k
data within Cell c. Due to this limited data, it might choose an optimal configuration
for itself that may not be compatible with the configurations of other Level k cells in
the system. The coordination between cells at the same control level which reduces
the possibility of a mismatch is called inter-cell coordination.
For example, the Level k Cell c controller might not choose a configuration Ck
which is able to produce part j, whereas the remaining Level k cells do. This single
configuration choice will permit the production of part j only until all buffers upstream
of Level k Cell c become full and all buffers downstream become empty. Production
beyond that amount will require a second configuration change within Level k Cell c.
In addition to the problem of a cell being improperly configured, the overall flexi-
bility of a system may be less than that of its isolated components. Consider a system
with two machine groups, i and i' (each in different Level k cells). The system pro-
duces two part types, j and j'. Both machine groups perform operations on both part
types, and each machine group requires a Level k setup change to switch production
from one part type to the other.
Suppose that it is easy for Machine Group i to switch production into part type
j, but it is difficult to switch into part type j'. Suppose that the reverse is true
for Machine Group i'. If Machine Group i were considered in isolation within its
Level k cell, then there would be one easy and one difficult setup change in a cycle
from one part type to the other. The same would be true if Machine Group i' were
considered in isolation within its Level k cell. However, the combination of the two
machine groups leads to a system where switching into either setup states contains
239
one difficult setup change. Therefore, the total amount of time and cost for changing
setups in the system is greater than that of each machine group in isolation.
Intra-cell Configuration Coordination Suppose that the controller of Level k
Cell c has determined that it is time to change the Level k configuration C' of its
machines within the constraints of the current Level k - 1 configuration C - 1 . The
controller must choose a configuration for each Machine Group i, C', such that at
least some production is possible after all required setup changes are completed.
Ideally, the configuration should be such that the cell is capable of reducing its
Level k backlogs in the best way possible. This configuration decision must only
account for what is possible within the cell, and can ignore the outside system. This
is an intra-cell compatibility problem.
Coordination Mechanism A control policy is needed which can effectively pick
a reasonable configuration for each machine group within a cell while taking into
account the status of neighboring cells. A mechanism is needed to communicate
configuration change information to the next lower level. The mechanism developed
in Section 6.8 uses production parts as signal bearers, and also uses virtual machine
connections to provide appropriate information about neighboring cells.
240
Chapter 6
Setup Implementation
6.1 Design of Hiercsim Version 4.0
6.1.1 Setup Scheduling Implementation Objective
Setup changes are an important part of any scheduler's responsibility. However,
setup changes in the hierarchical framework are not well understood. The purpose of
Hiercsim Version 4.0 is to provide a research testbed from which an understanding of
setup changes can be gained. This chapter describes the algorithms used to determine
the system configuration based on a feedback control policy.
6.1.2 Limitations of Hiercsim Version 4.0
Setup Changes and Failures The initial goal of Version 4.0 was to have a policy
which would allow setup changes and failures to occur in the same simulation. That
problem turned out to be more difficult than originally anticipated. Therefore, this
feature is not yet implemented and only setup changes are allowed'.
Algorithm Complexity The complexity of the setup change algorithms imple-
mented in Hiercsim Version 4.0 is driven by a number of factors. The major factors
1Note that Hiercsim Version 3.5 provides a complementary set of features by allowing failures with
a fixed system configuration. The two versions cannot be used together yet since initial conditions
cannot be specified in sufficient detail to transfer data from one version to the other.
241
are outlined here.
1. Multiple Setup Change Control Levels The desire to incorporate multiple
setup change control levels forced the code to be written so that it is easily
expandable. This posed a design challenge which required the development of
general notation and recursive algorithms. A scheme had to be developed which
is able to communicate configuration change information across a distributed
collection of cells and machine groups. Sections 5.2, 6.4, and 6.8 address these
issues.
2. Inherent Delay Setup changes (by definition in Section 5.2) require a signif-
icant amount of time to perform. This delay poses problems in a system of
cells because communication between cells is limited and yet a setup change
command may have an impact far beyond the scope of the controller of the cell
where the command is issued (Section 5.1).
Two problems are addressed in Hiercsim Version 4.0. The first is the possibility
that a Level k - 1 setup change might occur while a Level k setup change is in
progress on Machine Group i. This is addressed in Section 6.4.3. The second is
the possibility that Machine Group i might change out of a setup configuration
required to complete all parts cleared for processing by the staircase policy
(Section 3.9). This stranded part issue is addressed in Section 6.8 and 6.9.
3. Coordination of Setup States The distribution of cells across a pyramid
hierarchy structure (Section 3.5.3) poses problems when the setup configurations
of more than one cell must be coordinated while leaving individual events in
each cell locally controlled.
When setup configuration information is transmitted down the hierarchy, re-
finements are required to determine the precise setup state on each machine
affected by the configuration change. This leads to restrictions of cells based on
the configuration change status.
242
The change of configuration of Level k - 1 Cell C implies a change in Level k - 1
target production rates for the Level k component Cell c. Depending on the
purpose, Level k Cell c requires the use of the Level k- 1 target production rates
under the old configuration, those during the period when setup changes are
underway, and those anticipated for when all setup changes are complete. Also
depending on the purpose, the status of virtual machines are affected. These
issues are described in Sections 6.5 and 6.6.
6.2 Setup Constraints in the Master Linear Pro-
gram
Section 4.1.4 introduced the concept of a master LP for all constraints that may be
used in the linear program. That section listed all the constraints required for Level
k Cell c where there are operations, machine failures, and buffer failures. This section
lists the number of additional rows and columns required to accommodate the setup
change rate variable constraints in the master LP.
Following the notation conventions of Section 5.2.2, there are a total of T(S)
Level k + 1 and lower setup changes possible on Machine Group i which is contained
in Level k Cell c. A total of T(Sl) columns must be added to the master LP to
accommodate the setup change rate variables. Level k + 1 setup change variables are
listed first, followed by Level k + 2 setup change variables and so on.
The feasibility constraints (5.8) are not implemented in Hiercsim Version 4.0.
The interdependence constraints (5.9) are implemented in this thesis. The con-
straints relating the setup frequencies to production (5.10) are also implemented.
Each type of constraint requires one constraint row for each of the setup change
rate variables. Therefore, there are a total of 2T(Sf) rows for these two types of
constraints.
243
6.3 Capacity of a Cell During a Setup Change
A setup change is an activity which occupies a machine for a finite duration. For
that reason, any machine undergoing a setup change is unable to perform any other
activity, such as operations. This section describes how capacity is affected during a
transition between two configurations.
Suppose the controller of Level k Cell c initiates a configuration change in which
Machine Group i must change its configuration. The current Level k configuration is
Ck-I and the new Level k configuration is C '.
As soon as the new Level k setup configuration is determined, the number of
machines that will be changed into each setup state can be computed. The quantity,
ACki(o i ), represents the net gain or net loss of machines in Machine Group i for the
Level k setup state t-T. It is calculated as follows:
Ack k k' k kk E(.
aC CO' C - Cci(, (6.1)
The total number of machines in Machine Group i which will be changing setup
states is denoted CkI(0). All those machines do not have a setup state during their
change and are incapable of performing any other controllable activity. This quantity
is equal to the sum over all setup states that experience a net loss of machines:
c(o) Ak ( (6.2)
C ES
In (6.2), the sum is taken over all setups eo, E Sik such that ACk(O) < 0.
The controller of Level k Cell c assumes that each setup state which will lose one or
more machines in the new configuration loses those machines instantly. In reality, the
information about the configuration change has to be communicated down through
the hierarchy in a finite amount of time. However, this time is insignificant compared
to the characteristic Level k time scale, l/fk, and so the assumption that the setup
changes begin immediately is appropriate.
Setup states which are to gain machines in the new configuration do not do so
244
until the setup change is actually completed. The time to change into a Level k setup
state is significant compared to the Level k characteristic time scale, 1/fk.
A machine which is changing into setup state ao is not credited to ao until the
machine has completed the change. The value for the number of machines which are
set up in Level k setup state ar is n (oi ). The value of n (ck), of E Sik. used in the
capacity set (5.6) for the linear program (5.11) is computed as follows:
) = f Cc(ur) + ACci(o, ) if C k
n (0I)=< (6.3)
Czz (a() otherwise
These values of n are to be used in the capacity set (5.6) until the Level k
setup changes are completed. As each setup change is completed, the newly changed
machine is added to the appropriate constraint row in (5.6).
6.4 System Coordination of Setups
6.4.1 Order of Cell Calculations with Configuration
Changes
Whenever the controller of Level k Cell c allocates capacity for controllable events,
Level k - 1 and higher events are accounted for first, then capacity is allocated to the
Level k events, followed by capacity for Level k + 1 and lower events.
The calculations of the Level k Cell c controller are divided into three components
in Hiercsim Version 4.0: the best Level k configuration consistent with the appropriate
Level k - 1 configuration is chosen; Level k operations are cleared for processing
according to the staircase policy of Section 3.9; and the rates of Level k + 1 and lower
controllable events within the domain of Level k Cell c are chosen.
6.4.2 Controllable Rates During a Configuration Change
Suppose Level k - 1 Cell C changes its configuration from configuration C - 1 to
configuration Ck-l ' at time t. Assume that the time required to completely change
245
the Level k - 1 configuration is At. The controller of Level k - 1 Cell C assumes
that the configuration change begins immediately. The time at which the Level k - 1
setup changes actually begin is denoted as t + t' and occurs once all parts already in
the system under the old configuration have been assured a clear path to exit Level
k - 1 Cell C. The delay t' is due to the method with which the configuration change
is coordinated among the lower level component cells of Level k - 1 Cell C. This
method is described in Section 6.6.5.
The Level k Cell c component cell of Level k - 1 Cell C must alter its current
configuration Ck to be consistent with the new Level k - 1 configuration. The Level
k transition from Level k - 1 configuration C - 1 to configuration C - 1' can be divided
into three calculations at time t.
The first Level k calculation uses the Level k - 1 target rates u -(t) and fk- 1(t)
which are determined by the Level k - 1 Cell C controller using (5.11) under the old
configuration C - 1 . This is necessary so that all parts which are already in the system
can be completed before a setup change occurs.
The second calculation uses the Level k - 1 target rates u-1(t + At) and fk-'(t +
At) which are determined by the Level k - 1 Cell C controller using (5.11) under the
new configuration C - 1' as if the configuration change were complete. This permits
the new Level k configuration to be optimized according to the new Level k - 1
configuration. These rates are not used for any other purpose at time t.
The third calculation uses the Level k - 1 target rates u-l(t + t') and fk-'(t + t')
which are determined by the Level k - 1 Cell C controller using (5.11) using the
transient machine availability values (6.3). The remainder of the machines are off-
line according to (6.2). This set of rates are to be used as target rates for all parts
produced during the interval [t + t', t + At) and are to be updated upon each setup
change completion in that interval.
The Level k Cell c controller computes the Level k version of each of these three
sets of rates which are passed down to all of its Level k + 1 component cells. This
process continues until the operation level of the hierarchy is reached and setup change
assignments are made by the machine group controllers according to the algorithm
246
in Section 6.9.
6.4.3 Cell Configuration Change Status
In order to ensure the smooth transition of the entire system from one overall con-
figuration to another, Hiercsim Version 4.0 uses a configuration change status. The
configuration change status allows the controller of Level k Cell c to place certain
restrictions on a cell so that it is able to coordinate its new configuration with the
rest of the cells in the system. The three possible configuration change status states of
a cell are: CELL-WAITING-TO-SETUP, CELL-SETTING-UP and CELL-SETUP-
COMPLETE. This section defines each status and describes restrictions specific to
each status. The mechanism which is used to implement these restrictions is described
in Section 6.8.
CELL-WAITING-TO-SETUP Each Level k component cell of Level k - 1 Cell
C is immediately assigned the configuration change status CELL-WAITING-TO-
SETUP when the Level k - 1 configuration, C - ', changes. This status places two
restrictions on the controller of Level k Cell c. It prevents the Level k controller from
initiating any configuration change until the configuration change status becomes
CELL-SETUP-COMPLETE; and it tells Level k the controller to use the Level k - 1
target production rates in effect before the Level k -1 configuration change was made,
The delay in starting the Level k configuration change due to the status CELL-
WAITING-TO-SETUP is required for the proper coordination of all the Level k
component cells in Level k - 1 Cell C. The mechanism which causes the delay is
described in Section 6.6.5. The delay is assumed to be small compared to the Level
k - 1 characteristic time scale, l/fk-l, but can be significant when compared to the
Level k characteristic time scale, 1/fk.
When the Level k - 1 configuration change information is transmitted to the
component Level k Cell c, the cell's status becomes CELL-SETTING-UP.
The restriction on which Level k - 1 target production rates to use is motivated
247
by the possibility of stranding parts in the system. The time for a part to travel
through a Level k - 1 process segment is insignificant compared to the Level k - 1
time scale, but is significant when compared to the Level k time scale. The interim
target controllable rates, u-1l(t + t') and f-1(t +t') set by Level k- 1 Cell C account
for machines being in the process of changing setups. Those target rates are therefore
smaller than those which were in effect immediately before the change was initiated,
u' -l(t) and fc-(t). The smaller target production rates can leave parts stranded at
Level k due to the reduced demand used in the staircase policy (Section 3.9).
CELL-SETTING-UP The configuration status of Level k Cell c is changed from
CELL-WAITING-TO-SETUP to CELL-SETTING-UP as soon as it receives a signal
which tells it to react to the Level k - 1 configuration change in its parent cell, Level
k - 1 Cell C. The implementation of the signal is described in Section 6.8. Upon
receiving this signal, the controller of Level k Cell c changes its configuration to be
consistent with the new Level k - 1 configuration. This status restricts the ability of
the Level k Cell c controller to initiate any other Level k configuration change.
In calculating the new Level k configuration, Ck, the controller of Level k Cell c
uses the Level k - 1 target production rates u-l1(t + At) and fk-1(t + At) anticipated
after the new Level k- 1 configuration is projected to be completed. Once a new Level
k configuration is found, the controller of Level k Cell c changes the configuration
status of each of its Level k + 1 component cells to CELL-WAITING-TO-SETUP.
The cell controller then computes interim Level k production rates, u (t + t') and
fc(t + t'), for the configuration change period. This calculation requires that the
controller account for the loss of capacity due to the configuration change (6.3). The
target rates used in the calculation are the interim Level k - 1 production rates,
ukC-1(t + t') and f -l(t + t'), computed by Level k - 1 Cell C. After each Level k - 1
setup change is completed, the target production rates uc-l(t + t') and fC-l(t + t'),
are recomputed to account for the the new capacity. This process continues until all
Level k - 1 setup changes are completed or an unplanned event occurs which forces
a new configuration. The Level k production rates are recomputed every time the
248
Level k - 1 target rates are changed.
The restriction imposed on the ability of the Level k Cell c controller to change the
Level k configuration Ck during the time its status is CELL-WAITING-TO-SETUP
is maintained during the time the cell is CELL-SETTING-UP. The restriction is
imposed for the same reason: to avoid loss of capacity due to setup changes that
must be canceled immediately. The restriction is in effect until either all the required
setup changes are completed, in which case the cell's status changes to CELL-SETUP-
COMPLETE, or an uncontrollable event occurs which affects capacity at or above
the control level where the change was initiated.
CELL-SETUP-COMPLETE The configuration change status of Level k Cell c is
CELL-SETUP-COMPLETE whenever none of its component machines is undergoing
a Level k or higher setup change. and the high level configuration has not changed.
This status places no restrictions on the ability of the cell's controller to initiate a
Level k configuration change. It also specifies that the Level k - 1 target production
rates be those that reflect the current Level k - 1 configuration state, u-1l(t) and
f - (t).
It is possible, in rare cases, for the configuration status of Level k Cell c to change
directly from CELL-WAITING-TO-SETUP to CELL-SETUP-COMPLETE if the
current configuration change is overridden by an even higher level change. This case
is described in more detail in Section 6.4.4.
6.4.4 Cancellation of Configuration Change Requests
When the configuration status of Level k Cell c is either CELL-WAITING-TO-
SETUP or CELL-SETTING-UP, it is possible for the configuration change to be
canceled. A cancellation occurs when the configuration change in progress is super-
seded by an even higher level configuration change. In that case, the cell's status
reverts to CELL-SETUP-COMPLETE for the brief period of time until the higher
level configuration change information reaches the Level k controller.
When a high level cell controller decides to change configuration, this change
249
overrides all setups in progress in its component cells. In addition to forcing lower level
cell's configuration status to be CELL-SETUP-COMPLETE, a cancellation makes
any machines which are in the process of changing setups revert back to the setup
state before the change was begun. It is assumed that there is no cost for canceling
a setup change that has been initiated on a machine.
This is an implementation issue which simplifies the calculation of which con-
figuration is the best within a cell. Giving precedence to higher level configuration
changes, the coding of the implementation becomes much less complicated than the
case where two configuration changes have to be accommodated at the same time.
6.4.5 Machine Group Configuration Change Status
MCG-WAITING-TO-SETUP The configuration status of Machine Group i is
MCG-WAITING-TO-SETUP when it is waiting for a signal to assign specific setups
to specific machines in the group. This status indicates that the operation level cell
which contains Machine Group i is undergoing a configuration change.
The machine group is able to perform operations without restrictions. However,
the machine group controller monitors parts for a signal to begin the required config-
uration change.
MCG-SETTING-UP As soon as the signal to change configurations is received
at Machine Group i, the machine group controller assigns setup changes to specific
machines.
MCG-SETUP-COMPLETE As soon as all required setup changes on machine
in Machine Group i are completed, the configuration status is changed to MCG-
SETUP-COMPLETE.
6.4.6 Machine Configuration Change Status
WAITING-TO-SETUP The setup change status of a machine is changed to
WAITING-TO-SETUP as soon as it is notified that it has been selected to change
250
setup state. The setup change does not begin until a signal is received, The details
of this signal are described in Section 6.8.3.
SETTING-UP Once the signal is received, the machine setup status is changed to
SETTING-UP. At this point, the machine begins performing its setup change. The
procedure for changing setup on a machine is described in Section 6.10.
SETUP-COMPLETE When the setup change on the machine is complete, its
status is changed to SETUP-COMPLETE. There are no restrictions on a machine's
ability to perform operations when it is in this status, other than those imposed by
its particular setup state.
6.5 Configuration Catalog of a Cell
There is a large number of possible combinations of system configurations in any
reasonably sized manufacturing system. Among those configurations, there are many
which prevent any production from occurring by forcing at least one machine to be
setup in a state that is incompatible with the setup states of all the other machines.
The controller of any Level k Cell c which specifies the Level k configuration C' must
know which configurations make the target production rates feasible, and also which
leave the cell in a compatible configuration with the rest of the system.
Most of the possible configurations of Level k Cell c are undesirable because of
it only takes one machine in the wrong setup state to block production. The few
configurations that are desirable can be stored in a catalog of possible configurations.
The current implementation of Hiercsim Version 4.0 limits the controller of Level k
Cell c to choose only a Level k configuration which is contained in the configuration
catalog. In addition, at any given time, the controller of Level k Cell c must only
choose from among those Level k configurations in the catalog which are consistent
with the appropriate Level k - 1 configuration CA- ' of the Level k - 1 Cell C parent
cell (Section 6.4.2).
An example of a setup catalog appears in Figure 7-42 in the setup example of
251
Section 7.6. The remainder of this section describes the configuration catalog, and
how it is used to represent the system setup state at each level in the hierarchy. It
also describes how the catalog can be used to ensure consistent configurations across
control levels.
6.5.1 Catalog Entry in a Cell
The combination of one configuration vector for each machine group within a cell
comprises the configuration of the cell for a given catalog entry. Each Level k catalog
entry is related to a Level k - 1 parent catalog entry in which each of the Level k
setups are consistent with the Level k - 1 setups. There may be more than one Level
k catalog entry which has the same Level k - 1 catalog entry. Each Level k catalog
entry has one or more Level k + 1 component catalog entries, unless Level k is the
interface level between the hierarchy and the factory (Section 3.5).
Consider Machine Group i in Level k Cell c. The machine group has ni identical
machines which can be set up independently of each other. Recall from Section 5.2
that the Level k configuration of the machine group C'. is the description of the setup
states of each of the machines. Each element Ck(or) corresponds to the number of
machines in Level k setup state ar. Those machines in Machine Group i which are
undergoing a setup change or are failed belong to the configuration element C';(0).
Note that a configuration vector within the catalog does not necessarily represent
the current configuration of the cell. However, it does represent a possible configura-
tion of the cell for a later date.
The values within the Level k configuration vector of a catalog entry are used by
the controller to determine the anticipated Level k capacity of Level k Cell c if it were
to change to that configuration. There is one element in the configuration for each
of the machine constraint rows in the cell linear program given by (5.11). This linear
program is used by the controller to determine which of the feasible configurations
would be the best given the current setup change policy and cell configuration change
status (Section 6.6).
252
6.5.2 Relationship of Catalog Entries Between Cells
The configuration of each cell should be consistent throughout the hierarchy. Each
Level k - 1 configuration, C - 1 of Level k - 1 Cell C is an aggregation of the Level
k configurations of each of its Level k component cells. Cells at the same level must
have consistent configurations so that production may be synchronized and WIP can
be kept low. This places restrictions on the controller of Level k Cell c when it is time
to change the Level k configuration. This section describes the relationships between
catalog configurations which are needed in order to maintain the system consistency.
Consistency Between Levels Suppose that the controller of Level k Cell c has
determined that it is time to change the Level k configuration. A new Level k con-
figuration must be among the subset of configurations in the Level k Cell c catalog
which may be aggregated into the current Level k- configuration Ck - 1 (Section 5.2).
For example, suppose that the Level k - 1 configuration of Machine Group i
contains machines which are set up in the Level k - 1 setup state o--. Valid Level
k configurations of Machine Group i may contain machines which are set up in any
Level k setup state in the setup class S((a - 1) as long as the number of machines
set up in S*(o - ') is equal to the number of machines set up in Level k - 1 setup
state cr - 1 . If the number of machines set up in a-1 is greater than 1 and there is
more than one Level k refinement to o - 1 , then more than one Level k catalog entry
is possible, as different combinations of setup states within the same group are valid.
Consistency at the Same Level The Level k configuration of Level k Cell c
must also be consistent with its neighboring cells so that blockage and starvation of
machines is reduced. This condition is satisfied indirectly through a special use of
virtual machines, described in Section 6.6.4.
Consistency Discontinuity The only exception to system-wide consistency is
when the system2 is in transition between two configurations. In that case, there
2 The system is the collection of all cells in the manufacturing system being controlled.
253
are two separate systems states: one which uses the previous configuration; the other
which uses the new configuration. The boundary between the two systems states is
made up of all cells which have status CELL-WAITING-TO-SETUP. Both subsys-
tems must be self-consistent, but there may be a discontinuity between each cell of
status CELL-WAITING-TO-SETUP and its parent cell and any of its neighboring
cells.
Cases Where There is No Choice of Configuration In Hiercsim Version 4.0,
a convention has been adopted in which Level 1 is the highest level cell in the system.
The setup tree from Section 5.2 is required to collapse to a single node at Level 1.
Due to this convention, the long-term capacity set (5.4) applies. Each machine group
in the system only has a single Level 1 configuration. Therefore, the Level 1 cell
has only one configuration, C1, in its catalog. The configuration vector C' of each
Machine Group i has two elements. The value of the element C 1(al) is equal to the
total number of operational machines in the machine group, and the value of the
element C' (0) is the total number of Level 1 failed machines.
At control levels below the level where the last setup change is controlled, there is
no choice about the setup configuration of the cell. There is a unique configuration
for each cell below the last setup change level for each configuration at the last setup
change level. Therefore, there may be many possible catalog entries at the lowest
levels, but only one of the entries is available at any given instant. It is important
to have these lower level configurations so that changes in configuration may be
communicated down to the machine groups (Section 6.9).
6.6 Algorithm for Determining Cell Configura-
tion
Whenever the configuration change status in Level k Cell c is CELL-SETTING-UP,
the controller of the cell is required to change the cell's configuration. For any given
configuration change, there may be more than one configuration which is valid. This
254
section describes the algorithm used in Hiercsim Version 4.0 that attempts to choose a
reasonable configuration, based on the current status of the cell and the configuration
of its parent cell, Level k - 1 Cell C.
The performance of this algorithm in a simple system is presented in Section 7.6.
That section shows results generated using two policies used in Hiercsim Version 4.0
which govern the setup change timing. Those policies are described in Section 6.7.
The central part of the algorithm is series of linear programs with which each of
the configurations in the cell's catalog is compared to all others. The best catalog
according to the criteria in this section is chosen as the new configuration for the cell.
Each of the configurations in the cell's catalog is examined for feasibility. The feasi-
bility conditions are described in Section 6.6.3. A linear program (5.6) is constructed
for each of the valid configurations. The cost of each of the valid configurations is
calculated using the objective function (5.11) and the rates determined by the linear
program (5.6). The catalog entry which has the lowest cost (most negative objective
function) is considered to be the optimal solution. The specifics of this algorithm are
described in the following sections.
6.6.1 Obtaining and Using Initial Conditions
Before any calculations can be done, initial conditions must be set, either by the
user or as default values. This section describes those initial conditions which are
unique to Hiercsim Version 4.0 and setup changes. All the default initial conditions
for Hiercsim Version 3.5 described in Section 4.4 are in effect for Version 4.0.
Initial Setup Configuration At the start of the simulation, the controller of Level
k Cell c requires that the initial catalog entry and its configuration, Ck, be known.
Consistency of between cell catalogs is maintained by accepting only consistent cata-
log entries as the hierarchy is descended from Level 1 to the operation level. Hiercsim
Version 4.0 does not check for consistency between cells at the same level, however.
Once each cell's initial catalog entry has been specified, the machine group con-
troller assigns an initial setup state to each individual machine. The number of
255
machines in any Level k setup state o is equal to the number of machines specified
in the Level k configuration element, C)'(o, ), found in the initial Level k setup catalog
of Level k Cell c.
Initial Conditions for Setup Change Variables Suppose that the Machine
Group i setup tree includes the Level k setup state ao which is in the Level k setup
class S(o(a-1). In the Level k - 1 Cell C linear program, the setup state aO' is
represented by a setup change rate variable f-, 1, where m E Mk(a - 1 ) is the index
which corresponds to the level k setup state of . This rate variable is treated in a
similar manner as the Level k -1 production rate variables, uk- 1. The user is required
to specify the Level k - 1 relative cost weight coefficient A - 1 (Section 5.5.2) and the
Level 1 target setup change frequency fz',. The Level k- i initial surplus, x k- , of the
setup change rate variable is set equal to zero. All hedging points of the setup change
rate variables z-'1 are set to zero by default. An identical procedure is followed for
Level k - 2 and higher cells that contain Machine Group i.
6.6.2 Setup Staircase Policy
This section describes the technique used in Hiercsim Version 4.0 to limit the number
of setup changes in a manufacturing system with limited flexibility. Setups can be
expensive beyond what is reflected in the objective function. The extra expense may
be due a large number of man-hours required to perform the change, wear and tear
incurred on equipment, or other effects that are difficult to quantify. Therefore, it
is sometimes desirable to restrict the number of setup changes in a system, allowing
some idle time to occur at machines. The current implementation of the setup change
frequency limitation is based on the staircase policy of Section 3.9. The setup staircase
policy is known to give poor performance (Gerswhin, 1993).
Setups Allocated Vector Level k - 1 Cell C provides the controller of Level k
Cell c with target setup change rates, fC-, m M(S - ') for Machine Group i.
The cumulative integral of those rates tells the controller of Level k Cell c how many
256
Level k setup changes have been allocated to date. The cumulative number of setup
changes may not exceed the cumulative number allocated.
In this implementation, the cumulative number of setup changes allocated for O7k is
defined as N((a-, t). N (o , t) is the next integer larger than the cumulative integral
3 4 of f-:
N(<t) jf'dsl mE Mk(o ) (6.4)
One of the differences between this policy and the staircase policy is that the setup
changes are allocated, but do not have to be used, whereas when a part is cleared for
production, it is expected to be completed. This distinction gives the Level k Cell c
controller some flexibility in choosing to change configuration.
The cumulative number of setup changes into setup state r performed to date is
represented by the symbol A/'k(ok, t), which can never exceed the cumulative number
allocated for changing into ak, Nik(oi, t) :
K, lt)< N(o,t) m E Mk(or) (6.5)
The vector, A/k (t), represents the number of setup changes available the the
controller of Level k Cell c for Machine Group i. There are Tk(Sf) elements in
AAik(t). The value of the element A /k(&, t) is equal to the difference between the
cumulative number of setup changes allocated into the setup state by the Level k - 1
target frequency, Nj(ao, t) and the cumulative number changes into a performed to
date s, A'7(or, t)
AAk(0,it) = N (o7,t) A (ot) (6.6)
3 The set Mk(oiu) consists of the single index which corresponds to the Level k setup state 0o in
Level k - 1 Cell C.
4 The quantity in the brackets [ is converted into the next higher integer. For example, the
number [0.051 is converted to 1.
sThis notation is only valid for sequence-independent setup changes between setup states in the
same setup class. In the case of sequence-dependent setup changes between setup states in the same
setup class, the rate of change out of Level k setup state o1 must also be taken into account.
257
Suppose the setup state o, is contained in the setup class Scr(a - ). The number
of available setups changes (represented by A.A/fk(o , t)) are only available to change
the setup state of machines which are currently in one of the other setups in the
setup class Si(ao-). Setup changes on any other machine require a higher level
setup change in order to change into c .
The element A.Afk(u, t) is zero, according to (6.6), if the Level k setup state or
is the only Level k setup state within a given Level k - I setup state - 1 . That is,
S (o' - 1) only contains the Level k setup state a;, so there is no need to perform a
Level k setup change into o .
6.6.3 Valid Catalog Entries
This section describes the conditions under which a catalog entry in Level k Cell c is
valid. A valid entry is one which may be considered by the controller of Level k Cell
c in the algorithm which chooses the next configuration for the cell. All non-valid
catalog entries are ignored during the current configuration change calculation. Valid
catalog entries in Level k Cell c satisfy two conditions.
1. The first condition is that the configuration of the catalog entry must be able
to be aggregated into the Level k - 1 configuration C - 1 of the parent cell, Level
k - 1 Cell C.
2. The second condition is that all setup changes required to convert the current
configuration of Level k Cell c into the configuration of the catalog entry do not
exceed the setup change limits determined by the setup staircase policy (6.4).
Consistent Level k Configurations The first condition is satisfied by finding the
list of Level k Cell c catalog entries which are consistent with the Level k - 1 catalog
entry that is in effect. This Level k - 1 catalog entry may be the current Level k - 1
catalog entry, or it may be the new Level k - 1 catalog entry if the configuration
change originated at Level k - 1 or higher. All Level k configurations which are not
found in this list of catalog entries are invalid.
258
Configurations within Frequency Limits The implementation technique which
incorporates frequency limits on setup changes is described here. This policy is based
on the staircase policy of Section 3.9. Suppose that the current Level k configuration
of Level k Cell c is C' and that the proposed Level k catalog entry contains the
configuration C'. Assume that C' satisfies the first condition of this section which
requires that the configuration of the catalog entry be able to be aggregated into the
Level k - 1 configuration C - 1. The calculation occurs at time t.
Three quantities are required to determine whether or not the configuration Ck'
is valid given the current state of setup changes allocated6 .
The first quantity is the total number of setup changes required to change the
configuration on Machine Group i, AC), computed by (6.1). Note that AC. is the
total number of setup changes required, including Level k-1 and higher setup changes,
as well as Level k setup changes.
The second quantity is the number of available Level k setup changes, AA~.N(, t),
for each Level k setup state o in Si (6.6),
The third quantity is the number of Level k - 1 setup changes which have been
allocated. If the Level k - 1 or higher configuration has changed, there are Level k - 1
or higher setup changes which augment the Level k setups allocated vector AA(t).
Using the notation of Section 6.3, there are AC7 l(ao - 1) machines which are required
to change into any one of the Level k setup states in Sk(or-'). This is true for each
Level k- 1 setup state o - ' e St - 1 . Note that Level k - 2 and higher setup changes
are automatically included in AC ' 1(O ) because the Level k - 1 instance of this
algorithm has taken them into account.
There are two stages in the calculation to determine if the configuration Cci' is valid.
The first stage computes the deficit of setup allocations when the total required setup
changes are compared to the allocated Level k setup changes for each Level k setup
state or E S . The second stage applies the appropriate allocated Level k - 1 setup
changes to each of the Level k setup states with a deficit. The configuration Ck' is
6 For simplicity, the criteria are developed for Machine Group i in Level k Cell c, but they must
be satisfied by all machine groups in Level k Cell c for the configuration to be valid.
259
valid if the combination of Level k and Level k - 1 setup change allocations permit
the configuration to be reached.
Define the vector of integers, Di(t) to be the Level k deficit when converting from
Level k configuration Ck to Level k configuration Cc. The element i ( P , t) is a
non-positive integer which represents the number of required Level k setup changes
into Level k setup o that are not allocated. It is computed by the equation, '
VEo ,t) =
{ LzC ,(o) - AAV(o,t)J if Aco,( )- AAk(O, t) <0 (6.7)
(6.7)
0 otherwise
The configuration Cf' is valid if all the elements of the deficit vector Dc,(t) are zero.
However, if elements of D k(t) are not zero, the Level k - 1 setup change allocations
must be applied to offset the appropriate deficits. All the Level k - 1 configuration
change vector elements, AC 1  ), which are positive form a pool that can be
distributed among the Level k setup states in the Level k setup class Si(o - 1) to
overcome the Level k setup change frequency limitations.
There are sufficient Level k- 1 setup changes allocated if, for each Level k- 1 setup
state o -x E S - 1, the magnitude of the sum of the Level k deficits D)(ao ,t), o E
S (o - 1) is less than the number of required Level k - 1 setup changes into Level
k - 1 setup state c-'. This is written
Z D (,t) + ACk-1 '(0, - )> o, 0 - C Sk- (6.8)
, ES(. - ' )
Only those Level k configurations which satisfy (6.8) are considered in the selec-
7 The quantity in the brackets [ ] is converted into the next lower integer. For example, the
number L-0.05] is converted to -1.
260
tion of the next configuration of Level k Cell c. (Note that an implicit assumption of
this algorithm is that any cost or time difference between multiple options for chang-
ing a configuration from C/i to C is insignificant.) If no configurations are valid,
then the configuration calculation is postponed until one does become valid. This
postponement is described in Section 6.7.
6.6.4 Determination of the Best Configuration
Once a list of all catalog entries has been determined by Level k Cell c controller using
the procedure in Section 6.6.3), the controller is able to choose which one to use. This
section describes the criterion used to select an appropriate configuration, the special
modifications to the standard linear program (5.11) used in configuration evaluation,
and the coordination of configuration calculations in a set of Level k component cells
contained in Level k - 1 Cell C.
Criterion The controller evaluates each of the Level k configurations in the list of
valid catalog entries by performing the same calculation procedure on each one. The
one configuration with the lowest overall reduced cost satisfying (5.11) is chosen.
Capacity of a Setup Catalog Configuration The evaluation of a valid Level k
catalog entry is performed using the linear program (5.11) where the capacity set (5.6)
is derived from the proposed Level k configuration C' of the catalog entry. During
the evaluation, the full capacity of the Level k configuration Ck' is used, as if Level
k Cell c had already completed the configuration change. Recall from Section 5.4.3
that each constraint row in (5.6) corresponds to a possible Level k setup state cra for
Machine Group i. The number of machines which will be in setup 0 is Ck'(ca).
It is assumed that the time and cost-to-change differences between any two con-
figurations contained in the list of valid Level k catalog entries are insignificant to the
Level k Cell c controller. This assumption allows the controller to ignore transition
dynamics and focus on the relative production capabilities of the configuration in
each of the valid Level k catalog entries. It is a reasonable assumption to use in this
261
set of preliminary configuration evaluation criteria given the frequency assumptions
of Section 2.5.
A further assumption is that all machines in Level k Cell c are operational. That
is, none of the machines are undergoing a Level k or higher failure. This assumption
simplifies the evaluation calculation because only one Level k capacity set per valid
Level k catalog entry need be used.
If a machine were failed in a Level k or higher failure mode, the Level k Cell c
controller would have to evaluate a Level k catalog entry based on a reduced capacity
of its configuration. This becomes difficult because the Level k flexibility of the cell
permits the controller to compensate for the failure and choose the Level k setup
state that is best able to lose a machine. The subsequent configuration change would
ignore the failed machine until it is repaired, at which point that machine would be
required to alter its setup state to complete the chosen Level k configuration. The
evaluation would require multiple calculations for each catalog entry, leading to a
significant increase in computational effort.
6.6.5 Coordination of Configuration Changes
Configuration changes are complex in part because the setup state of a machine in
one section of a system directly affects the ability of machines in other sections of
the system to produce parts. If the setup states are incompatible, upstream buffers
quickly become filled, upstream machines become blocked (2.20), downstream buffers
become empty, and downstream machines become starved (2.19).
This section details the mechanisms implemented in Hiercsim Version 4.0 which
attempt to solve this coordination problem.
Intra-Cell Coordination It is the responsibility of the user to define Level k
catalog entry configurations which coordinate all machine groups within Level k Cell
c. Once the catalog is read in from the input file, no changes are made to it for
the duration of the simulation. The catalog is the method by which the intra-cell
coordination problem described in Section 5.5.3 is resolved in Hiercsim Version 4.0.
262
Inter-Cell Coordination It is important to ensure that the Level k process seg-
ments within Level k Cell c are able to perform operations at the rate specified by
the cell's controller over the time period when the new configuration will be in effect.
If the outside Level k system is not taken into account at the time the configuration
is chosen, the Level k entry and exit buffers may fail, leading to the installation of
Level k virtual machine constraints (2.19) and (2.20) and reducing the cell's ability
to perform the operations associated with the limited Level k process segments.
The approach chosen in Hiercsim Version 4.0 to coordinate independent Level k
cells uses a restriction on the sequence of configuration choice calculations within the
Level k component cells of Level k - 1 Cell C and a special application of virtual
machine constraints. The details of the sequence control method are presented in
Section 6.8
Inter-Cell Coordination Mechanism Consider a system similar to the one shown
in Figure 3-2. Suppose that Level k - 1 Cell C determines the Level k - 1 production
rates c'1 for the Level k- 1 Process Segment j. Level k - 1 Cell C contains at least
two Level k component cells, Level k Cell c and Level k Cell c' each of which operate
on a different Level k subset of the operations in Process Segment j. Level k Cell c
provides the Level k production rate u to its portion of Level k - 1 Process Segment
j while Level k Cell c' provides the Level k production rate ujk to its portion.
Level k Cell c controls the Level k subset of Level k - 1 Process Segment j where
parts enter the manufacturing system. The Level k exit buffer of Level k Cell c process
segment is the Level k entry buffer into Level k Cell c'. Therefore, Level k Cell c is
upstream of Level k Cell c'.
Since Level k Cell c contains the first operation of Process Segment j, its en-
try buffer can be assumed never to be empty (Section 2.6). Therefore, no Level k
starvation virtual machine constraint (2.19) will limit its production (Section 3.8),
regardless of configuration choice.
In addition, since the controller of Level k Cell c' has not yet chosen the Level k
configuration for Cell c', any existing condition which would lead to blockage of Level
263
k Process Segment j in Level k Cell c may be removed with a configuration change.
Therefore, any blockage virtual machine constraint (2.20) which exists on Level k
Process Segment j in Level k Cell c is removed for the evaluation of any new Level k
configuration for Level k Cell c.
Virtual Machine Constraint Modifications The sequence of calculations for
the configurations of the rest of the Level k component cells within Level k - 1 Cell C
proceeds downstream in order of operations within the Level k - 1 Process Segment
j. The next Level k controller to choose a configuration is that of Level k Cell
c'. The rules which ensure that upstream cell configurations are determined before
downstream cell configurations are detailed in Section 6.8.
Ideally, the configuration of Level k Cell c' will have a capacity set which will
allow Level k Cell c' to match the production rates of Level k Cell c exactly. However,
there will be differences between the configurations. One of the objectives of inter-cell
coordination is to minimize those differences. One approach used in Hiercsim Version
4.0 to reduce those differences is to activate all starvation virtual machine constraints
(2.19) in Level k Cell c' during the evaluation of potential configurations. However,
instead of using the current values of the upstream Level k production rates u(t), the
starvation constraints will be limited by the production rates u (t + At) anticipated
after Level k Cell c completes its transition to the new configuration. All blockage
constraints within Level k Cell c' are removed for the same reason that they were
removed within Level k Cell c.
When the configuration for Level k Cell c' is chosen by its controller, the controller
of the next downstream Level k component cell within Level k - 1 Cell C is cleared to
choose a new configuration. This procedure is continued, moving downstream, until
each controller of all the Level k component cells within Level k - 1 Cell C have
chosen a new configurations.
264
6.6.6 Supporting Routines
The configuration evaluation calculation for Level k Cell c uses the same routines
as those used for (5.11), which are described in Section 4.8. However, the method
for computing the values of the Level k capacity set (5.6) and the Level k virtual
machine constraints (2.19) and (2.20) are different from a standard production rate
calculation. This section describes those differences and the way they are incorporated
into Hiercsim Version 4.0. The approach taken to accommodate those differences is
designed to reduce the impact on code maintainability.
The section is divided into two parts. The first part describes the pre-processing
routines, which are those routines that place appropriate data for a given configuration
evaluation into the data structures used by the standard algorithm, described in
Section 4.8. The second part describes the post-processing routines, which are those
routines which extract the necessary results and return the standard algorithm data
structures to their original state.
Pre-Processing Steps The number of machines in each setup state are different
for each valid configuration Ck' within Level k Cell c, and the Level k virtual machine
constraints are different from the current status of the Level k buffers bounding the
cell. Therefore, the standard routines for converting the master linear program into
the work linear program (Section 4.6) cannot be used. For this reason, the master
linear program used by the Level k Cell c controller is set aside temporarily and
replaced with a catalog linear program. The catalog linear program is identical in
structure to the master linear program, but the values of the machine availability m
of the catalog linear program change with each configuration being evaluated and are
discarded once the evaluation is completed.
A special routine is used to transfer the necessary constraint rows, rate variable
columns, and target production rates from the catalog linear program to the work
linear program. This routine uses the logic detailed in Section 6.6.4 using asimilar
procedure to the one described in Section 4.6. The constraint row which represents
setup state o of Machine Group i contains Ck'(o,) machines. The limiting rate of
265
each Level k starvation virtual machine constraint row (2.19) is equal to the value
of the Level k limiting upstream production rate anticipated at time t + At, after
the upstream Level k cell has completed its change. If there is no upstream Level k
cell, then the starvation virtual machine constraint is turned off. All Level k blockage
virtual machine constraint rows are automatically turned off.
The Level k- 1 target production rates uk-i(t+ At) to be used in the attractiveness
conditions for boundaries in Section 4.7.2 are those which are anticipated after the
Level k - 1 configuration change of the Level k - 1 Cell C is complete. If there is no
Level k - 1 configuration change in the Level k - 1 Cell C, then the current target
production rates u. - 1 (t) are used.
Once the information specific to the proposed configuration Ck' has been trans-
ferred into the catalog linear program, the work linear program is constructed. The
logic of Section 4.6.1 is used to transform the catalog linear program into the work
linear program. Since the logic is identical to that used for the master linear program,
the same routines are used within the simulation. At this point, the work linear pro-
gram contains a set of inequalities and an objective function which can be minimized
by the standard routines described in Section 4.8.
Post-Processing Steps Upon completion of the rate calculations described in Sec-
tion 4.8, the anticipated rates uk(t + At) and fk(t + At) for the proposed configuration
C' are stored for later use. In addition, the overall total cost of the objective function
in (5.11) using the anticipated rates is computed and stored.
In Hiercsim Version 4.0, the best configuration is the one which has the most
negative cost of the objective function (5.11). To resolve a tie, the controller of Level
k Cell c picks the first configuration in the list of valid configurations with the most
negative overall cost.
Once the best new configuration is chosen, the catalog linear program data struc-
ture is discarded and replaced by the original master linear program. If the best
configuration is the same as the current configuration, the cell controller computes
the next time to attempt a configuration change according to the algorithm of Sec-
266
tion 6.7. Otherwise, the cell controller initiates a configuration change in the cell. If
there is a Level k - 1 configuration change, the target production and setup change
rates from Level k - 1 Cell C are updated to account for the transition during the
configuration change (Section 6.4.2).
6.7 Choosing the Next Time to Change Configu-
ration
This section describes the algorithm used in Hiercsim Version 4.0 to compute the
time when the controller of Level k Cell c will begin a configuration calculation. The
algorithm is based on the corridor policy found in Sharifnia, Caramanis, and Gershwin
(1989, 1990, and 1991) and Srivatsan and Gershwin (1990) combined with the setup
change staircase policy presented in Section 6.6.2. The algorithm is designed to serve
as a preliminary policy and is not expected to be optimal.
Example simulations showing the performance of the corridor policy in isolation
and the hybrid described in this section are presented in Section 7.6. The corridor
policy is implemented by removing any explicit frequency restrictions on the setup
change variables. Frequency restrictions are implicitly imposed by the capacity set.
Suppose that at time t, the configuration change status of Level k Cell c is changed
to CELL-SETUP-COMPLETE (Section 6.4.3). The controller of Level k Cell c is
required to compute the time t + At, at which it will begin the configuration change
calculation described in Sections 6.6.3 and 6.6.4.
Corridor Policy Hybrid Assuming that Level k Cell c has just completed a con-
figuration change, time and effort has been invested to permit production of parts
which had not been possible before. For that reason, some production should occur
before another configuration change is considered. The heuristic that is used in Hierc-
sim Version 4.0 is to compute the time until the first process segment surplus equals
its hedging point. This time becomes the amount of time that will elapse before the
cell controller can start a new configuration change calculation. It is computed as
267
At( = mi - zk)
Jmin (u - u7 )' iE c (6.9)
This equation is superseded if none of the Level k production rates uk exceeds
the corresponding Level k - 1 target production rates uCj. This situation can arise
if there are Level k virtual machine constraints (2.19) and (2.20) imposed on Level
k Cell c. This combination of production rates prevents production in Level k Cell
c from ever reaching the hedging point. A change of configuration may avoid the
virtual machine constraints allowing other production targets to be met. In this case,
a configuration change calculation will be performed immediately, so At, = 0.
Setup Staircase Hybrid When the time until the next configuration calculation
is equal to the current time t (At, = 0), it is possible for the algorithm to fall into
an infinite loop. This arises when the current configuration is the one with the most
negative objective function (Section 6.6.4) and the conditions leading to At, = 0 are
still present. An initial attempt to prevent this from happening in Hiercsim Version
4.0, is to permit only one configuration calculation at time t.
When the second request for a configuration calculation at time t is made, the
controller of Level k Cell c computes the earliest time until another setup change is
allocated, according to (6.4). If the controller waits until that time to perform its con-
figuration change calculation, another configuration may become valid (Section 6.6.3)
and replace the current configuration. The time At, that will elapse is calculated as:
At, = min F N,t)1 - MP(,t) V i (6.10)
where Mk(ao, t) is the number of available Level k setup changes into Level k setup
state o t, computed using (6.6).
268
6.8 Configuration Change Coordination Mecha-
nism
A manufacturing system which has multiple independent cells arranged in a pyramid
hierarchy (Section 3.5.3) must have a coordination mechanism when a configuration
change occurss . Two conditions are imposed in Hiercsim Version 4.0 which comprise
the coordination mechanism when the configuration of Level k- 1 Cell C has changed.
1. The first condition is to ensure that upstream cell configurations are chosen
before downstream cell configurations so that the calculation procedures of Sec-
tion 6.6.4 may be used. The method which is used to satisfy this condition is
described in Section 6.8.1.
2. The second condition is to ensure that no parts are stranded in the system when
a configuration change switches production from one set of parts to another.
The method used in Hiercsim Version 4.0 is described in Section 6.8.2.
When both of these conditions are met, then the configuration change status of
Level k Cell c (contained within Level k-1 Cell C) is changed from CELL-WAITING-
TO-SETUP to CELL-SETTING-UP (Section 6.4.3).
6.8.1 Configuration Calculation Sequencing
The first condition is monitored by checking the configuration change status of each
Level k cell which supplies the Level k process segments of Level k Cell c. Only those
Level k cells which are contained within the same parent cell, Level k - 1 Cell C, as
Level k Cell c are monitored.
If any of those cells has a configuration change status of CELL-WAITING-TO-
SETUP then the configuration calculation in Level k Cell c is postponed. When all
upstream Level k cell status are changed out of CELL-WAITING-TO-SETUP, the
8The algorithms described in this section have been implemented in Hiercsim Version 4.0 but
have not been tested or debugged.
269
first condition preventing the controller of Level k Cell c from starting its configuration
change calculation is lifted.
6.8.2 Part Processing Completion
This section describes cell-to-cell communication required to ensure that no parts
are stranded. The analogous cell-to-machine group communication is described in
Section 6.8.3.
The controller of Level k - 1 Cell C must ensure that all parts already in the cell
can be processed when a Level k - 1 configuration change is initiated. The method
used in Hiercsim Version 4.0 is based on creating a signal which tells Level k Cell c9
when there are no more parts in Process Segment j that may become stranded.
The signal for Process Segment j tells the Level k controller that all parts subject
to the Level k - 1 target rates u ''(t) under the previous Level k - 1 configuration
have been cleared for processing in Level k Cell c. All subsequent parts cleared for
processing will be subject to the Level k - 1 target rates uk-'(t + t'), computed for
the transition from the previous to the new Level k - 1 configuration (Section 6.4.2).
Level k Cell c is prevented from changing configuration until a signal from each of its
Level k process segments has been received.
Signal Generation Consider Level k - 1 Cell C which contains only Level k com-
ponent cells and therefore does not control any machine groups directly. When the
controller of Level k - 1 Cell C has changed its configuration, each of the Level k
component cells must also change configuration to be consistent with the new Level
k - 1 configuration.
The configuration change status of each of the Level k component cells is changed
to CELL-WAITING-TO-SETUP. Before the Level k component cells can change con-
figurations, all parts which were cleared for processing under the previous Level k - 1
configuration using the staircase policy of Section 3.9 must be assured that their
9The relationship between Level k - 1 Cell C and its component cell, Level k Cell c, is used as
an example for the relationship between Level k - 1 Cell C and all of its Level k component cells.
270
processing can be completed.
The controller of Level k - 1 Cell C examines Level k - 1 Process Segment j.
Starting from the Level k - 1 entry buffer, each step of the Level k - 1 process
segment is looked at in turn. The first part that is encountered is marked with a
Level k - 1 tag. That part is the last part which was loaded into Level k - 1 Cell C
under the rate u l'(t) determined by the controller using the previous configuration.
When there are no parts in the process segment, the procedure stops upon reaching
the exit Level k - 1 buffer, and the process segment is declared to be free of parts.
This marking of parts is repeated for each of the remaining Level k - 1 process
segments in Level k - 1 Cell C.
Signal Detection and Bookkeeping Consider Level k Cell c which has a configu-
ration change status of CELL-WAITING-TO-SETUP due to a Level k - 1 configura-
tion change in its its parent cell, Level k - 1 Cell C. The controller of Level k Cell c is
presently operating under the Level k - 1 target production rates u-(t) determined
using the previous Level k - 1 configuration and the linear program (5.11).
The controller of Level k Cell c monitors the entry buffer to Process Segment
j. As each part in that buffer is cleared for processing at the operation level of the
hierarchy, according to the staircase policy of Section 3.9, it is examined for a Level
k - 1 tag. If such a tag is found, then the Level k Process Segment j is checked off.
Three cases exist which do not require a part to be cleared for processing in order
to check off Process Segment j. They are:
1. If the Level k - 1 target production rate u -(t) for Level k Process Segment
j is zero under the previous Level k - 1 configuration, the process segment is
automatically checked off.
2. If the part in Process Segment j that was marked by the controller of Level
k - 1 Cell C has already been processed by Level k Cell c, then the process
segment is automatically checked off.
3. If no part in Process Segment j was marked by the controller of Level k - 1
271
Cell C, then the process segment is automatically checked off.
Each of the other Level k process segments are monitored in the same fashion by
the controller of Level k Cell c.
When all process segments in Level k Cell c are checked off, there are no more
parts subject to the previous Level k - 1 configuration that are waiting to be loaded
into Level k Cell c. When this occurs, the second condition required to change the
configuration change status to CELL-SETTING-UP is met.
6.8.3 Machine Group Configuration Change Activation
The procedure for signaling a machine group to change its configuration is similar to
the procedure used to signal a cell, described in Section 6.8.2. The new configuration
for Machine Group i is specified by the operation level cell which contains the machine
group. This eliminates the need for the controller of Machine Group i to coordinate
with other machine groups in the same cell. However, it is still possible to strand
parts if downstream machine groups change configurations before upstream machine
groups.
Consider Level k Cell c whose control level corresponds to the operation level of
the manufacturing system. This cell is at the low level interface of the hierarchical
controller as defined in Section 3.5. Machine Group i is contained in Level k Cell c.
When the configuration change status of Level k Cell c is changed to CELL-
SETTING-UP, the cell converts the configuration change status of each of its machine
groups to MCG-WAITING-TO-SETUP (Section 6.4.5). The controller of Level k Cell
c places a Level k tag on the last part loaded under the previous configuration, in the
same way that is described in Section 6.8.2.
As soon as Machine Group i completes an operation on a part, it examines the
part for a Level k tag. If the tag exists, the operation which was just completed
is checked off. Once a Level k tag is detected for each operation that is active (an
active operation is one that has a positive Level k production rate under the previous
272
configuration)', the configuration change status of Machine Group i is changed to
MCG-SETTING-UP.
6.9 Machine Setup Assignment Algorithm
This section describes the procedure which is used to assign specific setup changes to
specific machines in a machine group in Hiercsim Version 4.0.
A machine group controller can only assign setup changes to machines which are in
terms of an operation level configuration. High level setup changes must be resolved
into appropriate factory level setup changes. The resolution is specified according to
the algorithm described in Section 6.6.4. This section describes how a machine group
controller translates the hierarchical controller configuration change specifications into
specific setup changes on specific machines".
Section 6.9.1 defines quantities which are used to track setup requirements and
assignments over the course of the algorithm. Section 6.9.2 describes the machine
group setup assignment algorithm and includes an illustrative example.
6.9.1 Notation for Assignment Algorithm
The quantities defined in this section are summarized in Section 5.2.2 for ready ref-
erence.
Configuration Change Vector Consider Level k Cell c which contains Machine
Group i. Recall from Section 6.3 that the configuration change vector, ACi ( i ),
is the number of machines that will change into or out of Level k setup state ao
during the current configuration change (6.1). Each setup state in the setup tree for
Machine Group i (Section 5.2.1) at each control level in the hierarchy is represented
by a unique element of a configuration change vector.
1'All inactive operations are automatically checked off because there are no parts to strand.
11The algorithms described in this section have been implemented in Hiercsim Version 4.0 but
have not been tested or debugged.
273
The configuration change vector is fixed by the time the factory controller of
Machine Group i (Section 3.10) receives the command to assign setup changes to
specific machines. Figure 6-1 shows an example of a setup tree for Machine Group i.
In the example, Machine Group i contains two machines. The setup tree shows how
each of the machines are set up now, and what the new setup states of the machines
will be after the setup change is complete.
Net Number of Setup Assignments The net number of Level k assignments
made in this call of the setup assignment algorithm is kept in the vector, An'. The
controller of Machine Group i has one such vector for each level in the hierarchy.
These vectors are only visible to the machine group controller. The elements of these
vectors are initialized to zero at the start of the setup change algorithm.
The element, Ani(o), is a running variable that is incremented each time a
machine is assigned a setup change into setup state 6o and is decremented each
time a machine is assigned a setup change out of setup state ao. When the setup
assignment algorithm is complete, the value in this element will be equal to the value
of the element AC'(o').
All other setup counting variables are derived from AC ) and ) for all6(o,(l i ) and Ani ( ri ) for all
setup states at all control levels in the hierarchy affected by the configuration change.
Remaining Setup Changes for a Setup State Ami(aO) is a running variable.
It is equal to the number of machines that remain to switch into or out of Level k
setup state of in Machine Group i. It is computed from the configuration change
vector and the net number of setup assignments vector as follows:
amrZ) = AC(o') - Ani (), o, E S,, V k (6.11)
This number is either positive or negative. A positive number indicates that
Ami(oi) machines must still change into Level k setup state o during this con-
figuration change. A negative number indicates that Amk(o,) machines must still
change out of Level k setup state ao during this configuration change.
274
New Configuration
1k-1
1k 2
1k+1 2 2k+1
C'
Figure 6-1: Original and New Configurations
275
Setup Changes into a Setup Class Am+(S) is a running variable. It is equal
to the remaining number of machines that must switch into Level k setup class S
from another class. This number is always positive or zero. It is computed from Amc
as follows:
a m(Sk) =l Am,(o ) (6.12)
ami( i ) > 0
Setup Changes out of a Setup Class Am-(Sik ) is a running variable. It is equal
to the remaining number of machines that must switch out of Level k setup class Sik
during the current configuration change. This number is always positive or zero. It
is computed from Amc as follows:
m-( = - A-m,(O ) (6.13)
Total Matches within a Setup Class Consider a machine that is set up in a
setup state with a negative value of Am i(o i ) where o-r E Sk. Suppose that another
setup state o' E S has a positive value of Am (o'). It is possible to satisfy the
requirement to switch out of o and the requirement to swtich into O~' by assigning a
setup change into o' to the machine. This setup assignment requires a Level k setup
change and is an example of a match within the setup class S .
There is one Level k setup change match for each pair of setup states in which one
setup state has a positive value of its corresponding element in the vector Am, and
the other has a negative value of its corresponding element in the vector Am . The
total number of setup matches Amt ot(S(o - ') that can be made within the Level k
setup class S( - ) is
A k k 1-Z
Amt(S i (~ -)) = min [n+(Sf (a -)), n-(S2 (o(cr ))] (6.14)
This is based on the assumption that setup changes can only occur within a setup
276
class under a common Level k - 1 setup state.
Excess Setup Changes within a Setup Class The total number of unmatched
Level k setup change requirements Am,,,a,,(Sk(o - )) in Level k setup class S( 1k-1)
is
CAmCe. (Sa(fo)) = abs [n±(Sa(j.' )) - n(Si '(o)) (6.15)
The value of Am ece,,(S (of-')) equals the number of Level k - 1 or higher setup
changes that are required to complete the current Level k configuration change. Recall
from Section 5.2.1 that each Level k - 1 setup state oe - 1 must be refined into one of
the Level k setup states in setup class S1 (k-o).
A higher level setup change will add to the total number of machines that are set
up in the setup class Sik(o-1) if
Amxce,,.(S i ')) > 0
Conversely, a higher level setup change will take away from the total number of
machines that are set up in the setup class S(oik- 1') if
Am.=.,( o'i-' )) < 0
6.9.2 Setup Assignment Algorithm Description
This section describes the algorithm used by the controller of Machine Group i to
assign setup changes to machines during a configuration change. The basic function
of the setup assignment algorithm is to find a machine which is set up in a setup
state that is to lose a machine and tell the machine to change into a setup state that
is to gain a machine. The setup assignments are made consistent with the set of
assumptions defined in Section 5.2.1.
The setup assignment algorithm is performed once for each machine group that is
affected by the current configuration change. The algorithm assigns as many low level
277
setup changes as possible before assigning higher level setup changes. Therefore, the
algorithm begins at the low level interface of the hierarchy and works up the hierarchy
one level at a time to the setup change level.
The algorithm is illustrated by a simple example which accompanies this de-
scription. A sample simulation is presented in Section 7.6 which shows some results
obtained using Hiercsim Version 4.0.
Assumptions Three assumptions are used in the machine group setup change as-
signment algorithm. They are:
1. Setup changes can only be assigned between two setup states in the same class
with a common parent setup state.
2. The cost to change setups within a given setup class is approximately the same
for all combinations of two setup states in that class.
3. The time required to perform a setup change within a given setup class is
approximately the same for all combinations of two setup states in that class.
Example System Consider Machine Group i which has two machines. Both ma-
chines are of the same type and may be set up independently of each other. Level k
Cell c contains the machine group and specifies Level k setup states. Level k + 1 Cell
c* also contains the machine group and specifies Level k + 1 setup states.
The setup tree for that type of machine is shown in Figure 6-1. The setup stree
consists of two Level k setup states, 1I and 2 . Level k setup state 1 can be further
refined into one of two Level k + 1 setup states, whereas there is only one possible
Level k + 1 refinement of Level k setup state 2 .
Suppose that the configuration change status of Machine Group i is changed to
MCG-SETTING-UP at time t (Section 6.8.3) in response to the configuration change
represented in Figure 6-1. The original configuration of the machine group has one
machine in the Level k + 1 setup state 1' +1 E S +1(2 ) and the other machine in
the Level k + 1 setup state 2 + 1 E S2 +1(1 ). The configuration change requires
278
both machines to change into the Level k + 1 setup state 1' E Sk+1(1?). This
configuration change requires one major setup change, and one minor setup change.
The machine group setup assignment algorithm is started at time t.
Initialize the Running Variables For each Level k setup state Or in the Level k
setup class S,(,-), the value of Ank(~o) is set to zero because no setups have been
assigned. The values of the running variables Amik(aOf), Am(S (c-1)),(Sik mo, I( k),Uo em+(,s( -i)),
Am-(S ( - A)), mt.(S ( '-l)) and Am'~.,,o(Sf~i (-)) are each initialized using
(6.11), (6.12), (6.13), (6.14), and (6.15). This initialization is repeated at each level
in the hierarchy that is affected by the configuration change.
Stage 1 in Figure 6-2 shows the initial state of the running variables used in the
machine group setup assignment algorithm. For example, the Level k + 1 setup state
1+1 e Sk+x (1i ) will gain two machines in this configuration change so
A , tii +1) - 0
Ami (l1+') = 2
The Level k +1 setup class Si+1(1) will gain only one machine and will experience
one internal setup change. Its initial values are therefore
Am+(Sk+l (1)) = 2
Amm-(S'+1(1 )) = 1
"t(ot (12)) = 1
Am k+l k l
Amexc,,(S i +1 (lf))= 1
Identify Setup Assignments within a Setup Class Setup assignments are only
performed within a setup class whose setups have a common parent setup state.
Consider the Level k setup class, S(ao, - 1). This stage of the algorithm identifies two
Level k setup states oain and 'o, , within that class.
Such a pair of setup states exists if
Amtot (S 1)) > 0
279
Stage 1
1k
n (An , (1
M. (s
1k+1
An,(1-1) = 0
Am.:, (1 -)= 2
(1,))
.I)
= 2\ Am (s(2,))= 0
( k12 --l =. ..(2 .- -
2k+1 k
An=,(2")= 0 : n=(1 )= 0
Am,(2*). -1' = m:(1 )= -1
Stage 2
* (st(1))= 1
lk+1 2ki
* An:, (1;1)= 1 Ani,
S m:,(1.1) = 1 m:
:=,(L1 m
Am+ (S (2 ))= 0
1 1 k+1
(2•1')= -.s An_,(I
,(2,1 ' ) = 0 - Amk,(1
Figure 6-2: Stage 1 and 2 of Setup Assignment
280
.)= -
)=,-1
Figure 6-3: Stage 3 of Setup Assignment
The setup state crain is a setup state such that the value
zmgi(oa.) > 0
Likewise, the setup state o,e is a setup state such that the value
Am (o 1,, ) < 0
This implies that there is a machine which is set up in Level k setup state {ose
and can be assigned a setup change into Level k setup state k
Once the two setup states kain and lok have been identified, the running variables
are updated as follows:
281
Stage 3
m* (S(17))= 0 Am* (S,(2 ))= 0
4 -..S(..... 1))= (Sk(2 )) ......
: k+1 2k1 k+1
An.(1+1)= 2 Ank,(2k,)= - An,(1 1 ) -1 .
m:, (1-')= 0 Am:,(2') 0 = ;m:,(1 **)= 0
Zni (aOkain) 
_ k ain
AmiA('ain) +- 
-iin
Afk (,k - A o e ) - 1
Amki(Belose)-Am(ro,8 ) + 1 (6.16)
Amn+(sik(k-1 A k(- +(S (-) - 1
Am-(Si(o' 
-1)) Am- (S('l))- 1
A m k k -1Amtot(S k(Ohl)) 
- m ktot(ik 1(o')) - 1
according to (6.11), (6.12), (6.13), and (6.14). The value of Amke(Sf( 0 "-1)) does
not change with a Level k setup assignment (6.15).
At this point, a Level k allocated setup change has been used. The setup change
allocation control variables ANk(ain, t) and AAk( ain, t) are updated as follows:
( O ain, t) /k( ain, t) + 1
A'k (k (6.17)A.Vkain k (aint) - 1
according to (6.6).
Stage 2 of Figure 6-2 shows the values of the running variables after a Level k + 1
setup assignment has been made within the Level k + 1 setup class Sik+(1i).
Stage 3 of Figure 6-3 shows the values of the running variables after a Level k
setup assignment has been made within the Level k setup class Sf(1 - '). Note how
the Level k + 1 values of the running variables are changed when a Level k setup
assignment occurs. These changes are descrbed next.
Refine Level k Setup State rC, When Level k is not the low level interface be-
tween the hierarchical controller and the factory, the Level k setup change assignment
is not specified in sufficient detail for the machine to perform operations. Therefore,
Level k + 1 and lower refinements to the Level k setup change must be made.
Since a Level k setup change has been assigned, the value of Amk (k+l( k
is
282
Ameces,,(Si ain)) > 0
A Level k + 1 setup state o+~1l E S+l( in)is located such that
Amk+ltl(k+1)>O
c*i Cgrainl) >
This Level k + 1 setup state becomes the Level k + 1 refinement to Level k setup
state ain. The values of the Level k + 1 running variables are updated as follows:
Ank+l(_k+l) 
_k+l (k+) +
m F1 ai n ) +-k + m (+1 kgain)
M Ak+l /k+lx k+l +l\
c gain ct aYgai / (6.18)
Am+ (S+l('gkain)) +- Amexcess(St +1( ,,in))- 1
according to (6.11), (6.12), and (6.15). The value of km .+(S,+1 ain)) does not
change with a Level k or higher setup assignment (6.14).
Since resource time has already been allocated for the Level k setup change by
the Level k controller, the Level k + 1 setup change control variables Afkt+l ( ) and
A/k, l(gain) are not changed.
This procedure is followed down the hierarchy to the low level interface between
the hierarchical controller and the factory.
Refine Level k Setup State 'koe A similar procedure is followed to refine the
Level k setup state into a low level interface setup state.
Since a Level k setup change has been assigned, the value of Amk (Sk+1 0 k
excess (ik+1 lose))
is
mcesa (( Iose)) < 0
A Level k + 1 setup state k+l S + ( ose) is located such thatO l o s e E i l s
283
mkl <k+l 0c*i \,'oae )
This Level k + 1 setup state becomes the Level k + 1 refinement to Level k setup
state of,,~. The values of the Level k + 1 running variables are updated as follows:
Ak-l[ kI A +l k, k+l[.
mzi (alose ) + " i \'lose ) 1
Am k i [k+1 A, k+1 k+1) +
z 4 <- l(6.19)
m (+4o)) <-- Am-(S 2i (08,)) - 1
Am k+1 (Sk+l k ss(Sk+1 kexces ( (0 ))- Am (Se, (Oloe))-
according to (6.11), (6.13), and (6.15). The value of am i '(S+'(o loe)) does not
change with a Level k or higher setup assignment (6.14).
This procedure is followed down the hierarchy to the low level interface between
the hierarchical controller and the factory.
Machine Setup Change Assignment When the low level interface versions of
the Level k setup states ka in and ko,, are determined, the controller of Machine
Group i locates a machine which is currently set up in the low level version of Level
k setup state oo, and whose setup change status is SETUP-COMPLETE. The con-
troller changes that machines's setup change status to WAITING-TO-SETUP and
assigns the setup change to the machine. A machine whose setup change status is
WAITING-TO-SETUP will begin its setup change according to the procedure de-
scribed in Sections 6.10.
Continuation of the Assignment Algorithm When the setup change assign-
ment is given to a machine, this process is repeated until tAmt (S-(ok -)) = 0, at
which point, the algorithm moves to the next Level k setup class. When all Level
k setup assignments in all Level k setup classes have been completed, the algorithm
proceeds to Level k - 1.
The algorithm proceeds up the hierarchy until the level at which the setup change
284
originated is reached.
6.10 Machine Setup Change Procedure
This section describes the model of the factory and how it performs a setup change
on a specific machine. The two essential parts to this model are the setup initiation
and the setup completion. The outcome of a setup change depends on the conditions
present when either event occurs.
6.10.1 Machine Setup Initiation
When the setup change status of a machine is changed to WAITING-TO-SETUP,
the machine may be in the middle of another activity. Therefore, the machine does
not initiate the setup change until it becomes IDLE (Section 4.3.1).
If the machine is IDLE at the time of the setup assignment, then the setup change
is initiated immediately. If the machine is PROCESSING a part at the time of the
setup change assignment, then the machine waits until the part is finished before the
setup change is star, 1.
As soon as the setup change is initiated at a machine, the time to complete the
setup change is computed. A setup completion event is placed in the event queue
(Section 4.1.3) and is scheduled to occur after the setup change time has elapsed.
The value of the highest control level which sees a change in the setup state of the
machine is stored so that the machine may report back to the correct controller when
the change is completed.
Before the simulation is allowed to proceed, the machine setup change status is
changed to SETTING-UP, and the machine is removed from the list of active machines
within the group. The machine does not have a setup state during a setup change.
However, both the new setup state and the old setup state are stored for possible
later use.
285
6.10.2 Setup Completion
The machine setup change status may be changed from SETTING-UP to SETUP-
COMPLETE in one of two ways. The setup change may be completed normally, in
which case the machine's setup state becomes the requested setup state.
The setup request may be terminated due to a configuration change initiated at
an even higher control level than the one which originated the current configuration
change. In that case, the machine reverts back to the setup state it was in immediately
before the setup change was started. This does not necessarily accurately model a
factory. However, some policy is required to manage a setup change termination. In
the future, a better policy may be implemented.
The machine uses the setup change level recorded earlier to report back to the
correct control level in the hierarchy. The controller at that level will use the new
information to recompute production rates, taking into account the new setup state
of the machine.
286
Chapter 7
Simulation Experiments
7.1 Overview of Simulations
This chapter contains a series of simulations which demonstrate some of the concepts
described in this thesis. All results were obtained by using Hiercsim Versions 3.5 and
4.0.
Each simulation shows the response of a system to a set of external target produc-
tion rates and internal events, such as failures of machines. Some of the simulations
are used to show general responses of the hierarchical controller. Others are used to
demonstrate the internal mechanics of the controller and the factory model.
Combined, these simulations demonstrate the validity of both the hierarchical
controller described in this thesis and the simulation model implemented in Hiercsim
Version 3.5 and 4.0. These simulations show how groups of independent cells interact
to produce a satisfactory overall result. This interaction includes coordination of
cells through target production rates and virtual machines, the stability of the entire
system in response to unforeseen events such as failures, and how the system performs
under high target production rates.
Some of the tradeoffs involved between work-in-process, cycle time, and through-
put are shown.
287
Outline of Simulations The first series of simulations provides a demonstration
of buffer behavior described in Section 2.8.2. The relationship between disruption
duration and buffer size, and the behaviors of a system with no buffers, a system with
partly full buffers, and a system with empty or full buffers are shown. In addition to
the basic behaviors of these systems, some of the controller features are demonstrated.
These include virtual machines from Section 3.8, and the cell as a subfactory from
Section 3.6.1.
The second series of simulations demonstrates the effect of measurement frequency
on the perception of the controller of Level k Cell c, described in Section 2.9. Central
to this concept are the assumptions about frequencies of events from Section 2.5.
Each cell in the hierarchy perceives events in the factory according to its control
level, as described in Section 3.6.1.
The controller of Level k Cell c requires knowledge of three quantities in order to
perform its function of capacity allocation:
1. Cumulative production, defined in Section 2.9.1.
2. A model of capacity of the resources within its boundaries, as described in
Section 2.9.2.
3. A measure of the amount of material in neighboring buffers, as described in
Section 2.9.3.
Each of these quantities is examined at different frequencies of measurement.
The third series of simulations demonstrates the performance of a cell as a sub-
factory, as described in Section 3.6.1. This includes an example of target production
rates which lie within the cell's capacity set (3.12), and an example of how hedging
points at different control levels relate to each other, detailed in Section 3.13. The
dynamics of the cell controller are demonstrated through a close examination of its
response to a failure (Section 4.5.3) and then through the controller's performance
with different parameters (Section 4.4.3).
The fourth simulation deals with how Hiercsim 3.5 deals with reentrant flows
(Section 3.12). This includes both the overall system performance and the detailed
288
description of how antilooping constraints (3.34) and (3.35) are added to the cell
controller's capacity set (3.12).
The fifth series of simulations demonstrates the preliminary versions of the setup
change algorithms implemented in Hiercsim Version 4.0. The two policies shown are
the corridor policy and the frequency limited setup change.
The sixth series of simulations is based on the MIT ICL semi-conductor wafer fab
(Bai and Gershwin, 1989 and Bai, 1991). In particular, this series demonstrates the
antilooping constraints (3.34) and (3.35) in a decoupled system and how the choice
of low level heuristics (Section 3.10 and Section 4.3.2) affects the overall performance
of the hierarchical scheduler when the frequency assumptions of Section 2.5 are not
met.
7.2 Demonstration of Buffer Behavior
Each of the Observations in Section 2.8.2 is demonstrated in this section with simple
simulations using Hiercsim 3.5. A sample input file appears in Appendix A. That file
is used in the simulation which appears in Section 7.2.4 and is shown in Figure 7-11.
7.2.1 Disruption Duration and Buffer Size
This section demonstrates how the relationship between buffer size and disruption
duration can affect the overall capacity of a system. In Section 2.8.2, Observation 1
states that buffers are low pass filters which attenuate high frequency disruptions and
let pass low frequency disruptions. The system is represented in Figure 7-1. It has
two machines operating in sequence on a single part type. Both of the machines are
unreliable.
The control system is divided into three control levels. Level 1 is the static control
level which converts the target production rate, uo = 9.5 parts per time unit, into
Level 1 the production rate, ul
, 
which is distributed to the two Level 2 cells. Control
at Level 2 is accomplished by two cells, each converting the Level 1 target production
rate, ul, into Level 2 production rates, u 2 and u 1, independently of the other Level
289
1Su .5
U11 I U11
I Ml: Level 2
U2  I MTTF - 45R
21 MTTR 
- 5R
Cell 3
I I Z2 . o.o
r31
2 M2: Level 2
U MTTF - 47.5RMTTR 
- 2.5R
Cell 5
I - Z o.o
I I
L I r
N I N
41 51
- ----- ------ - - -- - -------- - -
Factory
' 1 - 0.075
Step 2
B - 100 or 20 7 - 0.1
Step 3 Step 4
Figure 7-1: Disruption Duration and Buffer Size
290
Level 1
Cell 1
------------------------- IZ o.o11
I,,,,,,,,,, "~,,,,,,,,I
Cell 2
1 2
Z lo.o
L21
L -
Level 2
Level 3
Cell 4
I I Z . o.o
I 41
I . . . .1
Step 1
- 0
00-
2 cell, as long as the buffer between them is neither empty nor full. The Level 2
production rates are converted into staircase policy loading commands, N 41 and N51 ,
(Section 3.9) by the Level 3 cells.
The machine failures occur at a frequency which is comparable to the characteristic
frequency of Level 2. The failure parameters (Mean Time to Fail and Mean Time to
Repair) are indicated immediately below the Level 2 cells. The symbol "R" on the
mean time to fail MTTF and mean time to repair MTTR indicate the failure and
repair times are exponentially distributed random variables. The production required
during an average failure is approximately 25 to 50 parts.
There is one buffer in the system. Its maximum size B 1 is set first at 100 parts in
Case 1, then at 20 parts in Case 2 and Case 3. The first size is sufficient to decouple
the two machines, while the second is not. The choice of hedging points, z21, z21 , z 1 ,7
and z1, places 10 parts into the buffer at Level 2 and 10 parts at Level 3 when the
system has reached its hedging point (see Section 3.13). The weighting coefficient
A is 1.0 (3.21). The operation times are indicated below the respective machines in
the factory. The simulations were run for a total of 1000 time units using Hiercsim
Version 3.5.
Case 1: Sufficiently Large Buffer This case demonstrates how a sufficiently
large buffer can dampen the disruptive effects of unreliable machines in a system.
The size of Buffer 1 (B 1 = 100) is large enough to hold twice as many parts as are to
be expected to accumulate during any given failure. It is expected that the maximum
capacity is limited by the slowest machine, which in this case is Machine 2.
The two machines are decoupled so that they can be treated independently. The
system rate is limited to the maximum production rate of the slowest machine, Ma-
chine 2. The maximum production rate, ur
~
, is1
elm'
u = = 9.5 parts per time unit721
1The target production rate was intentionally chosen to match this limiting production rate.
291
Overall Performance After 1000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 9500 9438 9.44 66.82 7.05
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 23.80% 71.30% 4.90%
M2 1 1.30% 94.40% 4.40%
Buffer Statistics
Buffer Name Buffer Size AverageAverage Wait in BufferBuffer
Buffer 1 100 65.01 3.43
Table 7.1: Results of Large Buffer Size Simulation
292
Figure 7-2: Cumulative Production with Large Buffer
Decoupling the machines comes at the price of high work-in-process inventory and
long cycle times. Table 7.1 gives the overall statistics generated by this case.
The actual throughput is close to the required throughput. The average work-in-
process is 67 parts which is much higher than the desired hedging point of 10 parts.
The average cycle time is 40 times the raw processing time (the total operation time).
Figure 7-2 shows the cumulative production as a function of time. The straight line
in the graph is the cumulative integral of the target demand rate. The cumulative
production follows the cumulative requirements closely throughout the simulation
run. Failures disrupt production only when Machine 2 is down.
Figure 7-3 shows the amount of material in the buffer as a function of time. There
are large swings in the number of parts in the buffer as Machine 1 and Machine 2
fail independently of each other. The amount of material is limited by the maximum
buffer size of 100 parts. Machine 2 is starved and idle between t = 70 and t = 80,
293
Figure 7-3: Amount of Material in Large Buffer
294
100
90
80
70
60
so
40
Level 2 Buffer 1
20 -
10 -
0 200 400 600 800 1000
Time
or about 1% of the simulation run. This corresponds to the results presented in
Table 7.1. Machine 1 is blocked and idle during the interval between roughly t = 500
to t = 800, or about 23% of the time. This period of blockage occurs after a failure
at Machine 2, and its length due to the slowness of Machine 2 relative to the target
production rate.
Case 2: Insufficiently Large Buffer This case demonstrates the effect a small
buffer has on a system with unreliable machines. The size of Buffer 1, B1 = 20, is
large enough to hold about one-half of the amount of material expected to be added
to the requirements during a disruption. It is expected that the maximum capacity2
is limited by the unreliability of Machines 1 and 2 according to the efficiency (2.14)
divided by the longest operation time, r21 . For this system, the maximum expected
production rate, u,, is
1 1
1 - 8.593 parts per time unit (7.1)
U m 1a + + r21
Table 7.2 gives the overall statistics generated by this case. The actual production
rate is 6.3% higher3 than that expected from (7.1).
The average work-in-process is 18 parts and is limited by the size of the buffer.
This is 3 times less work-in-process than the case when the buffer size was 100 parts.
The average cycle time is only 5 times the raw processing time, compared to 40 times
with the larger buffer. Notice that the fraction of time that the Machine 2 is idle
has increased by a factor of 5, but that of Machine 1 stayed constant. This is due
to the increased time that Machine 2 is starved for parts from Machine 1. Note that
any time that Machine 2 is left waiting for parts takes directly away from the system
capacity, since Machine 2 is the bottleneck.
Figure 7-4 shows a graph of the cumulative production as a function of time. The
straight line in the graph is the cumulative integral of the target demand rate. Notice
2 Due to Buffer 1, this set is slightly conservative (Lasserre, 1992).
8If the actual failure percentages from the simulation are used in (7.1) instead of the given failure
parameters, then uma, = 9.066. Under these assumptions, the measured production rate is only
0.7% higher than predicted.
295
Overall Performance After 1000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 9500 9130 9.13 18.15 1.99
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 23.70% 68.60% 7.70%
M2 1 6.10% 91.30% 2.60%
Buffer Statistics
Buffer Name Buffer Size Average # of Parts inBuffer Name Buffer Size Average Wait in BufferBuffer
Buffer 1 20 16.44 0.9
Table 7.2: Results of Small Buffer Size Simulation
296
Figure 7-4: Cumulative Production with Small Buffer
297
Figure 7-5: Amount of Material in Small Buffer
298
how the cumulative production diverges with each failure. There is not enough excess
capacity to recover between occurrences of failures.
Figure 7-5 shows a graph of the amount of material in Buffer 1 as a function of
time. When Machine 2 is down, Machine 1 is unable to produce anything beyond 20
parts before its production is restricted to the limiting rate of Machine 2 (2.19).
Case 3: Insufficiently Large Buffer, Low Demand This case demonstrates
the behavior of the hierarchical controller in a system with insufficiently large buffers
and with a low target production rate. The size of Buffer 1, B1 = 20, is large
enough to hold about one-half of the amount of material expected to be added to the
requirements during a disruption. The target production rate is equal to 8.5 parts
per unit time, compared to the actual maximum production rate of 9.0 parts per unit
time (7.1).
Case 2 shows the effect of a small buffer on a system with unreliable machines.
That case shows the effect on the Level 1 capacity set (3.12) when the independence
assumption implicit in the capacity set is violated. The effect is a net reduction in
the maximum production rate, according to the failure model (2.14) in Buzacott and
Hanifin (1978). Section 7.2.2 provides a detailed description of this effect.
This effect impacts production only when the target demand rate exceeds the
actual available capacity of the system. When the target production is less than the
actual available capacity, the hedging point strategy works well.
Table 7.3 gives the overall statistics generated by this case. The actual production
rate is equal to the target demand rate. Actual production is less than required
production because a failure occurred at the near of the simulation.
The average work-in-process is 15.5 parts and is limited by the size of the buffer.
This is 3 times less work-in-process than the case when the buffer size was 100 parts.
The average cycle time is only 5 times the raw processing time, compared to 40 times
with the larger buffer. Notice that the fraction of time that both machines are idle
has increased. This is due to the fact that the system is at its hedging point more
than in the previous cases, and does not need to work as hard.
299
Overall Performance After 1000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 8500 8146 8.5 15.53 1.83
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 28.90% 61.10% 10.00%
M2 1 15.70% 81.50% 2.90%
Buffer Statistics
Buffer Name Buffer Size AverageAverage Wait in BufferBuffer
Buffer 1 20 13.96 0.82
Table 7.3: Results of Small Buffer Size Simulation, Low Demand
300
Figure 7-6: Cumulative Production with Small Buffer, Low Demand
301
Figure 7-7: Amount of Material in Small Buffer, Low Demand
302
20
18
16
14
12
10
S8
6
4
2
0
0 200 400
Time
1000
Figure 7-6 shows a graph of the cumulative production as a function of time. The
straight line in the graph is the cumulative integral of the target demand rate. Notice
how the cumulative production recovers from a failure well before the next failure
occurs. This contrasts with the overloaded system in Case 2 where the system could
not recover completely before the next failure occured.
Figure 7-7 shows a graph of the amount of material in Buffer 1 as a function of
time. When Machine 2 is down, Machine 1 is unable to produce anything beyond
20 parts before its production is restricted to the limiting rate of Machine 2 (2.19).
Notice that the system spends a significant fraction of time at the hedging point of
10 parts in the buffer.
7.2.2 Systems with No Buffers
This section demonstrates Observation 2 from Section 2.8.2 which states that a se-
quence of machines which are not separated by buffers acts as a single unit. This
simulation also demonstrates the importance of buffer sizing and its impact on ca-
pacity estimation.
System Description The system used in this simulation is shown in Figure 7-8.
It has four machines operating in sequence on a single part type. All four machines
are unreliable with different reliability parameters. The buffers separating each of
the machines have a maximum capacity of 1 part each, which is too small to offer a
significant decoupling effect given the reliability parameters of the machines.
The control system is divided into three levels. Level 1 is the static control level.
The Level 1 controller converts the target production rate, uo = 8.5 parts per time
unit, into the Level 1 production rate, ul11, according the linear program (4.11). That
rate is distributed to the four Level 2 cells. Each of the four Level 2 cells operate
independently of each other, except when a virtual machine constraint (2.19) or (2.20)
is in effect. (Virtual machines are described in Section 3.8). A Level 2 controller
converts the Level 1 production rate into Level 2 production rates, accounting for the
repair state of the machine in its cell. That rate is passed to the corresponding Level
303
Lev
Lev
!el 1 I0u = 8.5
*I =
U1U I
U II
1
U
11
e2 Iiel 2
U1
11
1
U I
11
Level 2
= 48.5R
= 1.5R
3 3 3 3
- 0.0 Z - o.o - .o Z o.o
81 71 81 91
N N N, N
SI 71 81 91
Factory
7T,= 0.100 B,, = 1 7T,= 0.106 B,, = 1 T,= 0.103 B,, = 1 ,= 0.108
Step 1 Step 3 Step 5 Step 7
Step 2 Step 4 Step 6 Step 8
Figure 7-8: System with No Buffers
304
Cell 1
I I
L I
Z . o.o
11
3 cell where the staircase policy (Section 3.9) is used to convert the rate into loading
times.
Failures occur at a frequency which is comparable to the Level 2 characteristic
frequency. The failure parameters for each machine are indicated directly under the
Level 2 cell which contains that machine. All failure modes are operation dependent
(Section 2.7) and are exponentially distributed. The production required during an
average failure is approximately 10 to 40 parts, depending on the machine. This value
is much larger than the maximum buffer sizes.
The target demand rate for Process 1 was chosen to be equal to 95% of the long
term capability for a system with buffers with the same parameters. The choice was
made in order to show how production diverged from requirements.
The weighting coefficient A' in the cost function (3.21) is set to 1.0. The operation
times are indicated in Figure 7-8 below the respective machines in the factory. The
non-zero Level 2 hedging points are designed to make each buffer be half full 4 when
the system is at its steady state.
The simulation was run for 1000 time units using Hiercsim Version 3.5.
Results Table 7.4 gives the overall statistics generated by this simulation. The
actual production rate is 9.5% less than that required by the system. An explanation
for this discrepancy is presented in the interpretation portion of this section.
Figure 7-9 shows the cumulative production as a function of time in Cell 2 at Level
2. The straight line in the graph is the cumulative integral of the target demand rate.
Notice that the cumulative production begins to deviate from the target production
almost immediately.
Figure 7-10 shows the cumulative production as a function of time in Cell 5 at
Level 2. The straight line in the graph is the cumulative integral of the target demand
rate. Notice how the cumulative production at the fourth machine matches almost
exactly that of the first machine. This implies that the system is always producing at
4A buffer may be half full as perceived by a Level 2 observer because the amount of material in
that buffer is averaged over a long time period compared to the duration of any individual activity
(Section 2.9.3).
305
Figure 7-9: Level 2 Cell 2 Cumulative Production with No Buffers
306
Figure 7-10: Level 2 Cell 5 Cumulative Production with No Buffers
307
Overall Performance After 1000 Time Units
Process Total Total Mean Cycle
Name Required Produced Time
Process 1 8500 7693 7.69 3.67 0.478
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 15.70% 77.00% 7.30%
M2 1 14.80% 81.60% 3.60%
M3 1 17.60% 79.20% 3.10%
M4 1 14.10% 83.10% 2.80%
Table 7.4: Results of System with No Buffers
the maximum rate of the slowest machine in the system, as was stated in Observation
2 of Section 2.8.2.
Interpretation The Level 1 production rate, ul, is calculated from the capacity
set (3.12). This calculation is based on the assumption that starvation and blockage
(Section 3.8) of machines within Level 1 Cell 1 are minimal. This assumption is
accurate when buffers internal to Level 1 Cell 1 are of sufficient size to decouple
machines from Level 2 and lower internal disruptions s
Based on the assumptions of Section 3.6.2, the maximum production rate through
the system should be 8.96 parts per time unit 6 , limited by Machine 2. The chosen
target production rate (u0 = 8.5 parts per time unit) is chosen to be 95% of this
maximum rate. The actual maximum production rate is 7.69 parts per time unit, or
5An internal disruption in this context is a disruption of production within Level 1 Cell 1 caused
by a failure of a machine within Level 1 Cell 1.
8 If the observed parameters from Table 7.4 are used in (3.12), the maximum production rate
becomes 9.00 parts per time unit, limited by Machine 4.
308
14.2% less than the predicted maximum.
Due to the small size of the buffers internal to Level 1 Cell 1, none of the ma-
chines are decoupled from Level 2 disruptions in other machines in the system. This
manufacturing system is a close approximation to a transfer line with zero buffers as
described in Buzacott and Hanifin (1978). The estimate of capacity (2.14) may be
used to determine the maximum production rate.
From the point of view of a Level 1 observer, the four machines may be lumped
into a single machine with an operation time equal to the longest operation time in
the system. That operation time is r41 = 0.108 on Machine 4. When the production
rate is high enough, r41 becomes the transfer time of the line. That is, one part will
be completed every 0.108 time units even though the total cycle time is 0.417 time
units.
The effective fraction of time the lumped four machine transfer line is operational
is represented by the quantity e*. If the actual reliability parameters from Table 7.4
are used in (2.14), then e* becomes
e*= + 2 =0.8498 (7.2)
1 + '3 + 36 + 3.1 + 2.892.7 96.4 96.9 97.2
The maximum production rate of the system using this model is 0.8498 = 7.870.108
parts per time unit. The observed maximum production rate is on 2.27% less than
this predicted rate. This model is much closer to the actual rate because it accounts
for the amount of time that machines are blocked or starved due to disturbances
internal to Level 1 Cell 1. Notice the significant amount of time that each machine
is idle in Table 7.4. This idle time is the time that each machine is either starved or
blocked during the simulation.
7.2.3 Systems with Partly Full Buffers
This section uses the results generated in Section 7.2.1 to explain how a partly full
buffer decouples machines. Observation 3 of Section 2.8.2 stated that a machine which
is separated from its neighbors by partly full buffers is able to set its production rates
309
independently of other machines' limitations.
Consider the system and results presented in Section 7.2.1. Figure 7-3 of that
section depicted the time history of the amount of material in Buffer 1. Buffer 1
separates Machines 1 and 2. Recall that both machines are able to fail independently
of each other, and therefore at any given time, either of the machines could be down.
The system control hierarchy operates the machines independently of one another
at Level 2. The controllers of Cell 2 and Cell 3 set production rates for the machines
in the respective cells based on the current repair state of the machine and the surplus
state of production.
During the first 500 time units of the simulation, the buffer was never completely
full, and rarely empty. During that period, Machine 1 failed a total of five times,
and Machine 2 failed a total of eleven times. The controller of Cell 2 was able to run
Machine 1 at its best rates according to the hedging point strategy of Section 3.7.2,
without having to know about the state of Machine 2. This is because Machine 1
was always able to deposit its finished products into the buffer, regardless of how
fast it was making them. Likewise, Machine 2 was able to operate independently of
Machine 1 because there was always raw material for the operation at Machine 2.
7.2.4 Systems with Empty or Full Buffers
This simulation illustrates Observation 4 of Section 2.8.2 which stated that a sequence
of machines separated by buffers sometimes acts, in a limited way, as if there were
no buffers. This behavior occurs when either all the buffers are empty or all the
buffers are full. In addition, this simulation demonstrates the use of virtual machines
(Section 3.8) to transmit bottleneck information throughout a distributed control
system.
The system modeled here is contrived in a similar manner as the one used in
Section 7.2.2. That is, the buffers are intentionally chosen to be much smaller than
the amount of material required to be produced during an average disruption. To
offset this violation of the hierarchcal assumption detailed in Section 7.2.2, the target
production rate imposed on the system is much lower than the maximum possible
310
production rate.
Two simulations will be used to demonstrate the buffer behavior. The first sim-
ulation demonstrates how a series of empty buffers transmits bottleneck information
downstream through many cells. The second demonstrates how a series of full buffers
transmits bottleneck information upstream through many cells. The only differences
between the systems in each case are the machine reliability parameters and the speed
at which the machines can operate.
The simulations were run for a total of 2000 time units each using Hiercsim Version
3.5.
Case 1: Starvation, System Description The first simulation demonstrates
starvation of machines in a system and the way in which the hierarchical controller
accounts for starvation in determining production rates. See Appendix A for the
input file which corresponds to this simulation.
Figure 7-11 shows the system. It contains four machines operating in sequence
on a single part type. Only the first machine is unreliable, with a mean time to
repair of 200 time units and a mean time to fail of 800 time units. The operation
times of each process step are indicated beneath the machines. The operation time
spent on a machine decreases with step number so that the transmission of starvation
information can be demonstrated clearly. The maximum buffer size for each of the
buffers is equal to 50 parts.
The target demand rate for Process 1 was chosen to be only 53% of the long term
capacity of the system. This target production rate was chosen so that the system
could recover quickly from the failure of Machine 1. (Even though the maximum
buffer size is approximately three times smaller than the amount of requirements
expected to accumulate during a failure of Machine 1, the capacity set (3.6.2) is
accurate, unlike the system in Section 7.2.2. This is due to the fact that there is only
one unreliable machine - Machine 1 - and it is also the bottleneck).
The control system framework is identical to that of Section 7.2.2. The response
to failures of Machine 1 is handled by Level 2 Cell 2.
311
Level 1
1 1
U U
111 11
Level 2 L
Cell 2 Cell 3
Z 7s Z so
21 31
2 M1: Level 2 2
UI U,
211 MTTF - 800R 31,
0
I u 0.850
1'v,
U1 U'
IlI
Level 3 , ,, - zuo I
Cell 6 Cell 7 Cell 8 Cell 9
Z .I Z~ . I Z .o .oo
61 71 a1 11
N N N N
61 71 81 91
I- - -- - - - - - - *
-  
- -1 - - - - - - -+- - - - - -
Factory
T,- 0.500 B 11 - 507,,= 0.475 B,,- 50 T,,= 0.451 B,, - 504,,- 0.429
Step 1 Step 3 Step 5 Step 7
Step 2 Step 4 Step 6 Step 8
Figure 7-11: System which Undergoes Starvation
312
Cell 1
r----------------------------------1
L-----------------------------------
Z - 0.0
11
Overall Performance After 2000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 1700 1700 0.85 64.17 73.52
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 45.60% 44.40% 10.00%
M2 1 58.40% 41.60% 0.00%
M3 1 61.10% 38.00% 0.00%
M4 1 63.50% 36.50% 0.00%
Buffer Statistics
Buffer Name Buffer Size Average # of Parts inBufferName BufferSize Average Wait in BufferBuffer
Buffer 1 50 19.93 11.3
Buffer 2 50 20.82 12.0
Buffer 3 50 21.71 12.7
Table 7.5: Starvation Simulation Results
As in the earlier examples, the relative priority coefficient was chosen arbitrarily
to be 1.0. The hedging points of each Level 2 cell are such that the buffers will be
half full when the system is at its collective hedging point.
Case 1: Starvation, Simulation Results Table 7.5 shows the overall results of
the simulation. As expected, the production met requirements by the end of the
simulation run7 . The significant amount of idle time on each machine is due to the
fact that the target production rate is much less than the system capacity. The work-
'If a failure had occurred near the end of the simulation run, the system would not have caught
up with requirements.
313
1250
1200 System Fully Recovered
1150
S1100
1o050 Machine 1 Failure 
- ll 1S1000
Cell 2
950
E-Cell3
o 900
80- - - - - Ce114
800 Machine 1 Repair - Cell s
750 9 I I
900 950 1000 1050 1100 1150 1200 1250 1300 1350
Time
Figure 7-12: Starvation - Cumulative Production for All Cells
in-process for the system translates to 75 time units of requirements. This is due to
the large buffers which force parts to sit still for 73 out of the 75 time units which
comprise the average cycle time through the system.
The dynamics of how bottleneck information is transmitted downstream are shown
in plots of cumulative production and of the Level 2 amounts of material in each of
the three buffers.
Figure 7-12 is a graph of cumulative production vs. time for each of the four
machines in the system. Only a limited amount of the simulation run is shown.
Machine 1 fails at t = 961. The machine is repaired 200 time units later at
t = 1161. Since the system is at its hedging point at the time of failure, each of
the downstream machines is unaware of the upstream failure until its entry buffer
is empty. Until that time, each of the downstream machines operates at the target
demand rate, and remain at its hedging point.
314
50
45
Buffer 2
40
B- - uffer s System Fully Recovered
35 -
30 - Machine 1 Failure
25
S20  '
15 \ Machine 1 Repair /
10 \ ' '
0
900 950 1000 1050 1100 1150 1200 1250 1300 1350
Time
Figure 7-13: Starvation - Amount of Material in All Buffers
315
Figure 7-13 is a graph of the Level 2 amount of material in each of the buffers in
the system. The same time interval is shown as in Figure 7-12. Initially, all buffers
contain the hedging amount of material, which is 25 parts (3.39) for each buffer. After
the failure, the amount of material in Buffer 1 (between Machines 1 and 2) starts to
go down because Machine 2 is still operating at the target production rate, while
Machine 1 is shut down (2.18).
Once all parts in Buffer 1 are gone, Machine 2 must match its production rate
to that of Machine 1 (2.19). (This is when bottleneck information is transmitted
from Level 2 Cell 2 to Level 2 Cell 3.) At that time, the amount of material in
Buffer 2 starts to go down until it is empty, at which time, Machine 3 must match
the production rate of Machine 2. This process continues until the entire system
downstream of Machine 1 is drained of parts8 (t = 1050).
When Machine 1 gets repaired at t = 1050, all the machines begin to process parts
as if they were in a transfer line (Section 7.2.2). Because Machine 1 is the slowest
machine, the rate that the system is able to recover at is the maximum production
rate of Machine 1. Therefore, all starvation virtual machines (2.19) remain in effect,
but the limiting rate is changed from 0 to - = 2 parts per time unit. During the
initial recovery period, before Machine 4 reaches its hedging point at t = 1243, all
buffers remain empty.
Once Machine 4 does reach its hedging point at t = 1243, its production rate is
slowed to that of the target production rate (uO = 0.850) and Buffer 3 begins to fill
up (2.18). Buffer 3 continues to gather parts until Machine 3 reaches its hedging
point, at which time Machine 3 slows to match the target production rate. The rate
of increase of the amount of material in Buffer 3 drops to zero, at which time Buffer 2
beginning to fill up. This cascade effect continues until Machine 1 regains its hedging
point and the system is fully recovered at t = 1320.
At that time, all machines are operating independently of each other again.
'The choice of operation times causes the system to behave in this manner because the speed of
machines increases with step number. If this were not the case, then the sequence in which buffers
became empty and the times at which they emptied would be different. In all cases, the equations
(2.18) and (2.19) would be satisfied.
316
Case 2: Blockage, System Description The second simulation demonstrates
blockage of machines in a system and the way in which the hierarchical controller
accounts for blockage in determining production rates. This system is the mirror
image of Case 1, starvation. The impact of blockage on work-in-process and cycle
time is different than that of starvation.
Figure 7-14 shows the system. It contains four machines operating in sequence
on a single part type. Only the fourth machine is unreliable, with a mean time to
repair of 200 time units and a mean time to fail of 800 time units. The operation
times of each process step are indicated beneath the machines. The operation time
spent on a machine increases with step number so that the transmission of blockage
information can be demonstrated clearly. The maximum buffer size for each of the
buffers is equal to 50 parts.
The target demand rate for Process I was chosen to be only 53% of the long term
capability of the system. This target production rate was chosen so that the system
could recover quickly from the failure of Machine 4. (Even though the maximum
buffer size is approximately three times smaller than the amount of material required
to be produced during a failure of Machine 4, the capacity set (3.6.2) is accurate,
unlike the system in Section 7.2.2. This is due to the fact that there is only one
unreliable machine - Machine 4 - and it is also the bottleneck).
The control system framework is identical to that of Section 7.2.2. The response
to failures of Machine 4 is handled by Level 2 Cell 5.
As in the earlier examples, all relative priority coefficients Ak were chosen to be
1.0. The hedging points of each Level 2 cell are such that the buffers will be half full
when the system is at its collective hedging point.
Case 2: Blockage, Simulation Results Table 7.6 shows the overall results of
the simulation. As expected, the production met requirements by the end of the
simulation run '. The significant amount of idle time on each machine is due to
'If a failure had occurred near the end of the simulation run, the system would not have caught
up with requirements.
317
Lev
Lev
!el 1
1
UI
11
rel 2
U = 0.850
U
1 I
U
11
1
UI
11
V
Cell 5
I t
U I U U I U' I M4: Level 2
211 311 411 511 MTTF = 800R
Level 3 1 I I I MTTR = 200R
Cell 6 Cell 7 Cell 8 Cell 9
I, Ig I II
Z 3 o.o Z - o.o Z o.o Z o.o
61 71 81 91
I I I I
N I N N N81 71 81 91
S - ----- - - --- ------ - -- V- .
Factory
,= 0.429 B,, = 50oT,,= 0.451 8,, = 50 T=,- 0.475 B,, = 507,,= 0.500
Step 1 Step 3 Step 5 Step 7
Step 2 Step 4 Step 6 Step 8
Figure 7-14: System which Undergoes Blockage
318
Cell 1
I
1
Z - o.o
11
Overall Performance After 2000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 1700 1700 0.85 84.72 97.7
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 61.90% 38.10% 0.00%
M2 1 60.50% 39.50% 0.00%
M3 1 59.00% 41.00% 0.00%
M4 1 47.50% 42.50% 10.00%
Buffer Statistics
Buffer Name Buffer Size Average # of Parts inBufferName BufferSize Average Wait in BufferBuffer
Buffer 1 50 26.88 15.3
Buffer 2 50 27.66 15.9
Buffer 3 50 28.47 16.6
Table 7.6: Blockage Simulation Results
319
Figure 7-15: Blockage - Cumulative Production for All Cells
the fact that the target production rate is much less than the system capacity. The
work-in-process for the system translates to 99 time units of requirements. This is
due to the large buffers which force parts to sit still for 95 out of the 97 time units
which comprise the average cycle time through the system.
The dynamics of how bottleneck information is transmitted upstream are shown
in plots of cumulative production and of the Level 2 amounts of material in each of
the three buffers.
Figure 7-15 is a graph of cumulative production vs. time for each of the four
machines in the system. Only a limited amount of the simulation run is shown.
Machine 4 fails at t = 1050. The machine is repaired 200 time units later at
t = 1250. Since the system is at its hedging point at the time of failure, each of the
upstream machines does not detect the downstream failure until the respective exit
buffer is full. Until that time, each of the upstream machines operate at the target
320
1250
1200 System Fully Recovered
1150
S1100
1050 Machine 1 Failure 
- - cell 1
1000 0 12
7 950-
Cell 3
o 900C
850 -- - - - 0Ce114
800 - Machine 1 Repair - ce1 s
750 I I I I I I I
900 950 1000 1050 1100 1150 1200 1250 1300 1350
Time
50 Buffer 1
45
Buffer 2
40
B- uffer 3 System Fully Recovered
35
mo 3 Machine 1 Failure
25 
-
15 \ , Machine 1 Repair
10 \ ' ,' I
0 1
900 950 1000 1050 1100 1150 1200 1250 1300 1350
Time
Figure 7-16: Blockage - Amount of Material in All Buffers
321
demand rate, and remain at their respective hedging points.
Figure 7-16 is a graph of the Level 2 amount of material in each of the buffers in
the system. The same time interval is shown as in Figure 7-15. Initially, all buffers
contain the hedging amount of material, which is 25 parts (3.39) for each buffer.
After the failure, the amount of material in Buffer 3 (between Machines 3 and 4)
starts to increase because Machine 3 is still operating at the target production rate,
while Machine 4 is shut down (2.18).
Once all available space in Buffer 1 is occupied, Machine 3 must match its pro-
duction rate to that of Machine 4 (2.20). (This is when bottleneck information is
transmitted from Level 2 Cell 5 to Level 2 Cell 4.) At that time, the amount of mate-
rial in Buffer 2 starts to increase until it is full, at which time, Machine 2 must match
the production rate of Machine 3. This process continues until the entire system
upstream of Machine 4 is clogged with partslo (t = 1140).
When Machine 4 gets repaired at t = 1250, all the machines begin to process
parts as if they were a transfer line (Section 7.2.2). Because Machine 4 is the slowest
machine, the fastest rate that the system is able to recover is equal to the maximum
production rate of Machine 4. Therefore, all blockage virtual machines (2.20) remain
in effect, but the limiting rate is changed from 0 to - = 2 parts per time unit.
711
During the initial recovery period, before the Machine 1 reaches its hedging point at
t = 1325. all buffers remain full.
Once Machine 1 reaches its hedging point at t = 1325, its production rate is slowed
to that of the target production rate (u° = 0.850) and Buffer 1 begins to empty (2.18).
Buffer 1 continues to lose parts until Machine 2 reaches its hedging point, at which
time Machine 2 slows to match the target production rate. The rate of decrease of
the amount of material in Buffer 1 drops to zero, at which time Buffer 2 beginning to
empty. This cascade effect continues until Machine 4 regains its hedging point and
the system is fully recovered at t = 1400.
10The choice of operation times causes the system to behave in this manner because the speed of
machines decreases with step number. If this were not the case, then the sequence in which buffers
filled up and the times at which they became full would be different. In all cases, the equations
(2.18 and (2.20) would be satisfied.
322
At that time, all machines are operating independently of each other again.
7.3 Demonstration of Measurement Frequency
The hierarchical controller detailed in Section 3.7.2 depends on three measurements
made on the factory floor. They are: actual production, capacity, and the amount of
material in buffers. The values of each of these measurements change when the char-
acteristic frequency of measurement change. In essence, when measurement frequency
is high, more changes can be recorded. Likewise, when measurement frequency is low,
fewer changes can be recorded. Section 2.5 details the reasons for this dependence on
measurement frequency.
System Description This section demonstrates this phenomenon using a simu-
lation of the system depicted in Figure 7-17. The system consists of two unreliable
machines operating in sequence on a single part type. The machines are separated
by a buffer whose maximum size is 100 parts. Each of the machines has two failure
modes. One of the failure modes is a major and infrequent failure, whereas the other
is minor and frequent failure.
The system is controlled by a four-level hierarchy. The Level 1 controller sets Level
1 production rates for both machines. These rates are constant for the duration of the
simulation and are consistent with the Level 1 capacity set (3.12). Level 2 controllers
respond to the major and infrequent failures by specifying Level 2 production rates
within the Level 2 capacity sets (3.12) for each Level 2 cell. Each machine is contained
in its own Level 2 cell. Level 3 controllers respond to the minor and frequent failures
by specifying Level 3 production rates consistent with the Level 3 capacity sets (3.12)
for each Level 3 cell. Each machine is contained in its own Level 3 cell. Parts are
cleared for processing by the Level 4 controllers.
The target demand rate for Process 1 was chosen to be 93% of the Level 1 capacity
of the system (3.12). The relative priority coefficient was arbitrarily chosen to be 1.0.
Note that the ratio between the Level 2 and Level 3 hedging points is equal to the
323
Level 1
U1
11
Level 2
Cell 2 r - 2cZ, o.o
Le 221
Ml: Level 2
U MTTF = 45R
Level 3 " MTTR = 5R
Cell 4 r - - - -
L? M1: Level 3
U MTTF = 15.0Level 4 MTTR = 1.6R
Cell 6 r - - - 4IZ . .
N
61 I
U = 8.5
U
11
Cell3 r ---- 2,0.0
II Z . o,
M2: Level 2
U I MTTF = 47.5R
3 1MTTR = 2.5R
Cell 5 r - - - - .I Zo.o
M2: Level 3
U I MTTF = 15.8R
MTTR = 0.83R
I, TI
N
71
Factory
= 0.075 B, = 100 0.1
Step 2 Step 3 Step 4
Figure 7-17: Frequency Demonstration System
324
Cell 1 1S ---------- 
----------- 0.o
11
Step 1
00-
00
Overall Performance After 100 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 850 850 8.5 15.27 1.784
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 21.30% 64.70% 13.90%
M2 1 10.60% 85.00% 4.40%
Buffer Statistics
Buffer Name Buffer Size Average # of Parts inBufferName BufferSize Average Wait in BufferBuffer
Buffer 1 100 13.59 0.794
Table 7.7: Frequency Demonstration Results
ratio between the Mode 1 and Mode 2 mean times to repair. This choice of hedging
points is designed to show how failures at each control level are attenuated by the
hedging point of the respective control levels.
The simulation was run for a total of 100 time units using Hiercsim Version 3.5.
Overall Results The overall performance of the system is shown in Table 7.7. The
system performed well compared to the requirements. The work-in-process and cycle
time reflect the high hedging points chosen to counteract the lack of reliability in the
machines. There is a significant amount of idle time on each machine, which reflects
the fact that the target production rate was chosen to be well within the capacity of
the system.
Even though the maximum buffer size was 100 parts, on average, only 13.5 parts
were stored in the buffer. This is due to the choice of Level 2 and Level 3 hedging
325
Figure 7-18: Cumulative Production for Entry Cells, Long Interval
points which total to 13 parts (3.43).
7.3.1 Cumulative Production
Observations about the effect of measurement frequency on the perceived value of
cumulative production are detailed in Section 2.9.1.
Figure 7-18 shows the cumulative production of parts at Machine 1 as perceived by
Level 1 Cell 1, Level 2 Cell 2, and Level 3 Cell 4 over the duration of the simulation"
The Level 1 view of production is a straight line because it is measured at the start
of the simulation, and again at the end of the simulation. The Level 2 production is a
line which is composed of a number of straight segments. Each segment corresponds
"Recall from (3.13) that the Level k - 1 expectation of the Level k production rate uk is equal
to the Level k - 1 production rate u - 1 when the Level k - 1 rates are specified with the hedging
point policy of Section 3.7.2.
326
900
800 Cell 1
700 Cell 2
600 -- - - - - - - Cell 4
500 --
400
200
0 20 40 60 80 100
Time
Figure 7-19: Cumulative Production for Entry Cells, Intermediate Interval
327
Figure 7-20: Cumulative Production in Entry Cells, Short Interval
to a controller response to a Level 2 random event, such as a failure or repair, or
to a feedback signal from the surplus, such as when the surplus reaches the hedging
point. Similarly, the Level 3 production curve is piecewise linear. Since the Level 3
controller responds to more frequent events, its cumulative production curve is much
more jagged (has many more changes in slope) than that of Level 2.
Notice also that the magnitude of deviation from the target cumulative produc-
tion decreases as the hierarchy is descended. This reflects the fact that lower level
disruptions are of shorter duration than higher level disruptions.
Figure 7-19 shows the cumulative production at Machine 1 in Level 1 Cell 1, Level
2 Cell 2, and Level 3 Cell 4, over the period between t = 10 and t = 45. The Level
2 cumulative production curve changes slope at three points in response to a single
failure and repair cycle on Machine 1. On the other hand, the Level 3 production
curve changes slope a total of nine points in the same interval.
328
260
250
240
8 230
220
~ 210
200
190
180
Cell 1
Cell 2
-Cell 4
Cell 6
25 26 27 28 29 30
Time
Figure 7-20 shows the cumulative production of Machine 1 at all four levels of
the hierarchy over an interval of 5 time units, between t = 25 and t = 30. Note
how the Level 4 staircase closely follows the Level 3 cumulative production curve.
The Level 4 production curve changes slope each time a part is cleared for processing
(Section 3.9).
This series of figures demonstrate how production curves become more jagged as
the hierarchy is descended.
7.3.2 Amount of Material in a Buffer
Section 2.9.3 details how the perception of the amount of material in a buffer changes
with frequency of measurement. This leads to independent representations of the
same buffer at different levels of the hierarchy (Section 3.4.2). This section demon-
strates how buffer control levels appear in a simulation.
Figure 7-21 shows a graph of the amount of material in Buffer 1 during the interval
between t = 10 and t = 45. The Level 2 and Level 3 perceptions of the amount of
material in the buffer are shown. The Level 2 amount of material reflects only Level 2
disruptions, such as the indicated Level 2 failures and repairs. On the other hand, the
Level 3 amount of material in the buffer reflects Level 3 disruptions as well as Level 2
disruptions, as indicated by the Level 3 failure and repairs. It is possible for a Level 3
virtual machine constraint (2.19) or (2.20) to be active when the corresponding Level
2 virtual machines are not active (Section 3.8). This is a subtle, but key point to
explain the role of virtual machines in the pyramid decomposition (Section 3.5.3).
Figure 7-22 shows a graph of the amount of material in Buffer 1 during the interval
between t = 25 and t = 30. The Level 2 amount of material in Buffer 1 changes slope
six times over the interval, while the Level 3 amount of material changes slope 14
times over the same interval. The Level 4 amount of material in Buffer 1 tracks the
actual number of parts in the buffer. It is interesting to note how closely the Level
4 amount follows the Level 3 amount. Over a longer interval, the difference between
the Level 3 curve and the Level 4 curve is very small.
329
Figure 7-21: Material in Buffer 1, Intermediate Interval
330
Figure 7-22: Material in Buffer 1, Short Interval
331
35 --
Level 2 Buffer 1
30 - - - - - - Level 3 Buffer 1
25 Level 4 Buffer 1
20 1
10 Staircase Policy
5
25 26 27 28 29 30
Time
7.3.3 Capacity of a Machine
None of the graphs presented in this section explicitly demonstrate the change in
capacity as a function of measurement frequency (Section 2.9.2). This change can be
shown indirectly through the cumulative production graphs, Figure 7-19 and Figure 7-
20.
Figure 7-19 shows that the Level 1 capacity set does not change, but it does take
into account the loss of capacity due to Level 2 and Level 3 failures (3.12). At Level
2, the capacity of the machine changes with every occurrence of a Level 2 failure.
Such a failure occurs on Machine 1 at t = 13, and the corresponding repair occurs
at t = 19. At that time, the Level 2 controller attempts to recover from the loss of
production over the duration of the failure. The rate at which the system recovers is
as fast as the machine is capable while taking into account capacity lost due Level 3
failures. This rate is faster than the maximum Level 1 production rate.
Likewise, the maximum Level 3 production rate is greater than the maximum
Level 2 production rate.
The difference between each of the views of capacity is shown in Figure 7-20. Note
how the Level 3 production is slowly catching up to the Level 2 target, which is in
turn catching up to the Level 1 target. Over the interval of that figure, the capacity
at Levels 1 and 2 are constant, while that of Level 3 changes with due to a Level 3
failure of Machine 1.
7.4 Demonstration of the Cell as a Sub-Factory
The concept of Level k Cell c was defined in Section 3.6.1. The purpose of this
section is to illustrate through examples some of the concepts and algorithms used in
Hiercsim Version 3.5 which are described in Chapters 3 and 4. The following concepts
and algorithms are included:
1. A method to determine whether or not the combination of demand rates lies
within the Level 1 capacity set (3.12) is shown.
332
2. The algorithm which governs the response of a controller to a failure cycle on
machine using the hedging point strategy of Section 3.7.2 is shown.
3. The installation of a boundary constraint in a linear program, described in
Sections 4.8 and 4.8.4, is shown.
4. The effect of varying the priority coefficient A' in the cost function (3.21) on
the controller response to failures is shown (Section 4.4.3).
System Description The system used to demonstrate these points is shown in
Figure 7-23. It has three machines performing operations on two processes. Process
1 is operated on by Machine 1 and Machine 3 in sequence, and Process 2 is operated
on by Machine 2 and Machine 3 in sequence. Machine 3 is the only unreliable machine
in the system. Machine 3 is also flexible so it can switch from Process 1 to Process 2
and back without any setup change penalty. The proportion of time that Machine 3
operates on each process is determined by the hierarchical controller.
Operation times and buffer sizes are such that virtual machines from Section 3.8
become active when Machine 3 experiences a failure.
The combination of demand rates for Process 1 and Process 2 is within the long-
term capacity of the system.
Initially, the relative priority coefficient A is chosen to be 1.0 for both processes.
In the final demonstration, these coefficients are changed to show how they affect the
controller behavior.
The system is controlled by a four-level hierarchical controller. Level 1 Cell 1
sets the long term production rates for both processes based on the reliability of the
machines in the system and the demand rate. Level 2 Cell 2 responds to the Level 2
failures on Machine 3 and sets Level 2 production rates for both processes. Level 3
Cell 3 responds to Level 3 failures on Machine 3 and sets the corresponding Level 3
production rates. Level 4 Cell 4 clears parts for loading onto Machines 1 and 212.
12Parts which have been cleared for processing at either Machine 1 or 2 are automatically cleared
for loading onto Machine 3. This is because no Level 4 cell boundaries are crossed when parts are
moved from either Machine 1 or Machine 2 to Machine 3. See Section 3.9.
333
I 0 I 0
U = 0.9 U = 0.9
1 2
l I
U I
12
2
Z - 1o.
22
21
U
22
U
82
M3: Level 2
MTTF = 45R
MTTR - SR
3
Z - 5.0
31 3)
Z . 2.0
82
MS: Level 3
MTTF = 10R
MTTR = 0.5R
4
Z - 0.0
Z . o.0
42
Ce 4ll----
I---------------------------
I I
N
,  I N42  I
Factory
' .. B,1 =2 .
Step 1 Step 2
Big =2
Step 3
Figure 7-23: Subfactory Demonstration System
334
Level 1
z . 0.0Z =o.o
Z
1 
- 0.0
12
Cell 2
-- - - - ------- -- -- - - -
U I
11
Level 2
U2
21Level 3
Level 4
Ce 3ll _,3- - - - - - - - - -
- -------------------, 
.......... - - - - - - - - - - - - - - - -
3I
US
Step 4
--
-
Cellt _- - - - - - - - -
-- - - - - - - - - - - - - - - - - - - - - - - -
0__ O ,L,,,,V_ _
Overall Performance After 100 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 450 464 0.928 1.38 1.484
Process 2 450 461 0.922 1.69 1.831
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 25.60% 74.40% 0.00%
M2 1 21.50% 78.50% 0.00%
M3 1 4.60% 83.30% 12.00%
Buffer Statistics
Buffer Name Buffer Size Average # of Parts inBufferName BufferSize Average Wait in BufferBuffer
Buffer 11 2 0.15 0.162
Buffer 12 2 0.435 0.472
Table 7.8: Results of Subfactory Demonstration
The hedging points at each level are chosen for each process to demonstrate how
hedging points can dampen the effects of failures.
The simulation was run for 100 time units using Hiercsim Version 3.5.
Overall Results Table 7.8 shows the overall performance of the system over 100
time units. Actual cumulative production kept pace with cumulative requirements.
The excess production at the end of the simulation is due to the Level 2 and Level 3
hedging points.
Cycle time is close to the raw processing time, and work-in-process is minimal.
These results are due to the fact that the internal buffers are small.
Machine 3 is idle for only 5% of the time. This indicates that the combination of
335
demand rates is close to the capacity of the system. Machines 1 and 2 both have a
significant amount of idle time. Therefore, Machine 3 is the bottleneck in the system.
7.4.1 Target Rates within Long-Term Capacity
This section develops the capacity set used in the hedging point linear program (4.11)
by the controller of Level 1 Cell 1.
Basic Capacity Set Recall that the capacity set Qk(ekk, t) of a cell with multiple
machine groups is modeled by (3.12). That model is reproduced here:
f (ekmk, t) (t) e(t)M(t) Vi E c
J.i EP,
L(j)>k
uj(t) > 0 Vj E P, L(j) > k (7.3)
Level 1 Cell 1 Capacity Set For Level 1 Cell 1, the values of mi, m1, and mi
are equal to 1.0 since the only failures in the system are Level 2 and Level 3 failures.
Likewise, the values of el and el are both equal to 1.0, since both Machine 1 and
Machine 2 are completely reliable. On the other hand, the value of el is less than
1.0 since Machine 3 experiences Level 2 and Level 3 operation dependent failures.
Using (2.14), an estimate for the time available on Machine 3 after Level 2 and Level
3 failures are accounted is given by
1
el- 0.8613 MTTR 2  MTTR=08
Combining the values of e,'m , with the operation times shown in Figure 7-23, the
Level 1 Cell 1 capacity set can be written,
Machine 1: 0.80ul < 1.0
Machine 2: 0.85ul2  < 1.0 (7.4)
Machine 3: 0.44ux1 +0.46u 2 < 0.861
336
u1 > 0
u1 2 > 0
When the system is at its hedging point, the hedging point strategy linear program
(4.11) specifies that the rates u:l and u12 are equal to the demand rates, u° = 0.9
parts per time unit and u° = 0.9 parts per time unit. Using the demand rate values,
(7.4) becomes,
Machine 1: 0.720 < 1.0
Machine 2: 0.765 < 1.0 (7.5)
Machine 3: 0.810 < 0.861
Machine 1 is predicted to be idle 1.0 - 0.720 = 28% of the time. Table 7.8 shows
that Machine 1 is actually idle 25.6% of the time. Likewise, Machine 2 is predicted
to be idle 23.5% of the time compared to an actual idle fraction of 21.5%. Machine
3 is predicted to be idle 5.1% of the time compared to its actual 4.6% idle fraction.
All these predicted numbers are close to the actual numbers, although they are
consistently less than the actual numbers. This is due to the fact that the buffers are
too small to decouple the failures in Machine 3, leading to starvation and blockage.
This points to the need for a refinement in the capacity model to account for finite
buffers between machines. Section 7.2.2 demonstrates this discrepancy in detail.
When the demand rate vector violates any of the constraints in the Level 1 capacity
set, the set of Level 1 production rates will be determined such that the limiting
machine has no idle time. Hiercsim will accept excessive demand rates in the input
file. The controller of the Level 1 cell will translate those rates into Level 1 production
rates which are within the long-term system capacity.
Figures 7-24 and 7-25 are graphs of cumulative production for Process 1 and
Process 2 at Levels 1, 2, and 3, respectively. Notice that despite failures (indicated
by periods of zero increase in cumulative production) the overall demands are met
quite closely.
337
Figure 7-24: Cumulative Production for Process 1
338
500
450
400
3
300
250
S200
o 150
100
50
0
Level 1 P1
Level 3 P1
0 100 200 300 400 500
Time
Figure 7-25: Cumulative Production for Process 2
339
170
S150
E140
130
120 
140 145 150 155 160
Time
- - - Level 1 P1
Level 2 P1
Level 3 P1
165 170 175 180
Figure 7-26: Process 1 at its Collective Hedging Point
7.4.2 Hedging Points at Different Levels
In this system, Level 2 hedging points are intended to dampen the effects of the Level
2 failures of Machine 3 on the system downstream of Level 2 Cell 2. Likewise, the
Level 3 hedging points are intended to dampen the effects of the Level 3 failures
Machine 3 on the system downstream of Level 3 Cell 3.
The overall system hedging point, is equal to the sum of the Level 2 and Level
3 hedging points (3.43). In the case of Process 1, this is equal to z2 1 + z31 = 15.
Figure 7-26 shows the cumulative production of Process 1 over a time interval when
the system is at its collective hedging point. Note how the Level 2 production is 10
parts ahead of Level 1 requirements, and how the Level 3 production is 5 parts ahead
of Level 2 requirements.
Similarly for Process 2, the overall system hedging point is equal to z 2 + z3 2 = 12.
Figure 7-27 shows the cumulative production of Process 2 over a time interval when
340
170
160 -
* 150
" 4 - - .Level 1 P2
140
S- 
- Level 2 P2
1Level 3 P2
120 111
140 145 150 155 160 165 170 175 180
Time
Figure 7-27: Process 2 at its Collective Hedging Point
341
the system is at its collective hedging point. Note how the Level 2 production is 10
parts ahead of Level 1 requirements, and how the Level 3 production is 2 parts ahead
of Level 2 requirements.
7.4.3 Failure/Repair Cycle
This section presents an example of how a flow rate controller responds during a
fail/repair cycle of a machine in the cell, described in Section 4.5.3.
The Level 2 Cell 2 is used as an example cell. At each instance where the Level
2 production rates must be computed by the controller of Level 2 Cell 2 (either in
response to a change of machine state, or to accommodate boundaries in surplus
space), the linear program (4.11) and its optimal solution are given. In addition, the
time at which the controller must recompute its rates is also given.
In the simulation, there are two Level 2 failures of Machine 3 at around t = 200
time units. This example starts at the beginning of the second Level 2 failure (at
t = 223.1), continues through the repair of the machine, and ends when the system
has returned to its hedging point at t = 297.
Figures 7-28 and 7-29 give the history of cumulative production for Process 1 and
Process 2 over the period where the production is disrupted by the Level 2 failure of
Machine 3. Cumulative production at Levels 1, 2, and 3 are shown in the figures.
Rate Calculation in Cell 2 at Time = 223.1000 The second Level 2 failure on
Machine 3 occurs at t = 223.1000. The controller of Level 2 Cell 2 computes produc-
tion rates which account for the fact that Machine 3 is not operational. The linear
program (4.11) at this time is
min -6.418u 1 - 6.711u 2
Subject to
342
Figure 7-28: Recovery of Process 1 From Level 2 Failure
343
Figure 7-29: Recovery of Process 2 From Level 2 Failure
344
Machine 1:
Machine 2:
Machine 3:
0.80u21
0.85u22
0.44u 2 +0.46u 2
< 1.0
< 1.0
< 0.0
21
u 2>
U22
The solution to this linear program is
U2 1 = 0.0
u 2 = 0.0
Because the surplus point is falling away from the hedging point, the time until
the next boundary is reached is undefined and is set to infinity according to (4.49)
At = o
Notice that the deviation from the hedging point of both processes (4.5) is neg-
ative. This indicates that the system has not yet fully recovered from the previous
Level 2 failure on Machine 3.
Rate Calculation in Cell 2 at Time = 228.0077 Machine 3 is repaired at t =
228.0077. Note that the cost of deviation for both processes has increased since
the time Machine 3 failed. The Level 2 availability of Machine 3 is e2 = 0.952381
according to (3.10). The linear program (4.11) used by the controller of Level 2 Cell
2 to specify production rates is
min -10.835u, 1 - 11.128u~2
Subject to
345
Machine 1: 0.80u2 < 1.0
Machine 2: 0.85u 2  < 1.0
Machine 3: 0.44u 1 +0.46u 2 < 0.952381
2 > 0
2
2The solution to this linear program is
U21 = 1.2500
u2, = 0.8747
These production rates emphasize Process 1 over Process 2 even though the Pro-
cess 2 surplus is further behind that of Process 1. This is due to the shape of the ca-
pacity set which influences the location of regions in surplus space (See Section 4.7.1)
and therefore the optimal solution to (4.11). If the capacity set were symmetric,
then Process 2 would be allocated more capacity than Process 1. The time until a
boundary will be reached is determined by (4.48), given that neither the capacity set
nor the target Level 1 rates change. That time is
At = 0.508075
Therefore, a calculation event is scheduled in the event queue at time t = 228.5158
(Section 4.1.3).
Rate Calculation in Cell 2 at Time = 228.5158 The surplus of Level 2 Cell
2 has reached its first boundary at t = 228.5158 (Section 4.7). The appropriate
boundary constraints are added using the algorithm in Section 4.8.4. A detailed
example of the boundary installation algorithm for this rate calculation can be found
in Section 7.4.4. The linear program which is used to determine the next set of
production rates is
346
min -10.657u 1 - 11.141u02
Subject to
Machine 1: 0.80u 1  _ 1.0
Machine 2: 0.85u 2  1.0
Machine 3: 0.44u21  +0.46u 2  < 0.952381
Boundary : 0.723u 1 -0.691u 2  = 0.028277
u 2 > 021 -
22 >0
The solution to this linear program is
u2l = 1.0546
u2 = 1.0616
The solution to the linear problem permits both processes to be run at a Level 2
rate which greater than the Level 1 production rates. This drives the Level 2 surplus
towards the Level 2 hedging point. The time until a boundary will be reached is
determined by (4.48), given that neither the capacity set nor the target Level 1 rates
change. That time is
At = 68.9281
Therefore, a calculation event is scheduled in the event queue at time t = 297.4440
(Section 4.1.3). The next boundary that will be reached is the boundary that corre-
sponds to the hedging point.
Rate Calculation in Cell 2 at Time = 297.4440 At time t = 297.4440, the
Level 2 hedging point has been reached. Since Level 2 Cell 2 contains two process
347
segments, the linear program (4.1.4) requires two boundary constraints to remain at
the hedging point. The cost of both processes is zero when at the hedging point. The
first boundary is the same as in the previous calculation. The appropriate boundary
constraints are added using the algorithm in Section 4.8.4. The linear program which
is used to determine the next set of production rates is
min 0.0u 1 + -0.Ou 2
Subject to
Machine 1: 0.80u 1  < 1.0
Machine 2: 0.85u 2  < 1.0
Machine 3: 0.44u 1  +0.46u 2  < 0.952381
Boundary : 0.723u 1 -0.691u 2 = 0.028277
Boundary: 0.691u2 1 +0.723u 2 = 1.272478
U 2 > 0
21
u21  0
22
The solution to this linear program is
u21 = 0.9000
u , = 0.9000
The solution to the linear problem restricts both processes to be run at the Level
1 production rates. The calculation which determines the time until the next bound-
ary is reached is undefined. Unless the Level 2 capacity set (3.12) or the Level 1
production rates change, then the Level 2 surplus will always remain at the hedging
point. Therefore according to (4.49),
At = oo
348
7.4.4 Boundary Installation Algorithm Example
This section demonstrates the boundary installation algorithm described in Sec-
tion 4.8.4. The system of Section 7.4.3 is used here. The rate calculation at time
t = 228.5158, in which a boundary is installed, is described in detail. The steps of
the multiple boundary installation algorithm are
1. Calculate initial rates
2. Create reference surplus, xo
3. Create ghost surplus, zp
4. Follow trajectory
5. Calculate final rates
In each case, the linear program expanded with slack variables (4.13) is used.
1. Calculate Initial Rates At the start of the rate calculation at t = 228.5158,
initial production rates are computed to determine the initial boundary coefficients.
When the linear program (4.11) is expanded with slack variables"3 in the form of
(4.13), the linear program used to compute the initial rates can be written,
min -10.657u1 
- 11.141u 2,
Subject to
Machine 1:
Machine 2:
Machine 3:
0.80u21
0.85u22
0.44u2,1 +0.46u 2
+82
=1.0
= 1.0
+s3 = 0.952381
1 Slack variables are shown in this section's linear programs to demonstrate the effect of adding
boundaries to the solution. As each boundary is added, the number of non-basic variables decreases
by one. This can be shown explicitly when the slack variables are included.
349
u 21 > 0
22>
81 > 0
82 > 0
s3 > 0
The solution to this linear program is
u = 0.9346
u 2 = 1.1765
sl = 0.2523
s2 = 0.0000
s3 = 0.0000
Slack variables 82 and s3 are nonbasic variables, while variables u 1 , 2U 2, and slack
variable sl are in the basis.
The boundary coefficients F are computed using (4.21). These coefficients de-
fine the boundary hyperplane Rh. Using the results the initial linear program, the
boundary coefficients are
F = -1.0000 0.0000
F2 L0.7226 -0.6912
The condition which determines whether or not the surplus is on a particular
boundary is given by (4.24). In Hiercsim Version 3.5, the conditions (4.40) and (4.41)
are used instead so that the numerical accuracy of the computer could be taken into
account.
The first condition, (4.40), gives the time until each of the boundaries will be
reached. For this example, the quantity (tl - to)i is the time until the ith boundary
will be reached.
The deviation of the surplus xz from the hedging point zk is
350
2 21x 1 - z -10.6572  - 1.14122
The rate of change of surplus, ;i.c which is given by (4.12). In this example, cj is
2 2 - U1 0.0346
i -2 1 0.2765
The times to reach the two boundaries are computed from (4.38)
tl - to = min (t - to) 1  [308.0058
(t1 - to) 2  0.00001
For this run of the simulation, small times are considered those less than e2 =
0.0001. According to this criterion, the surplus is on the second boundary, but is not
on the first boundary. Because the surplus is on one of the boundaries, the boundary
installation algorithm is continued to Step 2.
2. Create Reference Surplus xo The surplus xk lies close to the second bound-
ary, but it is not on the boundary within the precision of the computer. The reference
surplus x0 is a surplus which is close to the surplus x , but which lies precisely on
the boundary. The reference surplus xo, defined in (4.42), is created using the time
interval (tl - to) 2 = 0.00001 and the rates computed in Step 1.
S01 x 21 + (u21 - U11) x (0.00001) -10.657
Xo 2 2 u2 - 12) X (0.00001) -11.141
The reference surplus xo is not different from the original surplus ax to the number
of significant digits displayed in this example. However, the time interval required to
reach the nearest boundary (4.40) changes to
351
to = min (t - to) 308.395
(tl - t)2 [ 13.37 x 10-16
Note that the time to reach the second boundary is indistinguishable from zero.
3. Create Ghost Surplus, Xp When the surplus xz lies on more than one bound-
ary at the same time, it is difficult to distinguish between the boundaries when the
attractiveness condition (4.31) is applied. The ghost surplus xp is a surplus created
from the reference surplus xo which does not lie on any boundary (4.43). It is used as
a means to install attractive boundary constraints to the linear program (4.1.4). The
trajectory of the ghost surplus is such that each boundary is encountered individually,
allowing the attractiveness condition (4.31) to be applied.
In this example, the value of the base perturbation amount is A = 0.001 and the
fixed multiplier is A = 1.1. The quantities A and A are related by (4.45). The actual
perturbation 8 about the reference surplus is
-0.001
0.000
The cost function of the linear program (4.11) using the ghost surplus is
min -10.658u 1 - 11.141ui 2
subject to the same capacity set as the linear program in the initial rate calculation.
When the results of the ghost surplus linear program are used to calculate the
boundary coefficients F (4.21), their values become
F = 0.0000 -1.0000
F2 -0.7226 0.6912
Notice that the first boundary coefficients are reversed from the initial rates'
352
boundary coefficients, while the second differs only in sign.
The time (4.40) to reach the boundaries F are
(tl - to)i oo
t, - to = min (t -t)
(tx - to)2 0.002982 J
The time to reach the first boundary is undefined, so it is set to be infinity according
to (4.49).
Recall that times are considered to be small when they are less than e2 = 0.0001.
According to this criterion, the ghost surplus is on neither boundary. In fact, the
ghost surplus has crossed a boundary when it was created from the reference surplus
because the first boundary is now behind the ghost surplus.
If the ghost surplus were still on a boundary after the initial perturbation, then
the perturbation amount is increased according to (4.45) and the ghost surplus cre-
ation process is repeated. However, because the ghost surplus is on neither of the
boundaries, the boundary installation algorithm is continued to Step 4.
4. Follow Trajectory Up to this point in the algorithm, it is only known that the
surplus lies on a boundary. Any information generated by the initial rate calculation
has been lost due to the creation of the ghost surplus. Therefore, the projected
trajectory of Section 4.8.2 must be created from scratch.
Using the notation of Section 4.7.2, the production rates using the ghost surplus
are u- and the rates on the other side of the boundary from the ghost surplus are u+.
Those rates will be used in the conditions (4.31) and (4.32) to determine whether or
not the boundary is attractive.
Compute u- The ghost surplus is used in the cost function of (4.11) to compute
u- as follows:
min -10.658u- - 11.141u
Subject to
353
Machine 1: 0.80u- +sl = 1.0
Machine 2: 0.85u2 +S2 = 1.0
Machine 3: 0.44u1  +0.46u2 +s3 = 0.952381
ui 2 0
S2 > 0
83 > 0
The solution to this linear program is
ul = 1.2500
us = 0.8747
sl = 0.0000
S2 = 0.2565
83 = 0.0000
Slack variables sl and s3 are nonbasic variables, while variables u , u2, and slack
variable s2 are in the basis. Notice that this basis is different from the initial rate
calculation basis.
The boundary coefficients F- are computed using (4.21). These coefficients define
the boundary hyperplane Rh. The boundary coefficients are
F- 0.0000 -1.0000
F - -0.7226 0.6912
The deviation of the ghost surplus xp from the hedging point z is
Xp[ - z 2 -10.658
P2 - Z2 -11.141
354
The rate of change of the ghost surplus ip is given by
[ P [ uj - U [ 0.3500
SP2 -U 22 J -0.0253
The times to reach the two boundaries are
t, - to = min = 0(tl - to) 2  0.002982
For this run of the simulation, times are considered to be small if they are less
than E2 = 0.0001. Note that the first boundary has a time interval (tl - to)1 = oo
(4.49). This is due to the fact that the time interval is undefined (it is less than zero)
which indicates that the boundary lies in the opposite direction of the ghost surplus'
direction. Given the rates from the linear program, the ghost surplus is 0.02982 time
units from the second boundary.
Compute u+ The ghost surplus xp is moved using (4.39), as if 0.02982 + 0.0001 =
0.02992 time units had elapsed. This places the ghost surplus xp across the second
boundary. The rates across the boundary, u+ , are computed so that the attractiveness
criterion (4.31) may by applied to the boundary that was just crossed. The following
linear program is used to compute u+:
min -10.658u + - 11.141u +
Subject to
Machine 1: 0.80u +  +s 1  = 1.0
Machine 2: 0.85u +s2 = 1.0
Machine 3: 0.44u + +0.46u +sa = 0.952381
355
2 0
S2 2 0
83 > 0
The solution to this linear program is
U1+ = 0.9346
U2+ = 1.1765
sl = 0.2523
S2 = 0.0000
S3 = 0.0000
Slack variables s2 and s3 are nonbasic variables, while variables u + , + , and slack
variable Sl are in the basis.
The boundary coefficients F + are computed using (4.21). These coefficients define
the boundary hyperplane R+. The boundary coefficients are
F F 1 [ -1.0000 0.0000
F2 J 0.7226 -0.6912
The deviation of the ghost surplus xp from the hedging point z' is
Zpl - z 1  -10.657
XP2 - z 2  -11.141
The rate of change of the ghost surplus ip is given by
[p1 0.0346
iP2 0.2765
The times to reach the two boundaries are
356
tto = m - to) 308.006
tl - to = in (t - to) 2  6.68 x 10-16
The second boundary satisfies the attractive boundary condition (4.31) because
both u- and u+ drive the ghost surplus towards the boundary.
The attractive boundary constraint (4.35) is therefore
0.7226u 1 -0.6912u 2  = 0.028277
Compute Rates After Boundary Installation After a boundary is installed,
it is necessary to determine whether or not there more attractive boundaries. The
linear program (4.13) (which includes the boundary constraint) is solved using the
reference surplus:
min -10.657u 1 - 11.141u~2
Subject to
Machine 1: 0.80u 1  +81 = 1.0
Machine 2: 0.85u22 +82 = 1.0
Machine 3: 0.44u 1l +0.46u, +s8 = 0.952381
Boundary: 0.7226u 1 -0.6912u 2  = 0.028277
u > 0
21
22
S, > 0
82 0
83 s 0
The solution to this linear program is
357
u21 = 1.0546
u2 = 1.0616
S1 = 0.1563
82 = 0.0976
S3 = 0.0000
Slack variable s3 is the nonbasic variable, while variables u1L, u42, and slack vari-
ables sl and s 2 are in the basis.
The boundary coefficients F are computed using (4.21). These coefficients define
the boundary hyperplane Rh. The boundary coefficients are
F = F = [-0.6912 -0.7226
The deviation of the reference surplus xo from the hedging point z 2isEo - z1 -10.657
Xo - z 2  -11.141
The rate of change of surplus, Li which is given by (4.12) is
[2 r 2 1 r
21 21 - U1 0.1546
2 2 1 0.1616
22  U22  U1 0.16
The time to reach the remaining boundary (4.38) is
ti - to = 68.9291
For this run of the simulation, times are considered to be small if they are less
than E2 = 0.0001. According to this criterion, the reference surplus is not on this
boundary, so the procedure which follows the trajectory of the ghost surplus xp is
complete. Given this information, the final production rates can be computed since
358
all attractive boundaries have been installed. The procedure continues with Step 5,
the final rate calculation.
(If there had been another boundary, the procedure would have continued with
the calculation of another pair of production rates u- and u+ at the start of Step 4.)
5. Final Rate Calculation The final rate calculation uses the linear program
(4.36) and the original surplus x2. If the original surplus is on one or more boundaries,
those boundaries are unattractive. An appropriately small amount is added to the
surplus so that it is pushed across the unattractive boundaries. On the other hand,
if the original surplus is not on any boundary, then it is used without modification.
The final production rates are computed using the following linear program:
min -10.657u2 1 - 11.141u 2
Subject to
Machine 1: 0.80u~1  +81 = 1.0
Machine 2: 0.85u 2  +s2 = 1.0
Machine 3: 0.44u 1  +0.46u 2  +s8 = 0.952381
Boundary: 0.7226u 1 -0.6912u 2  = 0.028277
21
2i > 0
82 > 0
S3 > 0
The solution to this linear program is
359
utl = 1.0546
uj 2 = 1.0616
sl = 0.1563
s2 = 0.0976
S3 = 0.0000
The boundary coefficients F are computed using (4.21). These coefficients define
the boundary hyperplane Rh. The boundary coefficients are
F = F1 = [ -0.6912 -0.7226]
The time to reach this boundary is
tl - to = 68.9291
The next rate calculation is scheduled 68.9291 time units in the future, assuming
that no failures will occur, and the Level 1 production rates do not change. If there
had been an unattractive boundary, the time required to move the surplus across that
boundary would be added into the scheduled time using (4.48).
7.4.5 Variation in Relative Priority Coefficients
Section 4.4.3 stated that by changing the relative priority coefficients of each process
in the cost function (4.5), the controller's response to events could be altered to favor
one process over another. This section uses the system described above to show how
that response is varied by increasing the cost of deviation for Process 1 by a factor of
10 relative to Process 2.
Figures 7-30 and 7-31 show the cumulative production of Process 1 and Process
2 over an interval of 300 time units. Note that since the relative priority coefficients
changed, the simulation's use of the random number generator also changed. This
360
Figure 7-30: Process 1 Cumulative Production, High Priority
361
300
A1 =10
A2=1
250 A2
S200
0 150
Level 3 P2
0 I I I I I
0 50 100 150 200 250 300
Time
Figure 7-31: Process 2 Cumulative Production, Low Priority
362
resulted in a different sequence of failures at Machine 3. However, a comparison of
the general trends can be made between the curves in Figures 7-30 and 7-31 and
those of Figures 7-24 and 7-25 (where the relative priorities of the two processes are
identical).
The Level 2 production tracks the Level 1 production closely regardless of the
values of the relative priority coefficients. However, the detailed behavior of the Level
2 controller's response to a failure is quite different.
When the relative priority coefficients are identical, neither process is favored over
the other. This can be seen in Figures 7-24 and 7-25 where after a failure occurs,
both processes reach their respective hedging points at roughly the same time and
with the same speed.
When the relative priority coefficients differ, the cost of being behind in one process
is much bigger than that of the other process. Therefore, the controller attempts to
minimize the cost by allocating a larger proportion of capacity to the more expensive
process. In this example, the cost of deviation for Process 1 is 10 times more expensive
than that for Process 2. That is,
A1 = 10
A 1
Process 1 is always the first the reach its hedging point after a failure. Process
2 only begins to catch up after the deviation of Process 1 has been reduced to -1th
of the magnitude of the deviation of Process 2. Overall, Process 2 experiences much
larger swings in deviation from its hedging point when compared to Process 1.
The opposite results can be obtained by reversing the relative priority coefficients
between Process 1 and Process 2. In Figures 7-32 and 7-33, cumulative production
is shown for Process 1 and Process 2 where A' = 1 and A2 = 10. Note how closely
the Level 1 production is tracked by Level 2 Process 2 compared to that of Level 2
Process 1.
363
300
Al =1
A2 = 10250
C
200
10
o 150
5 too
o
0
0
0
S- - - Level 1 P1
- ----- Level 2 P1
Level 3 P1
50 100 150 200 250 300
Time
Figure 7-32: Process 1 Cumulative Production, Low Priority
364
Figure 7-33: Process 2 Cumulative Production, High Priority
365
7.5 Demonstration of Reentrant Flow Anti-
Looping Constraints
Reentrant flows and their impacts on the hedging point strategy algorithms are
discussed in Sections 3.12 and 4.5.1. This section presents an example of a reentrant
flow simulation. The first part of this section provides an overview of the perfor-
mance of the hierarchical controller with a reentrant process. The second part details
the linear program constraints required to accommodate the anti-looping constraints
(3.34) and (3.35) in the frequency decomposition hierarchy of Section 3.5.1.
It is important to note that not all of the reentrant flow algorithm bugs have been
resolved in Hiercsim Version 3.5. The issue of whether or not a virtual machine and
an anti-loop constraint due to the same buffer failure are redundant has not been fully
explored. Logic might have to be added to the creation of the work LP (Section 4.1.4)
such that a virtual machine constraint is removed when the corresponding anti-loop
constraint is present.
Evidence of a problem is demonstrated by the fact that some of the linear programs
shown in Section 7.5.2 required many iterations to come to a stable solution. In
addition, the last linear program in Section 7.5.2 has an inconsistency in its rate
solution. Since the hedging point strategy algorithm in Hiercsim Version 3.5 works
well for systems without reentrant processes, it is assumed that the problem arises
from the implementation of the anti-loop constraints.
The reentrant flow algorithm specified production rates which drove the surplus
to a point in the vicinity of the hedging point, but not directly to the hedging point.
This caused an additional rate calculation to actually put the surplus on the hedging
point. However, the overall system performance did not suffer beyond the extra
calculation effort, as is shown in Section 7.5.1. The robustness of the hedging point
strategy (which is based on a feedback control law) is shown by the fact that the
system performs well despite a less than ideal algorithm.
7.5.1 Reentrant Flow Behavior
366
Level 1 1U = 0.
I ' I M1: Level 2
u2 2 U2 I MTTF = 85RLevel 3 u u 2
21 22 2* MTTR =15R
Cell 3 z . S.o
I 3
Z - 2.5
I 32
_ Z o.o
I 33
I-- - J -- - I b
I N3Nl1 N3 2 I N3 s I
- ±, +
Factory
Figure 7-34: Reentrant Flow System
367
System Description The system is shown in Figure 7-34. It consists of a single
unreliable machine. The process which transforms raw material into finished product
consists of three steps. Each step uses the same machine. The sequence of the steps
in the process is fixed. There are two buffers in the system. Each buffer can hold
up to 5 parts. The buffers are used to temporarily store parts that have completed
intermediate stages of production.
The scheduling problem is to divide the capacity of the machine between the three
stages of the process so that the overall demand rate is met.
The control system which accomplishes this task is divided into three control
levels. The controller of Level 1 Cell 1 views the process as a single entity and
sets the Level 1 production rate according to the demand rate u ° and the long-term
capacity of the machine (3.12).
The controller of Level 2 Cell 2 responds to Level 2 failures of Machine 1. It
also treats the process as three independent process segments, each with its unique
production rate, surplus, and hedging point. The Level 2 production rates for each
of the three process segments are sent to Level 3 Cell 3 whose controller clears parts
for production according to the staircase policy of Section 3.9.
The operation times on the machine for each stage of the process were chosen
to be different in order that the virtual machine constraints of Section 3.8 and the
anti-looping constraints of Section 3.12 could be demonstrated.
The target demand rate for Process 1 was chosen to be within the long-term
capacity of the system. Its relative priority coefficient was arbitrarily chosen to be
1.0. The values of the Level 2 hedging points provide for an average of 2.5 parts in
each buffer at when the system is at its collective hedging point.
The simulation was run on Hiercsim Version 3.5 for a total of 500 time units.
Overall Results Table 7.9 shows the overall performance of the system over 500
time units. Actual production kept pace with requirements. The machine utilization
is as expected. The machine was idle 17.8% of the time, which indicates that the
demand rate is much lower that the system capacity. This compares to an expected
368
Overall Performance After 500 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 150 150 0.3 4.94 16.162
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Failed
M1 1 17.80% 67.20% 15.00%
Buffer Statistics
Buffer Name Buffer Size AverageAverage Wait in BufferBuffer
Buffer 11 5 2.31 3.745
Buffer 21 5 1.81 2.974
Table 7.9: Results of Reentrant Flow System
369
160
140
120 -J
oC
60 , /- 
- - - - - Level 2 Step 1(UE
40 - - - Level 2 Step 2
20 - -Level 2 Step 3
0
0 100 200 300 400 500
Time
Figure 7-35: Cumulative Production, Long Interval
idle fraction of 19%.
Figures 7-35 and 7-36 are graphs of cumulative production. Figure 7-35 shows
the production over the entire interval of 500 time units. Figure 7-36 shows the
production over the interval of time where the system begins to recover from a Level
2 failure in Machine 1.
At t = 214.4, Machine 1 is repaired, and can resume production. Level 2 Process
Segment 3 is allocated all the capacity of the machine until its entry buffer empties. At
that time (t = 216.1), the production rate of Level 2 Process Segment 3 is constrained
by that of Level 2 Process Segment 2, as can be seen by the merging of the two curves.
This constraint is implemented in the form of a starvation anti-looping constraint
(3.34).
As the system continues to recover from the failure, it is continually reducing the
deviation of surplus from all process segments from their respective hedging points.
370
Figure 7-36: Cumulative Production, Short Interval
371
70 -
S60 -
-Level1
S5565
50 50 - - Level 2 Step 3
45 I I
210 220 230 240 250 260
Time
54.5
4
3.5
2.5
2
1.5 Level 2 B1
1-
0.5
0
0 100 200 300 400 500
Time
Figure 7-37: Amount of Material in Buffer 1, Long Interval
At t = 252.2, the production rates of Process Segment 3 are less than those of
Process Segment 2, and so material in Buffer 2 begins to accumulate. At that point,
the starvation anti-looping constraint (3.34) is removed and the process segments are
controlled as independent units.
Figure 7-37 shows the amount of material in Buffer 1 over the entire interval of
the simulation. Figure 7-38 shows the amount of material in Buffer 1 over the shorter
interval from t = 210 to t = 260. The Level 2 amount of material in Buffer 1 is never
zero, nor is it ever equal to the maximum buffer size.
Figure 7-39 shows the amount of material in Buffer 2 over the entire interval of
the simulation. Figure 7-40 shows the amount of material in Buffer 2 over the shorter
interval from t = 210 to t = 260. Notice that during the interval between t = 216.4
and t = 252, the amount of material in Buffer 2 is equal to zero. During that time
interval, the production rate of Process Segment 3 is restricted to be less than or
372
54.5
4
3.5
2.5
2
1.5 Level 2 1
1
0.5
0 I I I I I
210 220 230 240 250 260
Time
Figure 7-38: Amount of Material in Buffer 1, Short Interval
373
54.5
4
3.5 Level 2 B2
3 -
2.5
1.5
1
0.5
0 100 200 300 400 500
Time
Figure 7-39: Amount of Material in Buffer 2, Long Interval
374
Figure 7-40: Amount of Material in Buffer 2, Short Interval
375
equal to that of Process Segment 2.
7.5.2 Reentrant Flow Controller Linear Programs
This section contains the complete set of linear programs and their solutions for the
period between t = 210 and t = 260. Combined, these linear programs demonstrate
how the controller in Hiercsim Version 3.5 responds to a failure of a machine which
operates on a reentrant process.
This example shows that the reentrant flow constraint implementation requires
refinement. In particular, some linear programs had to be solved multiple times
before a solution was found. This may be due to the fact that the starvation anti-
looping constraint (3.34) duplicates the starvation virtual machine constraint (2.19).
As the algorithm becomes better understood, this problem should be resolved.
Rate Calculation in Cell 2 at Time = 214.3529 This is the first linear pro-
gram (4.11) to be solved after the repair of Machine 1. For that reason, all process
segments are behind requirements.
min -18.096u 1 - 18.057u 2 - 18.094u
subject to
Machine 1: 0.75u 1 +0.80u~2  +0.65u 3 +sl = 1.0
u 21 > 0
U23 > 0
S1 > 0
The solution to this linear program is
376
U 2 1  0.0000
u 2 = 0.0000
u, = 1.5385
sl = 0.0000
The time interval to reach the next boundary is computed using the algorithm in
Section 4.8.4, and is
tl - to = 1.6089
The solution to this linear program allocates all capacity to Process Segment 3,
causing the amount of material in Buffer 2 to decrease rapidly, as can be seen in
Figure 7-40. The next rate calculation is scheduled 1.6089 time units in the future, at
time t = 215.9618, assuming that no failures will occur, and the Level 1 production
rates do not change.
Rate Calculation in Cell 2 at Time = 215.9618 This linear program is trig-
gered because the Level 2 surplus has reached a boundary. The boundary is installed
according to the algorithm in Section 4.8.4.
min -18.579u2 1 - 18.540u, 2 - 16.102u2 3
Subject to
Machine 1: 0.75u 1  +0.80u 2  +0.65u 3  +si = 1.0
Boundary: -0.655u2 1 +0.756u~, = 0.030228
377
21
4 2> 0
E2 > 0
S> _ 0
The solution to this linear program is
u21 = 0.7416
22 = 0.0000
u23 = 0.6827
S1 = 0.0000
The time interval to reach the next boundary is computed using the algorithm in
Section 4.8.4, and is
tj - to = 1.6569
The solution to this linear program begins to divide the capacity of the machine
among the process segments. The next rate calculation is scheduled 1.6089 time units
in the future, at time t = 215.9618, assuming that no failures will occur, and the Level
1 production rates do not change.
However, Buffer 2 is running out of parts. The time for Buffer 2 to empty, Atv,
is computed using (4.7) and is less than the time to reach the next boundary.
b 2 0.06184
At - -0.6827 0.09057
b2 -0.6827
The starvation of Buffer 2 will change the capacity set of Level 2 Cell 2. Since
Atv is less than the boundary interval tl - to, the next time a rate calculation is
required is Atv time units in the future.
Rate Calculation in Cell 2 at Time = 216.0523 Once Buffer 2 empties, Level
2 Process Segment 3 is limited by the rate of Level 2 Process Segment 2, according to
378
the constraint (3.34). In addition, Hiercsim Version 3.5 also includes the starvation
constraint 2.19). This may be the cause of the failure to converge seen in later linear
programs in this example. Note that the boundary which was installed in the previous
linear program has been removed.
The following linear program is used to solve for production rates14 :
min -18.539u 1 - 18.567u 2 - 16.067u
Subject to
Machine 1: 0.75u 1  +0.80u 2  +0.65u 3  +si = 1.0
Starve 2: U4 +s2 = 0.030228
Stave AL 2: -U2 3  +3 = 0.0000
u 21  0
s2>022u 2 = 023 0
a = 0
s2 = 0
The solution to this linear program is
21i 1.3333
2U22 = 0.0000
S1 = 0.0000
82 = 0.0000
83 = 0.0000
"The term Starve 2 indicates a virtual machine constraint (2.19) on Process Segment 3 due to
Buffer 2 being empty. Likewise, the term Starve AL 2 indicates an starvation anti-loop constraint
(3.34) on Process Segment 3 due to the reentrant link to Process Segment 2.
379
The time interval to reach the next boundary is computed using the algorithm in
Section 4.8.4, and is
ti - to = 0.8613
Due to the zero production rate of Process Segment 2, the production rate of
Process Segment 3 is forced to be zero. This allocates all capacity to the production
of Process Segment 1 parts. The first boundary will be reached after 0.8613 time
units, which occurs at t = 216.9136.
Rate Calculation in Cell 2 at Time = 216.9136 At time t = 216.9136, the
Level 2 surplus is at a boundary. The appropriate boundary constraint is installed,
and the production rates are computed using the following linear program:
min -17.649u 1 - 18.825u@2 - 16.325u 3
subject to
Machine 1: 0.75u 1  +0.80u~2  +0.65u 3  +si = 1.0
Starve 2: u23 +82 = 0.452750
Starve AL 2: -u2 3  +3 = 0.0000
Boundary: -0.807u2 1 +0.4174 2  +0.417u 3  = 0.008349
U2 > 0
21 -
U2 > 0
S > 0
82 0
S3 > 0
The solution to this linear program is
380
ui = 0.4580
u2, = 0.4527
u3 = 0.4527
23= 0.0000
s2 = 0.0000
s3 = 0.0000
The time interval to reach the next boundary is computed using the algorithm in
Section 4.8.4, and is
ti - to = 34.7794
This linear program required 17 iterations before it converged to the above so-
lution. This may be due to the fact that there are both the starvation constraint
and the antilooping constraint in the linear program. The current implementation of
Hiercsim does not take into account the fact that a starvation virtual machine and
an antilooping virtual machine on the same process segment are redundant.
All three Process Segments are being produced now, but rate of flow of material
into Buffer 2 is still zero. The next rate calculation is scheduled 34.7794 time units
in the future, at time t = 251.6930, at which point a boundary is reached.
Rate Calculation in Cell 2 at Time = 251.6930 The linear program used to
calculate production rates at time t = 251.6930 is
min -12.153ui4 
- 13.513u 2, - 11.013u23
subject to
381
Machine 1: 0.75u' 1 +0.80s 2
Starve 2:
Stave AL 2: -u22
+0.65u2 3 +sl
2
U 23
+u 3
+82
= 1.0
= 0.689655
+s3 = 0.0000
U 2 > 021 -
u 2 > 0
u22 > 0
23
82 0
S3 > 0
The solution to this linear program is
21 = 0.0000
U 2 2 = 0.6897
u3 = 0.6897
sl = 0.0000
s2 = 0.0000
s3 = 0.0000
The time interval to reach the next boundary is computed using the algorithm in
Section 4.8.4, and is
tl - to = 0.4606
This solution to the linear program has shifted all resources to Process Segment
2 and Process Segment 3. It required 4 iterations before it converged to the above
solution.
At least two causes may be present which may have had a role in the removal
of the boundary installed in the previous linear program. One cause may be due to
382
the fact that the virtual machine constraint has changed, altering the capacity set
and the corresponding boundaries. The change in the virtual machine constraint is
caused by the variable nature of the limiting rate imposed by Process Segment 2 on
Process Segment 3. However, the magnitude of the shift suggests another factor is
also present.
The other possibility is that the rate solution from the previous linear program
may not be accurate. That solution required 17 iterations to converge to a stable
solution. The lack of convergence may be due to the existence of both the virtual
machine constraint and the anti-loop constraint installed in the constraint matrix for
the same buffer. This is an topic of further research.
In any case, the next scheduled time for a rate calculation is 0.4606 time units in
the future, at time t = 252.1536.
Rate Calculation in Cell 2 at Time = 252.1536 The final linear program to
be calculated before the next failure of Machine 1 is:
min -12.29 1u 1 - 13.333u 2 - 10.833u 3
subject to
Machine 1: 0.75u21  +0.80u 2  +0.65u]3  +S1 = 1.0
Starve 2: u23 +82 = 0.467385
Starve AL 2: -u 2  + 3 3 = 0.0000
Boundary : 0.730u 1  -0.684u 2  = 0.013679
Boundary : -0.349u 1 -0.372u 2 +0.860u 3 = 0.041849
383
21
u 2 > 0
22
u 2 > 02 0
82 > 0
83 > 0
The solution to this linear program is
u2 = 0.4569
u2, = 0.4674
u2 = 0.4360
23= 0.0000
82 = 0.0000
S2 = 0.0000
The time interval to reach the next boundary is computed using the algorithm in
Section 4.8.4, and is
tj - to = 79.1962
This linear program required 2 iterations before it converged to the above solution.
The production rate of Process Segment 3 satisfies the virtual machine constraint
with inequality. Therefore, the rate of flow of material into Buffer 2 is positive, filling
the buffer. The virtual machine constraint will be removed after this rate calculation
is complete, according to Section 4.5.1.
According to the results of the algorithm, the time until the next boundary is
reached is 79.1962 time units. This requires a rate calculation to be scheduled at
time t = 331.3499.
These results have inconsistencies that are important to note. If the calculated
time until the next boundary is allowed to elapse, the deviation of the surplus x from
the hedging point z would be
384
221 - z21 = 0.134
X22 - z22 = -0.076
223 - Zz = -0.062
Since this linear program already has two installed boundaries, the next boundary
that would be reached should be the hedging point, where all deviations from hedging
points are zero. The third and final boundary would be installed at that time. This
is not the case.
In addition, the value of the deviation for Process Segment 1 is positive, which
means that the surplus is greater than its hedging point. This implies that excess
capacity is being used on Process Segment 1 which could be used on either Process
Segment 2 or Process Segment 3. These two inconsistencies point to an inadequate
understanding of the reentrant flow anti-loop constraint implementation in Hiercsim
Version 3.5.
7.6 Demonstration of Setup Changes in Hiercsim
Version 4.0
This section presents two examples of an system with a machine that is limited in
flexibility. This example illustrates some of the concepts and policies developed in
Chapters 5 and 6. A sample input file from the first of these two examples appears
in Appendix B.
The system is identical in both cases, and both use the algorithm for setup change
control as described in Section 6.7.
The first example demonstrates a rudimentary corridor policy (Sharifnia, Cara-
manis, and Gershwin, 1991), where the corridor boundaries are simply the hedging
points of each of the part types. The corridor policy as used in this implementation is
described in Section 6.7. The policy simply ignores the setup change rate frequency
385
restrictions imposed by the setup staircase policy.
The second example demonstrates a setup policy which restricts the frequency of
setup changes even though there may be capacity available for more setup changes
after production is satisfied. The frequency limiting policy is described in Section 6.7
in which setup change rate frequency restrictions are imposed.
System Description The system is represented in Figure 7-41 and it consists of
two processes and a single machine. Each process requires a different setup state
from the other. The setup catalog entries (Section 6.5) for the machine are shown in
Figure 7-42. One of the catalog entries is desgined for production of Type 1 parts,
and the other catalog entry is designed for production of Type 2 parts.
The control system is divided into three control levels. Level 1 is the static control
level which converts the target production rate, u ° = 9.5 parts per time unit, into
Level 1 the production rate , u}l , which is distributed to Level 2 Cell 2. Level 1
Cell 1 also provides the setup change frequencies fk'l- 1 and flk2- 1 to Level 2 Cell 2.
Control at Level 2 is accomplished by the controller of Level 2 Cell 2. which converts
the Level 1 setup change rate frequencies into Level 2 setup states and the Level 1
target production rate, ul1, into Level 2 production rates, u21 and u 2 , The Level 2
production rates are converted into staircase policy loading commands, N31 and N 32,
(Section 3.9) by Level 3 Cell 3.
The machine is completely reliable.
All hedging points, z22, 27 z, and z. 2 , are set to 0.0. The weighting coefficient
A is 1.0 (3.21). The operation times are indicated next to the respective process
segments in the factory in Figure 7-41. The setup change times are indicated in the
setup tree at Level 2, where the setup changes are initiated. The simulations were
run for a total of 1000 time units using Hiercsim Version 4.0.
The target production rates were chosen in order to leave 30% of the long-term
capacity available for setup changes. When setup changes are limited in frequency,
the remaining time is IDLE.
386
I
Level 1 u -0." 1
Machine 1 Setup Tree
1:
1:
7,- 10.0
Time to Change into
Setup State
Operation Time
Figure 7-41: System with Setups
387
1U - o.s
- 2
Level
Level
I IN , N32
Factory
Figure 7-42: Setup Catalog for the System with Setups
388
Catalog 1
P2
Catalog 2
P2
Overall Performance After 1000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 350 332 0.332 0.35 1
Process 2 350 346 0.346 0.35 1
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Setting Up
M1 1 0.00% 67.80% 31.20%
Table 7.10: Results of Corridor Setup Policy
389
Figure 7-43: Cumulative Production under Corridor Policy
Case 1: Corridor Setup Policy This section presents results from the system
under the corridor policy algorithm, described in Section 6.7. The corridors are
bounded by the hedging points of each part type. Summary results are shown in
Table 7.10. Notice that there is no idle time on the machine. All available time on
the machine which is not used in production is used in changing setup state.
Figure 7-43 shows the Level 2 cumulative production of Process 1 and Process
2, compared to the Level 1 requirements. A setup change is performed when the
cumulative production of the part in production reaches the Level 2 hedging point.
The amount of production during a given setup run is roughly equal to 30 parts.
Case 2: Setup Staircase Policy In this simulation, the Level 1 setup change rate
frequencies were set to the following values:
390
Overall Performance After 1000 Time Units
Process Total Total Mean Rate Mean WIP Mean Cycle
Name Required Produced Time
Process 1 350 309 0.309 0.351 1.000
Process 2 350 350 0.350 0.350 1.000
Machine Group Utilization Fractions
Group Name # Machines % Idle % Operating % Setting Up
M1 1 21.60% 65.90% 12.50%
Table 7.11: Results of Setup Staircase Policy
391
Figure 7-44: Cumulative Production under Setup Staircase Policy
fl = 0.005 setup changes per unit time
f2, = 0.005 setup changes per unit time
These setup change rates and the target production rates occupy 82.5% of the
machine's time, leaving 17.5% of the machine's time free for any other events. In this
simulation, that time was converted to IDLE time.
Table 7.11 shows the results for the setup staircase policy. Production require-
ments for Process 1 were not met by the system. The actual percentage of time that
was used in changing setups is equal to the limiting fraction specified by the Level 1
controller. However, the machine was IDLE 21.6% of the time, which indicates that
the machine catches up to production in one process, but is unable to switch into the
other process by the limiting Level 1 setup change rates f,'l and ff2.
392
400
350
P2 production continued
300 because setup for P1
.o is prevented by
8 250 setup rate limit.
. 200
* 150E Level 1 Demand
100 - Level 2 P1
Time to change
50 I9/ production from Level 2 P2
P1 to P2
0 200 400 600 800 1000
Time
Figure 7-44 shows the Level 2 and Level 3 cumulative production of Process 1
and Process 2, compared to the requirements of Level 1. Note that the cumulative
production of Process 2 reaches its hedging point at time t = 500, but all allocated
setup changes have been used. Therefore, the machine continues to produce parts
for Process 2 while Process 1 falls further behind. In contrast to the corridor policy
of the previous case, the amount of production performed during a production run
varies widely.
Process 2 is always favored by this setup change policy because the setup change
into Process 2 used the last allocated setup change (6.6) until the setup staricase
permitted another change 75 time units later.
This policy performs badly in this case. As more and more processes are added,
each with their own setup states, this policy begins to perform even worse. This
performance shortfall is due to the lack of sequencing control which can regulate
where the idle time on the machine is used up. In the current implementation, all
the idle time gets used during the last interval of the setup change sequence.
7.7 ICL Semiconductor Fab Case Study
This section provides an overview of a complex system which has been modeled using
Hiercsim Version 3.5. The details of the system can be found in Bai (1991b).
This case is used to demonstrate two points about the algorithms described in
this thesis:
1. A large complex system can be modeled using the current implementation of
Hiercsim Version 3.5. Bai (1991b) conducted a simulation of the MIT CMOS
process using Hiercsim Version 1.0 (Darakananda, 1989). That simulation took
17 weeks to complete 40 days of production on a VAX 11-780 mini-computer.
The simulation of the same MIT CMOS process which generated the results
for this section was conducted using Hiercsim Version 3.5. It required 3 hours
running on a VAXStation 3100 to generate 500 days of production.
393
Figure 7-45: The MIT Twin-Well CMOS process
394
2. The consequences of violating the flow rate assumptions in Section 2.5 are
demonstrated.
The MIT CMOS Process The MIT CMOS process which is simulated using
Hiercsim Version 3.5 is described below:
In the Integrated Circuit Laboratory of MIT, there is a baseline process,
a 1.75 micron CMOS process, which is used to monitor equipment perfor-
mance and device characteristics. The baseline process is enhancement-
compatible so that new technology innovations can be tested in a real
integrated circuit process. The process was designed modularly such that
coupling between processing steps was minimized. Whenever possible,
these baseline processing steps are incorporated into new processes and
experiments.
In order to use the hierarchical model, we assume that all inspection
operations in the baseline process are not restrictive. That is, none of the
wafers fail the inspections. We only count the time for inspections and
add it to the processing time of the preceeding operation.
Figure 7-45 illustrates the route of the baseline CMOS process which
consists of 73 operations. A wafer is released into the system at the RCA
station and leaves the system at Tube-b7 after the process is completed.
There are 19 machines involved in the CMOS production. ... There are
72 buffers in the system which are located between every two consecutive
operationsi 5
The machine parameters are not listed here, but may be found in Bai (1991b).
Machines are unreliable to varying degrees. The most unreliable machine fails, on the
avearage, once every 50 days. The worst mean time to repair time is 3 days. Total
processing time is 21.3 days for one wafer. A table of the machine parameters appears
SFigure 7-45 and this text are excerpted from Bai (1991b, pp 148-150) with permission from
MIT.
395
in Bai (1991b). The machines are modeled as flexible machines with all setup change
times included in the appropriate operation times.
The maximum size of all buffers in the simulations used in this section is equal
to 1.0 wafer. The hedging points were chosen such that, on average, 4 wafers would
be distributed evenly among the 72 buffers. The nature of the process prevents the
accumulation of a large amount of work-in-process. Therefore, this low amount of
work-in-process distributed among such a large number of buffers violates the flow
rate assumptions of Section 2.5. This is a large grain manufacturing system.
The control system used in Hiercsim Version 3.5 consists of a single cell at Level
1 which supplies target production rates based on the demand rate and the machine
parameters. There are 19 Level 2 cells, one for each machine in the system. The
controllers of those cells respond to failures of the machine they control, and to virtual
machine constraints (Section 3.8) imposed by upstream and downstream cells.
The demand rate for this system is 0.15 wafers per day. The maximum theoretical
long-term production rate is 0.1755 wafers per day (3.12). Case 1 uses the hierarchical
controller described in Chapter 3 with a low level machine loading policy of processing
the youngest wafer first. Case 2 uses the hierarchical controller described in Chapter 3
with a low level machine loading policy of processing the oldest wafer first.
The simulation was run for 500 days in Case 1 and for 190 days in Case 2.
This system is highly reentrant (the photo-track sees each wafer 12 times during
the process). Therefore, the reentrant flow constraints (3.34) and (3.35) are imposed
frequently during the simulation. The problems associated with those constraints,
illustrated in Section 7.5, are also present in this simulation. The simulation ran in
4 hours on a VAX 3100 with reasonable results. Case 2 ended prematurely 190 time
units into its simulation. The reason for this premature ending is not known for sure.
However, it is possible that the algorithm for reentrant flow constraints described in
Section 3.12.3 could made the simulation stop.
Figure 7-46 shows the Level 1 and Level 2 cumulative production over the 500
time units for Case 1. The cumulative production for the Level 2 entry step at the
RCA station and the Level 2 exit step at Tube-b7 are shown. Case 2 had similar
396
85
75
65
B45
35
E 25 Cell 1
15 -- Level 2 ROA
5 -- - -- - - Level 2 Tube B7
-5
0 100 200 300 400 500
Time
Figure 7-46: CMOS Line Level 2 Cumulative Production
397
results for Level 1 and Level 2, and is not shown.
The Level 2 cumulative production at the exit step at Tube-b7 tracks the Level 1
target rate quite well. The 21 day cycle time can be seen by the time interval between
the Level 2 entry step at the RCA station and the Level 2 exit step at Tube-b7. Note
the 4 wafer hedging point at the entry step at the RCA station.
Case 1: Youngest Wafer First The difference between the two cases appears
in the Level 3 cumulative production and in the work in process inventory. Since
there are so few wafers distributed over so many buffers when the system is far
from hedging point, the Level 2 controllers do not have a good model of blockage
and starvation. Therefore, those controller operate as if their machines are neither
blocked nor starved, even though the machines are for the most part starved during
the initial loading of the process.
The policy of processing the youngest wafer first in the highly reentrant system
ensures that some buffers are filled unnecessarilly before any of the older wafers are
able to be processed. This builds up work-in-process and delays the time when the
first part leaves the system. The effects of this policy are felt for 160 days, after which
the system behaves much like the system in Case 2.
Figure 7-47 shows the Level 3 cumulative production at the first step in the process
at the RCA station and at the last step in the process at Tube-b7. Only the results
for the first 190 days are shown to remain consistent with the results of Case 2. The
Level 1 and Level 2 calculated cumulative production are also shown as a comparison
between the controller's model of production and what actually is happening.
There is a small deviation between the Level 2 production at the entry step at
that RCA station and the corresponding Level 3 production. On the other hand, the
loading policy which holds older parts in favor of younger ones forces a large deviation
between Level 2 production and Level 3 production. After the initial 160 days, this
transient phenomenon dampens out, and the system maintains a work-in-process
inventory of 5-6 wafers.
Figure 7-48 shows the work-in-process in the system as a function of time. The
398
40
35
r -30
25
o 20
7 15
10
50
0
0
Cell 1
- - - Level 2 RCA
Level 3 RCA
- - ---- Level 2 Tube B7
Level 3 Tube B7
Finished
Product
50 100 150
Time
Figure 7-47: CMOS Line Level 3 Cumulative Production - Youngest Wafer First
399
Figure 7-48: CMOS Line Overall Work-in-Process - Youngest Wafer First
400
20
18
16
14
12
8
4
2
0
0 50 100 150
Time
Figure 7-49: CMOS Line Level 3 Cumulative Production - Oldest Wafer First
first 100 days of the simulation are spent filling the internal buffers. The greatest
work-in-process is less than the maximum of 72 parts because the machine loading
rule only adversely affects the highly reentrant stations such as the phototrack and
the asher which also happen to be the system bottlenecks.
Case 2: Oldest Wafer First The policy of processing the oldest wafer first in the
highly reentrant system ensures that wafers are expedited through the system. This
keeps work-in-process inventory low and shortens the time until the first part leaves
the system.
Figure 7-49 shows the Level 3 cumulative production at the first step in the process
at the RCA station and at the last step in the process at Tube-b7. Only the results
for the first 190 days are shown because the simulation stopped at that time. The
Level 1 and Level 2 calculated cumulative production are also shown as a comparison
401
40
35
c 30
9 15
E 105
0o20
cc
'S E
C)10
5
0
Cell 1
- Level 2 RCA
Level 3 RCA
- Level 2 Tube B7
Level 3 Tube B7
0 50 100 150
Time
Figure 7-50: CMOS Line Overall Work-in-Process - Oldest Wafer First
402
20
18
16
14
012
810
6
4- Overall WIP
2
0I
0 50 100 150
Time
between the controller's model of production and what actually is happening. The
Level 3 production tracks closely the Level 2 requirements both at the entry step at
the RCA station and at the exit step at Tube-b7.
Figure 7-50 shows the work-in-process in the system as a function of time. The
work-in-process remains at a constant value of 6-7 wafers which is only slightly dif-
ferent from the 4 wafers required by the hedging points.
403
Chapter 8
Conclusion
8.1 Conclusion
The purpose of Hiercsim Versions 3.5 and 4.0 is to provide a research testbed for the
exploration of hierarchical control algorithms. Hiercsim Version 3.5 is a generally sta-
ble simulation which is able to model manufacturing systems which have operations,
failures, and repairs. Arbitrary control levels, number of machines, and number of
processes can be modeled. An attempt has been made to make the design of the
code modular, limiting the amount of interaction between subroutines. Hopefully,
this architecture can be ported to an object oriented environment without major
modification.
At this stage in the development of the hierarchical control algorithms, many of
the decisions required to run a factory do not have policies consistent with the hier-
archical framework. Development of those policies has been constrained in the past
to theoretical papers with limited experimentation. Hiercsim Version 3.5 provides a
stable platform to conduct policy development experimentation.
Hiercsim Version 4.0 is a preliminary version of a simulation which incorporates
setup changes into the hierarchical framework. An initial description of setup changes
consistent with the hierarchical framework has been developed as well as initial control
policies. Further work is required to permit the use of Hiercsim Version 4.0 as a
research testbed for setup change policies.
404
8.2 Summary
Hiercsim Versions 3.5 and 4.0 are implementations of the hierarchical manufacturing
control framework found in Gershwin (1989) and Violette and Gershwin (1991). This
thesis covers many of the issues that enable the hierarchical controller (which is
comprised of a set of independent and distributed controllers) to function as a coherent
system. This thesis also points out areas which require further research and which
can use Hiercsim to aid in that research. It also provides a springboard into future
work that incorporates setup changes into manufacturing control.
This thesis describes the assumptions required to implement the hierarchical con-
trol policies (Sections 2.4, 2.5, 2.6, and 2.7). Those assumptions have implications for
the creation of the hierarchy and the estimation of capacity (Sections 2.8.2, 2.9, and
3.6). This thesis also describes all the algorithms used in Hiercsim, from the major
hierarchical control policies to the heuristic low level factory loading policies and the
communication necessary to link the distributed controllers into a cohesive system.
Three major implementation issues relevant to the pyramid decomposition hierar-
chy were encountered and addressed during the implementation. These issues, once
resolved, permitted the modeling of much more diverse systems than those modeled
using only the frequency decomposition hierarchy. Those issues were:
1. The interaction and precedence of cell calculations in a pyramid decomposition
(Sections 3.6, 3.8, and 3.13).
2. Reentrant flow control in a pyramid hierarchy (Section 3.12).
3. Boundary installation in the linear program (Section 4.8.4).
In addition to the algorithms required for the hierarchical controller, this imple-
mentation also defines the architecture for a simulator which incorporates the concepts
of the hierarchical framework (Section 4.1). Included in that architecture is the defi-
nition of interfaces between major components in the hierarchical controller such as
the cell hierarchy and the factory.
405
Hiercsim Version 4.0 is a preliminary version of a simulation which models setup
changes in manufacturing systems of limited flexibility. Preliminary estimates of
capacity for system with setup changes are modeled, as well as preliminary control
algorithms based on those estimates (Chapters 5 and 6).
A wide range of simulations are presented in Chapter 7 which exercise Hiercsim
under a a variety of conditions. The basic capability of Hiercsim is shown, as well as
cases which pose problems for future research. The hierarchical controller performed
best in simulations which followed the assumptions of Chapter 2. The assumptions
were most critical when target demand rates were specified near the edges of the
capacity set. However, when target demand rates were well within the capacity
set, the hierarchical controller performed very well, even when the assumptions of
Chapter 2 were not followed closely.
8.3 Future Research
This section provides some direction for those who are interested in pursuing research
in the hierarchical control framework. Based on experience in manufacturing, the
following additional features would improve the ability of Hiercsim to model a wider
range of manufacturing systems:
Realtime Feedback from Production The current formulation of the hierarchi-
cal controller uses a feedback control law based on calculated performance of
the factory. It has been observed in simulations using Hiercsim Version 3.5 that
in cases where the capacity estimates are inaccurate, the controller's perception
of performance diverges from the actual factory performance. Since manufac-
turing parameters are difficult to obtain accurately in a manufacturing setting,
there is a high probability that estimates of capacity will be inaccurate to a
certain degree. Feedback based on actual production should help compensate
for this divergence.
406
Due Dates Industrial contracts are based on the production of a certain quantity of
product for delivery at specified due dates. The hierarchical control framework
does not yet have a mechanism to convert these contractual terms into demand
rates and the subsequent flow-controlled production into shipments.
Assembly Many manufacturing systems involve the assembly of components into
larger systems. It would be useful to be able to model this, especially over long
time frames.
Multiple Routing Machine shops which fabricate metal parts often have many
types of machines which are capable of performing the necessary operations.
Hiercsim is currently limited to modeling manufacturing systems in which an
operation can only be performed on a single type of machines, although there is
a choice about the particular instance of the machine type which can be used.
It would be useful to be able to specify the same operation on two different
types of machines and to be able to control that system.
Batch Processing Some manufacturing systems process parts in large batches. For
example, baking of bread, anodizing machined parts, and cleaning substrates
for electronic components are all examples of batch processes. Hiercsim does
not have the capability to model batch processes.
Set Management Manufacturing systems which include assembly at some point
in their processes must account for the availability of different parts in some
predetermined ratio so that a shipment may be completed. Hiercsim does not
have the capability to link two processes together such that they fall behind
requirements and catch up to their hedging points in fixed ratios of production.
Preventative Maintenance Preventive maintenance is an important part of schedul-
ing manufacturing operations. Hiercsim does not yet have the capability to
include preventive maintenance.
Direct and Indirect Labor Some manufacturing systems must carefully allocate
people among different contracts so that costs may be controlled. In some cases,
407
the manufacturing system is limited by the availability of people to run or repair
machines. Hiercsim does not have the capability to model these situations.
8.4 Note on the Status of Hiercsim
This section gives an overview for future developers of Hiercsim on the status of
Hiercsim. It is intended to provide some direction for the expenditure of programming
resources. It outlines areas of the code which work, areas where improvement would
be helpful, and areas which do not work properly.
8.4.1 Code that Works
Areas of the code which work well are listed below:
Data Input The data input routines in both Hiercsim Version 3.5 and 4.0 are known
to be implemented properly.
Factory Model The factory model works properly. The factory model includes
processing, transportation of parts, operation-dependent failures, and and setup
changes. Time-dependent failures are not implemented properly.
Cell Hierarchy The cell hierarchy and all communication between cells are known
to work properly.
Buffers in the Hierarchy Models of buffers at arbitrary levels in the hierarchy are
implemented correctly.
Hedging Point Models Hedging points are implemented properly.
Linear Program The linear program and the routines which transfer data into and
out of the solver are known to work properly.
Virtual Machines Virtual machine constraints are implemented correctly, includ-
ing the addition and removal of virtual machine constraints to the cell's capacity
set and the use of independent virtual machines at different control levels.
408
Boundary Installation Algorithm Boundary installation algorithm works as de-
scribed in Section 4.8.4, but can be improved to use less computational resource.
8.4.2 Algorithm Improvements
Improvements to the following areas of Hiercsim would greatly increase its utility as
a research testbed by making simulations easier to use and faster to run.
Updating the Programming Language Simulations are best when implemented
in an object-oriented language such as C++ or Smalltalk. Hiercsim is written
in pre-ANSI C and is therefore not compatible with compilers developed after
1992. It would be worthwhile to update the language to at least ANSI C.
User Interface Hiercsim was developed before the advent of graphical user inter-
faces. Therefore, the user interface is difficult to use and the resulting input
files are difficult to modify.
Improved Estimate of Cell Capacity The cell capacity set (3.12) is based on the
assumption that machines in the same process segment in a cell are separated
by buffers of sufficient size to eliminate starvation and blockage internal to the
cell due to low level events. When buffers are not large enough to eliminate
starvation and blockage within a cell, a different model of capacity is required.
Hiercsim is designed in such a way as to facilitate experimentation with different
capacity sets.
Boundary Installation Algorithm The boundary installation algorithm described
in Section 4.8.4 is not as efficient as it could be. With some extra development,
extra calls to the linear program solver can be eliminated, leading to faster
execution of the rate calculation algorithm.
Reentrant Flow Control Reentrant flow control in the distributed pyramid hier-
archy is subject to infinite looping. The solution described in Section 3.12.3
works marginally. The result of the controller is that the surplus approaches
409
the hedging point with excessive amount of calculation. A hybrid formulation
is suggested where the pyramid hierarchy is used when virtual machines chains
linking a cell to itself are not active, and the process decomposition hierarchy
found in Bai (1991b) is used when those virtual machines chains are active.
8.4.3 Code that must be Fixed
The following list shows areas of Hiercsim which are known to be flawed. These flaws
generally do not affect the performance of a simulation. Rather, they prevent the use
of additional promised features.
Time-Dependent Failures Time-dependent failures, described in Section 2.7 are
improperly implemented in Hiercsim Version 3.5. However, operation-dependent
failures are implemented correctly.
Reentrant Flow Constraints The interaction between reentrant flow constraints
(3.34) and (3.35) and virtual machine constraints (2.19) and (2.20) is not well
understood. When a virtual machine chain links a cell to itself, both the virtual
machine constraints and the reentrant flow constraints are added to the linear
program (4.11). This leads to the problems illustrated in Section 7.5. These
problems impact the efficiency of the control algorithm, and they may have
been the cause of the simulation crash in Case 2 of Section 7.7.
Setup Tree Model It is uncertain whether or not the setup tree model described in
Section 5.2.1 is implemented correctly in Hiercsim Version 4.0. Further testing
is required.
Setup Change Controller The control algorithm implemented in Hiercsim Version
4.0 performs well for simple cases, as shown in Case 1 of Section 7.6. However,
for more complicated systems, the controller does not work at all. This is an
area where Hiercsim could be used effectively as a research testbed.
Hiercsim Version 4.0 Other areas of Hiercsim Version 4.0 which do not work are
unknown because the code has not been thoroughly tested.
410
References
Akella, R., Choong, Y.F., and Gershwin, S.B. (1984) "Performance of Hier-
archical Production Scheduling Policy," IEEE Transactions on Components,
Hybrids, and Manufacturing Technology, pp. 225-240, September 1984.
Atherton, R. (1987) "Factory Scheduling Using Simulation Models," Proceedings,
AIM III, Electrochemical Society, 1987.
Bai, S. X. (1991a) "Linear Formulation of Control Parameter Estimation for Real-
Time Production Scheduling," MS Thesis, Department of Mechanical Engineer-
ing, Massachusetts Institute of Technology, September 1991.
Bai, S. X. (1991b) "Scheduling Manufacturing Systems with Work-In-Process In-
ventory Control," Ph. D. Thesis, Operations Research Center, Massachusetts
Institute of Technology, September 1991.
Bai, S.X. and Gershwin, S.B. (1989) "A Manufacturing Scheduler's Perspective
on Semiconductor Fabrication," MIT Laboratory for Manufacturing and Pro-
ductivity, Technical Report VLSI-89-518, 1989.
Bai, S.X. and Gershwin, S. B. (1990a) "Scheduling Manufacturing Systems
with Work-in-Process Inventory Control: Single-Part-Type Systems," MIT Op-
erations Research Center, Report OR 218-90, June 1990.
Bai, S.X. and Gershwin, Stanley B. (1990b) "Scheduling Manufacturing Sys-
tems with Work-in-Process Inventory Control: Multiple-Part-Type Systems,"
MIT Operations Research Center, Report OR 230-90, October 1990.
Darakananda, B. (1989) "Simulation of Manufacturing Process Under a Hierar-
chical Control Algorithm," MS Thesis, Department of Electrical Engineering
and Computer Science and Sloan School of Management, Massachusetts Insti-
tute of Technology, 1989.
Enomoto, M. (1992) "Implementation of a Hierarchical Controller in a Manufac-
turing Process," MIT Laboratory for Manufacturing and Productivity, Report
LMP-92-012, August 1992.
Gershwin, Stanley H. (1989) "Hierarchical Flow Control: A Framework for
Scheduling Events in Manufacturing Systems," Proceedings of the IEEE, Vol.
77(11), 1989.
Gershwin, Stanley B. (1993) Manufacturing Systems Engineering, in preparation.
Gershwin, Stanley B., Akella, R., and Choong, Y.F. (1985) "Short-
Term Production Scheduling of an Automated Manufacturing Facility," IBM
Journal of Research and Development, Vol 29, No. 4, pp. 392-400, July 1985.
411
Glassey, C.R., and Adiga, S. (1989) "Conceptual Design of a Software Object
Library for Simulation of Semiconductor Manufacturing Systems," Journal of
Object-Oriented Programming, 2 (4), pp. 39-43, 1989.
Hager, D.M. (1992) "Applying Continuous Flow Manufacturing Principles to a
Low Volume Electronics Manufacturer," MS Thesis, Department of Mechani-
cal Engineering and Sloan School of Management, Massachusetts Institute of
Technology, June 1992.
Lasserre (1992) "New Capacity Sets for the Hedging Point Strategy," International
Journal of Production Research, Vol. 30, No. 12, December, 1992, pp. 2941-
2949.
Luenberger, D. (1977) Introduction to Linear and Nonlinear Programming,
Addisson-Wesley Publishing Co., Reading, MA, 1977.
Kimemia, Joseph G. (1982) "Hierarchical Control of Production in Flexible Man-
ufacturing Systems," Ph.D. Thesis, Massachusetts Institute of Technology, 1982.
Kimemia, Joseph G. and Gershwin, Stanley B. (1983) "An Algorithm for the
Computer Control of Production in a Flexible Manufacturing System," HIE
Transactions, Vol 15(4), 1983.
Kowalski, J (1990) "Simulation Experiments Using a Model of a Wood Products
Factory," BS Thesis, Department of Mechanical Engineering, Massachusetts
Institute of Technology, June 1990.
Morin, C. (1991) "Scheduling Simulation of a Surface Mount Manufacturing Pro-
cess," BS Thesis, Department of Mechanical Engineering, Massachusetts Insti-
tute of Technology, June 1991.
Pegden, C., Sadowski, Randall P., and Shannon, Robert E. (1990)
Introduction to Simulation using SIMAN, Systems Modeling Corporation and
McGraw-Hill, Inc., New York, 1990.
Pritsker, A.A.B. (1986) Introduction to Simulation and SLAM II, Systems Pub-
lishing Corporation and John Wiley and Sons, New York, 1986.
Riester S.M. (1992) "Application of a Hierarchical Flow Control Production Sim-
ulator to a Low Volume Electronics Manufacturer," BS Thesis, Department of
Mechanical Engineering, Massachusetts Institute of Technology, June 1992.
Sharifnia, A., Caramanis, M., and Gershwin, S. (1989) "Dynamic Set-
up Scheduling and Flow Control in Flexible Manufacturing Systems," Third
ORSA/TIMS Special Interest Conference on Flexible Manufacturing Systems,
Massachusetts Institute of Technology, Cambridge, MA, August 14-16, 1989.
412
Sharifnia, A., Caramanis, M., and Gershwin, S (1990) "Dynamic Set-
up Scheduling and Flow Control in Flexible Manufacturing Systems," Boston
University, College of Engineering, Department of Manufacturing Engineering,
1990.
Sharifnia, A., Caramanis, M., and Gershwin, S (1991)
"Dynamic Set-up Scheduling and Flow Control in Manufacturing Systems,"
Journal of Discrete Event Dynamic Systems, Volume 1 Number 2, pp. 149-176,
September 1991.
Smarason, H. (1991) "Application of a Hierarchical Control Algorithm to Pro-
duction Scheduling in an Injection Molding System," BS Thesis, Department
of Mechanical Engineering, Massachusetts Institute of Technology, June 1991.
Srivatsan, N. and Gershwin, Stanley B. (1990) "Selection of Setup Times in a
Hierarchically Controlled Manufacturing System," Proceedings of the 29th IEEE
Conference on Decision and Control, Hawaii, December 1990.
Swain, J. (1991) "Simulation Software Survey," OR/MS Today, pp. 81-102, Octo-
ber 1991.
Torres, M. (1991) "A Graphical User's Interface for Hiercsim," BS Thesis, Depart-
ment of Electrical Engineering and Computer Science, Massachusetts Institute
of Technology, May 1991.
Villaflor, L (1992) "Xgrafconv: A Software Utility for Producing Graphical Out-
put for Hiercsim," Bachelor's Thesis, Department of Mechanical Engineering,
Massachusetts Institute of Technology, Feb 1992.
Violette, James D. and Gershwin, Stanley B. (1991) "Decomposition of Con-
trol in a Hierarchical Framework for Manufacturing Systems," Proceedings of
the Automatic Control Conference, Boston, MA, June 1991.
Wolverine Software Corp. 4115 Annandale Road, Suite 200, Annandale, VA 22003.
413
Appendix A
Sample Input File, Hiercsim
Version 3.5
This is the input file for the simulation described in Section 7.2.4. It was run using
Hiercsim Version 3.5.
;ALL TIMES UNITS MUST BE CONSISTENT.........;
;Simulation end time : ...............
;Random number seed: .................
;Enter numerical
;Simulation EPS:
;min_delta: .....
;EPStime : ......
;smallEPS: ......
;Infinity value:
;Perturbation Mul
;Maximum Perturba
;Linear Program E
;Surplus space mo
;Surplus space sc
..... ; 2000.000
.....; 234
stability parameters:.....;
................. ....... ;1.000000e-06
.......................... ;5.000000e-01
....... ..... .... .......... ;1.000000e-03
.......................... ;1.000000e-04
.......................... ;1.000000e+08
tiplier : ................ ;1.100000e+00
tion Multiplier: ......... ;1.000000e+02
PS: ......... ............. ;1.000000e-08
vement factor: ........... ; 1.000
ale factor: ...............; 1.000
414
;Enter subfile name for ..................
;Machine Description.....................
;Enclose subfile name with double quotes
;Machine type description....
;Machine type name: .........
;Setup description.........
;Enter name of setup class :
............... ;
............... ;
;Control level of misetup :........
;Number of operations in misetup :.
;Actual operation times in misetup.
;Operation 0 :.....................
........ ;
........ +;
... 
. . . . ;
.I
.p
;Perceived operation times in misetup ...... ;
;Operation 0 :.............................;
;Enter name of setup in this class: ........;
;Enter name of setup class :............... ;
;Failure name: ............................ ;
;Mean Time To Failure: .................... ;
;[actual, perceived]: ..................... ;
;Mean Time To Repair: ..................... ;
;[actual, perceived]: ..................... ;
;Failure level (control level #): ......... ;
;Failure name: ..................
1
0.5__/01
0.5__/01
end
end
mifail
800.000_
200.000
2
end
415
"none"
misetup
800.000
200.00
;~~~~~~~~~~~t~~~~~~~I~t~~~~~~~~~~~~S~~~~
;~~~~~~~~~~~~~~~~t~~f~~tt~~~~~~~~~~~~~~~
;Machine type name: ....................... ; m2
;Setup description ........................ ;
;Enter name of setup class :...............; m2setup
;Control level of m2setup :................ ;
;Number of operations in m2setup :.........;
;Actual operation times in m2setup.........;
;Operation 0 :............................. ; 0.475__/01
;Perceived operation times in m2setup......;
;Operation 0 :............................. ; 0.475__/01
;Enter name of setup in this class: ........ ; end
;Enter name of setup class :...............; end
;Failure name: ............................ ; end
;Machine type name: ....................... ; m3
;Setup description.........................;
;Enter name of setup class :................; m3setup
;Control level of m2setup :................; 1
;Number of operations in m2setup :.........; 1
;Actual operation times in m2setup ........
;Operation 0 :.............................; 0.451__/01
;Perceived operation times in m2setup......
;Operation 0 :.............................; 0.451__/01
;Enter name of setup in this class:........; end
;Enter name of setup class :...............; end
;Failure name: ............................ ; end
416
;Machine type name: ....................... ; m4
;Setup description.........................;
;Enter name of setup class :...............; m4setup
;Control level of m2setup :................; 1
;Number of operations in m2setup : ......... ;
;Actual operation times in m2setup.........;
;Operation 0 :............................. ; 0.429__/01
;Perceived operation times in m2setup......;
;Operation 0 :.............................; 0.429__/01
;Enter name of setup in this class:........; end
;Enter name of setup class :...............; end
;Failure name: ............................ ; end
;Machine type name: ....................... ; end
;Enter subfile name for....................
;Buffer Description........................;
;Enclose subfile name with double quotes .; "none"
;Buffer description........................;
;Buffer name: ............................. ; initbuf
;Buffer size: ............................. ; 1.000
;Buffer name: ............................. ; bufl
;Buffer size: ............................. ; 50.000
417
;Buffer name:
;Buffer size:
;Buffer name:
;Buffer size:
;Buffer name:
............................. ;
. . .. .. .. . .. .. . .. . ... .. .. . .. .;
.I
.I
;Enter subfile name for .................... ;
;Cell Description .......................... ;
;Enclose subfile name with double quotes .; "none"
;Cell Description .......................... ;
;Cell name: .............................. ;
;Control level of cell: ....................;
;Cell components.. .......... .............. ;
;[#] machine type/buffer/cell name: ....... ;
;Group name: ................................ ;
;Initial setup for machines in group gml...;
;Initial setup for machine 0: .............. ;
;[#] machine type/buffer/cell name: ....... ;
;Cell name: ............................... ;
;Control level of cell: ..................
;Cell components... ............... ........ ;
;[#] machine type/buffer/cell name: ....... ;
;Group name: .............................
;Initial setup for machines in group gm2...;
;Initial setup for machine 0: .............. ;
cell6
3
ml
gml
misetup
end
*********** ********
cell7
3
m2
gm2
m2setup
418
buf2
50.000
buf3
50.000
end
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; cell8
;Control level of cell: ................... ; 3
;Cell components...........................
;[#] machine type/buffer/cell name: ....... ; m3
;Group name: .............................. ; gm3
;Initial setup for machines in group gm2...;
;Initial setup for machine 0:..............; m3setup
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; cell9
;Control level of cell: ................... ; 3
;Cell components .......................... ;
;[#] machine type/buffer/cell name: ....... ; m4
;Group name: .............................. ; gm4
;Initial setup for machines in group gm2...;
;Initial setup for machine 0:..............; m4setup
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; cell2
;Control level of cell: ................... ; 2
;Cell components .......................... ;
;[#] machine type/buffer/cell name: ....... ; cell6
;[#] machine type/buffer/cell name: ....... ; end
; **************************************************************;
419
;Cell name: ............................... ; cell3
;Control level of cell: ................... ; 2
;Cell components ........................... ;
;[#] machine type/buffer/cell name: ....... ; cell7
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; cell4
;Control level of cell: ................... ; 2
;Cell components .......................... ;
;[#] machine type/buffer/cell name: ....... ; cell8
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; cell5
;Control level of cell: ................... ; 2
;Cell components........................... ;
;[#1 machine type/buffer/cell name: ....... ; cell9
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; celli
;Control level of cell: ................... ;
;Cell components .......................... ;
;[#] machine type/buffer/cell name: ....... ; cell2
420
[#]
[#]1
[#]
[#]
[#]
[#]
1#1
machine
machine
machine
machine
machine
machine
machine
machine
type/buffer/cell
type/buffer/cell
type/buffer/cell
type/buffer/cell
type/buffer/cell
type/buffer/cell
type/buffer/cell
type/buffer/cell
name:
name:
name:
name:
name:
name:
name:
name:
.. . . . ....... ;
.+.coo.;
o ;
..... o.;
....... +;
....... ;
+.......;
....... ;
;Cell name: ............................... ; end
;Enter subfile name for....................;
;Process Description.......................;
;Enclose subfile name with double quotes .; "none"
;Controllable process description.........;
;Process name : ........................... ;
;A coefficient (ie. relative importance): .;
;Demand rate (parts/unit time): ........... ;
;Operating level (control level #): ....... ;
;Production route...........................;
;Step # 1: ................................ ;
;Machine group/buffer name: ............... ;
;Transit time to this step: ............... ;
;Average lot size: ........................ ;
;Required holding space: .................. ;
pi
1.000
0.850
3
initbuf
1.000
1.000
421
cell3
cell4
cell5
initbuf
buf
buf 2
buf3
end
;Step # 2: ................................ ;
;Machine group/buffer name: ............ ;
;Transit time to this step: ............... ;
;Average lot size: ........................ ;
;Required machine setup name: .............;
;Enter operation number within setup: ..... ;
;Alternate routing probability: ........... ;
;Step # 3: ................
;Machine group/buffer name:
;Transit time to this step:
;Average lot size: ........
;Required holding space: ..
................;
;Step # 4: ..........................
;Machine group/buffer name: .........
;Transit time to this step: .........
;Average lot size: ..................
;Required machine setup name: .......
;Enter operation number within setup:
;Alternate routing probability: .....
;Step # 5: ..........................
;Machine group/buffer name: .........
;Transit time to this step: .........
;Average lot size: ..................
;Required holding space: ............
;Step # 6: ..........................
;Machine group/buffer name: .........
;Transit time to this step: .........
;Average lot size: ..................
;Required machine setup name: .......
;Enter operation number within setup:
;Alternate routing probability: .....
;Step # 7: ..........................
;Machine group/buffer name: .........
;Transit time to this step: .........
;Average lot size: ..................
;Required holding space: ............
;Step # 8: ................
;Machine group/buffer name:
;Transit time to this step:
...... ;
..... ;
...... ;
..... ;
.I
.p
............... ;
gml
o.ooo__/01oi
1.000
misetup
0
end
buf1
0.o00__/0oi
1.000
1.000
gm2
0.000ooo__/oi
1.000
m2setup
0
end
buf2
0.000ooo__/01
1.000
1.000
gm3
0.o000__/o
1.000
m3setup
0
end
buf3
0.000__/01
1.000
1.000
gm4
0.000__/01
422
;Average lot size: ........................ ;
;Required machine setup name: ............. ;
;Enter operation number within setup: ..... ;
;Alternate routing probability: ........... ;
;Step # 9: ................................ ;
;Machine group/buffer name: ............... ;
;Process name : ........................... ;
* ** * ** ******** ******* *** ** ************************
S S~~S~~~t~~~~t~~~~~t ~
;Enter subfile name for .................... ;
;Output Log Specification..................;
;Enclose subfile name with double quotes .;
;Output log specification.................;
;Process name: ............................ ;
S p~tft~~~t~~~~~~tt~ ~ ~
;Process name: ............................
;Step number: .............................
;Step number: .............................
pl
1 -__ /01
end
;Process name: ............................ ; end
;Enter subfile name for...................;
;Hedge Point Entry.........................;
;Enclose subfile name with double quotes .; "none"
423
1.000
m4setup
0
end
end
end
"none"
;~~t~~~~~~~~~~t~~~~t~~~~~t~t~~~~S~~~~t~~
;Enter hedge points for the following ...... ;
;processes and levels:.....................;
;Process pl, at Level 1 :.................;
;Cell celli, at Stepnumber 4: ..............; 0.000
;Process pl, at Level 2 :.................;
;Cell cell5, at Stepnumber 8:.............. ; 0.000
;Cell cell4, at Stepnumber 7:..............; 25.00
;Cell cell3, at Stepnumber 5:.............. ; 50.00
;Cell cell2, at Stepnumber 3:...............; 75.00
;Process pl, at Level 3 :.................;
;Cell cell9, at Stepnumber 8: .............. ; 0.000
;Cell cell8, at Stepnumber 7:............. ; 0.000
;Cell cell7, at Stepnumber 5:..............; 0.000
;Cell cell6, at Stepnumber 3: ............. ; 0.000
;Enter subfile name for.....................;
;Setup Control Data Entry.................. ;
;Enclose subfile name with double quotes .; "none"
;Enter top level setup change frequencies..;
;----- Cell name: celli .................... ;
;Enter setup change Acoeff.................;
;----- Cell name: cel14 .................... ;
;Enter setup change Acoeff.................;
;----- Cell name: ce115 ................... ;
;Enter setup change Acoeff.................;
424
; ----- Cell name: cell2....................;
;Enter setup change Acoeff.................;
----- Cell name: cell3...................;
;Enter setup change Acoeff.................;
;Enter subfile name for....................;
;Gantt Chart Output Selection..............;
;Enclose subfile name with double quotes .; "none"
;Gantt chart selection:....................;
;Enter Machine Group or Cell name:..........; end
;Enter subfile name for .................... ;
;Xgraph Output Specifications..............;
;Enclose subfile name with double quotes .; "none"
;Xgraph selection for cumulative production & buffers:;
;Enter Process name:....................... ; pi
; overall demand ;
;Step number: ............................. ; 1
;level number: ............................ ; 1
; machines ;
;Step number: ............................. ; 2
;level number: ............................ ; 2
;Step number: ............................. ; 4
;level number: ............................ ; 2
;Step number: ............................. ; 6
;level number: ............................ ; 2
425
;Step number: ........
;level number: .......
; buffers ;
;Step number: ........
;level number: .......
;Step number: ........
;level number: .......
;Step number: ........
;level number: .......
;Step number: ........
;Enter Process name:..
..................... ; 3
..................... ; 2
..................... ; 5
..................... ; 2
..................... ; 7
..................... ; 2
..................... ; end
******************************* **********;
..................... ;
;Xgraph selection for WIP of processes.....;
;Enter Process name: ........................ ;
;Xgraph selection for WIP integral of processes;
;Enter Process name: ........................ ;
end
end
end
;Enter subfile name for .................... ;
;Program Output Time Table Specification...;
;Enclose subfile name with double quotes .; "none"
;Program Output Time Table Specification...;
;Option to be toggled: .................... ;
;Time to toggle option: ................... ;
;Option to be toggled: .................... ;
;Time to toggle option: ................... ;
;Option to be toggled: .................... ;
426
x
0.000
s
0.000
f
..................... ;
..................... ;
;Time to toggle option:
;Option to be toggled:
;Time to toggle option:
;Option to be toggled:
;Time to toggle option:
;Option to be toggled:
.................... ;
................... ;
.................... ;
................... ;
.................... ;
427
0.000
v
;0.000
v
;1.000
end
Appendix B
Sample Input File for Hiercsim,
Version 4.0
This is the input file used in the firstsimulation described in Section 7.6. It was run
using Hiercsim Version 4.0.
;ALL TIMES UNITS MUST BE CONSISTENT........;
;Simulation end time : ...............
;Random number seed: .................
;Enter numerical stability parameters:
;Simulation EPS: .....................
;min_delta: ..........................
;EPStime : ...........................
;smallEPS: ...........................
;Infinity value: .....................
;Perturbation Multiplier : ...........
;Maximum Perturbation Multiplier: ....
;Linear Program EPS: .................
;Surplus space movement factor: ......
.....; 1000.000
..... ; 1234
..... ;1.000e-06
..... ;1.000e-03
..... ;1.000e-04
..... ;1.000e-05
..... ;1.000e+08
..... ;1.1 e00
..... ;1.000e+02
..... ;1.000e-08
..... ; 40.000
428
;Enter subfile name for....................;
;Machine Description ...................... ;
;Enclose subfile name with double quotes .; "none"
;Machine type description .................. ;
;Machine type name: ....................... ;
;Setup description.........................;
;Enter name of setup class :...............;
;Control level of SetupA :................. ;
;Number of operations in SetupA :.......... ;
;Actual operation times in SetupA..........;
;Operation 0 :............................. ;
;Perceived operation times in SetupA........;
;Operation 0 :............................. ;
;Enter name of setup in this class: ........ ;
;Enter name of setup class : ............... ;
;Control level of SetupB :.................;
;Number of operations in SetupB : .......... ;
;Actual operation times in SetupB..........;
;Operation 0 :.............................;
;Perceived operation times in SetupB........;
;Operation 0 :............................. ;
;Enter name of setup in this class: ........ ;
;Enter name of setup class :...............;
;Control level of Machinel :...............;
429
ml
SetupA
2
1
1.000/01
1.ooo000__/01o
end
SetupB
2
1
1.000 /01
1.000__/01
end
Machinel
1
;Number of operations in Machinel :........;
;Enter name of setup in this class:........ ;
;Enter name of setup in this class:.
;Enter name of setup in this class:.
;Enter name of setup class :........
;Enter perceived setup change times.
;for subsetups of Machinel..........
SetupA
....... ; SetupB
end
end
.. ... ..;
;Enter dependence of setups on sequence .... ;
;Possible setup change policies:............;
;Enter menu number for desired policy ...... ;
;1 Sequence Dependent....................;
;2 Sequence Independent ................. ;
;Enter a number:...........................; 2
;Enter sequence independent times .......... ;
;Time to change into SetupB:.................; 10.000__/01
;Time to change into SetupA:...............; 15.000__/01
;Enter actual setup change times............;
;for subsetups of Machinel ................ ;
;Enter dependence of setups on sequence ....;
;Possible setup change policies: ........... ;
;Enter menu number for desired policy ...... ;
;I Sequence Dependent....................;
;2 Sequence Independent..................;
;Enter a number: .......................... ;
;Enter sequence independent times...........;
;Time to change into SetupB:...............;
;Time to change into SetupA:...............;
;Enter the setup change policy for this setup
;Possible setup change policies:............;
;Enter menu number for desired policy......;
;1 ROUND ROBIN .......................... ;
;2 CORRIDOR ............................. ;
;3 STAIRCASE ............................ ;
;Enter a number: .......................... ;
2
10.000__/01
15.000__/01
class :;
3
430
;Failure name: ............................ ;
;Machine type name: ....................... ;
;Enter subfile name for....................;
;Buffer Description........................;
;Enclose subfile name with double quotes .;
;Buffer description........................;
;Buffer name:
;Buffer size:
. ....... ...... . .
.. .. ... ... .. ... . ..
;Buffer name: .............................
;Buffer size: .............................
;Buffer name: ..............................
;******** p***************************
;***** p******************************
;Enter subfile name for.....................
;Cell Description..........................
;Enclose subfile name with double quotes .;
;Cell Description..........................
;Cell name: ...............................
;Control level of cell: ...................
;Cell components .......................... ;
;[#] machine type/buffer/cell name: .......
;Group name: ..............................
431
end
"none"
initbuf
1.000
initbuf2
1.000
end
"none"
Cell3
3
ml
gmi
end
;~~~~~~~~~~~~~~~~~~~I~S~~~t~~~~~~~~~t~~~
;~t~~~t~~t~~t~l~~~t~~~~~~~t~~~~~~~~~~~~~
;~~~~~~~~~~~~~~~t~~~~~~~~t~tl~~~~~~~~t~~
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; Cell2
;Control level of cell: ................... ; 2
;Cell components ...........................
[#] machine type/buffer/cell name: ....... ; Cell3
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; Celll
;Control level of cell: ................... ;
;Cell components...........................
;[#] machine type/buffer/cell name: ....... ; Cell2
;[#] machine type/buffer/cell name: ....... ; initbufl
;[#] machine type/buffer/cell name: ....... ; initbuf2
;[#] machine type/buffer/cell name: ....... ; initbufl
;[#] machine type/buffer/cell name: ....... ; end
;Cell name: ............................... ; end
;Enter subfile name for .................... ;
;Process Description.......................;
;Enclose subfile name with double quotes .; "none"
432
;Controllable process description..........;
;Process name : ........................... ;
;A coefficient (ie. relative importance): .;
;Demand rate (parts/unit time): ..
;Operating level (control level #)
;Production route.......... ............... ;
;Step # 1: ..........................
;Machine group/buffer name: .........
;Transit time to this step: .........
;Average lot size: ..................
;Required holding space: ............
;Step # 2: ..........................
;Machine group/buffer name: .........
;Transit time to this step: .........
;Average lot size: ..................
;Required machine setup name: .......
;Enter operation number within setup:
;Alternate routing probability: .....
...... ;
...... ;
...... ;
...... ;
..... ;
.p
.p
.p
.p'
.I
initbuf1
0.000__/1O
1.000
gmi
o.ooo000__1Ol
1.000
SetupA
0
end
;Step # 3: ................................ ;
;Machine group/buffer name: ............... ; end
;Process name : ........................... ; p2
;A coefficient (ie. relative importance): .; 1.000
;Demand rate (parts/unit time): ........... ; 0.350
;Operating level (control level #): ....... ; 3
;Production route .......................... ;
;Step # 1: ..................
;Machine group/buffer name: .
;Transit time to this step: .
;Average lot size: ..........
;Required holding space: ....
;Step # 2: ..................
;Machine group/buffer name: ............... ;
433
p1
1.000
0.350
3
initbuf2
0.00__/01
1.000
1.000
gml
)
)
;Transit time to this step: .........
;Average lot size: ..................
;Required machine setup name: .......
;Enter operation number within setup:
;Alternate routing probability: .....
;Step # 3: ................................ ;
;Machine group/buffer name: ............... ;
0.000oo__/01
1.000
SetupB
0
end
end
p p~t~~~~~~~t ~ ~~~~~~~~
;Process name : ........................... ; end
p p~S~~~~~~~~~~~~~t~ ~~
;Enter subfile name for ...................
;Output Log Specification................
;Enclose subfile name with double quotes
;Output log specification................
;Process name: ...........................
;Step number: ............................
;Step number: ............................
;Process name: ............................
;Step number: .............................
;Step number: .............................
"none"
p1
2al__/01
end
p2
2 .- /01
end
p p~tt~lt~~~~tt ~ ~ ~ ~ ~ ~ S
;Process name: ............................ ;
;** ** **** **** * ***** *** **************** * **** * *****
p p******************************
;Enter subfile name for .................... ;
;Hedge Point Entry .........................
;Enclose subfile name with double quotes .;
434
end
"none"
t
)
)
;~t~t~t~~~~~~~~~~t~~~~~~~~~~~~~~~~~~~~~~
;Enter hedge points for the following ...... ;
;processes and levels:.....................;
;Process pl, at Level 1 :.................;
;Cell Celli, at Stepnumber 2:.............. ; 0.000
;Process pl, at Level 2 :.................;
;Cell Cell2, at Stepnumber 2:.............. ; 0.000
;Process pl, at Level 3 :.................;
;Cell Cell3, at Stepnumber 2:..............; 0.000
;Enter hedge points for the following ...... ;
;processes and levels:.....................;
;Process p2, at Level 1 :................. ;
;Cell Celli, at Stepnumber 2:..............; 0.000
;Process p2, at Level 2 :.................;
;Cell Cell2, at Stepnumber 2: ............. ; 0.000
;Process p2, at Level 3 :.................;
435
;Cell Cell3, at Stepnumber 2:............. ;
;~~~t~~~~~~~~~~~~~t ~ ~  ~ ~ ~ ~ ~ ~ ~ ~ *
;Enter subfile name for....................;
;Setup Control Data Entry..................;
;Enclose subfile name with double quotes .;
;Enter top level setup change frequencies..;
;Change into SetupB:. ...................... ; 0.010
;Change into SetupA: .....................
;----- Cell name: Celli.................
;Enter setup change Acoeff .............
;level = 2, Change into SetupB: .........
;level = 2, Change into SetupA:. ........... ; 1.000
;----- Cell name: Cell2...................
;Enter setup change Acoeff................
----- Cell name: Cel13....
;Enter setup change Acoeff. ................ ;
;S S~~~~~~~~~~~S~~S~
;Enter subfile name for....................;
;Catalog Description.......................;
;Enclose subfile name with double quotes .;
;Catalog for top cell Celli ................;
;Top Catalog name: ......................... ;
;Configuration of machine group: gmi.......;
;Number of machines in Machinel ............ ;
436
"none"
0.010
1.000
"none"
mlcat
1.000
0.000
;Configuration of catalogs for cell: Cell2.;
;Subcatalogs of: micat, from cell: Celli...;
;Catalog Description.......................;
;Catalog name: .................
;Configuration of machine group:
;Number of machines in SetupB...
;Number of machines in SetupA...
;Catalog name: .................
;Configuration of machine group:
;Number of machines in SetupB...
;Number of machines in SetupA...
;Catalog name: .................
gml..
.....
.................;
..... ;
..... ;
..... ;
o e o o
gml
............;
;Configuration of catalogs for cell: Cell3.;
;Subcatalogs of: sAcat, from cell: Cell2...;
;Catalog Description.................. ..... ;
;Catalog name: ............................ ; sAsubcat
;Configuration of machine group: gmi ....... ;
;Number of machines in SetupB..............; 0.000
;Number of machines in SetupA..............; 1.000
;Catalog name: ............................ ; end
;Configuration of catalogs for cell: Cell3.;
437
sAcat
0.000
1.000
sBcat
1.000
0.000
end
;~~~s~~~~~~~~~~~t~~l~~~~S~t~~~~t~tt~~~~~
)
. )
)
;Subcatalogs of: sBcat, from cell: Cell2..
;Catalog Description......................
;Catalog name: .......................
;Configuration of machine group: gml..
;Number of machines in SetupB.........
;Number of machines in SetupA.........
;Catalog name: .......................
;Enter subfile name for....................;
;Initial Catalog...........................;
;Enclose subfile name with double quotes .;
;Initial catalog for cell Celli: .......... ;
;Initial catalog for cell Cell2: .......... ;
;Initial catalog for cell Cell3: .......... ;
; sBsubcat
.....; 1.000
.....; 0.000
e.....nd
"none"'
micat
sBcat
sBsubcat
;Enter subfile name for....................;
;Gantt Chart Output Selection..............;
;Enclose subfile name with double quotes .;
;Gantt chart selection:....................;
;Enter Machine Group or Cell name:..........;
"none"
end
;Enter subfile name for....................;
438
;~~~~~~~f~~~~t~~~~~~~~t~~~~~~~~~~~t~tt~~
;~~~~~~~~~~~I~~~~~~~~~t~tttt~~t~~~~t~~~~
;~~~~~t~~~~~tl~~t~t~t~~~~t~~t~~~~~~~~~t~
;~~~~t~~~~S~~~~t~~~~t~~~~~~~~~~~~~~~~~ft
;~~t~~t~~~~~~~t~~~~~tt~~~~~~~~~~~~~~~~~~
;Xgraph Output Specifications..............;
;Enclose subfile name with double quotes ; "none"
;Xgraph selection for cumulative production & buffers:;
;Enter Process name:.......................; p1
;Step number: ..............
;level number: .............
;Step number: ..............
;level number: .............
;Step number: ..............
;level number: .............
;Step number: ..............
;Enter Process name:........
;Step number:
;level number:
;Step number:
;level number:
;Step number:
;level number:
;Step number:
............................ ;
. .... l.... l ... ... . .
.S
.S
.S
....... ; 2
....... ; 1
....... ; 2
....... ; 2
....... ; 2
....... ; 3
....... ; end
***************************
2
1
2
2
2
3
end
;Enter Process name: ....................... ; end
;Xgraph selection for WIP of processes .....;
;Enter Process name:........................; end
439
;Xgraph selection for WIP integral of processes;
;Enter Process name:.......................; end
;Enter subfile name for....................;
;Program Output Time Table Specification...;
;Enclose subfile name with double quotes .; "none"
;Program Output Time Table Specification...;
;Option to be toggled: .................... ; 1
;Time to toggle option: ................... ; 0.000
;Option to be toggled: ..
;Time to toggle option:
;Option to be toggled: ..
;Time to toggle option:
;Option to be toggled: ..
;Time to toggle option:
.................. ;
.. .. .. .. .. .. .. ..
.................. ;
.I
;Option to be toggled: ...
440
x
0.000
s
0.000
v
;0.000
end
Appendix C
Procedure to Add Control
Options to Hiercsim
C.1 Introduction
The architecture of Hiercsim allows control of output using the input file and the
command line. This document describes the procedure to add a new control option.
The changes needed to add an option are usually informational. The infrastructure
of the code is set up to handle all the mechanics of these options once the information
has been entered into the code. File names and subroutine names are specified. There
is an example of the addition of a control option.
C.1.1 Description of Output Control Architecture
The central technique used in Hiercsim to control printout is to use flags, which
when turned on allow printout, and when turned off, prevent printout. The flags are
grouped into one global data structure called 'flags'. The flags data structure contains
a variety of integer elements, each integer is able to represent up to 16 different logical
variables.
The logical variables are accessed using the ability of C to manipulate bits. Each
integer is composed of 16 bits. Each bit is accessible by comparing it to the power
of 2 corresponding to its position in the integer. If the comparison is true, then the
logical variable is turned on, otherwise, the logical variable is turned off.
All code which manages the operation of the logical variables is general. There
are a number of functions which this code accomplishes. The most basic function is
to control the amount of output generated, both in quantity and length of simulation
time. However, there are some supporting functions which organize the output into
different paths, permit specification of options from the command line and specifica-
tion of options from the input file, allocate memory for the flags, and create a list of
valid flags.
441
C.1.2 Overview of User Control Options
The user is able to activate control options using a number of different techniques.
The code must accommodate all the options and so can be quite confusing. This
section explains which techniques are available to the user so that the programmer
will understand the purpose of the code that follows in the Procedure section.
Two functions are accomplished by the code. The first function enables the user
to specify output paths. The second function enables the user to toggle control flags
on and off.
Output Paths The user can either specify a control option output path in two
different ways or accept the default output path to the screen.
* Full filename specification:
Queue output directed to file "<queuefile>".
hiercsim -q <queuefile>
* Coordinated filename specification:
Queue output directed to file "<kernel>.que".
hiercsim -k <kernel> q
* Default output path:
Queue output directed to the screen.
hiercsim
Flag Toggling The user is able to toggle control flags in two ways. The first way
is directly from command line. The second way is remotely from the input file.
NOTE: If the input file control is used, all flags which are contained in the list found
in (inittoggle() toggle.c) are set to zero unless told otherwise by the input file. Any
flag which is uniquely controlled by the command line is not affected by the use of
the input file.
* Command line control of flags:
When a command line option is used, the flag corresponding to that option is
turned on for the duration of the simulation unless it is overridden by the use
of input file control.
* The user is able to control a set list of flags from the input file by specifying
the letter of the option and the time to toggle the option. The list of options
controllable from the input file is contained in (inittoggle() toggle.c).
442
C.1.3 Files and Subroutines to be Changed
The following files and subroutines will be changed when a new control option is
added:
* simmac.h
* simstruc.h
- typedef struct flagstruct {} FLAGS
- typedef struct outfpstruct {} OUTFPLIST
* comline.c
- commandline_parser()
- openfiles()
* multfile.c
- coordoutput()
* toggle.c
- inittoggle()
* utility.c
- initflags()
- initoutfile()
443
C.1.4 Accessing files within the Revision Control System
(RCS)
This section deals with the use of the Revision Control System (RCS). RCS is a set
of UNIX commands which allow multiple programmers to work on a software project
simultaneously. RCS manages revisions (it stores in condensed form all revisions of
all files in a software package), manages conflicts of editing (it prevents more than
one person from editing a file at a given time), and it minimizes the risk of accidental
deletion of files.
RCS operates much like a library. Programmers check out (co) files to either read
only (unlock-u) or to edit (lock -1). If a given file is checked out and locked, then only
the user who checked out the file can write to that file. Other users may read the file
at any time. The lock remains on the file until the user checks the file back in (ci).
Upon check in, the UNIX librarian asks the user for a short description of changes
made to the file. The check in routine stores the old version in condensed form and
makes the version just checked in the current version.
read only:
co -u <filename> ---> check out and unlock for reading
edit:
co -1 <filename> --- > check out and lock for editing
ci <filename> ---> check in after editing
co -u <filename> --- > check out new version for reading
NOTE: It is important that all files which are locked be checked in before logout.
Otherwise, no one else will be able to edit the locked files.
Check-out/Check-in automation 'lastci': I recognize that Hiercsim is quite
large and it is easy to forget to check in some locked files. For this reason, I have
written. a script file called 'lastci'. This script will check in all locked files for a user
and will subsequently unlock the new versions of those files.
The script 'lastci' is located in the subdirectory "/mit/msa/HS???/bin/csh".
To call 'lastci', the current directory must be "/mit/msa/HS???/src".
There is an alias called Ici which is simply ../bin/csh/lastci and serves to save typing.
Before logout and after editing, check in all files.
Within source directory /mit/msa/HS???/src type:
../bin/csh/lastci
or use the alias:
lci
444
C.2 Procedure
C.2.1 New Macros and Variable Names - Basic Information
simmac.h
New output control option:
Add the new option bit to the list of options for the flag that will contain the bit.
The value for the option should be a multiple of 2 in hexadecimal format and unique
within the flag integer.
/* test option flag */
#define QUEUE OxO010
simstruc.h
New flag integer element addition:
If a new flag integer element is created, it must be declared in the FLAGS data
structure.
The flag integer is called from the global variable 'flags'.
int tst;
flags->tst
New file path addition:
If a new output path is needed, add the file path to the OUTFPLIST data structure.
OUTFPLIST stands for 'output file pointer list'.
The file structure is called from the global variable 'out'.
FILE *que;
out->que
445
C.2.2 New Macros and Variable Names - File Paths
In addition, if the new file path is to be read in from the command line, the i/o
filename indices must be adjusted.
simmac.h
The mechanics of file creation rely on an accurate accounting of the total number of
files possible. This accounting is taken care of in the macro definition file. There are
two types of files possible. The first type is for filenames which are entered directly
in the command line. The second type is for filenames which are generated from a
kernel specified in the command line.
Those files which are specified completely from the command line can also be
entered via the kernel option. If the new option requires any filename, then NUMFILE
should be incremented by one. If the new option creates only a filename generated
from a kernel, then NUMKERNELFILE should also be incremented by one.
#define NUMFILE (increase by one)
#define NUMKERNELFILE (increase by one if this is a kernel option)
All filenames either read in or generated by the program are stored in one of two
global arrays of names. Commmand line file names are stored in the array 'filename'.
Filenames generated by the kernel option are stored in the array 'kernelname'. Each
new filename must have its unique array index. If the new file name is common to
both the command line and the kernel option, put the filename index before those
filename indices which are exclusively kernel files.
#define QUEUEFILE 6
If the new filename is part of the kernel option, the code must be told to bypass
the default filename. This is accomplished by using integer element
flags->out. A bit must be added to the macrolist. The bit must be unique among all
the bits used for filenames within flags->out.
#define QUE Ox0080
446
C.2.3 Memory Bookkeeping and Initialization
utility.c
initflags() - new integer elements in 'flags'
Initialization of new integer elements of the flags global data structure must be done
here. All elements must start out as completely turned off. This prevents spurious
output.
flags->tst=0;
initoutfile() - new output paths in 'out'
If a new output path has been created, the default path should be directed to the
screen. This is done by directing the new path in the global output stream data
structure to the standard output stream 'stdout'.
out->que = stdout;
NOTE: I do not know the limit on the number of different output paths that may
be open simultaneously. Any new addition of output path should be checked for
compatibility with the maximum number of paths available.
447
C.2.4 New Command Line Options
comline.c
command_lineparser()
The commandline_parser() routine must be modified in order to read in new options
from the command line at the prompt. This routine turns on output control flags
and also reads in file names for the output paths.
Add in the new options to the command line parser by copying an existing op-
tion and changing the appropriate values. In effect, any modification should be the
replacement of a few variable names within each block of code.
/*---------------------*
* QUEUE OUTPUT OPTION *
--------------------- /
case 'c':
case 'Q':
fnameindex=QUEUEFILE;
appendoption(& (options_in_effect), "QUEUEFILE", "q") ;
break;
multfile.c
coordoutput()
If the new option is a suboption of the -k kernel option, then the option should be
changed within coordoption subroutine. This routine turns on output control flags
for the kernel options and also creates coordinated file names for the output paths
activated in the command line.
Add in the new options to the coordoutput by copying an existing option and
changing the appropriate values.
/* Create a separate QUEUE chart output stream. */
case 'q':
case 'Q':
kernelname EQUEUEFILE]=(char *) getmem(MAXFILENAMELEN +1);
sprintf (kernelname [QUEUEFILE] , ".s. que", kernel);
flags->out I= QUE;
flags->tst 1= QUEUE;
break;
448
C.2.5 New Output Paths
comline.c
openfiles()
Once the command line has been read and the correct flags turned on, the files must
be opened in the subroutine 'openfiles()'. Each possible file path is taken care of in its
own block of code. Each block of code is in a standard format. The block screens the
information read in by the command line parser. The standard output is the default
if no file specification was given in the command line. When a filename is specified
from the command line, the block uses information analyzed by the command line
parser to sort out which filename is to be used.
To add a new output file path, copy an existing block of code and change the names
within the block to match those names created in simmac.h. Change all comments
to reflect the function of the new output path.
/*-OPEN OUTPUT PATH FOR QUEUE OUTPUT.----------------------/
outputbuf = mkfilelist();
/* Determines the name of the output stream where the QUEUE data
* will be directed. */
if (!(flags->out&QUE) && !(filename[QUEUEFILE])) {
/* send all queue output to screen if neither the '-k q' option is
* in effect nor a QUEUE output file was specified.
* 'out->que' is by default 'stdout'*/
strcpy(outputbuf->filename,"STDOUT");
}
else {
if (flags->out&QUE) {
/* '-k q' option is in effect */
strcpy(outputbuf->filename,kernelname[QUEUEFILE]);
}
else if (filename[QUEUEFILE]) {
/* kernal option is not in effect for the queue file
* and the QUE file is specified in the command line. */
strcpy(outputbuf->filename,filename[QUEUEFILE]);
}
/* open the gantt file with the desired name. */
out->que=openfile(outputbuf->filename, "w");
}
outputbuf->filepointer = out->que;
strcpy(outputbuf->purpose,"QUEUE PRINTOUT");
/* document gantt file path. */
appendfile(&(outputfiles),outputbuf);
449
C.2.6 New Element in Toggle Option List
toggle.c
inittoggle() All options which are able to be controlled from the input file must be
placed in a list located in inittoggle().
To add the option to the list of options that can be manipulated from the input
file, copy one of the existing data blocks and change the values to match the new
option.
Make sure that the letter code chosen for the new option does not repeat an
existing option.
* 6th Element of togglelist is Queue Prinout *
appendtogglelist(&(toggletype), &(flags->tst), "QUEUE PRINTOUT",
"q", QUEUE, ZERO);
NOTE: The same type of mechanism used to toggle flags from the input file may be
modified to read in a schedule of orders. The orders can be placed in the event queue
at a specified time in the simulation. This has not yet been done.
C.2.7 Control Using Flags in Hiercsim
At this point, add in the changes to the code using the format:
if (flags-><flag name>&<flag bit>) {
output commands
}
where <flag name> is the flags code in the FLAGS data structure
and <flag bit> is the macro defined within simmac.h.
NOTE: It is vital to have only a single ampersand sign '&' between the two integers
being compared.
if (flags->tstQUEUE){
fprintf (out->que, "Queue data", variables);
}
450
Appendix D
Procedure to Add Constraints
In the design of Hiercsim, it was decided to put all possible constraints into a master lp
in the beginning of the program and then to selectively choose those which are needed
while running the program. The following changes occur when a new constraint option
is added:
* simmac.h
Under the category of 'LP CONSTRAINT TYPES' a new type of constraint
must be defined so as to allow the LP matrix to recognize what type of constraint
one has added. The name of the constraint should reflect the purpose of the
constraint. Example:
/* ---------- LP CONSTRAINT TYPES ------------------------- */
MCCONSTRAINT
UPS VIRTMACHINE
DNSVIRTMACHINE
PARTTOSETUP
SETUP_TOSETUP
UPSANTILOOP
DNS_ANTILOOP
SETUPFREQ
/* inequality */
/* inequality */
/* inequality */
/* inequality */
/* inequality */
/* inequality */
/* inequality */
/* equality */
451
#define
#define
#define
#define
#define
#define
#define
#define
* initlp.c
In the routine updmatrix() the variable lastrow adds all the possible rows to-
gether to allow for the allocation of memory. This variable must be incremented
to account for all the possible rows one might add.
Example:
In routine updmatrix():
/* nummcrow is for the # of machine constraints including
* setup states.
* I numvar accounts for the number of possible boundary
* constraints.
* i numvar accounts for the number of possible BLOCKAGE
* virtual machines.
* 1 numvar accounts for the number of possible STARVATION
* virtual machines.
* 1 numvar accounts for possible upstream virtual
* machine loops
* 1 numvar accounts for possible downstream virtual
* machine loops
* 1 su_numvar accounts for the number of possible boundary
* constraints.
* I su_numvar for going into a state equaling going out
* (SEQINDEP) or sum freq in = sum freq out (SEQ_DEP)
* 1 su_numvar for u>=f for each setup frequency.
* 2 for the cost rows.*/
lastrow=ctrl->num_mcrow + 5*(ctrl->numvar) +
3*(ctrl->sunumvar) + 2;
In the first routine 'mkmat' under '*Adds the extra constraints*', a new rou-
tine of the form 'setxxxxconstraints' must be called which initializes and gets
memory for the new constraint(s). xxxx should be a 2 or 3 letter mnenomic
code for the constraints purpose. Make certain to keep they constraint addition
routines in the same order because some of them require other constraints being
there in order for the routines to work.
Example:
In routine mkmat():
/* Adds the extra constraints.*/
-> set_su-constraints(ctrl);
set pt sconstraints(ctrl);
set_vm_constraints (ctrl);
set-boconstraints(ctrl);
452
In addition the new routine must be written.
Example:
******* Describe the routine *********
/* set_su_constraint() puts in the constraints on the setup
* change rate variables.
* Currently, there are only contraints in for sequence-
* independent setups.
* The constraint is: u[i] <= sum(u[j]), j!=i, for each i.
* u[i] is the rate of change into setup i for a given class. */
set suconstraints(ctrl)
CTRLLEVEL *ctrl;
SUCELLVARIABLE *suvar, *othersetup;
ROWID *newrowid, *mkrowvar(), *row_idlist;
int row, col, varindex,i;
rowidlist=ctrl->row_id;
suvar=ctrl->suvarlist;
****Go through all possible columns that might need
to be assigned a value.**************************
/* Go through each of the setup change variables in the list.*/
for (i=O;i<ctrl->sunumvar; i++){
****Assign the memory and make the new row data structure. ****
newrow_id=mkrowvar();
addrowvar(((COMPONENT *) NULL),suvar
->tosetup,(rowidlist),newrow id);
****Define the type of constraint.*************************
newrow_id->constraint_type=SETUPTO_SETUP;
row=newrow_id->master_index;
453
****Have a conditioning statement**************************
while (other_setup){
if ((suvar->to_setup->supermcgsu==
other_setup->to_setup->supermcgsu)
(suvar!=other_setup)) {
varindex=other_setup->varindex;
**********Sets value of function**********
ctrl->masterlp->c [row] [varindex = (-i);
}
other_setup=other_setup->nextsuvar;
Remembering that all constraints are of the =0 or <=0 type we run two while
loops over the CELLVARIABLE's and or SUCELLVARIABLE's. One for each
side of the equation. A conditioning statement is used to determine which
variables to include. Remember to multiply all variables which would normally
be on the right side of the equation by -1 to bring them to the left side.
* print.c In the routine drawcell(), an extra case must be added to the case while
looping to allow for the new constraint. Example:
while (rowid) {
if ((rowid->workindex != NOT_VALID)
II master) {
equality = FALSE;
/* --------- ROW LABEL -------------------- /
switch (rowid->constraint_type) {
case SETUPTOSETUP:
fprintf(out->ver,"s to s %-4d ", rowid->masterindex);
break;
case BOUNDARY:
/* Bypass equality constraints. */
equality = TRUE;
break;
default:
fprintf(out->ver,"drawcell: Invalid constraint type\n");
exit ();
}
454
* worklp.c
The only two routines in worklp.c which may change when adding a new con-
straint are buildcolmask() and build rowmask(). These routines identify rows
and columns to be included when making the working tableaus from the master
lp tableau.
In most cases buildcolmask will not be changed because the variable will have
already been included. I only recommend that this be checked to make sure.
In the case of buildrowmask a new routine which should be included in
buildmask of the following form may be added:
Example:
/* build_surow_mask() uses the column mask and activates all
* rows which are needed by the active columns. */
build_su_row_mask(ctrl)
CTRLLEVEL *ctrl;
{
for (i=O; i<ctrl->su_numvar; i++, suvar = suvar->nextsuvar) {
suvarindex = suvar->varindex;
if (colmask[suvarindex] == ACTIVE) {
row_id = ctrl->row_id;
while (rowid) {
if (row_id->constraint_type == SETUP_TO_SETUP) {
row = row_id->master_index;
if ((master_lp->c[row] [suvarindex)==ONE) {
rowmask[row] = ACTIVE;
}
rowid = row id->nextrow;
}
455
Appendix E
Hiercsim Version 3.5 Cross
Reference
Subroutines within Files of Hiercsim
Version 3.5
Auxiliary files are support files that may be separated from the
algortihm routines without losing information
Filename
Subroutine
analyze.c
analyze
addvariable
bltentry
CELLVARIABLE *mkvar
CTRLLEVEL *findctrl
addentry
construct
mknumcom
mklevel
LPDATA *mklpdata
countstates
numatlevel
ROWID *mkrowvar
addrowvar
CTRL_MSGCENTER *mkctrlmsg
456
anlyzesu.c
analyzesu
scan_resources
assigntolist
SUCELLVARIABLE *mksetupvar
addsuvariable
antiloop.c
idloops
chkloop
setiloopconstraints
LOOP *mkloop
addupsloop
adddnsloop
rmupsloop
rmdnsloop
boundary.c
installboundaries
calc xonboundary
perturb_offboundary
smallperturbation
followtrajectory
checkboundary
buffer.c
calcfilllevel
chgfillrate
457
calcstat.c
detlevivmc
detlevarmc
statAvmc
statarmc
statar-bf
stat Avbf
carbufsta
clvbufsta
clvmcsta
carmcsta
CELLSTAT *mkcellstat
BUFFSTAT *mkbuffstats
drawcellstats
drawbuffstats
finalrepstats
CELL find_topcell
drawstatcell
drawstatbuff
finalcalc-cell
final_calcbuff
CC.C
doccdown
double ccdown
double calcinitialrates
checkfeasible
countboundaries
update
calc_cost
mkenv
clrenv
mkscale
cd.c
docddown
double cd_down
458
cdsu.c
cdsudown
set-mc_group
find_sueligible
findmc_eligible
set_suvar
comline.c - - Auziliary
commandlineparser
open_files
coordsys.c
newlowtask
updlowstatus
chklowstatus
telllowlevel
tellcellcom
rearrangecellcalc
clr_rearrange
rearrangestarvation
rearrangeblockage
facpol.c
checkbatch
check_dnsbuffer
failure.c
double getfailtime
failmc
repairmc
mcrepaired
FAILSTAT *getfstat
idlevel
459
gantt.c - - Auxiliary
finishgantt
update_gant t machine
checkchangestatus
int findsymbol
update_status
locate_proc_symbol
print _statusline
findcolumn
global.c
Disclaimer and version number are located here.
All global variables must appear twice:
1) Declared in global.c
2) Redeclared in extern.h, preceeded by the extern keyword
Global variables are declared in global.c and are linked to all
files that "#include extern.h"
initlp.c
mkmat
updmatrix
suupdmatrix
setsuconstraints
set vmconstraints
setboconstraints
getindex
sugetindex
initeffcap
initcap
chgcap
clrrow
clrcol
initproc.c
initproc
initsu.c
MCGSUDATA *mkmcgsu
fillmcgsu
MCGSU_DATA *findmcgsu
inpbuff.c
getbuffer
460
inpcell.c
getcell
CELL *mkcell
getcomponent
CTRLLEVEL *mkctrl
COMPONENT *mkcomponent
inpgantt.c - Auxiliary
getgantt
addmachtogantt
initgantt-id
addcellto_gantt
initstatuslines
gantt ledger
makeproc_table
headerofgant t_chart
inphedge.c
gethedge
double readhedge
double getbound
findcell
inplog.c
getlogspec
PROCESS *findproc
ROUTEENTRY *findentry
inpmach.c
getmcdes
RESOURCE *mkresource
MCGSTATS *mkmcstats
TIMEPARAM *getoptime
getfailstat
RESOURCE *mkgroup
MACHINE *mkmachine
identify
461
inpproc.c
PROCESS *getproc
PROCESS *mkproc
ROUTEENTRY *getroute
RESOURCE *findres
inpsetup.c
check-setup_tree
SETUP *mksetup
getsu_stats
getroundrobin
getsuchg_time
getseqdep
getseqindep
prnsetuptree
prnmcgsutree
get initialsetup
allclear
prnsu_chgmenu
prnsu_timemenu
assign_index
SETUP *find setup
appendsetup
rmsetup
prnsetuplist
input.c
simparam
inpxgra.c
getxgraph
XGRAPHTABLE *add var to.xgraph
initxcoords
462
isp.c
ispfindmcgroup
find-next mcgroup
ispvarel
initsuchglist
SETUPLIST *mksetuplist
mksuchglist
updsu_active
ispmc
isp_var
1p.c - Auxiliary
double lp
zxolp
lp _chkfeasible
main.c - Auxiliary
main
init globalvar
test rout init
inputdata
matprep.c - Auxiliary
matprep
multfile.c - Auxiliary
coordoutput
getsubfile
ungetsubfile
FILELIST *appendfile
freeFILELIST
print.c
markprogress
prnstat
prnresstat
label_output
draw inverse
drawfmatrix
testdrawcell
drawcell
463
process.c
LOT *mklot
rmlot
LOT applot
double calctime
loadmachine
buflotreserve
buflotunreserve
LOT *mkupssendlist
buflottranspport
tellupsmcgroup
mcrecvlot
process
contprocess
procdone
unloadmachine
mclot reserve
LOT *mkdns_sendlist
mclot_transport
bufrecvlot
queue.c - - Auxiliary
queueup
rmQ
QELEM *mkQ
clrQ
QELEM *searchQ
r_robin.c rrindex
rr_sueligible
random.c - Auxiliary
initrandom
double randuniform
double randexp
double randgauss
randrte.c - - Auxiliary
getalternate
copyentry
linkbranch
procresult
464
setup.c
setupstart
setupfinish
double calcsetuptime
TIMEPARAM *getsutimeparam
simulate.c
runsim
stats.c - - Auxiliary
updWIP
keeplog
updstat
updvariance
string.c - - Auxiliary
long extlong
double extfraction
double extdouble
sudata.c
getsudata
getsuacoeff
test.c
test
testcell
listentry
testresrc
testproc
testctrl
dumpmat
dumprow
dumpcol
465
toggle.c - - Auxiliary
inittoggle
testoption
OPTIONLIST *appendoption
freeOPTIONLIST
valid
gettoggle
TOGGLELIST *mktoggle
printtoggle
toggleoption
resetflags
resetoption
ttbound.c
mk_ttb_env
clr_ttbenv
double time_to_boundary
buildcol inbasis
buildfmatrix
normalizef
calcvalidf
calcmintime
466
utility.c -- Auxiliary
char *getmem
initflags
initutil
initoutfile
fillline
prompt
prnnewline
copynewline
FILE *openfile
pushfile
popfile
filushout
chrinp
prnlocation
inpscreen
ungetinp
char *getfilename
char *mkname
cutline
char *getname
rmwhite
int getinteger
double getdouble
double getexponent
double getnumber
getmode
virtual.c
update_virt mach
cleanvirtual
checkvirt
updateups virt
updatednsvirt
sched_ups_virt
scheddns_virt
add_virtmach
ROWID *find_virtrowid
467
worklp.c
numinsetup
double numrowmc
buildmask
buildcolmask
buildinterdepend
buildrowmask
expandworklp
buildworkp
xgraph.c
addxcoord
prnxgraph
simmmac.h
All macro definitions are located here.
simstruc.h
All data structures definitions are located here.
extern.h
All global variables are declared as extern here.
468
Appendix F
Hiercsim Version 4.0 Cross
Reference
Subroutines within Files of Hiercsim
Version 4.0
Auxiliary files are support files that may be separated from the
algortihm routines without losing information
Filename
Subroutine
analyze.c
analyze_proc
addvariable
CELLVARIABLE *mkvar
CTRLLEVEL *findctrl
anlyzesu.c
analyzesu
scan-resources
assign_tolist
SUCELLVARIABLE *mksetupvar
addsuvariable
anlyzres.c
analyze-res
bltentry
addentry
ENTRYPTR *mkentryptr
assignentryindex
count setup-tree
mksuinfo
469
antiloop.c
idloops
chkloop
set ioop_constraints
LOOP *mkloop
addupsloop
add_dnsloop
rmupsloop
rmdnsloop
boundary.c
installboundaries
calc xon_boundary
perturb_offboundary
smallperturbation
follow_trajectory
checkboundary
buffer.c
calc_filllevel
chgfillrate
calcstat.c
detJevlvmc
detlevarmc
stati vmc
statarmc
stat_arbf
statlv_bf
carbufsta
clvbufsta
clvmcsta
carmcsta
CELLSTAT *mkcellstat
BUFFSTAT *mkbufferstats
finalrepstats
drawstatcell
drawstat buff
finalcalccell
finalcalcbuff
470
CC.C
doccdown
double ccdown
calcinitial_rates
checkfeasible
count _boundaries
update
calc-cost
mkenv
clrenv
mkscale
cd.c
docd_down
double cddown
cellcalc.e
cell_calc
comline.c - Auxiliary
commandlineparser
openfiles
FILELIST *mkfilelist
construc.c
construct
mknumcom
mklevel
createvarindex
mkctrlquant
mkctrllp
mksuthresh
prncellcontents
LPDATA *mklpdata
fill_llp
count_states
numatlevel
ROWID *mkrowvar
addrowvar
CTRL_MSGCENTER *mkctrlmsg
471
coordsys.c
newlowtask
updlowstatus
chklowstatus
telllowlevel
tellcellcom
rearrange_cell_calc
clrrearrange
rearrangestarvation
rearrangeblockage
facpol.cL check_batch
checkdns buffer
faliure.c
double getfailtime
failmc
repairmc
mcrepaired
FAILSTAT *getfstat
idlevel
gantt.c - Auxiliary
finish_gantt
updategantt machine
checkchangestatus
int findsymbol
updatestatus
locateprocsymbol
printstatusline
findcolumn
global.c
Disclaimer and version number are located here.
All global variables must appear twice:
1) Declared in global.c
2) Redeclared in extern.h, preceeded by the extern keyword
Global variables are declared in global.c
files that "#include extern.h"
and are linked to all
472
initlp.c
mkmat
updmatrix
suupdmatrix
set suconstraints
set _ptsconstraints
setvmconstraints
set boconstraints
getindex
sugetindex
initeffcap
initproc.c
initproc
initsu.c
MCGSU_DATA *mkmcgsu
fillmcgsu
MCGSU_DATA *findmcgsu
inpbuff.c
getbuffer
inpcat.c
inpcat
initcat
inpinit_cat
lower_cat
get _cat
filltopcell
get cat config
addmctocat
searcom
SETUPCATALOG *mkcatalog
MCCATDATA *mkmc data
COMCATALOGS *mkcomcat
topdrawcat
drawlowcat
drawcat
473
inpcell.c
getcell
CELL *mkcell
getcomponent
CTRLLEVEL *mkctrl
COMPONENT *mkcomponent
inpgantt.c - Auxiliary
getgantt
addmachtogantt
GANTT *mkgantt
initganttid
addcelltogantt
init statusines
ganttledger
makeproctable
PROCESSTABLE *mkprocess_table
headerofgantt _chart
inphedge.c
gethedge
double readhedge
double getbound
findcell
inplog.c
getlogspec
PROCESS *findproc
ROUTEENTRY *findentry
inpmach.c
getmcdes
RESOURCE *mkresource
MCGSTATS *mkmcstats
MCGMSGCENTER *mkmcgmsg
TIMEPARAM *getoptime
TIMEPARAM *mktimeparam
getfailstat
FAILSTAT *mkfailstat
RESOURCE *mkgroup
MACHINE *mkmachine
identify
474
inpproc.c
PROCESS *getproc
PROCESS *mkproc
ROUTEENTRY *getroute
ROUTEENTRY *mkrouteentry
RESOURCE *findres
inpsetup.c
check.setuptree
SETUP *mksetup
getsu_stats
getroundrobin
getsuchgtime
getseqdep
getseqindep
prnsetuptree
prnmcgsutree
get initialsetup
all-clear
prnsuchgmenu
prnsu_timemenu
assignindex
SETUP *find_setup
appendsetup
rmsetup
prnsetuplist
inpsudat.c
getsudata
getsuacoeff
input.c
simparam
lot.c
init_entrybuffer
maint ainentry_buffer
LOT *mklot
rmlot
LOT *applot
p.c -- Auxiliary
double Ip
zxolp
lp_chkfeasible
475
main.c - Auxiliary
main
init global_var
test _rout-init
inputdata
matprep.c - Auxiliary
matprep
multfile.c - Auxiliary
coordoutput
getsubfile
ungetsubfile
appendfile
freeFILELIST
print.c
prnstat
prnresstat
labeloutput
draw inverse
drawfmatrix
testdrawcell
drawcell
process.c
double calctime
loadmachine
buflotreserve
buflotunreserve
LOT *mkupssendlist
buflottranspport
tellupsmcgroup
mcrecvlot
process
contprocess
procdone
unloadmachine
mclot _reserve
LOT *mkdnssendlist
mclot_transport
bufrecvlot
476
queue.c - - Auxiliary
queueup
rmQ
QELEM *mkQ
clrQ
QELEM *searchQ
random.c - Auxiliary
initrandom
double randuniform
double randexp
double randgauss
randrte.c - - Auxiliary
getalternate
copyentry
linkbranch
procresult
setup.c
setupstart
setupfinish
double calcsetuptime
TIMEPARAM *getsutimeparam
findchangelevel
simulate.c
runsim
stats.c - Auxiliary
keeplog
updstat
string.c - Auxiliary
long extlong
double extfraction
double extdouble
477
sucell.c
mkcatenv
clrcatenv
findnewhigh_cat
docdsudown
double computesetupsallowed
idvalidcatalog
SETUPCATALOG *find_best catalog
find_cat cost
sucomp.c
compare_newsetup
next _setup_time
check x_t hreshhold
init _su_threshhold
sulotchk.c
chkmcgsuchg
chkmcgpaths
chk_cellsuchg
chkcell-paths
sumarklt.c
cellsumarkpath
cell_marklot
init cellmarklot
mcgmarklot
initmcglot
initsumark
clr_cellsumark
clrmcgsumark
sumcg.c
mcgsu_start
updatesuconfig
double fillsuconfig
updatedeltamcgconfig
mcgroup chgconfig
assignmc_setup
deltain_class
filLnewconfig
478
suutil.c
MCGSU_DATA *findnextmcgsu
COMCATALOGS *find_complist
SETUPCATALOG *find_newcatlist
MCCATDATA *find_mccat
SUINFO *find_suinfo
suworklp.c
transfermasterlp
restoremasterlp
buildcat virtual
buildcat _target
build catcap
double numcatrowmc
check_catfail
toggle.c - - Auxiliary
inittoggle
testoption
appendoption
OPTIONLIST *mkoptionlist
freeOPTIONLIST
valid
gettoggle
append_togglelist
printtoggle
toggleoption
resetflags
resetoption
TOGGLELIST *mktoggle
ttbound.c
mkttb_env
clrttbenv
double timeto_boundary
buildcolin_basis
buildfmatrix
normalizef
calc_valid_f
calc-mintime
479
utility.c - Auziliary
char *getmem
initflags
initutil
FILEENTRY *mkfileentry
initoutfile
fillline
prompt
prnnewline
copynewline
FILE *openfile
pushfile
popfile
filushout
chrinp
prnlocation
inpscreen
ungetinp
char *getfilename
char *mkname
cutline
char *getname
clrrow
rmwhite
int getinteger
double getdouble
double getexponent
double getnumber
getmode
CELL *findtopcell
virtual.c
updatevirt mach
cleanvirtual
check_virt
updateupsvirt
updatednsvirt
schedupsvirt
sched_dnsvirt
add_virt _mach
findvirt rowid
480
worklp.c
buildcap
chgcap
numinsetup
double numsowmc
buildmask
buildcol_mask
buildinterdepend
buildrow-mask
expandworklp
buildworklp
simmmac.h
All macro definitions are located here.
simstruc.h
All data structures definitions are located here.
extern.h
All global variables are declared as extern here.
481
Appendix G
Matlab Graphs from Hiercsim
Output
G.1 Introduction
The following is the recommended procedure to make graphs using Matlab from
data points collected using Xgraph in Hiercsim. 1
There is also help available in Matlab which can be accessed by typing 'help' from
the prompt in the Matlab window. Additional help is available through olc by typing
olc at the athena prompt then answers at the olc prompt.
G.2 Step-by-Step
1. Create the data. Run Hiercsim using the -k option with the x output control
option. This will produce a file *. m which is nearly in Matlab format.
2. Edit the *. m file. Variable names must be added for the vectors. The data
values are printed in the *. m file in the form:
0.000 0.000;
1000 200
To make the file readable by Matlab, type in the variable name and an equals
sign on the same line as the opening bracket. All opening brackets in a given
file must have a name and an equals sign:
0.000 0.000;
1This appendix was written by Steve Riester. The Matlab procedures were written by James
Violette and Marcello Torres.
482
1000 200
3. Enter Matlab. Use the following commands to start Matlab
add matlab
From a VAXstation
cd /mit/matlab/vaxbin
matlab directory
From an IBM RT
cd /mit/matlab/rtbin
matlab directory
where directory is the directory in which the *. m file containing the data is
stored. The directory must include the full path.
4. Load the *. m file into Matlab. At the prompt in the Matlab window type
filename where the filename is the name of the file where the data is stored
without the . m extension.
5. Plot the data. Create a plot of the data by typing
plot (variablel(: ,1) ,variablel(: ,2) ,variable2( : ,l) , variable2(; ,2))
where variable is the variable name assigned to the vector to be plotted in
the *. m file.
6. Scaling the axes. If the automatically scaled axes are not suitable for the
plot, the axes can be scaled using the axis command. At the prompt enter a
new variable and assign to it a 4 X 1 matrix where the first entry in the matrix
is the desired minimum value on the x-axis, the second entry is the desired
maximum on the x-axis, the third entry is the desired minimum on the y-axis
and the fourth entry is the desired maximum on the y-axis.
i.e.
Z = [200 300 1 10]
will correspond to an x range from 200 to 300 and a y range from 1 to 10.
then type axis (Z) and issue the plot command again. To return to auto-scaling
mode type axis without the argument.
7. Add titles and labels. Titles and axis labels can be added to the graph
with the title, xlabel and ylabel commands. All three commands use the same
483
format title('My Graph') where My Graph is the title or label depending on
the command.
8. Print the graphs. After the data is plotted and labels and titles are added
the graphs are ready to be printed.
If only one graph is to printed type
print (' printer')
where printer is the PostScript printer being used.
If several graphs are to be printed there is a more efficient way to print them.
In this case, after the first graph is plotted and labeled type meta filename
where filename is a new filename including the full path. For each addi-
tional graph, after plotting and labeling simply type meta. After all of the
graphs have been made, exit Matlab but stay in the directory. Type gpp
filename. met -dps at the athena prompt. This will convert the meta file
that was created to a PostScript file called
filename. ps which can then be printed using the lpr command: 1pr
-Pprinter filename. ps
484
