Automatic detection of software failures with hierarchical supervisors by Savor, Tony
Automatic Detection of Software Failures with 
Hierarchical Y upervisors 
Tony Savor 
A t hesis 
presented to the University of MTaterloo 
t hesis 
in f u l h e n t  of the 
requirement for the degree of 
Doctor of Philosophy 
in 
Elec trical Engineering 
Waterloo, Ontario, Canada, 1997 





Acquisitions and Acquisitions et 
Bibliographie Services services bibliographiques 
395 Wellington Street 395, rue Wellington 
Oaawa ON K1A ON4 OttawaON K1AON4 
Canada Canada 
The author has granted a non- 
exclusive licence allowing the 
National Library of Canada tc 
reproduce, loan, distribute or sell 
copies of this thesis in microform, 
paper or electronic formats. 
The author retains ownership of the 
copyright in this thesis. Neither the 
thesis nor substantial extracts £iom it 
may be printed or otherwise 
reproduced without the author's 
permission. 
Your Ne voire rtifenmce 
Our tTle Notre reférence 
L'auteur a accordé une licence non 
exclusive permettant a la 
Bibliothèque nationale du Canada de 
reproduire, prêter, distribuer ou 
vendre des copies de cette thèse sous 
la fome de microfiche/fil.m, de 
reproduction sur papier ou sur format 
électronique. 
L'auteur conserve la propriété du 
droit d'auteur qui protège cette thèse. 
Ni la thèse ni des extraits substantiels 
de celle-ci ne doivent être imprimés 
ou autrement reproduits sans son 
autorisation. 
The University of Waterloo requires the signatures of all persons using or pho- 
tocopying this thesis. Please sign below, and give address and date. 
Acknowledgment s 
First and foremost: 1 would like to thank my supervisors, Prof. R.E. Seviora and 
Prof. P. Dasiewicz whose efforts and contributions to this work went well beyond the 
ca l l  of duty. Gratitude also goes to all members of my cornmittee: Prof. J.M. Atlee, 
Prof. B.R. Preiss, Prof. A. Singh and Prof. M.A. Vouk for their guidance throughout 
the course of this work. In addition, appreciation goes to all members of the 
Software Engineering Group for their many contributions over the years. Finally, I 
would like to thank Claudia and the members of my family who stood behind me 
during t his period. 
Funding for this work was provided by Bell Canada, the Natural Sciences and 
Engineering Research Council of Canada, the Province of Ontario and the Univer- 
sity of Waterloo. 
Abstract 
As the size and complexity of modern software systems grows, it becomes increas- 
ingly difficult to determine whether they operate as specified. Presently, the process 
is excessively dependent on human observation, limiting its scalability and accuracy. 
Accurate and reliable detection of software failmes would aid in the management 
and improvement of software reliability. An automated approach to detection of 
software failures is needed. 
This thesis addresses software supervision, an approach to specification-based, 
automated detection of software failures. The work is focused on real-time reactive 
systems specified in a formalism based on communicating finite state machines. The 
supervisor: a separate unit, observes the inputs and outputs of a target software 
system. It makes use of the target systems' reqùements specification. Discrep- 
ancies between specified and observed behaviors are reported as failures by the 
supervisor. 
Supervision involves a number of difficult issues. A prominent one is the han- 
dling of specification nondeterminism. Specification nondeterminism permits the 
target system to generate several legal output behavioral alternatives for a single 
input behavior. The supervisor must be able to consider all behavioral alternatives 
so tliat unwarranted failure reports are not generated. In some cases, the exhaus- 
tive consideration of all behavioral alternatives results in an excessive supervisor 
timc and space cost. 
This thesis presents a novel approach to supervision, called hierarchal supervi- 
sion, that reduces the time and space cost of supervising systems whose specifica- 
tions contain large amounts of nondeterminism. In a hierarchal supervisor, failure 
detection is carried out at two levels of abstraction: the path detection level and the 
base level. The path detection level determines the path or trajectory through the 
specification that corresponds with observed target system behavior. Effectively, at 
the path detection level, the behavioral alternative chosen by the target system is 
identified. At the base level, a detailed check of observed behavior dong the path 
identified is made. 
This t hesis present s the underlying concepts of hierarchal supervision, the ar- 
chitecture of a hierarchal supervisor, the derivation of the supervisor mode1 from 
the requirements specification, the definition of the interpreters for both the path 
detection and base supervisor levels and describes the derivation of the time and 
space complelaties for both. The major research contributions of the thesis include 
split ting of supervision into two sub-problems ( path detection and detailed behavior 
checking). making use of both target system input and output signals to track tar- 
get system behavior, discussion of tradeoffs between the latency of failme detection 
vs the computational cost of supervision, development of an approach to prune 
behavioral alternatives from consideration and development of a base supervisor 
aimed at detailed behavior checking. 
To evaluate liierarchical supervision, a demonstration supervisor was imple- 
mented. It supervised the control program of a s m d  telephone exchange. Two key 
aspects, failure detection and timefspace complexity, were evaluated. 
The failure detection evaluation included bot h op timis tic and pessimistic report- 
ing. Pessimistic reporting refers to unwarranted generation of failure reports, while 
optimis tic refers to not generating warranted failme reports. Experimental obser- 
vations revealed that all failures were reported and no failures were missed. The 
time and space cost was evaluated by measuring the number of behavioral alterna- 
tives considered by the supervisor, which is indicative of its time and space cost. 
Experimental measurements showed improvements of over two orders of magnitude 
over the direct single-layer approach. 
Contents 
1 Introduction 
. . . . . . . . . . . . . . . . . . . . . . . . . . . .  1.1 Failure Detection 
. . . . . . . . . . . . . . . . . . . . . . . . . .  1.2 Software Supervision 
. . . . . . . . . . . . . . . . . . . . . . . . . . . .  1.3 Why Supervision 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1.4 Objectives 
. . . . . . . . . . . . . . . . .  1.5 Summary of Research Contributions 
. . . . . . . . . . . . . . . . . . . . . . . . .  1.6 Organization of Thesis 
2 Issues & Related Wosk 
2.1 Definition of Correct Behavior . . . . . . . . . . . . . . . . . . . . .  7 
2.1.1 Target System Response Time . . . . . . . . . . . . . . . . .  8 
. . . . . . . . . . . . . . . . . . . .  2.2 S pecification Non-Determinism 9 
2.2.1 An Execution Path Interpretation of Non-Determinism . . .  10 
2.2.2 Categories of Non-Determinism . . . . . . . . . . . . . . . .  11 
2.3 Supervisor Signal Processing Latency . . . . . . . . . . . . . . . . .  12 
vii 
2.3.1 In4imeSupervision . . . . . . . . . . . . . . . . . . . . . . .  14 
2.3 -2 Out-of-time Supervision . . . . . . . . . . . . . . . . . . . .  14 
2.4 Tradeoffs Between Accuracy and Computational Cost . . . . . . . .  15 
2.5 At tachment of a Supervisor to a Target 
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  16 
2.5.1 TappingofaDataLink . . . . . . . . . . . . . . . . . . . .  17 
2.5.2 Polling of Controlled Hardware Interface Memory . . . . . .  18 
. . . . . .  2.6 Continuation of Supervision After Detection of a Failure 21 
2.6.1 Resynchronization . . . . . . . . . . . . . . . . . . . . . . .  22 
2.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  23 
2.7.1 Intrusive . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  23 
2.7.2 Non-Intrusive . . . . . . . . . . . . . . . . . . . . . . . . . .  25 
2.8 Research Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  27 
3 Hierarchical Software Supervision 30 
3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  30 
3.2 Interna1 Organization of a Non-Hierarchical Supervisor . . . . . . .  31 
3.2.1 Approaclies to Dealing wit h Specification 
Non-determinism . . . . . . . . . . . . . . . . . . . . . . . .  32 
. . . . . . . . . . . . . . . . . .  3.3 Tracking Target System Operation 33 
3.3.1 The Tracking Model . . . . . . . . . . . . . . . . . . . . . .  35 
3.4 Hierarchical Software Supervisor . . . . . . . . . . . . . . . . . . . .  37 
3.4.1 Operation of the Hierarchical Supervisor . . . . . . . . . . .  38 
3 .4.2 Supervisor Signal Processing Latency . . . . . . . . . . . . .  41 
3.4.3 Computational Cost . . . . . . . . . . . . . . . . . . . . . .  42 
4 The PDM Model 43 
. . . . . . . . . . . . . . . . . . . . . . .  4.1 Example Software System 44 
4.1.1 Illustration of Nondeterministic Behavior . . . . . . . . . . .  46 
. . . . . . . . . . . . . .  4.2 Issues in the Derivation of the PDM-Model 47 
. . . . . . . . . . . . . . .  4.2.1 Identification of S tate Transitions 47 
. . . . . . . . . . . . . . . . . . . . . . .  4.2.2 Causality Pathways 49 
. . . . . . . . . . . . . . . . . . . . . . . .  4.2.3 SipdParameters  51 
. . . . . . . . . . . . . . . .  4.3 PDM-Mode1 Transformation Algorit hm 52 
. . . . . . . . . . . . . . . . . . . . . . . . . . . .  4.3.1 Overview 53 
4.3.2 Constraint-Based Stimulus Consistency . . . . . . . . . . . .  56 
. . . . . . . . . . . .  4.3.3 PDM-Mode1 Transformation Algorithm 58 
. . . . . . . . . . . . . . . . .  4.3.4 StimulusSelection Algorithm 59 
. . . . . . . . . . . . . .  4.3.5 PDM-Mode1 Generation Algonthm 63 
. . . . . . . . . . . . .  4.3.6 PDM-Mode1 Transformation Example 66 
5 The Path Detection Module Interpreter 73 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5.1 Overview 74 
. . . . . . . . . . . . . . . . . . . .  5.1.1 Components of the PDM 74 
5.2 Temporal Signal Tags . . . . . . . . . . . . . . . . . . . . . . . . . .  76 
. . . . . . . . . . . . .  5.2.1 Interpretation of Occurrence Intervals 77 
. . . . . . . . . . . . . .  5.2.2 Singly Bound Occurrence Intervals 79 
5.2.3 Generation of Signal Tags . . . . . . . . . . . . . . . . . . .  80 
5.2.4 Timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  80 
5.3 Partial-Order Signal Consumption . . . . . . . . . . . . . . . . . . .  82 
5.3.1 Application of Partial Order Signal Consumption . . . . . .  82 
5.3.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . .  83 
5.3.3 An Implementation of Partial-Order Signal 
Consumption . . . . . . . . . . . . . . . . . . . . . . . . . .  85 
5.4 BeIiefMethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  88 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5.5 Core Interpreter 90 
5.5.1 SDL Abstract Machine . . . . . . . . . . . . . . . . . . . . .  90 
. . . . . . . . . . . . . . . . . . . .  5.5.2 PDM Abstract Machine 92 
. . . . . . . . . . . . . . . . . . . . . . . .  5.5.3 PDM Input Port 97 
5.5.4 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . .  103 
. . . . . . . . .  5.5.5 Computational Complexity of the Input Port 105 
5.5.6 Scheduling Process Execution within the PDM . . . . . . . .  105 
. . . . . . . . . . . . . . .  5.6 Time and Space Complexity of the PDM 108 
. . . . . . . . . . . . . . . . . . .  5.6.1 Running-Time Complexity 108 
. . . . . . . . . . . . . . . . . . . . . . . .  5.6.2 Space Complexity 108 
6 The Base Supervisor 110 
. . . . . . . . . . . . . . . . . . . . . .  6.1 The Base Supervisor Model 110 
. . . . . . . . . . . . . . . . .  6.2 Base Supervisor Interpreter Overview 112 
. . . . . . . . . . . . . . . . . . .  6.3 Tirne within the Base Supervisor 113 
. . . . . . . . . . . . . . . . . . . .  6.4 Behavior Supervisor Interpreter 114 
. . . . . . . . . . . . . . . . . .  6.4.1 Belief Creation/Termination 116 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6.5 Comparator 117 
. . . . . . . . . . . . . . . . . . . .  6.5.1 Queue Signal Algorithm 117 
. . . . . . . . . . . . . . . . . .  6.5.2 Process Contents Algoritlun 118 
. . . . . . . . . . . . . . . . . . . . . .  6.5.3 Complexity Analysis 120 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6.6 Input Port 121 
. . . . . . . . . . . . . . . . . . . .  6.6.1 Queue Signal AlgoAthm 121 
. . . . . . . . . . . . . . . . . . .  6.6.2 Consume Signal Algorithm 121 
. . . . . . . . . . . . . . . . . . . . . .  6.6.3 Complexity Analysis 125 
. . . . . . . . . . . .  6.7 Schedultng Process Exccution within thc BSup 126 
. . . . . . . . . . . . . . . . . . . . . .  6.7.1 Complexity Analysis 127 
. . . . . . . . . . . . . . .  6.8 Time and Space Complexity of the BSup 127 
. . . . . . . . . . . . . . . . . . . . . . . .  6.8.1 Time Complexity 127 
. . . . . . . . . . . . . . . . . . . . . . . .  6.8.2 Space Complexity 128 
6.9 Time and Space Complexity of the 
. . . . . . . . . . . . . . . . . . . . . . . . .  Hierarchical Supervisor 128 
6.9.1 Time Complexity . . . . . . . . . . . . . . . . . . . . . . . .  128 
6.9.2 Space Complexity . . . . . . . . . . . . . . . . . . . . . . . .  129 
7 Evaluation 130 
7.1 Demonstration System . . . . . . . . . . . . . . . . . . . . . . . . .  130 
7.1.1 Class Description . . . . . . . . . . . . . . . . . . . . . . . .  132 
7.1.2 Supervisor Operation . . . . . . . . . . . . . . . . . . . . . .  134 
7.2 Evaluation Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . .  141 
7.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  144 
7.3.1 Failure Detection Capability . . . . . . . . . . . . . . . . . .  145 
7.3.2 Number of Legitimate Behavioral Alternatives . . . . . . . .  148 
7.3.3 Number of Behavioral Alternatives Generated . . . . . . . .  150 
7.3.4 Running-Time Complexity . . . . . . . . . . . . . . . . . . .  151 
7.3.5 Space Complexity . . . . . . . . . . . . . . . . . . . . . . . .  152 
7.3.6 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . .  154 
8 Conclusions 157 
. . . . . . . . . . . . . . . . . . . . . . . .  8.1 Hierarcliical Supervision 158 
. . . . . . . . . . . . . . . . . . . . .  8.2 Major Research Contributions 159 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  8.3 FutureWork 161 
Bibliography 
A Target System Specification 170 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .  A.1 PhoneHander 171 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .  A.2 TTRX-Manager 172 
. . . . . . . . . . . . .  A.3 Network Path Manager (Net PathManager) 172 
List of Tables 
5.1 Example: Partial Order Distance Table . . . . . . . . . . . . . . . .  88 
7.1 Hierarchical Supervisor . Lines of Source . . . . . . . . . . . . . . .  134 
7.2 Supervisor Failure Detection Capability . . . . . . . . . . . . . . . .  146 
A . l  SDL Requirements Dictionary (112) . . . . . . . . . . . . . . . . . .  177 
A.2 SDL Requirements Dictionary (212) . . . . . . . . . . . . . . . . . .  178 
xiv 
List of Figures 
. . . . . . . . . . . . . . . . . . . . . . . . . . .  1.1 Software Supervisor 2 
. . . . . . . . . . . . . . . . . . . . . .  2.1 Non-Deterministic Behaviors IO 
. . . . . . . . . . . . . . . . . . . .  2.2 Example Finite State Machines 11 
. . . . . . . . . . . . . . . .  2.3 Causality Violation in Event Processing 13 
. . . . . . . . . . . . . . . . . . .  2.4 Supervisor Connectivity Pat tans  17 
. . . . . . . . . . . . .  2.5 Sampling of the Hardware Interface Memory 19 
. . . . . . . .  2.6 Operation of a System After Occurrence of a Failure 22 
. . . . . . . . . . . . . . . . . . .  3.1 Anatomy of a Software Supervisor 32 
. . . . . . . . . . . . . . . . . . . . . . .  3.2 Example SDL Specification 34 
. . . . . . . . . . . . . . . . . . . . . . . .  3.3 Example Tracking Model 36 
. . . . . . . . . . . . . . . . . . . .  3.4 Hierarchical Software Supervisor 37 
. . . . . . . . . . . .  3.5 Operating States of a Hierarchical Supervisor 38 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  3.6 Failure Types 39 
. . . . . . . . . . . . . . . . . . . . . . . . . .  3.7 Illegitimate Behaviors 40 
. . . . . . . . . . . .  4.1 Telephone Exchange SDL System Specification 44 
. . . . . . . . . . . .  4.2 Fragments of the Phone Handler Specification 45 
4.3 BehavioralAlternativesfortheAand B C d Z S c e n a r i o  . . . . . .  46 
4.4 Permuteable Signals at the Input of a SDL Process . . . . . . . . .  47 
4.5 Causality Pathway and Causality Pat hway Tracing . . . . . . . . .  50 
. . . . . . . . . . . . . . . . . . . .  4.6 Example: PDM-Mode1 Deadlock 52 
. . . . . . . . . . . . . . . . .  4.7 PDM-Mode1 Transformation f rocess 54 
. . . . . . . . . . . . . . . . . . . . .  4.8 Segment of Constraint Graph 57 
. . . . . . . . . . . . . . . . . . . . .  4.9 Stimulus Selection Algorithm 60 
. . . . . . . . . . . . . . . . . .  4.10 PDM-Model Generation Algorithm 64 
. . . . . . . . . . .  4.11 Application of the Stimulus Selection Algorithm 67 
. . . .  4.12 Application of the PDM-Model Generation Algorithm (113) 69 
. . . .  4.13 Application of the PDM-Mode1 Generation Algorithm (213) 70 
. . . .  4.14 Application of the PDM-Mode1 Generation Algorithm (313) 71 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5.1 Signal Ordering 75 
. . . . . . .  5.2 Overlapping and Non-Overlapping Occurrence Intervals 77 
. . . . . . . . . . . . . . . . . . .  5.3 SDL Timer Set/Reset Constructs 81 
. . . . . . . . . . . . . . . . . . .  5.4 Channels Carrying Timer Signals 82 
. . . . . . . . . . . . . . . . . . . . . . . . . .  5.5 Generation of Beliefs 89 
. . . . . . . . . . . . . . . . . . . . . . . . .  5.6 SDL Abstract Machine 91 
. . . . . . . . . . . . . .  5.7 Path Detection Module Abstract Machine 93 
5.8 Belief Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  95 
5.9 BeliefTerniination . . . . . . . . . . . . . . . . . . . . . . . . . . .  95 
5.10 Input Port Queue S i p a l  Algonthm . . . . . . . . . . . . . . . . . .  100 
. . . . . . . . . . . . . . . . . . . . . . .  5.11 Consume Signal Algorithm 101 
. . . . . . . . . . .  5.12 PDM Input Port Signal Consumption Algorithm 102 
. . . . . . . . . . . . . . . . . .  5.13 PDM Process Scheduling Algorithm 107 
. . . . . . . . . . . . . . .  6.1 Base Supervisor Mode1 Transformations 111 
. . . . . . . . . . . . . . . . . .  6.2 Base Supervisor Abstract Machine 115 
. . . . . . . . . . . . . . . . .  6.3 Comparator Queue Signal Algorithm 118 
. . . . . . . . . . . . . . .  6.4 Comparator S i p a l  Matching Algorithm 119 
. . . . . . . . . . . . . . . . . .  6.5 Input Port Queue Signal Algorithm 122 
. . . . . . . . . . .  6.6 BSup Input Port Signal Consumption Algorithm 123 
. . . . . . . . . . . . . . . . . . . . .  7.1 High-Level Supervisor Design 131 
7.2 Signal Routing within the Hierarchical Supervisor . . . . . . . . . .  135 
. . . . . . . . . . . . . .  7.3 Scheduling a not-ready-terun Comparator 137 
. . . . . . . . . . . . .  7.4 Execution of a PDMProcess and Cornparator 138 
. . . . . . . . . . . . . . . . . . . . . .  7.5 Belief Creation/Termination 139 
. . . . . . . . . . . . . . . . . . . . . .  7.6 Generating a Failure Report 141 
. . . . . . . . . . . . . . . . . . . . . . . . . . .  7.7 Evaluation Testbed 142 
. . . . . . . . . . . . . . . . . . . . . .  7.8 Measured Number of Beliefs 149 
7.9 Number of Behavioral Alternatives Generated . . . . . . . . . . . .  150 
. . . . . . . . . . . . . . . . . . . . . . . . . . .  7.10 CPU Time Per C d  153 
7.1 1 Maximum Memory Usage . . . . . . . . . . . . . . . . . . . . . . .  155 
. . . . . . . . . . . . . . . . . . .  A . l  Private Branch Exchange (PBX) 171 
. . . . . . . . . . . . . . . . . . . . . .  A.2 System Specification of PBX 173 
. . . . . . . . . .  A.3 SDL Specification of PhoneHandler Process (112) 174 
. . . . . . . . . .  A.4 SDL Specification of PhoneHander Process (212) 175 
. . . . . . . . . . .  A.5 SDL Specification of Net Pathhlanager Process 176 
. . . . . . . . . . . .  A.6 SDL Specification of TTRXManager Process 176 
Chapter 1 
Introduction 
1.1 Failure Detection 
This thesis addresses automatic detection of software failures. I t  is well known that 
s tate-of-the-art software development processes yield imperfect software. Thus it 
is common for large systems such as telecommunication switches, avionics Aight 
systems etc. to cont ain several thousands of software fault S. Automatic failure 
detection is the f i s t  step to dealing with failures arising fiom software faults. 
1.2 Software Supervision 
Software supervision is an approach aimed at automaticaily detecting externally 
observable software failures. A software supervisor monitors the inputs and outputs 
of the target system (figure 1.1) and makes use of the target system's requirements 
specification. 
CHAPTER 1. INTRODUCTION 
Inputs Real-Tirne System (or Subsystem) 1 y Outputs n 
t 
Failure Report 
Figure 1.1: Software Supervisor 
Internally, the supervisor generates a set of expected behaviors from the require- 
ments specification and target system input andior output signals. The expected 
behaviors are compared with a c t u d y  observed behaviors. A failure is reported if 
a match between the two cannot be made. 
The supervisor may be attached to either the entire system or a sub-system, 
provided that the inputs and outputs of the sub-system are observable. In the latter 
case, a supervisor could be used to detect errors before they manifest tliemselves 
as externally observable failures. 
A number of challenges exist in the development of a software supervisor. One 
major challenge is dealing with specification non-determinism. Specification non- 
deterrninism permits several legitimate output beliaviors for a single input behavior. 
A supervisor that uses a specification containing non-determinism must be able to 
consider all legitimate behavioral alternatives so that false failure reports are not 
generated. 
For some specifications, the number of legitimate behavioral alternatives can be 
large. Explicit consideration of each alternative can result in a very large supervisor 
time and/or space complexity. 
CHAPTER 1. INTRODUCTION 
1.3 Why Supervision 
There are three principal categories of application areas for supervision: software 
development, on-line supervision and software reliability instrumentation. This 
section outlines several uses of a supervisor within each application area. 
A. Software Development: During software development, a supervisor may be 
used to report software failures prior to release. Two specific application areas 
are: 
1. Fault-localization tool. Large software systems can be partitioned 
into several sub-systems. If a supervisor is attached to each sub-system? 
a fault may be automaticdy localized to a sub-system. In this case. 
supervision has the potential of reducing cos ts associated wit h software 
de bugging . 
2. Test tool. Systems exhibiting non-determinism are difficult to test due 
to a number of possible outcomes for a given test case. As a result. 
testing, in most cases, is restricted to specific cases with few, known 
outcomes. 
A supervisor is able to report a relatively complete set of failures. It 
wodd serve to improve the effectiveness of testing and indirectly improve 
the reliability of developed software. 
B. On-Line Supervision: The presence of faults in software systems during field- 
operation makes supervision an attractive approach for detecting failures. 
On-line supervision presents a number of advantages including: 
O Early reporting of failmes d o w s  a Company to repair underlying faults 
before more serious consequences occur. 
CHAPTER 1. INTRODUCTION 4 
Minor failures are, in some cases, indicative of more serious future prob- 
lems. Accurate reporting of failures often gives early warning of potential 
future catastrophes. 
The supervisor maintains a more global perspective of the system than 
any individual user. It is thus able to report failures not visible to 
individual users. 
The supervisor is able to provide more accurate and detailed failure 
reports t han non-technically oriented users. 
C. Software Reliability Instrumentation: A major impediment to the advance- 
ment of the software reliability engineering discipline are the difficulties as- 
sociated with collection of software f d m e  data. At present. the process is 
excessively dependent on human intervention both for the detection of fail- 
ures and collection of relevant descriptors. The software supervisor may be 
used to automate tkis process. 
1.4 Objectives 
The primary objective of this work is the research of an efficient approach to auto- 
matic detection of software failures in the presence of specification non-determinism. 
The intended application of the failure detection unit is real-time reactive telecom- 
munications software. 
CHAPTER 1. INTRODUCTION 
1.5 Summary of Research Contributions 
Partitioning supervision into two subproblems: target system tracking and 
detailed behavior checking. 
Definition of a framework to track the target system operation. The tracking 
unit consists of a mode1 and an interpreter. 
- Formalization of the semantics of the tracking-unit model. 
- Research of a derivation procedure for the tracking unit model. 
- Definition of algorithms for a suitable tracking system mode1 interpreter. 
- Development of a prototype implementation of the tracking unit inter- 
preter. 
Definition of a framework for a detailed behavior checking. A detailed beiiav- 
ior checking unit consists of a model and an interpreter. 
- Formalization of the semantics of the detailed behavior checking unit. 
- Development of algorithms for a suitable interpreter of the detailed be- 
havior checking unit. 
- Development of a prototype implementation of the detailed behavior 
checking unit. 
0 Computational complexity assessrnent of the proposed approach. 
1.6 Organizationof Thesis 
This thesis is organized as follows: Chapter 2 outlines the major issues related to 
automated failure detection and overviews existing approaches. 
CHAPTER 1. INTRODUCTION 
Chapter 3 presents an overview of hierarchical superwision. A hierarchical super- 
visor consists of two layers: (1) the tracking layer (called the path-detection layer) 
and (2) the detailed behavior checking layer (called the base supervisor layer). Each 
layer makes use of a unique target system model and interpreter. Target system 
models are derived fiom the t arget sys tem requirement s specification. 
Chapter 4 describes the transformatioii of the model used by the path detec- 
tion layer. The transformation accepts as input the target system's requirements 
specification and generates a suitable rnodel to be used by the path detection layer. 
Cliap ter 5 describes an interpreter for the aforementioned model. 
Chapter 6 presents the model transformation and interpreter for the base su- 
pervisor layer. Evaluations of the approach based on a prototype supervisor and a 
srnall telephone exchange that served as a target system are presented in chapter 7. 
Conclusions are drawn in chapter 8. 
Chapter 2 
Issues & Related Work 
This chapter outlines six major issues that arise in software supervision. Existing 
work that may be used for automatic detection of software failures is described 
next. The chapter concludes with an overview of the focus of this thesis in light of 
the issues and existing work. 
2.1 Definition of Correct Behavior 
The objective of supervision is to detect failures in the operation of a target system. 
The supervisor requires a definition of legitimate target system behavior. The 
definition is required to be complete and expressed using a formal notation. 
One possibility is that the supervisor uses the target software system's require- 
ments specification, typicdy developed as part of the software Me cycle [46]. The 
requirements specification defines the externally observable behavior of the software 
system. A multitude of formal specification languages exist with formally defined 
semantics to minimize semantic ambiguities. 
CHAPTER 2. ISSUES & RELATED WORK 8 
This work is focused on communicating finite state machine (CFSM) based 
fomalisms. Many internationally standardized formalisms are based on a CFSM 
model. Examples include the Specification and Description Language (SDL) [58]? 
Estelle [23], and Lotos (221. 
2.1.1 Target System Response Time 
Physical systems are typically specified as having finite response times. Thus an 
event, E will be serviced by the target system after R units of time. 
For actual systems, R may be different for Merent  events. Furthermore, for a 
single event, R rnay vary depending on several factors such as the target system 
load and the adab i l i t y  of resources. The exact response time may be impossible 
to determine analyticdy. 
An approximation of the individual event response times can be made by con- 
sidering the best and worst-case response times. The actual response time will fall 
within this interval. TGi, is defined as the best case response-time of any event 
under any specified condition of the target system. Similarly, TA,, is defined as 
the worst case response time. For the remainder of the thesis, each event will be 
considered to have a response time that f d s  within the interval [TL,, TG,,]. 
This t hesis considers the case where the requirement s specification consis ts of 
two components. The behavioral specification appears in a CFSM-based formalism. 
The behavioral specification is supplemented by a declarative specification of best 
and worst-case response times. 
CHAPTER 2. ISSUES Jk RELATED WORK 
2.2 Specification Non-Determinism 
Before non-determinism is dehed ,  a definition of determinkm is presented first. 
The definition originally appeared in the encyclopedia of philosophy [lS]. 
Determinisrn is the general philosophical thesis which states that for ev- 
erything that ever happens there are conditions such that, given them, 
nothing else could happen. (. . .) an event might be said to be deter- 
mined in this sense if there is some other event or condition or group 
of them, sometimes called its cause, that is a sufficient condition for its 
occurrence. the sufficiency residing in the eEects following the cause in 
accordance with one or more laws of nature 
From this definition, non-determinism may be defined as the theory or doctrine 
that for each cause, there may be two or more legitimate effects. 
Non-determinism is an important part of many specification formalisms. It  
allows the specification writer to omit portions of the specification that are not 
relevant. This reduces the specification effort and gives the software designer more 
design freedom to choose the behavioral alternative (or alternatives) that ~ ~ o u l d  
result in a less costly or otherwise desirable implementation. 
Specifications having non-determinism d o w  sys tems to exhibit non-determinis tic 
behavior during field operation. Consider a telephone exchange and the scenario 
where two parties, A and B simultaneously attempt to c d  a tlllrd party, 2. The 
exchange will typicdy exhibit non-deterministic behavior in that either A or B 
can connect to 2. The two behavioral alternatives arising are shown in figure 2.1. 
A software supervisor must be able to consider all behavioral alternatives arising 
out of the non-determinism in the requirements specification. A supervisor that 
is not able to consider all behavioral alternatives may generate erroneous failrue 
reports. Specification non-determinism is one of the major challenges of supervision 
CHAPTER 2. ISSUES SG RELATED WORK 10 
Phone A Exchange Phone B Phone Z Phone A Exchange Phone B Phone Z 
v v v Liri. v T v v 
digitm) digit(Y) -i 
ring-tone slowbusy 
ringphone 
Figure 2.1: Non-Determinis tic Behaviors 
as the number of behavioral alternatives required to be considered by the supervisor 
may be large resulting in a large supervisor time and space complexity [47]. 
S pecification non-determinism may refer to several categories of non-determinism. 
Descriptions of many of these can be found in (441. The principal ones dealt 
with here are non-deternùnistically delayed communication paths and the non- 
determinism a~oc ia ted  with the precise time of a local clock. For our purposes, 
the latter will refer to the clifference between the values of the supervisor clock and 
the target system clock. 
2.2.1 An Execution Pat h Interpretation of Non-Determinism 
Many specification formalisms support different types of non-determinism. A corn- 
mon framework can be used to represent most types of non-determinism. The 
framework shall be referred to as the ezecution path (EP) interpretation. 
An EP is defined as a series of state transitions through a finite state machine. 
Essentially, non-determinism pexmits two or more legitimate EPs for a possibly 
empty set of stimuli directed to a CFSM-based specification. 
As an example, consider the single-FSM specification in figure 2.2a. On the 
CHAPTER 2. ISSUES St RELATED WORK 11 
arrival of stimulus a,  either path So + Si or So + S2 codd be taken. The choice 
of EPs is non-deterministic despite both paths producing an identical observable 
output, X. 
Figure 2.2: Example Finite State Machines 
2.2.2 Categories of Non-Determinism 
Non-determinkm refers to choices in the EP. Specification non-determinism may be 
categorized into don't care non-determinism and don't k n o r  non-determinzsm. In 
tliis section, an informal definition of the two types of non-determinism is presented. 
A formal definition will appear later. 
Don't care nondeterrninism refers to two or more alternate EPs that if followed 
for a finite number of state transitions will leave the system in an identical global 
state. For communicating extended finite machine (CEFSM)-based specifications, 
global state refers to the collective state of ail FSMs including the contents of 
communication channels and input ports. 
A trivial example of don't care non-determinism is Uustrated in figure 2.2b. 
Assume that the FSM is initially in state So and stimulus a is consumed by the 
FSM. Regardless of the EP chosen, the FSM will output signal X and terminate 
in state, SI. 
CHAPTER 2. ISSUES St RELATED WORK 12 
Don't know non-determinkm refers to or more alternate EPs that if traced wilt 
leave the system in a different global state. FSMs containing examples of don't 
know non-determinism are illustrated in figures 2.2a and 2 . 2 ~ .  The two paths 
in figure 2.2a leave the FSM in two different symbolic states, while the paths in 
figure 2 . 2 ~  output different signals. 
From the perspective of a software supervisor, all  behavioral alternatives must 
be considered so that erroneous failure reports are not generated, as described in 
section 2.2. If the don't care non-determinism could be separated from the don% 
know non-determinism, the supervisor would only have to consider donk know 
non-determinism. This would have the desirable effect of reducing the time and/or 
space complexity of the supervisor. 
2.3 Supervisor Signal Processing Latency 
Signals to and fiom the target system are directed to the supervisor. Signals may 
be processed, by the supervisor, an arbitrary time after their occurrence. Two 
general categories of supervisors are in-time and out-of-time. The supervisors cliffer 
principdy in the time at which signals are processed by the supervisor. In other 
words, the relation between the clocks of the supervisor and target system. A 
loosely bound definition of in and out-of-time supervision is presented below. This 
definition will be refined later as more issues are presented. 
Consider an event, E, generated by the environment to be processed by the 
target system. Assume that E was generated at time T. A supervisor WU process 
E at some time, T + A. An in-time supervisor must be able to process, event E 
such that A = O while an out-of-time supervisor must be able to process event E 
such that A > 0. 
CHAPTER 2. ISSUES fc RELATED WORK 13 
The time at which events in a supervisor are processed is dependent on the 
response time of the target system. Consider two events, El and Ez, representing 
requests for service (e.g. two telephones going ofiook). El and E2 are generated 
at times, Tl and T2 respectively. If ITl - T2 1 < TA,,, then the order in which the 
events are serviced by a non-deterministicdy specified system may be arbitrary. 
For example, if telephone A goes offhook before telephone B, it may be possible 
(and legitimate) for B to receive dialtone before A. 
In general, a violation of causality may result if events are processed as they are 
received by a supervisor. From the previous example, if event El is processed before 
E2 arrives (figures 2.3a and 2.3b). On the arrivai of E2, the supervisor determines 
that the order in which the events were processed does not correspond with the 
order chosen by the target system (figure 2 . 3 ~ ) .  The supervisor must revert to a 
previous state and reprocess the events in order E2 - El (figure 2.3d). 
(4 (b ( c )  (dl 
Figure 2.3: Causality Violation in Event Processing 
Based on the issues in event processing latency, more precise definitions of in-time 
and ou t-of- time supervision follow. 
CHAPTER 2. ISSUES St RELATED WORK 
2.3.1 In-time Supervision 
An in-time supervisor is defmed as one where events, generated at time T, are 
processed on the interval, [T, T + TA]. 
As outlined in section 2.3, causality violations may occur in an in-time supervi- 
sor. This category of supervisors must make provision for un-consuming consumed 
signals to un-do causality violations. Two approaches have been studied thus far. 
The signal-in-transit approach 1251 pre-creates an explicit behavioral alternative 
for each possible signal that may arrive. The rollback-and-recovery approach 1561 
moves the global state of the supervisor back and ce-orders the processing of events 
as required. 
The principal advantage of in-time supervision is that failures are reported 
within TL,, of thek occurrence. The disadvantage is that the supervisor must 
be able to keep up with the target system (i.e. the supervisor cannot lag the target 
system by more than Tm,, units of time). In most cases? the supervisor is more 
computationally intensive than the target system due to the need to consider all 
behavioral alternatives. For systems with large amounts of non-determinism, the 
computational complexity of the in-time approach has been found to be a severe 
short coming [47]. 
2.3.2 Out-of-time Supervision 
An out-of-time supervisor is defined as one where events generated at tirne, T are 
processed on the intervai, [T + Ga,, Y '  + oo]. 
In an out-of-tirne supervisor, before an event is processed, the supervisor waits 
at least Tm,, units of time. The supervisor can thus guarantee that no further 
CHAPTER 2. ISSUES & RELATED WORK 15 
events wiU be generated that may precede the current one. Thus the out-of-time 
supervisor does not need a mechanism to un-do causality violations like its in-time 
counterpart. 
The principal advantage of out-of-time supervision is that peaks in processing 
requirements can be amortized over an arbitrary amount of time. Thus the out- 
of-time supervisor requires a CPU that can process the average computational 
requirements of the target system rather than the peak as required by the in-time 
one. The disadvantage of the approach is the latency of failure reporting. 
2.4 Tradeoffs Between Accuracy and Cornputa- 
Cost 
Specification non-determinism rnay result in large supervisor computational corn- 
plexities as mentioned in section 2.2. This is currently one of the major impediments 
to the use of a supervisor. One possible approach of dealing with specification non- 
determinism is to use partial supervisor models [47]. A partial model would reduce 
the computational complexity of supervision at an expense of reduced failure de- 
tection capability. 
Partial models may be derived from the requirements specification. There are 
two categories of approaches to devising partial models: pessimistic and optimistic. 
A partial supervisor model may be derived using a combination of the two ap- 
proaches. 
Pessimistic models can cause the supervisor to report failures while the target 
system is operating correctly. The failures reported by a pessimistic model are a 
superset of the actual set of failures. Pessimistic models are derived by eliminating 
CHAPTER 2. ISSUES St RELATED WORK 16 
alternative EPs representing don't know non-determinism from the requirements 
specification [47]. 
Optimistic models can cause the supervisor to miss reporting some failures. 
Failures reported by an optimistic model are a subset of the actual set of failures. 
Optimistic models are derived by eliminating n EPs representing don't know non- 
determinism fiom the specification and replacing them with n new EPs such that 
m > n [51]. 
The effects of reduced model supervision have been studied in [45,47]. It was de- 
terrnined that the savings are proportional to the number of encountered behavioral 
alternatives. As system loads get Iarger, more non-determinism was encountered 
and more savings in computational complexity were realized. For one particular 
experiment, reductions in computational complexity of several orders of magnitude 
were observed with approximately three quarters of fdures  reported [47]. 
2.5 Attachment of a Supervisor to a Target 
System 
To minimize the interference with the target system software: a supervisor typically 
executes on a separate hardware platform. There are several ways a supervisor can 
be attached to observe the input and output signals of a target system. This work 
is targeted towards systems with a large number of input and/or output connec- 
tions such as communication controllers, telephone exchanges etc. The physical 
connection of the supervisor to each inputfoutput wke of a large system is prac- 
t i c d y  infeasible. Two commonly used approaches WU be described here, namely: 
(1) tapping of a data link and (2) polling of controlled hardware interface memory. 
CHAPTER 2. ISSUES St RELATED WORK 





Interface - Abstractor - Supervisor 
Memory 1 
5 DATA LINK 1 1 ControIled / Hardware 1 
Sybsystem 
Figure 2.4: Supervisor Connectivity Patterns 
2.5.1 Tapping ofaDataLink 
Tapping of a data link refers to snooping traffic traveling across a communication 
channel. Data is monitored in read-only mode. A protocol translator converts 
physical-layer signals to events that can be processed by the supervisor. 
The difficulty with this approach is the multiple interpretations of a lack of 
information by the protocol translator. The absence of information is typicdy 
handled by timeouts in many protocols. The protocol translator must deal with the 
absence of an event (for example) in the same way as the target system. The precise 
time that the timeout occurs is non-deterministic due to the lack of knowledge in the 
supervisor about the local dock of the target system (as discussed in section 2.2). 
In such cases, two behavioral alternatives need to be considered by the supervi- 
sor: (1) that the timeout has expired before the event is received and by the target 
system and (2) that the timeout expires after the event is received. 
CHAPTER 2. ISSUES & RELATED WORK 
2.5.2 Polling of ControIled Hardware Interface Memory 
A software system's input and output signals can be identified by polling the con- 
trolled hardware interface memory. An abstractor (figure 2.4) is used to convert 
bit changes into signais recognizable by the supervisor. Current hardware design- 
for-testability trends such as boundary-scan [42] may facilitate polling hardware 
interface mernories. 
Several issues arise when poling the hardware interface memory. Three common 
ones are described here. First, signals of short duration rnay be missed. Second, 
the order signals are reported may be permuted by the abstractor and finally. the 
scanning of some signals may be dependent on the correct target system operation. 
A brief overview of each of the issues follows. 
Short Duration Signals 
An abstractor samples the hardware interface memory at a fixed frequency, f,. 
Consider a signal E, generated by the target system with duration, TE such that 
TE is less than the sampling period (Le. TE < k). If E is generated between 
sampling points. it will be missed by the abstractor. 
Consider the example in iigure 2.5a. Signal, E is generated between sampling 
points 1 and 2. The abstractor will miss reporting the occurrence of signal E. The 
missed event will be reported by the supervisor as an illegitimate failure of the 
target system. 
CHAPTEX 2. ISSUES SG RELATED WORK 
Signalling 
Translater Supervisor 
1 m T A m ~  Target Systern 
CONTROL 
[on / offj 
(cl 
Figure 2.5: Sampling of the Hardware Interface Memory 
CHAPTER 2. ISSUES 96 RELATED WORK 
Reversal of Signal Order 
If two or more signals are generated between sampling intervals, the abstractor d 
not be able to report the actual signal generation order. Rather, the order reported 
will be based on some intemal abstractor scanning order. 
As an example consider the two events, A and B occurring between sampling 
points 1 and 2 as shown in figure 2.5b. Both signals, A and B will be detected 
by the abstractor at sampling point 2 and the actual order of occurrence is not 
resolvable by the abstractor. 
For some specifications, order of signal generation is critical. If the abstractor 
reports signals out of order, the supervisor will report an erroneous failure of the 
t arge t sys tem. 
Dependence on Correct Target System Operation 
The supervisor relies on the correct operation of the target system for some signals 
to be reported by the abstractor. Consider the case shown in figure 2 .5~.  A common 
signaling translator is used by both the supervisor and target system. Furthermore 
assume that the signaling translator is turned off and on as needed by the target 
system software. This could be representative of a power-critical application such 
as a battery-operated device, or the case where the signaling translator is a shared 
resource, allocated/deaUocated as needed. 
The difficulty arises in that the requirements specification only specifies the 
externally observable behavior. Switching the signaling translator on and off is 
typically not specified at the requirements specification since it is not an externally 
observable event. 
CHAPTER 2. ISSUES & RELATED WORK 21 
If a software fault exists that omits turning on the signaling translator, the 
events will be suppressed by the translator and neither the target system nor su- 
pervisor will receive them. This type failure is not detectable by a supervisor as 
the supervisor relies on the correct operation of the target system for the signal to 
be generated. 
2.6 Continuation of Supervision After Detection 
of a Failure 
A requirements specification typically does nof specify the behavior of a target 
system after the occurrence of a failure. From the requirements specification per- 
spective, a failure causes the target system to traverse a state transition that does 
not correspond with any transition in the requirements specification. This may 
lead the target system into a state that does not correspond with any state in the 
requirements specification. 
If the supervisor remains attached to a system after a failure occurs with the 
supervisor state different from the target system state, the supervisor would expect 
one behavior and the target system would generate another. The result would be 
a shower of failure reports generated by the supervisor. 
Most systems exhibit some fault tolerance capability. For minor failures a sys- 
tem may be able to recover its operation after a period of time, tr (figure 2.6). 
Session oriented systems typically fd into this category. For example, if a failure 
is observed during a telephone c d  in North America, a natural reaction would be 
to place the telephone onhook and to re-attempt the c d ,  effectively re-setting the 
state of the local phone. 






Figure 2.6: Operation of a System After Occurrence of a Failure 
2.6.1 Resynchronization 
The post-failure state of the target system is not known by the supervisor. but is 
needed to prevent generation of spurious failure reports. The post-failure state of 
the target system may be determined once it resumes normal operation. Once the 
state of the target system is known, supervision may resume. 
A resynchronization mechanism is needed to determine the post-fdure state 
of a target system. The mechanism accepts as input both target system inputs 
and outputs, just Like the supervisor. It generates a state corresponding with the 
current state of the target system based on the requirements specification. The 
problem is complicated because distingvishing signal sequences must be determined 
for all CEFSMs including interna1 ones that do not communicate directly with the 
environment. This result is a very large possible search space [2: 16: 151. 
In the context of supervision, resynchronization was studied in [30, 351. The 
central research issue in both cases was coping with the large number of possible 
states that the target system could be in. Both used assumptions to limit the num- 
ber of possible states, for example [30] made the assumption that the post-failure 
state was closest to the pre-failure state while [35] proyosed resynchronization based 
on the pre-failure state and target system fault models. 
CHAPTER 2. ISSUES SG RELATED WORK 
2.7 Related Work 
Previous work on monitoring software systems for failures can be subdivided into 
two broad categories: intrusive and non-intrusive. Intrusive approaches require 
modifications to the target system software while non-intrusive approaches do not. 
Existing work on several intrusive and non-intrusive approaches to software moni- 
t oring is described in the following section. 
2.7.1 Intrusive 
Software Audits 
Software data errors are detected and possibly corrected by means of audit pro- 
grams [l, 13, 41, 431 before they mariifest themselves as failures. Audit programs 
consist of additional software wliich has access to the main programk data struc- 
tures. An audit executes at a lower priority than the main program and periodically 
checks data structures for errcrs. 
Audits principally detect three types of errors [41]: (1) direct comparison er- 
rors. comparison of data structures with a duplkate: (2) comparison b y  association 
errors, detection of failures with the aid of data structure redundancy such as a 
doubly-linked list and (3) format comparison errors, common sense checking of data 
such as bounds checking. 
The main advantage of audits is that they are able to detect software errors 
before the errors manifest themselves as failures. However, audits detect only a 
limited set of errors. In addition, audits themselves may contain faults, potentially 
reducing the overd  reliability of the software. 
CHAPTER 3. ISSUES 22 RELATED WORK 
Wat chdog Timers 
The watchdog timer is an approach for detecting severe system failures [38]. The 
approach requires that the target system software be instrumented with code to 
generate sanity pulses within an interval of time, T. Generally, generation of s a -  
ity pulses surrounds code such as procedure c d s ,  resource requests or loops with 
known worst case execution times. In the event that some portion of code does 
not terminate before its maximum execution time, a sanity puise is not generated 
wit liin the required time. 
An external unit or watchdog timer, monitors the sanity puises. The timer may 
be implemented in hardware and/or software [29]. Hardware implementations are 
able to report a broader range of failures than purely software approaches. If a 
pulse is not received within T units of tirne, the unit reports that a failure of the 
software has occurred. 
The advantage of watchdog timers is that they are simple and easily imple- 
mented. The disadvantage is the limited set of failures that can be detected. 
Run-Time Result-Checking 
Run-time result-checking refers to a collection of approaches to check the correctness 
of results produced by progam modules [8, 9, 12, 501. Correctness checks are 
performed on the outputs of modules/programs. As an example, if a procedure is 
to compute a function, y = f (x), a checker could make use of the inverse function 
to re-compute the actual inputs, x = f-'(y). 
There are several difficulties vith this approach. Development of a checking rou- 
tine may be more complex than the actual routine itself. Result-checking software 
CHAPTER 2- ISSUES & RELATED WORK 25 
that executes on the same processor as the target system may degrade the overall 
system performance. In addition, the checking software may itself contain faults. 
reducing the overall reliability of the system. Sankar and Mandel [50] have devel- 
oped a distnbuted monitoring approach where the monitor resides on a separate 
processor that deviates these problerns to some degree. 
2.7.2 Non-Intrusive 
N-Version Programming 
N-version programming (NVP) refers to an approach for failure detectionlfault 
tolerance [3]. From a single requirements specification, N separate designs and 
implementations are produced by N isolated teams of developers. 
The N-versions of software are all executed concurrently. The outputs of all 
N copies are fed into a voting algonthm that compares outputs. If' all outputs are 
not identical, a failure may be reported. Fault-tolerance is achieved by having the 
voting algorithm choose a non-failed output and use it as the actual output of the 
system. A majority-wins algorithm is one such common voting scheme. 
The principal difficulty with NVP is its cost. N-versions of the software are 
required. Studies have shown that the N versions of software may contain iden- 
tical faults despite being developed by isolated teams [21, 321. Additionally. non- 
determinism poses difficulty as each of the N versions may have different outputs 
that are all legitimate. Recent research has focused on ways to reduce imple- 
mentation non-deterrninism [44]. However this may have the undesirable effect of 
increasing development cos t S. 
CHAPTER 2. ISSUES Sc RELATED WORK 
External Assertion Checking 
External assertion checking refers to an approach that checks certain properties of 
outputs generated by a specific target system. Two such systems, Elektra [31. 531 
and HMON (171 are described here. 
Elektra is an electronic railway control system. It consists of two primary corn- 
ponents, the logic processor and the safety bug. The logic processor is the target 
system. The safety bag checks and possibly rejects outputs produced by the logic 
processor. The safety bag consists of a real-time rde-based expert system that 
encodes various safety rules stated by the railway authority. 
HMON is a dis tributed real-time monitoring and debugging environment. I t  
is able to monitor of several event types induding system c d s ,  context switches, 
interrupts and shared variables. HMON attaches itself to the target system soft- 
ware through shared libraries and a modified kernel. I t  allows the user to specify 
attributes about each of the events. Discrepancies between the specified evçnt 
at tributes and actually observed events are reported as failures. 
Both approaches monitor properties of the target system. As a result, they are 
only able to reported a limited set of failures. 
The Observer 
The observer [4, 5, 141 is an approach for formal on-line validation of distributed 
systems. It is very sirnilar to a software supervisor. The observer monitors the 
inputs and outputs of the target system and makes use of a formal model of the 
target system, derived from the requirements specification. Discrepancies between 
observed behaviors and behaviors ïepresented by its interna1 model are reported as 
CHAPTER 2. ISSUES 96 RELATED WORK 
The observer was applied to the monitoring of distributed systems. The major 
ciifference between the observer and supervision is that the work reported on the 
observer does not address the issue of specification non-determinism. 
Software Oracles 
An oracle is an extemal source of information about a program. Common examples 
of oracles include proof axioms, another progran; or a formal specification [IO: 40: 
491. Approaches to the automated development of oracles from specifications have 
been described. 
A principal use of oracles has been in software testing. Oracles categorize test 
cases as either legitimate or illegitimate. As a result, they are typically only able 
categorize the behaviors represented by the test cases due to their limited mode1 of 
the target system. 
2.8 Research Focus 
Category of Systems: This thesis addresses supervision of discrete, real-time. re- 
active systems t hat service humans. The case where the sys tem specifications 
appear in a communicating extended finite state machine based formalism is 
considered. Such systems typicdy have a simple interface and as a conse- 
quence a simple specification. 
Categories of Failures: The detection and reporting of behavioral and perfor- 
mance failures is addressed. 
Behavioral failures are defined as spuiious, incorrect or missing events that are 
generated and/or not-generated by the target system. Performance failures 
CHAPTER 2. ISSUES & RELATED WORK 28 
are defined as violations of the temporal requirements of a specification [Il]. 
The category of performance failures considered are violations of worst case 
response time, Tm,,. 
Definition of Correct Behavior: This t hesis addresses supervision of CEFSM- 
based requirements specifications. For the sake of concreteness, discussion 
is aimed at the Specification and Description Language (SDL) [58]. SDL 
is standardized by the International Telecommunications Union (ITU) and 
used internationally within the telecommunications industry. The reader is 
referred to (71 for an introduction to the language. 
Treatment of supervision with SDL-specifications is focused to a subset of 
SDL-88. The subset is sufficient for many applications such as telecornmuni- 
cations c d  processing software. An outline of addressed constructs follows. 
Structural Constructs: system, block, process 
Communication Constructs: signal, signal route, channel 
SDL Process Constructs: decision, signal input, signal output, save, task. 
start, state, stop, any, none 
Specification Non-Determinism: This work addresses non-determinism asso- 
ciated with multiple event consumption orders. Three types of SDL non- 
determinism t hat fall into this category are: non-deterministic ckannel delay, 
spont aneous transitions and non-determinis tic decisions. The latter two types 
of non-de terminism may be modeled wit h non-deterministic channel delay. 
Supervisor Signal Processing Latency: This thesis focuses on out-of-time su- 
pervision. Events generated by the target system's environment or by the 
CHAPTER 2. ISSUES Sd RELATED WORK 29 
target system itself may be processed by the supervisor an arbitrary time 
aft er t heir generation. 
Tradeoffs Between Accuracy and Computational Cost: The case where a 
complete set of failures is required is considered. Thus supervision with a 
full mode1 of the target system is treated in this work. 
Observability of Target System Inputs: This work assumes complete observ- 
ability of all target system input and output events. 
Continuation of Supervision After Detection of a Failure: Addressed is su- 
pervision of correct behavior from the point where the target system is ini- 
tialized to the point where a failure is detected. 
Chapter 3 
Hierarchical Software Supervision 
This chapter gives an overview of hierarchical software superuision, an approach to 
supervision aimed at dealing with specification non-determinism. 
The chapter beings with some definitions that WU be used throughout the 
remainder of the thesis. The interna1 organization of a hierarchical supervisor is 
described next followed by a discussion of each function unit within the supervisor. 
The chapter concludes with a description of the operation of the supervisor. 
3.1 Definitions 
Definition 3.1.1 (Process State) For a n  SDL process, Pi, the process state is 
defined as a P t q l e ,  $ =< a, V, Q > where: 
o represents the carrent symbolic state of Pi; 
a V is the set of al1 variables and associated assignments; 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISION 
a Q is t h e  sequence representizg the  contents  o f  Pi's i n p u t  queue. 
Definition 3.1.2 (Global State) For a n  SDL speczfication consist ing of processes, 
Pt, Pz:. . . , Pn, the global state o f  the  specification, C is defined as a tuple  of the al1 
n process states,  C = < $1: $9, . . . > 1/>, > . 
Note that the definition of global state assumes t hat all communication channels 
in the specification are empty. Thus it may be considered a quiescent global state. 
This definition simplifies the discussion as the additional state space introduced by 
channels is omit ted. 
3.2 Interna1 Organization of a Non-Hierarchical 
Supervisor 
The following description gives a conceptual overview to the components and opes- 
ation of a software supervisor. Conceptually, a software supervisor consists of five 
fundament al components: the supervisor model, int erpret er, expected behavior 
bufir, observed behavior bufFer and a matcher. One possible variant of a software 
supervisor, where inputs are used to generate expected behaviors or an input-driven 
supervisor is shown in figure 3.1. 
The supervisor model captures the legitimate behaviors of the target system. 
As discussed in section 2.8, the case where the supervisor model is specified in 
SDL is considered. The interpreter icterpret s the supervisor model. Behaviors 
expected to be generated by the target system (expected behaviors) are bdered  
in the expected behavior b d e r .  Correspondingly, observed behaviors are buffered 
in the observed behavior buffer. This deviates the need for both behaviors to be 










Figure 3.1: Anatomy of a Software Supervisor 
generated at precisely the same time. A matcher compares the contents of the two 
bufFers and reports a failure if a match cannot be made. 
3.2.1 Approaches to Dealing wit h Specification 
Non-determinism 
Specification non-determinism permits more than one legitimate expect ed behavior 
for a given observed behavior. If the behavioral alternatives are visualized as alter- 
nate EPs tlirough the supervisor model, as outlined in section 2.2.1, the supervisor 
must be able to consider all alternate EPs. Two approaches have been developed. 
The belief method [26] explores all legitimate EPs in a breadth-iîrst manner. A 
separate thread of execution or belief is created for each encountered EP. A be- 
lief represents one global state of the supervisor model and the contents of the 
expectedfobserved behavior bders .  Beliefs are terminated as their externally ob- 
servable behavior is invalidated by the actually observed target system behavior. 
The belief method is a conceptually elegant approach for dealing with behavioral 
alternatives. However, its most serious shortcoming is its worst case timelspace 
complexity. Consider the case where N signals are queued for consumption whose 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISION 33 
order cannot be determined. In this scenario, the worst case computational com- 
plexity of the supervisor is given by (3.1) [26]. 
The optimistic path prediction and rollback ( O P P R )  approach [56] was developed 
to overcome the large time and space requirements of the belief method. OPPR 
explores legitimate EPs in a depth-first fashion, according to a heuristic derived 
from the target system's operational profile. 
Results indicate that the average case complexity of the OPPR approach is 
significantly better than the belief based approach [55, 561. However. upon occur- 
rence of a failure, the OPPR must explore all behavioral alternatives, resulting in 
a worst-case complexity similar to that of the belief-based method. 
3.3 Tracking Target System Operation 
The belief method considers all EPs concurrently while OPPR considers a heuris- 
t i cdy  ordered sequence of EPs. In many cases, however the actual EP chosen by 
the target system may be inferred dynamicdy from the observable signals to and 
from the target system. 
As an example, consider the SDL s~ecification in figure 3.2. Assume that signals 
a and b are generated by the environment within a short duration, E of each other'. 
Due to the non-deterministic SDL channel delay, process A could consume the 
signals in order: a - b or b - a. Specification non-determinkm thus permits either 
path 1 or path 2 to be legitimately traversed. 
l ~ h e  actual bounds for IE will be discussed later. 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERV7SION 
process A 
Q 
Figure 3.2: Example SDL Specification 
CHAPTER 3. HIEXARCHICAL SOFTWARE SUPERVISION 
By having the supervisor watch for key signals (either target system inputs or 
target system outputs), the path chosen by the target system could be inferred. 
For the specification in figure 3.2, a supervisor could infer that path l fpa th  2 was 
followed if signal, X / Y  was generated by the target system. The reader should 
note that signals X and Z would have been e q u d y  effective in detecting the two 
state transitions. 
3.3.1 The Tkacking Mode1 
In general, both target system input and output signals may be used to track 
target system operation tlirough the supervisor model. The observed signds are 
used to detect the occurrence of state transitions corresponding wit h t arget system 
behavior. 
For each state transition in the requirements specification, a M e r e n t  signal 
may be used to detect that the transition is taken place. A tracking model is one 
representation of such signals. 
The tracking model contains all symbolic states and state transitions of the 
requirements specification. The principal difference between the two models is 
their stimuli. Stimuli for the tracking model are chosen to detect state transitions 
corresponding with target system behavior. For each state transition in the tracking 
model, a stimulus is chosen from the set of signals consumed/ge~~erated during the 
corresponding s tate transition in the requirements specification. 
A primary criterion to select stimuli for the tracking model is signal uniqueness. 
Uniqueness is a relative concept. In general, signal, SI is considered more unique 
than S2 if SI can be consumed/generated in fewer states than S2. The precision of 
state detection is improved by choosing more unique stimuli. This reduces uncer- 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISrON 36 
tainty within the supervisor as to the actiial state transition that occurred and as 
a consequence improves the supervisor time andior space complexity. 
For the example requirements specification in figure 3.2, a corresponding track- 
ing model is shown in figure 3.3. The model is developed based on the choice that 
signals X and Y are used to detect paths 1 and 2 through the requirements specifi- 
cation. Note that the SDL system specification (figure 3.3a) has the output channel 
reversed to introduce the supervisor perspective. Signals, a, b and Z are not used 
to track the target system and are consumed without effect. Additional (non-SDL) 




(*) @ IvJ (%) (so) 
-- - -- 
(a) (b) 
Figure 3.3 : Example Tracking Mode1 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISION 
3.4 Hierarchical Software Supervisor 
Supervision may be decomposed into two smaller sub-problems: (1) tracking the 
evolution of the target system state throiigh the requirements specification and (2) 
detailed behavior checking. Lessons learned fiom disciplines such as AI planning 
indicate that a problem can be solved more efficiently if decomposed and each part 
solved with a domain-specific problem solver [6, 341. The resultant architecture is 
hierarchical and consists of two functional units: the path detection module (PDM) 







. . . -* . . - . . . .*  
Execution 
BSup Patb (EP) 
1 .  i 
- - 1  
I 
I 
@ Fai lure 
1 
I Report 




Figure 3.4: Hierarchical Software Supervisor 
The PDM tracks the operation of the target system. It accepts both input 
and output signals of the target system and generates EP information. The PDM 
consists of a PDM-model, similar to the tracking mode1 described in section 3.3.1 
and an interpreter. The PDM-mode1 is derived fiom the requirements specification. 
The BSup is a detailed behavior checker. It accepts target system inputs, out- 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISION 38 
puts and EP information fiom the PDM. The BSup consists of the five components 
described in section 3.2. The B Sup-mode1 very closely resembles the requirements 
specification. The interpreter interprets the BSup-model, s teering execution ac- 
cording to EP information generated by the PDM. 
3.4.1 Operation of the Hierarchical Supervisor 
The hierarchical supervisor operates with one of its two functional units active 
at  any point in time. Figure 3.5 shows the operating states of the lierarchical 
supervisor. 
Detected u
Figure 3.5: Operating States of a Hierarchical Supervisor 
Execution begins at the PDM. The PDM executes until it determines the next 
segment of the EP followed by the target system. The PDM communicates this 
information to the BSup and passes control to the BSup. The BSup attempts to 
follow the EP through the requirements specification and generates the expected 
output(s) correspondhg to the EP traversed. The matcher compares the expected 
output(s) with the actually observed output(s). 
CHAPTER 3. HiERARCHICAL SOFTWARE SUPERVISION 
Failure Reporting 
Failwes may be reported by either the PDM or BSup. The PDM reports a failure 
if ehe signals generated by the target system could not have been generated dong 
any path emanating fiom the current symbolic state. The BSup reports a failure in 
any one of three cases: (1) if the BSup cannot be steered dong the path prescribed 
by the PDM, (2) if the expected and observed behaviors do not match and (3) if a 
timeout occurs while the BSup waits for path information to be generated by the 
PDM 
Failures described are sub-divided into four commonly-occurring types, cate- 
gorized by two attributes: the failure category and the hindrance of the PDM7s 
tracking ability. The two failure categories are: (1) spuriously-generated signals 
and (2) missing or not-generated signds. The presence of a failure may or may not 
cause the PDM to report an incorrect EP. Both cases are 
types are summarized in figure 3.6. 
PDM Tracking Hindered 
No Yes 
b'igure 3.6: Failure Types 
TYPE 1 
described. The f a i h e  
TYPE LI 
As an example, consider a hierarchical supervisor that uses the PDM-mode1 
CHAPTER 3. HERARCHICAL SOFTWARE SUPERVISION 40 
shown in figure 3.3 and BSup-mode1 in figure 3.2. Both the PDM and BSup are 
in i t idy  in state SO. Examples of the four different failme categories are shown in 
figure 3.7. 
Environment Target System Environment Target System 
T v v -?==' 
Environment Target System Environment Target System 
T v 7 v 
1 1 
œ - 
(c) Type III 
Figure 3.7: Illegitimate Behaviors 
The fust failure type (figure 3.7a) is an example of an illegitimate output pro- 
duced by the target system. The behavior does not correspond to any path em- 
anating fkom the curent  symbolic state. This type of failurc is reported by the 
PDM. 
The second failure type (figure 3.7b) represents an incorrect output generated 
that corresponds to an existing but incorrect EP (see figure 3.3). The PDM reports 
that pzth 2 was traversed by the target system. The BSup attempts to steer 
execution dong path 2 but cannot due to the absence of signal b. The BSup 
reports the failure. 
The third failure type (figure 3.7~) represents a missing signal that does not 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISION 41 
interfere with the PDM's ability to determine EP information. The PDM reports 
that path 2 was traversed. The BSup generates an expected behavior consisting 
of signals, Y and 2. The matcher discovers that the expected behavior does not 
match the observed behavior. A failure is reported by the matcher. 
The final failure type (figure 3.7d) represent s a missing signal that interferes wit h 
the PDM's ability to detect the EP. In this example, signal Y was not generated 
by the target system. The PDM cannot determine EP information since it waits 
for Y, however the BSup has received signals b and 2. The BSup waits TL, from 
the receipt of b for EP information from the PDM to account for signals b and Y. 
If EP information from the PDM has not arrived after this time, the BSup reports 
a failure. 
3.4.2 Supervisor Signal Processing Latency 
The PDM tracks target system behavior by waiting for key signals so that the next 
segment of the EP traversed by the target system can be determined. The PDM 
typically uses a combination of target system input and output signals. As outlined 
in section 2.1.1, target system outputs rnay have a latency of up to TA,, units of 
time before they are generated by the target system. 
The PDM cannot guarantee acciuate path detection unless it lags the target 
system in the processing of events by at least T'A,, units of time. Thus out-of-time 
is a na turd  mode of operation for the hierarchical supervisor. 
In some cases, the PDM may not be able to resolve the EP chosen by the 
target system. This is principdy due to a lack of unique signals that may be 
generated/consumed in more than one requirements specification state transiti~n. 
In such a case, the PDM must resort to an approach where several candidate EPs 
CHAPTER 3. HIERARCHICAL SOFTWARE SUPERVISION 42 
are considered concurrently. Two such approaches (belief method and OPPR) were 
described in section 3.2.1. The belief method will be used in this thesis due to its 
maturity over the OPPR approach. 
3.4.3 Computational Cost 
The hierarchical supervisor makes use of two models and two interpreters. As a 
first approximation, its time and space cost is twice that of a monolithic one. As 
w d  be discussed in the latter parts of this thesis, the computational cost of a 
1lierarcliica.l supervisor is proportional to the number of beliefs generated. Thus a 
point of inclifference between the choice of a liierarchical supervisor and a monolithic 
occurs one when the hierarchical supervisor eliminates fiom consideration Iialf of 
the beliefs generated by a monolithic one. Once more than half of the beliefs can be 
eliminated from consideration, a hierarchical supervisor becomes more cost effective 
than a monolithic one. 
The time and space cost of a hierarchical supervisor depends on: (1) the amount 
cf non-determinism in the requirernents specification, (2) the implementation of 
non-determinism in the target system and (3) the operational profle. An analyt- 
ical mode1 of the computational cost of a hierarchical supervisor is left as future 
work. However, the time and space complexities of a monolithic and hierarchical 
supervisor are evaluated experimentally for one target system in chapter 7. 
Chapter 4 
The PDM Mode1 
This chapter describes the derivation of a tracking model fkom the requirements 
specification. As rnentioned previously, this model is referred to as a PDM-model. 
R e c d  from section 3.3.1 that the PDM-model is used by the path detection module 
(PDM) to track target system operation though the requirements specification. 
The PDM-mode1 derivation procedure is exemplified with the aid of a non- 
trivial system; a fragment of a s m d  telephone exchange. The example was chosen 
to exemplify the main parts of the transformation process which are difficdt t o  
illustrate with a trivial example. 
The chapter begins with a description of the telephone exchange and its require- 
nents specification. The prominent issues arising in the derivation of a PDM-mode1 
are described next followed by the actual derivation procedure. 
CHAPTER 4. THE PDM MODEL 
4.1 Example Software System 
The example software system is the c d  processing software of a small telephone 
exchange. Its complete specification appears in appendix A. For discussion pur- 
poses, the SDL process interaction diagram of the exchange is duplicated in this 
chapter. It appears in figure 4.1. 
system Private-BranchExchange 
1 block Phone-Hdlr signakt 
L 1 = DiaiTonne. No-DT. Fasr-Busy . 
No-FB. Slow-Busy. No-SB. Ring-Back. 
No-RB. ConnÇE. Disc-CE. Ring. 
No-Ring. Conn-CR. Dise-CR 
signalist E = ONHK. OMK. Digit(x) 
Manager 
block NecPath-Mgr 1 
1 
Figure 4.1: Telephone Exchange SDL System Specification 
The behavior seen by each telephone is defined by a PhoneHandler process. 
Phone-Handlers communicate to connect and terminate telephone c d s .  A sepa- 
rate, bidirectional communication path exists between each pair of PhoneHandler 
CHAPTER 4. THE PDM MODEL 
processes, represented by implicit SDL signal routes in figure 4.1. 
AU PhoneHandlers are identical. To simplify the discussion, only two fragments 
of the Phone-Handler are shown (figure 4.2). They deal with an originating party 
dialing the final digit of the telephone number and requesting connection with 
the terminating party. For brevity, identification of the destination process for 
signals req-connect, rernote-auail and remote-busy is omitted as are portions of the 
specification dealing with exceptions such as timeouts and uncompleted dialuig. 
The numbers in brackets ([= -1) appearing in figure 4.2a wiU be described Iater. 
Ring I)"' 
(a) Originating Fragment (b) Terminating Fragment 
Figure 4.2: Fragments of the Phone Handler Specification 
CHAPTER 4. THE PDM MODEL 
4.1.1 Illustration of Nondeterministic Behavior 
Consider the A and B c d  Z scenario. A chart illustrating the signals exchanged 
between PhoneHandlers A, B and Z and the environment is shown in figure 4.3. 
If both A and B dial the final digit of Z within a brief interval of each other. 
the indeterminate delay on the inter-process communication paths between the 
environment and processes A and B permits either A or B to complete the c d  to 
Z (the other will receive slow busy tone). Figure 4.3a shows the case where the 
delay to process B is larger and 4.3b where it is smaller than the delay from the 
environment to A. For this particdar scenario, the specification permits two legal 
behavioral alternatives. Both alternatives must be considered by a supervisor. 
Figure 4.3: Behavioral Alternatives for the A and B C d  Z Scenario 
Phone- Phone- Phone- Phone- Phone- Phone- 
Handler A Handler B Handler Z Handler A Handler B Handler Z 
At the input port of 3 pmcess, specification nondeterminism permits the two 
CR-Con signals (figure 4.4) to be consumed in either order. Provision must be made 
by the supervisor to consider al l  possible signal orderings if consumption order 
uncertainty exists. The consequence of considering only a subset of all possible 
signal permutations is that the supervisor may generate spurious failure reports. 
For a process with n signals in its input port, the upper bound on the number 
j 
*OnC 
I i L i r 1 I 1 r 1 I 
d'si(yj 
1 - CR-Con(B) 
CRÇon(A) 
disi((y) 








* n e  Avnil(Z) 
CHAPTER 4. THE PDM MODEL 47 
of signal permutations is n!. This may lead to a potentidy large computational 
complexity if all possible signal permutations must be explored. 
Handlr 
Figure 4.4: Permuteable Signals at the Input of a SDL Process 
4.2 Issues in the Derivation of the PDM-Mode1 
As mentioned in chapter 3, stimuli for the PDM-mode1 are chosen based on their 
uniqueness. A metric of uniqueness is described first. A discussion of maintaining 
sequences of internal state transitions or causality pathways in the PDM-mode1 is 
described next. The section concludes with a description of data flows in the PDM- 
model. Data flows appearing in the requirements specification m u t  be rnaintained 
in the PDM-model. 
4.2.1 Identification of State Transitions 
As discussed in section 3.3.1, the occurrence of a state transition is detected with 
either target system input os output signals. The motivation for using signals 
other than target system inputs to detect state transitions is to reduce the number 
of required signal permutations and as a result the computational complexity of 
the supervisor. 
Signals in the PDM-model, used to detect state transitions, are chosen based 
CHAPTEIE 4. THE PDM MODEL 48 
on their uniqueness. In the requirements specification, signals that are either con- 
sumedlgenerated during fewer state transitions are considered more unique than 
signals consumed/generated during more transitions. The precision with which 
state transitions can be detected improves as the uniqueness of signals used in- 
creases. 
The notion of a uniqueness  r n e t n c  or u-metric is used to quantify the idea of 
signal uniqueness. The u-metric is defined for all signal-transition pairs in the 
requirement specification. 
Definition 4.2.1 (Uniqueness Metric (u-metric)) Let  P be a n  SDL process 
and s a n  SDL signal t h a t  ini t iates a state t rans i t ion  o r  is gcnerated dur ing  a s ta te  
t rans i t ion  in P.  T h e  u -me t r i c ( s ,  P )  is defirred as: 
O i f  s is an inpu t  signal, u -me t r i c ( s ,  P )  is d e f i e d  as t h e  nurnber o f  s ta te  tran- 
s i t ions  ini t iated b y  s in P 
i f s  is an  output  signal, u-metn 'c(s ,  P )  is defined a s  t h e  n u m b e r  o f  d a t e  t ran-  
s i t ions  in P where s is generated 
The ability to map a signal to fewer state transitions reduces the number of 
behavioral alternatives the supervisor must consider. The u-metric is used as a basis 
to select stimuli for the PDM-mode1 by the derivation procedure to be discussed 
in section 4.3. Signals with lower u-metric values are preferred over signals with 
higher u-metric values. 
Dynarcic Metrics 
In general, metrics for PDM-mode1 stimulus selection may be classified as either 
s ta t ic  or dynamzc.  Static metrics take into consideration the specification but not 
CHAPTER 4. THE PDM MODEL 49 
the corresponding operationai profde of the target system. However, dynamic met- 
rics also take into consideration the operational profile. 
This thesis describes only one static metric (u-metric). Other static or dynamic 
metrics may be developed. The PDM-mode1 transformation process to be described 
remains the same regardless of the metric used. 
Example 
Consider the example in figure 4.2. The u-metrics are shown in square brackets 
( [ O  -1) beside eacli signal in figure 4.2. The u-metrics are computed based on the 
full requirements specification of the telephone exchange appearing in Appendix A. 
Note that signal CR-Con causes state many transition (in Appendix A, the 
star-state notation is used to capture this) and as a result it has a high u-metric 
value. 
4.2.2 CausaIity Pathways 
A target system input signal may cause a sequence of n state transitions in one 
or more processes of the requirements specification. The n state transitions may 
produce zero or more externally observable outputs (target system outputs). This 
series of state transitions s h d  be referred to as a causality pathway (CP). 
As an example, consider the specification in figure 4.2. Assume that the pro- 
cesses shown are in states Wai tD2  and Wait-Call. If' signal digit(Y) is con- 
sumed, it would cause state transition Wai tD2  + WaitRsp which would cause 
Wait-Gall+ WaitAns followed by Wai tRsp + Wait-Co. This collective set of 
state transitions, initiated by signal, digit(Y) is referred to as a causality pathway. 
CHAPTER 4. THE PDM MODEL 50 
Figure 4.5a illustrates this CP. A compact notation is used. StimuZi are denoted 
as inconring lines to the process. Generated outputs are denoted as outgoing lines. 
Actual state transitions are abstracted. 
Phone- Phone- 
Handler A Handler Z 
digit reqconnect \ ringphone 
Phone- Phone, 
Handler A Handler Z 
digit req-connect -., ring-phone b-x- - -L - 4 - - - - ),+--- 




Handler A Handler Z 
digit / req-connect r ingjhone 
The PDM, responsible for detecting state transitions t hat occur, effectively 
traces each CP. CPs can be traced in a forward direction, backward direction or a 
ring-bac k-toneL 
- 





combination of the two. A CP is traced forward by using stimuli of the requirements 




specification as stimuli in the PDM-model. Conversely, a CP is traced backwards 
4 
ringbacLtone\ / remote-avail \ / 
v 
A 
CHAPTER 4. THE PDM MODEL 
by using outputs fkom the specification as stimuli in the PDM-model. Two issues 
arise when CPs are traced backwards by the PDM. 
The first issue deals with a possible violation of signal sequencing in the detec- 
tion of state transitions. As an example, consider the specification in fibwe 4.2 and 
the corresponding CP in figure 4.5a. If the entire CP is traced backwards while 
process A is in state, WaitD2, the PDM would be required to report that tran- 
sition Wait-Rsp + Wait-Co occurred before transition WaitJ2  + W a i t X s p  
(figure 4.5b). 
The solution to this problem is to trace the CP only in the forward direction or 
to use a combination of forward and backward tracing. For the previous example, 
one possible forwardlbackward tracing that solves the described signal sequencing 
problem is shown in figure 4 .5~ .  
The second issue deals with the consistency in the selection of stimuli for the 
PDIvI-mode1 between individual processes. Consider two state transitions, SO + SI 
and Sa + S b  occurring in two different processes such that the occurrence of SO + 
SI triggers Sa -+ S b  (Le. both transitions are part of a single CP) (figure 4.6a). If 
in the PDM-model, the identical signal is used as a stimulus for both transitions. 
deadlock will occur (figure 4.6b). Clearly, stimuli that are chosen in one process 
constrairi the choice of stimuli in othcr proccsscs. 
4.2.3 Signal Parameters 
Parameters tagged to signals constitute the data flow through the requiremcnts 
specification. The state of a process is dependent on the values of data. Relevant 
data flows must be rnaintained in the PDM-model. 
CHAPTER 4. THE PDM MODEL 
CFSM A CFSM B 
(a) Requirements Specifîcation 
CFSM A CFSM B 
(b) PDM-Mode1 
Figure 4.6: Example: PDM-Mode1 Deadlock 
The two types of parameters are addressed: implicit and explicit. Explicit pa- 
rameters are specified by a specification writer. As an example the signal, digit(Y) 
in figure 4.2 uses an explicit parameter to carry a digit information. 
Implicit parameters are appended to each signal by the the semantics of the 
specification formalism. Examples of such parameters include the sender ID of a 
signal, the signal type, destination ID, etc. For brevity, wc restrict discussion of 
irnplicit parameters to the sender ID and signal type. Other implicit parameters 
may be treated in a sirnilar manner. 
In the PDM-model, all parameters used by a process must be communicated to 
the process. In many cases implicit parameters are not actuaUy used and can be 
dropped to simplify the transformation and the resultant PDM-model. 
PDM-Mode1 Transformation Algorit hm 
This section presents the algorithm for transformation of the requirements specifica- 
tion into the PDM-model. The section begins wit h an overview. The transformation 
algorithm is presented next, followed by an  example. 
CHAPTER 4. THE PDM MODEL 53 
The presentation of the PDM-mode1 transformation algorithm assumes that 
signals in the requirements specification have unique names. Formdyt consider 
two signal send constructs, sl and s2 appearing in the requirements specification. 
If a state transition, T does not exist such that both sl and s2 could cause T under 
any given scenario, signals sr and s2 must have different symbolic names. The 
above requirement can be enforced by simply relabeling the symbolic signal names 
in the requirements specification. 
4.3.1 Overview 
The PUM-mode1 difFers from the specification primarily in its stimuli. AU states 
and state transitions in the original specification appear iu the PDM-model. 
Path information is communicated to the BSup on the occurrence of each PDM 
state transition. Path information consists of a sequence of stimuli that if consumed 
by the BSup would steer execution dong the same path as determined by the PDM. 
The PDM-mode1 transformation consists of two parts: (1) stimuli selection and 
(2) model generation. Stimuli selection successively eliminates PDM-mode1 s timuh 
(initidy, all signais generated and consumed during a state transition are candidate 
s ~ i m d i  for the PDM-model). Stimuli selection terminates when exactly one signal 
signal remains for each state transition. At this point model generation is invoked. 
Mode1 generation constructs a communicating extended finite st ate machine with 
the chosen stimuli. The result is the PDM-model. 
Stimuli selection is the most challenging part of the PDM-mode1 transformation 
process. This is due to the fact that the selection of a stimulus for a particular state 
ixansition may constrain the choice of stimuli for adjacent state transitions on one 
or more CPs. These constrain'is are represented as a constraint graph so that as 
CHAPTER 4. THE PDM MODEL 54 
stimuli are chosen, other inconsistent stimuli can be removed fiom consideration. 
The components and data flows of the transformation process are shown in 
figure 4.7. The stimuli selection and mode1 generation components of the transfor- 
mation are described below in further detail. 
Requirements Specification Constraint Graph 
I I 1 












( S W  
1 ,- 
CHAPTEn 4. THE PDM MODEL 
Stimuli Selection 
The stimulus select ion algorit hm considers the requirement s specification a t  t hree 
independent levels of abstraction. 
The f i s t  level considers the data-flows through the specification. The  stim- 
di selection algorithm ensures that data-flows remain in the PDM-mode1 as they 
influence the state of processes. AU processes are considered a t  this level. 
The second Ievel deals with the consistent selection of stimuli. As &scussedt 
choosing a stimulus for a state transition in process X will influence the choices 
of stimuli in adjacent processes ( processes that communicate directly with process 
X), Consistency of stimulus selection requises consideration of stimuli for adjacent 
processes. 
Stimuli are actually chosen at the third level. At this level, each process is 
considered independently of other processes. Stimuli are chosen based on their 
uniqucness within the specification. A signai that causes or is generated in few 
state transitions wilI give the PDM more precise information as to which state 
transition occurred than would a signal that may be consumed/generated in many. 
PDM-Model Generation 
Tlie PDM-mode1 generator begins with a model that represents the requirements 
specification in topology. Al1 finite state machines, states and state transitions 
remain the same. State transitions are unlabeled (i.e. no input or output signais 
appear on the transition). 
The PDM-mode1 generation consists of three steps. First the selected stimuli 
are added t o  the model. Signal output constructs are added to state transitions 
CHAPTER 4. THE PDM MQDEL 5 6 
that are to cause interna1 state transitions based on the choice of stimuli. Finally, 
state transitions are added that consume target system input or output signais not 
chosen as stimuli. These signals are consumed without effectl. 
4.3.2 Constraint-Based Stimulus Consistency 
To ensure consistency between the selection of stimuli for the individual state tran- 
sitions of the PDM-model, the problem is pr~jected as a finite-domain constraint 
satisfaction problem (CSP) [48]. The classic formulation of CSP problems consist of 
three components: (1) variables, (2) variable domains and (3) constraints between 
variables. A constraint satisfaction algorithm is used to ensure that all constraints 
are satisfied by successively restricting elements or ranges of elements from a vari- 
able's domain. The CSP is said to be solvable if at least one variable assignment2 
exists that satisfies all constraints. 
For the mode1 transformation problem, state transitions are mapped into vari- 
ables, candidate PDM-mode1 stimuli for a particular transition are mapped to vari- 
able domains and inequali ty cons traints are placed between adjacent s tate transi- 
tions of a CP. The interpretation of the constraints is that adjacent state transitions 
cannot be initiated by a single signal generated or consumed during both transi- 
tions. The CSP can then be represented as a grapb where nodes represent state 
transitions, contents of nodes represent possible PDM-mode1 stimuli and labeled 
arcs represent constraints. 
As an example, a constraint graph was derived for the specification fragments 
'These transitions are equivalent in semantics to SDL implicit transitions. They are described 
explicitly for completeness purposes only. 
?A variable assignment may be considered as an elimination of al1 domain values except one 
for a given variable. 
CHAPTER 4. THE PDM MODEL 57 
in figure 4.2. The graph appears in figure 4.8. Note that due to space limitations, 
the graph captures only the originating fragment for phones A and B and the 
terminating &agment for phone 2. A complete constraint graph must capture al i  
interactions of all processes appearing in the communication topology (figure 4.1 ) . 
Wait-Co Wai1-O2 wai t-02 
CFSM A CFSM Z CFSM B 
Figure 4.8: Segment of Constraint Graph 
A constraint graph is said to be comGtent if for each variable's domain value. at 
least one corresponding domain value exists in each variable linked by a constraint 
that satisfies each corresponding constraint. The elimination of domain values from 
variables may cause the graph to become inconsistent. 
As an example, if signal digit( Y)  is removed from transition. Wuit 4 2  + 
Wait &pl signal C R - C a  becomes the stimulus for the aforementioned transi- 
tion. Thus the stimulus assignment in transition WaitD2 + W a i t h p  is no 
longer consistent with the assignment of CR-Con as the stimulus for either of 
Wait-Gall-, Wait Ans or Wait Ans + Wait Ans.  
Constraint propagation is a technique to eliminate inconsistent variable domain 
CHAPTER 4. THE PDM MODEL 5 8 
values. A constraint propagation algorithm accepts as input an inconsistent con- 
straint graph and retunis a consistent constraint graph, proyided that a consistent 
variable assignment exists. Constraint propagation algorithms operate by succes- 
sively removing inconsistent domain values until the graph becomes consistent. 
The algorithm is applied each time a value is removed from a variable's domain. A 
survey of such algorithms can be found in [37]. 
From the above example, if the described graph was an input into a constraint 
propagation algorithm, the algorithm would eliminate signal CR-Con fïom the 
domain valiies of transitions Wait-Cal1 + Wait A7z.s and Wait Arzs i W a i t i l n s  
and signal Busy from the two Wai tRsp  i. Waz't-02 transitions. 
4.3.3 PDM-Mode1 Transformation Algorit hm 
The PDM-model transformation algorithm is presented in two parts. The first part 
is the stimdus selection algorithm (SSA). It is used to choose stimuli for the PDM- 
model. The second part, the PDM-mode1 generation algorithm (PMGA)? generates 
the PDM-mode1 based on the stimuli chosen by the SSA. Recall, the transformation 
process was shown graphically in figure 4.7. 
In the descriptions of the SSA and PMGA, the following riotatioli will be used: 
Tl" wiU be used to rcfer to transition i in the requirements specification, Ty to 
the corresponding node i in the constraint graph and T'F to the corresponding 
transition in the PDM-model. 
CHAPTER 4. THE PDMrMODEL 
4.3.4 Stimulus Selection Algorithm 
The SS A accepts as input the requirements specification (Spec) and the correspond- 
ing cons traint graph, derived from the requirements specification ( Cons-Graph ). 
The SSA returns a stimulus for each state transition in the requirernents specifica- 
tion. 
The SSA can be subdivided into three parts. The first part checks for causality 
violations in the detection order of state transitions as described in section 4.2.2. 
The second part of the algorithm ensures that data flows remain intact in the 
PDM-model, as outlined in section 4.2.3. The final part of the algorithm actually 
selects signals tlrat will be used to identify state transitions in the PDM-mode1 
(i.e. the stimuli for state transitions). The seIection process is based on the u- 
metricl described in section 4.2.1. 
The SS A appears in figure 4.9. A textual surnmary of the algorithm follows. 
Causdity Violations Check [lines 1-71 
As described in section 4.2.2, a violation of causality occurs if an attempt is made 
to determine that a transition occurs after the curent  one. The SSA statically 
detects possible causality vioIations by tracing the CPs though the requirements 
specification. If a CP is found that crosses a particular process more than once, 
the algorithm forces the portion of the CP which is crossed more than once to be 
traced forward. 
As an example, the CP shown in figure 4.5a crosses process A twice. The SSA 
enforces that the first transition be processed in a forward direction. This effec- 
tively restricts the entire CP to be processed either entirely in a forward direction 
CHAPTER 4. THE PDM MODEL 
Algorithm S S A  (Spec, Cons- Graph) 
1. for al1 state transitions T;+' E Spec 
2. C P  = the set of al1 forruard causality pathways passing through at T:" 
3. if (ezists a cp E C P  such that cp initiates two or more d a t e  transitions 
in the process where transition T;" appears) 
4 - stimulus ( T i g )  = stimnlus (T.' )
5. applg comtraint propagation algorithm to Cons-Graph 
6. end if 
7. end for 
8. for al1 state transitions T;'" E Spec 
9. i f  ( s t i m u l ~ s ( T ~ ~ )  carries an eq l i c i t ,  used parameter 
1 O. stimulus ( T f g )  = stimulus(T;'") 
11. apply constraint propagation algorithm to Cons-Graph 
12. end if 
13. if (if implicit parameter(s) of stimulus(T;'') cannot be stafically determined 
14. delete al1 signals f ~ u m  the domain of Tig thaf do nof carry needed 
implicit paramet ers 
15. end if 
16. end for 
1% for al1 nodes, Tig E Cons-Graph 
18. while (number-of-elements-in-d~main(T;'~) > 1) do 
19. compute collective u-metric for each element in T;" 
20. delete element in T;'g with largest collective u-rnetric 
21. apply constraint propagation algorifhm to  Cons-Graph 
22. end while 
23. end for 
24. return (stimuli) 
25. end Algorithm 
Figure 4.9: Stimulus SeIection Algorit hm 
CHAPTER 4. THE PDM MODEL 6 1 
(figue 4.5a) or partially forward and par t idy  backward. One example of the latter 
is illustrated in figure 4 .5~.  
Maintaining Data Flows [lines 8-16] 
Data flows that appear in the requirements specification must be maintained in the 
PDM-mode1 where required. The SSA checks ail data flows and determines if the 
data is required. If so, it imposes constraints on stimuli to ensure that the data 
0ows will appear in the PDM-rnodel. 
The SSA checks both explicit (programmer specified) and implicit parclmeters. 
As discussed, implicit paramchers consist of the ID of the sender process for each 
signal only. 
The precision of the PDM-mode1 in detecting state transitions is reduced by 
imposing constraints on stimuli to maintain dataflows. In some cases. some signal 
parameters may be determined statically, which reduces the constraints on stimuli. 
As an example, the sender ID of a signal c a n  ofken be determined statically from the 
communication structure if there is only one process that could actually generate 
the signal. 
For parameters that are used and cannot bc determined statically, the CP is 
constrained to be processed in a forward direction. This ensures that the direction 
of the CP remains identical to that in the requirements specification. 
Stimuli Selection [lines 17-23] 
For the rernaining transitions having two or more candidate stimuli, stimuli are 
selected based on signal u-metric. Signals having lower u-metrics are preferred 
CHAPTER 4. THE PDM MODEL 6 2 
since they indicate which transition has occurred with greater ce r t a i~ ty  than a 
signal with a higher u-metric. 
Choosing a stimulus for a particular state transition constrains choices of other 
stimuli dong the CP as described in section 4.2.2. Stimuli <are chosen to minimize 
the restriction on the use of signals with s m d  u-metrics in adjacent processes. A 
signal, s chosen as a PDM-mode1 stimulus eliminates other candidate stimuli from 
being selected. The sum of all signals u-metrics that are eliminated as a result of 
choosing s shall be referred to as the collective u-metric. Note that the collective 
u-metric includes the u-metric of S.  
The final part of the SSA operates by repeatedly removing candidate stimuli 
from a particular state transition, T. The stimulus with the highest collective u- 
metric is removed. This means that if only one signal is left in a node that the signal 
becomes the stimulus for the node (state transition). A constraint propagation 
algorithm is applied after the removal of each stimulus to  ensure consistency. This 
process repeats until each node in the constraint graph contains exactly one signal. 
The reader should note that in the worst case, the SSA will choose stimuli for 
the PDM-mode1 that are identical to those in the requirements specification. This 
would occur, for example, in a specification that does not generate any outputs. As 
a result, a consistent selection of stimuli for the PDM-mode1 always exists. However. 
in some cases the stimuli selection algorithm may return an inconsistent set of 
stimuli. In such a case the constraint propagation algorithm could be combined 
with search to exhaustively consider the search space. From experience, such a 
scenario has not been encountered in the target system specifications considered. 
CHAPTER 4. THE PDM MODEL 
4.3.5 PDM-Mode1 Generation Algorit hm 
The PDM-mode1 generation algorithm accepts input the requirements specifi- 
cation, a stimulus for each state transition selected by the SSA and the unaltered 
constraint graph. The algorithm creates the PDM-model. The PDM-mode1 appears 
at two levels of abstraction, similar to the corresponding requirements specification: 
(1) the systern or process interaction level and (2) the process level. 
At the system level, all processes appearing in the requirements specification 
appear in the PDM-model. The channels and signal routes connecting processes 
difFer, principally due to  the possibility of signal direction reversal based on the 
choice of stimuli for the PDM-model. 
The communication topology is generated based on the foilowing rules. Consider 
two signals, sl and sz traveling from processes Pl to P2 in the PDM-model. If the 
two signals traveled on a single channel/signal route in the requirements specifica- 
tion, a single channel/signal route is created between process Pl and P2. I f  the two 
signals traveled on dXerent channels/signal routes, two separate channels/signal 
routes are created between processes P l  and P2. Note that some interna1 signals 
appearing in the requirements specification may not appear in the PDM-mode1 and 
as a consequence the PDM-mode1 may contain fewer channels and/or signal routes 
than the specification. 
The process level PDM-mode1 generation algorithm (Algorithm PGMA) is de- 
scribed in three parts. The first part creates the transitions using the stimuli pre- 
scribed by the SSA. The  second part introduces constructs to communicate path 
information from the PDM-mode1 to the BSup. The final part adds implicit signal 
consumption constructs for any signals fi-om the environment not used as stimuli. 
The PMGA is shown in figure 4.10. A textual summary of the algorithm follows. 
CHAPTER 4. THE PDM MODEL 
Algo rithm PMGA (Spec, Stimuli, Cons- Graph) 
1. create al1 process in  PDM-Mode1 having stimuli from Stimuli 
2. for al1 da te  transitions, T,'" E Spec 
3. for al1 transitions, T? having a constraint between Tira 
and TF f Cons-Graph 
4.  if (stimulus(Tj") E T;'") 
5. add output signal s t i m u l w ( T ~ )  to transition, Tim 
6. end i f  
Y. end for 
8. end for 
9. for al1 state transitions, T;P" E PDM-Mode1 
f O. add BSup-output construct to transition TiPm to communicate signal 
st imulus(T~") to BSup 
I l .  end for 
12. for ail state transitions, T;'" E Spec 
13. for all signctls, sig E TTa 
14- i f  {sig # stimulw(T;P")) and (sig originates from environment) 
15. i f  (sig appears before stimulus(TiF) in Spec) 
16. add implicit transition i n  date before Tip O C C U ~ S  
to consume sig without effect 
17. else 
18. add implicit transition i n  state after T T  O C C U ~ S  
to consume sig without effect 
19. end if 
20. e n d  i f  
21. end  for 
22. end for 
23. end Algorithm 
Figure 4.10: PDM-Mode1 Generation Algorit hm 
CHAPTER 4. THE PDM MODEL 
Creation of PDM-Mode1 S t ate Transitions [lines 1-81 
This portion of the algorithm creates ail state transitions with the stimuli specified 
by the SSA. For state transitions triggered by internally generated signals. the 
PMGA adds output constructs to the state transitions responsible for triggering 
these transitions. 
Insertion of Path Information to the BSup [lines 9-11] 
Constructs to communicate path information are added to each transition in the 
PDM-model. The path information is used to steer the BSup dong the path of 
the PDM. Path information consists of the triggering signal name, explicit and 
implicit parameters. The reader should note t hat the path information consists of 
the triggering signal that would have caused the state transition in the requirements 
specification, not in the PDM-model. 
Addition of Implicit Signa1 Consumption Constructs [lines 12-22] 
All signals generated and consumed by the target system travel to the PDM. Not 
all signals from the environment are used as stimuli in the PDM-model. Explicit 
signal consumption constructs are added for all signals from the environment riot 
used as stimuli. 
Signals may be consumed wit hout effect before or after the corresponding state 
transition occurs in the PDM. The main issue is to preserve the order of signal 
consump tion specified by the requirement s specification. 
For explanation purposes, the signal to be consumed without effect shall be 
referred to as S. S is generated or consumed in the requirernents specification 
CHAPTER 4. THE PDM MODEL 66 
during transition T. Assume that the chosen stimulus for transition T in the 
PDM-mode1 is Stim. Note that Stim # S 
If, during transition T in the requirement specification, signal S is consumed 
or generated before Stim, then S in the PDM-mode1 must be consumed before 
transition T takes place. If during state transition T ,  signal S is generated after 
Stim is consumed or generated, then in the PDM-model? Stim must be consumed 
directly after transition, T takes place (i.e. in the terrninating state of transition 
T). 
4.3.6 PDM-Mode1 Transformation Example 
As an application example the SSA and PIvIGA are applied to the specification 
fragment illustrated in figure 4.2. The corresponding constraint graph for the spec- 
ification is shown in figure 4.8. The description of each aIgorithm7s execution is 
broken down into the three steps used during the description of the algorithm. 
Stimulus Selection Algorit hm 
The outputs fiom intermecliate stages in the execution of the SSA are illustrated 
in figure 4.11. 
Causality Violation Check 
The dgorithm begins by tracing each of the CPs through the specification. In doing 
so, it is determined that the CP, initiated by signal digit(yl  in process A crosses 
process A twice. For this reason, the stimulus for transition, WaitD2 -+ WaitRsp  
is set to the stimulus of the requirements ~~ecification. A similar stimulus selection 
CHAPTER 4. THE PDMMODEL 
WSI-DZ -> 
Wait-Rsp 
(a) step 1 
(b) step 2 
(c) step 3 
Figure 4.11: Application of the Stimulus Selection Algorithm 
CHAPTER 4. THE PDM MODEL 68 
is made for the corresponding transition in process B. The resulting constraint 
graph is s h o w  in figure 4.11a. 
Maintainhg Data Flows 
The second step of the SS A examines the explicit and implicit parameters carried by 
all signais remaining in the constraint graph. The parameter (Y} carried by signal, 
digid(Y) is needed. However in the previous step. this signal was instantiated as 
the stimulus in the PDM-mode1 (if it was not instantiated as the stimulus in the 
previous step, it would have been during t h  step). Signal Ring during state 
transition Wait-Ce21 + Wait A n s  does not c a r y  tlie sender ID of the stimulus 
CR-Con in the PDM-model. This information is needed to commrinicate path 
information to tlie BSup and cannot be determined s tat icdy since it depends on 
parameters noc locally known to the process. For this reason, Ring is eliminated 
from tlie PDM-mode1 as a candidate stimulus (figure 4.11b). 
Stimuli Selection 
During the final step, the remaining stimuli are chosen. Signal CR-Con is a candi- 
date stimulus for transition Wait-Cal1 + WaitAns. This signal is eliminated as a 
candidate stimulus since signai Avail has a lower u-metric value of 2. Signal Avail is 
chosen and the constraint propagation algorithm invoked whicli in turn elirninates 
signals, Aerail &om transition Wait R s p  + Wait-Co. 
For transition Wai tAns  <t WaitAns ,  signal Busy has a Iower u-metric value. 
Thus Cr-Con is eliminated. The constraint propagation algorithm is invoked. The 
final constraint graph is shown in figure 4.11~. 
CHAPTER 4. THE PDM MODEL 
PDM-Mode1 Generation Algorithm 
The intermediate stages in the execution of the PMGA are illustrated in fig- 
ures 4.12 - 4.14. 
Figure 4.12: Application of the PDM-Mode1 Generation Algorithm (1/3) 
Creation of PDM-Mode1 State Transitions 
The algorithm begins by creating a PDM-model. The PDM-mode1 contains all 
state transitions of the original requirements specification. The stimuli generated 
by the SSA are used as stimuli in the PDM-mode1 (figure 4.12). State transitions, 
Wait-Call + W a i t A n s  and Wai tAns  + Wai tAns  are triggered by interndy 
genercrted signals. Output constructs are added to these transitions to generate 
t hese signals. 
CHAPTER 4. THE PDM MODEL 
Slow- FI 
CR-ON Q 
Figure 4.13: Application of the PDM-Mode1 Generation Algorithm (2/3) 
CHAPTER 4. THE PDM MODEL 
Wait-Ans 
Figure 4.14: Application of the PDM-Mode1 Generation Algorithm ( 313 )  
CHAPTER 4. THE PDM MODEL 
Insertion of Path Information to the BSup 
The second step of the PMGA adds output constructs to communicate path in- 
formation to the 3Sup. Path information consists of the signal that would cause 
the coiresponding state transition in the requirements specification. Note that all 
explicit and implicit parameters must be defined for this signal (figure 4.13). 
Addition of Implicit Signai Consumption Constructs 
The final step of the algorithm adds explicit signal consumption constructs for 
cd signals from the environment not used as stimuli. For this example, a signal 
consumption construct is added for signal Ring. In the requirements specification. 
it is generated after signal Auail and as a result it must be consumed after the 
transition has taken place in the PDM-mode1 (figure 4.14). 
Chapter 5 
The Path Detection Module 
Interpreter 
Tliis chapter outlines the theory and operation of the PDM interpreter. The PDM 
interpreter interprets a PDM-model, which is an SDL specification. For this reason 
the PDM-interpreter closely resembles the SDL interpreter. 
The chapter begins with an overview of the interpreter. The notion of time 
within the interpreter is subsequently described. The two approaches used to deal 
with behaviord alternatives arising from specification non-det erminism: partial- 
order signal consumption and belief-based supervision, are described next. Findy 
the key algorithms of the interpreter are presented dong with an analysis of th& 
time and space complexity. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 74 
Overview 
The PDM-interpreter interprets the PDM-model. The fundamental clifference be- 
tween the PDM-interpreter and the SDL abstract machine is their operation in 
the presence of non-determinism. The SDL abstract machine may select any one 
behavioral alternative arising from specification non-determinisrn. However, the 
PDM-interpreter mus t identifj and follow the behavioral alternative chosen by the 
target system. 
The most prominent SDL non-determinkm is channel delay. As an example, 
consider the SDL process and incoming channeIs shown in figue 5.1. Each of 
the signds traveling on an SDL cliannel are firçt-in-first-out (FIFO) ordered. The 
contents of the channels are merged into a single input queue associated with the 
SDL process. Several potential total orders of signais typicdy exist due to the 
non-deterministic channel delay. 
The PDM-interprcter must determine the total order chosen by the correspond- 
ing target system and sequence signal consumption accordingly. A supervisor that 
arbitrarily sequences signals for consumption would illegdy report failures of the 
target system. 
5.1.1 ComponentsofthePDM 
The PDM-interpreter is described in t erms of its four fundamental components: 
(1) temporal signal tags, (2) partial-mode1 supervision, (3)  belief-based handling 
of non-det erminism and (4) the core-interpre ter. The components are described in 
further detail. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 75 
Figure 5.1: Signal Ordering 
Temporal Signal Tags Each signal in the supervisor is tagged with its time of 
generation and/or consumption to facilitate its processing after its occurrence. 
Partial-Mode1 Supervision Used to reduce the number of behaviord dterna- 
tives needed to be considered by the PDM. 
BeIief Creation Algorithm Used when the PDM/partial-order signal consiimp- 
tion cannot resolve the behavioral alternative chosen by the target system. 
Core Interpreter An out-of-time, directed SDL interpreter. 
The remainder of this chapter describes, in further detail, the four components of 
the PDM-interpreter. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPILETER 
5.2 Temporal Signal Tags 
Signals within the supervisor are tagged with the time of generation and/or con- 
sumption. This information facilitates their processing after their occurrence. The 
interpreter is responsible for generating the tags. 
Signal tags are analogous to timestamps. However, uncertainty exists as to 
the actual signal generation/consumption time principdy due to a lack of interna1 
target system observability. As a result, signais within the supervisor are tagged 
with a timestamp ranging over an interval. The interval represents the time during 
which signais were generated and/or consumed within the target system. Such an 
interval is referred to as an occurrence interval (01). 
OIS are derived based on the time that inputs from and outputs to the environ- 
ment were generated. Consider the series of statc transitions in (5.1). &:. -. C ,  
represent global states of the PDM-model, i  a target system input signal, O a target 
system output signal and inf l, - - int, internally generated and consumed signds. 
OIS for the state tra~isitions in (5.1) can be derived from the observation times 
of signais i and o. Assume that signal i  was observed at time, t l  and O at time. 
tu .  ti  and tu represent the lower and upper bounds of both signal generation and 
consumption. Thus an 01, [ tr ,  tu]  represents the consumption time of signal, z', 
generation and consumption time of signals, intl . int, and the generation time 
of signal o. 
CHAPTER 5. THE PATH DETECTION MODULE INTEnPRETER 77 
5.2.1 Interpretation of Occurrence Intervals 
As previously stated, the actual time signals were generated and/or consumed 
within the target system typically cannot be determined due to  a lack of observ- 
ability. An O1 captures the range of time over which a signal was generated and/or 
consumed. 
Within the supervisor, OIS are used to order signais. Consider two signals, s l  
and s2 both with unique OIS. An definite order of the two signais can be determined 
from their OIS if the OIS do not overlap. Conversely, the order of the two signals 
cannot be determined solely based on OIS if the OIS of the two signals overlap. The 
formal definition of O1 overlap is defined below. The dot (.) operator is used to 
address the O1 of a signal. 
Definition 5.2.1 (Overlapping Occurrence Intervals) The occurrence inter- 
uak of two signals, s l  and s3 overlap if: 
3t such that [(t  >_ ~ 1 . t ~ )  A ( t  5 si . tU)] [(t >_ s2.tl) A ( t  5 çS.t,,)] 
As an example, overlapping and non-overlapping 01s are iliustrated graphically in 
figures 5.2a and 5.2b respectively. 
(a) Overlapping OIS (b) Non-Overlapping OIS 
Figure 5.2: Overlapping and Non-Overlapping Occurrence Intervals 
The PDM-interpreter orders signals for consumption based on (1) their oc- 
currence intervals and ( 2 )  their sequence on the channel/signal route which they 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
traversed. In some cases. the interpreter may not be able to deterministicalIy deter- 
mine the exact consumption order of signds. The set of signals whose consumption 
order cannot be determined is referred to as the consumable signal set, defined 
below. 
Definition 5.2.2 (Consumable Signal Set (CSS)) A t  time t ,  let s represent a 
signal from the set of signak appearing ut the heads of the incoming signal routes 
or channek;' having the smallest occurrence interval lower bound. Let J be the set 
of s ignab other t h a n  s that appear ut the Iteads of the .incomzng channels/signal 
routes vrhose occurrence intemals overlap with S .  The consumable signal set ( K ( t ) )  
is defined as: K ( t )  = J U  S .  
Theorem 5.2.1 (Consumable Signals) The signal to  be subsequentally consumed 
rnust be contained in the consumable signal set. 
Proof: Let X be a szgnal such that X @ K .  From definition 5.2.2, X either: ( 1 )  
does not  appear as a signal at  the head of a n  incoming signal route/channel or  (2) 
the occurrence interual of X does not overlap zuith S .  The two cases are treated 
independently. 
Case 1: X does not  appear ut the head of a signal route channel. The signal at  the 
head must be consumed before A. Thus  X cannot be a consumable signai in 
the current state. 
Case 2: There are two possible situations in which the occurrence intervals of s 
and X do  no t  overlap: ( 1 )  k t ,  < s.tl and (2)  s.t, < k t L .  The former is not  
'In SDL, signal routes carry al1 signals to processes within a block. However, signals that travel 
over a channel before rcaching their destination shall be referred to as tmveling over channels. 
CHAPTER 5. THE PATH DETECTION MOD ULE INTERPRETER 79 
possible since s is defined to have the minimum. tl of al1 signais ut the heads 
of the incoming channel.s/signal routes and (tl 5 tu) .  The latter case verifies 
that s must be consumed before A and ngrees with Theorem 5.2.1. 
5.2.2 Singly Bound Occurrence Intervals 
In some cases it may not be possible or desirable to  obtain both upper and lower 
bounds of a signal's 01. For example, to obtain both upper and lower bounds on the 
O1 for input signal, i requires that the time output O is generated be propagated 
backwards before the input is processed2. The backward propagation of event 
occurrence times adds a significant amount of complexity to the interpreter. 
It is possible to determine an O1 with only one bound, either the lower or upper. 
The worst-case target system response time. TA,, is required in such cases. Th,, 
may be considered an upper limit on the time at which input i will be legitimately 
serviced by the target system. An event that is serviced after TA,, time units is 
considered a hard real-time failure. 
An O1 for the case where the lower bound is not known can be approximat~d 
as [tu - TA,,, tu].  Correspondingly, the O1 for the case where the upper O1 bound 
is not known is [tl, tl + T;,,l3. 
'OIS are used by the supervisor to order signals. A signal can not be processed by the supervisor 
without an 01. 
3Note that this is an approximation of the actuai 01. I t  is possible that in some cases the 
supervisor would miss reporting some failures as a result of this. In chapter 7 an empiricai 
evaluation of the number of missed failures based on approximated OIS is presented. 
CHAPTER 5. T H E  PATH DETECTION MODULE INTERPRETER 
5.2.3 Generation of Signal Tags 
For target system input or output signals, the O1 of the corresponding signai is 
cornputed based on the signal observation time and the worst case target system 
response time, (TA,,) as described above. 
OIS for interndy generated signals are derived from the stimulus that caused 
the state transition in the PDM-model. Recall that an O1 is a bound of the 
generation/consumption times of all signals generated during a sequence of state 
transitions. As an example, consider the state transition sequence in (5,l) .  The 
occurrence interval for signal i includes the time where i was consumed and O gen- 
erated. Thus it must also include the generation/consumption times of signals, 
intl inta. . . . int,. Thus al l  generated signals inlierit the O1 of the stimulus causing 
the state transition in the PDM-model. 
5.2.4 Timers 
Timers arc used to implement delay and timeout facilities in CEFSM-based spec- 
ifications. Conceptuaily, iimers may be implemented with signal send and receivc 
facilities. As an example, SDL timer set and reset constructs are shown in figure 5.3. 
Timers are supervised so that delay and timeout failures can be detected by 
the supervisor. In an out-of-time supervisor, timers are handled with the aid of 
the OIS described. The semantics of SDL timers dictate that the setting of a timer 
(figure 5.3a) creates a signal, which shail be referred to as a timer signal, and places 
it in the input port of the corresponding process. Timer signals (representing an 
expired timer) are consumed identicdy to other SDL signals (figure 5.3b). The 
resetting of a timer (figure 5.3~) removes and discards an unconsumed timer signal 
from the input port. The tags of each timer signal influence when it is consumed. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
Figure 5.3: SDL Timer Set/Reset Constructs 
The occurrence interval of a timer signal is a function of three parameters: ( 1 )  
the O1 of the stimulus that caused the state transition in which the timer was 
set. (2) the timeout value of the timer and (3) a parameter, A, which represents 
the tolerance of a timers in the target system. The latter of the tliree parameters 
implies that within thc target system, a timeout will expire after Tout f A units of 
time. For the general timer set operation in figure 5.3a, the O1 of the t i~ner  signal 
is set to [tr + Tout - A, tu + Tout + A]. tl and tu represent the O1 of the signal that 
caused the state transition containing the timer set operation (X). 
Any number of timers can be set by an SDL process. Each timer signal in the 
supervisor is sent to the process input port via a separate, delayed channel. Timer 
channels are no t programmer specified but rat her implici t , used to concep tualize 
the ordering of timer signals within the supervisor. 
As an example, consider a process, Pl that uses two timers, T l  and T2. One 
confi,.;uration of channels leading to Pl is shown in figure 5.4. Unexpired timers 
are indicated by signals pending consumption. For the example shown, T 1 is an 
unexpired timer, while timer T2 is not yet set. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
CHANNEL N 
: CHANNEL2 
5 r m m n ~ ~ ~  
CHANNEL 1 
5 
CHANNEL T l  
Ezs3 2 
CHANNEL T2 
Figure 5.4: C hannels Carrying Timer Signals 
5.3 Partial-Order Signal Consumpt ion 
The objective of partial-order signal consumption [24,61] is to reduce the nurnber of 
behavioral alternatives that need to be considered. Recd  from section 2.2.2 that 
behavioral alternatives arising from specification non-determinism can be pca.rti- 
tioned into two categories, don? knour and don't care. Partial-order signal consump- 
tion adclresses don't care non-determinism. Its goal is to eliminate consideration of 
don't care non-determinism by the supervisor. 
5.3.1 Application of Partial Order Signal Consumption 
Three common types of SDL non-determinism are addressed in this thesis as out- 
lined in section 2.8. The three types of non-determinism can be sub-divided into 
directly and indirectly specified. 
Directly specified types of non-determinism addressed are spont aneous transi- 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 83 
tions and non-deterministic decisions. -4 specification writer mus t explicitly in- 
troduce one of these ccnstructs. For this reason, directly specified types of non- 
determinism typically fall into the don't know category. In other words, the behav- 
ioral alternatives generated rarely lead to identical behavioral alternatives (or the 
non-deterministic constructs wouldn't have been introduced by the specification 
writer). 
Non-deterministic channel delay is an indirectly specified non-determinisrn as 
SDL semantics dictate delayed channel communication must be used in certain sit- 
uations, beyond the control of the specification writer. In theory, a signal traveling 
on an SDL channel can be delayed anywhere from [O, oo] units of time. However, 
in practice there is typically at least an upper bouud placed on communication. 
From experience, many behavioral alternatives arising from channel delay fall into 
the don't care category as described in section 2.2.2. 
Partial order supervision is thus targeted to reducing the number of don't care 
non-deterrninism arising from SDL channel delay for the context of this thesis. 
5.3.2 Definitions 
Behavioral alternatives arising from SDL channei delay shall be referred to as C- 
behavioral alternatives. C-behavioral alternatives arise as a result of multiple possi- 
ble signal consumption orders. The definition of a C-behavioral alternative appears 
below. 
Definition 5.3.1 (C-Behavioral Alternatives) Let Sp represent  the  partially- 
ordered se t  of  signais u t  the  inpu t  queue of process P. Let Rs = { r l , r z , .   . , T N )  
represent  t h e  se t  of  N possible total  orders of signals from set Sp b a e d  o n  the FIFO 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 84 
ordering of SDL channels/signal routes and individual signal 01s.  Each member of 
Rs shall be referred to as a C-behavioral alternative. 
The notion of process behavior and process behavior equivalence is now defined. 
This will be used to define behavioral alternatives that lead to identical and different 
observable behavior. 
Definition 5.3.2 (Process-Behavior) Let P' represent a process in the  PDM 
model. in process state + with m outgoing channels and signal routes. Let T repre- 
sent a sequence of signals such that T E Rs. 
b e h p f ( $ ,  T ) ,  the behavz'or of Pt after consumption of signal sequence T ,  is defined 
as  a 2 - h p k :  ($, C )  where: 
?CI represents the process-state of Pt after consurnption of signal-sepuence r 
C = {cl c 2 .  . . k) represents the set of signal sepuences from the alphabet. 
X' U {E) on the emanating channels and signal routes of P f t  generated as the 
signal-sequence T vlas consumed. 
Two process-behaviors are considered identical if: (1) the generated sequerices of 
signals between the two behavioral alternatives are identical and the OIS of corre- 
sponcting signals overlap and (2) the final process-states are identical. 
C-behavioral dternatives may arise fiom don't know or don't care non-determinism. 
Don't know/don't care non-determinkm is defined based on the equivalence of the 
behavior arising fkom the two C-behavioral alternatives. 
Definition 5.3.3 (Don't Care Non-Determinism) Let T I ,  T Z  E Rs represent 
two C-behavioral alternatives. r l  and r2 are said to be generated under don't care 
non-determinisrn if behp(r l ,  $) = behp(r2 ,  $) for a given process state .rl> of P. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRJTER 8 5 
Definition 5.3.4 (Don't Know Non-Determinism) Let  r l ,  r2 E Rs represent 
two C-behavioral alternatives. T I  and r2 are said to be generated under don? Xnow 
non-de tennin i sm i f  b e h p ( r l ,  $) $ behp(rz, $) for a giuen procew state $ of P.  
From experience, many of the behavioral alternatives arise from don't care non- 
determinism in a SDL supervisor [39]. The foilowing claim, stated without proof 
due to its obviousness, is the b a i s  to a substantial reduction of time and/or space 
complexity in a software supervisor. 
Claim 5.3.1 (Partial-Order Signal Consumption) AII legitimate, speczfied be- 
haviors can be considered by s imulaf ing only c-behavioral alternatives generated .un- 
der don? Xxow non-determiniSm. 
Not ail behavioral alternatives need be considered [36]. Behavioral alternatives 
arising from don't care non-determinism can be pruned fkom the search space. One 
approach for pruning such behavioral alternatives in a supervisor is described in 
the subsequent section. 
5.3.3 An Implementation of Partial-Order Signal 
Consumption 
This section describes an irnplernentation of partial-order signal consumption. There 
are two principai categories of existing work on this subject, both differing in tlieir 
target application. The e s t  is with application to verification [28]. However, it is 
not immediately applicable to supervision since it does not address the reai-time 
aspects of supervision. The second category is focused on automatic generation 
of test cases for SDL specifications [59]. However, the work does not address the 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 8 6 
non-determinism associated with SDL channel delay. In other words, it assumes 
SDL specifications to have a constant SDL channel delay, an assumption not valid 
for the context of t his work. 
A spectrum of algorithms to reduce consideration of dun't care behavioral al- 
ternatives can be envisioned. At one end is a time-intensive algorithm that uses 
a generate-and-tes t approach to determine if behavioral alternatives were gener- 
ated under don't care non-determinism. This is the approach used in belief-based 
supervision for example. At the other end of the spectrum is a space-intensive 
approach that uses a look-up tabIe, indexed by the behavioral alternative. The 
table facilitates a 0(1) determination if two behavioral alternatives were generated 
under don't care non-determiaism. However. the approach has an enormous space 
requirement. Neither of the two approaches are well suited to the problem at hand. 
A liybrid approach is needed. 
In general, there is a tradeoff between the time and/or space complexity of the 
approacli and the reduction in the number of don't care behavioral alternatives 
considered. 
TIie partial-ordcr approach described capitalizes on the observation that a given 
signal s in the input port of a process in the PDM-mode1 is permuteable with a 
finite number of signais. This follows from the discussion of OIS in section 5.2. In 
addition, many SDL specifications have only a few States in which a signal can be 
consumed to result in a different behavior. 
These two properties are combined into a redundant permutation distance (rp- 
distance). The rp-distance represents the minimum distance, measured in state 
transitions, before a transition is reached where an SDL signal can be consumed 
dxerently than in the curent state. The formal definition of rp-distance follows. 
CHAPTER 5. THE PATH DETECTION MODu-LE INTERPRETER 87 
Definition 5.3.5 (rp-distance) Let P' be a process in the PDM model in process- 
state and A' the input and output alphabet of P'. A" thus represents the set of 
ail possible input signal sequences of  P'. For signal s f A'! the rp-distance. r p -  
dis t (P ' ,+,s )  iç defined m the m i n i m u m  length of a signal sequence X E A'' such 
that: 
where sX  denotes the concatenation of signal s with a signal sepence  A. If no such 
sequence exists, rp-dist(  P t ,  .JI, s )  = m. 
Note that rp-distance is rzot defined for signal-process state pairs where the signal 
is not consumable (i.e. where the SDL signal is saued). 
The rp-distance is enumerated for each process-state and each stimulus in the 
PDM model. Its significance is that it can be used to reduce redundant signal 
permutation in the PDM. If the niimber of signals in a set whose order is not 
known (i.e. the consumable signal set) is less than the rp-distance of any signal in 
that set, permutation is redundant. 
State/rp-distance pairs are tabulated for each process in the PDM-model. Such 
a table is called a partial OT&T distance table (POD-table) and constitutes the static 
information used by the partial-order approach. 
The rp-distances for all stimuli of the PDM-mode1 fragment in figure 3.3 are 
shown in table 5.1. 
As an example of the derivation of the table consider process state SO and 
signa1 X. In the PDM-model, the closest state (in state transitions) where can be 
consumed with a different behavior than in state SO is SI. The distance between 
SO and SI is one state transition. Thus the rp-distance of signd X in state SO is 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 88 
one. Consider signal Z as a second example. The behavior of the PDM-mode1 will 
be identical irrespective of the state in wliich Z is consumed. Tlius, a state tliat 
















resiilts i~ a diferent beliavior does not exist and as a result the rp-distance of Z in 
any state is m. 
5.4 Belief Method 

















Two approaches have been described to deal with some aspects of specification non- 
determinisrn thus far. The PDM-mode1 facilitates identification of the behavioral 
alternative chosen by the target system while partial order signal consumption 
prunes behavioral alternatives that do not lead to different external behavior. The 
mechanisms facilitate efficient handling of specification non-determinism. However, 

















CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
in all circumstances. The supervisor resorts to the belief method if both of the 
ot her approaches fail (i.e. more t han one unresolved behavioral alternative exists) . 
The belief method was discussed in section 3.2.1. It is a conceptually elegant 
approach for deaIing with all types of non-determinism. Thus it is more generd 
than the PDM-mode1 and partial-order signal consumption. However, it has a much 
larger time and space complexity. 
The PDM generates a belief for each unresolvable behavioral alternative. This 
occurs in two cases. The first case is where the queuing order of two or more signals 
cannot be determined (figure 5.la). In this case. a belief is generated for each 
possible signal queueing order (cg.  for the example shown, A.B and B.A) .  The 
second case is where the PDM-mode1 contains an ANY construct. as described in 
chapter 4 (figure 5.lb). In this case? a separate belief is created for each emanating 
path from the ANY construct. 
Figure 5.5: Generation of Beliefs 
Beliefs are treated as separate threads of execution. They are terminated in 
one of two cases: (1) if the behavior represented by the beiief does not match the 
externally observed behavior and (2) if n beliefs represent identicd global states 
of the hierarchicd supervisor (Le. PDM and BSup), n - 1 of these beliefs are 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 9 0 
terminated. Note that in the latter case, the supervisor may require processing of 
the n beliefs for a fiaite period of time before it can determine that n - 1 of the 
beliefs are redundant . 
5.5 Core Interpreter 
The PDM interpreter closely resembles the abstract machine of SDL [58 ] .  This sec- 
tion begins with an overview of the relevant portions of the SDL abstract machine. 
It then describes the key aspects of the PDM abstract machine. 
5.5.1 SDL Abstract Machine 
The semantics of SDL are formally defined by means of an abstract machine. The 
SDL abstract machme consists of six types of CSP [27] processes. executing concur- 
rent Iy and communicating synchronously. Figure 5.6 iliustrates eacli of the process 
types and the communication between them. An overview of the functionality of 
each SDL process foilows. Note that discussion focuses on the supported subset of 
SDL as outlined in section 2.8. 
system: Responsible for creating other process instances in the abstract ma- 
chine. It also routes signals between SDL processes. 
paéh: Handles the non-deterministic delay of channels. 
timer: Keeps track of curent  time and handles time-outs. 
sdl-process: An SDL interpreter. One iostance of this process exists for each SDL 
process in the specification. 




A N W ~ ~  
Figure 5.6: SDL Abstract Machine 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 92 
input-port: Handles the queueing of signais for an SDL process. One instance of 
input-port exists for each sdl-process instance. 
view: Keeps track of all revealed variables. Implements communication between 
sdl-processes by means of shared memory. 
This process represents functionality of SDL not addressed in this work and 
is not discussed further. 
5.5.2 PDM Abstract Machine 
The PDM abstract machine is s i d a  to the SDL abstract machine described. The 
difference between the two arises principally from the treatment of spccification 
non-determinism. 
The SDL abstract machine may arbitrarily choose a single behavioral alternative 
from the set of possible alternatives arising from specification non-determinism. The 
PDM abstract maclfine is required to identify and choose the behavioral alternative 
followed by the target systcm. 
The PDM abstract maclune drffers in three respects from its SDL counterpart. 
Firs t , as dcscribcd prcviouslg., mos t bchavioral altcrnativcs arisc from a numbcr of 
possibIe signal permutations a t  tlie input port. As a result, the input port of the 
PDM abstract machine significantly dXers from its SDL counterpart. Second, in 
some cases, the PDM will not be able to resolve the selected behavioral alternative. 
Beliefs were proposed as a way of dealing with this. Thus the PDM abstract 
machine must provide support for belief creation, management and termination. 
Finallÿ, to support out-of-time processing of signals, the PDM abstract machine 
tags all signals with their OIS. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 93 
The process interaction diagram of the PDM abstract machine appears in fig- 
ure 5.7. A textual summary of each process type in the abstract machine follows. 
The descriptions highlight the difierences between the PDM and SDL abstract ma- 
chines . 
Pid-Cmtrd. \ 










Figure 5.7: Path Detection Module Abstract Machine 
system: Creates, manages and terminates beliefs. Timestamps ail signals gen- 
erated by the environment with an occurrence interval. Handles routing of 
signals fiom the PDM to the BSup. Note that the PDM abstract machine 
only supports static SDL process creation. 
path: Communicates signals traveling over SDL channels to their appropriate des- 
CHAPTER 5. THE PATH DETECTION MODULE IlVTERPRETER 94 
tinations. Note, unlike its SDL counterpart, the path process does not output 
signals t O the environment. 
timer: Keeps track of curent  time and handles time-outs. 
PDM-process:  An SDL interpreter. The only clifFerence between the SDEprocess 
and PDM-process is that all paths are foilowed by the PDM-process when 
executing an ANY construct by generating one belief for each emanating 
path. 
input-port: Orders signals for consumption according to a corresponding order 
chosen by the target system when the order can be determined. Creates 
beliefs when the exact order cannot be determined. 
Belief Creation/Termination 
As outlined in section 5.4, beliefs are generated in response to unresolvable behav- 
ioral dternatives. In the hierarchical supervisor, the PDM is used to resolve the 
behaviord alternative chosen by the target system. As a result only the PDM cre- 
ates beliefs. Beliefs are terminated when the extcrnal behavior represented by the 
bclief doesn't match the expected behavior from the target system. Thus bclicfs 
may be terminated either by the PDM or BSup. 
Within the abstract machine, beliefs may be created by either the PDM-process 
or the input port as described. The control s ipals  exchanged by the processes when 
creating beliefs in these two cases are shown in figures 5.8a and 5.8b respectively. 
Beliefs may be terminated by either the BSup or input port. The control signals 
exchanged under these scenarios are shown in figures 5.9a and 5.9b respectively. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 9 5 
(a) PDM Process Initiated (b) Input Port Initiated 
P D M A M  PDhl AM P D M A M  PDM AM PDM AM PDM AM 
BSupAM System PDhl Process Input Port BSupAM System PDM Procm Input Pon 
Figure 5.8: Belief Creation 
-
Note that all SDL processes in a belief are terminated. Figures 5.9a aud 5.9b show 
only one process being terminated as others receive identical signals. 
- 1 1 1 - -
(a) BSupAM Iniiiated (b) Input Port Iniriated 
PDhI AM PDM Ab1 PDM Ah1 PDhf AAM PDM AM PDhI AXI 
BSupAM Sysrcm PDM Process Input Pon BSup.4Xl System PDhI Process Input Port 





The belief creation/termination facilities were originally described and formal- 














The supervisor processes target systern input and output signals out-of-time as 
described in section 3.4.2. Signals generated at time t may be potentially processed 
















CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
lags the dock of the target system by at least Tm,, units of time. 
The actual time within the PDM or the vatue of the PDM's clock is defined 
in this section. The clock of the PDM is advanced as signals are consumed. Thus 
the PDM clock is derived from the timestamps of the signals in the input ports 
of its SDL processes. Individual processes represent executions at different points 
in tirne, depending on how the processes are scheduled. Thus the time witliin the 
supervisor is defined at twn levelç: (1) a process leveI and (2) a global level. The 
definitions appear below. 
Definition 5.5.1 (PDM Process Time) Let P represent a PDM S D L  process. 
in process d a t e  a: and S the set  of signals queved i n  i t s  i npu t  port. K W the con- 
sumable signal set and Kt  a subset of K where al1 signals in se t  K' are consumable 
in the curent state ( i e .  n o t  in the Save set)  (K' Ç K C S ) .  T h e  process t i m e  of 
P (Tp) is an interual: Tp = [TF,, Th] where: 
Tp, = minimum occut'rence interual lower bovnd of a signal, sl E K' 
Th = m a x i m u m  occurrence interual upper bound of a signal, sz E K t  
For n, proceas P if set  K' i s  empty  its process time i s  undefitzed. 
The process time ranges over an interval due to the uncertainty of the actual gen- 
eration/consumption time of signais in the target system. The process time is 
undefined for processes with zero signals in their consumable unsaved signal sets 
(CUSSs). Process tirnes are consolidated into a PDM global time, defined below. 
Definition 5.5.2 (PDM Global Time) For a set of SDL processes, G,  the global 
t ime  of G (Tc,,,,) is a n  interval, TG,D,~f = [TGi,TC,] where: 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 9 7 
a TGt = minimum occurrence interval lower bound of a process, Pl E G having 
n defined process time 
TGh = maz in~um occurrence interual vpper bound of a process, Pz E G having 
a defined process time 
If al1 processes, P E G have undefined process times, the global time is undefined 
as well. 
5.5.3 PDM Input Port 
The central part of the PDM abstract machine is the input port. The input port 
orders signds for consumption. It takes into consideration the FIFO constraints 
imposed by the channels and signal routes of the specification as weii as the signal 
occurrence intervals. Thus for n signals, only a subset of the n! signal orders 
typicdy need be considered. 
The core of the input port is a sorting algorithm that orders signals in the 
consumable signal set (CSS), defined in section 5.2.1. Signals in the Save set are 
removed from the CSS to form the consumabIe unsaved signal set (CUSS). Signals 
in the CUSS are candidate signals to be consumed in the current state. 
The input port is described as two parts. The first part (Algorithm QueueSig- 
nul()) deals with the queueing of signals and preservation of the FIFO signal orders 
imposed by chalinels and signal routes. The second part (Aigorithm Consumes- 
ignal()) output s signals to the corresponding SDL process for consump tion. The 
discussion tegins with a description of the major type and datastructure definitions 
followed by the actual algorithms. 
CHAPTER 5. THE PATH DETECTrON MODULE INTERPRETER 
Type Definitions 
signal-names symbolic signal names consumed/generated by SDL process P 
O1 the occurrence interval type as defmed in section 5.2 
signal represents an SDL signal. signal has the foilowing sub-fields: 
name symbolic signal name of type signal-name 
sender pid of sender process 
receiver pid of receiver process 
origin source of the signal (i.e. PDM/BSup/environrnent ) 
O1 occurrence interval 
parameters associated signal parameters as defined in the PDM-mode1 
PS the process states of P 
time an interval ranging over a period of time 
Datastructures 
Datastructures principdy store incoming signais to the signal routes/ channeis. It 
is assurned that an SDL process has n incoming chônnels and/or signal routes. 
q a sequence of elements of type signal. c; represents the sequence of signals on 
an incoming channel or signal route i (1 5 i 5 n). As signals are consumed 
they are removed from the head of c;. Note that ci may be empty. 
C a set of sequences of type signal. C = {CI, cz, . . . cm) 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 99 
J a sequence of signals- Contains a copy of the signais at the heads of the incoming 
signal channels/signal routes. J is kept sorted based on the lower bound of 
each signal's occurrence interval. 
P O  D T ( a )  (s) partial order distance table, an array indexed by the cu ren t  process 
state ( a )  and signal (s), returning the partial-order distance. 
TPDM Global time of the PDM 
WTT-belief a pointer to the current beiief of the process 
Opera t ions  
comm-path id ( s  : signal) accepts as input a signal s 'and rcturns c;. the incoming 
communication path traversed by s where c; E C. 
x f i  y sequence concatenation. x and y represent sequences. The function returns 
the concatenation of x and y. 
Queue Signal Algorit hm 
Thc qucuc signal algorithm is responsible for queueiltg sigrials iri a datastructure 
that preserves the FIFO order of SDL channels/signal routes (ci). It also updates 
J. a copy of the signals at  the heads of the incoming channefs/signd routes. 
Consume  Signal Algori thm 
The consume signal algorit hm orders signals for consump tion. 1 t implement s partial- 
order signal consumption and creates beliefs when uncertainty exists as to the actual 
ordering of signals. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
Algon'thm QueueSignal(C : sigdsequenceset,  J : sigddequence, s : signal) 
1. c; = comm-path-id(s) 
2. if (c; == empty) 
3. insert s, into J based on the lower bound of each signal's OZ 
4. end if 
5. append s tu the tail of c; 
6. return(C, J) 
end Algorithm 
Figure 5.10: Input Port Queue Signal Algorithm 
Due to the complexity of the algorithm, a flowchart of the algorithm appears 
in figure 5.11. The actual algorithm appears in figure 5.12. A textual summary of 
t he algori t hm follows . 
Consume Signal Algorit hm Description 
lines 1-2 Construct the consumable signal set (see definition 5.2.2) 
lines 3-6 Check if the global time has advanced past the process time. If so. no 
signal ever will arrive aiiowing the signds in the input port to be consumed. 
The current belief is terminated. 
line 7 The consumable unsaved signal set (CUSS) is generated for the current 
process state. 
lines 8-16 A check is made if the partial-order distance of each unsaved consum- 
able signal is p a t e r  than the total number of unsaved consumable signals If 
so the order of signal consumption is arbitrary and flag is set false to indicate 
this. 
C H A P T E R  5. THE PATH DETECTION MODULE INTERPRETER 101 
i ina 1-2) 
that m y  be consurned in 
current sute based on 01s 
and channe11 signal mure 
ordering 
signals O1 ? v 
Elirninate al1 signals frorn 
consurnable signai set 
whose cosumption order 
is arbritruy. LJ 
X 
N 
in consurneable signal 
1 
n = nurnber of signals in 
consurnable signal scr 
Crmre (n- 1) beliefs 
Output (n-l ) signal$ frorn 
Consume the rcrnaining 
signai in the current belief 
1 
END 
Figure 5.11: Consume Signal Algorithm 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
Algorithm ConsumeSignal(C : s i g n a l s e q u e n c e s e t ,  J : s igna l sequence ,  
a : p r o c e s s s t a t e ,  Tc,,,, : t irne) 
1. a = head(J) 
2. construct J1 and Jh such that J = J1 - Jh.  The OIS of al1 signals in JI overlap 
with a and the O h  of al1 signals in .Th do not overlap with the 01 of a 
3. i f  (TGp,,, does not overlap with the OI of any signal in J l )  
4. o u t p u t  (Terminate-Belief) 
5. exit Algorithm 
6. end i f  
7. let Ji represent the unsaved signals for process state, a in Ji ordered as in Jr 
8. let N = number of elements in  Ji 
9. flag = false 
10. for each v of type s i g n a l n a m e  
11. i f  (a at least one signal ( s )  of type v exïsts in  JI) 
12. i f  ( P O D T ( c ) ( s )  < N )  
13. flag = true; 
14. end if 
15. end i f  
16. end for 
17. cb = curr-bel ief  
18. if (ftag == t r u e )  
19. for (index = 2 to N )  
20. o u t p u t  (Register-Belief) 
21. o u t p u t  (Send-Signal(Jf ( index) )  
22. (C l  J )  = DeQueueSignal(C, J ,  J; ( index) )  
23. o u t p u t  (Set-Belief(cb)) 
24. end for 
24. end i f  
25. o u t p u t  (Send-Signal(Jf(1)) 
26. (Cl J )  = DeQueueSignal(C, J, J i (1 ) )  
27. return(C, J )  
end Algorithm 
Algorithm DeQueueSignal(C : s i g n a l s e q u e n c e s e t ,  J : s igna l sequence ,  s : signal) 
1. remove signal s from J 
2. c = comm-path-id(s) 
3. delete-head(c) 
4. if (c # empty) 
5. x = head(c) 
6. insert x, sorted into J based on the lower bound of each signal's 01 
7. end i f  
8. return (C, J )  
end Algorithm 
Fi,o;ure 5.12: PDM Input Port Signal Consumption Algorithm 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 103 
lines 18-24 If Jag is true, a separate belief is created for each possible signal 
consumption order. 
lines 25-26 The final signal is consumed in the current belief. 
DeQueue Signal Algorithm Description 
lines 1-3 the signal to be consumed is removed fiom the interna1 signal list (J) 
and from the incoming channel/signal route 
lines 4-7 if the cliannel/signal route carrying the signal to be consumed is not 
empty, the subsequent signal is added to the consumable signal List. 
5.5.4 Cornplexity Analysis 
The analysis e d u a t e s  the asymptotic time and space complexity of the major 
algorithms associated with the input port. A definition of the notation used in the 
analysis is presented first. The analysis omits discussion of portions of algorithms 
wliose complexity is O( 1). 
c: the maximum number of incoming channels and signal routes to a SDL process 
(Le. fan-in) 
t:  the number of signal types consumed/generated by the SDL process (Le. the 
cardinality of signal-name) 
B: the number of beliefs generated. An analytical expression will not be presented 
for B as it is highly application-specific. However, B is a function of the 
specification, the PDM-mode1 and the operational proHe. 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 104 
N: maximum number of signals queued within the PDM. IV is principally a function 
of the load on the target system and the SDL çpecification. 
Queue Signal Algorithm 
The essential function of the queue signal algorithm is to insert signals into a sorted 
list (J). The list contains at most one signal from each channel/signal route ( c ) .  
The algorithm is calied for each signal queued (N)  and is executed once per belief 
( B ) .  As a resdt. the wcrst case running time complexity of the algorithm is given 
by 5.2. 
DeQueue Signal Algorithm 
The running-time complexity of ai l  lines in the DeQueue Signal algorithm are O(1) 
except f ~ r  line G wliich performs an insert into a sorted list. Thus the algorithmts 
running-time complexity is given by 5.3. 
T P D M - I P D ~ ~ ~ ~ ~ S ; ~  (B. N ,  C) = O(B N log c) 
Consume Signai Algarithm 
line 2 a linear search and copy exaniines each element in J .  The worst case size 
of J is c elements resulting in a running time complexity of O(c) 
lines 3,7 linear search of a l l  elements in ,Tl. The maximum size of Ji is identical 
to J, resulting in a running-time complexity of O(c).  
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 105 
Iines 10-16 a search of JI is performed t  times. Thus the running time compIexity 
of the outer loop is O(t log c )  
Iines 19-26 For each belief generated, a c d  is made to the DeQueue algorithm. 
A maximum one belief per element in JL is created. Thus the ïunning time 
complexity becomes O(c - logc) 
The running time complexity of the consume signal algorithm is dominated by 
Lines 10-16 and 19-26. The algorithm is repeated for each signal consumed (N)  and 
for eacli belief generated ( B ) .  Tlius. the overall running-time complexity is given 
by 5.4. 
5.5.5 Computational Complexity of the Input Port 
The computational complexity of the input port is dominated by the consume 
signal aigorithm. This makes intuitive sense since tliis is the most sophisticated of 
all algorithms. Tlius the complexity of the input port is given by 5.5. 
5.5.6 Scheduling Process Execution within the PDM 
SDL processes in an SDL specification execute concurrently [62]. For a given spec- 
ification, many execution interleavings exist. Two or more execution interleavings 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
may result in different observable behavior. The supervisor, as described. considers 
the execution of SDL processes sequentially. 
Scheduling of SDL processes within the supervisor must t herefore take into 
consideration that alternate scheduling orders may result in different externally ob- 
servable behaviors. Behaviord alternatives arising from different process execution 
orders may be classified and dealt with similarly to behavioral alternatives arising 
from signal consumption orders. Don't know alternatives result in different ex- 
ternally observable behavior wMe don't care alternatives do not. Beliefs need be 
created to consider don't know process scheduling orders. 
From experience. the majority of SDL process scheduling alternatives do not 
rcsult in different ex terndy observable behavior. The intuitive explanation beliind 
this is that the consideration of dternate scheduling orders adds to the cornpIex- 
ity of the specification. This makes the specification more difficult to understand 
and impedes its central purpose: unambiguity and understandability to aU par- 
ties involved with the software development effort, from the customer to software 
developers and testers. 
The implication of incorrectIy scheduling the execution of processes within the 
PDM is that the hierarchical supervisor will generate false failure reports. An 
analysis was done on the class of systenis described. It was determined that the 
scheduling order c m  be approximated by scheduling processes based on their pr* 
cess times. A total scheduling order can be imposed if the process times do not 
overlap. Processes with overlapping process times are ordered heuristically. 
Processes that are about to consume signals from the environment are exe- 
cuted before processes that are about to consume internally generated signals, such 
that all internally generated signals are generated before processing begins. The 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 
scheduhg algorithm appears in the subsequent section. 
PDM Process Scheduling Algorithm 
Scheduling of processes within the PDM is done by the system process in the PDM 
abstract machine. The scheduling algorithm maintains a list of SDL processes 
ready to execute (i.e. SDL processes having at least one unsaved signal in their 
consumable signal set). The fust process P on the scheduling list is removed and 
executed provided that the upper bound of P's process time ( tu)  is greater than 
the current time (Le. t, < Tc) (see section 3.4.2). 
The scheduling algorithm accepts the parameters defined below as input. It 
returns an updated scheduling list, L. The algorithm appears in figure 5.13. 
P : process to be scheduled 
L : ordered process scheduling list 
Algorithm SchedulePmcess(P : process, L : Scheduling List) 
1. let X represent a 3-tuple: < P, P.tl, P.&, > 
2. insert X znto L sorted in ascending onier based on tr 
3. r e - s d  L such that Ptuples with ouerlapping 01s having signals from their environment 
in the+ znput queue appear before processes having intemal signais in their input que 
4. retur?l(L) 
end A l g h t h m  
Figure 5.13: PDM Process Scheduling Algorithm 
Running-Time Complexity 
Let Np represent the numba of SDL processes in the PDM-model. N the number of 
queued signals within the PDM and B the number of maximum beliefs generated. 
CH.4PTER 5. THE PATH DETECTION MODULE INTERPRETER 108 
The maximum size of L is Np.  A sorted insertion into L has a running-time 
complexity of O(1og N p ) .  The scheduling algorithm is executed after each signal 
consumption. Scheduling is re-computed for each belief independently. Thus the 
overall running-time complexity of the algorithm is given by 5.6. 
5.6 Time and Space Complexity of the PDM 
The complexity analyzed based on the dominant algorithrns described (i.e. Con- 
sume Signal and Schedule Process). The time and space complexities are presented 
individually. 
5.6.1 Running-Time Complexity 
For each signal within the PDM, internalally or externally generated. the Consume 
Signal and Schedule Processes are invoked. The algorithrns are invoked once per 
belief. Based on equations 5.5 and 5.6, the overall running time complexity of the 
PDM is given by 5.7. 
5.6.2 Space Complexity 
The space complexity of the supervisor is largely dependent upon the number of 
beliefs generated (B). Each belief makes a duplicate of each signal(N), the state of 
CHAPTER 5. THE PATH DETECTION MODULE INTERPRETER 1 09 
each process P, and the scheduling list Np.  Thus for a specification of size S1 the 
space complexity of the PDM is given by 5.8. 
Chapter 6 
The Base Supervisor 
This chapter describes the base supervisor (BSup). Like the PDM, the base su- 
pervisor consists of a BSup-model, obtained from the requirements specification by 
transformation, and a BSup-interpreter. 
The section begins with a discussion of the BSup-mode1 transformation process. 
A high-level overview of the BSup interpreter is presented next. followed by a 
discussion of time within the BSup. The BSup interpreter is then described in 
detail. Major algorithms and their time/space complexities are presented. 
6.1 The Base Supervisor Mode1 
As described in chapters 4 and 5, the objective of the PDM is to steer the execu- 
tion of the BSup. Unlike the PDM, the BSup makes use of an almost-unaltered 
requirements specification. 
The PDM steers BSup execution by specifying the signal consumption order that 
would lead the BSup dong the determined path. In two cases, the path chosen by 
CHAPTER 6. THE BASE SUPERVISOR 
the BSup does noi: depend on the signal consurned. Execution of the SDL AlVY or 
NONE constructs directs a SD L specification dong a non-deterministicdy chosen 
path. 
Two transformations are used to allow the PDM to steer the BSup in both 
of these cases. The transformations appear in figure 6.1. The ANY construct is 
replaced by a state transition for each emanating path (figure 6. la). Signals causing 
the state transitions (ANY-Pi ,  ANY-PZ ANY-Pnj are generated solely by the 
PDM and are not matched with a signal generated within the BSup. Simiiarly. 
spontaneous transitions are replaced with signal transitions initiated by the PDM 
Directives that are not matched s h d  be referred to as non-matchable directives. 
(a) ANY Transformation 
'2 - - 
-j * FI I none-pl " none-Pni TI i= 
I I 
(b) none Transformation 
Figure 6.1: Base Supervisor Mode1 Transformations 
CHAPTER 6. THE BASE SUPERVISOR 
6.2 Base Supervisor Interpreter Overview 
The BSup abstract machine is similar to the SDL abstract machine. However. 
differences arise kom the ditference in purpose of the BSup abstract machine. that 
is detailed behavior checking. The major ciifferences between the SDL and BSup 
abstract machines are outlined below. 
Time: The BSup abstract machine is an out-of-time SDL interpreter. All sig- 
n a l ~  are tagged with OIS as in the PDM. OIS are used to order signals for 
consump tion. 
Belief Processing: The BSup indudes facilities for belief generation. manage- 
ment and termination. Beliefs are created under the direction of the PDM. 
however they may be terminated kom within the BSup (as well as within the 
PDM). 
Comparat or: Wit h reference to figure 3.4, the comparator implements the ex- 
pected/observed behavior buffers and the matcher functionality. Its function 
is to compare expected and observed sipals  generated by the BSup and target 
system respectively and terminate the currently executing belief if a match 
cannot be made. 
Path-Direction: Signals to be consumed by a BSup process must match with 
path-directives generated by the PDM. The BSup interpreter includes facil- 
ities for matching path directives £rom the PDM with signais in the BSup 
queued for consump tion. 
The remainder of this chapter describes the BSup abstract machine. The discus- 
sion begins with a description of time in the supervisor. A process-Ievel description 
CHAPTER 6. THE BASE SUPERVISOR 113 
of the BSup abstract machine is presented next. The  major algorithms of the ab- 
s tract machine are subsequent d y  presented dong wit h their associated time and 
space complexities. 
6.3 Time within the Base Supervisor 
The BSup makes use of signal occurrence intervals, simiIar to the PDM. Each signal 
is tagged with an 01: rug ing  over an interval, that represents when the signal was 
generated and/or consumed. OIS within the BSup are derived in identically as in 
tlie PDM (discussed in section 5.2). SDL timers are implemented with the aid of 
OIS and operate as described in section 5.2.4. 
The notions of process and global time are dcfined for tlie BSup as done for the 
PDM. The BSup-specific versions of the definitions follow. 
Definition 6.3.1 (BSup Process Time) Let P represent a BSup SDL process 
in process d a t e  u, and S the set of s i p a l s  queued i n  the input port of P .  K is 
the consurnable signal set and K' a subset of K where al1 signais in set K' are 
consurnable in the c.i~rrent state (i.e. no t  in the SDL Save set) (Kt  Ç K C S ) .  The 
process t ime  of P ( T p )  is a n  interval: Tp = [Tp,, Tp,] where: 
Te = m i n i m u m  occurrence interual lower bound of a signal, sl E K' 
Tp,, = m a s i m u m  occurrence interual upper bound of a s i p a l !  sa f K' 
For a process P if set Kr is empty ifs process t ime is undefined. 
Unlike the PDM, a BSup input port must process signals generated within the BSup 
(or from the environment) in addition to signals generated by the PDM. Note that 
process time is not infiuenced by the path-directives generated by the PDM. 
CHAPTER 6. THE BASE SUPERVISOR 114 
Definition 6.3.2 (BSup Global Time) For a set of SDL processes, G ,  the global 
t ime of G (TGBSup) i.9 defined over a range TG = [TG*, TG,] where: 
Tc, = m i n i m u m  occurrence .interval lower bound of a process, Pl E G having 
a defiraed process time 
0 Tc, = maximum occurrence interna1 upper bound of a process, Pz E G having 
a defined process time 
If al1 processes, P E G have unde f i ed  process times, the global t ime is nndefcned 
as .well. 
6.4 Behavior Supervisor Interpreter 
The BSup interpreter is specified as an abstract machine, based on the SDL ab- 
stract machine (figure 5.6). The BSup interpreter, as described, does not implement 
the functionality associated witli the view process. The comparator process is in- 
troduced to match expected and observed signals (figure 3.4). The BSup abstract 
machine process interaction diagram is shown in figure 6.2. A brief description of 
the functionality of each process follows. 
system: Creates manages and terminates beiiefs. Tags all signals generated by the 
environment with an 01. Note that the BSup abstract machine only supports 
static SDL process creation. 
path: Stamps ail signals with the traversed channel ID. Does not delay signals iike 
its SDL abstract machine counterpart. 
timer: Keeps track of curent  time and handles time-outs. 












Figure 6.2: Base Supervisor Abstract Machine 
CHAPTER 6. THE BASE SUPERVISOR 116 
BSup-process: An SDL interpreter. Identical to the SDL abstract machine with 
one exception: no support is provided for execution of ANY or none constructs 
(see section 6.1). 
inpüt-port : Maintains two groups of signals: (1) signals generated by the environ- 
ment and internally within the BSup and (2) signals generated by the PDM. 
Orders signals for consumption according to order prescribed by the PDM. 
comparator: Queues signals for matching. Terminates the current belief if a 
match between expected and observed signals cannot be made. 
6.4.1 Belief Creation/Termination 
As indicated previously, belief creation is initiated only by the PDM. A belief 
created by the PDM requires the BSup to create a matching belief. Both the 
PDM and BSup process the same belief at all times. Beliefs may be terminated by 
either the PDM or BSup. If, for example, a belief is terminated by the BSup, the 
corresponding belief must be terminated witllln the PDM and vice-versa. 
Within the BSup, beliefs are terminated by eitlier the input-port or comparator. 
The input-port terminates beliefs in one of two cases. First, if the path prescribed 
by the PDM cannot be followed due to missing signals (i-e. a path directive cannot 
be matched with any signal in the BSup). Second, if spurious signals have been 
generated by the environment that do not correspond with the path prescribed by 
the PDM (i.e. a signal in the BSup cannot be matched with any path directive). 
The comparator terminates beliefs if a match cannot be made between the contents 
of the expected/observed behavior queues. 
The belief generation/termination protocol used in the BSup is identical to that 
CHAPTER 6. THE BASE SUPERVISOR 
used in the PDM. As an example, the reader is referred to figures 5.8 and 5.9. 
The following sections descnbe the novel aspects of the BSup interpreter. The 
discussion begins with the comparator foilowed by the BSup input port. The dis- 
cussion concludes with a commentary on BSup process scheduling. 
6.5 Comparator 
The functions of the comparator are: (1) to store signals in a pair of observed / 
expected behavior queues and (2) to compare the contents of the two queues. If a 
match of the contents of the two queues can be made, the contents are annihdated. 
If a match cannot be made, the current belief is terminated. 
The comparator is presented as two algorithms. The Queuesignal algorithm 
determines the source of signals and queues them in either the expected or observed 
behavior queue. The ProcessContents algorithm matches signal contents of the two 
queues. Tlie two algonthms appear in figures 6.3 and 6.4. A description of the 
major datastructures used by the algoritlims follows. 
OBQ: Observed behavior queue. A queue of elements of type signal. 
EBQ: Expected behavior queue. A queue of elements of type signal. 
TesUp: global time of the BSup. 
6.5.1 Queue Signal Algorit hm 
Tlie QueueSigna2 algorithm accepts as input the EBQ, OBQ and a signal to be 
queued. It returns the new EBQ and OBQ after the signal has been queued. The 
CHAPTER 6. THE BASE SUPERVlSOR 
algorit hm appears in figure 6.3. 
Algorithm QueueSignal(EBQ : signal-queue, OBQ : signal-queue, s : signal) 
1. if (source(s) == environment) 
2. append s to OBQ 
3. else 
4. append s to EBQ 
5. end if 
6. return (E BQ, O BQ) 
end Algorithm 
Figure 6.3: Comparator Queue Signal Algori t hm 
6.5.2 Process Contents Algorit hm 
The process contents algonthm compares the contents of the expected/observed 
behavior queues. It accepts the EBQ, OBQ and the global time of the BSup as 
input. It returns the new EBQ and OBQ. The algorithm appears in figure 6.4. A 
textual summary of the algorithm follows. 
line 1 the algorithm attempts to match the entirc contents of the EBQIOBQ 
lines 4-6 matching signals in the EBQ/OBQ are uiniliilated 
lines 7-9 if a match cannot be made the current belief is terminated 
lines 12-16 if the EBQ is not empty and the global time of the BSup lias advanced 
past the O1 of the signal at  the head of the EBQ, the current signal wiU never 
be matched. The current belief is terminated. 
lines 17-23 if the OBQ is not empty and the global time of the BSup has advanced 
past the O1 of the signal at the head of the OBQ, the current signal will never 
be matched. The current belief is terminated. 
CHAPTER 6. THE BASE SUPERVISOR 
Algorithm PmcessCon tents(E BQ : signal-queue, OBQ : signal-queue, TG,,", : t ime) 
1. while( (EBQ # e m p t y )  and (OBQ # e m p t y )  ) 
2. x = head(EBQ) 
3. y=head(OBQ) 
4. if ( (x.name = y.name) and (x. OI overlaps with y. OI) ) 
5. delete( head(EBQ) ) 
6. delete( head(0BQ) ) 
7. else 
8. o u t p u t  (Terminate-BeIief) 
9. exit Algorithm 
10. end if 
11. end while 
12. i f  ( EBQ # e m p t y  ) 
13. if ( head(EBQ).OI.t, < TGBSup .tl ) 
14. o u t p u t  (TerminateBelief) 
15. exit Algonthm 
16. end if 
17. end if 
18. if ( OBQ # e m p t y  ) 
19. if ( head(0BQ). OI.t, < TGBSu,, .ti ) 
20. o u t p u t  Ferminate-Belief) 
21. exit A Egorithm 
22. end if 
23. end if 
24. return(EBQ , O B Q )  
end Algom'thm 
Figure 6 -4: Comparator Signal Mat ching Algorit hm 
CHAPTER 6. THE BASE SUPERVISOR 
6.5.3 Complexity Analysis 
The asymp to tic running-time complexity is presented for the algorit hms compris- 
ing the comparator. The notation used is be consistent with that introduced in 
chap ter 5. 
Queue Signal Algorit hm 
The queue signal algorithm sirnply appends a signal to the appropriate queue. Its 
running time complexity is O(1). It is invoked once for each signal to be queued. 
Signals are re-queued individually in each beiief. For a worst case of N signals 
queued, and a maximum of B beliefs. the running-time complexity of the algorithm 
is given by 6.1. 
Process Contents Algorithm 
The ProcessContents algorithm compares the contents of the two queues. For a 
worst-case queue lengtli of N, its running time complexity is O ( N ) .  Queues are 
dupiicated in each belief and thus the running-time complexity of the algorithm is 
given by 6.2. 
CHAPTER 6. THE BASE SUPERVISOR 
6.6 Input Port 
The input port is specified as a collection of three algorithms: QueueSignal, An- 
nihilate and ConsumeSignal. The QueueSignal algori t h  queues bot h BSup sig- 
nais and PDM path directives in the input port. Annihilate deletes a matching 
BSup signal and PDM path directive fiom the input port and ConsumeSignal per- 
forms the matching between path directives and BSup signals, keeping track of the 
PDM/BSup global times. 
The majority of type and datastructure definitions used by the algorithms are 
consistent with those used to speczy the PDM input port and appear in sec- 
tion 5.5.3. In addition, the BSup input port must queue path directives from 
tlie PDM and thus it requires an appropriate datastructure, defined below. 
PDMQ a queue of path directives of type signal from the PDM 
6.6.1 Queue Signal Algorit hm 
The QueueSignal algorithm accepts as input the set of signal sequences correspond- 
ing to the incoming channels/signal routes ( C ) ,  the P D  M Q ,  a sorted List of signals 
at the heads of the incoming channels/signal routes (J) and the signal to be queued 
(s). I t  returns the updated datastructures C, PDMQ and J .  
6.6.2 Consume Signal Algorit hm 
The BSup ConsurneSigna1 algorithm has a similar function to the PDM ConsumeS- 
zgnal algorithm described in section 5.5.3. Unlike its PDM counterpart, tlie BSup 
CHAPTER 6. THE BASE SUPERVISOR 
Algon'thm QueueSignal(C : signal-sequenceset , PD MQ : signal-queue, 
J : signal-sequence, s : signal) 
1.  if (s-origin = P D M )  
2. insert s at the tail of PDMQ 
3. else 
4 c; = comm-path-id(s) 
5. i f ( c ; = e m p t y )  
6. insert s, sorted into J based on the lower bound of each signal's 01 
7. end if 
8. append s to the tail of ci 
9. end if 
10. return(C, PDMQ, J )  
end Algorithm 
Figure 6.5: Input Port Queue Signal Algorithm 
algorithm matches signals in the input port with the directives from the PDM. The 
effect is that execution is steered along the PDM-specified path. 
The algorithm accepts as input queued signals on the incoming channels/signd 
routes (C), the sequence of signals at the heads of the cliannels/signal routes (J). 
the PDMQ, the curent  process state of the associated BSup-process ( c r )  and the 
global times of both the PDM and BSup (TPDM and Tss,). It returns the updated 
datastructures C, J and P D  MQ and outputs a signal to the corresponding process 
for consumption. 
The algorithm appears in figure 6.6. A textuai summary of the algorithm fol- 
lows. 
Consume Signal Algorithm 
line 1 the algorithm attempts to match all unsaved path directives from the PDM 
with signals within the BSup 
CHAPTER 6. THE BASE SUPERVISOR 
Algorithm ConsumeSignal(C : s igna l sequencese t ,  J : s ignalsequence ,  
PDMQ : signal-queue, a : processs ta te ,  T P ~ ~  : tirne, T B S ~ ~  : t i m e )  
1.  while( (PDMQ # e m p t y )  and (J # e m p t y )  
2. x = head(PDMQ) 
3. i f  ( x is a non-matching path directive ) 
4- o u t p u t  ( Send-Signal(x) j 
5. else i f  (fields of x match with fields of a signal, s E J 
and OIS of x and s overlap) 
6. o u t p u t  ( Send-Signal(x) ) 
7. (C, J ,  PDMQ) = Annihilate(C, J ,  PDMQ) 
8. else 
9. out put (Terminate-Belief) 
10. exit Algorithm 
11. end if 
12. i f  ( PDMQ # empty ) 
13. z = head(PDMQ) 
14 i f  ( x. OI.t, < T G ~ ~ ~  .tl ) 
15. output (~errninate-Belief) 
16. exit Algorithm 
17. end if 
18. i f ( J # e m p t y )  
19. z = head(J) 
20. i f  ( x.OI.t, < T ~ ~ ~ ~ . t l  ) 
21 - o u t p u t  (Terminate-Belief) 
22. exit X g o d h m  
23. end i f  
24. end while 
25. return (C, J ,  P D M Q )  
end Algorithm 
Algorithm Annihilate@ : signalsequenceset, J : signalsequence, 
PD MQ : signal-queue) 
1. x = head(PDMQ) 
2. delete-head(PDMQ) 
3. remove signal x having identical fields as x and 
overlapping OIS fiom J 
4 .  c = comrn-puth-id(z) 
5. delete-head(c) 
6. if (c # empty) 
7. z = head(c) 
8. insert x, sorted into J based on the lower bound of each signal's OI 
9. end i f  
10. return (C, J ,  PDMQ) 
end Algorîthm 
Figure 6.6: BSup Input Port Signal Consumption Algorit hm 
CHAPTER 6. THE BASE SUPERVISOR 124 
Iine 2 x represents the path directive for the current path from the PDM. Note 
that that path directives are identical to signals but are generated by the 
PDM rather than internally within the BSup 
line 3-4 if the current path directive is a non-matchable directive (see section 6.1) 
then it is consumed directly 
lines 5-7 if the path directive matches with a signal in Y, the signal is consumed. 
The matching path directive and signal are deleted. 
lines 8-11 if the path does not match with any signal in the BSup. tlie current 
belief is terminated. 
lines 12-17 if a path directive from the PDM exists but no signals exist to be 
matched and the BSup global time has advanced past tlie O1 of the path 
directive a signal will never be generated to match the directive. Thus the 
current belief is terminated. 
lines 18-24 if signals exist but no path directive has been generated and the PDM 
time lias advanced past the smallest O1 of the signals, a matching patli direc- 
tive wiU never be generated. The current belief is terrninated. 
Annihilate Algorithm 
lines 1-2 the path directive corresponding to the followed path is deleted 
lines 3-5 the consumed signal is deleted 
lines 6-9 the consumable signal set is updated 
CHAPTER 6. THF BASE SUPERVISOR 
6.6.3 Complexity Analysis 
Queue Signal Algorit hm 
The complexity of the Queuesignal algorithm is dominated by line 6 which does an 
insertion into a sorted list (J). As outlined in section 5.5.4, the maximum size of 
J is the worst-case fan-in of the corresponding SDL process ( c ) .  The algorithm is 
repeated for each signal queued (N) and each active belief (B). Thus the worst-case 
running-time complexity of the algorithm is given by 6.3. 
Annihilat e Algorit hm 
The complexity analysis of the Annihilate algorithm is dominated by line 8 which 
does an insertion into a sorted list ( J ) .  Thus the running-time complexity of this 
algoritlim is identical to the complexity of QueueSigna! algorithm and is given 
by 6.4. 
Consume Signal Algorit hm 
Due to the size of the ConsumeSignaI algorithm, the running-time complexity is 
presented on a line-by-line basis. Lines with O(1) running-time complexity are 
omit ted from the analysis. 
CHAPTER 6. THE BASE SUPERVISOR 126 
line 1 the outer loop iterates once per signal-pair in the input port, its complexity 
is O ( N )  
line 3 a binary search of J' whose worst case size is c. The resultant complexity 
is O(1og c ) .  
line 5 fiom above, the Annihilate algorithm lias a running time complexity of 
O (log c) . 
The resultant complexity of the consume signal algorithm per belief is O ( N  
log c ) .  The algorithm is re-executed for each belief. Thus the overall complexity is 
given by 6.5. 
6.7 SchedulingProcessExecutionwithintheBSup 
Recall from the discussion in section 5.5.6 that the PDM determines the sclieduling 
order corresponding to target system execution. Given that the PDM-mode1 and 
BSup-model both contain identical SDL processes, the BSup must execute SDL 
processes in the BSup mode1 according to the scheduhng order prescribed by the 
PDM. 
Scheduling order is prescribed by the PDM indirectly. Pat h directives are gener- 
ated. For a path directive to be generated a corresponding process must be executed 
within the PDM. If processes within the BSup are executed in the same order as 
path-directives are generated, the BSup will foUow the scheduling order prescribed 
by the PDM. 
CHAPTER 6. THE BASE SUPERVISOR 127 
As in the PDM, scheduling is done by the system process in the BSup abstract 
machine. Processes are executed in the order that path-directives are received from 
the PDM. Recall that the system routes signals (and path directives) between SDL 
processes. A SDL process within the BSup is queued for execution as path-directives 
f?om the PDM are observed. 
6.7.1 Complexity Analysis 
The complexity of the BSup scheduling algorithm is a function of the number of 
processes scheduled. In the worst case, each process is scheduled for each signal con- 
sumed. A scheduling/de-scheduling operation consists of an adcLition/deletion from 
a scheduling queue. Both are 0(1) operations. Thus the running-time complexity 
of the scheduling algorithm for N signals and B beliefs is given by 6.6. 
6.8 Time and Space Complexity of the BSup 
The complexity analyzed based on the dominant algorit hms described (i-e. Con- 
sume Signal and Process Contents). The time and space complexities are presented 
individually. 
6.8.1 Time Complexity 
The running-time complexity of the PDM is dominated by the input port. Thus 
for B beliefs, a maximum of N signals queued within the BSup and a worst-case 
CHAPTER 6. THE BASE SUPERVISOR 
fan-in of c, fiom 6.5, the running-time complexity of the BSup is given by 6.7. 
6.8.2 Space Complexity 
The space complexity of the supervisor is largely dependent upon the number of 
beliefs generated (B). Each belief makes a duplicate of each signal (N): the state 
of each process P. and the sckeduling list of worst-case size N p .  Thus for a speci- 
fication of size SI the space complexity of the BSup is given by 6.8. 
Time and Space Complexity of the 
Hierarchical Supervisor 
Time Complexity 
From equations 5.7 and 6.7, it is clear that the running- time complexity of the hier- 
archical supervisor is dominated by the PDM. Conceptually this makes sense since 
the PDM must identify the chosen behavioral alternative, a much more cornplex 
task than merely detailed behavior checking. The running-time complexity of the 
hierarchical supervisor is t hus given by 6.9. 
THs(B, N ,  t ,  C,  N p )  = O(B N ( ( t  + c )  log c + log N p ) )  
CHAPTER 6. THE BASE SUPERVISOR 
6.9.2 Space Complexity 
The space complexity of the PDM and BSup is largely dominated by the number 
of beliefs generated as indicated by equations 5.8 and 6.8. The asymptotic space 
complexity of the PDM and BSup is identical. Thus the o v e r d  space complexity 
of the hierarchical supervisor is given by 6.10. 
Rss(Bo N ,  N p ,  S) = O(B (N + Pa - NP + Np) i S )  
Chapter 7 
Evaluat ion 
This chapter is organized into three parts. The k s t  part provides an overview of the 
structure and operation of a demonstration supervisor. The second part describes 
the testbed (including target system) that was used to evduate the supervisor. The 
third part describes the experiments conducted to evaluate the supervisor. 
7.1 Demonstration System 
A demonstration supervisor was developed based on the supervisor abstract ma- 
chines outlined in chapters 5 and 6. The top-level design of the supervisor, in the 
object model notation [20] appears in figure 7.1. 
The following two sub-sections describe the s tatic function of eacli top-level 
class appearing in figure 7.1 and the dynarnic communication between classes under 
cornmon operational scenarios. 
CHAPTER 7. EVALUATION 
il, 
CHAPTER 7. EVAL UATION 
7.1.1 Class Description 
The principal clifference between the specification of the PDM/BSup abstract ma- 
chines (appearing in figures 5.7 and 6.2 respectively) and the design of the supervisor 
is that the system process is refined into several classes. The PDM system process 
is refined into objects: EnvRouter, MainSched, PDMRouter and PDMSched. Sim- 
ilarly. the BSup system process is refined into objects: EnvRouter, MaznSched. 
BSapRouter and BSupSched. Note that the EnuRouter and MainSched are shared 
between the PDM and BSup. 
The PDM/BSup path process functionalities are implemented by the PDM- 
Router and BSupRouter classes respectively. The PDM/BSup input port and SDL 
processcs are irnplemented by the PDMInputPort, PDMProcess. BSuplnputPort 
and BSupProcess respectively. The BSup comparator process is implemented by 
the Comparator class. A HandleFailure class, shared by both the PDM and BSup, 
implements failure reporting once a failure has been detected. A detailed descrip- 
tion of the function of each class follows. 
EnvRouter: Collects target system input and output signals. Tags all signals witli 
OIS. 
MainSched: Specific functions of this class include: 
a creates/terminates and manages the List of beliefs 
a compacts beliefs representing identical global states (i-e. beliefs created 
under don't care non-determinism) 
0 schedules for execution the PDM, BSup, and EnvRouter 
PDMRouter/BSupRouter: Routes signals between processes within the PDM 
/ BSup. Uses the system specification of the PDM/BSup models as the 
CHAPTER 7. EVAL UATION 133 
communication topology. The PDMRoutet includes addi tional functionality 
for routing signals from the PDM to the BSup. 
PDMSched/BSupSched: Schedule individual PDMPTocess/ BSupProcess objects 
for execution. Scheduling is implemented as described in chapters 5 and 6. 
For scheduling purposes, comparators are treated as SDL processes. T h s  the 
BSupSched class includes additional functionality to schedule the execution 
of Comparator objects. 
PDMInputPort/BSupInputPort: Queue and order signals for consumption by 
the corresponding process. The PDMInputPort uses a partial-order distance 
table to reduce redundant signal permutation. It creates beliefs when a unique 
total order of signals cannot be determined. The BSvpInputPort queues sig- 
nals into two groups: (1) signals generated by the environment and signals 
generated internally within the BSup and (2) path directives generated by 
the PDM. 
PDMProcess/BSupProcess: Implement an SDL interpreter for each process. 
The PDMProcess and BSup Process are almos t identical in func t ionali ty excep t 
that the BSupP~ocess does not include support for ANY and none constructs 
as described in section 6.1. The PDMProcess generates a separate belief for 
each emanating path from an ANYconstruct and for multiple none constructs 
emanating from a single state. 
Comparator: Each process implements one expected and one observed behavior 
queue per channel. Each Comparator process is responsible for queuing sig- 
nais in the appropriate queue, comparing the contents of queues, annihilating 
identical queue contents and signaling for the curent  belief to be terminated 
if a match between contents cannot be made. 
CHAPTER 7. E VAL UATION 






HandleFailure: Reports a failure of the target system, terminates operation of 
the hierarchical supervisor. 
The  supervisor was implemented in Cf+. I t  consists of approximately 110 different 
classes, 1000 methods and 38,000 commented lines of source. The line counts of 
the PDM, BSup and common components of the supervisor appear in table 7.1. 
Commented 





7.1.2 Supervisor Operation 
Non-Comented 





The operation of the hierarchical supervisor is described in several sections. Each of 
the sections describes one particular aspect of functionality within the supervisor. 
A textual overview of the functionality is presented, followed by an example of the 
methods invoked by each class under one particular scenario. 
Signal Routing 
Observed signals (i.e. inputs and outputs of the target system) are tagged with 
OIS by the EnvRouter and transmitted to MainSched. For each belief in existence, 
MainSched duplicates each signal and routes i t  to the belief. For the currently 
CHAPTER 7. EVALUATION 135 
executing belief, signals are routed to their appropriate destinations by the PDM- 
Router and BSupRouter. As an example, the flow of control during routing of a 




EnvRouter MainSched PDMRouter PDMSched PDMInputPott PDMProcess 
Schdulch!e( ) 
(a) PDM Signal Routing 
HandleFailure 
T 
MainSched BSupRoutcr BSupSched BSupInputPort BSupProcess Compmtor HandleFailure 
v v v v - 
(b) BSup Signa1 Routing 
Figure 7.2: Signal Rout ing wit hin the Hierarchical Supervisor 
Scheduling SDL Processes and Comparators 
The two objectives of scheduling within the PDM are: (1) to order execution of 
SDL processes and comparators such that expected signals match with the observed 
signals' and (2) to reduce the computational complexity of the supervisor by listing 
objects ready-t O-run and thus eliminating the need to exhaustively search all objects 
to determine if they are ready-to-run. Processes are scheduled to execute when a 
- -  - - 
'Assuming that the target system is operating as specified. 
CHAPTER 7. EVALUATION 136 
signal is queued in their input port. Comparators are scheduled to execute when 
both t heir expected and observed input queues are non-empty. 
In recpnst: to the second motivation for scheduling, there are three types of 
scheduling within the hierarchical supervisor: (1) ready-to-run and (2) not-ready- 
to-run and (3)  not scheduleable. SDL processes with unsaved signals in their input 
port, or comparators with signais in both their expected/observed behavior queue, 
are classified as ready-to-run. SDL processes where every signal in the consumable 
signal set is in the Save set and comparators with either the observed or expected 
beliavior queue empty and the other non-empty are classified as not-ready-to-run. 
If the PDM/BSup global time advances past the O1 of the signal with the smallest 
O1 lower bound in the input port/comparator, the process/comparator will never 
be ready to run and as a result the currently executing belief is terminated. SDL 
processes with no signals in their input ports are classified as not scheduleable since 
t hey cannot execute. 
An example of ready-brun scheduling is shown in figure 7.2a. After a signal 
is queued in the PDMInputPort, the PDMInputPort schedules itself to execute. 
A PDMInputPort is re-scheduled o d y  if the scheduling parameters of the process 
change. Note that the PDMProcess is not scheduled liowever, it is executed by the 
PDMInputPort. Thus it executes after the PDMInputPort executes. BSup Input- 
Port/ BSup Process and Comparutor ready-brun scheduling operates similady. 
An example of not-ready-terun scheduling is shown in figure 7.3. A signal is 
queued in a cornparator with an empty observed and expected behavior queue. The 
flow of control within the supervisor for not-ready-to-run scheduling of a PDMPro- 
cess and a BSupProcess process is similar. 
CHAPTER 7. EVAL UATION 137 
Figure 7.3: Scheduling a no t-ready-t O-run Comparator 
EnvRouter MainSched BSupRouter BSupSched BSupInputPort BSupProcess Cornpmtor HandleFailure 





Execution of a PDMProcess, BSupProcess or Cornparator is initiated by the Main- 
Sched. Initially. all ready-terun processes in the PDM are executed followed by 
ready-to-run processes/comparators in the BSup. Both processes and comparators 
after executing must re-schedule themselves based on the remaining signals in their 
input ports or expected/observed queues. After execution, processes/comparators 
may be in a ready-to-run, not-ready-terun or not scheduleable state (if no sig- 
nais remain in tlieir input port/queues). Objects in either the ready-tsrun or 
not-ready-to-run state must be scliedded as described in tlie previous section. 
As an example, figure 7.4a shows the flow of control in the ltierarchical super- 
visor when executing a PDM process that becomes ready-to-run after execution. 
Figure 7.4b illustrates tlie case where a comparator is executed and after execution 
only the observed behavior queue (for example) is non-empty (i.e. the comparator 




As discussed, beliefs are created only by the PDM. Within the PDM, beliefs may be 











(a) PDM Process Execution 
MainSched PDMRouter PDMSched PDMInputPort PDMProcess - 
- 
(b) BSup Comparator Execution 
MriinSched BSupRouter BSupSched BSupInputPort BSupProcess Compantor HandieFailure 





PDMInputPort if ambiguity exists with regards to the actual signal consumption 
order or the PDMSched if uncertainty exists as to the actual process execution 
order. 
As an example, the flow of control during the creation of a belief by a PDM- 
Process executing an ANY construct is shown in figure 7.5a. The otlier two cases 
are handled in a similar fashion. Note tliat the t h e a d  of control remains with the 
curent  belief. The new belief is subsequentally scheduled by MainSched. 







The currently executing belief may be terminated within either the PDM or BSup. 
Wit hin the PDM/BSup, beliefs may be terminated by the PDMSchedl BSupSched 
if the PDM/BSup time advances past the O1 of any signal in a PDMInputPort / 
- - 
Exmid ) 
- - - 









I - T -





(a) Belief Creation 
MainSched PDMRouter PDMSched PDMInputPon PDMProcess 
(b) Belief Termination 
- 
- 
MainSched BSupRouter BSupSched BSupInputPart BSupProccss Compararor HandleFailure 










Ex=w(  ) 
- 
BSupInputPort that is not-ready-t O-run2. Additionally: a belief may be t erminated 




queue is non-empty and the BSup time advances past the O1 of the signal at the 
head of the non-empty queue or (2) if a match cannot be made between the heads 
of the expected/observed signal queues. 
- - 




queues do not match is illustrated in figure 7.5b. 
ffilICurruiiCBS( 
- 
Annihiiud ) - 
- 
'Recall that PDM and BSup times are based on process times of processes that are ready-to- 
run. 
CHAPTER 7. EVALUATION 
Redundant Belief Compaction 
Beliefs are generated in response to uncertainty as to the behavioral alternative 
chosen by the target system. In some cases: tmically resulting from the tradeoffs 
made in partial-order signal consumption (described in section 5.3), n > 1 beliefs 
may be generated that represent identical observable behavior. The redundant belzef 
compaction mechanism (RBCM) is used to terminate n - 1 of these beliefs. 
Recall that a belief represents the global state of both the PDM and BSup. 
Essentially the RBCM compares the global states represented by two beliefs and 
if identical, terminates one of the two beliefs. To reduce the computational cost of 
the RBCM, two-level hashing is used during the cornparison. T w o  hash functions, 
hl() ,  h Z ( )  were developed such that for two beliefs, A and B, if the hash values of 
either functions are different then the two beliefs represent different global states 
6.e. if hl,()  # hl,() or &,() f h,,() fhen A + B). 
The f i s t  level hash simply takes into consideration the symbolic state of each 
process and the number of signals in each input port/comparator queue. The second 
lever hash takes into consideration symbolic signal names and OIS of signals. If botli 
the h s t  and second level hash functions are equal for the two beliefs, the global 
state of the two beliefs is exhaustively compared before one of the two beliefs is 
terminated. 
From empirical measurements, the first-level hash is able to identify approxi- 
mately 70% of different beliefs and the second level100% for a sample size of several 
hundred beliefs. 
The RBCM is invoked by the MainSched in one of two cases. First, if the 
number of beliefs, exceeds a threshold (AB) and second if the age of a belief, 
exceeds a threshold (AT). 
CHAPTER 7. EVALUATION 
Failure Reporting 
Failures are reported by the hierarchical supervisor after all beliefs are terminated. 
MainSched manages and schedules beliefs for execution. If the belief scheduling list 
becomes empty, MainSched signais HandleFaiZure to report a failure of the target 
system. The flow of control within the supervisor is illustrated in fiorne 7.6. 
Figure 7.6: Generating a Failure Report 
EnvRouter MainSched PDMRouter PDhlSched PDMInputPon PDMProcess HandleFailure 
7.2 Evaluation Testbed 
T - 
The control program of a s m d  telephone exchange was used as a target system 
based on which the hierarchical supervisor was evaluated. The exchange serviced 
60 telephones. 
The exchange hardware was simulated and exchange software executed on a 
UNIX workstation. A random telephone traffic generator served as a generator of 
inputs. Several tools were used to analyze the traffic data generated. The various 
components of the test bed and their interconnections are shown in figure 7.7. A 
detailed description of each component follows. 
" 
HandlcFailurd ) 
Telephone Traffic Generator: The telephone trafFic generator simulated typi- 
cal, random plain old telephone senn'ce (POTS)  usage patterns. Several pa- 





CHAPTER 7. EVAL UATION 143 
rameters such as the origination rate and comect t h e  were programmable 
allowing modeling of various load profiles. I t  executed as a single UNIX pro- 
cess. The generator used in this work is described fwther in [47]. 
Hardware Emulator: This unit emulated the exchange hardware. It supported 
up to 60 telephones. The emulator executed as a single UNIX process. 
Hardware Interface Memory: The hardware interface memory represented the 
memory map of the exchange hardware. It was implemented as a contiguous 
block of UNIX shared memory. 
Cal1 Processing Software: Provides functionality for all telephones serviced by 
the excliange and manages shared hardware resources. The SDL requirements 
specification and an overview of the c d  processing software can be found in 
Appendix A. 
Interface: Served two purposes: fist, it provided a visual display of the state of 
each telephone and second, dowed rnanual telephone c d s  to be placed. Note 
that use of the user interface in the latter case excludes use of the teleplione 
traffic generator. 
Abstractor: Translated bit sequences appearing in the hardware interface nieniory 
into signals as appearing in the SDL requirements specification. 
Hierarchical Software Supervisor: Consists of the PDM-model, BSup-mode1 
and interpreters as described in section 7.1. 
Trace Manipulation Tools: Permitting seeding random fadures at random points 
in the trace file. 
CHAPTER 7. EVALUATION 144 
Trace Analysis Tools: A collection of utilities for analysis of telephone traffic 
statistics. Parameters such as the number of originations, number of calls 
routed to slow busy, number of c d s  routed to fast busy etc. are generated 
fkom the contents of a trace file. 
Behavioral Alternative Counter: A tool used to measure the total number of 
behavioral alternatives (Le. don't know and don% care) that arise under a 
given requirements specification and trafic load over time. 
The components of the testbed are written in five programming languages as some 
languages are more suitable for certain applications than others. The majority of 
the testbed is written in C and C++, the Interface which is largely grapliical is 
written in Tci/Tk. the Trace Manipulation/Analysis Tools are written in Perl and 
csh. The entire testbed consists of approximately 70,000 lines of commented source. 
7.3 Evaluat ion 
The Iiierarchical supervisor presented in this tliesis was evaluated dong two lines: 
(1) its failure detection capability and (2) its time/space complexity. 
The section begins with an experimental evaluation of the supervisor's ability 
t O de t ect failures and t O simult aneously limi t generat ion of false-failure report S. 
An analysis of the size of the problem space (i.e. the total number of behavioral 
alternatives) is presented next. This is followed by experimental evaluations of the 
supervisor time and space complexity. The section concludes with a commentary 
on the scdability of the hieradical supervision to industrial systems based on the 
evaluations presented. 
CHAPTER 7. EVALUATION 
7.3.1 Failure Detection Capability 
The failwe detection capability of hierarchical supervision was evaluated with the 
aid of the target system described. The supervisor was used to monitor the exchange 
for extended periods of time. The failure detection capability of the supervisor was 
evaluated based on two attributes: (1) the supervisor's spurious failure reporting 
and (2)  the supervisor's failure detection capability. Both sets of evaluations are 
presented in the following two sub-sections. 
Spurious Failure Reporting 
Spurious failure reporting refers to the number of unwarranted failure reports gen- 
erated by the supervisor. It was evaluated by having the supervisor monitor the 
operation of the target system for several thousands of c d  originations. Typical 
reliabili ty requirements for North American t elephone swit ching sys tems are t hat 
up two calls out of ten t housand may be mishandled. These requirements were used 
as a guideline in setting the interval during wlùch the supervisor was executed. 
The supervisor was used to supervise several traces consisting of over twenty 
thoiisand c d  originations ranging in origination rates from 2-6 calls/phone/liorir. 
The loads were chosen to range from heavy residential to heavy commercial traf- 
fic levels. The target system ca l l  processing software was a third-generation de- 
bugged version. The supervisor detected several failures in the output of the ex- 
change. Detected failures were subsequentally traced back to either (1) faults in 
the PDM/BSup mode1 or (2) residual faults in the target system control program. 
The faults in the PDM/BSup models were introduced during the transformation 
of the models from the requirements specification and resulted from human error. 
The results are summarized in table 7.2. 
CHAPTER 7. EVAL UATIaN 
1 Implementation 1 2 1 6  I 
Fault Category 
Supervisor Mode1 
Table 7.2: Supervisor Failure Detection Capability 
One supervisor model fault type was detected by the supervisor. The supervisor 
was able to detect and subsequentally report the discrepancy. The supervisor model 
fault was due to resources being incorrectly deallocated. When a c d  was placed and 
tlie terminating party was busy, resources were not dedocated upon the originating 
party going onhook. The supervisor reported a failure after all resources within 
the supervisor model were depleted (i.e. after the effect of resources not being 
dedocated became externally visible). 
Number of Fault Types 
1 
Two types of residual target sys tem faults manifes ted t hemselves as externally 
observable failures. The f i s t  related to the scanning of digits dialed by the user. 
Number of Instances 
1 
Wlien waiting for tlie first digit, the control program disconnected and re-connected 
the touch-tone receiver liardware as part of the process of removing dial-tone. The 
connection/disconnection of the touch-tone receiver is not an externally observable 
event. However, it was interpreted by the supervisor (or more precisely by tlie 
abstractor) as two separate digits dialed. The second failure type was due to a 
clifference between a specified and implemented timeout duration. The supervisor 
reported the external signal generated after the timeout as a failure, because it was 
not expecting it at that time. 
AU failures detected by the supervisor were either traced back to faults in the 
supervisor model or residual faults in the target software system. Based on the 
experiment conducted, no unwarranted failure reports were generated by the her- 
CHAPTER 7. EVALUATION 
archical supervisor. 
Failure Det ection Capability 
The evaluation of the supervisor failure detection capability is difficult due to the 
large sizes of the trace fles. Manual verification that a given trace represents a 
behavior corresponding to the specification is almost impossible. 
For tliis reason, the failure detection capability of the supervisor was evaluated 
by seeding known failures into a trace representing the execution of the exchange. 
The failure model consisted of altering the signals emitted during state transi- 
tions [54]. T h e e  types of failures were seeded: (1) signal removal, (2) signal inser- 
tion and ( 3 )  signal replacement. Note that the final failure type is a combination 
of the f k t  two. 
Two diEerent types of evaluations were carried out: exhaustive and random. 
Tkey differed principally in how failures were seeded. Exhaustive evaluation is 
better suited for use with s m d  trace files due to its computational cost. Ran- 
dom evaluation is bctter suited for use with large trace files. Evaluations of the 
supervisor based on these two types of evaluations are described below. 
Exhaustive evaluation refers to seeding all three types of failures at each line of 
the trace file. Ten s m d  trace files representing loads from 2-20 calls/phone/hour 
were used. Each line of the trace was seeded with all three failure types, represent- 
ing a total of approximately 30 fadures per line. The traces contained a total of 
approximately 10 caUs or 200 lines each. Thus a total of 10 30 200 = 60,000 fail- 
ures were seeded in separate traces. The supervisor was executed on each individual 
trace. The presence of all seeded failures were reported by the supervisor. 
Random evaluation refers to seeding randomly chosen failures at random loca- 
CHAPTER 7. EVALUATION 148 
tions in a given trace file. Fifteen trace files representing approximately 20,000 c d s  
at loads ranging between 2-20 caIls/phone/hour were seeded with the three failure 
types described. A total of approximately 60,000 failures were seeded into separate 
traces. The supervisor was executed on each trace individually. The presence of all 
seeded failures were reported by the supervisor. 
7.3.2 Number of Legitimate Behavioral Alternatives 
The size of the supervisor problem space was estimated by measuring the number 
of Iegal behavioral alternatives (BAS). Internally, BAS arise from specification non- 
determinism under a particular input scenario. Within the supervisor, t hey are 
represented as beliefs. A subset of the BAS generated by the supervisor a c t u d y  
lead to different externally observable behavior. 
The number of generated BAS is a function of the requirements specification 
and the target system load. The small telephone exchange was run under several 
different traffic loads. The maximum number of BAS generated by the supervisor 
(Le. beliefs) for each Ioad is plotted in figure 7.8. 
Further analysis on the number of generated BAS was done to determine the 
proportion of don't care and don't know BAS. BAS were grouped into n sets: 
S I ,  s ? ,  - , s,. All BAS in set si result in identical observable behavior (i.e. don't 
care BAS). While any two BAS in sets si and s j  i # j represent dif'Ferent observable 
behavior (Le. don't know BAS). Thus the total number of don% know BAS is n and 
total number of don't care BAS is equal to (#sl + #s2 + --• + #s, - n) where # 
represents a set cardinality operator. For the experiment described, the results are 
plotted in figure 7.8. 
As expected, the total number of behavioral alternatives is very large. This is 
CHAPTER 7. EVALUATION 
Figure 7.8: Measurcd Number of Beliefs 
CHAPTER 7. EVALUATION 150 
due to the worst case f a c t o d  number of possible signal interleavings at the input 
ports of SDL processes, each Ieading to a legitimate BA. Few of these BAS actually 
lead to different observable behavior, making the motivation for pruning such BAS 
from consideration strong. 
7.3.3 Number of Behavioral Alternatives Generated 
The number of behavioral alternatives generated is a key parameter in bo th t lie time 
and space complexity of the hierarchical supervisor. The supervisor was executed on 
the load described in section 7.3.2. The number of behavioral alternatives generated 
is plot t ed in figure 7.9. 
1 0 ' ~  , . 
m '  
O .  - .   a ,  
f 
a *  - a - 
L 
O - Toid B c h v i o n l  Alternatives 
2 102 7 
: 
a l '  
2 :  
2 .  
3 
B - 
2 Total B e h v i o n l  Altcmctivcs Gencnted - by Hicnrchical Supervisor g IO! : 
n :  
5 : z 
0 -
2 .  Don't Know B e h a v i o d  Alternaciva 
1 ooo 
200 400 600 800 1 O00 1200 
CJI Traffic (callslhour) 
Figure 7.9: Number of Behavioral Alternatives Generated 
As shown, the hierarchical supervisor significantly reduces the number of be- 
CHAPTER 7. EVALUATION 151 
liavioral alternatives considered. The majority of the BAS generated are don't care 
BAS due to the tradeoffs made with partial-order signal consumption. As the load 
on the exchange is increased, the number of don't know BAS increases. TLs  is 
principdy due to resource starvation; the PDM is not able to determine which 
of the two resources in the target system have been depleted based on the signals 
observed (i.e. which path was followed). For the example system describedo a PDM 
would be able to accurately track the don't know BAS for an exchange with properly 
provisioned resources. 
The act ual number of beliefs generated is highly application specific. It depends 
on the requirements specification, the algorithm used to derive the PDM-model, 
the load and the detailed implementation of algorithrns in the PDM interpreter. 
Empirical curve fitting revealed that for an exchange subject to a traffic load L. 
the number of beliefs generated by the hierarchical supervisor is of order O(L-log L)  
as shown in figure 7.9. This is a subst antial reduction fiom the factorial-number of 
total legitimate BAS. 
7.3.4 Running-Time Complexity 
This section presents empirical validation of the running time complexity of the 
hierarchical supervisor as given by equation 6.9. For the experiment described, t be 
worst case fan-in of each process ( c ) ,  the number of signal types (t)  and the worst- 
case number of SDL processes N p  are defined by the specification topology and are 
treated as constants since the evaluation deals only with one target system. Thus 
equation 6.9 reduces to THs(B, N) = O(B N ) .  
From the empirical analysis presented in section 7.3.3. B can be estimated as 
B = O( L log L) where L represents the load on the exchange. N, representing 
the number of signals in the supervisor, increases linearly with the load on the 
exchange. Thus N can be approximated as N = O ( L ) .  The resultant running-time 
complexity of our example is thus given by 7.1. 
The hierarchical supervisor was used to monitor the operation of the target 
system at several different operationai loads. As the load increased, the number of 
beliefs generated increased (as described in section 7.3.3), increasing the CPU time 
required by the supervisor per telephone c d .  
The supervisor CPU time per c d  was measured by running several hundred 
c d s  and averaging the total supervisor running time by the number of originations. 
The number of originations was made large to reduce the effect of the supervisor 
initialization on the total running tirne. Supervisor running time was rneasured 
using the UNIX getrusage system c d .  
The CPU time per c d  is plotted in figure 7.10 for various operational loads. 
Resdts were obtained on a machine having a SPECint95 and SPECfp95 of 1.0. 
A 0(L2 log L) curve is plotted as a reference. As shown, good correspondence 
between the predicted and measured running-time complexity was observed. 
7.3.5 Space Complexity 
The predicted space complexity of the supervisor (given by equation 6.10) was 
compared with the measured space complexity of the supervisor developed. For the 
particular experiment described, only one specification was considered. Thus the 
specification size S and the number of processes in the specification Np is constant. 
CHAPTER 7. EVALUATION 
Figure 7.10: CPU Time Per C d  
The number of signds within the supervisor, N is approximated as N = O ( L )  (as 
outfined in section 7.3.4). Thus the resultant space complexity of our example is 
given by 7.2. 
The exchange was executed over several different t r a c  loads. The maximum 
memory usage of the supervisor was deterrnined with the aid of the UNIX top 
command3. The measured supervisor memory usage is plotted in figure 7.11 for 
various loads. The constant size of the supervisor executable was subtracted from 
the results plotted. 
7.3.6 Scalability 
This section at  tempts to extrapolat e the time cornplexity results presented to larger 
systems. The results are meant only to serve as a general indicator of the scala- 
bility of the approach. An  actual system would introduce factors not taken into 
consideration in the presentation below such as a larger requirements specification. 
Most telephone exchanges have modular organization to facilit ate module reuse 
and to allow ease of expandability. For example, the line interface module (LIM) 
that interfaces subscribers telephones with the central exchange controller services 
approximately 1000 lines in both the Northern Telecom DMS-100 [57] as well as 
the Lucent 5ESS [19]. It would be difficult to observe the inputs and outputs of the 
3The supervisor contains an interna1 memory manager. Memory, wben dealiocated is returned 
to the supervisor memory pool rather than the operating system memory pool. Tbus the maximum 
memory usage of the supervisor results just before the supervisor completes its execution. 
CHAPTER 7. EVAL UATION 
Figure 7.11: Maximum Memory Usage 
CHAPTER 7. EVALUATION 156 
entire exchange. However, a supervisor would be suitable for monitoring a single 
module such as the LIM. 
The CPU requirements of the supervisor were estimated based on maximum 
business trafic (i.e. 6 cds/phone/hour). The supervisor is assumed to monitor 
a LIM servicing 1000 telephones at the standard business origination rate of 6 
cds/hour/phone. Extrapolating fiom figure 7.10, the supervisor running on a 
machine having a SPECint95 and SPECfp95 of 1.0 reqiùres approximately 3 cpu 
seconds/call at this load. The LM is required to process 6 x 1000 calls/hour or 
1.67 calls/second. Thus a CPU having a SPECint95 and SPECfp95 greater than 




This thesis addressed the automatic detection of software fadures or software su- 
pervision. The software supervisor is a unit that monitors the inputs and outputs 
of a given target software system. It makes use of the target system's requirements 
specification as a definition of correct behavior. Discrepancies between specified 
and observed behaviors are reported as fdures by the supervisor. 
The complexity and sophistication of modern software systems makes automatic 
detection of fadures an industrially important area of research. Thee  potential ap- 
plications of supervision include on-line detection of fadures, evaluation of testcase 
results during software development and the collection of accurate failure data to 
identify problem areas and improve the reliability of software. 
This thesis focuses on the supervision of real-time reactive software sys tems. 
This class of systems represents some of the largest and most complex software 
ever developed. The case where the requirements specification of the target system 
external behavior appears in a finite state machine based formalism is considered. 
Software supervision is a highly complex activity. Several open research issues 
CHAPTER 8. CONCL USIONS 158 
related to supervision exist. The principal issue addressed by this work is the com- 
putationally efficient handling of specification non-determinism. Non-determinism 
is an important component of a specification formalism. It permits the specifica- 
tion writer to avoid stating unpertinent aspects of behavior. This leaves freedom 
to the software designer to choose the least costly or otherwise desirable alterna- 
tive. However, the supervisor must be able to consider all legitimate behavioral 




alternatives exist even for moderate size systems and exhaustive 
all alternatives is prohibitive. Hierarchical supervision addresses 
8.1 Hierarchical Supervision 
A novel approach to supervision, called hierarchical supervision, was proposed. The 
objective of the approach is the efficient handling of specification non-determinism. 
Hierarchical supervision improves the efficiency of non-determinisrn handling 
by a divide-and-conquer approach. Supervision is split into two sub-problems: (1) 
determination of the path through the specification chosen by the target system and 
(2 )  detailed behavior cliecking. The corresponding architecture of the hierarchical 
supervisor has two layers. The path detection module ( P D M )  determines the path 
through the specification chosen by the target system while the base supeniisor 
(BSup) checks that the followed path was ac tudy the legitimate one. 
The functionality underlying the BSup lias been studied extensively and is rela- 
tively well understood. However, the PDM has not been addressed previously. The 
major focus of the thesis is on the PDM. 
CHAPTER 8. CONCL USIONS 
The PDM relies on signals, generated by the target system, that uniquely iden- 
tify the foIlowed path through the target system. The precision of tracking the 
target system improves with the availability of signals that uniquely identify the 
path followed. 
Hierarchicd supervision is best suited for target systems where the average 
uniqueness of signals used by the PDM to track the target system (i.e. PDM-mode1 
stimuli)' is greater than the average uniqueness of the requkements specification 
stimuli. The chosen behavioral alternative is identified by the PDM based on 
a subset of the signals directed to/fiom the target system. Unique signals may 
be mapped to fewer state transitions than less unique ones. As a result, fewer 
behavioral alternatives need be considered by the supervisor. The average number 
of behavioral alternatives explored by a hierarchical supervisor decreases as the 
average uniqueness of signals chosen to track the target system increases. 
8.2 Major Research Contributions 
This thesis presented five major researcli contributions to cost-effective automatic 
detection of software failures in the presence of specification non-determinism: (1) 
the notion of splitting supervision into two sub-problems: path determination and 
detailed behavior checking, (2) improvement of the accuracy of path determina- 
tion by the use of both target system input and output signals, (3) exploration of 
the tradeoffs in having the supervisor lag the target system, (4) development of 
a method for pruning alternatives arising from specification non-determinism not 
leading to different observable behaviors and (5) development of a base supervisor, 
I A ~  underlying assumption is t hat a sui tabIe metric of uniqueness exists. The reader is referred 
to section 4.2.1 for a definition of one such metric. 
CHAPTER 8. CONCLUSIONS 
a directed simulator for detailed behavior checking. A further description of each 
contribution follows- 
Split ting supervision into two sub-problems separat es the two fundament al func- 
tionalities of the supervisor. It may be considered a divide-and-conquer approach 
to reducing the computational cost of supervision. The two components of the 
hierarchical supervisor which implement these functionalities differ substantidy 
in their purpose, design and implementation. The result is that each component 
implements a more specialized function than a monolithic supervisor allowing for 
improved efficiency and a conceptudy simpler implementation. 
Hierarchical supervision makes use of both target system input and output 
signals to determine the path chosen by the target system. Thus the occurrence of 
a state transition in the requirements specification may be determined to have taken 
place by either an input or output signal. This improves the use of the information 
provided about the path traversed by the target system. From the perspective 
of the supervisor, it reduces the number of behavioral alternatives that need to 
be considered. However, it complicates the derivation process of the PDM-mode1 
which must ensure that sequences of state transitions in the PDM-rnod:.! follow a 
similar causal path as in the requirements specification. 
A supervisor that has the capability to lag the target system by a sufficiently 
long period A (or an out-of-time supervisor) needs only consider what happened 
rather than what may happen. The advantage of the approach is that only a subset 
of the behaviors need to be considered by such a supervisor. In addition, signals 
generated by the target system may be stored, dowing the supervisor to process 
peak target system activity over a longer period of time. The tradeoff with out- 
of- time supervision is the increased space requirement required to store signals 
generated by the target system during the interval A in addition to the latency of 
CHAPTER 8. CONCL USIONS 
failure reporting by a worst-case period, A. 
In many requirements specifications and operational scenarios, a number of be- 
havioral alternatives arise fiom specification non-deterrninism that do not lead to 
different externally observable behaviors. Partial-order techniques were proposed 
to prune such alternatives from consideration. The approach makes use of static in- 
formation compiled from the requirements specification. Static information is used 
to dynamicdy discard alternatives arising from specification non-determinisrn not 
leading to unique externally observable behavior. A spectrum of such algorithms 
can be envisioned, each suited for different applications. However. in general. a 
tradeoff exists between the timefspace resource requirernents of the approach and 
its capability to prune behavioral alternatives. 
At the core, a software supervisor must have a simulator to generate expected 
behaviors of the target system. Expected behaviors are cornpared with observed 
behaviors to determine and failures reported if a match cannot be made. A typical 
simulator chooses a behavioral alternatives in the presence of non-determinism. 
However, the proposed simulator (i.e. the BSup) is directed by the PDM dong the 
beliavioral alternative chosen by the target system. 
8.3 Future Work 
The fundamental contributions described may have further applications than those 
described in this thesis. Future work is sub-divided into t h e e  categories: (1) further 
reductions in computational complexity arising from specification non-determinism, 
(2) continuation of supervision after detection of a failure and (3)  alternate appli- 
cations of the described work. A discussion of each follows. 
CHAPTER 8. CONCL USIONS 
This thesis described a berarchical approach to software supervision consisting 
of two layers. Experience gained in domains such as artificial intelligence plan- 
ning indicate that N-iayer problem solving is a principal means of dealing with 
cornputational complexity [33]. 
The two-layer approach to supervision could be extended into an N layer ap- 
proach by abstracting paths and successively resolving paths at  lower layers in the 
supervisor. Several state transitions could be abstracted into a single, aggregate 
s t a te  transition. Upon determination that the aggregate state transition has taken 
place: the supervisor effectively knows the destination state. Subsequently lower 
layers could then resolve the actual path from the previous composite state to the 
current composite state. It is believed that such an approach would yield further 
reductions in computational complexity provided that sufficiently unique signals 
exist to track the target system. The tradeoff with the approach is the increased 
delay in failure reporting. 
Supervision requires that the state of the target system and the specification 
state of the supervisor be in-sync. However, few assumptions can be made about the 
the post-failure specification s tate of a target system. For supervision to continue, 
an approach to determining the state of the target system after the occurrence of 
a failure is needed. 
The notion of path detection is simila to the notion of state detection. Pa th  
detection attempts to identify the state transition that took place from the em- 
anating state. State detection requires that the state transition that took place 
and consequentidy the final state be identified without a notion of the current 
state. Thus the research contributions developed for path detection such as track- 
ing the execution path by means of unique signals and delaying the reporting of 
the execution path may be applied to state detection. One obvious difficulty is the 
CHAPTER 8. CONCL USIONS 
enormous potential state space. Research results indicate a tradeoff between the 
amount af time spent determinhg the state and the computational complexity of 
the approach [35, 521. A unit to determine the post failure state will probably have 
to lag the target system by an interval greater than the PDM. 
The work on path detection as described may have several other applications 
other than supervision. For example, a PDM with a properly instrumented mode1 
rnay be used as a quality of service (QoS) monitor. For systems that have large 
amounts of internal state, simple assertion checking is typicdy not suitable. The 
out-of-time orientation of the descnbed PDM is naturally suitable for monitoring 
QoS. Other applications include resource utilization monitoring and specification 
coverage metering for applications such as software testing. 
Bibliography 
[l] R.P. Almquist, J.R. Hagerman, R.J. Hass, R.W. Peterson, and S. L. Stevens. 
Software protection in No. 1 ESS. In International Switching Sympos ium 
Record, pages 565-569. IEEE, 1972. 
[2] Rajeev A h ,  Costas Courcoubetis, and Mihalis Yannakakis. Dis tinguishing 
tests for non-deterministic and probabalistic machines. In Proceedings of the 
27th Annual ACM Symposium on  the Theoni  of Computing, pages 363-372. 
1995. 
[3] Algirdas Aviiienis. The N-version approach to fault-tolerz.int software. I E E E  
Transactions on  Software Engineering, SE11(12):1491-1501, December 1985. 
[4] J.M. Ayache, P. Azema, and M. Diaz. Observer, a concept for on line detection 
for control errors in concurrent systems. In 9th International Sympos ium on 
Fault- Tolerant Cornputing, pages 79-86. IEEE, 1979. 
[5] J.M. Ayache, J.P. Courtiat, and M. Diaz. Self-checking software in distributed 
systems. In 3rd International Conference o n  Distributed Computer  Systems,  
pages 163-170. IEEE, 1982. 
[6] F. Bacchus and Q. Yang. The expected value of hierarchical problem-solving. 
In Proceedings of the annuat National Conference on  Artificial Intelligence. 
Arnerican Association for Artificial Intelligence ( AAAI), 1992. 
[7] F. Belina, D. Hogrefe, and A. Sarma. SDL with Applications from Protocol 
Specification. Prentice Hall International, 199 1. 
(81 M. Blum and H. Wasserman. Program result-checking: A theory of testing 
meets a test of theory. In 35th Annual Symposium o n  Foundations of Computer  
Science (FOCS '95), pages 382-392, 1994. 
BIBLIO GRAPHY 165 
M. Blum and H. Wasserman. Reflections on the pentium division bug. IEEE 
Transactions on Computers, 45(4):385-393, April 1996. 
D.B. Brown, R.F. Roggio, J.H. Cross, and C.L. McCreary. An automated 
oracle for software testing. IEEE Transactions on Reliability, 41 (2):272-279, 
June 1992. 
Alan Burns and Andy Wellings. Real-Time Systemg and thezr Piogrammzng 
Languages. Addison-Wesley, 1990. 
S.E. Chodrow, F. Jahanian, and M. Donner. Run-time monitoring of red-time 
systems. In Proceedings of the Real- Tirne Systems Symposium, pages 74-83. 
IEEE, 1991. 
John R. Connet, Edward J. Pasternak, and Brude D Wagner. Software defenses 
in real-time control systems. In Fault-Tolerant Computing, pages 94-99, June 
1972. 
M. Diaz, G. Juanole, and J.P. Courtiat. Observer-a concept for formal on-line 
validation of distributed systems. IEEE Transactions on Software Engineering, 
20(12):900-913, December 1994. 
D.Lee, A.Netravali, K.Sabnani, and B .Sugla. Passive testing and applications 
to network management. to appear. 
D.Lee and M-Yannakakis. Principles and methods of testing finite state ma- 
cliizes - a survey. In Pioceedings of the IEEE, volume 84, pages 1090-11267 
1996. 
P.S. Dodd and C.V. Ravishankar. Monitoring and debugging distributed real- 
time programs. Software - Practice and Experience, 22(10) :863-877, October 
1992. 
P. Edwards, editor. The Encyclopedia of Philosophy, volume 2. Collier - 
Macmillan Limited, London, 1967. pages 359-378. 
H.G. Holland et. al. The 5ESS-2000 switch: Exceeding customer expectations. 
AT&T Technical Journal, 73(6):28-38, NovemberfDecernber 1994. 
J. Rumbaugh et. al. Object-Oriented Modeling and Design. Prentice Hall, 
1991. 
Peter G. Bishop et al. PODS - A Project on Diverse Software. IEEE Trans- 
actions on Software Enaineerino. sE12(9):929-940. Se~tember  1986. 
International Organization for Standardization. A Forma1 Description Tech- 
nique based on Temporal Ordering of Obseruational Behavior. ISO/IEC 8807, 
Geneva, 1989. 
International Organization for Standardization. A Formal Description Tech- 
nique based on Eztended State Transition Model. ISO/IEC 9074, Geneva, 1990. 
P. Godehoid. Partial-Order Methods for the Verification of Concurrent Sys- 
tems - An Approach to the State Explosion Problem. PhD thesis, Universite 
de Liege, Facultè des Sciences AppIiquèes, October 1994. 
D.B. Hay. A belief method for detecting operational failmes in soft real time 
systems. Master's thesis, University of Waterloo, Depart ment of Electrical and 
Computer Engineering, Waterloo, Ontario, Canada N2L 3G1, 1991. 
D.B. Hay and R.E. Seviora. A real-time validator. In Proceedings of the Third 
IEE International Conference on Software Engineering for Real-Time Systems, 
pages 199-204. IEE, 1991. IEE Publication Number 344. 
C. A.R Hoare. Cornrnunicating Sequential Processes. Prentice-Hall Interna- 
tional, 1985. 
G. Holzmann and D. Peled. An improvement in formal verification. In Pro- 
ceedings of the Seventh International Conference on Forma1 Description Tech- 
niques (FORTE794), Berne, Switzerland, October 1994. 
Yennun Huang and Chandra Kintala. Software implemented fault tolerance: 
Technologies and experience. In The Twenty- Third Annual International Sym- 
posium on Fault-Tolerant Computing (FTCS-23), pages 2-9, Los Alamitos, 
California, Junc 1903. IEEE Computcr Society Press. 
Radu Iorgulescu. Resynchronization in supervision of soft real-time systems. 
Master's thesis, University of Waterloo, Department of Electrical and Com- 
puter Engineering, Waterloo, Ontario, Canada N2L 3G1, 1991. 
Peter Klein. The safety-bag expert system in the electronic railway interlocking 
system Elektra. In EXPERTSYS-90, pages 177-182, 1990. 
J-C. Knight and N.G. Leveson. An experimental evaluation of the assumption 
of independence in multi version programming. IEEE Transactions on Software 
Engineeeng, SE12(1):96-109, January 1986. 
Craig A. Knoblock. Automatically Generating Abstractions for Problem Solv- 
ing. PhD thesis, Carnegie Mellon, School of Computer Science, Pittsburgh, 
PA 15213, May 1991. CMU-CS-91-120. 
Craig A. Knoblock, Josh D. Tenenberg, and Qiang Yang. Characterizing 
abstraction hierarchies for planning. In Proceedings of the Annual National 
Conference on  Artijkïal Intelligence, pages 692-697. American Association for 
Artificial Intelligence ( AAAI), 1991. 
Radmila Kovacevic. A resynchronization scheme for belief-based real- time 
software supervision. Mas t er's t hesis, University of Waterloo, Depart ment of 
Electrical and Computer Engineering, Waterloo, Ontario, Canada N2L 3G1. 
1991. 
Robert Kowalski. Logic for Problem Solving. Artificial Intelligence Series. 
North Holland, New York, 1979. 
Vipin Kumar. Algorithms for constraint-satisfaction problems: A survey. AI 
Magazine, 13(1):32-44, Spring 1992. 
P.A. Lee and T. Anderson. Fault Tolerance - Principles and Practice, volume 3 
of Dependable Computing and Fault- Tolerant Systems Series. S pringer-Verlag , 
Wein, New York, 2nd edition, 1990. 
J.J. Li. A Real- Tirne Software Supervision Approach for Automatic Faihre 
Detection. P hD thesis, University of Waterloo, Department of Electrical and 
Computer Engineering, Waterloo, Ontario, Canada, N2L 3G1, 1996. 
Luqi, H. Yang, and X. Zhang. Constructing an automated texting oracle: An 
effort to producc rcliablc software. In Computer Softwarc and Applications 
Conference. pages 228-233. IEEE, 1994. 
M.N. Meyers, W.A., Routt, and K.W. Yoder. No. 4 ESS: Maintenance Soft- 
ware. The Bell System Technical Journal, 56(7): 1139-ll67, September i977. 
Kenneth P. Parker. The Boundary-Scan Handbook. Kluwer Academic Publish- 
ers, Norwell, Massachusetts 02061, 1992. 
Brian Penney. The DMS-100: A switch that takes care of itself. Telesis, 
F0ur:41-43~ 1980. 
S. Poledna. Fault-Tolerant Real-Time Systems - The problem of Replica De- 
terminism. Kluwer Academic Publishers, 1996. 
[45] Danny Prairie. Supervising a class of software systems with two weakly inter- 
acting functionalities. Master's thesis, University of Waterloo, Department of 
Electrical and Computer Engineering, Waterloo, Ontario, Canada N2L 3G1, 
1996. 
[46] Roger S. Pressman. Software Engineering - A Practitionersi Approach. 
McGraw-Hill Inc., third edition, 1992. 
[47] Scott Reid. Reduced model supervision: QuantiSrhg trade-offs in failure de- 
tection and computational complexity. Master7s thesis, University of Waterloo, 
Waterloo, Ontario, Canada N2L 3G1, 1996. 
[48] Elaine Rich and Kevin Knight. Artificial Intelligence. McGraw-Hill Inc., 1991. 
[49] D . J. Richardson, S. L. Aha, and T.O. O'Malley. Specification-based test oracles 
for reactive systems. In Proceedings of the 14th International Conference on 
Sofiware Engineering, pages 105-118, 1992. 
[50] S. Sankar and M. Mandal. Concurrent runtime monitoring of formally specified 
programs. IEEE Cornputer, pages 32-41, March 1993. 
[51] T. Savor and RE. Seviora. User-oriented supervision of SDL-specified software. 
In Fourth Bellcore/KPN/Purdue Worbhop on Issues in Software Reliabilityl 
October 1995. 
[52] T. Savor and R.E. Seviora. An approach to automatic detection of software 
failures in real-time systems. In Proceedings of the IEEE Real- Time Technology 
and Applications Symposium, 1997. To appear. 
[53] E. Sclioitsch, E. Dittrich, S. Grasegger, D. Kropfitsch, A.  Erb, P. Fritz, and 
H. Kopp. The Elektra testbed: Architecture of a real-time test environment for 
high safety and reliability requirements. In Proceedings of  the 1990 IFAC/IFIP 
Symposium on Safety of Computer Control Systems (SAFECOMP 'go), pages 
59-65, 1990. 
[54] D.P. Sidhu and T.K. Leung. Formal methods for protocol testing: A detailed 
study. IEEE Transactions on Softuiare Engineering, 15(4):413-426, April1989. 
[55] D. Simser and R.E. Seviora. Supervision of real-time systems using optimistic 
path prediction and rollbacks. In Proceedings of the Seventh International 
Symposium on Software Reliability Engineering (ISSRE '96)' pages 340-349. 
IEEE, 1996. 
BIBLIOGRAPHY 169 
David Siniser. Supervision of real-time systems using optimistic path predic- 
tion and rollbacks. Mas ter's t hesis, University of Waterloo, Waterloo, Ontario, 
Canada N2L 3G1, 1996. 
R. Swan. DMS-100 f d y  evolution. BNR Telesis, 3(2):2-9: 1983. 
International Telegraph and Telephone Consultative Cornmittee. Functional 
Specification and Description Language, Recommendations 2.100-2.104, (&e 
Book). International Telecornmunication Union. Geneva, 1989. 
D. Toggweiler, J. Grabowski, and D. Hogrefe. Partial order simulation of sdl 
specifications. In R. Braek and A. Sarma, editors, Proceedzngs of the 7th SDL 
Forum. Elservier, 19%. 
Department of Electrical University of Waterloo and Cornputer Engineering. 
ECE 455 Software engineering course project, PBX hardware description. Ver- 
sion 1.2. 
Pierre Wolper and Patrice Godefroid. Partial-order methods for temporal 
verification. In Eike Best, editor, 4th International Conference on Concurrency 
Theory (CONCUR '93), pages 233 - 246, Hildesheim, Germany, August 1993. 
Apringer-Verlag. 
ITU-T Recommendation 2.100. Specification and Description Language, SDL. 
International Telecommunication Union, Geneva. 1992. 
Appendix A 
Target System Specificat ion 
TlGs appendix contains a full specification for a private branck telephone exchange 
(PBX). The PBX consists of 60 telephones as shown in figure A.1. Each telephone 
is allocated a two digit telephone nurnber. To sirnplify the system, inter-PBX c d s  
are not dowed.  A description of the PBX hardware can be found in [60]. 
The specifications that follow are for the control program of the PBX and are 
given in SDL/GR. Figure A.2 illustrates the systern specification of the PBX as 
consisting of two types of processes, the Phone-Handler and ResourceMunager.  
The finite state machine specification of the PhoneHandler  appears in figures A.3 
and A.4. The N e t P a t h M a n a g e r  appears in figure A.5 and the TTRX-Munager in 
figure A.6. 
The PhoneHandler  is responsible for the behavior associated with both orig- 
inating and terminating telephones. The ResourceManager controls access to 
shared resources which are required for the duration of a c d .  Each of the 60 
telephones in the PBX is docated an individual PhoneHandler  process. 
Tables A. 1 and A.2 supplement the SDL specifications by giving a textual de- 
APPENDLX A. TARGET SYSTEM SPEClFICATZON 
- Branch 
\ Exchange / 
Figure A. 1: Private Branch Zxchange (PBX) 
scription of all signals used. A brief, textual summary of each type of process 
foilows. 
A. 1 PhoneHandler 
The PhoneHandler process (figures A.3 and A.4) specifies all observable signal 
sequences of an originating telephone. PhoncHandler is responsible for obtaining 
all resources required for the duration of a telephone c d  in addition to establishing 
a voice path between the originator and terminator. 
Signal, CR-Con(x) is assumed to be routed to the Phone-Handler process which 
lias been assigned to the telephone whose directory number matches the dialed 
number. This is omitted from the SDL specification to limit specification size 
and complexity. Pairs of PhoneHandler processes corresponding to the origina- 
APPENDUC A. TARGET SYSTEM SPECIFICATION 172 
tor and terminator of a telephone conversation communicate via impliczt SDL sig- 
nal routes. Conceptudy, a bi-directional signal route exists between each pair of 
Phone-Handler process. 
The TTRXMunager process (figure A.5) arbitrates allocation of touch tone re- 
ceivers (TTRXs) which are required during dialing. Resources are requested and re- 
leased by signals Get-ttn and Rel-ttrz respectively. Similarly, resources are granted 
and indicated as not being available by signals Grant-ttn and NG-ttn respectively. 
A.3 Network Pat h Manager (Net P a t  hxanager) 
The Net-Path-Manager process (figure A.6) arbitrates allocation of network paths 
though the exchange which are required for the duration of the c d .  Resources 
are requested and released by signals Getpath and Rel-path respectively. Similarly. 
resources are granted and indicated as not being available by signals Grant-path 
and NG-path respec tively. 





signalist L 1 = Diai-Tone. No-DT. Fiut-B usy. 
No-FB. Slow-Busy. No-SB. R i n g B x k .  
No-RB. Conn-CE, Dise-CE. Ring. 
No-Ring. Conn-CR. Dise-CR 
/ signdlrt L2 = ONHK. OMK. Digittr)  I 




Figure A.2: System Specification of PBX 
APPENDLX A. TARGET SYSTEM SPECIFICATION 
process Phone-Handler 1/2 
Ring i
V i t - R e s  0 
io smder 9 
Fast-Busy v 
CR-ON 9 
Figure A.3: SDL Specification of PhoneHander Process (112) 
APPENDLX A. TARGET SYSTEM SPECIFICATION 







Figure A.4: SDL S pecification of P honeHander Process ( 2 1  2) 
APPENDIX A. TARGET SYSTEM SPECIFICATION 
process Ne t-Path-Manager 
Figure A.5: SDL Specification of NetPathManager Process 
process ïTRX-Manager 
Figure A.6: SDL Specification of TTRX-Manager Process 









N o D T  
FastBusy 


































Table A. 1: SDL Rcqiiireiiiciit s Dictioiiary ( 112) 
Description 
User dialed digit number x 
User has taken originating telephone oflhook 
User has placed originating telephone onhook 
, 
Digit dialing timer 
Slow Busy tone timer 
Ring Back tone timer 
PBX provides user with dia1 tone 
PBX removes dial tone 
PBX provides user with fast busy tone 
PBX removes fast busy tone 
PBX provides user with slow busy tone 
PBX removes slow busy tone 
PBX provides user with ring-back tone 
PBX removes ring-back tone 







APPENDLX A. TARGET SYSTEM SPECIFICATION 
