Extended update plans by Mencák, Jirí
University of Huddersfield Repository
Mencák, Jirí
Extended update plans
Original Citation
Mencák, Jirí (2003) Extended update plans. Doctoral thesis, University of Huddersfield.
This version is available at http://eprints.hud.ac.uk/5935/
The University Repository is a digital collection of the research output of the
University, available on Open Access. Copyright and Moral Rights for the items
on this site are retained by the individual author and/or other copyright owners.
Users may access full items free of charge; copies of full text items generally
can be reproduced, displayed or performed and given to third parties in any
format or medium for personal research or study, educational or not-for-profit
purposes without prior permission or charge, provided:
• The authors, title and full bibliographic details is credited in any copy;
• A hyperlink and/or URL is included for the original metadata page; and
• The content is not changed in any way.
For more information, including our policy and submission procedure, please
contact the Repository Team at: E.mailbox@hud.ac.uk.
http://eprints.hud.ac.uk/
Extended Update Plans 
Jifi Mencdk 
A thesis submitted to the University of Huddersfield 
in partial fulfilment of the requirements for 
the degree of Doctor of Philosophy 
The University of Huddersfield 
School of Computing and Engineering May 2003 
Abstract 
Formal methods are gaining popularity as a way of increasing the reliability of systems through 
the use of mathematically based techniques. Their domain is no longer restricted to purely 
academic environments and examples, as they are slowly moving into industrial settings. The 
slow rate at which this transition takes place is mainly due to the perceived difficulty of 
formalising the behaviour of systems. While this is undoubtedly true, it is not the case with 
all formal methods. 
Update Plans are a powerful formalism for the description of computer architectures 
and intermediate to low-level languages. They are a declarative specification language with 
an underlying imperative machine model. The descriptions using Update Plans are clear, 
compact, intuitive, unambiguous and simple to read. These characteristics allow for the 
minimisation of possible errors at early stages of the development process even before a 
verification takes place. 
In this thesis an overview of the Update Plans formalism is given and a number of real- 
world applications is shown. The investigation of the application area focuses on computer 
architectures for which various specifications already exist. The comparison of Update Plan 
specifications to other specifications provides a useful insight into the strengths and shortcom- 
ings of the formalism. The shortcomings, in particular the lack of synchronisation primitives 
and modularity, are addressed by the development and evaluation of several syntactic and 
semantic extensions described in this thesis. The extended formalism is also compared to 
other specification languages and conclusions are drawn. 
Acknowledgements 
The research detailed in this thesis was funded by the Engineering and Physical Sciences 
Research Council (EPSRC) and partially by the School of Computing and Engineering of the 
University of Huddersfield. I wish to express my gratitude to both of these institutions for 
their financial support. 
I am indebted to my director of studies Dr. Hugh R. Osborne and my supervisor Dr. Adrian 
R. Jackson, for their help and guidance over the past three years. 
Special thanks belong to my parents and my girlfriend Kamila for their continuous support 
and encouragement. 
Statement of Original Authorship 
The work contained in this thesis has not been previously submitted for a degree or diploma 
at any other higher education institution. To the best of my knowledge and belief, the thesis 
contains no material previously published or written by another person except where due 
reference is made. 
Brno, 4th April 2004 
Contents 
Introduction I 
1 Context ....................................... 1 
2 Update Plans .................................... 2 
2.1 Research questions ............................. 3 
3 Organisation .................................... 4 
4 Notational conventions ............................... 6 
I Basic Update Plans 7 
1 Update Plans 9 
1 Basic Update Plans ................................. 9 
2 Typing ........................................ 13 
3 Archetypes ..................................... 14 
3.1 Syntax .................................... 15 
3.2 Expansion .................................. 16 
3.3 Syntactic sugar ............................... 17 
4 Parallelism ..................................... 19 
5 Examples ...................................... 19 
5.1 ADTs .................................... 19 
5.2 Archetype expansion ............................ 20 
2 PDP-11 22 
1 Addressing modes ................................. 22 
2 Instructions ..................................... 24 
2.1 Single operand instructions ........................ 24 
2.2 Double operand instructions ........................ 26 
2.3 Condition code and program flow operations .............. 27 
2.4 Interrupts .................................. 28 
2.5 Other instructions ............................. 29 
3 Conclusions ..................................... 29 
I 
CONTENTS ii 
SPARC-V9 30 
1 Types and constants ................................ 30 
2 Registers ....................................... 31 
2.1 General purpose r registers ........................ 31 
2.2 Floating-point f registers ......................... 33 
3 Instructions ..................................... 33 
3.1 Arithmetic and logical operations ..................... 33 
3.2 Register window manipulation instructions ............... 
35 
3.3 Load/Store instructions .......................... 36 
3.4 Floating-point instructions ........................ 38 
3.5 Control transfer instructions ....................... 40 
3.6 Miscellaneous instructions ......................... 45 
4 An example ..................................... 45 
5 Conclusions ..................................... 46 
4 Java Virtual Machine 47 
1 Types, constants, variables ............................ 47 
1.1 Types .................................... 47 
1.2 Constants .................................. 48 
1.3 Variables 
.................................. 
48 
2 Instructions 
..................................... 49 
2.1 Operand stack management ........................ 
49 
2.2 Local variable access ............................ 50 
2.3 Arithmetic instructions .......................... 51 
2.4 Immediate operands ............................ 52 
2.5 Control transfer .............................. 52 
3 Conclusions 
..................................... 54 
II Extended Update Plans 55 
5 Syntactic Extensions 57 
1 Everything is an update .............................. 57 
2 Archetypes 
..................................... 58 
2.1 Grammar 
.................................. 58 
2.2 Ambidextrous archetypes ......................... 58 
2.3 Command archetypes ........................... 59 
2.4 Archetype parameters ........................... 59 
2.5 Archetypes in guards ............................ 60 
3 TýYpes 
........................................ 60 
CONTENTS iii 
3.1 Constants ......................... .......... 
60 
3.2 Type grammar ............................... 
61 
4 Comments ...................................... 
61 
5 Conclusions ..................................... 
62 
6 Semantic Extensions 63 
1 Parallel blocks ................................... 
63 
2 Sequential update schemes ............................. 
64 
2.1 Background/Motivation .......................... 
64 
2.2 Syntax .................................... 
66 
2.3 Semantics .................................. 
67 
2.4 Canonical form ............................... 
69 
2.5 Implementation ............................... 
72 
3 Sequential archetypes ............................... 
76 
3.1 Background/Motivation .......................... 
76 
3.2 Syntax .................................... 
77 
3.3 Semantics .................................. 
77 
3.4 Special types of sequential archetypes .................. 
80 
3.5 Parameters ................................. 
81 
3.6 Syntactic sugar ............................... 
82 
3.7 Limitations ................................. 
83 
4 Special cases of archetype expansion ....................... 
83 
4.1 Alternatives ................................. 
83 
4.2 Parallel blocks ............................... 
83 
5 Conclusions ..................................... 
84 
7 PRAM 85 
1 Informal description 
................................ 
85 
2 2-PRAM memory models ............................. 87 
3 n-PRAM memory models ............................. 89 
4 Instructions ..................................... 91 
4.1 Addressing modes ............................. 91 
4.2 Accumulator loading instructions ..................... 
93 
4.3 General purpose register loading instructions .............. 95 
4.4 Program counter loading instructions .................. 
95 
4.5 Memory read instructions ......................... 95 
4.6 Memory write instructions ......................... 
95 
4.7 The instruction set ............................. 
96 
5 Conclusion 
..................................... 96 
CONTENTS iv 
8 Other Methods 97 
1 Specification methods ............................... 97 
1.1 Hardware .................................. 97 
1.2 Concrete machines and instruction sets ................. 100 
1.3 Parallelism ................................. 10, 
1.4 Protocols ........;......................... 103 
1.5 Z/VDM ................................... 104 
2 Verification ..................................... 104 
2.1 ACL2 .................................... 105 
3 Conclusions ..................................... 106 
3.1 Integrated specification/verification methodologies ........... 106 
3.2 Specification methods ........................... 106 
3.3 Summary .................................. 108 
9 Conclusions and Future Research 109 
1 Update Plans applications ............................. 109 
2 Update Plans extensions .............................. 110 
3 Future considerations ................................ 
3.1 Theory ................................... 
3.2 Applications ................................ 112 
3.3 Implementation ............................... 113 
Bibliography 114 
A Extended Update Plans Grammar 121 
B PDP-11 129 
C SPARC-V9 133 
D Java Virtual Machine 140 
n-PRAM 144 
P Glossary 151 
Introduction 
I Context 
The first recorded ideas about modern formal methods were those of Leibniz (1646-1716), 
who dreamt [47] about a machine called "calculus ratiocinator" which would decide any 
question given in a language "characteristica universalis" rich enough to describe any kind of 
phenomena. 
Many years have passed since these ideas surfaced, and formal methods are today ad- 
vocated as a means of increasing the reliability of systems as exhaustive testing through 
simulation is no longer possible due to the ever increasing complexity of integrated circuits. 
Furthermore, the gap between the performance of current systems used for simulation and 
the complexity of systems under development is increasing rather than shrinking. 
The current uses of formal methods include but are not limited to safety and security- 
critical systems in domains such as transport, defence, banking and medical applications, 
where the cost of failure is unacceptably high. The losses suffered by bad microprocessor 
design in particular can be devastating. For example, the cost of Ariane 5 to the European 
Space Agency was $7,000 million and 10 years of design due to an overflow error during a 
conversion of a number from a 64-bit format to a 16-bit format [27]. The Pentium floating- 
point bug [63] in the division algorithm cost Intel an estimated $500 million. Perhaps not 
surprisingly the use of formal methods is spreading from government agencies such as NASA 
or MoD into a wider industrial sphere. 
The "weakest link" in the development of a system is usually its specification. Many 
inconsistencies and ambiguities can be discovered just by going through the process of rigorous 
specification. By writing a formal specification and devising a verification strategy one is 
forced to think about the system in new ways, new questions are raised, new insights into the 
system are gained and thus the number of possible errors is minimised at early stages of the 
development. 
While there are methods for formal development of high level software (e. g. [7,14]), and 
for hardware description (e. g. [36,57,73]), there seems to be a shortage of specification and 
development formalisms at the intermediate level of machine architectures and instruction sets 
1 
INTRODUCTION 
as the semantics of low-level languages is still 
[48,70,77] unjustly assumed trivial. Although 
it is possible to describe the semantics of low-level languages by various methods [13,16,34, 
66,69), they either lack formal semantics themselves or the readability of specifications aimed 
at this level is generally bad. 
Update Plans [60], on the other hand, aim to fill the gap among formal languages by 
providing a formal, clear, unambiguous and expressive way of describing intermediate to 
low-levels of machine architectures. 
Many formal method§ (some of which are briefly described and compared with Update 
Plans in this thesis) are difficult to grasp. The learning time for the production of usable 
designs with such methods is usually very long, and an efficient'use of a specification language 
relies heavily on its detailed knowledge. This also results in all sorts of other difficulties. 
Firstly, it is very easy to formalise the wrong thing in this environment. Secondly, a hard to 
understand design description makes any verification more complicated. 
Update Plans, on the other hand, have a short learning curve-they are easy to learn 
and understand as its basic concepts are very simple and confidence in using the formalism is 
achieved very soon. Designs written in Update Plans are highly communicable and compact 
formal descriptions. 
2 Update Plans 
Update Plans (UP for short) are a formalism for the description of (abstract) machines and 
algorithms. A specification produced by UP combines both the structural and behavioural 
model of a system. The structural descriptions are concrete, detailed and low-level (although 
it is easy to abstract away from irrelevant details). With respect to algorithmic structures the 
descriptions are abstract and high-level. These characteristics make UP particularly suitable 
as a specification language for the description of large classes of machine architectures [55, 
58,59,61] and as a target language in compiler design [54,60]. 
In [541 the Update Plans formalism (then called Update Schemes) was introduced by Hans 
Meijer as a target language in the framework of the design of a translator generator. Since 
then it has been extended to the description of machines and algorithms. 
Hugh Osborne designed a formal semantics [581 and developed a powerful macro-like mech- 
anism called archetypes. He also simplified the typing regime, introduced parallelism [60], 
constructed a prototype implementation based on [451, investigated a number of typical ap- 
plications [59,61] and approached the problem of the formal verification of UP specifications. 
A brief description of most of the above-mentioned past areas of Update Plans research 
is given in the following points. 
* Semantics. The presence of a formal semantics for any specification language is a ne- 
cessity for any kind of formal reasoning about specifications written in that language. In UP 
INTRODUCTION 
this has been done in [58] by defining the underlying imperative model and the referentially 
transparent semantics of UP. 
* Archetypes. Archetypes are an abstraction and structure reuse mechanism that 
has 
been introduced into Update Plans in order to further increase their expressiveness [60]. 
Although the primary motivation was abstracting away the details of addressing modes, they 
can be used for a variety of other purposes where -it is desirable to express multiple update 
schemes by a single archetype call. 
* Parallelism. Parallelism is inherent in Update Plans. While there is (to some extent) no 
need for an explicit mechanism to describe it, it has been introduced [60] to increase legibility 
by providing a way to allow many update schemes to be combined into one atomic update. 
o Typing. The type information introduced to Update Plans in [60] is not only important 
for type checking, but it also has serious consequences for the implementability and formal 
verification in Update Plans. 
9 Verification. Perhaps most importantly, the work on proving semantic equivalence be- 
tween update plans (possibly representing various levels of description) has started [601. 
9 Applications. A number of machines varying from very abstract to concrete have been 
described. One of the first larger applications was a machine for a simple functional lan- 
guage [59]. Two lower level specifications have been given for a more realistic machine model, 
taken from [25]. The more abstract of these was a machine for a tree language, typical of 
intermediate code generated by a compiler. The more concrete was a PDP-11 style machine. 
A linearisation of the tree code (register allocation) was given transforming it to concrete 
machine code. A proof of the semantic equivalence of the two specifications under the trans- 
formation was shown, which verified the register allocation algorithm. A specification of a 
pipelined RISC processor has been given [61]. UP have also been applied to a (partial) spec- 
ification of the Java Virtual Machine [55]. Using an abstraction mechanism similar to the 
archetype mechanism, elementary VLSI components have been specified [53]. 
Update Plans have also proved very useful in education. Apart from their didactic use in 
explaining the intricacies of addressing modes and the semantics of instructions of a computer 
architecture simulator [62], they can also be used as a compact means of explaining algorithms 
such as rebalancing operations in AVL trees [60]. 
2.1 Research questions 
The main aim of this project is to develop UP as a specification and verification formalism, 
primarily at the level of low-level code and hardware. To achieve this a hierarchical modular 
specification and validation structure for UP needs to developed. This would provide a 
formalism in which multi-level specifications of architectures can be given, each level being 
a translation or transformation of its neighbouring levels. The formally defined translation 
INTRODUCTION 4 
between levels will provide a proof of the semantic equivalence of the specifications. 
A second aim is to extend the syntax and semantics of UP to cover a wide range of parallel 
paradigms and, in particular, their communication and synchronisation primitives. 
To achieve this the problems of parallelism, modularity and formal verification need to be 
addressed. 
9 Parallelism. More general mechanisms for specifying parallel systems need to be intro- 
duced, allowing for the specification of asynchronous parallelism and related communication 
primitives. The semantics of the new constructs must be clearly defined. While some pre- 
liminary investigation of parallelism in UP has been done, this has not gone much further 
than replacing nondeterministic execution of an applicable update scheme with simultaneous 
execution of all applicable update schemes; so in order to maintain well-behaviour a host of 
artificial semaphore-like constructs have to be used. 
e Modularity. A modular structure needs to be introduced into UP in order to provide 
the formalism with a mechanism for structure reuse, for information hiding and to encourage 
hierarchical specification. A specification at one level, e. g. that of the fetch/execute cycle 
could then be defined in terms of a lower level specification such as microcode, and in turn 
be used to specify higher level activities such as the instruction set. 
* Formal verification. A proof system needs to be developed that, given UP specifica- 
tions of two machine architectures and a translation, would prove the semantic equivalence 
of the specifications under the translation. This work should be based on the concepts of 
semantic equivalence, translation and irrelevance [60]. 
3 Organisation 
This thesis contains an evaluation of Update Plans as a specification formalism for hardware 
architectures, and suggests extensions based on this evaluation. It is divided into two parts. 
Unless indicated otherwise by marginal notes giving forward references to sections of chapter 5, 
the first part uses basic Update Plans as defined by [60], and the suggested extensions are 
described in the second part. The first part of this thesis is an evaluation of basic Update 
Plans, and consists of three case studies involving instruction set specifications of machines 
differing in the degree of abstraction. 
A tutorial on basic Update Plans is given in chapter 1. A lot of material in this chapter 
was adapted with minor amendments from [60] and [61]. 
The PDP-11 specification in chapter 2 is the first attempt to specify a large subset of a real 
machine's instruction set using Update Plans. It was chosen as a candidate for specification 
because some historically important formal specifications of the PDP-11 exist [20,69]. Con- 
sideration of a more detailed specification at the fetch/execute cycle drove the development 
of some semantic extensions described in chapter 6. 
INTRODUCTION 
The first chapters to make some advance use of the syntactic extensions to Update Plans 
described in chapter 5 are chapters 3 and 4. Both of these chapters feature modern, abstract, 
but very different architectures. Chapter 3 has the UP specification for the SPARC-V9 archi- 
tecture. Again, there was already a partial specification of the SPARC-V9 architecture [651, 
but this is aimed at the specification of connections and communication rather than at a 
concise specification of the instruction set. In chapter 4 the Java Virtual Machine (JVM) 
was chosen to test UP's suitability for describing more abstract instruction sets using a wide 
variety of types. 
The second part of this thesis proposes some extensions to Update Plans and contains a 
comparison of Update Plans to other formal methods. 
Chapter 5 covers the syntactic extensions brought about during the development of the 
SPARC-V9 and JVM instruction set specifications in the first part of the thesis. In particular 
this chapter discusses the changes in the revised Update Plans grammar and the extensions 
to the typing mechanism which, in many cases, allow multi-level specifications. 
The concept of sequential update schemes and archetypes is introduced in chapter 6 
together with other semantic extensions to make the Update Plans formalism more expressive 
and consistent. More importantly, this chapter gives possible answers to the research questions 
of parallelism and modularity. The problem with synchronisation of parallel processes is 
solved by the introduction of a general synchronisation primitive (sequential update schemes) 
that augments non-deterministic model of Update Plans by explicitly stating the order in 
which updates will be applied. Sequential archetypes extend the possibilities for information 
hiding and structure reuse by encapsulating a series of synchronised updates at one level 
of abstraction into a modular update easily identifiable against a corresponding update at 
another level of description. 
Chapter 7 contains a more comprehensive example of the application of sequential update 
schemes and archetypes in a specification of a theoretical model of computation-Parallel 
Random Access Machine (PRAM). PRAM provides a useful test-bed for the semantic exten- 
sions as it specifies strictly sequential operations (the fetch/execute cycle) within a massively 
parallel context. 
The penultimate chapter evaluates Update Plans. Other existing formalisms which could 
be used for the specification of intermediate levels of hardware architectures are considered. 
A brief description of each of the formalisms is given and they are compared to Update Plans. 
This chapter uses a few simple examples to show the advantages the introduction of sequences 
brings to Up. 
Finally, conclusions are drawn and future research directions are suggested in chapter 9. 
INTRODUCTION 
4 Notational conventions 
Throughout this thesis, the following notational conventions are observed. 
Production rules used to describe basic and Extended Update Plans grammars given 
in this thesis make use of the notational conventions of the specification formalism 
ASF+SDF [43,74] in which the notation IS term}+ indicates a list of one or more S's 
separated by the terminal term. If the + is replaced by a* the list may also be empty. 
The suffix -opt indicates zero or one occurrences of its nonterminal. More notational 
conventions used by the ASF+SDF specification formalism can be found in appendix A 
on page 121. 
Update plans given in this thesis use the typewriter font. This convention has two 
exceptions. Firstly, update plan comments and 'meta-variables' such as lhs use the italic 
typestyle. The second exception is update plan elements which have been commented 
out. They use the slanted typewriter font. 
All emphasised words appearing freely in the text are glossary terms explained in ap- 
pendix F for quick reference. The only exception are references to update plan 'meta- 
variables' which use the emphasised (italic) typestyle. 
Whenever a reference to an instruction field is made, a sans serif typeface is used. The 
ý symbol designates concatenation of bit vectors, the % symbol is arithmetic modulo, 
and x is used for multiplication. Symbols &, 1, and ^ are used for bit-wise AND, OR, 
and XOR operations respectively. And finally, symbols =, 0, <, >, :! ý and > have their 
usual mathematical meanings. 
Marginal notes used in the first part of this thesis are forward references to sections 
of chapter 5 where syntactic UP extensions are explained. References to the trivial 
extension introduced in section 4 of the same chapter are not made. 
e The American spelling 'program' is used to denote computer software. 
Part I 
BasiC Update Plans 
Introduction 
Part one of this thesis serves as an introduction to Update Plans with the focus on the 
description of hardware architectures. It shows three specifications of instruction sets of 
machines at various degrees of abstraction, ordered from the most concrete to the most 
abstract. Each of the specifications provide some interesting insights into the formalism 
and uncovers its various shortcomings, which are addressed in the second part of the thesis. 
Although the specifications use basic Update Plans as described in [60], some advance use (in 
chapters 3 and 4) is made of syntactic changes to the formalism described in chapter 5. 
8 
Chapter I 
Update Plans 
T Iiis chapter gives an informal overview of basic Update Plans. Only the basic facts neces- 
sary to understand the specifications in the first part and the reasons for the extensions 
in the second part of the thesis are introduced. More information on Update Plans along 
with their complete formal syntactic and semantic definition can be found in [58-611. 
1 Basic Update Plans 
An update plan specifies state transitions in an abstract machine. This machine consists of 
a number of stores, each containing a linear countably infinite sequence of memory cells (e. g. 
bytes or machine words) between locators (addresses). Note that it is not the cells themselves 
that are addressed, but the boundaries between cells as shown in figure 1.1. 
Triples ci[ý], 6 are called locator expressions, where a and 6 are locators, ce <0 and the 
cells between the addresses a and 3 contain (a particular representation of) the value of ý. 
A set of locator expressions is consistent if there are no two or more expressions in the set 
which specify the contents of some cell to have different values. A consistent set of locator 
expressions describes a (sub) configuration of the machine. 
State transitions in the abstract machine are described by update schemes, which are 
constructed from two sets of locator expressions forming the left-hand side (Ihs) and the 
right-hand side (rhs), separated by a guard (=qy ý*) which carries an applicability condition 
-y- A guard whose condition is always true can be simply written as ==>. 
An update plan is a set of update schemes, each of which may contain unspecified values 
(variables). Variables are indicated by lower case words, constants are indicated either by 
a value or, symbolically, by upper case words. An update scheme containing no unspecified 
values is called an update rule. Update schemes yield update rules by instantiation. Both the 
left and right-hand sides of the resulting update rule must be self-consistent, i. e. all locator 
expressions in the left and right-hand side must be mutually consistent. 
9 
CHAPTER 1: UPDATE PLANS 
10 
FUIKE 
1230123 
'Raditional array addressing Update Plan addressing 
Figure 1.1: Addressing in arrays and Update Plans 
An update rule is applicable to a given configuration if its left-hand side is a subset of that 
configuration and its guard is true. The memory then may 
be minimally updated such that 
thereafter all locator expressions in its right-hand side are satisfied. 
An initial configuration, which specifies the initial state of the memory before any update 
scheme is applied, and an update plan form an update script. The update plan is executed 
by repeatedly (non-deterministically) choosing an applicable update scheme from the plan, 
and applying it, until the configuration is such that no scheme is applicable. This final 
configuration is the result of the script. 
The overview of the basic Update Plans syntax is given by the following grammar. 
(script) --+ (configuration) ". " (plan) 
(plan) (itern)* 
(item) (scheme) 
(configuration) --+ (locator expression)* 
(scheme) --+ (configuration) (guard) (configuration) 
(guard) --4 "=4" (term) "I=#. " 
(locator expression) --+ (locator) "[" (text) 'I" 
(locator) 
(locator) --* (term) 
(text) - (term)* 
A (term) is an expression built from constants, variables and operators. Note that the gram- 
mar presented here is a very simplified version of the Update Plans grammar to demonstrate 
the basic structure. This grammar will be extended at several places in this chapter, but will 
still not be presented in its entirety. The complete basic Update Plans grammar can be found 
in [60], and a complete and updated grammar for Extended Update Plans in appendix A. 
The classic two-scheme update plan in example 1 demonstrates Euclid's algorithm for 
computing the CCD of the number initially between A and B and that initially between B and 
C. The constants A, B and C are fixed locators and x and y are unspecified values. 
Example 1 
A[x]B B[y]C =l x<y ý* B[y - x]C. 
A[x]B B[y]C =I x>y j=> A[x - y]B. 
r- 
CHAPTER 1: UPDATE 
PLANS 11 
If at any stage of the computation 
the machine configuration contains A [91 B and B [61 C) 
(only) the second update scheme (instantiating into an update rule) in example 1 is applicable, 
whereupon the 9 is replaced 
by a 3. 
Syntactic sugar 
The following points describe notational simplifications which make update plans more legible. 
"A superfluous locator may be omitted. 
A locator is superfluous if its removal does not 
lead to any confusion. 
" Contiguous sequences may be concatenated. Two expressions x[s]y and y[t]z may 
be 
written as x[s]y[t]z. 
" Locators may be also omitted when concatenating contiguous sequences so that x[s]y[t]z 
may also be written as x[s t]z. 
" Identical left-hand sides may be shared. A repeat of the previous left-hand side is 
indicated by the 'repeat' symbol '11'. 
" Other notational conventions (1/0, the introduction of the program counter and alter- 
natives) are described in sections of their own. 
1/0 
As input/output instructions are one of most frequently used operations, a mechanism for 
hiding the details and so sweetening the syntax is provided. Input streams which are described 
by an update scheme 
IP[i] i(input]j ... =4 g J* ... IP[j] 
may be written as 
? IP[input] =4 g ý* 
For the standard input stream the '? [input]' can be used when this is unambiguously 
defined. 
Similarly, output streams 
op[o] ... =[ g ý* ... op[p] O[Output]p 
may be written as 
=4 9 J=ý, ---! OP[output] 
Again, the '! [output]' can be used for the standard output stream, if it is defined. 
Program counter 
By acknowledging the existence of a program counter at a fixed locator (by convention PC), 
update schemes exhibiting the pattern 
CHAPTER 1: UPDATE PLANS 
s 
J-d 
xs tt 
uu 
vv 
PUSH a a[xl SP[t] ==ý. s[x]t SP[s]. 
Figure 1.2: Visualisation of the PUSH operation on a stack machine 
PC[pc] pc[OP argsjqc lc =[ g J=: ý- Pc[pc'] pc'[nextlqc rc. 
may be written as so-called commands 
OP args Ic =4 g Y* next rc. 
12 
with the program counter hidden, where at least one of OP aryis and next is a non-empty 
command sequence, and Ic/rc are left/right contexts. The first term of a command will in 
most cases be a constant, known as the opcode. An update plan in which all update schemes 
are commands is know as a command driven plan. A configuration is in command form if it 
contains a non-empty command sequence, or one in which the contents of the register PC are 
not specified. 
Example 2 
The following two commands may be part of a stack machine. 
PUSH a a[xl SP[tj s[xlt SP[s]. 
POP a SP[s] s[xjt a[xl SP[t]. 
They will be desugared as 
PC[pc] pc[PUSH a]qc a[x] SP[t] =4> PC[qc] s[x]t SP[s]. 
PC[pcl pc[POP alqc SP(s] sfxlt =* PC[qcl afxj SP(t]. 
A graphical description of the PUSH operation is in figure 1.2. 
Alternatives 
A set of update schemes with mutually exclusive guards often performs some form of case 
analysis. Such a set 
lhs I gi J=#- rhs 1. 
IhS2 --Igl A 92 1=ý' rhS2. 
n1 lhs,, A --lgi) A 9, ý* rhs,,. i=l 
CHAPTER 1: UPDATE PLANS 13 
can then be written as a series of n update schemes known as alternatives 
ills 1 =1 g, ]=-ý- rhs 1; 
IIIS2 =f 92 1=ý' rhS2; 
lhs,, =q gn I=, - rhsn. 
The update schemes are divided by semicolons and only the first applicable update scheme, 
reading from top to bottom, will be applied. The following production rules are added to the 
basic Update Plans grammar. 
(item) ---ý (alternatives) ". " 
(alternatives) --ý { (update scheme) "; " 
2 Typing 
The basic type in Update Plans is the locator. In fact all objects appearing in an update plan 
are considered to be locators. 
New types may be defined by combining existing types using any of the standard operators 
of regular expressions. Such a declaration is said to define a type alias. Store names are lower 
case words with an initial upper case letter, and must, therefore, contain at least two letters, 
in order to ensure that store names can be distinguished from constants. Stores are declared 
by listing them between braces, e. g. 
{Bit, Bool, Int, Stack, Heap}. 
Each store name is said to be a type p7imitive. Type aliases are declared similarly 
f Num = Byte I Short I Long}. 
Type aliases may not be recursive, either directly or indirectly. This ensures that any type 
alias can be expressed as a regular expression containing only type primitives. Every object 
appearing in an update plan must be typed. For some objects this may be done implicitly. 
Each symbolic constant is considered to have its own unique type, again unless indicated 
otherwise. Other objects-expressions for which no type can be determined automatically- 
must have their type indicated. This can be done by means of a global declaration valid 
throughout an update plan, e. g. 'v:: Storel. '. A global type definition can be overridden by 
casting a term within an update. scheme with a different type, 'e. g. (v:: Store2)' such a cast 
determining only the type of the term to which it is applied. The syntax of type declarations 
is obtained by adding the following rules to the basic UP grammar 
Is (item) ---ý (store declaration) (type declaration) 
(store declaration) ---ý "I" I (store) ", " J* "I" 
CHAPTER 1: UPDATE PLANS 14 
(type declaration) f(store) ", " J+ ":: " (store structure) 
(term) --* "("(term) ":: " (store structure) ")" 
(store) - 
(store name) 
(store name) (store structure) 
A (store name) is a lower case word with a leading upper case letter, and a (store structure) 
is a regular expression over the set of store names. Some extensions have been added to the 
typing mechanism, and can be found in chapter 5 alongside an updated type grammar. 
A very important concept related to typing and implementability of update plans is 
grounding. It is extensively described in [60], but the following terminology is provided 
for convenience as it will be used in this thesis. 
A ground expression is one for which a unique variable-free expression can be derived, 
possibly by instantiation. of variables with respect to the current configuration; a semi-ground 
expression is one for which a finite number of such expressions can be derived. 
Example 3 
This example assumes that the length of an object of the type Bit is 1. 
Given the following update plan 
x, y:: Bit. 
xs, ys :: (Bit)*. 
A[x]b[xs]C[ys]d ==#- B[x y]c. 
A, B and C are ground as they are constants; b and c are also ground 
(b =A+1, c=B+ 2); d is non-ground, since the'length of ys cannot be 
established; x is ground, since both of its locators are ground and its value 
can be determined by a reference to a configuration. The 'sequence' xs is 
ground for b<C<b+1, semi-ground otherwise. Finally, both ys and y 
are non-ground. 
Archetypes 
The expressive power of Update Plans is greatly increased by the use of a macro-like mech- 
anism known as archetypes. An update plan often contains a set of similar update schemes 
sharing certain parts of their left/right-hand sides. Such patterns can be moved out of up- 
date schemes into an archetype definition, and replaced by archetype calls. This will almost 
certainly mean a reduction in the number of update schemes necessary for a description as 
duplicate update schemes will be omitted from the update plan. 
CHAPTER 1: UPDATE PLANS 15 
The primary motivation for the introduction of the archetype mechanism was the com- 
plexity of many machines due to the number of addressing modes they use. In an update plan, 
addressing modes can easily be replaced by a single archetype call, thus making it possible to 
express many update schemes as one. 
Archetypes can also be viewed as a mechanism of abstraction, where certain frequently 
used actions are be grouped into "modular updates". This view will be reinforced in the 
second part of the thesis, where the concept of sequential update schemes and archetypes is 
introduced. 
3.1 Syntax 
The basic Update Plans grammar is updated by the following production rules. 
(item) ---ý (archetype definition) 
(archetype definition) --+ 
(basic archetype definition) 
(command archetype deflnition) 
(basic archetype definition) --+ (basic declaration) (basic deflnition)+ 
(basic declaration) --+ (basic archetype name) (parameters) 
(basic deflnition) ---ý "=" (basic body) ". " 
(basic body) --ý 
(configuration) (guard) (configuration) 
(repeat) (guard) (conflguration) 
(conliguration) I 
(text) (context) (guard) (context) 
(text) (repeat) (guard) (context) 
(text) (context) 
(command archetype definition) ---ý 
(command declaration) (command deflnition)+ 
(command declaration) --ý 
(command archetype name) (parameters) (text) 
(command deflnition) --ý "=" (command body) ". " 
(command body) --+ 
(context) (guard) (context) 
(repeat) (guard) (context) 
(context) 
(parameters) 
---ý "(" {(term) 
CHAPTER 1: UPDATE PLANS 16 
A basic archetype name is an identifier, a command archetype name is a constant. The syntax 
of an archetype call is basically that of an archetype declaration. 
The only difference is when 
archetype calls are 'coupled' by an index. 
(term) ---) (archetype call) 
(archetype call) --+ (archetype name) (index)-opt (parameters) 
(archetype name) - (basic archetype name) I (command archetype name) 
(index) --+ (number) I "I" (number) "I" 
Again, the archetype grammar has been adapted with minor amendments from [60]. 
3.2 Expansion 
The archetype expansion mechanism consists of two stages, the textual expansion and the 
parameter resolution stage. 
The parameter resolution phase is necessary due to the fact that parameters do not have 
to be variables-they can be complex expressions. The aim of parameter resolution is to 
derive semi-ground expressions for non-ground terms. Parameters are resolved by using a 
resolution set, to which equations are added as rewriting takes place. If at any point in this 
process a non-trivial equation is derived relating two semi-ground expressions in an update 
scheme, it is added to the scheme's guard. The method is discussed in great detail in [60]. 
An archetype's body consists of a left and a right-hand side expansion and context. The 
expansion is the text that actually replaces the archetype call, the left/right-hand side ex- 
pansion replacing the call on the left/right-hand side. The contexts are simply added to the 
call's scheme, again the left/right-hand side context is added to the left/right-hand side of the 
scheme. Archetype variables conflicting with variables of the call's scheme will be renamed 
before expansion. 
See, for example, an archetype autoinc in example 4a which could define an autoincrement 
addressing mode. The (output) parameter v obtains its value by means of instantiation against 
the current configuration, and the contents of r are updated. 
Example 4a 
autoinc(v) = AUTOINC r r[b] b[v]c ==> r[c]. 
In the example, the left-hand side expansion is AUTOINC r and the left-hand side context 
r [b] b [vj c. 
Archetype calls occur in indexed pairs, with one element of the pair on the left-hand side 
of the update scheme or archetype definition in which it occurs, and the other on the right- 
hand side. The resolution mechanism is demonstrated in example 4b, which gives a possible 
expansion of ADD r autoincl(x) autoinC2(Y) ==> autoinci(x) autoinC2(y) r[x + y]. 
CHAPTER 1: UPDATE PLANS 17 
Example 4b 
ADD r AUTOINC r, AUTOINC r2 
rl[bi] r2[b2] bj[vjjcj b2[V2]C2 ==* r[vi + V2] ri[ci] r2[C2]- 
As will be noted in the following section, syntactic sugar allows one of the elements of an 
archetype call pair to be omitted if the corresponding expansion is empty. The indices may 
then also be omitted, as they are superfluous. The update plan from the previous example 
can then be rewritten as ADD r autoinc(x) autoinc(y) ==* r[x + y]. 
Archetype calls can be recursive. However, there are several limitations as to where 
recursive and non-recursive archetype calls may appear in an update plan. For example, a 
guard may not contain an archetype call and a locator and an archetype parameter may not 
contain a call of a recursive archetype. 
3.3 Syntactic sugar 
The bodies of archetype definitions inherit syntactic sugar as described in section 1. Addi- 
tionally, syntactic sugar also makes it possible to 
1. omit the guard and right-hand side from the body of an archetype definition, if the 
right-hand side and the guard are both empty 
2. replace irrelevant parameters by the "don't care" symbol 
3. share identical archetype declarations (see the grammar) 
Other forms of syntactic sugar are left/right-handed, ambidextrous and command arche- 
types. 
Left /Right-handed archetypes 
If the expansions (text) of all right/left-hand sides of a given archetype in all of its definitions 
are empty then the right/left-hand side of that archetype may be omitted. If all right/left- 
hand side calls are omitted in an update plan, such an archetype is called a left/right-handed 
archetype. 
Left/Right-handed archetypes further contribute to sweetening of an update plan by the 
omission of archetype call's indices which are rendered superfluous as all calls of the same 
name appear only on one (left/right) side of an update scheme. 
Arnbidextrous archetypes 
A left/right-handed archetypes are restricted in that they may only be called on the left/right- 
hand side. An ambidextrous archetype is a context independent archetype, which can be called 
on both sides of an update scheme. 
A definition of an ambidextrous archetype a is equivalent to a pair of left and right 
archetype definitions al and a, 
CHAPTER 1: UPDATE PLANS 18 
al(params) = text lc g )=ý re. 
ar(params) = Ic =1 g text re. 
which can be 'corhpressed' into one ambidextrous archetype definition 
a(params) = text I Ic =[ g Y* rc. 
One use of ambidextrous archetypes can be to describe repetitive computations occurring 
on both sides of an update scheme. 
Example 5 
Consider the following recursive ambidextrous archetype pad(x) and an 
update scheme containing a call of this archetype. The archetype expands 
x padding bytes PAD to align address a of the JMP instruction so that a 
begins at an address that is a multiple of 4 bytes. 
pad(O) =. 
pad(n) = PAD pad(n - 1) n>0 
PC[pc] pc[JMP pad(4 - (pc + 1) % 4) a] PC[a]. 
The purpose of making the pad(n) archetype ambidextrous is that it can 
be called on both sides of an update scheme without the need to define the 
following pad[,, ] (n) pair 
padl(n) = PAD pad(n - 1) =4 n> 0)=-ý, .# To be called on the lhs 
pad, (n) = =q n>0 ý* PAD pad(n - 1). # To be called on the rhs 
in which case the padi(n) and pad, (n) archetypes would have to be called 
on the left/right-hand side of an update scheme respectively, depending on 
the side padding is needed. 
Command archetypes 
An ambidextrous archetype a whose text part begins with a constant can be further com- 
pressed by using this constant as a name of a command archetype, unless this constant has 
already been used for this purpose. The cornmand archetype 
CONST(params) text = Ic =f g I=-;,. rc. 
is the sugared version of 
a(params) = CONST text I Ic =1 g ý* re. 
with calls of a replaced throughout the update plan by calls of CONST. Note that the syntax 
of ambidextrous archetypes changed (see chapter 5) in Extended Update Plans to be more 
consistent with command archetypes. 
CHAPTER 1: UPDATE PLANS 
Parallelism 
19 
Parallelism is inherent in Update Plans, as application of an update rule is atomic and many 
cells may be changed simultaneously as a result. However, relying purely on this mechanism 
may create update schemes of unmanageable length. The solution is to have a set of non- 
related update schemes, instantiations of which will be applied simultaneously in a parallel 
block. These instantiations will only be applied if the conditions for application are satisfied 
for every single update rule and their right-hand sides are mutually consistent. An alternative 
view of parallel blocks will be given in chapter 6. 
Parallel blocks are delimited by the open parallel block symbol, '(11', and the close parallel 
block symbol, '11)'. The use of the double pipeline symbol '11' is encouraged to separate the 
individual update schemes in a parallel block. An rn update scheme parallel block can then 
be written as 
Ihsi gi J* rhsi. 
IhS2 92 J* 'rhS2- 
Ihs, J#- rhs,,,. 
or making use of typesetting possibilities as 
Ihs, gl rhsl. 
IhS2 92 rhS2- 
lhs,, =[ g, 1=: ý rlls,. 
To accommodate for parallel blocks, the following production rules need to be added to 
the basic Update Plans grammar. 
(item) --+ (parallel block) 
(parallel block) --+ "(11" ((alternatives) ". ")+ "11)" 
Note that in Extended Update Plans not only alternatives can be in the body of a parallel 
block as will be shown in the second part of this thesis. 
5 Examples 
5.1 ADTs 
The use of Update Plans is not limited to specification of hardware architectures although 
this is their primary aim. As has already been shown in this chapter, Update Plans can 
also be used to describe algorithms. This section shows their use in the description and 
implementation of abstract data types (ADTs), in particular singly linked lists. 
CHAPTER 1: UPDATE PLANS 
LIST[headj head[datal item2l item2[data2 item3l iteM3[data3 NULL]. 
is-empty(list) = list[item] (item = NULL) =[ 1i st 0 NULL j#- . 
insert(itemi, iteM2, data) = =[ iteM2 54 NULL ý* iteM2[data item, ]. 
insert(iteml, iteM2, data) = item, [- next] =1 item, ýý NULL A iteM2 0 NULL j#- 
item, [- iteM21 iteM2[data next]. 
remove(item) = itemo[- item] item[- next] =[ itemo 0 NULL A item =7ý NULL j#- 
itemo[- next]- 
length(NULL) = 0. 
length(item) = item[- next] length(next)+l =[ item =A NULL J#- . 
Figure 1.3: ADT list operations 
20 
An item of a list can be described as item[data itemf]. It consists of a payload data 
left of locator item and locator itemf addressing a following item. To make a list, items can 
be simply linked in a list of items, and a reference to its head will be added as shown in 
figure 1.3. 
The easiest operation on a list is to find about its emptiness. The archetype is-empty sim- 
ply checks whether the head of a list points to the constant NULL and expands the appropriate 
truth value. 
Two versions of insert item operation are provided. The difference is only in the place- 
ment of item,. Note that an additional update scheme inserting an item at the start of the 
list would be required to complete the ADT if a list without a dummy head and the second 
version of insert archetype was to be used. 
Similarly to the the second insert item archetype, the remove item archetype doesn't 
take into account removal of the first item in a list of linked items. Thus a similar function 
now removing the first item in a list would be required if a list without a dummy head was 
to be used. 
Finally, the recursive length archetype calculates length of a list of items. The remaining 
ADT operations such as "retrieve n-th item" can be defined in a similar fashion. 
5.2 Archetype expansion 
The following example uses archetypes and an update scheme from chapter 2 to illustrate the 
archetype expansion mechanism. In this example the update scheme 
[2.2] arithm(x, ri) dop(a, x) ==> a[rl] cc(ri). 
will be expanded into one update scheme showing the effects of the INC instruction in direct 
autodecrement mode from the PDP-11 instruction set. 
The command scheme 2.2 expands as shown in figure 1.4. The archetypes and their 
index numbers from chapter 2 are included in figure 1.4 as comments for convenience. After 
substitution using equations derived at each step the full expansion of our update scheme is 
CHAPTER 1: UPDATE PLANS 
arithm(x, r, ) dop(a, x) ==> a[rl] cc(ri). 
# [2.2] ari tlim (x, x+1)= INC. x= x, r, =x+ l 
INC dop(a, x) =* a[rij cc(ri). 
# [1.4] dop (a, vi) = AUTODEC(a, vj). a=a, x = v, 
INC AUTODEC(a, vi) ==ý. a[rij cc(rl). 
# [1.1] AUTODEC(a, v, ) r= r[b] a[vl]b ===ý, r[a]. a=a, v, = v, 
INC AUTODEC r r[b] a[vl]b =ý- r[al a[ri] cc(ri). 
# [2.1] CC(V2) = CCN[(V2 ": ýs 0) (V2 = 0) (-n(M'Ns ! ýs V2 ! ýý, MAX., )) 
(V2 >u MAXu)] I- rl = V2 
INC AUTODEC r r[b] a[vl]b ==: > r[a] a[ri] CCN[(V2 <s 0) 
(V2 : -- 0) 
(--i(MIN., <, V2 : 5., MAX., )) (V2 >,, MAX,, )]. 
Figure 1.4: Example of archetype expansion 
INC AUTODEC r r[bl a[xlb ==> r[al a[x + 11 
CCN [«X + 1) <- 0) «x + 1) = 0) (-1 (MIN <, (x + 1) : 5., MAXj) «x + 1) >,., MAX)] - 
21 
A slightly more complex example of archetype expansion can also be found in chapter 3. 
Chapter 2 
PDP-11 
T he PDP-11 is one of the most widely known and used machines in the history of computing 
science, and its importance in mission-critical applications can hardly be questioned. 
Although one might argue the relevance of a specification of a historically dead machine 
(although sales of hardware products for the PDP-11 continued until as recently as September 
1997), the very fact of its historical importance serves as a counterargument. 
The instruction set of the PDP-11 was designed to provide a clean, general, symmetric 
instruction set. Word length is 16 bits with the leftmost, most significant bit being bit 15. 
There are eight general registers of 16 bits each. Register 7 is the pro-ram counter (PC) 
and, by convention, register 6 is the stack pointer (SP). There is also a Processor Status 
Register/Word (PSW) which among other things indicates the 4 condition code bits (N, Z, 
V, C). 
In this chapter, a formal specification (based on [21,23,69]) of the DEC PDP-11 machine 
instruction set is given using the Update Plans formalism. Although some formal specifica- 
tions of the PDP-11 instruction set already exist (20,69], this chapter gives a clear, compact, 
unambiguous [601, and easy to follow specification which can serve as a comparison to those 
already existing. 
I Addressing modes 
Basic addressing modes of the PDP-11 machine are given in 1.1. They define two results-the 
effective address of the value and the value (v) itself. The following list describes PDP-11 
addressing modes informally. "The register" refers to one of eight general purpose registers 
(r), identified by an instruction word. "The displacement" refers to a word in the memory 
stored after an instruction (d). 
register. Direct access to a general register. The content of the selected register is 
taken as the value. 
22 
CHAPTER 2: PDP-11 23 
autoincrement. At the start of an instruction's execution, the register contains the 
address of the value, and after the value is accessed and the instruction is executed, the 
address is incremented. 
* autodecrement. The register has been decremented, before the value is accessed at 
this new address. 
* index. The content of the register is added to the displacement to produce the address 
of the value. 
" register deferred. The address of the value is stored in the register. 
" autoincrement deferred. Tile register contains a pointer to the address of the value. 
Tile pointer is automaticallY incremented after the value is retrieved. 
" autodecrement deferred. The content of the register is first decremented and then 
it contains a pointer to the address of the value. 
" index deferred. The displacement is added to the base address stored in the register. 
The result is a pointer to the address of the value. 
[1-1] REG(r, v) r rfvj. 
AUTOINC(b, v) r r[b] b[vlc r[c]. 
AUTODEC(a, v) r r[b] a[v]b r[a]. 
INDEX(b + d, v) rd= r[b] b+d[v]. 
REGDEF(b, v) r= r[b) blvl. 
AUTOINCDEF(b2, v) r= r[bl] bi[b2jcl b2[VIC2 r[cl] 
AUTODECDEF(a2, V) r= r[bl) al[19L2]bl a2[vlb2 r[all 
INDEXDEF(b2, v) rd= r[bil bl+d[b2l b2[VI. 
# register mode 
# autoincrement mode 
# autodecrement mode 
# index mode 
# register deferred mode 
# autoincrement deferred mode 
# autodecrement deferred mode 
# index deferred mode 
In addition to these basic addressing modes there are 4 other special PC addressing modes. 
These addressing modes (1.2) come into effect when referencing the PC (register 7). Again, 
they define (apart from the immediate mode) two results-the effective address of the value 
and the value itself. "The word" refers to the contents of the location following the instruction. 
9 PC immediate. The Nvord is the resulting value itselL 
* PC absolute. The Nvord is interpreted an address of the value. 
" PC relative. The value's address is calculated by adding the Nvord to the PC. 
" PC relative deferred. A pointer to the value's address is calculated by adding the 
word to the PC. 
[1-2] IMM(v) v# immediate mode 
ABS(a, v) aa [v]. # absolute mode 
REL(pc + d, v) d PC[pc] pc+d[v]. # relative mode 
RELDEF(a, v) d PC[pcl pc+d[a] a[v). # relative deferred mode 
CHAPTER 2: PDP-11 24 
Archetypes in 1.3 and 1.4 define two classes of addressing modes-the source (sop), and 
the destination (dop) operand. 
[1.3] sop(v) = REG(-, v). 
= AUTOINC(-, v). 
= AUTODEC(-, v) 
= INDEX(-, v). 
= REGDEF(-, v). 
= AUTOINCDEF(-, v). 
= AUTODECDEF(-, v). 
= INDEXDEF(-, v). 
IMM(-, V) - 
ABS(-, v). 
REL(-, v). 
RELDEF(-, v). 
2 Instructions 
[1.41 dop(a, v) = REG (a, v). 
= AUTOINC(a, v). 
= AUTODEC(a, v). 
= INDEX(a, v). 
= REGDEF(a, v). 
= AUTOINCDEF(a, v). 
= AUTODECDEF(a, v). 
= INDEXDEF(a, v). 
= ABS(a, v). 
= REL(a, v). 
= RELDEF(a, v). 
2.1 Single operand instructions 
In order to make the Update Plans specification closer to the real implementation, the follow- 
ing data types are defined. Promotions and conversions between these individual data types 
are considered to be defined elsewhere. 
{Nibble = Bit Bit Bit Bit, Byte = Nibble Nibble, Word = Byte Bytel. 
The following text shows the layout of condition codes in the PSW register. We make use 
of the CC[NZVC] locators later on in this specification. The registers CC[NZVC] carry the negative, 
zero, overflow and carry bits respectively. 
b., b7 b,, b,; :: Bit. 
nibble:: Nibble. 
byte:: Byte. 
PSW[byte nibble]CCN[b, ]CCz[b, ]CCv[bv]CCC[bc]. 
Archetype 2.1 sets the condition codes according to the value of the parameter v. 
(2.1] CC(V) -' CCN[(V <s 0) (V: --: 0) (-i(MIN5 < v <, MAXJ) (V > MAXJ] 1. 
The constants MIN, /MAX, are the smallest/largest negative 16-bit two's complement signed 
integer (-32768/32767). MAX, is the largest 16-bit unsigned integer (65535). The operators <,, 
<, (signed comparison operators), >.,, (unsigned comparison operator) and = are assumed to 
be defined elsewhere. 
CHAPTER 2: PDP-11 25 
It is assumed that the standard arithmetic operators are already defined. The arithmetic 
instructions are then defined as archetypes. 
[2.2] arithm(-, 0) = CLR. # clear destination 
arithm(x, --ix) = COM. # complement destination 
arithm(x, x+ 1) = INC. # increment destination 
arithm(x, x- 1) = DEC. # decrement destination 
arithm(x, -x) = NEG. # negate destination 
arithra(x, x+ c) = ADC CCc[cj- # add carry to destination 
arithm(x, x- c) = SBC CCc[c]. # subtract carry from destination 
arithm(x, x/2) = ASR. # arithmetic shift right destination 
arithm(x, xX 2) = ASL. # arithmetic shift left destination 
arithm(-, -1 X n) = STX CCN[n). # sign extend destination 
arithm(x, r) dop(a, x) ==> a[r] cc(r). 
This specification covers mainly integer operand instructions. The byte operand instruc- 
tions could easily be defined. For example SWAB is defined by 
[2.3] vi, vo :: Byte. 
SWAB dop(a, 
-) = a[vi vo] ==* a[vo vil. 
#swap bytes of destination 
The specification of INCB is slightly more complicated. Here is an example of how the 
specification would change if we were to specify the instruction set including the byte operand 
instructions. 
[2.4] arithm(x, x+ 1) = INC ==ý, W. # increment destination 
arithm(x, x+ 1) = INCB ==* B. # increment destination byte 
arithm(x, r) dop(a, x) ==* rcc(a, r, arithm(-, -)). 
The other byte/word operand pairs can be defined similarly. The constants W and B are 
used to pass typing information to the new condition code archetype rcc. Note that the left- 
hand side call of arithm occurs in the text part of the left-hand side, while the right-hand side 
call is a parameter of the rcc archetype so that, e. g. the INC in update scheme 2.4 will appear 
in the text of the expansion, while the constant W will be passed to rcc on the right-hand side. 
A small example is shown in figure 2.1. As this is a simple example (of a partial expansion), 
parameters are resolved every time an archetype is expanded. Some extra archetypes, which 
are used for byte operand instructions are shown in 2.5. 
[2.5] type(W) = Word 
type(B) = Byte 
rcc(a, v, w) = cc(v, w) I=* a[(v:: type(w))]. 
CC(V, W) : -- CCN[(V <s 0) (v = 0) (-l(MINs <, v <., MAX., )) (v >. MAXJ] 1. 
cc(v, B) = CCN((V <bs 0) (v = 0) (-i(MIN-B, <-bs V <-bs MAX-Br, )) 
(V >bu MAX-B,, )] 1. 
CHAPTER 2: PDP-11 
arithm(x, r) dop (a, x) ==* rcc (a, r, arithm(-, -)). 
# [2.4] 
# [2.41 ari thm (x, x+1)= TNCB =: ý B. x =x, r =x+ I 
INCB dop (a, x) ==ý> rc c (a, x+1, B) - 
# [2.5] rcc (a, v, w) = cc (v, w) I ==ý a [(v :: type (w))]. a=a, x+1=v, B=v 
INCB dop(a, x) =ý- cc(x + 1, B) a[(x +1:: type(B))]. 
# [2.5] cc(v, B) = CCN[(v <bs 0) (v = 0) (-, (MIN-Bý, :! ýbs V -<bs 
MAX-Bs)) 
(v >bu MAX-Bu)] 1, x+I=v 
INCB dop(a, x) =#> CCN[(X +1 <bs 0) (X +1= 0) (-(MIN-B., <bs X+1 <bs MAX-B, )) 
(X +1 >bu MAX-Bu)] a[(x +1:: type(B))]. I 
Figure 2.1: An example of archetype expansion 
26 
Again, the byte constants MIN-B.,, MAX-B,, MAX-B,, and the byte operators <bs) :! ýbsl >bu 
have similar meaning to their integer (word) counterparts, and are assumed to be defined 
elsewhere. 
(2.6] TST dop(-, x) ==: ý> cc(x). #test destination 
ROR dop(a, x) CCc[c] ==* a[(x>>1)j(c<<15)] CCc[(x&l :: Bit)]. 
ROL dop(a, x) CCc[cj ==* a[(x<<l)lcl CCcf(x>>15)gzl :: Bit)]. 
The TST instruction sets the condition codes according to the contents of the destination 
word. 
ROR/ROL: the PSW C-bit and the destination word (considered as a 17-bit "word" is rotated 
right/left one bit. The vacated high/low-order bit (bit 15/0) is loaded with the contents of 
the C-bit. The low/high-order bit of the destination (bit 0/15) is loaded into the C-bit. 
2.2 Double operand instructions 
The MOV instruction moves a copy of the source word contents to the destination. Its command 
archetype definition is given in 2.7. 
[2.7] MOV sop(v) dop(a, -) ==* a[v]. 
The following arithmetic and bit-wise instructions perform the specified operations, and 
set the destination operand and the condition codes in accordance with the result. 
[2.8] abrithm(x, y, x+ y) = ADD. # add source to destination 
abrithm(x, y, x- y) = SUB. # subtract source from destination 
abrithm(x, y, -x&y) = BIC. # bit clear destination from source 
abrithm(x, y, x1y) = BIS. # bit set destination from source 
abrithm(x, y, r) sop(x) dop(a, y) =* a[r) cc(r). 
The comparison instructions are similar to the above instructions, however, the destination 
operand is not affected. 
CHAPTER 2: PDP-11 27 
[2.91 cmps (X, Y, x- y) = CMP- # compare source to destination 
cmps (X, y, X&y) = BIT. # bit test source and destination bytes 
cmps (x, y, r) s op (x) dop (-, y) =#- cc (r). 
2.3 Condition code and program flow operations 
The condition code operations SEc and CLc, where c is any of the condition codes N, Z, V, and 
C are shown in 2.10. The SEc operations set the condition codes, and CLc clear the condition 
code. TRUE and FALSE are predefined constants of the type Bit, values of which are 1 and 0 
respectively. 
[2.10) SEN ==* CCN[TRUE]. CLN =: ý> CCN[FALSE]. 
SEZ CCz[TRUE]. CLZ CCz[FALSE]. 
SEV CCv[TRUE]. CLV CCv[FALSE]. 
SEC CCc[TRUE]. CLC CCc[FALSE]. 
SCC CCN[TRUEJ ccz[TRUE] CCV[TRUE] CCc[TRUE]. 
CCC CCN[FALSEJ CCZ[FALSEJ CCv[FALSE] CCc[FALSE]. 
The full suite of program flow instructions is given in the following definitions. First, the 
conditional branches are given (2.11), and then the unconditional jump is defined in 2.12. 
[2.11] branch(true) = BR. # branch always 
branch(-z) = BNE CCz[z]. #00 
branch(z) = BEQ CCz[z]. #=0 
branch(-(n ^ v)) = BGE CCN[n] CCv[vj. #>O 
branch(n ^ v) = BLT CCN[n] CCv[v]. #<O 
branch(-i(z I (n ^ v))) = BGT CCZ [z] CCN [n] CCV [VI -#>0 
branch(z I (n ^ v)) = BLE CCZ[z] CCN[n] CCv [v]. #<0 
branch(nn) = BPL CCN[n]. #+ 
branch(n) = BMI CCN[n]. #- 
branch(-(clz)) = BHI CCc[c] CCz[zl- # higher (unsigned comparison) 
branch(-c) = BCC CCC[c]. # carry clear 
= BHIS CCc[c]- # higher or same (unsigned compar. ) 
branch(c) = BCS CCC[c]- # carry set 
= BLO CCC[c]. # lower (unsigned comparison) 
branch(clz) = BLOS CCC[c] CCZ [z] - 
# lower or same (unsigned compar. ) 
branch(--v) = BVC CCv[v]- # overflow clear 
branch(v) = BVS CCV[v]. # overflow set 
PC[pcl pc[branch(c ) d1cp =j c j#- PC[cp + dj; # branch 
it =* PC[cp). # no branch 
v 
CHAPTER 2: PDP-11 28 
All branch instructions jump relative to the contents of the program counter based on con- 
dition codes. The displacement d is contained in the instruction word. The target absolute 
address of the jump instruction defined by the following command is dependent on an ad- 
dressing mode encoded in the instruction word. 
[2.12] JMP dop(-, a) ==* PC[a]. 
The following register addressing archetype is defined for convenience. Note that use can 
not be made of the register addressing command archetype defined in 1.1, since it expands 
the additional REG text. 
[2.13] reg(r, v) = r[v] 1. 
JSR calculates the destination address (a), saves the contents (v) of the source register 
(r) on the hardware stack, and saves the address of the following instruction in the source 
register. (This saving of the PC provides the linkage from the subroutine back to the calling 
program. ) Finally, the PC is given the destination address, which produces the actual jump 
to the subroutine. Return to the calling routine is provided by the companion instruction 
RTS. The contents of the specified register are loaded into the PC, and the top of the hardware 
stack is popped into the register. Note that register 6 is by convention the stack pointer. 
[2.14] PC[pc] pc[JSR r dop(a, -)Jqc reg(r, v) reg(6, tp) 
==* PC[a] reg(6, sp) sp[v]tp reg(r, qc). 
RTS r reg(r, pc) reg(6, sp) sp[v]tp =-: ý> PC[pc] reg(6, tp) reg(r, v). 
2.4 Interrupts 
Software interrupt instructions perform a trap through the vector at (fixed) memory locations. 
The effect of all of these instructions is the same: stack the contents of the PC and PSW 
(program status word), and then load the PC and PSW with the contents of a fixed memory 
location. The following definitions specify interrupt traps. 
[2.15] trap(148,168) = BPT. # break-point trap 
trap(208,228) = IOT. #input/output trap 
trap(308,328) = EMT. # emulator trap 
trap(348,368) = TRAP. # TRAP 
trap(Pca, psw,, ) reg(6, tp) PCa[PC1 PsWa[PSWI PC[Pcsl PSW[Pswsl 
===> sp[pc, pswjtp reg(6, sp) PC[pcj PSW[pswl. 
Returns from interrupts, on the other hand, restore the original contents (just before an 
interrupt) of the PC and PSW registers from stack. 
[2.16] reto = RTL return from interrupt 
= RTT. return from trap 
reto reg(6, sp) sp[pc psw]tp ==* PC[pcl PSW[psw] reg(6, tp). 
CHAPTER 2: PDP-11 
2.5 Other instructions 
29 
The NOP instruction is the standard no operation instruction. It produces no effect, but 
occupies one word of memory and requires a complete execution cycle for execution. 
[2-17] NOP ==* .# no operation 
Conclusions 
I 
A significant subset of the PDP-11 instruction set has been described using the Update Plans 
formalism. Although this specification adheres strictly to the Update Plans formalism as 
described in [60], it already raises several issues addressed in the second part of this thesis. 
In particular, it prompted the development of sequential update schemes and archetypes as 
explained in chapter 6. A new, more complex specification would be based on the concepts 
described in the same chapter, possibly on a fetch/execute cycle level description. This 
specification is rather an abstract one. Use of the syntactic extensions described in the second 
part of the thesis could be made, together with additional typing information to define the 
precise layout of instruction words, as is done, for example, in the following chapter. 
Chapter 3 
SPARC-V9 
T he SPARC' has been implemented in processors used in a range of computers from laptops 
to supercomputers, and today boasts over 8,000 software application programs. SPARC- 
V9, like its predecessor SPARC-V8, is a modern microprocessor specification created by the 
SPARC Architecture Committee of SPARC International. SPARC-V9 is not a specific chip. 
It is an architectural specification that can be implemented as a microprocessor by those 
obtaining a license from SPARC International. 
In this chapter, a formal specification of the SPARC-V9 abstract machine instruction set 
is given using the Update Plans formalism. Although a partial specification of the SPARC-V9 
architecture already exists [50], this is aimed at a different level of description. 
The structure of this chapter is as follows. Section 1 defines types and constants used 
throughout the specification. The SPARC-V9 has a rich set of registers using a relatively new 
concept of windowing which is both informally described and then specified using Update 
Plans in section 2. The most important instructions of this architecture are specified in 
section 3. An example taken from the specification is shown in section 4. Finally, conclusions 
and some future research suggestions are given section 5. 
1 Types and constants 
Throughout the specification, the following types are used. The remaining types used in this 
specification which are not displayed here are declared in sections of their immediate use. Bit 
is considered to be a primitive type having the usual meaning. 
jBit}. 
fByte(012) = jBit}8, Half word(102) = {Bit}16, Word(002) = {Bit}32,5: 3.1,3.2 
Extended-word = jBit}64j. 5: 3.2 
{Fp-single = jBitJ32, Fp-double = {BitJ64, Fp-quad = (Bitj 128}. 5: 3.2 
'Scalable Processor ARChitecture 
30 
CHAPTER 3: SPARC-V! ) 31 
In Update Plans, constants are uppercase words. However, as will be shown in chapter 5 
any sequence of letters (store) can be made into a constant by assigning it a type Constant. 
The main reason for listing these constants is to show their bit values, supply grounding 
information where necessary, and use names easily identifiable against the official (informal) 
description of the SPARC-V9 architecture [77]. 
nPC:: Constant. 
IMM13(12) :: Bit. 
OP: ARM(102), OP: LS(112), OP: BRS(002), OP: CLL(012) fBit}2. 
BRZ(012), BRLEZ(102), BRLZ(112) :: {BitJ2. 
ADD(0002), SUB(1002), SAVE(1002), RESTORE(1012) {Bit}3. 
BPA(10002), BPN(00002), BPNE(10012), BPE(00012) {Bit}4. 
LDSTUB(OO 11012) :: {Bit}6. 
FADD(O 0100 002), FSUB(O 0100 012) :: fBit}7. 
REGRES(O 00 00 00 002) :: {Bit}9. 
2 Registers 
A SPARC-V9 processor includes two types of registers: general-purpose, or working data 
registers, and control/status registers. In the following sections only integer and floating 
point registers are described. For the structure of control/status registers, please refer to the 
informal specification [77]. 
2.1 General purpose r registers 
An implementation of the instruction unit may contain anything from 64 to 528 general- 
purpose 64-bit r registers. At any time, an instruction can access the first 8 global registers 
(r[O-7]), and a 24-register window into the r registers. A register window comprises the 8 
'in' and 8 'local' registers, together with the 8 'in' registers of an adjacent register set, which 
are addressable from the current window as 'out' registers. See figure 3.1 for more graphic 
explanation. 
5: 3.1 
5: 3.1 
5: 3.1,3.2 
5: 3.1,3.2 
5: 3.1,3.2 
5: 3.1,3.2 
5: 3.1,3.2 
5: 3.1,3.2 
5: 3.1,3.2 
The following two sets of archetypes are provided to facilitate reading (2.1) and writing 
(2-2) of r registers, and expansion of a 5-bit (a:: fBitj5) register address in an instruction 5: 3.2 
word. 
rr(O, 0) =. # reading r[01 yields 0 
rr(a, v) = a[v] <a<8 1#- #global registers 
= CWP [w] outr (a - 8, v, w) 8<a< 16 J=Ie 
= CWP[w] locr(a - 16, v, w) 16 <a< 24 J=> 
= CWP [w] inr (a - 24, v, w) 24 <a< 32 1=ýe 
CHAPTER 3: SPARC-V9 32 
Window NWINDOWS-I Window 1 
15 8 31 24 
outs WNWINDOWS-I locals WO 
ins W1 
- ins WO 
II 
outs WO 
31 24 23 16 15 8 
Figure 3.1: Register Window 0 
CWP is a constant locator pointing to the left boundary of the CWP (Current Window 
Pointer) register containing a 5-bit, (w :: {Bit}5) address of the current 24 r register window. 5-3.2 
A typical use of archetypes rr is through archetypes sopi and sop2 (3.1) both of which 
will be described in section 3.1. 
[2.2] rw(a, v) a= CWP [w] rw (a, v, w). 5: 2.2 
rw (a, v, w) = O=aý*. # writing r [01 has no effect 
1<a< 8ý*a[v]. #global registers 
8<a< 16 outw(a - 8, v, w). 
16 <a< 24 locw(a - 16, v, w). 
24 <a< 32 1=> inw(a - 24, v, w). 
Global register zero r[O] always reads as zero (archetype 2.1), and writes to it have no 
program-visible effect (archetype 2.2). 
[2.3] outr(a, v, w) = ilocr(a, v, w). 
outw(a, v, w)= ==>ilocw(a, v, w). 
locr(a, v, w) = ilocr(a + 8, v, w). 
locw(a, v, w)= ==ý-ilocw(a+8, v, w). 
inr (a, v, w) = ilocr(a, v, w- 1). 
inw(a, v, w) = =*ilocw(a, vw-1). 
Archetypes outr/outw, locr/locw and inr/inw (used to read/write access 'out', 'local' 
and 'in' registers) are defined in terms of archetypes ilocr/ilocw (2.4) which calculate left 
locators for these 3 types of r registers. 
[2-4] ilocr(a, v, w) = (NWINDOWS -1- (w % NWINDOWS)) x 16 + a[v]. 
ilocw(a, v, w)= ==ý. (NWINDOWS-1-(w%NWINDOWS))xl6+a[vl. 
The number of windows or register sets, NWINDOWS, is implementation-dependent and 
ranges from 3 to 32. Note that r register with address o, where 8<o< 16, refers to exactly 
the same register as o+ 16 does after the CWP is incremented by I (% NWINDOWS). Likewise, 
a register with address i, where 24 <i< 32, refers to exactly the same register as address 
i- 16 does after the CWP is decremented by 1 (% NWINDOWS)- 
CHAPTER 3: SPARC-V9 33 
2.2 Floating-point f registers 
The floating-point unit contains 
* 32 single-precision (32-bit) floating-point registers, numbered f [0], f [11, f [31] 
* 32 double-precision (64-bit) floating-point registers, numbered f [01, f (21, f [621 
* 16 quad-precision (128-bit) floating-point registers, numbered f [0), f [4], f [60) 
The floating-point registers are arranged so that some of them overlap, that is, are aliased. 
Layout and numbering of the floating-point registers is apparent from the following archetypes. 
[2.5) fr (a, v, s) a= a[(v:: Fp-single)] =[ S= 012 
J=ý' 
- 5: 2.2 
= a[(v:: Fp-double)] =4 S= 102 
1=4' 
- 
= a[(v:: Fp-quad)] =t S= 112 ý* - 
f w(a, v, s) a=S=0 12 a[(v :: Fp-single)]. 5: 2.2 
S= 102 a[(v:: Fp-double)). 
s= 112 a[(v:: Fp-quad)]. 
Archetypes (2.5) are provided to encapsulate reading (f r) and writing (f w) of different 
types of floating-point values into f registers. Similarly to r registers, address a of an f 
register is 5-bit wide (a :: jBit}5). 5: 3.2 
Unlike the windowed r registers, all of the floating-point registers are accessible at any 
time. 
Instructions 
3.1 Arithmetic and logical operations 
Arithmetic and lo-ical instructions (and some others) use the instruction word format shown 
in detail in figure 3.2. The value (major opcode) of two highest bits of all instructions falling 
into this category is 10, for which a constant OP: ARM is used. Other fields in the instruction 
word are 5-bit addresses of source (rsl, rs2) and destination (rd) registers, minor opcode op3, 
and if bit 13 is set to 1, a 13-bit immediate value simm13 always sign-extended to 64 bits. 
In order to make our specification as compact and clear as possible, a frequent use of the 
archetype mechanism is made. Firstly, we define archetypes for accessing values in source 
registers, which at the same time describe the layout of the instruction word. 
sopi(v) = rr(a, v) a. 
sop2(v) = REGRES(v). 
= IMM13(v). 
Command archetypes REGRES and IMM13 help in defining the instruction word format and 
return a value of a source operand in variable v. The function sign-ext sign-extends an 
integer value and is assumed to be defined elsewhere. 
CHAPTER 3: SPARC-V9 34 
[3.2] REGRES(v) a= rr(a, v). # REGRES = 02 00 00 00 002 5: 2.2 
IMM13(sign-ext(v)) = (v:: jBit}13). # IMM13 = 12 5: 3.2 
Secondly, there must be a way of reading and writing carry bit in the condition codes 
register. Archetypes 3.3 and 3.4 do exactly that. Note that arithmetic and logical instructions 
always read the carry bit from integer condition code register (ICC), whereas they write to 
both ICC and the extended integer condition code (XCC) registers. 
[3.31 iccr(c) = CCR: ICC: C[c]. 
iccrb(c) = 12 iCCr(C)- 
iccrb(O) = 02- 
Archetype iccrb reads the carry flag to c, if the carry bit in the op3 field (bit 22) is set. 
Similarly, ccwb sets the condition code registers XCC and ICC (only) if the ccwb bit (23) in 
the instruction is 1. 
[3.4] ccw(v) CCR: XCC[(v <4,0) (V = 0) (-(MIN43 : ý4, v <4, MAX4s» (v >4u MAX4J1 5-. 2.2 
CCR: ICC[(v <2, (» (V = 0) (-(MIN2., : 52s V -<2s 
MAX2, » (V >2u MAX2J1- 
ccwb(V) 02- 
12 ý`* CC'ýI(V)- 
The constants MIN,, /MAX,,., are the smallest/largest negative n-byte two's complement signed 
integer. MAX,, is the largest n-byte unsigned integer. The n-byte operators <"S, : 5ns (signed 
comparison operators), >,, u (unsigned comparison) and = are assumed to be defined else- 
where. 
[3.5] c :: Bit. 
aloper(x, y, x+y+ c) 02 ccwb($3) iccrb(c) ADD. # ADD 0002 5: 2.4 
aloper(x, y, x-y- c) 02 ccwb($3) iccrb(c) SUB. # SUB 1002 5: 2.4 
Archetype aloper (3.5) shows the behaviour of 8 different arithmetic instructions and defines 
the layout of their op3 field making use of the previously defined archetypes iccrb and 
ccwb. These instructions include add/subtract, add/subtract and modify condition codes, 
add/subtract with carry, and add/subtract with carry and modify condition codes. The Sn 
symbol used in archetype bodies is used to reference the archetype's nth parameter. Thus 
the $3 of the first aloper archetype is merely shorthand for saying 'x +y+ c'. 
Since arithmetic and logical instructions fall into the same category as regards their in- 
struction field layout, the following archetypes (and in particular the aloper archetype) are 
provided to specify both the op3 field and the appropriate actions for logical instructions. 
CHAPTER 3: SPARC-V9 35 
10 rd op3 I rs 1,1 01 rs2 
10 rd, op3. ýsl 11 1 simm13 
..... 31 3029 25 24 19 18 14 13 12 540 
Figure 3.2: Format of arithmetic, logical, and SAVE and RESTORE instructions 
[3.6] n:: Bit. 
op:: fBit)2. 
neg(v) = -' =1 V= 12 
==> .# empty 
body if V0 12 
1OP(OP) OP = 012 # bit-wise and 
OP = 102 # bit-wise or 
OP = 112 1=4> # bit-wise xor 
aloper(x, y, neg(n) (X 1OP(OP) Y)) = 02 ccwb($3) 02 n op. 
If the n bit is set it inverts (by the means of the neg archetype) the logical operation 
determined by the op field. Note that archetype neg expands to empty right and left-hand 
sides, if the value of the bit n is 0. The archetype lop simply decides on one of 3 logical 
operations based on value of its two-bit parameter op. Again, condition codes are set or not 
by the archetype ccwb based on the result of logical operations (third argument of archetype 
aloper) in the same way as for arithmetical operations. 
Many update schemes in this specification are commands. As explained in [60], every 
command whose left and right program counters are hidden can be prefixed by 'pco =', 
where pco is a unique archetype declaration. As there are two program counters in the 
SPARC-V9, PC-containing the address of the current instruction and nPC-containing the 
address of the next instruction, the following update scheme can be defined and all commands 
instantiated through the expansion in this scheme. 
[3.7] PC[pc] pc[pc()]qc nPC[pc'] ==* PC[pc] pc'fpco]qc nPC[pc'+ 4]. 
F inally, the command update scheme 3.8 specifies the instruction field layout of all arith- 
metic and louical instructions defined in this section. 0 
[3.8] OP: ARM rw(-, r) aloper(x, y, r) sopl(x) sop2(y) # OP: ARM = 102 
3.2 Register window manipulation instructions 
All r registers (apart from the first eight) are windowed. This means that there must be 
a mechanism to access registers in various windows. There are two instructions that do 
that-SAVE and RESTORE. 
5: 3.2 
5: 2.4 
The SAVE instruction provides the routine executing it with a new register window. The 
out' registers from the old window become the 'in' registers of the new window. 
CHAPTER 3: SPARC-V9 36 
The RESTORE instruction restores the register window saved by the last SAVE instruction 
executed by the current process. The 'in' registers of the old window become the 'out' registers 
of the new window. The 'in' and 'local' registers in the new window contain the previous 
values. 
[3.9] sr(x, y, x+y+c, w+ 1) = 
10 ccwb($3) iccr(c) SAVE CWP[w] #SAVE = 1002 5: 2.4 
sr(x, y, x+y+ c, w- 1) = 
10 ccwb($3) iccr(c) RESTORE CWP[w] #RESTORE = 1012 5: 2.4 
OP: ARM rw(a, r, w) a sr(x, y, r, w) sopl (x) sop2(y) ==* CWP[w % NWINDOWS]. 
Furthermore, SAVE and RESTORE behave like an ordinary ADD instruction, except that 
the source operands r[rsl] and/or r[rs2j are read from the old window (that is, the window 
addressed by the original CWP) and the sum is written into r[rd] of the new window (that is, 
the window addressed by the new CWP). 
3.3 Load/Store instructions 
3.3.1 Load-store unsigned byte 
The load-store unsigned byte instruction copies a byte from memory into r[rd], and then 
rewrites the addressed byte in memory to all ones. The fetched byte is right-justified in the 
destination register r[rd) and zero-filled on the left. The format of this instruction is the same 
as the format of arithmetic instructions, with the exception of the major opcode. 
[3.10] r :: Byte. 
ldstub(a, r) = LDSTUB a[r] ==> a[1111111121- # LDSTUB = 00 11012 
OP: LS rw(-, r) ldstub(x + y, r) sopl(x) sop2(y) # OP: LS = 112 
3.3.2 Load integer from alternate space 
The load integer from alternate space instructions copy a byte, a halfword, a word or an 
extended word from memory into r[rd]. A fetched byte, halfword, or word is right-justified 
in the destination register; it is either sign-extended or zero-filled on the left (to 64 bits), 
depending on whether the opcode specifies a signed or unsigned operation, respectively. 
For each instruction access and each normal data access, the IU appends an 8-bit address 
space identifier (ASI) to the 64-bit memory address. Load/Store alternate instructions can 
provide an arbitrary ASI with their data addresses (if bit 13 of the instruction word = 0), or 
use the ASI value contained in the ASI register (if bit 13 of the instruction word = 1). 
Archetypes defined in 3.11,3.13 and the type alias declaration in 3.12 are provided in 
order to make the final command update scheme in 3.14 more compact. 
CHAPTER 3: SPARC-V9 
Jil l. rd. I - op3 I rsl. 1 0 1 imm-asi I rs2. 
1.1 rd. op3. ýsl 
,I 
II simm13 
31 3029 25 24 19 18 14 13 12 540 
Figure 3.3: Format of load/store integer or from/into alternate sPace instructions 
[3.11] signed(s) = zero-f ill =[ s 
= sign-ext =f s =: 
[3.12] jInteger =Byte I Half word I Word}. 
The archetype 3.13 makes sure that we get the correct value of ASI regardless of the value 
of bit 13 in the instruction field, fetches values x and y from source registers, and sets the 
format of the instruction word for bits 0-18. 
[3.13] asi :: Byte. 
sopasi(x, y, asi) = SOPI(X) 02 asi sopl(y). 
= sopl(x) IMM13(y) ASI[asil. 
Load/Store instructions use rather inconsistently different bit values to denote the type 
Extended-word, and that is why a separate archetype lias is provided for this type. The 
asi: a locator points to a normal or alternative address area. 
5: 3.1 
[3.14] s :: Bit. 
lias(asi, a, signed(s) (r)) S 02 Integer asi: a[(r :: Integer)]. 5: 3.1 
lias(asi, a, r) 10112 asi: a[(r:: Extended-word)]- 
OP: LS rw(-, r) 012 lias(asi, x+y, r) sopasi(x, y, asi) =#- . 
3.3.3 Store integer into alternate space 
The store integer into alternate space instructions copy the whole extended (64-bit) integer, 
the less-significant word, the least-significant halfword, or the least-significant byte of r[rd] 
into memory. Again, due to the inconsistency described in the previous section a separate 
archetype sias is provided for the type Extended-word. 
sias (as i, a, r) ý 012 Integer=* asi: a[(r:: Integer)]. 5: 3.1 
sias(asi, a, r) = 11102 =#> asi: a[(r:: Extended-word)]. 
OP: LS rr(-, r) 012 sias(asi, x+y, r) sopasi(x, y, asi) =* . 
37 
CHAPTER 3: SPARC-V9 38 
1101 
. 
rd. 
.1 . 
1.10100.1. rsl.. 1. fop. Is rs2 
31 3029 2524 19 18 14 13 76540 
Figure 3.4: Format of the floating-point ADD and SUB instructions 
3.4 Floating-point instructions 
3.4.1 Floating-point ADD and SUB instructions 
The floating-point add instructions add the floating-point register specified by the rsl field 
and the floating-point register specified by the rs2 field, and write the sum into the floating- 
point register specified by the rd field. The floating-point subtract instructions subtract the 
floating-point register specified by the rs2 field from the floating-point register specified by 
the rsl field, and write the difference into the floating-point register specified by the rd field. 
Note that the command archetype in 3.16 makes use of archetypes fr defined in section 2.2 
which read appropriate floating-point type values (based on the value in the two-bit cell s) 
into variables x and y. The archetype fw writes the result (r) into a floating-point register 
specified by the rd field, again, its type is based on the value of s. 
[3.16] s :: f Bit}2. 
f arithm(x, y, x+ y) = FADD. # FADD ý0 0100 002 
f arithm(x, y, x- y) = FSUB. # FSUB =0 0100 012 
OP: ARM f w(-, r, s) 1101002 f r(-, x, s) f arithm(x, y, r) sf r(-, y, s) ==> . 
3.4.2 Floating-point compare 
These instructions compare the floating-point register specified by the rsl field with the 
floating-point register specified by the rs2 field, and set the selected floating-point condition 
code (fccn) as specified by the 3.17 archetype. 
13.17) f cmp(x, y) X=Y 002- 
X<Y 012- 
X>Y 102. 
X? Y 112- # unordered (x or/and y is NaN) 
Archetypes f ccnr and f ccnw read/write a selected floating point condition code field, 
and finally, update scheme 3.19 makes use of all the archetypes from this section to specify 
floating-point compare instruction and its field layout. 
[3.18] v:: {Bit}2. 
5: 3.2 
5: 3.2 
f ccnr(f cc, v) = FSR: FCC: f cc[vj. 
fccnw(fcc, v)= ==*. FSR: FCC: fcc[v]. 
CHAPTER 3: SPARC-V9 39 
110 1 000 lccolccil 
. 
110101 rs 1.0010100 s rs2 
31 3029 27262524 19 18 14 13 76540 
Figure 3.5: Format of the floating-point compare instructions 
[3.19] OP: ARM 0002 CCI CCO 1101012 fr(-, x, -9) 0010102 sf r(-, y, s) 2: -> 
f ccnw(ccl ý ccO, f cmp(x, y)). 
3.4.3 Load floating-point from alternate space 
The load single floating-point from alternate space instruction LDFA (see figure 3.6) copies 
a word from memory into f [rd]. The load doubleNvord floating-point from alternate space 
instruction (LDDFA) copies a Nvord-aligned double-Nvord from memory into a double-precision 
floating-point register. The load quad floating-point from alternate space instruction (LDQFA) 
copies a word-aligned quadword from memory into a quad-precision floating-point register. 
The command update scheme in 3.20 makes use of archetypes defined earlier in 2.5 and 3.13. 
[3.20] ldfa(asi, a, r, 01) 002 asi: a[(r:: Fp-single)]. #LDFA 
ldf a(asi, a, r, 10) 112 asi: al(r:: Fp-double)). # LDDFA 
ldf a(asi, a, r, 11) 102 asi: a[(r:: Fp-quad)]. # LDQFA 
OP: LS f w(-, r, s) 11002 ldf a(asi, x+y, r, s) sopasi(x, y, asi) 
3.4.4 Store floating-point into alternate space 
The store single floating-point into alternate space instruction (STFA) copies f [rd] into memory. 
The store double floating-point into alternate space instruction (STDFA) copies a doublelvord 
from a double floating-point register into a Nvord-aligned doubleNvord in memory. The store 
quad floating-point into alternate space instruction (STqFA) copies the contents of a quad 
floating-point register into a Nvord-aligned quadNvord in memory. 
Similarly to load floating-point instructions from ASI (and load/store integer from/into 
ASI) store floating-point instructions into ASI contain the ASI to be used for the load in 
the imm-asi field if bit 13 =0 in the instruction Nvord, or in the ASI register if the same 
bit (i) reads as 1. The effective address for these instructions is r[rsi] + r[rs2] if i=0, or 
r[rsil + sign-ext(simml3) if i=1. 
[3.21] stf a(asi, a, r, 01) 002 asi: a[(r Fp-single)]. # STFA 
stf a(asi, a, r, 10) 112 asi: a[(r Fp-double)]. # STDFA 
stf a(asi, a, r, 11) 102 asi: a[(r Fp-quad)). # STQFA 
OP: LS f r(-, r, s) 11012 stf a(asi, x+y, r, s) sopasi(x, y, asi) 
CHAPTER 3: SPARC-V9 
1ý1 rd. op3. rsi. 0 imm-asi rs2 
11 rd. op3 rsi. 1 simml3 
.... 31 3029 25 24 19 18 14 13 12 540 
Figure 3.6: Load/Store floating-point from/into alternate space 
10 rd. 1.11.00.0 rsl. 10 1 rs2 
1.10 1 rd. 111000 1 rsl II I simm13 
31 3029 25 24 19 18 14 13 12 540 
Figure 3.7: Format of the JMPL instruction 
3.5 Control transfer instructions 
3.5.1 Jump and link 
40 
One of the simplest control transfer instructions is the JMPL instruction. The JMPL writes 
the contents of the PC (program counter), which points to the JMPL instruction itself, into 
the r[rd] register2 and then causes a delayed transfer of control (the register nPC (next PC] 
is written) to the address given by r[rsIj + r[rs2l or r[rsIj + sign-ext(simm13) based on the 
value of bit 13 of the instruction word. Note that its format is the same as that of arithmetic 
instructions (figure 3.2). Tile value 1110002 is the content of the op3 instruction field and 
simply distinguishes the JMPL instruction from other (arithmetic) instructions. 
[3.221 PC[pc] nPC[npcl pc[OP: ARM rw(-, pc) 1110002 SOP1(X) sop2(y)] ==* PC[npcl nPC[x+yl. 
3.5.2 Branch on integer register with prediction (BPr) 
These instructions branch based on the contents of r[rslj. They treat the register contents 
as a signed integer value. A BPr instruction examines all 64 bits of r[rsll according to the 
cond field of the instruction, producing either a TRUE or FALSE result. If TRUE, the branch 
is taken; that is, the instruction causes a PC-relative, delayed control transfer to the address 
PC+ (4 X sign-ext(dl6hi ý d161o)). If FALSE, the branch is not taken. If the branch is taken, 
the delay instruction is always executed, regardless of the value of the annul bit. If the branch 
is not taken and the annul bit (a) is 1, the delay instruction is annulled (not executed)-see 
update scheme 3.25. 
The predict bit (p) is used to give the hardware a hint about whether the branch is 
expected to be taken. AI in the p bit indicates that the branch is expected to be taken; a0 
indicates that the branch is expected not to be taken. 
First, let us define some stores which will be used later in the specification. 
2 The value written into the register is visible to the instruction in the delay slot. 
CHAPTER 3: SPARC-V9 41 
00 1a 10 1 cond I Oll Idl6hil PI. rs 1. I dl6lo 
31 30292827 25 24 22 21 20 19 18 14 13 0 
Figure 3.8: Format of BPr instructions 
a, p, n :: Bit. # annul, prediction and "negate condition" bits 
dl6hi:: {Bit}2. # 2-bit PC-relative displacement 5: 3.2 
rsl:: {Bit}5. # address of the 1st source register 5: 3.2 
d161o:: jBit}14. # 14-bit PC-relative displacement 5: 3.2 
The values of constants BRZ, BRLEZ, BRLZ and OP: BRS can be found in section 1. 
[3.231 bpr-cond(v, neg(n) (v = 0)) =n BRZ. # Branch on Register Zero 
bpr-cond(v, neg(n) (v < 0)) =n BRLEZ. #" Register Less Than or Equal to Zero 
bpr-cond(v, neg(n) (v < 0)) =n BRLZ. #" Register Less Than Zero 
Note that the archetypes bpr-cond can expand into 6 different -update schemes, based on the 
value of bit 27 (n) of the instruction word. The archetype neg is defined in 3.6 on page 35. 
Ambidextrous archetypes ea are used throughout the specification, and expand an effective 
jump address to branches. 
[3.24] ea(pc, displ) pc + (4 x sýgn-ext(displ)) =. 
[3.25] PC[pc] nPC[npc] pc[OP: BRS a 02 bpr-cond(v, c) 0112 d16hi p rw(rsl, v) dl6lol 
c J#. PC[npc] nPC[ea(pc, d16hi I d161o)]. 
-c Aa=0 PC[npcj nPC[npc + 4]. # instruction in delay slot executed 
=4 --ic Aa=I PC[npc + 4] nPC[npc + 8]. # instruction in delay slot annulled 
3.5.3 Branch on integer condition codes with prediction (BPcc) 
Unconditional Branches (BPA, BPN). A BPN (Branch Never with Prediction) in no case 
causes a transfer of control to take place. This instruction may be treated as a NOP by an 
implementation. 
BPA (Branch Always with Prediction) causes an unconditional PC-relative, delayed control 
transfer to the address PC + (4 x sign-ext(disp1g)). 
If the BPN's or BPA's a field is 1, the following (delay) instruction is not executed. If the a 
field is 0, the following instruction is executed. 
Conditional Branches. Conditional BPcc instructions (except BPA and BPN) evaluate one 
of the two integer condition codes (icc or xcc), as selected by cco and ccl, according to the 
cond field of the instruction, producing either a TRUE or FALSE result. 
If TRUE, the branch is taken; that is, the instruction causes a PC-relative, delayed control 
transfer to the address PC + (4 x sign-ext(displ9)). If FALSE, the branch is not taken. If a 
5: 2.2 
CHAPTER 3: SPARC-V9 42 
00 1aI cond. 1 001 Iccilccol pI displg 
31 3029 28 25 24 22 21 20 19 18 0 
Figure 3.9: Format of BPcc instructions 
conditional branch is taken, the delay instruction is always executed regardless of the value 
of the annul (a) field. If a conditional branch is not taken and the a field is 1, the delay 
instruction is not executed. 
Again, let us define some stores which will be used later in the following sections of this 
specification. 
n, z, v, c :: Bit. # bits of the ICC/XCC register 
ccl, ccO:: Bit. # condition codes selection 
cond:: fBit}4. # the condition field 
disp19:: jBit}19. # branch's PC-relative displacement 
[3.26] bp-cond(-, TRUE) = BPA. # Branch Always 
bp-cond(-, FALSE) BPN. #" Never 
bp-cond(ncc, -z) CCR: ncc[n zv cl BPNE. #" on 
bp-c ond (nc c, z) = CCR: nc c [n zv c] BPE. # on 
bp-cond(ncc, -, (z V (n - v))) = CCR: ncc[n zv c] BPG. # on > 
bp-cond(ncc, zV (n ^ v)) = CCR: ncc[n zv cl BPLE. # on < 
bp-cond(ncc, -, (n - v)) = CCR: ncc(n zv c] BPGE. # on > 
bp-cond(ncc, n- v) = CCR: ncc[n zv c] BPL. # on < 
bp-cond(ncc, -i(c V z)) = CCR: ncc[n zv c] BPGU. # on > Unsigned 
bp-cond(ncc, cV z) = CCR: ncc[n zv cl BPLEU. # on < Unsigned 
bp-cond(ncc, -c) = CCR: ncc[n zv c] BPCC. # on > Unsigned 
bp-cond(ncc, c) = CCR: ncc[n zv c] BPCS. # on < Unsigned 
bp-cond(ncc, -n) = CCR: ncc[n zv c] BPPOS. # on Positive 
bp-cond(ncc, n) = CCR: ncc[n zv c] BPNEG. # on Negative 
bp-cond(ncc, -iv) = CCR: ncc[n zv cl BPVC. # on Overflow Clear 
bp-cond(ncc, v) = CCR: ncc[n zv cl BPVS. # on Overflow Set 
The bp-cond archetypes are used for reading condition code values, and making a decision 
about a condition's validity. It also defines the cond field in an instruction word (the 4-bit 
values of the constants BPA, BPN, ..., BPVS are assumed to be defined elsewhere). COND is a 
3-bit offset into the instruction word. 
[3-27] PC[pcj nPC[npc] pc: COND[cond] 
pc[OP: BRS a]pc-COND[bp-cond(ccl I CCO) C) 0012 CCI ccO p displ9l 
=4 -1c Aa=0 PC[npc] nPC[npc + 4]. 
=1 --ic Aa=1 PC[npc + 41 nPC[npc + 8]. 
5: 3.2 
CHAPTER 3: SPARC-V9 43 
cAa=0 ý* PC[npcl nPC[ea(pc, displ9)]. 
cAa=1A cond BPA PC[ea(pc, displ9)] nPC[ea(pc, displ9) + 4]. 
cAa=1A cond BPA PC[npc] nPC[ea(pc, displ9)]. 
Note that the value of the CCR: nc c locator (CCR: ICC or CCR: XCC) depends on the values 
of condition codes cci and cco (002 ýý XCC7 102 ICC). The reason for using so many 
guards in update scheme 3.27 is mainly the different effects of the annul bit on conditional 
and unconditional branches described earlier in this section. 
3.5.4 Branch on floating-point condition codes with prediction (FBPf cc) 
The effects of these instructions are exactly the same as those of their integer counterparts 
described in section 3.5.3 as far as control transfer and annulment of the instruction in a delay 
slot is concerned. See figure 3.10 for their format. 
The following set of archetypes reads a floating-point condition register selected by a 2-bit 
value f cc and compares it to constants: E(002), L(012), G(102)) U(112) :: jBit}2. As a result 5: 3.1,3.2 
TRUE or FALSE value is instantiated as the archetype's second parameter. Again, the content 
of the cond field of the instruction word is defined by 4-bit constants (FBPA, FBPN, ..., FBPO) 
which are assumed to be defined elsewhere. 
[3.28] f bp-cond(-, TRUE) = FBPA. # Branch Always 
f bp-cond(-, FALSE) = FBPN. #" Never 
f bp-cond(f cc, v= U) =f ccnr(f cc, v) FBPU. #" on Unordered 
f bp-cond(f cc, v= G) =f ccnr(f cc, v) FBPG. #" on > 
f bp-cond(f cc, v=UVv= G) =f ccnr(f cc, v) FBPUG. #" on Unordered or > 
f bp-cond(f cc, v= L) =f ccnr(f cc, v) FBPL. # on < 
f bp-cond(f cc, v=UVv= L) =f ccnr(f cc, v) FBPUL. # on Unordered or < 
f bp-cond(f cc, v=LVv= G) =f ccnr(f cc, v) FBPLG. # on < or > 
f bp-cond(f cc, v= --, E) =f ccnr(f cc, v) FBPNE. #)I on 
f bp-cond(f cc, v= E) =f ccnr(f cc, v) FBPE. # on 
f bp-cond(f cc, v=UVv= E) =f ccnr(f cc, v) FBPUE. # on Unordered or 
f bp-cond(f cc, v=GVv= E) =f ccnr(f cc, v) FBPGE. # on > 
f bp-cond(f cc, v= --, L) =f ccnr(f cc, v) FBPUGE. # on Unordered or > 
f bp-cond(f cc, v=LVv= E) =f ccnr(f cc, v) FBPLE. # on < 
f bp-cond(f cc, v= --, G) =f ccnr(f cc, v) FBPULE. # on Unordered or < 
f bp-cond(f cc, v= --iU) =f ccnr(f cc, v) FBPO. # on Ordered 
A more complex, but concise alternative to archetypes 3.28 is given in 3.29. 
[3.29] e :: Bit. lgu:: {Bit}3.5: 3.2 
f bp-cond(f cc, neg(e) (dm(-, v) & (-, lgu + 1))) =f ccnr(f cc, v) e l&u. 
CHAPTER 3: SPARC-V9 
0.0 1aI co. nd. 1 101 Ic: cllccol pI disp19 
31302928 2524 2221 20 19 18 0 
Figure 3.10: Format of FBPf cc instructions 
ol I.. disp30 
31 3029 0 
Figure 3.11: Format of the Call and link instruction 
44 
The function dm is an ordinary demultiplexor, which could be defined using the arithmetic 
shift left operator << as dm(v): 12 << v Iv IvE jVol. 
[3.301 PC[pcl nPC[npcl pc: COND[condl 
pc[OP: BRS ajpc: COND[f bp-cond(ccl ý ccO, c) 1012 CC1 ccO p displ9] 
=4 --ic Aa=0 )=ý, PC[npcl nPC[npc + 4). 
=1 -ic Aa=1 J=> PC[npc + 4] nPC[npc + 8]. 
cAa=0 1=ý PC[npcl nPC[ea(pc, displ9)]. 
cAa=1A cond = FBPA j#. PC[ea(pc, displ9)] nPC[ea(pc, displ9) + 41. 
cAa=1A cond =7ý FBPA J=: ý, PC[npc] nPC[ea(pc, displ9)]. 
Again, the annul bit has a different effect for conditional branches than it does for uncon- 
ditional branches (in this case opcode FBPA). If the annul bit (a) is set, it annuls the delay 
instruction for unconditional branches always, whereas for conditional branches it annuls the 
delay only if the branch is not taken. Please refer to [77] for details. 
3.5.5 Call and link 
The CALL instruction writes the contents of the PC into r[151 ('out' register3 7) and then 
causes an unconditional delayed transfer of control to a PC-relative effective address. 
[3.31) disp30:: jBit}30.5: 3.2 
PC[pcl nPC[npcl CWP[w] pc[OP: CLL disp3O) OP: CLL -= 012 
==* PC[npcj nPc[ea(pc, disp30)] rw(15, pc, w). 
3.5.6 Return 
The RETURN instruction causes a delayed transfer of control to the target address and has the 
window semantics of a RESTORE instruction; that is, it restores the register window prior to 
the last SAVE instruction. The target address is r [rs1j + -r [rs2] if bit 13 of the instruction word 
(i) is 0, or r[rsll + sign-ext(simm13) if i=1. 
3 see footnote 2 on page 40 
CHAPTER 3: SPARC-V9 
1.0 7. 1.11.00.1 rsl. 0 rs2 
10 1.1100.1 rsl. 1 simm13 
..... 31 3029 25 24 19 18 14 13 12 540 
Figure 3.12: Format of the RETURN instruction 
0,01.09090.1 ýoq 1 00 00000 00000 00000 00000 
31 30 29 2524 2221 0 
Figure 3.13: Format of the NOP instruction 
[3.32] PC[pc] nPC[npc] CWP[w) pc[OP: ARM 000002 1110012 SOP1(X) sop2(y)] 
==* PC[npcl nPC[x + yj CWP[(w - 1) % NWINDOWS]. 
Note that registers r[rsl] and r[rs2] come from the old window. 
3.6 Miscellaneous instructions 
3.6.1 No operation 
The NOP instruction changes no program-visible state (except the PC and nPC registers). 
[3.33] OP: BRS 000002 1002 002 000002 000002 000002 000002 ý_-* - 
An example 
45 
The following example uses archetypes from sections 2 and 3 to illustrate the archetype 
expansion mechanism. In this example the update scheme 
[3.221 PC[pc] nPC[npcl pc[OP: ARM rw(-, pc) 1110002 SOP1(X) sop2(y)) ==* PC[npcl nPclx+y]. 
will be expanded into one update scheme showing the effects of the JMPL (Jump and Link) 
instruction described in section 3.5.1. For maximum simplicity, the instruction uses as its 
destination and first source operands global r registers (i <r< 8) and as its second source 
operand an immediate value carried by the instruction itself. 
The arithmetic command scheme 3.22 expands as shown in figure 3.14. The archetypes 
and their index numbers from sections 2 and 3 are included as comments for convenience. 
After substitution using equations derived at each step the full expansion of our update scheme 
is 
PC[pc] nPC[npc] CWP[w] al[xj # initial configuration 
pc[OP: ARM - 
1110002 a, IMM13 (V3 :: {Bit}13)] #field format of the current instruction 5: 3.2 
=[ (1 <-< 8) A (1 < a, < 8) Y* # conditions for application 
PC[npc] nPc[x + sign-ext(V3)] -[PC]- 
# effects of instruction on configuration 
CHAPTER 3: SPARC-V9 
PC[pcl nPC[npc] pc[OP: ARM rw(-, pc) 1110002 SOP1(X) sop2(y)] 
=* PC[npc] nPC[x + y]. # [3.22] 
# [2.2] rw (a, v) a= CWP[w] rw (a, v, w). a, pc =v 
PC[pcj nPC[npcj CWP[w] pc[OP: ARM rw(a, v, w) a 1110002 sopl(x) sop2(y)] 
=* PC[npcj nPC[x + yl. 
# [3.1] sopi (vj) = rr(al, vi) a,. X= V1 
PC[pcl nPC[npc] CWP[w] pc[OP: ARM rw(a, v, w) a 1110002 rr(al, vi) a, sop2(y)] 
==ý, PC[npc] nPC[vl + y]. 
# [3.1] sop2(v2) = IMM13(v2). Y= V2 
PC[pcj nPC[npc] CWP[w] pc[OP: ARM rw(a, v, w) a 1110002 rr(al, vi) al IMM13(V2)] 
=#- PC[npc] nPC[vi + V21- 
# [2.2] rw (a, v, w) = =[ I<a<8 ý* a [vj. 
PC[pc] nPC[npcj CWP[w] pc[OP: ARM a 1110002 rr(al, vl) a, IMM13(V2)] 
=[ I<a<8 J#. PC[npcj nPC[vi + V2] a[v]. 
# [2.1] rr (a,, vj) == a, [v, ] =4 I<a, <8 ý*. 
PC[pc] nPC[npc] CWP[w] al[vlj pc[OP: ARM a 1110002 al IMM13(V2)] 
=1 (1 <a< 8) A (1 < a, < 8) Y: ý- PC[npcj nPC[vj + V2] a[v]. 
# [3.2] IMM1 3 (s ign -ext 
(v3)) 
= 
(v3 
:: {Bi t} 13). v2 = sign -ext 
(V3) 
PC[pc] nPC[npcl CWP[w] al[vl] pc[OP: ARM a 1110002 a, IMM13 (V3 :: jBitJ13)] 
=I- (1 <a< 8) A (1 < a, < 8) J=> PC[npc] nPC[vi + V2] a[v]. 
Figure 3.14: Example of archetype expansion 
46 
For a slightly longer example of archetype expansion please refer to appendix C, where 
the same instruction uses as its destination operand one of 7 global r registers, as its first 
source operand one of 8 'in' registers, and finally as its second source operand an immediate 
value. 
Conclusions 
This chapter gives a (partial) specification of the SPARC-V9 CPU at the instruction set level. 
The main achievement is its compactness (the number of update schemes necessary to specify 
the whole instruction set of this processor is less then the number of opcodes appearing in the 
informal specification [77]) and clarity. The next step could be a specification of a SPARC-V9 
compliant specific chip (such as UltraSPARC Hi [70)) and proof of their (partial) equivalence. 
Furthermore, since the same formalism can be used to specify lower level representations 
(such as the microcode), it would be possible to prove transformations between these levels 
and reason about the specifications. 
Chapter 4 
Java Virtual Machine 
T he Java Virtual Machine (JVM) is a platform independent abstract computing machine. 
The JVM has a bytecoded instruction set designed to be compact and easily interpreted 
in either software or hardware. In this chapter, a formal specification of a subset of the JVM 
instructions is given. Instructions are described using the basic Update Plans formalism briefly 
introduced in chapter 1. This is not the first attempt at a formal description of Java bytecode 
semantics as a similar work has already been done [11], however, the following description 
of JVM instruction semantics reflects the readability and flexibility of Update Plans for this 
purpose and also demonstrates some of the new Update Plans features introduced to the 
formalism by this thesis. The specification is based on informal specifications [19,48,75] and 
a part of it was already published in [55]. 
I Types, constants, variables 
1.1 Types 
The following types are used throughout the specification. Store Bit is considered to be a 
primitive type having the usual meaning. 
[1-1] fBit}. 
f Boolean =f Bit)32, Byte = {Bit}32, Char =f Bit}32, Short =f Bit}32,5: 3.2 
Int =f Bit}32, Float = {Bit}32, Ref erence = jBit}32,5: 3.2 
ReturnAddress =f Bit}32, Long =f Bit}64, Double = {Bit}64}. 5: 3.2 
Since many JVM instructions are defined in terms of so called category types [48], the 
following type aliases are provided. 
[1.2] jCategoryi = Booleanj Byte I Char I Short I Int I Float I Reference I ReturnAddress, 
Category2 = Long I Double, 
Category12 = Categoryl I Category2}. 
47 
CHAPTER 4: JAVA VIRTUAL MACHINE 48 
For the purposes of this specification, additional types Signed/Unsigned Byte/Short are 
declared. The main reason is to supply grounding information for Update Plans. Signed 
values use two's-complement encoding similarly to Int and Long types. 
[1.3] {UnsignedByte = {Bit}8, SignedByte = {Bit}8,5: 3.2 
Uns ignedShort = {Bit} 16, SignedShort = jBit} 16}. 5: 3.2 
In order to make the definitions of the instructions as compact and clear as possible, an 
alias type Word is also defined, which is used mainly by stack manipulation instructions. 
[1.4] jWord = Categoryl Categoryl I Category2}. 
1.2 Constants 
This section contains a list of all JVM opcodes used in this specification with their values. 
POP(87), POP2(88), DUP(89), DUP-Xl(90), DUP-X2(91), DUP2(92) 5: 3.1 
DUP2-Xl(93), DUP2-X2(94), SWAP(95), 5: 3.1 
ILOAD(21), ILOAD-0(26), ILOAD-1(27), ILOAD-2(28), ILOAD-3(29), 5: 3.1 
ISTORE(54), ISTORE-0(59), ISTORE-1(60), ISTORE-2(61), ISTORE-3(62), 5: 3.1 
IADD(96), ISUB(100), IMUL(104), IDIV(108), IREM(112), 5: 3.1 
IAND(126), IOR(128), IXOR(130), ISHL(120), ISHR(122), IUSHR(124), 5: 3.1 
INEG(116), IINC(132), 5: 3.1 
ICONST-Ml(2), ICONST-0(3), ICONST-1(4), ICONST-2(5), ICONST-3(6), ICONST-4(7), 5: 3.1 
ICONST-5(8), 5: 3.1 
BIPUSH(16), SIPUSH(17), 5: 3.1 
IFEQ(153), IFNE(154), IFLT(155), IFGE(156), IFGT(157), IFLE(158), 5: 3.1 
IF-ICMPEQ(159), IF-ICMPNE(160), IF-ICMPLT(161), IF-ICMPGE(162), IF-ICMPGT(163), 5: 3.1 
IF-ICMPLE(164), 5: 3.1 
GOTO(167), GOTO-W(200), JSR(168), JSR-W(201), RET(169), 5: 3.1 
WIDE(196) :: UnsignedByte. 5: 3.1 
1.3 Variables 
Unless stated otherwise, throughout the whole document the following variables, used for 
description of instruction parameters, are assumed to have the following types 
b., :: SignedByte. 
b,,, opc :: UnsignedByte. 
s.,, 5., :: SignedShort. 
S, :: UnsignedShort. 
i, :: Int. 
z :: Categoryl2. 
CHAPTER 4: JAVA VIRTUAL MACHINE 
Instructions 
2.1 Operand stack management 
49 
A number of instructions are provided for direct manipulation of the operand stack in the 
current frame: POP, POP2, DUP, DUP-X1, DUP2-XI, DUP-X2, DUP2-X2 and SWAP. One of many 
possible UP definitions is as follows 
[2.1] POP Sp[t] sfvlt Sp[s]. 
POP2 SP[t] S[W]t Sp[s). 
DUP Sp[t) S[V]t SP[t'l S[v V)t'. 
DUP2 SP [t] S[W]t SP[t'l S(w W]t'. 
DUP-Xl SP[t] s(vi VO]t SP[t'j s[vo V, VO]t'. 
DUP2-Xl SP [t] s[vi wolt SP[t'l s[wo V, WO]t'. 
DUP-X2 SP[t] s[w, VO]t SP[t'j S[VO W, VO]t'. 
DUP2-X2 SP[t] s[w, WO]t SP[t'l s[wo W, wolt'. 
SWAP SP[t] s[vi VO]t SP[tlj s[vo vljt. 
where SP denotes a stack pointer in the current frame and the s, t and t' variables are stack 
locators. In the specification above, the variables v, w, and their indexed variants, are of data 
types 
[2.21 v:: Categoryl. 
w:: Word. 
where variables of the type Categoryl and Word occupy one and two stack items on the 
operand stack respectively. 
Variables w) wo and w, in 2.1 can be viewed as two variables of the type Categoryl. Thus 
(for example) the definition of the instruction DUP2-X2 also covers definitions such as 
[2.3] DUP2-X2 SP[tj S[V2 Vl 'ý7O]t SP[tll S[ý70 V2 Vl WO]tl- 
DUP2-X2 SP[t] S[W2 Vl VO]t SP[til S[VI VO W2 Vl Vo ]t'. 
DUP2-X2 SP[t) S[V3 V2 Vl VO)t ý4' SP[tlj SjVl VO V3 V2 VI VO]tt- 
The above specification assumes that the memory for the JVM stack is contiguous and as 
such is in fact implementation dependent. One of the ways around this particular problem 
is to use generic push and pop archetypes whose exact definitions would be defined later on 
in terms of specific/implementation dependent archetypes. For example, assuming there is a 
generic popsh(pop, psh) archetype combining the effects of pop and push instructions we can 
simply say that 
[2-4] push(v) = popsh(, v). 
pop(v) = popsh(v, ). 
5: 2.1 
5: 2.1 
CIIAPTEP, 4: JAVA VIRTUAL MACHINE 50 
The push and pop archetypes from 2.4 are frequently used in the following sections as a 
means of defining other JVM instructions which make changes to the operand stack. 
2.2 Local variable access 
The JVM instruction set includes a number of instructions for accessing local variables of 
different types in the current frame. The following ambidextrous archetype is used throughout 
this chapter to denote accesses to a local variable at index n. 
[2.5) var (n, z) n [z] =. 
Note that the value z right of the locator n can be either of type Categoryl or Category2 
as declared in section 1.3. As the usual size of local variables is 32 bits (the width of the 
computational type (see [481) Int), value z is stored in one or two local variables based on 
its size. 
Many JVM instructions can be prefixed by the WIDE instruction, which results in different 
interpretation of the instruction modified in this way. The WIDE instruction takes one of two 
formats, depending on the instruction being modified. Archetypes in 2.6 are provided to 
illustrate the modifications. Two non-wide instruction formats are immediately followed by 
two wide formats. a and 5 helper archetypes reconstruct (un)signed integer values from bytes 
of data. 
5: 2.2 
[2.61 a (b,,, bu2) (((b,,, << 8) lbu2) :: Uns i gnedShort) =. 5: 2.2 
6(bul, bu2) (((bul << 8) lbu2) :: SignedShort) =. 5: 2.2 
J(buj, bu2, bu3, bu4) ((bul <<24) I (bu2 << 16) 1 (bu3 <<8)lbu4) 5: 2.2 
wide(opc, b,, ) = opc bu. 
wide(opc, bu, b, ) = opc b,, b.,. 
wide(opc, ce(bul, bu2)) = WIDE opc b,,, bu2- 
wide(opc, a(bul, bu2)7 J(bu3, bu4)) = WIDE opc bul bu2 bu3 bu4- 
The most common instructions that access local variables are load and store instructions 
which transfer values between JVM local variables and the operand stack. The following 
archetypes describe the instruction formats of all Int load/store instructions. 
[2.7] i3(O) = 0. i3(I) = 1. i3(2) = 2. i3(3) = 3. 
iload(i) = wide(ILOAD, i). 
iload(i) = ILOAD--i3(i). 5: 3.1 
istore(i) = wide(ISTORE, i). 
istore(i) = ISTORE--i3(i). 5: 3.1 
The purpose of the i3 archetypes is to condense the specification which would otherwise 
had to explicitly list all ILOAD-n and ISTORE-n instructions that implicitly access variables 
n. Finally, update schemes in 2.8 show the effects that these instruction have on the JVM 
machine. 
CHAPTER 4: JAVA VIRTUAL MACHINE 
[2.8] iload(n) var(n, v) push(v) ==* . 
istore(n) pop(v) ==* var(n, v). 
51 
All the instructions described in this section operate on the data type Int. Similar arche- 
types and update schemes could be specified for the instructions operating on the data types 
Float, Long, Double, and Reference; that is, instructions FLOAD, LLOAD, DLOAD, and ALOAD 
respectively, plus their implicit immediate operand variants FLOAD-n, LLOAD-n, DLOAD-n, and 
ALOAD-n, where n C- {O, 1,2,3}. 
2.3 Arithmetic instructions 
The Java Virtual Machine set offers a wide range of arithmetic operators. It is assumed that 
the standard arithmetic operators are defined in the standard environment, so only these 
less common arithmetic operators are informally explained: shift left (<<), shift right (>>), 
and unsigned shift right (>>>). Archetypes and an update scheme for all the JVM's double 
operand Int arithmetic instructions are given in 2.9. 
[2.9] iarithm-binary(ij, i2, il + i2) = IADD. 
iarithm-binary(ij, i2, il - i2) = ISUB. 
iarithm-binary(ij, i2, il X i2) = IMUL. 
iarithm-binary(ij, i2, il/i2) = IDIV i2 0 
iarithm-binary(ij, i2, il % i2) = IREM i2 0 
iarithm-binary(ij, i2, il & i2) = IAND. 
iarithm-binary(ij, i2, il i2) = IOR. 
iarithm-binary(ij, i2, il i2) = IXOR. 
iarithm-binary(ij, i2, il << i2) = ISHL. 
iarithm-binary(ij, i2, il >> i2) = ISHR. 
iarithm-binary(ij, i2, il >>> i2) = IUSHR. 
iarithm-binary(ij, i2) ir) popsh(il i2) ir) ==* 
Similarly, the only single operand' Int arithmetic instruction present in the JVM instruc- 
tion set INEG can be defined as 
[2.10] iarithm-monadic(ii, -ii) = INEG. 
iarithm-monadic(i, ir) popsh(i, i, ) 
Note that because of the two's-complement representation used for negative numbers, negation 
of the minimum value of the type Int produces the same value, not the maximum value of 
this type as one might expect. 
'The Int bitwise negation is not present in the JVM instruction set, and is typically carried out by the 
instruction NOR with the constant ICONST-MI. 
CHAPTER 4: JAVA VIRTUAL MACHINE 52 
And finally the last arithmetic instruction defined in this section is the Int increment of 
a local variable by a constant. It is the only arithmetic instruction, which can take the wide 
instruction format. 
[2.11] wide(IINC, s, s., ) var(s, i) ==* var(s,,, i+s, ). 
All the arithmetic instructions specified here operated on the data type Int. The instruc- 
tion set of the Java Virtual Machine contains similar instructions for data types Long, Float, 
and Double, and so their formal description would not be beneficial to explanation of their 
semantics, or as a demonstration of the UP formalism. 
2.4 Immediate operands 
The instruction set of the Java Virtual Machine includes a number of instructions pushing an 
immediate operand onto the operand stack. 
The simplest form of these instructions are pushes of implicit immediate operands. In a 
way similar to the i3 archetypes from section 2.2, i5 archetypes are used in an effort to make 
the specification as succinct as possible. 
[2.121 i5(-l) = M1. i5(0) = 0. i5(1) = 1. i5(2) = 2. i5(3) = 3. i5(4) = 4. i5(5) = 5. 
iconst(i) = ICONST--i5(i). 5: 3.1 
The rest of the instructions that push Int values onto the operand stack have explicit 
immediate operands. 
[2.13] iconst(b,, ) = BIPUSH b,. 
iconst(J(b,,,, bu2)) = SIPUSH b,,, bil2- 
iconst(i) push(i) ==* . 
Please note that there is an implicit promotion of (un)sign6d types Byte and Short to 
Int described by [48]. 
Again, all the instructions described in this section operated on the data type Int. Similar 
archetypes and update schemes could be specified for the instructions operating on data types 
" Float: FCONST-n, where nE {0,1,2} 
" Long: LCONST-n, where nE {O, 1} 
" Double: DCONST-n, where nE fO, 1} 
9 Reference: ACONST-NULL pushing a special NULL value 
2.5 Control transfer 
The control transfer instructions conditionally or unconditionally make the Java Virtual Ma- 
chine continue execution with an instruction other than the one following the control transfer 
instruction. These instructions can be divided into three categories 
CHAPTER 4: JAVA VIRTUAL MACHINE 53 
* conditional branches (e. g. IFEQ 5, IF-ICMPEQ J) comparing a top stack item against zero 
or comparing two topmost stack items against each other 
" compound conditional branches (TABLESWITCH and LOOKUPSWITCH) with a variable num- 
ber of operands (branches) 
" unconditional branches (GOTO J.,, GOTO-W Ji, JSR J., JSR-W Ji, RET b, and WIDE RET s,, ) 
As has already been pointed out, there are two groups of conditional branches comparing 
Int type values. The first one compares its parameter against an implicit immediate operand 
0 and the other one compares its two parameters. Both versions are listed on the same line 
for the same type of conditional jump. The 5 archetype defined in section 2.2, is used. 
[2.14] jmpc(i, i=0, Ö(bl, b2» = IFEQ bl, b2, 
jmpc(i, i zA 0,6(bl, b2» = IFNE bi b2- 
jmpc(i, i<0, S(bl, b2» = IFLT bl b2- 
jmpc(i, i>0, J(bl, b2» = IFGE bl b2- 
j mpc (i, i>0,5(bi, b2» = IFGT bl b2- 
jmpc(i, i<0,5(bl, b2» = IFLE bl b2- 
jmpc(i, j, i j, J(bi, b2)) = IF-ICMPEQ bl b2- 
j mpc (i, j, i j, 5(bi, b2)) = IF-ICMPNE b, b2- 
jmpc(i, j, i<j, J(bi, b2)) = IF-ICMPLT bl b2- 
i mpc (i, j, i>j, J(bl, b2)) = IF-ICMPGE b, b2- 
jmpc(i, j, i>j, J(bi, b2)) = IF-ICMPGT b, b2- 
i mpc (i, j, i<j, J(bi, b2)) = IF-ICMPLE b, b2- 
The list of conditional branches in 2.14 is not complete. Again, only instructions compar- 
ing Int type values are included. There are four other conditional branches comparing values 
of the type Ref erence: IF_ACMPEQ J,,, IF-ACMPNE J.,, MULL J., and IFNULL J'; however, 
the semantics of these instructions is very similar to that of the instructions already defined 
in 2.14. 
[2.15] jmpu(TRUE, J(bl, b2)) = GOTO bi b2- 
jmpu(TRUE, J(bl, b2, b3, b4)) = GOTO-W bi b2 b3 b4 
Note that the only difference between the instructions GOTO J. ' and GOTO-W Ji is the size of 
their operands. The wide variant of the unconditional jump takes a four-byte offset, whereas 
the standard variant takes only a two-byte offset. The aim of the archetype definitions above 
is to compress both the conditional and unconditional branches into one update scheme 2.17. 
[2.16] jump(cond, Ji) = jmpc(i, cond, Ji) pop(i). 
= jmpc(i, j, cond, Ji) pop(i j). 
= jmpu(cond, Ji). 
[2.171 PC[pcl pc[jump(cond, Ji)lqc cond ý* PC(pc + Jj]. 
11 -icond 1=-' , PC[qc]. 
PC is the JVM program counter, which contains the address of the current instruction. Note 
that the target address of a GOTO (-W) instruction must be that of an opcode of an instruction 
within the method that contains this instruction. This is not dealt with in this specification, 
since this is the responsibility of the JVM bytecode verifier, which is not specified here. 
CHAPTER 4: JAVA VIRTUAL MACHINE 54 
The format of JSR (-W) instructions is exactly the same as the format of GOTO (-W) instruc- 
tions. However, contrary to the GOTO(_W) instructions, the JSR(_W) instructions also push 
the return address on the operand stack as shown in 2.19. 
[2.18] jsr(J(bj, b2)) = JSR bj b2- 
j sr(J(bl, b2, b3, b4)) = JSR_W b, b2 b3 b4- 
Note that the asymmetry of JSR and RET instructions is intentional. After a JSR instruc- 
tion, the return address is stored into a local variable by one of ASTORE instructions, and this 
local variable may in turn be used by a RET instruction. 
[2.19] PC[pcl pc[jsr(Ji)]qc push(qc) ==* PC[pc + Jj]. 
PC[pcj pc[wide(RET, n)jqc var(n, a) =* PC[a]. 
Conclusions 
The specification of the subset of the JVM instruction set is the third and last demonstration 
of the original Update Plans, using only minor syntactic extensions from Extended Update 
Plans introduced in the second part of this thesis. It provides a proof that even more abstract 
instruction sets using a wide variety of types can easily be described by UP. An interesting 
exercise for the future could be a specification of a real Java processor (used widely in mobile 
devices) which usually implements one of several existing JVM subsets. 
Part 11 
Extended Update Plans 
55 
Introduction 
The first part of the thesis concentrated mainly on Update Plans as proposed by [601. Several 
case studies have been performed testing the formalism on real examples, and a comparison 
of UP against other existing formal methods has been made. This prompted a number of 
research questions some of which are addressed in this second part of the thesis. This second 
part contains a number of syntactic and semantic extensions to Update Plans called Extended 
Update Plans (EUP). The main contribution of EUP is a concept of sequential update scheme 
and archetypes and improved consistency of the formalism in some areas. Finally, various 
other formal methods with similar application domains are examined and comparisons with 
the Update Plans formalism are drawn. 
56 
Chapter 5 
Syntactic Extensions 
T lie whole Update Plans grammar has been revised, and undergone changes to make it 
more compact and consistent in its structure and terminology. As a result, the formalism 
became much more simple to use without the need for frequent reference to its grammar. As an 
additional benefit, the implementation will be more transparent. Some syntactic changes, such 
as parallel blocks in archetypes introduced new possibilities in Update Plans and (inevitably) 
resulted in the need to define their semantics. These changes along with a completely new 
concept of sequential update schemes and archetypes will be discussed in chapter 6. 
This chapter describes most of the important changes and additions to the grammar and 
the semantic impact of less significant syntactic changes. The complete revised grammar of 
Extended Update Plans can be found in appendix A. 
I Everything is an update 
Most 'inconsistencies' in the original Update Plans grammar stemmed from the fact that there 
was no unifying concept of an 'update'. In the new grammar, an update is either a parallel 
block, a sequential block or a set of alternatives. In other words, changes to a configuration 
are always caused by an update. Another significant inconsistency was the use of a dot Y 
to separate all (top-level) items other than parallel blocks. Strictly only (top-level) items 
are separated by a dot now. One of the consequences is that sugared multiple archetype 
definitions such as 
a(params) = update, 
= update2 
update, 
are perceived as one item. They are separated from other items in an update plan by the dot 
behind the last update update,, and from individual updates of the archetype by the'=' sign. 
57 
CHAPTER 5: SYNTACTIc EXTENSIONS 58 
A similar consideration applies to parallel blocks. The use of the double pipeline symbol 
to separate alternatives in a parallel block was entirely optional. As only items are now 
separated by a dot, it is necessary to separate individual updates in parallel blocks. This is 
done by changing the production rule for parallel blocks to 
(parblock) --* "(11" f(update) "Il"}+ "11)" 
2 Archetypes 
2.1 Grammar 
The following grammar tries to be as compact as possible while preserving all features offered 
by the original grammar. 
(item) --+ (archetype deflnition) 
(archetype definition) ---) 
(basic archetype definition) 
(ambidextrous archetype definition) 
(basic archetype definition) --+ (basic declaration) (basic deflnition)+ 
(basic declaration) --+ (basic archetype name) (parameters) 
(ambidextrous archetype definition) 
(ambidextrous declaration) (basic deflnition)+ 
(ambidextrous declaration) --ý 
(archetype name) (parameters) (text) 
(basic definition) --+ "=" (archetype body) 
(archetype body) --+ (update) I (conflguration) 
(parameters) -+ "(" J(text) ", " J* 't)" 
A (basic archetype name) is an identifier, an (archetype name) is a symbolic constant 
or an identifier. Note that the pipeline symbol 'I' is no longer used to denote ambidextrous 
archetypes. The use of this symbol by ambidextrous archetypes and other means of separating 
the text by command (also ambidextrous) archetypes was one of the main inconsistencies. 
The consequences of these simplifications are discussed in the following two sections. The 
syntax of archetype calls has been left unchanged and can be found in appendix A or in the 
original work [60]. 
2.2 Ambidextrous archetypes 
As mentioned earlier, the '1' symbol no longer denotes the text to be expanded on the left or 
right-hand side of an update scheme during ambidextrous archetype expansion. Instead, the 
CHAPTER 5: SYNTACTic EXTENSIONS 59 
text is placed between the parameters of an archetype and its definition, so that the archetype 
definitions 
al(params) = text Ic g ý* rc. 
a, (params) = Ic =[ g text rc. 
can be 'compressed' into one EUP ambidextrous archetype definition 
a(params) text = lc =f g ]=-ý rc. 
where Ic/rc and text are left/right contexts and text shared among archetypes al, ar and 
a. Note that every archetype whose left and right-hand side expansions are empty is an 
ambidextrous archetype, which was not the case in basic Update Plans. 
If the archetype body of an ambidextrous archetype consists of alternatives 
a(params) text = Icl =[ gl J=> rcl; IC2 =4 92 ý* rC2; ... ; Ic" =1 g" 
1=ý rc,,. 
it is interpreted as syntactic sugar for 
a(params) text = Icl gl ý* rcl. 
a(params) text = IC2 gl A 92 J#' TC2- 
n-1 
a(params) text = Ic,, `90 1ý gn 
]=-ý' 7Cn- 
Note that text in ambidextrous archetypes is not allowed to appear on the left or right- 
hand side of an archetype's body. The semantics of ambidextrous sequential archetypes is 
discussed in chapter 6. 
2.3 Command archetypes 
If a is an ambidextrous archetype and the expansion of all definitions of a begin with the 
same constant then that constant may be used as the archetype name, unless it has already 
been so used. The remainder of the expansion is then placed immediately after archetype's 
parameters. Such an archetype is called command archetype. The command archetype 
CONST(params) text = update. 
is the sugared version of 
a(params) CONST text = update. 
with calls of a replaced throughout the update plan by calls of CONST. 
2.4 Archetype parameters 
An archetype's parameters can be referenced in the archetype's body by the symbol '$' and 
a number directly corresponding to the position of the archetype's parameter where the 
parameters are numbered from the left starting from 1. For instance, the fourth parameter 
4x+ Y' is referred to by $4 in the following archetype. 
CHAPTER 5- SYNTACTic EXTENSIONS 60 
add (o, b, c, x+ y) =A (o] aa [xj b [y] ==: ý c [$4]. 
One of the changes to the archetype grammar is that the individual parameters of an 
archetype call can be empty. For example add(, b, c, r) is a perfectly valid archetype call, and 
an "empty string" is in place of parameter $1. No production rules are added to the grammar 
as the preprocessor can expand these references as simple macros. 
2.5 Archetypes in guards 
The last extension concerning archetypes concerns archetype calls appearing in guards. Ar- 
chetype calls were not allowed in guards in basic Update Plans. As there is no reason for this 
restriction, archetype calls can be used in EUP guards unless they are recursive. 
3 Types 
3.1 Constants 
Constants are uppercase words in Update Plans. An exception to this rule is when making an 
explicit declaration of constants using the predefined type Constant. Then iIt is possible to 
compose constants of lowercase letters. For instance, c and nPC are given "a constant status" 
by the following definition c, nPC :: Constant. 
Variables, on the other hand, are denoted by lowercase words. However, even if a word 
uses uppercase letters, it can be assigned an explicit type. Furthermore, individual types can 
be viewed as constants when used as an ordinary text in an update scheme. This feature can, 
for example, compress the following two update schemes 
get-r(a, r) ý 0012 a[(r Byte)]. 
ý 0102 a[(r Half word)). 
into only one 
get-r(a, r) = Integer a[(r :: Integer)]. 
provided that a new type alias Integer is created, and that the constants Byte and Half word 
are assianed symbolic values 0012 and 0102 respectively by an explicit list of types with their 
symbolic values. 
jByte(0012), Half word(0102)}, 
f Integer = Byte I Half word}. 
It is sometimes useful to initialise a variable or a constant to a value. The following 
example assumes that we want constants E, L, G and U to have symbolic values 002,012,102 
CHAPTER 5: SYNTACTIC EXTENSIONS 61 
and 112 respectively. Note that this example makes use of the repeat construct introduced in 
section 3.2. 
E(002), L(012), G(102), U(112) :: fBit}2. 
A new meaning is given to the symbol '-'. It is now used as the text concatenation 
symbol. For instance a sequence 'T- E- X- V is interpreted as a constant 'TEXT'. A more 
realistic example of its use can be found in section 2.4 of chapter 4. 
An updated type grammar for these changes is given in the following section. 
3.2 Type grammar 
(item) -4 (store declaration) I (type declaration) 
(store declaration) "I" {(store) ", " }+ "I" 
(type declaration) J((term) (const)-opt) (store structure) 
(term) --ý "(" (term) ":: " (store structure) 
(store) ---ý 
(store identifier) 
(store identifler) (store structure) 
(store identifier) --+ (store name) (const) -opt 
(const) ---ý "(" (number) ")" 
Note that (store structure) is a regular expression over a set of (store name)s, i. e. lower 
case words with a leading upper case letter. 
A new type can be declared reusing an old one provided that the structure of the new 
type contains a pattern identical to the structure of the old type. In order to express the 
exact number of repetitions of the old structure in the new type, the syntax of the original 
Update Plans needs to be extended by adding the following rules. 
(store structure) --ý "I" (store structure) "I" (number) 
For instance, assuming that the type Bit is already defined, the type Byte could be defined 
using the repeat construct as fByte = {Bit}8}. 
4 Comments 
The last trivial but useful syntactic addition to basic Update Plans are comments. A comment 
is started by the symbol'#' and is terminated by the end of the line. They would be removed 
by a preprocessor during lexical analysis. 
CHAPTER 5: SYNTACTIC EXTENSIONS 
Conclusions 
62 
Although not significant, the changes described in this chapter contribute to the usability of 
the formalism in three ways. Firstly, the grammar has now a slightly less restrictive syntax. 
As a result several simple semantic rules had to be added to the formalism, and the more 
will follow in the following chapter. This, however, is a small price to pay for its increased 
consistency and more intuitive use of the whole formalism. Secondly, the introduction of a 
few simple conventions improves the type reuse mechanism and allows even more compact, 
multi-level UP specifications. And finally, the changes to the grammar open new possibilities 
to UP such as nesting of sequential/parallel blocks within an update which will be described 
in the following chapter. 
Chapter 6 
Semantic Extensions 
W llile parallel blocks form a useful extension to basic Update Plans by making them more 
readable, they do not add much power to the formalism as they can be interpreted as mere 
syntactic sugar. Moreover, one of the drawbacks of a parallel block lies in the nondeterministic 
execution of its constituent update schemes which makes any synchronisation of these schemes 
impossible. 
Not only do the extensions described in this chapter address these problems, but they also 
provide Update Plans with modularity, which with the exception of the archetype mechanism 
was not present in basic Update Plans. 
The layout of this chapter is as follows. Firstly, a simple convention concerning parallel 
blocks making Extended Update Plans more consistent is introduced in section 1. Secondly, 
the syntax, semantics and the implementation of sequential update schemes is presented in 
section 2. An extension, known as sequential archetypes, closely related to the archetype 
mechanism makes the concept of sequential update schemes more powerful and is introduced 
in section 3. The EUP grammar allows (in contrast to the basic UP grammar) any kind of 
update to appear in the body of an archetype. The consequences of this change are considered 
in section 4. Finally, conclusions are drawn in section 5. 
I Parallel blocks 
Parallel blocks were introduced in [60] as a way of expressing synchronous parallelism. In the 
original Update Plans, a parallel block is a set of independent alternatives which are applied to 
a configuration simultaneously. The application of a parallel block is naturally atomic, so all 
changes to the configuration contributed by any of its alternatives will appear simultaneously. 
The only way to share data between the individual alternatives is to use constant locators. 
The introduction of a simple convention allowing variables to be shared across all updates 
in a parallel block not only simplifies sharing of data among these updates and makes a 
parallel block consistent in this respect with its sequential counterpart, but it also makes 
63 
CHAPTER 6: SEMANTIC EXTENSIONS 64 
transformation of parallel blocks into canonical form slightly easier as there is no need to 
rename unrelated variables of the same name among the parallel block's updates. 
Example 1 
Consider the following update plan, which best illustrates the difference 
between the semantics of parallel blocks in basic and Extended Update 
Plans. 
A[O] B[11. # initial configuration 
A[a] A[a + 1] # basic Update Plans require '. ' here 
B[a] B[a - 1]. 
In basic Update Plans, the first application of the parallel block will 
result in the configuration A[l] B[01. In Extended Update Plans, the parallel 
block is not applicable, as the cells right of A and B have different values and 
as such the set of locator expressions on the left-hand side of the resulting 
update scheme is not consistent. 
Sequential update schemes 
2.1 Background/Motivation 
Apart from improving modularity in Update Plans, the motivation for sequential update 
schemes had two sources. Firstly, at the formalism level, it was necessary to provide some 
synchronisation for update schemes, and on a related note, at the application level, the 
introduction of explicit sequential execution proved to be very useful as there is often a need 
to express a sequential execution of update schemes without resorting to techniques which 
would make the specification less transparent. 
At the instruction level, consider any PDP-11-like instruction, but with three operandsi, 
e. g. ADD zxy, with x and y source operands and z the destination operand. The semantics 
of this instruction seems simple. Source operands x and y are added and the result is written 
into the destination z. Consider, however, what happens if two of the operands use address- 
ing modes that address the same register, and change the value in that register e. g. x is 
postincrement on register R1 (Rl+) and y is predecrement on the same register (-Rl). It is 
then not clear what values are being addressed. Should x be addressed first and then y-i. e. 
get x from QR1, increment R1 then decrement R1 and get y from QR1, or is y addressed first 
and then x, or are they in some way accessed in parallel. In the original Update Plans such 
an instruction was illegal. 
'PDP-11 instructions have a maximum of only 2 operands 
CHAPTER 6: SEMANTic EXTENSIONS 65 
The specification of the above-mentioned addressing modes and the arithmetic instruction 
is in 2.1 
(2.11 POSTINC(b, v) r=r [b] b [v] cr [c]. # postincrement mode 
PREDEC (a, v) r= r[b] a[v]b r[a]. #predecrement mode 
arithm(x) Yi X+ Y) = ADD. # addition 
arithm(x, y, r) r3 POSTINC(b, x) PREDEC(a, y) ==* r3 [r] . 
In the following text the expansion of an ADD R2 R1+ -R1 instruction is examined. First, 
before an archetype's expansion, it is necessary to rename all archetype's variables so that 
they do not conflict with variables within the scheme the archetype is being called from. 
[2.2] POSTINC(b, v) r, = ri[b] b[vlc ==* ri[c]. 
The POSTINC archetype is now used and its parameters added to the resolution set. 
[2.31 arithm(x, y, r) r3 POSTINC ri ri[b] b[vlc PREDEC(a, y) =* r3[r] ri[c]. 
#{b = b, x = v} 
Similarly, all conflicting local variables of the PREDEC archetype are renamed. 
[2.41 PREDEC(a, vl) r2 = r2[bl] a[vilb, ==> r2[al- 
Finally, after the PREDEC's archetype expansion, the original update scheme becomes 
[2.51 arithm(x, y, r) r3 POSTINC r, rl[b] b[vlc PREDEC r2 r2[bl] a[vllbl ==ý. r3[r] rl[c] r2[al- 
# fa = a, y = vl} 
If r, and r2 refer to the same register (RI) then b= bl, so it must be the case that 
a : 7ý c. This is therefore a conflict between the cells ri[c] and r2[a], i. e. R1[c] and Ri[a] (the 
right-hand side of the update scheme 2.5 is inconsistent). Although the update scheme is not 
yet fully expanded, we can already recognise the inconsistency at this early stage of archetype 
expansion. 
Although there is a good reason for making these instructions illegal, there are situations 
when instruction operands are accessed sequentially in a clearly defined order during a f/e 
cycle. While even such cases can be specified using basic Update Plans, there would be a 
significant loss of clarity in the specification. 
On most PDP-11 implementations, operands of an instruction are accessed in some se- 
quence. In this example operand access order of the ADD zxy instruction is x, then y and 
finally z. Using the extension to the UP formalism introduced in this chapter, the behaviour 
of our arithmetic instruction can be described in terms of sequential update scheme S in 2.6. 
[2.6] S3 arithm(x, y, r) r3 ==: ý r3 
[r) 11 
POSTINC(b, x)=ý, 
2 PREDEC(a, y) ==* . 
CHAPTER 6: SEMANTIC EXTENSIONS 66 
Informally, the above definition says: expand archetypes POSTINC, PREDEC, and arithm 
and apply their locator expressions to the configuration in this exact order (the application 
order). However, the ordering of text expanded from sequence's archetypes is from left to 
right/top to bottom (the textual order). 
2.2 Syntax 
Similarly to parallel blocks [60), sequential blocks are delimited by the open sequential block 
symbol, '(SEQI' which takes an identifier (sequencer) here 'SEQ', and the close sequential 
block symbol, 'I)'. The reason for using the identifier in the open sequential block symbol will 
become clear in section 3 where sequential archetypes are discussed. 
While the double pipeline symbol '11' is entirely optional (with the exception of the 
open/close parallel block symbol) in basic Update Plans, the (single) pipeline symbol is essen- 
tial in sequential blocks, in order to separate updates in these blockS2. The pipeline symbol 
takes an additional number which is the order in which updates nested in the sequential block 
are applied to a configuration. 
A basic notation for sequential update schemes (sequences for short) is 
(SEQla updatea 
lb updateb 
Im update. 
1) 
or making use of typesetting possibilities 
SEQ a updatca 
b updateb 
update. 
where updates are either alternatives, a parallel block, or a sequence. 
Sequences either have all stages (sequence numb ers/indicators; in the above generic se- 
quential update scheme a, b,..., m) in a sequential block tagged, or all stages left untagged. 
Such 'untagged' sequences are so-called stageless. A sequence with no sequential block iden- 
tifier is referred to as an anonymous sequence. Sequencers do not have to be unique, but if 
they are not, the risk of creating an incorrect specification by an oversight is increased as 
a sequential archetype can expand into two or more unrelated sequences as will be seen in 
section 3.3 
The following production rules need to be added to the UP grammar. 
2 Note that in EUP the double pipeline symbol is no longer optional. 
CHAPTER 6: SEMANTic EXTENSIONS 67 
(item) --+ (seqblock) 
(seqblock) --+ "(" (seqblock id)-opt "I" fflstage)-opt (update)) "I"}+ "j)" 
(seqblock id) --- ý (symb-const) 
(stage) --+ (symb-const) I (variable) 
The complete grammar of Extended Update Plans can be found in appendix A. 
2.3 Semantics 
Consider the generic sequential update scheme from section 2.2. There are n= Ila, b.... IM11 
updates in the sequence, where the variables a, b, .--, m are stages of application. All the 
stage-representing variables can be grouped into a countable set of variables V= {a, b,..., M}. 
Every variable vEV instantiates a value i c- 1, where Y is a countable instantiation set. The 
mapping between V and Y is a many-to-one mapping, in other words, two or more stages in a 
sequential block can have equal instantiations, which allows for non-deterministic sequential 
updates schemes. The stages represent the application order for all the updates within a 
sequence and for every instantiation a 
{a IaE (I - {wl)}, where w= max(l) 
of these variables there is a direct successor function succ(a). If for any {a I succ(ce) V 11, the 
sequence is always only partially applicable, as explained later. If for all {a I succ(a) E -T}, 
stages do not restrict the applicability of the sequence, and it can be fully applicable depending 
on the applicability of its updates. The application/temporal order of updates in a sequence 
is defined as 
min(l), succ(min(-E)), succ(succ(min(l))),..., succ'-'(min(l)) =- max(l) 
where min(l) is the first stage, and succ'-1(min(l)) is the last stage of application. 
Although the application of a sequence takes place in individual stages (steps), it is still 
atomic (as is for example the application of a parallel block) in relation to the rest of the 
updates outside the sequence. The syntax and the semantics of any of these n updates 
(including their atomicity of application) is naturally preserved. The semantics of seque ntial 
blocks nested in parallel blocks is slightly more complicated, and is explained in section 2.4. 
Unlike parallel blocks, which are either applicable as a whole or simply not applicable at all, 
we define a full and partial applicability of sequences. A sequence is fully applicable if all of its 
constituent updates are applicable in the application order. A sequence is partially applicable, 
if one or more but less than n updates are applicable in the application order. For example, 
sequence 2.1 in chapter 7 will always be only partially applicable unless some sequential 
archetype expands its stage two. The effects of a sequential update scheme S which is only 
partially applicable are permanent, unless there was another update which was applicable 
(in the case of a sequential update scheme this may be either fully or partially) before the 
CHAPTER 6: SEMANTIC EXTENSIONS 68 
application of the scheme S. This behaviour is especially useful when simulating fatal failures 
of a system, when the system retains the state when the failure occurred. Breaking the 
application order by omitting a stage {a I succ(a) V 1} results in a sequential block which is 
always only partially applicable. 
Since text can be expanded as a result of a sequence's application, we need to define the 
placement of the text for the sequence as a whole, and also an internal textual orde7ing. The 
placement of text expanded as a result of a top-level sequence's application fits nicely with 
the concept of a command driven update plan. An update plan's top-level update is also an 
item in the update plan. In other words, a top-level update is always separated from other 
update plan's items by the Y symbol. Since the application of a sequence can be viewed as 
an atomic action, it can also be viewed as a command update scheme 
ltxt lc =4 g J=: ý rtxt rc 
which can be desugared for top-level sequences as 
PC[pcl pc[ltxt]qc Ic =1 g J=ý, PC[pc) pc[rtxt]qc 7-c. 
where ltxt/rtxt or both are non-empty left/right-hand side texts, and Ic/-rc is the left/right- 
hand side context before/after application of a sequence, and PC is a program counter. 
The (internal) textual ordering of left/right-hand side text 1txtj1TIxtj, iE ja, b,..., in} 
from updates of the form 
ltxti lci =j gi Y=> rtxti rci 
is then 
ltXta ItXtb 
,*, ItItia 
IC 9 ]'-: * rtXta rtltb *, * rtItm rc 
which can again be desugared for top-level sequences as 
PC[Pr-I PC[Itlta ltltb ,** ltxt. lqc Ic =[ g 1#. PC[pc'] pc'[rtxt, rtxtb ,** 71xtlqc rc. 
Not only are variables shared between left and right-hand sides of an update scheme, but 
they are also shared across all updates of a sequence. This further reduces the complexity 
of an update plan as it is possible to share data between updates directly rather than using 
additional constant locators. However, as will be shown in section 3.7, no forward references 
to variables are allowed. 
The temporal ordering of stageless sequences is equivalent to its textual ordering. In other 
Nvords, the stageless sequence 
SEQ update,, I 
updateb 
I update. 
is syntactic sugar for 
CHAPTER 6: SENIANTic EXTENSIONS 69 
SEQ min(l) update,, 
succ(min(l)) updateb 
succ"-'(min(. T)) =- max(. E) update,. 
2.4 Canonical form 
A sequential block is said to have a canonical form if all of its updates are update schemes. The 
canonical form of sequential blocks is introduced to simplify the implementation of sequential 
update schemes and to define the semantics of sequential blocks nested in parallel blocks. 
Every sequential block has its canonical form. The following text shows how a sequential 
update scheme can be translated into its canonical form. 
One of the requirements for transformation of a sequential update scheme into canonical 
form is that all its stages and stages of all nested sequences must be ground. At first sight 
this might seem as a limitation, but as temporal ordering is also required for the implemen- 
tation, and transformation into canonical form is primarily required as the first step of the 
implementation, this is not an issue. 
The algorithm can be divided into two independent parts. Firstly, the sequential block 
is 'linearised' by three transformations so that all of updates in the block are alternatives or 
parallel blocks. Then, in the second part, any parallel blocks and alternatives are replaced 
by update schemes. Due to the possible presence of alternatives (or alternatives nested in 
parallel blocks), this step (conversions 4 and 5) may produce two or more sequential update 
schemes in canonical form. 
The algorithm 
while(sequential block contains other sequential blocks) { 
convert: 
1) sequential blocks in 11 blocks /* -ri 
2) sequential blocks in sequential blocks /* 7-2 
3) 11 blocks in 11 blocks /* -r3 
I 
/* assertion: the only type of nested updates are 11 blocks or alternatives 
convert: 
4) all 11 blocks into update schemes /* 74a) 74b 
5) all alternatives into update schemes /* 75 */ 
All five individual steps of the algorithm are explained in more detail in the following sections. 
Sequential blocks in a parallel block 
Of all the transformations used in the algorithm the transformation of sequential blocks nested 
in a parallel block has the most noticeable impact on the structure of the resulting sequential 
CHAPTER 6: SEMANTIC EXTENSIONS 70 
block. As a side effect, the transformation also shows the semantics of sequential blocks nested 
in a parallel block. The main idea is demonstrated by the following example. There are 
two sequential updates A and B, and two non-sequential updates update, and update2 in the 
parallel block. Let there be instantiation sets ITA and _TB containing 
instantiations of sequential 
blocks' A and B stages. The parallel block can then be replaced by a sequential block 
containing n parallel blocks/stages, where n= ITA UIBI. Each of these parallel blocks/stages 
will contain updates to be applied at the same time. All non-sequential updates are grouped 
in the first stage of the newly created sequential block. 
update, 
A 
aA updateaA 
bA updatebA 
CA update CA 
BI 
aB updateaB 
bB updateb, 
update2 
Ti 
AB 
aA=aB update, 
update ZIA 
updatCaD 
update2 
bA=bj3 updatebA 11 
updateb, 
CA 
11 
update CA 
Provided stages aA = aB and bA = bB, the parallel block can be split into three separate 
parallel blocks as demonstrated above (7-1). As all updates in a parallel block are guaranteed 
to be desugared, no special care needs to be taken about textual ordering. 
Sequential blocks in a sequential block 
Let there be a sequential block A with a nested sequential block B in stage bA. Then it 
is possible to transform (T2) the original sequential block A into a sequential block AB by 
simply splitting B's stages and adding them to the the newly created sequential block. 
A 
aA update, AB aAB update, 
bA BlaB update,, B bAB updateaB 
bB 72 updatebB 
- CAB updatebB 
CA update2 dAB update2 
Values of stages of the new block AB must preserve the application order to ensure the 
semantic equivalence of sequential blocks A and AB. Values of all stages temporally preceding 
stage bA are left unchanged. For instance, assuming that bA temporally follows both the aA 
and CA stages, aAB = aA and dAB = CA. All stages B introduced from the block B will have 
their value changed to bA +b- min(B), where bEB. 
Values of all stages temporally following stafre bA need to be incremented by the difference 0 
between the last and the first stage to be applied in block B, i. e. max(B) - min(B). Textual 
ordering is naturally preserved. 
CHAPTER 6: SEMANTic EXTENSIONS 
Parallel blocks in a parallel block 
71 
The third step of conversion of a sequential block into its canonical form is the simplest one. 
A parallel block in the sequential block S having as its updates other parallel blocks will 
be transformed into a new parallel block using a simple manipulation demonstrated by the 
following figure. 
update, update, 
update2 update2 
update3 
7'3 
update3 
update4 update4 
Parallel blocks 
In contrast to the three transformations described above, transformation of parallel blocks 
into canonical form can produce two or more updates (update schemes) derived from a single 
parallel block. This fact is due to the possible presence of alternatives, which will now be 
separated into update schemes. 
A parallel block P containing m alternatives with a variable amount of update schemes 
(us,, ) for each of the alternatives 
US11; US12; ... usla 
US21; US22; ... US2b 
USml; USm2; ... ; USmn 
can be transformed (-r4,, ) into x=axbx... xn parallel blocks 
UsIl US11 UsIl UsIl UsIl US12 ... USla 
US21 US21 US21 US22 ... US2b US21 ... US2b 
Usrnl USm2 usm, UsTnj usm, USM1 usm, 
This means that the sequential block containing the parallel block P will be replaced by 
multiple (x) instances of the sequential block where P's occurrence will be replaced in each 
of the instances by a parallel block transformed as shown above. 
Every parallel block consisting of only update schemes 
Ilts 1 =[ g, rhs 1 
IhS2 =f 92 rhS2 
lhs,,, g rhs,,, 
is said to be in canonical form and can be rewritten (74b) as a single update scheme simply 
by taking the unions of all left/right-hand sides and combining the guards. 
CHAPTER 6: SEMANTIc EXTENSIONS 72 
Ihs I IhS2 *** Ihs.. =Jgl Aq2 A ... A gm J* rhsi rhS2 ,- rhs,,, 
Alternatives 
Linearised sequential blocks containing alternatives will be normalised (7-5) into multiple se- 
quential blocks using exactly the same principal as described in the previous section. 
An alternative A containing n update schemes 
lhsi =[g, ý* rhsi; ... ; Ihs,, =1 g,, 
I=: >. rhs,,. 
can be rewritten as n update schemes 
Ihs 1 =1 gl )=-ý, rhs 1. 
n-1 
Ihs,, A --19j) A 9, Y: ý- rhsn. 
i=1 
The sequential block containing alternative A will then be replaced by n instances of the se- 
quential block where A's occurrence will be replaced in each of the instances by the individual 
update schemes shown above. 
2.5 Implementation 
In this section a simple translation of sequential update schemes to basic update schemes 
is presented. The advantages of such a translation are obvious. As an algorithm for im- 
plementation of basic Update Plans has already been described [60], it is not necessary to 
redesign the whole implementation. In other words, the proposed translation is a front-end, 
independent of the actual implementation of basic UP. The implementation is divided in three 
sections. Firstly, the main idea is presented in the following section. Secondly, the algorithm 
is described in section 2.5.2. Finally, section 2.5.3 gives an example of the implementation. 
2.5.1 Preliminaries 
The algorithm described in the following section assumes that the sequential update scheme 
under transformation is already in its canonical form and that all of its sequential archetypes 
have been expanded. For the expansion of sequential archetypes please refer to section 3.3. 
The main idea is to translate all sequential update schemes at compile-time into basic update 
schemes and adding a mechanism to synchronise the translated update schemes and existing 
top-level updates. Synchronisation is made possible by the introduction of a shared central 
update plan synchroniser UP: SEQ, which is a locator addressing the synchroniser's stack. The 
role of the stack is twofold. Firstly, it serves as an extra guard to allow the application of 
only a particular update scheme from a sequential block, and secondly, if the application 
of a sequential block finished, or has yet to start, it allows the application of only top-level 
updates 
CHAPTER 6: SEMANTIc EXTENSIONS 
2.5.2 The algorithm 
73 
For the purposes of the implementation, the following popsh archetype is defined. It is a 
two-parameter stack management archetype, which combines the effects of the classical pop 
and push operations. 
[2.7] popsh(vl, V2) = UP: SEQ[q] p[vl]q =* UP: SEQ[q] P[V2]ql- #pop and push 
In order to prevent existing top-level update schemes from interfering with half-completed 
sequential update schemes, it is necessary to add an archetype call popsh(UP: TOP, UP: TOP) to 
every top-level update scheme in an update plan. The initial synchroniser's configuration 
is UP: SEQ[p] (UP: TOP]p, which enables the application of all top-level update schemes. The 
algorithm for the actual translation of sequential update schemes into basic update schemes 
is described below. 
Consider the following generic top-level canonical sequential update scheme S 
[2.8) Sa ltxt,, Ic,, ga rtxt, rc,, 
b ItXtb ICb 9b 7'tltb rCb 
m ltxt 1c. =4 -Q, 
J* 71xt, rc, 
where ltxtilrtxti is the left/right-hand side text and Ici/rej is the left/right-hand side context 
of an update scheme iE ja, b,... 'M}. 
As already explained in section 2.3, not only are variables shared between left and right- 
hand sides of an update scheme, but they are also shared across all updates in a sequence. 
As the aim is to convert a sequential block into top-level update schemes, and the only way 
to share data across top-level updates is using ground (in other words constant) locators, 
additional ground locators will be used as addresses of the shared variables. 
Let there be a set A of update schemes which share a variable v in the sequence S. Let 
there also be a universe of locators U used in the transformed update plan and a locator 
V, where VýU. Then all update schemes from A need to be adapted by adding a locator 
expression V[v] using the following two rules. 
1. An update scheme from A in which v is ground is subject to transformation 'r, 
Ihs =[ g rhs T, Ihs =[ g ý* rhs V[vl 
2. An update scheme from A in which v is not ground is subject to transformationT2 
Ihs =[ g J* rhs 72 ths V[vJ =[ g J* rhs 
It is also necessary to make the textual ordering of a sequence's text explicit, to conform 
to the semantics of sequences described in 2.3. Let there be an ordering B of n update 
schemes in the sequence S ordered textually in the order text is arranged in a configuration 
CHAPTER 6: SEMANTIC EXTENSIONS 74 
after expansion as described in section 2.3. Then every update scheme from that ordering 
is assigned a number t in the range 1 to n representing the position it appears in B and is 
subject to transformation 
ltxt Ic =[ g ý* rtxt rc 
'r3 
t-1 n 
PCIPCI (PC + lltXtjl)[ItXt) IC =4 g J=> (PC +E irtXtjl)[7'iXt] M PC 
i=t 
where J= Eý' . 7=1 
Ilt. Ttjl and pc -= PC[pc +5- Ej'ý_t Irtxtjlj if the update scheme under trans- 
formation is in the last (temporal) stage of S, otherwise pc is empty. The 'I - I' symbol is the C, 
length operator. 
Lastly, the application order of update schemes within S needs to be made explicit together 
with an explicit statement of its entering and leaving. 
Rule 1. The first update scheme jif of the sequence S can only be applied if there is an 
appropriate unique value a corresponding to 1-if on the synchroniser. The following update 
scheme needs to be added for (top-level) sequence S in order to enter sequential application 
of Ss schemes. 
[2.9] popsh(UP: TOP, UP: TOP a) ==* . 
#preamble, the first update scheme (rule 1) 
Rule 2. Let there be a temporal ordering C of all sequence's S stages given by the successor 
function succ(a) as described in section 2.3. Then for every stage aEC there is an update 
scheme /z in that stage (already adapted by transformations 7-1,72 and T3) which will be 
translated into a basic update scheme 
[2.10] popsh(v, , V2) P- # sequential block itself (rule 2 and 3) 
where v, and V2 are both unique constants-the first one identifying the update scheme in 
stage a, the second one identifying the update scheme in stage succ(a). The uniqueness of 
these constants in combination with the popsh archetype ensures sequentiality of application 
of all update schemes within the sequence S and no interference with other updates. 
Rule3. The update scheme yj in the last stage of S's application-succ'- '(min (1)), where 
I is a countable instantiation set of the sequence S as explained in section 2.3-is not subject 
to rule 2, as its successor function is not defined. There is, however, a need to transfer 
control back to (or enable execution of) top-level updates, in other words exit the sequential 
application of S. This is ensured by adding an update scheme 2.10 where 11 _ý /11) V2 = 
'UP: TOP' 
and v, = 'UP: TOP 0', where V) is the constant which was used by rule 2 as V2 for /-z in stage 
SUCC n-2 (min(l)). 
Anonymous and stageless sequences are mere syntactic sugar for ordinary sequences as 
described in sections 2.2 and 2.3. As such they do not need to be considered as a special case 
for implementation. 
CHAPTER 6: SENIANTIc EXTENSIONS 
2.5.3 Example 
75 
Continuing with the 'PDP-11' example introduced in this chapter as one of the motivations 
for sequential update schemes, the implementation of the sequential update scheme S shown 
in section 2.1 on page 65 is now examined. It is given again for convenience in 2.11. 
[2.11] S3 arithm(x, y, r) r3 ===> r3 
[r] ji 
POSTINC(b, x)==ý> 
2 PREDEC(a, y) ==* . 
As the sequential update scheme is already in canonical form, Ave can start with transfor- 
mations rl and -r2 and thus make sharing of variables between update schemes explicit. 
[2.12] S3 arithm(x, y, r) r3 X[XI Y[y] =* r3[r] # -r2, T2 (x and y not ground by ari t; hm) 
i POSTINC(b, x)==>X[x] # -ri (archetype POSTINC grounds x) 
2 PREDEC(a, y) => Y[y]. #, ri (archetype PREDEC grounds y) 
Note that variables a, b, r3 and r are not shared among the individual update schemes of S, 
and as such are not subject to transformations T, and -r2. 
As mentioned in section 2.5.1, this is a compile-time transformation and so all text- 
expanding archetypes of a sequence must be expanded before the sequence can be translated 
into basic update schemes. This step is demonstrated by the update scheme 2.13. 
(2.131 '3 ADD r3 X[xl Y(y] ==#- r3[r] #{x =xy =y, r =--x+y} 
1 POSTINC r, rl[b] b[v]c ==* X[x] rl[c) #fb = b, x = v} 
2 PREDEC r2 r2[bl] a[vl]bl ==: ý Y[Y] r2[a]. # {a = a,. v = vl} 
Note that the variables of archetype definitions have been renamed before the expansion of 
archetype calls as in section 2.1. As a sequential update scheme shares variables among its 
individual updates, variables in all updates within a sequence have to be taken into account 
during the renaming process. 
Assuming that JADD r3l = 16, IPOSTINC r, I=8, and JPREDEC r2l = 8, internal and external 
textual orderings are made explicit by the transformation -13 (see section 2.3) as shown by 2.14. 
(2.141 S3 PC[pcl pc[ADD r3l X[xl Y[y] => r3[r] pc+32[l PC[pc + 321 #t=1, n=3 
i PC[pc] pc+16[POSTINC ril ri[b] b[v]c ===ý, X[x] rl[c] pc+32[l #t=2 
2 PC[pc] pc+24[PREDEC r2] r2[bil a[vl]bl ==* Y[y] r2[a] pc+32[j. #t=3 
Finally, S can be split into individual top-level update schemes using rules 1-3 from the 
previous section. As no text is expanded on any of the right-hand sides the 'empty' locator 
expressions pc+32[l are discarded for better readability. 
[2.15] popsh(UP: TOP, UP: TOP S 1) : --=> .# rule I 
popsh(S1, S2) PCfpc] pc+16[POSTINC r1l r, (b] b[vlc ==* X[xj r, [c]. # rule 2 
popsh(S2, S3) PC[pcj pc+24[PREDEC r2j r2 [bl] a[vljbl ==> Y[y) r2 [a] -# rule 2 
popsh(UP: TOP S3, UP: TOP) PC[pc] pc [ADD r3l X[x] Y[y] ==> r3 [r] PC[pc + 32]. # rule 3 
CHAPTER 6: SEMANTIC EXTENSIONS 76 
Figure 6.1: A simple logic circuit 
Since in our example r, and r2 are identical registers, r2 can be substituted by rl. Also, 
based on the resolution set derived in 2.13, v, v, and r variables are replaced by x, y and 
x+y respectively. 
[2.16] popsh(UP: TOP, UP: TOP Sl) ==* . 
popsh(Sl, S2) PC[pcl pc+16[POSTINC ri] rl[b] b[x]c X[x] ri[c]. 
popsh(S2, S3) PC[pc] pc+24[PREDEC rij r, [bl] a[y]bl Y[y] r, [a]. 
popsh(UP: TOP S3, UP: TOP) PCfpc] pc [ADD r3l X[xl Y[y] ==*'r3[X+Yl PC[pc +32). 
To complete the example, all popsh archetypes are expanded. 
[2.17] UP: SEQ[q] p[UP: TOP]q ==> UP: SEQ[q'j p[UP: TOP Sl]qý. 
UP: SEQ[qýj q[Sllq' PC[pc] pc+16[PDSTINC ri) rilb) b[xlc 
X[x] rl[c] UP: SEQ[(ý] q[S2]qý. 
UP: SEQ[qýj qfKjqý PC[pcj pc+24[PREDFC ril ri[bil a[ylbl 
Y[y] ri[a] UP: SEQ[cfl q[S3]qý. 
UP: SEQ[qýj p[UP: TOP S31qý PC[pcj pc[ADD r3l X[XI Y[Yj 
r3[X + Yl PC[pc + 32] UP: SEQ[q] p[UP: TOP]q. 
Note that b, = c, and since x and y are of the same type, x=y (and b= a). Also note 
that the value b in ri[b] does not change with respect to the value before application of the 
update scheme 2.6 on page 65. 
Sequential archetypes 
3.1 Background/Motivation 
While sequential update schemes have already proved to be a good step towards more trans- 
parent and hierarchical UP specifications, they still provide only a limited improvement on 
the original model. 
Consider the example of a simple logic circuit in figure 6.1. A basic Update Plans speci- 
fication for AND and OR gates is as follows. 
l] and(a, b, c) = a[0] b[0] ==> c [0]. or(a, b, c) = a[0] b[0] ==> c[O]. 
= a[0] b[l] c[O]. = a[0] b[l] c[I]. 
= a[l] b[0] c[O]. = a[l] b[0] c[1]. 
= a[l] b[l] c[1]. = a[l] b[l] c[lj. 
CHAPTER 6: SEMANTic EXTENSIONS 77 
To connect the output D of the AND gate to the input D of the OR gate, sequential update 
scheme 3.2 is defined. 
[3.2] '1 and (A, B, D) 12 
or (D, C, E). 
In the situation shown in figure 6.1 only one of 16 possible expansions is applicable 
[3.3] 1 A[l] B[l] D[l] # AND gate 12 
D[l] C[01 E[11. # OR gate 
This example demonstrates that archetypes can be used in sequential update schemes in the 
same manner as in basic update schemes. Note that it would be possible to assign delays (or 
other metrics) to the individual and and or gates/archetypes, and that the overall delay of 
the logic circuit could then be easily calculated simply by adding maximum delays occurring 
in every stage of the sequential update scheme. 
However, the main limitation of using basic archetypes in sequential update schemes is that 
sequences would be unduly complicated to encapsulate into non-sequential archetypes. Not 
only do sequential archetypes address this issue, but they also make it possible to explicitly 
predetermine locations of sequential archetypes' updates in the hierarchy of the top-level 
sequential update scheme these archetypes are called from. They also provide the formalism 
with additional information-hiding facility with tile help of which modular specifications can 
easily be defined. 
3.2 Syntax 
The syntax of sequential archetype definitions is that of ordinary archetype definitions where 
the (update) (see page 58) appearing in the archetype body is a sequential block. The syntax 
of archetype calls is the same for all types of archetypes. Using typesetting possibilities a 
sequential archetype a is defined 
a(params) = SEQJ. update,, 
b updateb 
updatem. 
The new archetype grammar can be found in chapter 5, or alternatively refer to appendix A 
for the revised complete Extended Update Plans grammar. 
3.3 Semantics 
Syntacticly, sequential archetype calls are equivalent to basic archetype calls. A sequential 
archetype definition, on the other hand, has a sequential block in its body instead of an update 
scheme. The individual stages (together with their updates) of this sequential block are then 
CHAPTER 6: SEMANTIC EXTENSIONS 78 
matched during expansion against corresponding stages of the sequential update scheme they 
are called from. 
As matching is done using sequencers and stages, both of these must be ground in the 
archetype body and in the sequential update scheme before any expansion can take place. 
The sequencer/stage pair has a similar purpose to that of indices in basic archetype calls, as 
described in [60], i. e. to ensure the correct placement of archetype bodies during an expansion. 
In a sense, sequential archetypes are not as 'dynamic' as normal archetypes, and serve 
only as a framework to place update schemes in their predetermined locations of a sequential 
update scheme. Nevertheless, this is still a powerful concept, examples of which will be shown 
in chapter 7. 
Expansion 
if 
(AX. (SEQ al updateaý 
a2 updatea2 
a. update,,. ))(a(pararns)) 
is equivalent to a (top-level or nested) sequential update scheme S containing one or more 
calls of the archetype a(pararns) and the definition of this archetype is 
a(params) = SEQjbj updatebi 
b2 updatO-b2 
b. updateb. 
then the result of expanding the archetype in S is 
SEQ c, updatec, 
C2 updateC2 
co update,. 
where f cl , c2, ... co} = 
faj, a27---iam} U fbj, b21 ... 7 
bn}, and the individual updatecj up- 
dates, where jE o} are constructed as described in the following four sections. 
As has already been shown in the introduction to this section, expansion of non-sequential 
archetypes in sequential blocks is allowed and they expand in the same manner as they do in 
the original UP. 
3.3.1 Non-matching sequencers 
If the archetype's sequencer does not match any of the sequencers of the top-level sequential 
update scheme it is called from, it is matched against the sequencer of the closest sequence 
in which the archetype call a(params) appears-i. e. the lowest level sequence containing the 
CHAPTER 6: SEMANTIC EXTENSIONS 79 
archetype call. Note that due to the syntactic sugar introduced in section 3.6.2, this is in 
effect an in situ expansion. 
3.3.2 Non-matching stages 
If during the archetype expansion any a(params) b-stage does not match any of the (sequencer 
matched) sequential update scheme's S stages, it is added temporally after the previous stage 
of S. 
Example 2 
The sequential archetype a 
ao = ID13 update3 
2 update2. 
in a sequential update scheme 
IDI, 
ao lhs ==* rhs. 
expands as 
IDI I Ihs ==: ý> rhs 
2 update2 #1<2 
3 update3, #2<3 
Although it might seem attractive to preserve textual ordering as it is in the sequential 
block of an archetype, this is not the primary aim. The primary aim is an accurate placement 
of a sequential archetype's updates in a sequence regardless of the order in which sequen- 
tial archetypes expand (resulting in the normal form). This kind of expansion is also more 
consistent with the way placement of matching updates is performed. 
3.3.3 update,,, and updateb are alternatives 
In case sequencer and stage matched updates update,, and updateb are both alternatives, 
the expansion mechanism is identical to the expansion mechanism in basic Update Plans. 
Consider the following sequential archetype and update scheme containing its calls. 
a(params) = SEQ i ltxt, Ic, g, rtxt, rc, 
2 ltlt2 IC2 92 rC2 
. ltxt, ic. =4 g, I=> rtxt, rc,. 
CHAPTER 6: SEMANTIc EXTENSIONS 80 
SEQ i s-lhs, a(pararns) s-gl a(params) s-rhs, 12 
a(params) s-lhS2 S-92 s-rhs2 
s-lhs. =f S--qn ý* s--rhs,,. 
Note the intentional omission of an archetype call a(params) on the right-hand side of 
the alternative in stage 2. The archetype call can be omitted on the right/left-hand side 
provided the corresponding right/left expansion of the sequential archetype is empty. This is 
equivalent to the notion of left/right-handed archetypes introduced in [60]. 
Provided m does not match any stage in the sequential update scheme and ra >n the 
expansion is 
SEQ 1 s-Ihs, ltxtl lci S-gl A yj rtxtl s-rhs, rc, 
2 It--rt2 s-IhS2 1C2 S-92 A g2 S-rhS2 7-C2 
n s-lhs. S-9n ý* s-rhs,, 
m ltxt,,, 1c. A,, ]--ý rtxt. rem. 
Note that the text in stage m does not require any archetype call to be present since there 
is no matching stage m in the sequential update scheme and the stage is simply added as 
described earlier in section 3.3-2. 
Although alternatives, all a and b updates in this section Nvere in fact only plain update 
schemes. The expansion of true alternatives, however, is no more complicated than the 
expansion presented here as every sequential update block containing alternatives can be 
replaced by multiple instances of the sequential block containing only update schemes as Nvas 
described in section 2.4. 
3.3.4 update,, or updateb is a parallel or a sequential block 
If one of sequencer and stage matched updates update,, or updateb is a parallel or a sequential 
block, the newly constructed updatec will consist of both updates arranged in parallel in 
the matching stage. For example, see the expansion of the xor archetype in example 4 in 
section 3.6.2. 
3.4 Special types of sequential archetypes 
3.4.1 Ambidextrous sequential archetypes 
Ambidextrous sequential archetypes have a slightly different semantics to ambidextrous ar- 
chetypes from basic Update Plans. Again, consider the definition of the archetype a and the 
sequential update scheme SEQ containing its calls. 
CHAPTER 6: SEMANTIc EXTENSIONS 81 
a(params) text = SEQ i ltxt, Ic, g, j=#, rtxt, rc, 
2 ltXt2 1C2 g2 1=: ý, rC2 
m ltxt. lcm q,, j#- rtxt. rc.. 
SEQ I s-Ihs, a(params) s-gl a(params) s-rhsi 
2 a(params) s-IhS2 S-92 s-rhS2 
n s-lhs,, =1 S-9n J* S-Tlls-- 
Ambidextrous sequential archetypes have an additional text, which is substituted for ev- 
ery instance of its call. This text precedes any text expanded as a result of expansion of 
left/right-hand side of an update scheme in a particular stage. Apart from this, the expansion 
mechanism of ambidextrous sequential archetypes is equivalent to the expansion mechanism 
of sequential archetypes described in the previous section. 
Again, provided ra does not match any stage in the sequential update scheme and in >n 
the expansion is 
SEQ 1 s-lhs, text Itxt, 1c, s-gl A g, text rtxt, s-rhs, rc, 
2 text ItXt2 s-IhS2 IC2 S-92 A g2 S-TIIS2 rC2 
n S-lhs,, s-g. 1#. s-rhs,, 
m ltxt. Ic. g,, ý* rtxt. rcm. 
3.4.2 Command sequential archetypes 
Command archetypes in general are a special case of ambidextrous archetypes and their 
semantics has already been explained in section 2.3 on page 59. 
3.5 Parameters 
Parameters of sequential and parallel archetypes have to be ground expressions before arche- 
type expansion. The primary reason for this restriction is that the structure of a sequential 
block that is to be expanded must be known before transformation into canonical form which 
is an important part of the implementation of sequential update schemes. 
Example 3 
The following archetype definition is perfectly valid as long as n is known 
(ground) before archetype expansion and n>0. 
a(o) =. 
a(n) = Sln b(n) a(n - 1). 
CHAPTER 6: SEMANTIC EXTENSIONS 82 
Parameter resolution of archetype calls inside a sequential block is slightly complicated by 
the fact that variables are shared across individual stages of the sequential block application 
and the parameter resolution set has to reflect this. 
3.6 Syntactic sugar 
3.6.1 Update schemes in sequential and parallel blocks 
If the right-hand side of an update scheme in a sequential or parallel block is empty and its 
guard is true, then the transition symbol can be omitted. 
3.6.2 An update is a sequential update 
Every update update (excluding the updates which form archetype bodies) can be viewed as 
syntactic sugar for an anonymous one-stage sequential update I update. 0 
The impact of this syntactic sugar is bigger than it may at first sight seem. It is significant 
as sequential archetypes do not necessarily have to be called from within sequential update 
schemes. The intention is to use this sugar in conjunction with the rule from section 3.3.1. 
Example 4 
The following update plan defines a half adder. 
xor(x, y, s) and(x, y, c) =: ý> .# half adder 
It can be vieNved as syntactic sugar for 
I xor(x, y, s) and(x, y, c) ==* .# half adder 
Given an archetype definition 
xor(x, y, s) = not 
(x, W2) 
not (y, wi) 
and(x, Wl, W3) 
and(W21 Yi ý14) 
or(W3, W4, S)- 
expanding the xor archetype in the definition of a half adder using the rule 
from section 3.3.4 gives 
1 llnot(x, T172 ) 
not (y, wi) 
2 
jjand(x, 
ý71043) 
and(rIJ2) Yj W4) 
13 lior(W3, W4, S) 
and (x, y, c). 
which can be transformed as 
and (x, y, c) 
not (x, ý72) 
not(y, wl) 
and(x, I'll 1 1'13) 
and(Td2, Y, 1ý74) 
or 
(TI73 
7 T44) -9) - 
CHAPTER 6: SENIANTic EXTENSIONS 
3.7 Limitations 
83 
With the introduction of sequential update schemes a new set of problems needs to be ad- 
dressed. The most obvious one stems from allowing variables to be shared among all stages 
of a sequential block. Consider example 5. 
Example 5 
This rather forced example uses two sequential archetypes and a (sugared) 
sequential update scheme which calls these archetypes. One of these ar- 
chetypes (r) reads a value of variable v, and the second one (w) writes a 
value of variable v. 
r(v) = Sl 2 A(v]. 
w(v)=sli ýý-B[v]. 
r(v) ==ý. w(v). 
Clearly, the expansion of these archetypes the results in a "forward refer- 
ence" to the variable v (v is not ground in stage 1), which is not a valid 
specification. 
1 ==* B[v] 
2 A[v] ==* . 
However, problems such as these are not exclusive to Extended UP, they were also present 
in basic UP, and can easily be detected by a trivial data flow analysis [1]. Any update scheme 
containing such a forward reference is illegal. 
4 Special cases of archetype expansion 
4.1 Alternatives 
As a result of syntactic changes, alternatives can now (in Extended Update Plans) be bodies 
of archetypes. A single update scheme in which such an archetype is called will, assuming 
there are n update schemes in the archetype's alternative, expand into n update schemes. 
In other words, the archetype definition of such an archetype can be viewed as n archetype 
definitions of the same archetype containing update schemes with their guards adapted as 
shown by the transformation -r. 5 in section 2.4. Again, consistency is the primary reason for 
introducing alternatives into archetype's bodies. 
4.2 Parallel blocks 
The original Update Plans formalism does not allow the use of parallel blocks in archetypes. 
While this may seem unnecessary in the original Update Plans, in Extended Update Plans 
CHAPTER 6: SEMANTic EXTENSIONS 84 
parallel blocks can appear in archetypes not only as a part of a sequential update scheme, 
but also entirely on its own. This increases consistency and adds a degree of modularity and 
information hiding to the formalism. 
An archetype definition 
a(params) = 11 update, 
update,,. 
is syntactic sugar for 
a(pararns) update 
update, 
that is the same parallel block is embedded in a one-stage anonymous sequence. Thanks 
to the existence of the expansion rule introduced in section 3.3.1 and the syntactic sugar in 
section 3.6.2, the expansion of this kind of archetype is governed by the rule in section 3.3.4, 
which is thus in effect an in situ expansion. 
Conclusions 
In this chapter sequential update schemes and sequential archetypes were introduced as se- 
mantic extensions to basic Update Plans. They contribute to the usability of the formalism 
in two major areas. 
Firstly, the obvious area of synchronisation. Synchronisation, or more precisely explicit 
temporal ordering of update schemes, was already possible in basic Update Plans. How- 
ever, this required an introduction of artificial constructs unrelated to the architecture under 
description and made the whole specification less readable and elegant. Sequential update 
schemes augment the non-deterministic model of execution of basic Update Plans by explic- 
itly stating the order in which updates will be applied and expressing a number of consecutive 
updates as one. 
Secondly, the introduction of sequential archetypes extended the possibility for information 
hiding and structure reuse by encapsulating a series of synchronised updates rather than just 
a single atomic update. As a series of actions can be encapsulated into a module (a sequential 
archetype), it is possible to provide multiple definitions for exactly the same series of actions, 
but perhaps on a different level of abstraction. The next step is to design a mechanism 
proving equivalence of these levels or ideally a refinement methodology to automatically derive 
provably correct levels of description. 
Overall, Extended Update Plans present a real improvement in readability of descriptions 
where sequential behaviour is required. These may include fetch/execute cycle and lower-level 
descriptions such as gate-level models of (a)synchronous circuits. 
Chapter 7 
PRAM 
To demonstrate the power, compactness and intuitiveness of the use of sequential update 
schemes and archetypes, a specification of the parallel random access machine (PRAM) 
has been developed. The PRAM specification has been chosen in particular not only because 
of its historical importance as one of the first models of parallel computing [221, but also 
because of its ongoing relevance and the interest of researchers in this model [2,40,46,51,76]. 
A PRAM is characterised by its "memory models" which determine the PRAM's be- 
haviour when two or more random access machines (RAMs) attempt to read from or write to 
the same memory all. An informal description of these memory models is given in section 1. 
Section 2 gives an EUP specification of the memory models for PRAMs with two RAMs. 
More general n-RAM PRAM memory models are defined in section 3. The EUP specifica- 
tions are used in section 4 where they are placed in the context of the remainder of a PRAM's 
instruction set. Conclusions are drawn in section 5. The full specification for n-PRAM can 
be found in appendix E. 
I Informal description 
The parallel random access machine is a theoretical uniform memory access shared memory 
model (UMA/SMP) (35] of parallel computing. This description and the formal specification 
is based on the informal PRAM specification given in [30]. PRAM consists of n random access 
machines [67,72] with infinite shared memory and a common clock. Each of the RAMs can 
access shared memory independently from any other RAM in constant time. 
Figure 7.1 shows a RAM consisting of R general-purpose registers, a program counter 
(PC), a signature register (SIG), an accumulator (ACC) and a memory address register (MAR). 
Note that the signature register is only used in the priority model, which will be explained 
later in this section. 
A PRAM's instruction set can be divided into four classes: arithmetic/logic instructions, 
load/store instructions, flow control instructions and read/write instructions. These instruc- 
85 
CHAPTER 7: PRAM 
I SHARED MEMORY 
Figure 7.1: The structure of a PRAM's RAM and data flow inside it 
86 
tions are specified in section 4, but it is worth noting that only the read/write instructions 
access shared memory. 
All RAMs execute the same program, but each one can be executing a separate code 
segment within the program. Execution of any instruction on the PRAM machine takes 
one clock cycle, i. e. a constant unit of time. One clock cycle is divided into four sequential 
phases (in a sense a f/e cycle) which are synchronous between processors (RAMs). In the first 
phase program counters of all processors that are not yet halted are increased to point to the 
following instruction. The second phase is a register read/write access phase and instruction 
execution phase. A read/write instruction will, on execution, prepare for shared memory 
access in the next two phases by loading the memory address register. All other instructions 
will update tile contents of local registers. Finally the third and the fourth phases are shared 
memory read and write access phases respectively. 
Depending on whether simultaneous reads of a same shared memory cell are allowed, two 
read models, the concurrent read model (CR), and the exclusive read model (ER) can be 
defined. 
Similarly, two memory write models exist-a concurrent write (CW), and an exclusive 
write (EW). However, concurrent write models need further consideration. The result of two 
or more processors trying to write simultaneously into a same shared memory cell c has to be 
defined. There are many different write conflict resolution rules. Some of the most common 
ones are as follows. 
WEAK Simultaneously writing the value zero to c by two or more RAMs is allowed and the 
value zero is stored into the cell. Simultaneously writing any other value to the cell by 
CHAPTER 7: PRAM 87 
two or more RAMs is forbidden and the execution of the PRAM ends in a write conflict 
if such a write is attempted. 
COMMON Simultaneously writing a common value to c by two or more RAMs is allowed 
and the common value is stored into the cell. Writing two or more different values 
simultaneously to c by two or more RAMs is forbidden and the execution of the PRAM 
ends in a write conflict if such a write is attempted. 
TOLERANT If two or more RAMs simultaneously try to write to c, then the value of the 
cell is not changed. The value of c is changed only if just one RAM is writing to it at 
the time. 
COLLISION If two or more RAMs simultaneously try to write to c, then a special collision 
symbol (COLL) is written to the cell, even if they are writing the same value. 
COLLISION+ If two or more RAMs simultaneously try to write two or more different 
values to c, then a special collision symbol (COLL) is written to the cell. The value of c 
is changed normally if the RAMs are writing the same value. 
ARBITRARY If two or more RAMs simultaneously try to write to c, then an arbitrarily 
chosen RAM writing to the cell succeeds in writing. There is no way to determine 
prior to the write which RAM will succeed, or determine after the write which RAM 
succeeded. 
PRIORITY If two or more RAMs simultaneously try to write to c, then the RAM with 
the smallest RAM identifier, i. e. with the smallest value of the SIG register, succeeds in 
writing. In other words, RAM identifiers define an unequivocal and RAM-wise order of 
priority of RAMs so that the RAM with the smallest RAM identifier has the highest 
priority. 
2-PRAM memory models 
The aim of this section is to show a compact and elegant specification of a 2-RAM PRAM 
using EUP. This specification will serve as the basis for an n-PRAM specification presented 
in the following section. The EUP RAM instruction set specification is shared by 2- and n- 
PRAM specifications, and can be found in section 4. 
As already mentioned in the previous section, the execution of an instruction on a PRAM 
machine is divided into four stages. The following sequential update scheme defines three of 
these stages. Phase two of the f/e cycle (the register read/write phase) is defined in terms of 
the instr archetype which is part of the definition of the pc archetype-i. e. when the instr 
archetype is expanded, appropriate actions for (not only) phase two of the f/e cycle will be 
added to the specification. The pc, shr, and shw archetypes are defined later in this section. 
CHAPTER 7: PRAM 88 
[2.1] FE 1 pc(l) pc(2) # update PCs, block (stage 2) if neither PI or P2 is running 
3 shro # read shared memory 
4 shwo- # write shared memory 
The pc archetype in 2.2 increases program counters on both RAMs. C is the number of 
instructions in the pro-ram each of the RAMs are executing. A RAM having the value of its 0 
program counter pc <0 or pc ý! C is halted. The constants P1 and P2 are used to distinguish 
between processor (RAM) 1 and 2. Note the use of the symbol '-' to expand these constants. 
For example, for P=1, P-p expands to P-1 which is equivalent to Pi. The 'empty' update 
scheme in stage two of the pc archetype prevents unwanted blocking behaviour of the FE za 
sequence caused by the absence of the register access stage. 
[2.2] pC(p) FE 1 P-P: PC[pcj pc[inst; r(p)]qc =[ 0< PC <C ý* P-p: PC[qc] # running 12 
=#ý .# 
don't block 
PC(P) P-P: PC[PCj =1 0> PC v PC >C J#- .# not running 
In order to make the specification more concise, operators (9 and 0 are introduced. 
{al E) a2 a, a2 A al : 7ý NULL} 
{al@a2 al a2 Val =NULLVa2 =NULL} 
Memory read conflicts occur only if the PRAM's memory read model (given by the memory 
read model register MR) is ER, and both RAMs are trying to read the same shared memory 
cell. The shr (shared memory read) archetype is defined to read values from shared memory 
to the processors' accumulators, and to act as a barrier to halt the entire ER PRAM should 
a read conflict occur. 
[2.3] shro = PRAM: MR[mr] Pl: MAR[al] P2: MAR[a2l =[ a, E) a2 A mr = ER j#- Pl: PC[C] P2: PC[C]; 
11 ==* read(l, a, ) read(2, a2)- 
If no read conflicts occur and the current instruction is a memory read instruction, a value 
is read from shared memory into a RAM's accumulator by the following archetype. 
[2.4] read(p, a) = a[vl =1 a: 7ý NULL ý* P-p: ACC[vl; # read value v from shared memory 
==> .# not a memory read instruction 
The MAR on every RAM contains a special value NULL when the currently executed instruction 
doesn't access shared memory. 
The shw performs similar function to the shr archetype, but it writes values into shared 
memory. The PRAM's memory write model is given by the memory write model register 
(MW). 
CHAPTER 7: PRAM 89 
[2.51 shwo = PRAM: MW[mw) P1: MAR[ajj P2: MAR[a2l P1: ACC[vj] P2: ACClv2] Pl: SIG[si] P2: SIG[S2] 
=j aE) a2 A mw = EW j#- P1: PC(C] P2: PC(C]. 
= 11 =[ a, a a2 A mw = CW-WEAK A ((vi = 0) A (V2 = 0)) J=: ý- write(al, 0). 
a, E) a2 A mw = CW-WEAK A ((vi 0) V (V2 0 0)) J#' P 1: PC (C] P2: PC (C]. 
a, a2 A mw = CW-COMMON AvI V2 ý* write (a,, vj). 
a, a2 A mw = CW-COMMON A v, 7ý V2 1=ý Pl: PC[C] P2: PC[Cl. 
=f a, 0 a2 A mw = CW-TOLERANT ý* .# value not changed 
a, (9 a2 A mw = CW-COLLISION J#- write(al, COLL). 
a, E) a2 A mw = CW-COLLISIONP A v, V2 j#- write(al, vi). 
a, (E) a2 A mw = CW-COLLISIONP A V1 V2 Y* write(al, COLL). 
a, (E) a2 A MW = CW-ARBITRARY write (a,, vj). 
a, E) a2 A mw = CW-ARBITRARY write (a,, V2)- 
a, (S a2 A mw = CW-PRIORITY A si < S2 ]=-> write(al, vi). 
a, (B a2 A mw = CW-PRIORITY A si ý' S2 J* write(al, V2)- 
a, @ a2 Y* write(al, vj) write(a2, V2)- # no conflict 
Note that writing the constant C into a p's program counter register is effectively a HALT 
instruction. 
If the addresses in MAR registers are not the same, there are clearly no write conflicts and 
both values are written to shared memory using the following archetype. 
[2.6] write(a, v) a0 NULL ý* a(vj; # write value v into shared memory 
===> .# not a memory write instruction 
n-PRAM memory models 
This section has a specification for an n-RAM PRAM. It is a relatively high-level specification 
abstracting away implementation details and giving only two memory write models. Tile 
complete EUP specification can be found in appendix E. 
We first define the instruction cycle, as we did for the 2-PRAM. This time the PRAM is 
defined by the archetype pram (3.1), rather than just a sequential update scheme as for the 
2-PRAM (2.1). The number of processors and memory read/write models are parameters 
of the specification rather than of the machine configuration. Again, only three out of four 
stages are included in the FE sequence, as the second (execution) stage of the instruction 
execution will be expanded by the pc archetype which was defined in section 2. As will be 
explained later, the enforced blocking behaviour when no RAM is running is intentional. The 
pcs, shr and shw archetypes now take additional arguments. The variable n is the number 
of RAMs in the PRAM machine (n > 0), rm and wm are the memory read and write models 
respectively. For example, a 5-PRAM with concurrent read and common concurrent write 
would be specified by pram(5, CR, CW-COMMON). 
CHAPTER 7: PRAM 90 
pram(n, rm, wm) = FE i pcs(n) # update PCs, block (stage 2) if no RAM is running 
3 shr(n, rm) # shared memory reads 
4 shw(n, wm). # shared memory writes 
As mentioned at the start of this section, this specification is relatively abstract. It uses 
the 'limits' notation (e. g. in 3.2) to define 'multi-processor' archetypes, and the (multi)set 
notation (e. g. in 3.6) to simplify guards. This notation can easily be implemented as recursive 
archetypes as shown in appendix E. For example, the pcs archetype (which expresses the 
first stage of an instruction execution) can be defined using the limits notation as 
n [3.2] pcs (n) = p4p). 
P=1 
which is shorthand for 'pc(l) pc(2) ... pc(n). '. The same archetype in 3.3 is an example of a 
recursive implementation. It increases the PC on every running RAM p using the archetype 2.2 
from the previous section. If there are no running RAMs, execution is blocked (the FE sequence 
is only partially applicable), and the whole PRAM stops. 
[3.3] pcs(O) =. #0 RAMs 
pcs(p) = pc(p) pcs(p - 1) =1 p>0 
The following halt archetype serves to stop a specific RAM and also the whole PRAM. 
This archetype is used by shr/shw archetypes in terminal conflict models when a read/write 
conflict occurs. 
n 
[3.4] halt(p) P-p: PC[C]. # halt (p) stops the whole PRAM 
P=1 
The shr archetype checks simultaneous reads from shared memory. It uses set theory in 
order to discover read accesses to the same memory location. Addresses waiting to be read in 
shared memory are stored in each one of RAM's MAR registers. The archetype shr checks a 
RAM's (non-NULL) MAR register value against the values in all the other RAM's MARs, and if it 
finds a conflict, the entire PRAM is halted. The read archetype is defined in 2.4. The shrd 
archetype in 3.5 provides the address (ap) from the memory address register of processor p. 
[3.5] shrd(p, a)= P-p: MAR[a]. 
The notation fl ap R is the bag or multiset containing the elements ap, and an element 
aNnL =a if and only if ap =A NULL, otherwise it is equivalent to an empty/non-existent pp 
element. 
nn [3.61 shr(n, CR) = shrd(p, ap) read(p, ap). 
P=j P=j 
nn 
shr(n, ER) = shrd(p, ap) halt(p) =[ 
fI 
aEU-LL NU 
P=j P=j p 
110faPLLIý* #readconflict 
Shrd(p, ap) read(p, ap) =tjja'pý-UýL-11+ýPýL -11#- .# no read conflict P=j P=l 
CHAPTER 7: PRAM 91 
A similar approach will be used to determine shared memory write conflicts, but simple 
detection of the fact that 1 or more write conflicts are due to occur is not sufficient. Different 
actions need to be taken in different memory models. 
The number of RAMs (n) and the memory write model are noxv parameters of shw arche- 
types. The write archetype is defined in 2.6. The shwr archetype reads the address (ap) from 
the memory address register of processor p and the value (vp) to be written at this address 
in shared memory. 
(3.71 shwr(p, a, v) = P-p: MAR(al P-p: ACC[v]. 
An archetype is defined for each memory write model. Only the exclusive write and the 
weak concurrent write models are described in detail here. The full specification is in the 
appendix E. 
[3.8] shw(n, EW) 
nn 
shwr(p, ap, -) halt 
(p) aýN-ULL P 
11541a-NPULLJý* #write conflict 
P=1 P=1 
nn 
shwr(p, ap, vp) write(ap, vp) a-F-LL L P a7ý- # no write conflict P=1 P=1 
shw(n, CW-WEAK) 
nn 
shwr(p, ap, vp) halt(p) # write conflict 
P=1 P=I 
=[ 3(ap, vp), (a., v. ) Ef (a-NPU-LL, vp) pqA ap = a. A (vp .00v vq =0 0) 1=41 
nn 
shwr(p, ap, vp) write(ap, vp) #0 or no write conflict 
P=1 P=1 
V(ap, vp), (a NULL qý Vq) EI (ap , Vp) ap = aq =* (Vp = Vq = 0) 1#' 
Instructions 
4.1 Addressing modes 
4.1.1 Implicit modes 
There are three implicit addressing modes on the PRAM-Le. addressing modes that are 
implicit in the opcode, rather than being explicitly provided as an operand. Access to these 
addressing modes always takes place in the register read/write phase. The three implicit 
addressing modes are accumulator addressing, 
[4.1] accr(p, v) 
FEI 2 P-p: ACC[vl. # accumulator read 
aCCW(p, V) FEI 2 ==> P-p: ACC[vl. # accumulator write 
signature register addressing 
CHAPTER 7: PRAM 
(4.2] sigr(p, v) = FE12 P-p: SIG[vl. # signature register read 
SigW(p, V) = FEI 2 =ý> P-p: SIG[v]. # signature register write 
and program counter addressing. 
[4.3] pcr(p, v) = FEI 2 P-P: PC[V]. #program counter read 
pcw(p, v) = FEI 2 ==* P-p: PC[v]. #program counter write 
4.1.2 Register modes 
92 
We first define two 'textless' register addressing archetypes that can be used to read/write 
values from/to local registers. 
[4.4] regr(p, r, v) = F11 2 P-p: r[vl. # local register read 
regw(p, r, v) = FEI 2 ===> P-p: r[v]. # local register write 
These archetypes can, among other things, be used to define the operand register addressing 
modes. These are direct register addressing mode 
[4,51 DR(p, r, v) r= regr (p, r, v). # direct register (read) 
and indirect register addressing mode. 
[4.6] IR(p, ri, v) r= regr(p, r, ri) regr(p, ri, v). # indirect register (read) 
Note that archetypes in 4.5 and in 4.6 are command archetypes generating addressing 
mode mnemonic for a PRAM instruction. These two addressing modes are known as register 
modes. 
[4.71 rm(p, a, v) = DR(p, a, v) # register modes 
= IR(p, a, v). 
Parameter p is the processor number, a is the effective (register) address, and v is the 
value accessed. Register modes are the only addressing modes that may be used by a STORE 
instruction (the STORE instruction must have an effective (register) address in which to store 
the value--see section 4.3). 
4.1.3 Immediate/Register modes 
Most of the remaining instructions (jump instructions, arithmetic/logical instructions, the 
LOAD instruction) may also take an immediate value. Adding this to the register modes 
defines the set of immediate/register modes. 
[4.8] irm(p, v) = IMM v immediate mode and register 7-nodes 
rM(p, -, V). 
CHAPTER 7: PRAM 
4.1.4 Memory Modes 
93 
Memory read and memory write accesses can be direct or indirect. A direct memory read is 
defined by 
[4.91 DMR(p) M= FE 2 P-p: MAR[m] # direct memory read 13 
P-p: MAR[NULL]. 
The MAR is loaded with the address in phase two (register read/write), and cleared in phase 
three (memory read). 
The indirect memory read is 
[4.10] IMR(p) r= FE 2 P-p: r[mil P-p: MAR[mil # indirect memory read 13 
P-p: MAR[NULL]. 
The addressing modes in 4.9 and 4.10 are combined to form a set of memory read modes 
for use with the READ instruction-see section 4.5. 
[4.11] mrm(p) = DMR(p) 
IMR(p). 
The memory write modes are defined similarly. 
[4.12] DMW(p) M= FE 3 14 
IMW(p) r= 
FE 2 P-p: r[mi] 
3 
4 
P-p: MAR[znj # direct memory write 
P-p: MAR[NULL]. 
# indirect memory write 
P-p: MAR[mil 
P-p: MAR[NULL]. 
These two addressing modes form the memory write mode set. 
[4.13] mwm(p) = DMW(p) 
Imw(p). 
A read from shared memory will take place in the memory read phase (phase three), and a 
write to shared memory will take place in the memory write phase (phase four). A processor 
will read from memory if and only if its MAR contains a non-NULL address at the onset of phase 
three, and write to memory if and only if its MAR contains a non-NULL address at the onset 
of phase four. The mrm and mwm addressing modes are designed to load the processors' MARs 
with non-NULL addresses at the correct stage of the f/e cycle. 
4.2 Accumulator loading instructions 
There are two groups of (local) accumulator loading instructions-the arithmetic/logic in- 0 
structions and the load instructions. We will first define the way in which these instructions 
access the value to be written to the accumulator. 
CHAPTER 7: PRAM 
Arithmetic instructions 
94 
All arithmetic/logical instructions use only (local) registers (the imr addressing mode set), 
and hence take place wholly in phase two (the register read/write phase). 
Arithmetic/logical operators can be distinguished as being binary operators 
[4.14] binary(x, y, x+ y) = ADD. 
binary(x, y, x- y) = SUB. 
binary(x, y, xx y) = MUL. 
binary (x, y, x/y) = DIV. 
binary(x, y, x% y) MOD. 
binary(x, y, x << y) SHI 
binary(x, y, x& y) AND. 
binary(x, y, x y) = OR. 
binary(x, y, x y) = XOR. 
or monadic operators 
# addition 
# subtraction 
# multiplication 
# division 
# modulo 
FT. # shift left 
# bitwise and 
# bitwise or 
# bitwise xor 
(4.15] monadic(x, log(x)) = LOG. # loga7ithm 
monadic(x, not(x)) = NOT. # bitwise not 
Both binary and monadic operators access one argument from the instruction (immedi- 
ate/register addressed) operand. Binary operators access their other (first) argument from 
the processor's accumulator. 
[4.16] arlog(p, r) = binary(x, y, r) irm(p, y) accr(p, x) 
= monadic(y, r) irm(p, y). 
Load instructions 
There are three load instructions on the PRAM. They all load the processor's accumulator 
with a value. We will first define the value access and corresponding opcode. Note that 
LOADINDEX and LOADPC are zero operand instructions. 
[4,17] load(p, v) =LOAD irm(p, v) 
- LOADINDEX sigr(p, v) 
- LOADPC pcr(p, v). 
Writing the result 
We now combine these two sets of instructions and define the write to the accumulator. 
[4.18] toacc(p, v) = arlog(p, v) ==: ý, accw(p, v) 
= load(p, v) ==ý accw(p, v). 
CHAPTER 7: PRAM 95 
4.3 General purpose register loading instructions 
The store instruction writes the contents of the accumulator to a local register. The following 
update scheme defines the value and effective address access. 
[4.19] store(p, r, v) == STORE rm(p, r, -) accr(p, v). 
The write to the register is defined by 
[4.201 toreg(p, r, v) z-- store(p, r, v) =* regw(p, r, v). 
4.4 Program counter loading instructions 
The jump family of instructions will load a processor's program counter. The value to be 
loaded is defined by 
[4.2 1] i ump (p, v>0, a) = JPOS irm(p, a) ac cr (p, v). 
jump(p, v=0, a) = JZERO irm(p, a) accr(p, v). 
jump (p, TRUE, a) = JUMP irm(p, a). # unconditional jump 
jump(-, TRUE, C) = HALT. 
The program counter is only updated if the condition in the second parameter of jump is 
TRUE. 
[4.221 topc(p) jump(p, cond, a) =1 cond Y* pcw(p, a) 
j ump(p, cond, a) =f --iýond I=>. . 
4.5 Memory read instructions 
The READ instruction moves a value from shared memory into a RAM's accumulator. As 
mentioned earlier, the values are read from shared memory in phase three of the f/e cycle by 
the shr archetype, which checks memory read conflicts. The read archetype only generates 
the instruction and addressing mode mnemonic, and moves shared memory address into the 
MAR for the shr archetype. 
[4.23] read(p) = READ mrm(p). 
4.6 Memory write instructions 
The WRITE instruction stores values from a RAM's accumulator into shared memory. Similarly 
to the READ instruction, the actual write is done by the shw archetype, which checks and 
resolves memory write conflicts. 
[4.241 write(p) = WRITE mwm(p). 
CHAPTER 7: PRAM 96 
4.7 The instruction set 
Finally, all 5 types of instructions are added to the instruction set. 
14.251 instr(p) = toacc(p, 
= toreg(p, 
= topc(p, 
= read(p) 
= write(p). 
Conclusion 
Although [611 contains a specification of a parallel machine, this was only of pipelining in a 
relatively simple RISC processor. This specification, on the other hand, is a simple demon- 
stration that a massively parallel system can easily be formally described with EUP. It could 
also be relatively easily adapted to describe more recent PRAM models such as (76]. 
While a similar specification of the PRAM machine would be possible using basic Update 
Plans, this could only be done at a more concrete level of description by enforcing sequential 
behaviour through constructs unspecified by the PRAM model-Le. effectively preempting 
design decisions about the implementation of parallelism which the PRAM model was devel- 
oped to avoid. The Extended Update Plan specification preserves the level of abstraction of 
the PRAM model, with the shr and shw archetypes containing all the characteristics of the 
various access models. 
In addition the Extended Update Plan specification encapsulates most of the PRAM 
instructions in single sequential archetype definitions-corresponding to the descriptions with 
which a user would be familiar-even though the effects of these instructions take place across 
several clock cycles. This greatly simplifies the task of verification between multiple levels of 
specifications as a direct correspondence between these levels can easily be found. 
Chapter 8 
Other Methods 
T his chapter provides an overview of the most frequently used formal methods in the area 
of formal specification of hardware architectures. It is almost impossible to provide a 
complete list of methods that serve this purpose. For a fuller picture the reader is referred 
to comprehensive surveys in [18,29,42,52,68,801, where methods for specification and 
verification, not only of computer hardware, have been described. 
A group of specification methods (e. g. [31]) which is not considered in this chapter is 
based on process algebras. Although they are based on rigorous, well developed mathematical 
theories providing a variety of techniques for proving and verifying properties, this approach 
is still in its infancy and readability of such specifications is generally po6r. 
Several examples that demonstrate most of the formalisms and compare them to UP are 
given. Although the examples are too basic to exercise a wide range of language features, or to 
give a true test of ease of expressibility, they should provide some insight into the formalisms. 
1 Specification methods 
It is difficult to separate various formalisms into distinct classes as some of them overlap 
and every generalisation is bound to introduce some inaccuracies. Nevertheless, they will be 
categorised here by their area of predominant use. 
1.1 Hardware 
Hardware Description Languages (HDLs) have been used in the industry since the 1960s to 
document and simulate designs, mainly at the circuitry level of machine architectures. The 
most widely used HDLs are VHDL, Verilog and ELLA. 
97 
CHAPTER 8: OTHER METHODS 
Verilog 
module half -adder(a, 
b, s, c); 
input a, b; 
output S, C; 
xor gI (a, b, s); 
and g2 (a, b, c); 
endmodule 
Update Plans 
half -adder(a, 
b, s, c) == xor(a, b, s) and(a, b, c). 
Figure 8.1: Verilog vs. UP 
1.1.1 VHDL 
98 
VHDL [3,36] stands for VHISC (Very High Speed Integrated Circuit) HDL-an international 
IEEE standard (1987) specification language for describing digital hardware. 
Each VHDL description contains three main parts: the ENTITY section describing the 
entity interface, the ARCHITECTURE section of which there may be several instances for a 
particular entity and the CONFIGURATION section which defines the particular architecture to 
be used for an entity by its environment. 
A rich set of tools and has been written for VHDL which aid the development, synthesis, 
testing and verification of hardware designs. There are also IEEE standard libraries for some 
pre-defined components. 
Several methods have been developed to translate a VHDL subset into formats suitable 
for formal verification by both model checking and theorem proving [49,71]. However, as 
VHDL semantics (in contrast to UP) is not formally defined, any such translations must be 
taken as only 'provisional'. 
1.1.2 Verilog 
Verilog HDL [73] and VHDL are essentially identical in function, however, Verilog is simpler 
(less general) than VHDL, and syntactically different. Its programming constructs are based 
on C, while those of VHDL are based on ADA. Verilog was made IEEE standard 8 years after 
VHDL in 1995. Similarly to VHDL, Verilog has a large library of predefined components. 
Figure 8.1 shows a structural specification of a half adder in both Verilog and the UP 
formalism. An example of a sequential logic circuit is given in 8.2, no-%v comparing Verilog to 
Extended UP as described in chapter 6. 
The attempts [28] to formalise Verilog in order to facilitate formal verification are (in 
common with VHDL) still in very early stages. Unlike UP, Verilog is (in common with 
VHDL and ELLA) a deterministic language, so non-determinism must be emulated. 
CHAPTER 8: OTHER METHODS 
Verilog 
module f ull-adder(a, b, i, s, o); 
input a, b, i; 
output S, o; 
wire wl, w2, w3; 
half 
-adder g1 
(a, b, wl, w2); 
half 
-adder g2(wl, 
i, s, w3); 
or g3 (w2, w3, o); 
endmodule 
Extended Update Plans 
f ull-adder(a, b, i, s, o) = 
half 
-adder(a, 
b, wl, w2) 
half 
-adder(w1, 
i, s, w3) 
or(w2, w3, o). 
Figure 8.2: Verilog vs. EUP 
ELLA 
FN MUX == (bit: c il i2) -> bit: 
CASE c 
OF hi- il, 
lo: i2 
ESAC. 
Update Plans 
mux(c, il, i2, il) = c[HI]. 
mux(c, il, i2, i2) = c[LO]. 
Figure 8.3: ELLA vs. UP 
1.1.3 ELLA 
99 
ELLA [571 has been developed for circuit design, with the aim of supporting automatic syn- 
thesis from high-level behavioural descriptions to low-level structural descriptions. ELLA is 
a parallel language and describes circuit behaviour by defining nodes, connections and signal 
flows in that circuit. 
One of the marked differences between ELLA and VHDL is that in VHDL the structural, 
behavioural and procedural design descriptions must be separate, whereas in ELLA they can 
be freely mixed. Also, ELLA is primarily a functional language, whereas VHDL relies on 
state transitions. It is not only the regional use' and features described in this section that 
separate VHDL and ELLA. For a fuller comparison between VHDL and ELLA the reader is 
referred to [79]. 
As with other HDLs, though a complete formal semantics for ELLA does not currently 
exist, various verification strategies (based on model checking and theorem proving) for a 
subset of this specification language have been developed (5,6,12]. 
Two simple examples are provided to compare specification capabilities of ELLA and 
(E)UP. The first one (figure 8.3) is a two-bit multiplexer, the second one (figure 8.4) is a 
sequential parity checker. Note the clear arrangement of the parity checker modules in the 
EUP definition which makes the layout and the connections between the modules immediately 
obvious 
1VHDL is the standard language for the US DoD, ELLA is likely to become its equivalent for the UK Alol) 
CHAPTER 8: OTHER METHODS 
ELLA 
FN PARITY-IMP = (bit: in) -> bit: 
BEGIN 
MAKE INV: 11, 
MUX: 13 out, 
REG: 12 14. 
JOIN (in, 11,12) 13, 
hi -> 14, 
(14,13, hi) out, 
out 12, 
12 11. 
OUTPUT out 
END. 
Extended Update Plans 
parity-imp(in, out) = 
inv(12,11) 
mux(in, 11,12,13) 
reg(HI, 14) 
mux(14,13, HI, out) 
reg(out, 12). 
Figure 8.4: ELLA vs. EUP 
1.2 Concrete machines and instruction sets 
1.2.1 RTL, ISPS 
100 
Historically, the best known formalisms specifically aimed at the description of instruction sets 
are register transfer languages (RTL) and Instruction Set Processor Specification (ISPS) [69]. 
There is a variety of slightly differing RTL notations for describing the workings of com- 
puters at the register level. They all, however, semi-formally describe behaviour of computers 
as stepwise transformations on register contents, where variables correspond to hardware 
registers, using some FORTRAN, PASCAL or C constructs. 
ISPS has its origins in the Instruction Set Processor (ISP) notation and has been fre- 
quently used in the past as a design tool to cover a wide area of applications. One of the 
most significant applications is the PDP-11 specification [201. However, the specification was 
ambiguous and various implementations of the PDP-11 machine exhibited differing behaviour 
of several instructionS2. Formal semantics of the Update Plans formalism ensures that only 
unambiguous specifications are written. 
1.2.2 RAPIDE 
RAPIDE is a high level, event-based, concurrent, object-oriented specification and simulation 
language. It was designed for prototyping system architectures. Instruction sets are modelled 
by communicating modules forming an 'architecture'. The formalism started as an effort to 
complement often over-specific and hard to understand HDL descriptions. 
Unfortunately, RAPID E lacks a formal semantics, which makes any rigorous formal verifica- 
tion methodology impossible. The only way to verify properties of a system under description 
'For example, the MOV Rn, (Rn)+ instruction resulted in Rn having either the original value or the original 
value+2 depending on the processor. 
CHAPTEP, 8: OTHER METHODS 101 
RAPIDE 
type Producer is interface 
action out Serid(data: integer); 
action in ReadyO; 
behavior 
function GenDatao return integer; 
begin 
Start =* Send (GenData 
Ready =: ý Send(GenDatao);; 
constraint 
observe from Start, Send, Ready 
match (Start --+ Send) --ý [* rel 
(Ready --+ Send); 
end 
end Producer; 
Update Plans 
RECEIVED[O] START[l] READY[O]. 
# Producer 
START[s] READY[vj =4 sVv 
SEND[gendatao] START[O] READY[O). 
# Consumer 
RECEIVED[r] SEND[data] READY[Ol 
RECEIVED[r + 1] READY[l] r[data]. 
Figure 8.5: RAPIDE VS. UP 
is traditional simulation. The authors [50,661 argue that the strength of the RAME for- 
malism lies in partially ordered sets of events (posets) which are generated by executing a 
RAPIDE model. Where other concurrent event-based simulation languages produce linear 
traces of events, RAPIDE simulation shows dependency between events. However, the same 
effect (building posets) could easily be achieved by an UP simulator by annotating special 
"communication cells". 
Tile power of Update Plans to describe concrete data manipulations in a parallel environ- 
ment is demonstrated in figure 8.5 which shows the classic "pro ducer/consumer" problem. 
The very first 'Send' is initiated by 'Start' and subsequent 'Send' signals causally follow 
'Ready' (acknowledgement) signals. The RAPIDE specification of the 'Consumer' interface is 
not shown here for the sake of brevity, as its description is even longer than that of 'Producer'. 
On the other hand, the UP specifications of both the 'Producer' and 'Consumer' processes 
are included. The 'Consumer' simply stores data sent by the 'Producer' into a buffer. 
1.3 Parallelism 
1.3.1 UNITY 
The UNITY formalism [16) consists of two parts: a programming language based on transition 
systems, and a specification language, based on a linear temporal logic. Similarly to other 
formalisms, such us UP and CHAM, the UNITY computational model is liberated from 
control management. 
A minimal UNITY program consists of three parts: a collection of variable declarations 
called the 'declare' section, a set of initial conditions called the 'initially' section, and a 
CHAPTER 8: OTHER METHODS 
UNITY Update Plans 
program TrafficLi_qht 
declare 
ns, ew : fred, gm, yell 
initially 
ns, ew = fred, red) 
assign 
ns grn if (ns = red) A (ew = red) 
ns yel if (ns = gm) 
ns red if (ns = yel) 
ew: =grn if (ns = red) A (ew = red) 
ew := yel if (ew = gm) 
ew: = red if (ew = yel) 
end TrafficLight 
EW[RED) NS[RED]. 
NS[RED] EW[RED] =: > NS[GRNI. 
NS[GRNj NS[YEL]. 
NS[YEL) NS[RED]. 
NS[RED] EW[RED] =#> EW[GRN]. 
EW[GRNj EW[YELI. 
EW[YEL] EW[RED]. 
Figure 8.6: UNITY vs. UP program for a traffic light controller. 0 
102 
finite set of statements in the 'assign' section. The standard UNITY execution model is a 
non-deterministic, fair interleaved selection of all statements from the 'assign' section. 
An example of a program for a traffic light controller in both UNITY and UP is given in 0 
figure 8.6. Although very similar to UP, UNITY lacks an appropriate abstraction mechanism 
and is more suited for the description of concurrent programs, rather than architectures. Also, 
as opposed to Extended UP, as described in the second part of this thesis, there is no concept 
of a sequence which is often needed to describe hardware architectures. 
1.3.2 r, CHAM 
The Chemical Abstract Machine (CHAM) model of computation [9,10,37] is fashioned on 
chemicals and chemical reactions. The model is build upon the 17 language [4) for parallel 
programming. 17 computation is a set of reactions that consume elements of a multiset and 
produce new ones according to the rules that constitute the program. As reactions can take 0 
place in any order (or even simultaneously), the model is inherently parallel in common with 
other formalisms such as UP and Petri Nets. 
The r language is extended by the CHAM formalism by the provision of a classification 
scheme for reaction rules and a membrane construct which extends the use of multisets in such 
a way that they can form parts of molecules. The second extension allows the formalism to 
deal with abstraction and hierarchical programming, as a membrane can be porous to allow 
communication between an encapsulated solution (multisets of molecules) and its environ- 
ment. Perhaps an overused example, but one which effectively demonstrates the philosophy 
of CHAM is that a solution originally made of all integers from 2 to n along with a rule that 
any integer destroys its multiple will result in a solution of prime numbers between 2 and n. 
CHAM, as opposed to UP, is predominantly aimed at formally specifying and analysing 
CHAPTER 8: OTHER METHODS 103 
software achitectures. In contrast to CHAM, UP specifications are also reasonably simple to 
implement on concrete machines. 
1.4 Protocols 
There are many competing, well established protocol description languages/environments. 
The main contributors are SDL, Estelle and LOTOS. 
SDL stands for Specification and Description Language [15]. It is an object-oriented, for- 
mal language standardised by The International Telecommunications Union-Telecommunica- 
tions Standardization Sector (formerly CCITT). It was developed to describe real-time and 
distributed communicating systems. The major drawback in using SDL for protocol speci- 
fication, however, is that both the graphical and textual protocol specification descriptions 
tend to be large, and therefore difficult to understand and maintain. 
Estelle (Extended State Transition Language) [38] is an ISO standardised, partly for- 
malised technique for the specification of distributed and concurrent processing systems based 
on an extended state transition model (non-deterministic finite state machine augmented by 
the addition of variables). Estelle has been successfully used on the specification and analysis 
of many real protocols. 
LOTos and especially DILL deserve further attention as they have been used for the 
description of hardware, specifically sequential digital circuits. 
1.4.1 LOTOS, DILL 
LOTOS (Language Of Temporal Ordering Specification) [391 is a by-product of the effort of 0 
standardisation of the Open Systems Interconnection (OSI) within ISO. It is a standardised 
formal description technique designed to describe distributed concurrent information process- 
ing systems, in particular the OSI architecture and the related standards. 
DILL (Digital Logic in LOTOS) [33,34] uses LOTOS to formally specify digital hardware 
making use of a library of typical components and is realised through translation into LOTOS. 
The analysis and verification of properties takes place at the LOTOS level. 
For a comparison between DILL and UP see figure 8.7. The basic component of a logic 
circuit (a NAND gate) is described using (only) a behavioural style in DILL, and both the 
behavioural and stuctural styles in UP. It turns out that great care must be taken to avoid 
non-deterministic behaviour of components when writing DILL specifications. Even if such 
care has been taken there is another problem-the DILL model is not suitable for circuits 
containing cyclic connections. Although the NAND gate is modelled in the manner "when 
all inputs arrive, then output happens" as in UP, the DILL specification will deadlock [33] if 
there is a cyclic connection within each stage. 
Another disadvantage of this approach is that DILL only has a limited set of components 
in the hardware library and the construction of new components requires knowledge of both 
CHAPTER 8: OTHER METHODS 104 
DILL 
process Nand2 [1pl, Ip2, Opl: noexit: = 
(Ipl ? dtIpl Bit; exit (dtIpl, any Bit) 
Ip2 ? dtlp2 Bit; exit (any Bit, dtIp2)) 
>> accept dtlpl, dtIp2 : Bit in 
(Op ! (dtIpl nand dtIp2); 
Nand2 [1pl, Ip2, Op]) 
endproc (* Nand2 *) 
Update Plans (behavioural description) 
ipl, ip2:: Bit. 
nand(ipl, ip2, -1(ipl & ip2)). 
Update Plans (structural description) 
nand(ipl, ip2, op) = ipi (0] ip2[0] ==ý op[l]. 
= ipl[O] ip2[1] opf0j. 
= ipl[l] ip2[0] op[O]. 
= ipl[l] ip2[l] op[O). 
Figure 8.7: DILL Vs. UP modelling a NAND gate 
DILL and LOTOS. However, it seems to be convenient for giving higher-level architecture 
specifications. 
1.5 Z/VDM 
The list of formal methods would not be complete without at least mentioning the pioneering 
Z notation and the Vienna Development Method (VDM) [32] both of which still have a large 
user base. Z and VDM are based on set theory and first order predicate calculus. ISO 
standards exist for both of them. 
Z is a non-executable specification language developed mainly by the Programming Re- 
search Group at the Oxford University Computing Laboratory from the late 1970s. The most 
noticeable difference between Z and VDM are structures called schemas, by which programs 
are described. 
VDM, being a method rather than just a notation for expressing software specification 
design and development, also contains an inference system for constructing correctness proofs 
and a methodolog for developing software from a specification in a formally verifiable manner. oy 
While different in syntax and structure, Z and VDM do not differ radically from one 
another. Unfortunately, neither Z nor VDM can handle concurrent systems [78]. 
Although there has been some work [241 in describing computer architectures using both 
these methods, they are used primarily for software requirements specification and program 0 
development. 
Verification 
While formal specification of a system often forms an important part of its design, it is 
sometimes not sufficient and needs to be complemented by some sort of a verification strategy. 
This section gives a short overview of different verification techniques. 
The traditional method of trying to ensure correctness is through simulation and testing. 
CHAPTER 8: OTHER METHODS 105 
However, as Moore's prediction3 turned out to be very accurate, these techniques have their 
limitations as with ever-increasing complexity of hardware it is impossible to simulate all 
inputs or sequences of inputs. Formal techniques, on the other hand, are better able to scale 
with complexity by using various mathematical methodologies rather than exhaustive testing. 
While there have been some attempts at the "ideal solution" -correct-by-construction 
synthesis [641, designers are still using hand-crafted custom design [44) in an effort to optimise 
performance of systems, which necessitates some post-design verification. 
Formal post-design verification involves the use of analytical methods to prove that the 
implementation of a system conforms to the specification. Formal proofs are based on estab- 
lishing that universal properties about the design hold independently of any particular set of 
inputs, or on showing the equivalence between several layers of a system specification with 
differing degree of abstraction. Verification methods can be classified into two major groups: 
model-checking and theorem-proving, based on these criteria. 
In model-checking, the implementation description (model) is given as (or transformed to) 
a finite state machine (FSM), and the specification description by properties given in some 
kind of temporal logic. Correctness is then established by showing whether some property 
holds in the FSM model, or if not, a counter example is provided. Model-cliecking tools 
are fully automated, and perform exhaustive searches through the state space of the model. 
Unfortunately, this method is not scalable to larger circuits due to the state explosion problem. 
Various methodologies have been developed to alleviate the state explosion [171 usually by 
treating the sets of states symbolically or replacing the system to be checked by a simpler one 
in which irrelevant details are suppressed. 
Some specification methods use first-order (Boyer-Moore/ACL2) or higher-order logic 
(HOL, PVS) for both the specification and implementation description. Theorem-proving 
then tries to establish whether the specification and implementation are equivalent or the 
language representing the implementation is contained in the language representing the spec- 
ification. The main advantages of this approach are that the formal proof can be mechanically 
checked and, in contrast to model checking, theorem proving can deal directly with infinite 
state spaces. However, as opposed to model-cliecking the derivation of a formal proof is ex- 
tremely tedious, as the theorem-proving tools are semi-automatic and need a large degree of 
expertise for efficient use. Moreover, a theorem-prover is not guaranteed to always give an 
answer because of decidability problems. 
2.1 ACL2 
One integrated specification and verification method for computer architectures that stands 
out of the rest mainly because of its use in the industry, is A Computational Lo jc for 0 
Applicative Common Lisp (ACL2) [13,411. ACL2 is a re-implementation of the Boyer-Moore 
'31n 1965 Cordon Moore predicted that complexity of hardware devices would double every 18 months. 
CHAPTER 8: OTHER METHODS 106 
system Nqthm. Like Nqthm, ACL2 supports a Lisp-like, first-order mathematical logic. 
Most of the Common Lisp functions were axiornatised or defined as functions or macros 
in ACL2. In contrast to Common Lisp, all functions in ACL2 are total. 
There are two large scale verification projects based on ACL2. The first one is a formal 
executable specification of the Motorola CAP [26) digital signal processor. The second, per- 
haps even more important project was the application of ACL2 to formalise and prove the 
correctness of the microcode for the kernel of the floating point division operation used on 
the AMD5K86 microprocessor [56). 
As ACL2 is based on theorem proving, it is more reliant on the user. Authors themselves 
admit (131 that ACL2 proofs require many skills including great familiarity and insight into 
the applications areas, engineering issues, mathematics, formal logic, the workings of the 
ACL2 proof tool, and a lot of persistence and dedication. 
Conclusions 
This chapter briefly described several specification and verification methods relevant to the 
area of the Update Plan formalism. 
3.1 Integrated specification/verification methodologies 
The most significant challenge to UP are integrated specification and verification method- 
ologies such as ACL2. However these require a great deal of skill from the user, both in 
designing the specification in order to make it amenable to verification, and in performing the 
verification itself. A verification methodology is only useful if it is used. While Update Plans 
are not, as yet, embedded in a verification formalism (e. g. HOL) there do not seem to be 
any major obstacles to achieving this. This would provide a useful mechanism for producing 
intuitive, structured and verifiable specifications. 
3.2 Specification methods 
It is noticeable that all of the Update Plan specifications in section 1 are more compact 
but, arguably, more easily understood than the corresponding specifications using alternative 
methods. In contrast to many of the methods in section I the Update Plans formalism has 
a formal semantics, and can be applied to the specification of both hardware and software 
architectures. 
3.2.1 Hardware 
None of the HDLs described in section 1.1 has a formal semantics which has hindered the 
development of formal verification methodologies [791. Although some pioneering work has 0 
CHAPTER 8: OTHER METHODS 107 
been done on verification of HDL specifications, this was either only a preliminary work to 
inspire further research (28], or developed verification methodologies for only a restricted 
subset/core of that language [5]. UP on the other hand has a simple but formally defined 
semantics. 
The HDLs in section 1.1 are also generally not not very well suited to specification of 
software architectures. 
Furthermore they are deterministic. This makes it relatively simple to specify sequences 
and modules consisting of sequentially connected elements, but difficult to specify parallel 
and/or synchronous processes-in general these must be emulated. On the other hand non- 
determinism is an inherent part of the semantics of Update Plans making the specification 
of parallel processes relatively easy. Indeed basic Update Plans were somewhat weak in 
specifying sequential behaviour. The introduction of sequential schemes and archetypes in 
Extended Update Plans has solved this, and sequential systems can now be specified with 
ease. 
3.2.2 Concrete machines and instruction sets 
Again, none of the methods described in section 1.2 has a formal semantics-and RTL in par- 
ticular does not even have a standard syntax, as can be seen from the number of differing RTL 
notations. ISPS dates from the 1970s and is essentially an imperative procedural language 
with all the difficulties in formal reasoning that this entails. Though both RTLs and ISPS can 
provide detailed descriptions of the hardware/software interface neither is particularly suited 
to either the specification of hardware or the specification of software at a more abstract level. 
Nor are they strong in the description of parallel processes. 
Though RAPIDE is a more recent development it still lacks a formal semantics, and ver- 
ification can only be achieved through simulation. RAPIDE specifications also tend to be 
considerably longer than the corresponding UP specifications. 
3.2.3 Parallelism 
None of the specification methods discussed so far would be the tool of choice for defining 
parallel processes. The methods discussed in section 1.3 (UNITY, r and CHAM) were de- 
signed with parallelism explicitly in mind. However they are all far more suitable for the 
description of parallel pro-rams rather than hardware. They also lack mechanisms, such as 
sequential Update Schemes and archetypes, for imposing the sequentiality and synchronisa- 
tion that is almost always an essential part of hardware behaviour. In addition 17 and CHAM 
specifications are considerable harder to implement on real machines than the corresponding 
UP specifications. 
CHAPTER& OTHER METHODS 
3.2.4 Protocols 
108 
Of the specification methods discussed in section 1.4 LOTos and DILL are the most suited to 
the description of systems across the hardware/software interface. LOTOS provides the higher 
level (protocol) specification, while DILL uses LOTOS to provide a limited hardware specifica- 
tion facility. This means, however, that writing non-trivial DILL specifications requires a good 
knowledge of both LOTOs and DILL. Also DILL is limited in the class of systems that it can 
specify-it cannot even be used to specify a simple RS flip-flop, since this contains cyclical 
connections. It also tends to introduce unexpected non-determinism unless used carefully. 
3.2.5 Z/VDM 
These methods are again more suited to higher level (software) specifications rather than 
hardware. In contrast to UP they do not produce executable specifications. 
3.3 Summary 
Though there is a large number of specification methods across the application area of Update 
Plans none is as suited to describing systems at the hardware/software interface as Update 
Plans. Almost all of the methods tend to be either hardware or software oriented. Most of 
the methods lack a formal semantics making verification of the systems specified difficult. 
Also the Extended Update Plans formalism can be used equally well for the specification of 
sequential and parallel systems, and the interaction between sequential and parallel processes. 
The other methods surveyed here are mostly suitable for describing sequential or parallel 
systems, but not both. In general Extended Update Plans provide a powerful method for 
specifying and reasoning about all the most common processes and systems that exist at the 
hardware/software interface. 
Chapter 9 
Conclusions and Future Research 
U pdate Plans constitute a very flexible, clean formalism in which clear, elegant, compact, 
intuitive, simple to read and unambiguous low level specifications can be written. Con- 
tributions to the Update Plans formalism presented in this thesis can be divided into two 
major areas. 
Firstly, in the application domain, instruction sets of four different machines (ranging from 
concrete to purely mathematical models) have been described using both original Update 
Plans and UP with the extensions described in the second part of the thesis. 
Secondly, syntactic and semantic extensions to Update Plans made the whole formalism 
even more consistent and expressive. The semantic extensions in particular provide a way to 
make UP descriptions more compact and readable. This is mainly due to the advances in the 
area of synchronisation between parallel and sequential processes and the provision of a tool 
to design specifications with a better degree of modularity. 
I Update Plans applications 
Update Plans applications featuring both in the first and second part of the thesis provide 
a good deal of evaluation of the Update Plans formalism which drove both the syntactic 
and semantic extensions. Specifications Nvere produced for a variety of concrete and abstract 
machines. 
The PDP-11 machine instruction set specification provided a useful comparison of an UP 
specification to the historically important formal specifications in [20,691. A significant subset 
of the PDP-11 instruction set, including the complete set of addressing modes, single and 
double operand instructions, condition code and program flow instruction, and the interrupt 
mechanism, was developed. This specification was unambiguous (in contrast to, for example, 
[20]), more compact than [20,691 and significantly clearer. Consideration of a more detailed 
specification at the fetch/execute cycle drove the development of sequential update schemes- 
a semantic extension to the Update Plan model. 
109 
CHAPTER 9: CONCLUSIONS AND FUTURE RESEARCH 110 
The SPARC-V9 was chosen to test UP's suitability for specifying modern RISC archi- 
tectures. A partial specification of the SPARC-V9 architecture exists [50,651, but this is 
aimed more at the specification of connections and communication than at a concise specifi- 
cation of the instruction set. The UP specification covers the SPARC-V9 register architecture 
(general purpose and floating-point registers, the SPARC-V9 register window mechanism and 
instructions); arithmetic, logical and floating-point instructions; data transfer instructions; 
and control transfer instructions. The main achievement of this specification, apart from 
its clarity and compactness, is that the extensions to the Update Plan formalism allowed 
simultaneous specification of the SPARC-V9's assembly language and machine code. 
The Java Virtual Machine (JVM) was chosen to test Update Plans' suitability for de- 
scribing more abstract instruction sets using a wide variety of types. Again a compact and 
readable specification of the JVM instruction set subset was produced. This specification 
exercise lead to some syntactic changes to the original Update Plans formalism. 
The Parallel Random Access Machine (PRAM) [22] was chosen as a specification target 
because it provided a useful test-bed for the semantic extensions to Update Plans described in 
chapter 6. It was also a test of the expressive power of Update Plans in that the specification 
should be able to describe all of the PRAM's memory models within one spe cification. The 
"n-PRAM" specification is thus a family of specifications, parameterised by the number of 
RAMs, and the particular read/write model under consideration. 
2 Update Plans extensions 
The second class of advances achieved by the thesis are the extensions to the previous Update 
Plans formalism. 
The syntactic extensions have a more significant impact than might at first sight seem. 
They now allow, in many cases, for specifications at multiple levels of abstraction within one 
update plan. The equivalence of the specifications is then implicit in that single specification, 
rather than requiring explicit proof through transformation. The SPARC-V9 specification is 
an example of such a specification, since it describes both SPARC-V9 assembly language and 
machine code. 
The typing mechanism has been extended to allow a clearer match between "hardware 
types" (e. g. bits, bytes, words, etc.; registers, memory, etc. ) and "software types" (e. g. 
integers, booleans, etc.; opcodes, operands, condition codes, instruction words, etc. ). The 
JVIVI specification, for example, defines both the abstract and the physical structure of Java 
bytecode. 
The problem in specifying parallel processes is often not the parallelism in itself, but 
the interaction between, and the synchronisation. of, sequential and parallel processes. In 
the PRAM, for example, the massive parallelism of the memory access model must be syn- 
CHAPTER 9: CONCLUSIONS AND FUTURE RESEARCH ill 
chronised with the sequentiality of the fetch/execute cycles of each of the individual RAMs 
in the PRAM. When this research started parallel system could be specified using Update 
Plans, but synchronisation of the parallel processes thus specified required the introduction 
of artificial constructs unrelated to the architecture under consideration, leading to inelegant 
specifications that were hard to read. The introduction of sequential update schemes provides 
the formalism with a general synchronisation primitive that augments the non-deterministic 
model of Update Plans by explicitly stating the order in which updates will be applied. As a 
result it is now also possible to give more abstract descriptions in EUP than in UP. 
The development of sequential update schemes led naturally to the development of sequen- 
tial archetypes. These greatly extend the possibilities for information hiding and structure 
reuse by encapsulating a series of synchronised updates into a "module" (a sequential arche- 
type). A sequential archetype can express an atomic action at one level of abstraction (e. g. 
an instruction execution) as a series of atomic actions at a lower level of specification (e. g. 
phases of the f/e cycle). This simplifies the task of proving the equivalence of the two levels 
of specification. 
3 Future considerations 
Suggestions for further UP-related work can be classified into three different areas. Firstly, 
the Nvork on theoretical aspects of the formalism itself. Secondly, more research should be 
carried out on the application area of Update Plans. Finally, the implementation for the 
Update Plans formalism consisting of appropriate CAD tools should be developed. 
3.1 Theory 
3.1.1 Metrics 
An annotation to indicate cost of applying an update or expanding an archetype coupled 
with a cost-derivation methodology should be developed. This (with a working UP simulator) 
would not only make the application of Update Plans to compiler optimisations possible as 
suggested in (601, but it would also prove useful in a variety of other applications such as 
automatic calculation of delays of (networks of) components in a logic circuit, simulation of 
latencies in a network environment or answering other performance-related questions. 
3.1.2 Modularity 
It would be useful to have a mechanism for restricting access of updates to only a selected set 
of cells in the memory by means of some kind of a modular structure or simply by restricting 
the domain of variables in some way. Not only would this automatically protect accesses 
to undesired parts of memory, but the current set of grounding rules [60] could be further 
CHAPTER 9: CONCLUSIONS AND FUTURE RESEARCH 112 
extended. The impact on the compactness and readability of UP specifications would in some 
cases be significant. 
3.1.3 Verification 
Verification, and possibly transformation methods should be developed, together with appro- 
priate heuristic rules. Although a first step in this direction has been taken [60], further work 
is necessary. 
On a higher level of abstraction, it is more likely that complex data manipulations will be 
described by a well-defined external function rather than Update Plans. On a lower level of 
abstraction, this function will in some cases be specified in terms of Update Plans. Therefore 
it will be necessary to find such a formalism and develop a methodology of proving semantic 
equivalence or containment between these two descriptions. 
The majority of successful general-purpose proof checking systems that have proved them- 
selves on several industrial-scale- microprocessors are based on theorem proving in first or 
second order logic. Therefore a promising approach seems to be the formalisation of Update 
Plans in such a system (e. g. ACL2, PVS, HOL, Isabelle, IMPS or Nuprl). 
Alternatively, use of existing model-checking tools (e. g. SPIN) could be made after the 
development of methodologies to transform UP descriptions into specification languages the 
tools use. It would then be possible to use the tool's logic to specify and check desired 
properties. 
As no method or tool is general enough and appropriate for all the varied, and sometimes 
conflicting requirements of hardware architecture description and analysis, a combination of 
existing methodologies will almost certainly need to be used. UP should be used in con- 
junction with other formal techniques and the traditional approaches such as testing through 
simulation. 
Finally, methodologies to translate specification languages used by general theorem provers 
to hardware specification languages have'been developed [8]. A similar approach could be 
taken to facilitate automatic synthesis and implementation of chips. 
3.2 Applications 
Although there have already been a lot of UP-based specifications of concrete and abstract 
machines, further validation of the Update Plans formalism is needed. This effort should 
concentrate on the specification of one of the concrete machines already specified, but on a 
lower level, e. g. microcode. These two levels should then be subject to formal verification for 
equivalence or containment by one of the methods suggested in section 3.1.3. As Extended 
Update Plans introduced a convenient, way to describe sequences, libraries of components to 
describe the behaviour of basic building blocks starting from basic elements of logic circuits 
to blocks such as multipliers for an Update Plans simulator could be defined. 
CHAPTER 9: CONCLUSIONS AND FUTUpE RESEARCH 113 
A true test of the (Extended) Update Plans formalism will be more sophisticated models 
of parallel computation such as F-PRAM [76], which introduce a communication network 
with latency and a synchronisation barrier among asynchronously running processors. On 
a slightly related note, attempts to enter into neighbouring domains such as communication 
protocol specification 'could be tried. 
3.3 Implementation 
A prototype implementation of Update Plans [601 which has been successfully used as a di- 
dactic aid has been developed. Completion of this work will greatly enhance the usefulness of 
the formalism not only in automatic derivation of specified properties, but also in prototyping 
and validation. A full implementation of UP (an UP simulator) will lead to better popular- 
isation of UP, and helping to prompt more research in certain features of Update Plans and 
their refinement. 
Bibliography 
[1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers, Principles, Techniques and Tools. 
Addison-Wesley, 1986. 
12) A. G. Alexandrakis, A. V. Gerbessiotis, D. S. Lecomber, and C. J. Siniolakis. Bandwidth, 
space and computation efficient PRAM programming: The BSP approach. In Proceedings 
of the SUPEUR '96 Conference, Sep 1996. 
[3] J. H. Aylor, R. Waxman, and C. Scarratt. VHDL - feature description and analysis. 
IEEE Design and Text Of Computers, 3(2), April 1986. 
[4] J. -P. Banitre and D. L. M6tayer. Programming by multiset transformation. Communi- 0 
cations of the ACM, 36(l): 98-111,1993. 
151 H. Barringer, G. Cough, T. Longshaw, B. Monahan, M. Peim, and A. Williams. A seman- 
tics and verification framework for ELLA. Technical Report UMCS-92-4-6, Department 
of Computer Science, University of Manchester, Oxford Road, Manchester, UK, March 
1992. URL f tp: //f tp. cs. man. ac. uk/pub/TR/UMCS-92-4-6. ps. Z [cited May 20031. 
[61 H. Barringer, G. Cough, B. Monahan, and A. Williams. A process algebra foundation for 
reasoning about core ELLA. Technical Report UMCS-94-12-1, Department of Computer 
Science, University of Manchester, Oxford Road, Manchester, UK, December 1994. URL 
f tp: //f tp. cs. man. ac. uk/pub/TR/UMCS-94-12- 1. ps. Z [cited May 2003]. 
[7] F. L. Bauer. The Munich Project CIP, volume 183 of Lecture Notes in Computer Science. 
Springer-Verlag, Berlin/Heidelberg/New York, 1985. 
[8] C. Berg, S. Beyer, C. Jacobi, D. Kr6ning, and D. Leinenbach. Formal verification of the 
VAMP microprocessor (project status). In Symposium on the Effectiveness of Logic in 
Computer Science (ELICS02), pages 31-36, September 2002. URL http: //busserver. 
cs. uni-sb. de/publikationen/BBJKL02. pdf [cited May 20031. 
(9] G. Berry. The chemical abstract machine. 1998. URL http: //www-sop. inria. fr/ 
me ij e/personnel/Gerard. Berry/ cham. ps [cited May 2003]. 
114 
BIBLIOGRAPHY 115 
[10] G. Berry and G. Boudol. The chemical abstract machine. Theoretical Computer Science, 
96: 217-248,1992. 
(111 P. Bertelsen. Semantics of Java byte code. Technical report, 1997. URL f tp: //f tp. 
dina. kv1. dk/pub/Staf f /Peter. Bertelsen/jvm-semantics. ps. gz [cited May 2003]. 
[12] R. Boulton, M. Gordon, J. Herbert, and J. V. Tassel. The HOL verification of ELLA 
designs. Technical Report TR199, University of Cambridge Computer Laboratory, Cam- 
bridge, UK, 1999. URL http: //www. f tp. cl. cam. ac. uk/f tp/papers/reports/TR199- 
rjb-mj cg-jmjh-jvt-HOL-verif ication-ELLA. ps. gz [cited May 20031. 
[13] B. Brock, M. Kaufmann, and J. S. Moore. ACL2 theorems about commercial micropro- 
cessors. 1166: 275-293,1996. URL http: //www. cs. utexas. edu/users/moore/ac12/v2- 
1/reports/bkm96. ps [cited May 2003]. 
[14] R. M. Burstall and J. Darlington. A transformation system for developing recursive 
programs. Journal of the ACM, 24(l): 44-67, Jan 1977. 
(151 CCITT. Red book - Functional Specification and Description Language (SDL). Recom- 
mendations Z. 101-Z. 104, volume VI. CCITT, Geneva, 1985. 
[16] K. M. Chandy and J. Misra. Parallel Program Design: A Foundation. Addison-Wesley, 
1988. 
[17] E. Clarke, 0. Grumberg, S. Jha, Y. Lu, and H. Veith. Progress on the state explosion 
problem in model checking. Lecture Notes in Computer Science 2000, pages 176-194, 
2001. 
[18] E. Clarke and J. Wing. Formal methods: State of the art. and future directions. Journal 
of the ACM, 28(l): 626-643, Dec 1996. 
[19] R. M. Cohen. The Defensive Java Virtual Machine specification, version 0.53. Technical 
report, Austin Technical Services Center, 98 San Jacinto Blvd, Suite 500, Austin, TX 
78701,1997. 
(201 PDP-11 Processor Handbook. Digital Equipment Corporation, 1971. 
[21] H. R. Eckhouse, Jr. and L. R. Morris. Minicomputer Systems - Organization, Program- 
ming and Applications (PDP-11). Prentice-Hall, 1979. ISBN 0-13-583914-9. 
[22] S. Fortune and J. Willie. Parallelism in random access machines. In Proceedings of the 
10th Annual Symposium on Theory of Computing, pages 114-118,1978. 
[231 T. S. Rank. Introduction to the PDP-11 and its Assembly Language. Prentice-Hall, Le 
Moyne College, Syracuse, New York, 1983. ISBN 0-13-491704-9. 
BIBLIOGRAPHY 116 
124) S. Gerhart, D. Craigen, and T. Ralston. Observations on industrial practice using formal 
methods. In 15th International Conference on Software Engineering (ICSE), Baltimore, 
Maryland, USA, May 1993. 
[25] R. Giegerich. Implementierung von Programmiersprachen. Technische Fakultfit, Univer- 
sitdt Bielefeld, Postfach 86 40,4800 Bielefeld 1, BRD, 1992. Lecture notes for a course 
in compiler construction. (In German). 
[26] S. Gilfeather, J. Gehman, and C. Harrison. Architecture of a complex arithmetic pro- 
cessor for communication signal processing. In SPIE Proceedings, International Sym- 
posium on Optics, Imaging, and Instrumentation, pages 624-625, March 1994. URL 
http: //www. cli. com/hardware/cap. eps [cited May 2003]. 
[27] J. Gleick. A bug and a crash: Sometimes a bug is more than a nuisance. 1996. URL 
http: //www. around. com/ariane. html [cited May 2003]. 
[28] M. J. C. Gordon. The semantic challenge of Verilog HDL. In Tenth Annual IEEE 0 
Symposium on Logic in Computer Science (LICS'95), pages 136-145,1995. URL http: 
//www. math. chalmers. se/-gpace/HDL/papers/V. ps [cited May 20031. 
[29] A. Gupta. Formal hardware verification methods: A survey. In Formal Methods in System 
Design, volume 1, School of Computer Science, Carnegie Mellon University, Pittsburgh, 
Pennsylvania 15213,1992. Kluwer Academic Publishers, Hingham, MA, USA. 
[30] P. Hdmiildinen. PRAM emulator, user's manual. Technical Report B-1992-2, University 
of Joensuu, PO Box 111,80101 Joensuu, Finland, 1992. URL f tp: //cs. j oensuu. f i/ 
pub/Reports/B-1992-2. ps [cited May 2003]. 
[311 E. Harcourt, J. Mauney, and T. Cook. Formal specification and simulation of instruction- 
level parallelism. In Proceedings of the 1994 European Design Automation Conference. 
IEEE Press, October 1994. 
[32] A. Harry. Formal Methods: VDM and Z. Wiley, 1996. ISBN 0-471-95857-3. 
[33] J. He and K. J. Turner. Modelling and verifying synchronous circuits in DILL. Techni- 
cal Report CSM-152, Department of Computing Science and Mathematics, University 
of Stirling, Scotland, April 1999. URL ftp: //f tp. cs. stir. ac. uk/pub/staf f /kjt/ 
res earch/pubs/sync- dill. ps. gz (cited May 2003]. 
(34] J. He and K. J. Turner. Specification and verification of synchronous hardware using 
LOTOS. October 1999. URL ftp: //f tp. cs. stir. ac. uk/pub/staf f /kjt/research/ 
pubs/sync-lot. ps. gz [cited May 2003]. 
BIBLIOGRAPHY 117 
[35) J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. 
Morgan Kaufmann Publishers, 3rd edition, June 2002. ISBN 1-55860-724-2. 
[361 IEEE Standard VHDL Language Reference Manual. IEEE, New York, 1988. IEEE Std. 
1076-1987. 
[37] P. Inverardi and A. L. Wolf. Formal specification and analysis of software architectures 
using the chemical abstract machine model. IEEE Transactions on Software Engineering, 
21(4), 1995. 
[38] ISO. Information Processing Systems - Open Systems Interconnection - Estelle -A 
Formal Description Technique based on an Extended State Transition Model. ISO 9074. 
International Organization for Standardization, Geneva, 1989. 
[39] ISO/IEC. Information Processing Systems - Open Systems Interconnection - LOTOS 
-A Formal Description Technique based on the Temporal Ordering of Observational 
Behaviour. ISO/IEC 8807. International Organization for Standardization, Geneva, 1989. 
[40] S. Juvaste. Modelling Parallel Shared Memory Computations. PhD thesis, University of 
Joensuu, 1998. URL f tp: //f tp. cs. joensuu. f i: /pub/Dissertations/juvaste. ps. gz 
[cited May 20031. 
(411 M. Kaufmann and J. S. Moore. An industrial strength theorem prover for a logic based 
on Common Lisp. Software Engineering, 23(4): 203-213, March 1997. URL http: //www. 
cs. utexas. edu/users/moore/publications/km97. ps. Z [cited May 2003]. 
[421 C. Kern and M. R. Greenstreet. Formal verification in hardware design: A survey. A CM 
Transactions on Design Automation of Electronic Systems, 4(2): 123-193,1999. 
[431 P. Klint. A meta-environment for generating pro-ramming environments. ACAT Trans- 0 
actions on Software Engineering and Methodology, 2(2): 176-201, April 1993. ISSN 1049- 
331X. 
[44] A. Kuehlmann, A. Srinivasan, and D. P. LaPotin. Verity-a formal verification program 
for custom CMOS circuits. IBM Journal of Research and Development, 39(1/2): 149-165, 
March 1995. 
[45] W. Lamain. Update Plans, implementatie aspecten. Master's thesis, University of Ni- 
jmegen, Toernooiveld 1, Nijmegen, The Netherlands, 1992. (In Dutch). 
[46] D. S. Lecomber, K. R. Sujithan, and J. M. D. Hill. Architecture-independent locality 
analysis and efficient PRAM simulations. In ffPCN'97, Vienna, April 1997. Springer- 
Verlag. 
BIBLIOGRAPHY 118 
[47] G. W. Leibniz. (Huvre philosophiques, latines et franýaises, de feu Mr. de Leibniz, tirdes 
de ses manuscHts, qui se conservent dans la bibliothdque royale ä Hanovre et publiges 
par M. Rud. Eric Raspe. Amsterdam/Leipzig, 1765. 
[48] T. Lindholm and F. Yellin. The Java Virtual Machine Specification (Second Edition). 
Addison-Wesley, 1999. URL http: //j ava. sun. com/docs/books/vmspec/index. html 
(cited May 20031. 
[491 J. Lolise, J. Bormann, M. Payer, and G. Venzl. VHDL-translation for BDD-based for- 
mal verification. 1994. URL http: //tech-www. inf ormatik. uni-hamburg. de/vhdl/ 
papers/verif ication/vhdl2f sm. ps. gz (cited May 2003]. 
[50] D. C. Luckham, J. L. Kenney, L. M. Augustin, J. Vera, D. Bryan, and W. Mann. Specifi- 
cation and analysis of system architecture using RAPIDE. IEEE Transactions on Software 
Engineering, 21(4): 336-355,1995. 
[511 M. Marin. Binary tournaments and priority queues: PRAM and BSP. Technical Report 
PRG-TR-7-97, Oxford University, January 1997. 
[52] M. C. McFarland. Formal verification of sequential hardware: A tutorial. IEEE Trans- 
actions on Computer-Aided Design of Integrated Circuits and Systems, 12(5), May 1993. 
[531 E. Meijer. Calculating Compilers. PhD thesis, University of Nijmegen, Tbernooiveld 1, 
Nijmegen, The Netherlands, 1992. 
[54] H. Meijer. Programman A Translator Generator. PhD thesis, University of Nijmegen, 
Toernooiveld 1, Nijmegen, The Netherlands, 1986. 
[55] J. MencAk. Java target code optimization. Master's thesis, Department of Computer 
Science and Engineering, Brno University of Technology, The Czech Republic, 1999. 
[56] J. S. Moore, T. W. Lynch, and M. Kaufmann. A mechanically checked proof of the 
correctness of the kernel of the AMD5K86 floating point division algorithm. March 1996. 
URL http: //www. cli. c om/news /divide. ps [cited May 2003]. 
[57] J. D. Morison and A. S. Clarke. ELLA200: A Language for Electronic System Design. 
McGraw-Hill, 1993. 
158] H. R. Osborne. The semantics and syntax of Update Schemes. In Code Generation - 
Concepts, Tools, Techniques (Proceedings of the International Workshop on Code Gen- 
eration, Dagstuhl, Gennany, 20-24 May 1991), Workshops in Computing, pages 210- 
223. Springer Verlag, 1992. URL http: //scom. hud. ac. uk/scomhro/Papers/CODE91/ 
code9l. ps [cited May 2003]. 
BIBLIOGRAPHY 119 
(59] H. R. Osborne. Update Plans. In Proceedings of the 25th Hawaii International on Sys- 
tem Sciences (Volume II: Software Technology), pages 488-496. IEEE Computer Soci- 
etyPress, 1992. URLhttp: //scom. hud. ac. uk/scomhro/Papers/HICSS25/hicss25. ps 
[cited May 20031. 
[60] H. R. Osborne. Update Plans -A High Level Low Level Specification Language. PhD 
thesis, University of Nijmegen, 1995. URL http: Hscom. hud. ac. uk/scomhro/Papers/ 
PhD/phd. ps [cited May 2003). 
[61] H. R. Osborne. Update Plans for parallel architectures. In Abstract Machine Models for 
Parallel and Distributed Computing, pages 79-90, Amsterdam, 1996. IOS Press. URL 
http: //scom. hud. ac. uk/scomhro/Papers/AMW/amw. ps [cited May 2003]. 
[621 H. R. Osborne. The Postroom Computer. Journal of Educational Resources in Com- 
puting, 1(4): 81-110, December 2001. URL http: //scom. hud. ac. uk/scomhro/Papers/ 
JERIC/jeric. ps [cited May 2003]. 
[631 V. Pratt. Anatomy of the Pentium bug. In TAPSOFT95: Theory and Practice of Soft- 
ware Development, pages 97-107. Springer Verlag, 1995. URL http: //boole. stanf ord. 
edu/pub/anapent. ps. gz [cited May 2003]. 
[641 P. S. Rajan. Transformations in high-level synthesis: Formal specification and efficient 
mechanical verification. October 1994. URL http: //www. csl. sri. com/papers/c/s/ 
csl-94-10/csl-94-10. ps. gz [cited May 2003]. 
[65] A. Santoro, W. Park, and D. Luckham. SPARC-V9 architecture specification with 
RAPIDE. Technical Report CSITR-95-677,1995. URL ftp: //pavg. stanford. edu/ 
pub/Rapide- 1. O/sparc. ps. Z [cited May 2003]. 
[66] A. Santoro, W. Park, and D. Luckham. Specifying instruction set architectures with 
RAPIDE. Technical report, Computer Systems Lab, Stanford University, to appear. 
[67] J. Savage. Models of Computation. Exploring the Power of Computing. Addison-Wesley, 
1998. 
(68] C. -J. H. Seger. An introduction to formal verification. Technical Report 92-13, Depart- 
inent of Computer Science, Vancouver, B. C., Canada, June 1992. 
[69] D. P. Siewiorek, C. G. Bell, and A. Neivell. Computer Structures: Principles and Exam- 
ples. McGraw-Hill, 1982. 
[701 UltraSPARC IR - User's ManuaL Sun Microelectronics, 1999. URL http: //www. sun. 
com/oem/products/manuals/805-0087. pdf [cited May 2003]. 
BIBLIOGRAPHY 120 
[711 J. V. Tassel and D. Hernmendinger. Toward formal verification of VHDL specification. 
In L. Claesen, editor, Applied Formal Methods For Correct VLSI Design, pages 261-270, 
Amsterdam, November 1989. Elsevier Science Publishers. 
[72] R. G. Taylor. Models of Computation and Formal Languages. Oxford University Press, 
1998. ISBN 0-19-510983-X. 
[73] D. Thomas and P. Moorby. The Verilog Hardware Description Language. Kluwer Aca- 
demic, 1991. 
[74] M. G. J. van den Brand and P. Klint. ASF+SDF Meta-Environment User Manual, 
Revision 1.125. kruislaan 413,1098 SJ Amsterdam, The Netherlands, 2002. URL 
http: //www. cwi . nl 
/proj e ct s /Met aEnv/met a/do c /manual. ps. gz [cited May 2003]. 
[75] B. Venners. Inside the Java Virtual Machine. McGraw-Hill, 1998. 
[761 J. Veriijiintausta. An F-PRAM emulator. Technical Report B-1998-1, University of 
Joensuu, PO Box 111,80101 Joensuu, Finland, 1998. URL ftp: //cs. joensuu. fi/pub/ 
Reports/B-1998-1. ps. gz [cited May 2003]. 
[771 D. L. Weaver and T. Germond, editors. The SPARC Architecture Manual. Prentice 
Hall, 2000. ISBN 0-13-825001-4. URL http: //www. sparc. com/standards/v9. ps. Z 
[cited May 20031. 
[78] B. A. Wichmann. A personal view of formal methods. March 2000. URL http: //WWW. 
npl. co. uk/ssf m/download/documents/baw-f m. pdf [cited May 20031. 
[79] A. Williams. Comparison of ELLA and VHDL. Technical Report IED 4/1/1357, 
Department of Computer Science, University of Manchester, Oxford Road, Manch- 
ester, UK, 1994. URL f tp: //f tp. cs. man. ac. uk/pub/hardware-verif ication/ELLA- 
PROJECT/D2.1b. ps. gz [cited May 2003]. 
180] P. J. Windley. Formal modeling and verification of microprocessors. IEEE Transactions 
on Computers, 44(l), October 1995. 
Appendix A 
Extended Update Plans Grammar 
This appendix gives a full context free grammar for Extended Update Plans as specified in 
this thesis. The grammar is given as an ASF+SDF [43,74] specification, a specification 
formalism developed at the University of Amsterdam and the Centrum voor Wiskunde en 
Informatica. The specification can be read as a context free grammar, with production rules 
reading from right to left. The following notational conventions apply 
" S* defines zero or more repetitions of a symbol (non-terminal or literal) S; 
" S+ defines one or more repetitions of a symbol S; 
" {S sep}* defines zero or more repetitions of a symbol S separated by the literal sep; 
" {S sep}+ defines one or more repetitions of a symbol S separated by the literal sep, 
0 (. ) defines grouping of two or more symbols '. '; 
0 [s] represents any one of the literals in the string s; 
0 -[s] represents any character not in the string s; 
0 \t is the horizontal tabulation character; 
0 \n is the newline character. 
The fleft} and {bracket} annotation is applied to disambiguate parsing, as is the in- 
formation provided under a priorities header. The symbols appearing in the sorts section 
denote start symbols for the grammar. 
121 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 
module Main 
imports Layout Schemes Types Stores Archetypes Updates 
exports 
sorts SCRIPT 
context-free syntax 
CONFIGURATION ". " PLAN --+ SCRIPT 
(ITEM ". ")* --+ PLAN 
TYPE-DECLARATION ITEM 
STORE-DECLARATION ITEM 
ARCH ETYPE-DEFIN ITION ITEM 
UPDATE ITEM 
module Updates 
imports Alternatives ParallelBlocks Sequentia I Blocks 
exports 
context-free syntax 
ALTERNATIVES --ý UPDATE 
PARBLOCK -+UPDATE 
SEQBLOCK -4 UPDATE 
module Alternatives 
imports Schemes 
exports 
context-free syntax 
ISCHEME "; "}+ --ý ALTERNATIVES 
module ParallelBlocks 
imports Updates 
exports 
context-free syntax 
"(11" JUPDATE "11"}+ "11)" --+ PARBLOCK 
122 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 123 
In the production rules for sequential blocks layout is forbidden between the pipeline 
symbol '1' and the symbol/variable indicating a stage. 
module Sequential Blocks 
imports Lexicon Updates 
exports 
context-free syntax 
"(" SEQBLOCK-ID-OPT "I" I(STAGE-OPT UPDATE) "I"}+ "j)" --* SEQBLOCK 
SYMB-CONST -4SEQBLOCK-ID-OPT 
--ý SEQBLOCK-ID-OPT 
SYMB-CONST I VARIABLE STAGE-OPT 
STAGE-OPT 
In the production rules for update schemes layout is forbidden between a locator and the 
or 'j' of the cell sequence of which it is a locator. 
module Schemes 
imports Terms 
exports 
context-free syntax 
CONFIGURATION GUARD CONFIGURATION SCHEME 
REPEAT GUARD CONFIGURATION SCHEME 
TEXT CONTEXT 
TERM* TEXT 
LOC-EXPR* --ýCONTEXT 
REPEAT 
TERM GUARD 
"==> " --+GUARD 
LEFT-SECTION+ LOCATOR LOC-EXPR 
TERM-OPT TEXT LOC-EXPR 
TERM-OPT TEXT LOC-EXPR 
LOCATOR "[" TEXT LEFT-SECTION 
TERM-OPT LOCATOR 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 
module Archetypes 
imports BasicArchetypes AmbidextrousArchetypes Lexicon Schemes Updates 
exports 
context-free syntax 
BASIC-ARCHETYPE-DEFINITION ARCHETYPE-DEFINITION 
AMBID-ARCHETYPE-DEFINITION ARCH ETYPE-DEFINITION 
"=" ARCHETYPE-BODY 
UPDATE I CONFIGURATION 
ARCHETYPE-CALL 
-+ BASIC-DEFINITION 
--4ARCHETYPE-BODY 
--) TERM 
ARCHETYPE-NAME INDEX-OPT PARAMETERS --+ ARCHETYPE-CALL 
INDEX 
"(" ITEXT ", "}* ")" 
INDEX-OPT 
INDEX-OPT 
--ý PARAMETERS 
module BasicArchetypes 
imports Archetypes Lexicon 
exports 
context-free syntax 
BASIC-DECLARATION BASIC-DEFINITION+ BASIC-ARCHETYPE-DEFINITION 
BASIC-ARCHETYPE-NAME PARAMETERS BASIC-DECLARATION 
module AmbidextrousArchetypes 
imports Archetypes Lexicon 
exports 
context-free syntax 
AMBID-DECLARATION BASIC-DEFINITION+ --4AMBID-ARCHETYPE-DEFINITION 
ARCHETYPE-NAME PARAMETERS TEXT --ý AMBID-DECLARATION 
124 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 
module Stores 
imports Lexicon 
exports 
context-free syntax 
STORE-NAME STORE-STRUCTURE 
"(" STORE-STRUCTURE lbracket} 
STORE-STRUCTURE "++" STORE-STRUCTURE STORE-STRUCTURE fleft} 
STORE-STRUCTURE "I" STORE-STRUCTURE STORE-STRUCTURE lleft} 
STORE-STRUCTURE 'Y' 
"I" STORE-STRUCTURE "I" NUMBER 
"I" fSTORE ", "}+ "I" --ý STORE- D ECLARATI ON 
STORE-IDENTIFIER STORE 
STORE-IDENTIFIER "=" STORE-STRUCTURE 
STORE-NAME CONST-OPT STORE-IDENTIFIER 
context-free priorities 
STORE-STRUCTURE 'Y' > 
STORE-STRUCTURE "++" STORE-STRUCTURE -4 STORE-STRUCTURE > 
STORE-STRUCTURE "I" STORE-STRUCTURE --ý STORE-STRUCTURE 
module Types 
imports Terms Lexicon Stores 
exports 
context-free syntax 
J(TERM CONST-OPT) STORE-STRUCTURE --ý TYPE-DECLARATION 
module Terms 
imports Archetypes BasicTerms Arithmetic Logic Ordering Memory 
exports 
context-free syntax 
"(" TERM ":: " STORE-STRUCTURE TERM 
TERM 'Y' TERM 
TERM "++" TERM TERM fleft} 
TERM "I" TERM TERM (left} 
"(" TERM TERM {bracket} 
TERM TERM-OPT 
-*TERM-OPT 
125 
context-free priorities 
TERM "*" --ý TERM > TERM "++" TERM -4 TERM > TERM "I" TERM -4 TERM 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 
module BasicTerms 
imports Lexicon 
exports 
context-free syntax 
NUMBER --+ TERM 
CHAR -ýTERM 
SYMB-CONST TERM 
VARIABLE TERM 
DONTCARE -ýTERM 
module Arithmetic 
imports BasicTerms 
exports 
context-free syntax 
TERM TERM TERM {left} 
TERM TERM TERM {left} 
TERM "x" TERM TERM {left} 
TERM TERM -4TERM fleft} 
TERM TERM TERM {Ieftj 
TERM TERM TERM {left} 
"-" TERM TERM 
context-free priorities 
"-" TERM -) TERM > TERM "' TERM TERM 
{left: TERM TERM --ý 
TERM TERM TERM 
TERM "x" TERM TERM}> 
fleft: TERM TERM --4TERM 
TERM TERM -ý TERM 
module Logic . imports BasicTerms 
exports 
context-free syntax 
TERM "A" TERM TERM fleft} 
TERM "V" TERM TERM jIeft} 
"ý" TERM TERM 
context-free priorities 
"ý" TERM - TERM > TERM "A" TERM --+ TERM > TERM "V" TERM --ý TERM 
126 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 
module Ordering 
imports BasicTerms 
exports 
context-free syntax 
TERM TERM TERM 
TERM TERM TERM 
TERM TERM TERM 
TERM TERM TERM 
TERM TERM --4TERM 
TERM '5" TERM --+ TERM 
module Memory 
imports Lexicon BasicTerms 
exports 
context-free syntax 
"I" STORE-NAME "I" --+ TERM 
"La" "(" TERM ", " TERM TERM 
"La" TERM TERM 
module Lexicon 
exports 
lexical syntax 
[0-9]+ 
""' -['\n] ""' 
[A-Z][A-Z'0-9-1* 
[a-z][a-z'0-9]* 
--4 NUMBER 
CHAR 
SYMB-CONST 
--4VARIABLE 
--+ DONTCARE 
[A-Z][A-Z'0-9-]* [a-z][A-Za-z'0-9_]* - STORE-NAME 
[a-z][a-z'0-9-]* BASIC-ARCHETYPE-NAME 
[A-Z][A-Z'0-9]* COMMAND-ARCHETYPE-NAME 
BASIC-ARCHETYPE-NAME ARCHETYPE-NAME 
COMMAND-ARCHETYPE-NAME ARCHETYPE-NAME 
NUMBER INDEX 
NUMBER INDEX 
NUMBER --4CONST-OPT 
-+ CONST-OPT 
127 
APPENDix A: EXTENDED UPDATE PLANS GRAMMAR 
module Layout 
exports 
lexical syntax 
[\\t\n]* -4 LAYOUT 
"#" [\n]* [\n] -4 LAYOUT 
context-free restrictions 
LAYOUT? -/- [\\t\n] 
128 
Appendix B 
PDP-11 
REG (r, v) r = r[v]. 
AUTOINC(b, v) r = r[b] b[v]c r[c]. 
AUTODEC(a, v) r = r[b] a[v]b r[a]. 
INDEX(b + d, v) rd = r[b] b+d[v]. 
REGDEF(b, v) r = r[b] b[vj. 
AUTOINCDEF(b2, v) r = r[bi] bi[b2lcl b2[VIC2 r[cil 
AUTODECDEF(a2, v) r = r[bi] al[a2]bl a2[v]b2 r[al] 
INDEXDEF(b2, v) rd = r[bl] bl+d[b2] b2[VI. 
register mode 
autoincrement mode 
autodecrement mode 
index mode 
register deferred mode 
# autoincrement deferred mode 
autodecrement deferred mode 
index deferred mode 
IMM(V) V# immediate mode 
ABS (a, v) a= a[v]- # absolute mode 
REL(pc + d, v) d= PC[pc] pc+d[v]. # relative mode 
RELDEF(a, v) d= PC[pc] pc+d[a] a[v]. # relative deferred mode 
sop(V) = REG(-, v). 
= AUTOINC(-, v). 
= AUTODEC(-, v). 
= INDEX(-, v). 
= REGDEF(-, v). 
= AUTOINCDEF(-, v). 
= AUTODECDEF(-, v) 
= INDEXDEF(-, v). 
= IMM V) - 
= ABS v). 
= REL(-, v). 
= RELDEF(-, v). 
dop (a, v) = REG (a, v). 
= AUTO INC (a, v). 
= AUTODEC(a, v). 
= INDEX(a, v). 
= REGDEF(a, v). 
= AUTOINCDEF(a, v). 
= AUTODECDEF(a, v). 
= INDEXDEF(a, v). 
= ABS (a, v). 
= REL(a, v). 
= RELDEF (a, v). 
129 
APPENDix B: PDP-11 
{Nibble = Bit Bit Bit Bit, Byte = Nibble Nibble, Word = Byte Byte}. 
b,, b,, b,, bc :: Bit. 
nibble:: Nibble. 
byte:: Byte. 
PSW[byte nibble]CCN[b,, ]CCZ[b, ]CCv[bv]CCC[bc]. 
cc(v) : -- CCN[(v <, 0) (v = 0) (--, (MIN, <, v <, MAXJ) (v >, MAXJ] 1. 
arithm(-, 0) = CLR. # clear destination 
arithm(x, -x) = COM. # complement destination 
arithm(x, x+ 1) = INC. # increment destination 
arithm(x, x- 1) = DEC. # decrement destination 
arithm(x, -x) = NEG. # negate destination 
arithm(x, x+ c) = ADC CCc[c). # add carry to destination 
arithm(x, x- c) = SBC CCc[c). # subtract carry from destination 
arithm(x, x/2) = ASR. # arithmetic shift right destination 
arithm(x, xX 2) = ASL. # arithmetic shift left destination 
arithm(-, -I x n) = STX CCN[n). # sign extend destination 
arithm(x, r) dop(a, x) ===* a[r] cc(r). 
vl, vo :: Byte. 
SWAB dop(a, -) = a[vl vol =: ý> a[vo vl]. 
#swap bytes of destination 
arithm(x, x+ 1) = INC =* W. # increment destination 
arithm(x, x+ 1) = INCB ==* B. # increment destination byte 
arithm(x, r) dop(a, x) ==* rcc(a, r, arithm(-, -)). 
type(W) = Word 
type(B) = Byte 
rcc(a, v, w) = cc(v, w) a[(v:: type(w))]. 
CC(V7W) CCN[(V <s 0) (v = 0) (-n(MIN, <, v <,, MAX, )) (v >,, MAXJ] 1. 
cc (v, B) CCN 
[(V <bs 0) (V = 0) (- (MIN-B., <-bs V 
-<bs 
MAX-B., )) (V >bu MAX-Bu)] 
TST dop(-, x) ===> cc(x). #test destination 
ROR dop(a, x) cCc[c] ==> a[(x>>1)j(c<<15)] CCc[(x&l :: Bit)]. 
ROL dop(a, x) CCc[c) ===> a[(x<<l)lc] CCc[(x>>15)&l :: Bit)]. 
130 
MOV sop(v) dop(a, -) ==* a[v]. 
APPENDix B: PDP-11 
abrithm(x, y, x+ y) = ADD. # add source to destination 
abrithm(x, y, x- y) = SUB. # subtract source from destination 
abrithm(x, y, -x&y) = BIC. # bit clear destination from source 
abrithm(x, y, x1y) = BIS. # bit set destination from source 
abrithm(x, y, r) sop(x) dop(a, y) ==: > a[r] cc(r). 
cmps(x, y, x- y) = CMP. # compare source to destination 
cmps(x, y, x&y) = BIT. # bit test source and destination bytes 
cmps(x, y, r) sop(x) dop(-, y) =* cc(r). 
SEN ==> CCN[TRUE]. CLN ==> CCN[FALSE]. 
SEZ CCz[TRUE]. CLZ CCz[FALSE]. 
SEV CCv[TRUE). CU CCv[FALSE). 
SEC CCc[TRUE]. CLC CCc[FALSE]. 
SCC CCN[TRUE] CCZ[TRUE] CCv[TRUE] CCc[TRUE]. 
CCC CCN[FALSE] CCZ[FALSE] ccv[FALSE] ccc[FALSE]. 
branch(true) = BR. # branch always 
branch(-z) = BNE CCzfzl- #00 
branch(z) = BEQ CCZ[z]. #=O 
branch(-(n ^ v)) = BGE CCN[n] CCv[v]. #>0 
branch(n - v) = BLT CCN(n] CCV[Vl- #<0 
branch(-i(zl(n - v))) = BGT CCz [z] CCN [n] CCv [v]. #>0 
branch(zl(n - v)) = BLE CCz[z) CCN[n] CCV [VI. #<0 
branch(-in) = BPL CCN[n]. #+ 
branch(n) = BMI CCN[n]. #- 
branch(-(ciz)) = BHI CCc[c] CCZ[z]. # higher (unsigned comparison) 
branch(-c) = BCC CCc[c]. # carry clear 
= BHIS CCc[c]. # higher or same (unsigned compar. ) 
branch(c) = BCS CCC[c]. # carry set 
= BLO CCc[c]. # lower (unsigned comparison) 
branch(clz) = BLOS CCc[c] CCZ [z]. # lower or same (unsigned compar. ) 
branch(-v) = BVC CCv[vl- # overflow clear 
branch(v) = BVS CCV[v]- # overflow set 
PC[pc] pc[branch(c ) djcp =1 c 1#- PC[cp + dj; # branch 
11 =: ý PC[cp]. # no branch 
JMP dop(-, a) ==: > PC[a]. 
131 
reg(r, v) = r[v] 1. 
APPENDix B: PDP-11 
PC[pc] pc[JSR r dop(a, -)]qc reg(r, v) reg(6, tp) 
==* PC[a] reg(6, sp) sp[v]tp reg(r, qc). 
RTS r reg(r, pc) reg(6, sp) sp[v]tp =#. PC[pc] reg(6, tp) reg(r, v). 
trap(148,168) = BPT. # breakpoint trap 
trap(208,228) = IOT. # input/output trap 
trap(308,328) = EMT. # emulator trap 
trap(348,368) = TRAP. # TRAP 
trap(Pca7 pswa) reg(6, tp) Pca[Pcl Pswa[PSWI PC[Pc., ] PSW[Psw., ] 
==> sp[pc, psws]tp reg(6, sp) PC[pc] PSW[psw]. 
reto = RTI. # return from interrupt 
= RTT. # return from trap 
reto reg(6, sp) sp[pc psw]tp ==* PC[pc] PSW[psw] reg(6, tp). 
132 
NOP =* .# no operation 
Appendix C 
SPARC-V9 
{Bit}. 
jByte(012) = jBit}8, Half word(102) = {Bit}16, Word(002) = jBit}32, 
Extended-word = {Bit}64}. 
jFp-single = jBit}32, Fp-double = jBit}64, Fp-quad = fBit}128}. 
nPC:: Constant. 
IMM13(12) :: Bit. 
OP: ARM(102), OP: LS(112), OP: BRS(002), OP: CLL(012) fBitJ2. 
BRZ(012), BRLEZ(102), BRLZ(112) :: {Bit}2. 
ADD(0002), SUB(1002), SAVE(1002), RESTORE(1012) {Bit}3. 
BPA(10002), BPN(00002), BPNE(10012), BPE(00012) fBit}4. 
LDSTUB(OO 11012) :: f Bit}6. 
FADD(O 0100 002), FSUB(O 0100 012) :: fBit}7. 
REGRES(O 00 00 00 002) :: f BitJ9. 
rr(O, 0) =. # reading r[01 yields 0 
rr (a, v) =a [vj <a<8 J=> # global registers 
= CWP[w] outr(a - 8, v, w) 8<a< 16 1#> 
= CWP[w] locr(a - 16, v, w) 16 <a< 24 
= CWP [w] inr (a - 24, v, w) 24 <a< 32 
rw(a, v) a= CWP[w] rw(a, v, w). 
rw(a, v, w) 0 =a ý* .# witing r 
[01 has no effect 
1<a<8 a[v]. # global registers 
8<a< 16 outw(a - 8, v, w). 
16 <a< 24 locw(a - 16, v, w). 
24 <a< 32 inw(a - 24, v, w). 
133 
APPENDIX C: SPARC-V9 
outr(a, v, w) = ilocr(a, v, w). 
outw(a, v, w)= =#-ilocw(a, v, w). 
locr(a, v, w) = ilocr(a + 8, v, w). 
locw(a, v, w) = ==>ilocw(a+8, v, w). 
inr(a, v, w) = ilocr(a, v, w+ 1). 
inw (a, v, w) = ==: ý>ilocw(a, v, w+l). 
ilocr(a, v, w) = (NWINDOWS -1- (w % NWINDOWS)) x 16 + a[v]. 
ilocw(a, v, w)= ==>(NWINDOWS-1-(W%NWINDOWS))xl6+a[v], 
f r(a, v, s) a= af(v:: Fp-single)] =4 S= 012 J=> - 
= af(v:: Fp-double)] =1 S= 102 
14> 
= al(v:: Fp-quad)) =1 S= 112 ý* - 
f w(a, v, s) a= =4 S= 012 )=ý' af(v :: Fp-single)). 
=4 S= 102 I=> a[(v :: Fp-double)]. 
=4 S= 112 I=> af(v :: Fp-quad)). 
s op 1 (v) = rr (a, v) a. 
sop2(v) = REGRES(v). 
= IMM13(v). 
REGRES(v) a= rr(a, v). # REGRES =: 02 00 00 00 002 
IMM13(sign-ext(v)) = v:: {Bit}13. # IMM13 = 12 
iccr(c) = CCR: ICC: C[c]. 
iccrb(C) ' 12 iccr(c). 
iccrb(O) = 02. 
ccw(v) CCR: XCC[(v <4, Ö) (v = 0) (-(MIN4s : 54, V 
-<4s 
MAX4, » (V >4u MAX)] 
CCR: ICC[(V <2,0) (V = 0) ('(MIN2s : 52s V 
-<2s 
MAX2s» (V >2u MAX2J)- 
ccwb(V) ý 02- 
ý-- 12 CCW(V)- 
Bit. 
134 
aloper(x, y, x+y+ c) 02 ccwb($3) iccrb(c) ADD. # ADD 0002 
aloper(x, y, x-y- c) 02 ccwb($3) iccrb(c) SUB. # SUB 1002 
APPENDIX C: SPARC-V9 
n:: Bit. 
op:: {Bit}2. 
neg(v) V 12 1*; 
==: > # empty body if V 54 12 
1OP(OP) & OP 012 Y* # bit-wise and 
OP 102 J#> # bit-wise or 
OP 112 1=ý # bit-wise xor 
aloper(x, y, neg(n) (X IOP(OP) Y)) --"= 02 ccwb($3) 02 n op- 
PC[pc] pc[pcoJqc nPC[pc'] ===> PC[pc] pc'fpco]qc nPC[pc'+ 4]. 
OP: ARM rw(-, r) aloper(x, y, r) sopl(x) sop2(y) ==* . 
#OP: ARM= 102 
sr(x, y, x+y+c, w+ 1) = 
10 ccwb($3) iccr(c) SAVE CWP[w] SAVE = 1002 
sr(x, y, x+y+c, w- 1) = 
10 ccwb($3) iccr(c) RESTORE CWP[w] #RESTORE = 1012 
OP: ARM rw(a, r, w) a sr(x, y, r, w) sopl(x) sop2(y) =ý- CWP[w % NWINDOWS]. 
r:: Byte 
ldstub(a, r) = LDSTUB a[r] ==ý. a[l 111111121 -# LDSTUB = 
00 11012 
OP: LS rw(-, r) ldstub(x + y, r) sopl(x) sop2(y) # OP: LS = 112 
signed(s) = zero-f ill =4 s=0 ý* . 
= sign-ext =4 s == 1 1=--> - 
f Integer = Byte I Half word I Word}. 
asi :: Byte. 
sopasi(x, y, asi) = SOP1(X) 02 asi sopi(y). 
= sopl(x) IMM13(y) ASI[asi]. 
s :: Bit. 
lias(asi, a, signed(s) (r)) S 02 Integer asi: a[(r:: Integer)]. 
lias(asi, a, r) 10112 asi: a[(r:: Extended-word)]. 
OP: LS rw(-, r) 012 lias(asi, x+y, r) sopasi(x, y, asi) ==* . 
sias(asi, a, r) 012 Integer ==. > asi: a[(r :: Integer)]. 
sias(asi, a, r) 11102 ==* asi: a[(r:: Extended-word)]. 
OP: LS rr(-, r) 012 sias(asi, x+y, r) sopasi(x, y, asi) =* . 
135 
APPENDIX C: SPARC-V9 
s :: fBit}2. 
f arithm(x, y, x+ y) = FADD. # FADD 0 0100 002 
f arithm(x, y, x- y) = FSUB. # FSUB 0 0100 012 
OP: ARM f w(-, r, s) 1101002 f r(-, x, s) f arithm(x, y, r) sf r(-, y, s) ==> . 
f cmp(Xl Y) = =1 X=Y ý* 002- 
X<Y J#'012- 
X>Y 102- 
X? Y 112- # unordered (x or/and y is NaN) 
v:: fBitJ2. 
f ccnr(f cc, v) = FSR: FCC: f cc[v]. 
fccnw(fcc, v)= ==t-FSR: FCC: fcc[vj. 
OP: ARM 0002 CCI CCO 1101012 fr(-, x, s) 00101002 sf r(-, y, s) ==> 
f ccnw(ccl ý ccO, f cmp(x, y)). 
ldfa(asi, a, r, 01) 002 asi: a[(r:: Fp-single)]. #LDFA 
ldf a(asi, a, r, 10) 112 asi: a[(r Fp-double)). # LDDFA 
ldf a(asi, a, r, 11) 102 asi: a[(r Fp-quad)]. # LDQFA 
OP: LS f w(-, r, s) 11002 ldf a(asi, x+y, r, s) sopasi(x, y, asi) 
stf a(asi, a, r, 01) 002 asi: a[(r :: Fp-single)]. # STFA 
stf a(asi, a, r, 10) 112 asi: a[(r:: Fp-double)]. # STDFA 
stf a(asi, a, r, 11) 102 asi: a[(r:: Fp-quad)]. # STQFA 
OP: LS f r(-, r, s) 11012 stf a(asi, x+y, r, s) sopasi(x, y, asi) 
PC[pc] nPC[npc] pc[OP: ARM rw(-, pc) 1110002 SOP1(X) sop2(y)] ==* PC[npc] nPc[x+y]. 
a, p, n :: Bit. annul, prediction and "negate condition" bits 
d16hi:: jBitj2.2-bit PC-relative displacement 
rsl:: {Bitl5. address of the Ist source register 
d161o:: jBitj14.14-bit PC-relative displacement 
bpr-cond(v, neg(n) (v = 0)) =n BRZ. # Branch on Register Zero 
bpr-cond(v, neg(n) (v < 0)) =n BRLEZ. #" Register Less Than or Equal to Zero 
bpr-cond(v, neg(n) (v < 0)) =n BRLZ. #" Register Less Than Zero 
136 
ea(pc, displ) pc + (4 x sign-ext(displ)) =. 
APPENDIX C: SPARC-V9 
PC[pc] nPC[npc] pc[OP: BRS a 02 bpr-cond(v, c) 0112 d16hi p rw(rsl, v) d161o] 
c J* Pc[npc] nPc[ea(pc, d16hi ý d161o)]. 
--ic Aa=0 1=> PC[npcl nPC[npc + 4]. # instruction in delay slot executed 
=1 --ic Aa=1 j#- PC[npc + 41 nPC[npc + 8]. # instruction in delay slot annulled 
n, z, v, c :: Bit. # bits of the ICC1XCC register 
ccl, ccO:: Bit. # condition codes selection 
cond:: {Bit}4. # the condition field 
displ9:: {Bit}19. # branch's PC-relative displacement 
bp-cond(-, TRUE) = BPA. #Branch Always 
bp-cond(-, FALSE) BPN. # Never 
bp-cond(ncc, --1z) CCR: ncc[n zv c] BPNE. # on 
bp-cond(ncc, z) = CCR: ncc(n zv c] BPE. # on 
bp-cond(ncc, -, (z V (n ^ v))) = CCR: ncc[n zv c] BPG. # on > 
bp-cond(ncc, zV (n ^ v)) = CCR: ncc[n zv c] BPLE. # on < 
bp-cond(ncc, -, (n ^ v)) = CCR: ncc[n zv c] BPGE. # on > 
bp-cond(ncc, n ^ v) = CCR: ncc(n zv c] BPL. # on < 
bp-cond(ncc, -(c V z)) = CCR: ncc[n zv c] BPGU. # on > Unsigned 
bp-cond(ncc, cV z) = CCR: ncc[n zv c] BPLEU. # on < Unsigned 
bp-cond(ncc, -ic) = CCR: ncc[n zv c] BPCC. # on > Unsigned 
bp-cond(ncc, c) = CCR: ncc[n zv c] BPCS. # on < Unsigned 
bp-cond(ncc, --in) = CCR: ncc[n zv c] BPPOS. # on Positive 
bp-cond(ncc, n) = CCR: ncc[n zv c] BPNEG. # on Negative 
bp-cond(ncc, -iv) = CCR: ncc[n zv cl BPVC. # on Overflow Clear 
bp-cond(ncc, v) = CCR: ncc[n zv c] BPVS. # on Overflow Set 
PC[pc] nPC[npc] pc: COND[cond] 
pc[OP: BRS ajpc: COND[bp-cond(ccl ý ccO, c) 0012 ccl ccO p displ9] 
=4 -nc Aa=0 PC[npc] nPC[npc + 4]. 
--ic Aa=1 PC[npc + 4] nPC[npc + 8]. 
cAa=0 ý* PC[npc] nPC[ea(pc, displ9)]. 
cAa=1A cond = BPA PC[ea(pc, displ9)] nPC[ea(pc, displ9) + 4]. 
cAa=1A cond =7ý BPA PC[npc] nPC[ea(pc, displ9)]. 
f bp-cond(-, TRUE) = FBPA. # Branch Always 
f bp-cond(-, FALSE) = FBPN. #" Never 
f bp-cond(f cc, v= U) =f ccnr(f cc, v) FBPU. #" on Unordered 
f bp-cond(f cc, v= G) =f ccnr(f cc, v) FBPG. #" on > 
f bp-cond(f cc, v=UVv= G) =f ccnr(f cc, v) FBPUG. #" on Unordered or 
137 
APPENDIX C: SPARC-V9 
f bp-cond(f cc, v= L) =f ccnr(f cc, v) FBPL. # on < 
f bp-cond(f cc, v=UVv= L) =f ccnr(f cc, v) FBPUL. # on Unordered or < 
f bp-cond(f cc, v=LVv= G) =f ccnr(f cc, v) FBPLG. # on< or> 
f bp-cond(f cc, v= -iE) =f ccnr(f cc, v) FBPNE. # on 54 
f bp-cond(f cc, v= E) =f ccnr(f cc, v) FBPE. # on = 
f bp-cond(f cc, v=UVv= E) =f ccnr(f cc, v) FBPUE. # on Unordered or 
f bp-cond(f cc, v=GVv= E) =f ccnr(f cc, v) FBPGE. # on > 
f bp-cond(f cc, v= -iL) =f ccnr(f cc, v) FBPUGE. # on Unordered or > 
f bp-cond(f cc, v=LVv= E) =f ccnr(f cc, v) FBPLE. # on < 
f bp-cond(f cc, v= -G) =f ccnr(f cc, v) FBPULE. #" on Unordered or < 
f bp-cond(f cc, v=-, U) =f ccnr(f cc, v) FBPO. #" on Ordered 
#e:: Bit. lgu:: {Bit}3. 
#f bp-cond(f cc, neg(e) (dm(-, v) & (-, lgu + 1))) =f ccnr(f cc, v) e lgu. 
PC[pc] nPC[npcj pc: COND[cond] 
pc[OP: BRS a]pc: COND[f bp-cond(ccl ý ccO, c) 1012 ccl ccO p displ9] 
-ic Aa=0 ]=-ý- PC[npc] nPC[npc + 4]. 
-, c Aa=1 j#- PC[npc + 4] nPC[npc + 8]. 
cAa=0 ý* PC[npcj nPC[ea(pc, displ9)]. 
cAa=1A cond = FBPA J#ý PC[ea(pc, displ9)] nPC[ea(pc, displ9) + 4]. 
cAa=1A cond =7ý FBPA J=: ý PC[npc] nPC[ea(pc, displ9)). 
disp30:: IBit}30. 
PC[pc] nPC[npc] CWP[w] pc[OP: CLL disp30] # OP: CLL -= 012 
==#. PC[npc] nPC[ea(pc, disp30)] rw(15, pc, w). 
PC[pc] nPC[npc] CWP[w] pc[OP: ARM 000002 1110012 SOP1(X) sop2(y)] 
=* PC[npc] nPC[x + yj CWP[(w - 1) % NWINDOWS]. 
138 
OP: BRS 000002 1002 002 000002 000002 000002 000002 `ý - 
APPENDIX C: SPARC-V9 139 
I PC[pc] nPC[npc] pc[OP: ARM rw(-, pc) 1110002 sopl(x) sop2(y)] 
2 =* PC[npc] nPC[x + y]. # [3.22] 
3# [2.2] rw (a, v) a= CWP [w] rw (a, v, w). a, pc =v 
4 PC[pc] nPC[npc] CWP[w] pc[OP: ARM rw(a, v, w) a 1110002 SOP1(X) sop2(y)] 
5 ==* PC[npc] nPC[x + y]. 
6# [3.1] s op 1 (vl ) =- rr (a,, v, ) a, X=V, 
7 PC[pc] nPC[npc] CWP[w] pc[OP: ARM rw(a, v, w) a 1110002 rr(al, vi) a, sop2(y)] 
8 ==* PC[npc] nPC[vi + y]. 
9# [3.1] sop2(v2) = IMM13(v2). Y= V2 
10 PC[pcj nPC[npc] CWP[w] pc[OP: ARM rw(a, v, w) a 1110002 rr(al, vi) a, IMM13(V2)] 
11 ==* PC[npc] nPc[vi + V21- 
12 # [2.2] rw (a, v, w) = =1 I<a<8 J=ý a [vl. 
13 PC[pc] nPC[npcj CWP[w] pc[OP: ARM a 1110002 rr(al, vl) a, IMM13(V2)] 
14 =[ 1<a<8 ý* PC[npc] nPC[vi + V21 a[v). 
15 # [2.1] rr (a,, v, ) = CWP [w, j inr (a, - 24, vi, wi) =[ 24 < a, < 32 
16 PC[pc] nPC[npcj CWP[w] CWP[wi] pc[OP: ARM a 1110002 inr(al - 24, vi, wi) a, IMM13(V2)] 
17 =4 (1 <a< 8) A (24 < a, < 32) ý* PC[npc] nPC[vj + V2] a[v]. 
18 # [3.2] IMM13 (sign 
-ext 
(v3)) = (v3 :: jBitJ13). v2 = sign-ext(V3) 
19 PC[pc] nPC[npcl CWP[w] CWP[wl] 
20 pc[OP: ARM a 1110002 inr(al -24, vl, wl) a, IMM13 (V3 Bit} 13)] 
21 =[ (I <a< 8) A (24 < a, < 32) ý* PC[npc] nPC[vj + V2] a[vl. 
22 # [2.3] inr(a2, V4, VI) = ilocr(a2, V4, VI + I)- a, -24 = a2, VI = V4 
23 PC[pc] nPC[npc] CWP[w] CWP[wll 
24 pc[OP: ARM a 1110002 ilocr(a2, V4, Wl + 1) a, IMM13 (V3 :: fBit}13)] 
25 =f (1 <a< 8) A (24 < a, < 32) ý* PC[npc] nPC[vj + V2] a[v]. 
26 # [2.4] ilocr (a2 , 
V4 i T72) = (NWINDOWS -I- (w2 % NWINDOWS)) x 16 + a2 
[V41 
27 # Wl +I= W2 
28 PC[pc] nPC[npc] CWP[w] CWP[wl] (NWINDOWS -1- 
(TI72 % NWINDOWS)) x 16 + a2[V4] 
29 pc[OP: ARM a 1110002 a, IMM13 
(V3 
:: fBit}13)] 
30 =1 (1 -< a< 
8) A (24 <- a, < 32) 1* PC[npc] nPC[vl + V2] a[v]. 
31 # after resolution and rearrangement of locators and text 
32 PC[pc] nPC[npc] CWP[w] (NWINDOWS -1- ((w + 1) % NWINDOWS)) x 16 + (a, - 24)[x] 
33 pc[OP: ARM - 1110002 a, IMM13 
(V3 :: {Bit}13)] #field format of current instruction 
34 =4 (I <-< 8) A (24 < a, < 32) ý* # conditions for application 
35 PC[npc] nPC[x + sign-ext(V3)] -[pc]. 
# effects of the instruction on configuration 
Figure CA: Example of archetype expansion 
Appendix D 
Java Virtual Machine 
{Bitl. 
jBoolean = {Bit}32, Byte = (BitJ32, Char = {BitJ32, Short = jBit}32, 
Int = jBitJ32, Float = {Bit}32, Reference = {Bit}32, 
ReturnAddress = {Bit}32, Long = jBit}64, Double = fBit}64}. 
{Categoryl = Booleanj Byte I Char I Short I Int IFloat I Reference I ReturnAddress, 
Category2 = Long I Double, 
Category12 = Categoryl I Category2j. 
fUnsignedByte = fBit}S, SignedByte = fBit}8, 
UnsignedShort = fBit}16, SignedShort = fBit}16}. 
(Word = Categoryl Categoryl I Category2}. 
POP(87), POP2(88), DUP(89), DUP-Xl(90), DUP-X2(91), DUP2(92) 
DUP2-XI(93), DUP2-X2(94), SWAP(95), 
ILOAD(21), ILOAD-0(26), ILOAD-1(27), ILOAD-2(28), ILOAD-3(29), 
ISTORE(54), ISTORE-0(59), ISTORE-1(60), ISTORE-2(61), ISTORE-3(62), 
IADD(96), ISUB(100), IMUL(104), IDIV(108), IREM(112), 
IAND(126), IOR(128), IXOR(130), ISHL(120), ISHR(122), IUSHR(124), 
INEG(116), IINC(132), 
ICONST-Ml(2), ICONST-0(3), ICONST-1(4), ICONST-2(5), ICoNST-3(6), ICONST-4(7), 
ICONST-5(8), 
BIPUSH(16), SIPUSH(17), 
IFEQ(153), IFNE(154), IFLT(155), IFGE(156), IFGT(157), IFLE(158), 
IF-ICMPEQ(159), IF-ICMPNE(160), IF-ICMPLT(161), IF-ICMPGE(162), IF-ICMPGT(163), 
IF-ICMPLE(164), 
GOTO(167), GDTD-W(200), JSR(168), JSR-W(201), RET(169), 
WIDE(196) :: UnsignedByte. 
140 
A11111, M)ix D: JAVA VIRTUAL MACHINE 
bs :: SignedByte. 
b, opc :: UnsignedByte. 
s S, 
6., :: SignedShort. 
S, :: UnsignedShort. 
i, j, 6i :: Int. 
z :: Categoryl2. 
POP SP[t] S[V]t ýý> SP[S]. 
POP2 spit] S[WIt =z> SP[sl. 
DUP SP[t] S[V]t SP[t'] S[V V]t'. 
DUP2 SP[tl S[WIt SP[t'] S[W wit'. 
DUP-Xl SP[t] s[vl vo]t ==> SP[t'] s[vo vi vo]t'. 
DUP2-Xl SP[t] s[vl wo]t ==zý> SP[t'] S[wo vi wolt'. 
DUP-X2 SP[t] S[W, VO]t ==> SP[t'] s[vo wi vo]t'. 
DUP2-X2 SP[t] s[wl wo]t ==> SP[t'] S[wo wi wolt'. 
SWAP SP[t] s[vl vo]t =z> SP[t'] s[vo vl]t. 
v:: Categoryl. 
w:: Word. 
push(v) = popsh(, v). 
pop(v) = popsh(v, ). 
var(n, z) n[z] =. 
(v(b,,,, bu2) (((bul << 8) 1 bu2) :: UnsignedShort) =- 
(S(bul, bu2) (((b,,, << 8) JbI12) :: SignedShort) =. 
6(b,,,, bu2, bu3, bu4) ((b,,, << 24) 1 (bU2 << 16) 1 (bu3 << 8) lbu4) 
wide(opc, b, ) = opc b,,. 
wide (opc, b, bs) = opc b,, bs. 
wide(opc, (v(bul, bu2)) ý WIDE opc bul bu2 - 
wide(opc, a(bul, bu2)i 6(bu3, bu4)) = WIDE opc bul b112 bu3 bu4- 
i3(O) = 0. i3(1) = 1. i3(2) = 2. i3(3) = 3. 
iload(i) =wide(ILOAD, i). 
iload(i) = ILOAD--i3(i). 
istore(i) = wide(ISTORE, i). 
istore(i) = ISTORE--i3(i). 
141 
iload(n) var(n, v) push(v) ==#ý . 
istore(n) pop(v) --ý, var(n, v). 
APPENDix D: JAVA VIRTUAL MACHINE 
iarithm-binary(il, i2, il + i2) = IADD. 
iarithm-binary(ii, i2, il - i2) = ISUB. 
iarithm-binary(ii, i2, il X i2) = IMUL. 
iarithm-binary(ii, i2, il/i2) = IDIV =4 i2 0 
J=: ý' 
iarithm-binary(ii, i2) il % i2) = IREM =q i2 0 
1=4' 
iarithm-binary(ii, i2, il k i2) = IAND. 
iarithm-binary(ii, i2, il i2) = IOR. 
iarithm-binary(ii, i2, il i2) = IXOR- 
iarithm-binary(ii, i2, il << i2) = ISHL. 
iarithm-binary(il, i2, il >> i2) = ISHR. 
iarithm-binary(il, i2, il >>> i2) = IUSHR. 
iarithm-binary(ii 
, 
i2; i. ) popsh( il i2, ir) ==* 
iarithm-monadic(il, -ij) = INEG. 
iarithm-monadic(i, ir) popsh(i, i, ) 
vide(IINC, s,, s,, ) var(s, i) ==*- var(s, i+s,. ). 
i5(-1) = Ml. i5(O) = 0. i5(1) = 1. i5(2) = 2. i5(3) = 3. i5(4) = 4. i5(5) = 5. 
iconst(i) = ICONST--i5(i). 
iconst(b, ) = BIPUSH b.. 
iconst(J(b,,,, bu2)) = SIPUSH bul bu2- 
iconst(i) push(i) =: > . 
jmpc(i, i=0, d(bl, b2» = IFEQ bl, b2- 
jmpc(i, i z7ý 0, d(bl, b2» = IFNE bi b2- 
jmpc(i, i<0, Ö(bi, b2» = IFLT bi b2- 
jmpc(i, i>0, &(bl, b2» = IFGE bl b2- 
j mpc (i, i>0, d(bl, b2» = IFGT bl b2- 
j mpc (i, i<0,5(bl, b2» = IFLE bl b2- 
j mp c (i, j, i j, 5(bi, b2)) = IF-ICMPEQ bl b2- 
jmpc (i, j, i j, J(bl, b2)) = IF-ICMPNE bl b2- 
jmpc(i, j, i<j, J(bl, b2)) = IF-ICMPLT bl b2- 
jmpc(i, j, i>j, 6(bi, b2)) = IF-ICMPGE bl b2- 
jmpc(i, j, i>j, J(bl, b2)) = IF-ICMPGT bl b2- 
jmpc(i, j, i<j, 5(bi, b2)) = IF-ICMPLE bi b2- 
jmpu(TRUE, 5(bl, b2)) = GOTO b, b2- 
j mpu(TRUE, 5(bl, b2, b3, b4)) = GOTO-W bl b2 b3 b4 
jump(cond, Ji) = jmpc(i, cond, Ji) pop(i). 
= jmpc(i, j, cond, Ji) pop(i j). 
= jmpu(cond, Ji). 
142 
PC[pc] pcfjump(cond, Ji)]qc =q cond ý* PC[pc + Jil. 
81 =[ -cond j#- PC[qc]. 
APPENDix D: JAVA VIRTUAL MACHINE 
j sr(J(bl, b2)) = JSR b, b2- 
j sr(J(bl, b2, b3, b4)) = JSR-W bl b2 b3 b4- 
PC[pc] pc[j sr(Ji)]qc push(qc) =: ý> PCýc + Jil. 
PC[pc] pc[wide(RET, n)]qc var(n, a) ==ý PC[a]. 
143 
Appendix E 
n-PRAM 
{MrModel = Int, MwModel = Int}. # type aliases for the memory read and write models 
fER, CR:: MrModel, 
mr:: MrModel}. 
{EW, CW-WEAK, CW-COMMON, CW-TOLERANT, CW-COLLISION, CW-COLLISIONP, 
CW-ARBITRARY, CW-PRIORITY:: MwModel, 
mw:: MwModell- 
pram(n, rm, wm) = FE 1 pcs(n) # update PCs, block (stage 2) if no RAM is running 
3 shr(n, n, rm, O) # shared memory reads 
4 shw(n, n, WM, O). # shared memory writes 
pcs(O) =. #0 RAMs 
pcs(p) = pc(p) pcs(p - 1) =1 p>0 
pC(p) = FEI 1 P-p: PC[pcl pc[instr(p)lqc =1 0< pc <C j#- P-p: PC[qc] 
2 =. 
pC(P) = P-P: pc[pcl =[ 0> PC V PC >C J#- . 
halt(O) =- #halt 0 RAMs 
halt(p) = halt(p - 1) =f p>0 ý* P-p: PC[C]. # halt p RAMs 
shrd (p, a) =P -p: MAR [a]. 
n 
# shr (n, CR) = shrd (p, ap) 
P=j 
n # shr (n, ER) = shrd (p, ap) 
P=j 
n #= shrd(p, ap) 
P=j 
# running 
# don't block 
# not running 
n 
read (p, ap). 
P=j 
halt(p) apFffL -. 
jjOjaWaLjj#> #readconflict E- 
P=j p 
n 
. read(p, ap) =Jfl apýL-jj=f aRVfL P=l p# 
no read conflict 
144 
APPENDix E: n-PRAM 
shr(O, -, -7 -) ý-#0 
RAMs, no conflicts 
shr(p, n, CR, P-p: MAR[a] shr(p - 1, n, CR, -) read(p, a) 
=4 p>0 )=#, -# CR, no conflicts 
shr(p, n, ER, s) = P-p: MAR[al shr(p - 1, n, ER, s) halt(n) 
=1 a =7ý NULL AaCsAp>0 1=ý .# conflict 
shr(p, n, ER, s) = P-p: MAR[a] shr(p - 1, n, ER, sUf al) read(p, a) 
=4 (a = NULL Vaý s) Ap>0# no conflict 
read(p, a) = a[vj =[ a -7ý NULL J* P-p: ACC[vj; # read value v from shared memory 
==ý> .# not a memory read instruction 
shw(O, -, -, -) =. 
#0 RAMs, no write conflicts 
shw(p, n, wm, s) = P-p: MAR[a] shw(p - 1, n, wm, s) =f aEsAp>0 1=4, #a resolved 
= P-p: MAR[a] shw(p - 1, n, wm, sU ja}) sw(p, n, wm) =[ aýsAp>0 
sw(p, n, EW) # exclusive write 
= P-p: MAR[a] P-p: ACC[vl ew(p, a, v, n, n). 
sw(p, n, CW-WEAK) # concurrent write, weak model 
= P-p: MAR[a] P-p: ACC[vl weak(p, a, v, n, n). 
sw(p, n, CW-COMMON) # concurrent write, common model 
= P-p: MAR[a] P-p: ACC[vl common(p, a, v, n, n). 
sw(p, n, CW-TOLERANT) # concurrent write, tolerant model 
= P-p: MAR[a] P-p: ACC[v] tolerant(p, a, v, n, n). 
sw(p, n, CW-COLLISION) # concurrent write, collision model 
= P-p: MAR[al P-p: ACC[vl co11ision(p, a, v, n, n)- 
sw(p, n, CW-COLLISIONP) # concurrent write, collision+ model 
=P -p: MAR [a] P ~p: ACC [vl co 11 isi onp (p, a, v, n, n). 
sw(p, n, CW-ARBITRARY) # concurrent write, arbitrary model 
= P-p: MAR[a] P-p: ACC[v] arbitrary(p, a, v, n, n). 
sw(p, n, CW-PRIORITY) # concurrent write, priority model 
= P-p: MAR[a] P-p: ACC[v] P-p: SIG[s] priority(p, a, v, S, n, n). 
# fal E) a2 I al = a2 A al 0 NULL} 
# fal(jDa2 I al 54a2 Val =NULLVa2 =NULL} 
#shwr(p, a, v)=P-p: MAR[al P-p: ACC[vi. 
sh w (n, EW) 
#= shwr (p, ap, -) 
halt (p) aa 
-11#, # w7ite conflict 
P=j P=j pp 
nn 
shwr(p, ap, vp) write (ap, vp) aNTL-11= 
I 
alýýLjj=#- . 
#no wTite conflict 
P=j P=j pp 
145 
APPENDIX E: n-PRAM 
ew(-, a, v, 0, -) = write 
(a, v). 
ew(pt, at, vt, p, n) 
= P~p: MAR[a] halt(n) 
=[ pt 7ý pA atE)a Ap>0# write conflict 
= P-p: MAR [a] ew(pt, at, vt, p-1, n) 
=1 (pt =pV at (Da) Ap>0 Y=>. .# no conflict 
# shw(n, CW-WEAK) 
nn 
#= shwr(p, ap, vp) halt (p) # write conflict 
P=1 P=1 
#3 (ap, vp), (aq , Vq) EI 
(apWffL, vp) p0qA ap = aq A (vp =0 0V Vq 0 0) 
nn 
shwr (p, ap, vp) write (ap, vp) #0 or no write conflict 
P=j P=j 
L V(ap, vp), (aq, v. ) (ar, vp) ap = a. =ý 
(Vp 
= Vq 
weak(-, a, v, 0, -) = write(a, v). 
weak (pt, at, vt, p, n) 
= P-p: MAR[a] P-p: ACC[v] weak(pt, at, vt, p-1, n) 
=Ipt OpAatE)aAvt = OAv= OAp > 01#. #write 0 conflict 
= P-p: MAR[a] P-p: ACC[vj halt(n) 
=fpt =OpA atE)aA (vt OOVvOO) Ap > Oý* #conflict, halt 
= P_p: MAR[aj weak(pt, at, vt, p-1, n) 
=4 (pt =pV at @ a) Ap>0 ]=#> .# no conflict, check fuTiher 
# shw(n, CW-COMMON) 
nn 
sh wr (p, ap, vp) h al t (p) 
P=j P=l 
3(ap, vp), (a., Vq) Ef (aP751ýL, vp) ap = a. A vp =A Vq 
nn 
shwr(p, ap, vp) write (ap, vp) 
P=j P=j 
V(ap, vp), (aq, Vq) EI (aplýu'--L, vp) 
II 
ap = aq =4> (Vp = Vq = 0) J=> . 
common(-, a, v, 0, -) = write(a, v). 
common(pt, at, vt, p, n) 
= P_p: MAR[aj P-p: ACC[v] common(pt, at, vt, p-1, n) 
=[ pt /= pA at (E) aA vt =vAp>0 conflict, same value, proceed 
= P-p: MAR[a] P-p: ACC[v] halt(n) 
=[ at E) aA vt 0vAp>0 ý* # conflict, halt 
=P -p: MAR (a] common (pt, at, vt, p-1, n) 
=j (pt =pv at @ a) Ap>0 1=ý .# no conflict 
146 
APPENDix E: n-PRAM 
nn # shw(n, CW-TOLERANT) shwr(p, ap, vp) write(a,, p, vp) 
P=j P=l 
I(ap, vp), (aq, Vq) E (ap7ýLL, vp) (a,, p := 
(p =A qA ap = a. ) ? NULL: ap p 
tolerant(-, a, v, 0, -) = write(a, v). 
tolerant (pt, at, vt, p, n) 
= P_p: MAR[a] 
=4 pt 0pA atE) aAp>0 
= P_p: MAR[a] tolerant (pt, at, vt, p-1, n) 
=[ (pt =pV at (D a) Ap>0 J* . 
# conflict, value not changed 
# no conflict 
# shw (n, CW-COLLISION) = shwr (p, ap, vp) wri te (ap, v,, p) 
P=j P=j 
(ap, vp), (aq , Vq) 
Ef (aP5ý'-", vp) (v,, p := 
(p 0qA ap = aq) ? COLL : vp) 
collision(-, a, v, 0, -) = write(a, v). 
collision(pt, at, vt, p, n) 
= P_p: MAR[aj 
=4 pt : ýý pA at E) aAp>0 1=4> write (at, COLL) # conflict, wTite a collision symbol 
P-p: MAR[a] collision(pt, at, vt, p-1, n) 
=1 (pt =pV at @ a) Ap>0 ]=#- .# no conflict 
nn # shw(n, CW-COLLISIONP) = shwr(p, ap, vp) write (ap, v,, p) 
P=j P=j 
3(ap, vp), (a., vq) EI (aWTLL, vp) (vW := (ap = aq A Vp :A Vq) ? COLL : vp p) 
1=> 
- 
collisionp(-, a, v, 0, -) = write(a, v). 
collisionp(pt, at, vt, p, n) = 
= P~p: MAR[a] P-p: ACC[v] collisionp(pt, at, vt, p-1, n) 
=tpt Op A at E)a A vt =vAp> Oý* # conflict, same value, proceed 
= P-p: MAR[a] P-p: ACC[v] 
=f at E) aA vt 0vAp>0 1=-e write (at, COLL) # conflict, wTite a collision symbol 
P-p: MAR[a] collisionp(pt, at, vt, p-1, n) 
=4(pt =pV at (B a) Ap> 01* .# no conflict 
nn # shw (n, CW-ARBITRARY) = shwr (p, ap, vp) write (ap, v,, p) 
P=l P=j 
#=4E](ap, vp), (aq, Vq)EI(aplýý", vp))I(V,, p: =(p: 7ýqAap=aq)? rand(I 
a'}): vp)ý*- VP 
147 
APPENDix E: n-PRAM 
arbitrary(-, a, v, 0, write (a, v). 
arbitrary(pt, at, vt, p, n) 
= P_p: MAR[a] arbitrary(pt, at, vt, p-1, n) 
=[ pt =7ý pA at (9 aAp>0 ý* # conflict, choose vt 
= P-p: MAR[a] P-p: ACC[v] arbitrary(pt, at, v, p-1, n) 
=4 pt : 7ý pA at (E) aAp>0 ý* # conflict, choose v 
= P~P: MAR[a] arbitrary(pt, at, vt, p-1, n) 
=4 (pt =pV at 0 a) Ap>0 ]=-ý> .# no conflict 
148 
12 n # shw(n, CW-PRIORITY) = shwr(p, ap, vp, sp) write (ap, vp) 
P=j P=l 
#=[I(ap, vp), (aq, v, )EI(alpýULL, vp)ll(v,, p: =(p54qAap=a, 
)? min,, P(f 
'P}): vp)ý*. VP 
priority(-, a, v, -, 07 -) = write 
(a, v). 
priority(pt, at, vt, st, p, n) 
= P-p: MAR[a] P-p: ACC[v] P-p: SIG[s] priority(pt, at, vt, st, p-1, n) 
=[ pt =5k pA at E) aA st <sAp>0 J* # conflict, p has lower priority 
= P-p: MAR[a] P-p: ACC[v] P-p: SIG[s] priority(pt, at, v, s, p-1, n) 
=4 pt 0pA at E) aA st >sAp>0 ]=-ý> # conflict, p has higher priority 
= P-p: MAR[a] P-p: ACC[vl P-p: SIG[s] priority(pt, at, vt, st, p-1, n) 
=[ (pt =pV at 0 a) Ap>0 ý* .# no conflict 
write(a, v) a NULL ý* a[v]; # wite value v into shared memory 
===> # not a memory write instruction 
accr(p, v) = FEI 2 P-p: ACC[v]. # accumulator read 
accw(p, v) = FEI 2 ===> P-p: ACC[v]. # accumulator write 
sigr(p, V) = FEI 2 P-p: SIG[vj. # signature register read 
SigW(p, V) = FEI 2 ==> P-p: SIG[v]. # signature register w7ite 
pcr(p, v) = FEI 2 P-P: PC[VI. #program counter read 
pCW(p, V) = FE12 ===> P-p: PC[v]. #program counter write 
regr(p, r, v) = FEI 2 P-p: r[vl. # local register read 
regw(p, r, v) = FE12 =* P-p: r[vl. # local register write 
DR(p, r, v) r= regr(p, r, v). # direct register (read) 
IR(p, ri, v) r= regr(p, r, ri) regr(p, ri, v). # indirect register (read) 
APPENDix E: n-PRAM 
rm(p, a, v) = DR(p, a, V) # register modes 
= IR (p, a, v). 
irm(p, v) = IMM v# immediate mode and register modes 
rm(p, -, v). 
DMR(p) M= FE 2 P-p: MAR[m] # direct memory read 13 
P-p: MAR[NULL]. 
IMR(p) r= FE 2 P-p: r[mij P-p: MAR[mi] # indirect memory read 13 
P-p: MAR[NULL]. 
mrm(p) = DMR(p) 
IMR(p). 
DMW(p) M= FE 3 P-p: MAR[m] # direct memory write 14 
P-p: MAR[NULL]. 
IMW(p) r= FE 12 P-p: r[mi] # indirect memory write 
P-p: MAR[mi] 
P-p: MAR[NULL]. 
mwm(p) = DMW(p) 
imw(p). 
binary(x, y, x+ y) = ADD. 
binary(x, y, x- y) = SUB. 
binary(x, y, xx y) = MUL. 
binary(x) Y7 X/y) = DIV. 
binary(x, y, x% y) MOD. 
binary(xy Yi X'ýý Y) = SHI 
binary(x, y, xk y) AND. 
binary(XIYIX y) = DR. 
binary(x, y, x y) = XOR. 
# addition 
# subtraction 
# multiplication 
# division 
# modulo 
FT. # shift left 
# bitwise and 
# bitwise or 
# bitwise xor 
monadic(x, log(x» = LOG. # logarithm 
monadi c (x, not (x» = NOT. # bitwise not 
arlog(p, r) = binary(x, y, r) irm(p, y) accr(p, x) 
= monadic(y, r) irm(p, y). 
load(p, v) = LOAD irm(p, v) 
= LOADINDEX sigr(p, v) 
= LOADPC pcr(p, v). 
149 
APPENDix E: n-PRAM 
toacc(p, v) = arlog(p, v) =ý. accw(p, v) 
= load(p, v) =ý- accw(p, v). 
store(p, r, v) = STORE rm(p, r, -) accr(p, v). 
toreg(p, r, v) = store(p, r, v) =ý. regw(p, r, v). 
jump(p, v>0, a) = JPOS irm(p, a) accr(p, v). 
jump(p, v=0, a) = JZERO irm(p, a) accr(p, v). 
jump (p, TRUE, a) JUMP irm(p, a). # unconditional jump 
jump(-, TRUE, C) HALT. 
topc(p) jump(p, cond, a) cond )=-ý pcw(p, a) 
j ump (p, c ond, a) -c ond j#. . 
read(p) = READ mrm(p). 
write(p) = WRITE mwm(p). 
instr(p) = toacc(p, 
= toreg(p, 
= topc(p, 
= read(p) 
= write(p)- 
150 
Appendix F 
Glossary 
This glossary contains basic terminology for Extended Update Plans. Many terms have 
been adopted from the original work on Update Plans [601, others have been changed to 
accommodate for changes introduced in this thesis, and there is also a number of completely 
new terms exclusive to EUP. 
alternatives, 14 
a series of update schemes; the first applicable scheme in the series is applied 
ambidextrous archetype, 18 
a pair of archetypes of the same name, one left-handed, the other right-handed 
anonymous sequence, 67 
a sequence with no sequential block identifier 
applicable, 11 
an update rule is applicable if its left-hand side is consistent with the current config- 
uration and its guard evaluates to true 
application order, 68 
the order updates are applied to a configuration in a sequence 
archetype, 15 
a macro-like mechanism 
archetype expansion, 17 
the mechanism by which archetypes are expanded to give update schernes 
canonical form, 70 
a sequential or a parallel block is in canonical form if it has only one type of updates- 
update schemes 
151 
APPENDIX F: GLOSSARY 152 
casting, 14 
forcing the value of a term to be of a different type than the type globally declared for 
that term 
cell, 10 
an element of configuration or memory 
command, 13 
an update scheme in which both the left and right-hand sides are in command form 
command archetype, 19 
an ambidextrous archetype whose name starts with a constant 
command driven, 13 
an update plan in which all update schemes are commands 
command form, 13 
a configuration containing a non-empty command sequence, or one in which the 
contents of the register PC are not specified 
command sequence, 13 
a sequence of terms 
conflguration, 10 
a partial function from locators to values; a consistent set of locator expressions 
consistent, 10 
a set of locator expressions is consistent if it does not specify conflicting contents for 
one and the same cell 
context, 17 
the non-text part of a configuration, in particular in an archetype body 
expansion, 17 
see archetype expansion; the text that actually replaces tile archetype call 
final configuration, 11 
a configuration to which none of the update schemes in an update plan are appli- 
cable 
full applicability, 68 
a sequence is fully applicable if all of its constituent updates are applicable in the 
application order 
ground term, 15 
a term for which a unique variable free value can be derived 
APPENDix F: GLOSSARY 1 153 
guard, 10 
a condition for applicability 
initial configuration, 11 
specifies the initial state of the memory before any update scheme is applied 
left-handed archetype, 18 
an archetype having an empty right-hand side expansion 
locator, 10 
an index to a memory 
locator expression, 10 
a sequence of cells delimited on the left and right by a locator 
memory, 11 
a function from locators to values 
opcode, 13 
the first element of a command sequence 
parallel block, 20 
a set of update schemes to be applied simultaneously 
parallel block symbol (open), 20 
'(11': indicates the start of a parallel block 
parallel block symbol (close), 20 
'11)': indicates the end of a parallel block 
parameter resolution, 17 
the mechanism by which parameters of an archetype are rewritten to evaluable expres- 
sions 
partial applicability, 68 
a sequence is partially applicable, if more than one but not all its constituent updates 
are applicable in the application order 
program counter, 12 
the register PC 
register, 12 
a constant locator 
repeat, 12 
Ell ': indicates a repeat of the previous left-hand side 
APPENDix F: GLOSSARY 
right-handed archetype, 18 
an archetype having an empty left-hand side expansion 
semi-ground term, 15 
a term for which a finite number of variable free values can be derived 
sequence, 67 
a synonym for a sequential update scheme 
sequencer, 67 
a synonym for the sequential block identifier 
sequential archetype, 78 
an archetype whose body is a sequential block 
sequential block, 67 
a set of updates to be applied sequentially 
sequential block symbol (open), 67 
'(SI': indicates the start of a sequential block S 
sequential block symbol (close), 67 
'I)': indicates the end of a sequential block 
sequential update scheme, 67 
a top-level sequential block 
stage, 67 
one step of a sequence 
stageless sequence, 67 
a sequence with all of its stages left untagged 
store, 10 
a two-way countably infinite set of cells 
store structure, 15 
a regular expression over store names 
154 
synchroniser, 73 
central update plan synchroniser (UP: SEQ), a constant used in the implementation of se- 
quences 
text, 17 
a sequence of terms 
APPENDix F: GLOSSARY 
textual expansion, 17 
the replacement text of an archetype call, before parameter resolution 
textual ordering, 69 
the way text expanded in individual stages of a sequence is arranged 
top-level, 69 
an update plan's top-level update is also an item in the update plan 
type alias, 14 
a type declaration 
type primitive, 14 
a type name not appearing as the left-hand side of a type alias 
update, 58 
a parallel block, a sequential block or alternatives 
update plan, 10 
a set of updates, type and store declarations 
update rule, 10 
155 
an update scheme which contains no variables and both its left and right-hand sides are 
self-consistent 
update scheme, 10 
consists of a left-hand side, a right-hand side (both configurations) and a guard 
update script, 11 
an update plan and an initial configuration 
