Applying Formal Methods to the Design of Smart Card Software by Butler, Michael et al.
Applying Formal Methods to the Design of Smart Card
Software
Michael Butler, Pieter Hartel, Eduard de Jong and Mark Longley
Declarative Systems and Software Engineering Group
Technical Report DSSE-TR-97-08
July 1997
www.dsse.ecs.soton.ac.uk/techreports/
Department of Electronics and Computer Science
University of Southampton
Higheld, Southampton SO17 1BJ, United Kingdom
Applying Formal Methods to the Design of
Smart Card Software
Michael Butler, Pieter Hartel, and Mark Longley
Department of Electronics and Computer Science, University of Southampton
Eduard de Jong
Integrity Arts, San Mateo, USA
Deliverable of the
Smart Card Software Generator Project
July 1997
Contents
1 Management Summary 1
1.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Initial Z Specication of Memory Manager 2
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Abstract Specication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Concrete Specication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Memory System Implementation 6
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Parameterised Specication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Abstract Type Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Implementation Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 initial memsys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.6 ANewObject/new assoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.7 AReadObject/read assoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.8 AWriteObject/write assoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.9 AReleaseObject/release assoc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Modied Z Specication 9
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Abstract Specication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.3 Concrete Specication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 Z Specication of Memory Module Design 21
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Sequences of Atomic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Handling Erroneous States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.4 Specication Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.5 Error States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.6 Error Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.7 Design Specication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6 \Safe" Storage Allocation 36
7 Modula-3 Memory Management 37
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.2 Type System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.3 Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.4 Generic Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
i
7.5 BYTE Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.6 MEM Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.7 Store Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.8 Store INTEGER Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
8 A Proposed Type System for CLASP 41
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.2 Storage Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.2.1 Type Coercions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.2.2 Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.2.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.3 Arrays and Number Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.4 Generic Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.5 Tagged Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.6 Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.7 Orthogonal Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
ii
Section 1
Management Summary
The goal of this work is the design of a language for the implementation of smart card applications, specically
an operating system, as high integrity software. The integrity of a piece of software is demonstrated by proving
various properties of the software. The language must therefore exclude any constructs that would make such
proofs unreasonably dicult. An untyped language is not only very dicult to reason about formally but also
allows many unchecked run-time errors that are eliminated in a, suitably, typed language. We would like the type
system of the language to be strong, expressive and simple. Unfortunately the language is required to be able
implement certain routines that might normally be part of the run-time system, notably the storage allocation
routines. This requirement is likely to force the adoption of a weaker type system than we would ideally prefer.
In order to understand the consequences of this requirement we rst had to understand in more detail the
storage allocation system required. To this end Michael Butler prepared an initial Z specication of the existing
implementation (See 2). Pieter Hartel then produced an executable specication and Mark Longley a Miranda
implementation (See 3). These led to a modied Z specication for the existing implementation in both an
abstract and rened form. (See 4). The rened form of the modied Z specication was further rened to a
detailed design (See 5). This was followed by some thought about the general requirements and implications of
a storage allocation function (See 6) and an example implementation in Modula-3 (See 7).
Finally a proposal for a type system was prepared, describing the advantages of certain choices and the
problems introduced by others (See 8).
1.1 Conclusion
 The formal specication, and example implementations, enabled discussions concerning the existing im-
plementation of the tagged memory system to proceed without confusion or ambiguity.
 Consideration of the general problem of a functional interface to a storage allocator, the tagged memory
specication and the exampleModula-3 implementation led to a type system proposal.
 The proposed type system has a number of signicant features:
1. It is not so strong as to preclude the implementation of a storage allocator.
2. By distinguishing between type coercions we have shown that it is possible to limit the \type-unsafe"
type coercion to a single instance in the storage allocator.
3. By introducing range types and associated language constructs we have shown that static checking of
array indexing can be achieved in many cases.
4. As well as describing the inherently type-unsafe operations we have also described those operations
that may still require a run-time type check. These run-time checks could also be eliminated if the
appropriate proof obligations were satised.
1
Section 2
Initial Z Specication of Memory
Manager
2.1 Introduction
The following is a L
A
T
E
X presentation of a hand written Z specication for the CLASP tagged memory imple-
mentation presented by Michael Butler. We use a non-standard form of Z to which the various Z tools will not
be applicable.
2.2 Abstract Specication
We leave the details of the information being stored and the tags it is associated with unspecied.
[Tag ;Page]
We have a limit on the size of the memory system and a given set of available tags.
msize:N
tags:FTag
The abstract specication of the memory system species the association between tags and sequences of infor-
mation as a partial function. The only property we require is that the total number of pages associated with all
the tags is no greater than the size of the memory.
AMemSys
data: tags 7! seqPage
(
P
t j t 2 dom(data)  #data(t)) 6 msize
We have an operation that creates a new association, for a new tag, for a given sequence length.
ANewObject
AMemSys
n?:N
t !: tags
t ! 62 dom(data)
(
P
t j t 2 dom(data)  #data(t)) 6 (msize   n?)
9 d : seqPage j #d = n?  data
0
= data [ ft ! 7! dg
We have an operation that returns the information associated with a tag.
2
AReadObject
AMemSys
t?: tags
d !: seqPage
t? 2 dom(data)
d ! = data(t?)
We have an operation that writes a sequence of information to a tag.
AWriteObject
AMemSys
t?: tags
d?: seqPage
t? 2 dom(data) ^ #d? = data(t?)
(
P
t j t 2 dom(data)  #data(t)) 6 (msize  #d?)
data
0
= data  ft? 7! d?g
We have an operation that releases a tag.
AReleaseObject
AMemSys
t?: tags
t? 2 dom(data)
data
0
= ft?g
 
C data
2.3 Concrete Specication
The concrete specication requires some more complex types so we introduce some abbreviations.
Loc == 0 : : : (msize   1)
Gen == N
PageNo == N
CPage == tags  PageNo Gen  Page
CMem == Loc ! CPage
The concrete specication of the memory system introduces a number of relations, most of which are implicit
in the others. We have more complex conditions controlling the various generations of information that may be
associated with a tag.
CMemSys
cmem:CMem
locs: tags 7! (seqLoc)
size: tags 7! N
avail :PLoc
dom(locs) = dom(size)
8 t : tags j t 2 dom(locs) 
locs(t) = recent(good matches(t ; size(t); cmem))
avail = Locn
fi j 9 t 2 dom(size)  9 j :N j
0 6 j < size(t) ^ locs(t)(j ) = i
g
3
The data renement used when deriving the concrete specication from the abstract specication is as follows.
Abstraction
AMemSys
CMemSys
dom(data) = dom(size)
8 t 2 dom(data) 
#data(t) = size(t)
data(t) = page  cmem  (locs(t))
We have an operation to write to the memory system.
CWriteObject
CMemSys
t?:Tag
d?: seqPage
t? 2 dom(size)
#d? = size(t?)
9 lc: seqLoc j injective(lc) ^ (ran lc)  avail ^ #lc = size(t) 
cmem
0
= cmem  fj j 0 6 j < size(t?) 
lc(j ) 7! (t?; j ; g + 1; d?(j ))
where
g = fst(max gen(good matches(t?;
size(t?);
cmem)))
g
locs
0
= locs  ft? 7! lcg
avail
0
= (avail n (ran lc)) [ ran(locs(t?))
size
0
= size
We have an operation to read the information associated with a tag.
CReadObject
CMemSys
t?: tags
d !: seqPage
t? 2 dom(size)
d ! = page  cmem  (locs(t?))
We have an operation to release a tag.
CReleaseObject
CMemSys
t?: tags
t? 2 dom(size)
size
0
= ft?g
 
C size
locs
0
= ft?g
 
C locs
avail
0
= avail [ ran(locs(t?))
cmem
0
= cmem
We have an operation that returns the pages associated with a tag.
4
good matches: tags  N CMem ! P(Gen  seq Loc)
good matches(t ; n; cmem) =
f(g ; lc) j #lc = n ^
8 i j 0 6 i < n 
tag(cmem(lc(i))) = t
gen(cmem(lc(i))) = g
pageno(cmem(lc(i))) = i
g
We have a operation that returns the maximum generation associated with a tag.
max gen:P(Gen  seq Loc) 7! Gen  seq Loc
dom(max gen) = (Gen  seq Loc) n fg
8 s 2 dom(max gen) 
max gen(s) 2 s
8 x  x 2 s ) fst(max gen(s)) > fst(x )
We have an operation to return the most recent information associated with a tag.
recent :P(Gen  seq Loc) 7! Loc
dom(recent) = dom(max gen)
recent(s) = snd(max gen(s))
We have four operations that extract the elds from a concrete page.
tag :CPage ! tags
gen:CPage ! N
pageno:CPage ! N
page:CPage ! Page
tag(t ; pn; g ; p) = t
gen(t ; pn; g ; p) = g
pageno(t ; pn; g ; p) = pn
page(t ; pn; g ; p) = p
2.4 Conclusion
The details of the actual implementation still remain unclear. The relationship between the New and Write
operations and the precise data structures required by the implementation require further explanation.
5
Section 3
Memory System Implementation
3.1 Introduction
Starting from the initial Z specication of a memory system I produced a Miranda
1
implementation of the
operations specied in order to clarify some of the questions raised by the specication.
I chose to dene an abstract type that provided the New, Read, Write and Release operations from the
abstract specication. The implementation (type and operations) of this abstract type was intended to illuminate
some of the issues raised by the concrete specication.
3.2 Parameterised Specication
The rst problem with implementing the specication is its use of the unspecied types Tag and Page and the
valuesmsize and tags. I chose to use the Miranda %freemechanism to produce a script that was paremeterised
by the types info and tag and the values memsys size and tags
2
.
%free
f info :: type
memsys size :: num
tag :: type
tags :: [tag]
g
I use the name info for the \chunk" of information associated with a tag rather than Page as I nd this less
confusing.
This does not completely model the Z specication as it is not possible to constrain memsys size to be a
natural number or tags to be a nite list. We do, however, obtain an implementation that is independent of the
types of the Tag and Page.
3.3 Abstract Type Signature
The signature of the abstract type does not capture any of the conditions specied by the abstract specication,
it simply declares the types of the functions. As the ANewObject, AWriteObject and AReleaseObject
operations modify a memory system we must add an extra operation to provide an initial memory system
3
.
As the operations over a memory system may fail for a variety of reasons I include a simple form of exception
handling using the following Miranda algebraic type:
ok * ::= OK * | Error [[char]]
This allows us to return an error message when an exception occurs.
1
Miranda is a trademark of Research Software Ltd.
2
The script is also parameterised by display functions for the types.
3
The signature also contains a function to display a memory system.
6
abstype memsys with
initial memsys :: memsys
new assoc :: num -> memsys -> ok (tag,memsys)
read assoc :: tag -> memsys -> ok [info]
write assoc :: tag -> [info] -> memsys -> ok memsys
release assoc :: tag -> memsys -> ok memsys
3.4 Implementation Type
The concrete specication denes a number of relations (cmem, locs, size and avail) most of which, as Michael
Butler noted, are included for ease of presentation. I have chosen to represent a memory system as a simple
list of pages (that is page not Page!) where all these relations are implicit in the housekeeping information
associated with the pages.
location, generation, page No == num
page ::= Page bool bool tag generation page No info
memsys == [page]
A real implementation may well include auxiliary data structures that record information for each tag. Lacking
information about what these data structures might be I chose an implementation that examined the extreme
case where no such data structures are available. This means that all operations over the relations dened in
the concrete specication will require repeated searches over the entire memory system in this implementation.
The page type contains two boolean ags that indicate whether the page is currently in use (associated with
some tag) and whether it has been written to yet. The rst ag is required so that we can generate a new list
of pages to associate with a tag and the second so that we can write into pages returned by new assoc and still
maintain a trace of the generations of pages associated with a tag. Each page also records the tag it is associated
with, if any, and its generation and page number for that tag.
In the implementation equations for the abstract type I have chosen to manipulate the locations of pages
rather than the pages themselves to give some sense of the problems a real implementation might face.
3.5 initial memsys
The initial memory system is simply a list of pages, each of which is marked as unused. In a real implementa-
tion the data in a page (housekeeping and information) will have some, arbitrary, initial value. Errors in the
implementation of the memory system could cause this initial information to be returned in some circumstances.
There is no way in Miranda to provide arbitrary initial data in the initial memory system. The best we can do is
use initial values that will cause a fatal error if they are referenced. This means that a Miranda implementation
can never properly model a real system where erroneously referencing initial data would not cause an error.
3.6 ANewObject/new assoc
The new assoc function creates a new association between a tag and a list of pages of the required length. This
will fail if there are no unused tags or there are not enough unused pages. We implement existential quantiers
in the specication as searches (filter) followed by choices (hd and take). The association between tags and
pages is modied by updating the housekeeping information in the pages newly associated with the tag. While
these pages are now marked as used they are not yet marked as written.
3.7 AReadObject/read assoc
The read assoc function returns the list of information, if any, associated with a tag. This will fail if there are
no pages associated with the tag or they are unwritten pages. Because we are mainting a trace of the generations
of pages associated with a tag this function must ensure that it returns those pages associated with the most
recent generation of the tag.
7
3.8 AWriteObject/write assoc
The write assoc function writes a list of information to a tag. This will fail if there are no pages associated
with the tag, the number of pages to write dier from those associated with the tag or there are not enough
unused pages. Each successive write to a tag obtains a new generation count, the most recent generation being
that returned by a read. When we write some information to a tag we must distinguish the rst write to newly
allocated storage from all subsequent writes to the tag. We use the second boolean ag in the pages to do this.
3.9 AReleaseObject/release assoc
The release assoc function frees all the pages associated with a tag. This releases all the generations of pages
associated with a tag. As there is no limit on the number of generations that may be associated with a tag a
sequence of writes to the same tag can exhaust the memory system.
3.10 Conclusion
 There are many data structures that could be chosen to represent the relations employed in the concrete
specication. They have various advantages and disadvantages in terms of the storage used and the ease
with which functions over them may be implemented. This implementation uses the simplest possible data
structure resulting in complex function denitions.
 There are various properties of real implementations that can't be captured in a Miranda implementation.
 TheANewObject operation doesn't seem to be entirely neccessary. A suitably sophisticated write assoc
function could achieve the same eect (generational backup) without the complexities this operation in-
troduces.
8
Section 4
Modied Z Specication
4.1 Introduction
This modication of the initial Z specication incorporates the new features required of the memory system that
arose during discussions of the original version. The following changes have been made in the specication:
 The ANewObject operation is not required in its original form. We can instead use an operation that
returns an unused tag and a more sophisticated write operation.
 The generation marking of the information associated with a tag is now controlled by a new \commit"
operation. After a commit operation a subsequent write will aquire a new generation mark. All following
writes, assumming no commit occurs, will have this same generation mark.
 The read operation may now specify a generation relative to the current generation.
 A limit on the number of generations that may be associated with a tag is introduced.
 Operations that manipulate the generations of information associated with a tag may be introduced.
These new requirements alter the abstract specication signicantly, introducing new operations and making
explicit some of the lower level details of the memory system.
4.2 Abstract Specication
As in the original specication we don't need to know anything about tags and the information associated with
them so we parameterise our specication by the types Tag and Info types.
[Tag ; Info]
We also have a given set of available tags and limits on the size of the memory system and the maximum
generation of information that may be associated with a tag:
tags : FTag
msize : N
1
maxgen : N
1
The memory system is specied by two partial functions and a set, we include a derived value to aid the
presentation:
9
AMemSys
assoc : tags 7! seq(seq Info)
size : tags 7! N
1
committed : Ptags
usage : N
domassoc = dom size
committed  domassoc ^ (8 t : tags j t 2 committed  assoc t 6= h i)
8 t : tags j t 2 domassoc 
#(assoc t) 6 maxgen ^
8 i : N
1
j 1 6 i 6 #(assoc t) 
#(assoc t i) = size t
usage =
 
P
t : tags j t 2 domassoc  #(assoc t) size t

usage 6 msize
The assoc function associates a tag with a sequence of sequences of information, the most recent generation is
at the head of the sequence. The size function gives the length of the information sequences associated with a
tag. The committed set records those tags whose most recent generation of information has been committed.
We require the two functions to have the same domain, the committed set to be a subset of this set and all the
information sequences associated with a tag to be of the length given by the size function. Finally, we require
that the total amount of information associated with all the tags should not exceed the size of the memory
system.
We need to describe the initial state of the memory system:
AInitialMemSys
AMemSys
domassoc = ?
We simply require the assoc function to have an empty domain.
We have an operation that returns an unused tag, we have chosen to specify the size of the information
sequences we expect to write to the tag as an argument to this operation instead of letting the rst write
determine the sequence length:
ANewTag
AMemSys
n? : N
1
t ! : tags
t ! 62 domassoc
assoc
0
= assoc [ ft ! 7! h ig
size
0
= size [ ft ! 7! n?g
committed
0
= committed
We return an unused tag (one that has no associated sequence of information sequences), record the expected
length of the information sequences and mark the most recent generation as We have an operation to read the
information sequence, of a given generation, associated with a tag:
AReadGeneration
AMemSys
t? : tags
g? : N
info! : seq Info
t? 2 domassoc ^ assoc t 6= h i ^ g 6 (#(assoc t)  1)
info! = assoc t (g + 1)
10
The tag must have an associated information sequence of the given generation, numbered relative to the current
generation.
We have a schema that constrains a generation argument to the current generation:
CurrentGeneration
g? : N
g? = 0
Using schema conjunction and hiding we can specify an operation that reads the current generation of
information associated with a tag:
ARead b= (AReadGeneration ^ CurrentGeneration) n (g?)
We have an operation that releases all the information associated with a tag:
ARelease
AMemSys
t? : tags
t? 2 domassoc
assoc
0
= ft?g
 
C assoc
size
0
= ft?g
 
C size
committed
0
= committed n ft?g
We simply remove the tag from the domains of the two functions and from the committed set.
We have an operation that commits the current generation of information associated with a tag:
ACommit
AMemSys
t? : tags
t? 2 domassoc ^ assoc t? 6= h i
committed
0
= committed [ ft?g
The tag must have an associated information sequence, which we mark as committed.
We have an operation that writes a sequence of information to a tag. This operation has a number of dierent
cases depending on the state of the sequence of generations associated with the tag and whether the current
generation has been committed.
The rst write to a tag, after ANewTag , must make sure there is enough room to write the new information:
AWriteFirst
AMemSys
t? : tags
info? : seq Info
t? 2 domassoc ^ #info? = size t?
assoc t? = h i
(usage +#info?) 6 msize
assoc
0
= assoc  ft? 7! f1 7! info?gg
size
0
= size
committed
0
= committed
We override the association for the tag with a singleton sequence containing the new information sequence.
11
Writing to a tag whose current generation is not committed doesn't need any extra room
1
:
AWriteUncommitted
AMemSys
t? : tags
info? : seq Info
t? 2 domassoc ^ #info? = size t?
assoc t? 6= h i
t? 62 committed
assoc
0
= assoc  ft? 7! (assoc t?  f1 7! info?g)g
size
0
= size
committed
0
= committed
We override the association for the tag with a new sequence of information sequences obtained by overriding its
rst element with the new information sequence.
Writing to a tag whose current generation has been committed requires extra room for the new information:
AWriteCommitted
AMemSys
t? : tags
info? : seq Info
t? 2 domassoc ^ #info? = size t?
assoc t? 6= h i
t? 2 committed
(usage +#info?) 6 msize
assoc
0
= assoc  ft? 7! (1 : :maxgen C (f1 7! info?g
a
assoc t?))g
size
0
= size
committed
0
= committed n ft?g
We again override the association for the tag with a sequence of information sequences obtained from the current
value. In this case the sequence of sequences is obtained by concatenating the new seqeunce in front of the existing
one and then cropping the sequences of sequences by the maximum allowed generation.
We can now use schema disjunction to specify the write operation:
AWrite b= AWriteFirst _AWriteUncommitted _AWriteCommitted
4.3 Concrete Specication
In the concrete specication the functions of the abstract specication are now implicit in the lower level functions
that describe locations and the pages associated with them.
We introduce the following abbreviation and values to simulate a boolean type:
B == N
true : B
false : B
true 6= false
1
This may not capture the desired behaviour but can easily be changed.
12
We will associate a piece of housekeeping data with each tag, this will record whether it is in use, the length
of information sequences associated with it, whether its current generation of information has been committed,
the number of generations currently associated with it and the index of the current generation:
TagData == B  N
1
 B NN
We adopt a peculiar generation numbering scheme that is, apparently, that used in the real implementation.
We introduce some projection operations over this type:
tdu : TagData ! B
tds : TagData ! N
1
tdc : TagData ! B
tdg : TagData ! N
tdx : TagData ! N
tdu (inuse; size; committed ; generations; genindex ) = inuse
tds (inuse; size; committed ; generations; genindex ) = size
tdc (inuse; size; committed ; generations; genindex ) = committed
tdg (inuse; size; committed ; generations; genindex ) = generations
tdx (inuse; size; committed ; generations; genindex ) = genindex
We introduce a new constraint that is required by the generation indexing scheme:
maxindex : N
maxindex > maxgen
The following abbreviation will be useful:
Indices == 0 : :maxindex
The abstract association between tags and sequences of sequences of information is now implicit in a function
from locations to pages. We rst introduce some useful abbreviations:
Loc == 0 : : (msize   1)
Page == Info  B  tags  NN
Memory == Loc ! Page
The memory will consist of msize locations each of which addresses a page. Each page contains a piece of
information and housekeeping data that records whether it is in use, the tag it is associated with, its generation
index for that tag and its page number (relative position in the information sequence) for that tag.
We again introduce some projection operations:
pi : Page ! Info
pu : Page ! B
pt : Page ! tags
px : Page ! N
pn : Page ! N
pi (info; inuse; tag ; genindex ; pageno) = info
pu (info; inuse; tag ; genindex ; pageno) = inuse
pt (info; inuse; tag ; genindex ; pageno) = tag
px (info; inuse; tag ; genindex ; pageno) = genindex
pn (info; inuse; tag ; genindex ; pageno) = pageno
We have an operation that takes a memory and returns all the locations whose pages are associated with a
given tag:
taglocs : tags Memory ! PLoc
taglocs (tag ;mem) =
fl : Loc j pu (mem l) = true ^ pt (mem l) = tagg
13
We need only consider those pages that are marked as in use.
Given this we can specify an operation that takes a memory and returns a set of sets of locations. The pages
of the locations in each set of will have the same generation index:
gentaglocs : tags Memory ! P(PLoc)
9 lsets : P(PLoc) j lsets = gentaglocs (tag ;mem) 
[
lsets = taglocs (tag ;mem) ^
8 lset : PLoc j lset 2 lsets 
8 l
1
; l
2
: Loc j l
1
2 lset ^ l
2
2 lset 
px (mem l
1
) = px (mem l
2
)
We can also specify an operation that returns the generation indices associated with a tag:
tagindices : tags Memory ! PN
tagindices (tag ;mem) =
fl : Loc j pu (mem l) = true ^ pt (mem l) = tag  px (mem l)g
Before we can specify the concrete memory system we must rst characterise the acceptable sets of generation
indices that may be associated with a tag. This (partial) function returns the oldest and newest index in an
acceptable, non-empty, index set. An acceptable index set is contiguous modulo the fact that it may wrap around
at maxindex .
indices : PN 7! N N
indices =
fi : PN; old ; new : N j
i  Indices ^ old 2 Indices ^ new 2 Indices ^
(old 6 new ^ i = old : : new _
old > new ^ i = 0 : : new [ old : :maxindex ) 
i 7! (old ; new)
g
The concrete memory system is specied by two functions, we include two derived values to aid the presen-
tation:
14
CMemSys
data : tags ! TagData
mem :Memory
freetags : Ptags
freelocs : PLoc
8 t : tags j taglocs (t ;mem) 6= ?  tdu (data t) = true
8 t : tags j tdu (data t) = true 
tdg (data t) = #glocs ^ tdg (data t) 6 maxgen
^
tdc (data t) = true ) tdg (data t) > 0
^
tdg (data t) > 0)
(9 old : N  indices tidxs = (old ; tdx (data t)))
^
8 lset : PLoc j lset 2 glocs 
#lset = tds (data t) ^
fl : Loc j l 2 lset  pn (mem l) + 1g = 1 : :#lset
where
glocs = gentaglocs (t ;mem)
tidxs = tagindices (t ;mem)
freetags = ft : tags j tdu (data t) 6= trueg
freelocs = fl : Loc j pu (mem l) 6= trueg
The conditions over these two functions are more complex than in the abstract specication. This is partly due
to the consistency conditions required by the relationship between the two functions. The greatest complexity
arises from the conditions on the memory required to make it behave as a representation of a seq(seq Info). The
generation indexing scheme introduces even more complexity. The only simplication is that the memory size
constraint is clearly satised by the memory function.
The data renement from the abstract to concrete specications is specied by the following Abstraction
schema:
Abstraction
AMemSys
CMemSys
assoc =
ft : tags j tdu (data t) = true 
t 7! fg : N; glocs : P(PLoc); tidxs : PN j
glocs = gentaglocs (t ;mem) ^
tidxs = tagindices (t ;mem) ^
g 6 tdg (data t) 
g 7! flset : PLoc; l : Loc j
lset 2 glocs ^
l 2 lset ^
oset (tidxs; px (meml)) = g 
pn (mem l) + 1 7! pi (mem l)
g
g
g
8 t : tags j t 2 domsize  size t = tds (data t)
8 t : tags j t 2 committed  tdc (data t) = true
The data renement above requires some functions over generation indices. We can increment and decrement
generation indices:
15
incrindex : Indices ! Indices
decrindex : Indices ! Indices
incrindex = fi ; j : Indices j i = maxindex ^ j = 0 _
i 6= maxindex ^ j = i + 1 
i 7! jg
decrindex = fi ; j : Indices j i = 0 ^ j = maxindex _
i 6= 0 ^ j = i   1 
i 7! jg
Given these functions we can specify a function that takes a generation index and returns the oset of that
generation relative to the newest generation:
oset : PIndices  Indices 7! N
oset =
fi : PIndices; x : Indices; o : N j
x 2 i ^
(9 old ; new : N 
indices i = (old ; new) ^
x = new ^ o = 0 _
x 6= new ^ o = oset (i ; incrindex x ) + 1) 
(i ; x ) 7! o
g
The initial state of the memory system is:
CInitialMemSys
CMemSys
freetags = tags
freelocs = Loc
There are no used tags or memory locations. The remaining elds in the tag data and pages take arbitrary
initial values.
We can return an unused tag:
CNewTag
CMemSys
n? : N
1
t ! : tags
t ! 2 freetags
data
0
= data  ft ! 7! (true; n?; false; 0; tdx (data t !))g
mem
0
= mem
freetags
0
= freetags n ft !g
freelocs
0
= freelocs
This simply updates the tag data, the memory remains unchanged.
We can read the information sequence, of a given generation, associated with a tag:
16
CReadGeneration
CMemSys
t? : tags
g? : N
info! : seq Info
tdu (data t?) = true ^ tdg (data t?) 6= 0 ^ g? < tdg (data t?)
info! =
flset : PLoc; l : Loc j
lset 2 gentaglocs (t?;mem) ^
l 2 lset ^
oset (tagindices (t?;mem); px (mem l)) = g? 
pn (mem l) + 1 7! pi (mem l)
g
Using schema conjunction and hiding we can specify an operation that reads the current generation of
information associated with a tag:
CRead b= (CReadGeneration ^ CurrentGeneration) n (g?)
We can release all the information associated with a tag:
CRelease
CMemSys
t? : tags
tdu (data t?) = true
data
0
= data  ft? 7! (false;
tds (data t?);
tdc (data t?);
tdg (data t?);
tdx (data t?))g
mem
0
= release (mem; taglocs (t?;mem))
freetags
0
= freetags [ ft?g
freelocs
0
= freelocs [ taglocs (t?;mem)
This uses the following operation to release the locations associated with the tag:
release :Memory PLoc !Memory
release (mem; lset) = mem  fl : Loc j l 2 lset 
l 7! (pi (mem l);
false;
pt (mem l);
px (mem l);
pn (mem l))
g
We can commit the current generation of information associated with a tag:
17
CCommit
CMemSys
t? : tags
tdu (data t?) = true ^ tdg (data t) > 0
data
0
= data  ft? 7! (tdu (data t?);
tds (data t?);
true;
tdg (data t?);
tdx (data t?))g
mem
0
= mem
freetags
0
= freetags
freelocs
0
= freelocs
We can perform the rst write to a tag after CNewTag :
CWriteFirst
CMemSysy
t? : tags
info? : seq Info
tdu (data t?) = true ^ tds (data t?) = #info?
tdg (data t?) = 0
#freelocs > #info?
data
0
= data  ft? 7! (tdu (data t?);
tds (data t?);
false;
1;
0)g
9 lset : PLoc j lset  freelocs ^ #lset = #info? 
mem
0
= update (mem; lset ; info?; t?; 0)^
freelocs
0
= freelocs n lset
freetags
0
= freetags
This uses the following function to update the set of locations newly associated with the tag:
update :Memory PLoc  seq Info  tags  N!Memory
8mem :Memory ; lset : PLoc; info : seq Info; t : tags; x : N 
9mem
0
: PLoc ! Page j dommem
0
= lset 
8 n : 1 : :#info 
9 l : Loc j l 2 lset 
pi (mem
0
l) = info n ^
pu (mem
0
l) = true ^
pt (mem
0
l) = t ^
px (mem
0
l) = x ^
pn (mem
0
l) = n   1
^
update (mem; lset ; info; t ; x ) = mem mem
0
We write to a tag whose current generation is uncommitted:
18
CWriteUncommitted
CMemSysy
t? : tags
info? : seq Info
tdu (data t?) = true ^ tds (data t?) = #info?
tdg (data t?) > 0
tdc (data t?) 6= true
data
0
= data
9 lset : PLoc j lset 2 gentaglocs (t?;mem) ^
(8 l : Loc j l 2 lset  px (mem l) = tdx (data t?)) 
mem
0
= update(mem; lset ; info?; t?; tdx (data t?))
freetags
0
= freetags
freelocs
0
= freelocs
When we write to a tag whose current generation has been committed we add a new generation to those
associated with the tag. If this results in more than the maximum allowed number of generations the oldest
generation must be dropped.
If we have not reached the maximum allowed number of generations we needn't drop the oldest generation:
CWriteCommittedAddGen
CMemSysy
t? : tags
info? : seq Info
tdu (data t?) = true ^ tds (data t?) = #info?
tdg (data t?) > 0 ^ tdg (data t?) < maxgen
tdc (data t?) = true
#freelocs > #info?
data
0
= data  ft? 7! (tdu (data t?);
tds (data t?);
tdc (data t?);
tdg (data t?) + 1;
incrindex (tdx (data t?)))g
9 lset : PLoc j lset  freelocs ^ #lset = #info? 
mem
0
= update (mem; lset ; info?; t?; incrindex (tdx (data t?))) ^
freelocs
0
= freelocs n lset
freetags
0
= freetags
If we have reached the maximum allowed number of generations we must drop the oldest generation:
19
CWriteCommittedMaxGen
CMemSysy
t? : tags
info? : seq Info
tdu (data t?) = true ^ tds (data t?) = #info?
tdg (data t?) = maxgen
tdc (data t?) = true
#freelocs > #info?
data
0
= data  ft? 7! (tdu (data t?);
tds (data t?);
tdc (data t?);
tdg (data t?);
incrindex (tdx (data t?)))g
9 old ; new : N; oldlset : PLoc; mem
00
:Memory ; lset : PLoc j
(old ; new) = indices (tagindices (t?;mem)) ^
oldlset = fl : Loc j l 2 taglocs (t?;mem) ^ px (mem l) = oldg ^
mem
00
= release (mem; oldlset) ^
lset  freelocs ^ #lset = #info? 
mem
0
= update (mem
00
; lset ; info?; t?; incrindex (tdx (data t?))) ^
freelocs
0
= freelocs n lset
freetags
0
= freetags
We can now use schema disjunction to specify the write operation:
CWrite b= CWriteFirst _ CWriteUncommitted _ CWriteCommittedNewGen _ CWriteCommittedMaxGen
4.4 Conclusion and Future Work
This specication demonstrates how we can rene an abstract specication into a concrete specication that is
fairly close to a real implementation. It does, however, reveal that an implementation of sequences via the data
in the pages of the memory requires a complex specication. This specication successfully captured our current
understanding of the real implementation of the tagged memory system and allowed us to discuss it at length
with Eduard de Jong. This discussion revealed a number of ways in which the implementation diered from the
specication and suggested future work on the specication:
 The relationships between the CNewTag and CWrite operations must be rened to capture the correct
allocation of locations to a tag.
 As it stands this specication is too strong as it assumes that the concrete operations are atomic. We must
introduce a further level of renement in terms of the truly atomic operations: writing a page and reading
a byte.
 Once we accept that the concrete operations used in this specication are not atomic we must then allow
for the memory system being in states not allowed by this specication. This means we must include
detection of these non-standard states allowing recovery or error reporting.
 If we weaken the concrete specication to allow error reporting we must rewrite the abstract specication
in a similar fashion.
 We would like to be able to extend the basic specication in a modular way to encompass these enhance-
ments.
20
Section 5
Z Specication of Memory Module
Design
We describe a further renement of the Concrete specication of the memory management system. This rene-
ment introduces the atomic operations over pages in terms of which all operations must be described. We also
describe the new error states that may arise when a sequence of atomic operations is interrupted. By elaborating
the housekeeping data stored in the pages in memory we are able to constrain these error states and ensure that
any \lost" pages can subsequently be reclaimed.
5.1 Introduction
In gure 5.1 we describe the various specications of the memory management system and the documents in
which they appear. The initial Abstract and Concrete specications appeared in 2 of the rst deliverable.
Initial Jun 96
Abstract
?
Renement
Concrete
Review
........
Modied Aug-Oct 96
Abstract
?
Renement
Concrete
?
Renement
Current
Apr 97
Design
?
Renement
Next
May 97
Implementation
Figure 5.1: Z Specications
21
Through a process of review and discussion with Eduard de Jong these were modied to produce the versions
that appeared in 4 of that document. This document describes the Design level specication of the memory
management system, which is a renement of the Concrete specication. This document is also the product of
an ongoing process of review and discussion with Eduard de Jong. This specication will itself be rened into
the Implementation specication which will appear in a future document.
This specication introduces some of the low-level implementation details corresponding to the real hard-
ware/software as described by Eduard de Jong. The inclusion of these implementation details gave rise to new
error states which were not considered in the abstract and concrete specications. This could have required the
modication of these specications to include these new error states. However, by careful specication of the
handling of these new error states in this specication we were able to avoid the need for changes to the abstract
and concrete specications. The new implementation details are described below:
 Repeated updates of the same memory location cannot be allowed. The major consequence of is that we
must simplify the data associated with each tag in the TagData mapping in order to remove values that
can be calculated from the memory.
 The atomic operations over the memory are page based and are sequenced.
 The sequences of atomic operations may be interrupted at any point. This means that the memory can be
left in error states not found during normal operation.
5.2 Sequences of Atomic Operations
In the actual implementation the basic operations over the memory are the writing of a page into tagged memory
and the reading of a byte from tagged memory. The number of bytes in each page will be constant but may
vary from implementation to implementation. We can assume that these atomic operations must either succeed
or fail as this requirement can easily be satised by the actual implementation. Sequences of atomic operations
can, however, be interrupted at any point by the removal of the card from its power supply. If a sequence of
atomic write operations is interrupted the tagged memory can be left a state that is not captured by the concrete
specication. This design specication describes all operations that write to the memory in terms of sequences of
the atomic page writing operation that may be interrupted at any point. This design specication must capture
the new error states that arise, allow for their subsequent detection and hopefully support recovery from these
erroneous states. We do not describe operations that read from the memory in terms of sequences of atomic
read operations as their interruption cannot introduce new states of the memory.
5.3 Handling Erroneous States
There are two dierent ways to handle erroneous states. The rst approach I considered was to modify both
the concrete and abstract specications to allow for such erroneous states. The design specication could then
simply allow such states but avoid discussing how they might be handled. The problem with this approach is that
while error states can be detected, by the absence or duplication of pages, there is no way to recognise the cause
of the error and therefore no way to perform error recovery. To solve this problem the memory manager would
have to record some indication of its current state in the memory in such a way as to allow for subsequent error
recovery. My attempts to specify the recording of such a state in a form that relates to the memory operations
as seen by an application all required repeated writes of the state information to some page in memory.
This problem, along with the need to modify the abstract and concrete specications, lead me to investigate
an approach in which all the error detection and recovery could be contained within the new specication and
hidden at some level within the nal implementation of the system. This is method I adopted and which is
described in the remainder of this paper.
5.4 Specication Constraints
There are a number of new constraints that I used as goals when designing the new specication. The rst
constraint was actually the motivation for the development of the tagged memory management system. However,
22
the abstract and concrete specications did not take this into account and in that sense it is new in this
specication:
 We should write to a given page as few times as possible. This basically means that we should only write
to a page when we have no choice:
{ When writing new pages of information.
{ When superseding pages of information.
{ When removing an association between a page and a tag.
All the information required to track the state of the memory manager should be stored utilising only these
write operations.
I was not certain that I would be able to satisfy all these constraints but in the event I believe I have succeeded
in doing so while imposing only a slight memory cost on the memory manager.
 Memory is limited so the memory management system should use as little as possible itself.
 The only write operation we may perform on the memory is the atomic writing of a page.
 The card can be pulled at any time, thus any sequence of atomic write operations can be interrupted at
any point. We should be able to detect the resulting erroneous state and then tidy up the memory.
 We should ensure that we can always recover the memory lost when an atomic operation sequence is
interrupted. This is because one of the major results of verication will the memory boundedness of
programs which will only hold if we can recover lost memory.
 We, of course, retain all the constraints employed in the previous specications, such as requiring that the
information read from a tag is equal to that previously written to that tag.
5.5 Error States
There are four contexts in which a sequence of atomic operations can be interrupted to give rise to a distinct
error state:
 When writing a new generation of information we may fail to write all the required pages.
 When writing a new version of the current generation we may fail to write all the pages of the new version
or to supersede all the pages of the old version.
 When releasing the pages of an old generation in order to provide space for a new generation we may fail
to release all the pages of the old generation.
 When deallocating all the pages for a tag for the Release operation we may fail to deallocate them all.
We can't record a separate ag to track the current state of the memory manager for a tag as we would have to
pick a page to keep it in which would then suer from repeated writes as the state changed. Instead we use the
prescence of page zero to indicate the prescence of all the other pages of a generation, as described by Eduard
de Jong, and elaborate the information otherwise stored in a page by a further piece of data:
 A cyclic three state ag that allows us to determine the relative age of two versions of the same generation.
Each page in a given version will have the same value in this ag, the pages of a new version will all take
the successor state to that of the current version.
23
5.6 Error Recovery
As checking for and remedying error states before each operation would be expensive we instead wait until we
have no choice but reclaim the memory lost due to disrupted operations. Thus the prescence of an error state in
the memory manager will be noted by a Write operation failing to nd sucient free pages. We can then invoke
an operation to tidy up the memory, releasing the lost pages for reuse. By performing some inexpensive local
housekeeping in the operations we can restrict the complexity of the error states that can arise from repeated
disruptions. This greatly simplies the error recovery task. We describe the dierent forms of error recovery,
how they are tidied up, the error states that invoke them and the housekeeping required of the operations:
1. If there are pages marked as in use by a tag but the tag data does not mark it as in use they can all be
marked as not in use. This will only occur due to a disruption while releasing all the information associated
with a tag. The New operation are required to tidy up any pages marked as in use by the new tag.
2. If there are pages for a given generation and version with no page zero they can all be marked as not in
use. This will occur due to disruptions while writing new generations and versions and while superseding
old versions and releasing old generations. The Commit and Write operations are required to tidy up
incomplete versions and generations for the given tag.
3. If there are two complete sets of pages for a tag with the same generation the pages of the older version
can be marked as not in use. This will only occur due to a disruption while writing a new version of the
current generation. The Commit and Write operations are required to tidy up out-of-date versions for the
given tag. Given this housekeeping we can ensure that only the current generation can ever have multiple
versions.
4. If there are more generations associated with a tag than the maximum allowed then the pages of the oldest
generation can be marked as not in use. This will only occur due to a disruption while writing a new
generation when the maximum number of generations already exist. The Write operation are required to
tidy up excessively old generations for the given tag.
Given this localised housekeeping we can easily calculate a conservative estimate of the number of locations
currently in use before each Write operation. This estimate is conservative in the sense that, in the error states,
it may conclude that more locations are in use that in fact are marked as in use. During normal operation this
estimate will correspond exactly to the number of pages required by all the tags currently in use. If this estimate
indicates that there are not enough free locations we can tidy up the memory, recovering locations lost due to
interruption of a memory update, and try again. If there are still not enough free locations this indicates an
unrecoverable error due to an application requiring more than the available memory. We make no attempt to
handle this error, we instead require the user to avoid calling operations in such a manner as would cause this
error. This may well require that memory boundedness constraints are veried for all applications.
This is an instance of a general issue concerning the limits of our specication. We are assuming that certain
operations will only be called when it is sensible to do so. This allows us to avoid the additional complexity that
would be required in the specications if we were to consider these additional sources of errors. In a development
process involving verication of the use of operations such simplifying assumptions can be formally justied.
5.7 Design Specication
The design specication follows the concrete specication in its general structure and retains some elements from
both the abstract and concrete specication. The specication is parameterised by the following types:
[Tag ; Info]
We have a set of tags, a memory size and a maximum generations count:
tags : FTag
msize : N
1
maxgen : N
1
24
We have a simulated boolean type:
B == N
true : B
false : B
true 6= false
We have a constraint on the generation indices:
maxindex : N
maxindex > maxgen
We have two useful abbreviations:
Indices == 0 : :maxindex
Loc == 0 : : (msize   1)
We have the following characterisation of acceptable sets of generation indices:
indices : PN 7! N N
indices =
fi : PN; old ; new : N j
i  Indices ^ old 2 Indices ^ new 2 Indices ^
(old 6 new ^ i = old : : new _
old > new ^ i = 0 : : new [ old : :maxindex ) 
i 7! (old ; new)
g
In the following I derive the names employed for types and functions from those employed in the concrete
specication, prexing them with a `D' or `d'. In some cases I have also shortened the resulting name.
We modify the tag data, removing the generation count and index as these values can be determined from
the memory.
DTagData == B  N
1
 B
We have projections over this type:
dtdu : DTagData ! B
dtds : DTagData ! N
1
dtdc : DTagData ! B
dtdu (inuse; size; committed) = inuse
dtds (inuse; size; committed) = size
dtdc (inuse; size; committed) = committed
The abbreviations dening a page in memory are now altered to included the extra state information required
by the design specication.
DPage == B  tags  Info  N NVersion
Version ::= V
A
j V
B
j V
C
DMemory == Loc ! DPage
The memory will consist of msize locations each of which identies a page. Each page contains a piece of
information and housekeeping data that records whether it is in use, the tag it is associated with, its generation
index, its page number and its version. The page versions form a cycle that allows us to order successive versions
of of a generation:
25
 : Version $ Version
nextV : Version ! Version
V
A
 V
B
^ V
B
 V
C
^ V
C
 V
A
nextV V
A
= V
B
nextV V
B
= V
C
nextV V
C
= V
A
We introduce some projection operations on DPage:
dpu : DPage ! B
dpt : DPage ! tags
dpi : DPage ! Info
dpx : DPage ! N
dpn : DPage ! N
dpv : DPage ! Version
dpu (inuse; tag ; info; genindex ; pageno; version) = inuse
dpt (inuse; tag ; info; genindex ; pageno; version) = tag
dpi (inuse; tag ; info; genindex ; pageno; version) = info
dpx (inuse; tag ; info; genindex ; pageno; version) = genindex
dpn (inuse; tag ; info; genindex ; pageno; version) = pageno
dpv (inuse; tag ; info; genindex ; pageno; version) = version
We have an operation that takes a memory and returns all the locations whose pages are marked as being
associated with a given tag. We need only consider those pages that are marked as in use:
dtaglocs : tags DMemory ! PLoc
dtaglocs (tag ;mem) =
fl : Loc j dpu (mem l) = true ^ dpt (mem l) = tag  lg
We can also specify an operation that returns the generation indices associated with a tag:
dtagindices : tags  DMemory ! PN
dtagindices (tag ;mem) =
fl : Loc j l 2 dtaglocs (tag ;mem)  dpx (mem l)g
In the following characterisations of the sets of locations corresponding various classes of error we use the
numbering scheme from page 24.
As it is possible for the deallocation of pages for a tag to be interrupted we must be able to detect this
condition. These are simply the locations that are marked as being in use by a tag that is itself marked as not
being in use:
badlocs1 : (tags ! DTagData) DMemory ! P(PLoc)
badlocs1 (data;mem) =
flset : PLoc j (9 tag : tags j lset = dtaglocs (tag ;mem)) ^
dtdu (data tag) = false  lsetg
We can specify an operation that takes a tag and a memory and returns a set of sets of locations. The pages of
the locations in each set of will have the same generation index for that tag:
gentlocs : tags DMemory ! P(PLoc)
gentlocs (tag ;mem) = lsets
where
[
lsets = fl : Loc j l 2 dtaglocs (tag ;mem)  lg ^
8 lset : PLoc j lset 2 lsets 
8 l
1
; l
2
: Loc j l
1
2 lset ^ l
2
2 lset 
dpx (mem l
1
) = dpx (mem l
2
)
26
We can further rene this by separating the pages of dierent versions for a given generation:
vergtlocs : tags DMemory ! P(P(PLoc))
vergtlocs (tag ;mem) = lsetss
where
flsets : P(PLoc) j lsets 2 lsetss 
[
lsetsg = gentlocs (tag ;mem) ^
8 lsets : P(PLoc); lset : PLoc j lsets 2 lsetss ^ lset 2 lsets 
8 l
1
; l
2
: Loc j l
1
2 lset ^ l
2
2 lset 
dpv (mem l
1
) = dpv (mem l
2
)
As it is possible for the writing or release of pages for a given generation to be interrupted we must be able
to detect this condition. We always write page zero of a generation last and release it rst so that its absence
indicates an incomplete generation:
badlocs2 : DMemory ! P(PLoc)
badlocs2 mem =
flset : PLoc j (9 tag : tags; lsets : P(PLoc) 
lsets 2 vergtlocs (tag ;mem) ^ lset 2 lsets) ^
: (9 l : Loc  l 2 lset ^ dpn (mem l) = 0)g
It is also possible for two complete versions of the same generation to exist at the same time. The version
numbers allow us to distinguish the locations of the two versions and determine the out-of-date version:
badlocs3 : DMemory ! P(PLoc)
badlocs3 mem =
flset : PLoc j (9 tag : tags; lset
0
: PLoc 
flset ; lset
0
g 2 vergtlocs (tag ;mem) ^
flset ; lset
0
g \ badgtlocs mem = ? ^
(8 l ; l
0
: Loc j l 2 lset ^ l
0
2 lset
0

dpv (mem l)  dpv (mem l
0
)))g
Finally, its possible for there to be one more complete generation than the maximum allowed:
badlocs4 : DMemory ! P(PLoc)
badlocs4 mem =
flset : PLoc j 9 old ; new : N; tag : tags j
(old ; new) = indices (dtagindices (tag ;mem)) ^
lset = fl : Loc j l 2 dtaglocs (tag ;mem) ^
dpx (mem l) = oldg ^
#(dtagindices (tag ;mem)) > maxgeng
Given these characterisations of the erroneous states of the memory system we can describe how to tidy up a
memory system.
All operations that modify the memory will be expressed in terms of sequences of the following atomic
operation which updates a single location in the memory:
write : DMemory  Loc DPage ! DMemory
write (mem; l ; p) = mem  fl 7! pg
To tidy up a page in the memory we simply mark it as not in use with the following procedure:
27
PROCEDURE
release : DMemory PLoc ! DMemory
release (mem; lset) =
FOR l IN lset DO
write (mem; l ; (false;
dpt (mem l);
dpi (mem l);
dpx (mem l);
dpn (mem l);
dpv (mem l)))
It is important that we retain the housekeeping data in the page after we release it as it may be required when
calculating the conservative estimate of free locations, described on page 31, before attempting subsequent Write
operations. The construct
FOR l IN lset DO write(mem; l ; p)
is used to capture the fact that an update of a set of locations is achieved by a sequence of updates that may
be interrupted at any point. In this case we are not concerned about the order in which the elements of the set
of locations are released. In some cases we do wish to be sure that the rst page we release from a generation is
page zero:
PROCEDURE
release0 : DMemory PLoc ! DMemory
release0 (mem; lset) =
release(release (mem; lset0); lset n lset0)
where
lset0 = fl : Loc j l 2 lset ^ dpn (mem l) = 0g
We can now describe how to tidy up the various sets of erroneous pages:
tidy1 : (tags ! DTagData)  DMemory  tags ! DMemory
tidy2 : DMemory  tags ! DMemory
tidy3 : DMemory  tags ! DMemory
tidy4 : DMemory  tags ! DMemory
tidy1 (data;mem; tag)
= release (mem; dtaglocs (tag ;mem) \
[
badlocs1 (data;mem))
tidy2 (mem; tag)
= release (mem; dtaglocs (tag ;mem) \
[
badlocs2 mem)
tidy3 (mem; tag)
= release0 (mem; dtaglocs (tag ;mem) \
[
badlocs3 mem)
tidy4 (mem; tag)
= release0 (mem; dtaglocs (tag ;mem) \
[
badlocs4 mem)
It should be noted that we are not concerned with the order in which locations are tidied up unlike when releasing
locations during normal operation.
The memory system design is specied by two functions as in the concrete specication:
28
DMemSys
ddata : tags ! DTagData
dmem : DMemory
8 t : tags; lsetss : P(P(PLoc)) j lsetss = vergtlocs (t ; dmem) 
#lsetss  maxgen + 1 ^
lsetss 6= ?) dtagindices (t ; dmem) 2 dom indices
8 lsets : P(PLoc) j lsets 2 lsetss 
#lsets  2 ^
8 lset : PLoc j lset 2 lsets 
#lset  dtds (ddata t) ^
(9 l : Loc j l 2 lset  dpn (dmem l) = 0)
) fl : Loc j l 2 lset  dpn (dmem l) + 1g = 1:::#lset
The conditions over the two functions are weaker than in the concrete specication. This is required to allow for
the possible interruption of a sequence of atomic operations over the memory which can leave it in any of the,
previously described, error states.
The data renement from the concrete to design specications is specied in terms of the tidied up design
memory system:
Abstraction
CMemSys
DMemSys
9 tdydmem : DMemory ; lset 2 PLoc j
lset =
[
badlocs1 (ddata; dmem)[
[
badlocs2 dmem[
[
badlocs3 dmem[
[
badlocs4 dmem ^
tdydmem = release (dmem; lset) ^
8 t : tags 
tdu (data t) = dtdu (ddata t) ^
tdu (data t) = true )
tds (data t) = dtds (ddata t) ^
tdc (data t) = dtdc (ddata t) ^
tdg (data t) = #vergtlocs (t ; tdydmem) ^
tdx (data t) = 0 ^ dtdx (ddata t) = 0 _
9 o; n : N 
(o; n) = indices (tagindices (t ;mem)) ^
(o; n) = indices (dtagindices (t ; tdydmem))
^
8 l : Loc 
pi (mem l) = dpi (tdydmem l) ^
pu (mem l) = dpu (tdydmem l) ^
pt (mem l) = dpt (tdydmem l) ^
px (mem l) = dpx (tdydmem l) ^
pn (mem l) = dpn (tdydmem l)
The initial state of the memory system is:
DInitialMemSys
DMemSys
8 l : Loc j dpu (dmem l) = false ^
8 t : tags j dtdu (ddata t) = false
29
We can return an unused tag:
DNewTag
DMemSys
n? : N
1
t ! : tags
dtdu (ddata t !) = false
dmem
0
= tidy1 (ddata; dmem; t !)
ddata
0
= ddata  ft ! 7! (true; n?; false)g
This not only updates the tag data but also tidies up any locations associated with the tag in order to control
the range of possible error states. We are assumming here that the update of the tag data is an atomic operation
that takes place after we tidy up the memory.
We can read the information sequence, of a given generation, associated with a tag:
DReadGeneration
DMemSys
t? : tags
g? : N
info! : seq Info
dtdu (ddata t?) = true
dmem
0
= tidy4 (tidy3 (tidy2 (dmem; ft?g); ft?g); ft?g)
info! =
flset : PLoc; l : Loc j
lset 2 gentlocs (t?; dmem
0
) ^
l 2 lset ^
oset (dtagindices (t?; dmem
0
); dpx (dmem
0
l)) = g? 
dpn (dmem
0
l) + 1 7! dpi (dmem
0
l)
g
ddata
0
= ddata
It should be noted that this Read operation can modify the memory. This is because it must rst tidy up the
locations associated with the tag before it can determine the locations corresponding to the requested generation.
We can again use schema conjunction and hiding to specify an operation that reads the current generation:
DRead b= (DReadGeneration ^ CurrentGeneration) n (g?)
We can release all the information associated with a tag:
DRelease
DMemSys
t? : tags
tdu (ddata t?) = true
ddata
0
= ddata  ft? 7! (false;
dtds (ddata t?);
dtdc (ddata t?))g
dmem
0
= release (dmem; dtaglocs (t?; dmem))
We are assuming here that the update of the tag data is an atomic operation that takes place before we release
any of the memory locations.
We can commit the current generation of information associated with a tag:
30
DCommit
DMemSys
t? : tags
tdu (ddata t?) = true
dmem
0
= tidy3 (tidy2 (dmem; ft?g); ft?g)
dtaglocs (t?; dmem
0
) 6= ?
ddata
0
= ddata  ft? 7! (true;
tds (ddata t?);
true)g
It should be noted that we must tidy up the locations associated with the tag before we can be sure is is safe to
mark the current generation as committed.
Before we can perform any of the Write operations we must rst dene how we calculate the conservative
estimate of the number of locations in use. In normal operation, where the memory is not in an error state, the
maximum number of locations in use between operations can be calculated by multiplying the sizes associated
with each tag that is in use bymaxgen. The Write operations require a further generation's worth of locations for
the tag being written to. Thus the memory must be large enough to handle the maximumnumber of generations
for all tags that will be in use at the same time plus one generation of the tag with the largest size. When errors
states are considered each tag may have maxgen+1 generations associated with it. These error states mean that
Write operations cannot guarantee that there will be enough free locations unless they check the memory and
tidy up any error states. We need a way to calculate an estimate of the number of locations marked as in use
that will correspond to the actual number in normal operation and provide a conservative estimate in the error
states. We estimate the number of locations in use by a tag by assuming that the existence of each location with
a unique generation and version pair implies the existence of an entire generation of locations. This calculation
applies even if the tag data marks the tag as not in use and requires that the tag data record the size of the
generations previously associated with a tag. The conservative estimate we use is:
usedlocs : DTagData  DMemory ! N
usedlocs (ddata; dmem) =
P
t :tags
dtds (ddata t) max (maxgen;#
[
vergtlocs (t ; dmem))
Given this we can check there is enough room to perform a Write operation. We express this as a schema that
performs the check and if it fails tidies up the memory and performs the check again.
DWriteOK b= DCheckOK _ (DCheckNotOK
o
9
DTidy
o
9
DCheckOK )
To check the Write operation we simply subtract our estimate of locations in use from the memory size and
check there is room for a generation for the tag:
DCheckOK
CMemSys
t? : tags
info? : seq Info
msize   usedlocs (ddata; dmem)  dtds (ddata t?)
We can't simply use schema negation to produce the negation of this check because its property is partly
expressed by its declaration.
DCheckNotOK
CMemSys
t? : tags
info? : seq Info
msize   usedlocs (ddata; dmem) < dtds (ddata t?)
31
To tidy up the memory we simply tidy up all the error states:
DTidy
DMemSys
t? : tags
info? : seq Info
ddata
0
= ddata
dmem
0
= tidy4 (tidy3 (tidy2 (tidy1 (ddata; dmem; ft?g); ft?g); ft?g); ft?g)
We can perform the rst write to a tag after DNewTag :
DWriteFirst
DMemSys
t? : tags
info? : seq Info
dtdu (ddata t?) = true ^ dtds (ddata t?) = #info?
tdymem = tidy4 (tidy3 (tidy2 (dmem; ft?g); ft?g); ft?g)
dtaglocs (t?; tdymem) = ?
ddata
0
= ddata
9 lseq : seq Loc j l 2 ran lseq ) dpu (tdymem l) = false ^
#lseq = #info? ^
#(ran lseq) = #info?
 dmem
0
= update (tdymem; lseq ; info?; t?;0;V
A
)
We must tidy up the locations associated with the tag before we perform the memory update in order to
control the range of possible error states. This uses the following procedure to update the set of locations newly
associated with the tag:
PROCEDURE
update :Memory  seq Loc  seq Info  tags N Version !Memory
update (mem; lseq ; info; t ; x ; v) =
FOR n = 1TO #infoDO
write (mem; lseq n; (true;
t ;
info (#info   n + 1);
x ;
#info   n;
v))
The FOR loop can interrupted at any point and it is important that the last page written is page zero.
We can write to a tag whose current generation is uncommitted:
32
DWriteUncommitted
DMemSys
t? : tags
info? : seq Info
dtdu (ddata t?) = true ^ dtds (ddata t?) = #info?
tdymem = tidy4 (tidy3 (tidy2 (dmem; ft?g); ft?g); ft?g)
dtaglocs (t?; tdymem) 6= ?
dtdc (ddata t?) 6= true
ddata
0
= ddata
(old ; new) = indices (dtagindices (t?; tdymem))
9 olseq : seq Loc; v : Version; lseq : seq Loc j
#olseq = #info? ^
ran olseq 2 gentlocs (t?; tdymem) ^
(8 l : Loc j l 2 ran olseq ) dpx (tdymem l) = new ^ dpv (tdymem l) = v) ^
dpn (olseq 1) = 0 ^
(8 l : Loc j l 2 ran lseq ) dpu (tdymem l) = false) ^
#lseq = #info? ^
#(ran lseq) = #info? 
udmem = update (tdymem; lseq ; info?; t?; new ; nextV v) ^
dmem
0
= free (udmem; olseq)
We must again tidy up the locations associated with the tag before we perform the memory update in order to
control the range of possible error states. We are assuming here that the memory update is performed before
the memory release. The new version of the current generation has the same generation index but its version is
that following that of the current version of the current generation. This uses the following procedure to release
the locations of the old version of the current generation, releasing page zero rst:
PROCEDURE
free :Memory  seq Loc !Memory
update (mem; lseq) =
FOR n = 1TO #lseqDO
write (mem; lseq n; (false;
dpt (mem (lseq n));
dpi (mem (lseq n));
dpx (mem (lseq n));
dpn (mem (lseq n));
dpv (mem (lseq n))))
When we write to a tag whose current generation has been committedwe add a new generation to those associated
with the tag. If this results in more than the maximum allowed number of generations the oldest generation
must be dropped.
If we have not reached the maximum allowed number of generations we needn't drop the oldest generation:
33
DWriteCommittedAddGen
DMemSys
t? : tags
info? : seq Info
dtdu (ddata t?) = true ^ dtds (ddata t?) = #info?
tdymem = tidy4 (tidy3 (tidy2 (dmem; ft?g); ft?g); ft?g)
#gentlocs (t?; tdymem) < maxgen
dtdc (ddata t?) = true
(old ; new) = indices (dtagindices (t?; tdymem))
9 lseq : seq Loc j
8 l : Loc j l 2 ran lseq ) dpu (tdymem l) = false ^
#lseq = #info? ^
#(ran lseq) = #info? 
dmem
0
= update (tdymem; lseq ; info?; t?; incrindex new ;V
A
)
ddata
0
= ddata  ft? 7! (true; tds (data t?); false)g
We again tidy up the locations associated with the tag before updating the memory. We are assuming here that
the memory update is performed before the tag data update
1
.
If we have reached the maximum allowed number of generations we must drop the oldest generation:
DWriteCommittedMaxGen
DMemSys
t? : tags
info? : seq Info
dtdu (ddata t?) = true ^ dtds (ddata t?) = #info?
tdymem = tidy4 (tidy3 (tidy2 (dmem; ft?g); ft?g); ft?g)
#gentlocs (t?; tdymem) = maxgen
dtdc (ddata t?) = true
(old ; new) = indices (dtagindices (t?; tdymem))
9 olseq : seq Loc; lseq : seq Loc j
#olseq = #info? ^
ran olseq 2 gentlocs (t?; tdymem) ^
8 l : Loc j l 2 ran olseq ) dpx (tdymem l) = old ^
dpn (olseq 1) = 0 ^
8 l : Loc j l 2 ran lseq ) dpu (tdymem l) = false ^
#lseq = #info? ^
#(ran lseq) = #info? 
udmem = update (tdymem; lseq ; info?; t?; incrindex new ;V
A
) ^
dmem
0
= free (udmem; olseq)
ddata
0
= ddata  ft? 7! (true; tds (data t?); false)g
We again tidy up before the memory update and assume that the update precedes the release and they both
precede the tag data update
We can now use schema disjunction to specify the write operation:
DWrite b= DWriteOK
o
9
(DWriteFirst_
DWriteUncommitted_
DWriteCommittedNewGen_
DWriteCommittedMaxGen)
1
It is worth noting that the committed ag in tag the data introduces a possible data inconsistency between the memory and
the tag data which is not present in the real system.
34
5.8 Conclusion
In this specication we have successfully described the operations that write to the memory in terms of sequences
of atomic page writing operations. The possible interruption of these sequences of atomic operations introduced
a number of possible error states not present in the previous specications. By elaborating the data stored in a
page and adding some limited housekeeping to the standard operations we have been able to restrict the number
of possible error states. Given this restricted set of new error states we are able to specify how the operations
may detect and recover from all such errors. This approach has the advantage that we have not had to revise
the abstract and concrete specications to handle error returns from the standard operations. Furthermore, by
replacing data stored with each tag by searches over the memory, we have avoided the previous requirement for
multiple updates of single memory locations.
35
Section 6
\Safe" Storage Allocation
To perform storage allocation for objects of a variety of types the storage allocation routines must be able to
perform a number of operations we would not like to see used throughout a program:
 Determine the size (in bytes or bits say) of an object of any type.
 Determine the address boundary constraints of objects of any type.
 Coerce storage of any type into storage of some low-level type (an array of bytes or bits say).
 Given allocated storage of some low-level type of the required size and alignment for a type coerce it into
storage of that type.
If the storage allocator is to be written in the language itself then these dangerous constructs must be present in
the language itself. In order to limit the parts of a program that must be considered \unsafe" we must restrict
the occurrences of all these operations.
The problems that arise if we don't can be seen in C where malloc(3) performs the storage allocations,
handling all the alignment constraints. Each call of malloc will, however, appear as something like the following:
(Type *) malloc(Len * sizeof(Type))
Thus the entire program will be littered with sizeof operations and type casts. The problem C has is that it
can't parameterize the storage allocation routines with the type of the object being allocated.
Modula-3 solves this problem by providing a NEW(T, ...) operation that is paremeterized by the type to
be allocated. A similar eect could be achieved inModula-3 by utilising its generic modules. This would allow
Modula-3 to paremeterise a module of storage allocation routines by the type being allocated and restrict all
the unsafe operations to this module. Each type could import an instance of this generic module to give a type
specic function for allocating storage of that type.
36
Section 7
Modula-3 Memory Management
7.1 Introduction
We describe an implementation of a general storage allocation scheme for structures of any type in Modula-3.
7.2 Type System
Modula-3 is strongly typed but requires run-time type checking. This has both performance and storage
implications because of the manipulations of the housekeeping information that must be associated with each
value.
The types corresponding to \pointers" are called references inModula-3 and two dierent kinds of reference
are supplied. The rst, called \traced" references, refer to storage that is managed by the garbage collector.
The second, called \untraced" references, refer to storage that is not garbage collected, though it may still have
associated housekeeping information.
7.3 Safety
Certain operations, such as address arithmetic, are classied as unsafe and may only appear in modules that
are explicitly marked as unsafe. Arbitrary type casts, such as from an arbitrary reference to a char pointer, are
classied as unsafe. It is not possible for an unchecked run-time error to occur in a safe module.
7.4 Generic Modules
Modula-3 modules can be parameterised but only by the name of an interface. This means that if what we are
really trying to implement is a type parameterisation we must explicitly code interfaces that do nothing more
than contain the desired type for each type that we wish to pass to the parameterised module.
7.5 BYTE Module
We start by dening a module for the type BYTE. For this we require only an interface:
INTERFACE BYTE;
IMPORT ASCII;
TYPE T = BITS 8 FOR ASCII.Range;
TYPE PTR = UNTRACED REF ARRAY OF T;
END BYTE;
37
This requires the library module ASCII which contains a denition of the sub-range of the type CHAR that
corresponds to the 256 characters that can be represented in 8 bits. We dene a packed
1
type for 8 bit characters
2
.
7.6 MEM Module
Given the interface BYTE we can dene a module that handles low-level operations over a block of storage. The
interface for this module is:
INTERFACE MEM
IMPORT BYTE;
EXCEPTION IllMemRef(CARDINAL);
PROCEDURE Get(offset: CARDINAL): BYTE.T RAISES IllMemRef;
PROCEDURE Set(offset: CARDINAL; byte: BYTE.T) RAISES IllMemRef;
END MEM.
This is a safe module and the functions may raise an exception if the supplied oset is too large
3
. We have
a function to return the byte at a given oset/address and a function to set the byte at a given oset The
implementation of this module is straightforward:
MODULE MEM;
IMPORT BYTE;
CONST Size: CARDINAL = 1024;
VAR Block: BYTE.PTR;
PROCEDURE Get(offset: CARDINAL) : BYTE.T RAISES IllMemRef =
BEGIN
IF offset >= Size THEN
RAISE IllMemRef(offset);
ELSE
RETURN Block[offset];
END;
END Get;
PROCEDURE Set(offset: CARDINAL; byte: BYTE.T) RAISES IllMemRef =
BEGIN
IF offset >= Size THEN
RAISE IllMemRef(offset);
ELSE
Block[offset] := byte;
END;
END Set;
BEGIN
Block := NEW(BYTE.PTR,Size);
FOR offset := 0 TO Size - 1 DO
Block[offset] := 'n000';
END;
END MEM.
The functions Get and Set read and write bytes in the block of storage. They simply check the oset before
accessing the array of bytes. We initialise the allocated block of storage to NUL's to allow tracking of what has
been written where in the buer easier.
1
The conversions between the packed and unpacked representations of values are performed automatically by Modula-3.
2
Though CHAR seems to take only 8 bits on this implementation it is simply guaranteed to contain at least 256 elements corre-
sponding to the 8 bit characters.
3
Modula-3 would raise its own exception if the array index was out of bounds.
38
7.7 Store Module
In order to store a value of an arbitrary type we must be able to regard it as a sequence of bytes. This requires
us to perform an arbitrary type coercion so we will require an unsafe module. Furthermore, to parameterise a
module by a type we must parameterise it by an interface that is expected to dene a type of a known name.
We adopt the Modula-3 idiom of naming such types \T". The generic interface is:
GENERIC INTERFACE Store(I);
PROCEDURE WriteVAR(VAR v:I.T; offset: CARDINAL);
PROCEDURE ReadVAR(VAR v:I.T; offset: CARDINAL);
END Store.
We parameterise by an arbitrary interface I which is expected to contain a type named \T". Notice that the
argument to the functions is a variable designator, not a value. The implementation is a little tricky as we must
forge a byte pointer from the variable designator:
GENERIC MODULE Store(I);
IMPORT BYTE, MEM;
PROCEDURE WriteVAR(VAR value:I.T; offset: CARDINAL) =
VAR addr: UNTRACED REF BYTE.T;
BEGIN
FOR i := 0 TO BYTESIZE(I.T) - 1 DO
addr := ADR(value) + i;
MEM.Set(offset + i,addr^);
END;
END WriteVAR;
PROCEDURE ReadVAR(VAR value:I.T; offset: CARDINAL) =
VAR addr: UNTRACED REF BYTE.T;
BEGIN
FOR i := 0 TO BYTESIZE(I.T) - 1 DO
addr := ADR(value) + i;
addr^ := MEM.Get(offset + i);
END;
END ReadVAR;
BEGIN
END Store.
These functions copy the value in the argument variable designator into and out of storage a byte at a time. By
declaring a local variable of the required type we can avoid an explicit use of the LOOPHOLE function.
7.8 Store INTEGER Module
Given the generic module Store we can instantiate it to get a storage module for a type. For example, to store
INTEGER we have the interface:
UNSAFE INTERFACE Store_INTEGER = Store(Integer)
END Store_INTEGER.
Its implementation is:
UNSAFE MODULE Store_INTEGER = Store(Integer)
END Store_INTEGER.
This module could then be imported into a module:
IMPORT Store_INTEGER AS SINT;
In that module we could then store and retrieve integers:
SINT.WriteVAR(numin,offset);
SINT.ReadVAR(numout,offset);
For any given offset and numin after these two commands the values of numin and numout will be equal.
39
7.9 Conclusion
As a model for a smart-card language Modula-3 has a number of obvious disadvantages:
 Run-time type checking.
 No simple way to provide access to the sequence of bytes that represent a value for an arbitrary type.
 Personal opinion: an awful syntax.
It does have some advantages:
 Strongly typed.
 Object oriented.
 Dened module/interface system
4
.
4
This can be contrasted with C where header le usage is merely a convention that is not enforced by the language.
40
Section 8
A Proposed Type System for CLASP
8.1 Introduction
When designing a type system for the new CLASP language a number of conicting goals must be reconciled:
 Consideration of program development and verication would lead to the choice of a strong type system
that supports generic constructs. In particular we would want a type system that has no \holes" in the
type system, such as the LOOPHOLE construct in Modula-3. We would also choose a simple type system
that can be described by a small set of generally applicable rules, unlike that of Modula-3.
 Because of the constraints imposed by the target hardware the type system must impose a minimal run-time
cost, unlike the run-time type checking found in Modula-3.
 It is required that the language be capable of implementing its own storage allocator. Such storage
allocators normally require the ability to perform arbitrary type coercions and arbitrary address arithmetic,
as exemplied by the implementation of malloc(3) in C.
 We want to be able to be able to localise, and minimise, the possibility of unchecked run-time errors and
check as many run-time errors as possible. This is similar to the features provided by the \safe" and
\unsafe" modules in Modula-3. In particular we must check for illegal array indexing (See 8.3).
It seems clear that the most appropriate language for the task is some kind of Object-oriented language. Given a
free choice of selecting a type system one would prefer that of the Eiel to that of Modula-3. Eiel is strongly
typed with no type \holes", supports generic constructs that are parameterised by types (not interfaces) and
requires no run-time type checking. Unfortunately the requirement that the language be capable of implementing
its own storage allocator means that it must allow arbitrary type coercions. We regard the storage allocator as
representing all the low-level routines that the language may be required to implement as any problems they
raise for the type system will be raised by the storage allocator. The question then becomes, how much of
the elegance and power of a type system such as that of Eiel can we retain while supporting the low-level
programming constructs required in the language?
8.2 Storage Allocation
The basic function of a storage allocator is to manage regions of memory in such a way as to support the creation
and destruction of structures of any type. The standard approach is to regard the memory as a sequence of bytes
over whose addresses arithmetic operations may be performed. Management of the memory involves locating
and reserving regions of the memory in order to represent these structures. This management process will need
to store housekeeping information in the memory along with the contents of the structures themselves. It may
also be necessary to enforce address alignment restrictions when allocating storage of certain types. All these
features can be seen in the various versions of malloc(3) in C.
Programming errors in the implementation of a storage allocator can lead to a variety of run-time errors. The
most obvious is that the value returned from the memory may not be that initially stored there. The operations
41
of a storage allocator are \type-unsafe" in the sense they can easily \forge" values (bit patterns) for a type that
are not valid for that type. If the application employs an out-of-date handle to a structure previously stored in
memory similar errors can ensue. These programming errors will lead to obscure, uncheckable run-time errors.
We wish, at least, to isolate and minimise such type-unsafe operations and and hopefully eliminate all but one
type-unsafe operation (See 8.2.1 and 8.6). This type-unsafe operation should, of course, be supported by a
proof of its safety as it will be the most likely cause of run-time errors.
8.2.1 Type Coercions
There are in fact two dierent kinds of type coercion required by a storage allocator.
The rst is required to turn a reference to a structure of some type into a reference to a sequence of bytes.
This is required when the application passes a structure to the storage allocator. This operation is, in fact,
type-safe, in the sense used above, as all bit patterns are included in the type of bytes. The type conformance
rule for Bit type in Eielregards such a coercion as acceptable. It becomes type-unsafe if the type coercion
does not take a copy of the structure and subsequently modies the sequence of bytes. If we either require this
type coercion to copy its argument or ensure that the resulting byte sequence is marked as \read-only" then this
becomes a type-safe type coercion.
The second is required to turn a reference to a sequence of bytes into a reference to a structure of some
type. This is required when the storage allocator passes a structure back to the application. This coercion is
type-unsafe and we should severely restrict its use, perhaps allowing only a single use in the entire system.
8.2.2 Addresses
In the simple model of the storage allocator we require a type of addresses over which we can perform arithmetic.
We thus require an operation that returns the address of a structure so that by performing operations relative
to this address we can perform operations on the contents of the structure. This separation can easily lead to
type-unsafe programming errors. In the restricted context of the CLASP language we may be able to avoid the
use of a separate address type.
In the simple model we already require the compiler to provide values for the base address of the memory
and some measure of its size. If the compiler could instead be assumed to provide us with a predened array of
bytes that represents the memory we could then remove some of the programming errors that may occur with
address arithmetic. In particular, as arrays will have index checking in our language (See 8.3), illegal addresses
can be excluded.
8.2.3 Operations
In the tagged memory scheme employed in CLASP the operations over the persistent (EEPROM) memory are
restricted. While it is possible to read a byte from any address one must write blocks of bytes of a given size
to an appropriate block address. While representing memory as an array removes some errors we would still
be dependent on the coding of the operations over this array respecting the restricted nature of the operations
actually allowed over the memory.
If we could represent the memory as an object whose methods imposed the restrictions required by the actual
operations over the memory we could remove another source of programming errors. This object could easily
have a single instance provided by the compiler.
8.3 Arrays and Number Ranges
Our language must employ array index checking as illegal array accesses are a common source of run-time errors.
In the general case this will require a run-time check imposing both time and storage
1
costs on our programs.
Number ranges are a useful programming device as they can be used to capture more clearly the intent of a
program as well as detecting some programming errors. By combining number ranges, arrays whose length is
statically determined (similar to the closed arrays of Modula-3) and language constructs to iterate over ranges
we can provide static array index checking in many cases.
1
For the bounds of the array.
42
In particular we would would not want to perform run-time checks, or allocate housekeeping information, for
the array representing the memory (See 8.2.2) We therefore introduce a kind of array which requires that all its
indexing be statically checked. This requirement could be in the form of a proof obligation rather than a strict
requirement on the compiler itself. We do, however, expect our compiler to be able to statically determine that
memory[I + J]
is a valid array access given:
memory : ARRAY [0..MemLen]
I : [0..MemLen - BlockLen]
J : [0..BlockLen]
for any compile-time constants MemLen and BlockLen where BlockLen  MemLen. Removal of run-time checks
on arbitrary array indexing may be achieved if the appropriate proof obligations are satised.
8.4 Generic Modules
The generic modules in Modula-3 suer from a number of serious aws.
 A generic module cannot be type checked! Only its instances may be type checked, unhelpful for program
development.
 Modules are parameterised by interfaces not types. Thus all types that may be used to instantiate a generic
module must have a trivial interface dened for them and all these interfaces and generic modules must
adopt a consistent naming scheme
2
.
The approach adopted in Eiel is much neater and avoids all these problems.
8.5 Tagged Memory
The tagged memory implemented in the persistent (EEPROM) memory provides special problems for the design
of the language. The basic question is, should the application be aware that certain data is actually stored in
tagged memory?
If the application is aware that a piece of data is in tagged memory it will need to explicitly read the data
into RAM before treating as a value of the appropriate type. The types of values that may be read into RAM is
therefore limited by the size of the RAM buer used by the read operation. In this approach the tagged memory
is regarded very much as the le system is in most programming languages
3
. The advantage of this approach is
that the compiler need know nothing about the implementation of the tagged memory. If the compiler is required
to know the details of the tagged memory implementation this raises the possibility of all sorts of problems arising
from inconsistencies between the implementation and the compiler's model of the implementation.
If the storage of data in tagged memory is supposed to be invisible to the application, at least when reading
values, then the compiler must have some knowledge of the implementation of the tagged memory. By making
suitable choices in the implementation of the interface to the tagged memory allocator we can reduce this
knowledge to its absolute minimum.
The basic idea is to implement a read from tagged memory in an application as:
Read(tag; type; field) : type:field
where type is the type of the data stored for the tag and field is the eld we want to read, which may be the
entire value. The types of elds that may be read into RAM are again limited by the buer size. The interface
to the tagged memory will transform this into a call to the storage allocator:
ReadMemory(tag; size; offset; length)
2
The Modula-3 idiom seems to be to name types in the interface T.
3
This interpretation of tagged memory makes it clear why it would not be safe to store structures containing pointers in tagged
memory.
43
Where size is the length in bytes of values of type type and offset and length specify how to extract the
specied field from values of that type when it is stored in contiguous memory. Both type casts, to and from
byte arrays, are performed within the Read function, the storage allocator itself handling only arrays of bytes.
Thus our language must provide operations that return the size in bytes of values of any type (like the sizeof
operator in C), return the type of any eld of a type, return the oset of any eld of a type and return the
length of any eld of a type. With this interface the implementation of the tagged memory can be varied at
will without the compiler having to change at all. Thus the compiler can invisibly transform source language
expressions of the form value.field into the appropriate calls to the Read function.
In either of these approachs if the size of the RAM buer is decreased it may become necessary to rewrite
programs that attempt to read values, or elds, that will no longer t in the buer.
8.6 Views
Views are a generalisation of the concept of a type-safe type coercion. When viewing a value of one type as a
value of another type we must, in general, perform run-time checks to ensure that the operation is type-safe. For
instance, a value of type [0..5] may only be viewed as of type [0..3] if the value is actually less than 4. The
reverse view would require no run time check. Views can be implemented either by copying or by ensuring that
the resulting value is read-only. One possible use for views is when interpreting input character sequences. The
language can clearly implement any protocol it wishes over an input stream, building any desired byte sequence
from the input. A view of the resulting byte array both allows the application to regard the input as a structured
value and to perform a run-time check, via the exception generated by a failed view, that it is a valid value of
that type. We should not therefore need to use the type-unsafe type coercion when reading input.
8.7 Orthogonal Issues
Many issues in the design of the language are orthogonal to the design of its type system. One such issue is the
interpretation of values of structured types. These can be interpreted as either references to values or as simply
being values. Either approach can also be adopted when passing arguments to functions. Whichever choice is
made the type system, and storage allocator, described can be implemented for the language.
8.8 Conclusion
By modelling memory as arrays of statically determined sizes and adopting the appropriate interface in the
implementation of the storage allocator we can avoid the need for a type of addresses and for run-time checks on
memory accesses. We can also restrict the type-unsafe type coercion to the interface between the applications
and the storage allocator, which itself performs no type-unsafe operations. References to values stored in tagged
memory can be rendered invisible to applications by the compiler without it requiring detailed knowledge of the
tagged memory scheme. By using views with run-time checks we may be able to restrict the type-unsafe type
coercion to its single use in the memory allocator interface.
44
