




V E R S IT A S
S
A




zur Erlangung des Grades
Doktor der Ingenieurswissenschaften (Dr.-Ing.)
der Naturwissenschaftlich-Technischen Fakulta¨t I





Tag des Kolloquiums: 16. Ma¨rz 2010
Dekan: Prof. Dr. Joachim Weickert
Vorsitzender des Pru¨fungsausschusses: Prof. Dr. Philipp Slusallek
1. Berichterstatter: Prof. Dr. Wolfgang J. Paul
2. Berichterstatter: Prof. Dr. Reinhard Wilhelm
(Vertreter fu¨r Dr. Gerwin Klein)
Akademischer Mitarbeiter: Dr. Eyad Alkassar
Hiermit erkla¨re ich, dass ich die vorliegende Arbeit ohne unzula¨ssige Hilfe Drit-
ter und ohne Benutzung anderer als der angegebenen Hilfsmittel angefertigt
habe. Die aus anderen Quellen oder indirekt u¨bernommenen Daten und Kon-
zepte sind unter Angabe der Quelle gekennzeichnet. Die Arbeit wurde bisher
weder im In- noch im Ausland in gleicher oder a¨hnlicher Form in anderen
Pru¨fungsverfahren vorgelegt.
Saarbru¨cken, im November 2009

Acknowledgments
First and foremost I express my gratitude to Prof. Dr. Wolfgang J. Paul for
giving me an opportunity to join possibly the world’s leading research team in
the field of formal verification.
I am indebted to my colleagues Dr. Norbert Schirmer, Dr. Eyad Alkassar,
Dr. Mark Hillebrand, Dr. Dirk Leinenbach, Alexandra Tsyban, and Cosmin
Condea. Their scientific results laid a solid foundation for the experimental part
of my work and their comprehensive theses dramatically facilitated creation of
this document.
I am deeply grateful to Dr. Norbert Schirmer for his guidance and collab-
oration on the topic of semantics stack. Every request I made to adapt the
stack’s definitions and proofs for the needs of this thesis led to his immediate
reaction. I feel fortunate to have had a chance to work with such a responsive
colleague.
This thesis work was founded by the German Federal Ministry of Education
and Research (BMBF) in the frame of the Verisoft project under grant 01




This thesis presents the formal pervasive verification of demand paging.
Memory virtualization by means of demand paging is a crucial component of
every modern operating system. The formal verification is challenging because
the reasoning about the page-fault handler (i) has to cover two concurrent com-
putational sources: the processor and the hard disk, and (ii) involves different
kinds of semantics for high- and low-level programming languages.
In order to tackle the challenge we applied a stack of semantics [Sch06,
AHL+09] for a high-level C-dialect [Lei07] and low-level assembly code. It
can handle mixed-language implementations and concurrently operating de-
vices, and permits the transferral of properties to the target architecture while
obeying its resource restrictions. We use a formally verified microprocessor
VAMP [BJK+06] with devices [Alk09] as a target architecture to run the de-
mand paging implementation.
The main result of this work is a mechanically checked formal proof that the
page-fault handler maintains memory virtualization of user processes running
on top of an operating-system microkernel: each user process is provided with
the notion of an own, large and isolated memory.
This work is a part of the Verisoft project, a large scale effort bringing
together industrial and academic partners to push the state-of-the-art in formal
verification for realistic computer systems comprising hard- and software.
Zusammenfassung
In dieser Dissertation wird die Formale Verifikation des Demand Paging Me-
chanismus vorgestellt.
Speichervirtualisierung mittels Demand Paging spielt eine entscheidende
Rolle bei allen modernen Betriebssystemen. Die Formale Verifikation dieses
Machanismus stellt eine Herausforderung dar, da sie erstens die Beweisfu¨hrung
zwei nebenla¨ufiger Berechnungsresourcen beru¨cksichtigen muss: den Prozessor
und die Festplatte, und zweitens, sie entha¨lt verschiedene Semantiken fu¨r High-
und Low-Level Programmiersprachen.
Beim Angehen dieser Herausforderung wurde ein Semantikstack [Sch06,
AHL+09] fu¨r einen High-Level C-Dialekt [Lei07] und fu¨r einen Low-Level As-
sembler verwendet. Dieser Semantikstack kann mit Implementierungen in ge-
mischten Programmiersprachen und mit nebenla¨ufigen Berechnungsresourcen
umgehen. Sie erlaubt eine U¨bermittlung der Eigenschaften zur Zielarchitektur
mit Beru¨cksichtigung der Einschra¨nkungen dieser Architektur. Wir benutzen
einen formal verifizierten Mikroprozessor VAMP [BJK+06] mit Gera¨ten [Alk09]
als Zielarchitektur, um den Demand Paging Mechanismus auszufhren.
Das wichtigste Ergebnis dieser Arbeit ist ein maschinell verifizierter Beweis
u¨ber den Page-Fault Handler, welcher die Speichervirtualisierung der Benut-
zerprozesse verwaltet: jedem Benutzer wird sein eigener, grosser und isolierter
Speicher imitiert.
Diese Arbeit ist Teil des langfristig angelegten Forschungsprojektes Veri-
soft. Das Projekt bringt die industriellen und akademischen Partner zusammen,





2 Virtual Memory Simulation 7
2.1 VAMP: Verified Architecture Microprocessor . . . . . . . . . . 8
2.1.1 VAMP Instruction Set Architecture . . . . . . . . . . . 8
2.1.2 VAMP Assembly . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Devices Framework . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Combined Systems . . . . . . . . . . . . . . . . . . . . . 16
2.3 C0 and Small-Step Semantics . . . . . . . . . . . . . . . . . . . 19
2.3.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 Small-Step Semantics . . . . . . . . . . . . . . . . . . . 21
2.3.3 Compiling C0 to VAMP . . . . . . . . . . . . . . . . . . 23
2.4 CVM: Communicating Virtual Machines . . . . . . . . . . . . . 25
2.4.1 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.2 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Demand Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . 33
2.5.2 Execution Scenarios . . . . . . . . . . . . . . . . . . . . 37
2.5.3 Approach to Correctness Proof . . . . . . . . . . . . . . 39
3 Leveraging a Semantics Stack 41
3.1 Simpl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Hoare Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 BS: C0 Big-Step Semantics . . . . . . . . . . . . . . . . . . . . 46
3.4 Property Transfer from Simpl to BS . . . . . . . . . . . . . . . 50
3.5 Property Transfer from BS to SS . . . . . . . . . . . . . . . . . 52
4 Specification of Demand Paging 57
4.1 Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1.1 Abstract Configuration . . . . . . . . . . . . . . . . . . 58
4.1.2 Extended State . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 Initial Configuration . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Address Translation . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3.1 Physical Memory Address . . . . . . . . . . . . . . . . . 62
4.3.2 Page Faults . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4 I/O Operations on Swap Memory . . . . . . . . . . . . . . . . . 64
4.4.1 Address Adjustment . . . . . . . . . . . . . . . . . . . . 64
ix
4.4.2 Swap In . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4.3 Swap Out . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.4 Filling a Page with Zeros . . . . . . . . . . . . . . . . . 66
4.5 Page-Fault Handling Algorithm . . . . . . . . . . . . . . . . . . 67
4.5.1 Updating Data Structures . . . . . . . . . . . . . . . . . 67
4.5.2 Updating Extended State . . . . . . . . . . . . . . . . . 69
4.5.3 Prevention of Most Recently Loaded Page Swap Out . 70
4.6 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6.1 Invariants about Active and Free Lists . . . . . . . . . . 70
4.6.2 Invariants about Page Tables . . . . . . . . . . . . . . . 73
4.6.3 Invariants about Big-Page Tables . . . . . . . . . . . . . 74
4.6.4 Invariants about Process Control Blocks . . . . . . . . . 75
4.6.5 Altogether . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.6.6 Validity Proofs . . . . . . . . . . . . . . . . . . . . . . . 76
5 Implementation Correctness 81
5.1 Simpl Hoare Logic State Space . . . . . . . . . . . . . . . . . . 82
5.2 Abstraction Relation from Simpl . . . . . . . . . . . . . . . . . 82
5.2.1 Doubly-Linked Lists . . . . . . . . . . . . . . . . . . . . 82
5.2.2 Page and Big-Page Management . . . . . . . . . . . . . 83
5.2.3 Page and Big-Page Tables . . . . . . . . . . . . . . . . . 85
5.2.4 Process Control Blocks . . . . . . . . . . . . . . . . . . . 86
5.2.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.6 Altogether . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3 Initialization Code . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4 Swap In . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . 91
5.4.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.5 Swap Out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.5.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . 92
5.5.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . 92
5.5.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.6 Page-Fault Handler . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.6.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . 93
5.6.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . 95
5.6.3 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . 98
6 Property Transfer 101
6.1 Abstraction Relation from BS . . . . . . . . . . . . . . . . . . . 102
6.1.1 Doubly-Linked Lists . . . . . . . . . . . . . . . . . . . . 103
6.1.2 Page and Big-Page Management . . . . . . . . . . . . . 103
6.1.3 Page and Big-Page Tables . . . . . . . . . . . . . . . . . 104
6.1.4 Process Control Blocks . . . . . . . . . . . . . . . . . . . 105
6.1.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . 106
6.1.6 Altogether . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2 Property Transfer from Simpl to BS . . . . . . . . . . . . . . . 107
6.2.1 Mapping Simpl States to BS States . . . . . . . . . . . . 108
6.2.2 Transfer of Abstraction Relation . . . . . . . . . . . . . 109
6.2.3 Specification at the Level of BS . . . . . . . . . . . . . . 110
6.2.4 Correctness at the Level of BS . . . . . . . . . . . . . . 112
6.3 Program Context Extension . . . . . . . . . . . . . . . . . . . . 115
6.4 Abstraction Relation from SS . . . . . . . . . . . . . . . . . . . 118
6.4.1 Doubly-Linked Lists . . . . . . . . . . . . . . . . . . . . 119
6.4.2 Page and Big-Page Management . . . . . . . . . . . . . 119
6.4.3 Page and Big-Page Tables . . . . . . . . . . . . . . . . . 121
6.4.4 Process Control Blocks . . . . . . . . . . . . . . . . . . . 121
6.4.5 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . 122
6.4.6 Altogether . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.5 Property Transfer from BS to SS . . . . . . . . . . . . . . . . . 123
6.5.1 Mapping BS States to SS States . . . . . . . . . . . . . 123
6.5.2 Transfer of Abstraction Relation . . . . . . . . . . . . . 125
6.5.3 Specification at the Level of SS . . . . . . . . . . . . . . 126
6.5.4 Correctness at the Level of SS . . . . . . . . . . . . . . 127
7 Integrating Results 131
7.1 Hard-Disk Drivers and Zero Fill Page . . . . . . . . . . . . . . 132
7.1.1 Extended Semantics Environment . . . . . . . . . . . . 132
7.1.2 Correctness of the Extended Calls . . . . . . . . . . . . 134
7.1.3 Applying Correctness of the Extended Calls . . . . . . . 138
7.2 Initialization Code Top-Level Correctness . . . . . . . . . . . . 142
7.3 Page-Fault Handler Top-Level Correctness . . . . . . . . . . . 145
7.4 Using Results in the CVM Proof . . . . . . . . . . . . . . . . . 151
7.4.1 Handling User Page Faults . . . . . . . . . . . . . . . . 151
7.4.2 Address Translation in CVM Primitives . . . . . . . . . 152
8 Summary and Future Work 155
Appendix A: Macros from the Implementation 157
Appendix B: Loop Invariants and Ranking Functions 159












A proof is a proof. What kind of a proof? It’s
a proof. A proof is a proof. And when you
have a good proof, it’s because it’s proven.
– Jean Chre´tien
Computer systems nowadays are universal and omnipresent. They are crucial
to the functionality of vast numbers of systems with examples ranging from
electronic banking to medical technology. Our confidence in computer sys-
tems makes their reliability of large social importance. This makes correctness
guarantees for computer systems a hot research topic.
For the time being, the only way to guarantee absence of errors in a com-
puter system is to exploit rigorous formal methods of mathematics for spec-
ifying system’s intended behavior and proving that the actual system’s im-
plementation meets the desired specification. The latter is known as formal
verification. To prevent possible human errors the proof is either completely
conducted or — in case it is not possible with the state-of-the-art technology
— just checked by computer-aided verification tools.
With a comparably small code base of only some thousand lines of code, and
implementing important safety and security abstractions as process isolation,
operating-system microkernels seem to offer themselves as perfect candidates
for a feasible approach to formal verification. Possibly the most challenging
part in microkernel verification is memory virtualization, i.e., to ensure that
each user process has the notion of an own, large, and isolated memory. User
processes access memory by virtual addresses, which are then translated to
physical ones. Modern computer systems implement virtual memory by means
of demand paging : small consecutive chunks of data, called pages, are either
stored in a fast but small physical memory, or in a large but slower auxiliary
memory (usually a hard disk), called swap memory. Whenever the process
accesses a page located in the swap memory, a page-fault handler reacts by
moving the requested page to the physical memory. In case the physical mem-
1
2 Introduction
ory is full, some other page is swapped out.
The goal of this work is to prove correctness of exemplary demand paging
software in the interactive theorem prover Isabelle/HOL [NPW02].
Related Work
The related work for this thesis could be grouped in three categories: (i) de-
mand paging correctness, (ii) microkernel verification, and (iii) the Verisoft
research project which hosts the present work.
Demand paging correctness. The method of storage allocation known as de-
mand paging was first introduced in the early 1960s by the designers of the Atlas
computer [KHPS61, KELS62]. Due to its conceptual simplicity the method fast
became wide-spread [KR68] and subsequently classical [Tan97]. The bibliog-
raphy [Smi78] suggests many sources on paging and virtual memory. However,
they focus mostly on design and evaluation of page-replacement algorithms and
virtual memory systems rather than on correctness.
Paper-and-pencil formalizations of demand paging which constitute the the-
oretical basis for this thesis were conducted by Hillebrand [Hil05]. The page-
fault handler considered in this thesis was originally implemented by Con-
dea [Con06] who also provided paper-and-pencil specifications of the imple-
mentation.
Microkernel verification. An in-depth retrospective of the microkernel verifi-
cation projects is presented by Klein [Kle09]. Below we outline the key players
in the field from both past and present.
First attempts to use theorem provers for formal specification and correct-
ness proofs of operating systems date back to the mid 1970s in PSOS and UCLA
DSU projects. Provably Secure Operating System (PSOS) [FN79, NBF+80,
NF03] was designed at SRI International as a general-purpose operating sys-
tem with provable security properties. Simple illustrative proofs were carried
out to demonstrate how system’s properties could be formally proven. UCLA
Data Security Unix (DSU) [WKP79], a kernel-structured operating system,
was developed at the University of California at Los Angeles (UCLA) in the
late 1970s in order to demonstrate that program verification methods could be
applied to prove an operating system secure. The kernel supported processes,
capabilities, pages, and non-modeled devices via kernel calls. A proof that
specifications on different levels of abstraction are consistent with each other
was undertaken but not completed for all portions of the kernel.
A substantial progress in microkernel verification has been achieved in the
late 1980s with the KIT kernel [Bev87, Bev89] of the CLI stack [BHMY89a,
BHMY89b, Moo03] developed at the University of Texas at Austin. Kernel
for Isolated Tasks (KIT) was a small operating-system kernel implementing
services for process scheduling, error handling, single-word message passing,
and character I/O to non-modeled devices. KIT lacked dynamic creation of
processes as well as shared memory and demand paging. A top-level correctness
property was process isolation: execution of one process must not interfere with
the other in unintended ways. The project pioneered an innovative approach
for pervasive systems verification: not only the kernel program was verified
Introduction 3
but also its execution environment of a processor, assembler, and compiler.
The project appeared to be the first example of an accomplished mechanically
checked proof of the correct implementation of a complete microkernel.
The VFiasco project [HT05, End05, HT03] held at the Technical University
of Dresden aims at the formal verification of a small L4-compatible operating-
system microkernel Fiasco. The Fiasco kernel is implemented with less than
15K lines of source code. Verification was undertaken in the theorem prover
PVS, but the achieved results are however vague.
The Coyotos team has defined a new low-level programming language BitC
with precise formal semantics in order to carry out verification of a general-
purpose operating system Coyotos [JSM04], a successor of EROS [SW00]. As
yet, the status of formal verification is unclear.
The L4.verified project carried out at the National ICT Australia (NICTA)
is aimed at construction and verification of seL4 (secure embedded L4) mi-
crokernel [HEK+07, CKS08, EKE08, KDE09, KEH+09]. From its prototype
designed in Haskel both formal model and high-performance C implementa-
tion are generated. The verification goal is to show in Isabelle/HOL that the
produced implementation conforms with an abstract model via a refinement
between three levels of abstraction: from C implementation — through exe-
cutable specification — to the abstract kernel model. As of today, the project
has completed with a result of a 200000-line formal correctness proof of the
seL4 implementation.
The FLINT group at the Yale University focuses on studying separate prob-
lems in operating-system verification rather than proving a complete OS cor-
rect. XCAP [NS06] — a theoretical verification framework implemented in the
Coq theorem prover — was applied to certify a realistic x86 assembly implemen-
tation of machine-context management procedures [NYS07]. Recently, results
on verification of low-level programs with hardware interrupts and preemptive
threads were reported [FSDG08]. The work provides a solid foundation for
reasoning about preemptive kernels and hypervisors.
The Verisoft project. The context of this thesis is the Verisoft project, a large
scale effort bringing together industrial and academic partners to push the
state-of-the-art in formal verification for realistic computer systems, compris-
ing hard- and software. This thesis deals with the Verisofts academic system,
a computer system prototype for writing, signing, and sending emails. As it
covers all implementation layers from the gate-level hardware up to communi-
cating user processes it is a representative of a vertical slice of a general-purpose
computer system.
The academic system comprises the following layers, which are connected by
respective simulation theorems: (i) a gate-level design of the VAMP [BJK+03,
BJK+06, Tve09], a RISC processor with out-of-order execution, caches [Bey05],
and memory-management units [DHP05, Dal06], (ii) an operational semantics
for the VAMP instruction set architecture based on the DLX ISA [MP00],
(iii) an operational semantics for the VAMP assembly language, (iv) an op-
erational semantics for the C0 programming language [LPP05, LP08, Lei07,
Pet07], a slightly restricted dialect of C, (v) a framework for microkernel pro-
grammers, called Communication Virtual Machines (CVM) [GHLP05, IdRT08,
Tsy09], which implements low-level microkernel functionality including a page-
fault handler [Con06, ASS08], context switch [ST08b], communication [ST08a]
4 Introduction
and memory-management primitives [Con06], (vi) an L4-inspired microkernel
VAMOS [DDB08, DDSW08, DDW09], which provides support for priority-
based scheduling [DDW09], user-mode device drivers, and interprocess commu-
nication (IPC) (vii) a user-level Simple Operating System (SOS) [Bog08] featur-
ing TCP/IP communication protocol and a file system, (viii) a number of use-
ful user applications like remote procedure calls (RPC) [Sha06, ABP09, Alk09],
SMTP and signature servers, and an email client [BHW06].
A decisive novelty of the project is the integration of concurrent
devices [AHK+07, HIP05, Kna08], including a hard disk, through the whole
stack. The project developed a C0 semantics stack [AHL+08, AHL+09] which
is orthogonal to the system stack described above. The semantics stack estab-
lishes a convenient Hoare logic [Sch05, Sch06] to reason about the sequential
parts of C0 programs and simultaneously provides the means to compose the
results to deal with assembly code and to integrate devices [Alk09].
All work is conducted in Isabelle/HOL [NPW02]. To support management
of formal theories a repository of verification environments [HP07] is built.
Outline
This chapter ends with an introduction of notations used throughout the thesis.
The remainder of the thesis is organized in seven chapters.
• In Chapter 2 we formalize the problem of virtual memory simulation:
how one can provide to user processes an illusion of own memory which
exceeds the physical memory. We introduce models of the VAMP proces-
sor with devices, formalize the C0 programming language, and exploiting
it define CVM, a model of low-level microkernel functionality. We dis-
cuss how CVM implements demand paging in order to support memory
virtualization and outline our approach to prove it correct.
• Chapter 3 is devoted to a so-called C0 semantics stack, a methodology
for proving correctness of systems code on convenient abstract semantics
level in the Hoare logics and transferring the obtained results to the level
of C0 small-step semantics.
• In Chapter 4 we introduce abstract page-fault handler configurations and
operations which we use to specify the intended behavior of the demand
paging implementation. We introduce all invariants necessary to establish
correctness of demand paging and prove them to be preserved under the
defined operations.
• In Chapter 5 we specify functions of the demand paging implementation
and prove their correctness against these specifications using the Hoare
logics.
• Chapter 6 elaborates on transferring the results obtained in the previous
chapter down to the level of C0 small-step semantics.
• In Chapter 7 we import results on the correctness of hard-disk drivers
which we use — together with correctness on the small-step semantics
level — to show the top-level correctness of demand paging.
• In Chapter 8 we conclude and discuss directions for the future work.
Introduction 5
Notation
We denote the set of natural numbers including zero by N, the set of integers
by Z, the set of boolean values {T,F} by B, and the set of identifiers, e.g.,
variable names, by S. We write pow(t) for the power set of t. We introduce the
constant A to denote an undefined (arbitrary) value.
Lists. We denote the type of an abstract list with elements if a type t by t∗.
We write [ ] for an empty list and by [x, y, . . .] we construct a list of particular
elements. For numbers a and b we construct a list of consecutive elements
ranging from a to b by [a, . . . , b]. To obtain a list of a given length n where
each element is equal to x we use the function rep(x, n). For concatenation of
two lists l and l′ we write l ◦ l′. The head of a list l, denoted by hd(l), is the
first element of the list. The tail of a list l, denoted by tl(l), is the part of the
list besides its first element. We reverse a list l by means of the function rev(l).
The function |l| returns the length (number of elements) of the list l. To filter
a list l, i.e., keep only those elements in the list which satisfy a given predicate
P?, we use the notation filter(P?, l). We apply a function f to each element
of a given list l by writing map(f, l). Sometimes we interpret lists as sets (of
their elements) and allow set operation like intersection of inclusion for lists.
Bit vectors. We represent bit vectors with the type bv-t = {0, 1}∗. The left-
most bit is the most significant bit and the rightmost bit is the least significant
bit. Hence, we index bit vectors from right to left. For a bit vector w and two
natural numbers a > b we support the notation w[a : b] to denote the part of
the bit vector from the bit b up to the bit a. For a bit vector w we denote
by 〈w〉 the conversion to the natural number with the binary representation
w. For a natural number n we denote by bin(n) the conversion to the binary
representation of n.
Option type. Besides specific abstract data types to be defined further in this
thesis, we introduce a general option type. It is used to extend an existing type
with some error value ⊥. For a particular type t we write t⊥ = t ∪ {⊥} to
define such an extended type. All non-error values x ∈ t will be written as bxc.
We obtain the value of an element x of an option type by writing TxU = y,
provided that x is not an error value, i.e., x = byc.
Pairs. We denote the first and the second elements of a pair x = (a, b) by
fst(x) = a and snd(x) = b, respectively. We convert a list of pairs l into a
function f by writing map-of(l) = f . The function f is defined as f(x) = byc
in case (x, y) ∈ l, and as f(x) =⊥, otherwise.
We supplement predicates with a question mark, i.e., P?. Predicates which
describe validity of some concept are supplemented with a tick, i.e., pfh
√
.
Through the thesis we abbreviate “SS” and “BS” for small-step and big-step
semantics, respectively, whereas “H” refers to the Simpl Hoare logic. “PFH”
is an abbreviation for the page-fault handler meaning anything which refers to
the demand paging specification. We use a typewriter font to denote names

























The virtual memory simulation problem is to
introduce by hardware and software means
a notion of own, large, and isolated memory
to a number of user processes. In this chap-
ter we formalize a system which provides vir-
tual memory to user processes running on top
of it. We start by introducing the proces-
sor VAMP which features address translation
to support memory virtualization. Next, we
present a general theory of devices and show
how one can couple devices and a processor.
We need devices, in particular, a hard disk to
store exceeding portions of user virtual mem-
ory. Further, we introduce a formal model
of C0, a slightly restricted dialect of C, fea-
turing inline assembly, which is essential for
communicating with devices. We state simu-
lation theorems between the C0 and VAMP
with devices. Then, we introduce the formal
model of CVM which makes possible to run
several user processes with virtual memory on
top of VAMP. CVM is also implemented in C0
as a framework featuring, among others, de-
mand paging which is an essential component
to implement virtual memory. We conclude
by discussing the demand paging implemen-
tation in CVM.
7
8 Virtual Memory Simulation
2.1 VAMP: Verified Architecture Microprocessor
In this section we introduce two models of the VAMP microprocessor: VAMP
ISA (instruction set architecture) and VAMP assembly. While describing them
we follow the thesis of Tsyban [Tsy09].
2.1.1 VAMP Instruction Set Architecture
VAMP ISA is based on but not limited to DLX ISA, a RISC processor architec-
ture designed by Hennessy and Patterson [HP96]. Essentially, DLX is a cleaned
and simplified 32-bit MIPS architecture [KH92]. VAMP features beyond the
classical DLX model include mechanisms for operating-system support: user
and system mode, and address translation in user mode for virtual memory
implementation. Formal specification of VAMP ISA and its correctness proof
towards the gate-level implementation have been originally conducted in PVS
by Beyer [Bey05] and Dalinger [Dal06]. Subsequently, Tverdyshev [Tve09]
translated the model into Isabelle and re-proved its correctness using to a large
extend automated verification techniques.
Data representation. Data types used for representation of VAMP ISA com-
ponents are summarized in Table 2.1. Registers are modeled as bit vectors.
Thirty two registers form a register file which we model as a mapping from bit
vectors to bit vectors. Due to compatibility with double-word floating-point
instructions memories are modeled as mappings from bit vectors to pairs of
bit vectors. For a thirty two bit address a we retrieve a single word from a
double-word addressable memory m using the following notation.
DEFINITION 2.1 I
Read word from ISA memory mword(a) =
{
fst(m(a[31 : 3])) if a[2] = 1
snd(m(a[31 : 3])) otherwise
.




(w) = |w| = n
Register RegfISA = RegfISA
√
(r) =
file bv-t 7→ bv-t ∀ i : bv-t5√(i) −→ bv-t32√(r(i))
Memory MemISA = MemISA
√
(m) = ∀ a : bv-t29√(a) −→
bv-t 7→ (bv-t× bv-t) bv-t32√(fst(m(a))) ∧ bv-t32√(snd(m(a)))
Configurations. VAMP ISA configurations cISA are modeled by the record
CISA with the following fields.
cISA.pc, cISA.dpc Registers for normal and delayed program counters.
cISA.gpr, cISA.spr General and special purpose register files.
cISA.m Memory.
According to the DLX computational model the general purpose register zero
always contains the zero value. The special purpose register file contains reg-
isters needed to process interrupts and registers used for virtual memory sup-
port. In the specification of ISA considered in this thesis not all of the thirty
2.1. VAMP: Verified Architecture Microprocessor 9
two special purpose registers are used. Some of the special registers are just
reserved for possible extensions, e.g., an integration of a floating point unit.
Table 2.2 defines the set sprsISA :: pow(bv-t) of used special purpose register
binary indices.














Table 2.2: VAMP special purpose registers
Bin. index Dec. index Alias Name
00000 0 sr Status register
00001 1 esr Exceptional status register
00010 2 eca Exceptional cause
00011 3 epc Exceptional program counter
00100 4 edpc Exceptional delayed program counter
00101 5 edata Exceptional data
01001 9 pto Page table origin
01010 10 ptl Page table length
01011 11 emode Exceptional mode
10000 16 mode Mode
Instructions. VAMP supports a variety of instructions for (i) memory, data
transfer and control operations, (ii) arithmetic, logical, test, set and shift oper-
ations, and (iii) special operations for systems calls and return from exception.
Semantics. The semantics of execution without interrupts is given by the
transition function δwoiISA :: CISA 7→ CISA which yields for a configuration cISA
the next state c′ISA = δ
woi
ISA(cISA). The definition of δ
woi
ISA splits cases depend-
ing on the instruction to be executed. The semantics distinguishes two ex-
ecution modes: system denoted by sys-modeISA?(cISA), and user denoted by
user-modeISA?(cISA). User mode restricts access to special purpose registers
and execution of some instructions, e.g., a return from exception. Most no-
tably, memory accesses are subject to address translation in user mode.
Address translation. In user mode all addresses for instruction fetch as well
as for load/store operations are translated with the help of special purpose
registers pto and ptl. Virtual address space is divided into pages of size PG SZ =
212 bytes. Each virtual address va is split into virtual page index va[31 : 12]
and byte index va[11 : 0] which is an offset within the page.
The main data structure for address translation is the page table which
resides in the processor memory. Page table origin register and virtual page




cISA.mword(cISA.spr(pto)[19 : 0] ◦ rep(0, 12) +32 va[31 : 12] ◦ [0, 0]).
Above, +32 is the addition of bit vectors modulo 232.
10 Virtual Memory Simulation
Each page table entry pte contains a physical page index pte[31 : 12] and
(i) a valid bit pte[11] which denotes whether the page resides in the physical
memory, (ii) a protection bit pte[10] which signals whether the page is allowed
to be written, and (iii) an execution bit pte[9] which is on when the page
contains executable code.




pmaISA(cISA, va) = pteISA(cISA, va)[31 : 12] ◦ va[11 : 0].
However, this computation does not always succeed and might result in
raising one of the page fault interrupts which are introduced later in this sec-
tion. Definitions of page faults share a translation exception predicate which
we define below.
The page table length register is used to specify the amount of allocated
virtual memory. In case the virtual address va does not belong to the user
memory address translation results in a page table length exception:
DEFINITION 2.5 I
PTL exception
ptlexcpISA?(cISA, va) = 〈va[31 : 12]〉 > 〈cISA.spr(ptl)[19 : 0]〉.
Failures in address translation also occur if invalid data was used for the
translation, or the processor must forbid the attempted operation at the spec-
ified address. This happens if the memory which stores an instruction is not
tagged as executable, or protected memory is accessed for writing, or even the
page containing this address is not present in the physical memory. Altogether,
failures in address translation cause a translation exception:
DEFINITION 2.6 I
Translation exception
translexcpISA?(cISA, va,mw?, fetch?) = ptlexcpISA?(cISA, va)
∨ fetch? ∧ pteISA(cISA, va)[9] = 0
∨ mw? ∧ pteISA(cISA, va)[10] = 1
∨ pteISA(cISA, va)[11] = 0.
Above, the flag mw? indicates memory write and the flag fetch? denotes
instruction fetch.
Interrupts. VAMP ISA computations could be broken by interrupt signals
numbered with indices from zero to thirty one. Interrupts are classified ac-
cording to the following criteria: (i) maskable or not maskable, (ii) internal
or external, and (iii) of repeat, continue, or abort type. Maskable interrupts
can be ignored under software control. If an interrupt signal arrives during
execution of some instruction i and it is of repeat type then the instruction i
is repeated when the program execution is resumed. If the interrupt is a con-
tinue interrupt then the instruction that follows i in the program is executed
after the interrupt handling. In the remaining case the program execution is
aborted.
Table 2.3 depicts interrupts supported by the VAMP ISA model. Page fault
interrupts are of particular importance to this thesis. In order to define them
formally, let us first introduce the misaligned access exception. It is raised if
the memory-access width does not match (i) the delayed program counter in
case of instruction fetch (instruction misalignment imal?), or (ii) the low-order
2.1. VAMP: Verified Architecture Microprocessor 11
bits of the effective address ea(cISA) (data misalignment dmal?):
imal?(cISA) = cISA.dpc[1] = 1 ∨ cISA.dpc[0] = 1,
dmal?(cISA) = iw-mem?(cISA) ∧ ( ¬iw-byte?(cISA) ∧ ea(cISA)[0] = 1
∨ iw-word?(cISA) ∧ ea(cISA)[1] = 1).
Above, the predicate iw-mem? denotes that a memory access takes place whereas
the predicates iw-byte? and iw-word? denote the corresponding width of that
access. The effective address ea(cISA) is computed as the sum modulo thirty two
of the content of the general purpose register specified by the rs1 instruction
word field and the immediate constant.
The page fault on fetch interrupt is raised in the user mode whenever trans-
lation of the fetch address cannot be done:
J DEFINITION 2.7




The page fault on load/store is similar to the page fault on fetch, but here
the effective address of the load/store instruction is examined:
J DEFINITION 2.8




∧ translexcpISA?(cISA, ea(cISA), iw-write?(cISA),F).
Above, the predicate iw-write? denotes that a memory-write accesses takes
place.
Table 2.3: Interrupts of VAMP ISA
i Name Meaning Mask. Ext. Type
0 reset? Reset No Yes Abort
1 ill? Illegal instruction No No Abort
2 imal?∨dmal? Instr. / data misalignment No No Abort
3 pff? Page fault on fetch No No Repeat
4 pfls? Page fault on load/store No No Repeat
5 trap? Trap / System call No No Cont.
6 ovf? Overflow Yes No Cont.
≥ 12 eevi? Device interrupts Yes Yes Cont.
It is defined only by the configuration cISA which of the internal interrupts
occur in the current configuration. Conversely, external interrupts are modeled
as external input eev :: bv-t of length nineteen which is a parameter to the
transition function of VAMP ISA:
δISA :: CISA × bv-t 7→ CISA,
c′ISA = δISA(cISA, eev).
Interrupt signals raised in the configuration cISA are collected in the cause
function ca(cISA, eev). The masked cause vector mca(cISA, eev) is computed as
12 Virtual Memory Simulation
a bitwise conjunction of ca(cISA, eev) with the mask stored in status register
cISA.spr(sr): if interrupt i is maskable and cISA.spr(sr)[i] = 0 then bit i is
masked out. If at least one bit of mca(cISA, eev) is on the jump to interrupt
service routine (JISR) signal jisr(cISA, eev) is activated. In case JISR is not
activated we continue with uninterrupted transition: c′ISA = δ
woi
ISA(cISA). Oth-
erwise, if the lowest raised interrupt is of continue type then the instruction is
executed, which leads to state cˆISA:
cˆISA =
{
δwoiISA(cISA) if continue interrupt
cISA otherwise
.
Afterwards, the program counters are set to the start address of the inter-
rupt service routine. The exceptional versions of program counters and the
status and mode registers are assigned their normal versions. Register eca is
set to the masked interrupt cause. Register edata stores the data needed for
interrupt handling. Finally, the mode and the status registers are set to zero.
Execution of an interrupt service routine ends with the return from exception
instruction. According to its semantics the normal versions of registers pc, dpc,
sr, and mode are assigned their exceptional versions.
2.1.2 VAMP Assembly
Experience of Verisoft shows that reasoning about VAMP programs on the ISA
level is unnecessarily hard for a number of reasons: (i) bit-vector encoding of
instructions, (ii) bit-vector representation of operands and program counters,
(iii) unwanted interrupts, and (iv) low-level specification of functional units
close to their implementations. As a response to this problem a convenient
abstraction, the VAMP assembly model [Tsy09, Section 3.2], was introduced.
Data representation. Data types used for representation of VAMP ISA com-
ponents are summarized in Table 2.4. We represent register contents on the
assembly level with thirty-two-bit natural and integer numbers. An observa-
tion shows that in Verisoft programs use integers more frequently than natural
numbers. Therefore, it was decided to represent all registers except the pro-
gram counters with integers. Register files are represented therefore as lists of
integers. We represent memories as mappings from naturals to integers. We
suppose the memory to be word-addressable and thus each memory cell must
be represented with a thirty-two-bit integer.




(x) = x < 232
Z Z32
√
(x) = −231 ≤ x < 231
Register file RegfASM = Z∗ RegfASM
√
(r) =
|r| = 32 ∧ ∀i < 32 : Z32√(r[i])
Memory MemASM = N 7→ Z MemASM√(m) =
∀a : N32√(a) −→ Z32√(a/4)
Configurations. VAMP assembly configurations cASM are modeled with the
record CASM which has the following fields.
2.2. Devices Framework 13
cASM.pc, cASM.dpc Registers for normal and delayed program counters.
cASM.gpr, cASM.spr General and special purpose register files.
cASM.m Memory.
Table 2.2 defines the set sprsASM :: pow(N) of special purpose register decimal
indices which we use in the VAMP assembly model.














Instructions. In contrast to bit-vector instruction representation of VAMP
ISA instructions on the assembly level are modeled with inductive data type
Instr. Its constructors have names of instruction mnemonics [Tsy09, Table 3.2].
Semantics. The effect of a single instruction execution on the assembly con-
figuration is defined by the function execinstr :: CASM × Instr 7→ CASM [Tsy09,
Section 3.2.6]. Executions of the assembly model are specified by the transition
function δASM :: CASM 7→ CASM which executes current instruction instr(cASM)
with function execinstr:
δASM(cASM) = execinstr(cASM, instr(cASM)).






In contrast to ISA in the assembly model interrupts are not visible and there
are neither execution modes nor address translation.
2.2 Devices Framework
This section introduces the devices model used in the thesis. We start by
sketching a generic device model and later illustrate by the hard disk example
how this model can be instantiated with concrete devices. We show how several
devices are organized in a devices system. The reader should consult [Kna08]
for additional information on the devices model in general, and the doctoral
thesis of Alkassar [Alk09] for the hard disk. Note that in Isabelle devices have
outputs to the external environment. Since they are irrelevant to the current
work, we omit them.
2.2.1 Devices
Device model. A device of type x is modeled as a finite transition system with
configurations cx :: Cx and a transition function δx. The step function takes
the current state of the device cx :: Cx, a memory interface input mifit :: mifi-tt
from the processor, and an external interface input eifix :: eifi-tx from a non-
modeled external environment. It returns a devices’ updated state c′x :: Cx
14 Virtual Memory Simulation
and a memory interface output mifot :: mifo-tt to the processor. Thus, the
transition function has the following signature:
δx :: Cx ×mifi-tt × eifi-tx 7→ Cx ×mifo-tt.
The sets of inputs eifi-tx from the external environment are device-specific,
whereas the memory interfaces mifi-tt and mifo-tt depend only on the type t
they are represented with. Devices are accessed via word-sized ports which
occupy the 1024 highest word addresses of a processor memory.
Memory interface. A memory interface between a processor and devices is
specified by the memory interface inputs mifit and the outputs mifot. The
inputs are given by a processor to a device, and the outputs are produced by
a device for a processor. The memory interface inputs mifit are represented
by records of the type mifi-tt with the following fields whose representation
depends on type t of the memory interface.
mifit.rd Read flag which indicates a read operation on a device.
mifit.wr Write flag which indicates a write operation on a device.
mifit.a Access address.
mifit.din Word-sized data input used for write accesses to a device.
We denote the idle memory interface input by ε-mifit. The memory interface
output is a singleton mifot :: mifo-tt containing a word-sized data output. We
denote the idle memory interface input by ε-mifot.
Hard disk. A device model can be instantiated with particular devices. Next,
we sketch details of the hard disk model relevant for the thesis. We use
the model of the disk based on the ATA/ATAPI protocol. The hard disk
is parameterized over the number of sectors it has. Each sector has a size
of WORDS PER SECTOR = 128 words. The processor can issue read or write
commands to a range of sectors, by writing the start address and the count
of sectors to a special port. Each sector is then read/written word by word
from/to a sector buffer. After a complete sector is read/written from/to the
sector buffer, the hard disk needs some time to transfer data to the sector
memory. This amount of time is modeled as non-determinism by an oracle
input from the external environment modeled by the type eifi-tHD = {1, 0}.
The value eifihd = 1 indicates the end of the transfer.
Hard disk configurations cHD are represented by records of the type CHD
. The record comprises fields for modeling hard-disk internal functionality as
well as contents stored on the hard disk. For this thesis we are interested only
in the following fields of the hard-disk record.
cHD.s :: N Number of sectors; has to be ≤ MAX SECTORS = 228.
cHD.buf :: N∗ The sector word buffer.
cHD.bp :: N The word buffer pointer.
cHD.sm :: N∗ The swap memory content.
cHD.int :: B The pending interrupt flag.
cHD.cs The control state; can be IDLE, BRD, BWR, PRD, PWR, or ERR.
In state IDLE the disk is ready to process new commands. Reading from the
disk starts in state BRD by filling the disk buffer which is then read by the
2.2. Devices Framework 15
processor when the disk is in state PRD. Similarly, write commands visit states
PWR and BWR. In case an invalid command is issued, the disk transits to the
error state ERR.
We restrict the set of disk states to the valid ones, which are at the points
between the complete hard disk operations. These restrictions are: (i) the hard
disk is in idle state, (ii) it has no pending interrupt, (iii) the size of the hard
disk is sufficient, (iv) the buffer size is fixed, and (v) the buffer pointer points





(cHD) = cHD.cs = IDLE
∧ ¬cHD.int
∧ 8 · (BOOT PGS + TOT BIG PGS · PGS PER BIG PG) ≤ cHD.s < 228
∧ |cHD.sm| = cHD.s · WORDS PER SECTOR
∧ |cHD.buf| = WORDS PER SECTOR
∧ cHD.bp = 0
The meanings of the constants BOOT PGS = 150, TOT BIG PGS = 1152, and
PGS PER BIG PG = 1024 are explained in Section 2.5.
Generalized devices. The concept of generalized devices allows us to deal with
devices in a generic fashion, i.e., without having knowledge about particular
kinds of devices. Generalized devices are represented by inductive data types,
where each inductive constructor corresponds to a certain device. We consider
only the hard disk in the thesis, however the concept is easily expendable with
further devices.
A generalized device configuration cGD is an instance of the inductive data
type
CGD = dev-hd(CHD) | · · · | ill-dev.
The constructor ill-dev is used to model an illegal device access. In order to
determine whether generalized device configuration cGD corresponds to a par-
ticular device predicates of the form dev-hd?(cGD) = ∃cHD : cGD = dev-hd(cHD)
are used.
Generalized inputs from the external environment eifiGD are modeled by
inductive data type
eifi-tGD = eifi-hd(eifi-tHD) | · · · | ill-eifi.
The constructor ill-eifi is used to model illegal device inputs. In order to test
whether a generalized input eifiGD correspond to a particular device input we
use predicates like eifi-hd?(eifiGD) = ∃cHD : eifiGD = eifi-hd(cHD). We test
whether generalized input eifiGD is compatible with generalized device cGD by
means of predicate
eifi-match-dev?(cGD, eifiGD) = dev-hd?(cGD) ∧ eifi-hd?(eifiGD)
∨ · · ·
∨ ill-dev?(cGD) ∧ ill-eifi?(eifiGD).
The transition function δGD of a generalized device takes a generalized de-
vice configuration cGD, a memory interface input mifiN represented with nat-
ural numbers, and a generalized external input eifiDEV. It returns an updated
generalized device configuration c′GD and a memory interface output mifoN rep-
resented with natural numbers. We define the generalized transition function
16 Virtual Memory Simulation
inductively on generalized external inputs and a generalized device configura-
tion. In case the input is compatible with the device, we apply the step function
for that device. Otherwise, the idle device is returned:
δGD(cGD,mifi, eifiGD) =

(dev-x(c′x),mifo) if cGD = dev-x(cx)
∧ eifiGD = eifi-x(eifix)
∧ δx(cx,mifi, eifix) = (c′x,mifo)
(ill-dev, ε-mifot) otherwise
.
Devices Systems. Several devices can be organized in a devices system. Let
devnum-t = {1, . . . , MAX DEV} be the set of possible device identifiers. Devices
systems cDS are modeled as mappings from device identifiers to generalized
device configurations CDS = devnum-t 7→ CGD. We distinguish two possible
kinds of transitions a device may take in a system: internal, which are taken as
a reaction to a memory-interface input from the processor, and external, which
process inputs from the external environment.
In an internal step the memory-interface port address mifi.a defines iden-
tifier did of the device supposed to make a transition. The internal step of a
device in a system is defined by function
δINTDS :: CDS ×mifi-tt 7→ CDS ×mifo-tt.
Device did performs a step with idle external inputs ε-eifi:
δGD(cDS(did),mifi, ε-eifi) = (cGD,mifo),





cGD if i = did
cDS(i) otherwise
.
For the external step an explicit input of an identifier did of the device
which is supposed to make a step is necessary:
δEXTDS :: CDS × devnum-t× eifi-tGD 7→ CDS,






fst(δGD(cDS(did), ε-mifit, eifi)) if i = did
cDS(i) otherwise
.
Finally, we define a version of validity predicate for a hard disk which states







(cDS) = ∃cHD : cDS(SWAP DID) = dev-hd(cHD) ∧ hd ′√(cHD)
2.2.2 Combined Systems
We introduce a notion of a combined system, a processor model coupled with a
devices system. Computations of such systems are guided by an external oracle,
called an execution sequence, which defines for each point of time which of
the computational sources, either the processor or some device, makes a step.
2.2. Devices Framework 17
Whenever a device makes a step the external oracle additionally provides a
device input. We define two particular kinds of combined systems: (i) VAMP
ISA with devices, and (ii) VAMP assembly with devices.
Execution sequences and external inputs. An execution sequence is an exter-
nal oracle which parametrizes runs of a combined system. A sequence element
s denotes whether the processor or a device with a particular number and
particular input makes a step and is modeled by the data type
seqel-t = Proc | Dev(N× eifi-tDEV).
A sequence seq is defined as a mapping from combined system’s step numbers
to sequence elements:
seq-t = N 7→ seqel-t.
Execution sequence seq is called valid, if (i) it is live with respect to the pro-
cessor and all devices, i.e., for any position in the sequence there are positions
further in the sequence at which the processor and devices make a step, and
(ii) for any position n in the sequence where a device makes a step, an external





(seq, cDS) = ∀ n : ∃ i : n < i ∧ seq(i) = Proc
∧ ∀ dn, n : ∃ i, eifiGD : n < i ∧ seq(i) = Dev(dn, eifiGD)
∧ ∀ n : seq(n) = Dev(dn, eifiGD)
−→ eifi-match-dev?(cDS(dn), eifiGD).
We determine the number of processor steps in a prefix of length T of
execution sequence seq by means of the function
J DEFINITION 2.13
Processor step numbersproc-steps(seq, T ) =

0 if T = 0
proc-steps(seq, T − 1) if seq(T − 1) 6= Proc
proc-steps(seq, T − 1) + 1 otherwise.
We obtain a list of inputs corresponding to the device with number did from
execution sequence seq prefix of length T by means of function
J DEFINITION 2.14
Devices inputs filter
dev-input(seq, did, T ) =
[ ] if T = 0
dev-input(seq, did, T − 1) ◦ [eifiGD] if seq(T − 1) = Dev(did, eifiGD)
dev-input(seq, did, T − 1) otherwise.
Note that the length of the obtained list in the definition above is the
number of steps for the particular device.
Non-interference of devices. As defined in Section 2.2.1 devices may take tran-
sitions triggered by the processor they interact with or by the external environ-
ment. As long as devices take steps triggered only by the external environment
it is possible to split a computation of the combined system into two inde-
pendent execution sequences of the processor and of the devices system. This
allows us to reason about the processor computation separately and then ex-
ploit the result of such reasoning in order to describe the computation of the
whole combined system.
18 Virtual Memory Simulation
We check whether a configuration of the devices system c′DS is obtained
from configuration cDS by executing independently all devices steps contained
in the prefix of sequence seq of length T by the predicate
DEFINITION 2.15 I
Non-interference of devices
non-interf-dev?(cDS, c′DS, seq, T ) =
∀did : c′DS(did) = δ∗GD(cDS(did),
rep(ε-mifit, |dev-input(seq, did, T )|),
fst(dev-input(seq, did, T ))).
Memory mapping for devices. Semantics of combined systems distinguishes
whether the processor accesses some device by executing a load or store in-
struction with an effective address which belongs to devices ports. We define
constant DEVICES BORDER = 〈117015〉 which partitions the VAMP memory into
two parts: normal memory and devices ports.
VAMP ISA with devices. Configurations cISA+DS of VAMP ISA with devices
are modeled by the record CISA+DS which has two fields.
cISA+DS.cpu :: CISA Processor.
cISA+DS.devs :: CDS Devices system.
As for semantics, first of all we need to adapt the interrupts definitions. Since
we have two memory parts, it has to be guaranteed that page tables and the
fetched instruction do not lie behind DEVICES BORDER. We extend the page
fault on fetch predicate (Definition 2.7) with a condition that an interrupt
occurs if the fetch address or, in case of user mode, the corresponding page
table entry address, belongs to the devices range. For load/store we allow the
accessed address to point to a device port only if the memory operation has
the width of a word. The transition function of the VAMP ISA with devices
model
δISA+DS :: N× CISA+DS × seq-t 7→ CISA+DS
takes as arguments the number of steps T to be executed, ISA with devices
configuration cISA+DS, and execution sequence seq together with external in-
puts. The step function is defined by induction on the step numbers T . For
T = 0 we have:
δ0ISA+DS(cISA+DS, seq) = cISA+DS.






then, depending on the current sequence element s = seq(T ) the definition for
the step T + 1 distinguishes three cases:
• the processor makes a device access: the new states cT+1ISA and cT+1DS of
both the processor and the devices system (internal step) are computed,
• the processor makes a step without a device access: only the processor
new state cT+1ISA is generated, and
• the devices system makes a step triggered by the external environment
(external step): only the new state cT+1DS of the devices system is gener-
ated.
2.3. C0 and Small-Step Semantics 19
VAMP assembly with devices. Configurations cASM+DS of VAMP ISA with de-
vices are modeled by the record CASM+DS which has two fields.
cASM+DS.cpu :: CASM Processor.
cASM+DS.devs :: CDS Devices system.
Semantics of the VAMP assembly with devices model distinguishes whether a
processor access devices or not in the same fashion as VAMP ISA combined
systems do. The transition function of VAMP assembly with devices
δASM+DS :: N× CASM+DS × seq-t 7→ CASM+DS
performs n steps guided by an execution sequence seq starting from a given
configuration cASM+DS and yields an updated configuration of the VAMP as-
sembly with devices. It is defined inductively over the step number. For n = 0
we have:
δ0ASM+DS(cASM+DS, seq) = cASM+DS.






Then, depending on s = seq(n + 1) as well as whether a device access takes
place the function distinguishes the three following cases:
• the processor attempts a device access, and hence new configurations
cn+1ASM and c
n+1
DS of the processor and the devices systems, respectively, are
computed,
• the process makes a step which does not access any device: only the new
configuration cn+1ASM of the processor is computed, and
• some device performs a transition, therefore only the updated configura-
tion cn+1DS is obtained.
2.3 C0 and Small-Step Semantics
C0 [Lei07] is a type-safe garbage-collected dialect of C without pointer arith-
metic. C0 was chosen in Verisoft as a compromise between two orthogonal
issues: the language has to be powerful enough to allow implementation of
systems software while having clean formal semantics making verification of
this software feasible. As systems software like a page-fault handler may access
hardware resources beyond visibility of the C0 language, e.g., devices ports, C0
was extended with support of inline assembly statements.
2.3.1 Syntax
The C0 concrete syntax and visibility rules for variables are very similar to
standard C. Operational semantics, though, is similar to Pascal. The major
restrictions are: (i) no initialization during declarations, except for a constant
declaration, (ii) no side-effects inside expressions, (iii) no function calls inside
expressions, (iv) the size of arrays is fixed at the compile time, (v) no vari-
able declarations in functions after the first statement, (vi) only one return
statement which is the last control command in each function, (vii) no pointer
20 Virtual Memory Simulation
arithmetic, (viii) no pointers to local variables, (ix) no pointers to functions,
(x) no void pointers, i.e. all pointers are typed.
Types. C0 is a typed language. C0 types are defined by the data type ty-t
with the following constructors:
BooleanT, PtrT(tn), where tn :: S,
IntegerT for signed integers, NullT,
UnsgndT for unsigned integers, ArrT(n, T ), where n :: N, T :: ty-t,
CharT, StructT(fs), where fs :: S× ty-t∗.
Expressions. C0 supports the following expressions of the data type expr-t .
Lit(v) Literal values, where v is a constant.
VarAcc(vn) Variable access, where vn :: S.
ArrAcc(e, i) Access of array e with index i, where e, i :: expr-t.
StructAcc(e, n) Access of structure e :: expr-t with field n :: S.
Deref(e) Dereferencing, where e :: expr-t.
UnOp(uop, e) Unary operations, where uop :: unop and e :: expr-t.
BinOp(bop, e1, e2) Binary operations; bop :: binop and e1, e2 :: expr-t.
LzBinOp(bop, e1, e2) Lazy binary operations; bop :: lzbinop, e1, e2 :: expr-t.
Data types for unary unop, binary binop, and lazy binary lzbinop operations
are inductive with constructors of the form unary-minus, plus, logical-and, etc.
Also note that constants passed as parameters to literal values are represented
with the type val-t. Since it is also used to represent values in the big-step
semantics, we define it formally later in Section 3.3 where we discuss the big-
step semantics.
Statements. C0 supports the following statements modeled by the data type
stmt-t :
Skip The empty statement.
Ass(e1, e2) Assignment of expressions e2 :: expr-t to e1 :: expr-t.
PAlloc(e, tn) Allocation of a pointer to new element of type tn :: S
with assignment of the address to expression e :: expr-t.
Comp(c1, c2) Sequential composition, where c1, c2 :: stmt-t.
Ifte(e, c1, c2) Conditional execution; e :: expr-t and c1, c2 :: stmt-t.
Loop(e, c) While loop, where e :: expr-t and c :: stmt-t.
SCall(vn, pn, ps) Call of procedure pn :: S with parameters ps :: expr-t∗
and return value assigned to variable vn :: S.
XCall(pn, ps, rs) Extended call of procedure pn with parameters ps and
result expressions rs, where pn :: S and ps, rs :: expr-t∗.
ESCall(pn, ps, rs) External call to an only declared procedure of name pn.
Return(e) Return from procedure.
Asm(l) Inline assembly statement; l :: Instr∗.
All statements in a C0 program are tagged with unique numerical identifiers
which are hidden in the present thesis.
2.3. C0 and Small-Step Semantics 21
We compute the number of return statements in the statement (tree) s by
the function #ret(s) [Lei07, Definition 4.3]. We convert a statement (tree) s
into a list of statements by the function s2l(s) [Lei07, Definition 4.5].
Programs. A type environment maps type names to types:
tenv-t = (S× ty-t)∗.
A function table stores information about the functions. A single function f
is described by data type func-t which has four components.
f.body :: stmt-t The body of the function.
f.params :: (S× ty-t)∗ The list of function parameters.
f.lvars :: (S× ty-t)∗ The list of local variables.
f.rtype :: ty-t The return type of the function.
Parameters and local variables are represented by their names and types. We
call a list of pairs of names and type a symbol table. A complete function table
is a list of pairs of function names and descriptions:
functable-t = (S× func-t)∗.
Finally, a program is modeled by type prog-t which consists of a type environ-
ment, a function table, and a global symbol table:
prog-t = tenv-t× functable-t× (S× ty-t)∗.
A program Π :: prog-t is called translatable [Lei07, Definition 7.41], denoted
by Π ∈ xltblprog, if it fulfills constraints on the size of immediate operands an
the number of required VAMP assembly registers to evaluate expressions.
2.3.2 Small-Step Semantics
Different kinds of semantics for C0 we developed in Verisoft (cf. Chapter 3). In
this section we consider the lowest-level version, the C0 small-step semantics
which describes each single computation step that transform configurations.
Values. Variables and sub-variables are modeled by data type gvar-t which
has the following constructors.
gvargm(vn) Global variable of name vn :: S.
gvarlm(n, vn) Local variable of n-th frame with name vn :: S; n :: N.
gvarhm(n) n-th heap variable; n :: N.
gvararr(g, n) n-th array element of variable g :: gvar-t; n :: N.
gvarstr(g, fn) Field of structure g :: gvar-t with name fn :: S.
In the small-steps semantics all values are flattened. We have only elementary
values that could occupy one memory cell. Values of aggregate types are stored
consecutively as sequences of memory cells. We model memory cells by data
type mcell-t with the following constructors.
22 Virtual Memory Simulation
mcellbool(b) A Boolean b :: B.
mcellint(i) A (signed) integer i :: Z.
mcellnat(n) An unsigned integer n :: N.
mcellchar(c) A character c :: Z.
mcellptr(p) A pointer to p :: gvar-t⊥; ⊥ models the null pointer.
State. Configurations cSS of C0 small-step semantics are modeled by the data
type CSS which has two components.
cSS.mem The memory configuration of the type memconf-t.
cSS.prog The program rest of type stmt-t.
A memory configuration cSS.mem is a record of the type memconf-t with fol-
lowing three components.
cSS.mem.gm Memory frame for global variables.
cSS.mem.hm Heap memory frame.
cSS.mem.lm Stack of pairs of local memory frames and return destinations.
Each memory frame m consists of its symbol table m.st :: (S× ty-t)∗, the set
of initialized variables m.init :: pow(S), and the content m.ct :: N 7→ mcell-t.
We abbreviate global and heap symbol tables of a memory configuration
mem as gst(mem) and hst(mem), respectively. The top-most local memory
frame in the memory configuration mem is denoted by lmtop(mem) whereas
the top-most return destination is denoted by restop(mem). The symbol table
of the top-local memory frame is denoted by lsttop(mem).
Expression evaluation. Evaluation of expression e in memory configuration
cSS.mem and with respect to type environment te is done by means of func-
tion evalSS(te, cSS.mem, e) and is defined inductively over an expression tree in
Section 4.3.4 of Leinenbach’s thesis [Lei07]. In case e is a variable access the
content from corresponding memory frame is obtained, otherwise, equivalent
abstract operations are applied.
Transition function. The core of the small-step semantics is its transition func-
tion [Lei07, Section 4.4.3]. The transition function
δSS :: tenv-t× functable-t× CSS 7→ CSS⊥
gets a type environment te and a function table ft and calculates the transition
from configuration cSS:
δSS(te, ft, cSS) =
{
bc′SSc if no fault occurs
⊥ otherwise .
We execute n steps by δnSS.
Valid configurations. A configuration cSS is in the set of valid configurations
C0
√
(te, ft) [Lei07, Section 5.5] for a particular type environment and a function
table if basic well-formedness and typing constraints hold for the configuration
and the components, in particular:
2.3. C0 and Small-Step Semantics 23
• unique identifiers within the different name spaces of type names, global
variable names, local variable names within the different procedures, and
procedure names,
• procedure bodies are well-typed and contain a single return statement at
the end,
• all memory frames are well-typed and contain only valid pointers, i.e.,
pointers which point to existing variables,
• the program rest conforms with the function tables, i.e., all statements in
the program rest are from one of the procedures and their order follows
certain rules, and
• the number of return statements in the program rest is not greater than
the number of stack frames.
A version of the C0 validity C0 ′
√
(te, ft) which is often used in program verifi-
cation strengthens the last condition by requiring the number of return state-
ments in the program rest to be equal to the recursion depth minus one.
2.3.3 Compiling C0 to VAMP
So far we have defined in this chapter three models for reasoning about pro-
grams: VAMP ISA and assembly, and C0 small-step semantics. Most of the
software in Verisoft is implemented and verified at the C0 level. However the
key theorem of this thesis, the top-level correctness of a page-fault handler,
has to describe, among others, how C0 code is executed on the target ISA ma-
chine. In order to define this, C0 code needs to be translated — via assembly
code — to the object code which is executable on the hardware. For this a
simple non-optimizing compiler [Lei07, Pet07] has been developed and verified
in Verisoft. The correctness theorem of the compiler specification [Lei07, The-
orem 8.3] is a simulation theorem between a C0 program executed by the C0
small-step semantics and the generated assembly code executed by the VAMP
assembly model. The gap towards VAMP ISA was closed by Tsyban: a sim-
ulation theorem between VAMP ISA and assembly [Tsy09, Theorem 4.8] was
proven. Moreover, these two theorems were transitively combined in a ver-
ified ministack from C0 to VAMP ISA [Tsy09, Section 4.3]. We present its
correctness statement below.
Abstraction relation from VAMP ISA towards assembly. The major difference
between the VAMP ISA and assembly models is data representation. ISA reg-
isters are defined as bit vectors whereas assembly registers are natural numbers.
ISA memory is defined as a mapping from bit vectors of length twenty nine to
pairs of bit vectors of length thirty two whereas assembly memory is a map-
ping of thirty-bit natural numbers to thirty-two-bit integers. The abstraction
relation isa-sim-asm?(cASM, cISA) between VAMP ISA and assembly [Tsy09,
Definition 4.6] basically defines how components of an ISA configuration are
converted to their assembly equivalents.
24 Virtual Memory Simulation
Abstraction relation from VAMP assembly towards C0. An allocation func-
tion maps variables g to pairs alloc(g) = (b, s) of the allocated base ad-
dress b for g and the allocated size s of g’s type. The simulation relation
consis?(te, ft, cSS, alloc, cASM) [Lei07, Definition 8.11] defines whether a C0 small-
step semantics configuration and an assembly configuration are equivalent with
respect to a given allocation function. It is defined as a conjunction of different
individual consistency properties including those for code, control, allocation,
values, pointers, registers, and frame headers.
Abstraction relation from VAMP ISA towards C0. Transitive composition of ab-
straction relations introduced in this section define the ministack abstraction
relation from ISA towards C0:
C0-sim-isa?(te, ft, cSS, cISA) = ∃ cASM, alloc : asm√(cASM)
∧ consis?(te, ft, cSS, alloc, cASM)
∧ isa-sim-asm?(cASM, cISA).
Preconditions for simulation. The predicate dyn-C0-props?(te, ft, cSS, n) de-
notes necessary preconditions [Tsy09, Section 4.2.4] for simulation of C0 by
VAMP ISA which have to hold at every of n steps. The preconditions are the
absence of assembly statements and the “sufficient memory” requirement. The
latter states that the topmost stack frame ends below the heap base and that
the allocated heap size is appropriately bounded.
We denote that an ISA machine is running in system mode with all inter-
rupts masked out by
sys-execISA?(cISA) = sys-modeISA?(cISA) ∧ 〈cISA.spr(sr)〉 = 0.
Additional conclusions. In order to effectively apply the simulation theorem
between C0 and VAMP ISA in the context of systems verification the theorem’s
conclusion is enriched with additional properties which state that certain com-
ponents of the system stay unchanged during the C0 execution. The predicate
no-mod-spr?(spr, spr′) states that no special registers are modified. The predi-
cate only-mod-mem?(m,m′, abegin, aend) states that only the part of the memory
between the addresses abegin and aend is modified. Both predicates are formally
defined in Section 4.3.2 of Tsyban’s thesis [Tsy09].
THEOREM 2.16 I
C0 simulates ISA
Assume that (i) the initial C0 program is translatable, (ii) the C0-ISA rela-
tion holds for the current valid C0 configuration cSS and some valid VAMP ISA
machine cISA+DS.cpu being in the system mode, (iii) the execution sequence is
valid, (iv) the C0 computation starting from this configuration, does not pro-
duce a ⊥-configuration up to the step n, (v) during these n steps we execute
only non-assembly statements having enough stack and heap memory, (vi) the
last C0 address PROGEND does not lie in the devices range, then there exists a
number of ISA with devices steps T during which the ISA combined system
transits to a valid resulting state c′ISA+DS. Moreover, for it and a corresponding
valid C0 configuration c′SS the following holds: (i) the C0-ISA relation is pre-
served, (ii) the special purpose registers are unchanged, (iii) only the memory,
which belongs to the C0 program is possibly changed, and (iv) the devices
2.4. CVM: Communicating Virtual Machines 25
non-interference holds. Formally:
(te, ft, gst(cSS.mem)) ∈ xltblprog
∧ C0-sim-isa?(te, ft, cSS, cISA+DS.cpu)
∧ isa√(cISA+DS.cpu) ∧ seq√(seq, cISA+DS.devs) ∧ cSS ∈ C0 ′√(te, ft)
∧ sys-execISA?(cISA+DS.cpu)
∧ δnSS(te, ft, cSS) = bc′SSc
∧ PROGEND < DEVICES BORDER
∧ dyn-C0-props?(te, ft, cSS, n)
−→ ∃ T, c′ISA+DS :
δTISA+DS(cISA+DS, seq) = c
′
ISA+DS
∧ isa√(c′ISA+DS.cpu) ∧ c′SS ∈ C0 ′
√
(te, ft)
∧ C0-sim-isa?(te, ft, c′SS, c′ISA+DS.cpu)
∧ no-mod-spr?(c′ISA+DS.cpu.spr, cISA+DS.cpu.spr)
∧ only-mod-mem?(c′ISA+DS.cpu.m, cISA+DS.cpu.m,
abasegm(te, ft, gst(cSS.mem)), PROGEND)
∧ non-interf-dev?(cISA+DS.devs, c′ISA+DS.devs, seq, T ).
Note that above the function abasegm(te, ft, gst) [Lei07, Definition 7.15] com-
putes the base address of the global memory.
2.4 CVM: Communicating Virtual Machines
One of the most challenging verification objectives of Verisoft is to prove that a
VAMP ISA machine with devices correctly implements memory virtualization
for user processes: the physical memory and the swap space of the hard disk
are organized by the kernel to provide separate uniform linear memories for
user processes. Such organization is described by the model communicating
virtual machines (CVM) [GHLP05, IdRT08].
CVM [Tsy09] is a computational model for concurrent user processes in-
teracting with a generic microkernel and devices. CVM is implemented in
C0 with inline assembly as a framework featuring virtual memory, demand
paging [ASS08], memory management, and low-level inter-process and devices
communications. The demand paging implementation includes a page-fault
handler and its initialization code. Most other features are implemented in the
form of so called microkernel primitives [ST08a], functions with inline assem-
bly parts realizing basic operations which constitute the kernel’s functionality.
The framework can be linked on the source code level with an abstract ker-
nel [In 09], an interface to users, in order to obtain a concrete kernel, a program
that can run on a target machine, e.g., a VAMP processor.
2.4.1 Semantics
Configurations. We denote the number of processes including the kernel that
are allowed to run in our system by the constant MAX PID = 128. For identifiers
of user processes we introduce data type procnum-t = {1, . . . , MAX PID−1}.
Altogether user processes of the system are modeled as a mapping from process
identifiers to VAMP assembly configurations. We use data type userprocs-t =
procnum-t 7→ CASM for that. CVM configurations cCVM are modeled by record
CCVM which has the following components.
26 Virtual Memory Simulation
cCVM.ak :: CmonoC0 Configuration of the abstract kernel.
cCVM.ups :: userprocs-t Mapping of user processes.
cCVM.ds :: CDS Configuration of the devices system.
cCVM.cup :: procnum-t⊥ Current-process identifier.
cCVM.sr :: N Status-register used as a mask for interrupts.
The abstract kernel is modeled by the monolithic C0 small-step semantics
configuration which (additionally) includes a type environment and a function
table. Each user process is modeled by the VAMP assembly semantics. Con-
figurations of the CVM model are by no means restricted to any particular
instantiation of the devices system component. However, we will instantiate
the devices-system component with the devices system of the underlying phys-
ical combined system from which the swap hard disk is removed as swapping is
transparent to the CVM model. The identifier of a current process is modeled
by option type procnum-t⊥: the value ⊥ corresponds to the kernel whereas
any value pid which belongs to the set procnum-t corresponds to the process
with number pid. The status register component represents the interrupt mask
shared between the user processes. CVM executions are parametrized with
abstract kernel code ΠAK :: prog-t. Initial configuration of the CVM model for
devices system configuration cDS is denoted by cvminit(ΠAK, cDS).
Computations. The parameters of the CVM transition function
δCVM :: CCVM × seqel-t 7→ CCVM⊥
are a CVM model configuration cCVM :: CCVM, and an execution-sequence
element s :: seqel-t. In case the sequence element corresponds to a processor
step either the kernel or some user makes progress. Otherwise, δCVM boils down
to a step of some device. The CVM transition function yields either an error
constant ⊥ or an updated configuration of the CVM model.
Depending on the execution sequence element s and the current-process
identifier cCVM.cup the CVM transition function δCVM(cCVM, s) distinguishes
the following cases.
Devices step: s corresponds to some device.
User step: s is a processor step and cCVM.cup is a user-process identifier.
Kernel step: s is a processor step and cCVM.cup corresponds to the kernel.
Formally, the CVM transition function is
δCVM(cCVM, s) =

bstepuser(cCVM)c if s = Proc ∧ cCVM.cup = bpidc
stepkernel(cCVM) if s = Proc ∧ cCVM.cup =⊥
bstepdevs(cCVM, did, eifi)c if s = Dev(did, eifi)
.
For formal definitions of the functions stepdevs, stepuser, and stepkernel consult
the thesis of Tsyban [Tsy09].
A device step stepdevs [Tsy09, Definition 5.14] is taken as a response to
an input from the external environment and boils down to an application of
external device step function δEXTDS .
User steps distinguish three cases: (i) a user step without interrupts, (ii) a
user step with an interrupt that aborts the user execution (illegal, misalign-
ment or PTL exception), and (iii) a user step with an interrupt which allows
2.4. CVM: Communicating Virtual Machines 27
us to take a step (external interrupts, trap, and overflow). User steps without
interrupts boil down to an application of the VAMP assembly transition func-
tion to the user process which is making a step. Steps with interrupts results
in setting the current-process identifier to the kernel value and the abstract-
kernel invocation. Semantics of user processes update depends on the kind of
interrupt. In case (ii) the user is not changed, otherwise it makes a step as in
case (i). Altogether, the function stepuser [Tsy09, Definition 5.15] updates the
CVM-state components for user processes, the current-process identifier, and
the abstract kernel.
We distinguish three cases of a kernel step stepkernel [Tsy09, Definition 5.21]:
(i) waiting for interrupts from devices, (ii) finishing the kernel execution, and
(iii) a step of the abstract kernel. The first case models a situation when there
is not even a single user-process in the system to be resumed. In this case, the
kernel has no jobs to accomplish, and hence its configuration remains the same.
We say that the kernel is waiting for interrupts in this situation. If an interrupt
occurs, the kernel will be restarted. The second case handles a switch from the
kernel execution either to an execution of the next scheduled user process, or
to the idle state defined in the first case. The last case models a step of the
abstract kernel. This step is either a simple C0 step of the abstract kernel or
an execution of some CVM primitive.
The multiple-step function of the CVM δnCVM is defined by induction on the
step number n:
δ0CVM(cCVM, seq) = bcCVMc
δn+1CVM(cCVM, seq) =
{
⊥ if δnCVM(cCVM, seq) =⊥
δCVM(cnCVM, seq(n)) if δ
n
CVM(cCVM, seq) = bcnCVMc
.
Implementation. The CVM model is implemented in C0 with inline assembly
as a framework [Tsy09, Section 6.1] which consists of (i) process-context switch
procedures saving and restoring contexts of user processes, (ii) demand paging
implementation: a page-fault handler with its initialization code as well as
elementary device drivers, (iii) an elementary dispatcher which decides whether
an invocation of a page-fault handler is needed and calls the dispatcher of an
abstract kernel, and (iv) fourteen primitives for different operations for user
processes.
2.4.2 Correctness
Linker. As mentioned before a concrete kernel, a complete kernel program
that can run on a computer, is obtained by linking an abstract kernel with the
CVM framework. Next we follow Chapter 6 of Tsyban’s thesis [Tsy09] and
sketch how the linking operator is formally defined. Let the program of the
CVM framework be ΠCVM = (teCVM, ftCVM, gstCVM).
Linking of two type environments te and te′ is done by means of the function
linkte(te, te′) [Tsy09, Definition 6.1]. The result is a type environment which
contains all types which constitute both original type environments.
Linking of two function tables ft and ft′ is done by means of the function
linkft(ft, ft′) [Tsy09, Definition 6.6]. First, the function removes all external
functions from ft′, deletes entries of external functions in ft which are defined
in ft′, and replaces all external function call statements in ft by ordinary calls
28 Virtual Memory Simulation
in case a callee is defined in ft′. Next, the same is done for ft with respect to ft′.
Finally, the modified function tables are subject to concatenation. Here one
peculiarity has to be considered. The C0 small-step semantics requires that
each statement in a program is uniquely tagged with a statement identifier. The
concatenation mentioned above violates the uniqueness of statements identifiers
property. Therefore, the statements of the two function tables are renumbered
by means of the functions renumoddfun and renum
even
fun [Tsy09, Definition 6.5] to
regain the statement identifier uniqueness and only afterwards concatenated.
Linking of two symbol tables st and st′ is done by means of the function
linkst(st, st′) [Tsy09, Definition 6.7]. It concatenates the first symbol table with
the part of the second symbol table from which all entries that occur in the
first symbol table are removed.
Linking of two C0 programs Π and Π′ is done by means of the function
linkΠ(Π,Π′) = Π′′ [Tsy09, Definition 6.8] which uses the above defined func-




The C0 program of the concrete kernel ΠCK and its individual components
are defined below. All of them are parametrized by the program of the abstract
kernel ΠAK.
DEFINITION 2.17 I





Correctness of linking is justified by showing the C0 validity of the indi-
vidual components of the linked program. In order to succeed with that a
number of assumptions on the structure of the abstract kernel’s program ΠAK
have to be imposed. These assumptions are listed in Section 6.4 of Tsyban’s
thesis [Tsy09] and are collected in the predicate
abs-kernel-props(ΠAK).
Abstraction relation. Essentially, correctness criteria of CVM are formulated
as an abstraction relation from a VAMP ISA with devices configurations to-
wards the CVM model state. The abstraction relation denoted by
cvm-sim?(ΠAK, cCVM, cSS, cISA+DS)
holds if the CVM configuration cCVM encodes both the states of VAMP ISA
with devices cISA+DS and the concrete kernel cSS with respect to the abstract
kernel program ΠAK. The relation distinguishes user and kernel executions
within CVM and is formally defined in Section 7.1 of Tsyban’s thesis [Tsy09].
Since demand paging executions belong to the kernel we discuss below only
essential terms of the abstraction relation during the kernel runs. The latter
comprises (i) the kernel relation [Tsy09, Definition 7.10] which defines how a
concrete kernel is related to the abstract one, (ii) the devices relation [Tsy09,
Definition 7.1] which claims that the device states in CVM cCVM.ds and VAMP
ISA cISA+DS.devs configurations are equal except for the swap hard disk which
is an idle device in CVM, (iii) the relation for user processes [Tsy09, Definition
2.4. CVM: Communicating Virtual Machines 29
7.3] which states that the user processes configurations cCVM.ups are repre-
sented in the configuration cISA+DS of the VAMP ISA machine, and (iv) the
equality between the status register value retrieved from the memory of VAMP
ISA and the corresponding CVM component cCVM.sr.
The relation for user processes is of particular importance to the present
work since it can be maintained only with a correct page-fault handler. The
registers of suspended user processes are implemented by the process control
blocks (PCBs), a data structure maintained by the CVM framework which
we describe in details later in Section 2.5. An active user process, i.e., the
one that is currently running on the processor, is represented by the contents
of hardware registers. In both cases memories of user processes are stored in
the physical memory and on the hard disk. The cases are distinguished by
analyzing the value of the current process identifier and the execution mode of
the ISA machine.
The function vm(cISA+DS, p) performs the mentioned case distinction and
reconstructs a virtual assembly machine for a given process identifier p.
J DEFINITION 2.18
Reconstruction of user processes
vm(cISA+DS, p) ={
vmvars(cISA+DS, p) if ¬sys-modeISA?(cISA+DS.cpu) ∧ valcup(cISA+DS) = p
vmdirect(cISA+DS, p) otherwise
In the first case the process p is suspended and its virtual machine is recon-
structed from variables (PCBs) by means of the function vmvars. The second
case corresponds to a situation when the process p is active and its virtual ma-
chine is constructed directly from the hardware registers with the help of the
function vmdirect. The reader can find the definitions of both functions in Sec-
tion 7.1.3 of Tsyban’s thesis [Tsy09]. By valcup(cISA+DS) we denote the value of
the current process identifier on the side of CVM’s implementation — the vari-
able cup — read from the memory of cISA+DS. For the reconstruction of user
memory both vmvars and vmdirect use the function mem(cISA+DS, p) which we
introduce next. Its definition uses functions natvars(cISA, ad), intvars(cISA, ad),
and intswap(cDS, ad) which retrieve natural or integer numbers from the physical
or swap memory at a given address ad.
For each memory address, the decision where the data can be found is taken
according to the valid bit of the respective page table entry. On the input we
have a process identifier p and a virtual address va. Below we formally describe
how address translation mechanism is defined in the CVM formal theories1.
The page index and the byte index of a virtual address va are defined as
follows.
J DEFINITION 2.19
Page index and byte index
px(va) = va/PG SZ,
bx(va) = va mod PG SZ.
The page table origin and length for a process p are obtained from the memory
of ISA by reading the registers pto and ptl from the process control block at






ptoCVM(cISA, p) = natvars(cISA, ad
p
pto)
1Note that the hardware specification of this mechanism was already given in Sec-
tion 2.1.1. Moreover Chapter 4 will define the same mechanism in terms of the abstract
configurations of the page-fault handler. All three representations of the address translation
mechanism are shown to be equivalent during the proofs of the CVM and page-fault handler
top-level correctness theorems.
30 Virtual Memory Simulation
ptlCVM(cISA, p) = intvars(cISA, ad
p
ptl)
The i-th page table entry of a process p is delivered by reading the i-th
word in the physical memory starting from the respective page table origin.
DEFINITION 2.21 I
Page table entry
pteCVM(cISA, p, i) = natvars(cISA, ptoCVM(cISA, p) · PG SZ + i · 4)
The page index of a page table entry has a meaning of a physical page




pmaCVM(cISA, p, va) = px(pteCVM(cISA, p, px(va))) · PG SZ + bx(va)
We use big pages of size BIG PG SZ bytes. Therefore, big-page and big-byte




bpx(va) = va/BIG PG SZ,
bbx(va) = va mod BIG PG SZ.
We obtain the big-page table origin for a process p by reading a natural
number from the memory of ISA with devices at address adpbpto.
DEFINITION 2.24 I
Big-page table origin
bptoCVM(cISA, p) = natvars(cISA, ad
p
bpto)
Note, that big-page table origins do not store absolute values of addresses,
but only indices within the bpt array. To obtain the absolute address we need
to add the start address of this array adbpt. The i-th big-page table entry of a
process p is defined as the i-th word read from the physical memory starting
from the corresponding big-page table origin.
DEFINITION 2.25 I
Big-page table entry
bpteCVM(cISA, p, i) = natvars(cISA, adbpt + bptoCVM(cISA, p) · 4 + i · 4)
The swap memory address is defined as a combination of the respective
big-page table entry and the big-byte index.
DEFINITION 2.26 I
Swap memory address
smaCVM(cISA, p, va) = bpteCVM(cISA, p, bpx(va)) · BIG PG SZ + bbx(va)
From a page table entry pte we extract the valid bit at bit position eleven.
DEFINITION 2.27 I
Valid bit
validCVM(pte) = pte/PG SZ mod 2
The value of this bit signals whether the page we are considering resides in
the main or swap memory. We define the function which makes this decision




intvars(cISA+DS.cpu, pmaCVM(cISA+DS.cpu, p, va · 4)) if valid?
intswap(cISA+DS.devs, smaCVM(cISA+DS.cpu, p, va · 4)) otherwise
Above, valid? = validCVM(pteCVM(cISA+DS.cpu, p, px(va · 4))) = 1.
2.4. CVM: Communicating Virtual Machines 31
Now we are able to define a correctness criterion for user processes. Its idea
is to reconstruct virtual machines for every user process and check whether each
of these machines match the one specified by the CVM user process component
ups. For this we define an equality check operator for assembly machines. The










∧ c1ASM.pc = c2ASM.pc
∧ tl(c1ASM.gpr) = tl(c2ASM.gpr)
∧ stored-spr(c1ASM.spr) = stored-spr(c2ASM.spr)
∧ ∀ a < vm-size : c1ASM.m(a/4) = c2ASM.m(a/4)
Above, stored-spr(regs) = [regs[sr], regs[pto], regs[ptl], regs[mode]] is the lists of
those special purpose registers that are stored in process control blocks or other
kernel variables.
The only thing that remains before we can state the desired correctness
relation for user processes is the right choice of the parameter vm-size. For
each process p we should compare only those virtual memory parts that have
been allocated. The size of the allocated memory measured in pages is stored in
the page table length register and is expresses by the formula ptlCVM(cISA, p)+1.
Ultimately, the relation for user processes, denoted by B(ups, cISA+DS), is
nothing but an equality of the reconstructed assembly machines and all user
processes parametrized with the right amount of virtual memory.
J DEFINITION 2.30
Relation for user processes
B(ups, cISA+DS) = ∀ 0 < p < MAX PID :
asm-equal?(vm(cISA+DS, p), ups(p), (ptlCVM(cISA+DS.cpu, p) + 1) · PG SZ)
Implementation invariants. To prove that the CVM abstraction relation holds
throughout CVM executions a number of invariants over the concrete kernel’s
C0 machines as well as the underlying ISA machine with device has to hold.
These implementation invariants are formally defined in Section 7.1.6 in thesis
of Tsyban [Tsy09]. All implementation invariants are collected in the predicate
impl-inv?(ΠAK, cCVM, cSS, cISA+DS).
The formal definition distinguishes cases of user and kernel executions. The
implementation invariants which hold during kernel executions include (i) the
validity of the ISA machine, ISA code invariants code-inv?(ΠAK, cISA), and
the zero-filled page condition zfp-cond?(cISA), (ii) a requirement to the ISA
machine to operate in system mode, (iii) the validity of the concrete kernel
C0 configuration, (iv) memory structure invariants of the concrete kernel, and
(v) the C0-ISA ministack relation C0-sim-isa?(teCK(ΠAK), ftCK(ΠAK), cSS, cISA)
discussed in Section 2.3.3.
J DEFINITION 2.31
ISA code invariants
The ISA code invariants [Tsy09, Definition 7.11] code-inv?(ΠAK, cISA) state
that (i) the instruction stored in the ISA memory at address zero must be
a jump instruction to the beginning of the kernel code, and (ii) the concrete
kernel code translated by the C0 compiler resides in the ISA memory starting
from the address PROGBASE.
J DEFINITION 2.32
Zero-filled page condition
The zero-filled page condition zfp-cond?(cISA) states that a page filled with
zeros resides at the address ZFP in the memory of the ISA machine cISA.
32 Virtual Memory Simulation
We discuss the zero filled page in details further in this chapter.
CVM correctness theorem. Before we state the correctness theorem of CVM
we need to mention a number of additional definitions. All of them are intro-
duced in Section 7.2 of Tsyban’s thesis [Tsy09]. The predicate init-isa?(ΠAK, cISA)
claims that cISA is the initial ISA configuration which is the first valid configu-
ration after reset. HEAP-SIZEAK is the upper bound on the heap memory of the
abstract kernel. The function asizeheap(hst) [Lei07, Definition 7.12] computes
the allocated size of the heap for a heap symbol table hst. Finally, for two
generalized device configurations d1 and d2 we denote that both devices are of





Assume that (i) the abstract kernel properties hold for ΠAK, (ii) the pro-
cessor component cISA+DS.cpu of ISA with devices is in its initial state, (iii) the
swap hard disk properties hold for the devices system cISA+DS.devs, (iv) the
ISA execution sequence seqISA is valid, then there exists a valid CVM execu-
tion sequence seqCVM and for any finite number n of kernel steps ≤ N there
exists a number of CVM model steps TCVM, such that by executing this number
of steps starting from initial CVM configuration the CVM model transits to a
non-error state c′CVM with no heap boundaries violation. Moreover, there exists
a number of ISA with devices steps TISA during which the ISA machine transits
to a resulting state c′ISA+DS and a corresponding configuration c
′
SS of the con-
crete kernel, such that (i) the devices typing and the swap hard disk properties
hold for the updated devices system, (ii) the implementation invariants hold,





−→ (∃ seqCVM : ∀ n ≤ N : ∃ TCVM :
proc-steps(seqCVM, TCVM) = n
∧ seq√(seqCVM, dsinit(cISA+DS.devs))
∧ (∀ c′CVM :
δ
TCVM
CVM (cvminit(ΠAK, cISA+DS.devs), seqCVM) = bc′CVMc
∧ asizeheap(hst(c′CVM.ak.mem)) ≤ HEAP-SIZEAK
−→ (∃ TISA, c′ISA+DS, c′SS :
δ
TISA
ISA+DS(cISA+DS, seqISA) = c
′
ISA+DS
∧ ∀ did : c′ISA+DS.devs(did) type∼ cISA+DS.devs(did)
∧ hd√(c′ISA+DS.devs)
∧ impl-inv?(ΠAK, c′CVM, c′SS, c′ISA+DS)
∧ cvm-sim?(ΠAK, c′CVM, c′SS, c′ISA+DS)))).
Above, we have presented the formal model of CVM and its correctness
theorem: a simulation theorem between a physical ISA machine and virtual
machines of user processes. Most crucially the correctness of the simulation
theorem depends on the correctness of demand paging as the page-fault handler
takes care about maintaining the relation for user processes. In the next section
we focus on demand paging implementation in CVM whereas the remainder of
2.5. Demand Paging 33
the thesis will be concerned about its correctness.
2.5 Demand Paging
User processes are modeled as virtual assembly machines in the CVM model.
This means that user processes have an illusion of their own, large, and iso-
lated memory. This is implemented via demand paging based on the VAMP’s
address translation mechanism and a swap hard disk. Memory virtualization
is transparent to user processes: within the CVM model page faults that might
occur during a user step are handled silently by the page-fault handler such
that the user can continue its run. The page table space, a data structure both
accessed by the processor and by software, maintains information whether a
certain page is in the swap or the main memory. Whenever a user processes
accesses a page that is currently swapped out, a page-fault interrupt is trig-
gered by the processor (cf. Definitions 2.7 and 2.8). In response, the page-fault
handler is invoked, which copies the requested page back to the main memory.
To copy pages between the swap disk and the main memory the page-fault
handler relies on a hard-disk driver [Alk09]. Both the handler and the driver
are part of the CVM framework implementation.
In this section we follow the master thesis of Condea [Con06] and describe
which data structures are maintained by the demand paging implementation
and outline execution scenarios of the page-fault handler.
2.5.1 Data Structures
Constants. The implementation of demand paging uses the following con-
stants which define the memory map of the kernel.
• Our system configuration, and implicitly the page fault handler, is pa-
rameterized with two constants PG SZ and BIG PG SZ holding the page
and, respectively the big-page size. The former, as aforementioned, is set
to 4KB whereas the latter is 4MB. The ratio between big-page and page
size is denoted by PGS PER BIG PG = 1024. Moreover, page size measured
in words is denoted by PG SZ WD = 1024.
• The user memory region is placed to the high end of the physical memory
which is TOT PHYS PGS = 8192 pages large whereas everything below it
belongs to the system memory. Thus, we keep the index of the first user
memory page or, equivalently, the number of pages given to the kernel in
KERNEL PGS = 6390 constant. Pages from 0 to KERNEL PGS−1 are reserved
for the kernel whereas the remaining USER PGS = 8192−6390 = 1802 are
used for user processes. We devote the larger part of the available physical
memory to the kernel because in order to show competitive performance
it has to completely reside in the physical memory.
• The last page of the system memory which resides at address ZFP =
KERNEL PGS−1 is permanently filled with zeros to support the copy-on-
write principle.
34 Virtual Memory Simulation
• The swap memory stores TOT BIG PGS = 1152 big-pages. The first
BOOT PGS = 150 pages are reserved for the boot region which stores an
operating-system image.
• Constant PT START = 4194304 stores the absolute address where the page
table space begins. It must be known in order to appropriately set page
table origins.
• The total amount of virtual memory is kept in the TOT PGS constant and
set in our system to 4GB.
Process control blocks. In order to manage the interleaved execution of user
programs and to distribute the systems resources among the tasks, the user-
visible and the system information is kept in a table of process control blocks
(PCBs). The table has a fixed size, equal to the maximum number of supported
processes MAX PID. The data structure for a process control block of a process
pid is declared in Listing 2.1 and has the following components.
Listing 2.1: Process control block and page descriptor
1 struct pcb t { struct pd t {
2 int ef[EF DIM]; unsigned int pid;
3 unsigned int ihd[IHD DIM]; unsigned int vpx;
4 int bpto; unsigned int ppx;
5 int bptl; struct pd t∗ next;
6 int empty[46u]; struct pd t∗ prev;
7 }; };
Exception frame. This is the main save area within the process control block.
For any pid different from the identifier of the current process or when the
machine runs in system mode and register save has already completed, the
exception frame array holds the values of visible CPU registers. These
are thirty one general purpose registers and ten special purpose regis-
ters, which — together with thirty two items reserved for floating-point
registers — makes in total EF DIM = 73 registers. For each register to
be stored, there is a corresponding variable in the PCB. The following
registers are not stored in the exception frame:
• Register gpr[0] since it permanently has value zero.
• The special purpose mode register, because all the processes except the
kernel run in user mode of the processor.
• The emode register since it is only needed to hold a copy of the mode
register before a context switch from system to user mode of the CPU.
Since in CVM there is no interrupt nesting and thereby it is a binary
decision between the two aforementioned modes of the processor.
• The sr register, because as soon as an interrupt occurs, the hardware saves
the status register sr into the exception status register esr. From there it
is afterwards restored. Keep in mind that in CVM there is no interrupt
nesting. During the system mode execution all maskable interrupts are
masked by setting sr to zero.
2.5. Demand Paging 35
• The esr register since its the kernel has a variable SR to store the shared
status-register value.
Interrupt handlers. An array of IHD DIM = 6 unsigned integers is reserved for
handlers of internal interrupts / user-defined signals.
Big-page table origin. It stores the index in the big-page table space indicating
the start of the big-page table for process pid.
Big-page table length. It stores the length of the big-page table for process pid.
Empty array. The empty array at the end of a PCB is used to push the PCB
size to half a megabyte. By that we make starting addresses of processes
inside the PCB array to be powers of two, which allows efficient address
computations with logical shift operations.
Of particular relevance for the page fault handling and memory manage-
ment from all the data contained in the process control block are the exception
frame elements at indices PTO = 72 and PTL = 73 as well as bpto and bptl
fields. An important remark concerning the page table length register is the
fact that it also distinguishes between an active and an inactive task. A value
of -1 denotes the task is inactive whereas any non-negative value denotes an
active process.
Page table space and big-page table space. The page tables of the active tasks
are stored in a region called page table space. It is important that they must
not overlap and they must be page-aligned. In our system, the total virtual
memory TOT PGS is limited by 4GB and the page size is 4KB. For each allocated
virtual page there must exist a corresponding page table entry in the page
table space. Without wasting any space, we would need to store 220 page table
entries, each 4-byte long. This will take up 4 · 220/212 = 1024 pages of physical
memory. However, due to the page alignment of the page tables, there is a
maximum waste of one page per process. Since our system supports MAX PID
processes, the total waste is at most 128 pages. As a result, the accumulated
size, including the potential waste, of the page table space is 1024+128 = 1152
pages of physical memory.
With the above considerations, we declare the page table space as a two-
dimensional array on the heap pt[TOT PGS PT][PTES PER PG] with dimensions:
• TOT PGS PT = 1152 which represents the upper bound on the number of
pages occupied by the page table space, and
• PTES PER PG = 1024 which represents the number of page table entries
contained in a page.
Let us now estimate the memory requirements for the big-page tables. The
big-page tables of active tasks are stored in a region called big-page table space
and they must also not overlap.
Generally speaking, for every big-page of virtual memory there must exist
a corresponding big-page table entry in the big-page table space. With a total
virtual memory TOT PGS limited by 4GB and a big-page size of 4MB, we would
consequently need 232/222 = 1024 big-page table entries, each 4-byte long.
36 Virtual Memory Simulation
However, there is a difference with respect to the granularity of the virtual
memory and swap memory: while the first is partitioned in pages, the latter is
partitioned in big pages. Therefore, to store the virtual memory on the swap
file, we would waste a maximum one swap memory big-page per process. Since
our system supports MAX PID processes, the total waste cumulates to at most
128 swap memory big-pages. As a result, we set the total number of swap
memory big-pages TOT BIG PGS to 1024+128 = 1152 and define big-page table
space as one-dimensional array of this size: bpt[TOT BIG PGS].
User memory management. Starting from the page with index KERNEL PGS,
the user memory continues to the upper end of the physical memory. We define
simple structures, called page descriptors, in order to manage the user pages
(see 2.1). One such page descriptor pd corresponds to exactly one physical
page index of the user memory and has the following elements:
• a process identifier pid and a virtual page index vpx to identify to which
process and to which virtual page a physical page belongs to if it is used,
• a physical page index ppx which points to the user page in the physical
memory, and
• pointers next and prev to organize doubly-linked lists of page descriptors.
Each page descriptor is associated with a physical page belonging to the
user memory. These pages fall into two categories.
• User memory pages that store a virtual page. These pages, as well as
their associated page descriptors, are called active.
• Free pages of user memory. Their page descriptors are also called free.
Both categories are managed using lists of page descriptors. The active
pages are maintained in a so-called active list, whereas the free pages are main-
tained in a free list. The implementation variables for these lists are active
and free, respectively.
Swap memory management. The swap memory is allocated to processes in
big pages of size 4MB. To accommodate a maximum total virtual memory
TOT PGS of 4GB, there must be, as previously pointed out, TOT BIG PGS big-
pages of swap memory.
To manage the swap memory, we use a stack of free big-pages bpfree. This
is realized by means of a statically defined array of size TOT BIG PGS which is
always accessed at index bpages free−1, the free big page stack pointer. The
stack consists in fact only of the array entries with index lower or equal than
the stack pointer. Each entry on the stack will contain the (unique) index of a
free swap memory big-page.
Initially, all the swap big-pages are free, so the stack bpfree will hold all
the swap big page indices from 0 to TOT BIG PGS. At allocation, when an extra
swap memory big-page is needed to store a virtual big page, the big page table
entry corresponding to the virtual big page is assigned the value of the topmost
entry in bpfree and then the stack pointer is decremented. Conversely, at
deallocation, the stack pointer is incremented and then the swap memory big
2.5. Demand Paging 37
page index which is stored in the big-page table entry of the released virtual
big page is assigned to the topmost entry of bpfree.
Please keep in mind that the stack pointer will always be non-negative.
This is ensured by setting TOT BIG PGS sufficiently big with respect to the
total virtual memory size of our system.
Reverse lookup array. When a process voluntarily releases (portions of) its
virtual memory, the user memory pages that possibly used to store the released
virtual pages should also be freed. Therefore, we must remove all the physical
page descriptors associated with these user pages from the active list. Normally,
we would have to search the entire list for a page descriptor with a specific
physical page index, which is inefficient. In order to avoid the search time
we introduce an array storing pointers to all page descriptors. This array will
be used as a reverse lookup table, i.e., for associating a physical page index
to a page descriptor. We declare array struct pd* ppx2pd[TOT PHYS PGS]
which for all user pages with indices ppx ∈ {KERNEL PGS, . . . , TOT PHYS PGS}
maintains that ppx2pd[ppx] points to a page descriptor with the physical page
index field of value ppx.
Note that we will only need to manipulate page descriptors associated to
user pages. Therefore, the first KERNEL PGS elements of the ppx2pd array do
not store valid pointers to page descriptors. In fact, they are never initialized.
Miscellaneous variables. The system maintains also the following variables:
the first two are used by the page-fault handler and memory management
primitives whereas the remaining three are used only by other parts of the
CVM code.
pages used Number of currently used virtual pages.
pages free Number of currently free physical pages.
SR Devices interrupts and overflow mask.
cup Current user process identifier.
kheap Kernel heap pointer.
2.5.2 Execution Scenarios
As mentioned above, two page descriptor lists are used by the page-fault han-
dler, namely the active list and the free list. Elements of the free list correspond
to physical pages in user memory that are currently not used by any process.
At startup, the free list contains all user memory pages. Complementary, el-
ements of the active list correspond to allocated user pages. Every element
of the active list is associated with physical page index ppx, process identifier
pid, and virtual page index vpx. For such an element in the active list, the
page table entry of vpx which belongs to task pid is valid and points to ppx.
The order of the elements in the active list is significant, in fact it is used as
a queue: elements are created at its tail on swap in, and they are taken away
from its head on swap out. Hence, the page fault handler implements a FIFO
scheme, that always selects the oldest virtual page in user memory for eviction
(swap out).
Page faults are exceptions to the software raised by the hardware in cases
described by Definitions 2.7 and 2.8. On the software layer, nevertheless, these
38 Virtual Memory Simulation
hardware page page faults receive different interpretations. This depends on
the complexity of the handler and the features it supports. For instance, a page
fault caused by a memory access to a page not loaded in main memory might
be identified by dissimilar conditions in two distinct page fault handlers. As
a result, we call software page faults the collection of situations which a page
fault handler is able to treat. At the basis of any software page fault stands
a hardware page fault. Our page-fault handler can handle two software page
faults.
1. The invalid access page fault. It occurs when the accessed virtual page
does not reside in physical memory signal by the valid bit unset.
2. The zero protection page fault. It occurs on the very first user write
operation on a virtual page, after the allocation. Additional conditions
for it are that the protection bit is set and the physical page index in the
associated page table entry pointing to ZFP.
At allocation, virtual pages point to the zero-filled page and both the valid
and protected bit in their page table entries are set. Pages that map to the
zero-filled page do not have a page descriptor in the active list nor in the free
list. The physical rights of the newly allocated pages allow for user read access.
However, as soon as there will be a user write access on them, this will generate
a zero protection fault. Hence, one page of the user memory must be filled with
zeros and its associated page descriptor must be included in the active list. This
implements actually the so-called copy-on-write principle. On swap in, pages
are moved from the free list to the active list due to subsequent page faults to
the zero protection fault. Observe that now the protection bit is not set in the
respective page table entry since the only possible page fault from now on is
the invalid access page fault. On swap out, the victim pages are moved from
the active list back to the free list. The page table entry corresponding to the
swapped-out virtual page is entirely cleared. Pages will be also moved from
the active to the free list in case of memory release. But in this case, the page
table entries will not get explicitly cleared. They simply become inaccessible
due to the decrease of the page table length.
To offer a better understanding of the page-fault handler we depict in Fig-
ure 2.1 its mechanism drawing attention to all possible page descriptor move-
ments between lists together with the corresponding conditions and effects on
the page table entries.
The page-fault handler is implemented as a function pfh touch addr which
receives four input arguments: the process identifier pid, the exception virtual
address addr, the intention of a call to the handler intent, and the number
count of calls to the handler during which the page specified by pid and addr
must survive in the physical memory. The page-fault handler returns (in a non-
error case) a translated physical address for the pair (pid,vpx) represented as
an unsigned integer.
The function pfh touch addr serves as a single entry point for all cases
where the kernel has or might have to swap in a page. That means the func-
tion pfh touch addr is invoked not only to handle user page faults but also to
perform all (software) address translations in the kernel implementation. We
distinguish four different intentions intent of a handler’s call: (i) READ: the
page in question should be readable afterwards, (ii) WRITE: the page in ques-
2.5. Demand Paging 39
Figure 2.1: Page movements by the page-fault handler
page allocation for (pid, va)
ppx(pid, va) = ZFP
zero-protection page fault




tion can afterwards be read and written, (iii) OVERWRITE: the caller intends
to overwrite the entire page after the call, and (iv) SWAP IN: a page fault has
occurred at the given address and the function will swap in the page for arbi-
trary accesses. We describe the page-fault handler implementation in detail in
Section 5.6 where we give additional information on the role of its parameters
intent and count. Below we describe the handler functionality in case a user
page fault happens for the pair pid and addr.
The execution of the page-fault handler can be divided into three steps:
• First, we decide whether the exception operation was legal or illegal. By
convention, an operation is illegal if we have a page table length exception.
If this is the case, we abort the execution of the handler and exit with
the code INVALID ADDR. Otherwise, we continue.
• Second, we determine the physical page index in the user memory that
we intend to use for swap in. At this point, there are two possibilities
for choosing this index. If there are still unused free pages in the user
memory, we take one of their indices. Otherwise, we select and swap out
one of the pages in user memory. By the first-in-first-out (FIFO) strategy
we select for eviction the page associated with the head of the active list.
• Third, we check whether the exception page already lies in the physical
memory, i.e., if the valid bit of the corresponding page table entry is set.
If so, a zero protection page fault took place and therefore, we fill the
target page determined in the second step with zeros. If the valid bit is
not set, we simply swap in the exception page. Finally, we update the
page table entry corresponding to the exception page accordingly.
2.5.3 Approach to Correctness Proof
So far we have presented the problem of virtual memory simulation, introduced
a correctness statement for this problem — the CVM correctness theorem, and
elaborated how demand paging is implemented in CVM in order to support
virtual memory. The goal of the remainder of this thesis is to formulate and
40 Virtual Memory Simulation









prove correctness theorems of demand paging such that they could be applied
in the context of CVM’s proof.
The target semantics of the CVM correctness theorem (Theorem 2.33) is
VAMP ISA. The demand paging is implemented in C0 with calls to hard-disk
drivers which contain assembly code. We might have verified this implementa-
tion at the level of C0 small-step semantics and using the simulation theorem
between C0 and VAMP ISA project the obtained result to the hardware level.
However, proving the correctness of programs of the size and complexity of the
demand paging implementation is a big effort. In order to ease this problem we
introduce a so-called C0 semantics stack in the next chapter (Chapter 3). The
stack provides nice means to prove correctness of implementations not in the
C0 small-step semantics but rather on a more abstract level of the language
Simpl in the Hoare logics. The correctness results obtained on the level of
Simpl could be transfered — via an intermediate level of C0 big-step semantics
— down to C0 small-step semantics level. Moreover, the stack handles routines
with inline assembly portions by abstracting them to extended calls (XCalls).
The overall approach to prove correctness of demand paging is shown at
Figure 2.2. We first specify and prove all necessary and sufficient properties in
terms of the PFH automaton as discussed further in chapter 4. By proving in
Hoare logic that the abstraction mapping holds in the state after executing the
page-fault handler provided it holds before, we obtain the desired properties
at the level of Simpl (Chapter 5). We map these results — via the C0 big-
step semantics — to the level of the (extended) C0 small-step semantics. We
justify the XCalls to the hard disk driver by plugging in their correctness
result [Alk09]. Next, with the help of the simulation theorem of C0 by ISA
we are able to state the page-fault handler functional correctness in terms of
the ISA semantics. This allows us to prove the page-fault handler top-level


















from Simpl to BS
3.5
Property Transfer
from BS to SS
In order to support pervasive verification of
system software a stack of semantics for the
C0 language has been carefully crafted in
Verisoft. The C0 semantics stack comprises
a Hoare logic, a big-step semantics, and a
small-step semantics. By a higher level of
abstraction in the Hoare logic compared to
the small-step semantics, we gain efficiency
for the verification of individual C0 programs.
However, we have to integrate the results
obtained in the Hoare logic into our sys-
tems stack. Certain simulation theorems al-
low to transfer program properties from the
Hoare logic down to the small-step seman-
tics. In this chapter we first introduce Simpl,
a generic imperative language for which a
Hoare logic was defined. In the previous
chapter we have already defined the C0 small-
step semantics, therefore, it remains only to
introduce only the big-step semantics to cover
all layers of the semantics stack. Finally,
we present simulation theorems between the
stack’s layers. They will allow us to transfer
correctness properties of programs between
the layers.
42 Leveraging a Semantics Stack
The Hoare logic provides sufficient means to reason about pre- and postcon-
ditions of sequential, type-safe, and assembly-free C0 programs. The small-step
semantics level allows integration with inline assembly code. The big-step se-
mantics is a bridging layer, which is convenient to express results of the Hoare
logic operationally. The core differences of the semantical layers of C0 can be
summarized as follows:
• Hoare logic: split heap, compound values, implicit typing.
• Big step: single monolithic heap, compound values, explicit typing.
• Small step: single monolithic heap, flat values, explicit typing.
These design decisions reflect the purpose of the layers. The Hoare logic is
tuned to support verification of individual programs, whereas the small-step
semantics is nearer to the architecture level.
The levels below C0 allow integration of devices, or communication between
memory parts which do not belong to the program range. Our approach is to
abstract effects of those low-level computations into atomic XCalls (extended
calls) in all semantics layers. The state space of C0 is augmented with an
additional component that represents the state of the external component,
e.g., the device or parts of physical memory. An XCall is a procedure call
that makes a transition on this external state and communicates with C0 via
parameter passing and return values. With this model it is straightforward to
integrate XCalls into the semantics and into Hoare logic reasoning. Bodies of
XCalls are typically implemented in assembly. An implementation proof of this
piece of assembly justifies the abstraction to an atomic XCall. In this chapter
we introduce individual layers of the stack as well as equivalence theorems
between them.
In order to reason about programs on each of these levels a program’s source
code has to be correspondingly represented in Isabelle/HOL. Such representa-
tions are generated automatically by a translation tool developed in the frame
of this work.
While discussing Simpl, big-step semantics, and their equivalence we follow
the doctoral thesis of Schirmer [Sch06]. Our presentation of Hoare logic is based
on the doctoral thesis of Petrova [Pet07]. The section about equivalence be-
tween big-step and small-step semantics is guided by a JAR article [AHL+09].
3.1 Simpl
Simpl is a rather general sequential imperative programming language that
is not fixed to a specific real programming language like C, Pascal or Java.
It is more like a model for imperative programs that allows to embed real
programming languages for the purpose of program verification. Simpl makes
no assumptions on the state space, it is polymorphic. We will denote the state
space by type variable CH which is at the end instantiated with a program-
depended record. Basic actions like variable assignments or memory allocation
are arbitrary state updates of type CH 7→ CH.
Abstract syntax. Commands of Simpl are defined by the polymorphic data
type com(CH) parametrized over the state space:
3.1. Simpl 43
Skip Do nothing.
Basic(f) Basic command; f :: CH 7→ CH is a state update.
Seq(s1, s2) Sequential composition; s1 and s2 are of type com(CH).
Cond(b, s1, s2) Conditional statement; s1, s2 :: com(CH) and boolean
condition b is modeled as a state set of type pow(CH).
While(b, s) Loop; s :: com(CH) and b :: pow(CH).
Call(p) Procedure call; p is a procedure name of type S.
Guard(g, s) Guarded command; g :: pow(CH) and s :: com(CH).
State. The core state space is polymorphic and denoted by type variable CH.
It includes the extended state of XCalls as an ordinary record component. In
order to define semantics the state space is augmented with an error state ⊥,
i.e., the augmented state space is CH⊥. Executions start in some normal state
bcHc. In case a guard Guard(g, s) is violated the runtime fault is signaled by the
state ⊥. Moreover, execution can get stuck because of a call to an undefined
procedure.
Semantics. The operational semantics of Simpl
Γ `H 〈s, cH〉 ⇒ c′H
is defined inductively by the rules in Figure 2.1 of Schirmer’s thesis1. It has a
meaning that in procedure environment Γ execution of command s transforms
the initial state cH to the final state c′H, where Γ :: S 7→ com(CH), cH, c′H :: CH⊥,
and s :: com(CH).
XCalls are specified as basic commands Basic(f) where the function f ::
CH 7→ CH updates the extended state component of CH (and possibly some
over variables).
Termination. To verify total correctness of a program one needs to show that
the program terminates for all valid inputs. To guarantee termination of a
Simpl program it is not sufficient to require the existence of a terminating
computation: ∃ c′H : Γ `H 〈s, cH〉 ⇒ c′H. Due to nondeterminism this does
not guarantee that all computations from the same initial state s terminate.
Guaranteed termination
Γ `H s ↓ cH
of program s in the initial state cH is defined inductively by the rules in Figure
2.2 of Schirmer’s thesis. If statement s terminates when started in state cH,
then there exists final state c′H with respect to the semantics:
Γ `H s ↓ cH −→ ∃ c′H : Γ `H 〈s, cH〉 ⇒ c′H.
The other direction is not valid since Simpl is nondeterministic — the execution
branch that is taken depends on the particular input data.
Heap and pointers. The heap model we use excludes explicit address arith-
metic but it is capable to represent typical heap structures like lists. We follow
the split heap approach that goes back to Burstall [Bur72] and was recently
1Schirmer uses just a turnstile symbol `. We extend it with subscript “H” to stress that
it corresponds to Hoare logic as we also will have versions for big- and small-step semantics,
signaled by subscripts “BS” and “SS”, respectively.
44 Leveraging a Semantics Stack
taken up by Mehta and Nipkow [MN03]. The main benefit of this heap model
is that it already excludes aliasing between between pointers of unequal type
or to different structure fields. The typed view of memory is hard-wired into
the model. That is why it is not possible to properly express low-level untyped
operations like pointer arithmetic in it. To highlight that we do not calculate
with pointers we introduce a type ref-t of references. It is isomorphic to the
natural numbers. We declare the reference NULL as a constant without any
definition, it is just one value upon the references. To model allocation and
deallocation we need some bookkeeping of allocated references. We achieve this
by introducing a list of allocated references alloc to the state space. To model
the allocation of a new reference we use the function new :: pow(ref-t) 7→ ref-t,
which for a set of references A gives a fresh reference a = new(A) with condi-
tion a /∈ {NULL} ∪A. Since type ref-t is isomorphic to the natural numbers we
have infinitely many references.
3.2 Hoare Logic
Hoare triples. The Hoare logic for Simpl is inductively defined by the rules in
Figure 3.1 of Schirmer’s thesis. The judgment for partial correctness has the
following form:
Γ `H P s Q.
The intended formal semantics of a Hoare triple is defined by the notion of
validity:
Γ H P s Q = ∀ cH, c′H : Γ `H 〈s, cH〉 ⇒ c′H
∧ cH ∈ {bxc : x ∈ P} −→ c′H ∈ {bxc : x ∈ Q}.
Given an execution of statement s from initial state cH to final state c′H, pro-
vided that cH is some state satisfying the precondition P , then c′H becomes a
state satisfying Q. Validity and the Hoare logic are related by two important
theorems:
• Soundness: Γ `H P s Q −→ Γ H P s Q, and
• Completeness: Γ H P s Q −→ Γ `H P s Q.
Total correctness means partial correctness plus termination. This is di-
rectly reflected in the validity notion for total correctness:
Γ tH P s Q = Γ `H P s Q ∧ ∀ c′H ∈ {bxc : x ∈ P} : Γ `H s ↓ c′H.
The various judgments for total correctness are distinguished from partial cor-
rectness by the superscript “t”. The total correctness Hoare logic for Simpl is
inductively defined by the rules in Figure 3.3 in the thesis of Schirmer. The
judgments has the following form:
Γ `tH P s Q.
Here we have also soundness and completeness:
Γ `tH P s Q −→ Γ tH P s Q Γ tH P s Q −→ Γ `tH P s Q.
Keeping track of modified global and heap variables. Let us interpret sets as
predicates, i.e., P (x) means x ∈ P . We extend Hoare triples and judgments
3.2. Hoare Logic 45
for them with the following notation
Γ tH P (cH) s Q(c′H) ∩∆(cH, c′H) = {a, b, c, . . .}.
It has a meaning that during the execution of a statement s, i.e., a transition
from cH to c′H, only global or heap variables a, b, c, etc. were modified.
VCG: Verification condition generator. Since the Hoare logic is defined induc-
tively by the rules for every language constructor, we can apply them backwards
to a statement in order to decompose it to the atomic ones (Skip and Basic).
The idea of the verification condition generator (VCG) is the following:
for judgment a Γ `H P s Q that we want to prove, we automatically apply
the Hoare rules until program s is completely eliminated and a purely logical
proof claim P ⊆WP remains. WP is a weakest precondition that is computed
from the given postcondition Q by backward rules application. So, the weakest
precondition is a set of states which has only the properties, which are necessary
in order to guarantee, that execution of the program will end in a state, where
the postconditions hold.
In order to be able to apply all the rules automatically we need a version of
the rules that can be applied to any Hoare triple and can compute its weakest
precondition. Therefore for every language statement we provide a rule of the
following format:
P ⊆WP T1, . . . , Ti
Γ `H P s Q ,
where T1, . . . , Ti are side conditions needed to compute the weakest precondi-
tion WP . Moreover, for each modified rule there exist a proof, that it can be
deduced from the original ones.
• Let us consider transformation of the original rule for Basic as an exam-
ple:
Γ `H {cH|f(cH) ∈ Q} Basic(f)Q
The postcondition can be actually applied to any state, the precondition
is the weakest precondition for Basic statement, but we can not expect
that the triple we want to prove exactly matches the precondition. Thus,
we transform it to the the following one:
P ⊆ {cH|f(cH) ∈ Q}
Γ `H P Basic(f)Q
This rule has an appropriate format and can be applied to any Basic
statement.
• The rule for sequential composition combines pre- and postconditions for
both sub-statements, where R is the weakest precondition for Q and s2:
Γ `H P s1 R Γ `H R s2 Q
Γ `H P Seq(s1, s2)Q
• For the conditional statement the following rule will be used by the VCG:
P ⊆ {cH|(b(cH) −→ cH ∈ P1) ∧ (¬b(cH) −→ cH ∈ P2)}
Γ `H P1 s1 Q Γ `H P2 s2 Q
Γ `H P Cond(b, s1, s2)Q
46 Leveraging a Semantics Stack
If P1 and P2 are the weakest preconditions for both branches s1 and s2,
then the weakest precondition for the conditional combines them with
the value of the branching condition b:
{cH|(b(cH) −→ cH ∈ P1) ∧ (¬b(cH) −→ cH ∈ P2)}.
So, if state s satisfies the condition b, then the precondition P1 have to
hold in it; and if cH does not satisfy b, then it has to belong to P2.
• For handling loops we need a rule which allows us to introduce an invari-
ant. Since it cannot be computed by the rules from a while loop, it must
be provided by the user. The statement WhileI(I, b, s) introduces a while
loop with the annotated invariant. Since the invariant does not have any
influence on the deduction, it is semantically defined as a simple while
loop WhileI(I, b, s) = While(b, s). We need the invariant annotation only
for the rule for the VCG:
P ⊆ I Γ `H (I ∩ {cH|b(cH)}) s I I ∩ {cH|¬b(cH)} ⊆ Q
Γ `H P WhileI(I, b, s)Q
To prove a triple for the while loop by deduction we have to show three
subgoals: (i) the precondition P must imply the invariant I, (ii) the
invariant is maintained while the loop is being executed, and (iii) the
invariant and the negated loop condition must imply the postcondition
Q.
• The idea of the rule for a procedure call is to reduce the goal to proving
correctness of the procedures body with pre- and postconditions P ′ and
Q′, respectively, such that P ′ is weaker than P whereas Q′ is stronger
than Q:
P ⊆ P ′ Γ `H P ′ body of p Q′ Q′ ⊆ Q
Γ `H P Call(p)Q
The described verification condition generator is implemented in Isabelle/HOL
by Schirmer [Sch06].
3.3 BS: C0 Big-Step Semantics
Values. The address model for C0 big-step semantics is rather abstract. No
assumptions about data-alignment or consecutive addresses are made. One
location can store any kind of value, even structured ones. This is quite similar
to references in Simpl as introduced in Section 3.1. For a convenient translation
of C0 addresses to Simpl we introduce type loc-t of locations with non-NULL
references ref-t:
loc-t = {r :: ref-t | r 6= NULL}.
Primitive C0 values modeled by data type prim-t can be:
Bool(b) A Boolean b :: B. Chr(c) A character c :: Z.
Intg(i) A (signed) integer i :: Z. Addr(a) A reference a :: loc-t.
Unsgnd(n) An unsigned integer n :: N. Null The null reference.
C0 values modeled by data type val-t can be:
3.3. BS: C0 Big-Step Semantics 47
Prim(p) Primitive values, where p :: prim-t.
Arr(vs) Arrays, where vs :: val-t∗.
Struct(fs) Structures, where fs :: (S× val-t)∗.
Since there is no extra layer of values in Simpl, NULL is an ordinary element of
type ref-t, whereas in C0 Null is an extra constructor of values. The definition
of type loc-t allows us to map value Null to reference NULL since it can not be




Prim(Null) if r = NULL
Prim(Addr(ref2loc(r))) otherwise.
Function ref2loc(r) converts a reference to a reference. Since types for both of
them are just aliases for the natural numbers the conversion is identity. The
same is true for loc2ref(l).
Programs. Although in Isabelle/HOL there is a slight technical difference in
representation of C0 programs at the level of big-step and small-step semantics,
in this thesis we will use the previously defined type prog-t to model programs
at the BS level.
A big-step program might contain extended calls (XCalls) to some proce-
dures which affect the extended state configuration cX :: CX. Semantics of
extended procedures is defined in an extended semantics environment modeled
by the type xsem-t:
xsem-t = (S× (S× ty-t)∗ × ty-t∗ × ((S 7→ val-t⊥)× CX 7→ (val-t∗ × CX)⊥))
∗
.
It consists of (i) a procedure name, (ii) list of parameter names and types,
(iii) list of return types, and (iv) a function that takes a parameter environment
(partial mapping from parameter names to values) and an extended state, and
returns (if possible) a pair of return values and updated extended state.
State. States cBS of C0 big-step semantics are modeled by records of type CBS
which have the following fields.
cBS.gvars Global variables, a mapping S 7→ val-t⊥ from names to values.
cBS.heap The heap, a mapping loc-t 7→ val-t⊥ from locations to values.
cBS.lvars Local variables modeled the same way as global.
cBS.free The counter for the amount of free heap memory.
cBS.x The placeholder for the extended state of XCalls.
Expression evaluation. Local variables may hide global variables. Through-
out execution we keep track of the local variables via a set L to decide whether
a name refers to a local or global variable. This set is also used as a parameter
for the expression evaluation. Expression evaluation evalBS(L, cBS, e) of expres-
sion e :: expr-t in state cBS :: CBS and in context of local variables L :: pow(S) is
defined in Figure 7.1 of Schirmer’s thesis [Sch06]. The function produces a re-
sult of option type val-t⊥, which models possible run-time faults. If the context
of local variables is irrelevant we will write evalBS(cBS, e) for evalBS(∅, cBS, e).
48 Leveraging a Semantics Stack
Big-step semantics. The operational big-step semantics defined by judgment
Π, xsem, L `BS 〈s, cBS〉 ⇒ c′BS
means that with respect to a program Π, extended semantics environment
xsem, and context of local variables L execution of statement s transforms
the initial state cBS to the final state c′BS, where Π :: progBS, xsem :: xsem-t,
L :: pow(S), cBS, c′BS :: CBS⊥, and s :: stmt-t. The semantics is defined in the
thesis of Schirmer [Sch06] in Figulre 7.7.
The termination-guaranteed judgment for C0 programs
Π, xsem, L `BS s ↓ cBS
of statement s in the initial state cBS within the context of program Π, extended
semantics environment xsem, and local variables L is defined inductively by
the rules in Figure 7.8 of Schirmer’s thesis [Sch06]. A crucial property is the
following equivalence between termination and the big-step semantics:
Π, xsem, L `BS s ↓ cBS ←→ (∃ c′BS : Π, xsem, L `BS 〈s, cBS〉 ⇒ c′BS).
Hoare triples. Now we define the notion of a valid Hoare triple for C0 big-step
semantics analogously to validity in Simpl. We start with partial correctness:
Π, xsem, L BS P s Q = ∀ cBS, c′BS : Π, xsem, L `BS 〈s, cBS〉 ⇒ c′BS
∧ cBS ∈ {bxc : x ∈ P} −→ c′BS ∈ {bxc : x ∈ Q}.
Given an execution of statement s from initial state cBS to final state c′BS,
provided that the initial state satisfies the precondition P then the execution
of s does not cause a runtime fault and the final state satisfies the postcondition
Q. Total correctness additionally requires termination:
Π, xsem, L tBS P s Q = Π, xsem, L BS P s Q
∧ ∀ c′BS ∈ {bxc : x ∈ P} : Π, xsem, L `BS s ↓ c′BS.
Typing. C0 is a statically typed language. In Simpl the state is already im-
plicitly typed by the translation to a state-record. The correspondence to C0
programs is only guaranteed for well-typed programs.
The judgment
HT `v v :: T
expresses that value v is compatible with type T :: ty-t with respect to an
option heap typing HT :: (loc-t 7→ S)⊥. It is defined inductively by the rules in
Figure 7.10 of Schirmer’s thesis [Sch06].
The judgment
Π,VT,HT ` s√
ensures that statement s :: stmt-t is well-typed with respect to program Π,
variable typing VT :: S 7→ ty-t⊥, and heap typing HT :: loc-t 7→ S⊥. It is
defined inductively by the rules in Figure 7.13 of Schirmer’s thesis [Sch06].
Definite assignment. The local variables and the result variable of a procedure
are not automatically initialized. However, uninitialized variables are a serious
threat for a type-safe execution of a program. Here we supply a simple static
analysis for the source program that ensures that we assign a value to a variable
3.3. BS: C0 Big-Step Semantics 49
before we read from it. The formalization of the definite assignment analysis
basically consists of two parts. Function A calculates the set of variables that
are certainly assigned to by a piece of code. The test D that ensures that
reading the variable is safe, since it definitely was assigned to before. The
definite assignment analysis for C0 is defined by the following functions:
for statements D(s, L,A), A(s, L),
for expressions De(e, L,A),
for left-expressions Dl(e, L,A), Al(e, L).
Parameter L is the set of local variables and A is the set of assigned variables.
The definitions are given in Figure 7.14 of Schirmer’s thesis [Sch06].
Well-formed programs. We consider program Π to be well-formed, denoted by
wf-prog?(Π) [Sch06, Definition 7.42] , if it respects the following static condi-
tions.
• Type names in the type environment and the global variables have to be
unique and all types have to be well-formed.
• Moreover, all procedure names have to be unique and their definitions
have to be well-formed.
• The names of parameters, local variables, and the result variable have to
be unique and all their types have to be well-formed.
• The procedure body has to be well-typed with respect to the variable
typing obtained from global variables, parameters, local variables, and
the result variable.
• Moreover, the body has to pass the definite assignment test, where pa-
rameters, local variables, and the result variables are considered as local
names and only the parameters are considered as assigned variables.
• Finally, the result variable has to be assigned in the procedure body.
Type safety. Type safety relates the static semantics like typing and definite
assignment with the dynamic semantics, the execution of the program. It
describes properties that are guaranteed during runtime if the static tests have
been passed. Here we are interested in the fact that execution of statements
preserves well-typedness or conformance of the program state. Conformance of
a value to its type is already captured by the typing judgment bHTc `v v :: T
where HT is the heap typing for the current state. A store like a local variable
or the heap is a mapping from variable names S or locations loc-t to option
values val-t⊥. A store s conforms to static typing ST if every stored value
conforms to its type:
HT ` s :: ST = ∀ p, v, T : s(p) = bvc ∧ ST(p) = bT c −→ bHTc `v v :: T.
A BS state is conforming if the heap, the local variables, and the global vari-
ables conform to the corresponding type environment. For a program state we
define conformance predicate TE ` cBS :: HT,LT,GT, where TE :: S 7→ ty-t is
a type environment, HT :: loc-t 7→ S⊥ is a heap typing, and LT,GT :: S 7→ ty-t⊥
are typings for local and global variables, respectively.
50 Leveraging a Semantics Stack
DEFINITION 3.1 I
Conforming BS state
TE ` cBS :: HT,GT,LT = HT ` cBS.heap :: (TE ◦m HT)
∧ dom(HT) ⊆ dom(cBS.heap)
∧ finite(dom(cBS.heap))
∧ HT ` cBS.gvars :: GT
∧ dom(GT) ⊆ dom(cBS.gvars)
∧ HT ` cBS.lvars :: LT.
Above, by f ◦m g we denote a composition of partial mappings f and g , by
dom(f) = {a|f(a) 6=⊥} we denote a domain of partial mapping f , and by
finite(s) we denote that a set s is finite. The heap typing HT maps locations
to type names, and the type environment TE maps type names to types. The
heap has to conform to the composition of both. Moreover, the domain of
the heap typing is a subset of the domain of the heap and both are finite.
The global variables must conform to their types. The domain of the global
variable typing is a subset of the domain of the global variables. Finally, the
local variables have to conform to their types.
Sometimes, when we use a conforming BS state notation we might lack a
global, local, or heap typing to supply. In such situations we will provide an
empty typing, i.e., a typing with an empty set as a domain. We will denote
such typings by [ ] sharing the notation with an empty list.
3.4 Property Transfer from Simpl to BS
At the end of a day we want to prove functional Properties and termination
of C0 programs. We start with a C0 program, abstract it to Simpl and then
verify this Simpl program by means of the Hoare logic. To transfer a Hoare
triple from Simpl to C0 we need to know that the behavior of the C0 program
is captured by the behavior of the Simpl program. Therefore, we have to prove
that the C0 program can be simulated by the corresponding Simpl program.
In the context of total correctness we also have to show that termination of the
Simpl program implies termination of the corresponding C0 program.
A peculiarity of the translation from C0 to Simpl is that it cannot be
defined generically for all C0 programs since variables in Simpl are represented
as record fields. Therefore the shape of the record depends on every individual
program.
State abstraction. To abstract the state we use the function
BS2Hstate :: CBS 7→ pow(CH).
It takes a big-step state and yields the set of all related Simpl states of type CH.
Clearly, this function could only be defined for a particular program state space.
We define the function for the state of the demand paging implementation in
Section 6.2 (Defintion 6.19).
Program abstraction. An abstraction of C0 statements in the context of a
particular procedure is defined by function
BS2Hstmt :: stmt-t 7→ com(CH).
It is introduced in Definition 8.46 and Figure 8.4 in Schirmer’s thesis [Sch06].
3.4. Property Transfer from Simpl to BS 51
Assertion abstraction and concretization. We want to transfer a Simpl Hoare
triple
Γ tH PH BS2Hstmt(s) QH
to its C0 big-step semantics variant
Π, L tBS PBS s QBS.
Assertions PH and PBS as well as QH and QBS have to be related. There are
two ways to describe this relation, either by concretizing a Hoare assertion to
a BS assertion, or by abstraction of a BS assertion to a Hoare one.
Every state cBS for which there exists an abstract state cH ∈ BS2Hstate(cBS)
that satisfies PH satisfies the concretization of assertion PH:
H2BSasrt :: pow(CH) 7→ pow(CBS),
H2BSasrt(PH) = {cBS | ∃ cH : cH ∈ BS2Hstate(cBS) ∧ cH ∈ PH}.
The union of the abstracted states for all BS states satisfying PBS is the
abstraction of assertion PBS:
BS2Hasrt :: pow(CBS) 7→ pow(CH),
BS2Hasrt(PBS) = {cH | ∃ cBS : cH ∈ BS2Hstate(cBS) ∧ cBS ∈ PBS}.
In practice, it turns out that a combination of both methods works best. For
the precondition we use abstraction and for the postcondition concretization.
For given assertions PBS, PH, QBS, and QH we need to show:
• BS2Hasrt(PBS) ⊆ PH, and
• H2BSasrt(QH) ⊆ QBS.
Unfolding the definitions yields the following proof obligations:
• cH ∈ BS2Hstate(cBS) ∧ cBS ∈ PBS −→ cH ∈ PH, and
• c′H ∈ BS2Hstate(c′BS) ∧ c′H ∈ QH −→ c′BS ∈ QBS.
In both cases we obtain either cH ∈ BS2Hstate(cBS) or c′H ∈ BS2Hstate(c′BS),
which we can exploit in order to transfer assertions.
The following theorem which is a slightly modified version of Schirmer’s
Corollary 8.43 is used to transfer Hoare triples from Simpl to C0 big-step
semantics.
J THEOREM 3.2
Transfer from Simpl to BS
Let Π be a program, let GT be a typing of global variables, let LT be a
typing of local variables, and let HT be a heap typing. Let us denote a combined
typing of global and local variables by GT◦LT. Let Π be well-formed (1), let s
be a well-typed statement (2) which is also definitely assigned (3), and let the
locations we assume to be assigned be a subset of the actually assigned values
in the initial state (4). Moreover, assume that the initial state is conforming (5)
and there exists at least one abstracted state for the initial state (6). Having
the connections between preconditions PBS and PH (7) and postconditions QBS
and QH (8), we can conclude the valid transfer:
(1) wf-prog?(Π)
(2) ∧ ∀cBS ∈ PBS : Π,GT ◦ LT,HT ` s√
(3) ∧ D(s, dom(LT), A)
(4) ∧ ∀ cBS ∈ PBS : A ⊆ dom(cBS.lvars)
52 Leveraging a Semantics Stack
(5) ∧ ∀ cBS ∈ PBS : TE ` cBS :: HT,LT,GT
(6) ∧ ∀ cBS ∈ PBS : BS2Hstate(cBS) 6= ∅
(7) ∧ BS2Hasrt(PBS) ⊆ PH
(8) ∧ (∀HT ′ : dom(HT) ⊆ dom(HT ′) −→
H2BSasrt(QH) ⊆
{c′BS | TE ` c′BS :: HT ′,LT,GT
∧ A ∪ (dom(LT)) ∩ A(s) ⊆ dom(c′BS.lvars)
−→ c′BS ∈ QBS})
−→ Γ tH PH BS2Hstmt(s) QH −→ Π, L tBS PBS s QBS.
3.5 Property Transfer from BS to SS
Extended C0 small-step semantics. In Section 2.3 we have defined the transi-
tion function δSS which describes the C0 source-level semantics of the compiler
correctness theorem [Lei07]. However, it does not properly treat XCalls. For
that we introduce an extended transition function δSSX which defines the ex-
tended C0 small-step semantics. Transition δSSX(te, ft, cSS, x, xsem) extends δSS
to handle XCalls on the extended state x, according to the extended semantics
environment xsem. We execute n steps by δnSSX.
The extended semantics environment for the small-step semantics is defined
the same way as for the big-step semantics (Section 3.3). The only difference
is value representation: in the definition of the type xsem-t the value type
val-t is replaced by the C0 smal-step content type N 7→ mcell-t. We do not
introduce special type of the extended state for the small-step semantics, but
rather overload the type xsem-t.
Valid SS configurations for property transfer. Additionally to the standard
constraints on C0 small-step semantics configurations C0
√
we impose further
restrictions for the purpose of property transfer. These define the set of SS
configurations validSS(te, ft).
• First, certain statements and expressions are not allowed since they are
not supported by the big-step semantics or the Hoare logics, namely
address-of operation, external calls, and inline assembly statements, de-
noted by noAddrOf-Asm-ESCall?(s) for a statement s. With the help
of that we claim that forbidden statements and expressions do not oc-
cur in the program rest as well as all function calls reachable from the
statements in it:
noAddrOf-Asm-ESCall?(cSS.prog)
∧ ∀ p ∈ SCalls(ft, scalls(cSS.prog)) :
noAddrOf-Asm-ESCall?((Tmap-of(ft, p)U).body).
The function scalls collects the procedure calls of a statement and the set
SCalls inductively collects further calls in the procedure environment for
the given initial set.
• Global variables have to be initialized, denoted by glob-init?(cSS.mem):
∀vn : vn ∈ map(fst, gst(cSS.mem)) −→ vn ∈ cSS.mem.gm.init.
3.5. Property Transfer from BS to SS 53
• We allow only pointers to (root) heap locations in the memory, denoted
by only-heap-pointer?(cSS.mem). This predicate is a strengthened version
of Leinenbach’s type correct memory [Lei07, Definition 5.23] and type
correct heap [Lei07, Definition 5.24].
• Moreover, every type in the heap has to have a proper type name in the
type environment, denoted by named-ty?(te, hst(cSS.mem)).
• Only root positions of local and global variables are valid return destina-
tions: ∀ gv ∈ map(snd, cSS.mem.lm) : root?(gv).
• In case the return destination is a local variable it has to be defined in the
frame stack below: wf-retvars?(snd(hd(cSS.mem.lm)), tl(cSS.mem.lm)).
• Each procedure in the procedure environment has to pass the definite
assignment check, where we assume parameters and local variables to be
initialized:
∀p ∈ {map(snd, ft)}, pns={map(fst, p.params)}, lns={map(fst, p.lvars)} :
D(p.body, pns ∪ lns, pns)
• Similarly, the program rest has to pass the definite assignment check.
However, as the program rest gets expanded during the small-step com-
putation it may contain multiple returns and hence the program rest is
split into several regions that correspond to procedure invocations on the
frame stack. This generalization is formalized by Ds and LAs:
Ds(cSS.prog, (map(fst, lsttop(cSS.mem)), lmtop(cSS.mem).init)
◦ LAs(restop(cSS.mem), tl(cSS.mem.lm))).
Relational view of SS transitions. A final configuration is reached when the
program rest is Skip. In this case the transition function is the identity. We
define a relational view of the transition function that really stops in this con-
figuration by a single rule:
cSS.prog 6= Skip
te, ft, xsem `SS b(cSS, x)c → δSSX(te, ft, cSS, x, xsem) .
We refer to the reflexive transitive closure by substituting the arrow→ by→∗.
SS Hoare triple. Our notion of a Hoare triple at the level of the small-step se-
mantics is biased towards property transfer from the big-step level. We encode
validity of configurations and transition invariants right into this notation as
well as the set of local variables L which comes from the BS level. The reason
for this peculiarity is that the set of local variables is directly encoded into each
configuration in the small-step semantics. Hence we cannot relate those entities
between the big- and the small-step semantics without referring to a small-step
configuration which is only accessible within the pre- and postconditions of the
Hoare triple.
54 Leveraging a Semantics Stack
Π, xsem, L tSS P s Q =
∀ cSS, x : (cSS.mem, x) ∈ P
∧ cSS ∈ validSS(Π.te,Π.ft)
∧ #ret(cSS.prog) = 0
∧ L = {map(fst, lsttop(cSS.mem))}
∧ ¬(Π.te,Π.ft, xsem `SS b(cSS, x)c → . . . (∞))
∧ ∀ cx′ : Π.te,Π.ft, xsem `SS b(cSS, x)c →∗ cx′ ∧ final?(cx′)
−→ ∃ c′SS, x′, hst :
cx′ = b(c′SS, x′)c
∧ (c′SS.mem, x′) ∈ Q
∧ c′SS ∈ validSS(Π.te,Π.ft)
∧ trans-inv?(cSS.prog, hst, cSS, c′SS)
We consider initial configuration (cSS, x). The memory of the configuration and
the extended state fulfill the precondition P . Moreover the initial configuration
is valid, the program rest does not contain a return statement. Since we define
total correctness, non-terminating computations are ruled out by the judgment
Π.te,Π.ft, xsem `SS b(cSS, x)c → . . . (∞) which is defined as
∃f : f(0) = b(cSS, x)c ∧ ∀i : Π.te,Π.ft, xsem `SS f(i)→ f(i+ 1).
For every computation the final configuration cx′, denoted by final?(cx′), must
not be the error configuration, the postcondition Q has to hold, and it has to
be valid. A non-error final configuration has an empty program rest. More-




trans-inv?(s, hst, cSS, c′SS) =
tl(c′SS.mem.lm) = tl(cSS.mem.lm)
∧ lsttop(c′SS.mem) = lsttop(cSS.mem)
∧ gst(c′SS.mem) = gst(cSS.mem)
∧ hst(c′SS.mem) = hst(cSS.mem) ◦ hst
∧ restop(c′SS.mem) = restop(cSS.mem)
∧ fst(hd(cSS.mem.lm)) ∪map(fst, lsttop(cSS.mem)) ∩ A(s)
⊆ fst(hd(c′SS.mem.lm)).init
The transition invariant captures essential invariants of the small-step com-
putation that hold between the initial and final configurations of the procedure
call that we transfer. First of all the computation only affects the topmost
frame of local variables and the type information of this frame is not changed,
as for the global variables. The type information for the heap memory may
only grow. Return destinations of the top local memory frames are equal. Fi-
nally the increasing set of initialized local variables as approximated by the
definite assignment analysis A.
THEOREM 3.4 I
Transfer from BS to SS
Assume that SS configuration cSS satisfies validity requirements for property
transfer (1) and its program rest consists of statement s (2). Assume that local
variables names context L consists of variables of the topmost local memory
frame(3). Moreover, a precondition on the SS level is satisfied (4). If under
these assumptions there exist pre- and postconditions on the BS level, such
that BS Hoare triple holds (5), BS precondition is satisfied by the abstracted
configuration (6), and for all final valid SS configurations (7) respecting the
3.5. Property Transfer from BS to SS 55
transition invariant (8) and the BS postcondition (9) the SS postcondition
holds (10), then SS Hoare triple holds:
(1) cSS ∈ validSS(Π.te,Π.ft)
(2) ∧ cSS.prog = s ∧#ret(s) = 0
(3) ∧ L = map(fst, lsttop(cSS.mem))
(4) ∧ (cSS, x) ∈ PSS
(5) −→ (∃ PBS, QBS : Π, xsem, L tBS PBS s QBS
(6) ∧ SS2BSstate(cSS, x) ∈ PBS
(7) ∧ (∀ c′SS, x′, hst′ : c′SS ∈ validSS(Π.te,Π.ft)
(8) ∧ trans-inv?(s, hst′, cSS, c′SS)
(9) ∧ SS2BSstate(c′SS, x′) ∈ QBS
(10) −→ (c′SS, x′) ∈ QSS))
−→ Π, xsem, L tSS PSS s QSS
The core premise of this transfer theorem strengthens the precondition and
weakens the postcondition. Additionally, we take the different layers and the
system invariants of the small-step layer into account. For an arbitrary initial
configuration that fulfills the constraints of small-step validity and the precon-
dition PSS we have to supply a big-step Hoare triple Π, xsem, L tBS PBS s QBS
such that the big-step abstraction of the configuration fulfills the precondition
PBS. For a final valid small-step configuration, that fulfills the transition invari-
ant and for which the big-step abstraction fulfills the postconditionQBS we have
to derive the small-step postcondition QSS. Note the existential quantification
on the big-step pre- and postcondition. It is under the universal quantification
of the initial configuration and hence can depend on this configuration. The
function SS2BSstate is used to abstract a small-step configuration to a big-step
state.
From Hoare triples to operational-semantics-style computations. At this point
we are able to transfer a Hoare triple for a single procedure call to the small-
step level. One basic motivation of reasoning at the low abstraction level of the
small-step semantics instead of the convenient Hoare logic level is the ability to
combine ordinary C0-computation with inline assembly code. Hence, it would
be worthless if we were incapable of using the transferred result in a situation
where inline assembly code is part of the program rest. The following theorem




(1) Π, xsem, L tSS P s Q
(2) ∧ s = cSS.prog ∧#ret(cSS.prog) = 0
∧ (cSS.mem, x) ∈ P
∧ cSS ∈ valid ′SS(Π.te,Π.ft)
∧ L = map(fst, lsttop(cSS.mem))
(3) −→ ∃ n, c′SS, x′, hst′ : δnSSX(Π.te,Π.ft, cSS, x, xsem) = b(c′SS, x′)c
(4) ∧ c′SS.prog = Skip
∧ c′SS ∈ valid ′SS(Π.te,Π.ft)
∧ trans-inv?(s, hst′, cSS, c′SS)
∧ (c′SS.mem, x′) ∈ Q
We start in a configuration where the program rest agrees with the state-
56 Leveraging a Semantics Stack
ment in the Hoare triple (1,2). As we have total correctness we know that
the computation leads to a state where the statement is completely executed,
i.e., has evaluated to Skip (3,4). In the remainder of the theorem the usual























The chapter introduces an abstract page-fault
handler configuration as well as a configura-
tion for extended state. These configurations
are used to specify the intended behavior of
the demand paging software. In the subse-
quent chapters we will map them to the de-
mand paging implementation represented on
the levels of Simpl, big-step, and small-step
semantics. We introduce the initial abstract
configuration which defines the state of the
demand paging data structures after execu-
tion of the initialization code. An address
translation algorithm, operations for trans-
ferring pages between the physical memory
and the hard disk, and a page-fault handling
algorithm are defined over these configura-
tions. All correctness properties essential for
the top-level correctness proof are collected
into a validity predicate of the abstract page-
fault handler. This predicate is the main in-
variant maintained by the demand paging im-
plementation. It is established over the initial
configuration and proven to be preserved un-
der the page-fault handling algorithm.
57
58 Specification of Demand Paging
4.1 Configurations
The general approach to verification of low-level implementations is abstrac-
tion, i.e., mapping the data structures and algorithms constituting the imple-
mentation to higher-level concepts. The implementation of demand paging
operates on two kinds of variables: (i) C0 data structures and variables, and
(ii) user and swap memories which could be accessed only by assembly code.
For the first group we introduce an abstract page-fault handler configuration,
or abstract PFH configuration, — a record which collects abstract HOL rep-
resentations of the handler’s C0 data structures. Memories from the second
group are modeled by the so called extended state of the page-fault handler,
or simply the extended state.
Before defining abstract PFH configurations let us formalize an auxiliary
concept — a page descriptor.
Page descriptors. The record type pd-t holding information about one user
page defines the type for page descriptors. One such page descriptor pd corre-
sponds to exactly one physical page of user memory and has three components:
• pd.pid :: N, the process identifier which denotes to which virtual machine
an associated physical page belongs,
• pd.vpx :: N, the virtual page index showing to which virtual page the
corresponding physical page belongs, and
• pd.ppx :: N, the physical page index which points to the user page in the
physical memory.
Components of a page descriptor have to be appropriately bounded: the
process identifier field pd.pid is bounded by the maximum number of processes,
and the virtual pd.vpx and physical pd.ppx page index fields have to be page
addresses, i.e., bounded by TOT PGS.
DEFINITION 4.1 I
Sub-typing of page descriptors
subtypingpd?(pd) = pd.pid < MAX PID
∧ pd.vpx < TOT PGS
∧ pd.ppx < TOT PGS
4.1.1 Abstract Configuration
Abstract versions of data structures from the page-fault handler implementa-
tion together with those fields of process control blocks that are relevant for
page-fault handling are collected in the record CPFH . An abstract configu-
ration cPFH of the page-fault handler is an element of this type. It has the
following fields:
• cPFH.active :: pd-t∗, the active list of page descriptors associated with user
memory pages that store a virtual page,
• cPFH.free :: pd-t∗, the free list of descriptors corresponding to unused
physical pages,
• cPFH.bpfree :: N∗, the stack of free big-page indices,
4.1. Configurations 59
• cPFH.pt :: N∗∗, the page table space,
• cPFH.bpt :: N∗, the big-page table space,
• cPFH.pto :: Z∗, the list of page table origins of processes,
• cPFH.ptl :: Z∗, the list of page table lengths of processes,
• cPFH.bpto :: Z∗, the list of big-page table origins of processes,
• cPFH.bptl :: Z∗, the list of big-page table lengths of processes.
The components of the abstract configuration are modeled by the un-
bounded data types. However, for verification we need to impose size con-
straints on them reflecting the size of corresponding data structures from the
implementation.
Since active and free lists describe altogether all user memory pages their
total length has to be equal to the total number of user memory pages. Ele-
ments of these lists are page descriptors. They have to be bounded by means
of the sub-typing predicate for page descriptors (Definition 4.1).
J DEFINITION 4.2
Sub-typing of
active and free lists
subtypingactive-free?(cPFH) = |cPFH.active|+ |cPFH.free| = USER PGS
∧ (∀ i < |cPFH.active| : subtypingpd?(cPFH.active[i]))
∧ (∀ i < |cPFH.free| : subtypingpd?(cPFH.free[i]))
If no processes store (parts of) their virtual memory on the hard disk all big
pages are free. Therefore, the maximum length of the stack of free big pages
is the total number of big pages. Each entry of the stack points to some big





subtypingbpfree?(cPFH) = |cPFH.bpfree| ≤ TOT BIG PGS
∧ ∀ i < |cPFH.bpfree| : cPFH.bpfree[i] < TOT BIG PGS
Our abstraction of the page table space is quite close to the corresponding
array in the implementation: the size of dimensions of cPFH.pt is the same as
in the implementation. The first dimension is bounded by TOT PGS PT whereas
the second by PTES PER PG. Each entry of the page table space is four bytes




subtypingpt?(cPFH) = |cPFH.pt| = TOT PGS PT
∧ (∀ i < TOT PGS PT : |cPFH.pt[i]| = PTES PER PG)
∧ (∀ i < TOT PGS PT : ∀j < PTES PER PG :
cPFH.pt[i][j] < TOT PGS · PG SZ)
Each big page has its own entry in the big-page table. This entry is an





subtypingbpt?(cPFH) = |cPFH.bpt| = TOT BIG PGS
∧ ∀ i < TOT BIG PGS : cPFH.bpt[i] < TOT BIG PGS
As for the process control block components of the abstract configuration,
in each component there are as many entries as processes in the system. The
page table origin is bounded by the total number of virtual pages whereas the
60 Specification of Demand Paging
big-page table origin by the total number of big pages. The same upper bounds
are respected by page and big-page table lengths. However, the domain of their
values is additionally extended with −1 to denote that a process is inactive and
has no virtual memory.
DEFINITION 4.6 I
Sub-typing of PCBs
subtypingPCB?(cPFH) = |cPFH.pto| = MAX PID
∧ |cPFH.ptl| = MAX PID
∧ |cPFH.bpto| = MAX PID
∧ |cPFH.bptl| = MAX PID
∧ (∀ i < MAX PID : 0 ≤ cPFH.pto[i] < TOT PGS
∧ 0 ≤ cPFH.bpto[i] < TOT BIG PGS
∧ −1 ≤ cPFH.ptl[i] < TOT PGS




Altogether, sub-typing of an abstract PFH configuration subtypingPFH?(cPFH)
is the conjunction of Definitions 4.2–4.6.
4.1.2 Extended State
The extended state of the demand paging specification is an abstraction of the
non-system part of the physical memory of the machine running the handler
and the hard disk of this machine. The state is used to specify the effects of
read and write operations to/from the hard disk invoked by the handler. The
state is modeled by the record CX whose instances cX have two components:
• cX.mem :: N∗∗, the user part of the physical memory not reachable by
C0, and
• cX.swap :: N∗∗, the content of the hard disk excluding the boot region.
Figure 4.1 depicts the parts of the physical memory and the hard disk content
which are modeled by the extended state.
The component cX.mem is an abstraction of the user memory region —
USER PGS pages of memory starting from address KERNEL PGS — and the zero
filled page. The component cX.swap is an abstraction of the swap memory
stored at the hard disk — the region of TOT BIG PGS big pages starting from
address BOOT PGS. Each of the extended state components is modeled as a list
of pages, e.g., in the formula cX.mem[i][j] the index i denotes the page address




subtypingX?(cX) = |cX.mem| = USER PGS + 1
∧ |cX.swap| = TOT BIG PGS · PGS PER BIG PG
∧ (∀ i < |cX.mem| : |cX.mem[i]| = PG SZ WD)
∧ (∀ i < |cX.swap| : |cX.swap[i]| = PG SZ WD)
4.2 Initial Configuration
An abstract PFH configuration is set to its initial state by executing the ini-
tialization code of demand paging.
4.2. Initial Configuration 61




























Initial page descriptor. An initial configuration of a page descriptor is con-
structed by means of the function init-pd(i) = pd0, such that pd0.pid = 0,
pd0.vpx = 0, and pd0.ppx = i.
Initial abstract PFH configuration. The initial configuration init-cPFH of the
abstract page-fault handler is defined as follows:
• init-cPFH.active = [ ], the active list is empty,
• init-cPFH.free = map(init-pd, [TOT PHYS PGS− 1..KERNEL PGS]), the free
list describes all user physical pages,
• init-cPFH.bpfree = [TOT BIG PGS− 1..0], the stack of free big-pages con-
tains entries for all big pages,
• init-cPFH.pt = rep(rep(0, PTES PER PG), TOT PGS PT), all entries of the
page table array are initialized with zeros,
• init-cPFH.bpt = rep(0, TOT BIG PGS), so do the entries of the big-page
table list,
• init-cPFH.pto = 0 ◦map(init-pto, [1..MAX PID− 1]), the page table origins
of users are distributed inside the page table array with equal gaps be-
tween them computed by the function
init-pto(i) =
PT START +




note that we take into consideration the start offset PT START of page
tables in the physical memory,
62 Specification of Demand Paging
• init-cPFH.ptl = 0 ◦ rep(−1, MAX PID − 1, ), the page table lengths of pro-
cesses are set to −1 which denotes that all processes have no allocated
memory,
• init-cPFH.bpto = 0 ◦ rep(0, MAX PID − 1), the big-page table origins are
initialized with zeros, and
• init-cPFH.bptl = 0 ◦ rep(−1, MAX PID − 1), the big-page table lengths are
set to −1.
4.3 Address Translation
A memory access of a user process pid at virtual address va is subject to address
translation. This operation either generates a page fault signal or delivers a
physical memory address. In the following we define an address translation
mechanism in terms of abstract PFH configurations.
4.3.1 Physical Memory Address
For a process identifier pid the component of an abstract PFH configuration
cPFH.pto[pid] stores the origin of the page table of the process pid in the physical
memory of the underlying machine. The origin is measured in pages. On the
side of the abstract configuration the process’s page table origin is an index of
the first dimension of the page table array cPFH.pt. We compute this index by
means of the function
DEFINITION 4.9 I
Page table origin
in page table space
ptoPFH(cPFH, pid) =
cPFH.pto[pid] · PG SZ− PT START
4 · PTES PER PG .
It first compensates for the start address PT START of the page table memory
region and then considers page-alignment of page tables: each page contains
PTES PER PG page table entries, each four bytes long.
The abstract configuration’s component cPFH.ptl[pid] stores the page table
length of the process pid. That means that the process pid has cPFH.ptl[pid]
entries in the page table. Since page tables are page aligned, we define a
function which computes how many pages a process’s page table occupies:
DEFINITION 4.10 I
Page table length





Now we introduce page table entry addresses. Let vpx be a virtual page
index. As the page table array has two dimensions we need two addresses: one
for each dimension. The page table entry address for the first dimension is
computed by means of the function
DEFINITION 4.11 I
Page table entry address:
first dimension
ptea1(cPFH, pid, vpx) = ptoPFH(cPFH, pid) + vpx/PTES PER PG.
It counts vpx entries in the page table of the process pid starting at its origin
ptoPFH(cPFH, pid) and aligns the result pagewise.
The page table entry address for the second dimension is defined as
DEFINITION 4.12 I
Page table entry address:
second dimension
ptea2(vpx) = vpx mod PTES PER PG.
In this case the address computation function simply ignores the part of vpx
which is greater than the size of the second dimension of the page table array.
4.3. Address Translation 63
Having page table entry addresses we can define page table entries in terms




ptePFH(cPFH, pid, vpx) = cPFH.pt[ptea1(cPFH, pid, vpx)][ptea2(vpx)].
Finally, the physical memory address for a virtual address va is obtained
by combining the physical page index with the byte index:
J DEFINITION 4.14
Physical memory address
pmaPFH(cPFH, pid, va) = px(ptePFH(cPFH, pid, px(va))) · PG SZ + bx(va).
4.3.2 Page Faults
Let pte be a four byte long page table entry. Table 4.1 defines how we compute
valid, protected, and executable bits from the entry.
Table 4.1: Bits computed from a page table entry
Bit Notation Computation
valid valid?(pte) pte/2VALID POS mod 2
protected prot?(pte) pte/2PROT POS mod 2
executable exec?(pte) pte/2EXEC POS mod 2
If a memory access takes place at virtual address va which is greater or equal




ptlexcpPFH?(cPFH, pid, va) = px(va) ≥ cPFH.ptl[pid] + 1.
An invalid access page fault occurs if a process pid accesses memory at
virtual address va such that the valid bit of the corresponding page table entry
is not set:
J DEFINITION 4.16
Invalid access page fault
iapf?(cPFH, pid, va) = ¬valid?(ptePFH(cPFH, pid, px(va))).
A zero protection page fault occurs on writing at virtual address va, such
that the corresponding physical page index points to the zero filled page. That
the memory access is a write access is defined by the external parameter intent
which is passed to the page-fault handler. Any value of it which is different
from READ denotes a write access of different kinds:
J DEFINITION 4.17
Zero protection page fault
zppf?(cPFH, pid, va, intent) = px(ptePFH(cPFH, pid, px(va))) = ZFP
∧ prot?(ptePFH(cPFH, pid, px(va)))
∧ intent 6= READ.
Altogether, page faults distinguishable on the software level:
J DEFINITION 4.18
Page fault
pfPFH?(cPFH, pid, va, intent) =
iapf?(cPFH, pid, va) ∨ zppf?(cPFH, pid, va, intent).
64 Specification of Demand Paging
4.4 I/O Operations on Swap Memory
Let vpx be a virtual page index, a natural number less than TOT PGS. A big-
page index of vpx is computed by means of the function
DEFINITION 4.19 I
Big-page index
bpx-of-vpx(vpx) = vpx/PGS PER BIG PG.
With constant values considered in this thesis a big page index has a meaning
of ten leading bits of the virtual page index: bpx-of-vpx(vpx) = bin(vpx)[19 : 10].
A big byte index of vpx is computed by meas of the function
DEFINITION 4.20 I
Big byte index
bbx-of-vpx(vpx) = vpx mod PGS PER BIG PG.
It has a meaning of ten trailing bits of the virtual page index: bbx-of-vpx(vpx) =
bin(vpx)[9 : 0].
For a process identifier pid and a virtual page index vpx a big-page table
entry address is computed by summing the big-page table origin of the process




bpteaPFH(cPFH, vpx, pid) = cPFH.bpto[pid] + bpx-of-vpx(vpx).
The corresponding big-page table entry is defined as an element of the
big-page table list taken at the big-page table entry address:
DEFINITION 4.22 I
Big-page table entry
bptePFH(cPFH, vpx, pid) = cPFH.bpt[bpteaPFH(cPFH, vpx, pid)].
We obtain a swap page index by adding an offset of big byte index to the
big-page table entry. Additionally, we have to consider an offset of BOOT PGS




spxPFH(cPFH, vpx, pid) =
bptePFH(cPFH, vpx, pid) · BIG PG SZ + bbx-of-vpx(vpx) + BOOT PGS.
4.4.1 Address Adjustment
The granularity of hard disk accesses is a sector of 128 words (512 bytes).
Hence, a single pages contains PG SZ/512 = 8 sectors. A swap memory address
computed by the function spxPFH(cPFH, vpx, pid) is measured in pages. In order
to convert it into a sector address we introduce the function
page2sec(ad) = ad · 8.
A conversion in opposite direction, i.e., from sector addresses to page addresses
is performed by the function
sec2page(ad) = ad/8.
As depicted in Figure 4.1 the swap memory component cX.swap of the
extended state models the part of the hard disk swap memory above the boot
region.
In order to use the swap memory page address to access the extended state
we have to compensate the offset of BOOT PGS. For this we define the following




offsswap(ad) = ad− BOOT PGS.
4.4. I/O Operations on Swap Memory 65
Similarly, the physical memory component cX.mem of the extended state
models the user part of the physical memory and the zero filled page. Addresses
in the extended state component cX.mem start from zero. The following func-
tion is used to convert page addresses in real physical memory to those in




offsmem(ad) = ad− ZFP.
4.4.2 Swap In
In the demand paging implementation a page is transferred from the hard disk
into the physical memory by means of the function read from disk, one of
the hard disk driver functions. On the abstract side we incorporate the effects
of this function into an XCall. The semantics of this XCall operating on the




Let cX be an extended state, let ma be a physical memory page address,
and let sa be a disk sector address. The semantics of reading from the hard
disk is defined by the function read-from-disk :: N×N×CX 7→ CX which returns
an updated extended state c′X = read-from-disk(ma, sa, cX) such that the page
in the physical memory component of the extended state at the address ma is




cX.swap[offsswap(sec2page(sa))] if i = offsmem(ma)
cX.mem[i] otherwise.
Note, that ma and sa address pages in the real physical memory of the un-
derlying machine and the hard disk content. Therefore respective address ad-
justments with the function offsmem and offsswap for translating them to the
extended state take place.
The page-fault handler calls the function read from disk trough a wrapper
function pfh swap in. The latter computes the swap memory address and calls
the hard disk driver.
J DEFINITION 4.27
Semantics of swap in
Let cPFH be an abstract PFH configuration, let cX be an extended state,
let pid be a process identifier, let ppx be a physical page index, and let vpx
be a virtual page index. The semantics of swap in is defined by the function
swap-in :: N × N × N × CPFH × CX 7→ CX which returns an updated extended
state c′X = swap-in(pid, ppx, vpx, cPFH, cX) such that the page in the physical
memory component of the extended state at the page address ppx is replaced
by the swap memory page taken at the computed swap memory address:
c′X.mem[i] =
{
cX.swap[offsswap(spxPFH(cPFH, vpx, pid))] if i = offsmem(ppx)
cX.mem[i] otherwise.
Address adjustments with the function offsmem and offsswap for translating them
to the extended state take place.
66 Specification of Demand Paging
4.4.3 Swap Out
In our demand paging implementation pages are transferred in the opposite
direction, i.e., from the physical memory to the hard disk, by means of the
function write to disk, the second of the hard disk driver functions. The
following definition introduces the semantics of the XCall corresponding to the




Let cX be an extended state, let ma be a physical memory page address,
and let sa be a disk sector address. The semantics of writing to the hard disk
is defined by the function write-to-disk :: N × N × CX 7→ CX which returns an
updated extended state c′X = write-to-disk(ma, sa, cX) such that the page in
the swap memory component at the address sa converted to the page address
is replaced by the physical memory page taken at the address ma. Necessary
address shifts with the functions offsmem and offsswap take place. Formally:
c′X.swap[i] =
{
cX.mem[offsmem(ma)] if i = offsswap(sec2page(sa))
cX.swap[i] otherwise.
The page-fault handler calls the function write to disk trough a wrapper
function pfh swap out. The latter computes the swap memory address and
calls the hard disk driver.
DEFINITION 4.29 I
Semantics of swap out
Let cPFH be an abstract PFH configuration, let cX be an extended state,
let pid be a process identifier, let ppx be a physical page index, and let vpx
be a virtual page index. The semantics of swap out is defined by the function
swap-out :: N×N×N×CPFH ×CX 7→ CX which returns an updated extended
state c′X = swap-out(pid, ppx, vpx, cPFH, cX) such that the page in the swap
memory components of the extended state at the computed swap memory
address is replace by the physical memory page taken at the page address ppx.




cX.mem(offsmem(ppx)) if i = offsswap(spxPFH(cPFH, vpx, pid))
cX.swap[i] otherwise.




Let cX be an extended state, and let ppx be a physical page index. The seman-
tics of page zero fill is defined by the function zero-fill-page :: N × CX 7→ CX
which returns an updated extended state c′X = zero-fill-page(ppx, cX) such that
the page in the physical memory component of the extended state at the page




rep(0, PG SZ WD) if i = offsswap(ppx)
cX.mem[i] otherwise.
4.5. Page-Fault Handling Algorithm 67
4.5 Page-Fault Handling Algorithm
For natural numbers x and pos the function
set-bit(x, pos) = x− x mod 2pos+1 + 2pos + x mod 2pos
sets the bit at the position pos of x. The bit is cleared by means of the function
clear-bit(x, pos) = x− x mod 2pos+1 + x mod 2pos.
Table 4.2 depicts how we use these two functions for setting and clearing valid,
protected, and executable bits.
Suppose, a page fault takes place in the abstract PFH configuration cPFH for
a process identifier pid, virtual address va, and intention intent,
i.e., pfPFH?(cPFH, pid, va, intent). The goal of the page-fault handler is to tran-
sit to configuration c′PFH such that the page fault predicate does not hold in
c′PFH. The transition is defined by a page-fault handling algorithm specified
below.
The algorithm operates on two configurations: the abstract PFH config-
uration and the extended state. Modifications of the former specify a page
replacement strategy together with validation and invalidation of necessary
page table entries. Changes in the extended state define page transfer between
physical and swap memory.
Table 4.2: Setting and clearing bits of a page table entry
Function Definition Meaning
set-valid(pte) set-bit(pte, VALID POS) set valid bit
clear-valid(pte) clear-bit(pte, VALID POS) clear valid bit
set-prot(pte) set-bit(pte, VALID POS) set protected bit
set-exec(pte) set-bit(pte, EXEC POS) set executable bit
4.5.1 Updating Data Structures
The abstract PFH configuration is brought to a non page faulting state c′PFH
by means of the function handle-pfPFH :: CPFH × N × N 7→ CPFH such that
c′PFH = handle-pfPFH(cPFH, pid, va). The function defines the following actions.
First of all a descriptor vict of a page for eviction from the physical memory
is selected. The free list is examined in order to find out whether any unused
page resides in the physical memory and could be given to a page faulting




cPFH.active[0] if cPFH.free = [ ]
cPFH.free[0] otherwise.
Update of the active and free lists specify the page replacement strategy.
We use the FIFO eviction strategy, which guarantees that the page swapped
in during the previous call to the handler will not be swapped out during the
current call. This property is crucial for the page-fault handler’s liveness since
a single instruction can cause up to two page fault on the physical machine —
one during the fetch phase, the other during a load or store operation.
68 Specification of Demand Paging
The active list is updated as follows. In case the free list is empty there is
no page in the physical memory which could be immediately given to the page
faulting process. Following the selected FIFO strategy we remove the element
at the head of the active list. By this we obtain a vacant place in the active
list. In case the free list is not empty there is at least one vacant place in the
active list. The reason is that the sum of the active and free lists lengths is
always equal to the total number of user pages. Now, we insert a descriptor
pd of the page that is swapped in at the end of the (so far partially) updated
active list. We assign the process identifier and virtual page index fields of pd
those values that caused a page fault: pd.pid = pid and pd.vpx = px(va). The
physical page index field of pd is assigned the page index of the evicted page:






tl(cPFH.active) ◦ pd if cPFH.free = [ ]
cPFH.active ◦ pd otherwise.
The free list update completes the specification of the page replacement







cPFH.free if cPFH.free = [ ]
tl(cPFH.free) otherwise.
The modifications of the active and free lists have to be reflected on the
page tables: the entry of the page that is swapped out has to be invalidated
while the entry which corresponds to the process identifier and virtual address
at which the page fault is being handled must be validated.
The predicate invalid-ptea?(i, j) holds if i and j are indices of the entry for
invalidation in the page table array cPFH.pt. The indices correspond to the page
table entry addresses for both dimensions computed for the process identifier
and virtual addresses of the page subject to eviction:
invalid-ptea?(i, j) = (i = ptea1(cPFH, vict.pid, vict.vpx)
∧ j = ptea2(vict.vpx)).
The predicate valid-ptea?(i, j) holds if i and j are indices of the entry for
validation in the page table array cPFH.pt. The indices correspond to the page
table entry addresses for both dimensions computed for the page faulting pro-
cess identifier and virtual address:
valid-ptea?(i, j) = i = ptea1(cPFH, pid, va)
∧ j = ptea2(va) .
The page table array elements cPFH.pt[i][j] are updated in a straightforward
way. In case the free list is empty the entry with invalid-ptea?(i, j) is invalidated
by clearing its valid bit. The entry with valid-ptea?(i, j) is validated by setting
its valid bit. The executable bit is raised as well. All other entries remain
unchanged. Formally:
DEFINITION 4.33 I
Updating page table space
during page-fault handling
c′PFH.pt[i][j] =
clear-valid(cPFH.pt[i][j]) if invalid-ptea?(i, j) ∧ cPFH.free = [ ]
set-exec(set-valid(vict.ppx · PG SZ)) if valid-ptea?(i, j)
cPFH.pt[i][j] otherwise.
4.5. Page-Fault Handling Algorithm 69
4.5.2 Updating Extended State
On the side of the extended state a page fault is handled by means of the
function handle-pfX :: CPFH×CX×N×N×N 7→ CX which returns an updated
extended state c′X = handle-pfX(cPFH, cX, pid, vpx, intent). The function speci-
fies page transfer between the physical memory and the swap memory during
treatments of page faults.
The extended state update must be consistent with the abstract PFH con-
figuration update. This results in two facts: (i) the page pointed to by the page
table entry invalidated during the abstract PFH configuration update must be
swapped out from the physical memory, and (ii) the page pointed to by the
page table entry validated during the abstract PFH configuration update must
be either filled with zeros, filled with the swapped in data, or intentionally left
unchanged.
First, we formally define the second fact. The data which is written to the
validated page depends on the kind of a page fault: an invalid access or zero
protection. For an easier distinction between them we introduce the predicate
zfp?(cPFH, pid, vpx). It holds if the page index of the page table entry which
corresponds to the process identifier pid and the virtual page index vpx points
to the zero filled page:
zfp?(cPFH, pid, vpx) = px(ptePFH(cPFH, pid, vpx)) = ZFP.
Recall that the parameter intent defines the intention of a page-fault han-
dler’s call. In case it equals OVERWRITE, denoted by owr?(intent), the caller is
going to overwrite the entire physical page which corresponds to pid and vpx by
some data1. Therefore, we can optimize the algorithm by keeping the original
data at that page. If intention is different from overwriting the physical mem-
ory component of the extended state is modified at the page address vict.ppx
by filling the page with zeros, in case zfp?(cPFH, pid, vpx), or with the swapped
in data, otherwise.





fill-page(cPFH, cX, pid, vpx, intent) =
zero-fill-page(vict.ppx, cX) if ¬owr?(intent)∧zfp?(cPFH, pid, vpx)
swap-in(pid, vict.ppx, vpx, cPFH, cX) if ¬owr?(intent)∧¬zfp?(cPFH, pid, vpx)
cX otherwise.
As for the first fact, we use the semantics of swap out with the process iden-
tifier, physical and virtual page indicies parameters taken from the descriptor
vict of the page for eviction. By that we obtain an intermediate configuration
cˆX of the extended state in which the invalidated page is swapped out:
cˆX = swap-out(vict.pid, vict.ppx, vict.vpx, cPFH, cX).
Swap out takes place only if the free list is empty.
Altogether, we define the configuration c′X of the extended state after page-
fault handling. The result configuration c′X is defined by the page filling func-
tion applied to the intermediate configuration cˆX, in case the free list is empty,
or to the original configuration cX, otherwise:
1An example of such a call occurs in the CVM primitive cvm copy (cf. Section 9.3 of
Tsyban’s thesis [Tsy09]).






fill-page(cPFH, cˆX, pid, vpx, intent) if cPFH.free = [ ]
fill-page(cPFH, cX, pid, vpx, intent) otherwise.
4.5.3 Prevention of Most Recently Loaded Page Swap Out
Our demand paging system supports features for close interaction with CVM.
The CVM implementation includes a primitive for copying data between two
user processes. This primitive needs that both source and destination pages
reside in the physical memory. However, if the page-fault handler just checks
that the page is currently in the memory, a subsequent call to the handler
might invalidate it. In order to overcome this problem, the parameter count
with a default value of zero is introduced to the page-fault handler. When it
is set to one the page-fault handler ensures that the specified page will survive
the second call of the handler. In case the descriptor of the considered page is
the head of the active list we need to push it one one position further inside
the active list. An updated abstract PFH configuration c′PFH in this case is
obtained by means of the function prevent-swap-out :: CPFH 7→ CPFH which
modifies only the active list components of its argument by swapping its first
and second elements:
DEFINITION 4.36 I
Preventing page swap out
c′PFH.active = hd(tl(cPFH.active)) ◦ hd(cPFH.active) ◦ tl(tl(cPFH.active)).
4.6 Validity
We demand a variety of properties to hold for abstract PFH configurations.
These properties reflect the functional correctness and are necessary for the
top-level correctness proof of the page-fault handler. All of the properties
are established for the first time after an execution of the page-fault handler
initialization code and are preserved under calls to the handler. Therefore, we
will refer to these properties as validity invariants of the page-fault handler.
Whenever we introduce such an invariant further in this section we intro-
duce a lemma for it as well. Each lemma assumes that a page-fault takes
place in a configuration cPFH for a process identifier pid at virtual address va
and claims that the invariant is preserved under the page-fault handling algo-
rithm. Hence, each lemma states that the invariant holds in the configuration
handle-pfPFH(cPFH, pid, va).
Certainly, each validity invariant holds in the initial configuration init-cPFH
of the page-fault handler. However, due to the monotony of such lemmas we
omit their statement. We rather present a single lemma in the end of this
section claiming that all validity invariants hold in the initial configuration.
4.6.1 Invariants about Active and Free Lists
Active list describes only user processes. Recall that our system supports
MAX PID processes. The process with identifier pid = 0 is an operating-system
kernel; all other identifiers denote user processes. We call pid a user process
4.6. Validity 71
identifier if it points to a user process and denote by the predicate
user-pid?(pid) = 0 < pid < MAX PID.
As we support virtual memory only for the latter the above property must hold
for all elements of the active list:
J DEFINITION 4.37user-pid-active?(cPFH) = ∀ i < |cPFH.active| : user-pid?(cPFH.active[i].pid).
Active list describes only user memory. We call ppx a user physical page index
if its values lies with the user memory region, i.e., USER PGS starting from the
page address KERNEL PGS, and denote by
user-ppx?(ppx) = KERNEL PGS ≤ ppx < TOT PHYS PGS.
This property holds for all elements of the active and free lists — they have
their physical page index field inside the user memory range:
J DEFINITION 4.38user-ppx-active-free?(cPFH) =
∀ i < USER PGS : user-ppx?((cPFH.active ◦ cPFH.free)[i].ppx).
Active list describes only valid pages. A page descriptor is inserted in the
active list in case the memory page pointed to by it is either already in the
physical memory, or is supposed to be transferred to the main memory from the
hard disk within the current call to the page-fault handler. The corresponding
page table entry is valid afterwards. Therefore, all page table entries pointed
to by the active list have their valid bits set:
J DEFINITION 4.39valid-active?(cPFH) = ∀ i < |cPFH.active| :
valid?(ptePFH(cPFH, cPFH.active[i].pid, cPFH.active[i].vpx)).
Active list describes only not protected pages. As it follows from Definition 4.39
the active list describes only valid pages. A page might be validated only dur-
ing a call to the page-fault handler. According to the page-fault handling
algorithm when a page table entry is validated only its valid and executable
bits are raised. The protected bit is kept cleared. Hence, all elements of the
active list describe page table entries which have their protected bits cleared:
J DEFINITION 4.40not-prot-active?(cPFH) = ∀ i < |cPFH.active| :
¬prot?(ptePFH(cPFH, cPFH.active[i].pid, cPFH.active[i].vpx)).
Active list points to physical page indices. The physical page index field of
active list’s elements must point exactly to the physical page associated with a
given process identifier and virtual page index. In other words, it must be equal
to the page index of the page table entry computed for this process identifier
and virtual page index.
J DEFINITION 4.41ppx-active-eq-ppx-pt?(cPFH) = ∀ i < |cPFH.active| :
cPFH.active[i].ppx = px(ptePFH(cPFH, cPFH.active[i].pid, cPFH.active[i].vpx)).
72 Specification of Demand Paging
Active and free lists do not point to zero-filled page. We allocate a page of
virtual memory at page address vpx for a process pid by making the page table
entry computed for vpx and pid point to the (shared) zero-filled page. A write
access to this newly allocated virtual page generates a zero protection page
fault. The page-fault handler assigns a real physical page to the pair (pid, vpx)
and inserts this entry to the active list. By this, no descriptors pointing to
the zero-filled page might occur in the active list. Initially, no elements of the
free list point to the zero-filled page. Since only descriptors from the active list
might be inserted in the free lists during our system run, there are no elements
referring to the zero-filled page in the free list as well:
DEFINITION 4.42 I ppx-active-free-not-zfp?(cPFH) =
∀ i < USER PGS : (cPFH.active ◦ cPFH.free)[i].ppx 6= ZFP.
Active lists contains only distinct pairs (pid, vpx). None of the virtual pages of
a particular user process might be stored by two or more different active pages.
In other words, pairs of the form (pid, vpx) must be distinct within the active
list:
DEFINITION 4.43 I dstnct-pid-vpx-active?(cPFH) = ∀ i, j < |cPFH.active|, i 6= j :
cPFH.active[i].(pid, vpx) 6= cPFH.active[j].(pid, vpx).
Active and free lists have distinct physical page indices. A physical page can-
not by shared by two or more virtual pages. The following invariant guarantees
that there are no two page descriptors in the active list that point to the same
physical page:
DEFINITION 4.44 I dstnct-ppx-active-free?(cPFH) = ∀ i, j < USER PGS, i 6= j :
(cPFH.active ◦ cPFH.free)[i].ppx 6= (cPFH.active ◦ cPFH.free)[j].ppx.
Active list respects page table lengths. Recall that the page table lengths com-
ponent of abstract PFH configurations encode amount of virtual memory allo-
cated for processes. A process pid has cPFH.ptl[pid]+1 pages of virtual memory.
Virtual page indices of active list’s elements must be bounded from above by
the respective virtual memory amount:
DEFINITION 4.45 I vpx-less-ptl-active?(cPFH) =
∀ i < |cPFH.active| : cPFH.active[i].vpx < cPFH.ptl[cPFH.active[i].pid]+1.
Valid page table entries describe active list. The correspondence between page
table entries which point to pages currently located in the physical memory
and elements of the active list is given by the following invariant. Consider all
user process identifiers pid and virtual page indices vpx such that the latter is
bounded by the amount of virtual memory the corresponding process has. If
4.6. Validity 73
the valid bit of the page table entry computed for pid and vpx is raised and this
entry does not point to the zero-filled page, then there is a page descriptor in
the active lists such that its process identifier field equals pid, its virtual page
index fields equals vpx, and its physical page index field equals the page index
of the computed page table entry:
J DEFINITION 4.46valid-pte-descr-active?(cPFH) =
∀ 0 < pid < MAX PID, vpx < cPFH.ptl[pid] + 1 :
valid?(ptePFH(cPFH, pid, vpx))
∧ px(ptePFH(cPFH, pid, vpx)) 6= ZFP
−→ ∃ i < |cPFH.active| :
cPFH.active[i].pid = pid
∧ cPFH.active[i].vpx = vpx
∧ cPFH.active[i].ppx = px(ptePFH(cPFH, pid, vpx)).
4.6.2 Invariants about Page Tables
Page tables refer only to user memory. User processes, certainly, must not
modify kernel code and data. User processes operate with virtual addresses
which are translated into physical memory addresses. The latter addresses
must lie outside the kernel range. In order to guarantee that, page indices of
page table entries must also have values outside the kernel range. However,
we do not impose this restriction on all page table entries, but rather on those
that are valid and do not point to the zero-filled page, and, therefore, can be
used for address translation. We use the predicate user-ppx? in order to limit
page indices in page tables:
J DEFINITION 4.47user-ppx-pt?(cPFH) = ∀ i < TOT PGS PT, j < PTES PER PG :
valid?(cPFH.pt[i][j]) ∧ px(cPFH.pt[i][j]) 6= ZFP
−→ user-ppx?(px(cPFH.pt[i][j])).
Page tables do not overlap. Our system features separate address spaces of
user processes. Therefore page tables must not overlap. For any given pair of
page tables of user processes, either one must end before the other starts or vice
versa. The page table of a process i starts in the page table space at the origin
ptoPFH(cPFH, i). This page table has ptlPFH(cPFH, i) entries which constitute its
length. The sum of the origin and the length has to be less than the origin of
the next process. We express this more formally in the following definition:
J DEFINITION 4.48not-overlap-pt?(cPFH) = ∀ j < MAX PID, 0 < i < j :
ptoPFH(cPFH, i) + ptlPFH(cPFH, i) < ptoPFH(cPFH, j).
Page tables do not exceed page table space. A slightly different invariant en-
sures that all page tables lie within the boundaries of the page table space:
J DEFINITION 4.49pto-ptl-less-pt?(cPFH) =
∀ 0 < i < MAX PID : ptoPFH(cPFH, i) + ptlPFH(cPFH, i) < TOT PGS PT.
74 Specification of Demand Paging
Protected pages point to zero-filled page. In our kernel implementation the
protected bit is set only for page table entries of newly allocated pages. These
entries point to the zero-filled page. A write access to a page described by such
page table entry triggers a zero protection page fault. When handling it we
transfer a real physical page into the main memory and make the respective
page table entry point to it. At the same time we clear the protected bit. By
this an invariant arises that the protected bit is equivalent to the fact that a
page table entry points to the zero-filled page:
DEFINITION 4.50 I zfp-eq-prot-pt?(cPFH) = ∀ i < TOT PGS PT, j < PTES PER PG :
px(cPFH.pt[i][j]) = ZFP←→ prot?(cPFH.pt[i][j]).
Valid pages are executable. On the hardware level page faults occur in case
(among others) the executable bit is not set on a fetched input. However, by
design of our kernel we do not treat such case as a page fault on the soft-
ware level. We solve the problem of inconsistency between page faults on the
hardware and software levels by implementing the kernel in a way that the
executable bit is set for all valid page table entries. The following invariant
defines this formally:
DEFINITION 4.51 I valid-is-exec-pt?(cPFH) = ∀ i < TOT PGS PT, j < PTES PER PG :
valid?(cPFH.pt[i][j]) −→ exec?(cPFH.pt[i][j]).
Entries that point to zero-filled page are valid. When allocating a new page for
a process we make its page table entry to point to the zero-filled page. At
the same time we set the valid bit for this entry. Since in our implementation
of demand paging while a page table entry points to the zero-filled page no
modifications of its valid bit are done the following invariant holds:
DEFINITION 4.52 I zfp-is-valid-pt?(cPFH) = ∀ i < TOT PGS PT, j < PTES PER PG :
px(cPFH.pt[i][j]) = ZFP −→ valid?(cPFH.pt[i][j]).
4.6.3 Invariants about Big-Page Tables
Free and used big pages constitute all big pages. The length of the free big
pages stack shows the total number of free big-pages. The total number of
used big pages could be computed as follows. For a process pid the number
of big pages consumed by it is cPFH.bptl[pid] + 1. If the process consumes no
big-pages this sum is equal to zero. We sum the equation for all user process
identifiers from 1 to MAX PID− 1 and obtain the desired total number of used
big pages. Now, the following invariant states that the sum of all free big pages
and all used big pages equal to the total number of big pages:




(cPFH.bptl[i] + 1) = TOT BIG PGS.
4.6. Validity 75
Big-page table entries are distinct. Free big page table entries are stored in the
stack of free big pages. Next, we formally define a list of used big page table
entries. A process pid occupies entries in the big page table at consecutive
address from cPFH.bpto[pid] up to cPFH.bpto[pid] + cPFH.bptl[pid]. We obtain
the list of big-page table entries of the process pid by accessing the big-page
table space at these addresses:
used-bpte-pid(cPFH, pid) = [cPFH.bpt[cPFH.bpto[pid]],
cPFH.bpt[cPFH.bpto[pid] + 1],
. . . ,
cPFH.bpt[cPFH.bpto[pid] + cPFH.bptl[pid]]].
Having this, we can obtain all used big-page table entries by iterating the user
process identifier from 1 to MAX PID− 1 and concatenating the results:
used-bpte(cPFH) = used-bpte-pid(cPFH, 1) ◦ used-bpte-pid(cPFH, 2) ◦ . . .
◦ used-bpte-pid(cPFH, MAX PID− 1).
Now, the requirement for all big-page table entries to be distinct is expressed as
all elements of the list made of free and used big-page table entries are distinct.
J DEFINITION 4.54dstnct-bpte?(cPFH) = ∀ i, j < |cPFH.bpfree ◦ used-bpte(cPFH)|, i 6= j :
(cPFH.bpfree ◦ used-bpte(cPFH))[i] 6= (cPFH.bpfree ◦ used-bpte(cPFH))[j].
Big-page tables do not exceed big-page table space. This invariant is a version
of Definition 4.49 for the big-page table space:
J DEFINITION 4.55bpto-bptl-rel?(cPFH) = ∀ 0 < pid < MAX PID :
cPFH.bpto[pid] + cPFH.bptl[pid] + 1 ≤ TOT BIG PGS
4.6.4 Invariants about Process Control Blocks
Page table origins are monotonic. In order to support disjointness of page
tables of different process we require page table origins to be monotonic:
J DEFINITION 4.56pto-mono?(cPFH) = ∀ j < MAX PID, 0 < i < j : cPFH.pto[i] < cPFH.pto[j].
Active processes have memory. By design the active list describes pages only
of processes which have some allocated memory. Hence, page table lengths of
such processes must be non-negative:
J DEFINITION 4.57ptl-pid-alloc-active?(cPFH) = ∀ i < |cPFH.active| :
cPFH.ptl[cPFH.active[i].pid] ≥ 0.
Relation between page table origins and lengths. The page table origin filed of
a process control block stores the start page address of the page table for the
process. This start arrdess is counted from the very beginning of the physical
memory. Taking into consideration the fact that the page table space starts at
the page address PT START/PG SZ in the physical memory we can represent the
76 Specification of Demand Paging
origins as a sum of this start address and some offset x. This constitute the
first conjunct of the invariant below. The second conjunct impose a restriction
on the choice of x. Recall that, cPFH.ptl[pid] + 1 is the amount of process’s
pid virtual memory measured in pages. From the other side, this number also
defines the number of page table entries occupied by the process. Therefore,
(cPFH.ptl[pid] + 1)/PTES PER PG is the number of pages in the page table space
occupied by the process pid. The sum of x and this number must not exceed
the number of pages in the page table space:
DEFINITION 4.58 I pto-ptl-rel?(cPFH) = ∀ 0 < pid < MAX PID : ∃ x :
cPFH.pto[pid] = PT START/PG SZ + x
∧ x+ (cPFH.ptl[pid] + 1)/PTES PER PG < TOT PGS PT.
Big-page table lengths could be computed from page table lengths. Due to the
ration PGS PER BIG PG between pages and big pages big page tables lengths al-
ways could be computed from page table lengths via division by PGS PER BIG PG.
There is an exception from this rule if a process has no virtual memory. In this
case both the page table length and the big page table length must be equal
to −1.
DEFINITION 4.59 I ptl-bptl-rel?(cPFH) = ∀ 0 < pid < MAX PID :
cPFH.ptl[pid] = −1 ∧ cPFH.bptl[pid] = −1
∨ cPFH.ptl[pid] = cPFH.bptl[pid] · PGS PER BIG PG.
4.6.5 Altogether





The overall validity invariant of the abstract PFH configuration pfh
√
(cPFH)
is the conjunction of (i) individual validity invariants, Definitions 4.37–4.59, and
(ii) sub-typing requirements, Definition 4.7.
In the remainder of this chapter we prove three lemmas stating that the
page-fault handler validity predicate holds for the initial abstract PFH config-
uration and that it is preserved under page-fault handling and page swap out
prevention operations.
4.6.6 Validity Proofs
Table 4.3: Active list invariants trivial for init-cPFH
Invariant Definition Invariant Definition
user-pid-active? 4.37 dstnct-pid-vpx-active? 4.43
valid-active? 4.39 vpx-less-ptl-active? 4.45











J PROOF• All sub-typing requirements (Definition 4.7) are trivial to show from the
invariants proven below.
• Since init-cPFH.active = [ ] all properties about elements of the active list
hold. These are listed in Table 4.3.
• The free list contains TOT PHYS PGS− KERNEL PGS = USER PGS elements,
and for all of the the following holds:
∀ i < USER PGS : init-cPFH.free[i].pid = 0
∧ init-cPFH.free[i].vpx = 0
∧ init-cPFH.free[i].ppx = TOT PHYS PGS− i− 1.
From that we conclude:
KERNEL PGS ≤ TOT PHYS PGS− i− 1 < KERNEL PGS + USER PGS
= KERNEL PGS ≤ init-cPFH.free[i].ppx < TOT PHYS PGS
= user-ppx?(init-cPFH.free[i].ppx)
= user-ppx-active-free?(init-cPFH) (Definition 4.38),
KERNEL PGS ≤ TOT PHYS PGS− i− 1
= init-cPFH.free[i].ppx 6= ZFP
= ppx-active-free-not-zfp?(init-cPFH) (Definition 4.42), and
i 6= j
= TOT PHYS PGS− i− 1 6= TOT PHYS PGS− j − 1
= init-cPFH.free[i].ppx 6= init-cPFH.free[j].ppx
= dstnct-ppx-active-free?(init-cPFH) (Definition 4.44).
• For page table origins and lengths as well as their big versions we have
the following:
∀ 0 < pid < MAX PID : init-cPFH.pto[pid] = init-pto(pid)
∧ init-cPFH.ptl[pid] = −1
∧ init-cPFH.bpto[pid] = 0
∧ init-cPFH.bptl[pid] = −1.
Exploiting this we conclude:
init-pto(pid1) < init-pto(pid2)
= init-cPFH.pto[pid1] < init-cPFH.pto[pid2]
= pto-mono?(init-cPFH) (Definition 4.56),
init-pto(pid) = PT START/PG SZ +
(pid− 1) · TOT PGS PT
MAX PID
∧
(pid− 1) · TOT PGS PT
MAX PID
< TOT PGS PT
= init-cPFH.pto[pid] = PT START/PG SZ + x ∧
x+ (init-cPFH.ptl[pid] + 1)/PTES PER PG < TOT PGS PT
= pto-ptl-rel?(init-cPFH) (Definition 4.58),
0 + (−1) + 1 ≤ TOT BIG PGS
= init-cPFH.bpto[pid] + init-cPFH.bptl[pid] + 1 ≤ TOT BIG PGS
= bpto-bptl-rel?(init-cPFH) (Definition 4.55), and
78 Specification of Demand Paging
init-cPFH.ptl[pid] = −1 ∧ init-cPFH.bptl[pid] = −1
= ptl-bptl-rel?(init-cPFH) (Definition 4.59).
• For a process pid the values of page table origin and length within the
page table space are
ptoPFH(init-cPFH, pid) =
init-pto(pid) · PG SZ− PT START
4 · PTES PER PG
=







Hence, the following invariants hold:
pid1 < pid2
= ptoPFH(init-cPFH, pid1) < ptoPFH(init-cPFH, pid2)
= ptoPFH(init-cPFH, pid1) + ptlPFH(init-cPFH, pid1) < ptoPFH(init-cPFH, pid2)
= not-overlap-pt?(init-cPFH) (Definition 4.48), and
ptoPFH(init-cPFH, pid) < TOT PGS PT
= ptoPFH(init-cPFH, pid) + ptlPFH(init-cPFH, pid) < TOT PGS PT
= pto-ptl-less-pt?(init-cPFH) (Definition 4.49).
• Invariant valid-pte-descr-active? (Definition 4.46)holds since there is no
virtual page index vpx for any process pid such that
vpx < init-cPFH.ptl[pid] + 1.
• The stack of free big-pages is initialized with the list of elements from 0
till TOT BIG PGS− 1 such that
∀ i < TOT BIG PGS : init-cPFH.bpfree[i] = i.
Using the fact that no process has any memory allocated and bptl[pid] =
−1 we conclude:
TOT BIG PGS








= total-bpages?(init-cPFH) (Definition 4.53).
Also, there are no used big-page table entries used-bpte(init-cPFH) = [ ],
therefore
i 6= j
= (init-cPFH.bpfree)[i] 6= (init-cPFH.bpfree)[j]
= dstnct-bpte?(init-cPFH) (Definition 4.54).
4.6. Validity 79
• The remaining group of validity invariants speaks about the page table
space. All its entries are initialized with zeros:




= user-ppx-pt?(init-cPFH) (Definition 4.47),
px(0) 6= ZFP ∧ ¬prot?(0)
= px(init-cPFH.pt[i][j]) = ZFP←→ prot?(init-cPFH.pt[i][j])
= zfp-eq-prot-pt?(init-cPFH) (Definition 4.50),
¬valid?(0)
= valid?(init-cPFH.pt[i][j]) −→ exec?(init-cPFH.pt[i][j])
= valid-is-exec-pt?(init-cPFH) (Definition 4.51), and
px(0) 6= ZFP
= px(init-cPFH.pt[i][j]) = ZFP −→ valid?(init-cPFH.pt[i][j])
= zfp-is-valid-pt?(init-cPFH) (Definition 4.52).

J LEMMA 4.62
Validity is preserved under
page-fault handling
The validity invariant is preserved under handling a page fault which oc-




∧ pfPFH?(cPFH, pid, va, intent)
∧ ¬ptlexcpPFH?(cPFH, pid, va)
∧ user-pid?(pid)
−→ pfh√(handle-pfPFH(cPFH, pid, va)).
During handling the page fault elements of the active and free lists are J PROOF
permuted and only the last element of the active list
l = |handle-pfPFH(cPFH, pid, va).active| − 1
has new values of its pid and vpx fields:
handle-pfPFH(cPFH, pid, va).active[l].pid = pid,
handle-pfPFH(cPFH, pid, va).active[l].vpx = px(va).
Besides that, only the page table-table space component cPFH.pt of the abstract
PFH configuration is changed in a way that the entry for the pair (pid, px(va))
ptePFH(handle-pfPFH(cPFH, pid, va), pid, px(va))
becomes valid and executable, and its physical page index is set to the ppx field
of the active list’s last element. Moreover, in case the free list was empty the
page table entry corresponding to the first element of the active list
ptePFH(handle-pfPFH(cPFH, pid, va), cPFH.active[0].pid, cPFH.active[0].vpx)
becomes invalid.
We will consider the most involved case of the proof: invariant
dstnct-pid-vpx-active? (Definition 4.43) which states that the active list con-
tains only distinct pairs (pid, vpx). In order to prove this distinction we con-
sider pairs (i, j) of page descriptor positions in the active list. A non-triviality
80 Specification of Demand Paging
arises when one of the pair’s element, say j, is the newly inserted one: i < l and
j = l. In case the free list was not empty the elements at position i remains
the same, otherwise
handle-pfPFH(cPFH, pid, va).active[i] = cPFH.active[i+ 1].
Let us denote this element as the i′-th one. So, we need to show
(cPFH.active[i′].pid, cPFH.active[i′].vpx) 6= (pid, px(va)).
We prove the goal by contradiction. Assume, that cPFH.active[i′].pid = pid
and cPFH.active[i′].vpx = px(va). Using validity invariant valid-active? (Defini-
tion 4.39) we get
valid?(ptePFH(cPFH, pid, va)),
which is equivalent by Definition 4.16 to
¬iapf?(cPFH, pid, va).
Using validity invariants valid-pte-descr-active? and ppx-active-free-not-zfp? (Def-
initions 4.46 and 4.42) we conclude
px(ptePFH(cPFH, pid, va)) 6= ZFP,
which is equivalent by Definition 4.17 to
¬zppf?(cPFH, pid, va, intent).
Combining these negated page fault signals we obtain
¬pfPFH?(cPFH, pid, va, intent),
which contradicts with the lemma’s assumptions.
LEMMA 4.63 I
Validity is preserved
under preventing page swap out
The validity invariant is preserved under preventing page swap out:
pfh
√
(cPFH) ∧ cPFH.free = [ ]
−→ pfh√(prevent-swap-out(cPFH)).
The algorithm for preventing page swap out only permutes elements ofPROOF I
the active list. Since all properties concerning the active list talk only about























In the previous chapter we have introduced
abstract page-fault handler configurations
and formally specified demand paging opera-
tions. The goal of this chapter is to formally
verify that our demand paging implementa-
tion satisfies the desired specification. We ful-
fill our goal by conducting the corresponding
proofs in the Hoare logics. For this we trans-
late the C0 implementation of demand paging
into Simpl and define a concrete shape of the
Simpl state. Next we introduce an abstrac-
tion relation from the concrete Simpl state
towards the abstract page-fault handler con-
figuration: global and heap variables from the
Simpl implementation are mapped to their
meanings on the abstract side. This relation
will play the essential role in specifying pre-
and postcondition for the functions of our de-
mand paging implementation in the Hoare
logics. We elaborate on the implementation
details of the initialization code, swap in and
swap out routines, as well as the page-fault
handler, and introduce pre- and postcondi-
tion for these functions. Having them, we




5.1 Simpl Hoare Logic State Space
We will call the state space for code verification of the demand paging imple-
mentation the Simpl state space or the Hoare logic state space and denote it
by CH. The subscript “H” stands for “Hoare”. A single state or configura-
tion is denoted by cH. The Simpl state is a record whose individual elements
are global and local variables from the demand paging implementation. All
variables of structural types are flattened: for each individual field there is a
a separate component in the state space. The state space contains a special
variable alloc :: ref-t∗ which is a list of already allocated references. As pointed
out before, the Simpl state space also aggregates the extended state of demand
paging: cH.x :: CX. For each pointer variable or pointer structure field a heap
function is inserted into the state space as stated in Table 5.1. State space
components for heap functions are annotated with a subscript “heap”. Global
variables from the implementation are listed in Table 5.2. Note that this table
contains also variables like cH.srglob which are not used in the demand paging
code but rather in some other CVM parts. We have them in the state space in
order to be able to show by means of the Hoare logic verification environment
that they are not modified by the demand paging code. Global variables have
a subscript “glob”. Finally, Table 5.3 lists used local variables and function
parameters. Note that in order to reduce the state space size we share local
variables between functions. Local variables have suffixes “loc”.
Table 5.1: Simpl state components for heap functions
Name Type Meaning
ptheap ref-t 7→ N∗∗ page table array
pidheap ref-t 7→ N process id field of page descriptors
vpxheap ref-t 7→ N virtual page index field of page descriptors
ppxheap ref-t 7→ N physical page index field of page descriptors
nextheap ref-t 7→ ref-t ’next’ field in active and free lists
prevheap ref-t 7→ ref-t ’previous’ field in active and free lists
5.2 Abstraction Relation from Simpl
In this section we introduce an abstraction relation from the demand paging
Simpl implementation states cH towards abstract PFH configurations cPFH.
The relation ties together global and heap variables from the implementation
and individual components of the abstract configuration. The relation is struc-
tured according to components in the abstract state space: below we introduce
separate conjuncts for each component.
5.2.1 Doubly-Linked Lists
To specify and verify programs which use (doubly) linked lists we use the
definitions of Schirmer who himself follows the approach of Mehta and Nip-
kow [MN03]. Generally, pointer structures in the heap are abstracted to a
suitable HOL type. A list in the heap is abstracted to a HOL list of references.
After this abstraction specification and verification takes place in the domain
5.2. Abstraction Relation from Simpl 83
Table 5.2: Simpl state components for global variables
Name Type Meaning
pcb-efglob Z∗∗ PCB exception frames
pcb-ihdglob N∗ PCB user-defined signals
pcb-bptoglob Z∗ PCB big-page table origins
pcb-bptlglob Z∗ PCB big-page table lengths
pcb-emptyglob Z∗ PCB dummy space
pages-usedglob N number of currently used virtual pages
activeglob ref-t (pointer to the head of the) active list
freeglob ref-t (pointer to the head of the) free list
ppx2pdglob ref-t
∗ reverse lookup array for physical page indices
pages-freeglob N number of currently free physical pages
ptglob ref-t pointer to the page table array
bpfreeglob N∗ list for the stack of free big pages
bpages-freeglob N number of free big pages
bptglob N∗ big-page table array
srglob N devices interrupt mask
cupglob N current user process
kheapglob Z kernel heap size
of HOL lists. Predicate listH? :: ref-t× (ref-t 7→ ref-t)× ref-t∗ 7→ B defines this
abstraction.
J DEFINITION 5.1
List abstracted from Simpl
listH?(p,next, l) =
{
p = NULL if l = [ ]
p = a ∧ p 6= NULL ∧ listH?(next(p),next, l′) if l = a ◦ l′
The list of references is obtained by means of the heap function next by
starting with the reference p following the references in the list l up to NULL. A
more involved kind of linked lists is a doubly linked list. The list of references
l constituting it is traversed in two directions by means of heap functions next
and prev. The first and last elements of a doubly linked lists are p and q.
Predicate dlistH? :: ref-t × (ref-t 7→ ref-t) × (ref-t 7→ ref-t) × ref-t × ref-t∗ 7→ B




dlistH?(p,next, prev, q, l) = listH?(p,next, l) ∧ listH?(q, prev, rev(l))
Thus, a doubly linked list is nothing but two singly linked listed traversed
in opposite directions.
Based on this abstraction a library of functions on doubly-linked lists was
specified and verified in Hoare logics against its C0 implementation (on the
Simpl level) [Ngu05, Sta06]. We import this library for our proofs, since our
demand paging implementation uses the functions for creating a list, as well as
inserting and deleting elements from it.
5.2.2 Page and Big-Page Management
Active and free lists. The page management lists are implemented as doubly-
linked lists of page descriptors. Such list uses the following variables: (i) a
pointer p :: ref-t to the head of the list, (ii) heap functions nextheap and prevheap
84 Implementation Correctness
Table 5.3: Simpl state components for local variables
Name Type Meaning
pidloc N process id parameter
dummyloc Z integer result of calls to subroutines
dummy-boolloc B boolean result of calls to subroutines
mem-addrloc N memory address parameter
disk-addrloc N disk address parameter
sloc ref-t pointer to a page descriptor
nloc N loop counter
addrloc N virtual address parameter
intentloc N intention of a call to the handler
countloc N number of calls to the handler during which
the given page must not be swapped out
pteloc N page table entry
vpxloc N virtual page index parameter
resultloc N intermediate result
victloc ref-t (pointer to the) page descriptor for swap out
active-tailloc ref-t (pointer to the) tail of the active list
ppxloc N physical page index parameter
bpxloc N virtual big-page index
bbxloc N big-page byte index
pbpxloc N physical big-page index
resloc Z result variable
— both of the type ref-t 7→ ref-t — used to obtain next elements in the list,
(iii) heap functions pidheap, vpxheap, and ppxheap — all of the type ref-t 7→ N
— to retrieve fields for a process identifier, virtual and physical page indicies,
respectively, and (iv) a pointer q :: ref-t to the last element of the list.
On the abstract side a page management list is specified by a list pds :: pd-t∗
of (abstract) page descriptors. It is easy to notice that the abstraction specifies
only the content of a page management list and lacks information about the
structure of the list, i.e., how list elements are placed in the memory. We fix
this problem by introducing an additional parameter to our abstraction map-
ping. The structure of the list’s elements in the memory is given by a list of
references l. All these three concepts, namely, variables from the implementa-
tion, a list of (abstract) page descriptors, and a list of references in the memory,
are tied together in the predicate below. It defines an abstraction for doubly
linked lists of page descriptors. Its meaning is that (i) the variables from the
implementation form a doubly linked list with the structures of references given
by l, (ii) there are as many references in l as in pds, and (iii) each page de-
scriptor field obtained on the implementation side corresponds to the one on
the abstract side.
DEFINITION 5.3 I
List of page descriptors
abstracted from Simpl
pds-absH?(p,nextheap, prevheap, pidheap, vpxheap, ppxheap, l, pds) =
∃ q : dlistH?(p,nextheap, prevheap, q, l) ∧ |l| = |pds|
∧∀ i < |l| : pidheap(l[i]) = pds[i].pid
∧vpxheap(l[i]) = pds[i].vpx
∧ppxheap(l[i]) = pds[i].ppx
5.2. Abstraction Relation from Simpl 85
Having this, it is easy to define parts of the abstraction relation for the
active and free lists. For the active list the parameter refsactive plays the role of









For the free list l is substituted by refsfree and cPFH.free is the list of the








Stack of free big pages. On the implementation side the stack of free big pages
is a fixed-sized array of TOT BIG PGS elements. The index of the stack’s topmost
element is stored in the variable cH.bpages-freeglob. All the elements between
cH.bpages-freeglob and TOT BIG PGS have unknown values. On the abstract side
this data structure is modeled as a stack abstract data type: push and pop
operations actually change the stack’s length. Considering these facts we need
to state in the abstraction relation for the stack of free big pages that the
abstract stack is a prefix of the implementation stack and that the length of
the latter is bounded by TOT BIG PGS.
J DEFINITION 5.6
Stack of free big pages
abstracted from Simpl
bpfree-absH?(cH, cPFH) =
|cH.bpfreeglob| = TOT BIG PGS
∧ ∀ i < |cPFH.bpfree| : cH.bpfreeglob[i] = cPFH.bpfree[i]
5.2.3 Page and Big-Page Tables
Page table space. We access page tables of user processes in the implementa-
tion through a data structure called the page table space. This data structure
is a two-dimensional array located in the heap memory. The pointer to the
beginning of the array is stored in the global variable cH.ptglob. Since the array
of page tables is the first variable allocated on the heap the value of cH.ptglob
has always to be 1. The content of this array could be accessed in the Simpl
implementation by means of the heap function cH.ptheap. The obtained content
cH.ptheap(cH.ptglob) has to match the abstract page table space cPFH.pt. This




pt-absH?(cH, cPFH) = (cH.ptglob = 1) ∧ (cH.ptheap(cH.ptglob) = cPFH.pt)
Big-page table space. As for the big-page table space its abstraction relation is
straightforward. The space is implemented as a global memory array cH.bptglob.




bpt-absH?(cH, cPFH) = cH.bptglob = cPFH.bpt
86 Implementation Correctness
5.2.4 Process Control Blocks
All abstraction relation parts for PCB fields relevant for demand paging follow
the same patters. A single process control block is implemented as a structure
comprising, among others, the following fields: (i) the exception frame which
is an array of integers; its elements correspond to the visible registers of the
physical machine, (ii) the big-page table origin field, and (iii) the big-page
table length field. These structures for individual PCBs are assembled into an
array of MAX PID blocks. We are interested only in user processes and hence
consider only items with indices pid respecting 0 < pid < MAX PID. Moreover,
only page table origins and lengths as well as their big-page versions are of our
interest. Page table origin and length are stored in the exception frame array
at positions PTO and PTL, respectively, whereas the big-page version of these
concepts are stored in the individual PCB structure fields. Altogether, four




















∀ 0 < pid < MAX PID : cH.pcb-bptlglob = cPFH.bptl[pid]
5.2.5 Miscellaneous
The implementation of demand paging also contains a number of global vari-
ables for which there is no analogous components in the abstract state However,
these variables have precise meanings and their values could be expressed as
formulas over the state space of the abstraction.





pages-free-absH?(cH, cPFH) = cH.pages-freeglob = |cPFH.free|
The number of free big-pages cH.bpages-freeglob equals to the length of the




bpages-free-absH?(cH, cPFH) = cH.bpages-freeglob = |cPFH.bpfree|
The total number of currently used virtual pages is stored in the implemen-
tation variable cH.pages-usedglob. This value is specified by the sum of virtual
5.3. Initialization Code 87








The reverse lookup array used for for associating physical page indices to
page descriptors is mapped to the abstract PFH state as follows. The length of
the implementation array cH.ppx2pdglob which stores pointers to page descrip-
tors equals to the number of physical pages. For a user physical page index
ppx the element cH.ppx2pdglob[ppx] is contained in the list of active refsactive or
free refsfree references. Applying the heap function for physical page indices to




ppx2pd-absH?(cH, cPFH, refsactive, refsfree) =
|cH.ppx2pdglob| = TOT PHYS PGS
∧ ∀ppx : user-ppx?(ppx) −→ cH.ppx2pdglob[ppx] ∈ refsactive ◦ refsfree
∧ cH.ppxheap(cH.ppx2pdglob[ppx]) = ppx
5.2.6 Altogether
Now we can combine all abstraction relations for individual variables into an
overall abstraction relation from Simpl implementation towards the abstract
PFH configuration.
J DEFINITION 5.17
Abstraction relation from Simpl
The relation absH?(cH, cPFH, refsactive, refsfree) is defined as a conjunction of
definitions 5.4–5.16.
For convenience we combine this abstraction relation with the validity in-
variant of abstract PFH configurations in a single predicate below. Also this
predicate contains terms defining validity properties for the lists of active
refsactive and free refsfree references. Since active and free lists are disjoint
doubly-linked list we claim that refsactive and refsfree are also do not overlap.
Further, we know that these lists are allocated in the heap memory directly
after the page table space. As we have a single pointer to the page table space
it consumes the very first address in the heap memory. Hence elements of the
active and free lists are accessed via pointers whose values start at two. As
there is USER PGS elements altogether in both lists we devote USER PGS ad-
dresses in the heap memory starting from two to these lists. Formally, this is
stated in the following definition.
J DEFINITION 5.18




(cH, cPFH, refsactive, refsfree) =
absH?(cH, cPFH, refsactive, refsfree)
∧ pfh√(cPFH)
∧ refsactive ∩ refsfree = ∅
∧ refsactive ∪ refsfree = {x | 2 ≤ x ≤ USER PGS + 1}
5.3 Initialization Code
When the kernel starts for the first time after the reset signal it has to exe-
cute the initialization code of demand paging pfh init() in order to set up
88 Implementation Correctness
necessary data structures such that they conform to the initial abstract PFH
configuration (cf. Section 4.2).
5.3.1 Implementation
The initialization code of demand paging is presented in Listing 5.1. The
implementation starts with an invocation of function zero fill page which
fills the page at address ZFP with zeros. Next, at line 7 we allocate the page
table space in the heap memory. With the loop at lines 9–16 we initialize those
fields of process control blocks which are relevant for demand paging: page
table origins and lengths as well as their big versions. We distribute page table
origins of processes across the page table space with equal segments between
the origins of each two consecutive processes. The page table lengths are set
to -1, which denotes that all user processes have no virtual memory yet. At
lines 17–18 we create empty active and free lists. The loop at lines 21–27 is
intended to fill the empty list with USER PGS = TOT PHYS PGS − KERNEL PGS
elements and initialize with them the reverse lookup array ppx2pd. The last
loop of the function (lines 30–33) initializes the stack of free big pages.
Listing 5.1: Initialization code of demand paging
1 int pfh init()
2 {
3 struct pd∗ s;
4 unsigned int n;
5 int dummy;
6 dummy = zero fill page(ZFP);
7 pt = new(pt t);
8 n = 1u;
9 while (n < MAX PID) {
10 pcb[n].ef[PTO] = int(PX(PT START +
11 (n−1u) ∗ TOT PGS PT ∗ PTES PER PG ∗ 4u / MAX PID));
12 pcb[n].ef[PTL] = −1;
13 pcb[n].bpto = 0;
14 pcb[n].bptl = −1;
15 n = n + 1u;
16 }
17 active = dList New();
18 free = dList New();
19 pages used = 0u;
20 n = KERNEL PGS;
21 while (n < TOT PHYS PGS) {
22 s = new(struct pd);
23 ppx2pd[n] = s;
24 s−>ppx = n;
25 free = dList InsertHead(free, s);
26 n = n + 1u;
27 }
28 pages free = TOT PHYS PGS − KERNEL PGS;
29 n = 0u;
30 while (n < TOT BIG PGS) {
31 bpfree[n] = TOT BIG PGS − 1u−n;
32 n = n + 1u;
33 }




Precondition. The precondition PRE Hinit?(cH) to the initialization code of de-
mand paging is a conjunction of the following facts:
5.3. Initialization Code 89
• cH.pcb-efglob = rep(rep(0, EF DIM), MAX PID), exception frames of all pro-
cesses are initialized with zeros,
• cH.bptglob = rep(0, TOT BIG PGS), the big-page table variable is initialized
with zeros,
• |cH.pcb-bptoglob| = MAX PID, the big-page table origin variable is a list of
length equal to the number of processes,
• |cH.pcb-bptlglob| = MAX PID, the big page table length variable is a list of
length equal to the number of processes,
• |cH.ppx2pdglob| = TOT PHYS PGS, the reverse lookup array variable is a list
of length equal to the total number of virtual pages,
• |cH.bpfreeglob| = TOT BIG PGS, the stack of free big pages variable is a list
of length equal to the total number of big pages, and
• cH.alloc = [ ], the list of allocated references is empty.
Postcondition. The main idea of the postcondition for the demand paging
initialization code is to express that data structures from the implementation
respect the values defined by the initial abstract PFH configuration init-cPFH.
However, some variables in the implementation have no equivalents on the
abstract side. Therefore, we have to claim their initial values directly. One
category of such variables are the exception frames of process control blocks.
The initial abstract PFH configuration defines only the values of registers PTO
and PTL. In order to describe the remaining fields — all initialized with zeros
— we introduce the following predicate.
J DEFINITION 5.19
Initialized exception frame
init-pcb-ef?(cH) = ∀ 0 < i < MAX PID :
∧ ∀ j < EF DIM, j 6= PTO, j 6= PTL : cH.pcb-efglob[i][j] = 0
The postcondition POST Hinit?(cH) of the initialization code is a conjunction
of the following terms.
• absH√(c′H, init-cPFH, [ ], [USER PGS+1, . . . , 2]), the implementation state c′H
after the execution of the initialization code satisfies the abstraction rela-
tion towards the initial abstract PFH configuration init-cPFH; the latter
configuration is valid. Initially there are no references to elements of the
the active list. All user pages — there are USER PGS of them — are free
and the free list is the second variable allocated on the heap (after the
page table array). This is indicated by the argument [USER PGS+1, . . . , 2]
of the valid abstraction relation. Note that we start counting references
from 1.
• init-pcb-ef?(c′H), exception frames are appropriately initialized.
• c′H.alloc = [USER PGS + 1, . . . , 1], the Simpl allocation list c′H.alloc keeps
track of addresses consumed by the heap variables allocated in the ini-
tialization code: the first address belongs to the pointer to the page table
space whereas the further USER PGS addresses refer to the elements of
active and free lists.
90 Implementation Correctness
• c′H.x = zero-fill-page(ZFP, cH.x), in the extended state the page at the
page address ZFP is filled with zeros.
• c′H.resloc = 0, the result value of the function execution is zero.
5.3.3 Correctness
The theorem below states the correctness of the initialization code at the level
of Simpl: (i) the postcondition of the function is satisfied in the state after the
function call provided its precondition has been satisfied in the state before the
call, (ii) the function terminates, and (iii) only those global and heap variables
are changed in the Simpl state that have been modified by the function. We
express this statement using a Hoare triple for Simple (cf. Section 3.2).
THEOREM 5.20 I
Implementation correctness
of the initialization code
Γ tH PRE Hinit?(cH)
cH.resloc = Call pfh init()
POST Hinit?(c
′
H) ∩∆(cH, c′H) =
{kheapglob, ptglob, pcb-efglob, pcb-bptoglob, pcb-bptlglob, activeglob, freeglob,
pages-usedglob, pages-freeglob, bpfreeglob, bpages-freeglob, ppx2pdglob,
ptheap, pidheap, vpxheap, ppxheap,nextheap, prevheap, alloc, x}
We annotate loops of the function with invariants INV iinit? and rankingPROOF I
functions RANK iinit for i ∈ {1, 2, 3} defined in Appendix B, run VCG, and
obtain the HOL subgoals stated below to be proven. Functions fx denote
modifications done over implementation configuration cH by code lines x in
Listing 5.1.
Subgoal 1. Implication from the precondition to the first invariant:
PRE Hinit?(cH) −→ INV 1init?(f6–8(cH)).
Subgoal 2. Preservation of the first invariant:
INV 1init?(cH, cX) −→ INV 1init?(f10–15(cH)).
Subgoal 3. Implication from the first invariant to the second invariant:
INV 1init?(cH, cX) −→ INV 2init?(f17–20(cH)).
Subgoal 4. Preservation of the second invariant:
INV 2init?(cH, cX) −→ INV 2init?(f22–26(cH)).
Subgoal 5. Implication from the second invariant to the third invariant:
INV 2init?(cH, cX) −→ INV 3init?(f28–29(cH)).
Subgoal 6. Preservation of the third invariant:
INV 3init?(cH, cX) −→ INV 3init?(f31–32(cH)).
Subgoal 7. Implication from the third invariant to the postcondition:
INV 3init?(cH, cX) −→ POST Hinit?(f34–35(cH)).

5.4. Swap In 91
5.4 Swap In
The swap in routine pfh swap in is an interface between the hard disk driver
part for reading read from disk and the page-fault handler. While data
is transferred between the physical memory and the hard disk by function
read from disk the main purpose of the function pfh swap in is to compute
memory and swap addresses and pass them to read from disk. The swap in
routine implements the corresponding I/O operation defined in Section 4.4 (cf.
Definition 4.27).
5.4.1 Implementation
The source code of function pfh swap in is presented in Listing 5.2. The
function loads a page from the hard disk at an address specified by process
identifier pid and virtual page index vpx to the physical memory at page
address ppx. The implementation starts with a computation of (virtual) big-
page index bpx and big byte index bbx from parameter vpx. By accessing
the corresponding big-page table entry at line 11 we obtain physical big-page
index pbpx. At line 12 we compute the disk sector address from pbpx and bbx
performing the required conversions from pages to sectors as well as noting the
boot region offset. At line 14 we pass this address and the memory address
computed at line 13 to the hard disk driver routine read from disk.
Listing 5.2: Source code of the swap in
1 int pfh swap in(unsigned int pid, unsigned int ppx, unsigned int vpx)
2 {
3 bool dummy bool;
4 unsigned int bpx;
5 unsigned int bbx;
6 unsigned int pbpx;
7 unsigned int disk addr;
8 unsigned int mem addr;
9 bpx = BPX(vpx);
10 bbx = BBX(vpx);
11 pbpx = BPTE(pid, bpx);
12 disk addr = (pbpx << 13u) + (bbx << 3u) + (BOOT PGS << 3u);
13 mem addr = ppx << 12u;




Precondition. Below we present the facts constituting the precondition
PRE Hswap in?(cH, cPFH) to the swap in routine. First of all we impose a number
of constraints on the function’s parameters:
• 0 < cH.pidloc < MAX PID, the process identifier parameter corresponds to
some user process,
• ZFP ≤ cH.ppxloc < TOT PHYS PGS the physical page index addresses a page
within the user region of the physical memory, and
• cH.vpxloc ≤ cPFH.ptl[cH.pidloc], the virtual page index parameter is bounded
by the ammount of the process’s virtual memory.
92 Implementation Correctness
The last constraint implies that the user process identified by cH.pidloc has
some virtual memory: cPFH.ptl[cH.pidloc] ≥ 0.
Further, it would be meaningful to have an abstraction relation from the
Simpl implementation towards the abstract PFH configuration in the precon-
dition. However, it turns out that we do not need a complete relation for
verification of the function but rather its particular parts, namely:
• bpt-absH?(cH, cPFH), the abstraction relation for the big-page table space,
and
• bpto-absH?(cH, cPFH), the abstraction relation for the big-page table ori-
gins.
Finally, the precondition requires an appropriate sub-typing of the extended
state: subtypingX?(cH.x).
Postcondition. The postcondition POST Hswap in?(c
′
H, cPFH) of the swap in func-
tion comprises the following terms:
• cH.x′ = swap-in(cH.pidloc, cH.ppxloc, cH.vpxloc, cPFH, cH.x), the extended
state is updated by the semantics of the swap in function (cf. Defini-
tion 4.27),
• subtypingX?(c′H.x), the sub-typing of the extended state is preserved, and
• c′H.resloc = 0, the result value of the function execution is zero.
5.4.3 Correctness




of the swap in
Γ tH PRE Hswap in?(cH, cPFH)
cH.resloc = Call swap in(cH.pidloc, cH.ppxloc, cH.vpxloc)
POST Hswap in?(c
′
H, cPFH) ∧∆(cH, c′H) = {x}
5.5 Swap Out
The swap out routine pfh swap out is an interface between the hard disk driver
part for writing write to disk and the page-fault handler. The swap out rou-
tine implements the corresponding I/O operation from Section 4.4 (cf. Defini-
tion 4.29).
5.5.1 Implementation
The source code of function pfh swap out is presented in Listing 5.3. It differs
from pfh swap in only at the call at line 14: we invoke write to disk.
5.5.2 Specification
Precondition. The precondition PRE Hswap out?(cH, cPFH) to the swap out func-
tion is the same as in the swap in function.
5.6. Page-Fault Handler 93
Listing 5.3: Source code of the swap out
1 int pfh swap out(unsigned int pid, unsigned int ppx, unsigned int vpx )
2 {
3 bool dummy bool;
4 unsigned int bpx;
5 unsigned int bbx;
6 unsigned int pbpx;
7 unsigned int disk addr;
8 unsigned int mem addr;
9 bpx = BPX(vpx);
10 bbx = BBX(vpx);
11 pbpx = BPTE(pid, bpx);
12 disk addr = (pbpx << 13u) + (bbx << 3u) + (BOOT PGS << 3u);
13 mem addr = ppx << 12u;
14 dummy bool = write to disk(mem addr, disk addr);
15 return 0;
16 }
Postcondition. The postcondition POST Hswap out?(c
′
H, cPFH) to the swap out
function differs from the one for the swap in function only in the term concern-
ing the extended state update. In the swap out function we use the semantics
of the swap out operation for this update (cf. Definition 4.29):
c′H.x = swap-out(cH.pidloc, cH.ppxloc, cH.vpxloc, cPFH, cH.x).
5.5.3 Correctness




of the swap out
Γ tH PRE Hswap out?(cH, cPFH)
cH.resloc = Call swap out(cH.pidloc, cH.ppxloc, cH.vpxloc)
POST Hswap out?(c
′
H, cPFH) ∧∆(cH, c′H) = {x}
5.6 Page-Fault Handler
Page faults are treated by a page-fault handler, a piece of software which trans-
lates addresses and loads missing pages from the hard disk into the physical
memory. The page-fault handler function is called in two situations: when
page-fault exceptions occur and when the kernel executes primitives that ac-
cess user memory. In the second case the handler simulates address translation
for the kernel, which runs untranslated, and makes sure that the corresponding
memory page is swapped in.
5.6.1 Implementation
Recall that the function pfh touch addr serves as a single entry point for
all cases where the kernel has or might have to swap in a page. The function
considers virtual address addr of user process pid. We distinguish four different
intentions intent of a handler’s call:
• READ: the page in question should be readable afterwards,
• WRITE: the page in question can afterwards be read and written,
94 Implementation Correctness
• OVERWRITE: the caller intends to overwrite the entire page after the call,
and
• SWAP IN: a page fault has occurred at the given address and the function
will swap in the page for arbitrary accesses.
Listing 5.4: Source code of the page-fault handler
1 unsigned int pfh touch addr(unsigned int pid, unsigned int addr,
2 unsigned int intent, unsigned int count)
3 {
4 int dummy;
5 unsigned int pte;
6 unsigned int vpx;
7 unsigned int result;
8 struct pd∗ vict;
9 struct pd∗ active tail;
10 result = 0u;
11 vpx = PX(addr);
12 if (intent == SWAP IN && vpx > unsigned(pcb[pid].ef[PTL])) {
13 result = INVALID ADDR;
14 } else {
15 pte = PTE(pid, vpx);
16 if (intent != SWAP IN && VALID(pte) && (intent == READ || !PROT(pte))) {
17 if (count > pages free) {
18 count = count − pages free;
19 vict = active;
20 while ((vict−>vpx != vpx || vict−>pid != pid) && count > 0u) {
21 count = count − 1u;
22 vict = vict−>next;
23 }
24 active tail = vict;
25 while (count > 0u) {
26 count = count − 1u;
27 active tail = active tail−>next;
28 }
29 if (active tail != vict) {
30 active = dList Delete(active, vict);
31 dummy = dList InsertAfter(active tail, vict);
32 }
33 }
34 result = (pte & PPX MASK) + BX(addr);
35 } else {
36 if (free == NULL) {
37 vict = active;
38 active = dList Delete(active, vict);
39 PTE(vict−>pid, vict−>vpx)
40 = PTE(vict−>pid, vict−>vpx) & INVALID MASK;
41 dummy = pfh swap out(vict−>pid, vict−>ppx, vict−>vpx);
42 } else {
43 vict = free;
44 free = dList Delete(free, vict);
45 pages free = pages free − 1u;
46 }
47 active = dList Append(active, vict);
48 vict−>pid = pid;
49 vict−>vpx = vpx;
50 if (intent != OVERWRITE) {
51 if (PX(PTE(pid, vpx)) == ZFP) {
52 dummy = zero fill page(vict−>ppx);
53 } else {
54 dummy = pfh swap in(pid, vict−>ppx, vpx);
55 }
56 }
57 pte = (vict−>ppx << 12u) | VALID MASK | EXEC MASK;
58 PTE(pid, vpx) = pte;





5.6. Page-Fault Handler 95
Sometimes, it is necessary to have multiple pages available in memory. How-
ever, if pfh touch addr just checks that the page is currently in the memory,
a subsequent call to pfh touch addr might invalidate it. In order to overcome
this problem, the parameter count is used. It should be set to the number
of subsequent calls to pfh touch addr that the currently touched page should
at least survive. Note that the function assumes that the argument count is
smaller than the number of user-available physical pages. The function will
either return the translated physical address of the given virtual address or the
error INVALID ADDR in case of a page table length exception.
The implementation starts by computing virtual page index vpx of the given
address. At lines 12–13 we check for a page table length exception and exit the
handler with return code INVALID ADDR in case it takes place. It is up to the
kernel to decide what should happen in this situation. Otherwise we compute
page table entry pte for the given parameters. At lines 16–34 we consider the
case when the requested page is already accessible and must be guaranteed to
survive count further calls to pfh touch addr. With the loop at lines 20–23
we search for the descriptor of the requested page in the active list. If the page
was found to early in the active list, we push it (as little as possible) backwards
at lines 24–32. Line 34 assign the translated physical memory address to the
result variable. Further, at lines 35–60 we treat the situation of a legal user
page fault or kernel-requested touch of a non-accessible page. The free list is
examined at line 36 in order to find out whether any unused page resides in the
physical memory and could be given (lines 43–46) to the page faulting process.
If not, a page from an active list is evicted (lines 37–41). The obtained vacant
page is then either filled with the desired data loaded from the disk, or with
zeros depending on the kind of page fault and intention (lines 50–56). The
page table entry of the evicted page is invalidated (line 39–40) while the valid
and executable bits of the loaded page are set (line 57). Finally, at line 59 we
compute the translated physical memory address to be returned.
5.6.2 Specification
Precondition. Besides such typical things as parameter sub-typing as well as
an abstraction relation the precondition to the page-fault handler includes the
following condition. The parameter intent of the page-fault handler denotes
the intention of a handlers’s invocation. That a micro kernel calls the page-
fault handler with an intention SWAP IN means that a real hardware page-fault
takes place. Recall that the page fault predicate pfPFH?(cPFH, pid, va, intent)
does not hold for the case of a page table length exception. The latter is
signaled by a separate predicate ptlexcpPFH?(cPFH, pid, va). Hence, in case of
an SWAP IN intention either a page fault, or a PTL exception predicate must
hold. Conversely, if the intention is different from SWAP IN there is no hardware
page faults on the underlying machine. As a consequence, no page table length





pf-swap-in?(cPFH, pid, va, intent) ={
ptlexcpPFH?(cPFH, pid, va)∨pfPFH?(cPFH, pid, va, intent) if intent=SWAP IN
¬ptlexcpPFH?(cPFH, pid, va) otherwise
.
96 Implementation Correctness
Now we define the precondition PRE Hta ?(cH, cPFH, refsactive, refsfree) to the
page-fault handler formally. The first group of terms constituting the precon-
dition impose restrictions on parameter ranges:
• user-pid?(cH.pidloc), the process identifier parameter corresponds to some
user process,
• cH.addrloc < TOT PGS · PG SZ, the virtual address parameter does not
exceed the size of a machine word, and
• cH.countloc ∈ {0, 1}, we are going to make at most one subsequent call to
the page-fault handler.
Also we require that the user process cH.pidloc has some virtual memory:
cPFH.ptl[cH.pidloc] ≥ 0. Further, the precondition contains the predicate
pf-swap-in?(cPFH, cH.pidloc, cH.addrloc, cH.intentloc). As described above, this
predicate makes a case distinction on the intention parameter and gives us
information about page fault and PTL exception signals in each case.
Finally, we need to state the abstraction relation and the validity require-
ments:
• absH√(cH, cPFH, refsactive, refsfree), the abstraction relation from Simpl to-
wards the abstract PFH configuration holds, and the latter configuration
is valid, and
• subtypingX?(cH.x), the extended state respects the sub-typing.
Postcondition. Depending on the functionality of our page-fault handler we
distinguish four cases of its invocation:
Case 1. User page fault with a PTL exception: a hardware page fault happens
due to a page table length exception.
Case 2. Legal user page fault: a hardware page fault happens and it is not a
PTL exception.
Case 3. Kernel requested touch of an accessible page: the kernel wants to en-
sure that a certain page will reside in the physical memory for a specified
number of subsequent calls to the page-fault handler and this page is
already in the physical memory.
Case 4. Kernel requested touch of an inaccessible page: the kernel wants to
ensure that a certain page will reside in the physical memory for a spec-
ified number of subsequent calls to the page-fault handler and this page
is not present in the physical memory.
A distinction between these four cases is performed formally inside the pred-
icate POST Hta ?(c
′
H, cPFH, refsactive, refsfree). Below we discuss how this predicate
is defined in each case.
5.6. Page-Fault Handler 97
Case 1: User page fault with a PTL exception. The case takes place if the
intention parameter is SWAP IN and there is a PTL exception:
cH.intentloc = SWAP IN ∧ ptlexcpPFH?(cPFH, cH.pidloc, cH.addrloc).
A page table length exceptions signals the kernel that some kind of invalid
memory operation has been undertaken. The page-fault handler does noth-
ing in this case and reports an error to the kernel. More precisely, the case
postcondition contains the following terms:
• absH√(c′H, cPFH, refsactive, refsfree), the abstract relation as well is validity
of the abstract PFH configuration is preserved,
• c′H.x = cH.x, the extended state is unchanged, and
• c′H.resloc = INVALID ADDR, the result variable is assigned an invalid mem-
ory operation code.
Case 2: Legal user page fault. This situation happens if the intention param-
eter differs from SWAP IN and the page fault predicate holds:
cH.intentloc 6= SWAP IN ∧ pfPFH?(cPFH, cH.pidloc, cH.addrloc, cH.intentloc).
The page-fault handler executes the page-fault handler code in this case. The
code brings the implementation data structures to the state consistent to the
abstract PFH configuration after applying the page-fault handling algorithm.
The extended state reflects necessary page transfers between the physical and
the swap memories. Formally, let us introduce the following abbreviations:
• c′PFH = handle-pfPFH(cPFH, cH.pidloc, cH.addrloc) is the abstract PFH con-
figuration after applying our page-fault handling algorithm,
• refs′active = tl(refsactive) ◦ hd(refsactive) and refs′free = refsfree are lists of
active and free references after executing the algorithm in case the list of
free references was empty, i.e., refsfree = [ ], and
• refs′active = refsactive◦hd(refsfree) and refs′free = tl(refsfree) are the same lists
in case the list of free references was not empty, i.e., refsfree 6= [ ].
Formally, the postcondition of the legal user page fault case is a conjunction
of the following facts:
• absH√(c′H, c′PFH, refs′active, refs′free), the abstraction relations holds between
the Simpl implementation configuration and the updated abstract PFH
configuration,
• c′H.x = handle-pfX(cPFH, cH.x, cH.pidloc, cH.vpxloc, cH.intentloc), the config-
uration of the extended state after transferring pages between the physical
and swap memory is specified by the page-fault handling algorithm,
• subtypingX?(c′H.x), the updated extended state is sub typed, and
• c′H.resloc = pmaPFH(c′PFH, cH.pidloc, cH.addrloc), the result variable of the
page-fault handler stores the translated physical memory address for pa-
rameters cH.pidloc and cH.addrloc.
98 Implementation Correctness
Case 3: Kernel requested touch of an inaccessible page. The case takes place
if the intention parameter is SWAP IN and there is no PTL exception:
cH.intentloc = SWAP IN ∧ ¬ptlexcpPFH?(cPFH, cH.pidloc, cH.addrloc).
The postcondition in this case is the same as in case 2.
Case 4: Kernel requested touch of an accessible page. This situation happens
if the intention parameter differs from SWAP IN and there is no page fault:
cH.intentloc 6= SWAP IN ∧ ¬pfPFH?(cPFH, cH.pidloc, cH.addrloc, cH.intentloc).
The case corresponds to a situation when the kernel executes CVM primitives
that access user memory. Recall that sometimes it is necessary to have two
certain pages available in the physical memory. Examples include a situation
when the kernel copies data between two user processes. However, if the page-
fault handler just checks that the page is currently in the memory, a subsequent
call to the handler might invalidate it. In order to overcome this problem, the
parameter cH.countloc is set to one denoting that the page will survive the
second call the page-fault handler. In case the descriptor of the considered
page is the head of the active list we need to push it one one position further





to-push?(cPFH, count, pid, ad) = count = 1
∧ cPFH.free = [ ]
∧ hd(cPFH.active).vpx = px(ad)
∧ hd(cPFH.active).pid = pid
.
In case to-push?(cPFH, cH.countloc, cH.pidloc, cH.addrloc) holds we update the
active list — it first element is moved to the second position:
c′PFH.active = hd(tl(cPFH.active)) ◦ hd(cPFH.active) ◦ tl(tl(cPFH.active)).
The updated version of the list refs′active of active references in this case is
obtained in a similar manner.
Now, the postcondition in the case consists of the following terms:
• absH√(c′H, c′PFH, refs′active, refsfree), the abstraction relations holds between
the Simpl implementation configuration and the updated abstract PFH
configuration,
• c′H.x = cH.x, the extended state is unchanged, and
• c′H.resloc = pmaPFH(c′PFH, cH.pidloc, cH.addrloc), the result variable of the
page-fault handler stores the translated physical memory address for pa-
rameters cH.pidloc and cH.addrloc.
5.6.3 Correctness
The theorem below states the correctness of the page-fault handler at the level
of Simpl.
5.6. Page-Fault Handler 99
J THEOREM 5.25
Implementation correctness
of the page-fault handler
Γ tH PRE Hta ?(cH, cPFH, refsactive, refsfree)




H, cPFH, refsactive, refsfree) ∩∆(cH, c′H) =
{activeglob, freeglob, pages-freeglob, ptheap, pidheap, vpxheap,nextheap, prevheap, x}
We annotate loops of the function with invariants INV ita? and ranking J PROOF
functions RANK ita for i ∈ {1, 2} defined in Appendix B, run VCG, and obtain
HOL subgoals stated below to be proven. Functions fx denote modifications
done over implementation configuration cH by code lines x in Listing 5.4.
Subgoal 1. Implication from the precondition to the postcondition:
PRE Hta ?(cH, cPFH, refsactive, refsfree)
−→ POST Hta ?(f10–13,36–62(cH), cPFH, refsactive, refsfree).
Subgoal 2. Implication from the precondition to the first invariant:
PRE Hta ?(cH, cPFH, refsactive, refsfree)
−→ INV 1ta?(f10–11,15–16(cH), cPFH, refsactive, refsfree).
Subgoal 3. Preservation of the first invariant:
INV 1ta?(cH, cPFH, refsactive, refsfree)
−→ INV 1ta?(f21–22(cH), cPFH, refsactive, refsfree).
Subgoal 4. Implication from the first invariant to the second invariant:
INV 1ta?(cH, cPFH, refsactive, refsfree)
−→ INV 2ta?(f24(cH), cPFH, refsactive, refsfree).
Subgoal 5. Preservation of the second invariant:
INV 2ta?(cH, cPFH, refsactive, refsfree)
−→ INV 2ta?(f26–27(cH), cPFH, refsactive, refsfree).
Subgoal 6. Implication from the second invariant to the postcondition:
INV 2ta?(cH, cPFH, refsactive, refsfree)


























So far we have proven in the Hoare Logics
that our Simpl implementation of demand
paging respects its specification. Our ulti-
mate goal is to infer correctness properties
of the demand paging in the semantics of
VAMP ISA. A big milestone on this way is to
prove correctness of the implementation rep-
resented in the C0 small-step semantics. In
this chapter we exploit the C0 semantics stack
in order to transfer implementation correct-
ness results from the level of Simpl down to
the level of SS. We achieve this goal through
an intermediate transfer of correctness results
to the big-step semantics level. We define ab-
straction relations between BS and SS states
on the one side, and abstract PFH configura-
tions on the other. With the help of these
relations we formulate specifications of the
demand paging functions on BS and SS lev-
els. Finally, using the C0 semantics stack we
prove that corresponding representations of
the implementation meet their specifications.
Throughout the transfer, calls to hard-disk
drivers whose bodies are implemented in as-
sembly are modeled as XCalls. Thus, the tar-
get level of this chapter is the extended C0
small-step semantics, i.e., it contains trusted
XCalls to the drivers. We will get rid of these
XCalls by plugging in the correctness state-
ments of the drivers in Chapter 7.
101
102 Property Transfer
6.1 Abstraction Relation from BS
This section defines abstraction relation
absBS?(cBS, cPFH,HT, locsactive, locsfree)
which claims that big-step semantics configuration cBS implements abstract
PFH configuration cPFH. As auxiliary parameters the relation takes heap typ-
ing HT :: (loc-t 7→ S⊥) and lists of locations in memory locsactive, locsfree :: loc-t∗
which define the structure of the active and free lists. Technically, the relation
follows the idea of the abstraction relation from Simpl (Section 5.2) and com-
prises predicates for individual components of the abstract FPH configuration.
We define them below, but do not give much textual description as the pred-
icates differ from those defined in Section 5.2 mainly in data representation.
For each predicate we provide references to the corresponding abstraction re-
lation’s from Simpl components such that the reader can examine differences
between definition for BS and for Simpl. Below we remind the reader of the
key differences in the memory models of C0 big-step semantics and Simpl.
Differences in the Simpl and BS memory models
As described in Chapter 3 the Simpl level features a split heap: every com-
ponent of a structure is stored in a separate heap. Moreover implicit typing
of variables is used. The program state is a record where every program vari-
able and every split heap are fields with their own HOL types. In Section 5.1
we define such a record for our implementation of demand paging. With this
shape of a program state it is convenient to define the abstraction relation
for the variables and data structures from our demand paging implementation
in Simpl towards the PFH abstraction: both levels are operating on ordinary
HOL variables (cf. Section 5.2).
In contrast to that, C0 big-step semantics (cf. Section 3.3) features a single
monolithic heap with compound values and explicit typing. The program state
is record with three separate components for global, local, and heap variables.
These components are partial mappings from variable names (locations, in case
of heap variables) to compound values. The values are modeled by inductive
data type val-t with separate constructors for primitive, array, and structure
values. The primitive values are modeled by inductive data type prim-t whose
constructors correspond to primitive data types supported by C0: booleans,
integers, unsigned numbers, characters, and references.
C0 big-step semantics shares the model of expressions with C0 small-step
semantics: expressions are instances of inductive data type expr-t (cf. Sec-
tion 2.3). In order to evaluate an expression e :: expr-t with respect to a
big-step semantics state cBS we use the function evalBS(cBS, e) [Sch06, Figure
7.1]. It returns the value v :: val-t⊥ of the expression e. This expression evalu-
ation function will be used in definitions of abstractions relations from BS for
different variables from the demand paging implementation. The common idea
behind these abstraction relations is to evaluate the respective variable from
the big-step semantics state by means of the function evalBS and to express the
obtained value with the help of inductive constructors of data type val-t and
the corresponding HOL values taken from the abstract PFH configuration.
6.1. Abstraction Relation from BS 103
6.1.1 Doubly-Linked Lists
The idea for abstracting list implementation in BS is the same as for Simpl
but another syntax for variable access (pointers, structure fields, etc.) is used.
Also it is crucial that on the big-step level a single heap is used which is not
split for different structure fields. Predicate
listBS? :: CBS × (loc-t 7→ S⊥)× val-t× S× S× loc-t∗ 7→ B
has the meaning of listH? (Definition 5.1) in big-step semantics. The differences
are in parameters of the predicate. It takes (i) complete BS state cBS to evaluate
expressions, (ii) heap typing HT, (iii) pointer p to the first element of the list,
(iv) type name tn of list elements, (v) field name next which corresponds to
the list’s “next”-field which yields successor elements, and (vi) list of locations
l specifying the list structure.
The list predicate in the big-step semantics means that in the non-degenerate
case p points to the first address in list l which is appropriately typed and we
can obtain p’s successor p′ for which the BS list predicate holds with l’s tail l′:
J DEFINITION 6.1
List abstracted from BS
listBS?(cBS,HT, p, tn,next, l) =
p = Prim(Null) if l = [ ]
p = Prim(Addr(a)) ∧HT(a) = btnc
∧ ∃ p′ : evalBS(cBS,StructAcc(Deref(Lit(p)),next)) = bp′c
∧ listBS?(cBS,HT, p′, tn,next, l′) if l = a ◦ l′.
A BS doubly-linked list is formalized by application of the BS list predicate
in both directions. Pointer q to the list’s last element and the “predecessor”-




dlistBS? :: CBS × (loc-t 7→ S⊥)× val-t× val-t× S× S× S× loc-t∗ 7→ B,
dlistBS?(cBS,HT, p, q, tn,next, prev, l) =
listBS?(cBS,HT, p, tn,next, l) ∧ listBS?(cBS,HT, q, tn, prev, rev(l)).
Since in this work we deal only with page-management lists whose elements
are page descriptors we formalize below a doubly-linked list of page descriptors
by instantiating the type name as well as parameters next and prev :
dlist pdBS ?(cBS,HT, p, q, l) = dlistBS?(cBS,HT, p, q, pd, next, prev, l).
6.1.2 Page and Big-Page Management
First of all we define an abstraction relation for lists of page descriptors. It
states that list of abstract page descriptors pds :: pd-t∗ and list of locations
l :: loc-t∗ specify a doubly linked list of page descriptors starting at the variable
of name vn in big-step semantics configuration cBS with respect to heap typing
HT. Each implementation page descriptor fields pid, vpx, and ppx are specified




List of page descriptors
abstracted from BS
pds-absBS?(cBS,HT, vn, l, pds) =
|l| = |pds| ∧ (∃ v : cBS.gvars(vn) = bvc ∧ ∃ q : dlist pdBS ?(cBS,HT, v, q, l))
∧ ∀i < |l|, evalpd = λ fn : evalBS(cBS,StructAcc(Deref(
Lit(Prim(Addr(l[i]))))), fn) :
evalpd(pid) = bPrim(Unsgnd(pds[i].pid))c
∧ evalpd(vpx) = bPrim(Unsgnd(pds[i].vpx))c
∧ evalpd(ppx) = bPrim(Unsgnd(pds[i].ppx))c.
By instantiating the head variable name with active and lists of abstract
page descriptors and locations with cPFH.active and locsactive, respectively, we




active-absBS?(cBS, cPFH,HT, locsactive) =
pds-absBS?(cBS,HT, active, locsactive, cPFH.active).




free-absBS?(cBS, cPFH,HT, locsfree) =
pds-absBS?(cBS,HT, free, locsfree, cPFH.free).
As for the stack of free big pages, on the implementation side it is a fixed-size
array of TOT BIG PGS elements. Its abstract counterpart cPFH.bpfree, however,
has variable length and stores only indices of free big pages. We bridge this
gap by claiming existence of an array’s postfix vs in the BS implementation
such that the prefixes of implementation and abstract arrays match (cf. Defi-
nition 5.6).
DEFINITION 6.6 I
Stack of free big pages
abstracted from BS
bpfree-absBS?(cBS, cPFH) =
∃ vs : |cPFH.bpfree|+ |vs| = TOT BIG PGS
∧ evalBS(cBS,VarAcc(bpfree)) =
bArr(map(λ x : Prim(Unsgnd(x)), cPFH.bpfree) ◦ vs)c
6.1.3 Page and Big-Page Tables
The abstraction relation for the page table space states that BS implementa-
tion variable pt is a reference which points to the first heap address and by
dereferencing it we obtain a two dimensional array on the heap with elements







bArr(map(λ x : Arr(map(λ y : Prim(Unsgnd(y)), x)), cPFH.pt))c.
The abstraction relation for the big-page table space claims that BS imple-
mentation variable bpt is an array which matches the abstract big-page table





evalBS(cBS,VarAcc(bpt)) = bArr(map(λ x : Prim(Unsgnd(x)), cPFH.bpt))c.
6.1. Abstraction Relation from BS 105
6.1.4 Process Control Blocks
When it comes to the abstraction relation for process control blocks the major
differences between the Simpl and big-step semantics come to light. In the
Simpl with its flattened array representation it was sufficient only to claim
correspondence between the values of Simpl arrays representing the PCB fields
and theirs abstract counterparts (cf. Definitions 5.9–5.11). The big-step se-
mantics has a more accurate memory model. Therefore, in order to define the
abstraction relations for the process control blocks we have to consider how
the array of PCB data structures is declared (Listing 2.1) and model the latter
in the BS memory. The idea behind the abstraction relation is depicted in
Figure 6.1.
Recall that abstract PCBs fields like cPFH.pto are arrays of MAX PID elements
including the (unused) element at position zero reserved for the kernel. The
abstraction relation for the page table origins states that by evaluating BS
variable for the PCB array pcb we obtain an array of structures with the first
element equal to some value pcbkernel. Each remaining element i ∈ [0..MAX PID−
2] has the following fields: (i) exception frame array fields ef composed out
of some prefix efhd[i], the page table origin constructed from the abstract one
cPFH.pto[i+1], and some postfix eftl[i], (ii) user-defined interrupt handlers ihd[i],
(iii) big-page table origins and lengths bpto[i] and bptl[i], and (iv) empty space
empty. Moreover the abstraction relation claims that lengths of efhd and eftl
are MAX PID−1 as well as that lengths of their elements efhd[i] eftl[i] are PTO
and EF DIM−PTL. The latter reflects position of the page table origin element




pto-absBS?(cBS, cPFH) = ∃ pcbkernel, efhd, eftl, ihd, bpto, bptl, empty :
evalBS(cBS,VarAcc(pcb)) =
bArr(pcbkernel ◦
map(λ i : Struct([(ef,Arr(efhd[i] ◦
[Prim(Intg(cPFH.pto[i+1]))] ◦ eftl[i])),
ihd[i], bpto[i], bptl[i], empty[i]]),
[0..MAX PID−2]))c
∧ |efhd| = |eftl| = MAX PID−1
∧ (∀ i < MAX PID−1 : |efhd[i]| = PTO)
∧ (∀ i < MAX PID−1 : |eftl[i]| = EF DIM−PTL)
The abstraction relation for the page table lengths is defined in the same
fashion. However, since PTL is the last element which stores some value in the




ptl-absBS?(cBS, cPFH) = ∃ pcbkernel, efhd, ihd, bpto, bptl, empty :
evalBS(cBS,VarAcc(pcb)) =
bArr(pcbkernel ◦
map(λ i : Struct([(ef,Arr(efhd[i] ◦
[Prim(Intg(cPFH.ptl[i+1]))])),
ihd[i], bpto[i], bptl[i], empty[i]]),
[0..MAX PID−2]))c
∧ |efhd| = MAX PID−1 ∧ ∀ i < MAX PID−1 : |efhd| = PTL
106 Property Transfer
Figure 6.1: Abstraction relation for page table origins

























                                                
eftl[0]
                                                
ihd[0]
                                                
bpto[0]
                                                
bptl[0]
                                                
empty[0]
pcbkernel Arr Arr Arr
Struct Struct Struct
Arr
                                                
efhd[0]
                                                
eftl[1]
                                                
ihd[1]
                                                
bpto[1]
                                                
bptl[1]
                                                
empty[1]
                                                
cPFH.pto[2]
                                                
efhd[1]
                                                
eftl[MAX_PID­2]
                                                
ihd[MAX_PID­2]
                                                
bpto[MAX_PID­2]
                                                
bptl[MAX_PID­2]
                                                
empty[MAX_PID­2]
                                                
cPFH.pto[MAX_PID­1]
                                                
efhd[MAX_PID­2]
0 1 MAX_PID­2i:
The abstraction relation for the big-page table origins is even simpler as it




bpto-absBS?(cBS, cPFH) = ∃ pcbkernel, ef, ihd, bptl, empty :
evalBS(cBS,VarAcc(pcb)) =







The abstraction relation for the big-page table lengths is stated by predicate
bptl-absBS?(cBS, cPFH) which is defined similarly to the predicate above.
6.1.5 Miscellaneous
The abstraction relations for the number of free physical pages, free big pages,










evalBS(cBS,VarAcc(bpages free)) = bPrim(Unsgnd(|cPFH.bpfree|))c.










The reverse lookup array is supposed to map user physical page indices to
corresponding page descriptors. Therefore, we state existence of an array of
pointers to page descriptors pdskernel for the kernel pages. Further, this array is
succeeded by an array obtained from the values of user page descriptors pdsuser
to form the complete array stored at variable ppx2pd. The lengths of pdskernel
and pdsuser are equal to the number of kernel and user pages, respectively.
Next, the abstraction relation states that pointers to user page descriptors are
drawn from the list of active or free locations and correctly typed with respect
to heap typing HT. Finally, dereferencing of these pointers gives us the values




ppx2pd-absBS?(cBS,HT, locsactive, locsfree) = ∃ pdskernel, pdsuser :
evalBS(cBS,VarAcc(ppx2pd)) =
bArr(pdskernel ◦map(λ x : Prim(Addr(x)), pdsuser))c
∧ |pdskernel| = KERNEL PGS ∧ |pdsuser| = USER PGS
∧ ∀ i, j = i−KERNEL PGS : user-ppx?(i) −→
pdsuser[j] ∈ locsactive ◦ locsfree
∧HT(pdsuser[j]) = bpdc
∧ evalBS(cBS,StructAcc(Deref(Lit(Prim(Addr(pdsuser[j])))), ppx)) =
bPrim(Unsgnd(i))c
6.1.6 Altogether
Similarly to the abstraction relation from Simpl we combined the individual
components defined above into the abstraction relation from BS.
J DEFINITION 6.17
Abstraction relation from BS
The overall abstraction relation from the BS implementation towards ab-
stract PFH configurations absBS?(cBS, cPFH,HT, locsactive, locsfree) is a conjunc-
tion of Definitions 6.4–6.16.
Finally, we combine the overall abstraction relation together with validity
properties of the abstract page-fault handler.
J DEFINITION 6.18




(cBS, cPFH,HT, locsactive, locsfree) =
absBS?(cBS, cPFH,HT, locsactive, locsfree)
∧ pfh√(cPFH)
∧ locsactive ∩ locsfree = ∅
∧ locsactive ∪ locsfree = {x | 2 ≤ x ≤ USER PGS + 1}.
6.2 Property Transfer from Simpl to BS
In the previous section we presented the abstraction relation for the demand
paging implementation in the big-step semantics towards an abstract PFH
108 Property Transfer
configuration. In order to prove that the mentioned BS implementation is
correct it remains to do the following.
First, we have to define a concrete version of the function which abstracts
BS states towards Simpl states. As already mentioned such a function depends
on the concrete Simpl state and cannot be defined generically to fit any state.
Next, we have to prove two lemmas that the valid abstraction relation
from BS towards abstract PFH states implies the valid abstraction relation
from Simpl towards abstract PFH states and vice verse. As we discussed in
Section 3.4 we need the first implication to transfer preconditions respectively
the second to transfer postconditions.
Further, we have to state specifications of the initialization code and the
page-fault handler at the level of BS. Having this, we finally will be able to
transfer implementation correctness of the initialization code and the page-fault
handler down to the level of the big-step semantics.
6.2.1 Mapping Simpl States to BS States
In Section 3.4 we have mentioned the function BS2Hstate which abstracts big-
step semantics states towards Simpl states. Since Simpl features a polymorphic
state space and a shallow embedding this function could not be defined in
Isabelle/HOL generically: the definition has to consider each single variable
from the concrete state. With an example below we show how this function
could be defined for some selected global and heap variables. We skip local
variables as the part of the abstraction function for them is completely similar
to those for global and heap variables.
For the global variables we choose the pointer to the free list and the big-
page table as examples. As for the heap variables, the abstraction could only
be defined for those BS heap values that are compatible with the types of
values the demand paging implementation allocates on the heap (cf. Figure
7.10 of Schirmer’s thesis [Sch06]). The demand paging implementation uses
only values of two types in the heap memory: the page table-space and page
descriptors. We define types typt and typd in the C0 big-step semantics for
them below.








For each heap location l in the BS state if the heap value cBS.heap(l) is
compatible with the page table space type typt then the heap function cH.ptheap
yields at the reference loc2ref(l) a two-dimensional array constructed from the
value cBS.heap(l). A similar constraint must hold for all heap locations compat-
ible with the page descriptor type: the values obtained by the heap functions
for page descriptor fields match the corresponding values from the BS state.
6.2. Property Transfer from Simpl to BS 109
J DEFINITION 6.19
Abstraction of the state
from BS towards Simpl
BS2Hstate(cBS) =
{cH | cBS.gvars(free) = bRef(cH.freeglob)c
∧ cBS.gvars(bpt) = bArr(map(λx : Prim(Unsgnd(x)), cH.bptglob))c
∧ . . . (other global variables)
∧ (∀l : `v TcBS.heap(l)U :: typt −→
cBS.heap(l) = bArr(map(λx : Arr(map(λy : Prim(Unsgnd(y)), x)),
cH.ptheap(loc2ref(l))))c)






∧ . . . (local variables)}
Now we have the abstraction relation between BS and Simpl states and
can, finally, transfer correctness results of the demand paging implementation
presented in Section 5 down to the level of the big-step semantics. For that
we first show lemmas for the transfer of valid abstraction relations (Defin-
tions 5.18, 6.18) from Simpl to BS and vice versa. These lemmas will be the
main proof obligations for the transfer of the demand paging implementation
correctness to the level of BS.
6.2.2 Transfer of Abstraction Relation
Transfer of the valid abstraction relation from Simpl to BS. The lemma below
states that a valid abstraction relation from Simpl towards the abstract PFH
state implies the valid abstraction relation from BS towards the abstract PFH
state. Additional assumptions to the lemma are: (i) Simpl state cH is drawn
from the set of states obtained by abstraction of global and heap variables from
BS state cBS, and (ii) the conformance predicate hold for BS state cBS with




transfer from Simpl to BS
absH
√
(cH, cPFH, refsactive, refsfree)
∧ cH ∈ BS2Hstate(cBS)
∧ TE ` cBS :: HT,GT, [ ]
−→ absBS√(cBS, cPFH,HT,map(ref2loc, refsactive),map(ref2loc, refsfree))
The idea of lemma’s proof is as follows. From absH
√
we know that Simpl J PROOF
variables from state cH are correctly mapped to abstract PFH state cPFH. Since
cH belongs to the set of states abstracted from big-step state cBS we know that
variables of cH are also mapped to the variables of cBS. Having these two facts,
we conclude that the big-step state could be also mapped to the abstract PFH
state. 
110 Property Transfer
Transfer of the valid abstraction relation from BS to Simpl. The lemma for the




transfer from BS to Simpl
absBS
√
(cBS, cPFH,HT, locsactive, locsfree)
∧ cH ∈ BS2Hstate(cBS)
∧ TE ` cBS :: HT,GT, [ ]
−→ absH√(cH, cPFH,map(loc2ref, locsactive),map(loc2ref, locsfree))
6.2.3 Specification at the Level of BS
Pre- and postconditions to the functions of the demand paging implementation
formulated at the level of the big-step semantics literally follow those at the
level of Simpl. The only difference is data representation — we have already
illustrated this while defining the abstraction relation from BS (Definition 6.17).
Because of that we do not present formal definitions of pre- and postconditions
but only declare respective predicates.
Initialization code. The precondition to the initialization code of demand pag-
ing at the level of BS is stated by predicate
PRE BSinit?(cBS).




where r is an artificial variable for storing the result of the initialization code.
The definitions of both predicates follow Section 5.3.2. Most notably, the post-
condition establishes the valid abstraction relation from BS (Definition 6.18)
with the initial abstract PFH configuration:
absBS
√
(c′BS, init-cPFH, init-HT, [ ], [USER PGS+1..2]).
Above, init-HT is the initial value for the heap typing. The initialization code
allocates the page table space at the heap location one and page descriptors of
free and active lists at further USER PGS locations.
DEFINITION 6.22 I
Initial heap typing init-HT(i) =

bpt tc if i = 1
bpd tc if 1 < i ≤ USER PGS+1
⊥ otherwise
We separate specifications for the extended state into independent pred-
icates. By that we have the following advantage. Recall that the extended
state is shared between the big-step and small-step semantics. Therefore, we
can reuse the specifications for the extended state defined below while speaking
about correctness of demand paging at the level of SS.
The precondition for the extended state is defined by the predicate
PRE Xinit?(cX).
It requires sub-typing of the extended state (Definition 4.8). The postcondition




6.2. Property Transfer from Simpl to BS 111
It claims that the sub-typing is preserved and the page at page address ZFP is
filled with zeros (Definition 4.30).




by the initialization code
modifiedinit = [kheap, pt, pcb, active, free, bpfree
pages used, pages free, bpages free, ppx2pd]
Page-fault handler. The precondition of the page-fault handler function at the
level of BS is specified by predicate
PRE BSta ?(cBS, cPFH, pid, addr, intent, count,HT, locsactive, locsfree).
It explicitly takes values of the page-fault handler parameters as we evaluate
them outside the precondition directly in the formulation of the correctness
lemma at the level of BS (Lemma 6.30). The postcondition of the page-fault
handler function is expressed by predicate
POST BSta ?(c
′
BS, cPFH, pid, addr, intent, count,HT, locsactive, locsfree).
The formal definitions of both pre- and postconditions at the level of BS follow
corresponding definitions at the level of Simpl (cf. Section 5.6.2). Most im-
portantly, both pre- and postconditions contain the valid abstraction relation
from BS towards abstract PFH state (Definition 6.18):
absBS
√
(cBS, cPFH,HT, locsactive, locsfree).
In contrast to specification of the initialization code, the BS postcondition
of the page=fault handler does not speak about the result variable r. That is
because we introduce a separate term to describe the result in the formulation
of the respective theorem. We will benefit from this style later on when dealing
with the page-fault handler top-level correctness theorem. The result of the
page fault handler on the level of BS is computed by the function
pfh-ta-resBS(cPFH, pid, addr, intent, count)
which is defined following the cases described in Section 5.6.2: in case of a PTL
exception the function returns an error constant INVALID ADDR whereas in all
other cases the function yields a physical memory address
pmaPFH(cPFH, pid, addr) (Definition 4.14). The parameters intent and count
are taken only to perform the case distinction. The additional parameter of
the postcondition is an artificial variable r for the result.
The precondition for the extended state demanding only its sub-typing is:
PRE Xta ?(cX).
The postcondition is stated by the predicate
POST Xta ?(cX, c
′
X, cPFH, pid, addr, intent, count)
which is defined by case splitting according to Section 5.6.2.






modifiedta = [free, active, pages free]
112 Property Transfer
6.2.4 Correctness at the Level of BS
Demand paging program. Recall that Hoare triples on the big-step and small-
step levels are defined with respect to some program Π and extended semantics
environment xsem. In order to claim such Hoare triples for our demand paging
functions we need to have concrete instances of the program and the extended
semantics environment.
We denote the program of our demand paging implementation by ΠPFH. It
contains the entries for (i) the initialization code pfh init, (ii) the page-fault
handler function pfh touch addr, (iii) the swapping routines pfh swap out
and pfh swap in, and (iv) the functions of the doubly-linked list library used
by the demand paging.
The extended semantics environment for the demand paging implementa-
tion is xsemPFH. It contains entries for (i) the hard-disk drivers read from disk
and write to disk, and (ii) the zero fill page function zero fill page. The
formal definition of xsemPFH is introduced in Section 7.1 (Definition 7.6) where
we discuss correctness of extended calls to those functions in details.
Having a concrete instance of the demand paging program ΠPFH we can
define concrete version of different typings. Basically, the idea behind the
following definition is to convert concepts like a type environment and global
and local symbol tables of the program ΠPFH given as lists of pairs of names and
types as defined in syntax of C0 and small-step semantics to partial mappings






LTinit = map-of(fpfh init.params ◦ fpfh init.lvars)
LTta = map-of(fpfh touch addr.params ◦ fpfh touch addr.lvars)
Above, fpfh init and fpfh touch addr are function definitions in the function table
ΠPFH.ft.
Transfer of the initialization code correctness from Simpl to BS. The lemma
stated belows claims correctness of the initialization code at the level of BS. It
has only a single assumption. Some variable r to which we write the result of
the call of the initialization code must reside in the context of caller’s variables
L. The lemma claims a valid big-step Hoare triple with respect to program
ΠPFH, extended semantics environment xsemPFH, and context of variables L.
We need to express which variables are modified by the initialization code.
For that we define below the predicate modifies BSinit? which compares two BS
states cBS and c′BS with respect to a global typing GT and a context of caller’s
variables L and claims that (i) all variables outside the global typing GT remain
unchanged, (ii) all variables from the global typing GT which do not belong to
the list of modified global variables modifiedinit remain unchanged, and (iii) all
caller’s variables from the context L except for the result variable r remain
unchanged.
DEFINITION 6.26 I




BS,GT, L, r) =
(∀v 6∈ dom(GT) : c′BS.gvars(v) = cBS.gvars(v))
∧ (∀v ∈ dom(GT) : v 6∈ modifiedinit −→ c′BS.gvars(v) = cBS.gvars(v))
∧ (∀v ∈ L : v 6= r −→ c′BS.lvars(v) = cBS.lvars(v))
6.2. Property Transfer from Simpl to BS 113
The precondition of the Hoare triple is the set of all big-step states cBS
which (i) conform with the type environment TEPFH, global typing GTPFH,
and local typing LTinit, and (ii) respect the preconditions of the initialization
code PRE BSinit? and PRE
X
init?.
The Hoare triple’s postcondition is the set of all big-step states c′BS for
which (i) the postconditions POST BSinit? and POST
X
init? are satisfied, (ii) the
conformance predicate holds with respect to the initial heap typing init-HT,
and (iii) modification of variables is described by the predicate modifies BSinit?.
J LEMMA 6.27
Initialization code correctness
at the level of BS
r ∈ L −→ ΠPFH, xsemPFH, L tBS
{cBS | cBS = c0BS
∧ TEPFH ` cBS :: [ ],GTPFH,LTinit
∧ PRE BSinit?(cBS) ∧ PRE Xinit?(cBS.x)}
SCall(r, pfh init, [ ])
{c′BS | POST BSinit?(c′BS, r, ) ∧ POST Xinit?(c0BS.xc′BS.x)
∧ TEPFH ` c′BS :: init-HT,GTPFH,LTinit
∧modifies BSinit?(c0BS, c′BS,GTPFH, L, r)}
We apply the transfer theorem from Simpl to BS (Theorem 3.2) in which J PROOF
the Simpl Hoare triple is instantiated by the one proven correct for the initial-
ization code (Theorem 5.20). The core goals of the proof comprise an implica-
tion from the big-step precondition of the function to the Simpl precondition,
and an implication from the Simpl postcondition to the big-step postcondition.
We use Lemmas 6.20 and 6.21, respectively, to conclude that goals. 
Transfer of the page-fault handler correctness from Simpl to BS. Before stating
the theorem of correctness results transfer from Simpl to BS for the page-fault
handler let us formalize a few additional concepts.
As defined before, the BS specification of the page-fault handler takes val-
ues of the handler’s call parameters. These values are obtained from the cor-
responding expressions passed as parameters. Since these expressions might
be different depending on the place of the page-fault handler call in the CVM
program, they appear as free variables in the formulation of the handler’s cor-
rectness theorem at the level of BS and are subject to instantiation by the
theorem’s user. Our first goal, is to guarantee that the passed expressions are
correctly evaluated. For this we define a predicate which ensures that expres-
sions epid, eaddr, eintent, and ecount are evaluated to the corresponding values
pid, addr, intent, and count in a big-step semantics configuration cBS with
respect to a context of caller’s variables L.
J DEFINITION 6.28
Evaluation of parameters
to the page-fault handler
at the level of BS
paramBSta ?(cBS, L, epid, eaddr, eintent, ecount, pid, addr, intent, count) =
evalBS(L, cBS, epid) = bPrim(Unsgnd(pid))c
∧ evalBS(L, cBS, eaddr) = bPrim(Unsgnd(addr))c
∧ evalBS(L, cBS, eintent) = bPrim(Unsgnd(intent))c
∧ evalBS(L, cBS, ecount) = bPrim(Unsgnd(count))c
We define the predicate modifies BSta ? which describes modification of vari-
ables done by the page-fault handler. The predicate compares two states cBS
and c′BS with respect to a given global typing GT, heap typing HT, type environ-
114 Property Transfer
ment TE, and context of caller’s variables L. This predicate states that (i) all
variables outside the global variables typing GT remain unchanged, (ii) all
variables from the global typing GT which do not belong to the list modifiedta
remain unchanged, (iii) all variables from the context L except for r remain
unchanged, (iv) domain of the big-step heap stays the same, (v) all heap lo-
cations of the types outside the type environment TE remain unchanged, and
(vi) the heap might have changed only at the page table space (location one)
and active and free lists (locations specified by locsactive and locsfree).
DEFINITION 6.29 I
Modification of BS variables
by the page-fault handler
modifies BSta ?(cBS, c
′
BS,GT,HT, L, r,TE, locsactive, locsfree) =
(∀v 6∈ dom(GT) : c′BS.gvars(v) = cBS.gvars(v))
∧ (∀v ∈ dom(GT) : v 6∈ modifiedta −→ c′BS.gvars(v) = cBS.gvars(v))
∧ (∀v ∈ L : v 6= r −→ cBS.lvars(v) = cBS.lvars(v))
∧ dom(c′BS.heap) = dom(cBS.heap)
∧ (∀l, tn : HT(l) = btnc ∧ tn 6∈ dom(TE) −→ c′BS.heap(l) = cBS.heap(l))
∧ (∀tn, ty : TE(tn) = btyc −→ ∀l : HT(l) = btnc :
(ty = typt ∧ l 6= 1) ∨ (ty = typd ∧ l 6∈ locsactive ◦ locsfree)
−→ Tc′BS.heap(l)U = TcBS.heap(l)U)
Analogously to Lemma 6.27 the lemma below assumes a result variable r
to the be in the context of caller’s variables L. The lemma concludes a valid
BS Hoare triple with respect to a program ΠPFH and L. The precondition of
the Hoare triple is the set of all big-step states cBS which (i) conform with the
type environment TEPFH, local typing LTta, global typing GTPFH, and some
heap typing HT, (ii) correctly evaluates the page-fault handler parameters, and
(iii) respect the preconditions of the handler PRE BSta ? and PRE
X
ta ?.
The Hoare triple’s postcondition is the set of all big-step states c′BS for
which (i) the postconditions POST BSta ? and POST
X
ta ? hold, (ii) the function’s
result is appropriately computed, (iii) the conformance judgment is preserved,
and (iv) the variable modification is described by the predicate modifies BSta ?.
LEMMA 6.30 I
Page-fault handler correctness
at the level of BS
r ∈ L −→ ΠPFH, xsemPFH, L tBS
{cBS | cBS = c0BS
∧ TEPFH ` cBS :: LTta,GTPFH,HT
∧ paramBSta ?(cBS, L, epid, eaddr, eintent, ecount, pid, addr, intent, count)
∧ PRE BSta ?(cBS, cPFH, pid, addr, intent, count,HT, locsactive, locsfree)
∧ PRE Xta ?(cBS.x)}
SCall(r, pfh touch addr, [epid, eaddr, eintent, ecount])
{c′BS | POST BSta ?(c′BS, cPFH, pid, addr, intent, count,HT, locsactive, locsfree)
∧ POST Xta ?(c0BS.x, c′BS.x, cPFH, pid, addr, intent, count)
∧ r = pfh-ta-resSS(cPFH, pid, addr, intent, count)
∧ TEPFH ` c′BS :: LTta,GTPFH,HT
∧modifies BSta ?(c0BS, c′BS,GTPFH,HT, L, r,TEPFH, locsactive, locsfree)}
We apply the transfer theorem from Simpl to BS (Theorem 3.2) in whichPROOF I
the Simpl Hoare triple is instantiated by the one proven correct for the page-
fault handler (Theorem 5.25). For the core proof goals — an implication from
the big-step precondition of the function to the Simpl precondition, and an
implication from the Simpl postcondition to the big-step postcondition — we
6.3. Program Context Extension 115
use Lemmas 6.20 and 6.21, respectively. 
6.3 Program Context Extension
So far, we have correctness results of the demand paging implementation on the
big-step level with respect to the program ΠPFH (Lemmas 6.27 and 6.30). The
definition of ΠPFH contains only entries corresponding to the demand paging
functions. However, we want to apply this correctness result in the context of
a complete concrete kernel. That means that the correctness lemmas of the
demand paging must also hold in the context of a program which contains
entries for all functions of the CVM framework as well as the abstract kernel.
At first glance, it might seem that substituting ΠPFH by the program of
the concrete kernel ΠCK(ΠAK) (Definition 2.17) may suffice. However, the
concrete kernel program contains definitions of the hard-disk driver functions
read from disk and write to disk as well as the zero fill page function
zero fill page. These functions contain inline assembly code and are sub-
routines of the page-fault handler and its initialization code, respectively. As
Section 3.5 points, the simulation theorems between Simpl and BS as well as
BS and SS assume that the function subject to property transfer must not
execute assembly statements.
In order to solve this problem we have to remove the hard-disk driver func-
tions and the zero fill page routine from the definition of the concrete kernel
program. Information about the removed functions will be stored instead in
the extended semantics environment which is a parameter to correctness lem-
mas of demand paging on the small-step semantics level. Moreover, we have to
replace all calls to the removed functions by the extended calls (XCalls). Next





The function repl-hdzfpstmt(s) analyzes the statement (tree) s and replaces
all normal calls to the zero fill page and hard-disk driver functions by the
corresponding extended calls. The function is defined by structural induction.
For the zero fill page call s = SCall(vn, zero fill page, params) the result is
defined as
XCall(zero fill page, [VarAcc(vn, IntegerT)], params).
For the calls to the hard-disk driver functions s = SCall(vn, fn, params) with
fn ∈ {write to disk, read from disk} we define the results as
XCall(fn, [VarAcc(vn,BooleanT)], params).
For the compositional, conditional, and loop statement the function goes one






We define the function repl-hdzfpft(ft) which transforms the function table
ft to xpt in two steps.
• It removes the definitions of hard-disk driver and zero fill page functions:
ft′ = filter(λp : fst(p) /∈ {write to disk,
read from disk,
zero fill page}, ft).
116 Property Transfer
• It replaces all calls to these functions by XCalls:
xpt = map(λp : (fst(p), f ′), ft′)
where f ′.body = repl-hdzfpstmt(snd(p).body).
Having this, we can define the extended program Πext(ΠAK), the one we
will use instead of ΠPFH. The function table of this programs is defined by
applying Definition 6.32 to the function table of the concrete kernel ftCK(ΠAK).
The type environment and the global symbol table are those of the concrete













Note that since program extension does not affect local symbol tables we have
no need to define local typings with respect to the extended program, but
rather can keep using LTinit and LTta.
So far we have defined the extended program context for our demand paging
implementation. Our next goal is to state and prove its correctness lemmas
with respect to this extended program context. These proofs will use two
following lemmas which have been proven by Schirmer.
The first lemma provides means to substitute a program Π used in a big-step




Assume that (i) a big-step semantics Hoare triple P s Q holds with respect
to a program Π and an extended semantics environment xsem, (ii) for all func-
tion names fn that have an entry in the function table Π.ft there is an entry
in the function table Π′.ft, and (iii) for all type names ty that have an entry
in the type environment Π.te there is an entry in the type environment Π′.te,
then the Hoare triple P s Q holds with respect to Π′:
Π, xsem, L tBS P s Q
∧ (∀fn : map-of(Π.ft)(fn) = bxc −→ map-of(Π′.ft)(fn) = bxc)
∧ (∀ty : map-of(Π.te)(ty) = bxc −→ map-of(Π′.te)(ty) = bxc)
−→ Π′, xsem, L tBS P s Q.
The lemma is proven structural induction on the statement s.
The second lemma allows to substitute pre- and postconditions P and Q
of a big-step step semantics Hoare triple by different pre- and postconditions
P ′ and Q′ provided that P ′ ⊆ P and Q ⊆ Q′. The lemma is a nothing but a
consequence rule for Hoare triples at the level of big-step semantics:
LEMMA 6.36 I
Consequence rule
for BS Hoare triples
P ′ ⊆ P ∧ Q ⊆ Q′ ∧ Π, xsem, L tBS P s Q −→ Π, xsem, L tBS P ′ s Q′.
Below we reformulate correctness lemmas of demand paging at the level of
BS (Lemmas 6.27 and 6.30 ) with respect to the extended program.
6.3. Program Context Extension 117
J LEMMA 6.37
Initialization code correctness
at the level of BS
w.r.t. the extended program
abs-kernel-props(ΠAK) ∧ r ∈ L −→ Πext(ΠAK), xsemPFH, L tBS
{cBS | cBS = c0BS
∧ TEext(ΠAK) ` cBS :: [ ],GText(ΠAK),LTinit
∧ PRE BSinit?(cBS) ∧ PRE Xinit?(cBS.x)}
SCall(r, pfh init, [ ])
{c′BS | POST BSinit?(c′BS, r) ∧ POST Xinit?(c0BS.x, c′BS.x)
∧ TEext(ΠAK) ` c′BS :: init-HT,GText(ΠAK),LTinit
∧modifies BSinit?(c0BS, c′BS,GText(ΠAK), L, r)}
First, we apply Lemma 6.35 where we instantiate the “small” program with J PROOF
ΠPFH and the “big” one with Πext(ΠAK). The first assumption of Lemma 6.35
is discharged by Lemma 6.27, the second and the third are proven by inspecting
the definitions of small and big programs.
Second, we apply Lemma 6.36 in which P and Q are instantiated with
pre- and postconditions of the Hoare triple from Lemma 6.27, whereas P ′
and Q′ by those from the present lemma. This step leaves us two subgoals
to prove: P ′ ⊆ P and Q ⊆ Q′. Having the described instantiations, the
difference between P and P ′ respectively Q and Q′ is that the primed versions
of the predicates are defined for the extended type environment TEext(ΠAK)
and global variables typing GText(ΠAK).
The subgoal P ′ ⊆ P boils down to a weakening of a conforming BS state
condition (cf. Definition 3.1). The main proof step is to show an implication
dom(GText(ΠAK)) ⊆ dom(cBS.gvars) −→ dom(GTPFH)) ⊆ dom(cBS.gvars). It
follows from the fact that GTPFH is included in GText(ΠAK).
The subgoal Q′ ⊆ Q boils down to a strengthening of a conforming BS state
condition (cf. Definition 3.1). The main proof step is to show an implication
dom(GTPFH)) ⊆ dom(cBS.gvars) −→ dom(GText(ΠAK)) ⊆ dom(cBS.gvars).
Here we need to show that the “difference” between GText(ΠAK) and GTPFH has
a counterpart in cBS.gvars. It follows from the condition ∀v 6∈ dom(GTPFH) :
c′BS.gvars(v) = cBS.gvars(v) of Definition 6.26. 
J LEMMA 6.38
Page-fault handler correctness
at the level of BS
w.r.t. the extended program
abs-kernel-props(ΠAK) ∧ r ∈ L −→ Πext(ΠAK), xsemPFH, L tBS
{cBS | cBS = c0BS
∧ TEext(ΠAK) ` cBS :: LTta,GText(ΠAK),HT
∧ paramBSta ?(cBS, L, epid, eaddr, eintent, ecount, pid, addr, intent, count)
∧ PRE BSta ?(cBS, cPFH, pid, addr, intent, count,HT, locsactive, locsfree)
∧ PRE Xta ?(cBS.x)}
SCall(r, pfh touch addr, [epid, eaddr, eaddr, ecount])
{c′BS | POST BSta ?(c′BS, cPFH, pid, addr, intent, count,HT, locsactive, locsfree))
∧ POST Xta ?(c0BS.x, c′BS.x, cPFH, pid, addr, intent, count)
∧ r = pfh-ta-resSS(cPFH, pid, addr, intent, count)
∧ TEext(ΠAK) ` c′BS :: LTta,GText(ΠAK),HT
∧modifies BSta ?(c0BS, c′BS,TEext(ΠAK),HT, L, r,GText(ΠAK), locsactive, locsfree)}
118 Property Transfer
The proof is similar to the proof Lemma 6.37. We use Lemma 6.30 instedPROOF I
of Lemma 6.27.
6.4 Abstraction Relation from SS
This section introduces abstraction relation
absSS?(te, cSS, cPFH, gvarsactive, gvarsfree)
which states that SS implementation state cSS encodes abstract PFH configu-
ration cPFH with respect to type environment te :: tenv-t and lists of active and
free pointer variables gvarsactive, gvarsfree :: gvar-t
∗ which specify the positions
of the respective lists in memory. The definition of the predicate follows its
counterparts for Simpl (Section 5.17) and big-step semantics (Section 6.17).
Below we highlight the relevant differences in memory models of small-step
and big-step semantics.
Differences in the BS and SS memory models
We have described the memory model of C0 big-step semantics in Section 3.3
and the memory model of C0 small-step semantics in Section 2.3. The key
difference between these two model is values representation: big-step semantics
features compound values whereas small-step semantics uses flat values.
On the big-step level global, local, and heap memories are modeled as sep-
arate partial mapping from variable names (locations) to compound values.
These values are modeled by inductive data type val-t with individual con-
structors for primitive, array, and structure values. The primitive values are
also modeled by an inductive data type whose constructors correspond to the
primitive types supported by C0.
On the small-step level global, local, and heap memories are modeled by
means of memory frames. A memory frame stores its symbol table which is a
list of pairs of variable names and types, the set of initialized variables, and
the content. The content is a mapping from natural numbers to memory cells.
A memory cell stores a single primitive value and it is modeled by inductive
data type mcell-t whose constructors correspond to the primitive C0 types. As
the reader might observe, there are no special data types to model array and
structure values. Thus, such compound values are flattened and stored within
consecutive memory cells.
The variables are modeled within the small-step semantics by inductive data
type gvar-t. It has constructors to model global, local, and heap variable as
well as array elements and structure fields. Having a variable g we can compute
its content in the C0 small-step memory configuration mem by means of the
function valueg(mem, g) [Lei07, Definition 4.22]. The obtained content ct is
a mapping from natural numbers to memory cells. This content computation
function will be used in definitions of abstractions relations from SS for different
variables from the demand paging implementation. The common idea behind
these abstraction relations is to obtain the content of the respective SS variable
by means of the function valueg and to express the obtained result with the
help of inductive constructors of data type mcell-t and the corresponding HOL
values taken from the abstract PFH configuration.
6.4. Abstraction Relation from SS 119
6.4.1 Doubly-Linked Lists
Following ideas behind definitions of lists abstractions in Hoare logic (Defini-
tion 5.1) and big-step semantics (Definition 6.1) we introduce a list formaliza-
tion in small-step semantics by predicate
listSS? :: CSS × tenv-t× gvar-t⊥ × S× gvar-t∗ 7→ B.
It takes the following parameters: (i) complete SS state cSS and (ii) type envi-
ronment te to evaluate expressions, (iii) pointer p to the first element of the list,
(iv) field name next which corresponds to the list’s “next”-field which yields
successor elements, and (v) list of variables l specifying the list structure.
The list predicate in the small-step semantics means that in non-degenerate
case p is the first pointer in list l and we can obtain by evaluation the content
ct of p’s successor which being converted to a pointer maintains the SS list
predicate with the l’s tail l′:
J DEFINITION 6.39
List abstracted from SS
listSS?(cSS, te, p,next, l) =
p =⊥ if l = [ ]
p = a ∧ ∃ ds : evalSS(te, cSS.mem, p′) = bctc
∧ listSS?(cSS, te,mem2ptr(ct(0)),next, l′) if l = a ◦ l′,
where p′ = StructAcc(Deref(Lit(Prim(Addr(gvar2loc(a))))),next). Note that
contents ct are mappings from natural numbers to memory cells. The result
of evaluation in our case is stored at index zero: ct(0). In the definition func-
tion mem2ptr(m) above converts memory cell m to a pointer variable, and
gvar2loc(a) converts variable a to a location.
An SS doubly-linked list is formalized by application of the SS list predicate
in both directions. Pointer q to the list’s last element and the “predecessor”-




dlistSS? :: CSS × tenv-t× gvar-t⊥ × gvar-t⊥ × S× S× gvar-t∗ 7→ B,
dlistSS?(cSS, te, p, q,next, prev, l) =
listSS?(cSS, te, p,next, l) ∧ listSS?(cBS, te, q, prev, rev(l)).
An instantiation for lists of page descriptors is defined as follows:
dlist pdSS ?(cSS, te, p, q, l) = dlistSS?(cSS, te, p, q, next, prev, l).
6.4.2 Page and Big-Page Management
In the same way we did for the abstraction relations from Simpl and and from
BS we first introduce an abstraction relation for lists of page descriptors. The
predicate below states that a lists of page descriptors starts at a variable of
name vn in the memory of a small-step configuration cSS with respect to a
type environment te. The list’s content is specified by a list of abstract page
descriptors pds and its structure in the memory is given by a list of pointer
variables l (cf. Definitions 5.3 and 6.3).
In the definition below the function evalpd(fn) is used to access the field
with name fn of the doubly-linked list of page descriptors element pointed to
by l[i]. This is achieved by evaluating the respective C0 expression for structure
field access StructAcc(Deref(Lit(Prim(Addr(gvar2loc(l[i])))))) with the function
120 Property Transfer
evalSS. If succeeds, this evaluation function returns some content. We apply
evalpd for the page descriptor fields pid, vpx, and ppx. We claim that evaluation
succeeds and we obtain the corresponding contents ctpid, ctvpx, ctppx. These
contents contain at position zero the memory cells storing the respective values
from the PFH abstraction: pds[i].pid, pds[i].vpx, and pds[i].ppx.
DEFINITION 6.41 I
List of page descriptors
abstracted from SS
pds-absSS?(cSS, te, vn, l, pds) =
|l| = |pds|
∧ (∃ q : dlist pdSS ?(cSS, te,mem2ptr(data(te, cSS.mem, gvargm(vn))), q, l))
∧ ∀ i < |l|, evalpd = λ fn : evalSS(te, cSS.mem,StructAcc(Deref(
Lit(Prim(Addr(gvar2loc(l[i]))))), fn)) :
∃ ctpid, ctvpx, ctppx :
evalpd(pid) = bctpidc ∧ evalpd(vpx) = bctvpxc ∧ evalct(ppx) = bctppxc
∧ ctpid(0) = mcellnat(pds[i].pid)
∧ ctvpx(0) = mcellnat(pds[i].vpx)
∧ ctppx(0) = mcellnat(pds[i].ppx).
Substituting the list’s name with active and its specification lists with
cPFH.active and gvarsactive we obtain the abstraction relation for the active list




active-absSS?(te, cSS, cPFH, gvarsactive) =
pds-absSS?(cSS, te, active, gvarsactive, cPFH.active).
Setting the lists’s name to free and providing cPFH.free and gvarsfree as its





free-absSS?(te, cSS, cPFH, gvarsfree) =
pds-absSS?(cSS, te, free, gvarsfree, cPFH.free).
Note that lists of active and free pointer variables gvarsactive, gvarsfree ::
gvar-t∗ which specify the positions of the respective lists in memory are specified
later in Definition 6.56.
The abstraction relation for the stack of free big pages states that data
read from the SS implementation array bpfree in the memory configura-
tion cSS.mem matches the corresponding specification array (cf. Definitions 5.6
and 6.6).
DEFINITION 6.44 I
Stack of free big pages
abstracted from SS
bpfree-absSS?(te, cSS, cPFH) = ∀ i < |cPFH.bpfree| :
valueg(cSS.mem, gvararr(gvargm(bpfree), i)) = mcellnat(cPFH.bpfree[i]).
Recall that the function valueg(mem, g) [Lei07, Definition 4.22] yields the
content of the variable g in the C0 small-step memory configuration mem. In
this section, we are interested only in the memory cell which resides at index
zero of the obtained content. Because of that, we allow ourselves to abuse
notation and write valueg(mem, g) instead of valueg(mem, g)(0).
6.4. Abstraction Relation from SS 121
6.4.3 Page and Big-Page Tables
Notice a difference in numbering heap addresses in the small-step semantics
and on the level of Simpl/BS. In the former we count heap addresses starting
from zero whereas on the latter we start from one. That is why the abstraction
relation for the page table space first claims that the data stored at the variable
pt is a pointer to the heap memory cell at address zero. Reading data from
this variable gives us a two dimensional array with values matching the page




pt-absSS?(te, cSS, cPFH) =
valueg(cSS.mem, gvargm(pt)) = mcellptr(gvarhm(0))
∧ ∀ i < |cPFH.pt|, j < |cPFH.pt[i]| :
valueg(cSS.mem, gvararr(gvararr(gvarhm(0), i), j)) =
mcellnat(cPFH.pt[i][j]).
The abstraction relation for the big-page table space checks whether the
data read from the variable bpt matches the abstract big-page tabel space (cf.




bpt-absSS?(te, cSS, cPFH) = ∀ i < |cPFH.bpt| :
valueg(cSS.mem, gvararr(gvargm(bpt), i)) = mcellnat(cPFH.bpt[i]).
6.4.4 Process Control Blocks
The abstraction relation for the page table origins claims that the data obtained
at the index PTO of the field ef of the variable pcb matches the page table origin





pto-absSS?(te, cSS, cPFH) = ∀ i : user-pid?(i) −→





The abstraction relation for the page table lengths ptl-absSS?(te, cSS, cPFH)
is defined similarly to the predicate above but PTL and cPFH.ptl are used (cf.
Definitions 5.10 and 6.10).
The abstraction relation for the big-page table origins states that the data
read from the field bpto of the variable pcb matches the big-page table origin
values from the abstract PFH configuration for all user processes (cf. Defini-




bpto-absSS?(te, cSS, cPFH) = ∀ i : user-pid?(i) −→





The abstraction relation for the big-page table lengths, denoted by
bptl-absSS?(te, cSS, cPFH), is defined similarly to the predicate above but bptl
and cPFH.bptl are used (cf. Definitions 5.12 and 6.12.
122 Property Transfer
6.4.5 Miscellaneous
The abstraction relations for the number of free physical pages, free big pages,
and used virtual pages are straightforward and follow Definitions 5.13–5.15 on




pages-free-absSS?(te, cSS, cPFH) =




bpages-free-absSS?(te, cSS, cPFH) =




pages-used-absSS?(te, cSS, cPFH) =





The abstraction relation for the reverse lookup array states that for all
user physical page indices i the pointer which resides in the implementation
array ppx2pd at index i belongs to the set of active and free pointer variables.
By dereferencing this pointer at the field ppx we obtain the value of i (cf.




ppx2pd-absSS?(te, cSS, gvarsactive, gvarsfree) =
∀ i, ptrpd = λ j : mem2ptr(valueg(cSS.mem, gvararr(gvargm(ppx2pd), j)) :
user-ppx?(i) −→ ptrpd(i) ∈ gvarsactive ◦ gvarsfree
∧ valueg(cSS.mem, gvarstr(ptrpd(i), ppx)) = mcellnat(i).
6.4.6 Altogether
Analogously to the abstraction relations from Simpl and from BS we combined
the individual components defined above into the single abstraction relation
from SS.
DEFINITION 6.55 I
Abstraction relation from SS
The overall abstraction relation from the SS implementation towards ab-
stract PFH configurations absSS?(te, cSS, cPFH, gvarsactive, gvarsfree) is a conjunc-
tion of Definitions 6.42–6.54.
We combine the overall abstraction relation together with validity proper-
ties of the abstract page-fault handler.
DEFINITION 6.56 I




(te, cSS, cPFH, gvarsactive, gvarsfree) =
absSS?(te, cSS, cPFH, gvarsactive, gvarsfree)
∧ pfh√(cPFH)
∧ gvarsactive ∩ gvarsfree = ∅
∧ gvarsactive ∪ gvarsfree = {gvarhm(i) | 1 ≤ i ≤ USER PGS}.
6.5. Property Transfer from BS to SS 123
6.5 Property Transfer from BS to SS
6.5.1 Mapping BS States to SS States
In Section 3.5 we considered a meta theorem for transfer correctness of function
calls from the big-step to the small step semantics (Theorem 3.4). This theorem
has free variables for BS and SS configurations which have to match each
other. In order to instantiate these variables we need to define an abstraction
function which constructs a big-step state from a small-step state. Unlike
the state abstraction function from Simpl towards BS (cf. Definition 6.19) the
abstraction function from SS towards BS can be defined generically. Its was
defined formally in Isabelle/HOL by Schirmer.
First we define how values are abstracted with respect to BS and SS se-
mantics, and then introduce the abstraction function for states.
Value abstraction. C0 small-step semantics values are abstracted to big-step
semantics by means of function
J DEFINITION 6.57
Abstraction of values
from SS towards BS
SS2BSval :: (N 7→ mcell-t)× ty-t× N 7→ val-t⊥
It takes as arguments an SS content ct, type ty, and address within the content
a and returns a corresponding BS value. The function is defined by induction
on the type constructors. For primitive types ty listed in Table 6.1 the function
is defined as follows:
SS2BSval(ct, ty, a) =
{
bPrim(v)c ct(a) = m
⊥ otherwise
with values v and memory cells m specified in Table 6.1.
Table 6.1: Abstraction of primitive values from SS to BS







For the array type the function SS2BSval(ct,ArrT(n, ty), a) is additionally
recursive on the array size n. For n = 0 the function yields an empty array.
For n = m+ 1 the function abstracts the values of array’s elements.
SS2BSval(ct,ArrT(n, ty), a) =
bArr([ ])c if n = 0
bArr(v ◦ vs)c if SS2BSval(ct,ArrT(m, ty), a+sizet(ty)) = bArr(vs)c
∧ SS2BSval(ct, ty, a) = bvc ∧ n = m+1
⊥ otherwise
Above, the functionsizet(ty) [Lei07, Definition 4.1] computes the size of a type
(number of elementary values in a flattened value of a type).
124 Property Transfer
Finally, for the structure type the function SS2BSval(ct,Struct(fs), a) is de-
fined quite similarly as for the array type. The definition is additionally recur-
sive on the list fs of structure fields. For fs = [ ] the obtained value is an empty
structure. For fs = x ◦ xs the function abstract values of structure fields.
SS2BSval(ct,Struct(fs), a) =
bStruct([ ])c if fs = [ ]
bStruct((fst(x), v) ◦ vs)c if SS2BSval(ct,Struct(xs), a+sizet(snd(x))) =
bStruct(vs)c
∧SS2BSval(ct, snd(x), a) = bvc ∧ fs = x ◦ xs
⊥ otherwise
State abstraction. We denote the type of a variable vn in a given symbol table
st by typev(st, vn) ([Lei07, Definition 4.14]). We denote the base address of a
variable vn in a given symbol table st by bav(st, vn) ([Lei07, Definition 4.13]).
States of the small-step semantics are abstracted towards big-step states by
means of the function
DEFINITION 6.58 I
Abstraction of states
from SS towards BS
SS2BSstate :: CSS 7→ CBS,
SS2BSstate(cSS) = cBS.
Below we describe how the abstracted cBS state is obtained. Let us abbreviate
different SS memory components: gm = cSS.mem.gm, lm = cSS.mem.lm, and
hm = cSS.mem.hm.
The BS heap memory component is obtained by abstraction of all heap




idx2addr(hm.st, l−1)) if l ≤ |hm.st|
⊥ otherwise
Above, with idx2addr(st, i) we compute the base address of the i-th value ac-
cording to the sizes of proceeding types in the symbol table st.
In oder to define the local variables of the big-step semantic configuration
cBS.lvars(vn) we first check whether the local memory frame is empty lm = [ ].
If so, we assign cBS.lvars(vn) =⊥. Otherwise, lm = l ◦ ls and we compute the
type and base address of the variable vn in the local memory frame l. If we do
not succeed with that, i.e., either typev(fst(l).st, vn) =⊥ or bav(fst(l).st, vn) =⊥,
we assign cBS.lvars(vn) =⊥ alike. If this is not the case and typev(fst(l).st, vn) =
btyc and bav(fst(l).st, vn) = bbac we set
cBS.lvars(vn) =
{
SS2BSval(fst(l).ct, ty, ba) if vn ∈ fst(l).init
⊥ otherwise
Finally, for the global variables component cBS.lvars(vn) we compute the
type typev(gm.st, vn) and the base address bav(gm.st, vn) of the variable vn in
the global memory frame. If at least one of those is⊥ we assign cBS.gvars(vn) =⊥.
Otherwise, we have the type and base address values btyc and bbac, respectively,
and assign
cBS.gvars(vn) = SS2BSval(gm.ct, ty, ba).
Having the abstraction relation between SS and BS states we can transfer
6.5. Property Transfer from BS to SS 125
correctness results of the demand paging implementation further to the level
of small-step semantics. Similarly to a transfer from Simpl to BS we first show
lemmas about transfer of a valid abstraction relation from BS to SS and vice
versa. These lemmas will be essential arguments in the proofs of the demand
paging implementation down to the SS level.
6.5.2 Transfer of Abstraction Relation
Transfer of the valid abstraction relation from BS to SS. In order to formulate
a lemma that a valid abstraction relation can be transferred from the big-step
to the small-step level we need to formalize the structure of the heap memory
in the small-step semantics. Recall that, the demand paging implementation
allocates the page table space and USER PGS page descriptors on the heap.
Moreover, no other part of the CVM implementation stores data on the heap.
Below we define a constant htyCVM which is a list of types of elements allo-
cated by the demand paging code in the heap memory. By hstCVM we denote a
corresponding heap symbol table. Recall that a symbol table is a list of pairs
consisting of variable names and their types. Since variables on the heap are
nameless we use the polymorphic arbitrary constant A as the first component
(variable name) of each element of hstCVM. Additionally, we introduce a pred-
icate which tests whether the list of second symbol table’s components of a
given heap memory frame hm starts with htyCVM.
J DEFINITION 6.59
SS heap structure of
the demand paging
htyCVM = typt ◦ rep(typd, USER PGS)
hstCVM = map(λx : (A, x), htyCVM)
prefix-hstCVM?(hm) = ∀ i ≤ USER PGS : snd(hm.st) = htyCVM[i]
The lemma below states that a valid abstraction relation from BS towards
the abstract PFH state implies the valid abstraction relation from SS towards
the abstract PFH state. The big-step configuration is obtained by abstracting
a given small-step configuration cSS with the state abstraction function (Def-
inition 6.58). Additional assumptions to the lemma are: (i) the small-step
configuration is valid, (ii) the global symbol table of cSS coincides with the
linked symbol table of the concrete kernel, and (iii) the SS heap structure of




transfer from BS to SS
absBS
√
(SS2BSstate(cSS), cPFH,HT, locsactive, locsfree)
∧ cSS ∈ validSS(teCK(ΠAK), ft)
∧ gst(cSS.mem) = gstCK(ΠAK)
∧ prefix-hstCVM?(cSS.mem.hm)
−→ absSS√(teCK(ΠAK), cSS, cPFH,map(loc2gvar, locsactive),
map(loc2gvar, locsfree))
The lemma’s proof idea is quite similar to those of Lemma 6.20. The valid J PROOF
abstraction relation in the assumptions maps the BS state to the abstract PFH
state. This BS state is obtained by abstracting the corresponding SS state.
Therefore, we can also map the SS state to the abstract PFH configuration. 
Transfer of the valid abstraction relation from SS to BS. The lemma for the
implication in the direction from SS to BS is quite similar to the previous one,
though, has two additional assumptions.
126 Property Transfer
The first assumption is the abstract kernel properties abs-kernel-props(ΠAK)
(cf. Section 6.4 of Tsyban’s thesis [Tsy09]).
The second assumption relates an SS type environment, an SS heap mem-
ory frame, and a BS heap typing. Heap locations in the small-step semantics
are directly connected to the types, whereas in the big-step semantics there is
a indirection via the type name stored in the heap typing. Since the mapping
from type names to types specified by an SS type environment does not nec-
essarily have to be injective we require that there must be at least one name
that maps to the SS type according to the small-step heap symbol table. The
described idea is formalized in the definition below. We consider all BS heap
locations l. If the index number of such a location is bounded by the size of
the SS heap symbol table |hst(mem)| we require existence of a type name tn
such that (i) the heap typing maps to this type name at the location l, and
(ii) this type name corresponds to the type snd(hst(mem)[l−1]) in the SS type
environment te. Note an index shift by one in referring to BS and SS heap
locations: we start counting the former from one wheres the latter are counted
starting from zero.
DEFINITION 6.61 I abs-heap-ty?(te,mem,HT) =
∀l :
{




transfer from SS to BS
absSS
√
(teCK(ΠAK), cSS, cPFH, gvarsactive, gvarsfree)
∧ cSS ∈ validSS(teCK(ΠAK), ft)




−→ absBS√(SS2BSstate(cSS), cPFH,HT,map(gvar2loc, gvarsactive),
map(gvar2loc, gvarsfree))
6.5.3 Specification at the Level of SS
Pre- and postconditions to the functions of the demand paging implementation
formulated at the level of the small-step semantics express the same meaning
as those formulated at the level of Simpl. Since the only difference between
them is data representation — and we have already illustrated this difference
while defining the abstraction relation from SS (Definition 6.55) — we do not
present formal definitions of pre- and postconditions but only declare predicates
for them.
Initialization code. The precondition of the initialization code at the level of
SS is stated by the predicate
PRE SSinit?(cSS, te).
Its arguments are a type environment te, an SS state cSS. The postcondition




6.5. Property Transfer from BS to SS 127
It additionally takes a result variable r and lists of active and free pointer
variables. Formal definitions of the predicates follow Section 5.3.2. The essen-
tial feature of the specification it that the postcondition establishes the valid




(te, cSS, init-cPFH, [ ], [gvarhm(USER PGS), . . . , gvarhm(1)]).
The set of names of global variables changed by the functions is shared with
the big-step specification (Definition 6.23).
Page-fault handler. The precondition of the page-fault handler at the level of
SS is formulated as the predicate
PRE SSta ?(cSS, cPFH, pid, addr, intent, count, te, locsactive, locsfree)
which additionally to the discussed parameters takes values of the handler’s call
arguments. The postcondition of the page-fault handler in small-step semantics
is given by the predicate
POST SSta ?(c
′
SS, cPFH, pid, addr, intent, count, te, locsactive, locsfree).
The result of the page fault handler on the level of SS is computed by the
function
pfh-ta-resSS(cPFH, pid, addr, intent, count).
Formal definitions of the predicates follow Section 5.6.2. We stress the fact
that the most important conjunct of both pre- and postconditions is the valid
abstraction relation from SS towards abstract PFH state (Definition 6.56):
absSS
√
(te, cSS, cPFH, gvarsactive, gvarsfree).
The set of names of global and heap variables changed by the functions is
shared with the big-step specification (Definition 6.24).
6.5.4 Correctness at the Level of SS
Transfer of the initialization code correctness from BS to SS. The lemma stated
below describes the correctness of the demand paging initialization code at the
level of small-step semantics. The lemma assumes the abstract kernel proper-
ties to hold and the result variable r to be in the context of caller’s variables L.
The lemma conclude an SS Hoare triple with respect to the type environments
of the concrete kernel teCK(ΠAK), the extended function table ftext(ΠAK), the
extended semantics environment xsem, and the extended program Πext(ΠAK).
We define the predicate modifies SSinit? which compares two small-step states
cSS and c′SS and describes which variables were modified by the initialization
code. The predicate states that (i) all global variables of the concrete ker-
nel except those that are in the set modifiedinit remain the same, and (ii) all
initialized top-local variables except for r stay the same.
J DEFINITION 6.63





(∀v : v ∈ map(fst, gstCK(ΠAK)) ∧ v 6∈ modifiedinit
−→ valueg(c′SS.mem, gvargm(v)) = valueg(cSS.mem, gvargm(v)))
∧ (∀v : v ∈ map(fst, lsttop(cSS.mem)) ∧ v ∈ lmtop(cSS.mem).init ∧ v 6= r
−→ valueg(c′SS.mem, gvarlm(n, v)) = valueg(cSS.mem, gvarlm(n, v)))
128 Property Transfer
Above, n = |cSS.mem.lm| − 1.
The precondition of the Hoare triple is a set of pairs of small-step configu-
rations cSS and extended states cX such that (i) the global symbol table of cSS
coincides with the symbol table of the concrete kernel, (ii) the heap symbol
table of cSS is empty, and (iii) the preconditions PRE SSinit? and PRE
SS
init? hold.
The Hoare triple’s postcondition is stated as a set of pairs of small-step
configurations c′SS and extended states c
′
X for which (i) the postconditions
POST SSinit? and POST
X
init? hold, (ii) the heap symbol table respects the SS heap
structure of the demand paging code, and (iii) modification of the variables is
described by the predicate modifies SSinit?.
LEMMA 6.64 I
Initialization code correctness
at the level of SS
abs-kernel-props(ΠAK) ∧ r ∈ L −→
Πext(ΠAK), xsemPFH, L tSS
{(cSS, cX) | cSS = c0SS ∧ cX = c0X
∧ gst(cSS.mem) = gstCK(ΠAK) ∧ hst(cSS.mem) = [ ]
∧ PRE SSinit?(cSS, teCK(ΠAK)) ∧ PRE Xinit?(cX)}
SCall(r, pfh init, [ ])
{(c′SS, c′X) | POST SSinit?(c′SS, r, teCK(ΠAK)) ∧ POST Xinit?(c0X, c′X)
∧ hst(cSS.mem) = hstCVM
∧modifies SSinit?(c0SS, c′SS, r)}
We apply transfer theorem from BS to SS (Theorem 3.4) in which thePROOF I
big-step Hoare triple is instantiated with the one concluded in the correctness
lemma of the initialization code at the BS level with respect to the extended
program (Lemma 6.37). The major proof effort consists of an implication from
SS to BS preconditions and from BS to SS postconditions. To conclude that,
Lemmas 6.60 and 6.62 are used.
Below, we give this lemma an operational-semantics-style look. This reveals
a number of terms hidden inside the SS Hoare triple notation. The (extended)
validity of C0 states is a lemma’s invariant (1,4). The lemma assumes the result
variable r to be in the top-local memory frame. On the side of conclusions we
additionally have a successful extended C0 computation (2,3) and the transition
invariant (5) introduced in Section 3.5 (Definition 3.3).
LEMMA 6.65 I
Initialization code correctness
at the level of SS:
operational-semantics style
abs-kernel-props(ΠAK)
(1) ∧ cSS ∈ validSS(teCK(ΠAK), ftext(ΠAK))
∧ cSS.prog = SCall(r, pfh init, [ ])
∧ gst(cSS.mem) = gstCK(ΠAK) ∧ hst(cSS.mem) = [ ]
∧ PRE SSinit?(cSS, teCK(ΠAK)) ∧ PRE Xinit?(cX)
∧ r ∈ map(fst, lmtop(cSS.mem))
(2) −→ ∃n, c′SS, c′X : δnSSX(teCK(ΠAK), ftext(ΠAK), xsem, cSS, cX) = b(c′SS, c′X)c
(3) ∧ c′SS.prog = Skip
(4) ∧ c′SS ∈ validSS(teCK(ΠAK), ftext(ΠAK))
(5) ∧ trans-inv?(cSS.prog, hstCVM, cSS, c′SS)
∧ POST SSinit?(c′SS, r, teCK(ΠAK))
∧ POST Xinit?(cX, c′X)
∧ modifies SSinit?(cSS, c′SS, r)
6.5. Property Transfer from BS to SS 129
J PROOFThe lemma is proven using Theorem 3.5 which allows us to switch from the
Hoare-triple-style formulation to the view of operational semantics. The SS
Hoare triple is instantiated with the one concluded in Lemma 6.64. 
Transfer of the page-fault handler correctness from BS to SS. First we define
a predicate similar to Definition 6.28 which helps us to evaluate parameters of
the call to the page-fault handler.
J DEFINITION 6.66
Evaluation of parameters
to the page-fault handler
at the level of SS
paramSSta ?(cSS, te, epid, eaddr, eintent, ecount, pid, addr, intent, count) =
∃ctpid, ctaddr, ctintent, ctcount :
evalSS(te, cSS.mem, epid) = bctpidc ∧ ctpid(0) = mcellnat(pid)
∧ evalSS(te, cSS.mem, eaddr) = bctaddrc ∧ ctaddr(0) = mcellnat(addr)
∧ evalSS(te, cSS.mem, eintent) = bctintentc ∧ ctintent(0) = mcellnat(intent)
∧ evalSS(te, cSS.mem, ecount) = bctcountc ∧ ctcount(0) = mcellnat(count)
Modification of variables by the page-fault handler is described by the pred-
icate modifies SSta ? which states that (i) the size of the heap memory frame re-
mains unchanged, (ii) all global variables of the concrete kernel except those
that are allowed to be modified by the page-fault handler remain the same,
(iii) all initialized top-local variables except for r stay the same, and (iv) all
heap variable which lie after the page table space and USER PGS page descrip-
tors remain unchanged.
J DEFINITION 6.67
Modification of SS variables
by page-fault handler




∧ (∀v : v ∈ map(fst, gstCK(ΠAK)) ∧ v 6∈ modifiedta
−→ valueg(c′SS.mem, gvargm(v)) = valueg(cSS.mem, gvargm(v)))
∧ (∀v : v ∈ map(fst, lsttop(cSS.mem)) ∧ v ∈ lmtop(cSS.mem).init ∧ v 6= r
−→ valueg(c′SS.mem, gvarlm(n, v)) = valueg(cSS.mem, gvarlm(n, v)))
∧ (∀ USER PGS < i < |hst(cSS.mem)| :
−→ valueg(c′SS.mem, gvarhm(i)) = valueg(cSS.mem, gvarhm(i)))
Above, n = |cSS.mem.lm| − 1.
The SS Hoare triple’s precondition in the lemma below is a set of pairs
of SS and extended states (cSS, cX) for which (i) the global symbol table of
cSS coincides with the symbol table of the concrete kernel, (ii) the SS heap
structure of the demand paging is respected by cSS, (iii) the parameters of
the call are appropriately evaluated, (iv) the SS precondition of the page-fault
handler PRE SSta ? holds as well as the precondition over the extended state
PRE Xta ?.
The Hoare triple’s postcondition is stated as a set of pairs of small-step
configurations c′SS and extended states c
′
X for which (i) the SS postcondition
POST SSta ? of the page-fault handler is satisfied as well as the extended state
postcondition POST Xta ? , (ii) the result of the page-fault handler is appropri-
ately computed, and (iii) modifications over the SS memory are described by




at the level of SS
abs-kernel-props(ΠAK) ∧ r ∈ L −→
Πext(ΠAK), xsemPFH, L tSS
{(cSS, cX) | cSS = c0SS ∧ cX = c0X
∧ gst(cSS.mem) = gstCK(ΠAK) ∧ prefix-hstCVM?(cSS.mem)
∧ paramSSta ?(cSS, teCK(ΠAK), epid, eaddr, eintent, ecount,
pid, addr, intent, count)
∧ PRE SSta ?(cSS, cPFH, pid, addr, intent, count,
teCK(ΠAK), locsactive, locsfree) ∧ PRE Xta ?(cX)}
SCall(r, pfh touch addr, [epid, eaddr, eintent, ecount])
{(c′SS, c′X) | POST SSta ?(c′SS, cPFH, pid, addr, intent, count,
teCK(ΠAK), locsactive, locsfree) ∧ POST Xta ?(c0X, c′X)
∧ r = pfh-ta-resSS(cPFH, pid, addr, intent, count)
∧modifies SSta ?(cSS, c′SS, r)}
We apply transfer theorem from BS to SS (Theorem 3.4) in which thePROOF I
big-step Hoare triple is instantiated with the one concluded in the correctness
lemma of the page-fault at the BS level with respect to the extended program
(Lemma 6.38). The major proof effort — implications from SS to BS precon-
ditions and from BS to SS postconditions — are concluded using Lemmas 6.60
and 6.62.
Ultimately, we reformulate the above lemma in the style of operational
semantics. The additional terms reveled by that are the same as in Lemma 6.65.
LEMMA 6.69 I
Page-fault handler correctness
at the level of SS:
operational-semantic style
abs-kernel-props(ΠAK) ∧ cSS ∈ validSS(teCK(ΠAK), ftext(ΠAK))
∧ cSS.prog = SCall(r, pfh touch addr, [e-pid, e-addr, e-intent, e-count])
∧ gst(cSS.mem) = gstCK(ΠAK) ∧ prefix-hstCVM?(cSS.mem)
∧ paramSSta ?(cSS, teCK(ΠAK), e-pid, e-addr, e-intent, e-count,
pid, addr, intent, count)
∧ PRE SSta ?(cSS, cX, cPFH, pid, addr, intent, count,
teCK(ΠAK), locsactive, locsfree) ∧ PRE Xta ?(cX)
∧ r ∈ map(fst, lmtop(cSS.mem))
−→ ∃n, c′SS, c′X : δnSSX(teCK(ΠAK), ftext(ΠAK), xsem, cSS, cX) = bc′SS, c′Xc
∧ c′SS.prog = Skip ∧ c′SS ∈ validSS(teCK(ΠAK), ftext(ΠAK))
∧ trans-inv?(cSS.prog, [ ], cSS, c′SS)
∧ POST SSta ?(c′SS, cPFH, pid, addr, intent, count,
teCK(ΠAK), locsactive, locsfree) ∧ POST Xta ?(cX, c′X)
∧ r = pfh-ta-resSS(cPFH, pid, addr, intent, count)
∧modifies SSta ?(cSS, c′SS, r)
We apply Theorem 3.5 where the SS Hoare triple is instantiated by thePROOF I



















Using Results in the
CVM Proof
In this chapter we prove top-level correctness
theorems of demand paging: correctness of
the initialization code and the page-fault han-
dler. Up to this point we have shown cor-
rectness of the demand paging functions at
the level of extended C0 small-step semantics.
We still have unjustified XCalls to the hard-
disk drivers. We plug in results from Alkas-
sar’s doctoral thesis [Alk09] in order to close
this gap. First, we prove a theorem justifying
correctness of the extended calls towards their
ISA implementations and show how we apply
this theorem in the context of top-level cor-
rectness theorems of demand paging. Next,
we show the top-level correctness theorems
paying attention on the proofs of the CVM
relation for user processes as the latter essen-
tially defines correctness of memory virtual-
ization. Finally, we elaborate how the results




7.1 Hard-Disk Drivers and Zero Fill Page
In the end of the previous chapter we obtained computations of demand paging
functions at the level of C0 small-step semantics. Note that these computations
(see Lemmas 6.65 and 6.69) are defined by means of the C0 transition func-
tion δnSSX which is capable of treating extended calls. As we described in Sec-
tion 6.3 we replaced all calls to the hard-disk driver functions read from disk
and write to disk as well as the function zero fill page by corresponding
XCalls. We also mentioned that the effects of these functions are expressed in
an extended semantics environment.
However, the correctness theorem of CVM (Theorem 2.33) is formulated
on the levels of VAMP ISA with devices and C0 small-step semantics without
XCalls support. The ultimate goal of the demand paging correctness theorems
is to be used in the CVM correctness proof. To achieve this we need to get rid of
the mentioned XCalls by justifying their correctness: the extended semantics
has to be respected by their implementations which contain inline assembly
code.
The means to deal with the described problem are provided by Alkassar.
In his thesis [Alk09] he proves a compiler correctness theorem with support
of XCalls for the mentioned functions [Alk09, Theorem 11] for the case of
write to disk. The remaining cases could be handled completely analogously.
The target level of that theorem is VAMP assembly. However, we aim at
formulating correctness of the demand paging in VAMP ISA semantics. For
that, below we transfer the Alkassar’s theorem down to the VAMP ISA level.
The definitions in this section are a product of collaboration between Alka-
ssar and the author.
7.1.1 Extended Semantics Environment
Hard-disk drivers. The functions write to disk and read from disk are im-
plemented as C0 functions but their bodies consist only of a single assembly
portion. In Section 4.4 we specified the behavior of these function on the ab-
stract level (Definitions 4.28 and 4.26). Now we consider under which precondi-
tions their behavior occurs and specify the results of the function in small-step
semantics.
Both functions take the start (page) address in the physical memory ma and
the start (sector) address in the swap memory sa as parameters. They copy
the specified page between the memories. The precondition to the functions
restricts their parameters: (i) the memory address is page aligned; (ii) the page
to be copied does not overlap with the devices memory region and the system
memory part, and (iii) lies within the virtual address space; (iv) the disk sector
address is page aligned, i.e., divisible by eight, and (v) lies outside the boot




PREdriver?(ma, sa) = ma mod PG SZ = 0
∧ ma + PG SZ ≤ DEVICES BORDER
∧ ZFP·PG SZ ≤ ma
∧ ma < ZFP·PG SZ+(USER PGS+1)·PG SZ
∧ sa mod 8 = 0
∧ BOOT PGS·8 ≤ sa
∧ sa < BOOT PGS·8+TOT BIG PGS·PGS PER BIG PG·8
7.1. Hard-Disk Drivers and Zero Fill Page 133
The extended semantics of the functions takes as arguments a mapping
from parameter names to their contents args :: S 7→ (N 7→ mcell-t)⊥ and an
extended state cX. If the parameters fulfill the preconditions to the hard-disk
drivers the returned results comprises the value T and an updated extended
state obtained via functions write-to-disk respectively read-from-disk (Defini-





b([mcellbool(T)],write-to-disk(ma, sa, cX))c if PREdriver?(ma, sa)
⊥ otherwise
where ma = m2u(Targs(mem addr)U(0)),





b([mcellbool(T)], read-from-disk(ma, sa, cX))c if PREdriver?(ma, sa)
⊥ otherwise
where ma = m2u(Targs(mem addr)U(0)),
sa = m2u(Targs(disk addr)U(0)).
Above, m2u(m) converts a memory cell to an unsigned integer.
Zero fill page. The body of the function zero fill page also consists only of
assembly instructions. The function’s parameter is an index ppx of the page
to be filled with zeros. The preconditions to the function ensure that only the




PREzfp?(ppx) = ZFP ≤ ppx < ZFP + USER PGS + 1
The extended semantics of the function updates the extended state via the
function zero-fill-page (Definition 4.30) and returns zero as a result in case the






b([mcellint(0)], zero-fill-page(ppx, cX))c if PREzfp?(ppx)
⊥ otherwise
where ppx = m2u(Targs(ppx)U(0)).
Altogether. The extended semantics environment for the demand paging xsemPFH
is a list with three elements. Each element is pair of a function name and a







([(mem addr,UnsgndT), (disk addr,UnsgndT)], [BooleanT], xsemwrite)),
(read from disk,
([(mem addr,UnsgndT), (disk addr,UnsgndT)], [BooleanT], xsemread)),
(zero fill page, ([(ppx,UnsgndT)], [IntegerT], xsemzfp))]
134 Integrating Results
7.1.2 Correctness of the Extended Calls
Mapping the extended state to the physical and swap memories. As our cur-
rent goal is to get rid of XCalls, i.e., justify them by applying functional cor-
rectness of their implementations, we also have to get rid of the extended state
by mapping it to the physical and the swap memories of the underlying ISA
or assembly machine with devices. By that we will be able to project modi-
fications described by the extended semantics over the extended state to real
hardware memories.
The relation for the physical memory xconsismem? is defined for both VAMP
ISA cISA and assembly cASM configurations (we use the same function name).
The first parameter of the relation is one of these configurations whereas the
second parameter is the extended state of demand paging cX. The relation
states that each word taken from the physical memory component of cX equals
to the corresponding word read from the physical memory.
DEFINITION 7.7 I
Relation between physical
memory and extended state
xconsismem?(cASM, cX) = ∀ k < |cX.mem| : ∀ i < PG SZ WD :
i2n(cASM.m(PG SZ · (ZFP + k)/4 + i)) = cX.mem[k][i]
xconsismem?(cISA, cX) = ∀ k < |cX.mem| : ∀ i < PG SZ WD :
〈cISA.mword(bin(PG SZ · (ZFP + k) + 4 · i))〉 = cX.mem[k][i]
The relation for the swap memory xconsisswap? is defined between a config-
uration of the hard disk cHD and extended state cX. It states that each word
taken from the swap memory component of cX is equal to the respective word
read from the disk’s swap memory.
DEFINITION 7.8 I
Relation between swap
memory and extended state
xconsisswap?(cHD, cX) = ∀ k < |cX.swap| : ∀ i < PG SZ WD :
cHD.sm[PG SZ WD · (BOOT PGS + k) + i] = cX.swap[k][i]
Adding relation for C0 states. We obtain a single extended consistency relation
by combining relations xconsismem? and xconsisswap? with the relation between
a C0 configuration cSS of the concrete kernel and its version cXSS from which
the extended calls to the hard-disk drivers and the zero fill page function are
removed. The memories of these C0 machines are equal. The differences in
their program rests are specified by the function repl-hdzfpstmt introduced in
Section 6.3 (Definition 6.31).
DEFINITION 7.9 I
Extended consistency relation
xconsis?(cSS, cISA, cHD, cXSS, cX) =
cXSS.mem = cSS.mem
∧ cXSS.prog = repl-hdzfpstmt(hd(s2l(cSS.prog)))
∧ xconsismem?(cISA, cX)
∧ xconsisswap?(cHD, cX)
Note that in this definition an assembly configuration cASM could used be
instead of cISA.
Relation between function tables and extended semantics environment. The
extended consistency relations ties together a complete C0 configuration of the
concrete kernel and its variant with XCalls removed. We also need to express
a relation between the function tables corresponding to these configurations.
7.1. Hard-Disk Drivers and Zero Fill Page 135
The desired relation takes two functions tables ft and xpt, and some extended
semantics environment xsem. It states that (i) xsem equals the extended se-
mantics for demand paging xsemPFH, (ii) xpt is obtained from ft by removing
the definitions of hard-disk drivers and zero fill page, and (iii) these functions
are defined in the function table ft.
J DEFINITION 7.10
Relation between function
tables and extended semantics
xconsisft?(ft, xpt, xsem) = xsem = xsemPFH
∧ xpt = repl-hdzfpft(ft)
∧ renumevenfun (write to disk, fwrite to disk) ∈ ft
∧ renumevenfun (read from disk, fread from disk) ∈ ft
∧ renumevenfun (zero fill page, fzero fill page) ∈ ft
Above, f with subscripts corresponds to function definitions in a function
table.
Next, we present two theorems justifying correctness of the extended calls.
The target models of these theorems are VAMP assembly and ISA, respectively.
The theorem for the assembly is an adaptation of Theorem 11 from Alkassar’s
thesis [Alk09] in order to make it more suitable for further transfer down to
the ISA level. We use Alkassar’s theorem to prove the theorem for assembly.
The latter together with the simulation theorem of VAMP assembly by VAMP
ISA [Tsy09, Theorem 4.8] is used to prove the theorem at the ISA level.
Note that below we speak about correctness of a single function, like the
page-fault handler, and therefore the program rest consists of a single call
statement to the respective function.
XCalls correctness towards VAMP assembly. The purpose of the theorem be-
low is to justify correctness of XCalls occurring in a given extended C0 com-
putation, i.e., a computation defined by the transition function δnSSX between
some start pair of C0 and extended start configurations (cXSS, cX) and the
resulting states (c′XSS, c
′
X). The given extended C0 computation describes a
function call, e.g., a call to the page-fault handler. The extended consistency
relation between the extended C0 state cXSS and a normal C0 configuration cSS
is assumed to hold at the beginning. The theorem delivers a resulting normal
C0 configuration c′SS under preservation of the extended consistency relation.
Besides that, we state that the underlying assembly computation terminates.
Let us classify the theorem’s assumptions and conclusions in detail. The
theorem below assumes the following.
• Validity and size constraints of the involved models/concepts: C0 configu-
ration and program (1,2), assembly configuration (7), execution sequence
(9), hard-disk disk (10), and extended state (4).
• (Abstraction) relations between the mentioned concepts: simulation of
assembly by C0 (3), extended consistency relation (5), and function table
and extended semantics (6).
• Preconditions to executions: on the assembly level (8), and on the ex-
tended C0 level. The latter comprises absence of address-of operators
and assembly and external call statements inside the program part to be
executed (13) and dynamic (14) and static memory consumption con-
straints.
136 Integrating Results
• Successful extended C0 execution: the result configuration is obtained
(11), the call statement is removed from the program rest (12), and the
heap borders are not violated (15,16).
On the side of conclusions the theorem provides a resulting normal C0
configuration, an updated C0 allocation function, a number of taken assembly
steps, and an updated VAMP assembly with devices configuration, such that
the following holds.
• The underlying assembly computation succeeds (17) and there is no com-
munication with device except for the swap hard-disk (26). The latter is
denoted by the predicate non-interf-dev ′? which is defined by the equa-
tion from Definition 2.15 where the device with the identifier SWAP DID
is excluded from the universal quantification.
• The validity constraints are preserved over the model of C0 (18), assembly
(21), and hard-disk (23).
• The consistency relation between the C0 and assembly is preserved (19)
as well as the extended consistency relation (20).
• The assembly execution proceeds in the system mode (22) and absence
of interrupts is guaranteed (25).
• The assembly memory remains unchanged at the two first addresses




(1) cSS ∈ C0 ′√(te, ft)
(2) ∧ (te, ft, gst(cSS.mem)) ∈ xltblprog
(3) ∧ consis?(te, ft, cSS, alloc, cASM+DS.cpu)
(4) ∧ subtypingX?(cX)
(5) ∧ xconsis?(cSS, cASM+DS.cpu, the-hd(cASM+DS.devs(SWAP DID)), cXSS, cX)
(6) ∧ xconsisft?(ft, xpt, xsem)
(7) ∧ asm√(cASM+DS.cpu)
(8) ∧ sys-execASM?(cASM+DS.cpu)
(9) ∧ seq√(seq, cASM+DS.devs)
(10) ∧ hd√(cASM+DS.devs)
(11) ∧ δxc-stepsSSX (te, xpt, cXSS, cX, xsem) = b(c′XSS, c′X)c
(12) ∧ is-SCall(cXSS.prog) ∧ is-Skip(c′XSS.prog)
(13) ∧ ∀ p ∈ SCalls(xpt, called-func(cXSS.prog)) :
noAddrOf-Asm-ESCall?(Tmap-of(xpt, p)U.body)
(14) ∧ (ft, called-func(cXSS.prog), sz) ∈ SE
(15) ∧ ABASElm + asizest∗(sc(cSS.mem).lst) + sz ≤ HEAP START
(16) ∧ HEAP START + asizeheap(hst(c′XSS.mem)) < ZFP · PG SZ
−→ ∃ c′SS, alloc′, asm-steps, c′ASM+DS :











δasm-stepsASM+DS (seq, cASM+DS) = c
′
ASM+DS
∧ c′SS ∈ C0 ′
√
(te, ft)










∧ ∀ i : 0 ≤ i ∧ i < 2 −→ c′ASM+DS.cpu.m(i) = cASM+DS.cpu.m(i)
∧ dyn-propDS(cASM+DS, PROGBASE,
csizeprog(te, gst(cSS.mem), ft), seq, asm-steps)
∧ non-interf-dev ′?(cASM+DS.devs, c′ASM+DS.devs, seq, asm-steps)
Let us clarify some definitions used in the theorem. The function asizest∗
is used to compute the size of the local memory stack. This function is defined





The allocated size of a symbol table st is computed by the function asizest(st)
[Lei07, Definition 7.11]. The function asizeheap(hst) [Lei07, Definition 7.12]
computes the allocated size of the heap for a heap symbol table hst. Con-
dition (14) (ft, called-func(cSS.prog), sz) ∈ SE [Tsy09, Definition 7.21] is used
to estimate the stack consumption and means that starting from a call of the
function called-func(cSS.prog) the execution with respect to the function table
ft consumes not more than sz bytes for the stack.
XCalls correctness towards VAMP ISA. The formulation of theorem with VAMP
ISA as a target level literally follows the statement of the theorem above. We
use an ISA with devices configuration instead of assembly. Therefore neces-
sary validity constraints have to hold at the ISA level. Instead of consistency
relation between C0 and assembly a simulation relation between C0 and ISA





(3) ∧ C0-sim-isa?(te, ft, cSS, cISA+DS.cpu) ∧ · · ·
(5) ∧ xconsis?(cSS, cISA+DS.cpu,
the-hd(cISA+DS.devs(SWAP DID)), cXSS, cX) ∧ · · ·
(7) ∧ isa√(cISA+DS.cpu)
(8) ∧ sys-execISA?(cISA+DS.cpu)
(9) ∧ seq√(seq, cISA+DS.devs)
(10) ∧ hd√(cISA+DS.devs) ∧ · · ·









δisa-stepsISA+DS (seq, cISA+DS) = c
′
ISA+DS ∧ · · ·










∧ c′ISA+DS.cpu.m(rep(0, 29)) = cISA+DS.cpu.m(rep(29, 0))
∧ non-interf-dev ′?(cISA+DS.devs, c′ISA+DS.devs, seq, isa-steps)
138 Integrating Results
7.1.3 Applying Correctness of the Extended Calls
We will apply Theorem 7.12 twice: for the initialization code and for the page-
fault handler. The first function has an XCall to the zero fill page whereas the
second function additionally uses hard-disk drivers. The theorem application
procedure is the same for these two cases. Because of that, we describe how the
XCalls correctness theorem is applied before presenting the actual statements
of top-level correctness theorems for the initialization code and the page-fault
handler.
So far, we have transferred correctness results of the demand paging im-
plementation down to the extended C0 level. Lemmas 6.65 and 6.69 give us
extended C0 computations:
δxc-stepsSSX (teCK(ΠAK), ftext(ΠAK), cXSS, cX, xsem) = b(c′XSS, c′X)c.
Moreover, these configurations respect functional pre- and postconditions of
the demand paging.
On the other hand, demand paging top-level correctness theorems are sup-
posed to be applied during concrete kernel executions within CVM. That is
why they necessarily speak about some concrete kernel C0 configuration cSS
which respects the following properties:
abs-kernel-props(ΠAK),
gst(cSS.mem) = gstCK(ΠAK).
To apply Theorem 7.12 we instantiate the program with the linked kernel
program, i.e., te = teCK(ΠAK) and ft = ftCK(ΠAK). We instantiate the extended
C0 configuration with the configuration obtained from the normal C0 configu-
ration by replacing the program rest by its first statement. This is handled by
the function ss2xss(cSS) = cXSS:
cXSS.mem = cSS.mem,
cXSS.prog = hd(s2l(cSS.prog)).
We construct the extended state from the physical memory and the hard-
disk content of ISA with devices cISA+DS. For this we define the function
isa2x(cISA+DS) = cX which yields the extended state cX by (i) converting the
user part of the ISA memory to the physical memory component of the ex-
tended state with the function mem2x (cf. Figure 7.1), and (ii) converting the
hard-disk swap memory part outside the boot region to the swap component
of the extended state with the function swap2x (cf. Figure 7.2).
DEFINITION 7.13 I
Construction of extended state
from ISA with devices
cX.mem = mem2x(cISA+DS.cpu.m)
cX.swap = swap2x(the-hd(cISA+DS.devs(SWAP DID)).sm)
mem2x(m) = map(λi : map(λj : 〈mword(bin(PG SZ · (ZFP + i) + 4 · j))〉,
[0, . . . , PG SZ WD− 1]),
[0, . . . , USER PGS])
swap2x(sm) = map(λi : map(λj : sm[PG SZ WD · (BOOT PGS + i) + j],
[0, . . . , PG SZ WD− 1]),
[0, . . . , TOT BIG PGS · PGS PER BIG PG− 1])
The extended semantics environment is instantiated with the one for de-
7.1. Hard-Disk Drivers and Zero Fill Page 139























byte-addressable physical memory 
as accessed by read-isa
two-dimensional array of the physical memory component 
of the extended state as constructed by  mem2x 
























word-addressable swap memory 
component of the hard disk
two-dimensional array of the swap memory component of 
the extended state as constructed by  swap2x 
mand paging xsemPFH. For memory consumption restrictions we compute the
smallest number of bytes needed for the execution of each particular demand
paging function. For the initialization code we set sz = 52 whereas for the the
page-fault handler we use sz = 136.
Having these instantiations we justify the assumptions of Theorem 7.12 as
follows.
Assumption 1: C0 configuration is valid. This is a CVM invariant.
Assumption 2: C0 program is translatable. Since our program is a concrete
kernel obtained by linking we use Lemma 6.24 from Tsyban’s thesis [Tsy09]:
abs-kernel-props(ΠAK) −→ (teCK(ΠAK), ftCK(ΠAK), gstCK(ΠAK)) ∈ xltblprog
140 Integrating Results
Assumption 3: simulation relation between C0 and ISA configurations. This is
a CVM invariant.
Assumption 4: sub-typing of the extended state. We need to show
subtypingX?(isa2x(cISA+DS)).
After unfolding Definition 4.8 we show four individual conjuncts. The length
of the physical memory component is the user memory size plus one — for the
zero filled page.
|mem2x(cISA+DS.cpu.m)|
= |map(. . . , [0, . . . , USER PGS])|
= |[0, . . . , USER PGS]|
= USER PGS + 1
The length of the swap memory component is the number of pages contained
in the total amount of big pages.
|swap2x(the-hd(cISA+DS.devs(SWAP DID)).sm)|
= |map(. . . , [0, . . . , TOT BIG PGS · PGS PER BIG PG− 1])|
= |[0, . . . , TOT BIG PGS · PGS PER BIG PG− 1]|
= TOT BIG PGS · PGS PER BIG PG
Each element of the physical memory component is a list of PG SZ WD thirty-
two-bit natural numbers.
∀ i < USER PGS + 1 : |mem2x(cISA+DS.cpu.m)[i]|
= |map. . . , [0, . . . , PG SZ WD− 1]|
= |[0, . . . , PG SZ WD− 1]|
= PG SZ WD
The same is true for the swap memory component.
∀ i < TOT BIG PGS · PGS PER BIG PG :
|swap2x(the-hd(cISA+DS.devs(SWAP DID)).sm)[i]|
= |map(. . . , [0, . . . , PG SZ WD− 1])|
= |[0, . . . , PG SZ WD− 1]|
= PG SZ WD
Assumption 5: extended consistency relation. We need to prove
xconsis?(cSS, cISA+DS.cpu, the-hd(cISA+DS.devs(SWAP DID)),
ss2xss(cSS), isa2x(cISA+DS)).
We unfold Definition 7.9 and conclude its individual conjuncts. The function
ss2xss does not change C0 memory configurations.
ss2xss(cSS).mem = cSS.mem
The equation for the program rests holds because we handle a call to one of




7.1. Hard-Disk Drivers and Zero Fill Page 141
For the physical memory component of the extended state using Definition 7.13
we have:
∀ k < USER PGS + 1 : ∀ i < PG SZ WD :
mem2x(cISA+DS.cpu.m)[k][i]
= map(. . . ,map(. . . , [0, . . . , USER PGS]))[k][i]
= map(. . . (k), [0, . . . , PG SZ WD− 1])[i]
= 〈cISA+DS.cpu.mword(bin(PG SZ · (ZFP + k) + 4 · i))〉
Finally, for the swap memory component we conclude:
∀ k < TOT BIG PGS · PGS PER BIG PG : ∀ i < PG SZ WD :
swap2x(the-hd(cISA+DS.devs(SWAP DID)).sm)[k][i]
= map(. . . ,map(. . . , [0, . . . , TOT BIG PGS · PGS PER BIG PG− 1]))[k][i]
= map(. . . (k), [0, . . . , PG SZ WD− 1])[i]
= the-hd(cISA+DS.devs(SWAP DID)).sm[PG SZ WD · (BOOT PGS + k) + i].
Assumption 6: relations between function tables and extended semantics. We
need to show
xconsisft?(ftCK(ΠAK), ftext(ΠAK), xsemPFH).
Unfolding Definition 7.10 we see that the condition for the extended semantics
of demand paging is trivial as well as the conjunct for the function tables (cf.
Definition 6.33):
ftext(ΠAK) = repl-hdzfpft(ftCK(ΠAK)).
For the last conjunct we use the fact that if the function does not contain any
call statements then:
(fn, f) ∈ ftCVM −→ renumevenfun (fn, f) ∈ {ftCK(ΠAK)}.
This is true for fn ∈ {write to disk, read from disk, zero fill page}.
Assumptions 7-10: validity of models. These are CVM invariants.
Assumptions 11,12: execution on the extended C0 level. Delivered by correct-
ness lemmas of demand paging at the small-step semantics level (Lemma 6.65
and 6.69).
Assumption 13: no address-of, assembly, or external calls. Using the concrete
kernel function table we can show that during all possible executions of the
page-fault handler no function of the abstract kernel is called. That means that
no external calls might occur. As for the absence of the address-of operator
and assembly statements, these conditions are shown by inspecting the demand
paging code.
Assumption 14: static stack consumption. It is proven similarly by analyzing
the defined program code:
(ftCK(ΠAK), pfh init, 52) ∈ SE, (ftCK(ΠAK), pfh touch addr, 136) ∈ SE.
Assumption 15: dynamic stack consumption. This will be added as an as-
sumption to the top-level correctness theorems of the page-fault handler since
we do not know the size of the already consumed stack.
142 Integrating Results
Assumption 16: heap consumption. In case of the initialization code we know
that before the function call the heap was empty. Analyzing the initialization
code we show that that the elements are allocated during the function execution
are: (i) the page table space (a two-dimensional array of thirty-two-bit natural
numbers with dimensions TOT PGS PT and PTES PER PG), and (ii) and USER PGS
page descriptors, each occupies 20 bytes. Therefore, we conclude
HEAP START + TOT PGS PT · PTES PER PG · 4 + USER PGS · 20 < ZFP · PG SZ.
In case of the page-fault handler we do not allocate any elements on the heap.
7.2 Initialization Code Top-Level Correctness
We have already defined all functional specifications as well as abstraction rela-
tions used in the formulation of demand paging top-level correctness theorems.
These theorems also contain a number of technical pre- and postconditions
necessary to apply the theorems in the context of CVM verification. Mainly,
these conditions correspond to parts of either the validity of small-step config-
urations (for property transfer) validSS (Section 3.5) or the transition invariant
trans-inv? (Definition 3.3). We group them below.
So far all our demand paging correctness lemmas assume the the program
rest consists only of a single function call statement statement. According to
C0 validity it means that the local stack consists only of one local frame of
a callee. In order to adapt this to an application in the context of the CVM
proof we switch to complete C0 configurations of the concrete kernel. This
step is justified using the fact that an execution of the function call does not
affect the local frames below the callee’s one and it also does not affect the
program rest tree behind the discussed call statement. Thus, below we write
hd(s2l(cSS.prog) = SCall(. . .) instead of cSS.prog = SCall(. . .).
Technical preconditions. We impose the following restrictions on the C0 small-
step semantics configuration cSS of the concrete kernel at the point of the initial-
ization code invocation: (i) validity of the C0 configuration, (ii) only pointers
to heap locations in the memory are allowed, (iii) the top-most statement of the
program rest is a call to the initialization code of the demand paging, (iv) the
return variable is in the top-local memory frame, (v) the global symbol table of
the C0 configuration is the global symbol table of the concrete kernel program,
(vi) the heap symbol table is empty, and (vii) the local stack is appropriately




PRE techinit ?(ΠAK, cSS) =
cSS ∈ C0 ′√(teCK(ΠAK), ftCK(ΠAK))
∧ only-heap-pointer?(cSS.mem)
∧ hd(s2l(cSS.prog)) = SCall(r, pfh init, [ ])
∧ r ∈ map(fst, lsttop(cSS.mem))
∧ gst(cSS.mem) = gstCK(ΠAK)
∧ glob-init?(cSS.mem)
∧ hst(cSS.mem) = [ ]
∧ asizest∗(sc(cSS.mem).lst) + 52 ≤ ABASEhm − ABASElm
7.2. Initialization Code Top-Level Correctness 143
Technical postconditions. After the concrete kernel configuration cSS transits
to c′SS by executing the cal to the initialization code the following technical
postconditions hold: (i) the validity of the C0 configuration, (ii) only pointers
to heap locations in the memory are allowed, (iii) every type in the heap has
a proper name in the type environment, (iv) the program rest is updated
by removing the top-most statement, (v) the global and heap memories are
unchanged at all variables/locations except for those that are allowed to be
modified by the initialization code, (vi) the C0 memory invariant (described




POST techinit ?(ΠAK, cSS, c
′
SS) =





∧ c′SS.prog = rem-1st-stmt(cSS.prog)
∧ unchangedGM,HM(teCK(ΠAK), cSS.mem, c′SS.mem,
modifiedinit, [0..USER PGS])
∧ C0mem-inv?(teCK(ΠAK), cSS.mem, c′SS.mem,
hstCVM, r,mcellint(0)).
Let mc and mc′ be memory configuration before and after execution of some
call statement, let stheap be the additional part of the heap symbol table which
has been allocated during this call, and let ret-var and ret-val be a name and
a value of the return variable of the call, respectively. The C0 memory invari-
ant [Tsy09, Definition 7.31] C0mem-inv?(te,mc,mc′, stheap, ret-var, ret-val) is a
conjunction of the following facts: (i) global symbol tables of both memory
configurations are equal whereas the set of initialized global variables of mc
is included in that of mc′, (ii) the lengths of local memory stacks are equal
and these stack differ only at the topmost element, (iii) top return destination
and top local symbol tables of memory configurations mc and mc′ are equal
whereas the set of initialized local variables of mc is a subset of the initialized
local variables of mc′ — the call initializes the result variable, (iv) all initialized
top local variables have the same values in both memory configurations, (v) the
value of the return variable ret-var is ret-val, and (vi) the heap symbol table of
mc′ is composed of the heap symbol table of mc and stheap.
The predicate unchangedGM,HM(te,mc,mc′, varsgm, ind-setheap) [Tsy09, Def-
inition 7.29] compares two C0 memory configurations and asserts that they
differ only at given global and heap variable names (locations) specified by
varsgm and ind-setheap, respectively.
With the function rem-1st-stmt(cSS.prog) [Tsy09, Definition 7.32] we remove
the top-most statement in the program state.
Correctness theorem. The top-level correctness theorem of the initialization
code has the following assumptions.
• Validity of involved models/concepts: underlying VAMP ISA configura-
tion (5), hard disk (3), and execution sequence (3).
• Simulation relation between a C0 configuration of the concrete kernel and
the ISA configuration (4).
144 Integrating Results
• Preconditions to execution on different levels: abstract kernel properties
(1), ISA code invariants (6), system-mode execution (7), technical C0
preconditions (8), and functional preconditions of the initialization code
expressed in C0 small-step semantics (9).
On the side of conclusions the theorem claims the existence of an updated
VAMP ISA configuration achieved in T steps and an updated C0 configuration
of the concrete kernel such that the following holds.
• The ISA computation succeeds and all devices except for the swap hard
disk are untouched (10).
• Validity of the hard disk (11) and ISA (13), simulation of C0 by ISA (12),
as well as ISA code invariants (14) and system-mode execution condition
(15) are preserved.
• Technical (17) and functional (18) postconditions as well as the zero-filled
page condition (16) hold.
• The abstraction relation for user processes is established with the initial





(2) ∧ seq√(seq, cISA+DS.devs)
(3) ∧ hd√(cISA+DS.devs)
(4) ∧ C0-sim-isa?(teCK(ΠAK), ftCK(ΠAK), cSS, cISA+DS.cpu)
(5) ∧ isa√(cISA+DS.cpu)
(6) ∧ code-inv?(ΠAK, cISA+DS.cpu)
(7) ∧ sys-execISA?(cISA+DS.cpu)
(8) ∧ PRE techinit ?(ΠAK, cSS)
(9) ∧ PRE SSinit?(cSS, teCK(ΠAK))
−→ ∃ T, c′ISA+DS, c′SS : δTISA+DS(cISA+DS, seq) = c′ISA+DS
(10) ∧ non-interf-dev ′?(cISA+DS.devs, c′ISA+DS.devs, seq, T )
(11) ∧ hd√(c′ISA+DS.devs)
(12) ∧ C0-sim-isa?(teCK(ΠAK), ftCK(ΠAK), c′SS, c′ISA+DS.cpu)
(13) ∧ isa√(c′ISA+DS.cpu)
(14) ∧ code-inv?(ΠAK, c′ISA+DS.cpu)
(15) ∧ sys-execISA?(c′ISA+DS.cpu)
(16) ∧ zfp-cond?(c′ISA+DS.cpu)
(17) ∧ POST techinit ?(ΠAK, cSS, c′SS)
(18) ∧ POST SSinit?(c′SS, r, teCK(ΠAK), gvarsactive, gvarsfree)
(19) ∧ B(upsinit, c′ISA+DS)
We apply Lemma 6.65 in order to discharge the functional postconditionPROOF I
and obtain an extended C0 computation. By this we obtain a description
of physical and swap memories in terms of the extended state. We apply
Theorem 7.12 as described in Section 7.1.3 to map modifications done over the
extended state to real physical ISA and hard-disk memories.
Having this, the most important proof step is to establish the abstraction
relation for user processes B. Here we exploit two facts from the initialization
code specification (Section 5.3.2). All registers except PTO and PTL are ini-
tialized with zeros as claimed by the specification (Definition 5.19). The PTL
7.3. Page-Fault Handler Top-Level Correctness 145
registers are initialized with −1 for all user processes: no virtual memory is
allocated for any proces. This gives the term describing the virtual memory
equality for free (cf. Definition 2.29). The remaining proof goal is to show that
the values of PTO and PTL registers taken from the ISA memory correspond
to those specified by the small-step semantics postcondition of the initializa-
tion code. We use the simulation relation between C0 and ISA C0-sim-isa? to
conclude that. 
7.3 Page-Fault Handler Top-Level Correctness
Like we did for the initialization code we define technical pre- and postcondi-
tions for the page-fault handler top-level correctness theorem. The respective
predicates additionally take as arguments the values pid, addr, intent and count
of parametrs to the handler’s call.
Technical preconditions. At the moment the concrete kernel calls the page-
fault handler there exist expressions for the call parameters and the follow-
ing technical preconditions have to hold: (i) validity of the C0 configuration,
(ii) only pointers to heap locations in the memory are allowed, (iii) every type
in the heap has a proper name in the type environment, (iv) the top-most state-
ment of the program rest is a call to the page-fault handler with corresponding
parameter expressions, (v) the return variable is in the top-local memory frame,
(vi) the global symbol table of the C0 configuration is the global symbol table
of the concrete kernel program, (vii) all global variables of the concrete kernel
are initialized, (viii) the heap symbol table of the CVM implementation lies
at the beginning of the concrete kernel heap symbol table, (ix) the local stack
is appropriately bounded, (x) the heap is appropriately bounded, and (xi) the




PRE techta ?(ΠAK, cSS, pid, addr, intent, count) =
∃epid, eaddr, eintent, ecount :
cSS ∈ C0 ′√(teCK(ΠAK), ftCK(ΠAK))
∧ only-heap-pointer?(cSS.mem)
∧ named-ty?(teCK(ΠAK), hst(cSS.mem))
∧ hd(s2l(cSS.prog)) = SCall(r, pfh touch addr,
[epid, eaddr, eintent, ecount])
∧ r ∈ map(fst, lsttop(cSS.mem))
∧ gst(cSS.mem) = gstCK(ΠAK)
∧ glob-init?(cSS.mem)
∧ ∀ i < |hstCVM| : hst(cSS.mem)[i] = hstCVM[i]
∧ asizest∗(sc(cSS.mem).lst) + 136 ≤ ABASEhm − ABASElm
∧ asizeheap(hst(cSS.mem)) < ASIZEmaxhm
∧ paramSSta ?(cSS, teCK(ΠAK), epid, eaddr, eintent, ecount,
pid, addr, intent, count)
Technical postconditions. On the side of technical postconditions of the page-
fault handler top-level theorem we have the following: (i) the validity of the C0
configuration, (ii) only pointers to heap locations in the memory are allowed,
(iii) every type in the heap has a proper name in the type environment, (iv) the
146 Integrating Results
program rest is updated by removing the top-most statement, (v) the global
and heap memories are unchanged at all variables/locations except for those





POST techta ?(ΠAK, cSS, c
′
SS, pid, addr, intent, count) =





∧ c′SS.prog = rem-1st-stmt(cSS.prog)
∧ unchangedGM,HM(teCK(ΠAK), cSS.mem, c′SS.mem,
modifiedta, [0..USER PGS])
∧ C0mem-inv?(teCK(ΠAK), cSS.mem, c′SS.mem,
[ ], r,mcellnat(pfh-ta-resSS(cPFH, pid, addr, intent, count)).
Weak relation for user processes. Instead of the relation for user processes B
the top-level correctness theorem of the page-fault handler concludes its weak
version.
Basically, the weak version of the relation B, denoted by B′, states that
relation B holds for all virtual pages but one in the system. Recall that
when the page fault handler is called for a pair (pid, addr) with an intention
intent = OVERWRITE we know that the kernel is going to overwrite the entire
data stored in the physical page corresponding to (pid, addr). An example of
that is the primitive cvm copy [Tsy09, Section 9.3] which copies data between
two processes: some portions of the target process’s memory will be overwrit-
ten. Because of that an optimization was introduced in the page-fault handler:
in case intent = OVERWRITE we do not swap in the data for the touched page
but only update its page table entry (consider lines 50–58 of Listing 5.4). The
latter includes setting the valid bit. Since the page table entry corresponding to
the pair (pid, addr) is validated while the respective physical page is still filled
with the old data the relation for user processes is violated for that particular
page. Note the relation B will be regained during the kernel run as soon as the
kernel puts the right data at the discussed page. However, immediately after
the page-fault handler only the relation B′ holds.
In order to define B′ formally let us first adapt the assembly equality pred-
icate asm-equal? (Definition 2.29). The weak assembly equality, denoted by
the predicate asm-equal ′?, is additionally parametrized by a boolean flag flag
and a virtual page index vpx. The new predicate differs from asm-equal? in the
memory conjunct: virtual memories of two assembly configurations are equal











∧ c1ASM.pc = c2ASM.pc
∧ tl(c1ASM.gpr) = tl(c2ASM.gpr)
∧ stored-spr(c1ASM.spr) = stored-spr(c2ASM.spr)
∧ ∀a < vm-size : ¬(flag ∧ px(a) = vpx) −→ c1ASM.m(a/4) = c2ASM.m(a/4)
7.3. Page-Fault Handler Top-Level Correctness 147
Having this, it is fairly easy to define the relation B′. Its definition ex-
tends the parameters of B (Definition 2.30) with a boolean flag flag, a pro-
cess identifier pid, and a virtual page index vpx. The new relation claims for
all user processes p the weak assembly equality between their implementa-
tion vm(cISA+DS, p) and specification ups(p) such that the parameter flag of
asm-equal ′? is strengthened with a condition p = pid.
J DEFINITION 7.20
Relation for user processes:
weak version
B′(ups, cISA+DS,flag, pid, vpx) =
∀ 0 < p < MAX PID : asm-equal ′?(vm(cISA+DS, p), ups(p),
(ptlCVM(cISA+DS.cpu, p) + 1) · 212,
flag ∧ (p = pid), vpx)
By setting the flag parameter of B′ to intent = OVERWRITE we can express
the relation for user processes which holds after the page-fault hander:
B′(ups, cISA+DS, intent = OVERWRITE, pid, px(addr)).
Correctness theorem. Technically, the formulation of the page-fault handler
top-level correctness theorem is close to Theorem 7.16 (the top-level theorem
of the initialization code). Therefore we list only those assumptions and con-
clusions that differ.
On the side of assumptions the theorem has the zero-filled page condition
(1) and the relation for user processes (2). Both are maintained throughout
CVM and the page-fault handler top-level correctness theorem is supposed to
be applied during the concrete kernel executions. Further, we assume technical
(3) and functional (4) preconditions of the page-fault handler. As for the
conclusions, the theorem claims that technical (5) and functional (6) page-













(2) ∧ B(ups, cISA+DS)
(3) ∧ PRE techta ?(ΠAK, cSS, pid, addr, intent, count)
(4) ∧ PRE SSta ?(cSS, cPFH, pid, addr, intent, count,
teCK(ΠAK), locsactive, locsfree)
−→ ∃ T, c′ISA+DS, c′SS : δTISA+DS(cISA+DS, seq) = c′ISA+DS
∧ non-interf-dev ′?(cISA+DS.devs, c′ISA+DS.devs, seq, T )
∧ hd√(c′ISA+DS.devs)





(5) ∧ POST techta ?(ΠAK, cSS, c′SS, pid, addr, intent, count)
148 Integrating Results
(6) ∧ POST SSta ?(c′SS, cPFH, pid, addr, intent, count, r,
teCK(ΠAK), locsactive, locsfree)
(7) ∧ B′(ups, c′ISA+DS, intent = OVERWRITE, pid, px(addr)).
We apply Lemma 6.69 in order to discharge the functional postconditionPROOF I
and obtain an extended C0 computation. By this we obtain a description
of physical and swap memories in terms of the extended state. We apply
Theorem 7.12 as described in Section 7.1.3 to obtain relations between the
extended state and real physical ISA and hard-disk memories. The relation for
user processes constitutes the major proof effort.
The page-fault handler does not change registers of user processes. Hence,
the resister conjuncts of the relation are easy to show. As for the virtual
memories of the processes, its contents stay the same. Actually, because of
these facts we use the same user-processes components ups in the assumptions
and conclusions of the theorem.
The only condition affected by the page-fault handler is the place where a
certain virtual memory page lies: in the physical memory or on the hard disk.
Formally, this boils down to investigating the address translation mechanism.
For the user-processes relation proof we have to map the changes specified
for the extended state to the real machine memory and the hard disk. These
two are related by the predicates xconsismem? and xconsisswap? which are part
of the relation xconsis?.
Let us consider as an example the case of legal user page fault in the sit-
uation where some page has to be swapped out (this turns out to be the
most complicated case of page-fault handler execution). Let mem and swap
be the components of the extended state before the function execution. Sup-
pose a page fault occurs for the pair (pidin, vain). In order to treat this page
fault the handler has to swap out the page corresponding to some other pair
(pidout, vaout). The extended state components after the handler execution are









if i = offsswap(spxPFH(cPFH, px(vaout), pidout))
swap[i] otherwise
Note that since the extended state models only parts of physical and swap
memories the corresponding offset functions offsswap and offsmem are applied
(cf. Definitions 4.24 and 4.25).
We instantiate the extended state component before the function execution
with mem2x(cISA.m) and swap2x(cHD.sm) and by unfolding xconsismem? and
xconsisswap? obtain the following.
∀ ad : PG SZ · ZFP ≤ ad
∧ ad < PG SZ · (ZFP + USER PGS + 1)
−→ 〈c′ISA.mword(bin(ad))〉 =
7.3. Page-Fault Handler Top-Level Correctness 149
Figure 7.3: Page movement during page-fault handling
(pidin, vain)
physical memory component mem
(pidout, vaout)
swap memory component swap
offsswap(spxPFH(...))offsmem(px(pmaPFH(...)))

cHD.sm[(PG SZ · spxPFH(cPFH, px(vain), pidin) + bx(ad))/4]
if px(ad) = px(pmaPFH(cPFH, pidout, vaout))
〈cISA.mword(bin(ad))〉 otherwise
∀ ad : PG SZ WD · BOOT PGS ≤ ad
∧ ad < PG SZ WD · (BOOT PGS + TOT BIG PGS · PGS PER BIG PG) :
−→ c′HD.sm[ad] =
〈cISA.mword(bin(PG SZ · px(pmaPFH(cPFH, pidout, vaout)) + bx(4 · ad)))〉
if px(4 · ad) = spxPFH(cPFH, px(vaout), pidout)
cHD.sm[ad] otherwise
Note that the physical memory address and the swap page index computa-
tions over the abstract PFH state are semantically equivalent to those which
are defined in the CVM theories and used in the definition of the relation B
(Definitions 2.22 and 2.26).
At the end of the day, we want to show that the memory relation conjunct
of the relation for user processes holds for the same abstract user processes
configurations as before the function call. This could be reduced to the claim
that for every user process pid and address va which belongs to the set of
allocated addresses for that process the value constructed by the B-relation
before the function call is the same as after. Here we distinguish three cases.
Case 1. The pair (pid, va) corresponds to the same page table entry as the
pair (pidin, vain), i.e., pid = pidin and px(va) = px(vain). From the functional
postcondition it follows that the valid bit for that page was not set before the
function call and is set after the call. So, for this pair we should read the value
from swap memory before the call and from the physical memory after the call.
150 Integrating Results
The value before the call is
cHD.sm[(spxPFH(cPFH, px(va), pid) · PG SZ + bx(va))/4].







= px(ptePFH(cPFH, pidout, px(vaout)))
= px(pmaPFH(cPFH, pidout, vaout)).
So, the value after the call is
〈c′ISA.mword(bin(pmaPFH(c′PFH, pid, va)))〉




= cHD.sm[(PG SZ · spxPFH(cPFH, px(va), pid) + bx(va))/4].
Case 2. The pair (pid, va) corresponds to the same page table entry as the pair
(pidout, vaout), i.e., pid = pidout and px(va) = px(vaout). From the functional
postcondition it follows that the valid bit for that page was set before the
function call and is not set after the call. So, for this pair we should read the
value from physical memory before the call and from the swap memory after
the call. Value before the call is
〈cISA.mword(bin(pmaPFH(cPFH, pid, va)))〉.
The page index of the swap memory address is




= spxPFH(cPFH, px(vaout), pidout)
So, the value after the call is
c′HD.sm[(spxPFH(c
′
PFH, px(va), pid) · PG SZ + bx(va))/4]
= 〈cISA.mword(bin(PG SZ · px(pmaPFH(cPFH, pidout, vaout)) +
bx(spxPFH(c
′
PFH, px(va), pid) · PG SZ + bx(va))))〉
= 〈cISA.mword(bin(PG SZ · px(ptePFH(cPFH, pidout, px(vaout))) + bx(va)))〉
= 〈cISA.mword(bin(PG SZ · px(ptePFH(cPFH, pid, px(va))) + bx(va)))〉
= 〈cISA.mword(bin(pmaPFH(cPFH, pid, va)))〉.
Case 3. The pair (pid, va) does not correspond to any of the handled pages.
In this case the page corresponding to (pid, va) resides at the same place before
and after the function call. The “if” conditions in the equations presented in
the beginning of this proof are not satisfied. Therefore we conclude
〈c′ISA.mword(bin(pmaPFH(c′PFH, pid, va)))〉




PFH, px(va), pid) · PG SZ + bx(va))/4]
= cHD.sm[(spxPFH(cPFH, px(va), pid) · PG SZ + bx(va))/4].



















Figure 7.4: Verification diagram of user page fault handling.
7.4 Using Results in the CVM Proof
CVM invokes the initialization code of demand paging while initializing its
data structures after a reset. Section 9.2.2 of Tsyban’s thesis [Tsy09] describes
verification of kernel initialization after a reset and gives details in which con-
text the top-level correctness theorem of the initialization code (Theorem 7.16)
is applied.
The page-fault handler is called by CVM in two situations: when page-
fault exceptions occur during user steps and when the kernel executes CVM
primitives that access user memory. In the second case the handler simulates
address translation for CVM, which runs untranslated, and makes sure that
the corresponding memory page is swapped in. In this section we briefly review
these two cases.
7.4.1 Handling User Page Faults
During a single user step up to two page faults might occur: a page fault
on fetch, and a page fault on load/store. Our page-fault handler is designed
in a way that it guarantees that no more than two page faults occur while
processing a single instruction. Hence, the following five situations are possible
regarding page faults: (i) no page faults, (ii) only a page fault on fetch, (iii) only
a page fault on load/store, (iv) a page fault on fetch followed by a page fault
on load/store, and (v) a page fault on load/store followed by a page fault on







ISA are mapped to a single user process
state at which the user attempts to make a step. In state c1ISA it is guaranteed
that there is no page fault on fetch. In state c2ISA it is guaranteed that there is
no page fault on load/store. In case there was a page fault on fetch we also can
state its absence at this point. However, if there was no page fault on fetch, it
could happen during the handling of the page fault on load/store, which will
be signaled in the next step. Hence, generally we could not claim anything
about the instruction page fault. Only in state c3ISA it is guaranteed that there
both kinds of page-faults are absent.
The mentioned scenario is expressed as a formal lemma in Tsyban’s the-
sis [Tsy09, Lemma 10.1]. The proof of this lemma essentially boils down to a
triple application of the page-fault handler correctness theorem (Theorem 7.21).
In order to guarantee liveness of the system it is necessary to argue that during
the next call to the page-fault handler the page that was swapped in this time
will not be swapped out.
152 Integrating Results
The page-fault handler implementation respects the mentioned property.
Suppose we want to claim that the page corresponding to a pair (pid, va) will
still be in the physical memory after handling of the next page fault. If the page
fault has occurred because the needed page was not in the physical memory,
the handler has to load it from the swap memory. In case there is not a single
vacant page in the physical memory, which is indicated by an empty free list,
some page has to be evicted. According to our page-replacement strategy a
page at the beginning of the active list has to be swapped out. The formula
below claims that the page associated with the pair (pid, va) will not be evicted,
hence, will not be swapped out during the next call to the handler:











In Section 9.2.5 of her doctoral thesis [Tsy09] Tsyban gives this predicate a
meaning on the ISA level and formally proves that it holds after the page-fault
handler call using the fact that we have at least two user pages in the physical
memory.
7.4.2 Address Translation in CVM Primitives
When invoked in CVM primitives the page-fault handler is used for address
translation and guaranteeing that specified pages reside in the physical memory.
A good example of that is a primitive cvm copy. It copies a specified amount of
virtual memory from one process pid1 at virtual address va1 to another process
pid2 at virtual address va2. Correctness of the primitive is shown in Section
9.3 of Tsyban’s thesis [Tsy09]
In order to proceed with copying the primitive translates virtual addresses
to physical ones by invoking the page-fault handler. An important property for
the correctness proof of the primitive is the process memory isolation: two dif-
ferent pairs of process identifiers and virtual addresses (pid1, va1) 6= (pid2, va2)
are translated into two different physical addresses pmaPFH(cPFH, pid1, va1) 6=
pmaPFH(cPFH, pid2, va2). This property is respected by the page-fault handler
and below we prove a lemma justifying it. Note that this lemma is also applied
in similar cases of the CVM proof where it has to be shown that user processes
operate in separate address space, e.g., the correctness lemma of a user step
without interrupts [Tsy09, Lemma 10.2].
The property could be shown only for the valid page-fault handler configu-
rations, user process identifiers, and appropriately bounded virtual addresses.
The latter has to be less than the virtual memory amount of the respective
process. Note that some virtual addresses are mapped to the zero-filled page.
Without loss of generality we consider a situation when one of the addresses
does not translate to the zero-filled page whereas the second address either




(pid1, va1) 6= (pid2, va2)
∧ pfh√(cPFH)
∧ user-pid?(pid1) ∧ user-pid?(pid2)
∧ va1 < (cPFH.ptl[pid1] + 1) · PG SZ ∧ va2 < (cPFH.ptl[pid2] + 1) · PG SZ
∧ valid?(ptePFH(cPFH, pid1, px(va1)))
∧ (valid?(ptePFH(cPFH, pid2, px(va2))) ∨ zfp?(cPFH, pid2, px(va2)))
−→ pmaPFH(cPFH, pid1, va1) 6= pmaPFH(cPFH, pid2, va2)
7.4. Using Results in the CVM Proof 153
We prove lemma by splitting cases according to the disjunction in the J PROOF
premises.
Case 1. The second pair of process identifier and virtual address corresponds
to the zero-filled page:
zfp?(cPFH, pid2, px(va2)).
Using the page-fault handler validity conjunct user-ppx-pt? (Definition 4.47)
which states that page table entries refer only to the user memory we reason
as follows:
pmaPFH(cPFH, pid2, va2)
= ZFP · PG SZ + bx(va2)
< ZFP · PG SZ + PG SZ
= KERNEL PGS · PG SZ
≤ px(ptePFH(cPFH, pid1, px(va1))) · PG SZ
≤ px(ptePFH(cPFH, pid1, px(va1))) · PG SZ + bx(va1)
= pmaPFH(cPFH, pid1, va1).
From that we conclude that translated addresses are different.
Case 2. The second pair of process identifier and virtual address corresponds
to the user memory:
valid?(ptePFH(cPFH, pid2, px(va2))).
From the page-fault handler validity conjunct valid-pte-descr-active? (Defini-
tion 4.46) we conclude that two active lists’s elements a1 and a2 (possibly
equal) correspond to the given page table entries:
a1 ∈ cPFH.active
∧ a1.pid = pid1




∧ a2.pid = pid2
∧ a2.vpx = px(va2)
∧ a2.ppx =
px(ptePFH(cPFH, pid2, px(va2))).
Case 2.1: a1 = a2. In this case the corresponding process identifier as well as
virtual and physical page indices are equal:
pid1 = pid2,
px(va1) = px(va2),
px(ptePFH(cPFH, pid1, px(va1))) = px(ptePFH(cPFH, pid2, px(va2))).
From the assumptions it follows that if the process identifiers are equal
then the virtual addresses are different va1 6= va2:
(va1 6= va2)
= (bx(va1) 6= bx(va2))
= (px(ptePFH(cPFH, pid1, px(va1))) · PG SZ + bx(va1)
6= px(ptePFH(cPFH, pid2, px(va2))) · PG SZ + bx(va2))
= (pmaPFH(cPFH, pid1, va1) 6= pmaPFH(cPFH, pid2, va2)).
Case 2.2: a1 6= a2. From the validity predicate dstnct-ppx-active-free? (Defini-
tion 4.44) we get that two different elements of the active list have dif-
ferent physical page indices:
a1.ppx 6= a2.ppx.
154 Integrating Results
Without loss of generality, assume that a1.ppx < a2.ppx. Then we have:
pmaPFH(cPFH, pid1, va1)
= px(ptePFH(cPFH, pid1, px(va1))) · PG SZ + bx(va1)
= a1.ppx · PG SZ + bx(va1)
< a1.ppx · PG SZ + PG SZ
≤ a2.ppx · PG SZ
≤ a2.ppx · PG SZ + bx(va2)
= px(ptePFH(cPFH, pid2, px(va2))) · PG SZ + bx(va2)
= pmaPFH(cPFH, pid2, va2).









Summary and Future Work
And that’s all I have to say about that.
– Forrest Gump
The future is never clear.
– Warren Buffet
This thesis presents to the best of our knowledge the first example of pervasive
formal verification of demand paging, a crucial component of every modern
operating system. We for the first time applied the Verisoft’s semantics stack
which allows to combine results from sequential Hoare-logics-style reasoning
about systems software like the page-fault handler on the low-level concur-
rent machine model. The results achieved within this thesis are successfully
integrated into the correctness theorem of CVM, a verified framework for mi-
crokernel programmers.
At the time we started verification the implementation of demand paging
has been intensively tested on the VAMP simulator. However, the verification
disclosed the following errors.
• The initialization code of demand paging lacked a call to the zero fill page
function. In fact, the zero-filled page was never initialized in the system.
• The page-fault handler lacked compatibility with executable bits of page
table entries supported by the VAMP hardware.
• Constant TOT BIG PGS = 1152 was incorrectly assigned the value of 1024.
• In case the page-fault handler was invoked in order to prevent swapping
out of a specified page, the page was identified only by its virtual page
index instead of a pair (pid,vpx). Only the pair uniquely defines the
physical page.
155
156 Summary and Future Work
Formal theories in Isabelle/HOL developed in the scope of this work com-
prise more than 450 definitions and 1500 lemmas proven in up to 32000 steps.
Finally, we point out possible directions of future work.
• Verification of memory managements CVM primitives cvm alloc and
cvm free. They share data structures with the demand paging code and
hence one can re-use the valid abstraction relations of the PFH state
in different semantics as well as equivalence proofs between them. This
should significantly ease the verification effort.
• In the present work we focused on the integration of demand paging
into CVM and because of that assumed that the caller might request
the page-fault handler to guarantee the simultaneous presence of at most
two specified pages in the physical memory. However, in the implemen-
tation the corresponding parameter count is not limited by such means
but rather by the total number of user-available physical pages. One
might adapt the page-fault handler specification and proofs to support
the latter.
• This thesis considers a relatively simple page-fault handler which imple-
ments a FIFO page replacement strategy. Among its disadvantages is the
fact that it only considers the swap-in time and omits to keep track on the
access frequency for pages while in main memory. A good method to ap-
proximate the optimal page replacement algorithm is so-called FIFO with
second chance [Tan01] which itself approximates a better but rather com-
putationally expensive least recently used (LRU) strategy. A page-fault
handler implementing a FIFO with second chance has been developed
within Verisoft [Con06]. This implementation is a good candidate to il-
lustrate formal verification of more advanced demand paging software.
Since the advanced implementation only extends the page-fault handler
state with an additional page management list and some variables the
core parts of specifications and proofs presented in this thesis might be
reused.
• Transferring of correctness results from Simpl to C0 big-step semantics
requires defining many variable lookup functions as well as the state ab-
straction relation for concrete program states. One might add features to
the code translation tool for generation of such functions automatically.
• Although we used mostly interactive verification techniques we believe
there is a room for automation. One might profit from methods of au-
tomated verification while proving functional correctness of the source
code. As yet, we have unsatisfactory experience in application of a soft-
ware model checking tool [DMSS05] to justify expression guards in the
Hoare logic proofs: those guards that were proven by the tool automati-
cally could alike be proven by the built-in Isabelle simplifier.
Appendix A: Macros from the
Implementation
1 #define PX(x) ((x) >> 12u)




6 #define BPX(page) ((page) >> 10u)




11 #define BPTE(pid,bpx) bpt[unsigned(pcb[pid].bpto) + bpx]
12 #define PTO PT(pid) (((unsigned(pcb[pid].ef[PTO]) << 12u) − PT START)\
13 / (PTES PER PG ∗ 4u))
14 #define PTE(pid,page) ((∗pt)[(PTO PT(pid) + (page / PTES PER PG))]\




19 #define VALID MASK 2048u
20 #define INVALID MASK (˜VALID MASK)
21 #define VALID(x) (((x) & VALID MASK) != 0u)
22 #define PROT MASK 1024u
23 #define PROT(x) (((x) & PROT MASK) != 0u)
24 #define EXEC MASK 512u
25 #define PPX MASK 4294963200u
157






cH.kheapglob = PT START
∧ |cH.pcb-bptoglob| = MAX PID ∧ |cH.pcb-bptlglob| = MAX PID
∧ |cH.pcb-efglob| = MAX PID ∧ (∀ i < MAX PID : |cH.pcb-efglob[i]| = EF DIM)
∧ |cH.ppx2pdglob| = TOT PHYS PGS ∧ |cH.bpfreeglob| = TOT BIG PGS
∧ pt-absH?(cH, init-cPFH) ∧ bpt-absH?(cH, init-cPFH)
∧ init-pcb-ef?(cH)
∧ 0 < cH.nloc ≤ MAX PID
∧ (∀ 0 < i < cH.nloc : cH.pcb-efglob[i][PTO] = 1024+(i−1) · 9
∧ cH.pcb-efglob[i][PTL] = −1
∧ cH.pcb-bptoglob[i] = 0 ∧ cH.pcb-bptlglob[i] = −1)
∧ cH.alloc = [1] ∧ cH.free ≥ USER PGS · PD SZ
∧ cH.x = zero-fill-page(ZFP, c0H.x)
∧ pres({cH.bptglob})
RANK 1init(cH) = MAX PID− cH.nloc
159
160 Appendix B: Loop Invariants and Ranking Functions
Second Loop
INV 2init?(cH) =
cH.kheapglob = PT START
∧ |cH.ppx2pdglob| = TOT PHYS PGS ∧ |cH.bpfreeglob| = TOT BIG PGS
∧ active-absH?(cH, init-cPFH, [ ])
∧ pt-absH?(cH, init-cPFH) ∧ bpt-absH?(cH, init-cPFH)
∧ pto-absH?(cH, cPFH) ∧ ptl-absH?(cH, cPFH)
∧ bpto-absH?(cH, cPFH) ∧ bptl-absH?(cH, cPFH)
∧ init-pcb-ef?(cH)
∧ KERNEL PGS < cH.nloc ≤ TOT PHYS PGS
∧ (∃ q : dlistH?(cH.freeglob, cH.nextheap, cH.prevheap, q, [cH.nloc−KERNEL PGS+1..2])
∧ (∀ i < cH.nloc−KERNEL PGS : cH.pidheap(cH.nloc−KERNEL PGS−i+1) = 0
∧ cH.vpxheap(cH.nloc−KERNEL PGS−i+1) = 0
∧ cH.ppxheap(cH.nloc−KERNEL PGS−i+1) = n−i−1)
∧ (∀ KERNEL PGS ≤ i < cH.nloc : cH.ppx2pdglob[i] = 2+i−KERNEL PGS))
∧ cH.alloc = [cH.nloc − KERNEL PGS+1..1]
∧ TOT PHYS PGS · PD SZ ≤ cH.free + cH.nloc · PD SZ
∧ cH.x = zero-fill-page(ZFP, c0H.x)
∧ pres({cH.bptglob})
RANK 2init(cH) = TOT PHYS PGS− cH.nloc
Third Loop
INV 3init?(cH) =
cH.kheapglob = PT START
∧ |cH.bpfreeglob| = TOT BIG PGS
∧ active-absH?(cH, init-cPFH, [ ]) ∧ free-absH?(cH, init-cPFH, [USER PGS + 1..2])
∧ pt-absH?(cH, init-cPFH) ∧ bpt-absH?(cH, init-cPFH)
∧ pages-free-absH?(cH, cPFH)
∧ pto-absH?(cH, cPFH) ∧ ptl-absH?(cH, cPFH)
∧ bpto-absH?(cH, cPFH) ∧ bptl-absH?(cH, cPFH)
∧ init-pcb-ef?(cH)
∧ 0 < cH.nloc ≤ TOT BIG PGS
∧ (∀ i < cH.bpfreeglob[i] = TOT BIG PGS−1−i)
∧ cH.alloc = [USER PGS+1..1]
∧ cH.x = zero-fill-page(ZFP, c0H.x)
∧ pres({cH.bptglob})
RANK 3init(cH) = TOT BIG PGS− cH.nloc
Appendix B: Loop Invariants and Ranking Functions 161
Page-Fault Handler
First Loop
INV 1ta?(cH, cPFH, refsactive, refsfree) =
c0H.countloc = 1 ∧ cH.countloc ≤ 1
∧ cH.pages-freeglob = 0
∧ (cH.countloc = 1 −→ cH.victloc = cH.activeglob)
∧ (cH.countloc = 0 −→ cH.victloc = cH.nextheap(cH.activeglob)
∧ (cH.vpxheap(cH.activeglob) 6= px(cH.addrloc)
∨ cH.pidheap(cH.activeglob) 6= cH.pidloc))
∧ cH.vpxloc = px(cH.addrloc)
∧ cH.intentloc 6= SWAP IN
∧ cH.pteloc =
cH.ptheap(cH.ptglob)[(cH.pcb-efglob[cH.pidloc][PTO] · PG SZ−PT START)/PG SZ
+ px(cH.addrloc)/PTES PER PG][ptea2(cH.addrloc)]
∧ user-pid?(cH.pidloc) ∧ cH.addrloc < TOT PGS · PG SZ ∧ cPFH.ptl[cH.pidloc] ≥ 0
∧ ¬ptlexcpPFH?(cPFH, cH.pidloc, cH.addrloc)
∧ ¬pfPFH?(cPFH, cH.pidloc, cH.addrloc, cH.intentloc)
∧ absH√(cH, cPFH, refsactive, refsfree)
∧ cH.x = c0H.x
∧ pres({cH.activeglob, cH.freeglob, cH.bpfreeglob, cH.ppx2pdglob, cH.ptglob,
cH.bptglob, cH.pages-usedglob, cH.pages-freeglob, cH.bpages-freeglob,
cH.pcb-efglob, cH.pcb-bptoglob, cH.pcb-bptoglob, cH.pcb-bptlglob,
cH.pidloc, cH.addrloc, cH.intentloc,
cH.ptheap, cH.pidheap, cH.vpxheap, cH.ppxheap, cH.nextheap, cH.prevheap})
RANK 1ta(cH) = cH.countloc
162 Appendix B: Loop Invariants and Ranking Functions
Second Loop
INV 2ta?(cH, cPFH, refsactive, refsfree) =
c0H.countloc = 1 ∧ cH.countloc ≤ 1
∧ cH.pages-freeglob = 0
∧ (cH.countloc = 1 −→ cH.active-tailloc = cH.activeglob
∧ cH.vpxheap(cH.activeglob) = px(cH.addrloc)
∧ cH.pidheap(cH.activeglob) = cH.pidloc)
∧ (cH.countloc = 0 −→ cH.active-tailloc = cH.nextheap(cH.activeglob))
∧ (cH.vpxheap(cH.activeglob) = px(cH.addrloc)
∧ cH.pidheap(cH.activeglob) = cH.pidloc
−→ cH.victloc = cH.activeglob)
∧ (cH.vpxheap(cH.activeglob) 6= px(cH.addrloc)
∨ cH.pidheap(cH.activeglob) 6= cH.pidloc
−→ cH.victloc = cH.nextheap(cH.activeglob)
∧ cH.vpxloc = px(cH.addrloc)
∧ cH.intentloc 6= SWAP IN
∧ cH.pteloc =
cH.ptheap(cH.ptglob)[(cH.pcb-efglob[cH.pidloc][PTO] · PG SZ−PT START)/PG SZ
+ px(cH.addrloc)/PTES PER PG][ptea2(cH.addrloc)]
∧ user-pid?(cH.pidloc) ∧ cH.addrloc < TOT PGS · PG SZ ∧ cPFH.ptl[cH.pidloc] ≥ 0
∧ ¬ptlexcpPFH?(cPFH, cH.pidloc, cH.addrloc)
∧ ¬pfPFH?(cPFH, cH.pidloc, cH.addrloc, cH.intentloc)
∧ absH√(cH, cPFH, refsactive, refsfree)
∧ cH.x = c0H.x
∧ pres({cH.activeglob, cH.freeglob, cH.bpfreeglob, cH.ppx2pdglob, cH.ptglob,
cH.bptglob, cH.pages-usedglob, cH.pages-freeglob, cH.bpages-freeglob,
cH.pcb-efglob, cH.pcb-bptoglob, cH.pcb-bptoglob, cH.pcb-bptlglob,
cH.pidloc, cH.addrloc, cH.intentloc,
cH.ptheap, cH.pidheap, cH.vpxheap, cH.ppxheap, cH.nextheap, cH.prevheap})
RANK 2ta(cH) = cH.countloc
Appendix C: Mapping to Formal
Names in Isabelle/HOL
Name Formal name in Isabelle/HOL
Definition 2.2 VAMPasm2isaSystem/config correct::is dlx conft
Definition 2.4 VAMPisa/mem spec::compute pa
Definition 2.7 VAMPisa/dlxifspec::ipf
Definition 2.8 VAMPisa/dlxifspec::dpf
Definition 2.9 VAMPasm/Config::is ASMcore
Definition 2.10 VAMPisaDevices/dlxifspec dev hd::is valid idle hd
Definition 2.11 VAMPisaDevices/dlxifspec dev hd::swap disk exists
Definition 2.12 VAMPisaDevices/dlxifspec dev hd::{live input seq isa, efis welltyped}
Definition 2.13 VAMPisaDevices/dlxifspec dev hd::proc step number
Definition 2.14 VAMPisaDevices/dlxifspec dev hd::dev step times, dev step number
Theorem 2.16 C0SS2VAMPisaSystem/mini stack wo dev::C0 mini stack isa
Definition 2.17 cvm/config/c0 config::{linked tt, linked pt, linked st}
Definition 2.18 cvm/map/B relation::get p vm
Definition 2.19 cvm/map/B relation::{page index, byte index}
Definition 2.20 cvm/map/B relation::{page table origin, page table length}
Definition 2.21 cvm/map/B relation::page table entry
Definition 2.22 cvm/map/B relation::physical memory address
Definition 2.23 cvm/map/B relation::{big page index, big byte index}
Definition 2.24 cvm/map/B relation::big page table origin
Definition 2.25 cvm/map/B relation::big page table entry
Definition 2.26 cvm/map/B relation::swap memory address
Definition 2.27 cvm/map/B relation::valid bit
Definition 2.28 cvm/map/B relation::get mm
Definition 2.29 cvm/map/B relation::ASMcore equality
Definition 2.30 cvm/map/B relation:B relation
Definition 2.31 cvm/config/isa config::code invariant isa
Definition 2.32 pfh/Transfer/zfpCondition::zfp condition
Theorem 2.33 cvm/cvm correct/cvm correct::cvm correct
Definition 3.1 C0BS/ConformT::conforms
Theorem 3.2 HoareToBigStep/HoareToBigStep::validt hoare to C0 aux
Theorem 3.4 C0BSSSequiv/PropertyTransferBStoSS::
ultimate valid bs to valid ss transfer total
Theorem 3.5 C0BSSSequiv/PropertyTransferBStoSS::adapt hoare redex’
Definition 4.1 pfh/Validity::subtyping descriptor
163
164 Appendix C: Mapping to Formal Names in Isabelle/HOL
Definition 4.2 pfh/Validity::{subtyping active, subtyping free}
Definition 4.3 pfh/Validity::subtyping bpfree stack
Definition 4.4 pfh/Validity::subtyping ptspace
Definition 4.5 pfh/Validity::subtyping bptspace
Definition 4.6 pfh/Validity::subtyping pcbs
Definition 4.7 pfh/Validity::subtyping pfh
Definition 4.8 pfh/pfhX::valid x
Definition 4.9 pfh/Utilities::ptspace origin
Definition 4.10 pfh/Utilities::ptspace length
Definition 4.11 pfh/Utilities::pte dim1
Definition 4.12 pfh/Utilities::pte dim2
Definition 4.13 pfh/Utilities::ptspace entry
Definition 4.14 pfh/Utilities::translate addr
Definition 4.15 pfh/Utilities::ptl excp
Definition 4.16 pfh/Utilities::invalid access
Definition 4.17 pfh/Utilities::zero protection
Definition 4.18 pfh/Utilities::page fault
Definition 4.19 pfh/Utilities::bpx of vpx
Definition 4.20 pfh/Utilities::bbx of vpx
Definition 4.23 pfh/Utilities::compute disk addr
Definition 4.24 pfh/Utilities::adjust swap addr
Definition 4.25 pfh/Utilities::adjust mem addr
Definition 4.26 pfh/pfhX::read from disk
Definition 4.27 pfh/pfhX::swap in
Definition 4.28 pfh/pfhX::write to disk
Definition 4.29 pfh/pfhX::swap out
Definition 4.30 pfh/pfhX::zero fill page
Def. 4.31,4.32,4.33 pfh/Configurations::abs pfh after handling
Definition 4.34 pfh/pfhX::fill page
Definition 4.35 pfh/pfhX::pfh touch addr POST modifying X state
Definition 4.36 pfh/Configurations::abs pfh after handling non pf
Definition 4.37 pfh/Validity::is used pid active
Definition 4.38 pfh/Validity::{valid ppx active, valid ppx free}
Definition 4.39 pfh/Validity::valid bit active
Definition 4.40 pfh/Validity::not protection bit active
Definition 4.41 pfh/Validity::page index active
Definition 4.42 pfh/Validity::{ppx active not zfp, ppx free not zfp}
Definition 4.43 pfh/Validity::distinct pid vpx active
Definition 4.44 pfh/Validity::distinct ppx active free
Definition 4.45 pfh/Validity::vpx inside ptl active
Definition 4.46 pfh/Validity::active describes valid pte
Definition 4.47 pfh/Validity::valid ppx ptspace
Definition 4.48 pfh/Validity::pt not overlap
Definition 4.49 pfh/Validity::pto plus ptl inside ptspace
Definition 4.50 pfh/Validity::ppx zfp protected
Definition 4.51 pfh/Validity::pte valid exec
Definition 4.52 pfh/Validity::ppx zfp valid
Definition 4.53 pfh/Validity::number of bpages eq free and used
Definition 4.54 pfh/Validity::bpte distinct
Definition 4.55 pfh/Validity::bpto bptl correct
Appendix C: Mapping to Formal Names in Isabelle/HOL 165
Definition 4.56 pfh/Validity::pto mono
Definition 4.57 pfh/Validity::ptl pid alloc active
Definition 4.58 pfh/Validity::pto ptl correct
Definition 4.59 pfh/Validity::bptl correct
Definition 4.60 pfh/Validity::valid pfh pcbs
Theorem 4.61 pfh/pfhValidityLemmas::valid init pfh init pcbs
Theorem 4.62 pfh/pfhValidityLemmas::valid pfh pcbs abs pfh after handling
Theorem 4.63 pfh/pfhValidityLemmas::valid pfh pcbs abs pfh after handling non pf
Definition 5.1 Hoare/HeapList::List
Definition 5.2 dList/HeapdList::dList
Definition 5.3 pfh/Simpl/simplMapping::dList of pds map
Definition 5.4 pfh/Simpl/simplMapping::active map
Definition 5.5 pfh/Simpl/simplMapping::free map
Definition 5.6 pfh/Simpl/simplMapping::bpfree stack map
Definition 5.7 pfh/Simpl/simplMapping::ptspace map
Definition 5.8 pfh/Simpl/simplMapping::bptspace map
Definition 5.9 pfh/Simpl/simplMapping::ptos map
Definition 5.10 pfh/Simpl/simplMapping::ptls map
Definition 5.11 pfh/Simpl/simplMapping::bptos map
Definition 5.12 pfh/Simpl/simplMapping::bptls map
Definition 5.13 pfh/Simpl/simplMapping::number free pages map
Definition 5.14 pfh/Simpl/simplMapping::free big pages map
Definition 5.15 pfh/Simpl/simplMapping::pages used map
Definition 5.16 pfh/Simpl/simplMapping::ppx2pd map
Definition 5.17 pfh/Simpl/ValidSimplMapping::pfh pcbs map
Definition 5.18 pfh/Simpl/ValidSimplMapping::valid pfh pcbs map
Definition 5.19 pfh/Simpl/pfhInitSpec::initialized ef except pto ptl
Theorem 5.20 pfh/Simpl/pfhInitTotal::{pfh init spec, pfh init modifies}
Theorem 5.21 pfh/Simpl/pfhSwapInTotal::{pfh swap in spec, pfh swap in modifies}
Theorem 5.22 pfh/Simpl/pfhSwapOutTotal::{pfh swap out spec, pfh swap out modifies}
Definition 5.23 pfh/Utilities::no ptl excp unless swap in
Theorem 5.25 pfh/Simpl/pfhTouchAddrTotal::
{pfh touch addr spec, pfh touch addr modifies}
Definition 6.1 pfh/Transfer/bsList::Listbs
Definition 6.2 pfh/Transfer/bsList::dListbs
Definition 6.3 pfh/Transfer/bsMapping::dList of pds mapbs
Definition 6.4 pfh/Transfer/bsMapping::active mapbs
Definition 6.5 pfh/Transfer/bsMapping::free mapbs
Definition 6.6 pfh/Transfer/bsMapping::bpfree stack mapbs
Definition 6.7 pfh/Transfer/bsMapping::ptspace mapbs
Definition 6.8 pfh/Transfer/bsMapping::bptspace mapbs
Definition 6.9 pfh/Transfer/bsMapping::ptos mapbs
Definition 6.10 pfh/Transfer/bsMapping::ptls mapbs
Definition 6.11 pfh/Transfer/bsMapping::bptos mapbs
Definition 6.12 pfh/Transfer/bsMapping::bptls mapbs
Definition 6.13 pfh/Transfer/bsMapping::number free pages mapbs
Definition 6.14 pfh/Transfer/bsMapping::free big pages mapbs
Definition 6.15 pfh/Transfer/bsMapping::pages used mapbs
Definition 6.16 pfh/Transfer/bsMapping::ppx2pd mapbs
Definition 6.17 pfh/Transfer/bsMapping::pfh pcbs mapbs
166 Appendix C: Mapping to Formal Names in Isabelle/HOL
Definition 6.18 pfh/Transfer/bsMapping::valid pfh pcbs mapbs
Definition 6.19 HoareToBigStep/HoareToBigStep::absbs
Lemma 6.20 pfh/Transfer/MappingTransferSimplToBS::
valid pfh pcbs map impl valid pfh pcbs mapbs
Lemma 6.21 pfh/Transfer/MappingTransferBSToSimpl::
valid pfh pcbs mapbs impl valid pfh pcbs map
Definition 6.22 pfh/Transfer/pfhInitTransferSimplToBS::initial heap typing
Definition 6.23 pfh/Transfer/pfhCode::pfh init gm changed vars
Definition 6.24 pfh/Transfer/pfhCode::pfh touch addr gm changed vars
Lemma 6.27 pfh/Transfer/pfhInitTransferSimplToBS::pfh init hoare to bs’
Lemma 6.30 pfh/Transfer/pfhTouchAddrTransferSimplToBS::pfh touch addr hoare to bs’’
Definition 6.31 pfh/xSem::HDdriverXCall stmt
Definition 6.32 pfh/xSem::pt2xpt
Definition 6.33 pfh/pfhCode::pfh’bs’prog’
Lemma 6.35 C0BS/C0BSHoareValidity::c0validt extend program
Lemma 6.36 C0BS/C0BSHoareValidity::c0validt conseq
Lemma 6.37 pfh/Transfer/pfhInitTransferSimplToBS::pfh init hoare to bs’’’’
Lemma 6.38 pfh/Transfer/pfhTouchAddrTransferSimplToBS::pfh touch addr hoare to bs’’’’
Definition 6.39 pfh/Transfer/ssList::Listss
Definition 6.40 pfh/Transfer/ssList::dListss
Definition 6.41 pfh/Transfer/ssMapping::dList of pds mapss
Definition 6.42 pfh/Transfer/ssMapping::active mapss
Definition 6.43 pfh/Transfer/ssMapping::free mapss
Definition 6.44 pfh/Transfer/ssMapping::bpfree stack mapss
Definition 6.45 pfh/Transfer/ssMapping::ptspace mapss
Definition 6.46 pfh/Transfer/ssMapping::bptspace mapss
Definition 6.47 pfh/Transfer/ssMapping::ptos mapss
Definition 6.48 pfh/Transfer/ssMapping::ptls mapss
Definition 6.49 pfh/Transfer/ssMapping::bptos mapss
Definition 6.50 pfh/Transfer/ssMapping::bptls mapss
Definition 6.51 pfh/Transfer/ssMapping::number free pages mapss
Definition 6.52 pfh/Transfer/ssMapping::free big pages mapss
Definition 6.53 pfh/Transfer/ssMapping::pages used mapss
Definition 6.54 pfh/Transfer/ssMapping::ppx2pd mapss
Definition 6.55 pfh/Transfer/ssMapping::pfh pcbs mapss
Definition 6.56 pfh/Transfer/ssMapping::valid pfh pcbs mapss
Definition 6.57 C0BSSSequiv/ValueAbs::absVal
Definition 6.58 C0BSSSequiv/StateAbs::absState
Definition 6.59 pfh/Transfer/PfhBigStepToSmallStep::allocated cvm heap in heap
Lemma 6.60 pfh/Transfer/MappingTransferBSToSS::
valid pfh pcbs mapbs impl valid pfh pcbs mapss
Definition 6.61 C0BSSSequiv/EquivStmt::absHeapType
Lemma 6.62 pfh/Transfer/MappingTransferSSToBS::
valid pfh pcbs mapss impl valid pfh pcbs mapbs
Lemma 6.64 pfh/Transfer/pfhInitTransferBSToSS::pfh init bs to ss
Lemma 6.65 pfh/Transfer/pfhInitTransferBSToSS::pfh init ss adapt
Lemma 6.68 pfh/Transfer/pfhTouchAddrTransferBSToSS::pfh touch addr bs to ss
Lemma 6.69 pfh/Transfer/pfhTouchAddrTransferBSToSS::pfh touch addr bs to ss adapt
Definition 7.1 pfh/pfhX::{read from disk PRE, write to disk PRE}
Definition 7.2 pfh/xSem::XWtdSem’ss
Appendix C: Mapping to Formal Names in Isabelle/HOL 167
Definition 7.3 pfh/xSem::XRfdSem’ss
Definition 7.4 pfh/pfhX::zero fill page PRE
Definition 7.5 pfh/xSem::XZfpSem’ss
Definition 7.6 pfh/xSem::pfh SS’xdefns
Definition 7.7 pfh/NoXCall/driverCorrectnessAsm::pfh memConsis
Definition 7.8 pfh/NoXCall/driverCorrectnessAsm::pfh swapConsis
Definition 7.9 pfh/NoXCall/driverCorrectnessAsm::driver xConsis
Definition 7.10 pfh/NoXCall/driverCorrectnessAsm::driver pt xpt xsem relation
Theorem 7.11 pfh/NoXCall/driverCorrectnessAsm::XCall driver correct asm
Theorem 7.12 pfh/NoXCall/driverCorrectnessIsa::XCall driver correct isa
Definition 7.13 pfh/NoXCall/driverCorrectnessCvm::{mem to pfhX, swap to pfhX}
Theorem 7.16 pfh/NoXCall/pfhInitTopLevel::pfh init correct
Definition 7.19 cvm/map/B relation::b ASMcore equality
Definition 7.20 cvm/map/B relation::b relation
Theorem 7.21 pfh/NoXCall/pfhTouchAddrTopLevel::pfh touch addr correct
Lemma 7.22 cvm/config/pfh conditions::ptspace entry plus byte index diff

Bibliography
[ABP09] E. Alkassar, S. Bogan, and W. Paul. Proving the correctness of
client/server software. 34:145–192, 2009.
[AHK+07] E. Alkassar, M. Hillebrand, S. Knapp, R. Rusev, and S. Tverdy-
shev. Formal device and programming model for a serial interface.
In B. Beckert, editor, Proceedings, 4th International Verification
Workshop (VERIFY), Bremen, Germany, pages 4–20. CEUR-WS
Workshop Proceedings, 2007.
[AHL+08] Eyad Alkassar, Mark A. Hillebrand, Dirk Leinenbach, Norbert W.
Schirmer, and Artem Starostin. The verisoft approach to sys-
tems verification. In 2nd IFIP Working Conference on Verified
Software: Theories, Tools, and Experiments (VSTTE’08), volume
5295 of LNCS, pages 209–224. Springer, 2008.
[AHL+09] E. Alkassar, M. A. Hillebrand, D. C. Leinenbach, N. W. Schirmer,
A. Starostin, and A. Tsyban. Balancing the load: Leveraging
semantics stack for systems verification. In Journal of Auto-
mated Reasoning: Special Issue on Operating Systems Verifica-
tion. Springer, 2009.
[Alk09] Eyad Alkassar. OS Verification Extended. On the Formal Ver-
ification of Device Drivers and the Correctness of Client/Server
Software. PhD thesis, Saarland University, Computer Science De-
partment, 2009.
[ASS08] Eyad Alkassar, Norbert Schirmer, and Artem Starostin. Formal
pervasive verification of a paging mechanism. In Juris Hartmanis
Gerhard Goos and Jan van Leeuwen, editors, 14th International
Conference on Tools and Algorithms for the Construction and
Analysis of Systems (TACAS’08), volume 4963 of LNCS, pages
109–123. Springer, 2008.
[Bev87] W. R. Bevier. A Verified Operating System Kernel. PhD thesis,
University of Texas at Austin, 1987.
[Bev89] W. R. Bevier. Kit: A study in operating system verification. IEEE
Transactions on Software Engineering, 15(11):1382–1396, 1989.
[Bey05] Sven Beyer. Putting It All Together: Formal Verification of the




[BHMY89a] William R. Bevier, Warren A. Hunt, Jr., J S. Moore, and
William D. Young. An approach to systems verification. 5(4):411–
428, December 1989.
[BHMY89b] W.R. Bevier, W.A. Hunt, J. Strother Moore, and W.D. Young.
Special issue on system verification. Journal of Automated Rea-
soning, 5(4):409–530, 1989.
[BHW06] Gerd Beuster, Niklas Henrich, and Markus Wagner. Real world
verification — experiences from the verisoft email client. In Pro-
ceedings of the Workshop on Empirical Succesfully Computerized
Reasoning (ESCoR 2006), 2006.
[BJK+03] S. Beyer, C. Jacobi, D. Kro¨ning, D. Leinenbach, and W.J. Paul.
Instantiating uninterpreted functional units and memory system:
functional verification of the vamp. In CHARME 2003, volume
2860 of LNCS, pages 51–65. Springer, 2003.
[BJK+06] Sven Beyer, Christian Jacobi, Daniel Kroening, Dirk Leinenbach,
and Wolfgang Paul. Putting it all together: Formal verification
of the VAMP. International Journal on Software Tools for Tech-
nology Transfer, 8(4–5):411–430, August 2006.
[Bog08] Sebastian Bogan. Formal Specification of a Simple Operating Sys-
tem. PhD thesis, Saarland University, Computer Science Depart-
ment, 2008.
[Bur72] R. Burstall. Some techniques for proving correctness of programs
which alter data structures. In B. Meltzer and D. Michie, editors,
Machine Intelligence 7, pages 23–50. Edinburgh University Press,
1972.
[CKS08] David Cock, Gerwin Klein, and Thomas Sewell. Secure micro-
kernels, state monads and scalable refinement. In Otmane Ait
Mohamed, Ce´sar Mu noz, and Sofie`ne Tahar, editors, Proceed-
ings of the 21st International Conference on Theorem Proving in
Higher Order Logics (TPHOLs’08), volume 5170 of Lecture Notes
in Computer Science, pages 167–182. Springer-Verlag, 2008.
[Con06] Cosmin Condea. Design and implementation of a page-fault han-
dler in c0. Master’s thesis, Saarland University, 2006.
[Dal06] Iakov Dalinger. Formal Verification of a Processor with Memory
Management Units. PhD thesis, Saarland University, Computer
Science Department, July 2006.
[DDB08] Matthias Daum, Jan Do¨rrenba¨cher, and Sebastian Bogan. Model
stack for the pervasive verification of a microkernel-based oper-
ating system. In 5th International Verification Workshop (VER-
IFY’08), volume 372 of CEUR Workshop Proceedings, pages 56–
70, 2008.
Bibliography 171
[DDSW08] Matthias Daum, Jan Do¨rrenba¨cher, Mareike Schmidt, and
Burkhart Wolff. A verification approach for system-level con-
current programs. In Verified Software: Theories, Tools, and Ex-
periments, volume 5295/2008 of LNCS, pages 161–176. Springer,
2008.
[DDW09] Matthias Daum, Jan Drrenbcher, and Burkhart Wolff. Proving
fairness and implementation correctness of a microkernel sched-
uler. In Gerwin Klein, Ralf Huuck, and Bastian Schlich, editors,
Journal of Automated Reasoning: Special Issue on Operating Sys-
tem Verification. Springer, 2009.
[DHP05] Iakov Dalinger, Mark Hillebrand, and Wolfgang Paul. On the
verification of memory management mechanisms. In D. Borrione
and W. Paul, editors, Proceedings of the 13th Advanced Research
Working Conference on Correct Hardware Design and Verifica-
tion Methods (CHARME 2005), volume 3725, pages 301–316.
Springer, 2005.
[DMSS05] M. Daum, S. Maus, N. Schirmer, and M.N. Seghir. Integration of
a software model checker into isabelle. In LPAR, volume 3835 of
LNCS, pages 381–395. Springer, 2005.
[EKE08] Dhammika Elkaduwe, Gerwin Klein, and Kevin Elphinstone. Ver-
ified protection model of the seL4 microkernel. In Jim Wood-
cock and Natarajan Shankar, editors, Second IFIP Working Con-
ference on Verified Software: Theories, Tools, and Experiments
(VSTTE 2008), volume 5295 of LNCS, pages 99–114, Toronto,
Canada, October 2008. Springer.
[End05] Endrawaty. Verification of the fiasco IPC implementation, 2005.
[FN79] R. Feiertag and P. Neumann. The foundations of a provably se-
cure operating system (PSOS). In Proceedings of the National
Computer Conference 48, pages 329–334, 1979.
[FSDG08] Xinyu Feng, Zhong Shao, Yuan Dong, and Yu Guo. Certify-
ing low-level programs with hardware interrupts and preemptive
threads. In 2008 ACM SIGPLAN Conference on Programming
Language Design and Implementation (PLDI’08), New York, NY,
USA, June 2008. ACM.
[GHLP05] Mauro Gargano, Mark Hillebrand, Dirk Leinenbach, and Wolf-
gang Paul. On the correctness of operating system kernels. In
J. Hurd and T. F. Melham, editors, 18th International Conference
on Theorem Proving in Higher Order Logics (TPHOLs 2005), vol-
ume 3603, pages 1–16. Springer, 2005.
[HEK+07] G. Heiser, K. Elphinstone, I. Kuz, G. Klein, and S.M. Petters.
Towards trustworthy computing systems: Taking microkernels to
the next level. Operating Systems Review, 41(4):3–11, 2007.
172 Bibliography
[Hil05] Mark Hillebrand. Address Spaces and Virtual Memory: Specifi-
cation, Implementation, and Correctness. PhD thesis, Saarland
University, Computer Science Department, June 2005.
[HIP05] Mark Hillebrand, Thomas In der Rieden, and Wolfgang Paul.
Dealing with I/O devices in the context of pervasive system ver-
ification. In ICCD ’05, pages 309–316. IEEE Computer Society,
2005.
[HP96] John L. Hennessy and David A. Patterson. Computer Architec-
ture: A Quantitative Approach. Morgan Kaufmann, San Mateo,
CA, second edition, 1996.
[HP07] M. A. Hillebrand and W. J. Paul. On the architecture of system
verification environments. In Haifa Verification Conference 2007,
October 23-25, 2007, Haifa, Israel, LNCS. Springer, 2007.
[HT03] M. Hohmuth and H. Tews. The semantics of C++ data
types: Towards verifying low-level system components. In
D. Basin and B. Wolff, editors, TPHOLs 2003, Emerging
Trends Proceedings, pages 127–144. 2003. Technical Report No.
187 Institut fu¨r Informatik Universita¨t Freiburg, url = cite-
seer.ist.psu.edu/article/hohmuth03semantics.html.
[HT05] Michael Hohmuth and Hendrik Tews. The VFiasco approach for a
verified operating system. In 2nd ECOOP Workshop on Program
Languages and Operating Systems (ECOOP-PLOS’05), 2005.
[IdRT08] T. In der Rieden and A. Tsyban. Cvm - a verified framework
for microkernel programmers. In 3rd International Workshop on
Systems Software Verification (SSV08). Elsevier Science B. V.,
2008.
[In 09] Tomas In der Rieden. Verifying CVM - The Kernel Parts. To
appear. PhD thesis, Saarland University, Computer Science De-
partment, 2009.
[JSM04] E. Northup S. Sridhar J. Shapiro, M. S. Doerrie and M. Miller.
Towards a verified, general-purpose operating system kernel. In
1st NICTA Workshop on Operating System Verification, October
2004.
[KDE09] Gerwin Klein, Philip Derrin, and Kevin Elphinstone. Experience
report: sel4 — formally verifying a high-performance microkernel.
In Proc. 2009 ACM SIGPLAN International Conference on Func-
tional Programming (ICFP), pages 91–96. ACM, August 2009.
[KEH+09] Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andron-
ick, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engel-
hardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey
Tuch, and Simon Winwood. seL4: Formal verification of an OS
kernel. In Proc. 22nd ACM Symposium on Operating Systems
Principles (SOSP), pages 207–220, Big Sky, MT, USA, October
2009. ACM.
Bibliography 173
[KELS62] T. Kilburn, D. B. G. Edwards, M. J. Lanigan, and F. H. Sum-
ner. One-level storage system. IRE Transactions on Electronic
Computers, 11(2):223–235, 1962.
[KH92] Gerry Kane and Joe Heinrich. MIPS RISC architectures.
Prentice-Hall, Inc., 1992.
[KHPS61] T. Kilburn, D. J. Howarth, R. B. Payne, and F. H. Sumner. Atlas
operating system part i: Internal organization. The Computer
Journal, 4(3):222–225, 1961.
[Kle09] Gerwin Klein. Operating system verification — an overview.
Sa¯dhana¯, 34(1):27–69, February 2009.
[Kna08] Steffen Knapp. The Correctness of a Distributed Real-Time Sys-
tem. PhD thesis, Saarland University, Computer Science Depart-
ment, 2008.
[KR68] C.J. Kuehner and B. Randell. Demand paging in perspective. In
Fall Joint Computer Conference, 1968.
[Lei07] Dirk Leinenbach. Compiler Verification in the Context of Perva-
sive System Verification. PhD thesis, Saarland University, Com-
puter Science Department, 2007.
[LP08] D. Leinenbach and E. Petrova. Pervasive compiler verification –
from verified programs to verified systems. In 3rd intl Workshop
on Systems Software Verification (SSV08). Elsevier Science B. V.,
2008.
[LPP05] Dirk Leinenbach, Wolfgang Paul, and Elena Petrova. Towards
the formal verification of a C0 compiler: Code generation and im-
plementation correctness. In Bernhard Aichernig and Bernhard
Beckert, editors, 3rd International Conference on Software Engi-
neering and Formal Methods (SEFM 2005), 5–9 September 2005,
Koblenz, Germany, pages 2–11, 2005.
[MN03] Farhad Mehta and Tobias Nipkow. Proving pointer programs in
higher-order logic. In F. Baader, editor, Automated Deduction —
CADE-19, volume 2741, pages 121–135. Springer, 2003.
[Moo03] J Strother Moore. A grand challenge proposal for formal methods:
A verified stack. In Bernhard K. Aichernig and T. S. E. Maibaum,
editors, 10th Anniversary Colloquium of UNU/IIST, volume 2757,
pages 161–172. Springer, 2003.
[MP00] Silvia M. Mueller and Wolfgang J. Paul. Computer Architecture:
Complexity and Correctness. Springer, 2000.
[NBF+80] P.G. Neumann, R.S Boyer, R.J. Feiertag, K.N. Levitt, and
L. Robinson. A provably secure operating system: The system, its
applications, and proofs. Technical Report CSL-116. Computer
Science Laboratory, SRI International, Menlo Park, California,
May 1980.
174 Bibliography
[NF03] P. Neumann and R. Feiertag. PSOS revisited. In 19th Annual
Computer Security Applications Conference, 2003.
[Ngu05] V.G. Nguiekom. Verifikation von doppelt verketteten Listen auf
Pointerebene. Master’s thesis, Saarland University, 2005.
[NPW02] Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Is-
abelle/HOL: A Proof Assistant for Higher-Order Logic, volume
2283. Springer, 2002.
[NS06] Zhaozhong Ni and Zhong Shao. Certified assembly programming
with embedded code pointers. In POPL ’06: Conference record
of the 33rd ACM SIGPLAN-SIGACT symposium on Principles
of programming languages, pages 320–333, New York, NY, USA,
2006. ACM.
[NYS07] Zhaozhong Ni, Dachuan Yu, and Zhong Shao. Using xcap to
certify realistic systems code: Machine context management. In
TPHOLs, pages 189–206, 2007.
[Pet07] Elena Petrova. Verification of the C0 Compiler Implementation
on the Source Code Level. PhD thesis, Saarland University, Com-
puter Science Department, May 2007.
[Sch05] Norbert Schirmer. A verification environment for sequen-
tial imperative programs in Isabelle/HOL. In F. Baader and
A. Voronkov, editors, Logic for Programming, Artificial Intel-
ligence, and Reasoning, 11th International Conference, LPAR
2004, volume 3452, pages 398–414. Springer, 2005.
[Sch06] Norbert Schirmer. Verification of Sequential Imperative Programs
in Isabelle/HOL. PhD thesis, Technische Universita¨t Mu¨nchen,
Institut fu¨r Informatik, 2006.
[Sha06] Andrey Shadrin. Design and implementation of the portmapper
and rpc primitives in the context of the sos. Master’s thesis,
Saarland University, 2006.
[Smi78] Alan Jay Smith. Bibliography on paging and related topics.
SIGOPS Oper. Syst. Rev., 12(4):39–56, 1978.
[ST08a] Artem Starostin and Alexandra Tsyban. Correct microkernel
primitives. In 3rd International Workshop on Systems Software
Verification (SSV08). Elsevier Science B. V., 2008.
[ST08b] Artem Starostin and Alexandra Tsyban. Verified process-context
switch for c-programmed kernels. In Natarajan Shankar and Jim
Woodcock, editors, 2nd IFIP Working Conference on Verified
Software: Theories, Tools, and Experiments (VSTTE’08), vol-
ume 5295 of LNCS, pages 240–254. Springer, 2008.
[Sta06] Artem Starostin. Formal verification of a c-library for strings.
Master’s thesis, Saarland University, 2006.
Bibliography 175
[SW00] Jonathan S. Shapiro and Samuel Weber. Verifying the EROS
confinement mechanism. In IEEE Symposium on Security and
Privacy, pages 166–176, May 2000.
[Tan97] Andrew S. Tanenbaum. Operating Systems: Design and Imple-
mentation (Second Edition). Prentice-Hall, 1997.
[Tan01] Andrew S. Tanenbaum. Modern Operating Systems (Second Edi-
tion). Prentice-Hall, 2001.
[Tsy09] Alexandra Tsyban. Formal Verification of a Framework for Mi-
crokernel Programmes. PhD thesis, Saarland University, Com-
puter Science Department, 2009.
[Tve09] Sergey Tverdyshev. Formal Verification of Gate-Level Computer
Systems. PhD thesis, Saarland University, Computer Science De-
partment, 2009.
[WKP79] Bruce J. Walker, Richard A. Kemmerer, and Gerald J. Popek.
Specification and verification of the UCLA Unix security kernel.
In SOSP ’79: Proceedings of the seventh ACM symposium on




− `H − − −, 44
− `tH − − −, 44
− H − − −, 44
− tH − − −, 44
−,−,− BS − − −, 48
−,−,− tBS − − −, 48
−,−,− `SS − → . . . (∞), 54
∆(−,−) = {. . .}, 45
−,−,− ` −√, 48
− `v − :: −, 48
〈−〉, 5
− ` − :: −,−,−, 49
































































assembly w/dev., CASM+DS, 19
assembly, CASM, 12
C0 BS, CBS, 47
177
178 Index
C0 SS, CSS, 22
CVM, CCVM, 25
extended state, CX, 60
hard disk, CHD, 14
ISA w/dev., CISA+DS, 18
ISA, CISA, 8



































































modifies BSta ?, 114


































PGS PER BIG PG, 33
PG SZ, 33






POST techinit ?, 143
POST Xinit?, 110
POST Hswap in?, 92
POST Hswap out?, 93
POST BSta ?, 111
POST Hta ?, 96
POST SSta ?, 127
POST techta ?, 146













PRE techinit ?, 142
PRE Xinit?, 110
PRE Hswap in?, 91
PRE Hswap out?, 92
PRE BSta ?, 111
PRE Hta ?, 96
PRE SSta ?, 127
PRE techta ?, 145



















































assembly w/dev., δASM+DS, 19
assembly, δASM, 13
C0 BS, − `BS 〈−,−〉 ⇒ −, 48
C0 SS, δSS, 22
CVM, δCVM, 26
extended C0 SS, δSSX, 52
ISA w/dev., δISA+DS, 18
ISA, δISA, 11






























TOT BIG PGS, 34
TOT PGS, 34
TOT PGS PT, 35
TOT PHYS PGS, 33
trans-inv?, 54
translexcpISA?, 10
typd, 108
typt, 108
ty-t, 20
user-modeISA?, 9
user-pid-active?, 71
user-ppx-active-free?, 71
user-ppx-pt?, 73
userprocs-t, 25
USER PGS, 33
valid?, 63
valid-active?, 71
validCVM, 30
valid-is-exec-pt?, 74
validity
assembly, asm
√
, 13
C0 SS, C0
√
, C0 ′
√
, 22
execution sequence, seq
√
, 17
hard disk, hd
√
, 16
ISA, isa
√
, 9
PFH abstraction, pfh
√
, 76
valid-pte-descr-active?, 73
validSS, 52
val-t, 46
vm, 29
vpx-less-ptl-active?, 72
wf-prog?, 49
wf-retvars?, 53
write-to-disk, 66
xconsis?, 134
xconsisft?, 135
xconsismem?, 134
xconsisswap?, 134
xsemPFH, 133
xsemPFH, 112
xsemread, 133
xsem-t, 47
xsemwrite, 133
xsemzfp, 133
zero-fill-page, 66
ZFP, 33
zfp-cond?, 31
zfp-eq-prot-pt?, 74
zfp-is-valid-pt?, 74
zppf?, 63
