On Transformations of Load-Store Maurer Instruction Set Architecture by Hou, Tie
ar
X
iv
:0
80
8.
25
84
v2
  [
cs
.A
R]
  2
0 J
an
 20
09
On Transformations of Load-Store Maurer
Instruction Set Architectures
Tie Hou
Informatics Institute, University of Amsterdam,
Kruislaan 403, 1098 SJ Amsterdam, The Netherlands
November 19, 2018
1 Introduction
Maurer proposes a model for computers from the viewpoint of general function
and set theory in [7, 8]. Mathematical machines (Turing machines, push-down
automata, etc.) are widely known for their inadequate representation of modern
computers, but Maurer’s model gives a leading solution. Maurer machines [1],
introduced by Bergstra and Middelburg, are based on this model and basic
thread algebra with the operator for applying threads to Maurer machines.
Basic thread algebra (BTA), which was introduced as Basic Polarized Process
Algebra (BPPA) in [5], is a theory that describes the behaviour of deterministic
sequential programs under execution. The behaviours concerned are supposed
to be threads in BTA (see more in [3]).
In load-store (or register-register) architectures (see, e.g., [6]), we have ex-
plicit instructions that access memory only. Load instructions read data from
the memory and copy them to registers. Store instructions write data from reg-
isters to the memory. Computers of today use load-store architectures, because
(1) register access is faster than memory access; (2) registers allow for compiler
optimisations, e.g., an expression may be evaluated in any order of execution;
(3) registers can be used to hold all the variables relevant for a specific code
segment, so the operations are faster.
In [2], Bergstra and Middelburg introduced the concept of a strict load-store
Maurer instruction set architecture (strict load-store Maurer ISA, for short)
and studied under what conditions and how these conditions can affect the
transformations on the states of the memory of a strict load-store Maurer ISA
to be achieved.
There are mainly three parts in a load-store instruction set architecture: a
memory that contains data, registers, and an operating unit that processes data.
In this paper, we study how certain conditions can affect the transformations,
when half of the data memory serves as the part of the operating unit.
The rest of the paper is organised as follows. First of all, we review basic
thread algebra and Maurer machines in Section 2 and Section 3, respectively.
1
Next, in Section 4, we describe the notion of the apply operator. Following this,
we explain the strict load-store instruction set architectures in Section 5. After
that, in Section 6, we review the concept of thread powered function classes
and show two results of the completeness. Then we recall an incompleteness in
Section 7. Finally, we give some concluding remarks in Section 8.
2 Basic Thread Algebra
Consider a fixed but arbitrary finite set A of basic actions with tau /∈ A. We
denote A ∪ {tau} by Atau. The signature of BTA consists of the following
constants and operators:
1. the deadlock constant D;
2. the termination constant S;
3. for each a ∈ Atau, a binary postconditional composition operator E aD .
With D an inactive behavior is indicated and with S a successful terminating
behavior is denoted. A single action is not a thread, and finite threads always
end in S or D. The thread xEaD y will first perform a and then proceed as x if
the processing of a produces the positive reply T, and it will proceed as y if the
processing of a produces the negative reply F. We abbreviate P E aD P using
the action prefixing operator: a ◦ P and take ◦ to bind strongest. The action
tau will always produce a positive reply. The axiom for this action is given in
Table 1. Using the action prefixing operator, axiom T 1 can be also written for
short as: xE tauD y = tau ◦ x.
Table 1: Axioms for BTA
xE tauD y = xE tauD x T1
Every thread in BTA is finite in the sense that the number of consecutive
actions it can perform is bounded. Infinite threads can be defined using guarded
recursive specifications.
A guarded recursive specification over BTA is a set of recursion equations
{Xi = ti(X)|Xi ∈ VE}, where VE = {X1, X2, . . . , Xn} is a set of all variables
that occur on the left-hand side of an equation in E, X is a vector containing
all variables in VE , i.e. X = X1, . . . , Xn, and ti is a term of the form D,S or
tE aD t′ (t and t′ are terms of BTA that contain only variables from X).
A solution for a recursive equation is a thread that solves the equation.
We use the constant 〈Xi|E〉 to denote the solution for the recursive equation
(Xi = ti(X)) ∈ E. A solution for a guarded recursive specification E, with
VE = {X1, . . . , Xn}, is a vector 〈X1|E〉, . . . , 〈Xn|E〉 such that substituting each
variable in VE by its respective solution turns all equations in E into true state-
ments. Once E is declared, 〈Xi|E〉 can be abbreviated by 〈Xi〉. We give the ax-
ioms for guarded recursion in Table 2. The recursive definition principle (RDP)
2
Table 2: Axioms for guarded recursion
〈Xi|E〉 = ti(〈X1|E〉, . . . , 〈Xn|E〉) (i ∈ {1, . . . , n}) RDP
E ⇒ Xi = 〈Xi|E〉 RSP
states that 〈X1|E〉, . . . , 〈Xn|E〉 is a solution for E. The recursive specification
principle (RSP) states that this solution is the only one.
We write BTA+REC for BTA extended with the constants for solutions of
guarded recursive specifications and axioms RDP and RSP.
From now on, we write Efin(A), where A ⊆ A, for the set of all finite guarded
recursive specifications over BTA that contain only postconditional operators
E a D for which a ranges over A, and Tfinrec(A), where A ⊆ A, for the set
of all closed terms of BTA + REC that contain only postconditional operators
EaD for which a ranges over A and only constants 〈Xi|E〉 for which E ranges
over Efin(A).
We give the following definition of the set of thread states, which will be
used later in Section 6.
Definition 1. Let A be some model of BTA + REC, and let p be an element
from the domain of A. Then the set of states of p, written Res(p), is inductively
defined as follows:
1. p ∈ Res(p);
2. if q E aD r ∈ Res(p), then q, r ∈ Res(p).
In subsequent sections, the following threads, which have more than one
initial states, are not used.
a ◦ S
a
!!
CC
CC
CC
CC
b ◦ S
b
}}{{
{{
{{
{{
S
Figure 1: Connected Thread
3 Maurer Machines
In this section we review Maurer machines, which were first introduced in [1].
Most modern computers use the binary system, i.e., information is exchanged
and processed internally using 2 as numerical base. Theoretically we can also
use any number as the base, such as 3, 5, 8, etc. Therefore, a computer can be
constructed to the base n, which means that information is virtually operated
using only the digits from 0 through n − 1. We assume that the base n is
constant over the whole computer.
3
Every computer has a memory. We represent the memory of a computer
as a set M . Registers are regarded as subsets of M . We consider a set B as
the base set, whose cardinality is the base of the computer. If the base of a
computer is n, the base set of this computer is the set of all integers from 0 to
n − 1. A state of the computer is represented as an arbitrary map from M to
B . We can change one state to another by performing operations.
Maurer machines are based on this simple model of computers. The memory
of a Maurer machine consists of memory elements. Every memory element
contains a value from the base set of the Maurer machine as a content. The
contents of all memory elements build up a state of the Maurer machine. The
Maurer machine processes a basic action by performing the operation associated
with the basic action. The execution of an operation carries out the passing from
one state to the next. As a result of state changes, the content of the memory
element associated with the basic action is changed to the reply produced by
the Maurer machine.
Now we give the following definition of a Maurer machine.
Definition 2. Let M be a non-empty set, let B be a set with card(B) ≥ 2 (which
means B contains at least two members T and F), let S be a set of functions
S : M → B, let O be a set of functions O: S → S, let A ⊆ A be a set, let J K:
A→ (O ×M ) be a function, satisfying the following conditions:
- if S1, S2 ∈ S, M ′ ⊆ M, and S3:M → B is such that S3(x) = S1(x) if
x ∈M ′ and S3(x) = S2(x) if x /∈M ′, then S3 ∈ S;
- if S1, S2 ∈ S, then the set {x ∈M | S1(x) 6= S2(x)} is finite;
- if S ∈ S, a ∈ A, and JaK = (O ,m), then S(m) ∈ {T,F}.
Then the 6-tuple H = (M ,B ,S,O,A, J K) is a Maurer machine. The set M is
the memory of H ; the set B is the base set of H ; the members of S are the
states of H ; the members of O are the operations of H ; the members of A are
the basic actions of H ; and the function J K is the basic action interpretation
function of H .
Every operation O : S → S is associated with two subsets of M . For
example, if we want to move the data in the memory Y to the register R, we
are implying Y and R are proper subsets of M . We give the relation between
O and these two subsets by the following notions of input and output regions
of an operation, which will be used later in Section 5.
Definition 3. Let H = (M ,B ,S,O,A, J K) be a Maurer machine, and let O :
S → S. Then we define the input region of O, written IR(O), and the output
region of O, written OR(O), which are the subsets of M , as follows:
IR(O) = {x ∈ M | ∃S1, S2 ∈ S.(∀z ∈ M \{x}.S1(z) = S2(z) ∧
∃y ∈ OR(O).O(S1)(y) 6= O(S2)(y))},
OR(O) = {x ∈ M | ∃S ∈ S.S (x) 6= O(S )(x)}.
4
According to this definition, in the above example, we call Y the input region
and R the output region of O . Each operation takes data only from its input
region and places data only in its output region.
4 Application of Threads to Maurer Machines
The binary apply operator •H connects a thread and a state of a Maurer
machine, and yields either a state of the Maurer machine or the undefined state
↑. In other words, p•H S indicates the resulting state after the Maurer machine
H = (M ,B ,S,O,A, J K) executes all the basic actions performed by the thread
p ∈ Tfinrec(A) from the initial state S ∈ S. Let (Oa,ma) = JaK for all a ∈ A.
H executes a basic action a by performing Oa. This leads to a state change. In
the resulting state, the reply produced by H is the content in ma. If p is S, no
state changes. If p is D, the result is ↑.
Then we give the following defining equations for the apply operator in
Table 3, where a ranges over A, and S ranges over S.
Table 3: Defining equations for apply operator
x•H ↑=↑
S •H S = S
D •H S =↑
(x E aD y) •H S = x •H Oa(S ) if Oa(S )(ma) = T
(x E aD y) •H S = y •H Oa(S ) if Oa(S )(ma) = F
5 Strict Load-Store Maurer ISAs
In this section we review a strict load-store Maurer ISA [2, 4].
The basic idea of a strict load-store Maurer ISA is the following: in the
setting of Maurer machines, a segmented memory is used as a main memory to
contain data, and a small segmented memory is used as an operating unit to
process data, as shown in Figure 2. Only load and store instructions can access
the data memory, moving data from the memory to the register, or to the
memory from the register, respectively. All other instructions (e.g., instructions
for data manipulation) can use only register operands. Operations (such as,
calculating a data address, add, subtraction, AND, shifts, etc.), taking operands
from registers, are executed in the operating unit. The result is stored back to
a register. Without loss of generality, we assume that data is restricted to the
natural numbers.
A strict load-store Maurer ISA has the following parameters:
- an address width k;
- a word length l;
5
Data Memory
Registers
Operating Unit
...
...
load instructionsstore instructions
. . .
R0 R2
operations on registers only
e.g. add R0 R1 R2
R1
executing data manipulation instructions inside
...
rr
Figure 2: Strict Load-Store Maurer ISA
- a bit size m of the operating unit;
- a number u of pairs of address and data registers for load instructions;
- a number v of pairs of address and data registers for store instructions;
- a set A′ of basic instructions for data manipulation.
The symbols can be regarded as follows:
- k: the number of bits used for the binary representation of addresses of
data memory elements;
- l: the number of bits used to represent data in data memory elements;
- m: the number of bits that the internal memory of the operating unit
contains.
The data memory is a fixed but arbitrary set Mdata which has a cardinality
of 2k as shown in Figure 3. Its elements can contain natural numbers as data in
the interval [0, 2l− 1] (written Bdata), and can be addressed by natural numbers
in the interval [0, 2k − 1] (written Baddr). Hence, we give a fixed but arbitrary
bijection mdata : Baddr → Mdata.
The operating unit memory is a fixed but arbitrary set Mou which has a
cardinality of m. Its elements can contain natural numbers in the set {0, 1}
(written Bit), i.e., bits.
6
data memory
...
6
7
8
[0, 2k − 1]
...
[0, 2l − 1]
data in
address in
Figure 3: Data Memory
Registers are used to move data between the data memory and the operat-
ing unit memory. Load address registers and load data registers are fixed but
arbitrary sets Mla and Mld respectively, which have cardinality of u. Store ad-
dress registers and store data registers are fixed but arbitrary sets Msa and Msd
respectively, which have cardinality of v. The contents of Mla and Msa are taken
as addresses which are the members of Baddr, while the contents of Mld and Msd
are taken as data which are the members of Bdata. Hence, written [0, u− 1] and
[0, v − 1] as Bload and Bstore respectively, we give fixed but arbitrary bijections
mld : Bload → Mld, mla : Bload → Mla, msd : Bstore → Msd and msa : Bstore → Msa.
The memory element rr stores the reply of processing Oa, the operation
associated with the basic action a.
We assume that Mdata, Mou, Mld, Msd, Mla, Msa and {rr} are pairwise disjoint
sets. The meaning of these sets in reality are shown in Figure 4. Let n ∈ Baddr,
n′ ∈ Bload and n′′ ∈ Bstore. Then mdata(n) is denoted by Mdata[n], mld(n′) by
Mld[n
′], mla(n
′) by Mla[n
′], msd(n
′′) by Msd[n
′′] and msa(n
′′) by Msa[n
′′].
We give the following definition of a strict load-store Maurer ISA.
Definition 4. A strict load/store Maurer ISA with parameters k, l, m, u, v
and A′ is a Maurer machine H = (M ,B ,S,O,A, J K) with
M = Mdata ∪Mou ∪Mld ∪Msd ∪Mla ∪Msa ∪ {rr},
B = [0, j] ∪ {T,F} for j = max(2k − 1, 2l − 1),
S = {S :M → B |
∀m ∈ Mdata ∪Mld ∪Msd.S (m) ∈ Bdata ∧
∀m ∈ Mla ∪Msa.S (m) ∈ Baddr ∧
∀m ∈ Mou.S (m) ∈ Bit ∧ S (rr) ∈ {T,F}},
O = {Oa | a ∈ A},
A = {load:n | n ∈ Bload} ∪ {store:n | n ∈ Bstore} ∪A′,
JaK = (Oa, rr) for all a ∈ A,
7
Data Memory
Registers
Operating Unit
...
...
. . .
R0 R2R1
Mdata
Mou
Mld
Mla
Msd
Msa
{rr} rr
...
Figure 4: The Set Indications
where for all n ∈ Bload, Oload:n is the unique function from S to S such that for
all S ∈ S:
Oload:n(S ) ↾ (M \ {Mld[n], rr}) = S ↾ (M \ {Mld[n], rr}),
Oload:n(S )(Mld[n]) = S (Mdata[S (Mla[n])]),
Oload:n(S )(rr) = T,
and, for all n ∈ Bstore, Ostore:n is the unique function from S to S such that for
all S ∈ S:
Ostore:n(S ) ↾ (M \ {Mdata[S (Msa[n])], rr}) = S ↾ (M \ {Mdata[S (Msa[n])], rr}),
Ostore:n(S )(Mdata[S (Msa[n])]) = S (Msd[n]),
Ostore:n(S )(rr) = T,
and, for all a ∈ A′, Oa is a function from S to S such that:
IR(Oa) ⊆ Mou ∪Mld,
OR(Oa) ⊆ Mou ∪Msd ∪Mla ∪Msa ∪ {rr}.
We denote the set of all strict load-store Maurer ISAs with parameters k, l,
m, u, v and A′ by MISAsls(k, l,m, u, v,A′).
6 Thread Powered Function Classes
In this section we review the thread powered function classes, which help to
answer the following question: under which conditions can we achieve all the
8
possible state transformations by applying threads to a strict load/store Maurer
ISA with certain address width and word length?
A thread powered function class has the following parameters:
- an address width k;
- a word length l;
- an operating unit size m;
- an instruction set size d;
- a state space bound e;
- a working area flag f .
The symbols can be regarded as follows:
- d: the number of basic instructions excluding load and store instructions;
- e: a bound on the number of states of the threads that can be applied;
- f : indicates whether a part of the data memory is taken as a working
area. There are two cases. First, if f = T, we use the first half of the data
memory as the external memory and the second half of the data memory
as the internal data memory. Second, if f = F, we use the whole data
memory as the external memory.
The definition of the thread powered function class is given as follows.
Definition 5. Let k,m ≥ 0 and l, d, e > 0, and let f ∈ {T,F} such that f = F
if k = 0. We define
M
k
data
= {mdata(i) | i ∈ [0, 2k − 1]},
Sdata = {S | S : Mkdata → Bdata},
Tdata = {T | T : Sdata → Sdata}.
Then the thread powered function class with parameters k, l,m, d, e, f , denoted
by T PFC(k, l,m, d, e, f), which is a subset of Tdata, is defined as follows:
T ∈ T PFC(k, l,m, d, e, f)
⇔ ∃A′ ⊆ A.
∃H ∈ MISAsls(k, l,m, u, v,A′).
∃p ∈ Tfinrec(AH ).
(card(A′) = d ∧ card(Res(p)) ≤ e ∧
∀S ∈ SH .
((f = F⇒ T (S ↾ Mkdata) = (p •H S ) ↾ Mkdata) ∧
(f = T⇒ T (S ↾ Mk
data
) ↾ Mk−1
data
= (p •H S ) ↾ Mk−1data ))).
9
Threads are stored in the data memory. When the internal data memory is
used as a part of the operating unit, threads are stored in the external memory.
We say that T PFC(k, l,m, d, e, f) is complete if T PFC(k, l,m, d, e, f) is
equal to Tdata.
The following theorem points out that we can get the completeness if we
use 5 data manipulation instructions and threads with at most 6 +w states (w
is the number of load and store instructions) and take the operating unit size
slightly greater than the data memory size.
The 5 data manipulation instructions (recall that load and store instruc-
tions are not counted for the instruction set) are as follows: an initialization
instruction, a pre-load instruction, a post-load instruction, a pre-store instruc-
tion, and a transformation instruction. First, before a data memory element m0
is moved to any register, the address of m0 is sent to the load address register
by the pre-load instruction. And then m0 is loaded to the load data register.
Next, the post-load instruction moves the content of the load data register to
the operating unit. Similarly, before the data is moved from the register to the
data memory, the pre-store instruction sends the intended address in the data
memory to the store address register. And then the content of the operating
unit is moved to the store data register. Next, the content of the store data
register is stored to the data memory. The transformation instruction applies
the relevant state transformation to the content of the operating unit.
The number of the states of the threads consists of 5 states associated with
the above 5 data manipulation instructions, the w states associated with load
and store instructions, and the termination state.
Theorem 1. Let k ≥ 0, l > 0 and f ∈ {T,F}, and let dms be the data memory
size, i.e., dms = 2k · l. Then T PFC(k, l, dms+ k + 1, 5, 6 + w, f) is complete.
In [2], a proof of the case that there are only one load and one store instruc-
tions is given.
The following corollary points out that we can still get the completeness if
we use about half of the data memory size as the operating unit size.
Corollary 1. Let k, l > 0, and let ems be the external memory size in the
case that ems is half of the data memory size, i.e., ems = 2k−1 · l. Then
T PFC(k, l, ems+ k, 5, 6 + w,T) is complete.
In the cases of Theorem 1 and Corollary 1, we need at least 5 data manipu-
lation instructions to accomplish the job.
7 Incompleteness
In this section we show under which conditions it is impossible to achieve all
transformations on the states of the external memory taking into account the
use of the internal data memory.
The idea of using the internal data memory can be explained in Figure 5.
We move data from α to registers, operate them (e.g., adding two numbers) in
10
data memory operating unit
α: the first half of the data memory, used as the external memory
α
β
γ
β: the second half of the data memory, used as the internal data memory
γ: the small internal memory of the operating unit
ems
m
ems
registers
Figure 5: Using the Internal Data Memory
γ, and then move the result back to registers. If it is not possible to process all
the operations in γ due to the lack of space, we use β and γ together to process
operations.
In [2], β is not used to process operations in the case of the lack of space.
Lemma 1 in [2] states that if the operating unit size is at most ems/2, the
instruction set size is at most 2ems/2, and the number of threads that can be
applied is at most 2ems, it is impossible to achieve all transformations on the
states of the external memory, where ems (external memory size) is half of the
data memory size.
We reformulate this lemma with the use of the internal data memory as
follows. It states that it is still impossible to achieve all transformations on the
states of the external memory if the total size of the operating unit and the used
internal data memory is at most ems/2.
Lemma 1. Let k > 1, l,m, d, e > 0 and ems = (2k · l)/2, and let ims be the
used internal data memory size. Then T PFC(k, l,m, d, e,T) is not complete if
m+ ims ≤ ems/2, d ≤ 2ems/2, the number of threads that can be applied to the
members of ⋃
A′⊆A
MISAsls(k, l,m, u, v,A′)
is at most 2ems.
Proof. We know that, if the total size of the operating unit and the used internal
data memory is at most ems/2, then the number of bits the operating unit and
the used internal data memory have is at most ems/2. As shown in Figure 6,
since every bit mi has two choices, 0 or 1, for 1 ≤ i ≤ ems/2, the number of
states of the operating unit and the used internal data memory (in other words,
the number of sequences that ems/2 digits can make up if every digit has 2
11
mi
2 choices, 0 or 1
ems
2
· · · · · ·
Figure 6: Bits in the Memory
choices) is at most 2(ems/2). Hence there are at most
(2
ems
2 )(2
ems
2 )
transformations on the states of the operating unit and the used internal data
memory for one data manipulation instruction.
It follows that, if there are at most 2ems/2 data manipulation instructions,
then there are at most (
(2
ems
2 )(2
ems
2 )
)(2 ems2 )
transformations on the states of the external memory for one thread.
So, if at most 2ems threads can be applied, then the number of transforma-
tions on the states of the external memory is at most
(
(2
ems
2 )(2
ems
2 )
)(2 ems2 ) · 2ems.
This number is less than the number of all possible transformations on the
states of the external memory, which is (2ems)(2
ems), i.e.,
(
(2
ems
2 )(2
ems
2 )
)(2 ems2 ) · 2ems < (2ems)(2ems). (∗)
Therefore, we get that T PFC(k, l,m, d, e,T) is not complete.
We prove (∗) by the following computation: Let x = 2(ems/2). Then
(∗)⇒ (xx)x · x2 < (x2)(x2) ⇒
x(x
2) · x2 < (x2)(x2) ⇒
x(x
2) < (x2)(x
2−1) (⋆)
Applying logarithm to both sides of (⋆), we have
x2 log2 x < 2(x
2 − 1) log2 x⇒ (x2 − 2) log2 x > 0.
If x >
√
2, then we have x2 > 2, i.e., x2 − 2 > 0. Since log2 x > 1/2 if x >
√
2,
(x2 − 2) log2 x > 0 holds if x >
√
2, i.e., ems > 1.
Now we can give the following theorem showing that if the total size of
the operating unit and the used internal data memory is at most ems/2, the
instruction set size is at most 2l − w − 1, the maximal number of states of the
threads is at most 2k−2, then T PFC(k, l,m, d, e,T) is not complete.
12
Theorem 2. Let k > 2, l > 1, m, d > 0, e > 1 and ems = (2k · l)/2, and let
ims be the used internal data memory size and w the number of load and store
instructions. Then T PFC(k, l,m, d, e,T) is not complete if m+ ims ≤ ems/2,
d ≤ 2l − w − 1, e ≤ 2k−2.
Proof. We have d data manipulation instructions, plus w load and store in-
structions, then there are d + w instructions. Suppose every state of threads
can perform either according to the positive reply produced by the associated
instruction, or according to the negative reply. Since e is the maximal number
of states of the threads that can be applied, no matter which path it performs,
the number of states of each path is at most e. Hence, we have d+w choices for
instructions, e choices for the path caused by the positive reply, and e choices for
the path caused by the negative reply. Including the termination and deadlock,
we have (d + w) · e2 + 2 choices to form a thread. Therefore, the number of
threads with e states is (
(d+ w) · e2 + 2)e.
Since k > 2, l ≥ 2, e > 1, we have
(
(d+ w) · e2 + 2)e < ((d+ w) · e2 + e2)e ≤ 2ems if l ≥ 2k − 4.
Hence, the number of threads with e states is less than 2ems.
It is easy to see that 2l < 2l·2
k−2
= 2ems/2. Then we can get 2l − w −
1 < 2ems/2, i.e., d < 2ems/2. Because m + ims ≤ ems/2, applying Lemma 1,
we can conclude T PFC(k, l,m, d, e,T) is not complete if m + ims ≤ ems/2,
d ≤ 2l − w − 1, e ≤ 2k−2.
8 Conclusion
We have reviewed the concepts of BTA and strict load-store Maurer ISA. We
also have shown under which conditions we can achieve all the possible trans-
formations on the states of the external memory of a strict load-store Maurer
ISA and under which conditions we cannot.
From Theorem 1 and Corollary 1, we can get completeness with 5 data
manipulation instructions and at most 6+w states of the threads if we take the
operating unit size slightly greater than the data memory size, or half of the
data memory size. The completeness is lost by decreasing the number of data
manipulation instructions and the number of states of the threads. Theorem 2
stated that it is impossible to achieve all transformations if the total size of the
operating unit and the used internal data memory is at most half of the external
memory size, the instruction set size is at most 2l − w − 1, and the maximal
number of states of the threads is at most 2k−2.
References
[1] J.A.Bergstra and C.A.Middelburg. Maurer computers with single-thread
control. Fundamenta Informaticae, 80(4):333–362, 2007.
13
[2] J.A.Bergstra and C.A.Middelburg. On the operating unit size of load/store
architectures. Technical Report PRG0703, University of Amsterdam, 2007.
[3] J.A.Bergstra and C.A.Middelburg. Thread algebra for strategic interleaving.
Formal Aspects of Computing, 19:445–474, 2007.
[4] J.A.Bergstra and C.A.Middelburg. Maurer computers for pipelined instruc-
tion processing. Mathematical Structures in Computer Science, 18:373–409,
2008.
[5] J.A.Bergstra and M.E.Loots. Program algebra for sequential code. Journal
of Logic and Algebraic Programming, 51(2):125–156, 2002.
[6] J.L.Hennessy and D.A.Patterson. Computer Architecutre: A Quantitative
Approach. Morgan Kaufmann, third edition, 2003.
[7] W.D.Maurer. A theory of computer instructions. Journal of the ACM,
13(2):226–235, 1966.
[8] W.D.Maurer. A theory of computer instructions. Science of Computer
Programming, 60:244–273, 2006.
14
