A “Hardware Compiler” Semantics for Handel-C  by Butterfield, Andrew & Woodcock, Jim
A “Hardware Compiler” Semantics for Handel-C
Andrew Butterﬁeld 1 ,2
Department of Computer Science
University of Dublin
Dublin, Ireland
Jim Woodcock
3
Department of Computer Science
University of York, UK
Abstract
We present a denotational semantics for the hardware compilation language Handel-C that maps language con-
structs to a set of equations, which describe the structure of the resulting hardware. This semantics is then shown
to be useful for validating various algebraic laws which should hold for Handel-C programs, as well as exposing a
key principle which governs how such hardware should be operated.
Keywords: Handel-C, Hardware Compilation, Denotational Semantics, CSP
1 Introduction
This paper describes a semantics for Handel-C which gives a program a meaning as a col-
lection of equations describing a possible (very naive) hardware implementation — hence
the term “Hardware Compiler” Semantics, in the title. This semantics, which sounds very
operational in nature, turns out in fact to have a strong denotational character, albeit in
an unconventional sense.
Handel-C 4 [3] is a language originally developed by the Hardware Compilation Group
at Oxford University Computing Laboratory, and now marketed by Celoxica Ltd. It is
a hybrid of CSP [5] and C, designed to target hardware implementations, speciﬁcally
ﬁeld-programmable gate arrays (FPGAs) [7]. The language has sequential and parallel
1 Done while on sabbatical at the University of Kent
2 Email: Andrew.Butterfield@cs.tcd.ie
3 Email: Jim.Woodcock@cs.york.ac.uk
4 Handel-C is the registered trademark of Celoxica Ltd (www.celoxica.com)
Electronic Notes in Theoretical Computer Science 161 (2006) 73–90
1571-0661© 2006 Elsevier B.V. 
www.elsevier.com/locate/entcs
doi:10.1016/j.entcs.2006.04.026
Open access under CC BY-NC-ND license.
constructs and global variable assignment and channel communication. The language
targets synchronous hardware with multiple clock domains. All assignments and channel
communication events take one clock cycle. All expression and conditional evaluations,
as well as priority resolutions are deemed to be instantaneous, eﬀectively being completed
before the current clock-cycle ends.
As the Handel-C language targets hardware, it is ideal for implementing embedded
systems, often in situations where high levels of assurance would be desirable [4]. There is
a clear need for for both a formal semantics of Handel-C (or a reasonable subset) as well
as an appropriate methodology and tool support. The research described here is part of
program to provide just such an industrial-strength formal framework.
2 Syntax
We introduce here the “mathematical” syntax of a stripped-down version of Handel-C,
which albeit simpler, has all the essential features of the synchronous core of the full
language.
We have identiﬁers for channels (c ∈ Ch) and variables (x ∈ Var), and we assume
the existence of an expression syntax (e ∈ Exp) whose details need not concern us here.
We consider all the above as having either boolean or integer type. We also have the
notion of guards (g ∈ Grd), which denote the oﬀering and accepting of communication
actions. Guards either denote expression output along a channel (c!e), variable input via
a channel (c?x ), or a skip guard which always succeeds (!?).
A syntax of a process is as follows:
P ,Q ::= Skip | Delay | x := e | P ; Q | P || Q | P  e Q | e ∗ P | 〈gi → Pi〉
We use notation like 〈gi : pi 〉 as shorthand for 〈g1 : p1, . . . , gn : pn〉 where i is assumed to
index over 1 . . . n for appropriate n. In the last construct, if the !? guard appears it must
appear only once, as the last guard.
We can brieﬂy summarise the behaviour of a Handel-C process as follows: Skip does
nothing, in zero time; Delay does nothing, but takes one clock cycle to do it; x := e
assigns the value of e into x , taking one clock cycle; P ; Q ﬁrst executes P , and once it
has terminated immediately starts Q ; P ‖ Q runs both P and Q in lock-step parallel,
terminating when they have both ﬁnished; P  e  Q evaluates e : B and executes P
immediately if e is True, otherwise it runs Q ; and e ∗ P tests e : B and if True it runs P
and then repeats, otherwise it terminates.
The 〈gi → Pi〉 construct (“prialt”) is an ordered sequence of guard-process pairs. Each
guard is checked against the process environment to see if it is able to execute. If no guards
are so enabled, then the prialt blocks until the next clock cycle when it tries again. If
one or more guards are enabled, then the ﬁrst such in the list is executed, and the
corresponding process is executed subsequently. An input guard (c?x ) is enabled if there
is a corresponding output guard (c!e) in some other prialt executing at the same time,
and v.v. The skip guard (!?) is always enabled. The input (c?x ) and output (c!e) guards
perform their actions taking one clock-cycle, while the skip guard (!?) acts like Skip so
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9074
the subsequent process starts execution immediately. It is this “instant” execution of !?
guards that so complicates the formal semantics of Handel-C [2].
To see the problem, consider the following process:
〈 c!66 → Skip 〉 || 〈 d !99 → Skip, !? → ( b ∗ P ; 〈c?x → Skip〉 ) 〉(1)
In order to establish the outcome here (using the operational semantics of [2] for example)
we proceed as shown in Figure 1, where n indicates a transition-sequence annotated with
the number of clock cycles elapsing; [cond ] denotes a side-condition; and [[eﬀect ]] denotes
some change to internal state.
〈 c!66 → Skip 〉 || 〈 d !99 → Skip, !? → ( b ∗ P ; 〈c?x → Skip〉 ) 〉
0 [[requests lodged]]
〈 c!66 → Skip 〉 || 〈 d !99 → Skip, !? → ( b ∗ P ; 〈c?x → Skip〉 ) 〉
0 [[requests resolved]]
〈 c!66 → Skip 〉 || !? → b ∗ P ; 〈c?x → Skip〉
0
〈 c!66 → Skip 〉 || b ∗ P ; 〈c?x → Skip〉
0 [b = False]
〈 c!66 → Skip 〉 || 〈c?x → Skip〉
1 [[x → 66]]
Skip || Skip
Fig. 1. Program Execution
Details of how requests are “lodged” and “resolved” can be found in [1].
Prialts nested inside default clauses of other prialts may become active in the same
clock cycle as those enclosing prialts, which requires us to iterate this request–resolve
loop several times, in any given clock cycle. Managing this micro-cycle activity severely
complicates the operational semantics. However, the underlying hardware doesn’t iterate,
as it computes what is to be active in any given clock cycle using combinatorial logic.
The “Hardware Compilation” semantics described here was initially developed to see if
such a semantics would give some insight into a simpler, less “micro”-iterative operational
semantics. In other words, can we ﬁnd a way to compute the outcome in one (functional)
step ?
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 75
3 Hardware Compilation
The key concept behind the hardware semantics is to recognise that the resulting hardware
simply consists of a ﬁxed bank of registers, connected by ﬁxed combinatorial logic — in
eﬀect a large (ﬁnite-)state machine. On each clock cycle, new values for the register state
are computed as a function of the current values. Program execution simply repeats this
ﬁxed calculation on every clock cycle. The hardware semantics of a Handel-C program is
therefore simply a ﬁxed function f : State → State, where State denotes the contents of
all the registers. The contribution of this work is to describe how f is determined from the
Handel-C language constructs in a compositional manner. We use equations that model
the behaviour of the hardware to describe how f is computed.
The main features of the hardware that need to be modelled are:
• Registers, loaded on a clock edge used to store variable values and control tokens to
manage control ﬂow.
• Multiplexers used to route expression results to registers and channels (wires) and
channel data to registers.
• Program statement hardware has two key signals: start : B, an input, starts execution
of the statement; while done : B, an output, indicates its termination.
• A control token is a register whose input is a done signal, and whose output is fed to
one or more starts.
We shall express all these components using a set of equations which distinguish between
combinatorial (pure functions) and sequential (stateful) hardware. An equation simply
equates a variable on its lefthand-side with either a combinatorial or sequential expression
on its righthand-side:
Eqn =̂Var × Rhs
Rhs =̂CombExpr | SeqExpr
We diﬀerentiate between combinatorial and sequential expressions by using parentheses
for the former (z = f (x , y)) and square brackets for the latter (w = g [x , y ]). The overall
system is described as a list of such equations,
Sys =̂ PEqn
which we expect to have no “combinatorial cycles”: Any circular chain of dependencies
must include a sequential equation. Generally we either list the equations one to a line
as follows:
x = f (u, v)
y = g [w , x ]
z = h(u, y)
or we list several on one line, separated by semi-colons:
x = f (u, v); y = g [w , x ]; z = h(u, y)
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9076
The combinatorial building blocks provided are the usual functions over B and Z such
as ∧, ∨, ¬ , +, −, /, etc.., plus multiplexers (with non-standard controls 5 ):
muxn : B
n → Zn → Z
muxn(c1, .., cn)(data1, .., datan ) =̂ datai , if ci
If no ci , or more than one, is true, then the output is undeﬁned. The latter case corre-
sponds to more than one process trying to update a variable in a given clock-cycle.
These multiplexers are required because a single process variable x may participate in
many assignment statements only one of which should be active during any clock cycle
(e.g.):
. . . ; x := 0; . . . ; x := x + 1; . . . ; x := y − 2 ∗ z ; . . . ; x := −1; . . .
A multiplexer connects the four pieces of hardware implementing the expressions 0, x +
1, y−2∗z , −1 to the input of register x . The start signals for each assignment statement
above control the multiplexer to determine which expression is routed through to the
output. The logical-or of these start signals enables the loading of the register. All
register updates occur on the appropriate edge of the global clock, which is implicit in
this semantic model 6 .
We use three sequential building blocks:
• Registers: register [load : B, in : Z] : Z
When load is true, in is stored at the clock edge.
• Wait Block (Control Token): wait [ﬁni : B] : B
The value of ﬁni is stored and appears on output after clock edge.
• Synchronisation Block: syncn [done1 : B, .., donen : B] : B
syncn ’s output is initially false. It waits, over many clock cycles if necessary, for all n
donei s to go true. Then its output goes true immediately, and reverts to false at the
next clock edge.
In order to generate hardware equations we need to generate hardware equation vari-
ables (not to be confused with process variables), which is achieved by giving every process
statement (atomic and compound) a unique label. So, for example, the conditional state-
ment peq might be labelled as ::(m::pc n::q), where , m and n are unique labels,
with  labelling the entire conditional construct, while m and n label the true and false
branches respectively.
The trick now is come up with a way of generating the hardware equations for a
process in a compositional manner. Initially this seems impossible, simply because some
of the hardware generated seems to require global knowledge about the whole process for
which hardware is being produced. For example, the multiplexers that feed results into
variables need to have one data and one control input for every use of the variable in the
entire (top-level) process ! This seems to mitigate against a compositional semantics in
this case.
5 The standard n-way multiplexer encodes the controls in log2n bits.
6 Dealing with multi-clock Handel-C would require the use of explicit clock variables in the equations, but this is
relatively easy to add in if required.
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 77
However, there is a technical trick we can employ to make our semantics compo-
sitional: we generate partial hardware descriptions, and use an equation join operator
(unionmulti : PEqn × PEqn → PEqn) to collect equations together with appropriate merging of
partial hardware elements into more complete ones. This technique for merging partial
hardware descriptions is required in three cases:
• Register multiplexers:
We generate a “singleton” multiplexer: in.x = mux1(startm2)(x + 1), or an empty one:
c = mux0()() (for input channels).
We merge them using 7
in.x = muxm (c1, .., cm )(d1, .., dm ) unionmulti in.x = muxn(cm+1, .., cm+n )(dm+1, .., dm+n )
≡
in.x = muxn+m(c1, ..cm , cm+1, .., cm+n )(d1, ..dm , dm+1, .., dm+n )
• Distributed-Or (used for register-load/channel-data controls)
We generate either a singleton distributed-or: v =
∨
{c}, or an empty one (for some
channel cases): v =
∨
{}
We merge these using
v =
∨
{c1, .., cm} unionmulti v =
∨
{cm+1, .., cm+n}
≡
v =
∨
{c1, ..cm , cm+1, .., cm+n}
• Register instances:
We generate a register for every use: x = register [load .x , in.x ]
Merged using
x = register [load .x , in.x ] unionmulti x = register [load .x , in.x ] ≡ x = register [load .x , in.x ]
i.e., they all refer to the same register
The merging of all other sets of equations simply involves lumping them together as a
larger set of separate equations 8 .
3.1 Hardware Compilation Semantics
We are now in a position to give the hardware semantics for all the process types in our
language. We introduce a semantic function which maps processes into sets of hardware
7 Merge (unionmulti) as deﬁned here is non-commutative, but as the behaviour of mux is invariant of any consistent re-
ordering of controls and data, this has no real eﬀect on the overall semantics, where we would expect merge to be
commutative.
8 There is another problem with compositionality and the requirement for unique labels, but this can be resolved
by having unionmulti use renaming to avoid label clashes.
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9078
[[::Skip]] =̂ done = start
[[::Delay ]] =̂ done = wait [start]
[[::x := e]] =̂ in.x = mux1(start)([[e]]); load .x =
∨
{start}
x = register [load .x , in.x ]; done = wait [start]
Fig. 2. Compiling Atomic Statements
[[::(m::p  c  n::q)]]
=̂ [[m::p]] unionmulti [[n::q ]] unionmulti
startm = start ∧ [[c]]; startn = start ∧ ¬ [[c]]
done = donem ∨ donen
[[::(c ∗ n::p)]]
=̂ [[n::p]] unionmulti
startn = [[c]] ∧ (start ∨ donen); done = ¬ [[c]] ∧ (start ∨ donen )
[[::(m::p ; n::q)]]
=̂ [[m::p]] unionmulti [[n::q ]] unionmulti
startm = start; startn = donem ; done = donen
[[::(m::p || n::q)]]
=̂ [[m::p]] unionmulti [[n::q ]] unionmulti
startm = start; startn = start; done = sync2[donem , donen ]
Fig. 3. Compiling Compound Statements
equations:
[[−]] : P → PEqn
The semantics of the atomic statements is given in Figure 2.
Skip asserts done the instant is is started.
Delay waits for the clock cycle in which is was started to end before asserting that it
is done. Hence it always takes one clock cycle to execute.
The semantics of assignment x := e is simply to route the current value of e through
a multiplexer to the input (in.x ) of register x . The assignment statement’s start control
is used to route the multiplexer and load the register (via load .x ). We rely on the merge
operator as previously described to link up the multiplexers, distributed-ors, and to merge
identical register invocations to get the global hardware required.
The semantics of the standard compound statements are given in Figure 3
The conditional (c) uses c to determine which branch to start . It is done when
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 79
[[::〈m1::g1 → n1::p1, . . .mk ::gk → nk ::pk 〉]]
=̂
⊎
i [[mi ::gi ]] unionmulti
⊎
i [[ni ::pi ]] unionmulti
oﬀerm1 = start ∨ retry; oﬀermi+1 = oﬀermi ∧ ¬ activemi
startni = oﬀermi  gi =!? wait [activemi ]
inactive = ¬
∨
{activem1 , .., activemk }
retry = wait [ (start) ∨ retry) ∧ inactive ]
done =
∨
{donen1 , . . . , donenk }
Fig. 4. Compiling “prialt”
[[::!?]] =̂ active = oﬀer
[[::c!e]] =̂ out .c =
∨
{oﬀer}; in.c =
∨
{}
c = mux1(active)([[e]]); active = oﬀer ∧ in.c
[[::c?x ]] =̂ in.c =
∨
{oﬀer}; out .c =
∨
{}; active = oﬀer ∧ out .c
in.x = mux1(active)(c); load .x =
∨
{active}
x = register [load .x , in.x ]; c = mux0()()
Fig. 5. Compiling Guards
either branch is.
The loop (∗) looks at its condition. If false it terminates immediately, otherwise it
starts its body. It itself starts on an external request, or if its body has just terminated.
Sequential composition (; ) starts its ﬁrst sub-statement immediately, its second the
instant the ﬁrst is done, and it terminates when the second does.
Parallel composition (||) starts all its sub-statements immediately once it is itself
started. It is done when all its sub-statements are done, as signalled by sync2.
The compilation of the prialt statement is shown in Figure 4
The condition (gi =!?) is a “compile-time” conditional, which does not translate
into hardware. Once started, a prialt gets its ﬁrst guard to “oﬀer” to communicate. The
guard will report if it is active. If not, then each next guard in sequence is made to
“oﬀer”. Once a guard gets an active response, it executes in this cycle, followed by its
continuation process in the next (except for default guards, whose continuation process
starts immediately). The prialt terminates when the continuation process is done. If all
guards are inactive, it retrys next clock cycle.
The prialt semantics also makes use of a compilation scheme for guards:
[[−]] : Grd → PEqns
The compilation semantics for guards is described in Figure 5
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9080
The guards do not have done and start control tokens, but instead use signals oﬀer
and active respectively to oﬀer to perform their corresponding action, and to be told that
their action is to be active.
The skip guard !? is implemented with a piece of wire, since it is always active if it
oﬀers (Who cares about how it complicates the semantics !!).
An output guard c!e makes a global output oﬀer on out .c, and becomes active if it
sees a global input oﬀer on in.c. If so, it then multiplexes its expression data onto wire c.
Here we deﬁne in.c as an empty distributed-or, simply as a place-holder. Any assertions
here come from input statements. We are exploiting the same merge mechanism used for
assignments.
An input guard c!x makes a global input oﬀer on in.c, and becomes active if it sees
a global output oﬀer on out .c. If active, it behaves like an assignment x := c where c is
the channel data. It needs the value of c but cannot provide it, so an empty multiplexer
is used to complete the semantics and avoid a dangling reference.
3.2 Where has the ﬁxed point gone ?
A standard feature of denotational semantics is the use of ﬁxed points to reason about
recursion and iteration. However a look at the semantics of c ∗ p shows no sign of a ﬁxed
point. Fixed points are used to ensure that the semantics so given is compositional —
the semantics of a compound language construct is built up from the semantics of its
components.
First we point out that the semantics given here is compositional — for example [[c ∗ p]]
is given in terms of [[c]] and [[p]]. The merging of the semantics of program fragments
is achieved by unionmulti, which is deﬁned at the semantic level. Secondly, note that we are
deﬁning the behaviour of the program, or indeed any well-formed fragment, by giving its
computational behaviour for a single clock-cycle. The running program is characterised
by a sequence of states generated on successive clock-cycles by the repeated use of f on
some starting state s0 : State
s0, f (s0), f
2(s0), f
3(s0), . . .
This is where the ﬁxed point has gone — the iteration and its ﬁxed-point semantics is
eﬀectively lifted up to a top-level, were it eﬀectively covers the whole program’s execution
trace.
This is why we refer to this semantics as “denotational”, but admittedly in an uncon-
ventional manner.
4 Laws of Handel-C
We would like to be able to validate a variety of algebraic laws for Handel-C process, such
as:
Skip; P ≡ P ≡ P ; Skip
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 81
P || Q ≡ Q || P
P ; (Q ; R) ≡ (P ; Q); R
b ∗ P ≡ P ; b ∗ P  b  Skip
We consider two programs as equivalent (≡) if they both make the same variable assign-
ments on each clock cycle.
The hardware semantics makes it surprisingly easy to prove some of these laws, in
particular the structural ones.
In order to perform the proofs we need to introduce the notion of a process variable
denoting an arbitrary process (::P , say), and referring to its hardware semantics expansion
as
done = P [start]
Here P [. . .] represents all the hardware equations that correspond to the semantics of P .
The equation above simply serves to name the start and done signals for that hardware.
4.1 Proving Skip a unit for ;
We now consider the following three-way equation on processes:
Skip; P ≡ P ≡ P ; Skip
We introduce labels:
::(s::Skip ; p::P) = ::(p::P) = ::(p::P ; t ::Skip)
The extra label on the middle process simply serves to make it easier to compare the
results. Expanding out the lefthand-side:
[[::(s::Skip ; p::P)]]
=̂ donep = P [startp ]; dones = starts
starts = start; startp = dones ; done = donep
Expanding out the middle:
[[::(p::P)]]
=̂ startp = start; done = donep ; donep = P [startp ]
Expanding out the righthand-side:
[[::(p::P ; t ::Skip)]]
=̂ donep = P [startp ]; donet = startt
startp = start; startt = donep ; done = donet
How do we reconcile these three ? We have a label s in one, but a label t in another,
which are not equivalent. The key is to deﬁne the concept of a degenerate equation as
one which simply equates two variables. We then add in the concept of a degenerate label
by deﬁning such as a label for which every equation in which it appears is degenerate,
and that all variables referencing it occur as the righthand-side of at least one such
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9082
degenerate equation. Careful examination of the equations above ﬁnds that labels s
and t are degenerate. Label  is not degenerate, because start does not appear on any
equation righthand-side.
We shall simply use appropriate equation substitution to eliminate degenerate labels
— for example if n is degenerate below then
xn = ym ; zp = f (. . . xn . . .)
can be safely replaced by
zp = f (. . . ym . . .)
We can safely do this as it has no eﬀect on the underlying hardware — in eﬀect degenerate
equations simply indicate a situation where wires have multiple names, and a degenerate
label is one whose sole use is in the provision of one of these aliases. Removing them
makes no diﬀerence to the underlying hardware.
If we now strip t and s out
[[::(s::Skip ; p::P)]]
≡ startp = start; done = donep ; donep = P [startp ]
[[::(p::P ; t ::Skip)]]
≡ startp = start; done = donep ; donep = P [startp ]
We see all three sets of equations are now identical. 
4.2 Proving commutativity of ||
We now consider the issues surrounding a proof of the commutativity of parallel compo-
sition:
P || Q = Q || P
We label both sides as follows:
::((p::P) || (q ::Q)) = ::((q ::Q) || (r ::P))
Firstly, for the proof we assume that P and Q have disjoint labels, and that neither use
labels p or q . We assume that both instances of P have the same labelling internally (and
similarly for Q).
We compile the lefthand-side to get:
[[::((p::P) || (q ::Q))]]
=̂ [[p::P ]] unionmulti [[q ::Q ]] unionmulti
startp = start; startq = start; done = sync[donep , doneq ]
Before expanding out [[p::P ]] and [[q ::Q ]], we note that in general these produce not just
one equation, but many, and that these may refer to common variables. However, as the
unionmulti operator is associative and commutative, we do not need to deal with this explicitly, so
we can complete the expansion of the lefthand-side as:
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 83
[[::((p::P) || (q ::Q))]]
≡ donep = P [startp ] unionmulti doneq = Q [startq ] unionmulti
startp = start; startq = start; done = sync[donep , doneq ]
This works ﬁne, but we need to keep in the back of our minds that P and Q in the
semantics also stand for zero or more additional hidden equations.
We expand out the righthand-side:
[[::((q ::Q) || (p::P))]]
= doneq = Q [startq ] unionmulti donep = P [startp ] unionmulti
startq = start; startp = start; done = sync[doneq , donep ]
The ordering of equations is irrelevant, and the only diﬀerence between the two forms
is the equation for done. To complete the proof we require that the sync function be
invariant on any re-ordering of its inputs:
sync[a, b] = sync[b, a]
If we assume that sync has this property then our proof is complete. In fact, we take this
property of sync as a speciﬁcation that sync must satisfy.
The proof that sequential composition is associative is similar to that showing that
Skip is a unit for composition, and requires eliminating degenerate variables in the same
way. The (perhaps unsurprising) result of that proof is the following:
[[::(p::P ; m::(q ::Q ; r ::R))]]
≡ [[::(n::(p::P ; q ::Q); r ::R)]]
≡ startp = start; donep = P [startp ]
startq = donep ; doneq = Q [startq ]
startr = doneq ; doner = R[startr ]
done = doner
which is the obvious way one would deﬁne the semantics of the three-way sequential
composition construct:
::(p::P ; q ::Q ; r ::R)
We have seen that in order to prove some laws we need lemmas regarding properties
of building blocks, such as sync. Proving b ∗ P = P ; b ∗ P  b  Skip requires a much
more complex result: namely the Hardware Cloning Lemma.
The Hardware Cloning Lemma states that if we clone a piece of control hardware, and
occasionally run that instead of the original, that the switch is unobservable. Note that
only the control hardware is cloned — a reference to a variable x or channel c denotes
the same hardware elements in both the original and cloned hardware.
Let donep = P [startp ] denote the hardware generated for program p::P . The cloned
hardware needs to have labels distinct from those of the original, so let doner = R(P)[startr ]
denote the cloned hardware, where R is a relabelling function, that maps label p to r ,
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9084
wait [x ] ∨ wait [y ] = wait [x ∨ y ] (Wait)
mux2(c, d)(x , x ) = mux1(c ∨ d)(x ) (Mux )
synck [TRUE , ..,TRUE ] = TRUE (SyncNow)
synck [wait [TRUE ], ..,wait [TRUE ]] = wait [TRUE ] (SyncNext)
sync2[m,n] ∨ sync2[s, t ] = sync2[m ∨ s,n ∨ t ]
provided ¬ ((m ∨ n) ∧ (s ∨ t)) (SyncClone)
Fig. 6. Building Block Properties
and maps other labels to new values. We can state the Lemma as
P [startp ] ∨ R(P)[startr ] ≡ P [startp ∨ startr ]
Here startp is true when we plan to run the original, and startr is true when we want
to run the clone. Starting P with startp ∨ startr corresponds to running the original
in all cases. We assert that we cannot distinguish these two cases. The proof is a long
induction over the abstract syntax structure of processes, which we we omit.
Why do we need the Lemma to show b ∗ P ≡ (P ; b ∗ P)  b  Skip ? Simply
because the lefthand-side mentions P once, but the righthand-side mentions it twice.
The proof also requires the properties of building blocks shown in Figure 6. Property
Mux is combinatorial and easy to prove, and captures the fact that consistent re-ordering
of controls and inputs does not alter the behaviour. Property SyncClone arises because
the ‖ case in the proof exposes a key assumption required in order for the cloning lemma
to hold, namely that for any language construct, once start is asserted, it must remain
false on subsequent cycles until done is asserted. In other words, we cannot re-start
hardware until it is done. We shall refer to this as the “No-Pipeline principle” 9 , which is
captured by the side-condition ¬ ((m ∨ n) ∧ (s ∨ t)). We consider this a nice example of
how a formal theoretical analysis of an artifact (Handel-C hardware in this case) exposes
a key underlying principle about how such an artifact should be operated.
This and the other properties require a formal model that captures time in order to
be proven. It is to this that next turn our attention.
5 Register Transfer Notation
To capture time, we need to be very clear about when signals are latched into registers,
something about which the equations are somewhat vague. The equation x = g [y , z ]
indicates that clocked storage is used by g but is unlcear about precise timings. We
shall deﬁne a subset of our hardware equation notation, called Register Transfer Nota-
tion (RTN) that explicitly deﬁnes which lefthand-side variables denote registers. The
9 This should not be interpreted as meaning that it is not possible to build pipe-lined architectures using Handel-C
— this is very feasible, by ensuring each pipeline stage is a separate block running in parallel with other stages.
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 85
y = wait [x ] → y := x
x = register [load , d ] → x := d  load  x
alldone = synck [done1, .., donek ] → alldone =
∧
{isdone1, .., isdonek}
isdonei = donei ∨ cmpli
cmpli := ¬ alldone ∧ (cmpli ∨ donei )
Fig. 7. Sequential Building Blocks in RTN
combinatorial expressions remain unchanged, but the sequential ones must now be “im-
plemented” in terms of a single store primitive store[in] which stores its in value (boolean
or integer) at every clock edge. We insist that a sequential statement can only consist of
a single use of store, so must be of the form: x = store[data] which we shall simplify with
the shorthand x := data. These latter equations are now referred to as storage equations.
We show the implementation of the sequential building blocks in terms of RTN in Figure
7. It is worth noting that the wait building block is in fact exactly the same as the store
block just introduced.
We can give a formal semantics to RTN by translating it into state machines. Given
combinatorial-cycle free RTN equations:
w1 = expr1; . . . ; wm = exprm ; . . . ; v1 := exprm+1; . . . ; vn := exprm+n
where expri ranges over v1, .., vn and x1, .., x, we deﬁne the state to be the vector s =
(v1, .., vn ) ∈ S , the output vector to be o = (w1, ..,wm ) ∈ O , and the input vector to be
i = (x1, ..x) ∈ I . In eﬀect the lefthand-sides of the storage equations constitute the state,
and outputs are characterised by being the lefthand-sides of the other equations. If we
want to output a state component directly (vi say), then we add a (degenerate) equation
wm+1 = vi to signal this.
Given the vectors i, s and o, we can then summarise the equations as:
o = (expr1(s, i), .., exprm (s, i))
s := (exprm+1(s, i), .., exprm+n (s, i))
These can be interpreted as representing the next-state and output functions of a state
machine.
A state machine with input i : I , state s : S , output o : O , next-state (ns : I → S →
S ) and output (op : I → S → O) functions has behaviour:
runns,op : seq I → S → S × seqO
runns,op(〈〉)s0 =̂ (s0, 〈〉)
runns,op(i : is)s0 =̂ (s
′, (op(i)s0) : os
′)
where (s ′, os ′) = runns,op(is)(ns(i)s0)
By encoding hardware equation laws as state machines, we can use run to show that both
sides have the same outcome.
Consider proving that wait [x ] ∨ wait [y ] ≡ wait [x ∨ y ]. We deﬁne state-machines for
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9086
both sides of the equivalence as follows: For w1 = wait [x ] ∨ wait [y ] we get equations
w1 = sa ∨ sb ; sa := x ; sb := y and state machine:
s1 = (sa , sb); run1 = runns1,op1
ns1(x , y)(sa , sb) = (x , y); op1(x , y)(sa , sb) = sa ∨ sb
For w2 = wait [x ∨ y ] we obtain equations s2 := x ∨ y ; w2 = s2 and machine:
s2 = sa ∨ sb ; run2 = runns2,op2
ns2(x , y)(s2) = x ∨ y ; op2(x , y)(s2) = s2
We cannot prove that run1 = run2 because the states have diﬀerent types. Instead we
prove that outputs are identical for given inputs, and corresponding initial states:
π2(run1(is)(x0, y0)) = π2(run2(is)(x0 ∨ y0))
Proof: by induction on (length of) is. The base case is straightforward. The inductive
step requires following two lemmas:
run1((x , y) : is)(sx , sy) = (s
′′, (sx ∨ sy) : os
′′) where (s ′′, os ′′) = run1(is)(x , y)
run2((x , y) : is)(s) = (s
′′, s : os ′′) where (s ′′, os ′′) = run2(is)(x ∨ y)
The Inductive Step:
π2(run1((x , y) : is)(x0, y0))
= “ Lemma for run1 ”
π2((s ′′, (x0 ∨ y0) : os
′′) where (s ′′, os ′′) = run1(is)(x , y))
= “ defn. π2 (each way) ”
(x0 ∨ y0) : os
′′
where os ′′ = π2(run1(is)(x , y))
= “ inductive step ”
(x0 ∨ y0) : os
′′
where os ′′ = π2(run2(is)(x ∨ y))
= “ defn. π2 (each way) ”
π2((s ′′, (x0 ∨ y0) : os
′′) where (s ′′, os ′′) = run2(is)(x ∨ y))
= “ Lemma for run2 ”
π2(run2((x , y) : is)(x0 ∨ y0))

State machines can be coded up in the UTP framework [6] (see Appendix A) and
similar proofs can be performed in that setting.
6 Conclusions
We have presented a formal semantics for Handel-C which is compositional and expresses
how a process denotes a chunk of hardware described by a set of equations. We have also
given examples of how we can use this semantics to prove various laws regarding Handel-
C, and we have sketched out a more complex result which requires a speciﬁc operating
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 87
principle (No-Pipelining) to hold in order for the underlying hardware to have the correct
behaviour.
We have also built a semantic bridge from Handel-C to UTP, as we can formulate a
theory in UTP about state machines (see Appendix A). This results in the ﬁrst UTP
semantics for Handel-C.
Has the Hardware Semantics provided any insight into an “improved” Operational
Semantics ? This is not immediately clear or obvious, and as the issue has to do with
prialts and default clauses, we should consider a relevant example — namely the one
referred to earlier in this paper:
〈c!99 → Skip〉 || 〈d !66 → Skip, !? → (false ∗Delay ; 〈c?x → Skip〉)〉
We attach labels to obtain:
(1::〈3::c!99 → 5::Skip〉)
0:: || (2::〈6::d !66 → 8::Skip,
9::!? → 10::(11::(false ∗ 14::Delay);
12::〈15::c?x → 17::Skip〉)〉)
The resulting hardware semantics, simpliﬁed by removing degenerate equations and
dangling variables, is:
done0 = (start5 ∨ cmpl1) ∧ (start17 ∨ cmpl2)
cmpl1 :=¬ done0 ∧ (start5 ∨ cmpl1)
cmpl2 :=¬ done0 ∧ (start8 ∨ start17 ∨ cmpl2)
start5 := start0; start8 := false; start17 := start0
x :=mux1(start0)(99)  start0  x
The process of selecting which guards are active has been eﬀectively “calculated out” by
the process of simpliﬁcation. It is not clear how this could reﬂect back into operational
semantics: There are chains of equations linking prialt guards in order, and cross-linking
to other prialts, but there are diﬃcult to see, even with a global overview!
However, the Hardware Semantics is interesting in its own right, as it exposes clearly
how a Handel-C program is really a description of a ﬁnite state machine, It also exposed
the “No-Pipeline” principle, which suggests experimenting with pipelining language con-
structs.
We need to complete ongoing work to fully formalise the linkages between hardware
equations, RTN, the state machines and the UTP semantics. It also needs to be seen what
is the full range of Handel-C laws that can be veriﬁed using this hardware semantics.
References
[1] Butterﬁeld, A. and J. Woodcock, Semantics of prialt in Handel-C, in: J. Pasco, P. Welch, R. Loader and
V. Sunderam, editors, Communicating Process Architectures – 2002, Concurrent Systems Engineering (2002),
pp. 1–16.
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9088
init(s0) =̂ ins
′ = 〈〉 ∧ st ′ = s0 ∧ outs
′ = 〈〉
M1; M2 =̂ ∃ ins0, st0, outs0 •
M1[ins0, st0, outs0/ins
′, st ′, outs ′]
∧ M2[ins0, st0, outs0/ins, st , outs]
II =̂ ins ′ = ins ∧ st ′ = st ∧ outs ′ = outs
stepns,op(i) =̂ ins
′ = ins  〈i〉 ∧ st ′ = ns(i)st ∧ outs ′ = outs  〈op(i)st 〉
runns,op(〈〉) =̂ II
runns,op(i : is) =̂ stepns,op(i); runns,op(is)
Fig. A.1. UTP State-Machine Predicates
[2] Butterﬁeld, A. and J. Woodcock, An operational semantics for handel-c, in: Thomas Arts and Wan Fokkink,
editors, Electronic Notes in Theoretical Computer Science, 80, Elsevier, 2003 .
[3] Celoxica Ltd., “Handel-C Language Reference Manual, v3.0,” (2002), URL: www.celoxica.com.
[4] Dulay, N., T. Lee, W. Luk, E. Lupu, M. Sloman and S. Yusuf, Development framework for ﬁrewall processors,
URL: www.celoxica.com, in Academic Papers section.
[5] Hoare, C., “Communicating Sequential Processes”, Intl. Series in Computer Science, Prentice Hall, 1990.
[6] Hoare, C. A. R. and H. Jifeng, “Unifying Theories of Programming,” Series in Computer Science, Prentice Hall,
1998.
[7] Luk, W. and I. Page, Compiling Occam into ﬁeld-programmable gate arrays, in: W. Moore and W. Luk,
editors, FPGAs, Oxford Workshop on Field Programmable Logic and Applications, Abingdon EE&CS Books,
15 Harcourt Way, Abingdon OX14 1NV, UK, 1991 pp. 271–283.
A Unifying Theories of Programming
We can encode state machine semantics in the Unifying Theories of Programming frame-
work (UTP) [6]. This describes systems via alphabetised relational predicates which
relate the values of observational variables before some action to their values after the
action has run. This framework has been used to model sequential programs, concurrent
process formalisms such as CSP, logic programming, and implementation techniques such
a assembly language, among others.
Given a state machines with input: in : I , output out : O , statest : S , next-state
function ns : I → S → S and output function ns : I → S → O , we establish our
UTP observation variables as being: the inputs seen so far: ins : seq I ; the current state:
st : S and the outputs generated to date: outs : seqO . A State machine predicate relates
the values of these variables before a run (ins, st , outs) to their values once the run has
completed (ins ′, st ′, outs ′).
We deﬁne four actions on state-machines as UTP predicates in Figure A.1
The notation P [a, b, c/x , y , z ] denotes the simultaneous substation of a, b and c for
all free occurrences of x , y and z respectively.
The predicate init(s0) indicates that after a state-machine is initialised, its state is
s0 and the input and output sequences are empty, regardless of prior values. Sequential
composition (; ) is straight from standard UTP theory[6], and here is given a deﬁnition
tailored to state-machine observables. The predicate II (Skip) simply describes a situation
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–90 89
where nothing happens — it is useful as an identity for sequential composition. The
predicate stepns,op(i) describes the eﬀect of stepping a state-machine over one input i .
The predicate runns,op(is) describes the eﬀect of stepping a state-machine over the input
sequence is. It is deﬁned in terms of skip, step and sequential composition. We can easily
show the following laws to hold true:
II; P ≡ P ≡ P ; II
runns,op(is1); runns,op(is2) ≡ runns,op(is1  is2)
A. Butterfield, J. Woodcock / Electronic Notes in Theoretical Computer Science 161 (2006) 73–9090
