Automatic Formal Synthesis of Hardware from Higher Order Logic  by Gordon, Mike et al.
Automatic Formal Synthesis of Hardware
from Higher Order Logic
Mike Gordona Juliano Iyodaa Scott Owensb Konrad Slindb
a University of Cambridge Computer Laboratory, William Gates Building,
JJ Thomson Avenue, Cambridge CB3 0FD, UK
b University of Utah, School of Computing, 50 South Central Campus Drive,
Salt Lake City, Utah UT84112, USA
Abstract
A compiler that automatically translates recursive function deﬁnitions in higher order logic to
clocked synchronous hardware is described. Compilation is by mechanised proof in the HOL4
system, and generates a correctness theorem for each function that is compiled. Logic formulas
representing circuits are synthesised in a form suitable for direct translation to Verilog HDL for
simulation and input to standard design automation tools. The compilation scripts are open and
can be safely modiﬁed: synthesised circuits are correct-by-construction. The synthesisable subset
of higher order logic can be extended using additional proof-based tools that transform deﬁnitions
into the subset.
Keywords: Theorem proving, compiling, hardware synthesis
1 Introduction
Our goal is to synthesise correct-by-construction hardware directly from math-
ematical speciﬁcations in higher order logic (HOL [5]). The ‘synthesisable
subset’ of HOL is not intended to be ﬁxed, but to grow as we do case stud-
ies. The compiler currently generates hardware to implement tail-recursive
function deﬁnitions. An example is iterative accumulator-style multiplication:
MultIter(m,n,acc) =
if m = 0 then (0,n,acc) else MultIter(m-1,n,n+acc)
Since MultIter(m,n,acc) = (0,n,(m×n)+acc), a multiplier is deﬁned by:
Mult(m,n) = SND(SND(MultIter(m,n,0)))
Electronic Notes in Theoretical Computer Science 145 (2006) 27–43
1571-0661  © 2005 Elsevier B.V. 
www.elsevier.com/locate/entcs
doi:10.1016/j.entcs.2005.10.003
Open access under CC BY-NC-ND license.
where SND(SND(x,y,z)) evaluates to z, so Mult(m,n) = m×n. Using this
multiplier one could then deﬁne the factorial function by:
FACT n = if n = 0 then 1 else Mult(n, FACT(n-1))
This isn’t tail-recursive, so isn’t synthesisable, however a separate tool linRec
(see Section 4) can automatically generate a synthesisable deﬁnition:
FactIter(n,acc) =
if n = 0 then (n,acc) else FactIter(n-1,Mult(n,acc)))
Fact n = SND(FactIter (n,1))
linRec automatically proves FACT = Fact.
The compiler translates a function f , deﬁned in HOL, into a device DEV f
that computes f via a four-phase handshake circuit on signals load, inp, done
and out. These signals are a request line, a data input bus, an acknowledge













Fig. 1. The handshaking protocol.
The exact behaviour of such a handshaking device is speciﬁed in the HOL def-
inition of the predicate DEV, which is given in the Appendix. This speciﬁcation
says roughly that if a value v is input on inp when a request is made on load
then eventually f(v) will be output on out, and when this occurs is signalled
on done (Fig. 1). Here’s a more detailed description: at the start of a trans-
action (say at time t) the device must be outputting T on done (to indicate
it is ready) and the environment must be asserting F on load, i.e. in a state
such that a positive edge on load can be generated. A transaction is initiated
by asserting (at time t+1) the value T on load, i.e. load has a positive edge
at time t+1. This causes the device to read the value, v say, being input on
inp (at time t+1) and to set done to F. The device then becomes insensitive
to inputs until T is next asserted on done, at which time the computed value
f(v) will be output on out.
2 Representation of functions as circuits
A synchronous circuit clocked on the signal clk implements the handshake
protocol computing f if it guarantees that the higher order logic formula:
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4328
DEV f (load at clk, inp at clk, done at clk, out at clk)
is true (the Appendix has the formal deﬁnition of DEV). The signals load,
inp, done, out are modelled as functions mapping time to values, and the at-
operator projects a signal to the sequence of values occurring at rising edges
of the clock clk. More precisely σ at clk is the signal that for all times t has
the value at time t that the signal σ has at the tth rising edge of signal clk.
The notation “σ@clk” is sometimes used instead of “σ at clk”. The formal
theory of temporal projection is covered in detail in Melham’s monograph [9]
(where it is called ‘temporal abstraction’).
An actual circuit is represented as a conjunction of formulas, each repre-
senting a component instance. Internal wires are existentially-quantiﬁed. This
is a standard modelling of hardware in higher order logic, and is also described
in detail in Melham’s book (ibid).




(∃ v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v18 v19 v20 v21 v22
v23 v24 v25 v26 v27 v28 v29 v30 v31 v32 v33 v34 v35 v36 v37 v38 v39 v40 v41 v42
v43 v44 v45 v46 v47 v48 v49 v50 v51 v52 v53 v54 v55 v56 v57.
DtypeT(clk,load,v21) ∧ NOT(v21,v20) ∧ AND(v20,load,v19) ∧ Dtype(clk,done,v18) ∧
AND(v19,v18,v17) ∧ OR(v17,v16,v11) ∧ DtypeT(clk,v15,v23) ∧ NOT(v23,v22) ∧
AND(v22,v15,v16) ∧ MUX(v16,v14,inp1,v3) ∧ MUX(v16,v13,inp2,v2) ∧
MUX(v16,v12,inp3,v1) ∧ DtypeT(clk,v11,v26) ∧ NOT(v26,v25) ∧ AND(v25,v11,v24) ∧
MUX(v24,v3,v27,v10) ∧ Dtype(clk,v10,v27) ∧ DtypeT(clk,v11,v30) ∧ NOT(v30,v29) ∧
AND(v29,v11,v28) ∧ MUX(v28,v2,v31,v9) ∧ Dtype(clk,v9,v31) ∧
DtypeT(clk,v11,v34) ∧ NOT(v34,v33) ∧ AND(v33,v11,v32) ∧ MUX(v32,v1,v35,v8) ∧
Dtype(clk,v8,v35) ∧ DtypeT(clk,v11,v39) ∧ NOT(v39,v38) ∧ AND(v38,v11,v37) ∧
NOT(v37,v7) ∧ CONSTANT 0 v40 ∧ EQ32(v3,v40,v36) ∧ Dtype(clk,v36,v6) ∧
DtypeT(clk,v7,v44) ∧ NOT(v44,v43) ∧ AND(v43,v7,v42) ∧ AND(v42,v6,v5) ∧
NOT(v6,v41) ∧ AND(v41,v42,v4) ∧ DtypeT(clk,v5,v48) ∧ NOT(v48,v47) ∧
AND(v47,v5,v46) ∧ NOT(v46,v0) ∧ CONSTANT 0 v45 ∧ Dtype(clk,v45,out1) ∧
Dtype(clk,v9,out2) ∧ Dtype(clk,v8,out3) ∧ DtypeT(clk,v4,v53) ∧ NOT(v53,v52) ∧
AND(v52,v4,v51) ∧ NOT(v51,v15) ∧ CONSTANT 1 v54 ∧ SUB32(v10,v54,v50) ∧
ADD32(v9,v8,v49) ∧ Dtype(clk,v50,v14) ∧ Dtype(clk,v9,v13) ∧ Dtype(clk,v49,v12) ∧
Dtype(clk,v15,v56) ∧ AND(v15,v56,v55) ∧ AND(v0,v7,v57) ∧ AND(v57,v55,done))
==>
DEV MultIter
(load at clk, (inp1<>inp2<>inp3) at clk, done at clk, (out1<>out2<>out3) at clk)
This theorem has the form:
 InfRise clk ==> circuit ==> device speciﬁcation
The logic formula InfRise clk asserts that signal clk has an inﬁnite number
of rising edges. This is a standard precondition for temporal projection (ibid)
and is needed because of the use of the at-operator in the device speciﬁcation.
The logic formula circuit is the standard representation of the synthesised
circuit in higher order logic. The components are described in Section 2.
Circuits in this form are the lowest level of formal representation we generate.
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 29
However they are easily converted to HDL and then simulated or input to
other tools. We have written a ‘pretty-printer’ that generates Verilog HDL
and have used several simulators and the Quartus II FPGA synthesis tool to
run examples (including MultIter and Fact) on FPGAs.
The logic formula device speciﬁcation uses the HOL predicate DEV de-
scribed above to specify that MultIter is computed using a four-phase hand-
shake. Our compiler defaults to using 32-bit words. The input and output
of MultIter are thus triples of 32-bit words, which are represented by terms
inp1<>inp2<>inp3 and out1<>out2<>out3 where inp1, inp2, inp3, out1,
out2, out3 are 32-bit words and <> denotes word concatenation.
The compiler generates circuits using components from a predeﬁned li-
brary, which can be changed to correspond to the targeted technology (the
default target is Altera FPGAs synthesised using Quartus II).
The components used to implement MultIter are NOT, AND, OR (logic
gates), EQ32 (32-bit equality test), MUX (multiplexer), DtypeT (Boolean D-
type register that powers up into an initial state storing the value T), Dtype
(D-type register with unspeciﬁed initial state), CONSTANT (read-only register
with a predeﬁned value), ADD32 (32-bit adder) and 32-bit SUB32 (32-bit sub-
tracter). Each of these components is deﬁned in a standard style in higher
order logic. For example, NOT is deﬁned by:
NOT(inp,out) = ∀t. out(t) = ¬inp(t)
NOT is typical of all the combinational components (i.e. components that can
be implemented directly with logic gates without using registers). The two
sequential components, Dtype and DtypeT, are registers that are triggered on
the positive (rising) edge of a clock and their deﬁnitions use the predicate
Rise deﬁned by:
Rise s t = ¬s(t) ∧ s(t+1)
and then Dtype and DtypeT are deﬁned by:
Dtype (clk, d, q) = ∀t. q(t+1) = if Rise clk t then d t else q t
DtypeT(clk, d, q) = (q 0 = T) ∧ Dtype(clk, d, q)
These models are standard and are described in Melham’s book (ibid).
3 How the compiler works
The compiler is implemented in the HOL4 system and is a program in Stan-
dard ML that generates a proof in the version of higher order logic supported
by the system (which we refer to as “HOL”).
The compiler creates circuits implementing functions f in higher order logic
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4330
where f : σ1×· · ·×σm → τ1×· · ·× τn and σ1, . . . , σm, τ1, . . . , τn are the types
of values that can be carried on buses (e.g. n-bit words). The starting point
of compilation is the deﬁnition in HOL of such a function f by an equation
of the form: f(x1, . . . , xn) = e, where any recursive calls of f in e must be
tail-recursive. Invoking our compiler on such a deﬁnition (if necessary with a
user-supplied measure function to aid proof of termination) will ﬁrst deﬁne f
in higher order logic (using TFL [15]) and then prove a theorem:
|- InfRise clk
==> circuit
==> DEV f (load at clk, inputs at clk, done at clk, outputs at clk)
where inputs is inp1<>· · ·<>inpm, outputs is out1<>· · ·<>outn (with the type
of inpi matching σi and the type of outj matching τj) and circuit is a HOL
formula representing a circuit with inputs clk, load, inp1, . . ., inpm and
outputs done, out1, . . ., outn that computes f .
The ﬁrst step (Step 1) in compiling f(x1, . . . , xn) = e encodes e as an
applicative expression, E say, built from the operators Seq (compute in se-
quence), Par (compute in parallel), Ite (if-then-else) and Rec (recursion),
deﬁned by:
Seq f1 f2 = λx. f2(f1 x)
Par f1 f2 = λx. (f1 x, f2 x)
Ite f1 f2 f3 = λx. if f1 x then f2 x else f3 x
Rec f1 f2 f3 = λx. if f1 x then f2 x else Rec f1 f2 f3 (f3 x)
The encoding into an applicative expression built out of Seq, Par, Ite and Rec
is performed by a proof script and results in a theorem  (λ(x1, . . . , xn). e) = E ,
and hence  f = E . The algorithm used is straightforward and is not described
here. As an example, the proof script deduces from:
 FactIter(n, acc) =
if n = 0 then (n, acc) else FactIter(n− 1, n×acc)
the theorem:
 FactIter =
Rec (Seq (Par (λ(n, acc). n) (λ(n, acc). 0)) (=))
(Par (λ(n, acc). n) (λ(n, acc). acc))
(Par (Seq (Par (λ(n, acc). n) (λ(n, acc). 1)) (−))
(Seq (Par (λ(n, acc). n) (λ(n, acc). acc)) (×)))
The second step (Step 2) replaces the combinators Seq, Par, Ite and
Rec with corresponding circuit constructors SEQ, PAR, ITE and REC that com-
pose handshaking devices (see the Appendix for their deﬁnitions). The key
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 31
property of these constructors are the following theorems that enable us to
compositionally deduce theorems of the form  Imp =⇒ DEV f , where Imp is
a formula constructed using the circuit constructors, and hence is a handshak-
ing device. The long arrow symbol =⇒ denotes implication lifted to functions:
f =⇒ g = ∀load inp done out. f(load, inp, done, out) ⇒ g(load, inp, done, out).
 DEV f =⇒ DEV f
 (P1 =⇒ DEV f1) ∧ (P2 =⇒ DEV f2)
⇒ (SEQ P1 P2 =⇒ DEV (Seq f1 f2))
 (P1 =⇒ DEV f1) ∧ (P2 =⇒ DEV f2)
⇒ (PAR P1 P2 =⇒ DEV (Par f1 f2))
 (P1 =⇒ DEV f1) ∧ (P2 =⇒ DEV f2) ∧ (P3 =⇒ DEV f3)
⇒ (ITE P1 P2 P3 =⇒ DEV (Ite f1 f2 f3))
 Total(f1, f2, f3)
⇒ (P1 =⇒ DEV f1) ∧ (P2 =⇒ DEV f2) ∧ (P3 =⇒ DEV f3)
⇒ (REC P1 P2 P3 =⇒ DEV (Rec f1 f2 f3))
The predicate Total is deﬁned so that Total(f1, f2, f3) ensures termination.
If E is an expression built using Seq, Par, Ite and Rec, then by instan-
tiating the predicate variables P1, P2 and P3, these theorems enable a logic
formula F to be built from circuit constructors SEQ, PAR, ITE and REC such
that  F =⇒ DEV E . From Step 1 we have  f = E , hence  F =⇒ DEV f
A function f which is combinational can be packaged as a handshaking
device using a constructor ATM, which creates a simple handshake interface
and satisﬁes the reﬁnement theorem:
 ATM f =⇒ DEV f
The circuit constructor ATM is deﬁned with the other constructors in the Ap-
pendix. To avoid a proliferation of internal handshakes, when the proof script
that constructs F from E is implementing Seq f1 f2, it checks to see whether f1
or f2 are compositions of combinational functions and if so introduces PRECEDE
or FOLLOW instead of SEQ, using the theorems:
 (P =⇒ DEV f2) ⇒ (PRECEDE f1 P =⇒ DEV (Seq f1 f2))
 (P =⇒ DEV f1) ⇒ (FOLLOW P f2 =⇒ DEV (Seq f1 f2))
PRECEDE f d processes inputs with f before sending them to d and FOLLOW d f
processes outputs of d with f . The deﬁnitions are:
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4332
PRECEDE f d (load, inp, done, out) =
∃v. COMB f (inp, v) ∧ d(load, v, done, out)
FOLLOW d f (load, inp, done, out) =
∃v. d(load, inp, done, v) ∧ COMB f (v, out)
COMB f (v1, v2) drives v2 with f(v1), i.e. COMB f (v1, v2) = ∀t. v2 t = f(v1 t).
SEQ d1 d2 introduces a handshake between the executions of d1 and d2, but
PRECEDE f d and FOLLOW d f just ‘wire’ f before or after d, respectively,
without introducing a handshake. Replacing SEQ by PRECEDE or FOLLOW is an
example of a ‘peephole’ optimisation.
Step 2 results in a theorem  F =⇒ DEV f where F is a logic formula built
using the circuit constructors ATM, SEQ, PAR, ITE, REC, PRECEDE and FOLLOW.
The third step (Step 3) is to rewrite with the deﬁnitions of these construc-
tors (see their deﬁnitions in the Appendix) to get a circuit built out of standard
kinds of gates (AND, OR, NOT and MUX), the generic combinational component
COMB g (where g will be a function represented as a HOL λ-expression) and
Dtype registers.
Formulas of the form COMB g (inp, out) are then converted into circuits built
only using components in the library of predeﬁned circuits. The default library
currently includes Boolean functions (e.g. ∧, ∨ and ¬), multiplexers and
simple operations on n-bit words (e.g. versions of +, − and <, various shifts
etc.). A special purpose proof rule uses a recursive algorithm to synthesise
combinational circuits. For example:
 COMB (λ(m,n). (m < n, m+1)) (inp1<>inp2, out1<>out2) =
∃v0. COMB (<) (inp1<>inp2, out1) ∧ CONSTANT 1 v0 ∧
COMB (+) (inp1<>v0, out2)
where <> is bus concatenation, CONSTANT 1 v0 drives v0 high continuously, and
COMB < and COMB + are assumed given components (if they were not given,
then they could be implemented explicitly, but one has to stop somewhere).
The circuit resulting at the end of Step 3 uses unclocked abstract registers
DEL, DELT and DFF that were chosen for convenience in deﬁning ATM, SEQ, PAR,
ITE and REC (see the Appendix). The register DFF is easily deﬁned in terms
of DEL, DELT and some combinational logic (details omitted).
The fourth step (Step 4) introduces a clock (with default name clk) and
performs an automatic temporal projection as described in Melham’s book [9]
using the theorems:
 InfRise clk ⇒ ∀d q. Dtype(clk , d, q) ⇒ DEL(d at clk , q at clk)
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 33
 InfRise clk ⇒ ∀d q. DtypeT(clk , d, q) ⇒ DELT(d at clk , q at clk)
By instantiating load , inp, done and out in the theorem obtained by Step 3
to load at clk , inp at clk , done at clk and out at clk , respectively, and then
performing some deductions using the above theorems and the monotonicity
of existential quantiﬁcation and conjunction with respect to implication, we
obtain a theorem:
|- InfRise clk ==>
circuit implementing f ==>
DEV f (load at clk, inputs at clk, done at clk, outputs at clk)
4 Additional tools: linRec
The ‘synthesisable subset’ of HOL is the subset that can be automatically
compiled to circuits. Currently this only includes tail-recursive function def-
initions. We anticipate compiling higher level speciﬁcations by using proof
tools that translate into the synthesisable subset. Such tools are envisioned as
‘third party’ add-ons developed for particular applications. As a preliminary
experiment we are implementing a tool linRec to translate linear recursions
to tail-recursions. This would enable, for example, the automatic generation
of MultIter and FactIter from the more natural deﬁnitions:
Mult(m,n) = if m = 0 then 0 else m+Mult(m-1,n)
Fact n = if n = 0 then 1 else n*Fact(n-1)
A prototype implementation of linRec exists. It uses the following deﬁ-
nition of linear and tail-recursive recursion schemes:
linRec(x) = if a(x) then b(x) else c (linRec(d x)) (e x)
tailRec(x,u) = if a(x) then c (b x) u else tailRec(d x, c (e x) u)
A linear recursion is matched with the deﬁnition of linRec to ﬁnd values of a,
b, c, d, e and then converted to a tail recursion by instantiating the theorem:
∀ R a b c d e.
WF R
∧ (∀ x. ¬(a x) ==> R (d x) x)
∧ (∀ p q r. c p (c q r) = c (c p q) r)
==>
∀ x u. c (linRec a b c d e x) u = tailRec a b c d e (x,u)
where WF R means that R is well-founded. Heuristics are used to choose an
appropriate witness for R.
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4334
5 Current State and Future work
The compiler described here has been through several versions and now works
robustly on all the examples we have tried.
We have written a ‘pretty-printer’ that converts circuit formulas to Verilog,
so that they can be simulated and input to other tools. There were initially
diﬃculties when we ﬁrst experimented with Verilog simulation. Our formal
model represents bits as Booleans (T, F), but the Verilog simulation model is
multi-valued (1, 0, x, z etc.), so our formal model does not predict the Verilog
simulation behaviour in which registers are initialised to x. As a result, Verilog
simulation was generating undeﬁned x-values instead of the outputs predicted
by our proofs. The behaviour of most real hardware does not correspond
to Verilog simulation because in reality registers initialise to a deﬁnite value,
which is 0 for the Altera FPGAs we are using. By making our Verilog model
of Dtype initialise its state to 0 we were able to successfully simulate all our
examples. Since our proofs are valid for any initial value, the Verilog model
of Dtype is a valid implementation of the model in higher order logic. Our
investigation of this issue was complicated by a bug in the Verilog simulation
test harness: load was being asserted before done became T, violating the
precondition of the handshake protocol, so even after we understood the ini-
tialisation problem, simulation was giving inexplicable results. However, once
we ﬁxed the test-bench, everything worked. All our examples now execute
correctly both under simulation and on an Altera Excalibur FPGA board.
If we simulate our implementation of MultIter with inputs (5, 7, 0) using
a standard Verilog simulator (http://www.icarus.com) and view the result
















0 7 14 21 28 35
load is asserted at time 15; done is T then, but immediately drops to F in
response to load being asserted. At the time when load is asserted the values
5, 7 and 0 are put on lines inp1, inp2 and inp3, respectively. At time 135
done rises to T again, and by then the values on out1, out2 and out3 are 0, 7
and 35, respectively, thus Mult32Iter(5,7,0) = (0,7,35), which is correct.
In the immediate future we plan to complete a substantial example, being
done at the University of Utah, to use our compiler to implement the Ad-
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 35
vanced Encryption Standard (AES) [12] algorithm for private-key encryption.
This speciﬁes a multi-round algorithm with primitive computations based on
ﬁnite ﬁeld operations. Starting from an existing formalisation of AES [16],
we have generated netlists and circuits for the major components of an en-
cryption (and decryption) round. Although out work on AES is incomplete,
our current progress conﬁrms the viability of our synthesis methodology. The
AES formalisation includes a proof of functional correctness for the algorithm:
speciﬁcally, encryption and decryption are inverse functions. Deriving the
hardware from the proven speciﬁcation using logical inference assures us that
the hardware encrypter is the inverse of the hardware decrypter. Many of the
AES speciﬁcations are not tail-recursive, but formally deriving (and verify-
ing) tail-recursive versions was straightforward. To automate such proofs for
future work we developed the linRec tool (Section 4).
At present all data-reﬁnement (e.g. from numbers or enumerated types to
words) must be done manually, by proof in higher order logic. The HOL4 sys-
tem has some ‘booliﬁcation’ facilities that automatically translate higher level
data-types into bit-strings, and we hope to develop ‘third-party’ tools based
on these that can be used for automatic data-reﬁnement with the compiler.
We want to investigate using the compiler to generate test-bench monitors
that can run in parallel simulation with designs that are not correct by con-
struction. Thus our hardware can act as a “golden” reference against which
to test other implementations.
The work described here is part of a project to create hardware/software
combinations by proof. We hope to investigate the option of creating soft-
ware for ARM processors and linking it to hardware created by our compiler
(possibly packaged as an ARM co-processor). Our emphasis is likely to be on
cryptographic hardware and software, because there is a clear need for high
assurance of correct implementation in this domain.
6 Related work
Previous approaches to combine theorem provers and formal synthesis estab-
lished an analogy between the goal-directed proof technique and an interactive
design process. In LAMBDA, the user starts from the behavioural speciﬁca-
tion and builds the circuit incrementally by adding primitive hardware com-
ponents which automatically simplify the goal [4]. Hanna et al. [6] introduce
several techniques (functions) that simplify the current goal into simpler sub-
goals. Techniques are adaptations to hardware design of tactics in LCF.
Alternative approaches synthesise circuits by applying semantic-preserving
transformations to their speciﬁcations. For instance, the Digital Design Deriva-
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4336
tion (DDD) transforms ﬁnite-state machines speciﬁed in terms of tail-recursive
lambda abstractions into hierarchical Boolean systems [7]. Lava and Hydra
are both hardware description languages embedded in Haskell whose programs
consist of deﬁnitions of gates and their connections (netlists) [1,11]. While
Lava interfaces with external theorem provers to verify its circuits, Hydra de-
signers can synthesise them via formal equational reasoning (using deﬁnitions
and lemmas from functional programming). The functional languages µFP
and Ruby adopt similar principles in hardware design [8,14]. The circuits are
deﬁned in terms of primitive functions over Booleans, numbers and lists, and
higher-order functions, the combining forms, which compose hardware blocks
in diﬀerent structures. Their mathematical properties provide a calculational
style in design exploration.
These approaches deal with an interactive synthesis at the gate or state-
machine level of abstraction only. Moreover, the synthesis and the proof of
correctness require a substantial user guidance. Gropius and SAFL are two
related works that address these issues.
Gropius is a hardware description language deﬁned as a subset of HOL [2,3].
Its algorithmic level provides control structures like if-then-else, sequential
composition and while loop. The atomic commands are DFGs (data ﬂow
graphs) represented by lambda abstractions. The compiler initially combines
every while loop into a single one at the outermost level of the program:
PROGRAM out default (LOCVAR vars (WHILE c (PARTIALIZE b)))
The body b of the WHILE loop is an acyclic DFG. The list out default provides
initial values for the output variables. The term LOCVAR declares the local
variables vars and PARTIALIZE converts a non-recursive (terminating) DFG
into a potentially non-terminating command. The compiler then synthesises
a handshaking interface which encapsulates this program. Each of these hard-
ware blocks are now regarded as primitive blocks or processes at the system
level. Processes are connected via communication units (k-processes) which
implement delay, synchronisation, duplication, splitting and joining of a pro-
cess output data (actually there are 10 diﬀerent k-processes [2]). Although
the synthesis produces the proof of correctness of each process and k-process,
the correctness of the top-level system is not generated. The reason for that is
mainly because the top-level interface of a network of processes and k-processes
does not match the handshaking interface pattern.
Our compilation method is partly inspired by SAFL (Statically Allocated
Functional Language) [10], especially the ideas in Richard Sharp’s PhD the-
sis [13]. SAFL is a ﬁrst-order functional language whose programs consist
of a sequence of tail-recursive function deﬁnitions. Its high-level of abstrac-
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 37
tion allows the exploitation of powerful program analyses and optimisations
not available in traditional synthesis systems. However, the synthesis is not
based on the correct-by-construction principles and the compiler has not been
veriﬁed.
The novelty of our approach is the automatic compilation of HOL functions
to hardware together with the automatic generation of the proof of correctness
of the synthesis. Our method provides an alternative approach to the compiler
veriﬁcation. Instead of proving the correctness of a compiler, we only need to
prove the correctness of ﬁve circuit constructors once and for all. A verifying
compiler can then be easily programmed with the facilities provided by a
mechanised proof assistant such as HOL.
7 Acknowledgements
David Greaves gave us advice on the hardware implementation of handshake
protocols and also helped us understand the results of simulating circuits pro-
duced by our compiler. Simon Moore and Robert Mullins lent us an Excalibur
FPGA board on which we are running compiled hardware and they helped
us with the Quartus II design software that we are using to drive the board.
Ken Larsen used his dynlib library to write an ML version of our original C
interface to the serial port (this is used to communicate with the Excalibur
board).
References
[1] Per Bjesse, Koen Claessen, Mary Sheeran, and Satnam Singh. Lava: Hardware design in
Haskell. ACM SIGPLAN Notices, 34(1):174–184, January 1999.
[2] Christian Blumenro¨hr. A formal approach to specify and synthesize at the system level. In GI
Workshop Modellierung und Veriﬁkation von Systemen, pages 11–20, Braunschweig, Germany,
1999. Shaker-Verlag.
[3] Christian Blumenro¨hr and Dirk Eisenbiegler. Performing high-level synthesis via program
transformations within a theorem prover. In Proceedings of the Digital System Design
Workshop at the Euromicro 98 Conference, Va¨steras, Sweden, pages 34–37, Universita¨t
Karlsruhe, Institut fu¨r Rechnerentwurf und Fehlertoleranz, 1998.
[4] Simon Finn, Michael P. Fourman, Michael Francis, and Robert Harris. Formal system design—
interactive synthesis based on computer-assisted formal reasoning. In Luc Claesen, editor,
IMEC-IFIP International Workshop on Applied Formal Methods for Correct VLSI Design,
Volume 1, pages 97–110, Houthalen, Belgium, November 1989. Elsevier Science Publishers,
B.V. North-Holland, Amsterdam.
[5] Michael J. C. Gordon and Thomas F. Melham, editors. Introduction to HOL: A theorem
proving environment for higher order logic. Cambridge University Press, 1993. HOL4 website:
http://hol.sourceforge.net.
[6] F.K. Hanna, M. Longley, and N. Daeche. Formal synthesis of digital systems. In L. Claesen,
editor, Applied Formal Methods for Correct VLSI Design, pages 153–170. North-Holland, 1989.
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4338
[7] Steven D. Johnson and Bhaskar Bose. DDD – A System for Mechanized Digital Design
Derivation. Technical Report TR323, Indiana University, IU Computer Science Department,
1990.
[8] Geraint Jones and Mary Sheeran. Circuit design in Ruby. Lecture notes on Ruby from a
summer school in Lyngby, Denmark., September 1990.
[9] Thomas F. Melham. Higher Order Logic and Hardware Veriﬁcation. Cambridge University
Press, Cambridge, England, 1993. Cambridge Tracts in Theoretical Computer Science 31.
[10] Alan Mycroft and Richard Sharp. Hardware/software co-design using functional languages.
In Proceedings of Tools and Algorithms for the Construction and Analysis of Systems
(TACAS’01), pages 236–251, Genova, Italy, April 2001. Springer-Verlag. LNCS Vol. 2031.
[11] John O’Donnell. Overview of Hydra: A concurrent language for synchronous digital circuit
design. In Proceedings of the 16th International Parallel and Distributed Processing Symposium.
IEEE Computer Society Press, 2002.
[12] United States National Institute of Standards and Technology. Advanced Encryption Standard.
2001.
[13] Richard Sharp. Higher-Level Hardware Synthesis. PhD thesis, University of Cambridge, the
Computer Laboratory, Cambridge, England, 2002.
[14] Mary Sheeran. muFP, A language for VLSI design. In Conference Record of the 1984 ACM
Symposium on Lisp and Functional Programming, pages 104–112. ACM, ACM, August 1984.
[15] Konrad Slind. Function deﬁnition in higher order logic. In Theorem Proving in Higher Order
Logics, number 1125 in Lecture Notes in Computer Science, pages 381–398, Turku, Finland,
August 1996. Springer-Verlag.
[16] Konrad Slind. A veriﬁcation of Rijndael in HOL. In V. A Carreno, C. A. Munoz, and
S. Tahar, editors, Supplementary Proceedings of TPHOLs 2002, number CP-2002-211736 in
NASA Conference Proceedings, August 2002.
APPENDIX: formal speciﬁcations in higher order logic
The speciﬁcation of the four-phase handshake protocol is represented by the
deﬁnition of the predicate DEV, which uses auxiliary predicates Posedge and
HoldF. A positive edge of a signal is deﬁned as the transition of its value from
low to high or, in our case, from F to T. The formula HoldF (t1 , t2 ) s says that
a signal s holds a low value F during a half-open interval starting at t1 to just
before t2. The formal deﬁnitions are:
 Posedge s t = if t=0 then F else (¬ s(t−1) ∧ s t )
 HoldF (t1, t2) s = ∀t. t1 ≤ t < t2 ⇒ ¬(s t)
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 39
The behaviour of the handshaking device computing a function f is de-
scribed by the term DEV f (load , inp, done, out) where:
 DEV f (load , inp, done, out) =
(∀t. done t ∧ Posedge load (t+1)
⇒
∃t′. t′ > t+1 ∧ HoldF (t+1, t′) done ∧
done t′ ∧ (out t′ = f(inp (t+1)))) ∧
(∀t. done t ∧ ¬(Posedge load (t+1)) ⇒ done (t+1)) ∧
(∀t. ¬(done t) ⇒ ∃t′. t′ > t ∧ done t′)
The ﬁrst conjunct in the right-hand side speciﬁes that if the device is available
and a positive edge occurs on load , there exists a time t ′ in future when done
signals its termination and the output is produced. The value of the output at
time t ′ is the result of applying f to the value of the input at time t+1. The
signal done holds the value F during the computation. The second conjunct
speciﬁes the situation where no call is made on load and the device simply
remains idle. Finally, the last conjunct states that if the device is busy, it will
eventually ﬁnish its computation and become idle.
The circuit constructors
The following primitive components are used by the circuit constructors.
 AND (in1, in2, out) = ∀t. out t = (in1 t ∧ in2 t)
 OR (in1, in2, out) = ∀t. out t = (in1 t ∨ in2 t)
 NOT (inp, out) = ∀t. out t = ¬(inp t)
 MUX(sw , in1 , in2 , out) = ∀t. out t = if sw t then in1 t else in2 t
 COMB f (inp, out) = ∀t. out t = f(inp t)
 DEL (inp, out) = ∀t. out(t+1) = inp t
 DELT (inp, out) = (out 0 = T) ∧ ∀t. out(t+1) = inp t
 DFF(d , sel , q) = ∀t. q(t+1) = if Posedge sel (t+1) then d(t+1) else q t
 POSEDGE(inp, out) = ∃c0 c1. DELT(inp, c0) ∧ NOT(c0, c1) ∧ AND(c1, inp, out)
Atomic handshaking devices.
 ATM f (load , inp, done, out) =
∃c0 c1. POSEDGE(load , c0) ∧ NOT(c0, done) ∧ COMB f (inp, c1) ∧ DEL(c1, out)
Sequential composition of handshaking devices.
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–4340
 SEQ f g (load , inp, done, out) =
∃c0 c1 c2 c3 data.
NOT(c2, c3) ∧ OR(c3, load , c0) ∧ f(c0, inp, c1, data) ∧
g(c1, data, c2, out) ∧ AND(c1, c2, done)
Parallel composition of handshaking devices.
 PAR f g (load , inp, done, out) =
∃c0 c1 start done1 done2 data1 data2 out1 out2.
POSEDGE(load , c0) ∧ DEL(done, c1) ∧ AND(c0, c1, start) ∧
f(start , inp, done1, data1) ∧ g(start , inp, done2, data2) ∧
DFF(data1, done1, out1) ∧ DFF(data2, done2, out2) ∧
AND(done1, done2, done) ∧ (out = λ t. (out1 t, out2 t))
Conditional composition of handshaking devices.
 ITE e f g (load , inp, done, out) =
∃c0 c1 c2 start start
′ done e data e q not e data f data g sel
done f done g start f start g .
POSEDGE(load , c0) ∧ DEL(done, c1) ∧ AND(c0, c1, start) ∧
e(start , inp, done e, data e) ∧ POSEDGE(done e, start ′) ∧
DFF(data e, done e, sel) ∧ DFF(inp, start , q) ∧
AND(start ′, data e, start f ) ∧ NOT(data e, not e) ∧
AND(start ′, not e , start g) ∧ f(start f , q, done f , data f ) ∧
g(start g , q, done g , data g) ∧ MUX(sel , data f , data g , out) ∧
AND(done e, done f , c2) ∧ AND(c2, done g , done)
Tail recursion constructor.
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 41
 REC e f g (load , inp, done, out) =
∃done g data g start e q done e data e start f start g inp e done f
c0 c1 c2 c3 c4 start sel start
′ not e.
POSEDGE(load , c0) ∧ DEL(done, c1) ∧ AND(c0, c1, start) ∧
OR(start , sel , start e) ∧ POSEDGE(done g , sel) ∧
MUX(sel , data g , inp, inp e) ∧ DFF(inp e , start e , q) ∧
e(start e , inp e, done e, data e) ∧ POSEDGE(done e, start ′) ∧
AND(start ′, data e, start f ) ∧ NOT(data e, not e) ∧
AND(not e, start ′, start g) ∧ f(start f , q, done f , out) ∧
g(start g , q, done g , data g) ∧ DEL(done g , c3) ∧
AND(done g , c3, c4) ∧ AND(done f , done e, c2) ∧ AND(c2, c4, done)








































(d) PAR f g
Fig. 2. Implementation of composite devices.



































































(b) REC e f g
Fig. 3. The conditional and the recursive constructors.
M. Gordon et al. / Electronic Notes in Theoretical Computer Science 145 (2006) 27–43 43
