The compilation of Handel-C programs into net-list descriptions of hardware components has been extensively used in commercial tools but never formally verified. In this paper, we first introduce an extension of the compilation schema that allows the synthesis of the prioritised choice construct. Then we present a variation of the existing semantic model for Handel-C compilation that is amenable to mechanical proof and detailed enough for analysing properties of the hardware generated. We use this model to prove the correctness of the wiring schema used to interconnect the components at the hardware level and propagate control signals among them. Finally, we present the most interesting aspects of the mechanisation of the model and the correctness proofs in the HOL theorem prover.
Introduction
Handel-C [14] is a Hardware Description Language (HDL) based on the syntax of the C language extended with constructs for dealing with CSP-based [10] parallel behaviour and process communications.
Handel-C's compilation into hardware components (synthesis) was initially formulated to be used in a hardware/software co-design project [20] and later adopted in commercial tools [15] . The synthesis process is described graphically and it covers most of the basic constructs excluding the prioritised choice operator. The approach is based on the fact that each synthesised circuit is controlled by a pair of signals (start and finish). Compositionality, even though informally addressed, is achieved by means of associating a circuit with each construct and then considering it as a black box driven only by means of the above described simple controlling interface. The correctness of the compilation approach is based on the assumption that the start signal will be given to a circuit only if all the previous circuits have already finished. The translation mechanism is designed in a way that, in principle, ensures the validity of this assumption, leading to the satisfaction of a system-level assumption: ''the environment will start the hardware program running just once and will not attempt to start it again before the program has completed'' [20] . There is, however, no formal evidence of any of these claims as no semantics or formal proofs are provided.
A formal model of the compilation and denotational semantics for the hardware generated has been proposed as a first step towards verifying the compilation scheme [21, 24] . The semantics relies on the concept of state-transformers and uses branching sequences of atomic hardware actions as its semantic domain. However, when trying to prove a correctness result from the semantics, we observed that the semantics fails to capture the hardware parallelism at the branching sequence level, making any correctness proof very difficult.
One of the objectives of this work is then to provide a semantics that is capable of capturing the parallelism the hardware is capable of and that is suitable for mechanically verifying the assumptions the synthesis process is based on. As we are interested in a bigger subset of Handel-C than the one addressed in [20] , we also need to define a compilation approach for the priAlt construct (prioritised choice) as well as providing the semantics corresponding to it. Finally, given the fact that we focus on the correctness of the synthesis process, we exclude any efficiency/optimisation analysis from this work.
The work presented in this paper is a first-order logic theory that describes the syntax (by means of a deep embedding of the constructs as datatypes) and semantics for our subset of Handel-C. The syntax and semantics are then embedded in higher order logics to allow the verification of correctness properties associated with the assumptions the synthesis process is based on. The theory and its verification have been mechanised in the HOL theorem prover [9] and further details about its definitions and theorems can be found in [22] . 1 In this paper we intend to give a self-contained and accessible account of that work. In doing so, we deliberately omit all proofs and consequently many technical details in favour of a clearer presentation of the work as a whole.
The rest of the paper is organised as follows. We start by outlining the compilation approach presented in [20] together with our approach for compiling the priAlt construct. We then propose a reformulation of the semantic domain in [21] in terms of a slightly higher level semantic domain that allows us to capture parallelism at the sequence level. We then redefine the semantics in terms of the new domain and prove the existence of fix-point solutions for the recursive equations. Finally, we define the concept of wire-satisfiability in our model and prove the correctness of the wiring schema used in the compilation.
Handel-C and its synthesis
In order to explain how the synthesis process is performed and to provide semantics for the language, a simplified subset that captures the major constructs in the Handel-C language is being used. Most constructs in the language can be built by combining constructs in this subset, with the exception of pointers and function calls. Our subset of Handel-C constructs is presented in Fig. 1 .
As described in the language documentation [14] , programs are comprised of at least one main function and, possibly, some additional functions. Multiple main functions (within the same file) produce the parallel execution of their bodies under the same clock domain. In the context of our elementary subset of the language, it is possible to produce the same effect by means of the parallel operator.
All C-based constructs in Handel-C behave as defined in ANSI-C [12] but with some additional restrictions regarding the clock-based, synchronous nature of the language. In this sense, the evaluation of expressions is performed by means of combinatorial circuitry and it is completed within the clock cycle in which it is initiated (expressions are said to be evaluated ''for free'' [14] due to this interpretation). This way of evaluating conditions affects the timing of all the constructs in the language. In the case of selection, the branch selected for execution (depending on the condition) will start execution within the same clock cycle in which the whole construct is initiated. The while construct starts its body in the same clock cycle in which its condition evaluates to the logical value true. On the other hand, it terminates within the same clock cycle in which its condition becomes false. The assignment construct updates the value of the variable being modified at the end of the clock cycle. This definition of the assignment construct allows swapping the contents of a pair of variables without the need to allocate any temporary memory. Variable declarations are assumed to be implicit; thus, no variable declaration constructs are included in our subset of Handel-C.
Parallel composition of statements executes in a real parallel fashion as it refers to independent pieces of hardware running in the same clock domain. The delay construct leaves the state unchanged but takes a whole clock cycle to finish.
The input and output constructs have the standard blocking semantics: if the two parts are ready to communicate, the value output at one end is assigned to the variable associated with the input side. Both sides of the communication take one full clock cycle to successfully communicate. A process trying to communicate over a channel without the other side being ready will block (delay) for a single clock cycle and try again. Finally, the priAlt construct provides a list of possible actions guarded by communication primitives. The alternatives are offered to the environment in descending order (i.e., there is a decreasing priority ordering along the list of guards). Two aspects are worth noticing as regards the priAlt construct:
1. The conditions guarding each alternative do not behave like the communication primitives described above (even thought their syntax is the same). As mentioned before, input/output constructs will iterate waiting until the communication is possible. The guards inside a priAlt will attempt the communication. If they succeed, the communication will take place and the actions associated with them will start executing in the next clock cycle. If the communication is not possible, the control will be passed to the next alternative in the list of guarded commands inside the priAlt (i.e., no iteration will take place inside the guard).
2. The presence/absence of a default clause radically changes the behaviour of the priAlt construct. If there is a default clause, the actions associated with it will be executed if all the other alternatives have failed. If the default actions are triggered, control will be passed to them within the same clock cycle in which the priAlt construct has been started and the priAlt will finish its execution after (but within the same clock cycle) the default actions have finish.
If there is no default clause and all the alternatives have failed the priAlt construct will do nothing until the end of the clock cycle and will offer all its alternatives again (following the priority order) in the next clock cycle.
The synthesis process
In this subsection we recast the compilation templates presented in [20] that we will formalise and verify in the following sections.
The circuit representing the delay construct ( Fig. 2a) takes advantage of the flip-flop's latency to propagate the start signal into its finish wire one clock cycle after receiving it. On the other hand, the synthesised version of the assignment construct (Fig. 2b) uses the same approach the delay circuit uses to preserve the timing model but also uses the start signal to enable the register storing the contents of the variable being assigned to.
Sequential composition of constructs is achieved by means of linking the finish wire with the start one of the corresponding circuits following the sequential ordering (Fig. 3a) . The parallel composition of constructs, on the other hand, propagates its start signal to all the parallel processes to achieve their simultaneous activation. Given the fact that each of the parallel branches may take a different amount of clock cycles to terminate, their finish signals are memorised (i.e., latched). The finish signal of the parallel composition is produced only when all the parallel branches have finished (Fig. 3b) .
In the case of the selection construct, Fig. 3c , the activation pulse (i.e., start signal) is given to the appropriate sub-circuit only if the selection construct has been started during that clock cycle and if the value of the condition enables its execution. The finish signal of the selection construct is just the logical disjunction of the finish signals of the two sub-circuits. This implementation produces the desired effect provided that the synthesis schema preserves the assumption that a circuit will not generate its finish signal unless it has been previously activated.
The compilation schema for the iteration construct (Fig. 3d ) is in principle similar to the one for the selection construct. The most relevant difference is that the finish signal of the while's body is looped back to keep the circuit active after the first iteration over the loop is finished. Also note that if the condition is false, the finish signal is generated combinatorially producing the desired effect of terminating the while construct within that clock cycle.
The synthesis of input and output constructs decentralises the control of the communication. Fig. 3e illustrates the hardware generated to perform the actions associated with input/output. The circuit begins the clock cycle by requesting to communicate (i.e., by means of showing its readiness to communicate). This signal is processed by a mechanism arbitrating the communication that will respond over the transfer line, whether the communication can be performed or not. In the former case, the construct behaves like assignment does. In the latter, the circuit memorises its active state and tries again during the next clock cycle. Finally, there is another circuit that is generated by the synthesis process (once per channel) and implements the arbitration mechanism that governs the communications over it (Fig. 3f) . The circuit collects all the input and output requests over the channel and it allows the communication when there is at least one input and one output request over the channel during the current clock cycle.
PriAlt
Up to this point we have presented the compilation schema described in [20] for the constructs in Handel-C ranging from assignment and delay to parallel composition and communication. In this subsection we present our extension to this work that allows the synthesis of hardware implementing the priAlt construct.
As observed in Section 2, the behaviour (and algebraic properties) of the priAlt construct change significantly according to whether it contains a default guard or not. We have devised a compilation schema for each of these two possible behaviours. Fig. 4a shows the circuit generated for a priAlt construct with default guard of the form priAlt {case guard 1 : P 1 break; case guard 2 : P 2 break; default: P 3 }. The start signal is used as the communication request associated with guard 1 sent to the appropriate arbitration circuit. The transfer signal returned by the arbiter is used to activate the communication and P 1 after one clock cycle. The negation of the transfer signal is used to trigger the communication request for guard 2 (in a similar way to how start was used for guard 1 ). The same mechanism as described above is used to handle the transfer signal for guard 2 . On the other hand, if all the arbiters return a false transfer signal, the circuit associated with P 3 is activated.
Finally, the finish signal for the priAlt construct is the disjunction of the individual finish signals of P 1 , P 2 and P 3 . Note that the finish signal is properly generated provided that P 1 , P 2 and P 3 produce a finish pulse only if they have been started.
The compilation of the priAlt construct without a default guard is described in Fig. 4b (here we show the compilation of priAlt {case guard 1 : P 1 break; case guard 2 : P 2 break}). In the case of a priAlt with a default guard we activated P 3 when neither guard 1 nor guard 2 were allowed to communicate. Here we use the same condition to determine whether the priAlt has been resolved or not. If it has not been resolved and the circuit is active (i.e., the start signal was given during (a) Default priAlt.
(b) Non-default priAlt. the current clock cycle or the circuit has been started in a previous clock cycle but it has not been able to communicate up to the present clock cycle) this 'active' status is memorised. The active state is used to restart the circuit in the next clock cycle.
The semantic domain
The existing denotational semantics for the synthesis process [4, 21] are based on the notion of ''branching sequences'' (also referred to as ''computation trees'' [2] and, in essence, similar to the multi-level lists used by Lengauer [13] ). In this setting, non-branching nodes denote the execution of actions in which the control flow of the program does not get modified.
Branching nodes, on the other hand, model a decision point (such as the condition in a while construct), where the state must be accessed in order to evaluate the condition. The actual trace of the program is obtained by keeping track of the updates over the environment and by pruning the branches that do not get executed. Even though this semantic model was successfully implemented and tested against existing semantics for Handel-C [4, 5] , our goal is to achieve a higher degree of confidence in it by means of proving its correctness.
When trying to prove correctness properties for the compiled hardware, we observed that the semantics fails to capture the hardware parallelism at the branching sequence level (parallel actions performed combinatorially in the hardware are ''linearised'' in the branching sequences). The main drawback of this feature of the semantics is that most of the properties we are interested in proving hold at the end of clock cycles. In this context, proving relatively simple properties (such as that parallel composition is commutative if only state updates are considered 2 ) showed itself to be very complicated, given the need to establish the equivalence at synchronous points (i.e., clock cycle edges) that did not occur at the same depth in the different parallel branches.
To overcome this problem, we observed that Handel-C's synchronous time model allowed us to group the actions performed in each clock cycle into two disjoint sets: combinatorial actions (performed before the end of the clock cycle) and sequential actions (performed at the end of the clock cycle). The idea of grouping similar actions together is not fully compatible with the tree-like structure used in branching sequences. In particular, branching conditions are combinatorial actions and we want to include them in the corresponding set of actions, rather than having them branching the sequences' structure. We address this problem by formulating branching sequences of the form
(where stands for the concatenation operation over Seq and cond → S 1 | S 2 represents a branching node that selects S 1 or S 2 depending on whether condition cond holds) and expanding them into two sequences: (where cond ⊥ is an assertion stating that cond must hold during the current clock cycle and a : S adds combinatorial action a to the head of sequence S). This observation allows us to turn the semantic domain into a set of finite-length linear sequences (this view of the selection construct is consistent with the one used in other formalisms such as [11] ).
A similar problem is encountered when constructs are composed in parallel as a structured node was generated in the sequence, pre-empting the actions in the parallel branches to be collected together. We overcome this problem by introducing a merge operator that joins the pair of sequences being composed in parallel.
Our new semantic domain is the powerset of sequences of the type:
Definition 3.0.1.
The Action type captures the notion of actions that the hardware can perform together with assertion-like extensions (in the sense of [8] ) in order to allow the verifications we need over the hardware. Let w be a wire identifier, w 1 and w 2 expressions involving wires (i.e., elements of type wExpression as defined below), var a variable or channel name, val a value in the corresponding type and cnd a boolean condition, then we define the Action type as: 
Domain operations
Our semantic domain is going to need two operators in order to be able to manipulate sequences within the semantic function: concatenation ( ) and parallel merge (⊎).
Regarding concatenation, our initial formulation was a non-standard one due to the possibility of having to concatenate a pair of sequences that overlap. Two sequences overlap iff the last and first element (in the first and second sequence respectively) are built from the same constructor (for example the application (Cb a 1 ) (Cb a 2 S 2 ) does overlap).
In the case of overlapping sequences, the lack of synchronous actions in between the last element of the first sequence and the first element of the second one exposes the fact that no clock cycle finishes between them. In the context of our semantics, this indicates the need to join the extremes of the two sequences and generate a single combinatorial node with the union of the comprised sets of actions (in the previous example, the expected result would be (Cb (a 1 ∪ a 2 ) S 2 )). In fact, this is a very common case, given the fact that circuits generated for all constructs start and finish their execution performing combinatorial actions, leading to sequences in which the first and last elements are of combinatorial type.
We use a trivial extension of the standard concatenation operator ( ) that is capable of handling our heterogeneous sequences. This definition allows consecutive nodes based on the same constructor to be put in sequence and this kind of behaviour is not suitable in our semantics (we would expect the actions in the two consecutive nodes of the same kind to be collected in a single set of actions). We solve this problem by ensuring that the sequences being concatenated using the operator avoid the problematic cases (see Section 4 for further details).
To address the definition of our merge operator we first need to be able to establish when a pair of sequences is in-Phase:
In the case of having not in-Phase sequences (i.e. a pair of sequences in which the structure is not node-wise equal), the parallel-merge operator should give priority to combinatorial actions over sequential ones (we should allow a combinatorial action to be executed in the current clock cycle if one of the circuits being composed in parallel needs to do so). A typical way of implementing this kind of behaviour is through priorities. Although an efficient solution in terms of implementation, dealing with priorities makes the definition of the function quite complicated and, as in the case of the concatenation function, reduces the set of properties of the function to a minimal one. On the other hand, the definition of the operator becomes completely symmetric (and there is no longer a need for priorities) if we can ensure that the sequences are in-Phase:
Fix-points
As our semantic function is going to use recursive equations, it is necessary to assure the existence of an appropriate semantic domain with fix-point solutions to them. The general idea is to establish a semantic domain with an associated complete partial order (CPO), that is, a set with a partial ordering v, a least element ⊥, and limits of all non-empty chains [7] and to describe the semantics in terms of continuous functions (i.e., functions between CPOs that preserve the partial order and limit structure).
To achieve this goal we extend Seq with a bottom element ⊥ and order our lifted semantic domain Seq ⊥ by the relation ≼ below. We prove this ordering to be a partial order and that the constructors are monotonic with respect to it. 
We have now to verify that the concatenation function is also monotonic with respect to ≼. We first extend to treat ⊥ as left and right zero. With this extended definition, we can easily prove to be monotonic on its first and second arguments (by structural induction on s 1 , s 2 and s). We extend the parallel-merge operator to treat ⊥ as zero when it appears in any of its arguments. We also use ⊥ as the result for the function when the arguments are not in-Phase, as we need to totalise the functions in order to be able to encode them in the HOL theorem prover. The proof of right monotonicity for ⊎ is carried out by case analysis on the result of ⊎'s application (after using the right kind of induction).
The proof of ⊎'s left monotonicity, on the other hand, cannot be performed by structural induction because the sequences cannot be handled as a pair (the induction principle must be applied to individual sequences in a sequential way). In particular, structural induction over the individual sequences provides us with a pair of hypotheses that refer to sequences that are not in-Phase with respect to each other (this happens because of the sequential way in which we apply the induction principle). As mentioned earlier in the paper, this is a case in which ⊎ is undefined and it only arises because the induction is not set up in the right way. The solution is to make the proof by complete induction over the sum of the lengths of the sequences being merged. This way of proving the theorem is quite laborious, but allows us to instantiate the inductive hypothesis to the sequences of the right shape when needed. Following this approach combined with the case analysis mentioned above, we prove ⊎ left monotonic. We then use this result, together with the fact that ⊎ is commutative, to prove that ⊎ is also right monotonic.
Having shown a suitable partial order over the semantic domain and proved that all the operators preserve that ordering, we can guarantee that fix-point solutions to the recursive equations introduced in the next section exist in our model.
A model for the compilation and its semantics
As we intend to reason about the compilation process from our subset of Handel-C into hardware components we need a way of introducing the language constructs in our model. Moreover, the kind of reasoning we intend to do will imply reasoning over all possible combinations of the constructs in the language. The rest of this section is dedicated to showing the way in which we embedded the input language in HOL and how we used the semantic domain defined in the previous section to give semantics to its compiled forms.
Embedding Handel-C's syntax in HOL
Boulton et al. [1] provided evidence that a deep embedding of the input language into the model is the best alternative for reasoning about properties regarding programs that can be generated from that input language. Our problem fits this kind of reasoning as we want to analyse the correctness of the hardware generated for all possible (and valid) combinations of our input language.
On the basis of the above observation, we have encoded the input language as recursive datatypes in HOL. In fact, the encoding using recursive datatypes is enough for modelling most of the input language with the exception of the priAlt construct. The variability in the length of the guarded commands inside a priAlt construct makes it impossible to capture it as a standard datatype as this would require having a mechanism for defining constructors of variable arity. We solved this problem by defining the recursive type of guarded choices, allowing us to move the variability in length from the priAlt construct into its argument. In this way, we can embed the priAlt construct as a constructor of a single, but variable in length, argument. From the mechanisation point of view, as the definitions of the datatype describing the constructs in the language and the one describing the list of alternatives inside occurrences of the priAlt operator depend on each other, we have implemented them by means of a mutually recursive definition. Our encoding of the input language in HOL is as shown in Definition 4.1.1. lang ::= Delay : wireId × wireId
It is important to notice that we have used the same start/finish interface from Page's compilation approach [19] to describe the guards inside the priAlt construct. We have not described our synthesis schema for priAlt in these terms but this problem is easily overcome if we consider all the hardware generated from the ready signal up to the finish signal of the circuit associated with each guard. Fig. 5 illustrates the idea by showing a projection of the components of Fig. 4a that implement the first guarded alternative and illustrating which wires are used to interface this sub-circuit.
Denotational semantics for the compilation: the semantic predicate
On the basis of the sequence-based domain defined in the previous section we can give semantics to the translation for all the constructs in our input language. The semantics is going to be described as a set containing all possible execution traces for the program being synthesised. However, due to the presence of circuits with loop-back connections and the fact that our hardware model will capture the semantics of iterating constructs by means of successive syntactic approximations, we cannot apply any form of explicit description of the semantic set. This observation is particularly relevant if we consider that we are also aiming at mechanically verifying our approach. The obvious solution is to find a predicate smPred capturing the semantics of the hardware generated from Handel-C program c such that we can define the semantic function Sm as (Sm c) = {s :
In this context, inductively defined relations [18, 6] are predicates defined by a set of rules for generating their elements. Inductive definitions can be regarded as a ''constructive mechanism'' in the sense that they provide information about how to generate some basic elements (base cases) and additional rules that instruct on how to obtain new elements from existing ones.
Following [16, 17] we adopt this approach to define our semantic predicate, taking the (informal) rules in the compilation schema as the basis for defining the set of rules for our semantic predicate smPred. In particular, delay, assignment, while (false case) and successful input/output are ''base case'' constructs for the semantics (given the fact that their semantics do not involve the semantics of other syntactic elements). The remaining constructs, on the other hand, involve the semantic expressions for their components and can be regarded as ''inductive cases''.
The case of the priAlt construct introduces an additional level of complexity to the problem as we have to provide semantics to the hardware that is generated for its list of guarded alternatives. Moreover, the mutual dependency between the embedding of the language constructs and the list of alternatives for the priAlt operator forced the semantics to also be mutually dependent. Fortunately, it is possible to define the kind of predicates we need by means of mutual induction [25] . With this idea in mind, we included the new relationship smGuard that generates the corresponding semantic expressions for the list of guarded alternatives associated with each occurrence of priAlt in a given program. Note that as the list of alternatives inside priAlt constructs are not valid program constructs on their own (i.e., they can only occur inside priAlts) this extension of the semantic predicate does not affect our previous definition of the semantic function Sm.
We also need a way to incorporate the unique pair of wires (start and finish) generated for each construct in the program by the synthesis process. We do so by means of two functions (π s (c) and π f (c)) returning, respectively, the start and finish identifiers for a given circuit c. In our semantics, the hardware components generated by the synthesis process start their execution by performing a set of combinatorial actions (essentially to propagate their start signal to their constituent constructs) and also finish by carrying out a set of combinatorial actions (to propagate the finish signal appropriately). Furthermore, these ''before'' and ''after'' combinatorial actions performed by all the circuits are likely to be executed during the same clock cycle in which the previous circuit was terminating (or the next one is starting). This suggests that these are points in which the merging of action sets should take place to condense actions of the same type that happen in the same clock cycle into a single node.
We capture this notion by isolating these special points in the semantics, allowing us to have greater control over the structure of the sequences and the way in which they get concatenated/merged. We redefine our semantic predicate to relate a circuit c with two sets of combinatorial actions: prologue and epilogue (accounting respectively for actions at the beginning and end of the execution of the circuit) and a behavioural sequence (capturing the actions being executed in between these two combinatorial fragments).
On the other hand, the fact that the actions when the while construct terminates and default actions in a priAlt reduce to pure combinatorial actions makes their behaviour different from the rest of the constructs (that take at least one clock cycle to finish their execution). In this sense, we allow smPred to produce two kinds of results: mixed, to capture the semantics of constructs involving both combinatorial and sequential actions (this is our formulation based on the prologue, behavioural sequences and epilogue mentioned above); and combinatorial, to encode the case in which the semantics involve only actions that are performed within the current clock cycle. We split the actions in this last type of semantic expression into two, allowing us to have a prologue and epilogue when we produce the combinatorial type of result. We formalise this notion by defining the following result type: Definition 4.2.1.
For the remainder of the paper, we will use   and as syntactic shorthand for the Mixed and Combin constructors respectively. In these terms, the semantics of the delay construct state that its combinatorial prelude only includes a verification for the start wire, while its combinatorial epilogue just sets its finish wire to the high value. The behavioural part of the circuit just states that it delays its execution for a single clock cycle (Definition 4.2.2). 
The semantics for the assignment construct is very similar but the behavioural component is modified to capture the update to the store: ∀s, f ∈ wireId; x ∈ varId; e ∈ expr • smPred  (Assign s f x e),  {s ⊥ } (Ck {x ← e}) {f }  .
In the case of constructs c 1 and c 2 being sequentially composed, the prelude transfers the start signal from the sequential composition circuit to c 1 's start and also includes c 1 's combinatorial prelude (notice here that we are joining at this point the sequential composition's and c 1 's preludes). The behavioural part comprises c 1 's behaviour followed by a combinatorial set of actions turning the finish signal of c 1 into c 2 's start and performing c 2 's prelude, to conclude with c 2 's behaviour. Finally, the sequential composition's epilogue is composed of c 2 's epilogue and combinatorial hardware to propagate c 2 's finish signal as the sequential composition's one. Definition 4.2.4 presents a more formal description of these actions. 
It is easy to observe from the definition above that there is no assurance (at least, not at the syntactic level) that the precondition we established for the concatenation function is going to be satisfied. In particular, we need to guarantee that all possible behavioural sequences (as we could essentially compose any construct in sequence with any other) begin and end with a Ck node. As we explained before, this is why it is not possible to allow empty behavioural sequences (as needed to describe the semantics for the case of a terminating while construct). In Section 5 we define a subclass of our sequences in which the required property holds and prove that all possible behavioural sequences produced by our semantic predicate (i.e., when the Mixed type of result is produced) belong to that subclass.
We also need to add rules to handle the cases in which one of the constructs (or both) being composed sequentially only performs combinatorial actions. As an example, the case in which the first construct terminates ''immediately'' is described by Definition 4.2.5. For conciseness, for the rest of the paper we omit the domain of quantified variables when this information is clear from the context. 
The symmetric case (the second construct terminates within the same clock cycle in which it was started) is described in a similar way and we do not show it here. The case in which the two constructs being composed in sequence terminate in the same clock cycle also terminates in a single clock cycle (Definition 4.2.6). 
It is important to notice that none of the three additional rules generated to handle the case of a combinatorial result needs to apply the concatenation operation to produce its results. We will take advantage of this observation when proving the correctness in the application of the domain operators within the semantic predicate.
In the case of the parallel composition of c 1 and c 2 , the combinatorial prelude propagates the parallel composition's start signal to c 1 
When defining the semantics of parallel composition, the second simplification in our hardware model makes itself evident: we are not using flip-flops to memorise the finish token of the circuit with the shortest execution in order to preserve its value until the other parallel circuit has also finished. Even though it would be possible to modify the definition of ⊎ to account for this, we opted to not follow this idea for the following reasons:
• The flip-flop implementation in the actual hardware is a good solution at the circuit level for ensuring that no finish signal will be lost while composing circuits of different length in parallel. We achieve the same effect at the semantic level by making the resulting composed sequence finish only when both sequences have finished and by making the information that both c 1 and c 2 have finished available for the combinatorial actions following that behavioural sequence.
• The compilation definition does not allow any wire to connect directly to the individual finish wires of either c 1 or c 2 (these wires are, in fact, ''hidden'' within the wrapper constructed by the parallel composition structure). This fact implies that there are no correctness concerns that we need to address from the possible connections from these wires (we can be certain, for example, that no circuit will use the finish pulse of c 1 to activate itself, as it cannot access it), allowing us to safely abstract out this feature in our semantic model.
As with the sequential composition construct, it is also necessary to address the cases involving the instantaneous termination of the constructs being composed in parallel. We omit these rules (together with the ones handling the selection construct) as they are similar to the rules we have already presented. The complete set of rules can be found in [22] .
The semantics for the while construct needs to provide rules for handling the two possible outcomes of the evaluation of the condition. The first rule accounts for the case when the condition is false and the while terminates immediately (Definition 4.2.8). When the condition holds, the traditional notion of syntactic approximations [26] is applied. We provide two rules for the approximation: a base case capturing the first approximation to the solution (by means of a single execution of the while's body) and another one to capture the way in which we can construct a longer approximation from an existing one.
The first approximation calculates the semantics of the while's body, assumes that the looping condition does not hold and uses its epilogue to signal the while's termination (Definition 4.2.9).
Definition 4.2.9.
Following the approach we have been using so far, it would be natural to add a rule to handle the case in which the while's body finishes immediately. This situation corresponds, for example, to the C program: while (1) {while (0);}.
This case is regarded as ''pathological'' [4] as the construct stutters without making any progress or advancing the clock. In fact, all implementations of the Handel-C compiler spot this case and flag the program as incorrect. As this problem has been already detected (and corrected from the implementation's perspective), we decided to exclude it from our analysis. In this sense, we do not contemplate the possibility of a while's body terminating immediately and, hence, we do not add rules for handling it.
The final rule for the while construct is meant to extend an existing approximation (generated either by the basic rule above or by previous applications of itself). The approximation is constructed by appending one expansion of the body and the proper linking combinatorial action in front of the approximation's behavioural sequence (Definition 4.2.10). 
We use the approximation-based principle defined above to give semantics to the input and output primitives. In this particular case, we need two rules per constructor because the approximation part of inputs and outputs only requires waiting for a clock cycle and retrying the communication (the equations for while are more complicated given the fact that we require including the body's semantics on each approximative step). In providing the semantics for input and output, we need to address the different parts involved in the communication. We use ch?, ch! and ch to denote the reading, writing and bus (the physical means by which the value being communicated is transmitted) components in the communication over channel ch. Note that ch?, ch! represent the corresponding ready wires used in the compilation approach described in Fig. 3e .
The base case for the inputting construct uses its prelude to state the presence of the reading component and requests the presence of the writing part (by means of an assertion). If this is the case, the behavioural part of the semantics updates the store according to the value being transmitted over the channel. The actions in the epilogue are used to establish the generation of the finish signal (Definition 4.2.11).
Definition 4.2.11.
The rule defining the approximations when the writer is not ready to communicate just asserts the proper condition in the combinatorial prelude (¬ch! ⊥ ) and adds a delay of one clock cycle and appropriate combinatorial actions in front of the existing approximation (Definition 4.2.12).
Definition 4.2.12.
Note that all the approximations constructed by this rule conclude with a successful communication (to see why, consider that the base case for the communication primitives is a successful one, and the fact that all approximations will just add ''failing'' communication behaviour in front of this successfully terminating trace). This fact is the key factor we use to order approximations (see Section 3.2) as the ''successful termination'' part of the sequence is turned into ⊥ by the prune function should the construct not terminate its execution at this moment of execution (i.e., the approximation has not yet reached the fix-point solution and, hence, the assertion stating the terminating condition to hold cannot be satisfied).
Finally, the base case for the output construct is similar to the one for input, but it inverts the roles in the combinatorial prelude (it establishes the presence of the writer and asserts the readiness of the reader). Its behavioural part is also different as it has to assign the value being transmitted to the appropriate channel (Definition 4.2.13). The rule capturing the inductive approximations to the output construct using the same approach as was introduced for the input approximative solution is as described by Definition 4.2.14. 
Semantics for the priAlt construct
In order to give semantics to the priAlt construct we first need to describe how the semantic predicate for the list of guarded alternatives works. As mentioned in previous sections, we have included the predicate smGuard to account for the semantic effects of the different guards associated with a priAlt construct. Moreover, the semantics of the priAlt as a whole is dependent on whether any of the guarded alternatives has succeeded in communicating (i.e., the priAlt has been resolved) or not. With this idea in mind, the smGuard relates a list of guarded alternatives not only with a semantic expression but also with a boolean value reflecting whether the priAlt was resolved or not. The signature of the smGuard predicate is ∀G ∈ guardList; S ∈ smResult; res ∈ bool • smGuard (G, S, res) where G corresponds to the list of guards which we are calculating the semantics for, S are the semantics associated with the possible evaluations of G, and res indicates whether the priAlt has been resolved (i.e., any of the communications succeeded or a default guard was reached) or not. smGuard is also an inductively defined predicate. Its definition is given by means of the different induction cases presented below. Definition 4.2.16 shows the semantic expression associated with the last guarded command when the communication has not been possible (i.e., the priAlt has not been resolved). The case of the last guard becoming active when the associated guarded alternative produces mixed results is captured in Definition 4.2.17. The definition that captures the same situation but with the associated guarded fragment produces combinatorial results similar to the one already presented (the sequence part of the result reduces to just (Ck {skip})) and we do not show it here. In the rest of this section we will only show definitions associated with the mixed results given the similarity with their combinatorial counterparts. 
So far we have described the base cases the list of alternatives can have (a singleton list or a default action). The case where we have a list longer than a single element as argument for the priAlt uses the previous rules to treat the first element in the list. Definition 4.2.19 illustrates the case where the first guard in the list is allowed to communicate. Note how the epilogue produces the disjunction of the finish signal of the current guarded command with the finish signal of the remaining guarded commands in the list. In this way, we produce the n-way or gate in Fig. 4a and b by means of chaining binary or gates. 
From the above definitions it is possible to capture the semantics of the priAlt operator. Definition 4.2.21 shows the case of a priAlt without a default guard where one of the guards was able to communicate. This definition also describes the case when there is a default clause and it has been activated. 
On the other hand, the case where none of the guards is able to communicate makes the priAlt construct wait for a clock cycle and start again. As with other iterating constructs, we use successive approximations to model the behaviour of the construct in this context. Definition 4.2.22 shows the details of the semantics of this case. Note that the semantics associated with the attempt of communicating during the first clock cycle terminate combinatorially.
Definition 4.2.22.
Pruning
So far we have described the semantics of the translation from Handel-C into net-lists as a (possibly infinite) set of finitelength sequences. In order to complete the semantic description of the generated circuits, we need to find (if it exists) a single sequence that specifies the actual execution path and outcome of the program being synthesised.
As in [21] , we define two auxiliary functions:
∆Env : env → Action → env and flattenEnv : env → env.
The former updates the environment according to the action passed as argument by means of rewriting the appropriate function using λ-abstractions. In the particular case of skip, ∆Env treats it as its unit value and returns the same environment.
On the other hand, flattenEnv is meant to be used to generate a new environment after a clock cycle edge. It flattens all wire values (to the logical value false), resets the channel values to the undefined value and advances the time-stamp by one unit. We define the execution of sets of hardware actions by means of the predicate
by means of the following rules: , a), s − {a}) .
We also need to be able to handle assertions, so we introduce the function
where holds(a, e) is true iff the assertion a is true in the environment e.
As we are dealing with sets of actions and assertions on each node of our sequences, we need to define the collective effect of this heterogeneous set of actions over the environment. The first difficulty we face when defining how a set of actions is going to be executed is that the initial order between actions and conditions has been lost. This is, however, not a problem if we consider that assertions and control flow conditions refer only to the present value of the memory and all variables preserve their values during the whole clock cycle. This fact makes the evaluation of assertions and control flow decisions independent of the combinatorial actions performed in parallel with them and they can be evaluated at any time.
From the observation above, we split the set of actions into the disjoint sets of assertions (As) and the remaining ''unconditional actions'' (HAs). The partition into As/HAs is induced over the set of actions by means of the functions ∇ A and ∇ HA . Also from the observations above, we know that control flow assertions can be evaluated at any time, and that wire-correctness assertions must be evaluated after HAs, allowing us to establish the following order of evaluation: HAs ≺
As. Taking advantage of this execution order, we can introduce the function
defined by the rule for the successful case:
In turn, the above functions can be used to define a single-node execution function for sequences
The simplest case is successful termination (i.e., when the sequence we are trying to execute is empty), captured by the following rule:
.
The next case captures the execution of a sequence that begins with a set of actions containing unsatisfiable conditions (the symmetric case is similar and we omit it here):
The case in which it is possible to perform all actions and satisfy all tests within a combinatorial node is described by the following rule:
The counterpart of the above rule (dealing with sequences starting with a clock-edged block) is as follows:
Note that the environment needs to be flattened after all actions at the clock edge have taken place. The flattening can take place only at this point because of the possibility of having a value being transmitted over a bus (we will lose the value being transferred if we flatten the environment before updating the store with it).
Having described the mechanism to execute a single node, the extension to executing a complete sequence is captured by the predicate
Finally, the actual execution path for the synthesised program (for the case in which the program terminates) is given by the operator prune : env → P(Seq) → P(Seq) defined as prune(e, sSet) = {S ∈ sSet | seqExec(e, S) = }.
Note that, as Handel-C's control flow is governed by boolean conditions, only one of the possible branches is executable at any given time, making our semantic traces mutually exclusive. From this observation, it follows that only one sequence (if any) exists that will lead to successful termination, leading to the result in the seqExec predicate above. In consequence, the set described in Definition 4.3.1 is either a singleton set, modelling termination, or the empty set (the original program diverges).
Consistency considerations
Our definition of the semantic function relies on the sequences produced to have certain properties in order to satisfy the preconditions imposed by the operations in our semantic domain. The concatenation function requires the behavioural sequences produced by smPred to have the pattern (Cb a 1 s) (Cb a n ) for sets of combinatorial actions a 1 and a n and a sequence s. Furthermore, we need to ensure that the sequences produced by the semantics are in-Phase to ensure a safe application of the merge operator.
The methodology we use to ensure the satisfaction of these requirements is as follows. We first define a suitable subset of the sequences in our semantic domain and prove that the properties mentioned above hold for any element in this subset. Then we prove that all the sequences produced by the semantic predicate belong to this subset, ensuring that the semantic function also satisfies the required conditions.
Clock-bounded sequences
Given the properties we need to satisfy, we observed that all instances of s above should have their boundaries (i.e., first and last elements) based on the Ck constructor. Moreover, s should alternate (there shouldn't be two consecutive applications of the same constructor). The predicate ckBnd captures this notion by means of an inductive definition:
We also need to show that any pair of sequences satisfying the ckBnd property is also in-Phase. The alternating nature of ckBnd sequences also guarantees the satisfaction of this requirement, allowing us to prove:
Lemma 5.1.1.
The semantic predicate only generates clock-bounded sequences
We need to show that all behavioural sequences generated by smPred belong to ckBnd subtype. To do so, we first need two lemmas proving that the semantic domain operators preserve the ckBnd property. Regarding the application of to clock-bounded sequences, it is not possible to prove that the concatenation of two ckBnd sequences is still a clock-bounded sequence (it is not even possible to apply as the arguments do not satisfy their precondition). On the other hand, it is possible to prove that: Lemma 5.2.1.
This result is strong enough to aid us in the proof of smPred's preservation of the ckBnd property. In fact, it is easy to observe that the way in which the concatenation function is applied in the above lemma is the only way in which the function is applied in smPred's definition.
The proof of the lemma stating ⊎'s monotonicity with respect to the ckBnd property is very complicated. The complication arises because of the way in which the ckBnd's induction principle has to be applied (i.e., in sequential order) together with ⊎'s definition. In order to overcome this problem, we capture ⊎'s behaviour in the inductive predicate ⊎ pred . The predicate ⊎ pred (S 1 , S 2 , S 3 ) relates the pair of sequences S 1 and S 2 with the sequence S 3 provided S 3 can be obtained from merging S 1 with S 2 following the inductive definition below. 
. From this definition, it is easy to show the above property for the ⊎ pred predicate (i.e., it preserves the ckBnd property):
The main advantage of the predicate-based form of ⊎ is that it provides an induction principle, allowing us to conduct proofs by induction on the merge operator, rather than on sequences or its properties. Before being able to take advantage of this feature we need to prove that it is safe to replace ⊎ by the ⊎ pred inductive predicate (i.e., we need to prove that they are equivalent). As we intend to replace ⊎ by ⊎ pred only in the context of our semantic predicate, we can safely assume that the ckBnd property holds and prove the equivalence result that we need by showing mutual implication between the two formulations for the parallel-merge operator:
Lemma 5.2.3.
 .
With the equivalence result above we show that Lemma 5.2.2 also holds for ⊎:
The two main lemmas of this section allow us to prove that the behavioural sequences generated by smPred belong to the subset of Seq induced by ckBnd:
Theorem 5.2.1.
This final result ensures that the operators in the semantic domain are always applied within their definition domain by the semantic predicate.
Wire-wise correctness
We are now in a good position to verify the correctness of the wiring schema used to link the different components used at the hardware level. In particular, we are interested in proving the wire-correctness of the generated hardware by means of verifying whether: (a) the activation signal is propagated from the finish signal of the previous circuit; (b) the start signal is given to each component at the right clock cycle; and (c) the internal wiring of each circuit propagates the control signals in the right way and produces the finish pulse at the right time.
It is worthwhile noting that part of the verification is straightforward from the way in which the compilation is done. In particular, each construct is compiled into a black box and it is interfaced only through its start and finish wires. In this sense, it is impossible for a circuit to be started by any component but the circuit containing it, pre-empting the chance of a component c being activated by a random piece of hardware. After this observation, to prove (a) we only need to prove that the finish wire of the appropriate circuit is set to high by the time the subsequent circuit is started.
Regarding the verification of the start signal given at the right time (condition b), we have already included assertions regarding the start signal in the combinatorial prelude of all constructs in order to make sure that the circuit receives a start pulse during the first clock cycle of its execution. The remaining aspect of this question is whether our semantic model of the hardware activates the circuits at the right clock cycle. Towards answering this question, the synchronous time model used in Handel-C together with the component-based approach used in the compilation allows us to verify that the timing in our semantic model is equivalent to the one of the generated hardware. In this context, assuming that the generated hardware implements the timing model correctly, we only need to verify that the wire-related assertions are satisfied in order to verify (b).
The rest of this section is devoted to verifying the wire-correctness of the hardware on the basis of the observations above. We first define a way of calculating whether a given wire is high within the current clock cycle. We then define the concept of wire-satisfiability capturing the notion of wire-based assertions being true in a given set of combinatorial actions and use it to prove all the circuits are given the start signal in the clock cycle they are supposed to be started in.
Wire-transfer closure
The fact that all the combinatorial actions happening at a given clock cycle are collected together in a set provides enough information to allow one to ''calculate'' which wires hold the high value in that clock cycle. This is because of the way in which clock edges are handled: all the wires are set to low, forcing the presence of explicit combinatorial actions to set the appropriate wires to high again in the next clock cycle. In fact, the information about a wire holding the high value can be given either by a single formula (such as π s (c)) or by a chain of value transfers from other wires (such as {w 1 ← w 2 ; w 2 }).
In this context, we define the notion of a wire in high by means of the isHigh predicate:
At this point it might be relevant to highlight the fact that the definition above provides the induction rules that are used to generate the relation isHigh (as mentioned by Melham [18] , ''inductive relations are based on the concept of a relation being closed under a set of rules''). In HOL, an inductively defined definition like the one above adds three elements to a theory: (a) induction rules, (the rules above); (b) the induction principle, capturing how structural induction is to be conducted on the basis of the induction rules; and (c) a case analysis theorem, allowing the derivation of the relation's rules from the conclusions. As an example, we present below the case analysis theorem generated for our isHigh predicate above.
The predicate isHigh captures the notion of a wire w holding the high value provided the actions in set are executed.
From this definition we were able to prove some lemmas that will be necessary in the following sections. In particular, if a given wire w holds the high value in a given set s, then it still does so in a bigger set:
It is also possible to prove that any explicit action in the set of actions setting up a wire w 1 to the high value can be replaced by a pair of actions, one setting up a new wire w 2 to the high value and another one to propagate its value into w 1 :
With these results we are able to prove that for any given construct c the epilogue in the semantics always sets its finish wire π f (c) to high:
This is the first result towards proving (a) in the introduction of this section. In order to complete our verification of (a) we introduce a new assertion type (w 1 w 2 ) ⊥ defined to hold iff isHigh(w 1 )∧isHigh(w 2 ) holds in a given set of combinatorial actions. We then modify our semantic predicate to include assertions of this new type at the places where condition (a) is expected to hold. As an example, we show the updated version of the semantics for the sequential composition construct: In order to complete the verification, we need a way to ensure that the new assertion holds for any possible semantic trace generated from the compilation. The next subsection provides the mechanisms needed to prove that this is the case.
Assertion satisfiability
Having defined the concept of wire being set to high, we need a way to capture the idea of satisfaction of our assertions regarding wires. In particular, we say that all the wire-related assertions in a set are satisfied iff the predicate wireSAT holds 3 :
Definition 6.2.1.
wireSAT (s) = ∀ w ∈ WireIds • w ⊥ ∈ s ⇒ isHigh(w ⊥ , s).
The hardware generated from any given syntactic construct c is started; then all the wire-related assertions in its combinatorial prelude hold: This result is proving consideration (a) from the previous section: the activation signal is propagated from the parent/previous circuit into the start signal of the current one. In fact, even though the above theorem is just stating that it happens that the right start/finish signals get the high value in the appropriate clock cycle, evidence gathered during the proof process showed that the high value actually gets propagated between them, proving consideration (a) to its full extent. It also shows that consideration (b) holds: the start signal is given to each circuit at the appropriate time (provided that the time models of the hardware compilation are correct as regards Handel-C's semantics).
We also proved that the epilogue set of combinatorial actions satisfies wireSAT : Provided that (a) and (b) hold, the verification of (c) can be reduced to proving that the assertions in the behavioural part of the semantic predicate are satisfied. The rationale behind this affirmation is that the base cases for the smPred trivially satisfy (c) while compound circuits can be regarded as placeholders linking start/finish wires of different components by means of combinatorial actions. The assertions introduced in order to verify (a) and (b) are checking that those actions are propagating the right value among the different components involved.
In order to verify the behavioural sequences from the semantic predicate we first need to extend the concept of wiresatisfiability to sequences. We do so by defining the function wireSAT seq as follows: Before being able to use the wireSAT seq function to prove that the control flow wiring is correct in the behavioural sequences we need to prove two lemmas as regarding and ⊎ preserving the wire-satisfiability property for sequences if it holds true for their arguments. The case of concatenation is straightforward by induction over the composed sequences: Proving the equivalent result for the parallel-merge operator is a very complicated task given the already mentioned lack of an induction principle capable of handling two sequences as a pair. As we are still within the context of the semantic predicate (and hence, within the subclass of clock-bounded sequences) we can prove an easier goal: (res) and then use the equivalence between parMerge and ⊎ to deduce the equivalent result for ⊎. With these two results, we are able to prove the correctness of the wiring for the behavioural sequences produced by smPred: With the three main theorems of this section, we show that the wiring is correct (regarding our wire-satisfiability criteria) for any given construct in the core language.
Conclusions and future work
The main contributions of this work are a compilation schema for the priAlt construct, an improved semantic model for the hardware components synthesised from Handel-C programs and the mechanical verification of the wiring schema used to handle the control flow among those components.
This work presents a more abstract semantic domain than the one used in previous works and allows a better description of the parallelism exhibited by the synthesised hardware. In particular, we have defined our semantic domain in terms of a deep embedding of sequences of state-transformers in higher order logic. We have also established a partial order relationship over the domain and proved the existence of fix-point solutions to our inductive approximations for recursive constructs.
The synthesis process we are formalising is based on the assumption that no hardware component will be activated unless a precise signal has been given to it through its interface. We have captured this notion by embedding Handel-C's syntactic constructs in HOL and providing a semantic function that maps each construct in the language to its representation in our semantic model. Moreover, the way in which Handel-C's synchronous nature is encoded in the model introduces explicit information about the value held by the wires used to link different components. We have taken advantage of this feature to formally verify the correctness of the wiring schema used in the compilation.
Even though we have proved that each of the hardware components propagates the control token in the right way, we still need to prove that the hardware generated by the compilation rules is correct (i.e., semantically equivalent to its original Handel-C code). This correctness proof will also allow us to discharge the only assumption of this work: that the timing model of the generated hardware is consistent with the one for Handel-C. Towards this end, our next step is to prove the existence of an equivalence relationship using the semantic models for Handel-C [5, 23] and the semantics for the hardware generated that is presented in this paper.
