Abstract
Introduction
In hardware synthesis, design errors may be extremely expensive. This implies that one has to find a design methodology, that is safe in a sense that it guarantees correctness throughout the design process. Due to the complexity of nowadays circuits, simulation can never be exhaustive. Also post-synthesis verification is always NPcomplete. This paper presents an approach towards design correctness where synthesis is performed via a sequence of logical transformations thus guaranteeing correctness by construction.
This paper addresses synthesis at the algorithmic level. It is part of our ongoing work towards a formal synthesis tool named HASH (higher order logic applied to synthesis of hardware). In our previous work [3] , algorithmic synthesis was restricted to pure data flow graphs. The extensions to be presented in this paper allow synthesising arbitrary algorithmic descriptions i.e. mixed controlldata flow descriptions.
*This work has been financed by the Deutsche Forschungsgemeinschaft, Project SCHM 623/6-1.
Our work is based on a formal hardware description language named Gropiusl ranging from the gate level to the system level. Gropius [2] is a language with a formally exact semantics, where each construct is derived from logic within the HOL [7] theorem prover. In this paper, we will introduce the part of Gropius, that is related to the algorithmic level (section 2), and we will present a new formal hardware synthesis methodology, where the implementation is derived by applying a sequence of program transformations.
There are many hardware description languages. However, there are several reasons, why we believe that Gropius is superior to most conventional description languages. Gropius is defined with a precise formal semantics, it is functional, strongly typed, higher order and polymorphic.
There are also other approaches, where the synthesis process is based on a transformational design style. Unlike such approaches, we formalise the algorithmic description in a mathematical manner and perform the transformations directly in this representation style within the theorem prover HOL. Due to the fact that deriving theorems in HOL is restricted to a small core of rules and axioms, our approach can be considered to be extremely safe as to correctness. This design style, that we call formal synthesis, is superior to those, where the correctness of transformations is proved, but the correctness of their implementation is not. In those approaches, there are only paper&pencil proofs for the correctness [9] or the circuit transformations are based on a (non-mathematical) formalisation. Proofs are performed by intuition and not within a mathematical logic [8] . In the CAMAD system [ 111 for instance, the algorithmic description is given in a Pascal-like notation. For transforming the program, it is translated into a formalisation based on timed Petri-nets. Both the transformations from Pascal to Petri-nets and the transformations within the Petri-nets are pieces of software that are complex and safety critical. There is no explicit proof for the correctness of 'WALTER GROPIUS (1883 -1969 , founder of the BAUHAUS form jbl-
the implementation of these safety critical parts. There are also other formal synthesis approaches, where circuit transformations are performed by applying basic mathematical rules within a theorem prover. However, these are mostly restricted to lower abstraction levels (e.g. Lambdmialog [6]) or they are restricted towards checking some plausibility criteria rather than performing a complete proof. See [ 101 for a survey on formal synthesis approaches.
The starting point for high-level synthesis is an algorithmic description. The result of high-level synthesis is a structure at the Register Transfer level (RT-level). Usually, hardware at the RT-level consists of a data-path and a controller. In the conventional approaches [ 5 ] , several control states are introduced along a given controlldata flow description thus partitioning it into small cycle free pieces of program, each corresponding to one clock tick. Then scheduling, allocation and binding are performed on this exponential number of cycle-free pieces leading to a data-path and a symbolic state transition table. Afterwards, the controller and the communication part are generated.
We have developed a methodology that totally differs from the standard. In our approach, the implementation is derived via program transformations. Synthesis is performed in two steps. The first step transforms the program into an equivalent program with a specific shape that we call SLF-representation (section 4). In this step, pre-proven program equations are applied. In the second step, a preproven implementation theorem is applied for mapping the SLF-program to a RT-level structure (section 3). There are several of such implementation theorems each corresponding to a specific pattern for the interface behaviour of the hardware implementation.
Formal representation of programs
At the algorithmic abstraction level, the behaviour of the circuit, that has to be synthesised, is represented as a pure software program. The concrete timing of the circuit is not yet considered. Gropius offers appropriate means for this level of abstraction. We will now briefly introduce themfor a detailed description see [ 1, 21.
In Gropius, we distinguish between two different algorithmic descriptions: DFG-terms and P-terms. Both DFG-terms and P-terms can be used as a starting point for synthesising hardware. DFG-terms represent simple, non-recursive programs that always terminate (Data Flow -Graphs). P-terms are a means for representing arbitrary computable functions (Frograms).
Both P-terms and DFG-terms are functions. DFG-terms always terminate. The evaluation of P-terms, however, may not terminate. In our approach, P-terms are used for representing entire programs as well as blocks. Blocks are used for representing inner pieces of programs. Blocks are based on conditions and basic blocks. Both basic blocks and conditions are DFG-terms with basic blocks having same input and output type and conditions having a boolean output type.
In Gropius, there is a small core of 8 basic control structures for building arbitrary computable blocks and programs based on basic blocks and conditions: PARTIALIZE (convert a basic block into a block), WHILE (loop), THEN (sequence of blocks), I FTE (conditional branching), LOCVAR (local variable), LEFTVAR and RIGHTVAR (apply block to left/right part of state), PROGRAM (convert a block into a program)
The syntax of DFG-terms, blocks and programs are defined with the following Backus-Naur form:
Providing only a small basic language for representing programs leads to a small number of syntactic constructs to be considered and therefore reduces the number of program transformations that have to be derived. However, Gropius allows deriving new control structures by the programmer.
Here are some examples:
In our approach, the mapping from an algorithmic descri tion to hardware is performed in two steps: rograms are Erst turned into a specific pattern called single-f)oop form (see section 4) and then an implementation theorem is applied for mapping the SLF program to hardware (see section 3). Programs in SLF have the following shape.
In this expression o-init and v i n i t denote arbitrary constants, e is a condition and a an arbitrary basic block.
Converting programs to the Register Transfer level
The algorithmic description only defines the mapping from input values to output values. Time is not yet considered. To bridge the gap between the algorithmic description and the hardware implementation, the interface behaviour has to be described, specifying how the algorithm communicates with its environment.
Many approaches in the high-level synthesis domain use a notation, where algorithmic description and interface description are mingled [ l l]. In our approach algorithmic description and interface description are strictly separated. A fixed set of interface patterns is provided. The circuit designer can first write some ordinary, time independent algorithm and can then select one of the interface patterns, thus defining the way the circuit communicates with the environment. It is easy for the designer to switch from one interface behaviour to another without changing the program. Therefore, this methodology supports the reuse of designs in a systematic manner.
There are many possible interface specification patterns. Usually, the interface of the implementation not only consists of the data signals from the algorithmic description, but there are also additional control signals. They are used to steer the communication and to allow interrupting the execution of the algorithm.
The formula below shows the formal definition of a interface specification pattern called I FC1. It describes the relation between some interface signals input, reset, start, output and ready with respect to some arbitrary program f .
For each interface pattern, that we provide, we also give a correct implementation pattern in terms of an implementation theorem. All implementation theorems expect the algorithmic description to be in SLF. Figure 1 shows the structure of the general hardware implementation that we found for interface pattern lFCl .
We represented this structure in logic and named it IMP1 2 . The following theorem states, that IMP1 fulfils IFC1 for every program f being in SLF with some DFGterms a and c and arbitrary constants v-init and oinit.
IFC1 (input, reset, start, output, ready,
2For sake of space we will not give the structural description in Gropius. See [2] for structural RT-level descriptions in Gropius. When mapping an algorithmic description to hardware, certain cost functions have to be considered, The main critical aspects are consumption of area and timing behaviour. In general, these optimisation goals are contradictory. When regarding the RT-level implementation in figure 1, one can see that the two DFG-terms a and c directly determine the hardware costs.
Converting programs to single-loop form
Every P-term can be transformed into a SLF. However, for every P-term there is not a unique SLF, but there are several equivalent SLFs. Different SLFs lead to different implementations with different costs with respect to hardware consumption and execution speed. In our approach, doing a good high-level synthesis means producing a SLF which corresponds to a cost minimal implementation.
Within the HOL theorem prover environment, we have proven several transformation theorems, which can be subdivided into two groups: SPT (standard program transformations) theorems and OPT (optimisation program transformation) theorems. SPT theorems are used to convert arbitrary programs to SLF. Rewriting with the set of SPT theorems is confluent (i.e. applying them in an arbitrary order always leads to the same result) and always leads to a SLF. OPT theorems are used for optimising the control structures.
The SPT theorem set comprises a fixed number of 27 equations (see theorem 1 for an example). In simplified terms, the e uations reduce the number of control structures (THEN, WhLE,. . .) by adding new data variables holding the current control information. Furthermore, local variables are shifted from the inside to the outside of control structures. ~, h i ) , h z ) .
t-WHILE
CI 2 V hz) PARTIALIZE (X ((Z, h i ) , h z ) . (x, hi 1, ( a (2, hi 1, TI, ((2, init) , F)))
MUX (CZ

Conclusion
Currently, 12 OPT theorems have been proven. Unlike the SPT theorem set, this set is not fixed and may be extended. A very powerful OPT-theorem implements loopunrolling. The theorem describes the equivalence between a while-loop and an n-fold unrolled while-loop with several loop-bodies which are executed successively. Between two loop-bodies, the loop-condition is checked to guarantee, that the second body is only executed, if the value of the condition is still true. The advantage of loop-unrolling is, that the combinatorial depth is increased, which reduces the number of clock ticks that are required for executing the program. Theorem (2) gives the definition of a special forloop FOR-N, which realizes an n-fold application of the same hnction (see section 2). Theorem ( 3 ) shows the general loop-unrolling theorem, and theorem (4) can be used to remove the function FOR-N, after having instantiated n.
Within the HOL theorem proving system, we have proven the SPT theorems and the OPT theorems by hand. These theorems are powerful enough to derive SLFs in different ways. A given program is first optimised by applying some optimisation theorems. The OPT theorems can either be applied by hand or one needs to invoke some heuristics. Afterwards the SPT theorems are applied and to produce a SLF. This can be performed by pure rewriting, which is fully automated by the HOL system.
Converting a program to a SLF corresponds to conventional scheduling, allocation and binding techniques. In our approach, design goals such as hardware consumption and execution speed are reached by selecting suitable theorems among the OPT-theorems. Other than with conventional synthesis techniques, the schedule (the assignment between operations and clock cycles) is not calculated explicitely but is implicitly derived during the theorem applications. Allocation and binding are performed after the SLF-transformation within the basic blocks of the SLF. Performing allocation and binding within basic blocks in formal synthesis has already been presented in [3].
In this paper, we have presented a new methodology for deriving RT-level structures from circuit descriptions at the algorithmic level. It differs from other high-level synthesis approaches in three aspects. First, it is formal. The implementation is derived by applying basic logical transformations within a theorem prover thus guaranteeing correctness implicitly. Second, it provides a new synthesis concept. The implementation is derived by applying program transformations rather than extracting a control and data flow graph and analysing an exponential number of control paths. Thirdly, the input language for our high-level synthesis supports design reuse by using interface patterns rather then mingling algorithmic aspects and interface behaviour.
Our hardware description language Gropius can be used to describe circuits at different abstraction levels (see [4] ). It provides a consistent concept for a correctness-by-design synthesis style ranging from the algorithmic level down to the gate level.
