Abstract-This paper describes the T-Ruby system for designing VLSI circuits, starting from formal specifications in which they are described in terms of relational abstractions of their behaviour. The design process involves correctness-preserving transformations based on proved equivalences between relations, together with the addition of constr:sints. A class of implementable relations is defined. The tool enables such relations to be simulated or ti~anslated into a circuit description in VHDL. The de. '31 'g n process is illustrated by the derivation of a circuit for 2-dimensional convolution.
I. INTRODUCTION
This paper describes a computer-based system, known as T-Ruby [lo] , for designing VLSI circuits starting from a high-level, mathematical specification of their behaviour: A circuit is described by a binary relation between appropriate, possibly complex domains of values, arid simple relations can be composed into more complex ones by the use of a variety of combining forms which are higher-order functions.
The basic relations and their combining form generate an algebra, which defines equivalences (which may take the form of equalities or conditional equalities) between relational expressions. In terms of circuits, each such equivalence describes a general correctness, preserving transformation for a whole family of circuits of a particular form. In the design process, these equivalences are exploited to transform a "specification" in the form of one Ruby expression to an "implementation" in the form of another Ruby expression, in a calculation-oriented style [3, 7, 111.
T-Ruby is based on a formalisation of Ruby, originally introduced by Jones and Sheeran [2] , as a language of functions and relations, which we refer to as the T-Ruby language. The purpose of the paper is to demonstrate how such a general language can be used to bridge thle gap between a purely mathematical specification and the implementable circuit. The design of a circuit for 2-dimensional convolution is used to illustrate some of the features of the method, in particular that the step from a given mathematical specification to the initial Ruby description is Dept. of Computer Science Technical University of Denmark e-mail: osr@id.dtu.dk small and obvious, and that the method allows us to derive generic circuits where the choice of details can be postponed until the final actual synthesis. The T-Ruby system enables the user to perform the desired transformations in the course of a design, to simulate the behaviour of the resulting relation and to translate the final Ruby description of the relation into a VHDL description of the corresponding circuit for subsequent synthesis by a high-level synthesis tool. The transformational style of design ensures the correctness of the final circuit with respect to the initial specification, assuming that the equivalences used are correct. Proofs of correctness are performed with the help of a separate theorem prover, which has a simple interface to T-Ruby, so that proof burdens can be passed to the prover and proved equivalences passed back for inclusion in T-Ruby 's database.
The division of the system into the main T-Rubysystem, a theorem prover and a VHDL translator has followed a "divide and conquer" philosophy. Theorem proving can be very tedious and often needs specialists. In our system the designer can use the proved transformation rules in the computationally relatively cheap T-Ruby system, leaving proofs of specific rules and conditions to the theorem prover. When a certain level of concretisaion is reached efficient tools already exist to synthesise circuits. Therefore we have chosen to translate our relational descriptions into VHDL.
RUBY
The work described in this paper is based on the so-called Pure Ruby subset of Ruby, as introduced by Rossen [8] . This makes use of the observation that a very large class of the relations which are useful for describing VLSI circuits can be expressed in terms of four basic elements: two relations and two combining forms. These are usually defined in terms of synchronous streams of data as shown in Figure 1 . In the figure, the type sig (7 
The types of the relations describe the types of the signals passing through the interface between the circuit and its environment. However, the relational description does not specify the direction in which data passes through the interface. "Input" and "output" can be mixed in both the domain and the range.
A feature of Ruby is that relations and combinators not only have an interpretation in terms of circuit elements, but also have a natural graphical interpretation, corresponding to an abstract floorplan for the circuits. The conventional graphical interpretation of spread (or of any other circuit whose internal details we do not wish to show) is as a labelled rectangular box. The components of the domain and range are drawn as wire stubs, whose number reflects the types of the relations in an obvious manner: a simple type gives a single stub, a pair type two and so on. The components of the domain are drawn up the left hand side and the range up the right. The remaining elements of Pure Ruby are drawn in an intuitively obvious way, as illustrated in 
THE T-RUBY LANGUAGE
In T-Ruby all circuits and combinators are defined in terms of the four Pure Ruby elements using a syntax in the style of the typed lambda calculus. Definitions of some circuits and combinators with their types are given in Figure 3 . a, , f 3 and so on denote type variables and can thus stand for any type. The first five definitions are of non-parameterised stream relations, which correspond to carcuits. +, defined using the spread element applied to a function which evaluates to true when z equals the sum of z and y, pointwise relates two integers to their sum. L is the (polymorphic) identity relation, dub pointwise relates a value to a pair of copies of that value and reorg pointwise relates two ways of grouping three values into pairs. These all describe combinational circuits; all except + just describe patterns of wiring, and are known as wiring relations. The fifth, SUMspec, describes a simple sequential circuit: an adding machine with an accumulator register.
The remaining definitions are examples of combinators, which always have one or more parameters, typically describing the circuits to be combined. Applying a combinator to suitable arguments gives a circuit. Thus (Fst R) In the T-Ruby language, "repetitive" combinators and wiring relations are parameterised in the number of repetitions. This is reflected in the type system which includes dependent product types [4] , a generalisation of normal function types, which enable us explicitly to express the size of repetive structures in the type system. For example, the combinator map (which "maps" a relation over all elements in a list of streams) has the polymorphic dependent type: 
x n : int, R : a 5 p. a 2 p) ).
reorg ; Snd (s) ; reorg-I ; Fst (R)
). The combinator mapf is similar to map but the second parameter is a function from integers to relations, so that the relation used can depend on its position in the structure. tri creates a triangular circuit structure, arid colf a column structure where each relation is parametrised in its position in the column. Similarly, rdrf, called "reduce right" as in functional programming, if used for example with the function ( X i : int * +) as argument, gives a relation which relates a list of integers to their sum. The graphical interpretations of some of these repetitive combinators are shown in Figure 4 .
Note that the definitions are all given in a point-free notation, reflecting the fact that they are all expressed in
INote that the size argument n here, as elsewhere, is written as a subscript to improve readibility.
terms of the elements of Pure Ruby. It is easy to show that they are equivalent to the expected definitions using data values; for example, that:
However, defining circuits in terms of Pure Ruby elements offers several advantages: it greatly simplifies the definition and use of general rewrite rules; it simplifies reasoning about circuits in a theorem prover; and it eases the task of translating the language into a more traditional VLSI specification language such as VHDL.
Iv. THE TRANSFORMATIONAL PHASE OF T-RUBY

DESIGN
The design process in T-Ruby involves three main activities, reflecting the overall design of the system: VHDL. In this section we consider the first phase, which involves transforming an initial specification by rewriting, possibly with the addition of typing or timing constraints, so as to approach an implementable design described as a Ruby relation.
A. Rewriting with Constraints
Rewriting is an essential feature of the calculational style of design which is used in Ruby. The T-Ruby system allows the user to rewrite Ruby expressions according to pre-defined rewrite rules. Rewriting takes place in an interactive manner directed by the user, using basic rewrite functions, known as tactics, which can be combined by the use of higher order functions known as tacticals. This style of system is often called a transformation system to distinguish it from a conventional rewrite system. T-Ruby is implemented in the functional programming language Standard ML (SML), which offers an interactive user environment, and the tactics and tacticals are all SML functions, applied in this environment.
In the T-Ruby system, a rewrite rule is an expression with the form of an equality or an implication between two equalities, with explicit, typed universal quantification over term variables and in most cases implicit universal quantification over types via the use of type variables. Apart from this there are no restrictions on the forms of the rules which may be used. In practice, however, the commonly used rules are equalities between relational expressions, corresponding to equivalences between circuits, which can be used to manipulate a circuit description in Ruby to another, equivalent form. Rules for manipulating integer or Boolean expressions could, of course, also be introduced, but most such manipulations are performed automatically by a built-in expression simplifier based on traditional rewriting to a normal form.
Some examples of rules can be seen in Figure 5 . The first rules express simple facts about the combinators, such as the commutativity of Fst and Snd (fstsndcomm), the fact that the inverse of a serial composition is the backward composition of the inverses (inversecomp), and the distributivity of Fst over serial composition (fstcompdist). The fourth rule, maptricomm, is an exampIe of a conditional rule: the precondition that R and S commute over serial composition must be fulfilled in order for tri,R and map,S to commute. Similarly, forkmap states that if R is a functional relation then a single copy on the domain side of an n-way fork is equivalent to n copies on the range side of the fork. Finally, rules such as retimecol, are used in Ruby synthesis to express timing features, such as the input-output equivalence of a circuit to a systolic version of the same circuit. Note that since a 3 these rules contain universal quantifications over relations of particular types, they essentially express general properties of whole families of circuits.
In the T-Ruby system, the directed rules used for rewriting come from three sources. They may be explicit rewrite mle definitions, implicit definitions derived from circuit or combinator definitions (which permit the named circuit or combinator to be replaced by its definition or vice-versa), or lemmata derived from previous rewrite processes which established the equality of two expressions, say t and t'.
The correctness of the explicit rules is proved by the use of a tool [6] based on the Isabelle theorem prover [5] , using an axiomatisation of Ruby within ZF set theory. To make life easier for the user, conjectured rewrite rules can, however, be entered without having been proved. When rewriting is finished, all such unproved rewrite rules are printed out. Together with any instantiated conditions from the conditional rules, they form a proof obligation which the user must transfer to the theorem prover to ensure the soundness of the rewriting process.
The transformation process in the T-Ruby system primarily involves rewriting expressions as described above. However, rewriting can only produce relations which are exactly equivalent to the original, abstract specification. The T-Ruby system therefore offers the user the possibility of: 0 Introducing subtyping by adding relational con-0 Modifying the timing by adding delay elements, as e Specialisation by instantiation of free type or term
In general the transformation process starts from a relational specification, spec, of a circuit, at some suitably high level of abstraction. spec is then rewritten by a number of equality rewrites in order to reach a more implestraints [Ill. illustrated in the following section.
variables. The process can be illustrated by a series of transformations:
spec + step1 -+ step: j . step2 3 step: + . . . + impl where the primes denote the added constraints. The original specification is changed accordingly from spec to spec"...', reflecting the addition of the constraints and ensuring equality between impl and the final constrained specification. F'rom a logical point of view [14] , the constraints can be regarded as the assumptions under which the implementation fulfills the original specific,ation:
B. An Example: Bdimensiond Convolution
As an example of the tranformation process, we present part of the design of a VLSI circuit for 2-dimensional discrete convolution. The mathematical definition is that from a (2r + The intuition behind this is that the stream a represents a sequence of rows of length w, and that each value in c is a weighted sum over the corresponding value in the a-stream and its "neighbours" out to distance ztr in two dimensions, using the weights given by the matrix IC. This is commonly used in image processing, where a is a stream of pixel values scanned row-wise from a sequence of images, and IC describes some kind of smoothing or weighting function. Note that for each i, the summation over j is equal to the 1-dimensional convolution of a with the i'th row of K with a time offset of w . i, where 1-dimensional convolution is defined by:
El
The first step in the design process is to formulate the mathematical definitions in Ruby. Following the style of design used for a correlator in [2] , we now divide the relation between a and c into a combinational part, which relates c-values at a given time to a'-values at the same time (for convenience we let the summation run from 1 applying the substitution inew = iold + r + 1 and likewise for j): and a temporal part which relates the a' values at time t to the original U-values:
The temporal part, the matrix aij, can be further split into parts which can easily be specified directly in Ruby.
First we for a given i find a relation which relates b, to a (2r + 1)-list of aij:
An offset dependent on the position j , such that aij (t) = ai; (t + j -1) , which in Ruby can be specified by stating that (a", a') are related by (tri2,.+1P1).
A (2r + 1)-way fork, such that aij(t) = ay(t), specified by (a"', a") E fork2,.+l.
A fixed offset, such that ai" ( t) = b: (t -r) , specified by (b', a"') E V r . 
7.
Another fixed offset, such that b'"(t) = a(t -wr), specified by ( a , b,") E (Dw)". 8 . Assembling 5-7 we get:
It is convenient to rewrite relations (4) and (8) above as follows: E (fork2,+1 ; butterflyrD") (6)
where the combinator butterfly is defined by:
To define these relations in T-Ruby, it is convenient to parameterise them, so that they become combinators 
z ( t ) ) is related to c i ( t ) + ~( t )
by the Ruby relation (rdrf2,.+l(Q i)), where rdrf is defined in Figure 3 . Combining this with the temporal relations given in definitions 5 and 6 we find that the entire 2-dimensional convolution relation CR2, which relates Clzl corresponds to the inner summation over j in the specification. The graphical interpretation of CR2 for r = 2 is shown on the left in Figure 6 , and the interpretation of (CR1 i) for i = 1 on the right. The butterflies contain increasing numbers of delay elements, D, above the midline and increasing numbers of "anti-delay" elements, D'l below the mid-line. As follows from the definitions, the small butterflies use single delay elements, corresponding to the time difference between consecutive elements in the data stream, while the large butterflies use groups of w delay elements, corresponding to the time difference between consecutive lines in the data stream. With these definitions, the actual circuit for 2-dimensional convolution is described by the relation (conw2 r w Q) for suitable values of r , 20 and Q.
B2 Transformation t o an implementable relation
Unfortunately, the relation given above does not describe a physically implementable circuit, if we assume (as we implicitly have done until now) that the inputs appear in the domain of the relation (as x and a ) and the outputs in the range (as c). This is because of the "anti-delays", D1, in the butterflies. So instead of trying to implement the relation (con712 r w Q) as it stands, we implement a retimed version of it, formed by adding a constraint on the domain side which delays all the input signals:
This will result in the anti-delays being cancelled out, as the delay elements in the constraint are moved "inwards" into the original relation. The resulting circuit will produce its outputs r . ( w + l ) time units later than the original circuit, but this is the best we can achieve in the physical world we live in! From here on we use a series of rewrite rules to manipulate the relation into a more obviously implementable form. The derivation, shown in full in [12] , finishes with the relational expression: 
4))))
with no free variables. The graphical interpretation of this final version of the circuit is shown in Figure 7 . As can be seen in the figure, the circuit is semi-systolic, with a latch (described by a delay element, D) associated with each combinational element, but with a global distribution of the input stream a to all of the combinational elements.
C. Selection and Extraction
The rewriting system of T-Ruby includes falcilities for selection of subterms from the target expression by matching against a pattern with free variables. This can be used to restrict rewriting temporarily to a particular subterm, or, more importantly, for extraction of part of the target expression for implementation. In the latter case, the remainder of the target expression gives a context describing a set of implementation conditions that must be fulfilled for the extracted part to work Extraction is in many respects the converse of adding relational constraints to the specification, and the context specifies the same sorts of requirement. Firstly, it may give representation rules which must be obeyed at the interface to the extracted subterm, and secondly (if the context contains delay elements, D), it may give timing requirements for the implementation of the subterm.
V. VLSI IMPLEMENTATION
The relational approach to describing VLSI circuits offers a greater degree of abstraction than descriptions using functions alone, since the direction of data flow is not Arrows in the figure indicate the input/output partitioning determined by the causality analysis.
specified. However, real circuits offer particular patterns of data flow, and this means that the interpretation of a relation may in general be 0, 1 or many different circuits. In the case of zero circuits, we say the relation is unimplementable. The widest class of relations which are generally implementable is believed to be the causal relations, as defined by Hutton [l] . These generalise functional relations in the sense that inputs are not restricted to the domain nor outputs to the range.
In T-Ruby, causality analysis is performed at the end of the rewriting process, when the user has extracted the part of the relation which is to be implemented. In most cases, in fact, the context from which the relation is extracted is non-implementable: for example, it may specify timing requirements which (if they could be implemented) would correspond to foreseeing the future.
A. Causality analysis
More exactly, a relation is causal if the elements in each tuple of values in the relation can be partitioned into two classes, such that the first class (the outputs) are functionally determined by the second class (the inputs), and such that the same paftitioning and functional dependency are used for all tuples in the relation. For example, the previously defined relation + is causal, in the sense that the three elements ((2, y), z) of each tuple of values in the relation can be partitioned as described, in fact in three different ways:
1. With x and y as inputs and z as output, so that the 2. With x and z as inputs and y as output, so that the 3. With y and z as inputs and x as output, so that the Note that the relation +-I is also causal, although it is not functional. Essentially, causality means that the relation can be viewed under the partitioning as a deterministic function of its inputs.
In T-Ruby, the relation to be analysed is first expanded, using the definitions of its component relations, to a form where it is expressed entirely in terms of the four elements of Pure Ruby and relational inverse. The expanded relation is then analysed with a simple bottom-up analysis heuristic. For combinational elements described by spread relations, causality is determined by analysing the body of the spread, which must have the form of a body part which is:
e an equality with a single variable on the left-hand side, e a conjunction of body parts, or e a conditional choice between two body parts.
relation describes an adder.
relation describes a subtractor.
relation describes another subtractor.
In each equality, the result of the analysis depends on the form of the right-hand side. If this is a single variable, no conclusions are drawn, as the equality then just implies a wire in the abstract floorplan. If the right-hand side is an expression, all values in it are taken to be inputs, and the left-hand side is taken to be an output. In choices, all values in the condition are taken to be inputs. If these rules result in conflicts, no causal partitioning can be found. When there are several possible causal partitionings, as in the case of +, on the other hand, the rules enable us to choose a unique one.
For delay elements, D, values in the domain are inputs and those in the range are outputs. Parallel composition preserves causality, and so in fact does inversion, but serial compositions in general require further analysis, to determine whether the input/output partitionings for the component relations are compatible with an implementable (unidirectional) data flow between the components. Essentially, checks are made as to whether two or more outputs are used to assign a new signal value to the same wire, whether some wires are not assigned signal values at all or whether there are loops containing purely combinational components. This additional analysis is exploited in order to determine the network of the circuit in the form of a netlist with named wires between active components. At present there is no backtracking, so if the arbitrary choice of partitioning when there are several possibilities is the "wrong" one, then it will not be possible to find a complete causal partitioning for the entire circuit.
As an example, let us consider the analysis of parts of the relation for 2-dimensional convolution. The central element in this is the relation given by acc(p + k ) , which describes the combinational multiply-and-add circuit for kernel element ( p , k ) . Using the definition of acc, and substituting (p + k ) for W , this reduces to:
The body of the spread has the form of an equality with a single left-hand side, and thus the causal partitioning will make z an output and m and s both inputs. In this case, the relation is functional from domain to range, but in general this need not be so.
Since Further analysis proceeds in a similar manner, leading to the final data flow pattern shown by arrows in Figure 7 .
B. Damlation to VHDL
Since causality analysis gives both the network of the circuit and the direction of data flow along the individual wires between components, the actual translation to VHDL is comparatively simple. Each translated "top level" Ruby relation is declared as a single design unit, incorporating a single entity with a name specified by the user. In rough terms, each combinational relation C which is not a wiring relation within the expanded Ruby relation is translated into one or more possibly conditional signal assignments, where the outputs of C are assigned new d u e s based on the inputs. For example, the relation acc(p + k ) considered above gives rise to a single concurrent signal assignment of the form:
where s i g z , s i g s and sigrm are the names of the VHDL signals corresponding to z , s and m respectively, and W is a constant equal to the value of ( p + k ) for the circuit element in question. Since the operators available for use with operands of integer, Boolean, bit and character types in Ruby are (with one simple exception: logical implication) a subset of those available in VHDL, this direct style of translation is problem-free. In a similar manner, any conditional (if-then-else) expressions in the body of a spread are directly translated into conditional assignment statements, possibly with extra signal assignments to evaluate a single signal giving the condition.
The VHDL types for the signals involved are derived from the Ruby types used in the domain and range of C in an obvious way. Thus for the elementary types, the Ruby type bit is translated to the VHDL type rubybit, boo1 to rubybool, int to rubyint, and char to rubyclhar, where the VHDL definitions of rubybit, rubybool., rubyint and rubychar are predefined in a package RUBYDEF, which is referred to by all generated VHDL units. Composed types give rise to groups of signals, generated by (possibly recursive) flattening of the Ruby type, such that a pair is flattened into its two components, a list, into its n components and so on until elementary types are reached.
If the Ruby relation refers to elementary types other than these pre-defined ones, a package declaration containing suitable type definitions is generated by the translator. For example, if an enumerated type etyp is used, a definition of a VHDL enumerated type rubyetyp with the same named elements is generated.
Free variables of relational type and all noncombinational relations in the Ruby relation are translated into instantiations of one or more VHDL components. For Standard definitions of these components for all standard simple Ruby types are available in a library. Other components (in particular those generated from free relational variables) are assumed to be defined by the user.
The final result of translating the fully instaintiated 2-dimensional convolution relation into VHDL is shown in Figure 8 . The figure does not show the entire VHDL code (which of course is very repetitive owing to the regular nature of the circuit), but illustrates the style. Signal identifiers starting with input and output correspond to the external inputs and outputs mentioned in tlhe formal port clause of the entity, while names starting vvith wire identify internal signals. A clock input is generated if any of the underlying entities are sequential. The assignments marked Calculations describe the combinationall components, and those marked Registers describe the component instantiations corresponding to the Delay dements. (Instantiations of any other user-defined compoinents follow in a separate section if required.)
The correctness of the translation relies heavily on two facts:
1. There is a simple mapping between Ruby t:ypes and operators and types and operators which are available in VHDL.
2.
Relations are only considered translatable if an (internally consistent) causal partitioning can be found. The complete system is illustrated in Figure 9 . A similar style of analysis to that used for generating VHDL code is used for controlling simulation of the behaviour of the extracted relation. The user must supply a stream of values for the inputs of the circuit and, if required, initial values for the latches, and the simulation then uses exactly the same assignments of new values to signals as appear in the VHDL description. Obviously, only fully instantiated causal relations can be simulated.
RUBY expressions
PARSERfPRINTER TYPE CHECiCER
L .
The authors would like to thank Ole Sandum for his work on the design of the Ruby to VHDL translator. 
VI. CONCLUSION
In this paper, we have presented the T-Ruby Design System and outlined a general design method for VLSI circuits based on transformation of formal specifications using equality rewriting, constraints and extraction. The simple mathematical basis of the specification language in terms of functions and relations enables us to prove general transformation rules, and minimises the step from the mathematical description of the problem to the initial specification in our system.
The use of the system has been illustrated by the nontrivial example of a circuit for 2-dimensional convolution. This example shows how T-Ruby can be used to describe complex repetitive structures which are useful in VLSI design, and demonstrates how the system can be used to derive descriptions of highly generic circuits, from which concrete circuit descriptions can be obtained by instantiation of free parameters. Circuits described by secalled causal relations can be implemented and their behaviour simulated. In the T-Ruby system, a simple mapping from T-Ruby to VHDL for such relations is used to produce a VHDL description for final synthesis.
The design system basically relies on the existence of a large database of pre-proved transformation rules. However, during the design process, conjectured rules can be introduced at any time, and rewrite rules with preconditions may be used. The system keeps track of the relevant proof burdens and these can be transferred later to a separate theorem prover. Our belief is that this "divide and conquer" philosophy helps to make the use of formal methods more feasible for practical designs.
VII. ACKNOWLEDGEMENTS
The work described in this paper has been partially supported by the Danish Technical Research Council.
