An efficient representation for formal synthesis by Blumenroehr, Christian & Eisenbiegler, Dirk
Copyright 1997 IEEE. Published in the Proceedings of
ISSS’97, September 17-19, 1997 in Antwerp, Belgium.
Personal use of this material is permitted. However, per-
mission to reprint/republish this material for advertising or
promotional purposes or for creating new collective works
for resale or redistribution to servers or lists, or to reuse any
copyrighted component of this work in other works, must be
obtained from the IEEE. Contact: Manager, Copyrights and
Permissions / IEEE Service Center / 445 Hoes Lane / P.O.
Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone:
+ Intl. 908-562-3966. If you post an electronic version of
your paper, you must provide the IEEE with the electronic
address (URL, FTP address, etc.) of the posting. For your
convenience, you may forward this information to our FTP
site: /pub/incoming/cspress/Web-pprs and we will send it
on to the Copyrights Department.
An Efficient Representation for Formal Synthesis  
Christian Blumenröhr, Dirk Eisenbiegler
Institute for Circuit Design and Fault Tolerance
(Prof. Dr.-Ing. D. Schmid)
University of Karlsruhe, Germany
blumen,eisen@ira.uka.de
Abstract
In the last years, performing synthesis by logical refine-
ment has become an interesting alternative towards post-
synthesis verification. This paper gives a case study about
the complexity of formal synthesis programs within a given
calculus. For a simple synthesis step, it is discussed, how
one can efficiently implement circuit transformations with
respect to the complexity of the logical transformations of
the underlying calculus.
1. Introduction
Nowadays, more and more complex and sophisticated
synthesis tools are involved during the design of digital cir-
cuits. However, even a fully automated synthesis process
does not necessarily mean “correctness by construction”.
In general, bugs in synthesis tools lead to faulty implemen-
tations, and due to their complexity, it is almost impossi-
ble to formally verify synthesis tools. Therefore, formal
methods have to be applied to ensure that the implementa-
tion satisfies the specification. A possible approach is post-
synthesis verification [Gupt92]. However, fully automated
post-synthesis verification is only achievable for compara-
tively small sized circuits at lower levels of abstraction, and
theorem proving that requires interactive steps is not very
much accepted by circuit designers.
Formal synthesis is a complementary approach to hard-
ware verification, since formal averment is an integral part
of the synthesis process. In [KBES96] a survey of and
a classification scheme for formal synthesis approaches
can be found. In the last years, formal synthesis has
become a new research topic and several systems have
been introduced such as T-Ruby [ShRa95], Lambda/Dialog
 This work has been partly financed by the Deutsche Forschungsge-
meinschaft, Project SCHM 623/6-1.
[MaFo91], Veritas [HaLD89], DDD [JoBo91] or HASH
[EiKB97]. [Busc92] and [BaFr96] present reimplementa-
tions of the Veritas formal synthesis approach for different
theorem provers (LAMBDA and ISABELLE, respectively).
Other research activities have been started by [GrMT94],
[Wang92] or [Lars95]. They all have one thing in common:
they are based on some calculus, i.e. some small core of
basic logical transformations. The efficiency of formal syn-
thesis approaches depends on how efficient hardware can be
represented and on how fast circuit transformations can be
realized based on the given set of logical transformations.
This paper investigates efficiency aspects of implement-
ing formal synthesis transformations in the theorem prov-
ing environment HOL [GoMe93] illustrated by a basic cir-
cuit transformation step. In general, there are various ways
to perform such transformations. However, the complexity
very much depends on which basic logical transformations
are used and on the order in which the logical transforma-
tions are applied. Therefore, when implementing such syn-
thesis transformations, one has to consider the time com-
plexity of the basic logical transformations involved. We
will introduce a simple algorithm and discuss, how its per-
formance can be improved.
In order to optimize the performance of formal synthesis,
it is also possible to modify the core of the theorem prover
such that the considered circuit transformations can be im-
plemented more efficiently. However, one has to be careful,
since modifying the core of a theorem prover may violate
the consistency of the calculus. We will introduce a modi-
fication of the HOL theorem prover and discuss the impact
towards safety and efficiency.
At the end of the paper, we will discuss the extra costs
for formal synthesis as compared to conventional synthe-
sis. In our approach HASH (Higher order logic Applied to
Synthesis of Hardware), which exploits standard synthesis
algorithms, circuit transformations are strictly divided from
the design space exploration parts (calculating the schedul-
ing table, determining the state encoding, etc.). The design
space exploration parts are the same as in conventional syn-
thesis programs [TLWN90, GDWL94]. The extra cost for
formal synthesis arise from the circuit transformations, that
are to be performed by rule applications within a theorem
prover.
2. Scheduling transformation
In this paper, we concentrate on the transformation in
HASH for performing the scheduling task within high-level
synthesis. High-level synthesis converts an algorithmic
description of the circuit into a structure at the Register-
Transfer (RT) level. The major steps in high-level synthesis
are scheduling, allocation of storage, functional and inter-
connection units, binding the allocated hardware onto some
library components and interface synthesis.
The scheduling task assigns a control step (c-step) to
each operation in the algorithmic specification. There exist
various heuristic scheduling algorithms which try to mini-
mize the number of control steps or the hardware require-
ments [CaWo91, GDWL94]. A large number of them start
from data flow graphs that correspond to the basic blocks
in the algorithmic description. Although certain scheduling
algorithms start from control/data flow graphs, we shall re-
strict ourselves to pure data flow graphs in this paper. The
handling of control flow is the topic of our current research.












































Figure 1. Example for a scheduling step
  Formalizing data ow graphs
In order to transform the data flow graphs, they first have
to be formalized suitably. In our approach, data flow graphs
are represented by means of functions that are nothing but
simple compositions of basic operations. They are formal-
ized using  -expressions [Davi89].
The following term shows, how the data flow graph in
figure 1 is represented in HOL:
  a b c
let p  a   b in
let s  b c in
let q  s c in
let r  p   q in
let t  p s in
let x  r  t in
let y  r   t in
 x y
The above expression describes an input/output function in
terms of its basic operations. The function maps some in-
put triple  a b c to an output pair  x y. Each let-term
describes the connectivity of one operation. let-terms stand
for -redices, where letx  y in z means   xz y. Using
let-expressions improves readability.
   Transforming the data ow graphs
within HOL
During scheduling, the function g is split into a concate-
nation of functions g  g     gk with g  gk    g g ,
and each function again represents a data flow graph (see
figure 1).
This section describes, how the scheduling process de-
scribed in figure 1 is implemented as a conversion in HOL.
Our high level synthesis conversion is steered by external
control information (the schedule table). The implemen-
tation of other high level synthesis conversions (allocation
and binding of storage and (multi-purpose) functional units)
is described in [EiBK96]. In this section we will only de-
scribe the logical aspects of formally deriving the synthe-
sis result from the input data flow graph. The computation
of the control information and invocation of the external
heuristics will be discussed in section 4.
The approach is based on a conversion for normalizing
functions. We will first describe this conversion and then
describe, how scheduling can be realized based on this con-
version.
  Function normalization
The HOL representations corresponding to figure 1 are
both nothing but simple compositions of the same basic
functions. In principle, normalizing such representations is
pretty simple. The general algorithm looks as follows:
1. the original term g is converted to
  x  x    xmg x  x    xm by applying a
paired -reduction in the inverse direction
2. the  operations are expanded by rewriting, provided
there are any
3. -reductions and paired -reductions are performed
wherever possible
Given some function g and the scheduled function
g   gk      g  g , this algorithm leads to
the same normalized representation, which looks like
  x  x    xmvx  x     xm. In vx  x    xm
there are no -redices left and there is nothing but pure
function applications.
The most significant step performed in this normaliza-
tion, is (paired) -reduction. -reduction means turning
some expression   xP x a to P ax, where x is sub-
stituted by a in P . Paired -reduction means turning
some expression    x   xnP x   xn  a   an
to P a x    anxn.
  A universal conversion
We will now introduce a simple conversion which is
based on this normalization scheme. Given some data flow
graph representation g and some schedule table, the follow-
ing steps have to be performed:
1. produce g   gk  gg  according to the schedule
table
2. derive  g  g and  g   g via normalization
3. The equations  g  g and  g   g are combined to
 g  g  (symmetry and transitivity of equivalence).
The major drawback of this universal conversion is the com-
plexity of step 2 when dealing with data flow graphs with a
big depth, i.e. maximum number of operations on a path
from some input to some output. Data flow graphs whose
intermediate nodes have larger fanouts, i.e. the output of a
node is used by many successor nodes as inputs, lead to a
number of duplications during -reduction. Due to the fact
that during -reduction, one has to traverse the entire term
and since such -redices can be nested, the term size and
time consumption in step 2 may grow exponentially with
the depth.
  An advanced conversion
The universal conversion does not exploit any knowledge
about how the synthesis step was performed. However, one
can think of an advanced conversion, where synthesis is per-
formed by a sequence of conversions which are optimized
for the scheduling step.
In principle, this conversion is similar to the universal
conversion except that step 2 is tuned towards scheduling.
The idea of our scheduling conversion is to split the data
flow graph step by step rather than doing it all at once, as
in the universal synthesis conversion. -reduction is only
applied to those variables whose corresponding nodes have
been assigned to the current control step. Although some -
redices will remain, the terms achieved after normalization
will be equal.
Other than in the universal synthesis conversion, k  
conversions (k – number of control steps) have to be ap-
plied successively rather than applying one single conver-
sion. Hence, the exponential complexity associated with
step 2 is avoided and the overall cost is reduced.
To demonstrate the differences between the simple and
the advanced conversion, we will show some experimental
results in the following section.
  Experimental results
We consider two scalable data flow graphs. As a first ex-
ample, we use a data flow graph, which realizes the division





















The coefficients i and 	i should be computed. To facil-
itate the calculation, we assume that the divisor is normal-
ized with respect to p. After a few algebraic transforma-
tions we get the following two formulas for computing the
coefficients:
i  ip 
minfipqgX
ki 
ipk  k i  	    q
	j  j 
minfjqgX
k
jk  k j  	    p 
The data flow graph consists of pq subtractors, p q
multipliers and q p adders, so there is a total of 
pq
p
nodes. The critical path has a length of q  
 nodes.
Another scalable data flow graph is realized in our sec-
ond example. It calculates the discrete cosine transform
(DCT), which is popularly used for image compression.



























 u v  	
  otherwise
In the following, we assume that N  M . In most appli-
cations, the parameters are set to N  M  . The number
of additions is 
N N N and there are 
N N  

multiplications. So there is a total of N N  
N  

nodes. The length of the critical path is 
N  . A more
detailed description of this data flow graph can be found in
[BlEK96].
In figure 2, the runtimes for the simple and advanced
conversion are shown for the polynomial division. We set p
to 25 and increased q. Due to the exponential memory con-
sumption of the simple conversion, the computer’s capacity
of 1.2 GB was exceeded at about 600 nodes.
time [s]
nodes	 







Figure 2. Simple and advanced conversion
applied to PD
Figure 3 shows the runtimes for the two above mentioned
data flow graphs. We applied different schedules that were
derived by different scheduling algorithms namely ALAP
and FD (force-directed [PaKn89]). For PD, p was again
set to 
, and for the DCT, we increased N from 2 to 8.
The structures of the data flow graphs PD and DCT differ
in the depth and the number of reused intermediate results.
As can be seen, the runtime depends on the structure of the
DFG but is almost independent of the schedule table.
3. Changing the core of the theorem prover
In the following, we will present a modified HOL theo-
rem proving system, named HOL’, where we changed the
core in order to increase efficiency. There are two modifica-
tions: the term representation is changed and also the core
of basic logical transformations is enlarged.
time [s]










Figure 3. Advanced conversion for different
data flow graphs and schedules
 Changing the term representation
In the HOL theorem prover, terms are implemented
in a so called deBrujin-style, which means that free and
bound variables are represented differently. Free variables
are stored with their name and type, whereas bound vari-
ables are only represented by a number, linking to the  -
abstraction it refers to. The number tells, how many  -
abstractions apart the referred  -abstraction stands. Ex-
ample: The variables x and y in the body of the term
 x  yx y are internally represented by  and 	, respec-
tively. One advantage of the deBrujin representation is, that
checking the -equivalence of two terms can be performed
in constant time.
However, the deBrujin term representation has a decisive
drawback. Whenever one traverses a term while construct-
ing or destructing it and reaches a bound variable, one has to
go back to find the corresponding -abstraction for identify-
ing the variable. Especially, if, as in our approach, variables
are bound over a large range, this leads to a quadratic com-
plexity. Therefore deBrujin term representation is not well
suited for our application. However, this problem does not
only arise in our formal synthesis approach. Whenever large
circuits have to be represented, variables, which correspond
to signals, are bound over a large range, and this means that
deBrujin term representation becomes very inefficient.
Therefore we implemented HOL with a so called name-
carrying term representation. Here, also the bound vari-
ables are stored with their name and type. It has the ad-
vantage that terms, which correspond to circuit descrip-
tions, can be handled in a by far more efficient manner. On
the other hand, some basic operations (e.g. -equivalence
check) become a bit more inefficient when using this repre-
sentation style. Furthermore, it also increases memory con-
sumption for representing bound variables, since instead of
just a number, a string (for the name) and a type expression
have to be stored.
  Introduction of more e	cient functions
As mentioned in section 2.3, -reduction is applied very
often during the normalization of the terms. HOL only al-
lows one single -reduction at a time. However one could
increase the efficiency of the scheduling transformation by
performing several -reductions in a single term traversal
step. By adding this conversion to the core, we improve
the efficiency of the theorem prover. This advanced -
conversion is equipped with a filter for selecting the bound
variables that are to be expanded. This is important for our
advanced conversion, where we only expand variables, that
occur within the considered control step (see section 2.5).
Based on the advanced -conversion, paired -reduction
can be performed within one traversal. For the imple-
mentation of the advanced paired -conversion we added
a conversion, which changes paired -redices into an un-
paired representation. Example:    x yt  a b is turned
to   x  yt a b. Afterwards, all the -redices can be ex-
panded in one traversal using the already mentioned ad-
vanced -conversion.
The additional conversions are not only tailored to our
application, but is of general interest for other users of the
HOL system. However, it is to be noted, that enlarging the
core is safety critical. Therefore, one has to consider care-
fully, which functions should be added to the core.
Figure 4 shows runtimes for a scheduling of the PD
data flow graph scheduled with an ALAP technique. It
shows, that both the simple and the advanced conversion
runs faster under the modified HOL system (indicated by
HOL’). Again, due to nested -redices, the simple conver-
sion ran into memory problems for larger data flow graphs.
time [s]











Figure 4. Simple and advanced conversion in
different theorem provers
4. The formal synthesis scenario
Figure 5 demonstrates the underlying idea of our for-
mal synthesis approach HASH illustrated by the scheduling
step. Given a data flow graph, some scheduling heuristic
is started. This heuristic step has nothing to do with logic.
The heuristic returns a scheduling table which maps each
operation in the data flow graph onto a control step. This
scheduling table is now used by the formal logical transfor-











Figure 5. Invoking synthesis heuristics within
formal synthesis
The split between design space exploration (i.e. differ-
ent schedule tables for different heuristics) and the logical
transformation is the core idea in HASH. This core idea is
applicable to most of the synthesis steps, e.g. allocation/no.
of resources available, retiming/split in the combinational
logic, etc.
Two important points are met independently with this
strategy: quality and correctness of the implementation.
The quality only depends on the algorithm that calculates
the control information, whereas the correctness aspect is
guaranteed due to the transformation being based on the
HOL system.
Since the entire synthesis process is nothing but a HOL
conversion, correctness is guaranteed implicitly. Faulty im-
plementations cannot be achieved even if the control infor-
mation produced by the external program is flawed, such as
a schedule where the data dependencies are disregarded. In
such cases, the transformation cannot be performed within
the logic and an exception will be raised. In conventional
synthesis programs, such bugs could lead to faulty imple-
mentations. Our formal synthesis program either leads to
correct implementations or to no implementation but an ex-
ception. In case of an exception, an information is produced
telling the user in which synthesis step the error occurred.
In our approach, circuit transformations guarantee that
the functional behavior is preserved. Besides functional
correctness, there may be further requirements for the im-
plementation such as timing and area constraints. However,
checking, whether such constraints are fulfilled, can easily
be done by non-formal methods. Therefore, it is not nec-
essary to formally represent and verify such constraints in
logic. In our scheduling step, for example, it is pretty easy
to count the number of control steps and to check, whether
it exceeds some given limit.
Figures 6 and 7 show the runtimes for both heuristics
and transformational part of the formal synthesis step for
the PD data flow graph and the DCT data flow graph, re-
spectively (notice the different scales!). For determining
the schedule table, we applied both the force-directed and
the ALAP program. The circuit transformation was per-
formed in HOL’ (the modified HOL system) by applying
the advanced conversion. As can be seen, the runtime for
sophisticated design space exploration techniques such as
force-directed scheduling can exceed the runtime for the
transformational part by far. Even for simple design space
exploration techniques, the ratio between design space ex-
ploration and circuit transformation seems reasonable.
time [s]







transformations for ALAP & FD
determine ALAP
Figure 6. Time consumption for heuristic and
transformation for PD
time [s]







transformations for ALAP & FD
determine ALAP
Figure 7. Time consumption for heuristic and
transformation for DCT
5. Conclusions
For a given simple synthesis step, we have illustrated the
efficiency problem when performing the step by means of a
logical transformation. It shows, that it is very important to
be aware of the complexity of the basic logical transforma-
tion when constructing formal synthesis programs. On the
other hand it can be worthwhile to modify the basic log-
ical transformations themselves. We have demonstrated,
that formal refinement techniques for hardware synthesis
can only be applicable in practice when considering these
efficiency aspects. Efficient formal synthesis implementa-
tions, however, are applicable even for large sized circuits
and the extra-costs for it, which are independent of the de-
sign space exploration part, are reasonable.
References
[BaFr96] D. Basin and S. Friedrich. Modeling a hard-
ware synthesis methodology in isabelle. In
[HOL96], pages 33–50.
[BlEK96] C.Blumenröhr, D. Eisenbiegler, and R.Kumar.
Applicability of formal synthesis illustrated via
scheduling. In Workshop on Logic and Archi-
tecture Synthesis, Grenoble, France, Decem-
ber 1996. Institut National Polytechnique de
Grenoble.
[Busc92] H. Busch. Transformational design in a the-
orem prover. In V. Stavridou, T. F. Melham,
and R. T. Boute, editors, Theorem Provers
in Circuit Design, volume A-10, pages 175–
196, Nijmegen, The Netherlands, June 1992.
IFIP TC10/WG10.2 International Conference,
North-Holland.
[CaWo91] R. Camposano and W. Wolf. High-Level VLSI
Synthesis. Kluwer, Boston, 1991.
[Davi89] R. E. Davis. Truth, Deduction and Computa-
tion: Logic and Semantics for Computer Sci-
ence. Computer Science Press, New York, 1
edition, 1989.
[EiBK96] D. Eisenbiegler, C. Blumenröhr, and R. Ku-
mar. Implementation issues about the embed-
ding of existing high level synthesis algorithms
in HOL. In [HOL96], pages 157–172.
[EiKB97] D. Eisenbiegler and R. Kumar and C. Blu-
menröhr . A constructive approach towards
correctness of synthesis-application within re-
timing. In The European Design & Test
Conference, pages 427–432, Paris, France,
March 1997. IEEE Computer Society and
ACM/SIGDA, IEEE Computer Society Press.
[GDWL94] D. Gajski, N. Dutt, A. Wu, and S. Lin. High-
Level Synthesis, Introduction to Chip and Sys-
tem Design. Kluwer Academic Publishers,
1994.
[GoMe93] M.J.C. Gordon and T.F. Melham. Introduction
to HOL: A Theorem Proving Environment for
Higher Order Logic. Cambridge University
Press, 1993.
[GrMT94] W. Grass, M. Mutz, and W. Tiedemann. High
level synthesis based on formal methods. In
Proc. EUROMICRO, pages 83–91, Liverpool,
1994.
[Gupt92] A. Gupta. Formal hardware verification meth-
ods: A survey. Formal Methods in System De-
sign, 1(2/3):151–238, 1992.
[HaLD89] F.K. Hanna, M. Longley, and N. Daeche. For-
mal synthesis of digital systems. In Luc J. M.
Claesen, editor, Applied Formal Methods For
Correct VLSI Design, volume 2, pages 532–
548. IMEC-IFIP, Elsevier Science Publishers,
1989.
[HOL96] Joakim von Wright, Jim Grundy, and John Har-
rison, editors. Theorem Proving in Higher
Order Logics:9th International Conference,
TPHOLs’96, number 1125 in Lecture Notes
in Computer Science, Turku,Finland, August
1996. Springer-Verlag.
[JoBo91] S.D. Johnson and B. Bose. DDD - A system
for mechanized digital design derivation. In
Workshop on Formal Methods in VLSI Design,
Miami, Florida, January 1991. ACM/SIGDA.
[KBES96] R. Kumar , C. Blumenröhr, D. Eisenbiegler,
and D. Schmid . Formal synthesis in circuit
design-A classification and survey. In M. Sri-
vas and A. Camilleri, editors, Formal Methods
in Computer-Aided Design. First International
Conference,FMCAD’96, number 1166 in Lec-
ture Notes in Computer Science, pages 294–
309, Palo Alto, CA, USA, November 1996.
Springer-Verlag.
[Lars95] M. Larsson. An engineering approach to for-
mal digital system design. The Computer Jour-
nal, 38(2):101–110, 1995.
[MaFo91] E.M. Mayger and M.P. Fourman. Integration
of formal methods with system design. In
A. Halaas and P.B. Denyer, editors, Interna-
tional Conference on Very Large Scale Integra-
tion, pages 59–70, Edinburgh, Scotland, Au-
gust 1991. IFIP Transactions, North-Holland.
[PaKn89] P. G. Paulin and J. P. Knight. Force-directed
scheduling for the behavioral synthesis of
ASIC’s. IEEE Transactions on Computer
Aided Design, 8(6):661–679, June 1989.
[ShRa95] R. Sharp and O. Rasmussen. The T-Ruby de-
sign system. In CHDL ’95, pages 587–596,
1995.
[TLWN90] D.E. Thomas, E.D. Langnese, R.A. Walker,
J.A. Nestor, J.V. Rajan, and R.L. Blackburn.
Algorithmic and Register-Transfer Level Syn-
thesis: The System Architect’s Workbench.
Kluwer Academic Publishers, 1990.
[Wang92] L. Wang. Deriving a correct computer. In
L. J. M. Claesen and M. J. C. Gordon, ed-
itors, Higher Order Logic Theorem Proving
and its Applications, volume A-20, pages 449–
458, Leuven, Belgium, September 1992. IFIP
TC10/WG10.2, North-Holland.
