Using Formal Tools to Study Complex Circuits Behaviour by Paul Amblard Tima-Cmp et al.
Using Formal Tools to Study Complex Circuits Behaviour
Paul Amblard
TIMA-CMP, 46 av. F´ elix Viallet, 38031 GRENOBLE Cedex, France
Fabienne Lagnier
V´ erimag, Centre Equation, 2 avenue de Vignate, 38610 GIERES, France
Michel L´ evy
LSR-IMAG, B.P. 72, 38042 St MARTIN D’HERES Cedex, France
email : Paul.Amblard, Fabienne.Lagnier, Michel.Levy@imag.fr
Universit´ e Joseph Fourier, Grenoble, France.
Abstract
We use a formal tool to extract Finite State Machines
(FSM) based representations (lists of states and transitions)
of sequential circuits described by ﬂip-ﬂops and gates.
These complete and optimized representations helps the
designer to understand the accurate behaviour of the cir-
cuit. This deep understanding is a prerequisite for any ver-
iﬁcation or test process. An example is fully presented to il-
lustrate our method. This simple pipelined processor comes
from our experience in computer architecture and digital
design education. ([2])
1. Introduction
It is now widely accepted that a clean speciﬁcation of a
circuit must be designed in parallel with the design of the
circuit itself. This hardware-brainware co-design is neces-
sary to verify, simulate, prove, test, validate the circuit. Our
observationis that it is often difﬁcultto express all the spec-
iﬁcations of a circuit even if we know how to design it.
Our proposal, computer assisted exploration, can help
to obtain some properties. Particularly it can reveal unex-
pected, but correct, behaviours. It also help to understand
complexbehaviours. It will be the case in a pipelined exam-
ple. An obvious result is that exploration can reveal some
differences between circuit and speciﬁcations. The earlier
this discovery can be done in the design process, the better
it is.
This work is at the logic level, we deal with ﬂip-ﬂops
and gates, so the approach is typically for sequential cir-
cuits. At this level of abstraction, the formal model is a
Finite State Machine. We are obviously in front of the com-
binatorial explosion problem : in this approach, if a circuit
has N ﬂip-ﬂops, the corresponding automaton has poten-
tially 2
￿
states. As a consequence, we must try to maintain
N small. Our approach is well suited for small size circuits,
or mechanisms. Its goal is to analyse a part of a circuit, not
a big true device.
Another key point of our approach must be pointed at :
underlying principles, and the tool we use, suggest to ma-
nipulate the circuit with different formal frames (typically
an FSM or ﬂip-ﬂops + gates).
By suggesting to have design and speciﬁcation in dif-
ferent formalisms and proposing to automatically transform
one of them into the other one, we hope to propose a better
analysis of the hardware devices.
The ﬁrst section will explain the principles of our ap-
proach. The central section gives detail and analysis of an
example extracted from our experience in digital circuit de-
sign and computer architecture education. The circuit is in-
spired by a pipelined processor. A further section compares
our results to other approaches in similar situations, and
with the main other validation techniques (model-checking,
theorem proving, simulation).
2 The principles
All the modern C.A.D. tools contain a FSM synthesis
package : given a list of states and transitions (called the
speciﬁcation), the tool computes a netlist of gates and ﬂip-
ﬂops (called the implementation).
The basics of our approach are very simple : given a
description of an implementation of a Finite State Machine
(FSM), the tool computes the expansion of the FSM by states
and transitions. Several automata deliver the same output
sequence for a same input sequence ; one of them has aminimal number of states. The tool delivers this minimal
equivalent FSM. The designer works with this minimal rep-
resentation. Obviously this does not allow to deal with big
automata. We shallgivesome detailsonthetechniquesused
todescribethesource FSM. The techniques usedtocompute
this expansion are presented in [5] and this present contri-
bution is not about such C. A. D tool but about a new way
to use it.
2.1. How do we describe ?
All the descriptions are given in the language LUSTRE
([7] and [8]). LUSTRE looks like Lola, the language used
by Wirth in his book. ([15]) Description may be of different
types :
￿ Circuits described as a set of nodes : the nodes contain
logic gates and edge-triggered D-type ﬂip-ﬂops. The
only data type is boolean.
￿ Generic circuits of size N, dealing with boolean vec-
tors of size N. In the description registers have N ﬂip-
ﬂops. N must be instanciated before effective use, ei-
ther description of a physical device or description of
anautomaton. Aphysicalhardwaredevicecannot have
N pins. An automaton cannot have N states.
￿ Circuits described as a hierarchical or compositional
set of nodes. The nodes can be different (cooperat-
ing) automata. The language is such that, basically, all
the automata share the same clock. Due to this fea-
ture, LUSTRE is often referred to as a synchronous lan-
guage ([6]).
2.2. An example of describing circuits in Lustre
Here is an example of a n bits adder and a n bus multi-
plexor. X:boolˆndeﬁnes Xbeinga busof nwires, eachof
which being a boolean. The least signiﬁcant wire is X[0]
and the most signiﬁcant one is X[n-1].
node add1 (a,b,c:bool) returns (r,s:bool);
let
r = a and b or a and c or b and c;
s = a xor b xor c;
tel;
node add (const n: int ;
a,b: boolˆn)
returns (sum:boolˆn);
var carry : boolˆ(n+1);
let
carry[0] = false;
(carry[1..n], sum[0..n-1]) =
add1(a[0..n-1],b[0..n-1],carry[0..n-1]);
tel;
node mux1 (i,t,e:bool) returns (s:bool);
let
s = i and t or not i and e;
tel;
node mux (const n:int ;
i:bool;
t,e:boolˆn)
returns (s:boolˆn);
let
s[0..n-1] =
mux1(iˆn, t[0..n-1], e[0..n-1]) ;
---boolean i is repeated n times
tel;
A basic node flipflop deﬁnes an edged-triggered D
device :
node flipflop (D:bool; clock,reset:bool)
returns (Q:bool).
We use it to implement registers. (Fig 4)
node partoffig4 (const n: int ;
depl: boolˆn ;
clock,reset,cond: bool)
returns
(newpc: boolˆn);
var pc, spc : boolˆn;
let
pc = flipflop (newpc, clockˆn, resetˆn);
spc = flipflop (pc, clockˆn, resetˆn);
newpc = mux (n, cond and br,
add (n, spc, depl),
plusone (n, pc) );
tel;
2.3. What do we obtain ?
A ﬁrst use is to compile the circuit description given by
gates and ﬂip-ﬂops. The compiler delivers a description of
the given automaton. The description is based on the set of
states and the two functions : transition function and out-
put function. If the input description contained several au-
tomata, the compiler computes the product automaton. We
must be careful and avoid too large machines.
The description of the result automaton is given in an in-
ternal textual form or in a graphical form. It could as well
be written in VHDL or an other Hardware Description Lan-
guage. Another tool allows to minimize this automaton.
Different uses can be done with the result of this extrac-
tion, this paper concentrates on the third one.
￿ A ﬁrst use is to check the circuit obtained by a com-
mercial CAD synthesis tool. Our tool contains, in a certain
form, the reverse function. Given a list of states and transi-
tions, one of the tools we use gives the minimal equivalent
FSM.￿ A second very important task is allowed by this tool,
but we shall not enter into the details in the frame of this pa-
per : if we describe two FSM, and if we add a comparator on
the outputs, we can check the equivalence of the two FSM
(Let us notice that the comparator is virtual in this case).
They are equivalent if (and only if) the comparator deliv-
ers always ”True”. In this case the minimal automaton re-
sulting from the composition of the two automata and the
comparator has only one state. We used this approach in
education [3].
￿ Inthispaper wepresentanotheruse, exploration,based
onthecareful manualanalysisof theresultof this extraction
process.
A second use of the description is simulation. An inter-
active version of the simulator allows to give inputs to the
automaton and to observe outputs. Timing diagrams can de
drawn. A batch version allows to put the inputs in a textﬁle.
We use this simulator in education.
2.4. What is exploration ?
Our technique is based on exploration of the circuit be-
haviour. Exploration means that a designer is not already
completely certain about What must be done ? It is too early
to give any kind of implementation, formal speciﬁcation or
any similar description. Exploration is what you do on draft
paper, with a pencil. You are entering into your design and
you need to dig into it. You try, in fact, to understand what
your circuit will be (or would be ?). Exploration is certainly
not for a full circuit but for a part of circuit, for a mecha-
nism, for a hardware trick.
The tools used in this phase help you to get all the in-
formations from a rough description. They make a kind of
Computer Aided Draft. But your draft paper deals with for-
mal proofs when needed. In the next section an example
will be detailed to make clear this approach.
3 A complex behaviour, adapted from SPARC
Our circuit is a pipelined processor. To make the paper
as self-consistent as possible, we made drastic simpliﬁca-
tions and limited ourselves to a very simple example. Our
3 bits microprocessor could seem ridiculous compared to
50 000 000 transistors ones ! We use this example in digital
design education at an introductory level.
Our example is a reduced version of a processor SPARC
andisorganizedaroundthepartcomputing thenextvalueof
the Program Counter (PC). We study the so-called delayed
branch mechanism.
address label instr
0 zz instr0
1 instr1
2 brcond ss
3 tt instr3
4 instr4
5 brcond zz
6 brcond tt
7 ss instr7
Figure 1. A short program in assembly lan-
guage. All the instri are arithmetic. Label
zz is at address zero, tt at three and ss at
seven.
3.1. How does progress the Program Counter of a
nonSPARC processor ?
Let us consider a machine with only two classes of in-
structions : arithmetic instructions, (their only inﬂuence
upon theProgram Counteris its incrementation), andcondi-
tional branches BRCOND. In this simple machine, the Pro-
gram Counter is coded on three bits. The short program of
ﬁgure 1 contains 8 instructions, and after address 7, there is
address 0.
This program could exhibit different behaviours, de-
pending upon the values of the condition at the instants
where it is tested. We can represent them by signiﬁcant
sequences of instructions :
￿ The sequence of instructions [instr0, in-
str1, brcond ss, instr7] occurs if the
condition is Yes when evaluated at the instruction at
address 2.
￿ [instr0, instr1, brcond ss, instr3]
occurs if this same condition is No.
￿ [instr3, instr4, brcond zz, instr0]
occurs if Yes occurs at address 5.
￿ [instr3, instr4, brcond zz, br-
cond tt ,instr3] if No occurs at address
5 and Yes at address 6.
3.2. How does progress a SPARC Program Counter ?
The system of SPARC is different from the standard one
and is well known ([18], [16]) :
There are Control Transfer Instructions (CTI). Differ-
ent CTI exist : Jump and Link, Conditional Branch and
Call. We shall simplify here by considering only condi-
tional branch instructions.code
Inst1
I2 Brcond label
Inst3
Inst4
...
label Inst5
...
line value of cond Inst Sequence
I2 true [Inst1 Brcond Inst3 Inst5,..]
I2 false [Inst1 Brcond Inst3 Inst4,..]
Figure 2. Delayed branch mechanism. The
ﬁrst table is a small SPARC program. The
second table gives possible behaviours as-
suming that Inst1, Inst3, Inst4 are not Control
Transfer Instructions,
The instruction written immediately after a CTI is exe-
cuted ﬁrst, then the transfer of control occurs. This mecha-
nism is known as Delayed Branch. The instruction inserted
is said to be in the Delay Slot. There is a mechanism of
annul bit. We do not introduce it in the frame of this paper.
In the small program of ﬁgure 2 two sequences of in-
structions may occur (assuming that Inst1, Inst3, Inst4,
Inst5 are not CTI) :
- if the condition is true when it is examined in instruction
I2 the sequence of instructions is Inst1 Brcond Inst3 Inst5
- if the condition is false when it is examined in instruction
I2 the sequence of instructions is Inst1 Brcond Inst3 Inst4
This behaviour is made possible by the existence of a
(classical) register Program Counter (PC) and of another
information named Next Program Counter (nPC). The im-
mediate question is obviously : What occurs when two CTI
are written consecutively ? However the standard practice
of a programmer is not to write programs with such fea-
tures [13].
The complete documentation ([16]) explains the differ-
ent possible behaviours in this case. We take here a simpli-
ﬁed version.
We shall present such a situation in ﬁgure 3 : the pro-
gram contains two consecutive conditional branches. They
appear in lines 5 and 6. The program is similar to the one
of ﬁgure 1.
3.3. Our exploration experiment with this Very Re-
duced Computer
Our experiment was this one :
We got a VHDL description of a SPARC architecture from
theEuropean Space Agencysite (Leon version[17]) and we
address label instr
0 zz instr0
1 instr1
2 brcond ss
3 tt instr3
4 instr4
5 brcond zz
6 brcond tt
7 ss instr7
line value of cond Prog Counter values
2 true [0, 1, 2, 3, 7]
2 false [0, 1, 2, 3, 4]
5 then 6 false then false [3, 4, 5, 6, 7, 0]
5 then 6 true then false [3, 4, 5, 6, 0, 1]
5 then 6 false then true [3, 4, 5, 6, 7, 3]
5 then 6 true then true [3, 4, 5, 6, 0, 3]
Figure 3. A SPARC program with intricated
branches and the possible behaviours.
depl br
mux
add
memory
pc
spc
newpc
plusone
cond
Figure 4. Organization of the Program
Counter updating in reduced SPARC proces-
sor. Instead of a condition code register we
use cond as an external input.simpliﬁed it.
For this experiment, we saved only :
￿ the Program Counter (pc) and its ghost copy (spc),
￿ the Next Program Counter value (newpc),
￿ the combinational incrementer (plusone) associated
with these registers,
￿ the adder used to add a displacement to obtain the
branch target address
￿ the Instruction Register containing the current instruc-
tion. It contains two ﬁelds : br is operation code, depl
is a displacement of branch instructions.
The Register Transfer Level description of the system is
given graphically in ﬁgure 4.
Our circuit is composed of this restricted SPARC and
of an 8 words memory containing the aforementioned pro-
gram. This memory can be a ROM because we do not use
any STORE instructions.
Let us examine the small program used as a test-
bench :(ﬁgure 3) The expected behaviour depends upon the
valuesof theconditionduring executionof instruction 5and
6. For instance if the condition tested in instruction at ad-
dress 5 and the condition tested in instruction at address 6
arebothtrue,thesequenceofvaluesoftheProgramCounter
is 3, 4, 5, 6, 0, 3.
3.4. Boolean level description
Let us recall that the automata computing facility of the
Lustre compiler [7], [19] can compute the minimal automa-
ton from one of its descriptions. For instance a description
given in logic gates and ﬂip-ﬂops. To use this facility, we
restricted the data path to 3 address bits and to 4 data bits.
The ROM contains 8 4-bits words as in ﬁgure 3. The Op-
Code has only one bit (true for a BRCOND false for a NOP)
and the displacement is coded on 3 bits. It was enough for
our experiment as will be shown.
To put focus on the role of the condition, we considered
it to be an external input. The logic description is simple :
3 bits adder, 3 bits incrementer,... We compiled the Lustre
descriptionof thislogic description, obtained an automaton,
and minimized it. We obtained the automaton described by
ﬁgure 5.
3.5. How do we understand this automaton ?
Figure 5 gives the states obtained from the compiler. We
named them A, B, C, D, E, F, H and a, d, dd and h. A
is the initial state. In regards to the states we added the
corresponding values of the Program Counter. For instance
in states D, d and dd, the PC value is 3. Let us comment a
A
B
C
D
E
F
G
H
a
d dd
0
h 7
6
5
4
3
2
1
Figure 5. All the possible states of program
from ﬁgure 2 The left column gives the Pro-
gram Counter values. The picture has three
kinds of arrows : black thin arrows corre-
spond to standard PC incrementation, (ex-
ample : from PC=1 to PC=2), black bold ar-
rows correspond to rejected control transfers
when the condition is false (example : from
PC=3 to PC=4), grey arrows correspond to
control transfers when the condition is true
(example : from PC=3 to PC=7).transition in the automaton :
- Arrow D
￿ h (PC = 3
￿ PC = 7) corresponds to the
instruction Brcond at address 2 and a condition True.
All the possible behaviours given in ﬁgure 3 correspond
to a path in this automaton. The sequence of values of the
PC 3, 4, 5, 6, 0, 1 (condition true in instruction line 5 and
condition false in instruction at line 6) correspond to the
sequence of sates D, E, F, G, a, B.
Exploration gave us conﬁdence that our PC computation
mechanism is correct with respect to the speciﬁcation of the
processor with delayed branch. We could also observe that
our simpliﬁed model introduces a simulation artifact : cond
seems to be tested one clock cycle too late.
4 Comparison of exploration with other ap-
proaches
In this section we compare the principles of this tech-
nique to other approaches.
￿ Obviously our exploration must not be confused with
the state exploration used in certain model-checkers.
￿ A ﬁrst characteristic of our approach is its relation to
simulation. Exploration can give some complete informa-
tions while simulation cannot. Let us explain :
- In a ﬁrst step, simulation allows the designer to check
consistency between the implementation and the intention.
This part is known to be difﬁcult and unsafe. The behaviour
of the implementation is seen by timing diagrams and the
reliability depends upon the testbench prepared. The prob-
lem of elaborating a good testbench is highly difﬁcult. In
exploration, we do not need to give a testbench. And we
have, in a certain way, ALL the possible testbenches. Obvi-
ously it can be too much... But the behaviour obtained by
this technique is complete.
- In a second step, simulation can take into account some
informations extracted from layout steps. Nothing can re-
place these computations managing the wires and gates de-
lay. Exploration is not useful at this level.
￿ A second characteristic is the relation to model check-
ing. We use a formal approach, and associated tools. The
LUSTRE compiler is in fact a model checker. It contains
the functionality to build and minimize an automaton. But
in this paper we present how exploration informations can
be obtained from these tools. In the examples presented
hereafter, the model checker veriﬁer is used with a trick to
obtain ALL the states of an automaton. This is generally
frightening for people involved in model checking. They
use equivalence relations on the states to avoid combinato-
rial explosion. In our examples these complete Finite State
Machine structures are the useful information.
￿ It is difﬁcultto compareexploration toTheorem Prover
based techniques. Exploration can only help the designer
to establish the expected properties or to discover certain
counter-examples. By this experiment, we have not demon-
stratedthatour SPARC iscorrectwithregardtoagivenspec-
iﬁcation. We have done a kind of symbolic execution of the
machine language program of ﬁgure 3 and we have seen
that we obtain all the expected behaviours. Giving any kind
of proof would have needed a speciﬁcation of the delayed
branch mechanism. Such a speciﬁcation is difﬁcult to es-
tablish. A proof of the implementation of a delayed branch
mechanism appears in [10] and is based on the use of the
theorem prover P.V.S. ([14])
We can certainly not give general conclusions from this
study : in particular it is unrealistic to try to generalize such
a study to a full processor. We simply made possible the ac-
curate study of a subtle behaviour and we get conﬁdence in
our implementation of this behaviour. Our technique should
be compared to J. Levitt and K. Olukotun’s unpipelining.
” Our technique, which we call unpipelining, removes
pipeline stages from an implementation while preserving
the implementation’s behaviour, collapsing it into a single
stage through a series of transformations. The complexity
due to the pipelining is completely eliminated and the de-
constructed pipeline can be compared directly to the ISA
speciﬁcation. [11] ”
We also break the complexity of pipeline implementa-
tions by considering the developed form. But we establish
properties, at a logic level, by a tool similar to a model-
checker while those authors use a theorem demonstrator to
manipulate formulas representing the behaviour at Register
Transfer Level.
5. Conclusion
This kind of exploration, based on human understanding
of computer generated automata is very fruitful. We are
aware that it is also difﬁcult.
Our other experiments, not presented here because they
were too big, show that 100 states is a maximal complexity.
It means that we must simplify drastically a mechanism to
study it. In a previous study ([4]) we obtained a 6500 state
automaton and it was impossible to manage it by hand.
It remains of course tempting to explore the behaviour of
other tricks in computer architecture. Branch Target Buffer
or Register Renaming as in the processor AX ([1]) are good
candidates. At least we can fully make ours those authors’
comment: ” Experienceinteachingcomputerarchitectures
partially motivated this work.”
References
[1] Arvind and X. Shen, Using Term Rewriting Systems
to Design and Verify Processors, IEEE Micro, May-
June 1999, pp 36-46.[2] P. Amblard, J.C. Fernandez, F. Lagnier, F. Maran-
inchi, P. Sicard et P. Waille, Architectures Logicielles
et Mat´ erielles, Dunod, 2000.
[3] P. Amblard, F. Lagnier and M. Levy, Introducing Digi-
tal Circuits Design and Veriﬁcation Concurrently, Pro-
ceedings of the 3
￿
￿
European Workshop on Microelec-
tronics Education, Aix en Provence, 18-19 May 2000,
Kluwer, pp 261-264.
[4] P. Amblard, A Finite State Description of the Earliest
Logical Computer : the Jevons’ Machine, Mixed De-
sign of Integrated Circuits and Systems, (Ed A. Napier-
alski, Z. Ciota, A. Martinez, G. De Mey, J . Cabestany),
Kluwer, 1998, pp 195-202.
[5] A. Bouajjani, J.-C. Fernandez, N. Halbwachs, P.
Raymond, C. Ratel, Minimal state graph generation,
Science of Computer Programming, Vol. 18, 1992,
pp 247–269.
[6] N. Halbwachs, Synchronous programming of reactive
system, Kluwer Academic Pub., 1993
[7] N. Halbwachs, P. Caspi, P. Raymond and D. Pi-
laud : The Synchronous Data-ﬂow Programming Lan-
guage Lustre, Proceedings of the IEEE, pp 1305-1320,
September 1991.
[8] N. Halbwachs, F. Lagnier and C. Ratel : Program-
ming and Verifying Real-time Systems by Means of the
Synchronous Data-ﬂow Programming Language Lus-
tre, IEEE Transactions on Software Engineering, Spe-
cial Issue on the Speciﬁcation and Analysis of Real-
Time Systems, September 1992, pp 785-793.
[9] S. Huang, K. Cheng, K. Chen, C. Huang, F. Brewer :
AQUILA: An Equivalence Checking System for Large
Sequential Designs, IEEE Transactions on Computers,
May 2000, pp 443-464.
[10] D. Kroening, W. Paul and S. Mueller, Proving the
Correctness of Pipelined Micro-Architectures, Proc. of
the ITG/GI/GMM Workshop, (Ed K. Waldschmidt and
C. Grimm), VDE Verlag, 2000, pp 89-98.
http://www-wjp.cs.uni-
sb.de/projects/comparch/papers/pipe.pdf
[11] J. Levitt and K. Olukotun, A Scalable Formal Veriﬁca-
tion Methodology for Pipelined Microprocessors, DAC
1996, Las Vegas, June 1996, pp 558-563.
[12] K. McMillan, Veriﬁcation of Inﬁnite State Systems
by Compositional Model Checking, CHARME 1999,
Bad Herrenhalb, september 1999, pp 219-233. (LNCS
1703)
[13] R. Paul, SPARCArchitectureAssembly LanguagePro-
gramming, and C, Prentice-Hall, Inc., 1994
[14] M. Srivas, H. Rue
￿
and D. Cyrluk, Hardware Veriﬁ-
cation using PVS, Formal Hardware Veriﬁcation, Ed T.
Kropf, 1997, pp 156-205. (LNCS 1287)
[15] N. Wirth : Digital Circuit Design, Springer-Verlag,
1995.
[16] The SPARC Architecture Manual, version 8, Prentice-
Hall, Inc., 1992.
[17] http://www.estec.esa.nl/wsmwww/leon/
[18] http://www.sparc.org/standards/V8.pdf
[19] http://www-verimag/SYNCHRONE