Synthesis and optimization of interfaces between hardware modules with incompatible protocols by Androutsopoulos,V. et al.
SYNTHESIS AND OPTIMIZATION OF INTERFACES BETWEEN HARDWARE MODULES 
WITH INCOMPATIBLE PROTOCOLS 
Vassilis Androutsopoulos9 TJW Clarke and DM Brookes 
Department of Electrical and Electronic Engineering, Imperial College 
London SW7 2BT, UK 
ABSTRACT 
In this paper, we present a new algorithm that performs auto- 
matic interface synthesis between two synchronous hardware mod- 
ules with incompatible data communication protocols. We intro- 
duce the Data Path State Machine (DPSM) which captures data 
path dependencies. This allows control logic for data paths to be 
synthesized which is optimized for bandwidth over multiple trans- 
actions. 
1. INTRODUCTION 
The recent advent of Systems-On-Chip products, the growing com- 
plexity of designs and stringent time to market pressures are all 
factors for the so called design productivity gap. Unsurprisingly, 
pressures have therefore been exerted on EDA companies to de- 
velop tool environments to encourage the reuse of previous de- 
signs. The increasing reuse of RTL hardware blocks makes the in- 
terfacing of RTL hardware blocks important. Communication be- 
tween these blocks is made passible if proper interface circuits 
are introduced. Manually adapting these interfaces is a tedious and 
error prone process. Instead, methods and algorithms to automati- 
cally synthesize interfaces need to be developed. 
The problem can be expressed as: given rke producer and con- 
sumer dara communication prorocolx and a descripfion of fke dora 
park rhar inrefaces fke MO sides, generate an opfimal (in rems of 
peformance) inferface machine nuromaricolly that will synchm- 
nize and preserve rke meaning of fke dma between fke rwo sides. 
Passerone [PRSV981 showed that if protocols are represented 
naturally as Deterministic Finite Automata (DFA), the product FSM 
can be pruned to implement interface control logic. Passerone has 
argued that DFAs are not as easy for designers to use for proto- 
col specification as regular expressions. In this paper we will as- 
sume that the protocol specifications can be translated automati- 
cally into equivalent DFAs as shown by [PRSV98]. We will ad- 
dress the problem of automatic interface synthesis from protocol 
DFAs. 
Passerone’s algorithm uses acyclic DFAs to simplify synthe- 
sis, and so does not optimize latency and bandwidth over multiple 
transactions. Furthermore it does not consider data path issues, We 
present a new algorithm that deals with more realistic interface 
synthesis in which protocols are represented by cyclic DFAs. The 
introduction of a Data Path State Machine (DPSM), capturing data 
path dependencies, allows control logic for data paths to be opti- 
mized for bandwidth. 
The rest of the paper is organized as follows: Sect.2 gives a 
brief description of previous and related work, Sect.3 presents ter- 
minology required in the rest of the paper, Sect.4 presents the pro- 
Producer Consumer 
Figure 1: Problem Definition 
tocol specification and DPSM formalisms, Sect.5 presents the al- 
gorithm. Finally, Sect.6 describes the results and Sect.7 concludes 
the paper and presents direction for future work. 
2. RELATED WORK 
Interface synthesis has been addressed in a broad range of litera- 
ture. The STG is introduced in [Bor88] as a means to establish syn- 
chronization between synchronous andlor asynchronous compo- 
nents. However the protocol specifications are too low level (tim- 
ing diagrams) and the correspondence between the different pieces 
of data items are not resolved automatically. 
The authors in [AM911 describe the protocols using 2 ver- 
ilog FSMs and a non-deterministic Cartesian product is obtained 
which forms the interface. This is determinized by using a 3rd ma- 
chine called the C-machine which describes the intended behavior 
of the interface. The method does not solve the data correspon- 
dence problem mentioned above and does not consider any data 
path issues. 
Passerone et al. [PRSV98] describe the protocols using Reg- 
ular expressions. These are translated into finite automata which 
are then synthesized into FSMs using a product algorithm which 
resolves the pseudo non-determinism that arises by making the 
composition causal, non-deadlocking and optimal in terms of its 
latency. It solves the data correspondence problem but is limited 
in form of communication i.e. only a single transaction, point-to- 
point communication and common clock are assumed. 
More recently there have been efforls by [PCPKOO] to gener- 
ate hardware interfaces with both sides operating at different clock 
frequencies by inserting additional states and edges to the product 
V-6 13 0-7803-7761-31031$17.00 02003 IEEE 
FSM. The authors in [SCOZ] have recently proposed an interface 
architecture with 3 FSMDs (one for each of the producer, con- 
sumer and queue) and a data path consisting of a queue which they 
believe is general enough to accommodate any component proto- 
cols. The protocols are specified using FSMDs and the synthesis 
algorithm is responsible for mapping these onto the FSMDs on the 
target architecture. The algorithm does not address the data corre- 
spondence problem. 
3. PRELIMINARY TERMINOLOGY 
If c , ,  cz, ..., c, are the control ports associated with a certain pro- 
tocol, assuming values from the sets 01, m ~ ,  ... ,U", the control set 
of the protocol is defined as the product C = n:=, ai. Elements 
from the control set are called control symbols. 
If d i  , dz,  . .. , d, are the data ports associated with a certain pro- 
tocol, assuming values from the sets p ~ ,  p z ,  ..., pp.  the data set of 
the protocol is defined as the product D = ne, pi. Elements 
from the data set are called data  symbols. 
The alphabet C of the protocol is defined as the product C x 
D. The elements of the alphabet are called symbols. A formal 
language over C is a set of strings of symbols from C. A protocol 
is a formal language over C. In other words a protocol is a set of 
strings of symbols from C where each string of symbols represents 
a legal manifestation of a certain transaction or behaviour. 
Elements from the protocol set are called tokens. A token 
represents a complete communication between the producer and 
consumer. The set of data symbols associated with the token are 
known as the data type. In a bus transaction a token is broken 
down into a series of sub-tokens. Each sub-token consists of a 
string of data and control symbols. The data symbols are associ- 
ated with the data bus and the control symbols are associated with 
the control signals. 
Figure 2 illustrates the producer and consumer Data Flow 
States (DFS). Boolean expressions involving DFS form the condi- 
tions on the transitions of the Data Path State Machine. 
Plod-, C D " m  
Figure 2: Producer and Consumer Data Flow States 
4. INTERFACE SPECIFICATION 
A DFA is translated into the equivalent Labelled pseudo Non De- 
terministic Finite State Machine by considering the behaviour of 
the input and output wires separately. 
Definition 1 A Labelled pseudo Non Deterministic Finite State 
Machine (LNDFSM) is a directed graph defined by either of the 
following Nples depending on whether the hardware block con- 
forming to the communication protocol is a producer or a con- 
sumer. 
LNDFSM,  :=< S, I ,  0 U V, L ,  6, A, P, F, SO > 
LNDFSM,  :=< S , IUV30 ,L ,6 ,X ,P ,F , so  > 
S denotes the set of protocol states with SO E S being the reset 
state. I represents the finite input control space, 0 represents the 
finite output control space and V c D defines the set of storage 
variables. Lis the set of transition labels (alp. y) (for a producer) 
or ( a , y / p )  (for a consumer) where a E I, 0 E 0, y is a label 
distinguishing between the different storage variables in the trans- 
action. 6 ; S x I + S is the next state function and X is the output 
relation denoted by the characteristic function X : I x S x 0 - B 
where B = (0,l). F C S denotes the set of final states. A fi- 
nal state is the state which represents the entire data type being 
sentlreceived for the last time in the transaction. For a linear FSM 
I F I= 1. For a non-linear (branching) protocol I F It 1. P is the 
set of transaction parameters (i.e no. of data symbols associated 
with the sub-token and token). 
M . 1  Pro'oc* CPIMe "."d*.* P,o,oml 
Figure 3: LNDFSMs for Producer and Consumer protocols 
Example linear LNDFSMs for a non-stallable serial protocol 
(acting as the producer) and a 4-phase handshake protocol (acting 
as the consumer) are shown in figure 3. The serial protocol initially 
waits for the environment to set its input control signal to 1. An 
associated data symbol d l  is waiting to be placed onto the data 
bus. Once a 1 is received from the environment, the producer puts 
the data onto the bus. The control signal then goes to 0 one clock 
cycle later and another data symbol d2 is placed onto the bus. The 
last data symbol d4 in the data type, is placed onto the data bus 
two clock cycles later. 
The 4-phase handshake protocol initially waits for a request 
signal from the environment. After the environment sets the re- 
quest signal, it waits for the hardware block to assign the acknowl- 
edge signal. After the acknowledge signal is received, the environ- 
ment puts the data onto the bus. The hardware block reads the data 
from the bus until the environment drops the request signal. 
Definition 2 A Data Path State Machine (DPSM) is a directed 
acyclic graph defined by the tuple 
DPSM :=< R,6,C,ro,F > 
where R denotes the set of protocol states, with ro E Y being the 
initial state and F E S the final state, 6 2 S x S is the set of state 
transitions, and C is the set of all possible DFS conditions for the 
producer and the consumer. The states in the DPSM emphasize 
V-614 
the acceptance or rejection of a sequence of DFS conditions. The 
edges are labelled with these conditions. 
serve the meaning of the data according to their LNDFSM specifi- 
cations. The LDFSM coupled with the abstract register of figure 5 
will form the interface. 
Definition 3 A Labelled Deterministic Finite State Machine 
(LDFSM) is defined by the tuple 
DPSM for a single abstract register 
h, 61." 
Figure 4: DPSM for a single abstract register 
LDFSM :=< Q,lURUV,OUV,A,A,X,qn ,  F > 
where Q 2 S, x S, x R denotes the set of states with qn E S 
being the reset state and P c Q the set of final slates. I represents 
the finite input space, 0 the finite output space, R represents the 
set of possible data path states, V D defines the set of storage 
variables, A C L ,  x L, is the set of transition labels (a, p l y ,  6) 
where a and y are the input and output control symbols and 0 and 
6 are the input and output data labels distinguishing between the 
different data symbols. A c Q x Q is the set of state transitions, 
X : I U R x S + 0 is the output relation. 
The LDFSM is computed by using a modified version of 
ure 6) ,  The algorithm has been re-implemented in approximately 
2ooo lines of c-code and modified handle multiple transactions, 
The algorithm accepts the DPSM and the producer and 
algorithm to capture the desired data dependencies for the given 
data path, In  particular what is new in  implementation of the 
algorithm is the possibility for the to specify any arbitrary 
data path behaviour in the LDFSM 
composition if one exists, which is optimal in terms of the cycle 
length, 
TWO auxiliary data structures are created to assist the prod- 
ucl machine computation: Stack and Fail pool, The stack marks 
the explored path and is "sed to loops and avoid endless 
computation, The fail pool is used to recard failed states, 
The DPSM for a single abstract register wide enough to store Passerone.s product computation ,PRSV981 (see Fig. the entire data type is illustrated in Figure 4. We assume that the 
register is tied to the producer and that the various data is clocked 
into the register when it is first made available. State 1 in the DPSM 
to the register and is waiting for the consumer to relinquish the use 
of the data contents in the register. If at anytime during which the 
data path is in state I ,  the producer writes a new set of data symbols 
to the register, the contents of the register will be corrupted and the 
consumer will thereafter read incorrect data. 
A possible refined version of the abstract register is shown in 
Figure 5 .  I t  will consist of D modified registers where D repre- 
sents the size of the data type. These registers are modified to UP- 
timize latency. X and Y represent the input and Output pofl sizes 
respectively in terms of the number of data symbols that can be 
associated with them. 
is entered upon when the producer has written the entire data type data communication as input, The DPSM is used by the 
of the DPSM, It 
The Explore function returns one of three objects: Success, 
From FSM 
From FSM 
Figure 5.  Data path architecture 
5. SYNTHESIS ALGORITHM 
The inputs lo the synthesis algorithm are the LNDFSMs for the 
producer and consumer data communication protocols and the DPSM 
for the abstract register. The resulting composition is the Labelled 
Deterministic Finite State Machine (LDFSM), a subset ofthe cane- 
sian product of the 2 LNDFSMs and DPSM. The LDFSM will 
control the data flow between the producer and consumer, and pre- 
Fail, Loop. The explore function returns Success for a transition 
that will definitely lead to a successful transfer of data and Loop if 
the state already exists on the stack and has already been visited. 
Fail is returned if the interface is in a stale where the data transfer is 
non-causal, the buffer has overRown or is uncontrollable (i.e. will 
lead to either one of these states in an unfriendly environment). 
The product is computed by performing a depth first recursive 
search with backtracking on all possible states in the product ma- 
chine. The required subset of the product machine is constructed 
by starling from no states and then adding states. Each new state 
in the tree is explored and the pseudo non-determinism that arises 
is resolved by choosing the transitions which make the resulting 
composition causal, controllable and optimal in terms of its cycle 
length. In particular, we define a final state as a state which con- 
tains backedges to states previously marked on the stack and the 
minimum cycle length of all the states are computed with reference 
to these final states. If a final state contains multiple backedges 
which are non-deterministic, the backedge which results in the 
smaller cycle length is chosen. 
6. EXPERIMENTAL RESULTS AND DISCUSSION 
In the first experiment, a non-slallable serial protocol is interfaced 
to a 4-phase request acknowledge protocol (see figure 3). In the 
second experiment, the same protocols are interfaced to one an- 
other as in the first experiment only this time the order in which 
the producer sends the data symbols to the consumer is reversed. In 
the third experiment. the 4-phase request acknowledge protocol is 
V-615 
Figure 6: Explore Function Pseudo Code 
interfaced to the non-stallable serial protocol. This involves trans- 
lating the 4-phase request acknowledge protocol input signal ock 
(as observed by the interface) into an output. and the output signal 
req into an input. The non-stallable serial protocol transfers one 
data symbol at a time without interruption until the entire token 
has been transferred. The start of the next transaction is regulated 
by the interface. The request acknowledge side uses a bus four 
time larger in size which is regulated by the request-acknowledge 
signals. 
As expected, the resulting controller FSMs are cyclic. For the 
same examples, Passerone's algorithm [PRSV98] produced acyclic 
FSMs because only a single transaction is considered. The main 
results are summarized in figure 7. The first experiment contains 
many states because the output of the data is concurrent to the in- 
put of the data. The possibility to stop and resume the protocol 
anywhere during the data transfer gives rise to a large number of 
states. Also note that concurrency leads lo an exponential increase 
in the number of state explorations with increasing data type size. 
This problem can be overcome and is currently being dealt with. 
Clearly, one way to significantly reduce the number of explorations 
is tu record the explored successful product states along with their 
minimum cycle length whilst performing the recursive search. In 
this way re-exploration of the same states can be avoided. Despite 
the current large no. of explorations, the generated controller is 
optimal in terms of bandwidth. 
In the second experiment, the interface will have to wait to 
the end of the input phase to begin the output phase. Less choice 
leads to fewer states than in the first experiment. Also note that 
there are far fewer state explorations. This is because all the illegal 
states are found to be close to the initial state. The existence of a 
protocol violation in the third experiment reduces the number of 
choices fur the interface machine which results in far less states 
than the first machine. The resulting interface is non-optimal in 
terms of the bandwidth. To optimize the bandwidth, the capacity 
of the interconnecting buffer was increased to store two tokens. 
The number of states increases with increasing buffer size. This 
is because increasing the buffer size increases the state space and 
also reduces the prospect for a protocol violation. 
7. CONCLUSIONS AND FUTURE WORK 
We presented a novel extension to Passerone's algorithm [PRSV98] 
which supports multiple transactions. The algorithm finds the op- 
timal solution in terms of the bandwidth. Still, there are a number 
of interesting extensions. An obvious extension to this work is to 
Figure 7: Experimental Results 
devise a DPSM formalism to synthesize optimal control logic for 
different register configurations. to allow more complex data path 
synthesis. Another interesting extension would be to extend the 
approach to optimize the interface for data transfer latency and to 
determine a means to generate the best interface in terms of both 
data transfer latency and bandwidth. 
S. REFERENCES 
[AM911 Janaki Akella and Kenneth McMillan. Synthesizing 
Converters between Finite State Protocols. In Pmc. 
of the International Conference on Computer Design 
(ICCD'PI), pages 41 0 - 41 3, I99 1. 
Gaetano Borriello. A New Specification Methodology 
and its Application to Transducer Synthesis. Phd The- 
sis UCBKSD 88/430, University of Califomia, Berke- 
ley, 1988. 
[PCPKOO] Bong-Il Park, Hoon Choi, InCheol Park, and Chong- 
Min Kyung. Synthesis and Optimization of Interface 
Hardware between IP's Operating at Different Clock 
Frequencies. In Pmc. lnremationol Conference on 
Computer Design (ICCD'OO), pages 519 - 524,2000. 
[PRSV98] R. Passerone, 1. Rowson. and A. Sangiovanni- 
Vinceotelli. Automatic Synthesis of interfaces be- 
tween incompatible protocols. In Proceedings of the 
Design Automation Conference (DAC'981, pages 8 - 
13, 1998. 
Dongwan Shin and Daniel Gajski Interface Synthesis 
from Protocol Specification. Technical Report CECS. 
02-13, University of California, Irvine, April 2002. 
[BorSX] 
ISGO21 
V-616 
