System-level partitioning groups processes and variables in the system specication into modules representing chips and memories. Communication between the modules is represented by abstract communication channels, which are merged and implemented a s a b u s to minimize interconnect. Given a set of channels, bus generation synthesizes the bus structure, by trading o the the width of the bus and the performance of the processes communicating over it. For each channel, we describe a method to generate protocols that specify the mechanism of data transfer over the bus. Protocol generation presented in this paper results in a rened system specication that is simulatable. Both busgeneration and protocol-generation are demonstrated on detailed examples.
Introduction
A system can be viewed as a set of processes which comm unicate with each other over channels. A c hannel is an abstract comm unication medium o ver which two processes can transfer data. System partitioning [1] m a y group processes and variables in the system specication into modules. The set of tasks which are performed to implemen t comm unication between the modules in a system are collectiv ely dened as Interface Synthesis. F or example, process A in Figure 1 is mapped to two system modules after system partitioning. The variables ME Mand STATUS which w ere originally declared within process A are now mapped to processes A1 and A2 respectively in a dierent m o dule. Process A processes reads and writes data to the variable MEM over channels ch1 and ch2 respectively. STATUS is accessed over channel ch3. In addition, to minim ize the interconnect cost between the system modules, system partitioning ma y group channels to y This work at U.C. Irvine was supported by the Semiconductor Research Corporation (grant #92-DJ-146). A c hannel is a virtual entity and free of any implementation details. After interface synthesis, a set of channels is implemen ted as a bus consisting of a set of wires and a protocol dening some behavior over the wires. Given a group of comm unication c hannels to be implemen ted as a bus, the goal of interface synthesis is to synthesize the bus structure and protocol with a view to minimi zing the interconnect and maximi zing the performance of the processes comm unicating over the bus.
Most previous research eorts that have examined interfaces can be classied into two broad cate-
31
st ACM/IEEE Design Automation Conference ® Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying it is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1994 ACM 0-89791-653-0/94/0006 3.50 gories. In the rst category are research eorts which have incorporated interface timing constraints into the scheduling of operations and events within the process during high-level synthesis. These include ISYN [2] , CONSPEC [3] , and Interface Matching [4] . The second set of approaches address issues relating to interfacing standard components which h a v e incompatible protocols. Synthesis of transducers was presented in [5, 6] where, given timing diagrams of two incompatible interfaces, a template matching strategy is used to assign hardware to behavior. In [7] , the two behaviors being interfaced were specied as Verilog based FSMs and a cross product of the two FSMs was optimized to obtain the description of the transducer.
While channels merging was addressed in [4] , it was assumed that all the channels transferred data of identical bitwidths. In our case, we wish to synthesize a bus to implement communication b e t w een dierent processes communicating over abstract channels created as a result of system partitioning. In addition, we seek to examine issues relating to design parameters external to the process (such as the number of pins or data transfer rates and their eects on performance of processes). None of the above approaches address such issues.
In Section 2, we formulate the interface synthesis problem. Bus and protocol generation are discussed in Sections 3 and 4. Results of our experiments with interface synthesis are presented in Section 5. Finally, we present our conclusions and plans for future work.
Problem Formulation
In implementing a group of channels, the goal of interface synthesis should be to synthesize a bus which has a 100% utilization, i.e., the bus is never idle. Consider two c hannels A and B which transfer data as shown in Figure 2 . For simplicity, assume that the 4 second time interval shown in the gure is representative of the data transfer over the lifetimes of the processes which communicate over channels A and B.
The channel average rate, AveRate(C) is dened as the rate at which data is sent o v er channel C over the lifetime of the processes communicating over it. Channels A and B in Figure 2 have a v erage rates of 4 and 12 bits/second respectively. I f c hannels A and B are merged into a single bus AB, then we can see that the bus needs to send data at the rate of 16 bits/second to be able to satisfy the data transfer requirements of the two original channels A and B. The data items transferred over the channels have been labeled to make it easier to associate them with the data transferred over the shared bus. Consider the data item labeled B2 transferred at the t=1 second in the original channel B, which i s n o w transferred on bus AB at t=1.5 seconds. While individual data transfers may be delayed due to bus access conicts, the bits transferred over the individual channels before channel merging are still sent o v er the shared bus in the same amount of time.
In synthesizing the bus AB in Figure 2 , we take advantage of the fact that the individual channels will not always be transferring data. We attempt to utilize the idle time slots of one channel for data transfers of other channels by synthesizing a bus over which data is always being transferred at a constant rate.
If the channels before being merged into a bus were transferring data at a certain average rate, they should be able to transfer the data over the bus at the same average rate. This can be achieved if the data transfer rate BusRate(B), of bus B, is greater than the sum of the individual channel average rates. Thus,
Implementing a group of channels consists of two tasks: Given a set of channels to be implemented as a single bus and a set of constraints, interface synthesis consists of two tasks: (1) bus generation which determines the minimum cost buswidth which will satisfy the data transfer rates of individual channels, and (2) protocol generation which generates the protocols for data transfer over the bus for each c hannel.
Bus Generation
The bus generation algorithm determines the width of the bus required to implement a group of channels. The bus generation algorithm was presented in [8] .
Intuitively, the algorithm examines a range of possible buswidths. For each buswidth, the bus rate and the channel average rates are computed. If the bus rate is greater than the sum of the channel average rates (as explained in Equation 1), then we h a v e a feasible bus implementation. From the set of feasible bus implementations, each corresponding to a dierent buswidth, we select the one which has the least cost. Briey, the algorithm consists of ve steps:
(1) Determine buswidth range: The smallest buswidth examined by the Bus Generation algorithm is 1 and the largest buswidth examined is equal to the largest size of message sent b y a n y c hannel.
For each bitwidth, CurrBW, in the range determined above, repeat steps 2 and 3.
(2) Compute the bus rate: The bus rate depends on the delay of the protocol that will be used to implement the data transfer. For a handshake protocol, we assume that the delay i s t w o clock cycles, thus, BusRate(B) = CurrBW 2 2 ClockPeriod (2) (3) Determine average rates for each c hannel For all channels, determine the average rate for the current buswidth being examined. Estimation of channel average rates was presented in [8] .
If the bus rate, BusRate(B), is greater than the sum of the average rates of all the channels, then we have a feasible implementation for the bus. Go to step 4. If the bus rate, BusRate(B), is less than the sum of the channel average rates, the bus rate will not be able to satisfy the performance requirements of the channels. Go to Step 2 and try with the next buswidth in the range determined in step 1.
(4) Determine the cost function for CurrBW For a given set of channels which h a v e been grouped together to be implemented as a single bus, the designer can specify constraints and relative w eights for the buswidth, the minimum/maximum values of the channel average and peak rates. The cost of a bus implementation is calculated as the sum of the squares of violations of each of the constraints, weighted by the relative w eights specied for them.
(5) Select the buswidth If there were one or more feasible solutions that were determined at the end of Step 3, select the buswidth corresponding to the one with the least cost determined in Step 4. If there were no feasible solutions at the end of step 3 for all the buswidths examined, then an implementation for the group of channels is not possible. Any implementation for such a group of channels would progressively delay the processes communicating over the bus. Such a situation can arise when several channels that have very high average rate requirements are grouped together to be implemented as a bus. One solution to this problem would be to split the group of channels further to be implemented by more than one bus.
Protocol Generation
Once an appropriate buswidth has been selected to implement the channel group, protocol generation denes the exact mechanism of data transfer over the bus. A bus consists of three sets of wires.
(1) Data lines are used to send data over the bus.
The number of data lines (i.e., the buswidth) required can be determined by the bus-generation algorithm or they can be specied by the system designer. (2) "00" "10"
"11"
"01" is assigned the ID \01" and so on.
3. Bus structure and procedure denition: The structure determined for the bus (i.e. the data, control and ID lines) is dened in the specication. For each channel mapped to the bus, appropriate send/receive procedures are generated, encapsulating the sequence of assignments to the bus control, data and ID lines to execute the data transfer. 
Update variable-references: References to a
variable that has been assigned to another system component b y system partitioning must be updated in behaviors that were originally referencing it directly. Accesses to variables are replaced by the corresponding send and receive procedure calls corresponding to the channel over which the variable is accessed. For example, in Figure 3 , behavior P writes the value \32" to variable X directly. Channel CH0 represents the write to variable X. The statement``X <= 32'' is replaced by the send procedure call``sendCH0(32)'' as shown in Figure 5 . The statement``MEM(60) := COUNT'' in behavior Q is updated to``sendCH3(60, COUNT)'', indicating that the value in CO UNTis to be written to address 60 of array ME M. 5 . Generate variable processes: In order to obtain a simulatable system specication, a separate behavior is created for each group of variables accessed over a c hannel. Appropriate send and receive procedure calls are included in the behavior to respond to access requests to the variable over the bus. In Figure 3 , the variables X and ME Mw ere assigned to dierent system components as shown by the dashed lines. In Figure 5 , behaviors Xprocand ME Mp r o chave been created for these two v ariables.
Experiments and Results
Interface synthesis presented in this paper has been been implemented and integrated with the systemlevel partitioner presented in [1] . The partitioner groups the variables, behaviors and channels in a system specication into memories, modules, and buses respectively. We performed several experiments involving the application of the bus generation algorithm to synthesize module interfaces in an answering machine, an Ethernet network coprocessor and a fuzzy logic controller [9] . We will present the results for the fuzzy logic controller (FLC) in greater detail. We shall illustrate the Bus Generation algorithm with the example of interface synthesis in the Fuzzy Logic Controller [9] of Figure 6 . The Fuzzy Logic Controller consists of two inputs which sense the temperature and the humidity in a room. Depending on these two inputs, the FLC has 4 rules which are evaluated to compute the output signal which determines the operation of the air conditioning system. System partitioning mapped the memories (array v ariables in the description) which store the membership functions and fuzzy logic rules in the FLC to a separate chip. Processes EVAL R3 and CONV R2 of the fuzzy logic controller (FLC) access the array v ariables trRu0 and trRu2 respectively over communication channels ch1 and ch2 that have been merged to be implemented as a single bus. Figure 7 shows how the performance of the two processes transferring data over bus B is aected by the various bus widths that can be used to implement the bus. For each bus width, the protocol required for each c hannel in the bus was generated. A performance estimator [10] was then used to obtain the execution times of the processes. Clearly, as the bus width increases, the execution time for the processes decreases. Since the two c hannels each transfer 16 bits of data and 7 bits of address, bus widths greater than 23 pins do not yield any further improvements in the performance as the data transfer cannot be parallelized any further. If any performance constraints exist for these processes, the designer can select an appropriate buswidth for implementing the bus. For example, by examining Figure 7 , if process CONV R2 has a maximum execution time constraint of 2000 clocks, then only buswidths greater than 4 bits will be considered by the designer during interface synthesis for implementing the bus. To demonstrate how the designer can exercise control over interface synthesis by specifying appropriate constraints and weighing them accordingly, the Bus Generation algorithm was applied with three dierent sets of constraints. Figure 8 shows the bus constraints, selected bus widths, corresponding bus rates for the three bus designs A, B and C. In each case, specifying and weighing the constraints appropriately, the designer can implement the channel group with a dierent buswidth For example, in design A of Figure 8 , the designer has specied a minimum peak rate for channel ch2 of 10 bits/clock. The minimum cost function corresponds to a bus width of 20 which i s then used to implement the bus. The reduction in the number of data lines compared to the case when the two c hannels would have been implemented separately is 56%. In all the three examples, this reduction has been achieved without sacricing any performance o f the processes. 6 Conclusions and Future Work Communication channels in system-level synthesis are often grouped together to reduce the interconnect cost at the module boundaries in a system. In this paper we h a v e presented a method to generate protocols for implementing a group of communication channels as a single bus.
Protocol generation presented in this section has several advantages. First, the rened specication is simulatable and the design functionality after insertion of buses and communication protocols can be veried. Second, by encapsulating data transfer over the bus in terms of send and receive procedures, the description of the behavior remains relatively uncluttered as compared to the situation that would arise if we w ere to insert the assignments for the control and data lines at each communication point in the behavior. Finally, if at a later stage another communication protocol is selected for communication over the bus, only the bus declaration and send and receive procedures need be changed. The descriptions of the behaviors in the system, including the send and receive procedure calls, remain unchanged.
The work presented in this paper can be extended in several ways. We plan to study ways in which t w o o r more channels may transfer data simultaneously over the same bus by utilizing dierent sets of data and control lines. This would be useful in cases when no feasible solution can be found in the range of buswidths examined. Incorporating protocols other than a full handshake needs to be studied. In addition, further work is needed to examine the eect of bus arbitration delays on the performance of processes.
