Comparing RTL and behavioral design methodologies in the case of a 2M-transistor by Imed Moussa et al.
Comparing RTL and Behavioral Design Methodologies
in the Case of a 2M-Transistor ATM Shaper
Imed Moussa
1, Zoltan Sugar
1, Rodolph Suescun
2, Mario Diaz-Nava
3,
Marco Pavesi
4, Salvatore Crudo
4,L u c aG a z i
4 and Ahmed Amine Jerraya
1
1 TIMA laboratory, 46 Av. Felix Viallet 38031 Grenoble France
2 AREXSYS, Grenoble France
3 STMicroelectronics, 850 rue Jean Monet 38921 Crolles France
4 Italtel, 20019 Settimo Milanese Italy
email : Imed.Moussa@imag.fr
A
B
S
T
R
A
C
T
This paper describes the experience and the lessons learned during
the design of an ATM trafﬁc shaper circuit using behavioral syn-
thesis. The experiment is based on the comparison of the results
of two parallel design ﬂows starting from the same speciﬁcation.
The ﬁrst used a classical design method based on RTL synthesis.
The second design ﬂow is based on behavioral synthesis. The ex-
periment has shown that behavioral synthesis is able to produce
efﬁcient design in terms of gate count and timing while bringing a
threefold reduction in design effort when compared to RTL design
methodology.
1
I
N
T
R
O
D
U
C
T
I
O
N
Asynchronous Transfer Mode (ATM) devices have a short prod-
uct life cycle due to the constant evolution of the ATM recommen-
dations. Furthermore, the complexity of these devices is growing
exponentially due to the various demands of system integration of
this new telecommunication technology. The success of the whole
ATM technology (ICs, boards, systems, ...) depends on the capabil-
ities to meet the strong constraint of time-to-market. To satisfy this
stringent requirement, more productive design methodologies are
required. Current RTL design methodologies are very tedious and
time consuming. The escape is to start the design at a higher level
and use high-level synthesis toautomate the generation of the lower
description levels as RTL and gate. The beneﬁts of this approach
are to improve design quality and to shrink the design cycle.
This paper proves the efﬁciency of behavioral synthesis for the
design of an ATM trafﬁc Shaper circuit and describes the lesson
learned from this development. The complexity of this ATM func-
tion provides a good example to exercise the high level synthesis
design ﬂow.
Specification
RTL
RTL
Synthesis Gates
RTL
Synthesis Gates
VHDL
RTL
SIMULATION Test
SIMULATION Test SIMULATION
Behavior
VHDL
Back end
P&R
Back end
P&R
VHDL
Behavioral
RTL hand Coding
Behavioral
Synthesis
Synthesis Flow
Classic 
RTL Flow
Team
Figure 1: Comparison between the RTL and the behavioral design
ﬂow.
Three teams collaborated in this work. A system design team
wasinchargeofwritingtheShaperspeciﬁcations. Thesecondteam
followed the classical synthesis ﬂow: they hand coded the speciﬁ-
cations at RT level and used RTL synthesis to generate the netlist
gate and then used back-end tools. The third team used a behav-
ioral synthesis ﬂow: they coded the speciﬁcations at the behavioral
level and used behavioral synthesis in order to generate automat-
ically RTL models and then the rest of the classical design ﬂow.
Figure 1 shows the ﬂows used by each team. The three teams were
working in parallel and were located in three different places.
The analysis of the results obtained from the two design ﬂows
mentioned above shows that the high level synthesis methodology
has proven its efﬁciency in terms of handling complexity and time-
to-market compared to the manual RTL. In terms of gate count,
both ﬂows are almost equivalent. However, in terms of design time,
the behavioral design is 3 times faster than the RTL one. Addi-
tionally, the RTL model produced automatically by behavioral syn-
thesis is
1
0
% smaller and almost as readable as the manual RTL
model.
When starting from the behavioral model, there are other ad-
vantages which are very useful in the design of complex systems,
such as fast functional veriﬁcation (in terms of simulation time and
functionality abstraction), less errors at lower design levels, eas-
ier code maintenance considering its smaller size versus an RTL
one. All these features contribute strongly to shrink the design cy-
cle which allow to meet time-to-market constraints, and induce an
_
___________________________
Permission to make digital/hardcopy of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage, the copyright notice, the title of the publication
and its date appear, and notice is given that copying is by permission of ACM, Inc.
To copy otherwise, to republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee.
DAC 99, New Orleans, Louisiana
(c) 1999 ACM 1-58113-109-7/99/06..$5.00increase in productivity.
This paper will focus on comparing the results of behavioral
and RTL synthesis. This paper includes no discussion of speciﬁc
tools used for this experiment because the discussion of the speciﬁc
characteristic of these tools is not relevant for the contribution of
this paper. For more information about behavioral synthesis ﬂows
the reader is referredto[1], [2], and for other comparisons between
RTL and behavioral synthesis see [3], [4], [5], [6].
The remainder of this paper is organized as follows: section 2
gives an overview of the global architecture and the speciﬁcations
of the ATM Shaper. Section 3 presents the behavioral speciﬁcation
of the trafﬁc Shaper implementation. Section 4 provides the com-
parison results between behavioral and the RTL methodologies. In
section 5, the most important differences between RTL and behav-
ioral modeling styles are summarized. Finally, the conclusions of
this work are provided in section 6.
2
S
H
A
P
E
R
A
R
C
H
I
T
E
C
T
U
R
E
In ATM telecommunication systems, the network operator must be
able to ensure minimum congestion conditions. One of the basic
problems faced in the design of efﬁcient trafﬁc and congestion con-
trol schemes is related to the wide variety of services with different
trafﬁc characteristics and quality of service (QoS). To solve this
problem, trafﬁc shaping is one of the functions needed by an ATM
system in order to satisfy the QoS and avoid congestion at the same
time. This function is implemented by the ATM Shaper presented
in this paper. The complexity of the overal system is about 2M
transistors including memories.
2
.
1
T
r
a
￿
c
S
h
a
p
i
n
g
F
u
n
c
t
i
o
n
The main objective of trafﬁc shaping is the smoothing of the traf-
ﬁc characteristics. Trafﬁc bursts are buffered and delayed, and the
trafﬁc is transmitted respecting the connection priorities (accord-
ing to the trafﬁc type) and using the silent periods. This method
decreases the bandwidth requirements for the source, as well as the
probability of congestion conditions in switches due to the bursty
trafﬁc. The data transfer from the user to the network should be
controlled by the Shaper mechanism implemented in the terminal
adapter to shape the trafﬁc according to appropriate trafﬁc charac-
teristics. The main features of the ATM Shaper presented here are
the following: the bandwidth is 155 Mbit/s supporting all types of
trafﬁc such as Variable Bit Rate (VBR), Constant Bit Rate (CBR),
Unspeciﬁed Bit Rate (UBR) and Available Bit Rate (ABR) [7] and
up to 4K connections can be simultaneously managed.
2
.
2
F
u
n
c
t
i
o
n
a
l
d
e
s
c
r
i
p
t
i
o
n
o
f
t
h
e
S
h
a
p
e
r
a
r
c
h
i
t
e
c
t
u
r
e
In order to manage the trafﬁc, the Shaper makes use of two basic
concepts, the permits and the calendars. A permit is a data struc-
ture that describes a connection. It gives pointers on the ATM cells
transmitted through the connection. These permits are stored in cir-
cular memories called calendars. Figure 2 shows a functional block
diagram of the Shaper. The system to be designed is surrounded by
a dotted line. The Shaper architecture is composed of 4 interfaces,
4 main functional blocks and 7 memories containing the trafﬁc pa-
rameters. A brief description of these elements follows:
The interfaces are:
￿ The external trafﬁc description memory interface is used by
the Shaper to read the contracted trafﬁc parameters from the
external trafﬁc description memory (TDM).
￿ The microprocessor interface exchanges control and status
information with the microprocessor. The microprocessor
A
A
L
Intf
A
T
M
Intf
SENDER
Microproc_Interface
Time_Unit
SCHEDULER
ABR_DECISOR
DECISOR
ABR Management
UBR_cal
GR_Table
NRT GT  UBR
CSM TDM
RT1 RT2
Main Shaper functions
TDM Intf.
Figure 2: Shaper functional block diagram
conﬁgures the ABR parameters and it can access, in read and
write modes, to all the internal memories for set up and di-
agnostic purposes.
￿ The ATM Adaptation Layer (AAL) Interface indicates to the
shaper when a new packet has arrived and the Shaper gives
to the AAL the Queue IDentiﬁer (QID) of the cell to be sent.
￿ The ATM Interface is used to exchange data with the ATM
layer and to manage properly ABR connections thus provid-
ing to the ATM layer the QID of the cells belonging to the
ABR connections.
The internal memories used for the Shaper management are:
￿ Two real time (rt) calendars (RT1 and RT2) to handle the rt
trafﬁc.
￿ The non real time (nrt) nrt-calendar (NRT) to handle the nrt
trafﬁc.
￿ The UBR-calendar to handle the UBR trafﬁc.
￿ TheGroupTable (GT)memory isusedtoresolve theschedul-
ing problem associated to the permit inside the NRT.
￿ The Current State Memory (CSM) contains the contracted
timing parameters of each connection.
The only external memory used for the Shaper management is
the TDM containing the trafﬁc parameters.
The main functional blocks are:
￿ The Sender reads the permits from the calendars (RT1, RT2,
NRT and UBR) and sends them to the Decisor. Then the
sender updates the pointers to point at the next location to be
read from the calendars.
￿ The Scheduler receives from the Decisor the QID and the
time stamp indicating in which memory location the permit
should be stored. The Scheduler puts in the NRT calendar
the permit with the help of the Group Table memory. Then,
it updates the GT memory if required.
￿ The ABR-Decisor performs the following tasks: validates
the permits read by the Sender, sending the right one to the
proper interface (ATM or AAL), then calculates for each non
real time connection the new position in the calendar of the
next permit. This block also performs the ABR management
in the case of ABR trafﬁc.￿ The timing unit block is used to control the ABR trafﬁc pa-
rameter such as the Allowed Cell Rate(ACR) Decrease Time
Factor (ADTF) and Trm.
3
B
E
H
A
V
I
O
R
A
L
S
P
E
C
I
F
I
C
A
T
I
O
N
Behavioral synthesis is the process of reﬁning a speciﬁcation into
an RTL model [8], [9]. The main difference between behavioral
and RTL models is related to timing. At the RT level a descrip-
tion details the behavioral of the design at the clock-cycle level.
A behavioral model is generally speciﬁed in terms of computation
steps. The synchronization between the behavioral model and the
external world is performed using speciﬁc protocols. For exam-
ple, in VHDL, the basic behavioral description is generally made
of processes communicating with the external world through wait
statements. In this case the computation step corresponds to the
code between two wait statements.
The main task of behavioral synthesis is to split these computa-
tion steps into clock cycles in order to produce an RTL model [9],
[8], [2]. Of course the behavioral speciﬁcation of complex systems
like the ATM Shaper may be composed of interconnected blocks.
The speciﬁcation step is then generally composed of two steps, par-
titioning of the system into separate modules and description the
behavior of these modules and their interconnections.
This section discusses the behavioral description of the Shaper.
Although we used a speciﬁc behavioral synthesis tool, the follow-
ing discussion may apply to other existing behavioral approaches.
3
.
1
S
y
s
t
e
m
p
a
r
t
i
t
i
o
n
i
n
g
In order to implement a behavioral speciﬁcation suited to behav-
ioral synthesis, a complex design needs to be partitioned into sub-
systems, or modules in order to reduce its complexity so that each
module can be described and synthesized efﬁciently [10].
The partitioning of the system was driven by the functionality
of the Shaper and the timing constraints. The functional block dia-
gram was used to deﬁne the main components of the architecture.
The main timing constraint states that the processing of an ATM
cell cannot exceed 106 clk cycle at 40 MHz operating frequency.
This computation requires the activation of several components of
the architecture. In order to meet this constraint we were obliged
to introduce some parallelism into the architecture.
Scheduler
Communicating 
buses
Decisor
Hand-shake 
signals
Communication Network
Interface
Sender
ABR 
Management
Hand-shake 
signals
Time_Unit 
Hand-shake 
signals
T
o
p
_
C
o
n
t
r
o
l
l
e
r
Global Shaper Circuit Architecture
Microproc
Inter/Exter
RAMs/
ROMs
Figure 3: Global partitioning architecture of the Shaper
The critical path was the ABR-Decisor function. This was split
into two sub-blocks called ABR and Decisor as shown in ﬁgure 3.
Thisdecomposition allowsustoexecute thetwosub-functions con-
currently. Additionally, the ABR may be executed concurrently
with the Scheduler as required by some conﬁgurations. The con-
currency between the Decisor and the ABR is managed by the top
controller, and the concurrency between the ABR and the Sched-
uler is managed by the Decisor.
3
.
2
B
e
h
a
v
i
o
r
a
l
d
e
s
c
r
i
p
t
i
o
n
s
t
y
l
e
Each module of the Shaper system has been described by a sin-
gle process using a VHDL behavioral description. These pro-
cesses may perform condition tests and branch operations, arith-
metic/logic operations, read and write operations and I/O opera-
tions. The main difﬁculty when describing such processes is to
mix protocol (hand shaking), control algorithms ( loops and condi-
tional statement) and computation. The complete speciﬁcation of
the Shaper is made of 3200 lines of behavioral VHDL.
 if (DC = 16#000#) then
    Aux_pt := nrt_pt;
 else   --(DC /= 16#000#) then
             Addr_nrt := nrt_pt;
  read_ram_nrt(Addr_nrt, Gn_nrt, qid_nrt, TD, fe_nrt);
Addr_grt := Gn_nrt;
              read_ram_grt(Addr_grt, FFL);
  Aux_pt :=  FFL;
              AUX_POINT : while (DC /= 16#000#) loop
                              read_ram_nrt(Addr_nrt, Gn_nrt, qid_nrt, TD, fe_nrt);
                              if (fe_nrt = '0') then
            Aux_pt   := Aux_pt + 1;
            Addr_nrt := Addr_nrt + 1;
      else -- (fe_nrt = '1') then
            Aux_pt   := Addr_nrt;
      end if;
      DC := DC - 1;
                end loop AUX_POINT;
 end if;
Figure 4: Mixing protocol, control, computation and procedure call
Figure 4 shows an extract of the Sender speciﬁcation. This
model includes a combination of an if statement, a data dependent
loop and a procedure call that includes wait statements. This cod-
ing style allows us to make a compact description. In addition, the
capability to mix wait and control statements allows us to describe
precise protocols such as those required for exchanging data with
the memories. More information about the behavioral coding style
will be given in section 5.
4
S
Y
N
T
H
E
S
I
S
R
E
S
U
L
T
S
The main results of the whole ATM Shaper are summarized in ta-
ble 5. Table 5 also compares both design methodologies; the RTL
and the behavioral one, in terms of number of VHDL lines, number
of gates and the length of the design cycle.
Table 5 indicates that the behavioral description of the ATM
Shaper, is more than 50
% smaller than the RTL description. There-
fore, it will be much easier to maintain and modify the design at the
behavioral level. Section 5 analyzes the difference between RTL
and behavioral coding styles.
The most surprising result comes from the size of the generated
RTL code. This was smaller than the manual RTL code. The analy-
sis of both coding styles has shown that the main difference comes
from the partitioning. RTL designers decomposed the system into
a large number of blocks. This induced extra VHDL lines for the
interconnection of the blocks. In fact, behavioral designer split theBehavioral 
RTL VHDL
Behavioral 
Design
 
9311 vhdl lines
~3200 vhdl lines
RTL Design Description
No behavioral
model
10250 vhdl lines
Number of
Gates
28500 gates 27000 gates
 7 modules 15 modules
# of Partitioning
modules
Design effort
(Persons/Month)  8 P/M  25 P/M
Figure 5: Behavioral vs RTL synthesis Results
system into only seven (7 including the top-controller block) mod-
ules, while RTL designers have split the same design into 15 mod-
ules. In both cases partitioning was driven by the complexity of the
modules. In fact it is very difﬁcult to handle a VHDL process with
more than a few hundred code lines. This corresponds to FSMs
with a few dozens of states. For example, the Decisor function de-
scribed as a single VHDL process using 850 lines in the behavioral
model. The same module was described using several modules at
the RTL level. The analysis has also shown that the coding style
generated by the behavioral synthesis is as efﬁcient as the one pro-
duced by RTL designers manually.
In terms of gate count, the RTL and the behavioral design are
almost equivalent. Another important feature illustrated in table 5,
is the design effort which is 3 times faster when using behavioral
synthesis thanwhenusing RTL.Thisisverycriticalfortelecommu-
nication devices since time-to-market is the key issue for this kind
of application. This may be explained with the analysis the design
loops for both approaches. In this experiment the time-to-market
was comparable for both approaches. However, the RTL design
team was three times larger than the behavioral design team.
Figure 6 shows the high-level synthesis design ﬂow applied to
the ATM Shaper design with the corresponding veriﬁcation process
and design loops [11]. From the initial speciﬁcation, the circuit is
manually partitioned and described at the behavioral level. The
model is simulated (validation B) to check the global functionality.
Loop B1 allows to debug the behavioral description.
From the behavioral level, a register transfer level architecture
of the circuit is automatically generated using behavioral synthe-
sis. The RTL description is simulated (validation R) to check the
behavior at the clock-cycle level. Loop R1 is used to debug the
behavioral description with regard to the synthesizable subset and
the writing style accepted by behavioral synthesis. Loop R2 is used
to debug the directives and other inputs to behavioral synthesis for
functionality such as the availability of the right components in the
library. In the case of manual design, the RTL is produced auto-
matically and loop R2 is used to debug the RTL model.
From the register transfer level, the architecture of the circuit
is mapped onto a gate level netlist and optimized with an RTL and
logic synthesizer, according to manual directives. The gate level
description is simulated (validation G) to check the delays. Loop
G1 debugs the behavioral description for lacking signal and vari-
able initializations not detected during validation R. Loop G2 de-
bugs the directives for behavioral synthesis for performance opti-
mization such as pipelining. Loop G3 debugs the directives for
RTL and logic synthesis for performance optimization such as re-
timing or pipelining. From the gate level, the circuit is then placed
and routed. The loop G3 is the only one used by RTL designers to
debug RTL description.
directives
directives
B1 R1 G1
R2
G3
OK?
G2
OK?
OK?
validation B
validation R
validation G
PARTITIONING
RTL AND LOGIC SYNTHESIS RTL AND LOGIC SYNTHESIS
PLACE & ROUTE
BEHAVIORAL SYNTHESIS
PARTITIONING
specification level
layout
gate level
register transfer level
behavioral level
validation: simulation at every level with the same test-bench.
R
T
L
 
d
e
s
i
g
n
 
p
r
o
c
e
s
s
B
e
h
a
v
i
o
r
a
l
 
d
e
s
i
g
n
 
p
r
o
c
e
s
s
Figure 6: Design loops
Because of the abstraction level and the amount of informa-
tion handled, the behavioral simulation (validation B) is less time
consuming than the RTL simulation (validation R) which is much
faster than gatesimulation (validation G). Forthe samereasons, be-
havioral synthesis is faster than RTL synthesis which is itself much
faster than gate level synthesis. The main beneﬁt of behavioral
synthesis is to allow the validation of the design at a higher level
inducing a reduction in the number of lower level iterations from
the RTL to gate synthesis. In this experiment both behavioral ﬂow
and RTL design ﬂow were carried out in parallel. During the ﬁrst
phase of the design, the speciﬁcation was continuously changing.
Of course this fact was a huge penalty for RTL designers since the
behavioral model is easier to modify than the RTL one.
5
E
V
A
L
U
A
T
I
O
N
The main added value of behavioral synthesis comes from the fact
that the behavioral model is smaller than the RTL one. The RTL
produced code is readable and easy to understand except for the
portion of the code that mixes loops and wait statements. This
comes from the fact that the system unrolls the loops and intro-
duces extra intermediate variables in order to perform chaining of
operations [8], [12]. The rest of this section gives some examples
that show some differences in coding styles.
￿ Combining wait and procedures call : The Shaper function is
based especially on intensive access to memories with com-
plex protocols, hence it is very proﬁtable in this kind of ap-
plications to write a behavioral code rather than RTL code.
TheScheduler block includes loopstatements, procedure calls
and wait statements and several memory accesses. Behav-
ioral synthesis allows to use procedures to perform read and
write memory accesses. These procedures may contain sev-
eral wait statements as shown in ﬁgure 7 (a) which illus-
trate a read memory access. Figure 7(b) shows a sequence
that calls this procedure several times with different param-
eters. In RTL, this sequence would have required more than
60 VHDL lines. In fact, since the RTL style forbids wait             tdm_addr := QID&"000";
read_tdm(tdm_addr,pcr,ct);
tdm_addr :=QID&"001"; 
read_tdm(tdm_addr,mcr,rif);
tdm_addr :=QID&"010";
             read_tdm(tdm_addr,icr,rdf);
tdm_addr :=QID&"011";  
read_tdm(tdm_addr,adtf,cdf);
 tdm_addr  :=QID&"100";  
read_tdm(tdm_addr,crm1, var1) ;
 tdm_addr  :=QID&"101";  
read_tdm(tdm_addr,crm2, var2);
procedure read_tdm (Adresse : in  Bu_15bits;
tdm1    : out Bu_10bits;
tdm2    : out Bu_3bits) is
begin
tdm_cs          <= '0' ;
tdm_r_w       <= '1' ;
addr_tdm      <= Adresse;
        wait until rising_edge(clk);
tdm_cs          <= '1' ;
tdm_r_w       <= '0' ;
        wait until rising_edge(clk);
   tdm1              := tdm_dout(15 downto 6);
   tdm2              := tdm_dout(2 downto 0);
end read_tdm;
(a) Procedure read memory
(b) Calling the procedure 6 times
in the main process
Figure 7: Behavioral description using procedure call
statement within procedures, the procedure calls need to be
in-lined manually.
￿ Description of complex processes:
Describing a complex process in RTL is very tedious and
error prone. In this way the RTL designers have split the
Sender process into 2 modules, while it has been described
only in a single process with a behavioral description. The
two modules designed by RTL designers were coded with
1230 VHDL RTL lines. This is mainly due to the details in-
volved in the RTL description. However, the sender was de-
scribed by a single behavioral process with 540 VHDL lines.
The difference in number of lines is due to the fact that be-
havioral speciﬁcation allows us to specify complex protocols
in an algorithmic way mixing wait statements and control
statements. In RTL, handshakes need to be speciﬁed as ver-
bose FSMs.
￿ Wait statement in behavioral description:
The behavioral synthesis tool used in this experiment allows
wait with until expressions that may include other signals
than the clock. This writing style is illustrated in ﬁgure 8(a).
This ﬁgure shows a typical handshake sequence using wait
statements. The RTL code generated from this sequence is
made of two states corresponding to the two wait statements
of the initial model. In this case, the code generated automat-
ically is as efﬁcient as a hand coded VHDL.
a) Behavioral code
    start_schedul       <= '1' ;
    -- Start Scheduler Process
     qid_to_sche        <= qid_send;
     wait until rising_edge(clk);
       wait until  (done_schedul = '1' );
      start_schedul       <= '0';
          next_start_schedul            <=  '1';
          next_qid_to_sche              <= qid_send;
          NEXT_STATE_decisorp <= ST_decisorp_1 ;
      when ST_decisorp_1 =>
        NEXT_STATE_decisorp    <= ST_decisorp_2 ;
      when ST_decisorp_2 =>
        if ((done_schedul =  '1') then
            next_start_schedul             <=  '0';
            NEXT_STATE_decisorp  <= ST_decisorp_3 ;
        else
            NEXT_STATE_decisorp    <= ST_decisorp_2 ;
        end if;
b) RTL code
Figure 8: Behavioral vs RTL writing VHDL code
￿ Mixing loop, if and wait statements:
This is probably the most signiﬁcant difference between RTL
and behavioral coding styles. Figure 9(b) shows a sequence
including a loop and a procedure call that includes wait.T h i s
sequence is extracted from the Shaper. The RTL code pro-
duced for this sequence is shown in ﬁgure 9(c).
a) read memory procedure
 c) RTL code
when ( ST_schedul1_47_schedul1 ) =>
      C_H_A_I_N_E_D_67 := dout_grt (16 downto 6) ;
       next_gn_grt <= C_H_A_I_N_E_D_67;
       next_sg <= dout_grt(1);
      C_H_A_I_N_E_D_68 := dout_grt(0);
       next_ri <= C_H_A_I_N_E_D_68;
       if (C_H_A_I_N_E_D_68 =  '0') then
            C_H_A_I_N_E_D_69 := C_H_A_I_N_E_D_67;
             next_addr_grt <= C_H_A_I_N_E_D_69;
             next_cs_grt <=  '0';
             next_grt_addr <= C_H_A_I_N_E_D_69;
             next_rw_grt <=  '1';
             next_oe_grt <=  '0';
             NEXT_STATE_schedul1 <= ST_schedul1_48_schedul1 ;
        else
             next_cs_grt <=  '0';
             next_grt_addr <= addr_grt;
             next_rw_grt <=  '1';
             next_oe_grt <=  '0';
             NEXT_STATE_schedul1 <= ST_schedul1_50_schedul1 ;
        end if;
when ( ST_schedul1_48_schedul1 ) =>
        next_rw_grt <=  '0';
        next_cs_grt <=  '1';
        NEXT_STATE_schedul1 <= ST_schedul1_49_schedul1 ;
when ( ST_schedul1_49_schedul1 ) =>
        C_H_A_I_N_E_D_70 := dout_grt (16 downto 6) ;
        next_gn_grt <= C_H_A_I_N_E_D_70;
        next_sg <= dout_grt(1);
        C_H_A_I_N_E_D_71 := dout_grt(0);
        next_ri <= C_H_A_I_N_E_D_71;
        C_H_A_I_N_E_D_70 := (C_H_A_I_N_E_D_70+"00000000001");
        next_gn_grt <= C_H_A_I_N_E_D_70;
        if (C_H_A_I_N_E_D_71 =  '0') then
             C_H_A_I_N_E_D_72 := C_H_A_I_N_E_D_70;
             next_addr_grt <= C_H_A_I_N_E_D_72;
             next_cs_grt <=  '0';
        end if;
Main_loop : loop
   …….
    else  -- (FE ='1') Filled position an
     ……
……
          while (RI = '0') loop
   Addr_grt  :=  gn_grt;
    read_ram_grt2(Addr_grt,gn_grt,SG,RI);
              gn_grt := gn_grt + 1;
          end loop;
       ……
     end if;
    ….
end loop main_loop ;
procedure read_ram_grt2(Adresse:in
Bu_11bits;  grt1:out Bu_11bits;
grt2:out std_logic; grt3:out std_logic) is
 begin
       cs_grt              <= '0';
       grt_addr  <= A dresse;
       rw _grt           <='1';
      oe_grt              <='0';
    w ait until rising_edge(clk);
      rw _grt              <='0';
      cs_grt               <= '1';
    w ait until rising_edge(clk);
      grt1      := dout_grt(16 dow nto 6);
      grt2      := dout_grt(1);
      grt3      := dout_grt(0);
  end read_ram_grt2;
b) mix procedure call and while statements
Figure 9: Behavioral vs RTL writing VHDL code
The loop and the wait statements of the procedure 9(a) were
translated into only 2 states. The behavioral synthesis per-
formedalsoseveral optimizations that are difﬁculttoperform
manually. In this case, the system unrolled the loop without
creating extra states. All the C-H-A-I-N-E-D-XX variables
are temporary and will be removed by the RTL synthesis.
Of course all the transformations make the generated RTL
model difﬁcult to read.
6
C
O
N
C
L
U
S
I
O
N
This paper has described our experiences with VHDL-based be-
havioral synthesis for a telecommunication ATM Shaper. We high-
lighted the comparison between two design results obtained from
the implementation, in parallel, of an RTL classical design and a
more advanced behavioral one starting from the same design spec-
iﬁcations.The synthesis results showed that given the design constraints,
the behavioral synthesis methodology wasable toproduce anequiv-
alent gate count and a threefold reduction in design effort when
compared with classical RTL methodology.
The main added value of behavioral synthesis comes from the
fact that the behavioral model is smaller than the corresponding
RTL design. In the case of the ATM Shaper, the behavioral model
was half the size of the RTL one.
A
c
k
n
o
w
l
e
d
g
m
e
n
t
This work was supported by STMicroelectronics, Italtel, the Jessi
program A102 and MEDEA program under project SMT AT403.
R
e
f
e
r
e
n
c
e
s
[1] D.D.Gajski, N.D.Dutt, A.CH.Wu, andS.YL.Lin. High-level
Synthesis, Introduction to Chip and System Design.K l u w e r
Academic Publishers, Borton/London/Dordrecht, 1991.
[2] D.Kuand G.DeMicheli. High-level Synthesis of ASICs under
Timing and Synchronization Constraints. Kluwer Academic
Publishers, Borton/London/Dordrecht, 1992.
[3] E. Berrebi, P. Kission, S. Vernalde, S. De Troch, J.C. Her-
luison, J. Frehel, A.A. Jerraya, and I. Bolsens. Combined
control ﬂow dominated and data ﬂow dominated high-level
synthesis. 33rd ACM/IEEE Design Automation Conference
DAC’96, June 1996.
[4] T.E. Furtrman. Industrial extensions to university high-level
synthesis tools: Making it work in the real work. 28rd
ACM/IEEE Design Automation Conference DAC’91, June
1991.
[5] M. Genoe, P. Vanoostende, and G. Van Wauwe. On the use
of vhdl-based behavioral synthesis for telecom asic design.
In the Proceedings of the International Symposium on System
Synthesis ISSS’95, February 1995.
[6] M.T. Lee, Y. Hsu, Ben Chen, and M. Fujita. Domain-speciﬁc
high-level modeling and synthesis for atm switch design us-
ing vhdl. In 33rd ACM/IEEE Design Automation Conference
DAC’96, June 1996.
[7] The ATM Forum Technical Committee. Trafﬁc management
speciﬁcation v4.0. af-tm-oo56.000 Letter Ballot, April 1996.
[8] R. A. Walker and Gaetano Boriello. A Survey of High-
Level Synthesis Systems. Kluwer Academic Publishers, Bor-
ton/London/Dordrecht, 1991.
[9] D.D. Gajski and L. Ramacahndran. Introduction to high level
synthesis. IEEE Design and Test Computer, October 1994.
[10] A. Seawright and W. Meyer. Partitioning and optimizing con-
trollers synthesized from hierarchical high-level descriptions.
35rd ACM/IEEE Design Automation Conference, June 1998.
[11] A.A. Jerraya, H. Ding, P. Kission, and M. Rahmouni. Behav-
ioral Synthesis and Component Reuse with VHDL.K l u w e r
Academic Publishers, Borton/London/Dordrecht, 1997.
[12] R.A. Bergamaschi. Productivity issues in high-level design:
Are tools solving the real problems ? 32rd ACM/IEEEDesign
Automation Conference DAC’96, June 1995.