network of cells on a target structure whose connection delays have discrete values following its hierarchy.
The circuits is modelled by a set of timed cones whose delay histograms allow their classification into critical, potential critical and neutral cones according to predicted delays. The floorplanning is then guided by this cone structuring and has two innovative features: first, it is shown that the placement of the elements of the neutral cones has no impact on timing results, thus a sign&ant reduction is obtained; second, despite a greedy approach, a near optimal floorplan is achieved in a large number of examples.
Introduction
The problem of incorporating performance objectives in the physical design of integrated circuits has been widely addressed in the past relating to placement, floorplanning, and chip partitioning.
In [BurYou85] an approach to the automatic layout design for VLSI chips was proposed. It incorporates timing information to influence the placement and wiring processes. Placement is based on a successive partitioning algorithm. Weighting nets according to their timing criticality biases the gain computation of the FM partitioning algorithm.
In [ShiKuTsay92] system partitioning algorithm for MCM is proposed under timing and capacity constraints. They use a divide-and-conquer strategy: fisrtly, clustering is applied to insure timing correctness; then K-Way packing is applied to obtain+an initial solution satisfying capacity constraints, and after that K&L algorithm tries to minimize net crossings.
Permission to make digital/hard copies of all or pert of this material for personal or classroom use is granted without fee provided that the copies arc not made or diibuted for profit or commercial advantage, the copyright notice, the title ofthe publication and its date appear, and notice is given that copyright is by permission oflhe ACM, Inc. To copy otherwke, to republish, to post on servers or to rediiiute to lists, requires specific permission and& fee.
FPGA98 Monterey CA USA Copfi&t 1998 ACM 0-8979L978~5/98/01..SS.O0
In [KatsKoWaYo95] partitioning method under performance, area and IO pins constraints was proposed for MCM systems. The method firstly uses clustering of all nodes that cause the timing violations. lifter that the iterative improvement with mathematical programming is applied to minimize the number of cuts.
In [RajWong93] the problem of circuit clustering for delay minimization was considered under any monotone constraint. Proposed algorithm is timing-optimal, but the penalty is a high degree of replication. In paWo95] pin constraint is also taken into consideration.
/ I '
In [SwaSe95] path-based timing driven placement algorithm is presented. The difference with the preceding pathbased approaches is that it may handle very large circuits. The hierarchical methodology is applied through condensing netlist and applying simulated annealing with different temperatures to netlist with different degrees of condensing. Timing penalty is incorporated into the annealing cost function.
.
In [YousSaitSS] timing driven floorplanning approach is presented for general cell layouts. The approach incorporates timing criteria into the objective function of the greedy force-directed block placement algorithm.
In [RoySe94] present a multi-FPGA partitioning algorithm handling timing constraints. Method considers the geometric aspects -relative positions of partitions with respect to each other, subdividing each partition into bins. Timing penalty function is incorporated into the simulated annealing cost function.
In [SawTho95] a constructive set cover based approach is proposed to minimize the number of chip crossings in multi-way partitioning for FPGAs. In [SauBra93] conebased clustering and clusters merge are applied to contain critical paths inside cones.
No one of the previous approaches addressed the multilevel target. In this paper, we present a timing-driven floorplanning approach for hierachically structured programmable targets. We propose an algorithm which defines partial assignement of the design cells to the target struc-ture nodes. This assignement is performed only for timing critical cells and guarantes the t.iming predictability of the final placement.
The paper is organized as follows. In section 2, we give basic definitions, terminology and notations used thereafter. Our desing modclling is presented in sect.ion 3 nd 4. Timing modelling imd the flooxjlanning problem are described in section 5. Algorithm for timing invariant partial floorplan is discussed in section 6. In section G: we show how to t,ranslate constraints floorplan to the place and route tool. Experimental results and conclusion are given in sections 8 and 9.
2 Hierarchical target .
Physical characteristics
-4 target architecture is characterized by a set of basic modules/cells and interconnection resources. A hierarchical target is defined in addition by a hierarchy tree diagram with a depth corresponding to the number of levels of hierarchy. .4t each level t,he chip is organized into a subset regions, called quadrants, containing a fixed number of modules. Like for any hierarchy, the subsets at a given level are included in a subset associated with a higher level. In this paper we make a basic assumption that connections have a discrete delay at different hierarchy levels. Usually the interconnect delay grows at higher levels. Figure 1 gives an example of a hierarchical device structure represented by a hierarchy tree and a chip layout. Each hierarchy tree node corresponds to a chip region, or quadrant, and may be weighted by a number of parameters (size in terms of the number of basic cells, IO pins number, etc.). In Figure 
Timing characteristics
As was said above, each level of hierarchy in the target architecture is characterized by an interconnect delay added when traversing this level. For example il,i2 and i3 correspond to the interconnect delay of the Levels 1,2 and 3 correspondingly in Figure 1 .
In the following, we suppose that the target hierarchy has 3 levels as shown in Figure 1 . We call the first level nodes as quadrants and the second'level nodes as segments. The interconnect delay between two cells in two differents segments is constant and denoted d,.
3 Design modelling
Boolean network
The digital circuit is modelized as boolean network. This network is represented as a directed bipartite graph G = (VI, V,, E) where the node set 15 represents the circuit elements and node set Vz represents the nets. l A node A$ E VI is said predecessor of a node NZ E Vz if there exists a directed edge e E E from Nr E VI to Ns E 172. In other wordsjf a net N2 is connected to the output pin of module represented by Nl . The node Ne is called successor of the node Nl. l A node Nl E VI is said successor of a node Na E V2 if there exists a directed edge e E E from NZ E 1'2 to Nl E Ii,. In other words,if a net Ne is connected to an input pin of module represented by Ni. The node Ns is called predecessor of the node Nr. 
Predecessor cone
We define the predecessor cone of a node of the set-V2 by the set of paths connecting that node to primary or secondary inputs without traversing any node corresponding to a sequntial element.
L-PfedecessorconeofthencdeE8
Figure 3: Predecessor cones.
3.3 Paths in a boolean network Definition 1 : Logic delay of a path A path P traversing n cells has a logic delay TL(P) = C;zl@i) . The interconnect delay of a path P is determined after the floorplanning process. It takes in account the number of traversed hierarchy levels. The physical delay of a path P is defined as follow : Tp(P) = TL(P) + T,(P). Where TL(P) is the logic delay of the path P and T,(P) is the Interconnect delay of the path P.
Prime cones of a design
A prime cone in the network is a predecessor cone of any primary /secondary output. One node may belong to one or more cones, which forms the cone intersections. An example of circuit containing two intersecting cones is given in Figure 4 . 
3.5
Design profile
In Figure 5 are given statistics of the number of prime cones in different industrial circuits. Figure 6 shows the saturation in term of cells of the prime cones. In fact, the size of the cones in term of cells will be a criteria to choose an appropriate algorithm to perform the floorplanning, thus we will consider in the following two kinds of cones, the wide ones and the narrow ones.
Timing modelling for prime cones
It is considered here to use the prime cones as basic constituants. The prime cones are classified according to timing criticality. This will allow timing driven assignement of prime cones to quadrants later on.
Predicted arrival time
The arrival time of a cone C; is the arrival time at the output of the top cell of this cone. With each node is associated a predicted arrival time and a predecessor cone. For a signal to reach the, root of the cone, it has to cross potentially quadrants, segments and basic cells. Without taking in account imy interconnection delay, the arrival time at a node Ni is at least equal to Maz,,ec,(T~(P)), maximal logic time of all paths ending at node Ni .
To predict the arrival time at a node, it is also required to predict the number of quadrants or segments traversed before reaching this node. This number of quadrants depends both on the I/O and size saturation. 
Segment saturation constraints
Ns(Ci) is the minimal number of traversed segments required by the assignement of the predecessor cone Ci of the node Ni and is defined as follows :
The upper bound predicted arrival time of the cone Ci is defined as follows :
AT(Ci) = MUZJJ~C; (TL(P)) + (VI -1) * d,.
Where m is the number of basic cells belonging to the path P.
The lower bound predicted arrival time of the cone Ci is tlcfined as follow :
The physical delay of a cone is equal to the physical delay of its longest path. T~,(CJ = Tp(P), where P is the longest path in the cone Ci.
Property :
Let Tp(Ci) 1~: the physical delay of the prime cone Ci, then we have :
Let Ci be a prime cone, its physical delay after placement is at worst, equal to .4T(Ci), because in the worst case, each cell of the longest path delay in the cone Ci is assigned to a different segment.
5 Design timing profile and classifying prime cones of a circuit 
5.2
Design timing profile
In Table 1 , we present the results of timing analysis performed on the MCNC benchmark C880. It may be seen that the number of E -critical cones is small (equal to 4), and the number of potential critical cones is smaller (equal to 3) than the number of neutral cones (equal to 19). The value of E is fixed here to almost 10% of the maximum lower bound predicted time of the whole prime cones. 
Floorplanning problem
The Floorplanning process consits on assigning cells to the quadrants of the hierarchical target.
Theorem
The physical delay of a set of prime cones is independent of the assignement of the cells of neutral cones. is. the lower bound predicted arrival time of the set of prime cones, this implies that the assignement of the cells of the neutral cone Ci has no influence on the physical delay of the set of prime cone..
Experimental results on complexity reduction
The complexity reduction of the floorplanning problem is about (Number of neutral prime cone / Number of prime cones). In Figure 7 is presented an experimental evaluation of the complexity reduction due to the elimination of neutral cones. We observe that in average, the complexity was reduced by more than 47%.
Notations Pot-critical :Potential Critical In the first step, the prime cones and their intersections are created, then we perform the timing analysis by computing the different delays (AT. a,...).
This timing analysis allows us to classify the prime cones into three sets :e -criticul prime cones, potential critical prime cones and neutral prime cones. During the floorplanning process, the current arrival time is updated as follows : Consider the floorplanning performed on the elements of a cone Ci whose root is a node IV;. .4t each step, a cell Cj of the cone is assigned to a quadrant Q. If at a given time, the number of quadrants used is greater than the minimum number of quadrants required to implement the cone Ci whish is estimated to ArQ(Ci), the current arrival time is then updated.To update the current arrival time, interconnect delay between quadrants is then taken in account. If two quadrants belong to the same segment and connected to each other, then an interconnect delay d, is added. Otherxise: if two quadrants belong to two different segments and connected to each other, then an interconnet delay ds is then added. Xl1 the paths which cross the cone Ci are also updated, but we have to distinguish two cases : The first case is when Ci belongs to an intersection between two different prime cones, then all the paths which cross the cone C'i and may belong to the two prime cones have to be updated. Otherwise: only the paths which cross the cone Ci and the selected prime cone are currently updated. The! resulting informations of the floorplanning algorithm are stored as constraints in a file for the place and route tool. These facilities to propagate floorplan constraints exist in mostly for all FPGA design (Xilinx, ORCA, Altera...). In this paper, we take as illustration a hierarchical target namely called AMD MACHS. In the corresponding software environnement, these constraints will be propagated to a special file called PI (physical information) and passed to the AMD filter MACHXL . The same work can be done for Xilinx using RLOC constraints and PPR. tool. This constraint means that the place and route tool has to assign the cells Sl and S2 to the quadrant a of the segment SO ( see Figure 1 ).
Experimental results
The floorplanning algorithm described in this paper has been implemented in the C language on Sun SPARC workstations, and tested on a set of industrial examples. We compared the results with those obtained by the placement tool without floorplanning constraints. The experimental results (Table 2 and Table 3 and table 4) show a reduction of 57%,79.44%,61% on the delay due to the interconnections in the circuits and 15%,19.22%,15% on the critical path delay of the circuits. Timing predictable layout is one of the most diicult problems in the electronic circuit design world. The target addressed here makes the timing prediction easier. In addition to the complecity reduction of the floorplanning problem, we focused on designs where a logic structuring contributes also to the problem simplification for the sufficient condition track. Delay  9%  100%  9%  100%  6%  100%  0%  0%  21%  85%  17%  70%  17%  74%  14%  61%  21%  75%  11%  42%  12%  47%  10%  39%  14%  52%  14%  46%  13%  49%  14%  50%  17%  56%  50%  17% 15% 61% 10 Refeiences
