In this paper we present the essential features of an approach for the design of a parallel algorithm for the layout compaction problem. We begin with a formulation of the problem presented by Yoshimura in [4]. This formulation is in terms of the dual transshipment problem. Our approach t o the solution of the dual transshipment problem involves repeated applications of three basic steps, namely, testing feasfblllty, shortest-path computations and performing concurrent pivot operations. Our discussion is in terms of marked graph concepts and results presented in [5], [B]. Our approach can also be used in the study of the relative placement problem discussed in [7] by Mlynski and Weiss.
I. INTRODUCTION
VLSI designers have come to rely heavily on automatic layout tools. Therefore it has become necessary that these tools have the ability to do a good job minimizing the final area and other costcritical factors. But better quality comes a t the price of much longer computation time. One way t o alleviate this problem is to take advantage of fast commercially available parallel computers.
Thus for different steps in the layout phase of the VLSI design process, parallel algorithms which are suitable for implementation on a MIMD multiprocessor and which result in considerable savings in computing time are called for. Recent work on the design of multiprocessor based algorithms for certain phases of the layout problem may be found
In this paper, we are concerned wlth the design of a parallel algorithm for the layout compaction problem. T h e paper is organized as follows. In Section 11, we present a brief review of literature on the compaction problem. We also present the work of Yoshimura [4] wherein he formulates the problem as a dual transshipment problem. T h e remainder of the paper is concerned wlth our research towards design of distributed and parallel algorithms for t h e dual transshipment problem. Our approach is based on results and concepts from t h e theory of marked graphs presented in [5], [B] .
in ill. 121, 131.
LAYOUT COMPACTION
Compaction is the CAD tool used to pack rough sketches or symbolic diagrams to produce IC layouts. Manual compaction is tedious, t i m e consuming, and error-prone; automated compaction tools can greatly shorten the layout design cycle. T h e aim of layout compaction can be stated as follows: Starting from an initial layout and without changing its topology, a final mask layout has t o be achieved with a minimum chip area and consistent with design rules. The restriction to invariance of topology is necessary in order not to render the prevlous steps of placement and routing obsolete. It Is achieved by maintaining relative neighbors, i.e. adjacency of layout elements. Elements are either on the bottom-most mask level, e.g. diffusion windows, or on higher design levels, e.g. transistors and cells. These elements are not allowed to jump across each other during compaction.
Most of the available comDactors are based on the some one-dimensional aDDr&ch which solves two-dimensional compaction problem by two one-dlmenslonal procedures, i.e. a horizontal compaction and a vertical compaction. These two procedures are applied successively. Thus, the layout elements are moved in one direction a t a time changing either their x-coordinates only during horizontal compaction or their y-coordinates only during vertical compaction.
A simultaneous compaction in both the horizontal and the vertical directions is preferable, slnce it avoids t h e tradeoff of moving an element in either direction.
Although a general twodimensional compaction strategy is not yet available, all attempts to treat x-and y-positions simultaneously at least during some part of the compaction procedure shall be classifled as two-
In our work, we will be primarily concerned with the one-dimensional approach. Of the several strategies based on the one-dimensional method, the constraint graph approach seems to be most promising from the point of view of parallel imple mentation. This approach consists of two main steps: (1) build the constraint graph to indicate the relative positions and the minimum distance required among t h e elements, ( 2 ) solve the constraint graph to minimize the chip area using the longest path method. The basic constraint-graph approach separates the compaction problem into two independent compactions, one in the Xdirection and the one in the Y-direction. During X-compaction, elements can only move strictly horizontally to t h e left, and during Y-dlrection e l e ments can only move strictly vertically to the bot-tom. Simultaneous compaction in both directions have been consldered In the Ilterature. In addition to compresslng spaces, the compaction process may expand the Input to resolve design-rule violations. The user can do as many X-and Y-compactions as necessary. T o satisfy all spacing constralnts, each node element must be a t least its longest path's length away from the boundary element, but it does not need to exceed that distance. Hence, a solutlon of the maximum-packing problem is t h a t each node should be a t t h a t distance, assuming the boundary element is located a t z = 0. Since all spaclng Constraints are satlsfled by this solutlon, X-compaction also corrects design-rule violatlons In the X-dlrectlon t h a t may have been In t h e vlolatlons. Overall, t h e constraint-graph model offers better flexibllity than the other models and stlll allows for efflclent implementation.
Our work on parallel algorlthm design will start wlth a formulatlon of Yoshimura [4] described below.
Glven :
Subject to:
Mlnlmlze:
Initial placement of blocks, horizontal wire segments and vertical wire segments.
Reservation of the relatlve positions between blocks and horizontal wire segments.
No overlaps between blocks and wire segments.
Chip helght and total wire length in Y-dlrectlon.
First a directed graph G(V,E) Is constructed, where V and E are the node set and t h e edge set, respectlvely. Each node corresponds to a block upper edge, a block lower edge or a horizontal wire segment. Uslng thls graph, a minimum height layout can be obtalned by calculating a "constrained longest distance tree".
The total wire length minimization problem Is formulated as a linear programming problem. The constraints are described as follows:
;j where ui and u j are y-coordinates for nodes z and node j, respectively and d;j is a constant. An optlmum solution is constructed using a variation of the simplex method.
It can be seen t h a t the above formulatlon of the compactlon problem 1s In terms of the dual transshlpment problem. We also note that Mlynskl and Weiss [7] formulate the relative placement problem In terms of t h e dual transshipment problem. Thus t h e approach we shall be presenting In the following sections wlll also be applicable in t h e study of the relatlve placement problem. For a good treatment of standard algorithms for network optlmlzation problems, [8] may be referred.
Dual Transshipment Problem:
A New Approach T o present an algorithm which obtains t h e feaslble solution given in t h e above theorem, let us deflne t h e node flrlng operation as follows. Firing z times a node v refers t o the operation of adding z to t h e token of every outgoing edge at w and subtracting z from t h e token of every incoming edge a t U. In t h e followlng, M(e) denotes t h e token of edge e. Firing number of a node refers to t h e number of times t h e node has been flred. T o start with all t h e flrlng numbers are zero. Note t h a t in our algorithm t h e value of yi at any tlme wlll In fact be equal to the flring number of t h e node a t t h a t tlme. are:
Step 1:
Step 2:
Step 3:
Step 4:
T h e main steps in this implementation 
If ai for all i, STOP. ELSE return t o
Step 2.
A dlstrlbuted/parallel implenientatlon of the above algorithm which achieves a time complexity of O(n) and message complexity of O(mn) is given in [Q], In thls implementation each node is associated with a single processor and information is communlcated from one processor to another through messages.
Suppose that the given dual transshipment problem is feasible. Then, after an application of Algorithm FEASIBLE, all the tokens associated with the edges will be nyn-negatlve. Let the corresponding graph be G . From thls point onwards, our approach is to decrease the value of the objective W Y as much as possible untll optimality is reached. This is achieved by flring t h e nodes in an appropriate manner without ever allowing t h e tokens to become negative. Thus we flre only negative weight nodes. This 1s repeated until no further flrlng of these nodes is posslpje.
Let the graph ap ,this point be denoted as G .
Note that in G a t each negatlve-weight node, there will be a t least one edge with zero token, incident into the node.
Also we show that each negative-welght node vi would have been ASed exactly f i times where f i is the token in G of a shortest path to vi from a posltlve weight node. ,So to avoid redundant firings we compute in G the value of 354 f ; 's for all the negative weight nodes and then flre the npdes acc,ooydingly. This will transform the graph G to G . Note that this step involves only shortest path computations and can be done very efflclently using the distributed/parallel algorithm presented in [lo] . We shall call this as Algorithm SHORT-PATH.
Consider no)v, the graph G' ' . A nuniber of ed$q tokens in G wlll be zero. The subgraph of G induced by the zero-token edges may not be connected. In that case, let t h e , conne,cJed compopents of this subgraph be G , , ,q2 . ..., G, as follows. Nqd,e,i in G represents Gi and the edge e i j (dlrected from node z' to node j ) will be assigned the sqnjtllest of,the tokens of all edges directed from G; to Gj .
Next we compute the welght of ,e&qh node (which is now a cluster of nodes) in G given by the sum of the weights of all the nodes in the corresponding cluster. Then we apply algorithm SHORT-PATH to compute for each negatlveweight node its shortest path from a positive weight node and then flre these nodes by the appropriate amounts. Note that flring a cluster z times results in adding z to the current firing numbers of all the nodes in the cluster.
. We then corfs,tr,uct the graph G, , We repeat the above process until all nodes coalesce into a single cluster. A t this point we n~i l l have obtained a basic feasible solution (represented by a spanning tree) of the dual simplex. lire now test the optimality of the solution using the simplex optimality criterion. Suppose the solution is not optimal. Then we determine for each branch (i,j) of the spanning tree (representing t h e basic solution) the corresponding fundamental cutset.
ut the corresponding vertex partition be (v, , vi) . Assume without loss of generality that the node z ' is in v;. Then v; will be called the fundamental cluster corresponding to the branch (i,j) . If the weight of the cluster v; is negative then we could fire v, to decrease the value of the objective. Firing a fundamental cluster v, is in fact the same as a simplex pivot operation with respect t o the branch (i,j). Firing all negative weight fundamental clusters may result in producing negative tokens. So, we have designed a strategy t o concurrently flre certain negative weight fundamental clusters in an approprlate manner so that no token wlll become negative during this process.
At the end of above steps, the solution may not be basic. In such case, we repeat the above process until an optimum basic feasible solution is obtained.
Thus summarlzing, our approach consists of the following main steps. IVe start with the glven graph and the associated edge tokens prescribed by
MO.
Step 1: \Ye apply Algorithm FEASIBLE to test fcasiblllty of the problem. A t the end of this step, a11 the edge tokens will be non-negative.
Sten 2: We then Are negative weight nodes as much as posslble wlth a view to decreasing the oh/cctlve functlon. This process can be performed very efflclently using Algorithm SHORT-PATH. When no more flrings of negative welght nodes is possible, the nodes will partltion lnto clusters. A t this polnt, we Are negative weight clusters as much as posslble. Again, this can be done efficiently by constructing a smaller graph In which a node represents a cluster and then applying Algorithm SIIORT-PATH on thls new graph.
Sten 3: \Ye repeat step 2 untll we obtain a baslc solution (represented by a spanning tree) of the dual simplex. At thls polnt, we test the optlmality of the solutlon using the simplex optlmallty crlterlon. If the solution is not optimal we fire concurrently in an appropriate manner negative weight fundamental clusters and decrease the value of the objectlve as much as posslble.
A t the end of Step 3, the solutlon may not be baslc. We now repeat steps 2 and 3 untll optlmallty 1s reached.
The interestlng features of the above ni~pronch are:
No auxlliary graph is constructed to test feasibility. Node and cluster flrlng operatlons of step 2 can be performed efnciently using Algorlthm SHORT-PATH. Several simplex plvot operations are performed concurrently in step 3. The classlcal simplex approach does not permit concurrent pivots because these operatlons will destroy basicness of the solutlon. If the solutlon a t the end of step 3 1s not baslc, step 2 will convert it to a basic solution with an objective value less than that of the previous basic solution. In other words, step 2 converts a non-basic solution to a baslc one without ever increasing the value of t h e objective functlon.
This algorithm is now under lmplementatlon sultable for execution on the Hypercube computer. We are currently trylng to port this -dual algorlthm to the Cosmic Environment and Reactive Kernel systems (CE/RI<) [Ill which support a multi-process message passing environment. T h e CE/RI< environment provides uniform communlcatlon between processes independent of their actual locatlon. Since our algorithm consists of a collection of processes which communlcate wlth each other by passing messages it will be fairly easy to port thls algorithm to the CE/RK environment. We plan to study the run-time characteristics of the parallel algorithm under these conditions.
IV. SUMMARY
In this paper we have presented the essentlal features of a parallel algorithm for the layout compaction. Starting with a formulation (in terms of the dual transshipment problem) we have deslgned a parallel algorithm which in addition to shortestpath computatlons, lnvolves concurrent plvot operatlons. This algorithm can also be used to study the relatlve placement problem studled in [7] . T h e parallel algorithm 1s now under lmplementation.
