Placement of multiple dies on an MCM or high-performance VLSI substrate is a non-trivial task in which multiple criteria need to be considered simultaneously to obtain a true multi-objective optimization. Unfortunately, the exact physical attributes of a design are not known in the placement step until the entire design process is carried out. When the performance issues are considered, crosstalk noise constraints in the form of net separation and via constraint become important. In this paper, for better performance and wirability estimation during placement for MCMs, several performance constraints are taken into account simultaneously. A graph-based wirability estimation along with the Genetic placement optimization technique is proposed to minimize crosstalk, crossings, wirelength and the number of layers. Our work is signi cant since it is the rst attempt at bringing the crosstalk and other performance issues into the placement domain.
I. Introduction
Rapid growth of multimedia and communication systems demands the use of both analog/digital mixed signal ICs and deep sub-micron (below 0.6 m) CMOS technologies. The higher density and improved electrical performance of MCM technology is needed in these systems.
The rst step of the physical design process is placement. Over the years, a wide variety of placement algorithms have been developed. For an comprehensive overview of placement and routing algorithms, see SDP96] . One example is from Esbensen and Kuh EK96b], EK96a] who present a topic of Genetic-based multi-optimization in a oorplanning tool called Explorer, which generates alternative solutions with di erent main objectives, and simultaneously minimize layout area, deviation from target aspect ratio, routing congestion and the maximum path delay.
In the mixed analog-digital layout design and deep sub-micron CMOS technologies, automated synthesis of interconnections during the early placement stage of the design cycle is emerging as a most promising approach. High speed design techniques or signal integrity analysis becomes important when the propagation delay across the interconnect is 20-25rise/fall time. Current placement level design models do not capture important physical design signal integrity e ects such as crosstalk, power, and timing, simultaneously, which becomes the rst order of factors in chip performance. Crosstalk noise should be considered because crosstalk between long wires increases delay (because of larger e ective line capacitance) and also degrades signal integrity and causes logic faults. Crosstalk contributes as much as 50-75% to interconnection delay Bak90] as the width of the wire and the space between wires is reduced.
As physical feature sizes decreases, the time delay of electrical signals traveling in the interconnect between active devices and gates is approaching the delay through the devices and gates. The parasitic information of the interconnect is absolutely critical to predicting circuit performance. Thus, physical interconnections delay will overtake gate delays as a design concern by the tear 2000, mandating a shift in the physical design ow for deep-submicron. Therefore, iterations between synthesis and layout increase dramatically due to timing and routability problems. The key to solving this problem is knowing more about the physical design, i.e., placement and estimated interconnect, early in the design cycle. The RTL is being de ned to accurately predict size, timing and power, early in the design cycle and avoid downstream iterations. This means the design engineer needs to get back to the fundamentals of physics. It is the goal of this paper to explore physical level solutions to the deep submicron problems.
In MCMs (resp. deep-submicron layouts), some of the netlengths for connection between bare dies (or modules) can be so long that they have a resistance which is comparable to the resistance of the driver. Performance-driven placement is very important since the interconnect delays form a major part of the system cycle time. A resistance-driven placement algorithm has been proposed in SK94] .
However when the more performance issues are considered, e.g., in deep-submicron technologies, additional constraints in the form of net separation and via constraint become important. This is because the fabrication of densely routed designs may result in low fabrication yield or a excessive crosstalk design. Excessive local congestion gives rise to future routing di culty and also increases the potential crosstalk noise in high-speed signal lines. PCB/IC design has considered electrical interference (crosstalk and re ection). that is very likely encounted in MCM designs due to the fast signal rise time and long chip-to-chip connection. Futhermore, larger capacitance increases the power consumption of the dies. Crosstalk is minimized by ensuring that wires carrying high activity signals are placed su ciently far from the other wires. Moreover, for high-performance MCM routing, intersections of wires cause the use of more vias which, in turn, require more routing resources (because of the large via pitch), low manufacturing yield, and cause noise problems (because of the mismatched characteristic impedance between wires and vias) Sys93]. As a result, note that we are facing toward wire-design problems as going to sub-0.1 m design.
The problem of crosstalk is addressed typically after the placement step. Now it is the responsibility of P&R tool to detect potential crosstalk as early as possible. The next step in physical design is to assign every global route in the layout environment to a plane pair, called layer assignment, so that the capacity constraints are satis ed on all plane pairs, and the number of plane pairs is minimized. A number of papers have been published related on the crosstalk issues. A layer assignment algorithm to reduce crosstalk presented in HSVW90], CRS + 93], CW96a] maximizes the layer separation between interfering nets, so as to reduce both intralayer and interlayer crosstalk.
There are several works related to crosstalk-minimum routings. Recently, TCD95] proposed a channel-based thin-lm wiring methodology (using two layers) considering crosstalk minimization between adjacent transmission lines. SFM + 90] described a new detailed routing method for analog fullcustom ICs. The router is gridless and performs electrical optimization such as crosstalk, resistivity, ground capacitance and electrical symmetry, at a symbolic level. The algorithm is a sequential routing based on a wire ordering. KAE93] proposed a spacing algorithm for performance enhancement and crosstalk minimization, and GL93], CC97] proposed a channel routing enhancement to minimize crosstalk. In CW96b], the routing area is decomposed into channels which is a dense and thus crosstalk-sensitive area. The main goal of the MCM router developed in Dev94] is to route all the nets with a minimum number of layers and reduce the crosstalk by separating high frequency wires with a bound over the number of vias used in routing. The routing is carried out by scanning the routing region horizontally or vertically from one end of substrate to another. The algorithm is an extension from KC93]. A post global routing crosstalk risk estimation and reduction is presented by XKW96] .
There is no known reports on crosstalk minimization simultaneously during placement for high performance circuits. We need a reasonably fast timing and wirability estimation during placement. Timing errors due to crosstalk and timing violation after layout requires another design iteration. However, if we handle this problem in earlier stage like placement, there is much likely not to have timing errors in later routing stage. That leads to shorten the design cycle and thus lessen the design cost.
As a result, early estimation of wirability during placement is important, but net topology is di cult to estimate at the placement stage. One way to get over the problem is to consider placement and global routing simultaneously. However, the high problem complexity may not lead to an e ective solution. The advantage of our solution over the existing method is that we introduce a fast topology-mapped global routing (which will be described in the next section) to take the several performance constraints into account simultaneously. The problem is formulated as a graph-based optimization problem. Our work is signi cant and innovative since it is the rst attempt at bringing the crosstalk and crossing minimization problem into the placement domain.
In Section 2, we formulate the problem. In Section 3, a new heuristic to nd a global routing estimation using onebend routes is presented. An e cient solution to the problem using a Genetic algorithm is then proposed in Section 4. Experimental results and conclusion are presented in Sections 5 and 6, respectively.
II. The New Estimation
Our placement model targets MCMs, but can also be applied to module placement in a chip layout. A given input is a set of rectangular chips of the same size with pins xed within each block and a speci cation of n nets, including timing constraints on nets. Each output solution speci es an absolute position of each chip. The problem is stated as follows:
Given a set of chips C and a set of chip sites S, nd a mapping : C ! S, so as to minimize the crosstalk, crossings and total wirelength needed for routing and to ensure routability of the design in a minimum number of routing layers. Conventionally, a cost metric based on wirelength plus congestion increases the wirability. However, in our formulation, we do not consider the congestion measure, explicitly. We observed by experiments that congestion minimization is done automatically while we perform crosstalk and crossing minimization simultaneously, because it distributes wires evenly over the MCM substrate. Note that minimizing the number of crossings reduces the wirelength, whereas minimizing the crosstalk does not always do so. Next, we introduce a new interference measure based on crosstalk and crossings.
Net Topology and Graph Generation
Multi-terminal nets have many possible routing topologies such as daisy chain, Steiner tree, star and A-tree VMS94], Cho96]. However, it is impractical to consider all con gurations of a large fan-out net because the number of net topologies as a function of the number of a large fan-out receivers increases rapidly. In RCS86], Raghavan, Cohoon, and Sahni demonstrated a polynomial time solution (O(n 2 ) time) for a one-layer routing problem called single bend wirability problem, for two-terminal nets, which is the problem of determining whether there exists a planar routing with at most one bend per net. The problem can be reduced to the 2-satis ability problem. However, allowing multiple terminals renders the single bend wirability problem NP-complete Yen94].
The formulation cannot be directly applied to solve our problem that considers multiple constraints on wires. The bounding box measure (of wiring interference) for placement without taking net topologies into account completely is not su cient 1 . Thus, we consider, for two-terminal net i, two possible one-bend global routes, denoted i( 1 ) and i( 2 ). It is desirable that the multi-terminal nets are routed within the smallest bounding box enclosing the terminals belonging to the nets, and with their favorite topologies as mentioned previously. For example, one restricts one to a speci c routing pattern for a multi-terminal net with a mincost Steiner tree 2 having minimum wirelength, minimum bends, and minimum stubs. A stub or branch in a tree introduces extra delay and/or ringing in the received signal waveform SK93]. Evidently, the topology estimate from a placement in this way is poorer than the estimate from global routing, but it is necessary compromise for a strong coupling between the placement and global routing.
Based on these facts, given a placement, we create an interference graph G = (V; E) (refer to Figure 2 ), where jV j = 2n and jEj 2n(n?1) (in case of two terminal nets), to formulate the interference relation between n nets. In G, there are n nets of two types (i( 1 ) and i( 2 )), thus jV j = 2n. Edges are formed by connecting every node i( 1 ) to 2(n ? 1) other nodes except for i( 2 )). Each node in V represents a net and a weight on an edge in E represents a net-pair crosstalk and crossings measured as below. If there is no crosstalk e ect between two nets, then there is no corresponding edge in the graph G, thus jEj can be less than or equal to 2n(n ? 1). In general, a graph G(V; E) is a comparability graph if it satis es the transitive orientation property, i.e., if it has an orientation such that in the resulting directed graph Crosstalk Measure A popular approach used in the past to model the dependency of performance functions on parasitics is net classi cation. Nets are classi ed according to the type of signal they carry (stable, large swing, sensitive to noise, etc.). A bus of several sensitive nets running parallel to each other with correlated signals might inject considerable noise into a single net. The crosstalk-critical region is de ned as a region enclosed by two wire segments of net i and net j so that their coupling distance (i.e., the distance between two wire segments of di erent nets that run in parallel) d(i; j) is less than or equal to a small constant. The value depends on device technology. For example, using AC device technology on an MCM-L layer, = 1cm CW96a]. The shaded regions in Figure 1 corresponds to the set of crosstalk-critical regions induced by the given global routes of the two nets. The crosstalk between two nets i( p ) with toppolgy p and j( q ) with toppolgy q , denoted as (i( p ); j( q )), is estimated as proportional to the maximum length for which two nets run in parallel and is inversely proportional to the minimum separation between the parallel wires:
where K(i( p ); j( q )) is the set of crosstalk-critical regions between two nets i with topology p and j with topology q . An interference graph is established for net-pairwise crosstalk value being an edge-weight of the graph in O(n 2 ) time. Then, noise tolerance T i for net i with topology p with respect to the crosstalk measure is approximated as
, where M i( p) is the maximum allowable coupled noise for net i with topology type p and j( q ) is the crosstalk-critically adjacent net with respect to net i with topology type q .
We aim to identify the placement which either maximizes the sum of noise tolerance or maximizes the minimum noise tolerance for all nets. Thus, our goal is to remove all noise tolerance violations.
Crossing Measure
To minimize the number of signal line crossings and to minimize both congestion and wirelength, we also incorporate the wire crossings into our cost function. Especially for analog nets, crossings are one of the dominant crosstalk noises.
The crossing e ects of net-pair (i( p ); j( q )) can be computed by the number of intersection points between the net i with its topology p and the other net j with its topology q .
Timing Constraints
We need to consider net topologies and size that cause delays larger than the performance requirements or longer than the maximum allowable driver-to-receiver path length. A method of generating bounds on both net length and width of lossy transmission line interconnects to satisfy timing and overshoot constraints of the MCM and PCB designs is described in FSe92]. Then, timing tolerance S i( ) for net i with topology with respect to the maximum driver-to-receiver path length L i( ) is approximated as S i( ) = M i( ) ? L i( ) , where M i( ) is the maximum allowable driver-to-receiver path length for net i with topology .
Object (Fitness) Function
The rst approach to our placement algorithm is to nd a placement in (a set of placements) with its global routing ! among a set of global routing solutions ( ) min 8 2 f( ), where the objective function f( ) is min 8!2 ( ) f norm(max i( )2N f?T i( ) g) + norm( P (i( p);j( q))2E (i( p ); j( q ))) + norm(max i( )2N f?S i( ) g)g. Here, the userde ned parameters , and re ect the relative importance of minimizing the crosstalk, crossing, and timing errors respectively, and norm() is a function to normalize the values of crosstalk, crossings and timing errors such that the values are between 0 and 1.
A point of the design space is called a Pareto point if there is no other point (in the design space) with at least a inferior objective, all others being inferior or equal. A Pareto point corresponds to a global optimum in a monodimensional design evaluation space. The image of the Pareto points in the design evaluation space is the set of the optimal trade-o We rst select a placement which minimizes the wirelength, satisfying the given timing constraints. Given the initial placement, crosstalk and crossings are the next objectives to be minimized.
Note that we select a topology for a net among a set of topologies for the net given by designer. Thus we refer to the problem as \Topology-Mapped Global Routing". The problem of identifying the global routing which minimizes the total crosstalk noise and the number of crossings can be formulated as nding a minimum-edge weighted clique K n of size n in G. Note that in G the maximum mode clique is of size n. In general graphs, both the maximum node-and edge-weighted k-clique problem is NP-complete, but when restricted to a comparability graph, the exact algorithm on a node-weighted maximum clique problem can be implemented to run in linear time in the size of the graph Gol80].
However, the exact polynomial-time algorithm on nding a minimum edge-weighted k-clique problem is not known even in comparability graphs. In general, given a complete undirected graph G(V,E) with edge weights w(i; j), and an integer k, the minimum edge-weighted k-clique problem nds a minimum edge-weight clique with k nodes. The problem can be formulated as the following 0-1 integer programming problem which can be solved by any ILP package. The problem of minimizing the maximum crosstalk noise will be formulated similarly. We conjecture that the problem can be shown as NP-complete. Evidently, ILP is not our choice for fast wirability estimation even though it can be used for global routing. .
Our experimental results on some heuristics to solve the problem showed that the edge-weighted formulation does not work well in practice due to intractable complexity of nding a minimum-edge weighted clique.
Motivated by the fact, for fast and reasonable estimation, the problem is transformed into a version of nding maximum node-weighted clique of a comparability graph G. Each node i( p ) is assigned by their corresponding average
, where jQj is the number of di erent topologies for net j and others are de ned as in the previous section. Refer to Figure 2 . The crossing measure on node i can be similarly computed. The problem can be solved optimally in linear time by selecting a vertex with maximum weight among vertices corresponding to each net. In this case, the result of minimizing the maximum crosstalk (equivalently, maximizing the minimum crosstalk noise tolerance) is same as one of minimizing the total crosstalk.
IV. Placement with Genetic Algorithm
Over the years, a wide variety of placement algorithms have been developed. Sharookar and Mazumder SM91] have provided a survey of various placement techniques. For solving simultaneous multi-objective optimization problems, iterative probabilistic search optimization algorithms like the simulated annealing Sec88] or genetic algorithms are used and the process is iterated until some stopping criteria is satis ed. There have been presented several genetic placement algorithms ?], J.P91], Vem94]. The problem of crosstalk is typically addressed after the placement step. There is no know report on crossing and crosstalk minimization simultaneously during placement, thus our work is signi cant and innovative since it is the rst attempt at bringing the crosstalk and crossings into the placement domain.
The usual string representation in a genetic placement algorithm is as follows. The i-th element within a string corresponds to the placement of the i-th chip in row major among all possible placement locations in all the rows in the MCM layout. Each string is also associated with the tness cost function described before. Finding a appropriate set of parameters for GAs is crucial to its performance. A genetic algorithm is characterized by the number of o springs to be generated (0 < C < 1) and the fraction of populations to be mutated (0 < M < 1). Optimal values for these parameters, C and M, are obtained automatically by running another meta-genetic algorithm for optimization of those parameters. Esbensen and Mazumder EM94] have combined the genetic algorithm and simulated annealing algorithm to speed up the optimization search and obtain better placements compared to either algorithm alone. Our Genetic paradigm is similar to the above approach except that we set C=M=1. Our algorithm can be described as follows.
Procedure GENETIC PLACE() 
END Procedure
The procedures for crossover and mutation is as usual, for example in SM90], and is not shown for brevity. In summary, our approach is depicted in Figure 3 .
V. Experimental Results
To evaluate the e ectiveness of the algorithm, the placer was implemented in`C', and was tested on MCC1, Ami49, and apte from MCNC benchmarks. Table 1 gives the description of the designs on which the placer was tested. All the experiments were tested using a SUN SPARCclassic.
To see the impact of the crosstalk and crossings, we compared the proposed method with the placement of not considering crosstalk and crossovers, and with di erent values of and in the objective function in the previous Our approach is similar to the approach of EK96b], but di ers in in generating a set of alternative solutions by controlling the parameters and . Among all the solutions with di erent value of the parameters, we select the ones that correspond to Pareto point set. Then user can choose his or her favorate solutions.
We selected a minimum one with respect to crosstalk among 9 di erent 's and 's. Figure 4 shows the result of using MCC1. The best solution was when = 0:2 and = 0:8. After performing our placement algorithm, the wirability (i.e., the number of layers required), wirelength, and crosstalk were compared by running a crosstalk Driven MCM router Dev94]. As in Table 2 , crosstalk is signi cantly reduced to 26% on the average. Furthermore, wirelength and the number of layers were decreased by 8.9% and 8.3% respectively on the average. For example, in the case of MCC1, the wirelength and crosstalk was improved by 14% and 17% respectively without increase in the number of layers. The execution time of running our algorithm took about 2 hours clock time (not CPU time) for MCC1, 30 minutes for Ami49 and 10 minutes for apte; the computing time was mainly due to wirability estimation step described in Section 3. Finally, we investigated the wirelength e ect of minimizing crosstalk. Remind that crosstalk is proportional to the maximum length for which two nets run in parallel. Figure 6 represents the relationship between wirelength and crosstalk. It is interesting to observe that our placement algorithm automatically minimized the wirelength metric implicitly without including wirelength measure in our objective function. Thus, it can be a \universal" metric for optimizing deep-submicron physical designs. We have presented an e ective crosstalk-minimum placement algorithm that considers minimum bend global routing simultaneously. A Genetic approach to solve the placement problem is presented. A novel graph formulation to nd a minimum node-weighted k-clique on an comparability graph is presented for fast wirability and performance enhancement.
Finding an optimal minimum edge-weighted k-clique problem in interval graphs is not known to be NP-complete, thus it is a open problem. Thus, one of future directions would be to develop a more e cient heuristic to nd the near-optimal or optimal minimum edge-weighted k-clique in an interference graph.
