Abstract-Ball Grid Array packages in which I/O pins are arranged in a grid array pattern realize a number of connections between chips and a printed circuit board, but it takes much time in manual routing. We propose a fast routing method for 2-layer Ball Grid Array packages to support designers. Our method distributes wires evenly on top layer and increases completion ratio of nets by improving via assignment iteratively.
I. Introduction
In current VLSI circuits, there can be hundreds of required I/O pins. Instead of Dual In-line Package (DIP) or Quad Flat Package (QFP), in which the number of available I/O pins is small, Ball Grid Array (BGA) packages are used to realize a number of connections between VLSI chips and printed circuit boards (PCBs).
Though there exist many approaches for routing and parts of them are included in several tools, most tools are for routing on PCBs or VLSI chips. Most approaches proposed so far can not immediately apply to BGA packages routing which contains special requirements and constraints on BGA packages since it is hard to obtain a routing pattern as good as manual one. The structure of BGA packages is symmetric, and given netlists has some properties. In current design, designers generate routing patterns for BGA packages by using such properties. So, we introduce know-how in manual routing into our methods to obtain satisfactory routing pattern efficiently.
The first approach for BGA package routing was proposed in [1] and was improved in [2] . In these approaches, it is assumed that there is a single routing layer, and that a netlist is not given. These approaches generate a netlist and a global route for each net. Each net connects a finger and a ball. The objective is to balance the congestion over the routing area and shorten the wire length of each net. Since a netlist is usually given in package routing design, these approaches are primarily used for package architecture design or flip-chip bonding design. For a given netlist in two-layer BGA model, as shown in Fig. 1 , global routing on layer 1 may be possible by using these algorithms if a candidate of via positions is considered as a ball. The feasibility of the global routes on layer 2, however, is not guaranteed.
Algorithms for multi-layer Pin Grid Array (PGA) and BGA routing have been proposed in [3] and [4] , respectively. These algorithms first assign each net to a layer and then generate routes in each layer. However, neither the feasibility of the routes from the finger of each net to the assigned layer nor the routes from the assigned layer to the ball of the net are guaranteed. These routes require vias that are large compared to the wire width, and these algorithms omit the via assignment planning, which is the most difficult part of package routing.
A via assignment and global routing method for singlechip two-layer BGA packages that considers total wire length and wire congestion have been proposed as the first stage of package substrate routing in [5] , and this method has been improved in [6] . In these papers, the concepts of monotonic global routing and monotonic via assignment focusing mainly on layer 1 are introduced. In the method, a via assignment is iteratively improved to minimize the maximum wire congestion on layer 1 while the total wire length on layer 2 is kept to be small enough.
Though the method achieves small total wire length and congestion, several enhancements are required in order to use the method in actual package routing design. In the method, since the via of a net is placed near the ball of the net, the wire of each net on layer 2 is short and the routing on layer 2 seems not to be difficult. However, there is no guarantee that 100% routing on layer 2 is possible. Moreover, in package substrate, various kinds of obstacles exist. For example, mold gates from which resin is poured into the package are placed on layer 1. In the region at which a mold gate placed, routing on layer 1 is not allowed but routing on layer 2 is allowed. Mold gates make it difficult to generate 100% routing since the via of a net may be placed away from its ball if the ball is under a mold gate. Even if the evaluation of a via assignment by cost function is better, it can not be adopted if 100% routing is impossible.
In this paper, we propose a via assignment and global routing method which is an enhancement of the method proposed in [6] . In our proposed method, the maximum wire congestion on layer 1 and the wire length of a net on layer 1 and layer 2 are minimized. Moreover, a global routing on layer 2 is generated in the final stage to guarantee the 100% routing if a feasible via assignment is obtained. Our method consists of two phases. In the first phase, a via assignment is iteratively improved to minimize the maximum wire congestion on layer 1 while the total wire length on layer 2 is kept to be small enough. Though this phase is based on the method proposed in [6] , the computational complexity to obtain the maximum gain is improved from O(N 2 ) to O(N ), where N is the number of grid nodes. In the second phase, a via assignment is iteratively improved to improve the routability on layer 2 while the maximum wire congestion on layer 1 is maintained. In this phase, global routing on layer 2 is generated. New modification to improve the routability on layer 2 while maintaining the maximum wire congestion on layer 1 is introduced. In our experiments, our method obtains a via assignment which distributes wires evenly faster than the method proposed in [6] , and the routability on layer 2 is drastically improved.
II. Preliminary

A. Problem definition
In this paper, we consider a basic model of BGA package as shown in Fig. 1 . Our BGA package model has two routing layers and single chip which is smaller than package size. A bonding finger, which we will refer to as a finger, is connected to the chip by a bonding wire. Bonding fingers are placed on the perimeter of a rectangle enclosing the chip on layer 1. A solder ball, which we will refer to as a ball, is an I/O pin of the package, and is connected to the PCB. Solder balls are placed in a grid array pattern on layer 2. There are connection requirements between bounding fingers and solder balls. The connection requirement is called a net, and is realized by wires on each layer and vias which connect wires on different layer. The number of vias to be placed in the area surrounded by four adjacent balls is at most 1.
Mold gates are in some corners of top layer to pour resin into the package. In the region at which a mold gate placed, routing on layer 1 is not allowed but routing on layer 2 is allowed.
Ring structure which is used for electric plating surrounds the package. Each net should be connected to the ring in order to enable electric plating to protect its wires. The extra connection to the ring of a net is called a plating lead. The ring is cut when the package is used. A plating lead is redundant for operation, but is normally used to reduce the fabrication cost and to improve the reliability.
The routing area of a package is usually divided into sectors. Our approach is applied to each sector. In the following, we focus on the bottom sector as shown in Fig. 2 .
In this paper, we assume that a net consists of a finger and a ball. Nets are labeled according to the order of fingers on the perimeter from the left to the right as n 1 , n 2 , n 3 , . . ..
Since the radius of a ball is large compared to the interval of the balls, the number of possible routes between adjacent balls on layer 2 is at most one. Therefore, routes on layer 2 should be short and most of plating leads should be routed on layer 1. For this reason, we restrict the route of each net so that it has only one via, the wire on layer 1 connects the finger of the net and the ring through the via of the net, and the wire on layer 2 connects the ball and the via of the net.
The set of candidate locations of vias which include the locations within a mold gate is represented by the via grid array N . The interval of via grid array N is the same as that of the balls, and is unit length as shown in Fig. 2 . An element in N is called a grid node. We assume the number of possible routes on layer 1 between two vias placed in adjacent grid nodes under a design rule is at most h where h is the number of rows of balls.
The ball and the via of net The routing problem for a two-layer BGA package is defined as follows:
A routing problem for 2-layer BGA Input: Fingers, balls, and netlist (Connection Requirements between fingers and balls)
Output: A via assignment Φ, corresponding routing on layer 1, and routing on layer 2
Objective: Minimize the total wire length and the maximum wire congestion Constraint: All nets are realized, and vias are placed out of mold gates.
B. Monotonic via assignment
If the route of each net on layer 1 from its finger to the outer ring intersects every horizontal grid line only once, then the route is said to be monotonic. Otherwise, it is said to be non-monotonic. It is clear that a monotonic routing is possible for via assignment Φ if and only if x v i < x v j is satisfied for any pair of nets n i and n j (i < j) such that y
A via assignment is said to be monotonic if a monotonic routing of layer 1 is possible [5, 6] .
Given a monotonic via assignment, monotonic routing on layer 1 is uniquely determined. The via assignment shown in Fig. 3 is monotonic, and its routing is unique. For example, three vias v 5 , v 6 and v 10 are assigned on middle row in Fig. 3 . The route of nets n 1 , n 2 , n 3 , and n 4 in monotonic routing need to pass to the left of v 5 as shown in Fig. 3 .
C. Indices for Evaluation of a via assignment
In this section, indices of a via assignment which are used in the evaluation of the via assignment is explained briefly.
The number of wires on layer 1 between via v and the via above v is denoted by cut a (v). If no via exists above v, cut a (v) is zero. Details are explained in [6] . The Man- The balance of wire congestion of via v is denoted by
The indices defined above are also used in [6] , and their calculations are discussed in [6] . While, the indices defined below are mainly used to improve the completion ratio of nets which are not used in [6] .
The illegality of via v is denoted by obs(v). That is, if v is on a mold gate, obs(v) = 1. Otherwise, obs(v) = 0.
The violation of wire congestion of layer 1 between via v and the left of v is denoted by vio l (v). That is, vio l (v) = max{density l (v) − C MAX , 0} where C MAX be the allowable wire congestion of layer 1. The total violation is denoted by Δ. That is, Δ is the sum of violations on a whole via grid array.
The number of unconnected nets is denoted by U . The total wire length on layer 2 of connected nets is denoted by L. In order to evaluate U and L, the routing graph for layer 2 is defined and a rip-up and reroute method is used which are explained in Section V.
D. Modifications
There are many ways to modify a via assignment. In this paper, four simple modifications are used.
(EXC) Two adjacent vias on a vertical grid line are exchanged.
(ROT) Three vias on a unit square on the via grid array are rotated.
(MSEQ) Vias are moved to their adjacent grid nodes on a via grid array one by one until reaching a grid node without a via in which the direction of every horizontal movement of vias is either left or right and that of every vertical movement is either above or below.
(CEXC) Any two vias are exchanged.
EXC, ROT, and MSEQ have been proposed in [5] to improve the wire congestion of layer 1 while keeping the wire length of layer 2 as small as possible. Examples of them are shown in Fig. 4 . While, a CEXC is introduced here to improve the routablity of layer 2 while keeping the global structure of routing pattern of layer 1. See Fig. 5 . EXC may drastically change routes of layer 1 while keeping the distance between the via and the ball, whereas CEXC may improve the routability of layer 2 without changing routes of layer 1 drastically if the number of wires of layer 1 between vias is small.
III. Outline of our method
In our proposed method, an initial monotonic via assignment is generated by the method proposed in [5] . Then the initial via assignment is iteratively improved. Our method consists of two phases. In the first phase, a via assignment is iteratively improved under the monotonic condition to minimize the maximum wire congestion on layer 1 while the total wire length on layer 2 is kept to be small enough. In the second phase, a via assignment is iteratively improved under the monotonic condition to improve the routability on layer 2 while the maximum wire congestion on layer 1 is maintained.
The first phase is based on the method proposed in [6] . In this phase, three types of modification EXC, ROT, and MSEQ are used. In each iteration, a modification with the maximum gain on EXCs, ROTs, and MSEQs is applied to the current via assignment to improve the total wire length and the wire congestion. Though the initial via assignment has vias placed on a obstacle, all vias are moved to routing region in this iterative modification. In [6] , it takes O(|N | 2 ) time to find a MSEQ with the maximum gain. However, we show that it can be obtained in O(|N |). In the first phase of our proposed method, each iteration takes only O(|N |) time.
In the second phase, the via assignment generated by the first phase is iteratively modified by CEXC which is proposed here so that the maximum wire congestion on layer 1 is maintained and the routability on layer 2 is improved. In order to evaluate the routability on layer 2, the routing graph corresponding to a routing problem on layer 2 is introduced, and routes are generated on it. In each iteration, an allowable CEXC with the maximum gain is applied. In order to find an allowable CEXC with the maximum gain, the routing corresponding to each CEXC needs to be generated. However it is not so time consuming since the routes are changed incrementally and the most of routes are not changed even if CEXC is applied.
IV. The first phase
A. Cost of a via assignment
The routing cost for monotonic via assignment used in [6] is extended to move vias out of mold gate since the initial via assignment may have vias on a mold gate.
3A-3
The routing cost for monotonic via assignment Φ, which is denoted by COST 1 (Φ), is defined as follows:
where α 1 , β 1 , γ 1 , and δ 1 are coefficients. Note that δ 1 is set to much lager than the others in order to obtain a via assignment where vias are out of mold gates.
B. Improvement of the maximum gain computation
In the first phase of our method, a modification with the maximum gain under the monotonic condition is selected and applied to the current via assignment. The number of patterns on EXCs and ROTs is O(|N |), which is small enough to enumerate all the patterns, while the number of patterns on MSEQs is exponential in the terms of the number of grid nodes. In order to find a MSEQ with the maximum gain in polynomial time, the cost graphs are used.
A cost graph is a directed acyclic graph (DAG), and has some sources and sinks. All sources in the graph correspond to the start vias of MSEQs, and all sinks in the graph correspond to the end dummy vias of MSEQs. Every directed path from source to sink corresponds to a MSEQ, and the length of the path corresponds to the gain on the MSEQ. The type of an MSEQ is either above-left, above-right, below-left, or below-right since the directions are restricted. A MSEQ with the maximum gain is obtained by generating cost graphs for each type and searching a longest path on the graphs.
In [6] , the cost graph for every MSEQ beginning with a via is constructed and a longest path in each cost graph is obtained. The algorithm of the above-right type cost graph construction is shown in Fig. 6 . In a cost graph, the number of vertices in which v is the last element of label is at most seven. Note that a vertex is not generated if a via assignment becomes non-monotonic when the modification corresponding to the vertex is applied. The number of edges incident from a vertex is at most two. Therefore, the numbers of vertices and edges of a cost graph are O(|N |). For example, in the via assignment shown in Fig. 7 , labels in which v is the last element are (
. Therefore, a modification with the maximum gain on MSEQs can be obtained in O(|N |).
V. The second phase
Though a via is placed near its ball in the output of the first phase, all nets can not be connected if the via assignment is bad. So, in the second phase, the via assignment generated by the first phase is iteratively modified by CEXC to improve the routability of layer 2 while the maximum wire congestion on layer 1 is maintained.
A. Evaluation of a via assignment
In the second phase, we use another cost defined here, since the target is different from the first phase. The routability improvement has priority since a near optimal via assignment is obtained by first phase.
The cost of a via assignment is defined as follows:
where α 2 , β 2 , and γ 2 are coefficients. Note that γ 2 is set to much lager than the others in order to realize more nets.
The total violation Δ corresponds to the routability on layer 1, while the number of unconnected nets U corresponds to the routability on layer 2. The total wire length L on layer 2 is used to decrease U .
B. Routing graph on layer 2
The routing graph representing routing resource on layer 2 is constructed. The structure of it is changed depending on a via assignment. The routing graph has ball vertices, via vertices, and extra vertices. A ball vertex and a via vertex correspond to a ball and a via, respectively. The number of routes intersecting between two adjacent balls is at most one since ball radius is so big. A subgraph of a routing graph in Fig. 8(a) corresponds to a grid in which a ball exists in each corner and to which a via is not assigned. A subgraph of a routing graph in Fig. 8(b) corresponds to a grid to which a via is assigned.
A global routing on layer 2 is obtained by using a ripup and reroute technique on the routing graph. In each iteration of rip-up and reroute method, a shortest path of each net is sequentially generated on the routing graph regarding the routes of the other nets as obstacles. If the route of a net can not be found, then a shortest path is generated in the graph without other routes, and the routes of the other nets which intersect the found shortest path are ripped up. Whenever a route is ripped up, the weight of each vertex on the ripped up route and on the found shortest path is increased to avoid iterations such as the routes of two nets are alternately generated and ripped up. The number of iteration of rip-up and reroute method is restricted. Therefore, there are several unconnected nets if 100% routing can not be found in prespecified times.
In order to find an allowable CEXC with the maximum gain, a global routing on layer 2 is generated for each routing graph which is obtained by applying an allowable CEXC. In the routing graph corresponding to an allowable CEXC, routes of two nets whose vias are exchanged have to be regenerated. However, most of routes apart from exchanged vias are not modified. Therefore, the execution time of a global routing on layer 2 in each iteration is not so large.
C. Local modification for routability
In the first phase, the wire congestion is minimized effectively by using EXCs, ROTs, and MSEQs. In the second phase, routes on layer 1 should not be changed drastically to maintain the quality. So, in the second phase, a local modification CEXC which has high probability on improving the routability on layer 2 though the effect on routes on layer 1 is small is used. In CEXC, two vias are exchanged, but pairs of vias to be exchanged are restricted to be allowable.
If CEXC satisfies the following three conditions, then it is said to be allowable. In each iteration, a CEXC with the maximum gain is selected among allowable CEXCs. First, CEXC is restricted to satisfy the monotonic condition after it is applied. Second, CEXC is restricted not to increase the maximum wire congestion on layer 1. Let c and c be the maximum wire congestion around v i and v j before they are exchanged and after they are exchanged, respectively. 
VI. Experiments and Results
We implemented the proposed method in C++ language and applied it to several test cases which has mold gates as shown in Fig. 2 . The number of rows of balls is 4 in all data. The program ran on a personal computer with a 3.4GHz CPU and 1 GB of memory.
In our experiment, the number of used edges on routing graph is used as the total wire length of routing on layer 2. α 1 , β 1 , γ 1 , and α 2 are set to 1. δ 1 and γ 2 are set to much larger than the others. β 2 is set to 1 4 to balance each term, and that corresponds to the fact that the distance between adjacent two grid nodes is regarded as 1 since the distance on routing graph is 4. In addition, the iteration in each rip-up and reroute is restricted to be 20 times.
In the tables, C, D, F, and OBS are cut a (v), d(v), F (v), and obs(v), respectively, and TOTAL is the sum of them. Δ is the violations of the wire congestion. L is the wire length and U is the number of unconnected nets for routing on layer 2. OLD is the execution time of the method proposed in [6] , and PROP is the execution time of our proposed method where the cost graph is improved.
The initial cost of each data is shown in Table I . The outputs of the first phase for most inputs are improved drastically as shown in Table II .
Although the method proposed in [6] needs 45 second for data1, our proposed method obtains the identical output within 3 second by improving the computational complexity of each iteration. Table III shows the result of the second phase. Though TOTAL of the second phase increases in comparison to that of the first phase, the routability is improved. In this experiment, all nets are realized in data2, data3, data4, and data5, and L gets lower than the output of the first phase in all data.
The total violations on layer 1 are reduced by the second phase. The result with violations can not be used Fig. 9 . The output routes of the first phase for data4. Fig. 10 . The output routes of the second phase for data4.
as it is. However, even though the violations still exist, the result with few violations might be acceptable in design scene. A few violations would be eliminated easily by manual modifications and/or neglected by allowing to use a few narrow wire segments for non-critical parts and signals.
The output of the first phase for data4 is shown in Fig. 9 , and the output of the second phase is shown in Fig. 10 . Mold gates are not drawn in these figures. In the second phase, routability is improved without changing the structure of global routing of layer 1 drastically.
VII. Conclusion
We showed that a modification with the maximum gain is obtained in O(|N |), though it takes O(|N | 2 ) times in [6] . Moreover, we gave a routing graph for routing on layer 2, and a local modification to improve the routability on layer 2 while maintaining the maximum wire congestion.
In our experiments, our method obtains a via assignment which distributes wires evenly faster than the method proposed in [6] , and the routability on layer 2 is improved drastically. Our proposed method explores monotonic via assignments effectively, and a via assignment which guarantees 100% routability on layer 2 is obtained in most of test cases.
On the other hand, our method does not realize all net in data1 where the output of the first phase has many unconnected nets. In addition, though most wires are distributed evenly, there exist places with high wire congestion near mold gates. This is caused since moving vias out of mold gates has priority over improving or maintaining the wire congestion and the distance between a via and a ball in the first phase. These bad effects will be relaxed if the initial via assignment in which vias are placed out of a mold gate is created with the routability analysis, and the wire congestion on layer 1 can be improved if a part of plating leads is realized on layer 2. But, these are in our future works. Moreover, we will consider the method where parts of plating leads are realized on layer 2.
